Enhancing and using electronic dictionaries

Michael Zock (Limsi, CNRS) & Patrick St. Dizier (Irit, CNRS)

Sunday 29th August 2004

A dictionary is a vital component of any natural language processing system. Its modern, digital form has considerable potential, especially if it is extended and built in a way compatible with the needs and habits of the average language user. There are many ways to make an electronic dictionary useful for people in their daily tasks of processing language. One could assist

  1. reading and writing: adding a transliterator and a morphological generator/parser to a dictionary would put the needed information at the distance of a mouse click. Imagine someone trying to look up the meaning (or translation) of a word in a script he can’t read.
  2. language learning: combining dictionaries with a parametrizable flashcard system and a goal-driven exercice generator could help the memorization and automation of words and basic syntactic patterns. In such a system choosing a goal would trigger syntactic templates, filling the latter with words would yield (simple) sentences.
  3. lexical access (navigation): structuring the dictionary in a similar way as the human mind, i.e. building an associative network akin to WordNet, but with many more links in particular at the syntagmatic axis, could assist the writer not only in finding new ideas (brainstorming), but also the word s/he is looking for. Within this framework, word access amounts to entering and navigating in a huge, associative network. To build such a tool one could extract associations from an encyclopedia, label and add them as links to a resource like WordNet.

As one can see, there are numerous ways to enhance dictionaries. While the new hardware offers many, sometimes surprising opportunities for novel uses, seizing them requires some rethinking. This is the goal of this one-day workshop.

In particular, we’d like to discuss interesting extensions and enhancements of electronic dictionaries. For example, one could consider merging different, thesaurus-like dictionaries and see what kind of conceptual and navigational aids might be added to support the language user: what are his needs, what information is he looking for? Actually, a focus shift might be necessary to move from the data (content and size of the dictionary) to their organisation and access. As, what is a huge dictionary good for, if one cannot find the word one is looking for?

Target audience

The aim of this workshop is to bring together leading researchers involved in the building of electronic dictionaries to discuss modifications of existing resources in line with the users’ needs (i.e. how to capitalize on the advantages of the digital form). Given the breadth of the questions, we welcomed reports on work from many perspectives, including, but not limited, to linguistics, computer science, psycholinguistics, language learning, and ergonomics. We requested that each contribution addresses computational aspects.

Preliminary program

8:45-9:00 Welcome to participants (introduction)
  9:00-12:00 Long papers
  09:00 J. Breen Multiple Indexing in an Electronic Kanji Dictionary
  09:30  G. Huet Design of a Lexical Database for Sanskrit
  10:00 V. Pekar Linguistic Preprocessing for Distributional Classification of Words
  10:30 L. Romary, S. Salmon-Alt & G. Francopoulo Standards going concrete : from LMF to Morphalou
  11:00-11:30 coffee break
  11:30 M. Zock & S. Bilac Word Lookup on the Basis of Associations : from an Idea to a Roadmap
  12:00-12:50 short poster presentation (5 minutes each)
  12:00 I. A. Bolshakov & A. Gelbukh  A Very Large Dictionary with Paradigmatic, Syntagmatic, and Paronymic Links between Entries
  12:05 Elzbieta Dura Concordance of Snippet
  12:10 V. Keselj, T. Keselj & L. Zlatic R{j}ecnik.com : English-Serbo-Croatian Electronic Dictionary
  12:15 M. Maxwell & W. Poser Morphological Interfaces to Dictionaries
  12:20 J. McCracken Rebuilding the Oxford Dictionary of English as a Semantic Network
  12:25 C. Mota, P. Carvalho & E. Ranchhod Multiword Lexical Acquisition and Dictionary Formalization
  12:30 T. O'Hara & J. Wiebe Empirical Acquisition of Differentiating Relations from Definitions
  12:35 V. Reuer Language Resources for a Network-based Dictionary
  12:40 S. Schulte im Walde Identification, Quantitative Description, and Preliminary Distributional Analysis of German Particle Verbs
  12:45 S. Sheremetyeva Application Adaptive Electronic Dictionary with Intelligent Interface
  13:00-14:00 lunch
  14:00-15:30 Project notes (20 minutes each)
  14:00 U. Apel and J. Quint Building a Graphetic Dictionary for Japanese kanji-Character Look-up Based on Brush Strokes or Stroke Groups, and the Display of Kanji as Path Data
  14:20 P. Bernard, J. Dendien & J.M. Pierrel A Computerized Dictionary : Le Trésor de la Langue Française informatisé (TLFi)
  14:40 B. Jacquemin Dictionaries Merger for Text Expansion in Question Answering
  15:00 G. Neumann, C. Fellbaum, A. Geyken, A. Herold, C. Hümmer, F. Körner. U. Kramer, K. Krell, A., Sokirko, D. Stantcheva & E. Stathi A Corpus-based Lexical Resource of German Idioms
  15:30-16:00 coffee break
  16:00-17:00 poster session
  17:00-17:45 wrap up discussion
18:00 End of the workshop

  • size: the maximum size is 120 cm x 198 cm
  • format: how about A0 portrait?
  • installation time: lunch break might be a good moment

    Useful Information


    Program Committee:

    * Antonietta Alonge (University of Perugia, Italy)
    * Christian Boitet (GETA, Grenoble, France)
    * Nicoletta Calzolari (ILC-CNR, Pisa, Italy)
    * Christiane Fellbaum (University of Princeton, USA)
    * Graeme Hirst (University of Toronto, Canada)
    * Mathieu Mangeot-Nagata (NII, Tokyo)
    * Rada Mihalcea (University of North Texas, USA)
    * Alain Polguère (OLST, University of Montreal, Canada)
    * James Pustejovsky (University of Brandeis, USA)
    * Gilles Sérasset (GETA, Grenoble, France)
    * Patrick Saint Dizier (IRIT-CNRS, Toulouse, France)
    * Takenobu Tokunaga (Tokyo Institute of Technology, Tokyo, Japan)
    * Dan Tufis (RACAI, Bucharest, Roumania)
    * Jean Véronis (University of Aix en Provence, France)
    * Piek Vossen (Irion Technologies, Delft, The Netherlands)
    * Leo Wanner (University of Stuttgart, Germany)
    * Michael Zock (Limsi-CNRS, France)

    For any queries please contact Michael Zock: zock@limsi.fr