Workshop COLING 2004

Enhancing and using electronic dictionaries

A SIGLEX-endorsed COLING workshop

Michael Zock (Limsi, CNRS) & Patrick St. Dizier (Irit, CNRS)

Sunday 29th August 2004

Introduction

A dictionary is a vital component of any natural language processing system. Its modern, digital form has considerable potential, especially if it is extended and built in a way compatible with the needs and habits of the average language user. There are many ways to make an electronic dictionary useful for people in their daily tasks of processing language. One could assist

reading and writing: adding a transliterator and a morphological generator/parser to a dictionary would put the needed information at the distance of a mouse click. Imagine someone trying to look up the meaning (or translation) of a word in a script he can’t read.
language learning: combining dictionaries with a parametrizable flashcard system and a goal-driven exercice generator could help the memorization and automation of words and basic syntactic patterns. In such a system choosing a goal would trigger syntactic templates, filling the latter with words would yield (simple) sentences.
lexical access (navigation): structuring the dictionary in a similar way as the human mind, i.e. building an associative network akin to WordNet, but with many more links in particular at the syntagmatic axis, could assist the writer not only in finding new ideas (brainstorming), but also the word s/he is looking for. Within this framework, word access amounts to entering and navigating in a huge, associative network. To build such a tool one could extract associations from an encyclopedia, label and add them as links to a resource like WordNet.

As one can see, there are numerous ways to enhance dictionaries. While the new hardware offers many, sometimes surprising opportunities for novel uses, seizing them requires some rethinking. This is the goal of this one-day workshop.

In particular, we’d like to discuss interesting extensions and enhancements of electronic dictionaries. For example, one could consider merging different, thesaurus-like dictionaries and see what kind of conceptual and navigational aids might be added to support the language user: what are his needs, what information is he looking for? Actually, a focus shift might be necessary to move from the data (content and size of the dictionary) to their organisation and access. As, what is a huge dictionary good for, if one cannot find the word one is looking for?

Target audience

The aim of this workshop is to bring together leading researchers involved in the building of electronic dictionaries to discuss modifications of existing resources in line with the users’ needs (i.e. how to capitalize on the advantages of the digital form). Given the breadth of the questions, we welcomed reports on work from many perspectives, including, but not limited, to linguistics, computer science, psycholinguistics, language learning, and ergonomics. We requested that each contribution addresses computational aspects.

Back to the top of the page

Preliminary program

8:45-9:00 Welcome to participants (introduction)
	9:00-12:00 Long papers

	09:00	J. Breen	Multiple Indexing in an Electronic Kanji Dictionary

	09:30	G. Huet	Design of a Lexical Database for Sanskrit

	10:00	V. Pekar	Linguistic Preprocessing for Distributional Classification of Words

	10:30	L. Romary, S. Salmon-Alt & G. Francopoulo	Standards going concrete : from LMF to Morphalou

	11:00-11:30 coffee break

	11:30	M. Zock & S. Bilac	Word Lookup on the Basis of Associations : from an Idea to a Roadmap

	12:00-12:50 short poster presentation (5 minutes each)

	12:00	I. A. Bolshakov & A. Gelbukh	A Very Large Dictionary with Paradigmatic, Syntagmatic, and Paronymic Links between Entries

	12:05	Elzbieta Dura	Concordance of Snippet

	12:10	V. Keselj, T. Keselj & L. Zlatic	R{j}ecnik.com : English-Serbo-Croatian Electronic Dictionary

	12:15	M. Maxwell & W. Poser	Morphological Interfaces to Dictionaries

	12:20	J. McCracken	Rebuilding the Oxford Dictionary of English as a Semantic Network

	12:25	C. Mota, P. Carvalho & E. Ranchhod	Multiword Lexical Acquisition and Dictionary Formalization

	12:30	T. O'Hara & J. Wiebe	Empirical Acquisition of Differentiating Relations from Definitions

	12:35	V. Reuer	Language Resources for a Network-based Dictionary

	12:40	S. Schulte im Walde	Identification, Quantitative Description, and Preliminary Distributional Analysis of German Particle Verbs

	12:45	S. Sheremetyeva	Application Adaptive Electronic Dictionary with Intelligent Interface

	13:00-14:00 lunch
	14:00-15:30 Project notes (20 minutes each)

	14:00	U. Apel and J. Quint	Building a Graphetic Dictionary for Japanese kanji-Character Look-up Based on Brush Strokes or Stroke Groups, and the Display of Kanji as Path Data

	14:20	P. Bernard, J. Dendien & J.M. Pierrel	A Computerized Dictionary : Le Trésor de la Langue Française informatisé (TLFi)

	14:40	B. Jacquemin	Dictionaries Merger for Text Expansion in Question Answering

	15:00	G. Neumann, C. Fellbaum, A. Geyken, A. Herold, C. Hümmer, F. Körner. U. Kramer, K. Krell, A., Sokirko, D. Stantcheva & E. Stathi	A Corpus-based Lexical Resource of German Idioms

	15:30-16:00 coffee break
	16:00-17:00 poster session
	17:00-17:45 wrap up discussion
18:00 End of the workshop

Back to the top of the page

Posters

size: the maximum size is 120 cm x 198 cm

format: how about A0 portrait?

installation time: lunch break might be a good moment

Useful Information

Information about Geneva (transportation, parking, etc.)
How to get to the conference building (Uni-Mail)
- http://www.issco.unige.ch/coling2004/hotel_orientation.html
Map showing the location of the conference place (UniMail)
- http://w3public.ville-ge.ch/Ville-ge/adrge.nsf/WebPlan?ReadForm&l=339&x=116749&y=499779&id=970916094739&typ=ADR3

Registration
To register for the workshop, please go to the following web site: http://www.issco.unige.ch/coling2004/ The late registration fee is CHF 150 for regular participants and CHF 120 for students.

Back to the top of the page

Program Committee:

* Antonietta Alonge (University of Perugia, Italy)
* Christian Boitet (GETA, Grenoble, France)
* Nicoletta Calzolari (ILC-CNR, Pisa, Italy)
* Christiane Fellbaum (University of Princeton, USA)
* Graeme Hirst (University of Toronto, Canada)
* Mathieu Mangeot-Nagata (NII, Tokyo)
* Rada Mihalcea (University of North Texas, USA)
* Alain Polguère (OLST, University of Montreal, Canada)
* James Pustejovsky (University of Brandeis, USA)
* Gilles Sérasset (GETA, Grenoble, France)
* Patrick Saint Dizier (IRIT-CNRS, Toulouse, France)
* Takenobu Tokunaga (Tokyo Institute of Technology, Tokyo, Japan)
* Dan Tufis (RACAI, Bucharest, Roumania)
* Jean Véronis (University of Aix en Provence, France)
* Piek Vossen (Irion Technologies, Delft, The Netherlands)
* Leo Wanner (University of Stuttgart, Germany)
* Michael Zock (Limsi-CNRS, France)

Back to the top of the page

Contact

For any queries please contact Michael Zock: zock@limsi.fr