Michael Zock's homepage

Michael Zock

Emeritus Research Director
& Honorary Professor

LIS-CNRS, UMR 7020
Groupe TALEP
Aix-Marseille Université
Case 901 - 163 Avenue de Luminy
F-13288 MARSEILLE / FRANCE

CogALex		Cognitive Aspects of the Lexicon (COLING workshops).
Research		For a summary see here below.
Short bio		see here.
E-mail		michael.zock (@) lif.univ-mrs.fr
Phone		+33 (0)4 86 09 06 85
Fax		+33 (0)4 91 82 92 75


Communication relies on knowledge : knowledge of language, knowledge about the world around us, and, of course, knowledge concerning people : what do they know, feel and believe in? Being an empirical, lifelong process of learning it is never too early to start practicing, be it for acquiring the skill or for understanding the logic behind it. When my grandchild Augusta had grasped the relationship between this book (1, 2), her and me, she became interested in it. Yet, having her way to make sense of the world she tends to 'read' books upside down. Far from being a handicap, this may turn out to be an asset, helping her to become proficient in other scripts than only the one of her mother tongue. This being said, she is a lot nicer (3), and so much more fun to be with when she is not in the 'study mode' (4, 5, 6, 7, 8). Note that these pictures nicely illustrate the difference between 'impression' and 'expression', the latter being my concern.

'If you talk to a man in a language he understands, that goes to his head.
If you talk to him in his language, that goes to his heart.' (Nelson Mandela)

Situation + background

I started my research career with the French National Research Centre (CNRS) at LIMSI, an A.I. lab close to Paris. 2006, I moved to the south of France (Marseille) to join the NLP group of LIF (Aix-Marseille Université). Currently I am emeritus research director at CNRS and Honorary Professor of the Research Institute of Information and Language Processing (RIILP), university of Wolverhampton, UK (1).

Having learned languages under various circumstances, and having observed language acquisition in different contexts, I was intrigued by the fact that most people succeed so well in a natural setting, while they often miserably fail in school. To get a better understanding of the causes of this problem, I started to look at language learning from a psycholinguistic and computational linguistic (simulation) perspective.

My Ph.D. dissertation was devoted to speaking, a skill that must be learned. Yet, learning to speak and producing language in real-time are two daunting tasks. One must not only acquire a huge amount of knowledge (vocabulary, grammar), but also be able to accomplish rapidly and quasi-simultaneously the following tasks: message planning, formulation, and articulation. What makes this process even more demanding is the fact that the needed operations must be carried out under severe time and space constraints. Speaking is fast and short-term memory is limited.

Teaching the skills of communication (speaking and writing) is one of the missions of school. Alas, schools don't always succeed very well, and there are various reasons for this. Schools tend to emphasize rules and mechanical repetition, leaving aside the need to integrate rules into a process allowing the stepwise transformation of a conceptual input (message) into its corresponding form (sentence). In sum, they do not show how to convert 'declarative knowledge' (principles, rules) into 'procedural knowledge' (the way of using rules). Hence students may know a lot about language without necessarily knowing the language. Not having acquired the necessary operations (process), they don't know how to convert some input (message) into the corresponding output (sentence). It is also noteworthy that what has been learned (product) does not always correspond to what has been taught, an intuition that was confirmed by the eye track data collected during my Ph.D. (" From knowing-what to knowing-how: strategies in language production", Paris, 1980). The data also confirmed that certain methods are inadequate, making learning unnecessarily hard, if not impossible. Yet schools are meant to support learning rather than to prevent us from doing so.

Realizing all these facts, I felt that we needed to change our perspective, and, studying language production from a psychological point of view and trying to emulate the process by computer seemed to be a step in the right direction. Here are some of the questions I've asked myself then: (1) When do we process what? (2) What are the inputs and outputs? (3) What are the specific operations and constraints?

Motivations + approach.

Speaking and writing are resource-intensive processes whose success depends not only on knowledge but also on our momentary ability or skill to access and use it (synthesis). Yet these three conditions are not always met. Hence, our success in producing language lies somewhere in between two extremes: full access to the needed resources, or more or less limited access, yielding sub-optimal performance revealed by gaps, errors, disfluencies, etc.

This being so it makes sense to create authoring aids, i.e. assistive technologies (1, 2, 3, 4, 5), both for supporting the mother tongue or a foreign language. One may even wonder if this is not (also) one of the missions of computational linguistics. There are good reasons to believe that one should widen the scope and take better into account the human factor (needs and constraints of the human language processor), to build then the adequate tools (Interactive NLP, Human-Centered Computational Language Processing). However, to do so and to get these tools used in real-world (desktop, classroom ), true interdisciplinary work is needed, not only discretely, for not saying 'shamefully', in the backyard, but also more deliberately and visible, at the centre-stage. For additional information concerning the mindset of my approach, see 'Interactive Natural Language Generation' (INLG) here below.

Research summary

My research interests lie in communication, cognitive science and language production or language generation by and large. Starting from user needs and empirical findings (psycholinguistics, neurosciences) I try to build tools helping people to acquire the skill of speaking or writing in a foreign language and in their mother tongue. My current research deals with the following four topics:

Message-planning: creation of an interface (i.e. linguistically motivated ontology augmented with a graph generator), to support conceptual authoring (message composition);

Outline-planning: help authors to perceive possible links between their ideas to produce well-organized thoughts, i.e. coherent discourse. Writing is thinking which also implies linking of hitherto unconnected thought;

Lexical access: words being a major gateway to the mind, we must learn how to store, use and access them. My goal is to help authors overcome the tip-of-the-tongue problem. To this end I rely on certain features of the human brain (distribution of knowledge, which may be accessible only in terms of fragments) and the mental lexicon (lexical items being linked via association; functional equivalence of 'navigation' and 'spreading activation'). Of course, words in books, computers, or the human brain are not the same, be it for their storage, (organization) access or representation (holistic vs. decomposed). Still, it does make sense to take inspiration from the mental lexicon to see whether, functionally speaking, we can achieve something equivalent. This means in our case that access or retrieval consists of navigating in a huge semantic network, i.e. within a map where all words are linked via associations. Search space can be kept small because users are generally able to activate at least some of the target's (lemon) direct neighbors (yellow, acid fruit). Often they are even able to name the relationship between the two (synonym, hypernym, ...), and if none of this holds, one can still try to organize the output, and present the direct neighbors of the input (the word coming to the user's mind when looking for a target) in the form of a categorial tree.

Acquisition of basic speaking skills: help students to become quickly fluent in a foreign language (both western and oriental) by learning the basic vocabulary and syntactic structures (learn words in context). The scope is the survival level, and the method used is to build an open, possibly self-extending, multilingual phrasebook augmented with a parametrizable exercise generator.

Here is a summary of my research expressed in a few words, or simply by a wordle.

Organisation of some workshops

1° Electronic dictionaries and the mental lexicon

1.1 CogALex (Cognitive Aspects of the Lexicon), workshop series co-located with COLING :

2024 (Torino, Italy), workshop;
2022 (Taipei, Taiwan), workshop; 2020 (Barcelona, Spain), workshop, shared task;
2016 (Osaka, Japan), workshop, shared task; 2014 (Dublin, Ireland), workshop, shared task;
2012 (Mumbai, India), 2010 (Bejing, China); 2008 (Manchester, UK), and
2004, a forerunner (Geneva, Switzerland).

1.2 ICCS (International Conference on Cognitive Science), Beijing, 2010.

1.3 RLTLN (Lexical graphs and NLP), TALN workshop, Marseille, 2014.

2° Natural Language Processing and Cognitive Science (NLPCS)

NLPCS-2013: to access the proceedings, talks and tutorials of this workshop, click here.
Prior events : 2012, 2011, 2010, 2009, 2008, 2007.

3° European Workshop on Natural Language Generation (1995, 1993, 1991, 1989).

4° Tools for AuthoringAids (cfp and proceedings), LREC, Marrakech, 2008.

Some invited talks

1° Some thoughts concerning the future of the discipline

1.1RING panel : debate with Eduard Hovy (slides), COLING, 2010, Beijing.

1.2'AI + NLP' : abstract + slides (in french), Paris, 2012.

2° Other topics

2.1 Types and uses of semantic networks: genetic, practical and psycholinguistic aspects. Semantic networks: construction and usage (slides), Imera (Marseille, France)

2.2 'Errare humanum est'. Refusing to 'appreciate' this fact could be a big mistake ! (paper, slides), ERRARE (Sinaia, Romania).

2.3 'The striker’s fear at the penalty, or why intelligence is not everything?' IJCAI workshop: The Human factor in AI (HAI), Stockholm, July 15, 2018, abstract, slides (1, 2).

3° Electronic dictionaries and the mental lexicon

3.1Roget, WordNet and beyond. RANLP-2015, Hissar, Bulgaria.

3.2 Needles in a haystack and methods to find them. Can neuroscientists, psychologists, and computational linguists help us (to build a tool) to overcome the Tip of the Tongue problem? NetWordS conference: 'Word Knowledge and Word Usage: Representations and Processes in the Mental Lexicon' (Pisa, Italy).

3.3 Wheels for the mind of the language producer: microscopes, macroscopes, semantic maps and a good compass. LREC, Malta, 2010.

3.4 The mental lexicon, the blueprint of the dictionaries of tomorrow: linguistic, computational and psychological aspects of a highly valuable resource. ESSLLI, Toulouse, 2009.

3.5 How to help authors to overcome the Tip-Of-the-Tongue problem? Lexical graphs, associative networks, and some of their inherent problems. Toulouse (abstract, slides).

3.6Do you (still) love me? A crash course on telling and recognizing lies. Eurolan-2007, Iași, Romania (abstract, slides).

3.7If all roads lead to Rome, they are not all alike. BLRI (Brain & Language Research Institute), Marseille, 2012 (abstract, in French).

Publications

Here is a list of my publications, and here are the links to a book on lexical resources (Gala, N. & Zock, M. (Eds). Ressources Lexicales, John Benjamins, Amsterdam), as well as to some special issues devoted to 'Cognition and the Lexicon' (2015 and 2011). For the respective introductions see here (2015, 2011).

Natural Language Generation systems

List of Natural Language Generators. For a more recent version see here.

Many thanks for this Festschrift, a gift that I highly value
both as a token of your friendship and appreciation of my work.

My heartfelt thanks to all those colleagues and friends who have contributed to this Festschrift ( 1, 2, 3), including the authors of these wonderful introductions (4, 5). I would also like to extend my thanks to all of you for your empathy for my approach, even though it is not a major concern for most members of our community. I have always been interested in people and their ways of expressing feelings and thoughts (communication). Dealing mainly with language production, I am nevertheless interested in its correlate, interpretation, as, to be understood, we must not only speak correctly, but also understand (to some extent) the person we are talking to. “Speech belongs half to the speaker, half to the listener.” (Michel de Montaigne).

Interactive NLG (INLG), i.e. natural language generation mediated by machines, is meant to help people to communicate or to acquire the skills of speaking and writing. This implies finding relevant topics (messages) and expressing them then in a way likely to be understood. Various processes are involved in this (organization of thoughts, finding the relevant words, articulation,....), processes which are only partially understood. Hence gaining a better understanding of each one of them and building tools to support their learning are important goals. Yet to allow for this we need to integrate right from the start researchers with diverse expertise (psychologists, linguists, computer scientists, teachers) as well as the final user, an actor all too often overlooked. While this may look like a detour, it can be an asset, i.e. a potential accelerant to progress. Indeed, one may well make good progress and move ahead without necessarily dashing on the highway, but rather travel quietly on cross-disciplinary country roads, using the fertility of the soil and the 'gained' time to find new solutions to old problems while planting the seeds of future harvests.

Back to the top of the page