Origins of Language
Since 1995 members of the AI Lab have worked on the problem of the origins of language. The basic idea behind this work is that a community of language users (further called agents) can be viewed as a complex adaptive system which collectively solves the problem of developing a shared communication system. To do so, the community must reach an agreement on a repertoire of forms (a sound system in the case of spoken language), a repertoire of meanings (the conceptualisations of reality), and a repertoire of form-meaning pairs (the lexicon and grammar).
Although communication is not a general computational problem it is nevertheless a problem of great interest. First of all there is a strong interest from a scientific point of view. Finding the key how communication systems of the complexity of human natural languages emerge may help to solve the problem how human language itself may have originated and evolved. This longstanding fascinating question is receiving increasing attention lately, but only clear scientific models that explain how language evolved (as opposed to enumerating conditions why language evolved) can be expected to steer us away from the many speculations that made the field suspect for a long time. By clear scientific models we mean that the cognitive structures and interaction behaviors of each agent are specified and that it is shown how they collectively lead to a language.
Second, there is an interest because of possible applications. On the one hand, autonomous artificial agents which need to coordinate their activity in open-ended environments could make use of these mechanisms to develop and continuously adapt their communication systems. On the other hand, understanding how language develops and evolves is probably our only hope to ever get to technological artefacts that exhibit human-level language understanding and production. Human languages are constantly changing and differ significantly from one speaker to the next and from one context to the next. So, we need language technologies which exhibit the same adaptivity as humans.
The experiments conducted so far have always the same form: (1) They involve a population of (artificial) agents, possibly robots. (2) The agents engage in interactions situated in a specific environment. Such an interaction is called a game. (3) Each agent has a sensori-motor apparatus, a cognitive architecture, and a script determining how it interacts with others. (4) There is an environment (possibly the real world) which consists of situations that are ideally open-ended.
Different parts of language are being investigated within this framework:
Phonetics/Phonology
In the experiments on phonetics and phonology a population of agents develops human like sets of speech sounds. The agents engage in local interactions, called imitation games, in which they try to imitate each other's sounds. After a while the agents develop realistic sets of speech sounds.
For the production of sounds, a model of the human articulatory system is used with which a number of consonants as well as all basic vowels can be produced. The model is independent of any language in particular. For the perception of sounds a model of human perception of speech sounds is used.
The first experiments were meant to explain the shape of vowel systems. An example of a vowel system that obtained after 4000 imitation games in a population of twenty agents. The agents had to communicate under 20% noise. Systems of this type can be found in many human languages, for example in Arabic, because they are very stable.
Experiments have been done that tried to explain the shape of consonant systems. The results of these experiments indicate that many phenomena in the sound systems of human languages can be explained by functional constraints that are implemented through imitation and learning in a population of agents. No innate features seem to be necessary.
Lexicon formation
In order to talk about the world, agents need to have a shared lexicon. The lexicon formation experiments investigate how a shared lexicon can emerge in a population of agents that initially have no lexicon at all.
The simplest game that is played by the agents is the naming game. Two agents are chosen from the population. One agent points to an object and says a name for this object. The other agent looks in its lexicon and checks whether it has the same name for the same object. If so, the game was successful. If not the game is a failure, but the listening agent remembers the name that was given to the object.
Agents prefer to use the most successful names, and discard the unsuccessful ones. After a while a coherent and successful lexicon emerges.
A number of interesting variations on the basic game have been tried out. One of these entailed the developing of a more abstract spatial vocabulary with which the spatial relations between different objects could be expressed. In another experiment agents were given spatial positions and the probability of interaction between two agents depended on the distance between them. In this experiment, depending on the strength of the coupling between the agents monolingual, bilingual or multilingual agents emerged.
Grammar and Syntax
The origin of syntax is one of the most challenging questions in science. Our approach to this question has been similar to the approach we have taken to study the emergence of sound systems, meaning repertoires and lexicons. We model a population of agents, situated in a world, that -- by learning from each other -- together create a communication system aimed at transmitting knowledge about that world.
We have studied in different models the following aspects:
- The emergence of a shared lexicon for multiple word utterances;
- The emergence of syntactic categories;
- The creation of hierarchical semantic descriptions;
- The use and implementation of compositional grammars, that map complex meanings on grammatical sentences and v.v.;
- The emergence of morphological markers;
- Grammar induction;
- Language adaptation in "iterated learning";
- Natural selection of grammars.
We are currently working on integrating several components into a system that shows the emergence of a syntactic language in a population, under realistic cognitive and environmental constraints. Although at this point the results are still diverse and limited to only certain aspects of syntax, we do believe they lend support to our working hypothesis: syntactic language emerged from the same processes that still operate in human communication, and much less "innate" knowledge of language is necessary than is usually assumed.
Publications
The following publications on the origins of language have been made by the origins of language group between 1997 and 2000:
- De Jong, E.D. (2000). Attractors in the Development of Communication. [PDF]. J.-A. Meyer, A. Berthoz, D. Floreano, H. Roitblat, and S. Wilson (Eds). SAB2000 Proceedings Supplement Book. Honolulu, Hawaii: International Society for Adaptive Behavior.
- De Jong, E.D. (1999). Analyzing the Evolution of Communication from a Dynamical Systems Perspective. In Proceedings of the European Conference on Artificial Life ECAL'99, 689-693. Springer-Verlag LNCS, Berlin.
- De Jong, E.D. (1999). Autonomous Concept Formation. In Proceedings of the International
- De Jong, E.D. and L. Steels (1999). Generation and selection of sensory channels. In Evolutionary Image Analysis, Signal Processing and Telecommunications First European Workshops, EvoIASP'99 and EuroEcTel'99 Joint Proceedings, pp. 90-100. Göteborg, Sweden, May 1999. Springer-Verlag LNCS, Berlin.
- De Jong, E.D. (1999). Coordination Developed by Learning from Evaluations. To appear in the collected papers of the VIM workshops, Springer-Verlag LNAI, Berlin.
- De Jong, E.D. (1998). The Development of a Lexicon Based on Behavior. In: Proceedings of the Tenth Dutch Conference on Artificial Intelligence NAIC'98.
- De Jong, E.D. and P. Vogt (1998) How Should a Robot Discriminate Between Objects? A comparison between two methods. Proceedings of the Fifth International Conference of The Society for Adaptive Behavior SAB'98.
- Steels, L. and P. Vogt (1997) Grounding adaptive language games in robotic agents. In Harvey, P. and P. Husbands (eds.) (1997) Proceedings of the 4th European Conference on Artificial Life 97.
- Vogt, P. (1998) Perceptual grounding in robots. In (Birk, A. and J. Demiris eds.) Proceedings of the 6th European Workshop on Learning Robots 1997. Lecture Notes on Artificial Intelligence. Springer-Verlag 1998.
- Vogt, P. (1998) The evolution of a lexicon and meaning in robotic agents through self-organization AI-Memo 98-09. Presented at the 2nd International Conference on the Evolution of Language, London, April 6-9 1998.
