Vocabulaire en Computer.
M. Boot | Vakgroep Toegepaste Taalkunde en Computerlinguïstiek Rijksuniversiteit Utrecht
In computational linguistics three classes of models have been developed for the automated treatment of texts in natural language. One class is best characterized as a set model. Language is defined as a set of words. On these words normal arithmetic computations are performed. The set model has led to frequency counts. Frequency counts of natural language material have proved to be of little importance to language analysis and the study of language learning.
The second class of models is best characterized as formal linguistic models. Here language is defined as not merely a set of words but more as a set of sentences. On these sets of sentences more than purely arithmetic operations can be performed. Important notions in these models are transformations, recursivity or even grammars. This class of models has led to the adaptation of context free grammars to natural language. The weak point in this class of models is the inappropriateness of formal grammars to human language.
The third class of models can be defined as artificial intelligence models. Here the computer is used to simulate human verbal behavior. Language processes are defined as processes of understanding language. Linguistic knowledge is not defined outside the vocabulary or outside these processes. This class of models has led to the application of Minsky's frame theory to natural language processing. The lexicon here is defined as a procedural fact device in the language processor itself. This class of models is most promising for the study of language learning and the role of vocabulary in this language learning process.
Article language: Dutch