The paper is concerned with problems of methodology. Against this background, the situation of today's corpora is discussed and some fields are identified as being in a far from satisfactory shape. The place of corpora in linguistics is briefly looked at, suggesting that structuralist tradition is the only one to use them extensively. Problems of annotation and ways, less (statistical) or more successful (rule-based), are raised and discussed. Here, some of the most serious shortcomings, such as multi-word units or status of language units in general that computational linguists should deal with, are listed. In a more general direction, implications and status of paradigmatics and syntagmatics are discussed, too, with considerable and critical attention paid to ontologies.
2011. Corpus Academicum Lithuanicum: Design Criteria, Methodology, Application. In Human Language Technology. Challenges for Computer Science and Linguistics [Lecture Notes in Computer Science, 6562], ► pp. 412 ff.
Arppe, Antti, Gaëtanelle Gilquin, Dylan Glynn, Martin Hilpert & Arne Zeschel
2010. Cognitive Corpus Linguistics: five points of debate on current theory and methodology. Corpora 5:1 ► pp. 1 ff.
Colson, Jean-Pierre
2005. A New Computational Tool for Analyzing Translation Processes: The TransCorrect Project. Meta 50:2 ► pp. 573 ff.
This list is based on CrossRef data as of 5 august 2024. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers.
Any errors therein should be reported to them.