Nowadays, when linguists speak of a corpus, they usually mean a collection of computer-readable texts. The design of the collection as well as the nature of the texts may vary considerably from one corpus to another, but the texts, whether spoken or written, must have been produced in an actual context of language use. The utterances constituting the texts are never artificial linguistic objects produced under laboratory conditions for the sole purpose of linguistic research. The fact that corpora are computationally accessible and that they are repositories of language use, largely determines the nature of the linguistic research they are used for. First, corpus analysis nowadays cannot be carried out without the availability of advanced computational tools; secondly, it is naturally oriented towards the study of language use and therefore biased towards the study of specific languages, genres and language varieties.
References
Aarts, J
1992Comments on ICE. In J. Svartvik (ed.): 180–183.
Aarts, J., P. De Haan & N. Oostdijk
(eds.)1993English language corpora. Rodopi.
Black, E., R. Garside & G. Leech
(eds.)1993Statistically-driven computer grammars of English. Rodopi.
Burnage, G. & D. Dunlop
1993Encoding the British National Corpus. In J. Aarts, P. De Haan & N. Oostdijk (eds.): 79–95.
Collot, M. & N. Belmore
1993Electronic language. In J. Aarts, P. De Haan & N. Oostdijk (eds.): 41–55.
Granger, S
1993International Corpus of Learner English. In J. Aarts, P. De Haan & N. Oostdijk (eds.): 57–69.
Greenbaum, S
1992A new corpus of English: ICE. In Svartvik (ed.): 171–179.
Harris, Z
1951Methods in structural linguistics. University of Chicago Press.
Johansson, S
1980The LOB corpus of British English texts: presentation and comments. ALLC Journal 1: 25–36.
Johansson, S. & K. Hofland
1994Towards an English-Norwegian parallel corpus. In U. Fries, G. Tottie & P. Schneider (eds.) Creating and using English language corpora: 25–37. Rodopi.
Johansson, S. & A-B. Stenström
(eds.)1991English computer corpora. Mouton de Gruyter.
Karlsson, F
1994Robust parsing of unconstrained text. In N. Oostdijk & P. De Haan (eds.): 121–142.
Keulen, F
1986The Dutch computer corpus pilot project. In J. Aarts & W. Meijs (eds.) Corpus linguistics II: 127–155. Rodopi.
Knowles, G
1993The Machine-Readable Spoken English Corpus. In J. Aarts, P. De Haan & N. Oostdijk (eds.): 107–119.
Kučera, H. & W.N. Francis
1967Computational analysis of present-day American English. Brown University Press.
Kytö, M
1991Manual to the diachronic part of the Helsinki corpus of English texts. Helsinki University Dept. of English.
Kytö, M
., M. Rissanen & S. Wright(eds.)1994Corpora across the centuries. Rodopi.
Leech, G
1991The state of the art in corpus linguistics. In K. Aijmer & B. Altenberg (eds.) English corpus linguistics: 8–29. Longman.
Leech, G. & R. Garside
1991Running a grammar factory. In S. Johansson & A-B. Stenström (eds.): 15–32.
Leech, G., R. Garside & M. Bryant
1994The large-scale grammatical tagging of text: experience with the British National Corpus. In N. Oostdijk & P. De Haan (eds.): 47–63.
Marcus, M., B. Santorini & M. Marcinkiewicz
1993Building a large annotated corpus of English. Computational Linguistics 19: 313–330.
Oostdijk, N. & P. De Haan
(eds.)1994Corpus-based research into language. Rodopi.
Quirk, R
1960Towards a description of English usage. Transactions of the Philological Society: 40–61.
Quirk, R
1992On corpus principles and design. In J. Svartvik (ed.): 457–469.
Renouf, A
1993A word in time: first findings from the investigation of dynamic text. In J. Aarts, P. De Haan & N. Oostdijk (eds.): 279–288.
Sampson, G
1994SUSANNE: a Domesday Book of English grammar. In N. Oostdijk & P. De Haan (eds.): 169–187.
Souter, C
1989A short handbook to the Polytechnic of Wales corpus. Norwegian Computing Centre for the Humanities.
Svartvik, J
(ed.)1990The London-Lund corpus of spoken English. Lund University Press.
Svartvik, J
(ed.)1992Directions in corpus linguistics. Mouton de Gruyter. BoP
Taylor, L., G. Leech & S. Fligelstone
1991A survey of English machine-readable corpora. In S. Johansson & A-B. Stenström (eds.): 319–354.[See also: Statistics]