Corpora in Translation Studies: An Overview and Some Suggestions for Future Research

Mona Baker
UMIST & Middlesex University

Corpus-based research has become widely accepted as a factor in improving the performance of machine translation systems, and corpus-based terminology compilation is now the norm rather than the exception. Within translation studies proper, Lindquist (1984) has advocated the use of corpora for training translators, and Baker (1993a) has argued that theoretical research into the nature of translation will receive a powerful impetus from corpus-based studies. It is becoming increasingly important to take stock of what is happening on this front and to start working towards the development of an explicit and coherent methodology for corpus-based research in the discipline. This paper discusses the current and potential use of corpora in translation studies, with particular reference to theoretical issues.

Table of contents

The potential for using corpora is beginning to take shape in translation studies. Computerised corpora are becoming increasingly popular in those areas of the discipline which have close links with the hard sciences. This is particularly true of terminology and machine translation, where the emphasis is primarily, if not exclusively, on scientific and technical texts.

Full-text access is restricted to subscribers. Log in to obtain additional credentials. For subscription information see Subscription & Price. Direct PDF access to this article can be purchased through our e-platform.


Atkins, Sue, Jeremy Clear and Nicholas Ostler
1991 “Corpus Design Criteria”. Paper Presented at the Workshop on European Textual Corpora, Pisa, 7–10 January 1991.Google Scholar
Baker, Mona
1992In Other Words: A Coursebook on Translation. London and New York: Routledge.   DOI logoGoogle Scholar
1993a “Corpus Linguistics and Translation Studies: Implications and Applications”. Baker et al. 1993 : 233–250. DOI logoGoogle Scholar
1993bMultilingual Databases. Birmingham: University of Birmingham. [Report submitted to the European Commission as a contribution to a European enquiry into corpus work.]Google Scholar
Baker, Mona, Gill Francis and Elena Tognini-Bonelli
eds. 1993Text and Technology: In Honour of John Sinclair. Amsterdam/Philadelphia: John Benjamins.   DOI logoGoogle Scholar
Bernardo, Aldo S.
1981 “Maximizing Computer Assistance in Literary Translation: Petrarch’s Familiares”. Marilyn Gaddis Rose, ed. Translation Spectrum: Essays in Theory and Practice. State University of New York Press 1981 74–80.Google Scholar
Blum-Kulka, Shoshana and Eddie A. Levenston
1983 “Universais of Lexical Simplification”. Claus Faerch and Gabriele Kasper, eds. Strategies in IL Communication. Longman, 1983. 119–139.Google Scholar
British National Corpus: Written Corpus Design Specification
1991 OUP Promotional Document Dated 2 September.Google Scholar
Catford, J.C.
1965A Linguistic Theory of Translation: An Essay in Applied Linguistics. Oxford University Press.Google Scholar
Church, Kenneth and William Gale
1991 “Concordances for Parallel Text”. Paper Presented at the Seventh Annual Conference of the UW Centre for the New OED and Text Research. St. Catherine’s College, Oxford.
Gale, William and Kenneth Church
1991 “Identifying Word Correspondences in Parallel Texts”. Darpa SLS Workshop.   DOI logoGoogle Scholar
Hartmann, R.R.K.
1980Contrastive Textology: Comparative Discourse Analysis in Applied Linguistics. Heidelberg: Julius Groos.Google Scholar
Headland, Thomas
1981 “Information Rate, Information Overload, and Communication Problems in the Casiguran Dumagat New Testament”. Notes on Translation 83. 18–27.Google Scholar
Hofland, K. and S. Johansson
1982Word Frequencies in British and American English. Bergen: The Norwegian Computing Centre for the Humanities.Google Scholar
Johansson, Stig and Knut Hofland
1993 “Towards an English-Norwegian Parallel Corpus”. Udo Fries, Gunnel Tottie and Peter Schneider, eds. Creating and Using English Language Corpora: Papers from the Fourteenth International Conference on English Language Research on Computerized Corpora. Zurich, 1993. 25–37.Google Scholar
[ p. 242 ]
Krishnamurthy, Ramesh
1992 “Basic Access Software: Word Lists”. Birmingham: Cobuild. [Report submitted to the European Commission as a contribution to NERC workpackage 5: Access and Management Software Tools.]Google Scholar
Laffling, John
1991Towards High-Precision Machine Translation—Based on Contrastive Textology. Berlin-New York: Foris Publications.Google Scholar
1992 “On Constructing a Transfer Dictionary for Man and Machine”. Target 4:1. 17–31.   DOI logoGoogle Scholar
Larson, Mildred
1984Meaning-Based Translation: A Guide to Cross-Language Equivalence. Lanham, New York and London: University Press of America.Google Scholar
Leech, Geoffrey
1991 “Corpora”. Kirsten Malmkjær, ed. The Linguistics Encyclopedia. London and New York: Routledge 1991 73–80.Google Scholar
Lindquist, Hans
1984 “The Use of Corpus-Based Studies in the Preparation of Handbooks for Translators”. Wolfram Wilss and Gisela Thome, eds. Translation Theory and Its Implementation in the Teaching of Translating and Interpreting. Tübingen: Narr 1984 260–270.Google Scholar
Malmkjær, Kirsten
1993 “Who Can Make Nice a Better Word than Pretty?: Collocation, Translation, and Psycholinguistics”. Baker et al. 1993: 213–232. DOI logoGoogle Scholar
Marinai, E., C. Peters and E. Picchi
1991 “Bilingual Reference Corpora: A System for Parallel Text Retrieval”. Paper presented at the Seventh Annual Conference of the UW Centre for the New OED and Text Research, St. Catherine’s College, Oxford.
Newton, John
ed. 1992Computers in Translation: A Practical Appraisal. London and New York: Routledge.   DOI logoGoogle Scholar
Rettig, Heike
1993Evaluative Report on the Corpus Survey. Institut für Deutsche Sprache: Mannheim NERC Working Paper 17. [Submitted to the European Commission as a contribution to a European enquiry into corpus work.]Google Scholar
Sager, Juan
1990A Practical Course in Terminology Processing. Amsterdam/Philadelphia: John Benjamins.   DOI logoGoogle Scholar
Schubert, Klaus
1992 “Esperanto as an Intermediate Language for Machine Translation”. Newton 1992 : 78–95.Google Scholar
Shamaa, Najah
1978A Linguistic Analysis of Some Problems of Arabic to English Translation. Oxford University. [Ph.D. Thesis.]Google Scholar
Sinclair, John McHardy
1991aCorpus, Concordance, Collocation. Oxford: Oxford University Press.Google Scholar
1991bCouncil of Europe Multilingual Lexicography Project. [Report submitted to the Council of Europe under contract no. 57/89.]Google Scholar
Stubbs, Michael
1986 “Lexical Density: A Computational Technique”. Talking About Text. Discourse Analysis Monograph 13. University of Birmingham: English Language Research 1986 27–42.Google Scholar
1993 “British Traditions in Text Analysis: From Firth to Sinclair”. Baker et al. 1993 : 1–33. DOI logoGoogle Scholar
Toury, Gideon
1978 “The Nature and Role of Norms in Literary Translation”. James S Holmes, José Lambert and Raymond van den Broeck, eds. Literature and Translation: New Perspectives in Literary Studies. Leuven: ACCO 1978 83–100. [A revised version in: Gideon Toury. Descriptive Translation Studies and beyond. Amsterdam/Philadelphia: John Benjamins, 1995. 53–69.]Google Scholar
Vermeer, Hans J.
1987 “What Does It Mean to Translate?”. Gideon Toury, ed. Translation Across Cultures. New Delhi: Bahri Publications 1987 25–33.Google Scholar