Multilingual Corpora and Multilingual Corpus Analysis

| University of Hamburg
| University of Hamburg
ISBN 9789027219343 | EUR 75.00 | USD 113.00
ISBN 9789027273444 | EUR 75.00 | USD 113.00
This volume deals with different aspects of the creation and use of multilingual corpora. The term 'multilingual corpus' is understood in a comprehensive sense, meaning any systematic collection of empirical language data enabling linguists to carry out analyses of multilingual individuals, multilingual societies or multilingual communication. The individual contributions are thus concerned with a variety of spoken and written corpora ranging from learner and attrition corpora, language contact corpora and interpreting corpora to comparable and parallel corpora. The overarching aim of the volume is first to take stock of the variety of existing multilingual corpora, documenting possible corpus designs and uses, second to discuss methodological and technological challenges in the creation and analysis of multilingual corpora, and third to provide examples of linguistic analyses that were carried out on the basis of multilingual corpora.
[Hamburg Studies on Multilingualism, 14]  2012.  xiii, 407 pp.
Publishing status: Available
Table of Contents
Thomas Schmidt and Kai Wörner
Section 1. Learner and attrition corpora
The LeaP corpus: A multilingual corpus of spoken learner German and learner English
Ulrike Gut
Technological and methodological challenges in creating, annotating and sharing a learner corpus of spoken German
Hanna Hedeland and Thomas Schmidt
Creation and analysis of a reading comprehension exercise corpus: Towards evaluating meaning in context
Niels Ott, Ramon Ziai and Detmar Meurers
The ALeSKo learner corpus: Design – annotation – quantitative analyses
Heike Zinsmeister and Margit Breckle
Corpora of spoken Spanish by simultaneous and successive German-Spanish bilingual and Spanish monolingual children
Marta Saceda Ulloa, Conxita Lleó and Izarbe Garcia Sanchez
Monolingual and bilingual phonoprosodic corpora of child German and child Spanish
Conxita Lleó
Pragmatic corpus analysis, exemplified by Turkish-German bilingual and monolingual data
Annette Herkenrath and Jochen Rehbein
Corpus of Polish spoken in Germany: Collecting and analysing written & spoken data for investigating contact-induced change
Agnieszka Czachór
The HABLA-corpus (German-French and German-Italian)
Tanja Kupisch, Dagmar Barton, Giulia Bianchi and Ilse Stangen
Section 2. Language contact corpora
The Hamburg Corpus of Argentinean Spanish (HaCASpa)
Christoph Gabriel
Ad hoc contact phenomena or established features of a contact variety? Evidence from corpus analysis
Karoline Kühl
Phonoprosodic corpus of spoken Catalan (PhonCAT)
Ariadna Benet, Susana Cortés and Conxita Lleó
Researching the intelligibility of a (German) dialect
Magdalena Putz
Annotating ambiguity: Insights from a corpus-based study on syntactic change in Old Swedish
Steffen Höder
Section 3. Interpreting corpora
Sharing community interpreting corpora: A pilot study
Philipp Angermeyer, Bernd Meyer and Thomas Schmidt
CoSi – A Corpus of Consecutive and Simultaneous Interpreting
Juliane House, Bernd Meyer and Thomas Schmidt
The corpus “Interpreting in Hospitals” Possible applications for research and communication training
Kristin Bührig, Ortrun Kliche, Bernd Meyer and Birte Pawlack
Section 4. Comparable and parallel corpora
The GeWiss corpus: Comparing spoken academic German, English and Polish
Christian Fandrych, Cordula Meißner and Adriana Slavcheva
Korpus C4: A distributed corpus of German varieties
Henrik Dittmann, Matej Ďurčo, Alexander Geyken, Tobias Roth and Kai Zimmer
Treebanks in translation studies: The CroCo Dependency Treebank
Oliver Čulo and Silvia Hansen-Schirra
Section 5. Corpus tools
Multilingual phonological corpus analysis: The tools behind the PhonBank Project
Yvan Rose
Finding the balance between strict defaults and total openness: Collecting and managing metadata for spoken language corpora with the EXMARaLDA Corpus Manager
Kai Wörner
General index
Corpora index
Language index
“This collection of multilingual corpora studies, above all, appeals to a wide readership interested in multilingualism and corpus linguistics. In addition, anyone who is to some extent interested in languages or linguistic studies may find the book useful, as it covers a wide range of areas related to linguistics such as contact situation, interpretation and translation studies and language learning process in terms of various language levels and sub-levels (e.g. spoken and written modes, pronunciation, written essays, etc.). The volume differs from related collections, which focus only one aspect of bilingual corpora on certain languages (e.g. Johansson, 2007, which focuses on the English-Norwegian Parallel Corpus and the Oslo Multilingual Corpus), or just one level and sub-level of language (e.g. Teubert, 2007, which deals with bilingual and multilingual lexicography and, annotation issues). Thus, this volume fills a gap in the literature of multilingualism and corpus linguistics. Another important aspect of the volume is that it includes studies on both small and large corpora and studies that deal with both the creation and analysis of multilingual corpora. The editors’ objectives of (i) introducing the audience to a large number of available multilingual corpora, (ii) raising issues frequently encountered in the methodological and technological aspects of corpus creation, and (iii) presenting a selection of linguistics analyses drawn from multilingual corpora clearly appear to have been achieved.”
Cited by

Cited by 9 other publications

No author info given
2020.  In World Englishes on the Web [Varieties of English Around the World, G63], Crossref logo
Domke, Christine & Christina Gansel
2014. Zur Einführung: Korpora in der Linguistik - Perspektiven und Positionen zu Daten und Datenerhebung. Mitteilungen des Deutschen Germanistenverbandes 61:1  pp. 1 ff. Crossref logo
Hantgan-Sonko, Abbie
2017. Crossroads Corpus creation: Design and case study. Yearbook of the Poznan Linguistic Meeting 3:1  pp. 1 ff. Crossref logo
Herry-Bénit, Nadine, Stéphanie Lopez, Takeki Kamiyama & Jeff Tennant
2021. The interphonology of contemporary English corpus (IPCE-IPAC). International Journal of Learner Corpus Research 7:2  pp. 275 ff. Crossref logo
Kupisch, Tanja & Jason Rothman
2018. Terminology matters! Why difference is not incompleteness and how early child bilinguals are heritage speakers. International Journal of Bilingualism 22:5  pp. 564 ff. Crossref logo
Labrador, Belen
2016. Translation as an aid to ELT: Using an English–Spanish parallel corpus (P-ACTRES) to study Englishbothand its Spanish counterparts. Digital Scholarship in the Humanities 31:3  pp. 499 ff. Crossref logo
Lázaro Gutiérrez, Raquel & María del Mar Sánchez Ramos
2015.  In Yearbook of Corpus Linguistics and Pragmatics 2015 [Yearbook of Corpus Linguistics and Pragmatics, 3],  pp. 275 ff. Crossref logo
Trouvain, Jürgen, Frank Zimmerer, Bernd Möbius, Mária Gósy & Anne Bonneau
2017. Segmental, prosodic and fluency features in phonetic learner corpora. International Journal of Learner Corpus Research 3:2  pp. 105 ff. Crossref logo
Vessey, Rachelle
2019. Arja Nurmi, Tanja Rütten, and Päivi Pahta (eds): Challenging the Myth of Monolingual Corpora. Applied Linguistics 40:5  pp. 864 ff. Crossref logo

This list is based on CrossRef data as of 17 january 2022. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers. Any errors therein should be reported to them.

Subjects & Metadata
BIC Subject: CFDM – Bilingualism & multilingualism
BISAC Subject: LAN009000 – LANGUAGE ARTS & DISCIPLINES / Linguistics / General
ONIX Metadata
ONIX 2.1
ONIX 3.0
U.S. Library of Congress Control Number:  2012021737 | Marc record