Multilingual Corpora and Multilingual Corpus Analysis

Editors
| University of Hamburg
| University of Hamburg
HardboundAvailable
ISBN 9789027219343 | EUR 75.00 | USD 113.00
 
e-Book
ISBN 9789027273444 | EUR 75.00 | USD 113.00
 
This volume deals with different aspects of the creation and use of multilingual corpora. The term 'multilingual corpus' is understood in a comprehensive sense, meaning any systematic collection of empirical language data enabling linguists to carry out analyses of multilingual individuals, multilingual societies or multilingual communication. The individual contributions are thus concerned with a variety of spoken and written corpora ranging from learner and attrition corpora, language contact corpora and interpreting corpora to comparable and parallel corpora. The overarching aim of the volume is first to take stock of the variety of existing multilingual corpora, documenting possible corpus designs and uses, second to discuss methodological and technological challenges in the creation and analysis of multilingual corpora, and third to provide examples of linguistic analyses that were carried out on the basis of multilingual corpora.
[Hamburg Studies on Multilingualism, 14]  2012.  xiii, 407 pp.
Publishing status: Available
Table of Contents
Introduction
Thomas Schmidt and Kai Wörner
xi–xiii
Section 1. Learner and attrition corpora
The LeaP corpus: A multilingual corpus of spoken learner German and learner English
Ulrike Gut
3–23
Technological and methodological challenges in creating, annotating and sharing a learner corpus of spoken German
Hanna Hedeland and Thomas Schmidt
25–46
Creation and analysis of a reading comprehension exercise corpus: Towards evaluating meaning in context
Niels Ott, Ramon Ziai and Detmar Meurers
47–69
The ALeSKo learner corpus: Design – annotation – quantitative analyses
Heike Zinsmeister and Margit Breckle
71–96
Corpora of spoken Spanish by simultaneous and successive German-Spanish bilingual and Spanish monolingual children
Marta Saceda Ulloa, Conxita Lleó and Izarbe Garcia Sanchez
97–106
Monolingual and bilingual phonoprosodic corpora of child German and child Spanish
Conxita Lleó
107–122
Pragmatic corpus analysis, exemplified by Turkish-German bilingual and monolingual data
Annette Herkenrath and Jochen Rehbein
123–152
Corpus of Polish spoken in Germany: Collecting and analysing written & spoken data for investigating contact-induced change
Agnieszka Czachór
153–161
The HABLA-corpus (German-French and German-Italian)
Tanja Kupisch, Dagmar Barton, Giulia Bianchi and Ilse Stangen
163–179
Section 2. Language contact corpora
The Hamburg Corpus of Argentinean Spanish (HaCASpa)
Christoph Gabriel
183–197
Ad hoc contact phenomena or established features of a contact variety?: Evidence from corpus analysis
Karoline Kühl
199–214
Phonoprosodic corpus of spoken Catalan (PhonCAT)
Ariadna Benet, Susana Cortés and Conxita Lleó
215–229
Researching the intelligibility of a (German) dialect
Magdalena Putz
231–243
Annotating ambiguity: Insights from a corpus-based study on syntactic change in Old Swedish
Steffen Höder
245–271
Section 3. Interpreting corpora
Sharing community interpreting corpora: A pilot study
Philipp Angermeyer, Bernd Meyer and Thomas Schmidt
275–294
CoSi – A Corpus of Consecutive and Simultaneous Interpreting
Juliane House, Bernd Meyer and Thomas Schmidt
295–304
The corpus “Interpreting in Hospitals”: Possible applications for research and communication training
Kristin Bührig, Ortrun Kliche, Bernd Meyer and Birte Pawlack
305–315
Section 4. Comparable and parallel corpora
The GeWiss corpus: Comparing spoken academic German, English and Polish
Christian Fandrych, Cordula Meißner and Adriana Slavcheva
319–337
Korpus C4: A distributed corpus of German varieties
Henrik Dittmann, Matej Ďurčo, Alexander Geyken, Tobias Roth and Kai Zimmer
339–346
Treebanks in translation studies: The CroCo Dependency Treebank
Oliver Čulo and Silvia Hansen-Schirra
347–361
Section 5. Corpus tools
Multilingual phonological corpus analysis: The tools behind the PhonBank Project
Yvan Rose
365–381
Finding the balance between strict defaults and total openness: Collecting and managing metadata for spoken language corpora with the EXMARaLDA Corpus Manager
Kai Wörner
383–400
General index
401–404
Corpora index
405–406
Language index
407
“This collection of multilingual corpora studies, above all, appeals to a wide readership interested in multilingualism and corpus linguistics. In addition, anyone who is to some extent interested in languages or linguistic studies may find the book useful, as it covers a wide range of areas related to linguistics such as contact situation, interpretation and translation studies and language learning process in terms of various language levels and sub-levels (e.g. spoken and written modes, pronunciation, written essays, etc.). The volume differs from related collections, which focus only one aspect of bilingual corpora on certain languages (e.g. Johansson, 2007, which focuses on the English-Norwegian Parallel Corpus and the Oslo Multilingual Corpus), or just one level and sub-level of language (e.g. Teubert, 2007, which deals with bilingual and multilingual lexicography and, annotation issues). Thus, this volume fills a gap in the literature of multilingualism and corpus linguistics. Another important aspect of the volume is that it includes studies on both small and large corpora and studies that deal with both the creation and analysis of multilingual corpora. The editors’ objectives of (i) introducing the audience to a large number of available multilingual corpora, (ii) raising issues frequently encountered in the methodological and technological aspects of corpus creation, and (iii) presenting a selection of linguistics analyses drawn from multilingual corpora clearly appear to have been achieved.”
Cited by

Cited by other publications

Domke, Christine & Christina Gansel
2014. Zur Einführung: Korpora in der Linguistik - Perspektiven und Positionen zu Daten und Datenerhebung. Mitteilungen des Deutschen Germanistenverbandes 61:1  pp. 1 ff. Crossref logo
Kupisch, Tanja & Jason Rothman
2018. Terminology matters! Why difference is not incompleteness and how early child bilinguals are heritage speakers. International Journal of Bilingualism 22:5  pp. 564 ff. Crossref logo
Labrador, Belen
2016. Translation as an aid to ELT: Using an English–Spanish parallel corpus (P-ACTRES) to study Englishbothand its Spanish counterparts. Digital Scholarship in the Humanities 31:3  pp. 499 ff. Crossref logo
Lázaro Gutiérrez, Raquel & María del Mar Sánchez Ramos
2015.  In Yearbook of Corpus Linguistics and Pragmatics 2015 [Yearbook of Corpus Linguistics and Pragmatics, 3],  pp. 275 ff. Crossref logo
Trouvain, Jürgen, Frank Zimmerer, Bernd Möbius, Mária Gósy & Anne Bonneau
2017. Segmental, prosodic and fluency features in phonetic learner corpora. International Journal of Learner Corpus Research 3:2  pp. 105 ff. Crossref logo
Vessey, Rachelle
2018. Arja Nurmi, Tanja Rütten, and Päivi Pahta (eds): CHALLENGING THE MYTH OF MONOLINGUAL CORPORA. Applied Linguistics Crossref logo

This list is based on CrossRef data as of 27 september 2019. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers. Any errors therein should be reported to them.

Subjects
BIC Subject: CFDM – Bilingualism & multilingualism
BISAC Subject: LAN009000 – LANGUAGE ARTS & DISCIPLINES / Linguistics / General
U.S. Library of Congress Control Number:  2012021737