Shallow features as indicators of English–German contrasts in lexical cohesion
This paper contrasts lexical cohesion between English and German spoken and
written registers, reporting findings from a quantitative lexical analysis.
After an overview of research aims and motivations we formulate hypotheses on
distributions of shallow features as indicators of lexical cohesion across
languages and modes and with respect to register ranking and variation. The
shallow features analysed are: highly frequent words in texts, lexical density,
standardized type-token-ratio, top-frequent content words of the language within
individual registers and texts, and several types of Latinate words. Descriptive
analyses of the corpus are then presented and statistically validated with the
help of univariate and multivariate analyses. The results are interpreted
relative to our hypotheses and related to the following properties of texts in
terms of lexical cohesion: semantic variability, cohesive strength, number and
length of nominal chains, degree of specification of lexis, and degree of
variation along all of these properties.
Article outline
- 1.Aims and Motivation
- 2.Methodology
- 2.1Hypotheses
- 2.2Data
- 2.3Statistical Techniques
- 3.Analyses
- 3.1Most frequent words
- 3.2Lexical density
- 3.3Standardized type-token ratio
- 3.4Top content words
- 3.5Latinate words
- 3.6Overall comparison
- 4.Discussion and conclusion
- 4.1Language contrast English vs. German
- 4.2Spoken vs. written mode
- 4.3Register ranking
- 4.4Register variation
- Notes
-
References
References (44)
References
Baayen, H. 2008. Analyzing Linguistic Data. A Practical Introduction to Statistics Using
R. Cambridge: Cambridge University Press.
Beaugrande, R.-A. de and Dressler, W. U. 1981. Introduction to Text Linguistics. London, New York: Longman (German version also published by Niemeyer in 1981).
Berzlanovich, I. 2008. Lexical Cohesion and the Organization of Discourse. First year report PhD student: University Groningen.
Biber, D. 1988. Variation across Speech and Writing. Cambridge: Cambridge University Press.
Biber, D. and Finegan, E. 1989. Drift and evolution of English style: a history of three
genres. Language. 651:487–517.
Biber, D., Johansson, S., Leech, G., Conrad, S. and Finegan, E. 1999. Longman Grammar of Spoken and Written English. London: Longman.
Brinker, K. 2005. Linguistische Textanalyse: Eine Einführung in Grundbegriffe und
Methoden. 6th edition. Berlin: Erich Schmidt.
Chambers, J. M., Cleveland, W. S., Kleiner, B., and Tukey, P. A. 1983. Graphical Methods for Data Analysis. The Wadsworth Statistics / Probability Series. Duxbury Press, Boston.
Collins, P. 2012. Grammatical Variation in English Worldwide: The Role of
Colloquialization. Linguistics and the Human Sciences 8(3):289–306.
Cohen, J. 1992. A power primer. Psychological Bulletin, 112(1):155–159.
Deutsche Akademie für Sprache und Dichtung, Union der deutschen
Akademien der Wissenschaften (ed.). 2013. Reichtum und Armut der deutschen Sprache. Erster Bericht zur Lage der
deutschen Sprache. Berlin: Walter de Gruyter.
Fiehler, R., Barden, B., Elstermann, M. and Kraft, B. 2004. Eigenschaften gesprochener Sprache. Tübingen: Narr (Studien zur Deutschen Sprache 30).
Fischer, K. 2013. Satzstrukturen im Deutschen und Englischen. Typologie und
Textrealisierung. Berlin: Akademie Verlag.
Gast, V. 2008. V-N Compounds in English and German. Zeitschrift für Anglistik und Amerikanistik 56(3). 269–282.
Greenacre, M. 2010. Correspondence Analysis in Practice. CRC Press.
Halliday, M. A. K. and Hasan, R. 1976. Cohesion in English. London, New York: Longman.
Halliday, M. A. K. 2005. On Grammar. Vol. 11 of Collected Works of M. A. K. Halliday. London: Continuum.
Hansen-Schirra, S., Neumann, S. and Steiner, E. 2012. Cross-linguistic Corpora for the Study of Translations. Insights from
the Language Pair English – German. Series Text, Translation,
Computational Processing. Berlin / New York: Mouton de Gruyter.
Haspelmath, M. 2010. Comparative concepts and descriptive categories in
cross-linguistic studies. Language 86(4).
Hawkins, J. A. 1986. A Comparative Typology of English and German. Unifying the
Contrasts. London etc. Croom Helm.
Hennig, M. 2006. Grammatik der gesprochenen Sprache in Theorie und Praxis. Kassel: University Press.
House, J. 1997. Translation Quality Assessment. Tübingen: Narr.
Koch, P. and Oesterreicher, W. 1985. Sprache der Nähe – Sprache der Distanz. Mündlichkeit und
Schriftlichkeit im Spannungsfeld von Sprachtheorie und
Sprachgeschichte. Romanistisches Jahrbuch 36/851:15–43.
König, E. and Gast, V. 2012. Understanding English–German Contrasts. Grundlagen der Anglistik und
Amerikanistik. Berlin: Erich Schmidt Verlag. [3rd, extended edition].
Kunz, K., Lapshinova-Koltunski, E. and Martínez Martínez, J. M. 2016. Beyond Identity Coreference: Contrasting Indicators of Textual
Coherence in English and German. In Proceedings of CORBON at NAACL-HLT2016, San Diego.
Kunz, K., Degaetano-Ortlieb, S., Lapshinova-Koltunski, E., Menzel, K. and Steiner, E. (2017). English-German contrasts in cohesion and implications for translation. In De Sutter, G. and Delaere, I. and Lefer, M.-A. (eds.). Empirical Translation Studies. New Theoretical and Methodological Traditions. TILSM series. Vol. 3001. Mouton de Gruyter, 265–312
Lapshinova-Koltunski, E., Kunz, K. and Amoia, M. 2012. Compiling a Multilingual Spoken Corpus. Proceedings of the VIIth GSCP International Conference: Speech and
corpora, Firenze: Firenze University Press, 2012, 79–84, Available at: [URL] [last accessed 16/02/2015]
Lapshinova-Koltunski, E. and Kunz, K. 2014. Annotating Cohesion for Multillingual Analysis. Proceedings of the 10th Joint ACL-ISO Workshop on Interoperable Semantic
Annotation in conjunction with LREC2014 the Ninth International Conference
on Language Resources and Evaluation, Reykjavik, Iceland, 2014. Available at: [URL] [last accessed 18/01/2015]
Leech, G., Hundt, M., Mair, C. and Smith, N. 2009. Change in Contemporary English. A Grammatical Study. Cambridge: Cambridge University Press.
Leisi, E. and Mair, C. 2008. Das heutige Englisch: Wesenszüge und Probleme. 9th edition Heidelberg: Universitätsverlag Winter.
Martínez Martínez, J. M. 2015. GECCo UPOS. Internal Technical Report. Available at: [URL] [Last accessed 15/02/2016]
Martínez Martínez, J. M., Lapshinova-Koltunski, E. and Kunz, K. 2016. Annotation of Lexical Cohesion in English and German: Automatic
and Manual Procedures. Proceedings of the Conference on Natural Language Processing,
KONVENS-2016, September 19–21, Bochum, Germany.
Mair, C. 2006. Twentieth-Century English: History, Variation and
Standardization, Cambridge: Cambridge University Press.
Nenadic, O. and Greenacre, M. 2007. Correspondence analysis in R, with two- and three-dimensional
graphics: The ca package. Journal of Statistical Software 20(3): 1–13.
Neumann, S. 2013. Contrastive register variation. A quantitative approach to the
comparison of English and German. Berlin / New York: Mouton de Gruyter.
Petrov, S., Das, D., and McDonald, R. 2012. A Universal Part-of-Speech Tagset. Proceedings of the Eight International Conference on Language Resources
and Evaluation (LREC’12). Istanbul, Turkey: European Language Resources Association (ELRA). Available at: [URL] [last accessed 18/01/2015]
Schmid, H. 1995. Improvements in Part-of-Speech Tagging with an Application to
German. Proceedings of the ACL SIGDAT-Workshop. Dublin, Ireland.
Schmid, H. 1994. Probabilistic Part-of-Speech Tagging Using Decision
Trees. Proceedings of International Conference on New Methods in Language
Processing, Manchester, UK.
Stokes, N. 2004. Applications of Lexical Cohesion Analysis in the Topic Detection and
Tracking Domain. PhD Thesis Dublin: UCD.
Venables, W. N. and Smith, D. M. 2010. An Introduction to R. Notes on R: A Programming Environment for Data
Analysis and Graphics. Electronic edition. Available at: [URL].
Cited by (1)
Cited by one other publication
This list is based on CrossRef data as of 5 july 2024. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers.
Any errors therein should be reported to them.