Shallow features as indicators of English–German contrasts in lexical cohesion
This paper contrasts lexical cohesion between English and German spoken and
written registers, reporting findings from a quantitative lexical analysis.
After an overview of research aims and motivations we formulate hypotheses on
distributions of shallow features as indicators of lexical cohesion across
languages and modes and with respect to register ranking and variation. The
shallow features analysed are: highly frequent words in texts, lexical density,
standardized type-token-ratio, top-frequent content words of the language within
individual registers and texts, and several types of Latinate words. Descriptive
analyses of the corpus are then presented and statistically validated with the
help of univariate and multivariate analyses. The results are interpreted
relative to our hypotheses and related to the following properties of texts in
terms of lexical cohesion: semantic variability, cohesive strength, number and
length of nominal chains, degree of specification of lexis, and degree of
variation along all of these properties.
Article outline
- 1.Aims and Motivation
- 2.Methodology
- 2.1Hypotheses
- 2.2Data
- 2.3Statistical Techniques
- 3.Analyses
- 3.1Most frequent words
- 3.2Lexical density
- 3.3Standardized type-token ratio
- 3.4Top content words
- 3.5Latinate words
- 3.6Overall comparison
- 4.Discussion and conclusion
- 4.1Language contrast English vs. German
- 4.2Spoken vs. written mode
- 4.3Register ranking
- 4.4Register variation
- Notes
-
References
References
Baayen, H.
2008 Analyzing Linguistic Data. A Practical Introduction to Statistics Using
R. Cambridge: Cambridge University Press.


Beaugrande, R.-A. de and Dressler, W. U.
1981 Introduction to Text Linguistics. London, New York: Longman (German version also published by Niemeyer in 1981).


Berzlanovich, I.
2008 Lexical Cohesion and the Organization of Discourse. First year report PhD student: University Groningen.

Biber, D.
1988 Variation across Speech and Writing. Cambridge: Cambridge University Press.


Biber, D. and Finegan, E.
1989 Drift and evolution of English style: a history of three
genres.
Language. 651:487–517.


Biber, D., Johansson, S., Leech, G., Conrad, S. and Finegan, E.
1999 Longman Grammar of Spoken and Written English. London: Longman.

Brinker, K.
2005 Linguistische Textanalyse: Eine Einführung in Grundbegriffe und
Methoden. 6th edition. Berlin: Erich Schmidt.

Chambers, J. M., Cleveland, W. S., Kleiner, B., and Tukey, P. A.
1983 Graphical Methods for Data Analysis.
The Wadsworth Statistics / Probability Series. Duxbury Press, Boston.

Collins, P.
2012 Grammatical Variation in English Worldwide: The Role of
Colloquialization.
Linguistics and the Human Sciences 8(3):289–306.

Cohen, J.
1992 A power primer.
Psychological Bulletin, 112(1):155–159.


Deutsche Akademie für Sprache und Dichtung, Union der deutschen
Akademien der Wissenschaften (ed.)
2013 Reichtum und Armut der deutschen Sprache. Erster Bericht zur Lage der
deutschen Sprache. Berlin: Walter de Gruyter.


Fiehler, R., Barden, B., Elstermann, M. and Kraft, B.
2004 Eigenschaften gesprochener Sprache. Tübingen: Narr (
Studien zur Deutschen Sprache 30).

Fischer, K.
2013 Satzstrukturen im Deutschen und Englischen. Typologie und
Textrealisierung. Berlin: Akademie Verlag.


Gast, V.
2008 V-N Compounds in English and German.
Zeitschrift für Anglistik und Amerikanistik 56(3). 269–282.


Greenacre, M.
2010 Correspondence Analysis in Practice. CRC Press.

Halliday, M. A. K. and Hasan, R.
1976 Cohesion in English. London, New York: Longman.

Halliday, M. A. K.
2005 On Grammar. Vol. 11 of Collected Works of
M. A. K. Halliday. London: Continuum.

Hansen-Schirra, S., Neumann, S. and Steiner, E.
2012 Cross-linguistic Corpora for the Study of Translations. Insights from
the Language Pair English – German. Series Text, Translation,
Computational Processing. Berlin / New York: Mouton de Gruyter.


Haspelmath, M.
2010 Comparative concepts and descriptive categories in
cross-linguistic studies.
Language 86(4).

Hawkins, J. A.
1986 A Comparative Typology of English and German. Unifying the
Contrasts. London etc. Croom Helm.

Hennig, M.
2006 Grammatik der gesprochenen Sprache in Theorie und Praxis. Kassel: University Press.

House, J.
1997 Translation Quality Assessment. Tübingen: Narr.

Jenset, G. B. and McGillivray, B.
Koch, P. and Oesterreicher, W.
1985 Sprache der Nähe – Sprache der Distanz. Mündlichkeit und
Schriftlichkeit im Spannungsfeld von Sprachtheorie und
Sprachgeschichte.
Romanistisches Jahrbuch 36/851:15–43.

König, E. and Gast, V.
2012 Understanding English–German Contrasts. Grundlagen der Anglistik und
Amerikanistik. Berlin: Erich Schmidt Verlag. [3rd, extended edition].

Kunz, K., Lapshinova-Koltunski, E. and Martínez Martínez, J. M.
2016 Beyond Identity Coreference: Contrasting Indicators of Textual
Coherence in English and German. In
Proceedings of CORBON at NAACL-HLT2016, San Diego.


Kunz, K., Degaetano-Ortlieb, S., Lapshinova-Koltunski, E., Menzel, K. and Steiner, E.
(
2017)
English-German contrasts in cohesion and implications for translation. In
De Sutter, G. and
Delaere, I. and
Lefer, M.-A. (eds.). Empirical Translation Studies. New Theoretical and Methodological Traditions.
TILSM series. Vol. 3001. Mouton de Gruyter, 265–312

Lapshinova-Koltunski, E., Kunz, K. and Amoia, M.
2012 Compiling a Multilingual Spoken Corpus.
Proceedings of the VIIth GSCP International Conference: Speech and
corpora, Firenze: Firenze University Press 2012, 79–84, Available at:
[URL] [last accessed 16/02/2015]
Lapshinova-Koltunski, E. and Kunz, K.
2014 Annotating Cohesion for Multillingual Analysis.
Proceedings of the 10th Joint ACL-ISO Workshop on Interoperable Semantic
Annotation in conjunction with LREC2014 the Ninth International Conference
on Language Resources and Evaluation, Reykjavik, Iceland 2014 Available at:
[URL] [last accessed 18/01/2015]
Leech, G., Hundt, M., Mair, C. and Smith, N.
2009 Change in Contemporary English. A Grammatical Study. Cambridge: Cambridge University Press.


Leisi, E. and Mair, C.
2008 Das heutige Englisch: Wesenszüge und Probleme. 9th edition Heidelberg: Universitätsverlag Winter.

Martínez Martínez, J. M.
2015 GECCo UPOS. Internal Technical Report. Available at:
[URL] [Last accessed 15/02/2016]
Martínez Martínez, J. M., Lapshinova-Koltunski, E. and Kunz, K.
2016 Annotation of Lexical Cohesion in English and German: Automatic
and Manual Procedures.
Proceedings of the Conference on Natural Language Processing,
KONVENS-2016,
September 19–21, Bochum, Germany.

Mair, C.
2006 Twentieth-Century English: History, Variation and
Standardization, Cambridge: Cambridge University Press.


Nenadic, O. and Greenacre, M.
2007 Correspondence analysis in R, with two- and three-dimensional
graphics: The ca package.
Journal of Statistical Software 20(3): 1–13.

Neumann, S.
2013 Contrastive register variation. A quantitative approach to the
comparison of English and German. Berlin / New York: Mouton de Gruyter.


Petrov, S., Das, D., and McDonald, R.
2012 A Universal Part-of-Speech Tagset.
Proceedings of the Eight International Conference on Language Resources
and Evaluation (LREC’12). Istanbul, Turkey: European Language Resources Association (ELRA). Available at:
[URL] [last accessed 18/01/2015]
Schmid, H.
1995 Improvements in Part-of-Speech Tagging with an Application to
German.
Proceedings of the ACL SIGDAT-Workshop. Dublin, Ireland.

Schmid, H.
1994 Probabilistic Part-of-Speech Tagging Using Decision
Trees.
Proceedings of International Conference on New Methods in Language
Processing, Manchester, UK.

Stokes, N.
2004 Applications of Lexical Cohesion Analysis in the Topic Detection and
Tracking Domain. PhD Thesis Dublin: UCD.

Venables, W. N. and Smith, D. M.
2010 An Introduction to R. Notes on R: A Programming Environment for Data
Analysis and Graphics. Electronic edition. Available at:
[URL].

Cited by
Cited by 1 other publications
This list is based on CrossRef data as of 5 april 2022. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers.
Any errors therein should be reported to them.