Shallow features as indicators of English–German contrasts in lexical cohesion
Kerstin Kunz | Heidelberg University (Germany)
Ekaterina Lapshinova-Koltunski | Saarland University (Germany)
José Manuel Martínez Martínez | Saarland University (Germany)
Katrin Menzel | Saarland University (Germany)
Erich Steiner | Saarland University (Germany)
This paper contrasts lexical cohesion between English and German spoken and
written registers, reporting findings from a quantitative lexical analysis.
After an overview of research aims and motivations we formulate hypotheses on
distributions of shallow features as indicators of lexical cohesion across
languages and modes and with respect to register ranking and variation. The
shallow features analysed are: highly frequent words in texts, lexical density,
standardized type-token-ratio, top-frequent content words of the language within
individual registers and texts, and several types of Latinate words. Descriptive
analyses of the corpus are then presented and statistically validated with the
help of univariate and multivariate analyses. The results are interpreted
relative to our hypotheses and related to the following properties of texts in
terms of lexical cohesion: semantic variability, cohesive strength, number and
length of nominal chains, degree of specification of lexis, and degree of
variation along all of these properties.
Keywords: lexical cohesion, shallow features, written vs. spoken mode, language contrast, English / German
Article outline
- 1.Aims and Motivation
- 2.Methodology
- 2.1Hypotheses
- 2.2Data
- 2.3Statistical Techniques
- 3.Analyses
- 3.1Most frequent words
- 3.2Lexical density
- 3.3Standardized type-token ratio
- 3.4Top content words
- 3.5Latinate words
- 3.6Overall comparison
- 4.Discussion and conclusion
- 4.1Language contrast English vs. German
- 4.2Spoken vs. written mode
- 4.3Register ranking
- 4.4Register variation
- Notes
-
References
Published online: 28 November 2017
https://doi.org/10.1075/lic.16005.kun
https://doi.org/10.1075/lic.16005.kun
References
Baayen, H.
Beaugrande, R.-A. de and Dressler, W. U.
Berzlanovich, I.
Biber, D. and Finegan, E.
Biber, D., Johansson, S., Leech, G., Conrad, S. and Finegan, E.
Brinker, K.
Chambers, J. M., Cleveland, W. S., Kleiner, B., and Tukey, P. A.
Collins, P.
Deutsche Akademie für Sprache und Dichtung, Union der deutschen
Akademien der Wissenschaften (ed.)
Fiehler, R., Barden, B., Elstermann, M. and Kraft, B.
Fischer, K.
Gast, V.
Halliday, M. A. K.
Hansen-Schirra, S., Neumann, S. and Steiner, E.
Haspelmath, M.
Hawkins, J. A.
Jenset, G. B. and McGillivray, B.
Koch, P. and Oesterreicher, W.
König, E. and Gast, V.
Kunz, K., Lapshinova-Koltunski, E. and Martínez Martínez, J. M.
Kunz, K., Degaetano-Ortlieb, S., Lapshinova-Koltunski, E., Menzel, K. and Steiner, E.
Lapshinova-Koltunski, E., Kunz, K. and Amoia, M.
2012 Compiling a Multilingual Spoken Corpus. Proceedings of the VIIth GSCP International Conference: Speech and
corpora, Firenze: Firenze University Press 2012, 79–84, Available at: http://store.torrossa.it/pages/ipplatform/itemDetails.faces [last accessed 16/02/2015]
Lapshinova-Koltunski, E. and Kunz, K.
2014 Annotating Cohesion for Multillingual Analysis. Proceedings of the 10th Joint ACL-ISO Workshop on Interoperable Semantic
Annotation in conjunction with LREC2014 the Ninth International Conference
on Language Resources and Evaluation, Reykjavik, Iceland 2014 Available at: http://www.lrec-conf.org/proceedings/lrec2014/workshops/LREC2014Workshop-ISA-10%20Proceedings.pdf [last accessed 18/01/2015]
Leech, G., Hundt, M., Mair, C. and Smith, N.
Leisi, E. and Mair, C.
Martínez Martínez, J. M.
2015 GECCo UPOS. Internal Technical Report. Available at: http://www.gecco.uni-saarland.de/GECCo/Korpus_files/gecco_upos_tech_report.pdf [Last accessed 15/02/2016]
Martínez Martínez, J. M., Lapshinova-Koltunski, E. and Kunz, K.
Mair, C.
Nenadic, O. and Greenacre, M.
Neumann, S.
Petrov, S., Das, D., and McDonald, R.
2012 A Universal Part-of-Speech Tagset. Proceedings of the Eight International Conference on Language Resources
and Evaluation (LREC’12). Istanbul, Turkey: European Language Resources Association (ELRA). Available at: http://www.lrec-conf.org/proceedings/lrec2012/pdf/274_Paper.pdf [last accessed 18/01/2015]
Steiner, E.
Schmid, H.
Stokes, N.
Venables, W. N. and Smith, D. M.
2010 An Introduction to R. Notes on R: A Programming Environment for Data
Analysis and Graphics. Electronic edition. Available at: http://cran.r-project.org/doc/manuals/R-intro.html.
Cited by
Cited by 1 other publications
This list is based on CrossRef data as of 05 april 2022. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers. Any errors therein should be reported to them.