Shallow features as indicators of English–German contrasts in lexical cohesion
This paper contrasts lexical cohesion between English and German spoken and written registers, reporting findings from a quantitative lexical analysis. After an overview of research aims and motivations we formulate hypotheses on distributions of shallow features as indicators of lexical cohesion across languages and modes and with respect to register ranking and variation. The shallow features analysed are: highly frequent words in texts, lexical density, standardized type-token-ratio, top-frequent content words of the language within individual registers and texts, and several types of Latinate words. Descriptive analyses of the corpus are then presented and statistically validated with the help of univariate and multivariate analyses. The results are interpreted relative to our hypotheses and related to the following properties of texts in terms of lexical cohesion: semantic variability, cohesive strength, number and length of nominal chains, degree of specification of lexis, and degree of variation along all of these properties.
Keywords: lexical cohesion, shallow features, written vs. spoken mode, language contrast, English / German
Published online: 28 November 2017
Beaugrande, R.-A. de and Dressler, W. U.
Biber, D. and Finegan, E.
Biber, D., Johansson, S., Leech, G., Conrad, S. and Finegan, E.
Chambers, J. M., Cleveland, W. S., Kleiner, B., and Tukey, P. A.
Deutsche Akademie für Sprache und Dichtung, Union der deutschen Akademien der Wissenschaften (ed.)
Fiehler, R., Barden, B., Elstermann, M. and Kraft, B.
Halliday, M. A. K.
Hansen-Schirra, S., Neumann, S. and Steiner, E.
Hawkins, J. A.
Jenset, G. B. and McGillivray, B.
Koch, P. and Oesterreicher, W.
König, E. and Gast, V.
Kunz, K., Lapshinova-Koltunski, E. and Martínez Martínez, J. M.
Kunz, K., Degaetano-Ortlieb, S., Lapshinova-Koltunski, E., Menzel, K. and Steiner, E.
Lapshinova-Koltunski, E., Kunz, K. and Amoia, M.
2012 Compiling a Multilingual Spoken Corpus. Proceedings of the VIIth GSCP International Conference: Speech and corpora, Firenze: Firenze University Press 2012, 79–84, Available at: http://store.torrossa.it/pages/ipplatform/itemDetails.faces [last accessed 16/02/2015]
Lapshinova-Koltunski, E. and Kunz, K.
2014 Annotating Cohesion for Multillingual Analysis. Proceedings of the 10th Joint ACL-ISO Workshop on Interoperable Semantic Annotation in conjunction with LREC2014 the Ninth International Conference on Language Resources and Evaluation, Reykjavik, Iceland 2014 Available at: http://www.lrec-conf.org/proceedings/lrec2014/workshops/LREC2014Workshop-ISA-10%20Proceedings.pdf [last accessed 18/01/2015]
Leech, G., Hundt, M., Mair, C. and Smith, N.
Leisi, E. and Mair, C.
Martínez Martínez, J. M.
2015 GECCo UPOS. Internal Technical Report. Available at: http://www.gecco.uni-saarland.de/GECCo/Korpus_files/gecco_upos_tech_report.pdf [Last accessed 15/02/2016]
Martínez Martínez, J. M., Lapshinova-Koltunski, E. and Kunz, K.
Nenadic, O. and Greenacre, M.
Petrov, S., Das, D., and McDonald, R.
2012 A Universal Part-of-Speech Tagset. Proceedings of the Eight International Conference on Language Resources and Evaluation (LREC’12). Istanbul, Turkey: European Language Resources Association (ELRA). Available at: http://www.lrec-conf.org/proceedings/lrec2012/pdf/274_Paper.pdf [last accessed 18/01/2015]
Venables, W. N. and Smith, D. M.
2010 An Introduction to R. Notes on R: A Programming Environment for Data Analysis and Graphics. Electronic edition. Available at: http://cran.r-project.org/doc/manuals/R-intro.html.
Cited by 1 other publications
This list is based on CrossRef data as of 29 august 2021. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers. Any errors therein should be reported to them.