Lexical priming predicts that repeated encounters with lexical patterns will prime users for register awareness (Hoey 2013: 3344). To verify this prediction, this chapter reports on a study that determined the dimensions of collocation in American English, which are the parameters underlying the use of collocations in spoken and written text. The method was inspired by the multidimensional framework for register variation analysis introduced by Biber in the 1980s. The corpus used was the 450-million-word Corpus of Contemporary American English (COCA, 1990–2012 version). The most characteristic collocations of each register in COCA (spoken [American radio and television programs], magazine, newspaper, academic, and fiction) were computed using the logDice coefficient (Rychly 2008). These were then entered in a factor analysis, which yielded the statistical groupings of collocation across the registers. Nine dimensions were identified and are described in this chapter. The relationship between collocation and register was tested statistically through the dimensions, and the results suggested that register could predict the collocations (via the dimensions) between 39% and 67% of the time, which seems to lend support to the hypothesis that users are primed for register, as far as AmE collocations are concerned.
Article outline
1.Introduction
2.Method
3.Dimensions of collocation in American English
3.1Dimension 1: Literate discourse
3.2Dimension 2: Oral discourse
3.3Dimension 3: Objects, people, and actions
3.4Dimension 4: Colloquial and informal language use
3.5Dimension 5: Organizations and the government
3.6Dimension 6: Politics and current affairs
3.7Dimension 7: Feelings and emotions
3.8Dimension 8: Cooking
3.9Dimension 9: Education research
4.Assigning collocations to register categories based on their MD profile
Berber Sardinha, T. 1997. Automatic Identification of Segments in Written Texts. PhD dissertation, University of Liverpool.
Berber Sardinha, T., Kauffmann, C. & Acunzo, C.M. 2014. A multdimensional analysis of register variation in Brazilian Portuguese. Corpora 9(2): 239–271.
Berber Sardinha, T., Mayer Acunzo, C. & São Bento Ferreira, T. In press. Dimensions of collocation in Brazilian Portuguese: Exploring the Brazilian Corpus on Sketch Engine. In Essays in Lexical Semantics in Honor of Adam Kilgarriff, M. Diab & A. Villavicencio (eds). Berlin: Springer.
Berber Sardinha, T., São Bento Ferreira, T. & Teixeira, R. d. B.S. 2014. Lexical bundles in Brazilian Portuguese. In Working with Portuguese Corpora, T. Berber Sardinha & T. São Bento Ferreira (eds), 33–68. London: Bloomsbury.
Berber Sardinha, T. & Veirano Pinto, M. 2016. Predicting American movie genre categories from linguistic characteristics. Journal of Research Design and Statistics in Linguistics and Communication Science 2(1): 75–102.
Berber Sardinha, T. & Veirano Pinto, M. In press. American television and off-screen registers: A corpus-based comparison. Corpora.
Biber, D. 1988. Variation across Speech and Writing. Cambridge: CUP.
Biber, D. 2010. What can a corpus tell us about registers and genres?In The Routledge Handbook of Corpus Linguistics, A. O’Keeffe & M. McCarthy (eds), 241–254. London: Routledge.
Biber, D. 2012. Register as a predictor of linguistic variation. Corpus Linguistics and Linguistic Theory 8(1): 9–37.
Biber, D. & Conrad, S. 1999. Lexical bundles in conversation and academic prose. In Out of Corpora – Studies in Honour of Stig Johansson, H. Hasselgard & S. Oksefjell (eds), 181–190. Amsterdam: Rodopi.
Biber, D., Davies, M., Jones, J.K. & Tracy-Ventura, N. 2006. Spoken and written register variation in Spanish: A multi-dimensional analysis. Corpora 1(1): 1–37.
Herrmann, J.B. & Berber Sardinha, T. (eds). 2015. Metaphor in Specialist Discourse[Metaphor in Language, Cognition, and Communication 4]. Amsterdam: John Benjamins.
Hoey, M. 2005. Lexical Priming: A New Theory of Words and Language. London: Routledge.
Hoey, M. 2013. Lexical priming. In The Encyclopedia of Applied Linguistics, C. Chapelle (ed.), 3342–3347. Hoboken NJ: Wiley.
Hunston, S. 2002. Corpora in Applied Linguistics. Cambridge: CUP.
Lehrer, A. 1974. Semantic Fields and Lexical Structure. Amsterdam: North-Holland.
McEnery, T., Xiao, R. & Tono, Y. 2006. Corpus-based Language Studies: An Advanced Resource Book. London: Routledge.
Menon, S. & Mukundan, J. 2012. Collocations of high frequency noun keywords in prescribed science textbooks. International Education Studies 5(6): 149–162.
Pace-Sigge, M. 2015. The Function and Use of TO and OF in Multi-word Units. Houndmills: Palgrave Macmillan.
Phillips, M. 1989. Lexical Structure of Text. Birmingham: ELR, University of Birmingham.
Rychly, P. 2008. A lexicographer-friendly association score. In Proceedings of Recent Advances in Slavonic Natural Language Processing, RASLAN 2008, P. Sojka & A. Horák (eds), 6–9. Brno: Masaryk University.
Schütze, H. 1998. Automatic word sense discrimination. Computational Linguistics 24(1): 97–123.
Scott, M. 2000. Focusing on the text and its key words. In Rethinking Language Pedagogy from a Corpus Perspective, Vol. 2, L. Burnard & A. McEnery (eds), 103–122. Frankfurt: Peter Lang.
Sinclair, J.M. & Jones, S. 1974/1996. English lexical collocations: A study in computational linguistics. In J.M. Sinclair on Lexis and Lexicography, J.A. Foley (ed.), 22–68. Singapore: UniPress.
Sinclair, J.M., Jones, S. & Daley, R. 1970/2004. English Lexical Studies: The OSTI Report,. Ramesh Krishnamurthy (ed.). London: Continuum.
Stubbs, M. 2007. Quantitative data on multi-word sequences in English: the case of the word ‘world’. In Text, Discourse and Corpora, M. Hoey, M. Mahlberg, M. Stubbs,W. Teubert & (eds), 163–190. London: Continuum.
Trier, J. 1931. Der deutsche Wortschatz im Sinnbezirk des Verstandes; die Geschichte eines Sprachlichen feldes. Heidelberg: C. Winter.
Veirano Pinto, M. 2014. Dimensions of variation in North American movies. In Berber Sardinha & Veirano Pinto (eds), 109–149.
Yablo, S. 2016. Aboutness. Princeton NJ: Princeton University Press.
Cited by (8)
Cited by eight other publications
Martínez Caro, Elena
2024. Paragraph boundaries and discourse genre: Applying Lexical Priming to Spanish written texts. Círculo de Lingüística Aplicada a la Comunicación 99 ► pp. 53 ff.
Ortolani, Katherine O.
2024. The Language of Fashion from a Multidimensional Perspective. In Digital Humanities Looking at the World, ► pp. 75 ff.
Berber Sardinha, Tony
2023. Corpus linguistics and historiography. Journal of Research Design and Statistics in Linguistics and Communication Science 7:1 ► pp. 69 ff.
Sardinha, Tony Berber
2019. Lexicogrammar. In The Encyclopedia of Applied Linguistics, ► pp. 1 ff.
Hoey, Michael & Katie Patterson
2021. Lexical Priming. In The Encyclopedia of Applied Linguistics, ► pp. 1 ff.
2018. Where Corpus Linguistics and Artificial Intelligence (AI) Meet. In Spreading Activation, Lexical Priming and the Semantic Web, ► pp. 29 ff.
This list is based on CrossRef data as of 23 september 2024. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers.
Any errors therein should be reported to them.