Article published in:Current Issues in Phraseology
Edited by Sebastian Hoffmann, Bettina Fischer-Starcke and Andrea Sand
[Benjamins Current Topics 74] 2015
► pp. 135–164
50-something years of work on collocations
What is or should be next …
This paper explores ways in which research into collocation should be improved. After a discussion of the parameters underlying the notion of collocation, the paper has three main parts. First, I argue that corpus linguistics would benefit from taking more seriously the understudied fact that collocations are not necessarily symmetric, as most association measures imply. Also, I introduce an association measure from the associative learning literature that can identify asymmetric collocations and show that it can also distinguish collocations with high and low association strengths well. Second, I summarize some advantages of this measure and brainstorm about ways in which it can help re-examine previous studies as well as support further applications. Finally, I adopt a broader perspective and discuss a variety of ways in which all association measures – directional or not – in corpus linguistics should be improved in order for us to obtain better and more reliable results.
Keywords: association measure, collocation, directionality, dispersion, DP (delta P)
Published online: 10 July 2015
Bell, A., Brenier, J.M., Gregory, M., Girand, C. & Jurafsky, D.
Daudaravičius, V. & Marcinkevičienė, R.
Ellis, N.C. & Ferreira-Junior, F.
Ferraresi, A. & Gries, St. Th
2011 “Type and (?) token frequencies in measures of collocational strength: Lexical gravity vs. a few classics”. Paper presented at Corpus Linguistics 2011 , University of Birmingham, UK .
Gries, St. Th
2010b: online. “Bigrams in registers, domains, and varieties: A bigram gravity approach to the homogeneity of corpora”. InM. Mahlberg, V. González-Diaz & C. Smith(Eds.), Proceedings of the Corpus Linguistics Conference (CL 2009), University of Liverpool, UK , 20–23 July 2009.Available at: http://ucrel.lancs.ac.uk/publications/cl2009 (accessedJuly 2012).
Gries, St. Th., Hampe, B. & Schönefeld, D.
2009 “Simple maths for keywords”. Paper presented at Corpus Linguistics 2009 , University of Liverpool .
Michelbacher, L., Evert, S. & Schütze, H.
2007 “Asymmetric association measures”. Paper presented at the 6th International Conference on Recent Advances in Natural Language Processing , Borovets, Bulgaria .
1998 “Dependent bigram identification”. In Proceedings of the Fifteenth National Conference on Artificial Intelligence (AAAI-98) , July 28–30, 1197.
R Development Core Team
2012: online. R: A Language and Environment for Statistical Computing. Vienna: R Foundation for Statistical Computing. Available at: http://www.R-project.org (accessedJuly 2012).
Raymond, W.D. & Brown, E.L.
2011 “Intonation unit boundaries and the entrenchment of collocations: Evidence from bidirectional and directional association measures”. Unpublished ms, Department of Linguistics, University of California, Santa Barbara.