Concordance line sorting in The Prime Machine
Corpus data provide evidence of the patterning of language, and one way word usage can be analysed is through the study of concordance lines. While popular concordancers provide different sorting methods, they are typically only able to display lines in the order in which they occur in the corpus, randomly, or alphabetically by words in slots to the left or right of the word of interest. Less sophisticated users may find recognising patterns from these orderings quite challenging. This paper considers possible needs of language learners in terms of concordance ranking and introduces two methods which have been adopted and developed for The Prime Machine. The first method uses repeated patterns, measuring the number of matches made with other lines in the set. The second method incorporates collocation scores, providing examples with strong collocations from the entire corpus at the top of sampled concordance lines.
Keywords: concordance line ranking, data-driven learning, lexical patterning, collocation
Published online: 07 April 2021
(2020) AntConc (Version 3.5.9) [Computer software]. Waseda University. https://www.laurenceanthony.net/software/antconc/
Ballance, O. J., & Coxhead, A.
(2007) The British National Corpus (Version 3, BNC XML ed.). Oxford University Computing Services on behalf of the BNC Consortium. http://www.natcorp.ox.ac.uk/
Church, K. W., & Hanks, P.
(1994, September 14–16). A system for automating concordance line selection [Paper presentation]. NeMLaP Conference, Manchester, UK. https://files.eric.ed.gov/fulltext/ED378808.pdf
(1999) The Automatic Selection of Concordance Lines [Doctoral dissertation, University of Liverpool]. EThOS. https://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.367119
(2017a) Concordancing Lexical Primings: The rationale and design of a user-friendly corpus tool for English language teaching and self-tutoring based on the Lexical Priming theory of language. In M. Pace-Sigge & K. J. Patterson (Eds.), Lexical Priming: Applications and Advances (pp. 273–296). John Benjamins.
Kilgarriff, A., Husak, M., McAdam, K., Rundell, M., & Rychlý, P.
(2008) GDEX: Automatically finding good dictionary examples in a corpus [Paper presentation]. Euralex, Barcelona, Spain. https://euralex.org/publications/gdex-automatically-finding-good-dictionary-examples-in-a-corpus/
Mahlberg, M., Stockwell, P., Joode, J. D., Smith, C., & O’Donnell, M. B.
Mahlberg, M., & Wiegand, V.
O’Donnell, M. B.