Concordance line sorting in The Prime Machine

Jeaco, Stephen

doi:10.1075/ijcl.18056.jea

Article published In:

International Journal of Corpus Linguistics
Vol. 26:2 (2021) ► pp.284–297

Concordance line sorting in The Prime Machine

Stephen Jeaco | Xi’an Jiaotong-Liverpool University

Corpus data provide evidence of the patterning of language, and one way word usage can be analysed is through the study of concordance lines. While popular concordancers provide different sorting methods, they are typically only able to display lines in the order in which they occur in the corpus, randomly, or alphabetically by words in slots to the left or right of the word of interest. Less sophisticated users may find recognising patterns from these orderings quite challenging. This paper considers possible needs of language learners in terms of concordance ranking and introduces two methods which have been adopted and developed for The Prime Machine. The first method uses repeated patterns, measuring the number of matches made with other lines in the set. The second method incorporates collocation scores, providing examples with strong collocations from the entire corpus at the top of sampled concordance lines.

Keywords: concordance line ranking, data-driven learning, lexical patterning, collocation

Article outline

1.Introduction
2.Concordance line sorting
- 2.1Ranking concordance lines using links across texts
- 2.2Ranking concordance lines using collocations
- 2.3Combining scores for links across texts and collocations
3.Conclusion
Note
References

Published online: 7 April 2021

https://doi.org/10.1075/ijcl.18056.jea

References (23)

References

Anthony, L. (2018). Visualization in Corpus-Based Discourse Studies. In C. Taylor & A. Marchi (Eds.), Corpus Approaches to Discourse: A Critical Review. Routledge.

(2020). AntConc (Version 3.5.9) [Computer software]. Waseda University. [URL]

Ballance, O. J., & Coxhead, A. (2020). How Much vocabulary is needed to use a concordance? International Journal of Corpus Linguistics, 25(1), 36–61.

BNC Consortium. (2007). The British National Corpus (Version 3, BNC XML ed.). Oxford University Computing Services on behalf of the BNC Consortium. [URL]

Church, K. W., & Hanks, P. (1990). Word association norms, mutual information, and lexicography. Computational Linguistics, 16(1), 22–29.

Collier, A. (1994, September 14–16). A system for automating concordance line selection [Paper presentation]. NeMLaP Conference, Manchester, UK. [URL]

(1999). The Automatic Selection of Concordance Lines [Doctoral dissertation, University of Liverpool]. EThOS. [URL]

Frankenberg-Garcia, A. (2012). Learners’ use of corpus examples. International Journal of Lexicography, 25(3), 273–296.

(2014). The use of corpus examples for language comprehension and production. ReCALL, 26(2), 128–146.

Hoey, M. (1991). Patterns of Lexis in Text. Oxford University Press.

(2005). Lexical Priming: A New Theory of Words and Language. Routledge.

Hunston, S. (2002). Corpora in Applied Linguistics. Cambridge University Press.

Jeaco, S. (2017a). Concordancing Lexical Primings: The rationale and design of a user-friendly corpus tool for English language teaching and self-tutoring based on the Lexical Priming theory of language. In M. Pace-Sigge & K. J. Patterson (Eds.), Lexical Priming: Applications and Advances (pp. 273–296). John Benjamins.

(2017b). Helping language learners put concordance data in context: Concordance Cards in The Prime Machine. International Journal of Computer-Assisted Language Learning and Teaching, 7(2), 22–39.

(2019). Exploring collocations with The Prime Machine. International Journal of Computer-Assisted Language Learning & Teaching, 9(3), 29–49.

Johns, T. (1988). Whence and whither classroom concordancing? In T. Bongaerts (Ed.), Computer Applications in Language Learning (pp. 9–27). Foris.

Kilgarriff, A., Husak, M., McAdam, K., Rundell, M., & Rychlý, P. (2008). GDEX: Automatically finding good dictionary examples in a corpus [Paper presentation]. Euralex, Barcelona, Spain. [URL]

Mahlberg, M., Stockwell, P., Joode, J. D., Smith, C., & O’Donnell, M. B. (2016). CLiC Dickens: Novel uses of concordances for the integration of corpus stylistics and cognitive poetics. Corpora, 11(3), 433–463.

Mahlberg, M., & Wiegand, V. (2020). Stylistics and the Digital Humanities. In S. Conrad, A. J. Hartig, & L. Santelmann (Eds.), The Cambridge Introduction to Applied Linguistics (pp. 219–234). Cambridge University Press.

O’Donnell, M. B. (2008). KWICgrouper: Designing a tool for corpus-driven concordance analysis. International Journal of English Studies, 8(1), 107–121.

Scott, M. (2020). WordSmith Tools (Version 8) [Computer software]. Lexical Analysis Software.

Sinclair, J. M. (1991). Corpus, Concordance, Collocation. Oxford University Press.

Wible, D., Kuo, C.-H., Chien, F.-Y., & Wang, C. C. (2002). Toward automating a personalized concordancer for Data-Driven Learning: A lexical difficulty filter for language learners. In B. Kettemann, G. Marko, & T. McEnery (Eds.), Teaching and Learning by Doing Corpus Analysis (pp. 147–154). Rodopi.

Cited by (1)

Cited by one other publication

Huo, Jing & Stephen Jeaco

2024. Using The Prime Machine to Untangle the Patterns of Academic Paraphrases. In English for Academic Purposes in the EMI Context in Asia, ► pp. 301 ff.

This list is based on CrossRef data as of 17 october 2024. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers. Any errors therein should be reported to them.