Chapter published in:Applications of Pattern-driven Methods in Corpus Linguistics
Edited by Joanna Kopaczyk and Jukka Tyrkkö
[Studies in Corpus Linguistics 82] 2018
► pp. 107–130
Constance and variability
Using PoS-grams to find phraseologies in the language of newspapers
This paper describes the use of a corpus-driven methodology, the retrieval of part-of-speech-grams (PoS-grams), which is extremely effective for the discovery of phraseologies that might otherwise remain hidden. The PoS-gram is a string of part-of-speech categories (Stubbs 2007: 91), the tokens of which are strings of words that have been annotated with these PoS tags. A list of PoS-grams retrieved from a sample corpus can be compared with that from a reference corpus. Statistically significant items are further analysed to identify recurrent patterns and potential phraseologies. The utility of PoS-grams will be illustrated by way of analysis of a one million token corpus composed of texts from ten sections of The Guardian, the Sassari Newspaper Article Corpus (SNAC).
Keywords: PoS-grams, phraseology, journalism, corpus-driven
Published online: 13 March 2018
Baron, Alistair, Rayson, Paul, & Archer, Dawn
Biber, Douglas & Barbieri, Federica
Biber, Douglas, Conrad, Susan & Cortes, Viviana
Biber, Douglas, Johansson, Stig, Leech, Geoffrey, Conrad Susan, & Finegan, Edward
Cheng, Winnie, Greaves, Chris, Sinclair, John McH., & Warren, Martin
Cheng, Winnie, Greaves, Chris & Warren, Martin
D’hondt, Eva K. L., Verberne, Suzan, Weber, Niklas, Koster, Kees & Boves, Lou
2002–2007 kfNgram. Annapolis MD: USNA. http://www.kwicfinder.com/kfNgram/kfNhramHelp.html> (10 June 2016).
Francis, Gill, Hunston, Susan & Manning, Elizabeth
Gray, Bethany & Biber, Douglas
Greaves, Chris & Warren, Martin
Hunston, Susan & Francis, Gill
Hunston, Susan & Sinclair, John McH.
Martin, Jim R. & White, Peter R. R.
Morley, Barry & Sift, Patricia
Reyes, Antonio & Rosso, Paolo
Spiccia, Carmelo, Augello Agnese & Pilato, Giovanni
Cited by 1 other publications
Clarke, Isobelle, Tony McEnery & Gavin Brookes
This list is based on CrossRef data as of 14 september 2021. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers. Any errors therein should be reported to them.