Article published in:Keyness in Texts
Edited by Marina Bondi and Mike Scott
[Studies in Corpus Linguistics 41] 2010
► pp. 113–126
Identifying aboutgrams in engineering texts
This paper uses a new computer-mediated methodology, concgramming, to identify the aboutness of a text. Concgrams are the raw products of the concgramming process and consist of up to five co-occurring words irrespective of whether constituency variation (i.e. AB, A*B where * represents an intervening word) and/or positional variation (i.e. AB, BA) is present. Two engineering research articles are concgrammed to identify the most frequently occurring two-word lexical concgrams. The most frequent two-word lexical concgrams for each text are examined to determine whether the words simply co-occur or are meaningfully associated. Once this has been done, a provisional list of “aboutgrams” is drawn up which is tentatively taken to represent the aboutness of each text. These lists are then referred to a specialised corpus of engineering texts and then a general reference corpus. Those aboutgrams on the lists which are consistently more frequent in the text than in the two corpora are then put forward as representing the aboutness of the text. In the study, the lists of aboutgrams are compared with single word frequency lists to evaluate the advantages to be gained from determining aboutness by means of phraseology rather than key words. The conclusion is that aboutgrams are a better means for uncovering the aboutness of the texts.
Published online: 11 November 2010
Cited by 5 other publications
Murakami, Akira, Paul Thompson, Susan Hunston & Dominik Vajn
Sarfo-Kantankah, Kwabena Sarfo
This list is based on CrossRef data as of 12 october 2021. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers. Any errors therein should be reported to them.