Article published in:Phraseology: An interdisciplinary perspective
Edited by Sylviane Granger and Fanny Meunier
[Not in series 139] 2008
► pp. 391–406
23. Combined statistical and grammatical criteria for the retrieval of phraseological units in an electronic corpus
The aim of this study is to refine and optimise the mainly statistical and distributional approach to the automatic extraction of phraseological units (PUs) from text corpora, by introducing minimal linguistic elements (lemmatisation and grammatical tagging). These operations were first tested using the same corpora as in our previous research (Pamies & Pazos 2003 & 2004). This provided us with a new set of results, which we compared with the previous ones.We found that the detection ability had improved substantially, especially when dealing with verb + noun and verb + adjective collocations. This methodology was then applied to a larger corpus. Again, the results were encouraging, with phraseological densities up to 64.5% for the verb + noun category.
Published online: 01 June 2008
Cited by other publications
This list is based on CrossRef data as of 25 november 2020. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers. Any errors therein should be reported to them.