Automatic acquisition of verb subcategorization information by exploiting mininal linguistic resources
A set of well known statistical filtering methods (binomial hypothesis testing, log-likelihood ratio, t-test, thresholds on relative frequencies) is used on Modern Greek and English corpora in order to automatically acquire verb subcategorization frames that are not limited in number and are not known beforehand. As sophisticated linguistic resources and tools are not available for most languages (including Modern Greek), pre-processing of our corpora reaches merely the stage of elementary, intrasentential, non-embedded phrase chunking. By forming, permutating and counting subsets of the verb's neighboring set of phrases, and by applying the statistical filters mentioned previously, valid syntactic frames of verbs are detected. The results achieved were comparable to and, in several cases, better than the ones of previous approaches, even approaches utilizing richer resources. Incorporating the extracted list of frames into a shallow parser, the performance of the latter increases by almost 6%, showing thereby the importance of the acquired knowledge.
Keywords: Modern Greek, subcategorization, hypothesis testing, shallow parsing
Published online: 29 April 2004
Cited by 4 other publications
Forsberg, Markus, Harald Hammarström & Aarne Ranta
KERMANIDIS, KATIA, MANOLIS MARAGOUDAKIS, NIKOS FAKOTAKIS & GEORGE KOKKINAKIS
This list is based on CrossRef data as of 05 january 2022. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers. Any errors therein should be reported to them.