Article published in:Japanese Term Extraction
Kyo Kageura and Teruo Koyama
[Terminology 6:2] 2000
► pp. 211–232
Extracting terms by a combination of term frequency and a measure of term representativeness
This article describes a method for extracting terms that combines term frequency with a novel measure of term representativeness (i.e., informativeness or domain specificity). The measure is defined as the normalized distance between the word distribution in the documents which contain the term and the word distribution in the whole corpus. The measure is particularly effective in discarding uninformative terms that frequently appear and has a well-defined threshold value for judging the representativeness of a term. We combined the new measure with term frequency and applied it to the extraction of terms from abstracts of artificial intelligence papers. This article introduces the measure and reports on its effectiveness in term extraction.
Published online: 01 October 2001
Cited by 8 other publications
Hattori, S., T. Tezuka & K. Tanaka
Hisamitsu, Toru & Jun-ichi Tsujii
Horyu, Daisuke & Seishi Ninomiya
Iwayama, Makoto & Yoshiki Niwa
Suzuki, T., S. Kawamura, F. Yoshikane, K. Kageura & A. Aizawa
Wakaki, Hiromi, Tomonari Masada, Atsuhiro Takasu & Jun Adachi
This list is based on CrossRef data as of 11 november 2021. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers. Any errors therein should be reported to them.