Two different reasons suggest that combining the performance of several term extractors could lead to an improvement in overall system accuracy. On the one hand, there is no clear agreement on whether to follow statistical, linguistic or hybrid approaches for (semi-) automatic term extraction. On the other hand, combining different knowledge sources (e.g. classifiers) has proved successful in improving the performance of individual sources on several NLP tasks (some of them closely related to or involved in term extraction), such as context-sensitive spelling correction, part-of-speech tagging, word sense disambiguation, parsing, text classification and filtering, etc.
In this paper, we present a proposal for combining a number of different term extraction techniques in order to improve the accuracy of the resulting system. The approach has been applied to the domain of medicine for the Spanish language. A number of tests have been carried out with encouraging results.
2024. SEWAR: A corpus-based N-gram approach for extracting semantically-related words from Arabic medical corpus. Expert Systems with Applications 238 ► pp. 121767 ff.
Chung, Teresa Mihwa & Paul Nation
2004. Identifying technical vocabulary. System 32:2 ► pp. 251 ff.
Drouin, Patrick
2016. Acquisition automatique de termes : simuler le travail du terminologue. Éla. Études de linguistique appliquée N° 180:4 ► pp. 417 ff.
Gamallo, Pablo & Marcos Garcia
2016. Entity Linking with Distributional Semantics. In Computational Processing of the Portuguese Language [Lecture Notes in Computer Science, 9727], ► pp. 177 ff.
Gillam, Lee & Khurshid Ahmad
2005. Pattern Mining Across Domain-Specific Text Collections. In Machine Learning and Data Mining in Pattern Recognition [Lecture Notes in Computer Science, 3587], ► pp. 570 ff.
Ittoo, Ashwin & Gosse Bouma
2013. Term extraction from sparse, ungrammatical domain-specific documents. Expert Systems with Applications 40:7 ► pp. 2530 ff.
Ittoo, Ashwin, Laura Maruster, Hans Wortmann & Gosse Bouma
2010. Textractor: A Framework for Extracting Relevant Domain Concepts from Irregular Corporate Textual Datasets. In Business Information Systems [Lecture Notes in Business Information Processing, 47], ► pp. 71 ff.
Ren, Feiliang
2014. An unsupervised cascade learning scheme for ‘cluster-theme keywords’ structure extraction from scientific papers. Journal of Information Science 40:2 ► pp. 167 ff.
Rigouts Terryn, Ayla, Véronique Hoste & Els Lefever
2021. HAMLET. Terminology. International Journal of Theoretical and Applied Issues in Specialized Communication 27:2 ► pp. 254 ff.
Rigouts Terryn, Ayla, Véronique Hoste & Els Lefever
2022. Tagging terms in text. Terminology. International Journal of Theoretical and Applied Issues in Specialized Communication 28:1 ► pp. 157 ff.
Vivaldi, Jorge, Iria da Cunha & Javier Ramírez
2011. The REG Summarization System with Question Reformulation at QAINEX Track 2010. In Comparative Evaluation of Focused Retrieval [Lecture Notes in Computer Science, 6932], ► pp. 295 ff.
2014. Bibliography. In Automatic Text Summarization, ► pp. 309 ff.
This list is based on CrossRef data as of 9 june 2024. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers.
Any errors therein should be reported to them.