Article published in:Terminology across Languages and Domains
Edited by Patrick Drouin, Natalia Grabar, Thierry Hamon and Kyo Kageura
[Terminology 21:2] 2015
► pp. 151–179
The underpinnings of a composite measure for automatic term extraction
The case of SRC
The corpus-based identification of those lexical units which serve to describe a given specialized domain usually becomes a complex task, where an analysis oriented to the frequency of words and the likelihood of lexical associations is often ineffective. The goal of this article is to demonstrate that a user-adjustable composite metric such as SRC can accommodate to the diversity of domain-specific glossaries to be constructed from small- and medium-sized specialized corpora of non-structured texts. Unlike for most of the research in automatic term extraction, where single metrics are usually combined indiscriminately to produce the best results, SRC is grounded on the theoretical principles of salience, relevance and cohesion, which have been rationally implemented in the three components of this metric.
Keywords: SRC, cohesion, automatic term extraction, relevance, salience
Published online: 31 December 2015
Cited by 2 other publications
Felices Lago, Ángel M. & Pedro Ureña Gómez-Moreno
This list is based on CrossRef data as of 24 april 2021. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers. Any errors therein should be reported to them.
Ahmad, Khurshid, Lee Gillam, and Lena Tostevin
Barrón-Cedeño, Alberto, Gerardo Sierra, Patrick Drouin, and Sophia Ananiadou
Church, Kenneth Ward, and Patrick Hanks
Church, Kenneth Ward, William Gale, Patrick Hanks, and Donald Hindle
Collins WordBanks Online
2013 (http://www.collins.co.uk/page/Wordbanks+Online). Accessed 14 August 2015.
Conrado, Merley da Silva, Ariani Felippo, Thiago Salgueiro Pardo, and Solange Rezende
2014 “A Survey of Automatic Term Extraction for Brazilian Portuguese.” Journal of the Brazilian Computer Society 20 (12): 1–28. (http://www.journal-bcs.com/content/20/1/12). Accessed 14 August 2015.
Conrado, Merley da Silva, Thiago Salgueiro Pardo, and Solange Rezende
2014 “The Main Challenge of Semi-Automatic Term Extraction Methods.” In Proceedings of the 11th International Workshop on Natural Language Processing and Cognitive Science , 1–10, Venice.
Fedorenko, Denis, Nikita Astrakhantsev, and Denis Turdakov
2013 “Automatic Recognition of Domain-Specific Terms: An Experimental Evaluation.” In Proceedings of the 9th Spring Researcher’s Colloquium on Database and Information Systems , 15–23, Kazan.
Frantzi, Katerina, and Sophia Ananiadou
Frantzi, Katerina, Sophia Ananiadou, and Mima Hideki
Golik, Wiktoria, Robert Bossy, Zorana Ratkovic, and Claire Nédellec
Kageura, Kyo, and Bin Umino
Knoth, Petr, Marek Schmidt, Pavel Smrz, and Zdenek Zdráhal
Korkontzelos, Ioannis, Ioannis Klapaftis, and Suresh Manandhar
Kraaij, Wessel, and Renée Pohlmann
1996 “Viewing Stemming as Recall Enhancement.” In Proceedings of the 19th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval , 40–48, Zurich.
Lossio-Ventura, Juan Antonio, Clement Jonquet, Mathieu Roche, and Maguelonne Teisseire
2014 “Biomedical Terminology Extraction: A New Combination of Statistical and Web Mining Approaches.” In Proceedings of Journées Internationales d’Analyse Statistique des Données Textuelles , 1–12, Paris.
Luhn, Hans Peter
Mairal-Usón, Ricardo, and Carlos Periñán-Pascual
Manning, Christopher D., Prabhakar Raghavan, and Hinrich Schütze
Nagao, Makoto, Mikio Mizutani, and Hiroyuki Ikeda
Navigli, Roberto, and Paola Velardi
2002 “Semantic Interpretation of Terminological Strings.” In Proceedings of the 6th International Conference on Terminology and Knowledge Engineering , 95–100. Berlin-Heidelberg: Springer.
Park, Youngja, Roy J. Byrd, and Branimir K. Boguraev
Pazienza, Maria Teresa, Marco Pennacchiotti, and Fabio Massimo Zanzotto
Peñas, Anselmo, Felisa Verdejo, and Julio Gonzalo
2001 “Corpus-Based Terminology Extraction Applied to Information Access.” In Proceedings of the Corpus Linguistics Conference , 458–465, Lancaster.
Periñán Pascual, Carlos
Periñán-Pascual, Carlos, and Francisco Arcas-Túnez
2004 “Meaning Postulates in a Lexico-Conceptual Knowledge Base.” In Proceedings of the 15th International Workshop on Databases and Expert Systems Applications , 38–42. Los Alamitos: the Institute of Electrical and Electronics Engineers-Computer Society.
2010 “The Architecture of FunGramKB.” In Proceedings of the 7th International Conference on Language Resources and Evaluation , 2667–2674. Malta: ELRA.
Periñán-Pascual, Carlos, and Ricardo Mairal-Usón
Plante, Pierre, and Lucie Dumas
Real Academia Española
Corpus de Referencia del Español Actual (CREA). (http://www.rae.es). Accessed 14 August 2015.
Sabbah, Yousef W., and Yousef Abuzir
2005 “Automatic Term Extraction Using Statistical Techniques: A Comparative in-Depth Study & Applications.” In Proceedings of the International Arab Conference on Information Technology ACIT 2005 , 1–7, Amman.
Salton, Gerard, and Christopher Buckley
Salton, Gerard, Anita Wong, and Chung-Shu Yang
Salton, Gerard, and Chung-Shu Yang
Salton, Gerard, Chung-Shu Yang, and Clement T. Yu
Sclano, Francesco, and Paola Velardi
2007 “TermExtractor: A Web Application to Learn the Common Terminology of Interest Groups and Research Communities.” In Proceedings of the 9th Conference on Terminology and Artificial Intelligence , 1–10, Sophia Antinopolis.
Singhal, Amit, Chris Buckley, and Mandar Mitra
1996 “Pivoted Document Length Normalization.” In Proceedings of the 19th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval , 21–29. New York: ACM press
Singhal, Amit, Gerard Salton, and Chris Buckley
1996 “Length Normalization in Degraded Text Collections.” In Proceedings of the 5th Annual Symposium on Document Analysis and Information Retrieval , 149–162. Las Vegas: University of Nevada.
Sun, Qinglan, Debora Shaw, and Charles H. Davis
The British National Corpus (BNC)
Oxford University Computing Services. http://www.natcorp.ox.ac.uk
Turney, Peter D., and Patrick Pantel
Velardi, Paola, Michele Missikoff, and Roberto Basili
Wong, Wilson, Wei Liu, and Mohammed Bennamoun
2007 “Determining Termhood for Learning Domain Ontologies Using Domain Prevalence and Tendency.” In Proceedings of the 6th Australasian Conference on Data Mining , 47–54, Gold Coast.
Zhang, Ziqi, José Iria, Christopher Brewster, and Fabio Ciravegna
2008 “A Comparative Evaluation of Term Recognition Algorithms.” In Proceedings of the 6th International Conference on Language Resources and Evaluation , 2108–2113. Marrakech: ELRA.