Article published in:
Computational terminology and filtering of terminological informationEdited by Patrick Drouin, Natalia Grabar, Thierry Hamon, Kyo Kageura and Koichi Takeuchi
[Terminology 24:1] 2018
► pp. 23–40
Distributed specificity for automatic terminology extraction
Diana Inkpen | University of Ottawa, Canada
T. Sima Paribakht | University of Ottawa, Canada
Farahnaz Faez | Western University, Canada
The present article explores two novel methods that integrate distributed representations with terminology extraction. Both methods assess the specificity of a word (unigram) to the target corpus by leveraging its distributed representation in the target domain as well as in the general domain. The first approach adopts this distributed specificity as a filter, and the second directly applies it to the corpus. The filter can be mounted on any other Automatic Terminology Extraction (ATE) method, allows merging any number of other ATE methods, and achieves remarkable results with minimal training. The direct approach does not perform as high as the filtering approach, but it reemphasizes that using distributed specificity as the words’ representation, very little data is required to train an ATE classifier. This encourages more minimally supervised ATE algorithms in the future.
Keywords: automatic terminology extraction, neural networks, distributed specificity, representation learning, word embeddings
Published online: 31 May 2018
https://doi.org/10.1075/term.00012.amj
https://doi.org/10.1075/term.00012.amj
References
References
Anthony, Laurence
2012 AntConc (Version 3.3.0) [Computer Software]. Tokyo, Japan: Waseda University (http://www.laurenceanthony.net/). Accessed 12 February 2018.
Piotr Bojanowski, Edouard Grave, Armand Joulin, and Tomas Mikolov
Broß, Jurgen, and Heiko Ehrig
Cabré-Castellvi, Maria Teresa, Rosa Estopa Bagot, and Jordi Vivaldi-Palatresi
Chung, Teresa Mihwa
Chung, Teresa Mihwa, and Paul Nation
Conrado, Merley, Thiago Pardo, and Solange Rezende
Crippin, Peter, Robert Donato, and David Wright
Drouin, Patrick
[ p. 38 ]
Frantzi, Katerina T., Sophia Ananiadou, and Jun-ichi Tsujii
Inkpen, Diana, T. Sima Paribakht, Farahnaz Faez, and Ehsan Amjadian
Ismail, Azniah, and Suresh Manandhar
Kageura, Kyo, and Bin Umino
Kirkpatrick, Chris, Barbara Alldred, Crystal Chilvers, Beverly Farahani, Kristina Farentino, Angelo Lillo, Ian Macpherson, John Rodger, and Susanne Trew
Le Serrec, Annaïch, Marie-Claude L’Homme, Patrick Drouin, and Olivier Kraif
Ljubesic, Nikola, Spela Vintar, and Darja Fiser
Mikolov, Thomas, Kai Chen, Greg Corrado, and Jeffrey Dean
2013 “Efficient Estimation of Word Representations in Vector Space.” In arXiv preprint arXiv:1301.3781 (https://arxiv.org/pdf/1301.3781.pdf). Accessed 10 February 2018.
Mitkov, Ruslan, Richard Evans, Constantin Orasan, Iustin Dornescu, and Miguel Rios
Mnih, Andriy, and Koray Kavukcuoglu
Nazar, Rogelio, and Maria Teresa Cabré
Park, Youngja, Roy J. Byrd, and Branimir K. Boguraev
Pennington, Jeffrey, Richard Socher, and Christopher D. Manning
Platt, John
[ p. 39 ]
Pontiki, Maria, Dimitris Galanis, John Pavlopoulos, Harris Papageorgiou, Ion Androutsopoulos, and Suresh Manandhar
Pontiki, Maria, Dimitris Galanis, Haris Papageorgiou, Suresh Manandhar, and Ion Androutsopoulos
Rehurek, Radim and Petr Sojka
Small, Marian, Chris Kirkpatrick, B. Alldred, S. Godin, Angelo Lillo, and Andrew Dmytriw
Small, Marian, Chris Kirkpatrick, and Andrew Dmytriw
Su Nam, Kim, Timothy Baldwin, and Min-Yen Kan
Turney, Peter D.
Vintar, Spela
Vu, Thuy, Ai Ti Aw, and Min Zhang
Wang, Rui, Wei Liu, and Chris McDonald
Yang, Yuhang, Hao Yu, Yao Meng, Yingliang Lu, and Yingju Xia
Yin, Yichun, Furu Wei, Li Dong, Kaimeng Xu, Ming Zhang, and Ming Zhou
Yoshida, Minoru, and Hiroshi Nakagawa
Cited by
Cited by 2 other publications
Isaeva, Ekaterina & Vadim Bakhtin
Shan, Bingzhao, Muhammad Rizwan Abid & Ehsan Amjadian
This list is based on CrossRef data as of 22 november 2020. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers. Any errors therein should be reported to them.