Tagging terms in text
A supervised sequential labelling approach to automatic term extraction
Ayla Rigouts Terryn | Ghent University
Véronique Hoste | Ghent University
Els Lefever | Ghent University
As with many tasks in natural language processing, automatic term extraction (ATE) is increasingly approached as a machine learning problem. So far, most machine learning approaches to ATE broadly follow the traditional hybrid methodology, by first extracting a list of unique candidate terms, and classifying these candidates based on the predicted probability that they are valid terms. However, with the rise of neural networks and word embeddings, the next development in ATE might be towards sequential approaches, i.e., classifying each occurrence of each token within its original context. To test the validity of such approaches for ATE, two sequential methodologies were developed, evaluated, and compared: one feature-based conditional random fields classifier and one embedding-based recurrent neural network. An additional comparison was added with a machine learning interpretation of the traditional approach. All systems were trained and evaluated on identical data in multiple languages and domains to identify their respective strengths and weaknesses. The sequential methodologies were proven to be valid approaches to ATE, and the neural network even outperformed the more traditional approach. Interestingly, a combination of multiple approaches can outperform all of them separately, showing new ways to push the state-of-the-art in ATE.
Keywords: terminology, automatic term extraction, sequential labelling
Article outline
- 1.Introduction
- 2.Related research
- 2.1Machine learning approaches
- 2.2Evaluation
- 2.3Features
- 2.4Sequential approaches
- 3.Data
- 4.System description
- 4.1CRFSuite feature-based sequential ATE
- 4.2FlairNLP neural, embedding-based sequential ATE
- 4.3HAMLET machine learning approach to traditional hybrid ATE
- 5.Experiments and results
- 5.1Experimental setup
- 5.2CRF results
- 5.3RNN results
- 6.Analyses and discussion of results
- 6.1Choice of experiments and motivation
- 6.2Results per corpus
- 6.3Sequential, neural approach vs. traditional, feature-based approach
- 6.4Complementarity of results
- 7.RNN error analysis
- 8.Conclusion
- Notes
-
References
Published online: 10 January 2022
https://doi.org/10.1075/term.21010.rig
https://doi.org/10.1075/term.21010.rig
References
Agić, Željko, and Ivan Vulić
Akbik, Alan, Tanja Bergmann, Duncan Blythe, Kashif Rasul, Stefan Schweter, and Roland Vollgraf
Akbik, Alan, Duncan Blythe, and Roland Vollgraf
Alami Merrouni, Zakariae, Bouchra Frikh, and Brahim Ouhbi
Amjadian, Ehsan, Diana Inkpen, T. Sima Paribakht, and Farahnaz Faez
Amjadian, Ehsan, Diana Zaiu Inkpen, T. Sima Paribakht, and Farahnaz Faez
Astrakhantsev, Nikita, D. Fedorenko, and D. Yu. Turdakov
Bay, Matthias, Daniel Bruneß, Miriam Herold, Christian Schulze, Michael Guckert, and Mirjam Minor
Bojanowski, Piotr, Edouard Grave, Armand Joulin, and Tomas Mikolov
2016 ‘Enriching Word Vectors with Subword Information’. ArXiv Preprint in ArXiv:1607.04606 [Cs]. http://arxiv.org/abs/1607.04606
Bourigault, Didier
Cram, Damien, and Beatrice Daille
Crammer, Koby, Alex Kulesza, and Mark Dredze
Davies, Mark
2017 ‘The New 4.3 Billion Word NOW Corpus, with 4--5 Million Words of Data Added Every Day’. In Proceedings of the 9th International Corpus Linguistics Conference. Birmingham. Birmingham, UK. https://www.english-corpora.org/now
De Clercq, Orphée, Marjan Van de Kauter, Els Lefever, and Veronique Hoste
Devlin, Jacob, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova
2019 ‘BERT: Pre-Training of Deep Bidirectional Transformers for Language Understanding’. ArXiv:1810.04805 [Cs]. http://arxiv.org/abs/1810.04805
Dobrov, Boris, and Natalia Loukachevitch
Drouin, Patrick
Drouin, Patrick, Jean-Benoît Morel, and Marie-Claude L’ Homme
Fedorenko, Denis, Nikita Astrakhantsev, and Denis Turdakov
Goyal, Archana, Vishal Gupta, and Manish Kumar
Graff, David, Ângelo Mendonça, and Denise DiPersio
Habibi, Maryam, Leon Weber, Mariana Neves, David Luis Wiegandt, and Ulf Leser
Hätty, Anna, Michael Dorna, and Sabine Schulte im Walde
2017 ‘Evaluating the Reliability and Interaction of Recursively Used Feature Classes for Terminology Extraction’. In Proceedings of the Student Research Workshop at the 15th Conference of the European Chapter of the Association for Computational Linguistics, 113–21. Valencia, Spain: Association for Computational Linguistics. 

Hätty, Anna, Dominik Schlechtweg, and Michael Dorna
Hazem, Amir, Mérieme Bouhandi, Florian Boudin, and Béatrice Daille
Kageura, Kyo, and Elizabeth Marshman
Kageura, Kyo, and Bin Umino
Kauter, Marian van de, Geert Coorman, Els Lefever, Bart Desmet, Lieve Macken, and Véronique Hoste
Kim, J.-D., T. Ohta, Y. Tateisi, and J. Tsujii
Kingma, Diederik P., and Jimmy Ba
2015 ‘Adam: A Method for Stochastic Optimization’. In Proceedings of 3rd International Conference for Learning Representations. San Diego, CA. http://arxiv.org/abs/1412.6980
Koutropoulou, Theoni, and Efstratios Efstratios
Kucza, Maren, Jan Niehues, Thomas Zenkel, Alex Waibel, and Sebastian Stüker
2018 ‘Term Extraction via Neural Sequence Labeling a Comparative Evaluation of Strategies Using Recurrent Neural Networks’. In Proceedings of Interspeech 2018, the 19th Annual Conference of the International Speech Communication Association, 2072–76. Hyderabad, India: International Speech Communication Association. 

Loshchilov, Ilya, and Frank Hutter
2019 ‘Decoupled Weight Decay Regularization’. In Proceedings of the Seventh International Conference on Learning Representations. New Orleans, USA. http://arxiv.org/abs/1711.05101
Macken, Lieve, Els Lefever, and Véronique Hoste
Martin, Louis, Benjamin Muller, Pedro Javier Ortiz Suárez, Yoann Dupont, Laurent Romary, Éric de la Clergerie, Djamé Seddah, and Benoît Sagot
McCrae, John P., and Adrian Doyle
Meyers, Adam L., Yifan He, Zachary Glass, John Ortega, Shasha Liao, Angus Grieve-Smith, Ralph Grishman, and Olga Babko-Malaya
Mikolov, Tomas, Wen-tau Yih, and Geoffrey Zweig
Okazaki, Naoaki
2007 CRFsuite: A Fast Implementation of Conditional Random Fields (CRFs). http://www.chokkan.org/software/crfsuite/
Oostdijk, Nelleke, Martin Reynaert, Véronique Hoste, and Ineke Schuurman
Paszke, Adam, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, et al.
2019 ‘PyTorch: An Imperative Style, High-Performance Deep Learning Library’. In Proceedings of the 33rd Conference on Neural Information Processing Systems (NeurIPS 2019), 8024–35. Vancouver, Canada. http://arxiv.org/abs/1912.01703
Pedregosa, Fabian, Gael Varoquaux, Alexandre Gramfort, Vincent Michel, Bertrand Thirion, Olivier Grisel, Mathieu Blondel, et al.
Peters, Matthew, Mark Neumann, Mohit Iyyer, Matt Gardner, Christopher Clark, Kenton Lee, and Luke Zettlemoyer
Petrov, Slav, Dipanjan Das, and Ryan McDonald
Pires, Telmo, Eva Schlinger, and Dan Garrette
Qasemizadeh, Behrang, and Siegfried Handschuh
Qasemizadeh, Behrang, and Anne-Kathrin Schumann
Rigouts Terryn, Ayla, Patrick Drouin, Véronique Hoste, and Els Lefever
Rigouts Terryn, Ayla, Véronique Hoste, Patrick Drouin, and Els Lefever
Rigouts Terryn, Ayla, Véronique Hoste, and Els Lefever
Rokas, Aivaras, Sigita Rackevičienė, and Andrius Utka
Stenetorp, Pontus, Goran Topić, Sampo Pyysalo, Tomoko Ohta, Jin-Dong Kim, and Jun’ichi Tsujii
Vintar, Spela
Vivaldi, Jorge, and Horacio Rodríguez
Vries, Wietse de, Andreas van Cranenburgh, Arianna Bisazza, Tommaso Caselli, Gertjan van Noord, and Malvina Nissim
Wang, Rui, Wei Liu, and Chris McDonald
Wolf, Thomas, Lysandre Debut, Victor Sanh, Julien Chaumond, Clement Delangue, Anthony Moi, Pierric Cistac, et al.
Wołk, Krzysztof, and Krzysztof Marasek
Yuan, Yu, Jie Gao, and Yue Zhang