User-driven assessment of commercial term extractors
In this paper, we address the system evaluation issue for commercial term extraction tools from the users’
perspective. We first revisit the gold standard approach commonly practised among researchers, and discuss the challenges it may
pose on end users, taking translators as a typical example. Considering the very different motivations and needs of users and
researchers, a user-driven approach is proposed as a variation and alternative to the gold standard approach to allow users to
assess and understand the performance of commercial tools more objectively. Its feasibility and usefulness are demonstrated by
deploying a benchmarking dataset of English-Chinese financial terms, produced by multiple annotators, in a case study with SDL
MultiTerm Extract. The results also provide insight for future development of term extractors designed for translators, which will
hopefully generate more accurate candidates, offer more customised features, enable better user experience, and enjoy wider
popularity as a computer-aided translation tool.
Keywords: automatic term extraction, bilingual term annotation, computer-aided translation, financial terminology, user-driven system assessment
Article outline
- 1.Introduction
- 2.Related work
- 2.1Automatic term extraction
- 2.2The issue of system evaluation
- 3.Creating the user-made benchmark
- 3.1The corpus
- 3.2English-Chinese financial terms in existing resources
- 3.3Term annotation guidelines
- Scope of terms
- Form of terms
- Span of terms
- 3.4The annotation and the resulting benchmark
- 4.Assessing systems with user-driven benchmarks
- 4.1SDL MultiTerm Extract
- 4.2Monolingual English term extraction
- 4.3Monolingual Chinese term extraction
- 4.4Bilingual English-Chinese term extraction
- 5.Discussion
- 5.1User-driven approach to accommodate individual needs
- 5.2An informal comparison with research-based systems
- 6.Conclusion
-
References
Published online: 03 August 2021
https://doi.org/10.1075/term.20032.kwo
https://doi.org/10.1075/term.20032.kwo
References
Agirre, Eneko, Xabier Arregi, Xabier Artola, Arantza Díaz de Illarraza, Kepa Sarasola, and Aitor Soroa
Baldwin, Timothy, and Takaaki Tanaka
Bernier-Colborne, Gabriel, and Patrick Drouin
Bertels, Ann, and Dirk Speelman
Black, E., S. Abney, D. Flickenger, D. C. Gdaniec, R. Grishman, P. Harrison, D. Hindle, R. Ingria, F. Jelinek, J. Klavans, M. Liberman, M. Marcus, S. Roukos, B. Santorini, and T. Strzalkowski
Blancafort, Helena, Francis Bouvier, Béatrice Daille, Ulrich Heid, and Anita Ramm
Bourigault, Didier
Bowker, Lynne
Cabré Castellví, M. Teresa, Rosa Estopà Bagot, and Jordi Vivaldi Palatresi
Cabré, M. Teresa
Cao, Yunbo, and Hang Li
Chung, Teresa Mihwa
Daille, Béatrice
Daille, Béatrice, and Emmanuel Morin
Drouin, Patrick
Erdmann, Maike, Kataro Nakayama, Takahiro Hara, and Shojiro Nishio
Estopà, Rosa
Fernandez Parra, M., and P. Hacken
Foo, J., and M. Merkel
Fulford, Heather
Fung, Pascale
Hätty, Anna, and Sabine Schulte im Walde
Hazem, Amir, and Emmanuel Morin
Hazem, Amir, Mérième Bouhandi, Florian Boudin, and Béatrice Daille
Hippisley, Andrew R., David Cheng, and Khurshid Ahmad
Kageura, Kyo, Masaharu Yoshioka, Keita Tsuji, Fuyuki Yoshikane, Koichi Takeuchi, and Teruo Koyama
Kilgarriff, Adam, and Joseph Rosenzweig
Kim, J.-D., T. Ohta, Y. Tateisi, and J. Tsujii
Kit, Chunyu, and Xiaoyue Liu
Krauthammer, Michael, and Goran Nenadić
Kwong, Oi Yee
Kwong, Oi Yee, Benjamin K. Tsou, and Tom B. Y. Lai
Laroche, Audrey, and Philippe Langlais
Lossio-Ventura, Juan Antonio, Clement Jonquet, Mathieu Roche, and Maguelonne Teisseire
Macken, Lieve, Els lefever, and Véronique Hoste
Meyer, Ingrid
QasemiZadeh, Behrang, and Anne-Kathrin Schumann
QasemiZadeh, Behrang, and Siegfried Handschuh
Quirchmayr, Thomas, Barbara Paech, Roland Kohl, Hannes Karey, and Gunar Kasdepke
Resnik, Philip, and I. Dan Melamed
Rigouts Terryn, Ayla, Véronique Hoste, and Els Lefever
Rigouts Terryn, Ayla, Véronique Hoste, Patrick Drouin, and Els Lefever
Sager, Juan C.
Smadja, Frank, Vasileios Hatzivassiloglou, and Kathleen McKeown
Sproat, Richard, and Thomas Emerson
Vivaldi, Jorge, and Horacio Rodríguez
Voorhees, Ellen M., and Donna K. Harman
Wang, Rui, Wei Liu, and Chris McDonald
Warburton, Kara
Xu, Ran, and Serge Sharoff