Eliciting specialized frames from corpora using argument-structure extraction techniques
Beatriz Sánchez Cárdenas | University of Granada
Carlos Ramisch | Aix-Marseille Université, Université de Toulon, CNRS, LIS
Frame Semantics provides a powerful cross-lingual model to describe the conceptual structure underlying
specialized language. Building specialized frames is challenging because of the complex nature of predicate-argument structures,
and because of the domain-specific uses of general-language predicates. Our semi-automatic method elicits semantic frames from
specialized corpora. It aims to discover lexical patterns that reveal the structure of specialized frames and to populate them
with corpus-based data. Firstly, we automatically extracted verb-noun triples from corpora using bootstrapping to identify
noun-verb-noun phraseological patterns. Secondly, we annotated each noun-verb-noun triple with the lexical domain of the verbs and
the semantic class and role of the noun filling each argument slot. We then used these annotations and patterns to classify
similar triples. Thus, the structure and the types of lexical units that belong to each specialized frames were inferred.
Specialized corpora analysis of environmental science texts in English and in Spanish illustrate our methodology.
Keywords: Frame Semantics, frame-based terminology, corpora, corpus-based extraction, argument structure
Published online: 24 July 2019
https://doi.org/10.1075/term.00026.san
https://doi.org/10.1075/term.00026.san
References
References
Baroni, Marco, Adam Kilgarriff, Jan Pomikálek, and Pavel Rychlý
2006 “WebBootCaT: Instant Domain-specific Corpora to Support Human Translators.” In
Proceedings of EAMT. 11th Annual Conference of the European Association for Machine Translation, 247–252, Oslo (Norway).
Barrière, Caroline
Buendía-Castro, M.
Buendía-Castro, Miriam, and Beatriz Sánchez-Cárdenas
Cabré Castellví, María Teresa
Church, Kenneth Ward, and Patrick Hanks
Condamines, Anne
Condamines, Anne, and Josette Rebeyrolle
Dubois, Jean, and Françoise Dubois-Charlier
EcoLexicon
http://ecolexicon.ugr.es/en/). Accessed 30 March 2019.
Faber, Pamela
Faber, Pamela, and Pilar León-Araúz
Faber, Pamela, Pilar León Araúz, and Jose Antonio Prieto Velasco
Faber, Pamela, and Ricardo Mairal Usón
Faber, Pamela, and Ricardo Mairal Usón
Faber, Pamela, Juan Verdejo-Román, Pilar León-Araúz, Arianne Reimerink, and Gloria Guzmán Pérez-Carrillo
Faber, Pamela, and M. C. África Vidal Claramonte
Feliu, Judit
Fellbaum, Christiane J.
Fillmore, Charles J.
Fillmore, Charles, Christopher Johnson, and Miriam Petruck
Flaux, Nelly, and Danièle Van de Velde
FrameNet
https://framenet.icsi.berkeley.edu). Accessed 30 March 2019.
François, Jacques, Dennis Le Pesant, and Danielle Leeman
Gaudin, François
Goldberg, Adele E.
Granger, Sylviane, and Fanny Meunier
Hadouche, Fadila, Guy Lapalme, and Marie-Claude L’Homme
Halliday, Michael, Christian Mim Matthiessen, and Christian Matthiessen
Hanks, Patrick
Hatier, Sylvain, Magdalena Augustyn, Hoai Thi Thu Tran, Rui Yan, Agnès Tutin, and Marie-Paule Jacques
L’Homme, Marie-Claude
L’Homme, Marie-Claude, and Janine Pimentel
L’Homme, Marie-Claude, Robichaud Benoît, and Carlos Subirats Rüggberg
L’Homme, Marie-Claude, Subirats, Carlos, and Robichaud, Benoît
Langacker, Ronald W.
Linardaki, Evita, Carlos Ramisch, Aline Villavicencio, and Aggeliki Fotopoulou
Mairal Usón, Ricardo, and Pamela Faber
Meyer, Ingrid
Meyer, Ingrid, Kristen Mackintosh, Caroline Barrière, and Tricia Morgan
Nivre, Joakim, Marie-Catherine de Marneffe, Filip Ginter, Yoav Goldberg, Jan Hajic, Christopher D. Manning, Ryan McDonald, Slav Petrov, Sampo Pyysalo, Natalia Silveira, Reut Tsarfaty, and Daniel Zeman
Petruck, Miriam R. L.
1996 “Semantic Frames.” In Handbook of Pragmatics, ed. by Verschueren, J., J. Ostman, J. Blommaert, and C. Bulcaen, Amsterdam/Philadelphia: John Benjamins (https://benjamins.com/online/hop/). Accessed 1 April 2019.
Ramisch, Carlos
Ruppenhofer, Josef, Michael Ellsworth, Miriam R. Petruck, C. R. Johnson, and Jan Scheffczyk
San Martín, Antonio
San Martín, Antonio, and Pilar León Araúz
Sánchez Cárdenas, Beatriz
Sánchez Cárdenas, Beatriz, and Miriam Buendía Castro
Sánchez Cárdenas, Beatriz, and Pamela Faber
Straka, Milan, Jan Hajič, and Jana Straková
2016 “UDPipe: Trainable Pipeline for Processing CoNLL-U Files Performing Tokenization, Morphological Analysis, POS Tagging and Parsing.” In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016). Portorož (Slovenia) (http://ufal.mff.cuni.cz/~straka/papers/2016-lrec_udpipe.pdf). Accessed 1 April 2019.
Temmerman, Rita