Eliciting specialized frames from corpora using argument-structure extraction techniques
Frame Semantics provides a powerful cross-lingual model to describe the conceptual structure underlying
specialized language. Building specialized frames is challenging because of the complex nature of predicate-argument structures,
and because of the domain-specific uses of general-language predicates. Our semi-automatic method elicits semantic frames from
specialized corpora. It aims to discover lexical patterns that reveal the structure of specialized frames and to populate them
with corpus-based data. Firstly, we automatically extracted verb-noun triples from corpora using bootstrapping to identify
noun-verb-noun phraseological patterns. Secondly, we annotated each noun-verb-noun triple with the lexical domain of the verbs and
the semantic class and role of the noun filling each argument slot. We then used these annotations and patterns to classify
similar triples. Thus, the structure and the types of lexical units that belong to each specialized frames were inferred.
Specialized corpora analysis of environmental science texts in English and in Spanish illustrate our methodology.
Article outline
- 1.Introduction
- 2.Cognitive linguistics applied to specialized language
- 3.Extraction methodology
- 3.1Corpus description
- 3.2Query and filtering tools
- Step 1.Querying the corpora
- Step 2.Filtering and sorting the results
- 3.3Search result bootstrapping
- 4.Frame construction based on argument structure generalization
- 4.1Characterizing the nuclear meaning of verbs
- 4.2Characterizing the ontological nature of nouns
- 4.2.1Natural entities
- a.Inanimate objects: Mineral, rock, branch
- b.Material entities: Gas, smoke, ash, lava
- c.Natural geographical places: Forest, riverbed, shore
- 4.2.2Actions and events
- 4.3Characterizing the thematic role of noun-verb pairs
- 4.4Grouping similar argument structures
- 5.Results and analyses
- 5.1Error analysis
- 5.2Semantic frames of volcanic activity
- 6.Conclusions and future work
- Acknowledgements
- Notes
-
References
References (68)
References
Baroni, Marco, Adam Kilgarriff, Jan Pomikálek, and Pavel Rychlý. 2006. “WebBootCaT: Instant Domain-specific Corpora to Support Human Translators.” In
Proceedings of EAMT. 11th Annual Conference of the European Association for Machine Translation, 247–252, Oslo (Norway).
Buendía-Castro, M. 2013. Phraseology in Specialized Language and its Representation in Environmental Knowledge Resources. PhD Thesis. Universidad de Granada, Granada, Spain.
Buendía-Castro, Miriam, and Beatriz Sánchez-Cárdenas. 2016. “Using Argument Structure to Disambiguate Verb Meaning.” In Proceedings of the XVII EURALEX International Congress, ed. by Margalitadze, T., and G. Meladze, 482–490. Tbilisi: Ivane Javakhishvili Tbilisi University Press.
Church, Kenneth Ward, and Patrick Hanks. 1990. “Word association norms, mutual information, and lexicography.” Computational linguistics 16 (1): 22–29.
Coseriu, Eugenio. 1977. Principios de semántica estructural, Madrid: Gredos.
Dik, Simon. 1978. Functional Grammar, Dordrecht: Foris Publications.
Dubois, Jean, and Françoise Dubois-Charlier. 1997. “Synonymie syntaxique et classification des verbes français.” Langages 1281: 51–71.
EcoLexicon ([URL]). Accessed 30 March 2019.
Faber, Pamela (ed.) 2012. A Cognitive Linguistics View of Terminology and Specialized Language 201. Berlin: Walter de Gruyter.
Faber, Pamela. 2015. “Frames as a Framework for Terminology.” In Handbook of Terminology 1(14), ed. by Kockaert, H. J., and F. Steurs, 14–33. Amsterdam/Philadelphia: John Benjamins Publishing Company.
Faber, Pamela, and Pilar León-Araúz. 2014. “Specialized Knowledge Dynamics.” In Dynamics and Terminology: An Interdisciplinary Perspective on Monolingual and Multilingual Culture-bound Communication, ed. by Temmerman, R., and M. Van Campenhoudt, 135–158. Amsterdam/Philadelphia: John Benjamins.
Faber, Pamela, and Pilar León-Araúz. 2016. “Specialized Knowledge Representation and the Parameterization of Context.” Frontiers in psychology 71. ( ).
Faber, Pamela, Pilar León Araúz, and Jose Antonio Prieto Velasco. 2009. “Semantic Relations, Dynamicity, and Terminological Knowledge Bases.” Current Issues in Language Studies 1(1): 1–23.
Faber, Pamela, and Ricardo Mairal Usón. 1999. Constructing a Lexicon of English Verbs, New York: Mouton de Gruyter.
Faber, Pamela, and Ricardo Mairal Usón. 2017. “The Functional Lexematic Model: Past, Present and Future.” In Estudios de Filología Inglesa, ed. by Cutillas Espinosa, J. A. H. C., R. Manchón Ruiz, and F. Mena Martínez, 315–340. Murcia: Editum.
Faber, Pamela, Juan Verdejo-Román, Pilar León-Araúz, Arianne Reimerink, and Gloria Guzmán Pérez-Carrillo. 2017. “Specialized Knowledge Processing in the Brain: An fMRI Study.” In Terminological Approaches in the European Context, ed. by P. Faini, 168–182. Newcastle-upon-Tyne: Cambridge Scholars Publishing.
Feliu, Judit. 2004. Relacions conceptuals i terminologia: anàlisi i proposta de detecció semiautomàtica. PhD Thesis. Universitat Pompeu Fabra.
Fellbaum, Christiane J. 1990. “English Verbs as a Semantic Net.” International Journal of Lexicography 3(4): 278–301.
Fellbaum, Christiane J. (ed.) 1998. WordNet: An Electronic Lexical Database. Cambridge, MA: MIT Press.
Fillmore, Charles J. 2006. “Frame Semantics.” In Cognitive linguistics: Basic readings 341, ed. by Geeraerts, D., 373–400. Berlin/New York: Mouton de Gruyter.
Fillmore, Charles, Christopher Johnson, and Miriam Petruck. 2003. “Background to FrameNet.” International Journal of Lexicography 16(3): 235–250.
Firth, John Ruppert. 1961. Papers in Linguistics 1934–1951. Oxford: Oxford University Press.
Flaux, Nelly, and Danièle Van de Velde. 2000. Les noms en français : esquisse de classement. Paris: Ophrys.
FrameNet ([URL]). Accessed 30 March 2019.
François, Jacques, Dennis Le Pesant, and Danielle Leeman. 2007. “Présentation de la classification des Verbes français de Jean Dubois et Françoise Dubois-Charlier.” Langue française 11: 3–19.
Gaudin, François. 2003. Socioterminologie. Une approche sociolinguistique de la terminologie. Bruxelles: De Boeck/Duculot.
Goldberg, Adele E. 1995. Constructions: A Construction Grammar Approach to Argument Structure. Chicago: University of Chicago Press.
Gross, Gaston. 1994. “Classes d’objets et description des verbes.” Langages 1151: 15–30.
Gross, Gaston. 2008. “Les classes d’objets.” Lalies 281: 111–165.
Hadouche, Fadila, Guy Lapalme, and Marie-Claude L’Homme. 2011. “Attribution de rôles sémantiques à des actants.” In Actes de Traitement automatique des langues (TALN), Montpellier (France) ( ).
Halliday, Michael, Christian Mim Matthiessen, and Christian Matthiessen. 2014. An Introduction to Functional Grammar. London: Routledge.
Hanks, Patrick. 2012. “How People Use Words to Make Meanings: Semantic Types meet Valencies.” In Input, Process and Product: Developments in Teaching and Language Corpora, ed. by Boulton, A., and J. Thomas, 54–69. Masaryk: Masaryk University Press.
Hanks, Patrick. 2004. “Corpus Pattern Analysis.” In 11th EURALEX (European Association for Lexicography) International Congress (Euralex 2004) Proceedings, vol 11: 87–98. Lorient (France).
Hatier, Sylvain, Magdalena Augustyn, Hoai Thi Thu Tran, Rui Yan, Agnès Tutin, and Marie-Paule Jacques. 2016. “French Cross-disciplinary Scientific Lexicon: Extraction and Linguistic Analysis.” In Proceedings of the XVII EURALEX International Congress, 355–366, Tbilisi (Georgia).
Huyghe, Richard. 2015. “Les typologies nominales: présentation.” Langue française 185(1): 5–27.
L’Homme, Marie-Claude. 1998. “Le statut du verbe en langue de spécialité et sa description lexicographique.” Cahiers de lexicologie 73(2): 61–84.
L’Homme, Marie-Claude. 2004. “A Lexico-semantic Approach to the Structuring of Terminology.” In Proceedings of CompuTerm 2004: 3rd International Workshop on Computational Terminology, 7–14. Geneva, Switzerland.
L’Homme, Marie-Claude. 2012a. “Le verbe terminologique: un portrait de travaux récents.” In Actes du 3e Congrès mondial de linguistique française, ed. by Neveu, F. et al., 93–107. Lyon, France: EDP Sciences.
L’Homme, Marie-Claude. 2012b. “Adding Syntactico-semantic Information to Specialized Dictionaries: An Application of the FrameNet Methodology.” Lexicographica 281: 233–252.
L’Homme, Marie-Claude, and Janine Pimentel. 2012. “Capturing Syntactico-semantic Regularities among Terms: An application of the FrameNet Methodology to Terminology.” In Languages Ressources and Evaluation (LREC 2012), 262–268, Istambul, Turkey.
L’Homme, Marie-Claude, Robichaud Benoît, and Carlos Subirats Rüggberg. 2014. “Discovering Frames in Specialized Domains.” In Languages Ressources and Evaluation (LREC 2014), 1364–1371. Reykjavik, Iceland.
L’Homme, Marie-Claude, Subirats, Carlos, and Robichaud, Benoît. 2016. “A Proposal for Combining General and Specialized Frames.” In Proceedings of the 5th Workshop on Cognitive Aspects of the Lexicon, CogALex-V, 156–165, Dublin, Ireland.
Langacker, Ronald W. 1987. Foundations of Cognitive Grammar: Theoretical Prerequisites (Vol. 11). Stanford: Stanford University Press.
Linardaki, Evita, Carlos Ramisch, Aline Villavicencio, and Aggeliki Fotopoulou. 2010. “Towards the Construction of Language Resources for Greek Multiword Expressions: Extraction and Evaluation.” In Proceedings of the LREC Workshop on Exploitation of multilingual resources and tools for Central and (South) Eastern European Languages, 31–40. Valetta, Malta. ELRA.
Mairal Usón, Ricardo, and Pamela Faber. 2002. “Functional Grammar and Lexical Templates.” In New Perspectives on Argument Structure in Functional Grammar, ed. by Mairal Usón, R. and M. J. Pérez Quintero, 41–98. Berlin /New York: Mouton de Gruyter.
Meyer, Ingrid, Kristen Mackintosh, Caroline Barrière, and Tricia Morgan. 1999. “Conceptual Sampling for Terminological Corpus Analysis.” In Proceedings of the Fifth International Congress on Terminology and Knowledge Engineering (TKE’99), ed. by Sandrini, P., 256–267, Innsbruck (Austria).
Nivre, Joakim, Marie-Catherine de Marneffe, Filip Ginter, Yoav Goldberg, Jan Hajic, Christopher D. Manning, Ryan McDonald, Slav Petrov, Sampo Pyysalo, Natalia Silveira, Reut Tsarfaty, and Daniel Zeman. 2016. “Universal Dependencies v1: A Multilingual Treebank Collection.” In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016), 1659–1666. Portorož, Slovenia.
Petruck, Miriam R. L. 1996. “Semantic Frames.” In Handbook of Pragmatics, ed. by Verschueren, J., J. Ostman, J. Blommaert, and C. Bulcaen, Amsterdam/Philadelphia: John Benjamins ([URL]). Accessed 1 April 2019.
Ramisch, Carlos. 2015. Multiword Expressions Acquisition: A Generic and Open Framework, Charm: Springer.
Ruppenhofer, Josef, Michael Ellsworth, Miriam R. Petruck, C. R. Johnson, and Jan Scheffczyk. 2016. FrameNet II: Extended Theory and Practice. Institut für Deutsche Sprache, Bibliothek.
San Martín, Antonio. 2016. La representación de la variación contextual mediante definiciones terminológicas flexibles. PhD Thesis. University of Granada.
San Martín, Antonio, and Pilar León Araúz. 2013. “Flexible Terminological Definitions and Conceptual Frames.” In Proceedings of the International Workshop on Definitions in Ontologies (DO2013), ed. by S. Seppälä and A. Ruttenberg, 121–135. Montreal: Concordia University.
Sánchez Cárdenas, Beatriz. 2011. “Structuration hiérarchique du lexique verbal à travers la propriété de troponymie.” Revista de Lingüística y Lenguas Aplicadas 6(1): 329–340.
Sánchez Cárdenas, Beatriz, and Miriam Buendía Castro. 2012. “Inclusion of Verbal Syntagmatic Patterns in Specialized Dictionaries: The Case of EcoLexicon.” In Proceedings of the 15th EURALEX International Congress, ed. By Ruth Vatvedt Fjeld and Julie Matilde Torjusen: 554–562. Oslo: EURALEX.
Sinclair, John. 1991. Corpus, Concordance, Collocation. Oxford: Oxford University Press.
Straka, Milan, Jan Hajič, and Jana Straková. 2016. “UDPipe: Trainable Pipeline for Processing CoNLL-U Files Performing Tokenization, Morphological Analysis, POS Tagging and Parsing.” In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016). Portorož (Slovenia) ([URL]). Accessed 1 April 2019.
Temmerman, Rita. 1997. “Questioning the Univocity Ideal. The Difference between Socio-cognitive Terminology and Traditional Terminology.” HERMES-Journal of Language and Communication in Business 10(18): 51–90.
Williams, Geoffrey. 2005. “English Collocation Studies: The OSTI report”. International Journal of Lexicography 18(3): 391–393.
Cited by (3)
Cited by three other publications
Castaño, Emilia & Isabel Verdaguer Clavera
孙, 晓玲
2022.
The Application of Category in Cultural Terminology Translation from the Perspective of Cognitive Terminology.
Modern Linguistics 10:03
► pp. 545 ff.
Du, Jiali, Christina Alexandris, Yajun Pei, Yuming Lian & Pingfang Yu
2021.
Meeting the Growing Needs in Scientific and Technological Terms with China’s Terminology Management Agency – CNCTST. In
Human Interaction, Emerging Technologies and Future Applications IV [
Advances in Intelligent Systems and Computing, 1378],
► pp. 239 ff.
This list is based on CrossRef data as of 10 july 2024. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers.
Any errors therein should be reported to them.