Article published in:Lexical semantic approaches to terminology
Edited by Pamela Faber and Marie-Claude L'Homme
[Terminology 20:2] 2014
► pp. 198–224
Hunting for a linguistic phantom
A corpus-linguistic study of knowledge-rich contexts
The importance of semantic descriptions of concepts by means of defining statements is a commonplace tenet of scientific and practical approaches to terminology. While the current understanding of defining statements remains bound to classical concepts of defining, there is limited knowledge about the types of conceptual information that may ease the transfer of knowledge. Furthermore, there is little insight into how defining statements differ epistemologically from non-defining (generic) statements; on the linguistic side, the same can be said about linguistic differences between defining and generic statements. Last but not least, it remains unclear how practical terminology work can benefit from corpus-based research on the description of defining statements. This paper aims to shed light on some of these questions by describing a corpus-linguistic study of knowledge-rich contexts in German and Russian web corpora. Hypotheses about linguistic features of knowledge-rich contexts are derived in a theory-driven manner and researched by means of corpus-linguistic methods. Significant features are then investigated further for the German data, using a multivariate method.
Keywords: corpus linguistics, knowledge-rich contexts, Russian, German, web corpora
Published online: 31 October 2014
Baroni, Marco, and Stefan Evert
Bierwisch, Manfred, and Ferenc Kiefer
Cramer, Irene Magdalena
de Groc, Clément
2011 “Babouk: Focused Web Crawling for Corpus Compilation and Automatic Terminology Extraction.” In IEEE/WIC/ACM: International Conference on Web Intelligence and Intelligent Agent Technology , 497–498. Lyon, France.
Dubuc, Robert, and Andy Lauriston
Fahmi, Ismail, and Gosse Bouma
2006 “Learning to Identify Definitions Using Syntactic Features.” In Workshop on Learning Structured Information in Natural Language Applications, 11th Conference of the European Chapter of the Association for Computational Linguistics (EACL 2006) , 64–71. Trento, Italy.
Feliu, Judit, and Maria Teresa Cabré
2002 “Conceptual Relations in Specialized Texts: New Typology and an Extraction System Proposal.” In Terminology and Knowledge Engineering (TKE 2002), 45–49. Nancy, France.
Hamp, Birgit, and Helmut Feldweg
1997 “GermaNet – A Lexical-Semantic Net for German.” In Workshop on Automatic Information Extraction and Building of Lexical Semantic Resources for NLP Applications, 35th Annual Meeting of the Association for Computational Linguistics/8th Conference of the European Chapter of the Association for Computational Linguistics (ACL/EACL 1997) , 9–15. Madrid, Spain.
International Organization for Standardization
Malaisé, Véronique, Pierre Zweigenbaum, and Bruno Bachimont
Przepiórkowski, Adam, Łukasz Degórski, Miroslav Spousta, Kiril Simov, Petya Osenova, Lothar Lemnitzer, Vladislav Kuboň, and Beata Wójtowicz
2007 “Towards the Automatic Extraction of Definitions in Slavic.” In Workshop on Balto-Slavonic Natural Language Processing, 45th Annual Meeting of the Association for Computational Linguistics (ACL 2007) , 43–50. Prague, Czech Republic.
Quasthoff, Uwe, Matthias Richter, and Christian Biemann
2006 “Corpus Portal for Search in Monolingual Corpora.” In Language Resources and Evaluation (LREC 2006) , 1799–1802. Genova, Italy.
R Development Core Team
Sachs, Lothar, and Jürgen Hedderich
1994 “Probabilistic Part-of-Speech Tagging Using Decision Trees.” In International Conference on New Methods in Language Processing , 44–49. Manchester, UK.
2013 “Collection, Annotation and Analysis of Gold Standard Corpora for Knowledge-Rich Context Extraction in Russian and German.” In Student Research Workshop, International Conference Recent Advances in Natural Language Processing (RANLP 2013) , 134–141. Hissar, Bulgaria.
Sierra, Gerardo, Rodrigo Alarcón, César Aguilar, and Carme Bach
Storrer, Angelika, and Sandra Wellinghoff
2006 “Automated Detection and Annotation of Term Definitions in German Text Corpora.” In Language Resources and Evaluation (LREC 2006) , 2373–2376. Genova, Italy.
Walter, Stephan, and Manfred Pinkal
2006 “Automatic Extraction of Definitions from German Court Decisions.” In Workshop on Information Extraction beyond the Document, 21st International Conference on Computational Linguistics/44th Annual Meeting of the Association for Computational Linguistics (COLING/ACL 2006) , 20–28. Sydney, Australia.
2009 “Definition Extraction Using Linguistic and Structural Features.” In First Workshop on Definition Extraction, International Conference Recent Advances in Natural Language Processing (RANLP 2009) , 61–67. Borovets, Bulgaria.