This paper presents an innovative approach, within the framework of distributional semantics, for the exploration of semantic similarity in a technical corpus. In complement to a previous quantitative semantic analysis conducted in the same domain of machining terminology, this paper sets out to discover fine-grained semantic distinctions in an attempt to explore the semantic heterogeneity of a number of technical items. Multidimensional scaling analysis (MDS) was carried out in order to cluster first-order co-occurrences of a technical node with respect to shared second-order and third-order co-occurrences. By taking into account the association values between relevant first and second-order co-occurrences, semantic similarities and dissimilarities between first-order co-occurrences could be determined, as well as proximities and distances on a graph. In our discussion of the methodology and results of statistical clustering techniques for semantic purposes, we pay special attention to the linguistic and terminological interpretation.
1989Einführung in die Terminologiearbeit. Hildesheim: Georg Olms Verlag.
Baayen, Rolf H
2008Analyzing Linguistic Data. A Practical Introduction to Statistics Using R. Cambridge: Cambridge University Press.
Bertels, Ann, and Dirk Speelman
2012 “La contribution des cooccurrences de deuxième ordre à l’analyse sémantique.” Corpus 111: 147–165.
Bertels, Ann, and Dirk Speelman
2013 “Exploration sémantique visuelle à partir des cooccurrences de deuxième et troisième ordre.” In
Actes de Traitement Automatique des Langues Naturelles (TALN 2013) Atelier Sémantique Distributionnelle (SemDis)
, 126–139. Sables d’Olonne, France.
Bertels, Ann, Dirk Speelman, and Dirk Geeraerts
2010 “La corrélation entre la spécificité et la sémantique dans un corpus spécialisé.” Revue de Sémantique et de Pragmatique 271: 79–102.
Bertels, Ann
2006La polysémie du vocabulaire technique. Une étude quantitative. PhD thesis. University of Leuven.
2004 “Automatic Acquisition of Paradigmatic Relations Using Iterated Co-occurrences.” In
Proceedings of Language Resources and Evaluation (LREC 2004)
, 967–970. Lisboa, Portugal.
Borg, Ingwer, and Patrick Groenen
2005Modern Multidimensional Scaling: Theory and Applications. New York: Springer-Verlag.
Cabré, Maria Teresa
2000 “Terminologie et linguistique: la théorie des portes.” Terminologies nouvelles 21: 10–15.
Church, Kenneth W., and Patrick Hanks
1990 “Word Association Norms, Mutual Information, and Lexicography.” Computational Linguistics 16 (1): 22–29.
Clarke, Daoud
2012 “A Context-Theoretic Framework for Compositionality in Distributional Semantics.” Computational Linguistics 38 (1): 41–71.
Clarke, K.R
1993 “Non-parametric Multivariate Analyses of Change in Community Structure.” Australian Journal of Ecology 181: 117–143.
Condamines, Anne, and Josette Rebeyrolle
1997 “Point de vue en langue spécialisée.” Meta 42 (1): 174–184.
1993 “Accurate Methods for the Statistics of Surprise and Coincidence.” Computational Linguistics 19 (1): 61–74.
Eriksen, Lars
2002 “Die Polysemie in der Allgemeinsprache und in der juristischen Fachsprache. Oder: Zur Terminologie der “Sache” im Deutschen.” Hermes 281: 211–222.
Evert, Stefan
2007Corpora and Collocations. Extended Manuscript of Chapter 58 of Lüdeling A., and M. Kytö. 2008. Corpus Linguistics. An International Handbook
. Berlin: Mouton de Gruyter. [URL]. Accessed June 2014.
Evert, Stefan
2012“The Role of Dimensionality Reduction in Distributional Semantics.”
Presentation at Leuven Statistics Days
.Leuven, 8 June 2012.
Faber, Pamela
(ed.)2012A Cognitive Linguistics View of Terminology and Specialized Language. Berlin/Boston: De Gruyter.
2010“Similarité sémantique et extraction de synonymes à partir de corpus.” In
Actes de Traitement Automatique des Langues Naturelles (TALN 2010)
. Montréal, Canada.
Firth, John R
1968 “A Synopsis of Linguistic Theory, 1930-1955.” In Selected Papers of JR Firth, 1952-59, ed. by John R. Firth, 168–205. Bloomington: Indiana University Press.
Gaudin, François
2003Socioterminologie: une approche sociolinguistique de la terminologie. Bruxelles: Duculot.
Geeraerts, Dirk
2010Theories of Lexical Semantics. Oxford: University Press.
Grefenstette, Gregory
1994 “Corpus-derived First, Second and Third-order Word Affinities.” In
Proceedings of Euralex 1994. International Congress on Lexicography
, 279–290. Amsterdam, the Netherlands.
Habert, Benoît, Gabriel Illouz, and Helka Folch
2005 “Des décalages de distribution aux divergences d’acception.” In Sémantique et corpus, ed. by Anne Condamines, 277–314. Paris: Hermes-Science.
Harris, Zellig
1968Mathematical Structures of Language. New York: Wiley.
Heylen, Kris, Dirk Speelman, and Dirk Geeraerts
2012 “Looking at Word Meaning. An Interactive Visualization of Semantic Vector Spaces for Dutch Synsets.” In
Proceedings of the European Chapter of the Association for Computational Linguistics (EACL 2012)
, 16–24. Avignon, France.
Kruskal, Joseph B., and Myron Wish
1978Multidimensional Scaling. Sage University Paper series on Quantitative Applications in the Social Sciences, number 07-011. Newbury Park, CA: Sage Publications.
Landauer, Thomas K., and Susan T. Dumais
1997 “A Solution to Plato’s Problem: The Latent Semantic Analysis Theory of Acquisition, Introduction and Representation of Knowledge.” Psychological Review 104 (2): 211–240.
Lemaire, Benoît, and Guy Denhière
2006 “Effects of High-Order Co-occurrences on Word Semantic Similarity.” Current Psychology Letters 18 (1). [URL]. Accessed June 2014.
Morardo, Mikaël, and Eric Villemonte de La Clergerie
2013 “Vers un environnement de production et de validation de ressources lexicales sémantiques.” In
Actes de Traitement Automatique des Langues Naturelles (TALN 2013) Atelier Sémantique Distributionnelle (SemDis)
, 167–180. Sables d’Olonne, France.
Morlane-Hondère, François
2013 “Utiliser une base distributionnelle pour filtrer un dictionnaire de synonymes.” In
Actes de Traitement Automatique des Langues Naturelles (TALN 2013) Atelier Sémantique Distributionnelle (SemDis)
, 112–125. Sables d’Olonne, France.
2007 “Dependency-based Construction of Semantic Space Models.” Computational Linguistics 33 (2): 161–199.
Peirsman, Yves, and Dirk Geeraerts
2009 “Predicting Strong Associations on the Basis of Corpus Data.” In
Proceedings of the European Chapter of the Association for Computational Linguistics (EACL 2009)
, 648–656. Athens, Greece.
Sahlgren, Magnus
2006The Word-Space Model. PhD thesis, Stockholm University, Sweden.
Sahlgren, Magnus
2008 “The Distributional Hypothesis.” Rivista di Linguistica 20 (1): 33–53.
Schütze, Hinrich
1998 “Automatic Word Sense Discrimination.” Computational Linguistics 24 (1): 97–123.
2010 “From Frequency to Meaning: Vector Space Models of Semantics.” Journal of Artificial Intelligence Research 371: 141–188.
van der Laan, Mark J., and Katherine S. Pollard
2003 “A New Algorithm for Hybrid Hierarchical Clustering with Visualization and the Bootstrap.” Journal of Statistical Planning and Inference 1171: 275–303.
Venables, William N., and Brian D. Ripley
2002Modern Applied Statistics with S. New York: Springer-Verlag.
Wielfaert, Thomas, Kris Heylen, and Dirk Speelman
2013 “Interactive Visualizations of Semantic Vector Spaces for Lexicological Analysis.” In
Actes de Traitement Automatique des Langues Naturelles (TALN 2013) Atelier Sémantique Distributionnelle (SemDis)
, 154–166. Sables d’Olonne, France.
Wüster, Eugen
1931Internationale Sprachnormung in der Technik: besonders in der Elektrotechnik. Berlin: VDI-Verlag.
Wüster, Eugen
1991Einführung in die allgemeine Terminologielehre und terminologische Lexikographie. 3. Aufl. Bonn: Romanistischer Verlag.
Cited by
Cited by 7 other publications
Bertels, Ann
2018. Distributional semantic analysis in a technical corpus. Speech, Language and Hearing 21:2 ► pp. 69 ff.
Du, Jiali, Christina Alexandris, Yajun Pei, Yuming Lian & Pingfang Yu
2021. Meeting the Growing Needs in Scientific and Technological Terms with China’s Terminology Management Agency – CNCTST. In Human Interaction, Emerging Technologies and Future Applications IV [Advances in Intelligent Systems and Computing, 1378], ► pp. 239 ff.
Du, Jiali, Christina Alexantris & Pingfang Yu
2020. Comparative Research on Terminology Databases in Europe and China. In Human Interaction, Emerging Technologies and Future Applications II [Advances in Intelligent Systems and Computing, 1152], ► pp. 252 ff.
Juan & Faber
2019. Extraction of Terms Related to Named Rivers. Languages 4:3 ► pp. 46 ff.
This list is based on CrossRef data as of 9 june 2024. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers.
Any errors therein should be reported to them.