Construction mining
Identifying construction candidates for the German constructicon
The German Constructicon Project (www.german-constructicon.de) aims at documenting grammatical constructions in contemporary standard German on the basis
of annotated corpus examples, including relations between constructions and between constructions and evoked semantic frames. So
far, the research focus has been mainly on the development and computational implementation of a constructicographic workflow
(including a parsing pipeline) that allows for addressing any kind of constructions on varying levels of schematicity,
idiomaticity, and abstractness. However, such an exemplar-driven procedure precludes us from systematically identifying
constructional candidates. In this article, we scrutinize ways to operationalize and implement data-mining procedures to
inductively identify construction candidates.
Article outline
- 1.What’s out there in the constructicon?
- 2.Identifying constructions
- 3.Construction mining: Operationalization and implementation of a computational framework
- 3.1Generating a list of patterns
- 3.2From patterns to construction candidates
- 3.3cxnMiner: A framework for mining constructions
- 3.4Future work: From construction candidates to a constructicon
- 4.Conclusions
- Acknowledgements
- Notes
-
References
References
Bäckström, Linnéa, Lars Borin, Markus Forsberg, Benjamin Lyngfelt, Julia Prentice, and Emma Sköldberg
2013 “
Automatic Identification of Construction Candidates for a Swedish Constructicon.” In
Proceedings of the Workshop on Lexical Semantic Resources for NLP at NODALIDA 2013 (=
NEALT Proceedings Series 19 / Linköping Electronic Conference Proceedings 88), ed. by
Lars Borin,
Ruth Vatvedt Fjeld,
Markus Forsberg,
Sanni Nimb,
Pierre Nugues, and
Bolette Sandford Pedersen, 2–11.
Borges Völker, Emanuel, Maximilian Wendt, Felix Hennig, and Arne Köhn
2019 “
HDT-UD: A Very Large Universal Dependencies Treebank for German.” In
Proceedings of the Third Workshop on Universal Dependencies (UDW, SyntaxFest 2019), 46–57. Paris: Association for Computational Linguistics.
Borin, Lars, Dana Dannélls, and Normunds Grūzītis
Dunn, Jonathan
2017 “
Computational Learning of Construction Grammars.”
Language and Cognition 9 (2): 254–292.
Fillmore, Charles J.
2008 “
Border Conflicts: FrameNet Meets Construction Grammar.” In
Proceedings of the XIII EURALEX International Congress Barcelona, ed. by
Elisenda Bernal, and
Janet De Cesaris, 49–68. Barcelona: Universitat Pompeu Fabra.
Foth, Kilian A., Arne Köhn, Niels Beuck, and Wolfgang Menzel
2014 “
Because Size Does Matter: The Hamburg Dependency Treebank.” In
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC’14), ed. by
Nicoletta Calzolari,
Khalid Choukri,
Thierry Declerck,
Hrafn Loftsson,
Bente Maegaard,
Joseph Mariani,
Asuncion Moreno,
Jan Odijk, and
Stelios Piperidis, 2326–2333. Reykjavik: European Language Resources Association (ELRA).
Forsberg, Markus, Richard Johansson, Linnéa Bäckström, Lars Borin, Benjamin Lyngfelt, Joel Olofsson, and Julia Prentice
Fournier-Viger, Philippe, Jerry Chun-Wei Lin, Rage Uday Kiran, Yun Sing Koh, and Rincy Thomas
2017 “
A Survey of Sequential Pattern Mining.”
Data Science and Pattern Recognition 1 (1): 54–77.
Goldberg, Adele E.
2006 Constructions at Work: The Nature of Generalization in Language. Oxford: Oxford University Press.
Guthrie, David, Ben Allison, Wei Liu, Louise Guthrie, and Yorick Wilks
2006 “
A Closer Look at Skip-gram Modelling.” In
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06), ed. by
Nicoletta Calzolari,
Khalid Choukri,
Aldo Gangemi,
Bente Maegaard,
Joseph Mariani,
Jan Odijk, and
Daniel Tapias, 1222–1225. Genoa: European Language Resources Association (ELRA).
Herbst, Thomas
(ed) 2019 From Lexicography to Constructicography.
Special Issue of Lexicographica 35.
Honnibal, Matthew, and Ines Montani
2017 spaCy 2: Natural Language Understanding with Bloom Embeddings, Convolutional Neural Networks and Incremental
Parsing.
Lyngfelt, Benjamin, Lars Borin, Kyoko Ohara, and Tiago Torrent
Martí, Maria Antònia, Mariona Taulé, Venelin Kovatchev, and Maria Salamó
2019 “
DISCOver: DIStributional Approach Based on Syntactic Dependencies for Discovering COnstructions.” In
Corpus Linguistics and Linguistic Theory (published online ahead of print, 04.01.2019).
Shibuya, Yoshikata, and Kim Ebensgaard Jensen
2015 “
Mining for Constructions in Texts using N-Gram and Network Analysis.”
Globe: A Journal of Language, Culture and Communication 21: 23–54.
Sidorov, Grigori
2019 Syntactic N-Grams in Computational Linguistics (=
SpringerBriefs in Computer Science
). Cham, Switzerland: Springer International Publishing.
Wible, David, and Nai-Lung Tsao
2010 “
StringNet as a Computational Resource for Discovering and Investigating Linguistic Constructions.” In:
Proceedings of the NAACL HLT Workshop on Extracting and Using Constructions in Computational Linguistics, ed. by
Magnus Sahlgren, and
Ola Knutsson, 25–31. Los Angeles: Association for Computational Linguistics.
Ziem, Alexander, and Alexander Lasch
2013 Konstruktionsgrammatik. Konzepte und Grundlagen gebrauchsbasierter Ansätze [
Construction Grammar: Concepts and Foundations of Usage-Based Approaches]. Berlin / New York: de Gruyter.
Ziem, Alexander, Johanna Flick, and Phillip Sandkühler
2019 “
The German Constructicon Project: Framework, Methodology, Resources.”
Lexicographica 351: 15–40.
Cited by
Cited by 1 other publications
Ziem, Alexander & Tim Feldmüller
2023.
Dimensions of constructional meanings in the German Constructicon: Why collo-profiles matter.
Yearbook of the German Cognitive Linguistics Association 11:1
► pp. 203 ff.
This list is based on CrossRef data as of 20 march 2024. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers.
Any errors therein should be reported to them.