The computational learning of construction grammars* : State of the art and prospective roadmap

Doumen, Jonas; Schmalz, Veronica Juliana; Beuls, Katrien; Van Eecke, Paul

doi:10.1075/cf.23026.dou

Article published In:

Constructions and Frames: Online-First Articles

The computational learning of construction grammars *
*.

State of the art and prospective roadmap

Jonas Doumen | KU Leuven

Veronica Juliana Schmalz | KU Leuven

Katrien Beuls | Université de Namur

Paul Van Eecke | KU Leuven | Vrije Universiteit Brussel

This paper documents and reviews the state of the art concerning computational models of construction grammar learning. It brings together prior work on the computational learning of form-meaning pairings, which has so far been studied in several distinct areas of research. The goal of this paper is threefold. First of all, it aims to synthesise the variety of methodologies that have been proposed to date and the results that have been obtained. Second, it aims to identify those parts of the challenge that have been successfully tackled and reveal those that require further research. Finally, it aims to provide a roadmap which can help to boost and streamline future research efforts on the computational learning of large-scale, usage-based construction grammars.

Keywords: construction grammar, computational construction grammar, usage-based linguistics, computational modelling, learning construction grammars

Article outline

1.Learning computational construction grammars
2.Methodology
- 2.1Inclusion criteria
- 2.2Discussion criteria
3.Review of prior literature
- 3.1Learning a maximally concise grammar
- 3.2Learning a grammar from utterance-meaning pairs
- 3.3Learning a grammar under referential uncertainty
- 3.4Learning a grammar from a situation model
4.Discussion
- Representing meaning
- Representing form
- Representing constructions
- Learning constructions
- Language-independent learning
- Scaling up
5.Conclusion
Note
References

Available under the Creative Commons Attribution (CC BY) 4.0 license.

For any use beyond this license, please contact the publisher at [email protected].

Published online: 16 December 2024

https://doi.org/10.1075/cf.23026.dou

References (78)

References

Abend, O., Kwiatkowski, T., Smith, N. J., Goldwater, S., & Steedman, M. (2017). Bootstrapping language acquisition. Cognition, 164 1, 116–143.

Alishahi, A. & Stevenson, S. (2008). A computational model of early argument structure acquisition. Cognitive Science, 32 (5), 789–834.

Artzi, Y., & Zettlemoyer, L. (2013). Weakly supervised learning of semantic parsers for mapping instructions to actions. Transactions of the Association for Computational Linguistics, 1 1, 49–62.

Beekhuizen, B. (2015). Constructions Emerging [Doctoral dissertation]. LOT — Netherlands Graduate School of Linguistics.

Beekhuizen, B., & Bod, R. (2014). Automating construction work: Data-oriented parsing and constructivist accounts of language acquisition. In R. Boogart, T. Colleman & G. Rutten (Eds.), Extending the scope of Construction Grammar (pp. 47–74). Mouton de Gruyter.

Beekhuizen, B., Bod, R., Fazly, A., Stevenson, S., & Verhagen, A. (2014). A usage-based model of early grammatical development. In V. Demberg & T. O’Donnell (Eds.), Proceedings of the Fifth Workshop on Cognitive Modeling and Computational Linguistics (pp. 46–54). Association for Computational Linguistics.

Bender, E. M. (2008). Grammar engineering for linguistic hypothesis testing. In N. Gaylord, A. Palmer & E. Ponvert (Eds.), Proceedings of the Texas Linguistics Society X Conference: Computational linguistics for less-studied languages (pp. 16–36). CSLI.

Beuls, K., Gerasymova, K., & van Trijp, R. (2010). Situated learning through the use of language games. Proceedings of the 19th Annual Machine Learning Conference of Belgium and The Netherlands (BeNeLearn) (pp. 1–6).

Beuls, K., & Höfer, S. (2011). Simulating the emergence of grammatical agreement in multi-agent language games. In T. Welsh (Ed.), Proceedings of the Twenty-Second International Joint Conference on Artificial Intelligence (pp. 61–66). AAAI Press.

Beuls, K. & Steels, L. (2013). Agent-based models of strategies for the emergence and evolution of grammatical agreement. PLOS ONE, 8 (3), e58960.

Beuls, K. & Van Eecke, P. (2023). Fluid Construction Grammar: State of the art and future outlook. In C. Bonial & H. Tayyar Madabushi (Eds.), Proceedings of the First International Workshop on Construction Grammars and NLP (CxGs+NLP, GURT/SyntaxFest 2023) (pp. 41–50). Association for Computational Linguistics.

(2025). Construction grammar and artificial intelligence. In M. Fried & K. Nikiforidou (Eds.), The Cambridge handbook of Construction Grammar. Cambridge University Press.

Beuls, K., Van Eecke, P., & Cangalovic, V. S. (2021). A computational construction grammar approach to semantic frame extraction. Linguistics Vanguard, 7 (1), 20180015.

Brown, R. (1973). A first language: The early stages. Harvard University Press.

Bybee, J. (2006). From usage to grammar: The mind’s response to repetition. Language, 82 (4), 711–733.

Chang, N. (2008). Constructing grammar: A computational model of the emergence of early constructions [Doctoral dissertation]. University of California.

Chen, D. L., & Mooney, R. J. (2008). Learning to sportscast: A test of grounded language acquisition. In A. McCallum & S. Roweis (Eds.), Proceedings of the 25th International Conference on Machine Learning (pp. 128–135). Association for Computing Machinery.

Croft, W. (1998). Event structure in argument linking. In M. Butt & W. Geuder (Eds.), The projection of arguments: Lexical and compositional factors (pp. 21–63). CSLI.

Dominey, P. F. (2005a). Emergence of grammatical constructions: Evidence from simulation and grounded agent experiments. Connection Science, 17 (3–4), 289–306.

(2005b). From sensorimotor sequence to grammatical construction: Evidence from simulation and neurophysiology. Adaptive Behavior, 13 (4), 347–361.

(2006). From holophrases to abstract grammatical constructions: Insights from simulation studies. In E. Clark & B. Kelly (Eds.), Constructions in acquisition (pp. 137–162). CSLI.

Dominey, P. F., & Boucher, J.-D. (2005). Learning to talk about events from narrated video in a construction grammar framework. Artificial Intelligence, 167 (1), 31–61.

Doumen, J., Beuls, K., & Van Eecke, P. (2023). Modelling language acquisition through syntactico-semantic pattern finding. In A. Vlachos & I. Augenstein (Eds.), Findings of the Association for Computational Linguistics: EACL 2023 (pp. 1317–1327). Association for Computational Linguistics.

Dunn, J. (2017). Computational learning of construction grammars. Language and Cognition, 9 (2), 254–292.

(2018). Modeling the complexity and descriptive adequacy of construction grammars. Proceedings of the Society for Computation in Linguistics (SCiL), 1 1, 81–90.

(2019). Frequency vs. association for constraint selection in usage-based construction grammar. In E. Chersoni, N. Hollenstein, C. Jacobs, Y. Oseki, L. Prévot & E. Santus (Eds.), Proceedings of the Workshop on Cognitive Modeling and Computational Linguistics (pp. 117–128). Association for Computational Linguistics.

(2022). Exposure and emergence in usage-based grammar: Computational experiments in 35 languages. Cognitive Linguistics, 33 (4), 659–699.

(2023). Exploring the constructicon: Linguistic analysis of a computational CxG. In C. Bonial & H. Tayyar Madabushi (Eds.), Proceedings of the First International Workshop on Construction Grammars and NLP (CxGs+NLP, GURT/SyntaxFest 2023) (pp. 1–11). Association for Computational Linguistics.

Dunn, J., & Tayyar Madabushi, H. (2021). Learned construction grammars converge across registers given increased exposure. In A. Bisazza & O. Abend (Eds.), Proceedings of the 25th Conference on Computational Natural Language Learning (pp. 268–278). Association for Computational Linguistics.

EHAI (2023). SemBrowse: Semantics-driven corpus exploration. [URL]

Ellis, N. C. (2006). Language acquisition as rational contingency learning. Applied Linguistics, 27 (1), 1–24.

Fazly, A., Alishahi, A., & Stevenson, S. (2010). A probabilistic computational model of cross-situational word learning. Cognitive Science, 34 (6), 1017–1063.

Garcia Casademont, E. (2018). Origins of recursive phrase structure through cultural self-organisation and selection [Doctoral dissertation]. Universitat Pompeu Fabra.

Garcia Casademont, E., & Steels, L. (2015). Usage-based grammar learning as insight problem solving. In G. Airenti, B. G. Bara & G. Sandini (Eds.), Proceedings of the EuroAsianPacific Joint Conference on Cognitive Science (pp. 258–263). CEUR Workshop Proceedings.

(2016). Insight grammar learning. Journal of Cognitive Science, 17 (1), 27–62.

Gaspers, J., & Cimiano, P. (2012). A usage-based model for the online induction of constructions from phoneme sequences. 2012 IEEE International Conference on Development and Learning and Epigenetic Robotics (ICDL-EpiRob), 1–6.

(2014). A computational model for the item-based induction of construction networks. Cognitive Science, 38(3), 439–88.

Gaspers, J., Cimiano, P., Griffiths, S. S., & Wrede, B. (2011). An unsupervised algorithm for the induction of constructions. 2011 IEEE International Conference on Development and Learning (ICDL), 1–6.

Gaspers, J., Cimiano, P., Rohlfing, K., & Wrede, B. (2016). Constructing a language from scratch: Combining bottom-up and top-down learning processes in a computational model of language acquisition. IEEE Transactions on Cognitive and Developmental Systems, 9 (2), 183–196.

Gerasymova, K., & Spranger, M. (2010). Acquisition of grammar in autonomous artificial systems. In H. Coelho, R. Studer & M. Woolridge (Eds.), Proceedings of the 19th European Conference on Artificial Intelligence (ECAI-2010) (pp. 923–928). IOS Press.

(2012). An experiment in temporal language learning. In L. Steels & M. Hild (Eds.), Language grounding in robots (pp. 237–254). Springer.

Goldberg, A. E. (2003). Constructions: A new theoretical approach to language. Trends in Cognitive Sciences, 7 (5), 219–224.

Hemphill, C. T., Godfrey, J. J., & Doddington, G. R. (1990). The ATIS spoken language systems pilot corpus. Speech and Natural Language: Proceedings of a Workshop held at Hidden Valley (pp. 96-101).

Johnson, J., Hariharan, B., van der Maaten, L., Fei-Fei, L., Lawrence Zitnick, C., & Girshick, R. (2017). CLEVR: A diagnostic dataset for compositional language and elementary visual reasoning. In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 1988–1997). IEEE.

Krenn, B., Sadeghi, S., Neubarth, F., Gross, S., Trapp, M., & Scheutz, M. (2020). Models of cross-situational and crossmodal word learning in task-oriented scenarios. IEEE Transactions on Cognitive and Developmental Systems, 12 (3), 658–668.

Kwiatkowski, T., Goldwater, S., Zettlemoyer, L., & Steedman, M. (2012). A probabilistic model of syntactic and semantic acquisition from child-directed utterances and their meanings. In W. Daelemans (Ed.), Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics (pp. 234–244). Association for Computational Linguistics.

Kwiatkowski, T., Zettlemoyer, L., Goldwater, S., & Steedman, M. (2010). Inducing probabilistic CCG grammars from logical form with higher-order unification. In H. Li & L. Màrquez (Eds.), Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing (pp. 1223–1233). Association for Computational Linguistics.

(2011). Lexical generalization in CCG grammar induction for semantic parsing. In R. Barzilay & M. Johnson (Eds.), Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing (pp. 1512–1523). Association for Computational Linguistics.

MacWhinney, B. (2000). The CHILDES project: Tools for analyzing talk (3rd edition). Lawrence Erlbaum Associates.

Martí, M. A., Taulé, M., Kovatchev, V., & Salamó, M. (2021). DISCOver: DIStributional approach based on syntactic dependencies for discovering COnstructions. Corpus Linguistics and Linguistic Theory, 17 (2), 491–523.

Müller, S. (2015). The coregram project: Theoretical linguistics, theory development, and verification. Journal of Language Modelling, 3 (1), 21–86.

Nevens, J., Doumen, J., Van Eecke, P., & Beuls, K. (2022). Language acquisition through intention reading and pattern finding. In N. Calzolari & C.-R. Huang (Eds), Proceedings of the 29th International Conference on Computational Linguistics (pp. 15–25). International Committee on Computational Linguistics.

Ons, B., Gemmeke, J. F., & Van hamme, H. (2014). Fast vocabulary acquisition in an NMF-based self-learning vocal user interface. Computer Speech & Language, 28 (4), 997–1017.

Pauw, S. (2013). Size matters: Grounding quantifiers in spatial perception [Doctoral dissertation]. University of Amsterdam.

Renkens, V., & Van hamme, H. (2017). Automatic relevance determination for nonnegative dictionary learning in the gamma-poisson model. Signal Processing, 132 1, 121–133.

Spranger, M. (2015). Incremental grounded language learning in robot-robot interactions: Examples from spatial language. In 2015 Joint IEEE International Conference on Development and Learning and Epigenetic Robotics (ICDL-EpiRob) (pp. 196–201). IEEE.

(2017). Usage-based grounded construction learning: A computational model. In The 2017 AAAI Spring Symposium Series [Technical report] (pp. 245–250). AAAI Press.

Spranger, M., Pauw, S., Loetzsch, M., & Steels, L. (2012). Open-ended procedural semantics. In L. Steels, & M. Hild (Eds.), Language grounding in robots (pp. 153–172). Springer.

Spranger, M., & Steels, L. (2015). Co-acquisition of syntax and semantics: an investigation in spatial language. In Q. Yang, & M. Wooldridge (Eds.), Proceedings of the Twenty-Fourth International Joint Conference on Artificial Intelligence (pp. 1909–1915). AAAI Press.

Steels, L. (1998). The origins of syntax in visually grounded robotic agents. Artificial Intelligence, 103 (1–2), 133–156.

(2001). Language games for autonomous robots. IEEE Intelligent Systems, 16 1, 16–22.

(2004). Constructivist development of grounded construction grammar. Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics (ACL-04) (pp. 9–16).

Steels, L., & De Beule, J. (2006). Unify and merge in Fluid Construction Grammar. In P. Vogt, Y. Sugita, E. Tuci & C. L. Nehaniv (Eds), Symbol grounding and beyond, International Workshop on Emergence and Evolution of Linguistic Communication (EELC 2006) (pp. 197–223). Springer.

Tayyar Madabushi, H., Romain, L., Divjak, D., & Milin, P. (2020). CxGBERT: BERT meets Construction Grammar. In D. Scott, N. Bel & C. Zong (Eds.), Proceedings of the 28th International Conference on Computational Linguistics (pp. 4020–4032). International Committee on Computational Linguistics.

Tayyar Madabushi, H., Romain, L., Milin, P., and Divjak, D. (2025). Construction Grammar and language models. In M. Fried, & K. Nikiforidou (Eds.), The Cambridge handbook of Construction Grammar. Cambridge University Press.

ten Bosch, L., Boves, L., Van hamme, H., & Moore, R. K. (2009). A computational model of language acquisition: The emergence of words. Fundamenta Informaticae, 90 (3), 229–249.

Tomasello, M. (2003). Constructing a language: A usage-based theory of language acquisition. Harvard University Press.

Van Eecke, P. (2018). Generalisation and specialisation operators for computational construction grammar and their application in evolutionary linguistics research [Doctoral dissertation]. Vrije Universiteit Brussel, VUB Press.

van Trijp, R. (2008). Analogy and multi-level selection in the formation of a Case Grammar. A case study in Fluid Construction Grammar [Doctoral dissertation]. University of Antwerp.

(2016). The evolution of case grammar. Language Science Press.

van Trijp, R., Beuls, K., & Van Eecke, P. (2022). The FCG Editor: An innovative environment for engineering computational construction grammars. PLOS ONE, 17 (6), e0269708.

van Trijp, R., & Steels, L. (2012). Multilevel alignment maintains language systematicity. Advances in Complex Systems, 15 (3–4), 1250039.

Verheyen, L., Botoko Ekila, J., Nevens, J., Van Eecke, P., & Beuls, K. (2023). Neuro-symbolic procedural semantics for reasoning-intensive visual dialogue tasks. In K. Gal, A. Nowé, G. J. Nalepa, R. Fairstein, & R. Rădulescu (Eds.), Proceedings of the 26th European Conference on Artificial Intelligence (ECAI 2023) (pp. 2419–2426). IOS Press.

Wang, P., & Van hamme, H. (2022). Bottleneck low-rank transformers for low-resource spoken language understanding. Interspeech 2022, 1248–1252.

Weissweiler, L., He, T., Otani, N., R. Mortensen, D., Levin, L., & Schütze, H. (2023). Construction grammar provides unique insight into neural language models. In C. Bonial, & H. Tayyar Madabushi (Eds.), Proceedings of the First International Workshop on Construction Grammars and NLP (CxGs+NLP, GURT/SyntaxFest 2023) (pp. 85–95). Association for Computational Linguistics.

Weissweiler, L., Hofmann, V., Köksal, A., & Schütze, H. (2022). The better your syntax, the better your semantics? Probing pretrained language models for the English comparative correlative. In Y. Goldberg, Z. Kozareva & Y. Zhang (Eds.), Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing (pp. 10859–10882). Association for Computational Linguistics.

Willaert, T., Van Eecke, P., Beuls, K., & Steels, L. (2020). Building social media observatories for monitoring online opinion dynamics. Social Media + Society, 6 (2).

Zelle, J. M., & Mooney, R. J. (1996). Learning to parse database queries using inductive logic programming. Proceedings of the Thirteenth National Conference on Artificial Intelligence — Volume 2 (pp. 1050–1055).

The computational learning of construction grammars * *.

State of the art and prospective roadmap

The computational learning of construction grammars *
*.