Article published In:
The Mental Lexicon
Vol. 18:3 (2023) ► pp.472511
References (87)
References
Abdou, M., Kulmizev, A., Hershcovich, D., Frank, S., Pavlick, E., and Søgaard, A. (2021). Can Language Models Encode Perceptual Structure Without Grounding? A Case Study in Color. In Proceedings of the 25th Conference on Computational Natural Language Learning, pages 109–132, Stroudsburg, PA, USA. Association for Computational Linguistics. DOI logoGoogle Scholar
Anderson, A. J., Bruni, E., Lopopolo, A., Poesio, M., and Baroni, M. (2015). Reading visually embodied meaning from the brain: Visually grounded computational models decode visual-object mental imagery induced by written text. NeuroImage, 1201:309–322. DOI logoGoogle Scholar
Anschütz, M., Lozano, D. M., and Groh, G. (2023). This is not correct! negation-aware evaluation of language generation systems. DOI logoGoogle Scholar
Baroni, M. (2016). Grounding distributional semantics in the visual world. Language and Linguistics Compass, 10(1):3–13. DOI logoGoogle Scholar
Barsalou, L. W. (1999). Perceptual symbol systems. Behavioral and Brain Sciences, 22(4). DOI logoGoogle Scholar
(2003). Abstraction in perceptual symbol systems. Philosophical Transactions of the Royal Society of London. Series B: Biological Sciences, 358(1435). DOI logoGoogle Scholar
(2008). Grounded Cognition. Annual Review of Psychology, 59(1). DOI logoGoogle Scholar
(2010). Grounded cognition: Past, present, and future. Topics in cognitive science, 2(4):716–724. DOI logoGoogle Scholar
Barsalou, L. W., Santos, A., Simmons, W. K., and Wilson, C. D. (2008). Language and simulation in conceptual processing. In Symbols and Embodiment: Debates on meaning and cognition. Oxford University Press. DOI logoGoogle Scholar
Bordes, P., Zablocki, E., Soulier, L., Piwowarski, B., and Gallinari, P. (2019). Incorporating visual semantics into sentence representations within a grounded space. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 696–707, Hong Kong, China. Association for Computational Linguistics. DOI logoGoogle Scholar
Bruni, E., Tran, N.-K., and Baroni, M. (2014). Multimodal distributional semantics. Journal of Artificial Intelligence Research, 491:1–47. DOI logoGoogle Scholar
Brysbaert, M., Warriner, A. B., and Kuperman, V. (2014). Concreteness ratings for 40 thousand generally known English word lemmas. Behavior Research Methods, 46(3):904–911. DOI logoGoogle Scholar
Buchanan, E. M., Valentine, K. D., and Maxwell, N. P. (2019). English semantic feature production norms: An extended database of 4436 concepts. Behavior Research Methods, 51(4). DOI logoGoogle Scholar
Bulat, L., Clark, S., and Shutova, E. (2017). Speaking, Seeing, Understanding: Correlating semantic models with conceptual representation in the brain. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, Stroudsburg, PA, USA. Association for Computational Linguistics. DOI logoGoogle Scholar
Castelhano, M. S. and Rayner, K. (2008). Eye movements during reading, visual search, and scene perception: An overview. Cognitive and cultural influences on eye movements, 21751:3–33.Google Scholar
Chatfield, K., Simonyan, K., Vedaldi, A., and Zisserman, A. (2014). Return of the Devil in the Details: Delving Deep into Convolutional Nets. arXiv preprint arXiv:1405.3531. DOI logoGoogle Scholar
Chrupaɫa, G., Kádár, Á., and Alishahi, A. (2015). Learning language through pictures. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers), pages 112–118, Beijing, China. Association for Computational Linguistics. DOI logoGoogle Scholar
Collell Talleda, G., Zhang, T., and Moens, M.-F. (2017). Imagined visual representations as multimodal embeddings. In Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence (AAAI-17), pages 4378–4384. AAAI. DOI logoGoogle Scholar
Cree, G. S. and McRae, K. (2003). Analyzing the factors underlying the structure and computation of the meaning of chipmunk, cherry, chisel, cheese, and cello (and many other such concrete nouns). Journal of Experimental Psychology: General, 132(2).Google Scholar
Cronin, D. A., Hall, E. H., Goold, J. E., Hayes, T. R., and Henderson, J. M. (2020). Eye movements in real-world scene photographs: General characteristics and effects of viewing task. Frontiers in Psychology, 101:2915. DOI logoGoogle Scholar
De Deyne, S., Navarro, D. J., Collell, G., and Perfors, A. (2021). Visual and Affective Multimodal Models of Word Meaning in Language and Mind. Cognitive Science, 45(1). DOI logoGoogle Scholar
Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., and Fei-Fei, L. (2009). Imagenet: A largescale hierarchical image database. In 2009 IEEE conference on computer vision and pattern recognition, pages 248–255. Ieee. DOI logoGoogle Scholar
Dolan, R. J. (2002). Emotion, cognition, and behavior. Science, 298(5596):1191–1194. DOI logoGoogle Scholar
Finkelstein, L., Gabrilovich, E., Matias, Y., Rivlin, E., Solan, Z., Wolfman, G., and Ruppin, E. (2001). Placing search in context: The concept revisited. In Proceedings of the 10th international conference on World Wide Web, pages 406–414. DOI logoGoogle Scholar
Gerz, D., Vulić, I., Hill, F., Reichart, R., and Korhonen, A. (2016). SimVerb-3500: A large-scale evaluation set of verb similarity. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, pages 2173–2182, Austin, Texas. Association for Computational Linguistics. DOI logoGoogle Scholar
Goldstone, R. L. (1995). Effects of Categorization on Color Perception. Psychological Science, 6(5). DOI logoGoogle Scholar
Grondin, R., Lupker, S. J., and McRae, K. (2009). Shared features dominate semantic richness effects for concrete concepts. Journal of Memory and Language, 60(1):1–19. DOI logoGoogle Scholar
Günther, F., Petilli, M. A., Vergallito, A., and Marelli, M. (2022). Images of the unseen: extrapolating visual representations for abstract and concrete words in a data-driven computational model. Psychological Research. DOI logoGoogle Scholar
Günther, F., Rinaldi, L., and Marelli, M. (2019). Vector-Space Models of Semantic Representation From a Cognitive Perspective: A Discussion of Common Misconceptions. Perspectives on Psychological Science, 14(6):1006–1033. DOI logoGoogle Scholar
Halawi, G., Dror, G., Gabrilovich, E., and Koren, Y. (2012). Large-scale learning of word relatedness with constraints. In Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 1406–1414. DOI logoGoogle Scholar
Harnad, S. (1990). The symbol grounding problem. Physica D: Nonlinear Phenomena, 42(1–3):335–346. DOI logoGoogle Scholar
Harris, Z. S. (1954). Distributional Structure. WORD, 10(2–3). DOI logoGoogle Scholar
Hasegawa, M., Kobayashi, T., and Hayashi, Y. (2017). Incorporating visual features into word embeddings: A bimodal autoencoder-based approach. In IWCS 2017 – 12th International Conference on Computational Semantics – Short papers.Google Scholar
Hill, F., Reichart, R., and Korhonen, A. (2015). Simlex-999: Evaluating semantic models with (genuine) similarity estimation. Computational Linguistics, 41(4):665–695. DOI logoGoogle Scholar
Hochreiter, S. and Schmidhuber, J. (1997). Long short-term memory. Neural computation, 9(8):1735–1780. DOI logoGoogle Scholar
Hoffman, D. (2019). The case against reality: Why evolution hid the truth from our eyes. WW Norton & Company.Google Scholar
Hollenstein, N., de la Torre, A., Langer, N., and Zhang, C. (2019). CogniVal: A Framework for Cognitive Word Embedding Evaluation. In Proceedings of the 23rd Conference on Computational Natural Language Learning (CoNLL), Stroudsburg, PA, USA. Association for Computational Linguistics. DOI logoGoogle Scholar
Howell, S. R., Jankowicz, D., and Becker, S. (2005). A model of grounded language acquisition: Sensorimotor features improve lexical and grammatical learning. Journal of Memory and Language, 53(2):258–276. DOI logoGoogle Scholar
Kant, I., Guyer, P., and Wood, A. W. (1781/1999). Critique of pure reason. Cambridge University Press.Google Scholar
Kenton, J. D. M.-W. C. and Toutanova, L. K. (2019). Bert: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of NAACL-HLT, pages 4171–4186.Google Scholar
Kiela, D. and Bottou, L. (2014). Learning image embeddings using convolutional neural networks for improved multi-modal semantics. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 36–45, Doha, Qatar. Association for Computational Linguistics. DOI logoGoogle Scholar
Kiela, D., Bulat, L., and Clark, S. (2015). Grounding semantics in olfactory perception. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers), pages 231–236. DOI logoGoogle Scholar
Kiela, D. and Clark, S. (2015). Multi- and Cross-Modal Semantics Beyond Vision: Grounding in Auditory Perception. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, Stroudsburg, PA, USA. Association for Computational Linguistics. DOI logoGoogle Scholar
Kiela, D., Conneau, A., Jabri, A., and Nickel, M. (2018). Learning visually grounded sentence representations. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pages 408–418, New Orleans, Louisiana. Association for Computational Linguistics. DOI logoGoogle Scholar
Kiros, J., Chan, W., and Hinton, G. (2018). Illustrative language understanding: Largescale visual grounding with image search. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 922–933, Melbourne, Australia. Association for Computational Linguistics. DOI logoGoogle Scholar
Lakoff, G. (1987). Women, Fire, and Dangerous Things. University of Chicago Press. DOI logoGoogle Scholar
Lakoff, G. and Johnson, M. (1980). The metaphorical structure of the human conceptual system. Cognitive science, 4(2):195–208. DOI logoGoogle Scholar
Landauer, T. K. and Dumais, S. T. (1997). A solution to Plato’s problem: The latent semantic analysis theory of acquisition, induction, and representation of knowledge. Psychological Review, 104(2). DOI logoGoogle Scholar
Langacker, R. W. (1999). A view from cognitive linguistics. Behavioral and Brain Sciences, 22(4). DOI logoGoogle Scholar
Lazaridou, A., Chrupaɫa, G., Fernández, R., and Baroni, M. (2016). Multimodal Semantic Learning from Child-Directed Input. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Stroudsburg, PA, USA. Association for Computational Linguistics. DOI logoGoogle Scholar
Lazaridou, A., Marelli, M., and Baroni, M. (2017). Multimodal Word Meaning Induction From Minimal Exposure to Natural Text. Cognitive Science, 411. DOI logoGoogle Scholar
Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., and Zitnick, C. L. (2014). Microsoft coco: Common objects in context. In European conference on computer vision, pages 740–755. Springer. DOI logoGoogle Scholar
Louwerse, M. and Connell, L. (2011). A Taste of Words: Linguistic Context and Perceptual Simulation Predict the Modality of Words. Cognitive Science, 35(2):381–398. DOI logoGoogle Scholar
Louwerse, M. M. (2011). Symbol interdependency in symbolic and embodied cognition. Topics in Cognitive Science, 3(2):273–302. DOI logoGoogle Scholar
Louwerse, M. M. and Zwaan, R. A. (2009). Language Encodes Geographical Information. Cognitive Science, 33(1):51–73. DOI logoGoogle Scholar
Luong, T., Socher, R., and Manning, C. (2013). Better word representations with recursive neural networks for morphology. In Proceedings of the Seventeenth Conference on Computational Natural Language Learning, pages 104–113, Sofia, Bulgaria. Association for Computational Linguistics.Google Scholar
Lynott, D., Connell, L., Brysbaert, M., Brand, J., and Carney, J. (2020). The Lancaster Sensorimotor Norms: multidimensional measures of perceptual and action strength for 40,000 English words. Behavior Research Methods, 52(3). DOI logoGoogle Scholar
Mandera, P., Keuleers, E., and Brysbaert, M. (2017). Explaining human performance in psycholinguistic tasks with models of semantic similarity based on prediction and counting: A review and empirical validation. Journal of Memory and Language, 921. DOI logoGoogle Scholar
Marelli, M. and Amenta, S. (2018). A database of orthography-semantics consistency (osc) estimates for 15,017 english words. Behavior research methods, 501:1482–1495. DOI logoGoogle Scholar
Martin, A. (2007). The Representation of Object Concepts in the Brain. Annual Review of Psychology, 58(1):25–45. DOI logoGoogle Scholar
McRae, K., Cree, G. S., Seidenberg, M. S., and Mcnorgan, C. (2005). Semantic feature production norms for a large set of living and nonliving things. Behavior Research Methods, 37(4). DOI logoGoogle Scholar
Mikolov, T., Chen, K., Corrado, G., and Dean, J. (2013). Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781.Google Scholar
Mkrtychian, N., Blagovechtchenski, E., Kurmakaeva, D., Gnedykh, D., Kostromina, S., and Shtyrov, Y. (2019). Concrete vs. Abstract Semantics: From Mental Representations to Functional Brain Mapping. Frontiers in Human Neuroscience, 131(August):267. DOI logoGoogle Scholar
Montefinese, M. (2019). Semantic representation of abstract and concrete words: A minireview of neural evidence. Journal of Neurophysiology, 121(5):1585–1587. DOI logoGoogle Scholar
Park, J. and Myaeng, S.-h. (2017a). A computational study on word meanings and their distributed representations via polymodal embedding. In Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 214–223, Taipei, Taiwan. Asian Federation of Natural Language Processing.Google Scholar
(2017b). A computational study on word meanings and their distributed representations via polymodal embedding. In Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 214–223.Google Scholar
Pennington, J., Socher, R., and Manning, C. (2014). Glove: Global Vectors for Word Representation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), Stroudsburg, PA, USA. Association for Computational Linguistics. DOI logoGoogle Scholar
Pezzelle, S., Takmaz, E., and Fernández, R. (2021). Word representation learning in multimodal pre-trained transformers: An intrinsic evaluation. Transactions of the Association for Computational Linguistics, 91:1563–1579. DOI logoGoogle Scholar
Rotaru, A. S. and Vigliocco, G. (2020a). Constructing semantic models from words, images, and emojis. Cognitive science, 44(4):e12830. DOI logoGoogle Scholar
(2020b). Constructing Semantic Models From Words, Images, and Emojis. Cognitive Science, 44(4):e12830. DOI logoGoogle Scholar
Rozenkrants, B., Olofsson, J. K., and Polich, J. (2008). Affective visual event-related potentials: arousal, valence, and repetition effects for normal and distorted pictures. International Journal of Psychophysiology, 67(2):114–123.Google Scholar
Shahmohammadi, H., Heitmeier, M., Shafaei-Bajestan, E., Lensch, H., and Baayen, H. (2023). Language with vision: a study on grounded word and sentence embeddings. Behavior Research Methods, accepted for publication. DOI logoGoogle Scholar
Shahmohammadi, H., Lensch, H. P. A., and Baayen, R. H. (2021). Learning zero-shot multifaceted visually grounded word embeddings via multi-task training. In Proceedings of the 25th Conference on Computational Natural Language Learning, pages 158–170, Online. Association for Computational Linguistics. DOI logoGoogle Scholar
Silberer, C. and Lapata, M. (2014). Learning grounded meaning representations with autoencoders. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 721–732, Baltimore, Maryland. Association for Computational Linguistics. DOI logoGoogle Scholar
Simmons, W. K., Martin, A., and Barsalou, L. W. (2005). Pictures of Appetizing Foods Activate Gustatory Cortices for Taste and Reward. Cerebral Cortex, 15(10):1602–1608. DOI logoGoogle Scholar
Solomon, K. O. and Barsalou, L. W. (2001). Representing Properties Locally. Cognitive Psychology, 43(2):129–169. DOI logoGoogle Scholar
(2004). Perceptual simulation in property verification. Memory & Cognition, 32(2):244–259. DOI logoGoogle Scholar
Tan, H. and Bansal, M. (2020). Vokenization: Improving language understanding with contextualized, visual-grounded supervision. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 2066–2080, Online. Association for Computational Linguistics. DOI logoGoogle Scholar
Tan, M. and Le, Q. (2019). Efficientnet: Rethinking model scaling for convolutional neural networks. In International conference on machine learning, pages 6105–6114. PMLR.Google Scholar
Utsumi, A. (2022). A test of indirect grounding of abstract concepts using multimodal distributional semantics. Frontiers in psychology, 131. DOI logoGoogle Scholar
Vigliocco, G., Ponari, M., and Norbury, C. (2018). Learning and processing abstract words and concepts: Insights from typical and atypical development. Topics in cognitive science, 10(3):533–549. DOI logoGoogle Scholar
Wang, B., Wang, A., Chen, F., Wang, Y., and Kuo, C.-C. J. (2019). Evaluating word embedding models: Methods and experimental results. APSIPA transactions on signal and information processing, 81. DOI logoGoogle Scholar
Westbury, C. (2014). You Can’t Drink a Word: Lexical and Individual Emotionality Affect Subjective Familiarity Judgments. Journal of Psycholinguistic Research, 43(5). DOI logoGoogle Scholar
Westbury, C. and Hollis, G. (2019). Wriggly, squiffy, lummox, and boobs: What makes some words funny? Journal of Experimental Psychology: General, 148(1).Google Scholar
Wood, S. N. (2011). Fast stable restricted maximum likelihood and marginal likelihood estimation of semiparametric generalized linear models. Journal of the Royal Statistical Society (B), 73(1):3–36. DOI logoGoogle Scholar
Yun, T., Sun, C., and Pavlick, E. (2021). Does vision-and-language pretraining improve lexical grounding? In Findings of the Association for Computational Linguistics: EMNLP 2021, pages 4357–4366, Punta Cana, Dominican Republic. Association for Computational Linguistics. DOI logoGoogle Scholar
Zwaan, R. A. and Madden, C. J. (2005). Embodied Sentence Comprehension. In Grounding Cognition. Cambridge University Press. DOI logoGoogle Scholar