The variation of action verbs in multilingual spontaneous speech corpora
Semantic typology and corpus design
Most high frequency verbs referring to Action in our ordinary communication are General; that is, they productively extend to different actions in their own meaning. Moreover, languages can categorize actions differently. Despite its importance the variations of these verbs is largely unknown, and this lack of data prevents us from facing crucial aspects of lexical typology. The range of productive variations of Action verbs can be induced from spoken corpora, since references to actions are frequent in oral communication. This paper presents data derived from multilingual corpora (English and Italian) within the IMAGACT project and illustrates the methodology, the corpus design requirements, and the overall results obtained in this corpus-based research on cross-linguistic lexical semantics. The methodology identifies data that is relevant to semantic competence, separating the contexts in which the verb is used in its own core meaning from metaphors and phraseology. It makes use of visual prototypes rather than definitions in representing Action concepts, so allowing the display of typological variations across languages in a simple and informative manner. In the Italian corpus, among 677 verbs referring to Action, 106 are General, each of them comprising 3 to 15 action types. This subset records the majority of the cases in which there is reference to Physical Action and is for this reason a core area in the semantic knowledge of the language. Data regarding semantic variation can emerge only if a large enough variety of interactive context is recorded. As a whole, the incidence of metaphorical and phraseological usages in the verb occurrences is high (39%), but is higher in formal uses of language. Reference to Action is concentrated in informal, interactive contexts and especially in interactions with children in the early phases of language acquisition, which also testifies the higher variation of verbs across action types.
References
Biber, Douglas
1988 Variation across Speech and Writing. Cambridge: CUP.


Chomsky, Noam
1980 Rules and Representations. Oxford: Basil Blackwell.

Kilgarriff, Adam
1995 BNC database and word frequency lists [URL]
Berruto, Gaetano
1987 Sociolinguistica dell’italiano contemporaneo. Roma: La Nuova Italia.

Bowerman, Melissa
2005 Why can’t you ‘open’ a nut or ‘brake’ a cooked noodle. Learning cover object categories in Action word meanings. In
Building Object Categories In Developmental Time,
Lisa Gershkoff-Stowe &
David H. Rakison (eds), 209–243. Mahwah NJ: Lawrence Erlbaum Associates.

British National Corpus, Version 3
(BNC XML Edition)
2007 Distributed by Oxford University Computing Services on behalf of the BNC Consortium.
[URL]
Brown, Susan, Rood, Travis & Palmer, Martha
2010 Number or nuance: Which factors restrict reliable word sense annotation? In
Proceedings of the Seventh International Conference on Language resources and Evaluation
,
Nicoletta Calzolari (ed.), 3237–3243. Paris: ELRA

Choi, Soonja & Bowerman, Melissa
1991 Learning to express motion events in English and Korean: The influence of language specific lexicalization patterns.
Cognition 41: 83–121.


Coleman Linda & Paul, Kay
1981 Prototype semantics: The English verb ‘lie’.
Language 57(1): 26–44.


CORLEC
El Corpus Oral de Referencia de la Lengua Espanola Contempornea.
[URL]
Cresswell, Maxwell F
1978. Semantic competence. In
Meaning and Translation,
Franz Guenthner &
Mary Guenthner-Reutter (eds), 9–28. New York NY: New York University Press.

Cresti, Emanuela
2000 Corpus di Italiano parlato. Firenze: Accademia della Crusca.

Cresti, Emanuela & Moneglia, Massimo
De Mauro, Tullio, Mancini, Federico, Vedovelli, Massimo & Voghera, Miriam
1993 LIP. Lessico di frequenza dell’italiano parlato. Milano: ETAS.

De Mauro, Tullio
2006 Primo tesoro della lingua italiana del novecento. Torino: UTET.

Dixon, Robert M.W
2005 A Semantic Approach to English Grammar. Oxford: OUP.

Dowty, David
1979 Word Meaning and Montague Grammar. Dordrecht: Reidel.


Fellbaum, Christine
1998 WordNet: An Electronic Lexical Database. Cambridge MA: The MIT Press.

Gadet, Françoise
1996 Variabilité, variation, variété.
Journal of French Language Studies 1: 75–98,


Gagliardi, Gloria
2014 Validazione dell’ontologia dell’azione IMAGACT per lo studio e la diagnostic del ‘Mild Cognitive Impairment’ (MCI). PhD dissertation, University of Florence.

Halliday, Michael A.K
1989 Spoken and Written Languages. Oxford: OUP.

Izre’el, Shlomo, Hary, Benjamin & Rahav, Giora
Kopecka, Annetta & Narasimhan, Bhuvana
Korzen, Iørn
2005 Endocentric and exocentric languages in translation.
Perspectives – Studies in Translatology 13(1): 21–37.


LABLITA
Corpus of Spontaneous Spoken Italian.
[URL]
Labov, William
1966 The Social Stratification of English in New York City. Washington DC: Center for Applied Linguistics.

Labov, William
1973 The boundaries of words and their meanings. In
New Ways of Analyzing Variation in English,
Charles-James N. Bailey &
Roger W. Shuy (eds), 340–373. Washington DC: Georgetown University Press

Lakoff, George
1987 Women, Fire, and Dangerous Things. What Categories Reveal about the Mind. Chicago IL: University of Chicago Press.


Majid, Asifa, Boster, James S. & Bowerman, Melissa
2008 The cross-linguistic categorization of everyday events: A study of cutting and breaking.
Cognition 109: 235–250.


Moneglia, Massimo
1997 Prototypical vs. not-prototypical verbal predicates: Ways of understanding and the semantic types of lexical meanings.
Vestnik Moskovkogo Universitatea 2: 157–173. (English transl.
Quaderni del Dipartimento di Linguistica VII: 163–181).

Moneglia, Massimo
1998 Teoria empirica del senso e partizione semantica del lessico.
Studi di Grammatica Italiana XVII: 363–398.

Moneglia, Massimo
2005 Mettere. La semantica empirica del verbo di azione più frequente nel lessico verbale italiano. In
Italia Linguistica Discorsi di scritto e di parlato. Nuovi studi di linguistica italiana per Giovanni Nencioni,
Marco Biffi,
Omar Calabrese &
Luciana Salibra (eds), 261–282. Siena: Protagon.

Moneglia, Massimo
2011 Natural language ontology of action. A gap with huge consequences for natural language understanding and machine translation. In
Human Language Technologies as a Challenge for Computer Science and Linguistics. Proceedings of the LTC Conference
, November 25–27, 2011,
Zygmunt, Vetulani (ed.), 95–100. Poznań.
Moneglia, Massimo, Monachini, Monica, Calabrese, Omar, Panunzi, Alessandro, Frontini, Francesca., Gagliardi, Gloria & Russo, Irene
2012 The IMAGACT cross-linguistic ontology of action. A new infrastructure for natural language disambiguation. In
Proceedings of the Eight International Conference on Language Resources and Evaluation (LREC’12)
,
Nicoletta Calzolari,
Khalid Choukri,
Thierry Declerck,
Mehmet U. Doğan,
Bente Maegaard,
Joseph Mariani,
Jan Odijk &
Stelios Piperidis (eds), 2606–2613. Paris: ELRA.

Moneglia, Massimo & Panunzi, Alessandro
2007 Action predicates and the ontology of action across spoken language corpora. The basic issue of the SEMACT project. In
Proceeding of the International Workshop on the Semantic Representation of Spoken Language (SRSL7)
,
Manuel Alcántara &
Thierry Declerck (eds), 51–58. Salamanca: Universidad de Salamanca.

Ng, Hwee Tou, Chung Yong Lim & Shou King Foo
1999 A case study on inter-annotator agreement for word sense disambiguation. In
Proceedings of the ACL SIGLEX Workshop on Standardizing Lexical Resources (SIGLEX99)
, 9–13. College Park MD: University of Maryland.

Palmer, Martha, Gildea, Daniel & Kingsbury, Paul
2005 The proposition bank: An annotated corpus of semantic roles.
Computational Linguistics 31(1): 71–106.


Panunzi, Alessandro & Moneglia, Massimo
2004 La variazione primaria del verbo nel lessico del corpora di parlato. In
Atti del Convegno Nazionale II Parlato Italiano,
Federico, Albano Leoni,
Franco, Cotugno,
Massimo, Pettorino &
Renata, Savy (eds), C4 1–24. Napoli:
M. Dauria Editore.

Panunzi, Alessandro, Fabbri, Marco, Moneglia, Massimo, Gregori, Lorenzo, & Paladini, Samuele
2012 RIDIRE-CPI: An open source crawling and processing infrastructure for supervised web-corpora building. In
Proceedings of the Eight International Conference on Language Resources and Evaluation (LREC’12)
, Nicoletta Calzolari,
Khalid Choukri,
Thierry Declerck,
Mehmet U. Doğan,
Bente Maegaard,
Joseph Mariani,
Jan Odijk &
Stelios Piperidis (eds), 2274–2279. Paris: ELRA.

Rinaldi, Pasquale, Barca, Laura & Burani, Cristina
2004 A database for semantic, grammatical, and frequency properties of the first words acquired by Italian children.
Behavior Research Methods, Instruments, & Computers 36(3): 525–530


Rosch, Eleonor
1978 Principles of categorization. In
Cognition and Categorization,
Eleonor Rosch &
Barbara B. Lloyd (eds), 27–48. Hillsdale NJ: Lawrence Erlbaum Associates.

Talmy, Leonard
1985 Lexicalization patterns: Semantic structure in lexical form. In
Language Typology and Syntactic Description, Vol. III: Grammatical Categories and the Lexicon,
Timothy Shopen (ed.). Cambridge: CUP.

Tomasello, Michael
2003 Constructing a Language: A Usage-Based Theory of Language Acquisition. Cambridge MA: Harvard Univerisity Press.

Vendler, Zeno
1967 Linguistics in Philosophy. Ithaca NY: Cornell University Press,

Wittgenstein, Ludwig
1953 Philosophical Investigations. Oxford: Blackwell.

Cited by
Cited by 4 other publications
Cacioli, Caterina & Paola Vernillo
Panunzi, Alessandro & Paola Vernillo
Vernillo, Paola
2021.
Grounding Abstract Concepts in Action. In
Concepts in Action [
Language, Cognition, and Mind, 9],
► pp. 167 ff.

[no author supplied]
This list is based on CrossRef data as of 12 march 2023. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers.
Any errors therein should be reported to them.