How to do Linguistics with R

Data exploration and statistical analysis

| Université catholique de Louvain
HardboundAvailable
ISBN 9789027212245 | EUR 105.00 | USD 158.00
 
PaperbackAvailable
ISBN 9789027212252 | EUR 36.00 | USD 54.00
 
e-Book
ISBN 9789027268457 | EUR 105.00/36.00*
| USD 158.00/54.00*
 

This book provides a linguist with a statistical toolkit for exploration and analysis of linguistic data. It employs R, a free software environment for statistical computing, which is increasingly popular among linguists. How to do Linguistics with R: Data exploration and statistical analysis is unique in its scope, as it covers a wide range of classical and cutting-edge statistical methods, including different flavours of regression analysis and ANOVA, random forests and conditional inference trees, as well as specific linguistic approaches, among which are Behavioural Profiles, Vector Space Models and various measures of association between words and constructions. The statistical topics are presented comprehensively, but without too much technical detail, and illustrated with linguistic case studies that answer non-trivial research questions. The book also demonstrates how to visualize linguistic data with the help of attractive informative graphs, including the popular ggplot2 system and Google visualization tools.

This book has a companion website: http://doi.org/10.1075/z.195.website

[Not in series, 195]  2015.  xi, 443 pp.
Publishing status: Available
Table of Contents
Acknowledgements
xi–xii
Introduction
1–6
Chapter 1. What is statistics?: Main statistical notions and principles
7–20
Chapter 2. Introduction to R
21–40
Chapter 3. Descriptive statistics for quantitative variables
41–68
Chapter 4. How to explore qualitative variables: proportions and their visualizations
69–86
Chapter 5. Comparing two groups: t-test and Wilcoxon and Mann-Whitney tests for independent and dependent samples
87–114
Chapter 6. Relationships between two quantitative variables: Correlation analysis with elements of linear regression modelling
115–138
Chapter 7. More on frequencies and reaction times: Linear regression
139–170
Chapter 8. Finding differences between several groups: Sign language, linguistic relativity and ANOVA
171–198
Chapter 9. Measuring associations between two categorical variables: Conceptual metaphors and tests of independence
199–222
Chapter 10. Association measures: collocations and collostructions
223–240
Chapter 11. Geographic variation of quite: Distinctive collexeme analysis
241–252
Chapter 12. Probabilistic multifactorial grammar and lexicology: Binomial logistic regression
253–276
Chapter 13. Multinomial (polytomous) logistic regression models of three and more near synonyms
277–290
Chapter 14. Conditional inference trees and random forests
291–300
Chapter 15. Behavioural profiles, distance metrics and cluster analysis
301–322
Chapter 16. Introduction to Semantic Vector Spaces: Cosine as a measure of semantic similarity
323–332
Chapter 17. Language and space: Dialects, maps and Multidimensional Scaling
333–350
Chapter 18. Multidimensional analysis of register variation: Principal Components Analysis and Factor Analysis
351–366
Chapter 19. Exemplars, categories, prototypes: Simple and multiple correspondence analysis
367–386
Chapter 20. Constructional change and motion charts
387–394
Epilogue
395–396
The most important R objects and basic operations with them
397–408
Main plotting functions and graphical parameters in R
409–424
References
425–432
Subject Index
433–440
Index of R functions and packages
441–443
“Levshina’s book achieves something few other books on doing linguistics with R have achieved. She has written a book that makes sense even for novice users of R and for linguists not accustomed to statistical computing. Levshina writes in a pedagogically sensitive style, in friendly language, and with just the right amount of explanatory prose to lead the reader to insightful analyses. Taken together, the chapters introduce the reader to a sparkling variety of statistical methods. Best of all, from the point of view of linguists, real *linguistic *problems take centre stage throughout the book– the statistical methods are the means to answer intriguing linguistic questions, not an end in themselves.”
“This is a fantastic textbook: extremely comprehensive (the book surveys almost all major analysis technique used in the linguistic literature – from descriptive statistics over regression analysis to semantic vector space modeling), well-written, and with a much appreciated emphasis on good data visualization. Both beginning and more experienced quantitative linguists will find this book an invaluable resource.”
Cited by

Cited by other publications

No author info given
2016. Moisl, H. (2015). Cluster Analysis for Corpus Linguistics.. International Journal of Corpus Linguistics 21:4  pp. 581 ff. Crossref logo
No author info given
2018.  In Tag Questions in Conversation [Studies in Corpus Linguistics, 83], Crossref logo
No author info given
2018.  In Threatening in English [Pragmatics & Beyond New Series, 284], Crossref logo
No author info given
2019.  In Sensory Linguistics [Converging Evidence in Language and Communication Research, 20], Crossref logo
No author info given
2019.  In Sensory Linguistics [Converging Evidence in Language and Communication Research, 20],  pp. 235 ff. Crossref logo
No author info given
2020.  Pò (‘break’), qiē (‘cut’) and kāi (‘open’) in Chinese. Review of Cognitive Linguistics 18:1 Crossref logo
No author info given
2020. A long birth. Diachronica Crossref logo
No author info given
2020. Adverb placement in EFL academic writing. International Journal of Corpus Linguistics 25:2 Crossref logo
No author info given
2020. Numeral base, numeral classifier, and noun. Language and Linguistics. 語言暨語言學 21:4 Crossref logo
Abed Ibrahim, Lina & István Fekete
2019. What Machine Learning Can Tell Us About the Role of Language Dominance in the Diagnostic Accuracy of German LITMUS Non-word and Sentence Repetition Tasks. Frontiers in Psychology 9 Crossref logo
Arnold, Taylor, Nicolas Ballier, Paula Lissón & Lauren Tilton
2019. Beyond lexical frequencies: using R for text analysis in the digital humanities. Language Resources and Evaluation 53:4  pp. 707 ff. Crossref logo
Badryzlova, Yulia & Polina Panicheva
2018.  In Artificial Intelligence and Natural Language [Communications in Computer and Information Science, 930],  pp. 23 ff. Crossref logo
Blas Arroyo, José Luis
2020. “Madrit nos roba”. Spanish in Context 17:1  pp. 30 ff. Crossref logo
Bolly, Catherine T., Ludivine Crible, Liesbeth Degand & Deniz Uygur-Distexhe
2017.  In Pragmatic Markers, Discourse Markers and Modal Particles [Studies in Language Companion Series, 186],  pp. 71 ff. Crossref logo
Bolyanatz Brown, Mariška A. & Brandon M.A. Rogers
2019.  In Recent Advances in the Study of Spanish Sociophonetic Perception [Issues in Hispanic and Lusophone Linguistics, 21],  pp. 212 ff. Crossref logo
Brato, Thorsten
2020. Noun phrase complexity in Ghanaian English. World Englishes Crossref logo
DE SMET, HENDRIK & FREEK VAN DE VELDE
2017. Experimenting on the past: a case study on changing analysability in English ly-adverbs. English Language and Linguistics 21:2  pp. 317 ff. Crossref logo
De Smet, Isabeau & Freek Van de Velde
2019. Reassessing the evolution of West Germanic preterite inflection. Diachronica 36:2  pp. 139 ff. Crossref logo
Deibel, Isabel
2020. The contribution of grammar and lexicon to language switching costs: Examining contact-induced languages and their implications for theories of language representation. Bilingualism: Language and Cognition  pp. 1 ff. Crossref logo
Deshors, Sandra C
2020. Contextualizing Past Tenses in L2: Combined Effects and Interactions in the Present Perfect versus Simple Past Alternation. Applied Linguistics Crossref logo
Deshors, Sandra C.
2019. English as a Lingua Franca: A random forests approach to particle placement in multi‐speaker interactions. International Journal of Applied Linguistics Crossref logo
Divjak, Dagmar
2017. The Role of Lexical Frequency in the Acceptability of Syntactic Variants: Evidence Fromthat-Clauses in Polish. Cognitive Science 41:2  pp. 354 ff. Crossref logo
Divjak, Dagmar
2019.  In Frequency in Language, Crossref logo
Estellés Arguedas, Maria
2020. The Evolution of Parliamentary Debates in Light of the Evolution of Evidentials: Al Parecer and Por lo Visto in 40 Years of Parliamentary Proceedings from Spain. Corpus Pragmatics 4:1  pp. 59 ff. Crossref logo
Fronhofer, Nina-Maria
2019.  In Emotion in Discourse [Pragmatics & Beyond New Series, 302],  pp. 213 ff. Crossref logo
Fuchs, Robert, Bertus van Rooy & Ulrike Gut
2019.  In Corpus Linguistics and African Englishes [Studies in Corpus Linguistics, 88],  pp. 38 ff. Crossref logo
García-Castro, Laura
2020. Finite and non-finite complement clauses in postcolonial Englishes. World Englishes Crossref logo
Gerasimova, Anastasia & Ekaterina Lyutikova
2020. Intralingual Variation in Acceptability Judgments and Production: Three Case Studies in Russian Grammar. Frontiers in Psychology 11 Crossref logo
Her, One-Soon & Marc Tang
2020. A Statistical Explanation of the Distribution of Sortal Classifiers in Languages of the World via Computational Classifiers. Journal of Quantitative Linguistics 27:2  pp. 93 ff. Crossref logo
Hilpert, Martin & Flach, Susanne
2020. Disentangling modal meanings with distributional semantics. Digital Scholarship in the Humanities Crossref logo
Ioannou, Georgios
2017. A corpus-based analysis of the verbpleróoin Ancient Greek. Review of Cognitive Linguistics 15:1  pp. 253 ff. Crossref logo
Janda, Laura A.
2019. Quantitative perspectives in Cognitive Linguistics. Review of Cognitive Linguistics 17:1  pp. 7 ff. Crossref logo
Johns, Michael A, Jorge R Valdés Kroff & Paola E Dussias
2019. Mixing things up: How blocking and mixing affect the processing of codemixed sentences. International Journal of Bilingualism 23:2  pp. 584 ff. Crossref logo
Kang, Hui & Jiajin Xu
2020. A Multifactorial Analysis of Concessive Clause Positioning. Journal of Quantitative Linguistics  pp. 1 ff. Crossref logo
Kim, YouJin, Stephen Skalicky & YeonJoo Jung
2020. The Role of Linguistic Alignment on Question Development in Face‐to‐Face and Synchronous Computer‐Mediated Communication Contexts: A Conceptual Replication Study. Language Learning Crossref logo
Kimps, Ditte, Kristin Davidse & Gerard O’Grady
2019. English tag questions eliciting knowledge or action. Functions of Language 26:1  pp. 86 ff. Crossref logo
Kleiber, Ingo
2019. Review of Brezina, V. (2018) Statistics in Corpus Linguistics: A Practical Guide . International Journal of Corpus Linguistics 24:1  pp. 131 ff. Crossref logo
Kocher, Anna
2018. Epistemic and evidential modification in Spanish and Portuguese. Languages in Contrast 18:1  pp. 99 ff. Crossref logo
Kokorniak, Iwona & Alicja Jajko-Siwek
2018. Expressing i think that in Polish. Review of Cognitive Linguistics 16:1  pp. 229 ff. Crossref logo
Kruger, Haidee
2019. That Again: A Multivariate Analysis of the Factors Conditioning Syntactic Explicitness in Translated English. Across Languages and Cultures 20:1  pp. 1 ff. Crossref logo
Laitinen, Mikko
2018.  In Modeling World Englishes [Varieties of English Around the World, G61],  pp. 109 ff. Crossref logo
Laitinen, Mikko
2020. Empirical perspectives on English as a lingua franca (ELF) grammar. World Englishes Crossref logo
Leal, Tania
2018.  In Critical Reflections on Data in Second Language Acquisition [Language Learning & Language Teaching, 51],  pp. 63 ff. Crossref logo
Levshina, Natalia
2017. Measuring iconicity. Functions of Language 24:3  pp. 319 ff. Crossref logo
Lieb, Hans-Heinrich
2018.  In Essays on Linguistic Realism [Studies in Language Companion Series, 196],  pp. 79 ff. Crossref logo
Lorenz, Eliane
2018.  In Foreign Language Education in Multilingual Classrooms [Hamburg Studies on Linguistic Diversity, 7],  pp. 331 ff. Crossref logo
Lorenz, Eliane, Richard J Bonnie, Kathrin Feindt, Sharareh Rahbari & Peter Siemund
2019. Cross-linguistic influence in unbalanced bilingual heritage speakers on subsequent language acquisition: Evidence from pronominal object placement in ditransitive clauses. International Journal of Bilingualism 23:6  pp. 1410 ff. Crossref logo
Lorenz, Eliane & Peter Siemund
2019.  In International Research on Multilingualism: Breaking with the Monolingual Perspective [Multilingual Education, 35],  pp. 81 ff. Crossref logo
Meyers, Charlène
2019. Difficulties in Identifying and Translating Linguistic Metaphors: A Survey and Experiment among Translation Students. English Studies at NBU 5:2  pp. 308 ff. Crossref logo
Mikhailov, Mikhail
2020. Цвет в художественном тексте и в переводе: Опыт корпусного исследования. Scando-Slavica 66:1  pp. 23 ff. Crossref logo
Naccarato, Chiara
2019. Agentive (para)synthetic compounds in Russian: a quantitative study of rival constructions. Morphology 29:1  pp. 1 ff. Crossref logo
Nance, Claire
2020. Bilingual language exposure and the peer group: Acquiring phonetics and phonology in Gaelic Medium Education. International Journal of Bilingualism 24:2  pp. 360 ff. Crossref logo
Nesset, Tore & Anastasia Makarova
2018. The decade construction rivalry in Russian. Diachronica 35:1  pp. 71 ff. Crossref logo
Neumann, Stella
2020. On the interaction between register variation and regional varieties in English. Language, Context and Text. The Social Semiotics Forum 2:1  pp. 121 ff. Crossref logo
Norde, Muriel & Kristel Van Goethem
2018.  In The Construction of Words [Studies in Morphology, 4],  pp. 475 ff. Crossref logo
Octavio de Toledo y Huerta, Álvaro S.
2019. Large Corpora and Historical Syntax: Consequences for the Study of Morphosyntactic Diffusion in the History of Spanish. Frontiers in Psychology 10 Crossref logo
Paquot, Magali & Luke Plonsky
2017. Quantitative research methods and study quality in learner corpus research. International Journal of Learner Corpus Research 3:1  pp. 61 ff. Crossref logo
Percillier, Michael
2017.  In Data Analytics in Digital Humanities,  pp. 91 ff. Crossref logo
Polo Aparisi, Manuel, Eva Maria Schöll & Sabine Marlene Hille
2018. Alpine Marsh Tits Poecile palustris palustris exhibit no clear sexual dimorphism other than in wing length. Ringing & Migration 33:1  pp. 36 ff. Crossref logo
Radu, Malina, Laura Colantoni, Gabrielle Klassen, Matthew Patience, Ana Teresa Pérez-Leroux & Olga Tararova
2018. The perception and interpretation of sentence types by L1 Spanish–L2 English speakers. Linguistic Approaches to Bilingualism Crossref logo
Rastelli, Stefano
2019. The discontinuity model: Statistical and grammatical learning in adult second-language acquisition. Language Acquisition 26:4  pp. 387 ff. Crossref logo
Roothooft, Hanne & Ruth Breeze
2016. A comparison of EFL teachers’ and students’ attitudes to oral corrective feedback. Language Awareness 25:4  pp. 318 ff. Crossref logo
Rosemeyer, Malte
2019. Actual and apparent change in Brazilian Portuguese wh-interrogatives. Language Variation and Change 31:2  pp. 165 ff. Crossref logo
Rusk, Brian V., Johanne Paradis & Juhani Järvikivi
2020. Comprehension of English plural-singular marking by Mandarin-L1, early L2-immersion learners. Applied Psycholinguistics  pp. 1 ff. Crossref logo
Röthlisberger, Melanie & Benedikt Szmrecsanyi
2019.  In Handbook of the Changing World Language Map,  pp. 1 ff. Crossref logo
Röthlisberger, Melanie & Benedikt Szmrecsanyi
2020.  In Handbook of the Changing World Language Map,  pp. 131 ff. Crossref logo
Rühlemann, Christoph
2020.  In Visual Linguistics with R, Crossref logo
Schweinberger, Martin
2020.  In Corpora and the Changing Society [Studies in Corpus Linguistics, 96],  pp. 223 ff. Crossref logo
Silvennoinen, Olli O.
2018. Constructional schemas in variation. Constructions and Frames 10:1  pp. 1 ff. Crossref logo
SOMMERER, LOTTE & KLAUS HOFMANN
2020. Constructional competition and network reconfiguration: investigating sum(e) in Old, Middle and Early Modern English. English Language and Linguistics  pp. 1 ff. Crossref logo
Spina, Stefania
2019. Role of Emoticons as Structural Markers in Twitter Interactions. Discourse Processes 56:4  pp. 345 ff. Crossref logo
Stange, Ulrike
2020. “Holding Grudges Is So Last Century”: The Use of GenX So as a Modifier of Noun Phrases. Journal of English Linguistics 48:2  pp. 107 ff. Crossref logo
Susanti, Yuni, Takenobu Tokunaga, Hitoshi Nishikawa & Hiroyuki Obari
2017. Controlling item difficulty for automatic vocabulary question generation. Research and Practice in Technology Enhanced Learning 12:1 Crossref logo
Szmrecsanyi, Benedikt, Jason Grafmiller & Laura Rosseel
2019. Variation-Based Distance and Similarity Modeling: A Case Study in World Englishes. Frontiers in Artificial Intelligence 2 Crossref logo
Szmrecsanyi, Benedikt & Melanie Röthlisberger
2019.  In The Cambridge Handbook of World Englishes,  pp. 534 ff. Crossref logo
Tagarro, Pablo M. & Nerea Suárez-González
2019. Laurel J. Brinton (ed.). 2017. English historical linguistics: Approaches and perspectives . Studies in Language 43:4  pp. 1038 ff. Crossref logo
Tait, Alastair W., Emma J. Gagen, Siobhan A. Wilson, Andrew G. Tomkins & Gordon Southam
2017. Microbial Populations of Stony Meteorites: Substrate Controls on First Colonizers. Frontiers in Microbiology 8 Crossref logo
TAMAREDO, IVÁN, MELANIE RÖTHLISBERGER, JASON GRAFMILLER & BENEDIKT HELLER
2020. Probabilistic indigenization effects at the lexis–syntax interface. English Language and Linguistics 24:2  pp. 413 ff. Crossref logo
Tang, Marc & I-Ping Wan
2020.  In From Minimal Contrast to Meaning Construct [Frontiers in Chinese Linguistics, 9],  pp. 289 ff. Crossref logo
Tantucci, Vittorio & Aiqing Wang
2020. Diachronic change of rapport orientation and sentence-periphery in Mandarin. Discourse Studies 22:2  pp. 146 ff. Crossref logo
Tizón-Couto, David
2017. Exploring the Left Dislocation construction by means of multiple linear regression. Belgian Journal of Linguistics 31  pp. 301 ff. Crossref logo
Tizón-Couto, David
2018.  In Explorations in English Historical Syntax [Studies in Language Companion Series, 198],  pp. 203 ff. Crossref logo
Trye, David, Andreea S. Calude, Felipe Bravo-Marquez & Te Taka Keegan
2020. Hybrid Hashtags: #YouKnowYoureAKiwiWhen Your Tweet Contains Māori and English. Frontiers in Artificial Intelligence 3 Crossref logo
Velupillai, Viveka
2019. Gendered inanimates in Shetland dialect. English World-Wide 40:3  pp. 269 ff. Crossref logo
Viola, Lorella
2020. From linguistic innovation to language change. Revue Romane. Langue et littérature. International Journal of Romance Languages and Literatures 55:1  pp. 95 ff. Crossref logo
Wagner, Susanne
2019. Why very good in India might be pretty good in North America. International Journal of Corpus Linguistics 24:4  pp. 445 ff. Crossref logo
Woodin, Greg & Bodo Winter
2018. Placing Abstract Concepts in Space: Quantity, Time and Emotional Valence. Frontiers in Psychology 9 Crossref logo
Yao, Xinyue & Peter Collins
2019. Developments in Australian, British, and American English Grammar from 1931 to 2006: An Aggregate, Comparative Approach to Dialectal Variation and Change. Journal of English Linguistics 47:2  pp. 120 ff. Crossref logo
Çandarlı, Duygu
2018. Changes in L2 writers’ self-reported metalinguistic knowledge of lexical phrases over one academic year. The Language Learning Journal  pp. 1 ff. Crossref logo

This list is based on CrossRef data as of 02 july 2020. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers. Any errors therein should be reported to them.

References

References

Agresti, A.
(2002) Categorical Data Analysis (2nd ed.). Hoboken, NJ: Wiley. Crossref link
Allan, L.G.
(1980) A note on measurement of contingency between two binary variables in judgment tasks. Bulletin of the Psychonomic Society, 15, 147–149. Crossref link
Anishchanka, A.
(2013) Seeing it in color: A usage-based perspective on color naming in advertising. PhD diss., University of Leuven.
Arppe, A., Han, W., & Newman, J.
(2013) Polytomous logistic regression with Shanghainese topic markers. Vignette, CRAN-R Project. http://​cran​.r​-project​.org​/web​/packages​/polytomous​/vignettes​/shanghainese​.pdf (last access 13.12.2014).
Atkins, B.T.S.
(1987) Semantic ID tags: Corpus evidence for dictionary senses. The uses of large text databases. Proceedings of the Third Annual Conference of the UW Centre for the New Oxford English Dictionary (pp. 17–36). Waterloo, Canada.
Baayen, R.
H (2008) Analyzing Linguistic Data. A Practical Introduction to Statistics Using R. Cambridge: Cambridge University Press. Crossref link
Balota, D.A., Yap, M.J., & Cortese, M.J.., et al.
(2007) The English Lexicon Project. Behavior Research Methods, 39(3), 445–459. Crossref link
Barnbrook, G., Mason, O., & Krishnamurthy, R.
(2013) Collocation: Applications and Implications. Basingstoke, Hampshire: Palgrave Macmillan. Crossref link
Bates, E., & Goodman, J.C.
(1997) On the inseparability of grammar and the lexicon: Evidence from acquisition, aphasia and real-time processing. Language and Cognitive Processes, 12(5/6), 507–586. Crossref link
Berlin, B., & Kay, P.
(1969) Basic Color Terms: Their Universality and Evolution. Berkeley, CA: University of California Press.
Biber, D.
(1988) Variation Across Speech and Writing. Cambridge: Cambridge University Press. Crossref link
Bloomfield, L.
(1935) Language. London: Allen & Unwin.
Borg, I., & Groenen, P.
(1997) Modern Multidimensional Scaling: Theory and Applications. New York: Springer. Crossref link
Boroditsky, L.
(2001) Does language shape thought?: Mandarin and English speakers’ conceptions of time. Cognitive Psychology, 43, 1–22. Crossref link
Bowerman, M., & Choi, S.
(2003) Space under construction: Language-specific spatial categorization in first language acquisition. In D. Gentner & S. Goldin-Meadow (Eds.), Language in Mind: Advances in the Study of Language and Thought (pp. 387–427). Cambridge, MA: MIT Press.
Bresnan, J., & Hay, J.
(2008) Gradient Grammar: An effect of animacy on the syntax of give in New Zealand and American English. Lingua, 118(2), 245–259. Crossref link
Brugman, C.
(1988 [1981]) The Story of Over: Polysemy, Semantics and the Structure of the Lexicon. New York: Garland.
Bullinaria, J.A., & Levy, J.P.
(2007) Extracting semantic representations from word co-occurrence statistics: A Computational Study. Behavior Research Methods, 39, 510–526. Crossref link
Bybee, J.
(2001) Phonology and language use. Cambridge: Cambridge University Press. Crossref link
Chambers, J.
(2008) Software for Data Analysis: Programming with R. New York: Springer. Crossref link
Chang, W.
(2012) R Graphics Cookbook. Sebastopol, CA: O’Reilly Media.
Chomsky, N.
(1957) Syntactic Structures. The Hague: Mouton.
Conover, W.J.
(1999) Practical Nonparametric Statistics (3rd ed.). New York: Wiley.
Conover, W.J., Johnson, M.E., & Johnson, M.M.
(1981) A comparative study of tests for homogeneity of variances, with applications to the outer continental shelf bidding data. Technometrics, 23, 351–361. Crossref link
Cox, T.F., & Cox, M.A.A.
(2001) Multidimensional Scaling (2nd ed.). Boca Raton, FL: Chapman and Hall/CRC Press.
Crawley, M.J.
(2007) The R Book. Chichester: Wiley. Crossref link
Dąbrowska, E.
(2009) Words as constructions. In V. Evans & S. Pourcel (Eds.), New Directions in Cognitive Linguistics (pp. 201–223). Amsterdam: John Benjamins. Crossref link
Davies, M.
(2008) The Corpus of Contemporary American English: 450 million words, 1990 – present. Available online at http://​corpus​.byu​.edu​/coca/.
(2011) N-grams and word frequency data from the Corpus of Historical American English (COHA). Available online at http://​www​.ngrams​.info.
(2013) Corpus of Global Web-Based English: 1.9 billion words from speakers in 20 countries. Available online at http://​corpus2​.byu​.edu​/glowbe/.
de Leeuw, J.
(1977) Applications of convex analysis to multidimensional scaling. In J. Barra, F. Brodeau, G. Romier, & B.V. Cutsem (Eds.), Recent Developments in Statistics (pp. 133–145). Amsterdam: North Holland Publishing Company.
Deerwester, S., Dumais, S.T., Furnas, G.W., Landayer, T.K., & Harshman, R.
(1990) Indexing by Latent Semantic Analysis. Journal of the American Society for Information Science, 41, 391–407. Crossref link
Diessel, H.
(2007) Frequency effects in language acquisition, language use, and diachronic change. New Ideas in Psychology, 25, 108–127. Crossref link
Divjak, D.
(2003) On trying in Russian: A tentative network model for near(er) synonyms. In Belgian Contributions to the 13th International Congress of Slavicists , Ljubljana, 15-21 August 2003. Special issue of Slavica Gandensia . (pp. 25–58).
Divjak, D., & Gries, S. Th
(2006) Ways of trying in Russian: Clustering behavioral profiles. Corpus Linguistics and Linguistic Theory, 2, 23–60. Crossref link
(2009) Corpus-based cognitive semantics: A contrastive study of phasal verbs in English and Russian. In K. Dziwirek & B. Lewandowska-Tomaszczyk (Eds.), Studies in Cognitive Corpus Linguistics (pp. 273–296). Frankfurt am Main: Peter Lang.
Dunning, T.
(1993) Accurate methods for the statistics of surprise and coincidence. Computational Linguistics, 19(1), 61–74.
Ellis, N.
(2006) Language acquisition as rational contingency learning. Applied Linguistics, 27(1), 1–24. Crossref link
Ellis, N., & Ferreira-Junior, F.
G (2009) Constructions and their acquisition: Islands and the distinctiveness of their occupancy. Annual Review of Cognitive Linguistics, 7, 188–221. Crossref link
Ember, C.R., & Ember, M.
(2007) Climate, econiche, and sexuality: Influences on sonority inlanguage. American Anthropologist, 109(1), 180–185. Crossref link
Everett, D.
(2005) Cultural Constraints on Grammar and Cognition in Pirahã: Another Look at the Design Features of Human Language. Current Anthropology, 46, 621–646. Crossref link
Evert, S.
(2004) The Statistics of Word Cooccurrences: Word Pairs and Collocations. IMS, University of Stuttgart.
Everitt, B., & Hothorn, T.
(2011) An Introduction to Applied Multivariate Analysis with R. New York: Springer. Crossref link
Everitt, B.S., Landau, S., Leese, M., & Stahl, D.
(2011) Cluster Analysis (5th ed.). Chichester: Wiley. Crossref link
Faraway, J.J.
(2009) Linear Models with R. Boca Raton, FL: Chapman and Hall/CRC Press.
Fox, J.
(2008) Applied Regression Analysis and Generalized Linear Models (2nd ed.). Thousand Oaks, CA: Sage Publications.
Field, A., Miles, J., & Field, Z.
(2012) Discovering Statistics Using R. Los Angeles: Sage.
Firth, J.R.
(1957) A synopsis of linguistic theory 1930–1955. In J.R. Firth (Ed.), Studies in Linguistic Analysis (pp. 1–32). Oxford: Blackwell.
Friendly, M.
(1996) Paivio, et al. Word List Generator, Online application. Retrieved April 28 2013, from http://​www​.datavis​.ca​/online​/paivio/
Geeraerts, D.
(1999) Idealist and empiricist tendencies in cognitive linguistics. In T. Janssen & G. Redeker (Eds.), Cognitive Linguistics: Foundations, Scope, and Methodology (pp. 163–194). Berlin/New York: Mouton de Gruyter. Crossref link
(2010) Theories of Lexical Semantics. Oxford: Oxford University Press.
Gilquin, G.
(2006) The place of prototypicality in corpus linguistics: Causation in the hot seat. In S. Th. Gries & A. Stefanowitsch (Eds.), Corpora in Cognitive Linguistics: Corpus-Based Approaches to Syntax and Lexis (pp. 159–191). Berlin/New York: Mouton de Gruyter.
(2010) Corpus, Cognition and Causative Constructions. Amsterdam: John Benjamins. Crossref link
Gipper, H.
(1959) Sessel oder Stuhl? Ein Beitrag zur Bestimmung von Wortinhalten im Bereich der Sachkultur. In H. Gipper (Ed.), Sprache – Schlüssel zur Welt: Festschrift für Leo Weisgerber (pp. 271–92). Düsseldorf: Schwann.
Goldberg, A.E., Casenhiser, D., & Sethuraman, N.
(2004) Learning argument structure generalizations. Cognitive Linguistics, 14(3), 289–316.
Gower, J.C.
(1971) A general coefficient of similarity and some of its properties. Biometrics, 27, 857–874. Crossref link
Greenacre, M.
(2007) Correspondence Analysis in Practice (2nd ed.). Boca Raton, FL: Chapman and Hall/CRC Press. Crossref link
Gries, S. Th
(2004) Coll.analysis 3. A program for R for Windows 2.x.
(2006) Corpus-based methods and Cognitive Semantics: The many senses of to run . In S. Th. Gries & A. Stefanowitsch (Eds.), Corpora in Cognitive Linguistics. Corpus-based Approaches to Syntax and Lexis (pp. 57–99). Berlin/New York: Mouton de Gruyter. Crossref link
(2008) Dispersions and adjusted frequencies in corpora. International Journal of Corpus Linguistics, 13(4), 403–437. Crossref link
(2009) Quantitative Corpus Linguistics with R: A Practical Introduction. New York/London: Routledge. Crossref link
(2012) Behavioral Profiles: A fine-grained and quantitative approach in corpus-based lexical semantics. In G. Jarema, G. Libben, & C. Westbury (Eds.), Methodological and Analytic Frontiers in Lexical Research (pp. 57–80). Amsterdam: John Benjamins. Crossref link
(2013) Statistics for Linguistics with R. Berlin/New York: De Gruyter Mouton. Crossref link
Gries, S. Th., Hampe, B., & Schönefeld, D.
(2005) Converging evidence: Bringing together experimental and corpus data on the association of verbs and constructions. Cognitive Linguistics, 16(4), 635–676. Crossref link
Gries, S. Th., & Stefanowitsch, A.
(2004) Extending collostructional analysis: A corpus-based perspective on ‘alternations’. International Journal of Corpus Linguistics, 9(1), 97–129. Crossref link
Hanks, P.
(1996) Contextual dependency and lexical sets. International Journal of Corpus Linguistics, 1(1), 75–98. Crossref link
Harrell, F.E.
(2001) Regression Modeling Strategies. With Applications to Linear Models, Logistic Regression, and Survival Analysis. New York: Springer.
Harris, Z.
(1954) Distributional structure. Word, 10(2/3), 146–162. Crossref link
Hilpert, M.
(2011) Dynamic visualizations of language change: Motion charts on the basis of bivariate and multivariate data from diachronic corpora. International Journal of Corpus Linguistics, 16(4), 435–461. Crossref link
(2013) Constructional Change in English: Developments in Allomorphy, Word Formation, and Syntax. Cambridge: Cambridge University Press. Crossref link
Hosmer, D.W., & Lemeshow, S.
(2000) Applied Logistic Regression. New York: Wiley. Crossref link
Hothorn, T., Hornik, K., & Zeileis, A.
(2006) Unbiased recursive partitioning: A conditional inference framework. Journal of Computational and Graphical Statistics, 15(3), 651–674. Crossref link
Huck, S.W.
(2009) Statistical Misconceptions. New York/London: Routledge.
Husson, F., , S., & Pagès, J.
(2010) Exploratory Multivariate Analysis by Example Using R. Boca Raton, FL: Chapman and Hall/CRC Press. Crossref link
Itkonen, E.
(1980) Qualitative vs. quantitative analysis in linguistics. In T.A. Perry (Ed.), Evidence and Argumentation in Linguistics (pp. 334–366). Berlin: Mouton.
Johnson, K.
(2008) Quantiative Methods in Linguistics. Malden, MA: Blackwell Publishing.
Kaufman, L., & Rousseeuw, P.J.
(1990) Finding Groups in Data: An Introduction to Cluster Analysis. New York: Wiley-Interscience. Crossref link
Kay, P., & McDaniel, C.K.
(1978) The linguistic significance of the meanings of Basic Color Terms. Language, 54(3), 610–646. Crossref link
Kepser, S., & Reis, M.
(2005) Evidence in Linguistics. In S. Kepser & M. Reis (Eds.), Linguistic Evidence: Empirical, Theoretical and Computational Perspectives (pp. 1–6). Berlin/New York: Mouton de Gruyter. Crossref link
Keuleers, E., Lacey, P., Rastle, K., & Brysbaert, M.
(2012) The British Lexicon Project: Lexical decision data for 28,730 monosyllabic and disyllabic English words. Behavior Research Methods, 44(1), 287–304. Crossref link
Kortmann, B., & Lunkenheimer, K.
(Eds.) (2013) The Electronic World Atlas of Varieties of English. Leipzig: Max Planck Institute for Evolutionary Anthropology. Retrieved from http://​ewave​-atlas​.org
Kruskal, J.B.
(1964) Multidimensional Scaling by optimizing goodness of fit to a nonmetric hypothesis. Psychometrica, 29(1), 1–27. Crossref link
Kučera, H., & Francis, W.N.
(1967) Computational Analysis of Present-day American English. Providence: Brown University Press.
Lakoff, G., & Johnson, M.
(1980) Metaphors We Live By. Chicago: University of Chicago Press.
Landauer, T.K., & Dumais, S.T.
(1997) A solution to Plato’s problem: The Latent Semantic Analysis theory of the acquisition, induction, and representation of knowledge. Psychological Review, 104, 211–240. Crossref link
Langacker, R.W.
(1987) Foundations of Cognitive Grammar: Theoretical Prerequisites. Stanford, CA: Stanford University Press.
Larson-Hall, J.
(2010) A Guide to Doing Statistics in Second Language Research Using SPSS. New York: Routledge.
Lehrer, A.
(1974) Semantic Fields and Lexical Structure. Amsterdam: North Holland Publishing Company.
Levshina, N.
(2011) Doe wat je niet laten kan [Do what you cannot let]: A usage-based analysis of Dutch causative constructions. PhD diss., University of Leuven.
(2014) Geographic variation of quite + ADJ in twenty national varieties of English: A pilot study. Yearbook of the German Cognitive Linguistics Association, 2, 109–126. Crossref link
. (In preparation). Convergent evidence of divergent knowledge: A study of the associations between the Russian ditransitive construction and its collexemes.
Levshina, N., Geeraerts, D., & Speelman, D.
(2011) Changing the world vs. changing the mind: Distinctive collexeme analysis of the causative construction with doen in Belgian and Netherlandic Dutch. In F. Gregersen, J. Parrot, & P. Quist (Eds.), Language variation - European perspectives III. Selected papers from the 5th International Conference on Language Variation in Europe, Copenhagen, June 2009 (pp. 111–123). Amsterdam: John Benjamins. Crossref link
(2013) Towards a 3D-Grammar: Interaction of linguistic and extralinguistic factors in the use of Dutch causative constructions. Journal of Pragmatics, 52, 34–48. Crossref link
Levshina, N., & Heylen, K.
(2014) A radically data-driven construction grammar: Experiments with Dutch causative constructions. In R. Boogaart, T. Colleman, & G. Rutten (Eds.), Extending the Scope of Construction Grammar (pp. 17–46). Berlin/New York: Mouton de Gruyter.
Leys, C., Ley, C., Klein, O., Bernard, P., & Licata, L.
(2013) Detecting outliers: Do not use standard deviation around the mean, use absolute deviation around the median. Journal of Experimental Social Psychology, 49, 764–766. Crossref link
Lijffijt, J., & Gries, S. Th
(2012) Correction to “Dispersions and adjusted frequencies in corpora”. International Journal of Corpus Linguistics, 17(1), 147–149. Crossref link
Lin, D.
(1998) Automatic retrieval and clustering of similar words. Proceedings of the 17th International Conference on Computational linguistics , Montreal, Canada, August 1998 (pp. 768–774). Crossref link
Louviere, J.J., Hensher, D.A., & Swait, J.D.
(2000) Stated Choice Methods: Analysis and application. Cambridge: Cambridge University Press. Crossref link
Lund, K., & Burgess, C.
(1996) Producing high-dimensional semantic spaces from lexical co-occurrences. Behavior Research Methods, Instruments, & Computers, 28, 203–208. Crossref link
Manning, C., & Schütze, H.
(1999) Foundations of Statistical Natural Language Processing. Cambridge, MA: MIT Press.
Matloff, N.
(2011) The Art of R Programming: A Tour of Statistical Software Design. San Francisco: No Starch Press.
Michelbacher, L., Evert, S., & Schutze, H.
(2011) Asymmetry in corpus-derived and human word associations. Corpus Linguistics and Linguistic Theory, 7(2), 245–276. Crossref link
Miller, G.A., & Charles, W.G.
(1991) Contextual correlates of semantic similarity. Language and Cognitive Processes, 6(1), 1–28. Crossref link
Mitchell, J., & Lapata, M.
(2010) Composition in distributional models of semantics. Cognitive Science, 34(8), 1388–1439. Crossref link
Newman, J.
(2011) Corpora and cognitive linguistics. Brazilian Journal of Applied Linguistics, 11(2), 521–559.
Núñez, R.E., & Sweetser, E.
(2006) With the future behind them: Convergent evidence from Aymara language and gesture in the crosslinguistic comparison of spatial construals of time. Cognitive Science, 30, 401–450. Crossref link
Pado, S., & Lapata, M.
(2007) Dependency-based construction of Semantic Space Models. Computational Linguistics, 33(2), 161–199. Crossref link
Peirsman, Y.
(2008) Word Space Models of semantic similarity and relatedness. In Proceedings of the ESSLLI-2008 Student Session , Hamburg, Germany.
Peirsman, Y., Heylen, K., & Geeraerts, D.
(2010) Applying Word Space Models to sociolinguistics. Religion names before and after 9/11. In D. Geeraerts, G. Kristiansen, & Y. Peirsman (Eds.), Recent Advances in Cognitive Sociolinguistics (pp. 111–137). Berlin/New York: Mouton de Gruyter. Crossref link
Paivio, A., Juille, J.C., & Madigan, S.
(1968) Concreteness, imagery, and meaningfulness values for 925 nouns. Journal of Experimental Psychology, 76(1, Pt. 2), 1–25. Crossref link
Paradis, C.
(1997) Degree Modifiers of Adjectives in Spoken British English. Lund: Lund University Press.
Rosch Heider, E., & Olivier, D.C.
(1972) The structure of the color space in naming and memory for two languages. Cognitive Psychology, 3, 337–345. Crossref link
Rosch, E.
(1975) Cognitive representation of semantic categories. Journal of Experimental Psychology, 104(3), 192–233. Crossref link
Rosch, E., & Mervis, C.B.
(1975) Family resemblances: Studies in the internal structure of categories. Cognitive Psychology, 7, 573–605. Crossref link
Salkind, N.J.
(2011) Statistics for People Who (Think They) Hate Statistics (4th ed.). Los Angeles: Sage.
Schmid, H.-J.
(2000) English Abstract Nouns as Conceptual Shells. From corpus to cognition. Berlin/New York: Mouton de Gruyter. Crossref link
Schütze, H.
(1992) Dimensions of meaning. In Proceedings of Supercomputing 92 (pp. 787–796). Minneapolis, MN. Crossref link
Senghas, A., & Coppola, M.
(2001) Children creating language: How Nicaraguan Sign Language acquired a spatial grammar. Psychological Science, 12(4), 323–328. Crossref link
Senghas, A., Kita, S., & Özyürek, A.
(2004) Children creating core properties of language: Evidence from an emerging Sign Language in Nicaragua. Science, 305(5691), 1779–1782. Crossref link
Sheskin, D.J.
(2011) Handbook of Parametric and Nonparametric Statistical Procedures. Boca Raton, FL: Chapman and Hall/CRC Press.
Speelman, D., & Geeraerts, D.
(2009) Causes for causatives: The case of Dutch ‘doen’ and ‘laten’. In T. Sanders & E. Sweetser (Eds.), Causal Categories in Discourse and Cognition (pp. 173–204). Berlin/New York: Mouton de Gruyter. Crossref link
Steels, L.
(Ed.) (2012) Experiments in Cultural Language Evolution. Amsterdam: John Benjamins. Crossref link
Steen, G.J., Dorst, A.G., Herrmann, J.B., Kaal, A.A., Krennmayr, T., & Pasma, T.
(2010) A Method for Linguistic Metaphor Identification. From MIP to MIPVU. Amsterdam: John Benjamins. Crossref link
Stefanowitsch, A.
(2001) Constructing causation: A construction grammar approach to analytic causatives. PhD diss., Rice University.
(2010) Empirical Cognitive Semantics: Some thoughts. In D. Glynn & K. Fischer (Eds.), Quantitative Methods in Cognitive Semantics: Corpus-driven Approaches (pp. 355–380). Berlin/New York: De Gruyter Mouton. Crossref link
Stefanowitsch, A., & Gries, S. Th
(2003) Collostructions: Investigating the interaction of words and constructions. International Journal of Corpus Linguistics, 8(2), 209–243. Crossref link
(2003) Covarying collexemes. Corpus Linguistics and Linguistic Theory, 1(1), 1–43. Crossref link
Sweetser, E.
(1990) From Etymology to Pragmatics. Cambridge: Cambridge University Press. Crossref link
Szmrecsanyi, B.
(2010) The English genitive alternation in a cognitive sociolinguistics perspective. In D. Geeraerts, G. Kristiansen, & Y. Peirsman (Eds.), Advances in Cognitive Sociolinguistics (pp. 141–166). Berlin/New York: Mouton de Gruyter. Crossref link
Tagliamonte, S., & Baayen, R.H.
(2012) Models, forests and trees of York English: Was/were variation as a case study for statistical practice. Language Variation and Change, 24(2), 135–178. Crossref link
Talmy, L.
(1985) Lexicalization patterns: Semantic structure in lexical forms. In T. Shopen (Ed.), Grammatical Categories and the Lexicon, Vol. III (pp. 57–149). Cambridge: Cambridge University Press.
(2000) Toward a Cognitive Semantics. Cambridge, MA: MIT Press.
Taylor, J.
(2012) The Mental Corpus. How Language is Represented in the Mind. Oxford: Oxford University Press. Crossref link
Teetor, P.
(2011) R Cookbook. Sebastopol, CA: O’Reilly Media.
Turney, P.D., & Pantel, P.
(2010) From frequency to meaning: Vector Space Models of semantics. Journal of Articial Intelligence Research, 37, 141–188.
Urdan, T.C.
(2010) Statistics in Plain English (3rd ed.). New York: Routledge.
Verhagen, A., & Kemmer, S.
(1997) Interaction and causation: Causative constructions in modern standard Dutch. Journal of Pragmatics, 24, 61–82. Crossref link
Verhoeven, J., De Pauw, G., & Kloots, H.
(2004) Speech rate in a pluricentric language: A comparison between Dutch in Belgium and the Netherlands. Language and Speech, 47(3), 297–308. Crossref link
Wickham, H.
(2009) ggplot2: Elegant Graphics for Data Analysis. New York: Springer. Crossref link
Wiechmann, D.
(2008) On the computation of Collostruction Strength. Corpus Linguistics and Linguistic Theory, 4(2), 253–290. Crossref link
Wierzbicka, A.
(2006) English: Meaning and Culture. Oxford: Oxford University Press. Crossref link
Winke, P., Gass, S., & Sydorenko, T.
(2010) The effects of captioning videos used for foreign language listening activities. Language Learning and Technology, 14(1), 65–86.
Wolk, C., Bresnan, J., Rosenbach, A., & Szmrecsanyi, B.
(2013) Dative and genitive variability in Late Modern English: Exploring cross-constructional variation and change. Diachronica, 30(3), 382–419. Crossref link
Wulff, S.
(2006) Go-V vs. go-and-V in English: A case of constructional synonymy? In S. Th. Gries & A. Stefanowitsch (Eds.), Corpora in Cognitive Linguistics. Corpus-based Approaches to Syntax and Lexis (pp. 101–125). Berlin/New York: Mouton de Gruyter.
Wulff, S., Gries, S. Th., & Stefanowitsch, A.
(2007) Brutal Brits and persuasive Americans: Variety-specific meaning construction in the into-causative. In G. Radden, K.-M. Köpcke, T. Berg, & P. Siemund (Eds.), Aspects of Meaning Construction (pp. 265–281). Amsterdam: John Benjamins. Crossref link
Zipf, G.K.
(1935) The Psycho-Biology of Language. Cambridge, MA: MIT Press.
(1949) Human Behavior and the Principle of Least Effort. An Introduction to Human Ecology. Cambridge, MA: Addison Wesley.
Subjects
BIC Subject: CFX – Computational linguistics
BISAC Subject: LAN009000 – LANGUAGE ARTS & DISCIPLINES / Linguistics / General
U.S. Library of Congress Control Number:  2015016708