How to do Linguistics with R

Data exploration and statistical analysis

| Université catholique de Louvain
ISBN 9789027212245 | EUR 105.00 | USD 158.00
ISBN 9789027212252 | EUR 36.00 | USD 54.00
ISBN 9789027268457 | EUR 105.00/36.00*
| USD 158.00/54.00*

This book provides a linguist with a statistical toolkit for exploration and analysis of linguistic data. It employs R, a free software environment for statistical computing, which is increasingly popular among linguists. How to do Linguistics with R: Data exploration and statistical analysis is unique in its scope, as it covers a wide range of classical and cutting-edge statistical methods, including different flavours of regression analysis and ANOVA, random forests and conditional inference trees, as well as specific linguistic approaches, among which are Behavioural Profiles, Vector Space Models and various measures of association between words and constructions. The statistical topics are presented comprehensively, but without too much technical detail, and illustrated with linguistic case studies that answer non-trivial research questions. The book also demonstrates how to visualize linguistic data with the help of attractive informative graphs, including the popular ggplot2 system and Google visualization tools.

This book has a companion website:

[Not in series, 195]  2015.  xi, 443 pp.
Publishing status: Available
Table of Contents
Chapter 1. What is statistics?: Main statistical notions and principles
Chapter 2. Introduction to R
Chapter 3. Descriptive statistics for quantitative variables
Chapter 4. How to explore qualitative variables: proportions and their visualizations
Chapter 5. Comparing two groups: t-test and Wilcoxon and Mann-Whitney tests for independent and dependent samples
Chapter 6. Relationships between two quantitative variables: Correlation analysis with elements of linear regression modelling
Chapter 7. More on frequencies and reaction times: Linear regression
Chapter 8. Finding differences between several groups: Sign language, linguistic relativity and ANOVA
Chapter 9. Measuring associations between two categorical variables: Conceptual metaphors and tests of independence
Chapter 10. Association measures: collocations and collostructions
Chapter 11. Geographic variation of quite: Distinctive collexeme analysis
Chapter 12. Probabilistic multifactorial grammar and lexicology: Binomial logistic regression
Chapter 13. Multinomial (polytomous) logistic regression models of three and more near synonyms
Chapter 14. Conditional inference trees and random forests
Chapter 15. Behavioural profiles, distance metrics and cluster analysis
Chapter 16. Introduction to Semantic Vector Spaces: Cosine as a measure of semantic similarity
Chapter 17. Language and space: Dialects, maps and Multidimensional Scaling
Chapter 18. Multidimensional analysis of register variation: Principal Components Analysis and Factor Analysis
Chapter 19. Exemplars, categories, prototypes: Simple and multiple correspondence analysis
Chapter 20. Constructional change and motion charts
The most important R objects and basic operations with them
Main plotting functions and graphical parameters in R
Subject Index
Index of R functions and packages
“Levshina’s book achieves something few other books on doing linguistics with R have achieved. She has written a book that makes sense even for novice users of R and for linguists not accustomed to statistical computing. Levshina writes in a pedagogically sensitive style, in friendly language, and with just the right amount of explanatory prose to lead the reader to insightful analyses. Taken together, the chapters introduce the reader to a sparkling variety of statistical methods. Best of all, from the point of view of linguists, real *linguistic *problems take centre stage throughout the book– the statistical methods are the means to answer intriguing linguistic questions, not an end in themselves.”
“This is a fantastic textbook: extremely comprehensive (the book surveys almost all major analysis technique used in the linguistic literature – from descriptive statistics over regression analysis to semantic vector space modeling), well-written, and with a much appreciated emphasis on good data visualization. Both beginning and more experienced quantitative linguists will find this book an invaluable resource.”
BIC Subject: CFX – Computational linguistics
BISAC Subject: LAN009000 – LANGUAGE ARTS & DISCIPLINES / Linguistics / General
U.S. Library of Congress Control Number:  2015016708