<TitleType>01</TitleType> <TitleText textformat="02">How to do Linguistics with R</TitleText> <Subtitle textformat="02">Data exploration and statistical analysis</Subtitle>

219-7677 10 7500817 John Benjamins Publishing Company Marketing Department / Karin Plijnaar, Pieter Lamers onix@benjamins.nl 201608250348 ONIX title feed eng 01 EUR

76015968 03 01 01 JB John Benjamins Publishing Company 01 JB code Z 195 Eb 15 9789027268457 06 10.1075/z.195 13 2015019027 DG 002 02 <TitleType>01</TitleType> <TitleText textformat="02">How to do Linguistics with R</TitleText> <Subtitle textformat="02">Data exploration and statistical analysis</Subtitle> 01 z.195 01 https://benjamins.com 02 https://benjamins.com/catalog/z.195 1 A01 Natalia Levshina Levshina, Natalia Natalia Levshina Université catholique de Louvain 01 eng 454 xi 443 LAN009000 v.2006 CFX 2 24 JB Subject Scheme LIN.COGN Cognition and language 24 JB Subject Scheme LIN.COMPUT Computational & corpus linguistics 24 JB Subject Scheme LIN.THEOR Theoretical linguistics 06 01 This book provides a linguist with a statistical toolkit for exploration and analysis of linguistic data. It employs R, a free software environment for statistical computing, which is increasingly popular among linguists. How to do Linguistics with R: Data exploration and statistical analysis is unique in its scope, as it covers a wide range of classical and cutting-edge statistical methods, including different flavours of regression analysis and ANOVA, random forests and conditional inference trees, as well as specific linguistic approaches, among which are Behavioural Profiles, Vector Space Models and various measures of association between words and constructions. The statistical topics are presented comprehensively, but without too much technical detail, and illustrated with linguistic case studies that answer non-trivial research questions. The book also demonstrates how to visualize linguistic data with the help of attractive informative graphs, including the popular ggplot2 system and Google visualization tools. This book has a companion website: <a href="http://doi.org/10.1075/z.195.website">http://doi.org/10.1075/z.195.website</a> 05 Levshina’s book achieves something few other books on doing linguistics with R have achieved. She has written a book that makes sense even for novice users of R and for linguists not accustomed to statistical computing. Levshina writes in a pedagogically sensitive style, in friendly language, and with just the right amount of explanatory prose to lead the reader to insightful analyses. Taken together, the chapters introduce the reader to a sparkling variety of statistical methods. Best of all, from the point of view of linguists, real *linguistic *problems take centre stage throughout the book– the statistical methods are the means to answer intriguing linguistic questions, not an end in themselves. John Newman, University of Alberta 05 This is a fantastic textbook: extremely comprehensive (the book surveys almost all major analysis technique used in the linguistic literature – from descriptive statistics over regression analysis to semantic vector space modeling), well-written, and with a much appreciated emphasis on good data visualization. Both beginning and more experienced quantitative linguists will find this book an invaluable resource. Benedikt Szmrecsanyi, University of Leuven 04 09 01 https://benjamins.com/covers/475/z.195.png 04 03 01 https://benjamins.com/covers/475_jpg/9789027212245.jpg 04 03 01 https://benjamins.com/covers/475_tif/9789027212245.tif 06 09 01 https://benjamins.com/covers/1200_front/z.195.hb.png 07 09 01 https://benjamins.com/covers/125/z.195.png 25 09 01 https://benjamins.com/covers/1200_back/z.195.hb.png 27 09 01 https://benjamins.com/covers/3d_web/z.195.hb.png 10 01 JB code z.195.prelim i vi 6 Prelim pages -1 <TitleType>01</TitleType> <TitleText textformat="02">Prelim pages</TitleText> 10 01 JB code z.195.toc vii x 4 Table of contents 0 <TitleType>01</TitleType> <TitleText textformat="02">Table of contents</TitleText> 10 01 JB code z.195.ack xi xii 2 Acknowledgments 1 <TitleType>01</TitleType> <TitleText textformat="02">Acknowledgements</TitleText> 10 01 JB code z.195.intro 1 6 6 Article 2 <TitleType>01</TitleType> <TitleText textformat="02">Introduction</TitleText> 10 01 JB code z.195.c1 7 20 14 Article 3 <TitleType>01</TitleType> <TitleText textformat="02">Chapter 1. What is statistics?</TitleText> <Subtitle textformat="02">Main statistical notions and principles</Subtitle> 01 What is statistics? What can and cannot statistics do for you? How to formulate and test research hypotheses? What kind of statistical tests are there? These and many other questions are discussed in this chapter. In addition, you will also learn about different types of variables, parametric and non-parametric tests, p-values and many other things which you will need in order to understand explanations provided in the following chapters. 10 01 JB code z.195.c2 21 40 20 Article 4 <TitleType>01</TitleType> <TitleText textformat="02">Chapter 2. Introduction to R</TitleText> 01 In this chapter you will learn to install the basic distribution of R, as well as add-on packages. The chapter also introduces the basics of R syntax and demonstrates how to perform simple operations with different R objects. Special attention is paid to importing and exporting your own data to and from R and saving your graphical output. You will also be able to interpret error messages and warnings that R may give you and search for additional information on R functions. 10 01 JB code z.195.c3 41 68 28 Article 5 <TitleType>01</TitleType> <TitleText textformat="02">Chapter 3. Descriptive statistics for quantitative variables</TitleText> 01 This chapter shows how to compute basic descriptive statistics for a quantitative variable. You will learn the most popular measures of central tendency (the mean, the median and the mode) and dispersion (variance, standard deviation, range, IQR, median absolute deviation). The chapter will also demonstrate how to produce different graphs (box-and-whisker plots, histograms, density plots, Q–Q plots, line charts), which visualize univariate distributions and help one determine whether a variable is normally distributed. From the case studies you will learn how to analyse the distribution of word lengths in a sample, to detect suspicious values in subjects’ reaction times in a lexical decision task, and to correct some problems with the shape of a distribution. 10 01 JB code z.195.c4 69 86 18 Article 6 <TitleType>01</TitleType> <TitleText textformat="02">Chapter 4. How to explore qualitative variables</TitleText> <Subtitle textformat="02">proportions and their visualizations</Subtitle> 01 This chapter demonstrates how to explore a categorical variable with the help of tables of counts and proportions. As in the previous chapter, graphs (pie charts, bar plots and dot charts) will play a very important role. You will also learn how to change values of a categorical variable. In addition, we will discuss how one can use Deviation of Proportions to measure dispersion of words in a corpus. This approach will be illustrated by a case study of the Basic Colour Terms in English. 10 01 JB code z.195.c5 87 114 28 Article 7 <TitleType>01</TitleType> <TitleText textformat="02">Chapter 5. Comparing two groups</TitleText> <Subtitle textformat="02">t-test and Wilcoxon and Mann-Whitney tests for independent and dependent samples</Subtitle> 01 Do language learners who are taught by an innovative method show better results than those who are taught traditionally? Do speakers of one language variety speak faster than speakers of another variety? Do people of one gender use more hedging constructions than people of another? In this chapter, you will learn how to make such comparisons using the parametric t-test and the non-parametric Wilcoxon and Mann-Whitney tests for dependent and independent samples. You will learn how to compute the standard error and confidence intervals for the mean. The case studies will involve differences between high- and low-frequency nouns with regard to the number of associations that they trigger and their abstractness/concreteness scores. 10 01 JB code z.195.c6 115 138 24 Article 8 <TitleType>01</TitleType> <TitleText textformat="02">Chapter 6. Relationships between two quantitative variables</TitleText> <Subtitle textformat="02">Correlation analysis with elements of linear regression modelling</Subtitle> 01 Will your knowledge of statistics improve as you read more and more books on the subject? Is there a relationship between the length of a word and its frequency? Does grammatical proficiency of children depend on the number of lexical items which they have mastered? Does the number of phonemes in a language depend on the number of speakers? All these questions involve correlation between two variables. This chapter explains the principles of correlation analysis and demonstrates how it can be carried out using popular parametric and non-parametric tests. You will also learn how to produce correlograms and scatter plots with a regression line. Some fundamental notions of regression analysis, such as residuals, homo- and heteroscedasticity, will be introduced. The case studies investigate the relationship between word frequency and mean reaction time in a lexical decision task and the correlation between vocabulary size and grammatical proficiency in first language acquisition. 10 01 JB code z.195.c7 139 170 32 Article 9 <TitleType>01</TitleType> <TitleText textformat="02">Chapter 7. More on frequencies and reaction times</TitleText> <Subtitle textformat="02">Linear regression</Subtitle> 01 After the previous chapter has introduced some basic elements of regression analysis, this chapter will provide a more thorough discussion of linear regression. This method enables one to model and explain the relationships between one or more explanatory variables at any level of measurement, on the one hand, and one ratio- or interval-scaled response variable, on the other hand. In addition, one can investigate interactions between explanatory variables. You will learn how to fit a multiple linear regression model, to perform its diagnostics and to interpret the results. You will also learn how to carry out non-parametric linear regression with the help of bootstrap. The case study investigates the relationship between reaction times in a lexical decision task, and such factors as word length, corpus frequency and part of speech of lexical stimuli. In contrast with the previous case studies, all these factors are tested here simultaneously in a multiple linear regression model. 10 01 JB code z.195.c8 171 198 28 Article 10 <TitleType>01</TitleType> <TitleText textformat="02">Chapter 8. Finding differences between several groups</TitleText> <Subtitle textformat="02">Sign language, linguistic relativity and ANOVA</Subtitle> 01 This chapter introduces ANOVA (analysis of variance), a special case of linear regression with binary or categorical independent variables. This method is widely used in experimental linguistics, when the researcher compares several groups of experimental objects that undergo different treatments. In this chapter you will learn several types of ANOVA: one-way ANOVA with one factor as an independent variable, factorial ANOVA with two or more categorical independent variables, and repeated-measures and mixed ANOVA. The methods are illustrated by three case studies. The first two focus on grammatical features of an emergent sign language. The third case study deals with cross-linguistic differences in time conceptualization, which are interpreted as evidence in favour of the linguistic relativity hypothesis. 10 01 JB code z.195.c9 199 222 24 Article 11 <TitleType>01</TitleType> <TitleText textformat="02">Chapter 9. Measuring associations between two categorical variables</TitleText> <Subtitle textformat="02">Conceptual metaphors and tests of independence</Subtitle> 01 This chapter focuses on associations between two categorical variables. You will learn how to measure the association strength using odds ratios, Cramér’s V and the φ-coefficient. You will also learn how to test whether the association is statistically significant with the help of the χ2-test and the Fisher exact test. Bar plots, mosaic plots and association plots are used as visualization tools for cross-tabulated data. All these concepts and tools will be illustrated by case studies of metaphoric and non-metaphoric uses of the preposition over and the verb see in different registers. 10 01 JB code z.195.c10 223 240 18 Article 12 <TitleType>01</TitleType> <TitleText textformat="02">Chapter 10. Association measures</TitleText> <Subtitle textformat="02">collocations and collostructions</Subtitle> 01 Collocations, as well as colligations and other co-occurrence patterns, play an important role in corpus linguistics, psycholinguistics and usage-based grammar and lexicology. To measure the degree of attraction between words and other units, one can use diverse association measures, such as collostructional strength, Pointwise Mutual Information or ΔP. From this chapter you will learn how to compute a variety of association measures using a small set of different co-occurrence frequencies. The case study is based on co-occurrence frequencies of different verbs in the Russian ditransitive construction. 10 01 JB code z.195.c11 241 252 12 Article 13 <TitleType>01</TitleType> <TitleText textformat="02">Chapter 11. Geographic variation of quite: Distinctive collexeme analysis</TitleText> 01 This chapter introduces distinctive collexeme analysis, which employs bidirectional association measures discussed in the previous chapter. This method is based on the co-occurrence frequencies of words that occur in two near-synonymous constructions, or in two or more dialectal or diachronic variants of the same construction. Here we will compare the variants of quite + ADJ constructions in different national varieties of English. We will first present a canonical distinctive collexeme analysis with only two varieties, British and American English, and then will show how this approach can be extended to more lects, presenting a unified approach to multiple distinctive collexeme analysis. 10 01 JB code z.195.c12 253 276 24 Article 14 <TitleType>01</TitleType> <TitleText textformat="02">Chapter 12. Probabilistic multifactorial grammar and lexicology</TitleText> <Subtitle textformat="02">Binomial logistic regression</Subtitle> 01 In this chapter you will learn how to model the speaker’s choice between two near synonymous words or constructions on the basis of contextual features. The most popular statistical tool that is used to create such models is logistic regression. The approach is illustrated by a case study of two Dutch causative auxiliaries. As in the case of linear regression, you will learn how to create, test and interpret a logistic model with the help of different R tools. 10 01 JB code z.195.c13 277 290 14 Article 15 <TitleType>01</TitleType> <TitleText textformat="02">Chapter 13. Multinomial (polytomous) logistic regression models of three and more near synonyms</TitleText> 01 This chapter continues the discussion of logistic regression models, which can be used to predict the speaker’s choice between different near synonyms or variants. This time you will learn to model situations when the number of possible outcomes is greater than two. Such models are called multinomial, or polytomous. The method will be illustrated with a case study of three near synonyms: let, allow and permit. 10 01 JB code z.195.c14 291 300 10 Article 16 <TitleType>01</TitleType> <TitleText textformat="02">Chapter 14. Conditional inference trees and random forests</TitleText> 01 This chapter discusses conditional inference trees and random forests. These are non-parametric tree-structure models of regression and classification that can serve as an alternative to multiple regression. They are especially useful in the presence of many high-order interactions and in situations when the sample size is small, but the number of predictors is large. You will learn how to fit such models, interpret their results and evaluate their quality. The case study that illustrates the techniques deals with three English causative constructions make + V, cause + to V and have + V and identifies the set of independent semantic variables that are important for distinguishing between the constructions. 10 01 JB code z.195.c15 301 322 22 Article 17 <TitleType>01</TitleType> <TitleText textformat="02">Chapter 15. Behavioural profiles, distance metrics and cluster analysis</TitleText> 01 This chapter presents the Behavioural Profiles approach, which involves the comparison of contextual features of words or constructions in a corpus. The chapter also discusses several clustering algorithms, which are based on different distance metrics. Cluster analysis is a family of techniques that can help you discover groups of similar objects in the data. Several popular methods of cluster validation and diagnostics are discussed, which involve the computation of average silhouette widths and multiscale bootstrap resampling. The chapter also demonstrates how to interpret clusters with the help of the snake plot and effect size measures. In addition, you will learn to create and interpret scree plots, which are useful for determining the optimal number of clusters. 10 01 JB code z.195.c16 323 332 10 Article 18 <TitleType>01</TitleType> <TitleText textformat="02">Chapter 16. Introduction to Semantic Vector Spaces</TitleText> <Subtitle textformat="02">Cosine as a measure of semantic similarity</Subtitle> 01 This chapter introduces Semantic Vector Spaces, another distributional approach to semantics. This method originates in Natural Language Processing. Unlike Behavioural Profiles discussed in the previous chapter, it uses automatically extracted co-occurrences of target words and contextual features. The characteristic features of the method are weighted co-occurrence frequencies and the use of the cosine as the most popular similarity measure. This chapter provides a general introduction to the method, with a case study of English cooking verbs as an illustration. 10 01 JB code z.195.c17 333 350 18 Article 19 <TitleType>01</TitleType> <TitleText textformat="02">Chapter 17. Language and space</TitleText> <Subtitle textformat="02">Dialects, maps and Multidimensional Scaling</Subtitle> 01 This chapter introduces another popular method that deals with distance matrices. This method is called Multidimensional Scaling. It is a dimensionality reduction technique that represents distances between objects in a low-dimensional space. You will learn how to perform different types of metric and non-metric scaling and carry out the diagnostics of solutions by using the scree plot, the Shepard plot and goodness-of-fit measures. The chapter also shows how one can use R for creation of geographical maps with points and text labels. Finally, you will learn how to measure the correlation between two distance matrices with the help of the Mantel test. The case studies are based on geographic coordinates and several linguistic features of varieties of English all over the world. 10 01 JB code z.195.c18 351 366 16 Article 20 <TitleType>01</TitleType> <TitleText textformat="02">Chapter 18. Multidimensional analysis of register variation</TitleText> <Subtitle textformat="02">Principal Components Analysis and Factor Analysis</Subtitle> 01 In this chapter you will learn about Principal Components Analysis and Factor Analysis. The aim of these methods is to reduce a large number of correlated quantitative variables to a small set of underlying dimensions. You will learn how to use these methods to perform corpus-based multidimensional analysis of register variation. 10 01 JB code z.195.c19 367 386 20 Article 21 <TitleType>01</TitleType> <TitleText textformat="02">Chapter 19. Exemplars, categories, prototypes</TitleText> <Subtitle textformat="02">Simple and multiple correspondence analysis</Subtitle> 01 This chapter introduces Correspondence Analysis. It is similar to PCA, but is designed for visualization and exploration of bivariate and multivariate categorical data. The first case study focuses on register variation of English Basic Colour Terms by using Simple Correspondence Analysis, which can be used for visualization of bivariate categorical data in two-dimensional contingency tables. In the second case study of German lexical categories Stuhl ‘chair’ and Sessel ‘armchair’, you will learn how to perform Multiple Correspondence Analysis with higher-dimensional tables. 10 01 JB code z.195.c20 387 394 8 Article 22 <TitleType>01</TitleType> <TitleText textformat="02">Chapter 20. Constructional change and motion charts</TitleText> 01 This chapter introduces motion charts as a method for dynamic visualization of language change. More specifically, they enable one to detect and explore changes in the use of constructions by visualizing the relative frequencies of different lexemes that fill in the constructional slots. The method is illustrated with a case study that explores the changes in the use of future markers will and be going to by comparing the frequencies of infinitives that follow the markers. 10 01 JB code z.195.epi 395 396 2 Article 23 <TitleType>01</TitleType> <TitleText textformat="02">Epilogue</TitleText> 10 01 JB code z.195.app1 397 408 12 Article 24 <TitleType>01</TitleType> <TitleText textformat="02">The most important R objects and basic operations with them</TitleText> <TitlePrefix>The </TitlePrefix> <TitleWithoutPrefix textformat="02">most important R objects and basic operations with them</TitleWithoutPrefix> 10 01 JB code z.195.app2 409 424 16 Article 25 <TitleType>01</TitleType> <TitleText textformat="02">Main plotting functions and graphical parameters in R</TitleText> 10 01 JB code z.195.refs 425 432 8 References 26 <TitleType>01</TitleType> <TitleText textformat="02">References</TitleText> 10 01 JB code z.195.si 433 440 8 Index 27 <TitleType>01</TitleType> <TitleText textformat="02">Subject Index</TitleText> 10 01 JB code z.195.ri 441 443 3 Index 28 <TitleType>01</TitleType> <TitleText textformat="02">Index of R functions and packages</TitleText> 02 JBENJAMINS John Benjamins Publishing Company 01 John Benjamins Publishing Company Amsterdam/Philadelphia NL 04 20151125 2015 John Benjamins B.V. 02 WORLD 13 15 9789027212245 01 JB 3 John Benjamins e-Platform 03 jbe-platform.com 09 WORLD 21 01 06 Institutional price 00 105.00 EUR R 01 05 Consumer price 00 36.00 EUR R 01 06 Institutional price 00 88.00 GBP Z 01 05 Consumer price 00 30.00 GBP Z 01 06 Institutional price inst 00 158.00 USD S 01 05 Consumer price cons 00 54.00 USD S 412015967 03 01 01 JB John Benjamins Publishing Company 01 JB code Z 195 Hb 15 9789027212245 13 2015016708 BB <TitleType>01</TitleType> <TitleText textformat="02">How to do Linguistics with R</TitleText> <Subtitle textformat="02">Data exploration and statistical analysis</Subtitle> 01 z.195 01 https://benjamins.com 02 https://benjamins.com/catalog/z.195 1 A01 Natalia Levshina Levshina, Natalia Natalia Levshina Université catholique de Louvain 01 eng 454 xi 443 LAN009000 v.2006 CFX 2 24 JB Subject Scheme LIN.COGN Cognition and language 24 JB Subject Scheme LIN.COMPUT Computational & corpus linguistics 24 JB Subject Scheme LIN.THEOR Theoretical linguistics 06 01 This book provides a linguist with a statistical toolkit for exploration and analysis of linguistic data. It employs R, a free software environment for statistical computing, which is increasingly popular among linguists. How to do Linguistics with R: Data exploration and statistical analysis is unique in its scope, as it covers a wide range of classical and cutting-edge statistical methods, including different flavours of regression analysis and ANOVA, random forests and conditional inference trees, as well as specific linguistic approaches, among which are Behavioural Profiles, Vector Space Models and various measures of association between words and constructions. The statistical topics are presented comprehensively, but without too much technical detail, and illustrated with linguistic case studies that answer non-trivial research questions. The book also demonstrates how to visualize linguistic data with the help of attractive informative graphs, including the popular ggplot2 system and Google visualization tools. This book has a companion website: <a href="http://doi.org/10.1075/z.195.website">http://doi.org/10.1075/z.195.website</a> 05 Levshina’s book achieves something few other books on doing linguistics with R have achieved. She has written a book that makes sense even for novice users of R and for linguists not accustomed to statistical computing. Levshina writes in a pedagogically sensitive style, in friendly language, and with just the right amount of explanatory prose to lead the reader to insightful analyses. Taken together, the chapters introduce the reader to a sparkling variety of statistical methods. Best of all, from the point of view of linguists, real *linguistic *problems take centre stage throughout the book– the statistical methods are the means to answer intriguing linguistic questions, not an end in themselves. John Newman, University of Alberta 05 This is a fantastic textbook: extremely comprehensive (the book surveys almost all major analysis technique used in the linguistic literature – from descriptive statistics over regression analysis to semantic vector space modeling), well-written, and with a much appreciated emphasis on good data visualization. Both beginning and more experienced quantitative linguists will find this book an invaluable resource. Benedikt Szmrecsanyi, University of Leuven 04 09 01 https://benjamins.com/covers/475/z.195.png 04 03 01 https://benjamins.com/covers/475_jpg/9789027212245.jpg 04 03 01 https://benjamins.com/covers/475_tif/9789027212245.tif 06 09 01 https://benjamins.com/covers/1200_front/z.195.hb.png 07 09 01 https://benjamins.com/covers/125/z.195.png 25 09 01 https://benjamins.com/covers/1200_back/z.195.hb.png 27 09 01 https://benjamins.com/covers/3d_web/z.195.hb.png 10 01 JB code z.195.prelim i vi 6 Prelim pages -1 <TitleType>01</TitleType> <TitleText textformat="02">Prelim pages</TitleText> 10 01 JB code z.195.toc vii x 4 Table of contents 0 <TitleType>01</TitleType> <TitleText textformat="02">Table of contents</TitleText> 10 01 JB code z.195.ack xi xii 2 Acknowledgments 1 <TitleType>01</TitleType> <TitleText textformat="02">Acknowledgements</TitleText> 10 01 JB code z.195.intro 1 6 6 Article 2 <TitleType>01</TitleType> <TitleText textformat="02">Introduction</TitleText> 10 01 JB code z.195.c1 7 20 14 Article 3 <TitleType>01</TitleType> <TitleText textformat="02">Chapter 1. What is statistics?</TitleText> <Subtitle textformat="02">Main statistical notions and principles</Subtitle> 01 What is statistics? What can and cannot statistics do for you? How to formulate and test research hypotheses? What kind of statistical tests are there? These and many other questions are discussed in this chapter. In addition, you will also learn about different types of variables, parametric and non-parametric tests, p-values and many other things which you will need in order to understand explanations provided in the following chapters. 10 01 JB code z.195.c2 21 40 20 Article 4 <TitleType>01</TitleType> <TitleText textformat="02">Chapter 2. Introduction to R</TitleText> 01 In this chapter you will learn to install the basic distribution of R, as well as add-on packages. The chapter also introduces the basics of R syntax and demonstrates how to perform simple operations with different R objects. Special attention is paid to importing and exporting your own data to and from R and saving your graphical output. You will also be able to interpret error messages and warnings that R may give you and search for additional information on R functions. 10 01 JB code z.195.c3 41 68 28 Article 5 <TitleType>01</TitleType> <TitleText textformat="02">Chapter 3. Descriptive statistics for quantitative variables</TitleText> 01 This chapter shows how to compute basic descriptive statistics for a quantitative variable. You will learn the most popular measures of central tendency (the mean, the median and the mode) and dispersion (variance, standard deviation, range, IQR, median absolute deviation). The chapter will also demonstrate how to produce different graphs (box-and-whisker plots, histograms, density plots, Q–Q plots, line charts), which visualize univariate distributions and help one determine whether a variable is normally distributed. From the case studies you will learn how to analyse the distribution of word lengths in a sample, to detect suspicious values in subjects’ reaction times in a lexical decision task, and to correct some problems with the shape of a distribution. 10 01 JB code z.195.c4 69 86 18 Article 6 <TitleType>01</TitleType> <TitleText textformat="02">Chapter 4. How to explore qualitative variables</TitleText> <Subtitle textformat="02">proportions and their visualizations</Subtitle> 01 This chapter demonstrates how to explore a categorical variable with the help of tables of counts and proportions. As in the previous chapter, graphs (pie charts, bar plots and dot charts) will play a very important role. You will also learn how to change values of a categorical variable. In addition, we will discuss how one can use Deviation of Proportions to measure dispersion of words in a corpus. This approach will be illustrated by a case study of the Basic Colour Terms in English. 10 01 JB code z.195.c5 87 114 28 Article 7 <TitleType>01</TitleType> <TitleText textformat="02">Chapter 5. Comparing two groups</TitleText> <Subtitle textformat="02">t-test and Wilcoxon and Mann-Whitney tests for independent and dependent samples</Subtitle> 01 Do language learners who are taught by an innovative method show better results than those who are taught traditionally? Do speakers of one language variety speak faster than speakers of another variety? Do people of one gender use more hedging constructions than people of another? In this chapter, you will learn how to make such comparisons using the parametric t-test and the non-parametric Wilcoxon and Mann-Whitney tests for dependent and independent samples. You will learn how to compute the standard error and confidence intervals for the mean. The case studies will involve differences between high- and low-frequency nouns with regard to the number of associations that they trigger and their abstractness/concreteness scores. 10 01 JB code z.195.c6 115 138 24 Article 8 <TitleType>01</TitleType> <TitleText textformat="02">Chapter 6. Relationships between two quantitative variables</TitleText> <Subtitle textformat="02">Correlation analysis with elements of linear regression modelling</Subtitle> 01 Will your knowledge of statistics improve as you read more and more books on the subject? Is there a relationship between the length of a word and its frequency? Does grammatical proficiency of children depend on the number of lexical items which they have mastered? Does the number of phonemes in a language depend on the number of speakers? All these questions involve correlation between two variables. This chapter explains the principles of correlation analysis and demonstrates how it can be carried out using popular parametric and non-parametric tests. You will also learn how to produce correlograms and scatter plots with a regression line. Some fundamental notions of regression analysis, such as residuals, homo- and heteroscedasticity, will be introduced. The case studies investigate the relationship between word frequency and mean reaction time in a lexical decision task and the correlation between vocabulary size and grammatical proficiency in first language acquisition. 10 01 JB code z.195.c7 139 170 32 Article 9 <TitleType>01</TitleType> <TitleText textformat="02">Chapter 7. More on frequencies and reaction times</TitleText> <Subtitle textformat="02">Linear regression</Subtitle> 01 After the previous chapter has introduced some basic elements of regression analysis, this chapter will provide a more thorough discussion of linear regression. This method enables one to model and explain the relationships between one or more explanatory variables at any level of measurement, on the one hand, and one ratio- or interval-scaled response variable, on the other hand. In addition, one can investigate interactions between explanatory variables. You will learn how to fit a multiple linear regression model, to perform its diagnostics and to interpret the results. You will also learn how to carry out non-parametric linear regression with the help of bootstrap. The case study investigates the relationship between reaction times in a lexical decision task, and such factors as word length, corpus frequency and part of speech of lexical stimuli. In contrast with the previous case studies, all these factors are tested here simultaneously in a multiple linear regression model. 10 01 JB code z.195.c8 171 198 28 Article 10 <TitleType>01</TitleType> <TitleText textformat="02">Chapter 8. Finding differences between several groups</TitleText> <Subtitle textformat="02">Sign language, linguistic relativity and ANOVA</Subtitle> 01 This chapter introduces ANOVA (analysis of variance), a special case of linear regression with binary or categorical independent variables. This method is widely used in experimental linguistics, when the researcher compares several groups of experimental objects that undergo different treatments. In this chapter you will learn several types of ANOVA: one-way ANOVA with one factor as an independent variable, factorial ANOVA with two or more categorical independent variables, and repeated-measures and mixed ANOVA. The methods are illustrated by three case studies. The first two focus on grammatical features of an emergent sign language. The third case study deals with cross-linguistic differences in time conceptualization, which are interpreted as evidence in favour of the linguistic relativity hypothesis. 10 01 JB code z.195.c9 199 222 24 Article 11 <TitleType>01</TitleType> <TitleText textformat="02">Chapter 9. Measuring associations between two categorical variables</TitleText> <Subtitle textformat="02">Conceptual metaphors and tests of independence</Subtitle> 01 This chapter focuses on associations between two categorical variables. You will learn how to measure the association strength using odds ratios, Cramér’s V and the φ-coefficient. You will also learn how to test whether the association is statistically significant with the help of the χ2-test and the Fisher exact test. Bar plots, mosaic plots and association plots are used as visualization tools for cross-tabulated data. All these concepts and tools will be illustrated by case studies of metaphoric and non-metaphoric uses of the preposition over and the verb see in different registers. 10 01 JB code z.195.c10 223 240 18 Article 12 <TitleType>01</TitleType> <TitleText textformat="02">Chapter 10. Association measures</TitleText> <Subtitle textformat="02">collocations and collostructions</Subtitle> 01 Collocations, as well as colligations and other co-occurrence patterns, play an important role in corpus linguistics, psycholinguistics and usage-based grammar and lexicology. To measure the degree of attraction between words and other units, one can use diverse association measures, such as collostructional strength, Pointwise Mutual Information or ΔP. From this chapter you will learn how to compute a variety of association measures using a small set of different co-occurrence frequencies. The case study is based on co-occurrence frequencies of different verbs in the Russian ditransitive construction. 10 01 JB code z.195.c11 241 252 12 Article 13 <TitleType>01</TitleType> <TitleText textformat="02">Chapter 11. Geographic variation of quite: Distinctive collexeme analysis</TitleText> 01 This chapter introduces distinctive collexeme analysis, which employs bidirectional association measures discussed in the previous chapter. This method is based on the co-occurrence frequencies of words that occur in two near-synonymous constructions, or in two or more dialectal or diachronic variants of the same construction. Here we will compare the variants of quite + ADJ constructions in different national varieties of English. We will first present a canonical distinctive collexeme analysis with only two varieties, British and American English, and then will show how this approach can be extended to more lects, presenting a unified approach to multiple distinctive collexeme analysis. 10 01 JB code z.195.c12 253 276 24 Article 14 <TitleType>01</TitleType> <TitleText textformat="02">Chapter 12. Probabilistic multifactorial grammar and lexicology</TitleText> <Subtitle textformat="02">Binomial logistic regression</Subtitle> 01 In this chapter you will learn how to model the speaker’s choice between two near synonymous words or constructions on the basis of contextual features. The most popular statistical tool that is used to create such models is logistic regression. The approach is illustrated by a case study of two Dutch causative auxiliaries. As in the case of linear regression, you will learn how to create, test and interpret a logistic model with the help of different R tools. 10 01 JB code z.195.c13 277 290 14 Article 15 <TitleType>01</TitleType> <TitleText textformat="02">Chapter 13. Multinomial (polytomous) logistic regression models of three and more near synonyms</TitleText> 01 This chapter continues the discussion of logistic regression models, which can be used to predict the speaker’s choice between different near synonyms or variants. This time you will learn to model situations when the number of possible outcomes is greater than two. Such models are called multinomial, or polytomous. The method will be illustrated with a case study of three near synonyms: let, allow and permit. 10 01 JB code z.195.c14 291 300 10 Article 16 <TitleType>01</TitleType> <TitleText textformat="02">Chapter 14. Conditional inference trees and random forests</TitleText> 01 This chapter discusses conditional inference trees and random forests. These are non-parametric tree-structure models of regression and classification that can serve as an alternative to multiple regression. They are especially useful in the presence of many high-order interactions and in situations when the sample size is small, but the number of predictors is large. You will learn how to fit such models, interpret their results and evaluate their quality. The case study that illustrates the techniques deals with three English causative constructions make + V, cause + to V and have + V and identifies the set of independent semantic variables that are important for distinguishing between the constructions. 10 01 JB code z.195.c15 301 322 22 Article 17 <TitleType>01</TitleType> <TitleText textformat="02">Chapter 15. Behavioural profiles, distance metrics and cluster analysis</TitleText> 01 This chapter presents the Behavioural Profiles approach, which involves the comparison of contextual features of words or constructions in a corpus. The chapter also discusses several clustering algorithms, which are based on different distance metrics. Cluster analysis is a family of techniques that can help you discover groups of similar objects in the data. Several popular methods of cluster validation and diagnostics are discussed, which involve the computation of average silhouette widths and multiscale bootstrap resampling. The chapter also demonstrates how to interpret clusters with the help of the snake plot and effect size measures. In addition, you will learn to create and interpret scree plots, which are useful for determining the optimal number of clusters. 10 01 JB code z.195.c16 323 332 10 Article 18 <TitleType>01</TitleType> <TitleText textformat="02">Chapter 16. Introduction to Semantic Vector Spaces</TitleText> <Subtitle textformat="02">Cosine as a measure of semantic similarity</Subtitle> 01 This chapter introduces Semantic Vector Spaces, another distributional approach to semantics. This method originates in Natural Language Processing. Unlike Behavioural Profiles discussed in the previous chapter, it uses automatically extracted co-occurrences of target words and contextual features. The characteristic features of the method are weighted co-occurrence frequencies and the use of the cosine as the most popular similarity measure. This chapter provides a general introduction to the method, with a case study of English cooking verbs as an illustration. 10 01 JB code z.195.c17 333 350 18 Article 19 <TitleType>01</TitleType> <TitleText textformat="02">Chapter 17. Language and space</TitleText> <Subtitle textformat="02">Dialects, maps and Multidimensional Scaling</Subtitle> 01 This chapter introduces another popular method that deals with distance matrices. This method is called Multidimensional Scaling. It is a dimensionality reduction technique that represents distances between objects in a low-dimensional space. You will learn how to perform different types of metric and non-metric scaling and carry out the diagnostics of solutions by using the scree plot, the Shepard plot and goodness-of-fit measures. The chapter also shows how one can use R for creation of geographical maps with points and text labels. Finally, you will learn how to measure the correlation between two distance matrices with the help of the Mantel test. The case studies are based on geographic coordinates and several linguistic features of varieties of English all over the world. 10 01 JB code z.195.c18 351 366 16 Article 20 <TitleType>01</TitleType> <TitleText textformat="02">Chapter 18. Multidimensional analysis of register variation</TitleText> <Subtitle textformat="02">Principal Components Analysis and Factor Analysis</Subtitle> 01 In this chapter you will learn about Principal Components Analysis and Factor Analysis. The aim of these methods is to reduce a large number of correlated quantitative variables to a small set of underlying dimensions. You will learn how to use these methods to perform corpus-based multidimensional analysis of register variation. 10 01 JB code z.195.c19 367 386 20 Article 21 <TitleType>01</TitleType> <TitleText textformat="02">Chapter 19. Exemplars, categories, prototypes</TitleText> <Subtitle textformat="02">Simple and multiple correspondence analysis</Subtitle> 01 This chapter introduces Correspondence Analysis. It is similar to PCA, but is designed for visualization and exploration of bivariate and multivariate categorical data. The first case study focuses on register variation of English Basic Colour Terms by using Simple Correspondence Analysis, which can be used for visualization of bivariate categorical data in two-dimensional contingency tables. In the second case study of German lexical categories Stuhl ‘chair’ and Sessel ‘armchair’, you will learn how to perform Multiple Correspondence Analysis with higher-dimensional tables. 10 01 JB code z.195.c20 387 394 8 Article 22 <TitleType>01</TitleType> <TitleText textformat="02">Chapter 20. Constructional change and motion charts</TitleText> 01 This chapter introduces motion charts as a method for dynamic visualization of language change. More specifically, they enable one to detect and explore changes in the use of constructions by visualizing the relative frequencies of different lexemes that fill in the constructional slots. The method is illustrated with a case study that explores the changes in the use of future markers will and be going to by comparing the frequencies of infinitives that follow the markers. 10 01 JB code z.195.epi 395 396 2 Article 23 <TitleType>01</TitleType> <TitleText textformat="02">Epilogue</TitleText> 10 01 JB code z.195.app1 397 408 12 Article 24 <TitleType>01</TitleType> <TitleText textformat="02">The most important R objects and basic operations with them</TitleText> <TitlePrefix>The </TitlePrefix> <TitleWithoutPrefix textformat="02">most important R objects and basic operations with them</TitleWithoutPrefix> 10 01 JB code z.195.app2 409 424 16 Article 25 <TitleType>01</TitleType> <TitleText textformat="02">Main plotting functions and graphical parameters in R</TitleText> 10 01 JB code z.195.refs 425 432 8 References 26 <TitleType>01</TitleType> <TitleText textformat="02">References</TitleText> 10 01 JB code z.195.si 433 440 8 Index 27 <TitleType>01</TitleType> <TitleText textformat="02">Subject Index</TitleText> 10 01 JB code z.195.ri 441 443 3 Index 28 <TitleType>01</TitleType> <TitleText textformat="02">Index of R functions and packages</TitleText> 02 JBENJAMINS John Benjamins Publishing Company 01 John Benjamins Publishing Company Amsterdam/Philadelphia NL 04 20151125 2015 John Benjamins B.V. 02 WORLD 01 245 mm 02 174 mm 08 995 gr 01 JB 1 John Benjamins Publishing Company +31 20 6304747 +31 20 6739773 bookorder@benjamins.nl 01 https://benjamins.com 01 WORLD US CA MX 21 4 16 01 02 JB 1 00 105.00 EUR R 02 02 JB 1 00 111.30 EUR R 01 JB 10 bebc +44 1202 712 934 +44 1202 712 913 sales@bebc.co.uk 03 GB 21 16 02 02 JB 1 00 88.00 GBP Z 01 JB 2 John Benjamins North America +1 800 562-5666 +1 703 661-1501 benjamins@presswarehouse.com 01 https://benjamins.com 01 US CA MX 21 16 01 gen 02 JB 1 00 158.00 USD 180016118 03 01 01 JB John Benjamins Publishing Company 01 JB code Z 195 Pb 15 9789027212252 13 2015016708 BC <TitleType>01</TitleType> <TitleText textformat="02">How to do Linguistics with R</TitleText> <Subtitle textformat="02">Data exploration and statistical analysis</Subtitle> 01 z.195 01 https://benjamins.com 02 https://benjamins.com/catalog/z.195 1 A01 Natalia Levshina Levshina, Natalia Natalia Levshina Université catholique de Louvain 01 eng 454 xi 443 LAN009000 v.2006 CFX 2 24 JB Subject Scheme LIN.COGN Cognition and language 24 JB Subject Scheme LIN.COMPUT Computational & corpus linguistics 24 JB Subject Scheme LIN.THEOR Theoretical linguistics 06 01 This book provides a linguist with a statistical toolkit for exploration and analysis of linguistic data. It employs R, a free software environment for statistical computing, which is increasingly popular among linguists. How to do Linguistics with R: Data exploration and statistical analysis is unique in its scope, as it covers a wide range of classical and cutting-edge statistical methods, including different flavours of regression analysis and ANOVA, random forests and conditional inference trees, as well as specific linguistic approaches, among which are Behavioural Profiles, Vector Space Models and various measures of association between words and constructions. The statistical topics are presented comprehensively, but without too much technical detail, and illustrated with linguistic case studies that answer non-trivial research questions. The book also demonstrates how to visualize linguistic data with the help of attractive informative graphs, including the popular ggplot2 system and Google visualization tools. This book has a companion website: <a href="http://doi.org/10.1075/z.195.website">http://doi.org/10.1075/z.195.website</a> 05 Levshina’s book achieves something few other books on doing linguistics with R have achieved. She has written a book that makes sense even for novice users of R and for linguists not accustomed to statistical computing. Levshina writes in a pedagogically sensitive style, in friendly language, and with just the right amount of explanatory prose to lead the reader to insightful analyses. Taken together, the chapters introduce the reader to a sparkling variety of statistical methods. Best of all, from the point of view of linguists, real *linguistic *problems take centre stage throughout the book– the statistical methods are the means to answer intriguing linguistic questions, not an end in themselves. John Newman, University of Alberta 05 This is a fantastic textbook: extremely comprehensive (the book surveys almost all major analysis technique used in the linguistic literature – from descriptive statistics over regression analysis to semantic vector space modeling), well-written, and with a much appreciated emphasis on good data visualization. Both beginning and more experienced quantitative linguists will find this book an invaluable resource. Benedikt Szmrecsanyi, University of Leuven 04 09 01 https://benjamins.com/covers/475/z.195.png 04 03 01 https://benjamins.com/covers/475_jpg/9789027212245.jpg 04 03 01 https://benjamins.com/covers/475_tif/9789027212245.tif 06 09 01 https://benjamins.com/covers/1200_front/z.195.pb.png 07 09 01 https://benjamins.com/covers/125/z.195.png 25 09 01 https://benjamins.com/covers/1200_back/z.195.pb.png 27 09 01 https://benjamins.com/covers/3d_web/z.195.pb.png 10 01 JB code z.195.prelim i vi 6 Prelim pages -1 <TitleType>01</TitleType> <TitleText textformat="02">Prelim pages</TitleText> 10 01 JB code z.195.toc vii x 4 Table of contents 0 <TitleType>01</TitleType> <TitleText textformat="02">Table of contents</TitleText> 10 01 JB code z.195.ack xi xii 2 Acknowledgments 1 <TitleType>01</TitleType> <TitleText textformat="02">Acknowledgements</TitleText> 10 01 JB code z.195.intro 1 6 6 Article 2 <TitleType>01</TitleType> <TitleText textformat="02">Introduction</TitleText> 10 01 JB code z.195.c1 7 20 14 Article 3 <TitleType>01</TitleType> <TitleText textformat="02">Chapter 1. What is statistics?</TitleText> <Subtitle textformat="02">Main statistical notions and principles</Subtitle> 01 What is statistics? What can and cannot statistics do for you? How to formulate and test research hypotheses? What kind of statistical tests are there? These and many other questions are discussed in this chapter. In addition, you will also learn about different types of variables, parametric and non-parametric tests, p-values and many other things which you will need in order to understand explanations provided in the following chapters. 10 01 JB code z.195.c2 21 40 20 Article 4 <TitleType>01</TitleType> <TitleText textformat="02">Chapter 2. Introduction to R</TitleText> 01 In this chapter you will learn to install the basic distribution of R, as well as add-on packages. The chapter also introduces the basics of R syntax and demonstrates how to perform simple operations with different R objects. Special attention is paid to importing and exporting your own data to and from R and saving your graphical output. You will also be able to interpret error messages and warnings that R may give you and search for additional information on R functions. 10 01 JB code z.195.c3 41 68 28 Article 5 <TitleType>01</TitleType> <TitleText textformat="02">Chapter 3. Descriptive statistics for quantitative variables</TitleText> 01 This chapter shows how to compute basic descriptive statistics for a quantitative variable. You will learn the most popular measures of central tendency (the mean, the median and the mode) and dispersion (variance, standard deviation, range, IQR, median absolute deviation). The chapter will also demonstrate how to produce different graphs (box-and-whisker plots, histograms, density plots, Q–Q plots, line charts), which visualize univariate distributions and help one determine whether a variable is normally distributed. From the case studies you will learn how to analyse the distribution of word lengths in a sample, to detect suspicious values in subjects’ reaction times in a lexical decision task, and to correct some problems with the shape of a distribution. 10 01 JB code z.195.c4 69 86 18 Article 6 <TitleType>01</TitleType> <TitleText textformat="02">Chapter 4. How to explore qualitative variables</TitleText> <Subtitle textformat="02">proportions and their visualizations</Subtitle> 01 This chapter demonstrates how to explore a categorical variable with the help of tables of counts and proportions. As in the previous chapter, graphs (pie charts, bar plots and dot charts) will play a very important role. You will also learn how to change values of a categorical variable. In addition, we will discuss how one can use Deviation of Proportions to measure dispersion of words in a corpus. This approach will be illustrated by a case study of the Basic Colour Terms in English. 10 01 JB code z.195.c5 87 114 28 Article 7 <TitleType>01</TitleType> <TitleText textformat="02">Chapter 5. Comparing two groups</TitleText> <Subtitle textformat="02">t-test and Wilcoxon and Mann-Whitney tests for independent and dependent samples</Subtitle> 01 Do language learners who are taught by an innovative method show better results than those who are taught traditionally? Do speakers of one language variety speak faster than speakers of another variety? Do people of one gender use more hedging constructions than people of another? In this chapter, you will learn how to make such comparisons using the parametric t-test and the non-parametric Wilcoxon and Mann-Whitney tests for dependent and independent samples. You will learn how to compute the standard error and confidence intervals for the mean. The case studies will involve differences between high- and low-frequency nouns with regard to the number of associations that they trigger and their abstractness/concreteness scores. 10 01 JB code z.195.c6 115 138 24 Article 8 <TitleType>01</TitleType> <TitleText textformat="02">Chapter 6. Relationships between two quantitative variables</TitleText> <Subtitle textformat="02">Correlation analysis with elements of linear regression modelling</Subtitle> 01 Will your knowledge of statistics improve as you read more and more books on the subject? Is there a relationship between the length of a word and its frequency? Does grammatical proficiency of children depend on the number of lexical items which they have mastered? Does the number of phonemes in a language depend on the number of speakers? All these questions involve correlation between two variables. This chapter explains the principles of correlation analysis and demonstrates how it can be carried out using popular parametric and non-parametric tests. You will also learn how to produce correlograms and scatter plots with a regression line. Some fundamental notions of regression analysis, such as residuals, homo- and heteroscedasticity, will be introduced. The case studies investigate the relationship between word frequency and mean reaction time in a lexical decision task and the correlation between vocabulary size and grammatical proficiency in first language acquisition. 10 01 JB code z.195.c7 139 170 32 Article 9 <TitleType>01</TitleType> <TitleText textformat="02">Chapter 7. More on frequencies and reaction times</TitleText> <Subtitle textformat="02">Linear regression</Subtitle> 01 After the previous chapter has introduced some basic elements of regression analysis, this chapter will provide a more thorough discussion of linear regression. This method enables one to model and explain the relationships between one or more explanatory variables at any level of measurement, on the one hand, and one ratio- or interval-scaled response variable, on the other hand. In addition, one can investigate interactions between explanatory variables. You will learn how to fit a multiple linear regression model, to perform its diagnostics and to interpret the results. You will also learn how to carry out non-parametric linear regression with the help of bootstrap. The case study investigates the relationship between reaction times in a lexical decision task, and such factors as word length, corpus frequency and part of speech of lexical stimuli. In contrast with the previous case studies, all these factors are tested here simultaneously in a multiple linear regression model. 10 01 JB code z.195.c8 171 198 28 Article 10 <TitleType>01</TitleType> <TitleText textformat="02">Chapter 8. Finding differences between several groups</TitleText> <Subtitle textformat="02">Sign language, linguistic relativity and ANOVA</Subtitle> 01 This chapter introduces ANOVA (analysis of variance), a special case of linear regression with binary or categorical independent variables. This method is widely used in experimental linguistics, when the researcher compares several groups of experimental objects that undergo different treatments. In this chapter you will learn several types of ANOVA: one-way ANOVA with one factor as an independent variable, factorial ANOVA with two or more categorical independent variables, and repeated-measures and mixed ANOVA. The methods are illustrated by three case studies. The first two focus on grammatical features of an emergent sign language. The third case study deals with cross-linguistic differences in time conceptualization, which are interpreted as evidence in favour of the linguistic relativity hypothesis. 10 01 JB code z.195.c9 199 222 24 Article 11 <TitleType>01</TitleType> <TitleText textformat="02">Chapter 9. Measuring associations between two categorical variables</TitleText> <Subtitle textformat="02">Conceptual metaphors and tests of independence</Subtitle> 01 This chapter focuses on associations between two categorical variables. You will learn how to measure the association strength using odds ratios, Cramér’s V and the φ-coefficient. You will also learn how to test whether the association is statistically significant with the help of the χ2-test and the Fisher exact test. Bar plots, mosaic plots and association plots are used as visualization tools for cross-tabulated data. All these concepts and tools will be illustrated by case studies of metaphoric and non-metaphoric uses of the preposition over and the verb see in different registers. 10 01 JB code z.195.c10 223 240 18 Article 12 <TitleType>01</TitleType> <TitleText textformat="02">Chapter 10. Association measures</TitleText> <Subtitle textformat="02">collocations and collostructions</Subtitle> 01 Collocations, as well as colligations and other co-occurrence patterns, play an important role in corpus linguistics, psycholinguistics and usage-based grammar and lexicology. To measure the degree of attraction between words and other units, one can use diverse association measures, such as collostructional strength, Pointwise Mutual Information or ΔP. From this chapter you will learn how to compute a variety of association measures using a small set of different co-occurrence frequencies. The case study is based on co-occurrence frequencies of different verbs in the Russian ditransitive construction. 10 01 JB code z.195.c11 241 252 12 Article 13 <TitleType>01</TitleType> <TitleText textformat="02">Chapter 11. Geographic variation of quite: Distinctive collexeme analysis</TitleText> 01 This chapter introduces distinctive collexeme analysis, which employs bidirectional association measures discussed in the previous chapter. This method is based on the co-occurrence frequencies of words that occur in two near-synonymous constructions, or in two or more dialectal or diachronic variants of the same construction. Here we will compare the variants of quite + ADJ constructions in different national varieties of English. We will first present a canonical distinctive collexeme analysis with only two varieties, British and American English, and then will show how this approach can be extended to more lects, presenting a unified approach to multiple distinctive collexeme analysis. 10 01 JB code z.195.c12 253 276 24 Article 14 <TitleType>01</TitleType> <TitleText textformat="02">Chapter 12. Probabilistic multifactorial grammar and lexicology</TitleText> <Subtitle textformat="02">Binomial logistic regression</Subtitle> 01 In this chapter you will learn how to model the speaker’s choice between two near synonymous words or constructions on the basis of contextual features. The most popular statistical tool that is used to create such models is logistic regression. The approach is illustrated by a case study of two Dutch causative auxiliaries. As in the case of linear regression, you will learn how to create, test and interpret a logistic model with the help of different R tools. 10 01 JB code z.195.c13 277 290 14 Article 15 <TitleType>01</TitleType> <TitleText textformat="02">Chapter 13. Multinomial (polytomous) logistic regression models of three and more near synonyms</TitleText> 01 This chapter continues the discussion of logistic regression models, which can be used to predict the speaker’s choice between different near synonyms or variants. This time you will learn to model situations when the number of possible outcomes is greater than two. Such models are called multinomial, or polytomous. The method will be illustrated with a case study of three near synonyms: let, allow and permit. 10 01 JB code z.195.c14 291 300 10 Article 16 <TitleType>01</TitleType> <TitleText textformat="02">Chapter 14. Conditional inference trees and random forests</TitleText> 01 This chapter discusses conditional inference trees and random forests. These are non-parametric tree-structure models of regression and classification that can serve as an alternative to multiple regression. They are especially useful in the presence of many high-order interactions and in situations when the sample size is small, but the number of predictors is large. You will learn how to fit such models, interpret their results and evaluate their quality. The case study that illustrates the techniques deals with three English causative constructions make + V, cause + to V and have + V and identifies the set of independent semantic variables that are important for distinguishing between the constructions. 10 01 JB code z.195.c15 301 322 22 Article 17 <TitleType>01</TitleType> <TitleText textformat="02">Chapter 15. Behavioural profiles, distance metrics and cluster analysis</TitleText> 01 This chapter presents the Behavioural Profiles approach, which involves the comparison of contextual features of words or constructions in a corpus. The chapter also discusses several clustering algorithms, which are based on different distance metrics. Cluster analysis is a family of techniques that can help you discover groups of similar objects in the data. Several popular methods of cluster validation and diagnostics are discussed, which involve the computation of average silhouette widths and multiscale bootstrap resampling. The chapter also demonstrates how to interpret clusters with the help of the snake plot and effect size measures. In addition, you will learn to create and interpret scree plots, which are useful for determining the optimal number of clusters. 10 01 JB code z.195.c16 323 332 10 Article 18 <TitleType>01</TitleType> <TitleText textformat="02">Chapter 16. Introduction to Semantic Vector Spaces</TitleText> <Subtitle textformat="02">Cosine as a measure of semantic similarity</Subtitle> 01 This chapter introduces Semantic Vector Spaces, another distributional approach to semantics. This method originates in Natural Language Processing. Unlike Behavioural Profiles discussed in the previous chapter, it uses automatically extracted co-occurrences of target words and contextual features. The characteristic features of the method are weighted co-occurrence frequencies and the use of the cosine as the most popular similarity measure. This chapter provides a general introduction to the method, with a case study of English cooking verbs as an illustration. 10 01 JB code z.195.c17 333 350 18 Article 19 <TitleType>01</TitleType> <TitleText textformat="02">Chapter 17. Language and space</TitleText> <Subtitle textformat="02">Dialects, maps and Multidimensional Scaling</Subtitle> 01 This chapter introduces another popular method that deals with distance matrices. This method is called Multidimensional Scaling. It is a dimensionality reduction technique that represents distances between objects in a low-dimensional space. You will learn how to perform different types of metric and non-metric scaling and carry out the diagnostics of solutions by using the scree plot, the Shepard plot and goodness-of-fit measures. The chapter also shows how one can use R for creation of geographical maps with points and text labels. Finally, you will learn how to measure the correlation between two distance matrices with the help of the Mantel test. The case studies are based on geographic coordinates and several linguistic features of varieties of English all over the world. 10 01 JB code z.195.c18 351 366 16 Article 20 <TitleType>01</TitleType> <TitleText textformat="02">Chapter 18. Multidimensional analysis of register variation</TitleText> <Subtitle textformat="02">Principal Components Analysis and Factor Analysis</Subtitle> 01 In this chapter you will learn about Principal Components Analysis and Factor Analysis. The aim of these methods is to reduce a large number of correlated quantitative variables to a small set of underlying dimensions. You will learn how to use these methods to perform corpus-based multidimensional analysis of register variation. 10 01 JB code z.195.c19 367 386 20 Article 21 <TitleType>01</TitleType> <TitleText textformat="02">Chapter 19. Exemplars, categories, prototypes</TitleText> <Subtitle textformat="02">Simple and multiple correspondence analysis</Subtitle> 01 This chapter introduces Correspondence Analysis. It is similar to PCA, but is designed for visualization and exploration of bivariate and multivariate categorical data. The first case study focuses on register variation of English Basic Colour Terms by using Simple Correspondence Analysis, which can be used for visualization of bivariate categorical data in two-dimensional contingency tables. In the second case study of German lexical categories Stuhl ‘chair’ and Sessel ‘armchair’, you will learn how to perform Multiple Correspondence Analysis with higher-dimensional tables. 10 01 JB code z.195.c20 387 394 8 Article 22 <TitleType>01</TitleType> <TitleText textformat="02">Chapter 20. Constructional change and motion charts</TitleText> 01 This chapter introduces motion charts as a method for dynamic visualization of language change. More specifically, they enable one to detect and explore changes in the use of constructions by visualizing the relative frequencies of different lexemes that fill in the constructional slots. The method is illustrated with a case study that explores the changes in the use of future markers will and be going to by comparing the frequencies of infinitives that follow the markers. 10 01 JB code z.195.epi 395 396 2 Article 23 <TitleType>01</TitleType> <TitleText textformat="02">Epilogue</TitleText> 10 01 JB code z.195.app1 397 408 12 Article 24 <TitleType>01</TitleType> <TitleText textformat="02">The most important R objects and basic operations with them</TitleText> <TitlePrefix>The </TitlePrefix> <TitleWithoutPrefix textformat="02">most important R objects and basic operations with them</TitleWithoutPrefix> 10 01 JB code z.195.app2 409 424 16 Article 25 <TitleType>01</TitleType> <TitleText textformat="02">Main plotting functions and graphical parameters in R</TitleText> 10 01 JB code z.195.refs 425 432 8 References 26 <TitleType>01</TitleType> <TitleText textformat="02">References</TitleText> 10 01 JB code z.195.si 433 440 8 Index 27 <TitleType>01</TitleType> <TitleText textformat="02">Subject Index</TitleText> 10 01 JB code z.195.ri 441 443 3 Index 28 <TitleType>01</TitleType> <TitleText textformat="02">Index of R functions and packages</TitleText> 02 JBENJAMINS John Benjamins Publishing Company 01 John Benjamins Publishing Company Amsterdam/Philadelphia NL 04 20151125 2015 John Benjamins B.V. 02 WORLD 01 240 mm 02 170 mm 08 850 gr 01 JB 1 John Benjamins Publishing Company +31 20 6304747 +31 20 6739773 bookorder@benjamins.nl 01 https://benjamins.com 01 WORLD US CA MX 21 250 12 01 02 JB 1 00 36.00 EUR R 02 02 JB 1 00 38.16 EUR R 01 JB 10 bebc +44 1202 712 934 +44 1202 712 913 sales@bebc.co.uk 03 GB 21 12 02 02 JB 1 00 30.00 GBP Z 01 JB 2 John Benjamins North America +1 800 562-5666 +1 703 661-1501 benjamins@presswarehouse.com 01 https://benjamins.com 01 US CA MX 21 15 12 01 gen 02 JB 1 00 54.00 USD