Ch. 9 | Exercise 3

Chapter 9 | Exercise 3

Compare the distribution of on (see the data in Exercise 1) with onto, which is used metaphorically 4 times and non-metaphorically 3 times. Are the variables (the preposition and [non]-metaphoricity) independent? Which test would you use for this purpose and why?

Create a table:

> met <- c(735, 3) > nonmet <- c(331, 4) > on_onto <- rbind(met, nonmet) > colnames(on_onto) <- c('on', 'onto')

The χ2 test returns a large p-value and a warning message:

> chisq.test(on_onto) Pearson's Chi-squared test with Yates' continuity correction data: on_onto X-squared = 1.1571, df = 1, p-value = 0.2821 Warning message: In chisq.test(on_onto) : Chi-squared approximation may be incorrect

The problem is that there are expected frequencies that are less than 5 (both belong to onto):

> chisq.test(on_onto)$expected on onto met 733.1855 4.814539 nonmet 332.8145 2.185461 Warning message: In chisq.test(on_onto) : Chi-squared approximation may be incorrect

In this situation, Fisher’s exact test is recommended, which also says that we cannot discard the null hypothesis of no association:

> fisher.test(on_onto) Fisher's Exact Test for Count Data data: on_onto p-value = 0.2138 alternative hypothesis: true odds ratio is not equal to 1 95 percent confidence interval: 0.4973103 20.3013053 sample estimates: odds ratio 2.957369