Chapter 9 | Exercise 3
Compare the distribution of on (see the data in Exercise 1) with onto, which is used metaphorically 4 times and non-metaphorically 3 times. Are the variables (the preposition and [non]-metaphoricity) independent? Which test would you use for this purpose and why?
Create a table:
> met <- c(735, 3)
> nonmet <- c(331, 4)
> on_onto <- rbind(met, nonmet)
> colnames(on_onto) <- c('on', 'onto')
The χ2 test returns a large p-value and a warning message:
> chisq.test(on_onto)
Pearson's Chi-squared test with Yates' continuity correction
data: on_onto
X-squared = 1.1571, df = 1, p-value = 0.2821
Warning message:
In chisq.test(on_onto) : Chi-squared approximation may be incorrect
The problem is that there are expected frequencies that are less than 5 (both belong to onto):
> chisq.test(on_onto)$expected
on onto
met 733.1855 4.814539
nonmet 332.8145 2.185461
Warning message:
In chisq.test(on_onto) : Chi-squared approximation may be incorrect
In this situation, Fisher’s exact test is recommended, which also says that we cannot discard the null hypothesis of no association:
> fisher.test(on_onto)
Fisher's Exact Test for Count Data
data: on_onto
p-value = 0.2138
alternative hypothesis: true odds ratio is not equal to 1
95 percent confidence interval:
0.4973103 20.3013053
sample estimates:
odds ratio
2.957369