Ch. 5 | Exercise 1

# Chapter 5 | Exercise 1

Consider the imagery scores of the high- and low-frequency nouns, available as the imag variable in the data frames `pym_high` and `pym_low`. Which group of nouns would you expect to have higher imagery scores? Compute the means and medians. Use an appropriate parametric and non-parametric test to check if the difference between the groups is statistically significant. Are the results of the tests similar?

In this exercise we will test a non-directional hypothesis of difference between the groups. First, load the package and the data. Next, compute the means and the medians:

```> library(Rling) > data(pym_high) > data(pym_low) > mean(pym_high\$imag)  5.1706 > mean(pym_low\$imag)  4.884902 > median(pym_high\$imag)  5.05 > median(pym_low\$imag)  5.23 ```

The high-frequency nouns have higher means but lower medians. According to the parametric t-test, the differences between the means are not statistically significant:

```> t.test(pym_high\$imag, pym_low\$imag) Welch Two Sample t-test data: pym_high\$imag and pym_low\$imag t = 1.0694, df = 98.463, p-value = 0.2875 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: -0.2444499 0.8158460 sample estimates: mean of x mean of y 5.170600 4.884902 ```

According to the non-parametric Wilcoxon test, there is no statistically significant difference between the groups, either:

```> wilcox.test(pym_high\$imag, pym_low\$imag, correct = FALSE, conf.int = TRUE) Wilcoxon rank sum test data: pym_high\$imag and pym_low\$imag W = 1441.5, p-value = 0.258 alternative hypothesis: true location shift is not equal to 0 95 percent confidence interval: -0.2000116 0.7400222 sample estimates: difference in location 0.2000102 ```

Note that both tests are unpaired (independent) and two-tailed.