# Chapter 10 | Exercise 4

Using the frequencies in Exercise 3, compute the log-likelihood scores for each collexeme. Which adjective has the highest log-likelihood score, and which has the lowest one? The total number of words in COCA at the moment of query was 464 020 256.

Create the vectors with the frequencies *b*, *c*, *d* and expected *a*:

```
> b <- total - a
> c <- 28636 - a
> d <- 464020256 - a - b - c
> aExp <- (a + b)*(a + c)/(a + b + c + d)
```

Compute the log-likelihood ratio and add the information about the direction of the relationship:

```
> library(Rling)
> loglik <- LL.collostr(a, b, c, d)
> loglik1 <- ifelse(a<aExp, -loglik, loglik)
> names(loglik1) <- adj
> sort(loglik1, decreasing = T)
crazy wrong haywire blank unpunished
22406.529538 7498.978088 4056.346901 3560.834797 3214.194939
undetected stir-crazy batty hog-wild sick
3060.021742 231.245176 210.705804 207.630165 4.910525
```

The highest score belongs to *crazy*, and the lowest to *sick*.