Ch. 14 | Exercise 1

Chapter 14 | Exercise 1

Using the data set nerd, which was introduced in the exercise for Chapter 12, create a conditional inference tree and visualize it. Which variables are responsible for the splits? Does the model predict the actual outcomes well? Compute the C-index and the accuracy statistic to assess the goodness of fit.

Key

> library(Rling)
> data(nerd)
> library(party)
> nerd.ctree <- ctree(Noun ~ Num + Century + Register + Eval, data = nerd)
> plot(nerd.ctree)

The variables that are responsible for the splits are Eval and Century.

> nerd.ctree.pred <- unlist(treeresponse(nerd.ctree))[c(FALSE, TRUE)]
> library(Hmisc)
> somers2(nerd.ctree.pred, as.numeric(nerd$Noun)- 1)
           C          Dxy            n      Missing 
   0.6727935    0.3455871 1316.0000000    0.0000000
> table(predict(nerd.ctree), nerd$Noun)
      
       geek nerd
  geek  227   61
  nerd  443  585
> (227 + 585)/nrow(nerd)
[1] 0.6170213

The concordance index C = 0.673. The accuracy is 0.617. These results show that there is quite some room for improvement.