Ch. 14 | Exercise 1

Chapter 14 | Exercise 1

Using the data set nerd, which was introduced in the exercise for Chapter 12, create a conditional inference tree and visualize it. Which variables are responsible for the splits? Does the model predict the actual outcomes well? Compute the C-index and the accuracy statistic to assess the goodness of fit. 

> library(Rling) > data(nerd) > library(party) > nerd.ctree <- ctree(Noun ~ Num + Century + Register + Eval, data = nerd) > plot(nerd.ctree)

The variables that are responsible for the splits are Eval and Century.

> nerd.ctree.pred <- unlist(treeresponse(nerd.ctree))[c(FALSE, TRUE)] > library(Hmisc) > somers2(nerd.ctree.pred, as.numeric(nerd$Noun)- 1) C Dxy n Missing 0.6727935 0.3455871 1316.0000000 0.0000000 > table(predict(nerd.ctree), nerd$Noun) geek nerd geek 227 61 nerd 443 585 > (227 + 585)/nrow(nerd) [1] 0.6170213

The concordance index C = 0.673. The accuracy is 0.617. These results show that there is quite some room for improvement.