Chapter 14 | Exercise 1
Using the data set
nerd, which was introduced in the exercise for Chapter 12, create a conditional inference tree and visualize it. Which variables are responsible for the splits? Does the model predict the actual outcomes well? Compute the C-index and the accuracy statistic to assess the goodness of fit.
> library(Rling) > data(nerd) > library(party) > nerd.ctree <- ctree(Noun ~ Num + Century + Register + Eval, data = nerd) > plot(nerd.ctree)
The variables that are responsible for the splits are Eval and Century.
> nerd.ctree.pred <- unlist(treeresponse(nerd.ctree))[c(FALSE, TRUE)] > library(Hmisc) > somers2(nerd.ctree.pred, as.numeric(nerd$Noun)- 1) C Dxy n Missing 0.6727935 0.3455871 1316.0000000 0.0000000 > table(predict(nerd.ctree), nerd$Noun) geek nerd geek 227 61 nerd 443 585 > (227 + 585)/nrow(nerd)  0.6170213
The concordance index C = 0.673. The accuracy is 0.617. These results show that there is quite some room for improvement.