Togaware DATA MINING
Desktop Survival Guide
by Graham Williams
Google

Min Split (Rarg[]minsplit)

The Rarg[]minsplit specifies the minimum number of observations that must exist at a node in the tree before any further splitting will be attempted.

Using rpart directly we specify Roption[]minsplit within an option called Roption[]control which takes the results from a function called rpart.control. In this example we



> set.seed(42)
> w.train <- sample(nrow(weather), 0.5*nrow(weather))
> w.rpart <- rpart(RainTomorrow ~ Sunshine, data=weather[w.train,])
> w.rpart



n=181 (2 observations deleted due to missingness)

node), split, n, loss, yval, (yprob)
      * denotes terminal node

1) root 181 25 No (0.8618785 0.1381215) *



> table(predict(w.rpart, weatherAUS[-w.train,], type="class"), 
        weatherAUS[-w.train, "RainTomorrow"])



         No   Yes
  No  21956  6204
  Yes     0     0



> w.rpart <- rpart(RainTomorrow ~ Sunshine, data=weather[w.train,], 
                   control=rpart.control(minsplit=10))
> w.rpart



n=181 (2 observations deleted due to missingness)

node), split, n, loss, yval, (yprob)
      * denotes terminal node

 1) root 181 25 No (0.86187845 0.13812155)  
   2) Sunshine>=6.45 133 10 No (0.92481203 0.07518797) *
   3) Sunshine< 6.45 48 15 No (0.68750000 0.31250000)  
     6) Sunshine< 3.15 21  4 No (0.80952381 0.19047619)  
      12) Sunshine>=1.25 9  0 No (1.00000000 0.00000000) *
      13) Sunshine< 1.25 12  4 No (0.66666667 0.33333333)  
        26) Sunshine< 0.65 7  1 No (0.85714286 0.14285714) *
        27) Sunshine>=0.65 5  2 Yes (0.40000000 0.60000000) *
     7) Sunshine>=3.15 27 11 No (0.59259259 0.40740741)  
      14) Sunshine< 5.95 19  7 No (0.63157895 0.36842105)  
        28) Sunshine>=5.5 5  1 No (0.80000000 0.20000000) *
        29) Sunshine< 5.5 14  6 No (0.57142857 0.42857143)  
          58) Sunshine< 4.8 11  4 No (0.63636364 0.36363636) *
          59) Sunshine>=4.8 3  1 Yes (0.33333333 0.66666667) *
      15) Sunshine>=5.95 8  4 No (0.50000000 0.50000000) *



> table(predict(w.rpart, weatherAUS[-w.train,], type="class"), 
        weatherAUS[-w.train, "RainTomorrow"])



         No   Yes
  No  21310  5730
  Yes   646   474



Copyright © 2004-2010 Togaware Pty Ltd
Support further development through the purchase of the PDF version of the book.
The PDF version is a formatted comprehensive draft book (with over 800 pages).
Brought to you by Togaware. This page generated: Sunday, 22 August 2010