20.18 Summary of the Model
summary(model)
## Call:
## rpart(formula = form, data = ds[tr, vars], model = TRUE)
## n= 123722
##
## CP nsplit rel error xerror xstd
## 1 0.14525806 0 1.0000000 1.0000000 0.005511437
## 2 0.03497808 1 0.8547419 0.8543189 0.005191983
## 3 0.01000000 3 0.7847858 0.7890547 0.005031264
##
## Variable importance
## humidity_3pm sunshine temp_3pm wind_gust_speed max_temp
## 78 4 4 3 3
## cloud_3pm humidity_9am wind_speed_3pm wind_speed_9am rainfall
## 2 1 1 1 1
##
## Node number 1: 123722 observations, complexity param=0.1452581
## predicted class=No expected loss=0.2101647 P(node) =1
## class counts: 97720 26002
## probabilities: 0.790 0.210
## left son=2 (104365 obs) right son=3 (19357 obs)
## Primary splits:
## humidity_3pm < 71.5 to the left, improve=6887.670, (0 missing)
## rainfall < 0.55 to the left, improve=4079.306, (0 missing)
## rain_today splits as LR, improve=4023.390, (0 missing)
## cloud_3pm < 6.5 to the left, improve=2778.256, (0 missing)
## sunshine < 6.05 to the right, improve=2630.088, (0 missing)
## Surrogate splits:
## sunshine < 0.45 to the right, agree=0.852, adj=0.053, (0 split)
## temp_3pm < 10.55 to the right, agree=0.851, adj=0.050, (0 split)
## max_temp < 10.55 to the right, agree=0.848, adj=0.029, (0 split)
## cloud_3pm < 7.5 to the left, agree=0.846, adj=0.018, (0 split)
## rainfall < 30.3 to the left, agree=0.846, adj=0.016, (0 split)
##
## Node number 2: 104365 observations
## predicted class=No expected loss=0.1383127 P(node) =0.8435444
## class counts: 89930 14435
## probabilities: 0.862 0.138
##
## Node number 3: 19357 observations, complexity param=0.03497808
## predicted class=Yes expected loss=0.4024384 P(node) =0.1564556
## class counts: 7790 11567
## probabilities: 0.402 0.598
## left son=6 (10680 obs) right son=7 (8677 obs)
## Primary splits:
## humidity_3pm < 82.5 to the left, improve=933.6523, (0 missing)
## rainfall < 2.1 to the left, improve=636.3515, (0 missing)
## rain_today splits as LR, improve=620.9481, (0 missing)
## wind_gust_speed < 42 to the left, improve=441.6061, (0 missing)
## pressure_3pm < 1011.65 to the right, improve=391.1549, (0 missing)
## Surrogate splits:
## humidity_9am < 90.5 to the left, agree=0.615, adj=0.141, (0 split)
## cloud_3pm < 7.5 to the left, agree=0.610, adj=0.131, (0 split)
## temp_3pm < 11.95 to the right, agree=0.596, adj=0.099, (0 split)
## sunshine < 1.45 to the right, agree=0.590, adj=0.084, (0 split)
## max_temp < 11.85 to the right, agree=0.584, adj=0.072, (0 split)
##
## Node number 6: 10680 observations, complexity param=0.03497808
## predicted class=No expected loss=0.4575843 P(node) =0.08632256
## class counts: 5793 4887
## probabilities: 0.542 0.458
## left son=12 (6441 obs) right son=13 (4239 obs)
## Primary splits:
## wind_gust_speed < 42 to the left, improve=316.7443, (0 missing)
## rainfall < 2.7 to the left, improve=307.9253, (0 missing)
## rain_today splits as LR, improve=292.8988, (0 missing)
## pressure_9am < 1015.55 to the right, improve=257.6968, (0 missing)
## pressure_3pm < 1012.65 to the right, improve=243.6277, (0 missing)
## Surrogate splits:
## wind_speed_3pm < 23 to the left, agree=0.757, adj=0.389, (0 split)
## wind_speed_9am < 19.5 to the left, agree=0.749, adj=0.368, (0 split)
## pressure_9am < 1010.95 to the right, agree=0.644, adj=0.104, (0 split)
## pressure_3pm < 1011.45 to the right, agree=0.630, adj=0.068, (0 split)
## humidity_9am < 65.5 to the right, agree=0.618, adj=0.036, (0 split)
##
## Node number 7: 8677 observations
## predicted class=Yes expected loss=0.2301487 P(node) =0.07013304
## class counts: 1997 6680
## probabilities: 0.230 0.770
##
## Node number 12: 6441 observations
## predicted class=No expected loss=0.3587952 P(node) =0.05206026
## class counts: 4130 2311
## probabilities: 0.641 0.359
##
## Node number 13: 4239 observations
## predicted class=Yes expected loss=0.3923095 P(node) =0.0342623
## class counts: 1663 2576
## probabilities: 0.392 0.608
In the following pages we dissect the various components of this summary.
Your donation will support ongoing availability and give you access to the PDF version of this book. Desktop Survival Guides include Data Science, GNU/Linux, and MLHub. Books available on Amazon include Data Mining with Rattle and Essentials of Data Science. Popular open source software includes rattle, wajig, and mlhub. Hosted by Togaware, a pioneer of free and open source software since 1984. Copyright © 1995-2021 Graham.Williams@togaware.com Creative Commons Attribution-ShareAlike 4.0
