14.6 Predict Class
20200607 R provides stats::predict() to obtain
predictions from the model. We can obtain the predictions, for
example, on the te
st dataset and thereby determine the
apparent accuracy of the model.
For different types of models stats::predict() will behave
in a similar way. There are however variations that we need to be
aware of for each. For an rpart model to predict the
class (i.e., Yes
or No
) use
type="class"
:
## 1 2 3 4 5 6
## No No No No No No
## Levels: No Yes
We can then compare this to the actual class for these observations as
is recorded in the original te
dataset. The actual classes
have already been stored as the variable target_te
:
## [1] Yes Yes No No No No
## Levels: No Yes
We can observe from the above that the model correctly predicts
4 of the first 6 observations
from the te
st dataset, suggesting a
67%
accuracy. Over the full 31,275 observations
contained in the te
st dataset
26,091 are correctly predicted, which is
83% accurate.
For different evaluations of the model we will collect the class predictions from the training and tuning datasets as well:
ds[tr, vars] %>% predict(model, newdata=., type="class") -> predict_tr
ds[tu, vars] %>% predict(model, newdata=., type="class") -> predict_tu
We can also calculate the accuracy for each of these datasets:
## 83%
## 83%
Your donation will support ongoing availability and give you access to the PDF version of this book. Desktop Survival Guides include Data Science, GNU/Linux, and MLHub. Books available on Amazon include Data Mining with Rattle and Essentials of Data Science. Popular open source software includes rattle, wajig, and mlhub. Hosted by Togaware, a pioneer of free and open source software since 1984. Copyright © 1995-2022 Graham.Williams@togaware.com Creative Commons Attribution-ShareAlike 4.0