14.11 ROC Chart
Another measure of the quality of a model is the ROC curve and in particular the area under the ROC curve. This area can be calculated using ROCR::prediction() and ROCR::performance() from ROCR (Sing et al. 2020). These functions use the probability of a prediction rather than the prediction of a class.
In the following code block we obtain the predicted probabilities from
the model, predicting over the te
dataset. The result
from calling stats::predict() is a matrix with columns
corresponding to the possible class values recording the probability
of each class for each observation. The second column is the one of
interest (the probability that it will rain tomorrow or
rain_tomorrow==yes
). These probabilities are passed on to
ROCR::prediction() to compare them with the actual target
values. The result is then passed on to ROCR::performance()
from which we obtain the xfun::attr()ibute
y.values
and then magrittr::extract2() the first value
as the area under the curve.
References
Your donation will support ongoing availability and give you access to the PDF version of this book. Desktop Survival Guides include Data Science, GNU/Linux, and MLHub. Books available on Amazon include Data Mining with Rattle and Essentials of Data Science. Popular open source software includes rattle, wajig, and mlhub. Hosted by Togaware, a pioneer of free and open source software since 1984. Copyright © 1995-2022 Graham.Williams@togaware.com Creative Commons Attribution-ShareAlike 4.0