18.2 Algorithms Data and Variables
20210103
The rattle::weatherAUS dataset is loaded into the template
variable ds
and further template variables are setup as
introduced by Graham J. Williams (2017). See
Chapter 8 for details.
<- "weatherAUS"
dsname <- get(dsname)
ds
<- nrow(ds)
nobs
<- names(ds)
vnames %<>% clean_names(numerals="right")
ds names(vnames) <- names(ds)
<- names(ds)
vars <- "rain_tomorrow"
target <- c(target, vars) %>% unique() %>% rev() vars
For the rattle::weatherAUS dataset we similarly define the following template variables (Graham J. Williams 2017) used for predictive modelling. See Chapter 8 for details.
<- "risk_mm"
risk <- c("date", "location")
id <- c(risk, id)
ignore <- setdiff(vars, ignore)
vars <- setdiff(vars, target)
inputs
<- formula(target %s+% " ~ .")
form
%<>% na.roughfix()
ds[vars]
<- c(0.70, 0.15, 0.15)
SPLIT
%>% sample(SPLIT[1]*nobs) -> tr
nobs %>% seq_len() %>% setdiff(tr) %>% sample(SPLIT[2]*nobs) -> tu
nobs %>% seq_len() %>% setdiff(tr) %>% setdiff(tu) -> te
nobs
%>% slice(tr) %>% pull(target) -> actual_tr
ds %>% slice(tu) %>% pull(target) -> actual_tu
ds %>% slice(te) %>% pull(target) -> actual_te
ds
%>% slice(tr) %>% pull(risk) -> risk_tr
ds %>% slice(tu) %>% pull(risk) -> risk_tu
ds %>% slice(te) %>% pull(risk) -> risk_te ds
References
Williams, Graham J. 2017. The Essentials of Data Science: Knowledge Discovery Using r. The r Series. CRC Press.
Your donation will support ongoing availability and give you access to the PDF version of this book. Desktop Survival Guides include Data Science, GNU/Linux, and MLHub. Books available on Amazon include Data Mining with Rattle and Essentials of Data Science. Popular open source software includes rattle, wajig, and mlhub. Hosted by Togaware, a pioneer of free and open source software since 1984. Copyright © 1995-2022 Graham.Williams@togaware.com Creative Commons Attribution-ShareAlike 4.0
