11.2 Visualisation Data

The rattle::weatherAUS dataset is loaded into the template variable ds and further template variables are setup as introduced by Graham J. Williams (2017). See Chapter 8 for details.

dsname <- "weatherAUS"
ds     <- get(dsname)
    
nobs   <- nrow(ds)

vnames <- names(ds)
ds    %<>% clean_names(numerals="right")
names(vnames) <- names(ds)

vars   <- names(ds)
target <- "rain_tomorrow"
vars   <- c(target, vars) %>% unique() %>% rev()

We also do a little more to set the data up for demonstrating various approaches to visualisation. As with the model template, a number of template variables are identified here. We also a little data wrangling to remove all missing values by performing a missing value imputation with randomForest::na.roughfix().

risk   <- "risk_mm"
id     <- c("date", "location")
ignore <- c(risk, id)
vars   <- setdiff(vars, ignore)
inputs <- setdiff(vars, target)
ds[vars] %<>% na.roughfix()

References

Williams, Graham J. 2017. The Essentials of Data Science: Knowledge Discovery Using r. The r Series. CRC Press.


Your donation will support ongoing availability and give you access to the PDF version of this book. Desktop Survival Guides include Data Science, GNU/Linux, and MLHub. Books available on Amazon include Data Mining with Rattle and Essentials of Data Science. Popular open source software includes rattle, wajig, and mlhub. Hosted by Togaware, a pioneer of free and open source software since 1984. Copyright © 1995-2022 Graham.Williams@togaware.com Creative Commons Attribution-ShareAlike 4.0