14.2 ML Data and Variables
20210104
The rattle::weatherAUS dataset is loaded into the template
variable ds
and further template variables are setup as
introduced by Graham J. Williams (2017). See
Chapter 8 for details.
<- "weatherAUS"
dsname <- get(dsname)
ds
<- nrow(ds)
nobs
<- names(ds)
vnames %<>% clean_names(numerals="right")
ds names(vnames) <- names(ds)
<- names(ds)
vars <- "rain_tomorrow"
target <- c(target, vars) %>% unique() %>% rev() vars
It is always useful to remind ourselves of the dataset with a random sample:
%>% sample_frac() %>% select(date, location, sample(3:length(vars), 5)) ds
## # A tibble: 217,049 × 7
## date location humidity_3pm wind_speed_3pm rainfall press…¹ wind_…²
## <date> <chr> <int> <dbl> <dbl> <dbl> <dbl>
## 1 2017-06-07 BadgerysCreek 90 19 17.4 1021. 43
## 2 2008-12-05 Penrith 48 15 0 NA 28
## 3 2022-04-28 Albany NA NA 19.8 1013. NA
## 4 2017-10-08 Mildura 33 17 0.2 1008. 39
## 5 2011-06-02 WaggaWagga 46 15 0 1023. 28
## 6 2013-08-06 Penrith 33 24 0 NA 50
## 7 2015-11-28 Canberra 30 17 0 1013. 52
## 8 2011-01-08 CoffsHarbour 87 28 5.6 1009. 33
....
References
Williams, Graham J. 2017. The Essentials of Data Science: Knowledge Discovery Using r. The r Series. CRC Press.
Your donation will support ongoing availability and give you access to the PDF version of this book. Desktop Survival Guides include Data Science, GNU/Linux, and MLHub. Books available on Amazon include Data Mining with Rattle and Essentials of Data Science. Popular open source software includes rattle, wajig, and mlhub. Hosted by Togaware, a pioneer of free and open source software since 1984. Copyright © 1995-2022 Graham.Williams@togaware.com Creative Commons Attribution-ShareAlike 4.0
