14.2 ML Data and Variables
20210104
The rattle::weatherAUS dataset is loaded into the template
variable ds
and further template variables are setup as
introduced by Graham J. Williams (2017). See
Chapter 8 for details.
<- "weatherAUS"
dsname <- get(dsname)
ds
<- nrow(ds)
nobs
<- names(ds)
vnames %<>% clean_names(numerals="right")
ds names(vnames) <- names(ds)
<- names(ds)
vars <- "rain_tomorrow"
target <- c(target, vars) %>% unique() %>% rev() vars
It is always useful to remind ourselves of the dataset with a random sample:
%>% sample_frac() %>% select(date, location, sample(3:length(vars), 5)) ds
## # A tibble: 176,747 × 7
## date location cloud_3pm sunshine evaporation max_temp min_temp
## <date> <chr> <int> <dbl> <dbl> <dbl> <dbl>
## 1 2017-11-28 Canberra 3 NA NA 28.4 11.5
## 2 2012-06-26 Woomera 6 1.4 2.6 15.5 4.9
## 3 2015-01-18 Cairns 7 4.1 6.8 34.2 26.2
## 4 2018-06-11 Sale NA NA NA 15.3 2.2
## 5 2009-10-10 Watsonia 4 10.4 3.2 19.3 2.7
## 6 2010-04-12 Richmond NA NA 9.2 24.1 10.2
## 7 2008-11-06 Hobart 7 5 5.2 15.2 9.6
## 8 2011-10-21 CoffsHarbour 1 11.8 3.8 23.3 13.2
## 9 2013-07-01 Newcastle 6 NA NA 17.5 10.6
## 10 2011-01-20 Adelaide NA 13.5 7 37 17
## # … with 176,737 more rows
References
Williams, Graham J. 2017. The Essentials of Data Science: Knowledge Discovery Using r. The r Series. CRC Press.
Your donation will support ongoing availability and give you access to the PDF version of this book. Desktop Survival Guides include Data Science, GNU/Linux, and MLHub. Books available on Amazon include Data Mining with Rattle and Essentials of Data Science. Popular open source software includes rattle, wajig, and mlhub. Hosted by Togaware, a pioneer of free and open source software since 1984. Copyright © 1995-2021 Graham.Williams@togaware.com Creative Commons Attribution-ShareAlike 4.0
