10.19 Data Ingestion
20180721 Having identified the source of the dataset we can ingest the dataset into the memory of the computer using readr::read_csv() which returns an enhanced .
We set up a reference to the data frame’s location in the computer’s
memory by assigning the result of the call to the function
readr::read_csv() to the R variable
# Ingest the dataset. <- read_csv(file=dspath)weatherAUS
As a side effect of calling the function readr::read_csv() helpful messages are displayed that identify the data types for each of the variables found in the ingested dataset. We should review these to ensure they match our expectations. If they don’t, there are optional arguments to readr::read_csv() to inform it otherwise.
Note that the rattle (G. Williams 2020) also provides a smaller
rattle::weather dataset as an R dataset, also named
weather. Simply by attaching the rattle (G. Williams 2020)
package from the library a variable called
available. Running the above command will replace the dataset provided
by rattle (G. Williams 2020). Having done so we can still access the
weather dataset provided by rattle (G. Williams 2020) using
the package prefix as in rattle::weather.
Your donation will support ongoing development and give you access to the PDF version of this book. Desktop Survival Guides include Data Science, GNU/Linux, and MLHub. Books available on Amazon include Data Mining with Rattle and Essentials of Data Science. Popular open source software includes rattle, wajig, and mlhub. Hosted by Togaware, a pioneer of free and open source software since 1984.
Copyright © 1995-2021 Graham.Williams@togaware.com Creative Commons Attribution-ShareAlike 4.0.