10.22 Data Ingestion

20180721 Having identified the source of the dataset we can ingest the dataset into the memory of the computer using readr::read_csv() which returns an enhanced .

We set up a reference to the data frame’s location in the computer’s memory by assigning the result of the call to the function readr::read_csv() to the R variable weather.

# Ingest the dataset.

weatherAUS <- read_csv(file=dspath)

As a side effect of calling the function readr::read_csv() helpful messages are displayed that identify the data types for each of the variables found in the ingested dataset. We should review these to ensure they match our expectations. If they don’t, there are optional arguments to readr::read_csv() to inform it otherwise.

Note that the rattle (G. Williams 2023) also provides a smaller rattle::weather dataset as an R dataset, also named weather. Simply by attaching the rattle (G. Williams 2023) package from the library a variable called weather becomes available. Running the above command will replace the dataset provided by rattle (G. Williams 2023). Having done so we can still access the weather dataset provided by rattle (G. Williams 2023) using the package prefix as in rattle::weather.

References

Williams, Graham. 2023. Rattle: Graphical User Interface for Data Science in r. https://rattle.togaware.com/.


Your donation will support ongoing availability and give you access to the PDF version of this book. Desktop Survival Guides include Data Science, GNU/Linux, and MLHub. Books available on Amazon include Data Mining with Rattle and Essentials of Data Science. Popular open source software includes rattle, wajig, and mlhub. Hosted by Togaware, a pioneer of free and open source software since 1984. Copyright © 1995-2022 Graham.Williams@togaware.com Creative Commons Attribution-ShareAlike 4.0