3.17 Pipes: Tee Pipe Load CSV Files

20210103 This use case loads all .csv.gz files in the data folder into a single data frame and prints a message for each file loaded to monitor progress.

fpath <- "data"
files <- dir(fpath, "*.csv.gz")
ds    <- data.frame()
for (i in seq_len(length(files)))
{
  fpath %>%
    file.path(files[i]) %T>%
    cat("\n") %>%
    readr::read_csv() %>%
    rbind(ds, .) ->
  ds
}

This can be useful when combined with a sub pipeline which is introduced with curly braces. A global assignment operator saves the result for later. The example is a little contrived though illustrative. The sub-pipeline calculates the order of wind gust directions based on the maximum temperature for any day within each group defined by the wind direction. This is then used to order the wind directions, saving it in a global variable lvls. That variable is then used to mutate the original data frame (note the use of the tee pipe) to reorder the levels of the wind_gust_dir variable. This is typically done within a pipeline that feeds into a plot where we want to reorder the levels so that there is some meaning to the order of the bars in a bar plot, for example.

levels(ds$wind_gust_dir)
##  [1] "N"   "NNE" "NE"  "ENE" "E"   "ESE" "SE"  "SSE" "S"   "SSW" "SW"  "WSW"
## [13] "W"   "WNW" "NW"  "NNW"
ds %>%
  filter(rainfall>0) %T>%
  { 
    select(., wind_gust_dir, max_temp) %>%
      group_by(wind_gust_dir) %>% 
      summarise(max_max_temp=max(max_temp), .groups="drop") %>%
      arrange(max_max_temp) %>%
      pull(wind_gust_dir) ->>
    lvls
  } %>%
  mutate(wind_gust_dir=factor(wind_gust_dir, levels=lvls)) %$%
  levels(wind_gust_dir)
##  [1] "N"   "NNE" "NE"  "ENE" "E"   "ESE" "SE"  "SSE" "S"   "SSW" "SW"  "WSW"
## [13] "W"   "WNW" "NW"  "NNW"


Your donation will support ongoing availability and give you access to the PDF version of this book. Desktop Survival Guides include Data Science, GNU/Linux, and MLHub. Books available on Amazon include Data Mining with Rattle and Essentials of Data Science. Popular open source software includes rattle, wajig, and mlhub. Hosted by Togaware, a pioneer of free and open source software since 1984. Copyright © 1995-2022 Graham.Williams@togaware.com Creative Commons Attribution-ShareAlike 4.0