10.51 Ordered Factor

20180723 Given our knowledge that compass directions have an obvious order, we convert the direction variables into an ordered factor. We do so using ordered=TRUE with base::factor().

# Note the names of the wind direction variables.

ds %>% 
  select(contains("_dir")) %>% 
  names() %T>%
  print() ->
vnames
## [1] "wind_gust_dir" "wind_dir_9am"  "wind_dir_3pm"
# Convert these variables from character to factor.

ds[vnames] %<>% 
  lapply(factor, levels=compass, ordered=TRUE) %>% 
  data.frame() %>% 
  as_tibble()
# Confirm they are now factors.

ds[vnames] %>% sapply(class)
##      wind_gust_dir wind_dir_9am wind_dir_3pm
## [1,] "ordered"     "ordered"    "ordered"   
## [2,] "factor"      "factor"     "factor"

We can again obtain a distribution of the variables to confirm that all we have changed is the data type.

# Verify the distribution has not changed.

ds %>%
  select(contains("_dir")) %>%
  sapply(table)
##     wind_gust_dir wind_dir_9am wind_dir_3pm
## N           12894        16321        12402
## NNE          9520        11693         9663
## NE          10340        10906        11925
## ENE         11713        11255        11306
## E           13074        13304        11832
## ESE         10792        11423        12195
## SE          13364        13367        15053
## SSE         12815        13065        13001
## S           13019        12295        13797
## SSW         12819        11003        11829
## SW          12625        11990        12976
....


Your donation will support ongoing availability and give you access to the PDF version of this book. Desktop Survival Guides include Data Science, GNU/Linux, and MLHub. Books available on Amazon include Data Mining with Rattle and Essentials of Data Science. Popular open source software includes rattle, wajig, and mlhub. Hosted by Togaware, a pioneer of free and open source software since 1984. Copyright © 1995-2022 Graham.Williams@togaware.com Creative Commons Attribution-ShareAlike 4.0