17.4 CSV Files Basics

20200421 The command mlr from the package miller is a powerfull command line tool for working with csv, tsv, and tabular JSON files.

A verb is required to specify an action. Here we simply cat the file:

$ mlr cat example.csv
1=color,2=shape,3=flag,4=index,5=quantity,6=rate
1=yellow,2=triangle,3=1,4=11,5=43.6498,6=9.8870
...

Note that the input is a csv file using –icsv:

$ mlr --icsv cat example.csv
color,shape,flag,index,quantity,rate
yellow,triangle,1,11,43.6498,9.8870
red,square,1,15,79.2778,0.0130
...

Pretty print the output using –opprint:

$ mlr --icsv --opprint cat example.csv
color  shape    flag index quantity rate
yellow triangle 1    11    43.6498  9.8870
red    square   1    15    79.2778  0.0130
...

Pretty print the output as a table using –barred:

$ mlr --icsv --opprint --barred cat example.csv
+--------+----------+------+-------+----------+--------+
| color  | shape    | flag | index | quantity | rate   |
+--------+----------+------+-------+----------+--------+
| yellow | triangle | 1    | 11    | 43.6498  | 9.8870 |
| red    | square   | 1    | 15    | 79.2778  | 0.0130 |
| red    | circle   | 1    | 16    | 13.8103  | 2.9010 |
| red    | square   | 0    | 48    | 77.5542  | 7.4670 |
| purple | triangle | 0    | 51    | 81.2290  | 8.5910 |
| red    | square   | 0    | 64    | 77.1991  | 9.5310 |
| purple | triangle | 0    | 65    | 80.1405  | 5.8240 |
| yellow | circle   | 1    | 73    | 63.9785  | 4.2370 |
| yellow | circle   | 1    | 87    | 63.5058  | 8.3350 |
| purple | square   | 0    | 91    | 72.3735  | 8.2430 |
+--------+----------+------+-------+----------+--------+

Select fields from the file using the verb cut:

$ mlr --csv --opprint cut -f flag,shape example.csv
shape    flag
triangle 1
square   1
...

Order the fields as specified:

$ mlr --icsv --opprint cut -o -f flag,shape example.csv
flag shape
1    triangle
1    square
...

Further details at https://johnkerl.org/miller/doc/.



Your donation will support ongoing development and give you access to the PDF version of this book. Desktop Survival Guides include Data Science, GNU/Linux, and MLHub. Books available on Amazon include Data Mining with Rattle and Essentials of Data Science. Popular open source software includes rattle, wajig, and mlhub. Hosted by Togaware, a pioneer of free and open source software since 1984. Copyright © 1995-2021 Graham.Williams@togaware.com Creative Commons Attribution-ShareAlike 4.0.