7.8 kmeans pipeline
20211015 As with all mlhub commands, a goal is to provide powerful combinations of commands through pipelines. We might process a csv file through a number of steps, for example to normalise the columns, to then pipe the csv file into the train command followed by the predict command to output a csv file with each observation labelled with a cluster number.
$ cat iris.csv | ml train kmeans 3 | ml predict kmeans iris.csv sepal_length,sepal_width,petal_length,petal_width,label 5.0,3.6,1.4,0.2,1 7.7,3.8,6.7,2.2,0 6.1,3.0,4.9,1.8,2 5.4,3.7,1.5,0.2,2 ...
Your donation will support ongoing availability and give you access to the PDF version of this book. Desktop Survival Guides include Data Science, GNU/Linux, and MLHub. Books available on Amazon include Data Mining with Rattle and Essentials of Data Science. Popular open source software includes rattle, wajig, and mlhub. Hosted by Togaware, a pioneer of free and open source software since 1984. Copyright © 1995-2021 Graham.Williams@togaware.com Creative Commons Attribution-ShareAlike 4.0