8.4 apriori itemsets
UNDER DEVELOPMENT 20220106
ml itemsets apriori [options] [datafile] -i <name> --id=<name> The id column name. -s <0-1> --support=<0-1> Minimum support threshold.
Input file is a two column csv file, one is the basket
the other is an item in that basket. If no datafile is named on the
command line the data is read from stdin.
id,item u1234567,comp1234 u1234567,comp2345 u1234567,comp3456 u1234567,comp4567 u1234568,comp1234 u1234568,comp4567 ...
Output to stdout is a row for each possible basket item set combination, with frequencies and support:
$ ml itemsets apriori mcopm.csv pattern,freq,support comp1234:comp4567,145,0.75 comp2345,123,0.45 ...
The output can be filtered to include only those item sets with at least a specified value for the support, the support being 10% (0.1) by default.
$ ml itemsets apriori --support=0.5 mcomp.csv pattern,freq,support comp1234:comp4567,145,0.75 ...
A column named
id is expected. If it has a different name it can be
$ ml itemsets apriori --id=ID mcomp.csv pattern,freq,support comp1234:comp4567,145,0.75 ...
Your donation will support ongoing availability and give you access to the PDF version of this book. Desktop Survival Guides include Data Science, GNU/Linux, and MLHub. Books available on Amazon include Data Mining with Rattle and Essentials of Data Science. Popular open source software includes rattle, wajig, and mlhub. Hosted by Togaware, a pioneer of free and open source software since 1984. Copyright © 1995-2022 Graham.Williams@togaware.com Creative Commons Attribution-ShareAlike 4.0