20 Decision Trees

20200815 Decision trees are widely used in data mining and well supported in R (R Core Team 2024). Decision tree learning deploys a divide and conquer approach, known as recursive partitioning. It is usually implemented as a greedy search using information gain or the Gini index to select the best input variable on which to partition the dataset at each step.

Here we introduce (G. Williams 2024) and (Therneau and Atkinson 2023) for building decision trees. We begin with a step-by-step example of building a decision tree using Rattle, and then illustrate the process using R beginning with Section ??. We cover both classification trees and regression trees.

We have briefly introduced decision trees as an algorithm in Section @ref(sec:algorithms_dtrees) and Chapter @ref(cha:model_template) uses decision trees as the model builder to demonstrate the model template. Examples of decision tree induction are available through the rain, iris, and pyiris packages from MLHub.

References

R Core Team. 2024. R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing. https://www.R-project.org/.
Therneau, Terry, and Beth Atkinson. 2023. Rpart: Recursive Partitioning and Regression Trees. https://github.com/bethatkinson/rpart.
Williams, Graham. 2024. Rattle: Graphical User Interface for Data Science in r. https://rattle.togaware.com/.


Your donation will support ongoing availability and give you access to the PDF version of this book. Desktop Survival Guides include Data Science, GNU/Linux, and MLHub. Books available on Amazon include Data Mining with Rattle and Essentials of Data Science. Popular open source software includes rattle, wajig, and mlhub. Hosted by Togaware, a pioneer of free and open source software since 1984. Copyright © 1995-2022 Graham.Williams@togaware.com Creative Commons Attribution-ShareAlike 4.0