DATA MINING
Desktop Survival Guide
by Graham Williams

Decision Trees

Todo: UNDER CONSTRUCTION

Image dtrees:rattle_weather_rpart_draw_crop

Decision trees (also referred to as classification and regression trees) are the traditional building blocks of data mining and one of the classic machine learning algorithms. Since their development in the 1980's they have been the most widely deployed machine learning based data mining model builder. The attraction lies in the simplicity of the resulting model, where a decision tree (at least one that is not too large) is quite easy to view, to understand, and, indeed, to explain to management. Decision trees do not always deliver the best performance and represent a trade off between performance and simplicity of explanation. The decision tree structure can represent both classification and regression models.

In this chapter we discuss the decision tree structure as a knowledge representation language (Section 12.1). A heuristic search algorithm is presented for finding a good decision tree in Section 12.2. The measures used are discussed in Section 12.3. Section 12.4 then illustrates the building of a decision tree in Rattle and directly through R. The options for building a decision tree are covered in Section 12.5.

Subsections

Support further development through the purchase of the PDF version of the book.
The PDF version is a formatted comprehensive draft book (with over 800 pages).
Brought to you by Togaware. This page generated: Sunday, 22 August 2010