Desktop Survival Guide
by Graham Williams

Model Selection

The question that obviously now comes to mind is which model builder do we use. That is a question that has challenged us for a long time, and still there remains no definitive answer. It all depends on how well the model builder works on your data, and, in fact, how you measure the performance on the model. We review some of the insights that might help us choice the right model builder and, indeed, the right model, for our task.

Contrary to expectations, there are few comprehensive comparative studies of the performance of various model builders. A notable exception is the study by Caruana and Niculescu-Mizil, who compared most modern model builders across numerous datasets using a variety of performance measures. The key conclusion, they found, was that boosted trees and random forests generally perform the best, and that decision trees, logistic regression and boosted stumps generally perform the worst. Perhaps more importantly though, it often depends on what is being measured as the performance criteria, and on the characteristics of the data being modelled.

An overall conclusion from such comparative studies, then, is that often it is best to deploy different model builders over the dataset to investigate which performs the best. This is better than a single shot at the bullseye. We also need to be sure to select the appropriate criteria for evaluating the performance. The criteria should match the task at hand. For example, if the task is one of information retrieval then a Precision/Recall measure may be best. If the task is in the area of health, then the area under the ROC curve is an often used measure. For marketing, perhaps it is lift. For risk assessment, the Risk Charts are a good measure.

So, in conclusion, it is good to build multiple models using multiple model builders. The tricky bits are tuning the model builders (requiring an understanding of the sometimes very many and very complex model builder parameters) and selecting the right criteria to assess the performance of the model (a criteria to match the task at hand--noting that raw accuracy is not always, and maybe not often, the right criteria).

Copyright © 2004-2010 Togaware Pty Ltd
Support further development through the purchase of the PDF version of the book.
The PDF version is a formatted comprehensive draft book (with over 800 pages).
Brought to you by Togaware. This page generated: Sunday, 22 August 2010