Togaware DATA MINING
Desktop Survival Guide
by Graham Williams
Google

Predicted versus Observed

This produces a plot of the actual or observed values (X axis) with the model predicted values (Y axis). A linear model is also fit to the predicted value, based on the actual value, and is displayed as the blue line. The diagonal line (Predicted=Observed) is the perfect model (i.e. perfect correlation between the predicted values and the observed values).

A pseudo-R-square value is also reported. When we are building models to predict categoric outputs there is no measure equivalent to the R-square of ordinary least squares regression which predicts numeric values. Instead, a so called pseudo-R-square can be calculated, which is a measure of the correlation between the predicted and observed values. A higher value indicates that our model fits the data better. The pseudo-R-square can not be interpreted in the same way as the R-square for ordinary regression. There are also several approaches to calculating a pseudo-R-square--we have used a correlation.



Copyright © 2004-2010 Togaware Pty Ltd
Support further development through the purchase of the PDF version of the book.
The PDF version is a formatted comprehensive draft book (with over 800 pages).
Brought to you by Togaware. This page generated: Sunday, 22 August 2010