This project is rather open ended. Students may like to work in groups
of 3 or 4, so that a variety of approaches might be considered. In
that case, submissions should indicate what each student contributed
to the project.
The project provides an opportunity to think about how to develop good
models in a limited time frame. A plan should be developed to tackle
the problem. Like all real data mining projects, the plan can be
adjusted as the project proceeds, reflecting discoveries made about
- 50% for the technical component of the work. This will cover
how well the data mining task was performed, the choice and use of
algorithm(s) and method(s) to obtain best performance, and the
interpretation of the results.
- 25% for the report. The report should be clear, concise, and
complete. It will outline your plan and summarise the performance of
the models you built. You should identify which model you are
putting forward as the best and how it was finally come to. You must
include your performance estimate as this will be compared to the
actual performance on the testing dataset and large discrepancies
will be penalised. Keep the report to a suitable length, and
include graphs that clearly present specific points and.
- 25% for the performance of the model. This will be based on the
accuracy, RMSE, and the ROC area under the curve, for your
model. Note that three separate model predictions can be submitted,
tuned for each of the different measures of performance.
Copyright © 2004-2010 Togaware Pty Ltd
Support further development through the purchase of the PDF version of the book.
The PDF version is a formatted comprehensive draft book (with over 800 pages).
Brought to you by Togaware. This page generated: Sunday, 22 August 2010