DATA MINING
Desktop Survival Guide
by
Graham Williams
Desktop Survival
Project Home
List of Figures
List of Tables
Data Mining with Rattle
Introduction
Data Mining with Rattle
Data Sources
Selecting Data
Exploring Data
Transforming Data
Descriptive Models
Predictive Models
Evaluation and Deployment
Issues
Moving into R
Troubleshooting
R for the Data Miner
R
Data
Graphics in R
Understanding Data
Preparing Data
Descriptive and Predictive Analytics
Issues
Evaluating Models
Reporting
Cluster Analysis
Text Mining
Text Mining
Algorithms
Bagging
Bayes Classifier
Cluster Analysis
Conditional Trees
Hierarchical Clustering
K-Nearest Neighbours
Linear Models
Neural Networks
Support Vector Machines
Open Products
AlphaMiner
Borgelt Data Mining Suite
KNime
R
Rattle
Weka
Closed Products
C4.5
Clementine
Equbits Foresight
GhostMiner
InductionEngine
ODM
Enterprise Miner
Statistica Data Miner
TreeNet
Virtual Predict
Appendicies
Glossary
Bibliography
Index
Data Mining with Rattle
Subsections
Introduction
Data Mining
Types of Analysis
Data Mining Applications
A Framework for Modelling
Agile Data Mining
Why Rattle?
Data Preparation
Number of Algorithms
Repeatability
Why R?
Books on R
Data Mining with
Rattle
Installing GTK,
R
, and
Rattle
Preliminaries
Install Debian
Install MS/Windows
Installation Details
Initiating
Rattle
.Rprofile and Rprofile.site
Scripts
The Initial Interface
Interacting with
Rattle
Paradigms
Menus and Buttons
Project Menu and Buttons
Edit Menu
Tools Menu and Toolbar
Execute
Export
Settings
Help
Interacting with Plots
Summary
Data Sources
Nomenclature
Loading Data
CSV Data
ARFF Data
ODBC Sourced Data
R Data
R Dataset
Data Entry
Selecting Data
Sampling Data
Variable Roles
Automatic Role Identification
Weights Calculator
Exploring Data
Summarising Data
Summary
Describe
Basics
Kurtosis
Skewness
Missing
Exploring Distributions
Box Plot
Histogram
Cumulative Distribution Plot
Benford's Law
Other Digits
Stratified Benford Plots
Bar Plot
Dot Plot
Mosaic Plot
Sophisticated Exploration with GGobi
Scatterplot
Data Viewer: Identifying Entities in Plots
Other Options
Further GGobi Documentation
Correlation Analysis
Hierarchical Correlation
Principal Components
Single Variable Overviews
Transforming Data
Normalising Data
Recenter
Scale [0,1]
Rank
Median/MAD
Impute
Zero/Missing
Mean/Median/Mode
Constant
Remap
Binning
Indicator Variables
Join Categoricals
Math Transforms
Outliers
Cleanup
Delete Ignored
Delete Selected
Delete Missing
Delete Entities with Missing
Descriptive Models
Cluster Analysis
KMeans
Export KMeans Clusters
Discriminant Coordinates Plot
Number of Clusters
Hierarchical Clusters
Association Rules
Basket Analysis
General Rules
Predictive Models
Building Models
Risk Charts
Linear Regression
Decision Trees
Tutorial Example
Formalities
Tuning Parameters
Priors (prior)
Loss Matrix
Min Split (
minsplit)
Min Bucket (minbucket)
Complexity (XnullXXnullXR function argumentsR functions (R function)R function argumentsR libraries (R library)R function argumentsR option (R option)R function argumentsR packages (R package)R function argumentsDatasets (Dataset)XnullXR function argumentsR function arguments
cp
)
Max Depth (maxdepth)
Other Options
Boosting
Tutorial Example
Formalities
Tuning Parameters
Random Forests
Tutorial Example
Formalities
Tuning Parameters
Number of Trees
Sample Size
Number of Variables
Neural Network
Support Vector Machine
Bibliographic Notes
Evaluation
The Evaluate Tab
Confusion Matrix
Measures
Graphical Measures
Lift
ROC Curves
Area Under Curve
Precision versus Recall
Sensitivity versus Specificity
Predicted versus Observed
Scoring
Issues
Model Selection
Overfitting
Imbalanced Classification
Sampling
Cost Based Learning
Model Deployment and Interoperability
SQL
PMML
Bibliographic Notes
Moving into
R
The Current
Rattle
State
Data
Samples
Projects
The
Rattle
Log
Further Tuning Models
Troubleshooting
Cairo: cairo_pdf_surface_create could not be located
A factor has new levels
Copyright © 2004-2008 Togaware Pty Ltd
Support further development through the
purchase of the PDF
version of the book.
PDF version is properly formatted and forms a comprehensive book (draft with over 600 pages).
Brought to you by
Togaware
.