DATA MINING
Desktop Survival Guide
by Graham Williams

## Density Plot

 plot(density(iris\$Petal.Length))

http://rattle.togaware.com/code/rplot-iris-density.R

Here's an example that illustrates uniformity. The histogram shows a lot of variance in the uniform random sample, at least for small samples, whereas the quantile plots are more effective in showing the uniformity (or density).

 > hist(runif(100)) > hist(runif(1000)) > hist(runif(10000)) > hist(runif(100000)) > hist(runif(1000000)) > hist(runif(10000000)) > hist(runif(100000000)) > par(mfrow=c(2,2)) > for(i in c(10, 100, 1000, 10000)) { qqplot(runif(i), qunif(seq(1/i, 1, length=i)), main=i, xlim=c(0,1), ylim=c(0,1), xlab="runif", ylab="Uniform distribution quantiles") abline(0,1,col="lightgray") }

Histograms are not particularly good as density estimators. However, most of the time histograms are used as an exploratory tool useful in assisting in understanding our data. Using small bin widths helps find unexpected gaps and patterns in our data, and gives an initial view of the distribution.

Copyright © 2004-2010 Togaware Pty Ltd
Support further development through the purchase of the PDF version of the book.
The PDF version is a formatted comprehensive draft book (with over 800 pages).
Brought to you by Togaware. This page generated: Sunday, 22 August 2010