Togaware DATA MINING
Desktop Survival Guide
by Graham Williams
Google


Graphics in R

As well as being a package of choice for many statisticians, R is capable of producing excellent graphics in many formats, including PostScript, PDF, PNG (for vector images and text), and JPG (for colorful images).

R has a number of options for generating plots. The traditional, but now quite dated package, is the graphics package.

The grid package provides a newer approach to generating plots whilst the lattice package provides many high level graphics functions with good defaults for effective presentation, based on the Cleveland book Visualizing Data. lattice is not so good for building new types of plots--use grid for that. The lattice package allows quite complex plots to be generated relatively easily.

The ggplot2 is the newest plotting development, based on the idea of a grammar for graphics by Wilkinson. The default theme for producing plots is based on the collective wisdom of Tufte, Brewer, Carr, Sun, and Cleveland.

Example code presented in the following chapters will illustrate the generation of publication quality PDF (portable document format) graphics that can be viewed with many viewers, including Adobe's Acrobat. However, R supports many output formats, including PNG (portable network graphics, supported by many web browsers and importable into many word processors), JPG, and PostScript. Another format supported is XFIG. Such output is editable with the xfig graphics editor, allowing further annotations and modifications to be made to the automatically generated plot. The XFIG graphics can then be converted to an even larger collection of graphics formats, including PDF. For the graphics actually presented here in the book R has been used, in fact, to generate XFIG output which is then converted to PDF. Thus the code examples here, generating PDF directly, may give slightly different layouts to the figures that actually appear here.

A highly interoperable approach is to generate graphs in FIG format which can then be loaded into the xfig application, for example, for further editing. This allows, for example, minor changes to be made to fine tune the graphics, but at the cost of losing the ability to automatically regenerate the plot from the original R code. For LATEX processing the rubber package (under Debian GNU/Linux) will automatically convert them to the appropriate EPS or PDF format. Of course, xfig can also generate PNG and JPG and many other formats.

The basic concept of R's graphics model is that a plot is built up bit by bit. Each latter component of the plot overlays earlier components. A plot also has two components. The plotting area is identified by through the Roption[]usr parameter, as 4 numbers $x_1$, $x_2$, $y_1$, and $y_2$. You can retrieve the current plotting region (which is defined by the first component of a plot) with:

> plot(rnorm(10))
> par("usr")
[1]  0.640000 10.360000 -1.390595  1.153828

The whole figure itself will encompass the plotting region and the region around the plot used to add axis information and labels. Outside of the figure region is the device region. Normally, adding components to a plot, outside of the plotting region, will have no effect--they will be cropped. To ensure they do not get cropped, set the graphic parameter Roption[]xpd to TRUE:

> par(xpd=TRUE)



Subsections
Copyright © Togaware Pty Ltd
Support further development through the purchase of the PDF version of the book.
The PDF version is a formatted comprehensive draft book (with over 800 pages).
Brought to you by Togaware. This page generated: Sunday, 22 August 2010