Togaware DATA MINING
Desktop Survival Guide
by Graham Williams
Google

Overlays and Point in Polygon

Overlays can be used to locate points within polygons. For example, suppose we have a collection of earthquake locations (geocoded to longitude and latitude) and want to locate the area within which the earthquake is located (e.g., states of Australia).

We first construct a data frame containing the spatial coordinates of the earthquakes (from http://www.ga.gov.au/bin/listQuakes), together with their magnitude and a textual description of the location.

Todo: Move this data into a dataset within Rattle.



> earthquakes <- data.frame(latitude = c(-38.3624, -38.42, -30.26, -30.205, -16.314,
                              -31.4918, -30.306, -32.635, -30.233, -37.423, -31.633, 
                              -30.286, -31.213, -30.237, -33.39), 
                            longitude = c(145.8692, 145.91, 117.716, 117.783, 121.252, 
                              138.5274, 117.646, 138.066, 117.73, 146.04, 138.741, 
                              117.711, 138.559, 117.742, 116.64), 
                            magnitude = c(3.3, 3.1, 2.5, 2.6, 5.1, 3.3, 2.3, 
                              3.5, 2.5, 2.2, 2.1, 2.4, 2, 2.3, 2.6),
                            location = c("NW of Korumburra VIC", 
                              "Near Korumburra VIC", "NW of Beacon WA", 
                              "N of Beacon WA", "NW of Broome WA", 
                              "NE of Hawker SA", "NW of Beacon WA", 
                              "SE of Pt Augusta SA", "NW of Beacon WA", 
                              "SW of Jamieson VIC", "E of Wilpena Pound SA", 
                              "NW of Beacon WA", "South of Hawker SA", 
                              "NW of Beacon WA", "SW of Darkan WA"))

We can then convert the xy data (the longitude and latitude) in the data frame into a spatial object of class SpatialPointsDataFrame from the sp package.



> library(sp)
> head(earthquakes)



  latitude longitude magnitude             location
1 -38.3624  145.8692       3.3 NW of Korumburra VIC
2 -38.4200  145.9100       3.1  Near Korumburra VIC
3 -30.2600  117.7160       2.5      NW of Beacon WA
4 -30.2050  117.7830       2.6       N of Beacon WA
5 -16.3140  121.2520       5.1      NW of Broome WA
6 -31.4918  138.5274       3.3      NE of Hawker SA



> class(earthquakes)



[1] "data.frame"



> coordinates(earthquakes) <- ~longitude+latitude
> head(earthquakes)



           coordinates magnitude
1  (145.869, -38.3624)       3.3
2     (145.91, -38.42)       3.1
3    (117.716, -30.26)       2.5
4   (117.783, -30.205)       2.6
5   (121.252, -16.314)       5.1
6  (138.527, -31.4918)       3.3
7   (117.646, -30.306)       2.3
8   (138.066, -32.635)       3.5
9    (117.73, -30.233)       2.5
10   (146.04, -37.423)       2.2
11  (138.741, -31.633)       2.1
12  (117.711, -30.286)       2.4
13  (138.559, -31.213)       2.0
14  (117.742, -30.237)       2.3
15    (116.64, -33.39)       2.6



> class(earthquakes)



[1] "SpatialPointsDataFrame"
attr(,"package")
[1] "sp"



> str(earthquakes)



Formal class 'SpatialPointsDataFrame' [package "sp"] with 5 slots
  ..@ data       :'data.frame':	15 obs. of  2 variables:
  .. ..$ magnitude: num [1:15] 3.3 3.1 2.5 2.6 5.1 3.3 2.3 3.5 2.5 2.2 ...
  .. ..$ location : Factor w/ 11 levels "E of Wilpena Pound SA",..: 7 2 5 4 6 3 5 8 5 11 ...
  ..@ coords.nrs : int [1:2] 2 1
  ..@ coords     : num [1:15, 1:2] 146 146 118 118 121 ...
  .. ..- attr(*, "dimnames")=List of 2
  .. .. ..$ : NULL
  .. .. ..$ : chr [1:2] "longitude" "latitude"
  ..@ bbox       : num [1:2, 1:2] 116.6 -38.4 146 -16.3
  .. ..- attr(*, "dimnames")=List of 2
  .. .. ..$ : chr [1:2] "longitude" "latitude"
  .. .. ..$ : chr [1:2] "min" "max"
  ..@ proj4string:Formal class 'CRS' [package "sp"] with 1 slots
  .. .. ..@ projargs: chr NA

We also need some region data, or some other overlay, so we can map the points to the regions. Using freely available data we might use the Australian states data.

Now we can use overlay to find within which state each earthquake belongs to.



> proj4string(earthquakes) <- CRS("+proj=longlat +datum=WGS84")
> overlay(aus, earthquakes)$ADMIN_NAME



 [1] Victoria          Victoria          Western Australia
 [4] Western Australia <NA>              South Australia  
 [7] Western Australia South Australia   Western Australia
[10] Victoria          South Australia   Western Australia
[13] South Australia   Western Australia Western Australia
8 Levels: Australian Capital Territory ... Western Australia

For reassurance we can visually match the names in the original data above to the list of states here. The missing value turns out to lay outside the state, in the ocean.

Copyright © Togaware Pty Ltd
Support further development through the purchase of the PDF version of the book.
The PDF version is a formatted comprehensive draft book (with over 800 pages).
Brought to you by Togaware. This page generated: Sunday, 22 August 2010