Togaware DATA MINING
Desktop Survival Guide
by Graham Williams
Google

Reading a Large File

Suppose we have a very large dataset to score. We may not be able to load all of the data into R. One approach will be to partition the data and read blocks at a time to score and save results. The following snippet gives a hint as to how this might be done.



f <- file("hugedata.csv", "r")

skip <- 1562739
while (skip > 10000) {
   junk <- readLines(f, 10000)
   skip <- skip - 10000
}
junk <- readLines(f, skip)
readLines(f, 1)



Copyright © Togaware Pty Ltd
Support further development through the purchase of the PDF version of the book.
The PDF version is a formatted comprehensive draft book (with over 800 pages).
Brought to you by Togaware. This page generated: Sunday, 22 August 2010