Data Mining Survivor: Saving_Data - Reading a Large File

DATA MINING
Desktop Survival Guide
by Graham Williams

Reading a Large File

Suppose we have a very large dataset to score. We may not be able to load all of the data into R. One approach will be to partition the data and read blocks at a time to score and save results. The following snippet gives a hint as to how this might be done.

f <- file("hugedata.csv", "r") skip <- 1562739 while (skip > 10000) { junk <- readLines(f, 10000) skip <- skip - 10000 } junk <- readLines(f, skip) readLines(f, 1)

Support further development through the purchase of the PDF version of the book.
The PDF version is a formatted comprehensive draft book (with over 800 pages).
Brought to you by Togaware. This page generated: Sunday, 22 August 2010