journal features
movie reviews
photo of the day

getting it wrong with R

the journal of Michael Werneburg

twenty-seven years and one million words

Toronto, 2017.07.23

I'm taking a "MOOC" on Coursera in data science. There's an R programming element to it, and I'm currently taking that - the second - class.

Today I spent a few hours doing a twenty minute assignment because I mis-read it. But if anyone's interested in a system by which you can fairly quickly read a raft of (similarly formatted) CSV files into one matrix, here's a way of doing so.

library(plyr)

corr <- function(directory, threshold = 0) {

# 'directory' is a name of a valid subdirectory

# 'threshold' is an optional cut-off for retention

# of the records in any file

# step zero, set up a matrix with the two critical

# fields from the files

dat = matrix(data=NA,nrow=0,ncol=2, byrow=TRUE)

colnames(dat) <- c("sulfate", "nitrate")

list <- list.files(directory, all.files=TRUE, full.names=TRUE, recursive = TRUE)

for (filename in list) {

if (grepl(".csv", filename) == FALSE) {

next

}

# e.g. poldata <- read.csv(file="specdata/002.csv", header=TRUE, sep=",", as.is=T)

poldata <- read.csv(file=filename, header=TRUE, sep=",", as.is=T)

# removes any incomplete records

poldata <- poldata[complete.cases(poldata),]

# get a count of good records in the file

rowsGood <- nrow(poldata)

if (rowsGood >= threshold) {

# this was by far the fastest route I could find

# 1. cast the just-loaded data.frame as a matrix

matrix <- as.matrix(poldata[c("sulfate","nitrate")])

# 2. bulk-copy the records (using plyr library)

dat <- rbind.fill.matrix(dat,matrix)

}

}

cor(data.frame(dat[,1], dat[,2]))

}

Again, this is not the assignment from the Coursera course, this is something more difficult. I misread it while in the middle of one of my damn headaches because I was working against a deadline. I probably would have been better served by resting for that time, then reading the assignment correctly.

rand()m quote

Do not dwell in the past, do not dream of the future, concentrate the mind on the present moment.

—Siddhārtha Gautama (The Buddha)