Comparison (6) is possible only for atomic and list types

r

#1

I am currently working on code and I want to comparison it but I am getting error

corr <- function(directory, threshold = 0) { 
 complete <- function(directory, id = 1:332) {
  files_full <- list.files("specdata",full.names=TRUE) #sets the file path
  for (i in id) { 
    readfiles <- read.csv(files_full[i]) # read the data
    nobs <- sum(complete.cases(readfiles)) # number of complete cases
  }
}

if(nobs > threshold) {
      corrdata <- na.omit(dat)
      correlation <- rbind(correlation,cor(corrdata$nitrate,corrdata$sulfate))   
      }
}

#2

I don’t have the dataset and the rest of the code, so Im not sure this will work. If you haven’t already explicitly declared threshold try

threshold = 0

before the if loop.

Hope this helps. Let me know if it doesn’t.


#3

There are many problems with your code. It is hard to tell where to start. First, to answer your direct question as to why you were getting that error, the if(nobs > threshold) line doesn’t make any sense. Where is nobs? I know you think you created a variable called nobs in your function, but you didn’t. All you did was create a function complete, that would have nobs in it. But you never call the complete function.

There is no need to have the embedded function. Try instead:

corr <- function(directory, threshold = 0) { 
  files_full <- list.files("specdata",full.names=TRUE, pattern="csv") #sets the file path
  correlation <- matrix()
  for (i in 1:332) { 
    readfiles <- read.csv(files_full[i]) # read the data
    nobs <- sum(complete.cases(readfiles)) # number of complete cases
  
    if(nobs > threshold) {
      corrdata <- na.omit(readfiles)
      correlation <- rbind(correlation,cor(corrdata$nitrate,corrdata$sulfate))   
      }
  }
  correlation[-1]
} 

We can test with

head(corr("specdata", 400)
[1] -0.01895754 -0.04389737 -0.06815956 -0.07588814
[5]  0.76312884 -0.15782860

summary(corr("specdata"))
    Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 
-1.00000 -0.05282  0.10720  0.13680  0.27830  1.00000 

Which both produce the desired output.


#4

Hi Pierre,
why did you put add:
correlation[-1]


#5

It takes out the first column which is unnecessary.