# Converting loop to apply function

#1

Hi All,

Currently I have written a code using for loop to get the number of values which are less than certain quantiles of that column. How can I replicate the same loop to a apply function so that my processing gets faster. Below is the code.

Code
___________________________________________________
set.seed(1729)
temp <- data.frame(groups=c(1,2),value1=rnorm(12),value2=rnorm(12))

``````# Number of rows and columns
ngroup<-length(unique(temp[,1]))
iteration=ncol(temp)-1

#Default Table
Table1<- data.frame(matrix(0, nrow=ngroup, ncol=(3*iteration)+1))
p<-colnames(temp)[2:ncol(temp)]
q<-c("0.25","0.5","0.75")
colnames(Table1)=c("Groups",as.vector(t(outer(p, q, paste, sep="-"))))

# Editing Table with counts
for(i in seq(from=1, to=ngroup, by=1)){
Table1[i,"Groups"]<-i
}

for(j in seq(from=1, to=ngroup, by=1)){

for(i in seq(from=2, to=(3*iteration)+1, by=1)){

namecol<-colnames(temp)[ceiling((i-1)/3)+1]
if ((i%%3)==2){
quant<-quantile(temp[,ceiling((i-1)/3)+1],probs = as.numeric(q[1]))
}
else if ((i%%3)==0){
quant<-quantile(temp[,ceiling((i-1)/3)+1],probs = as.numeric(q[2]))
}
else{
quant<-quantile(temp[,ceiling((i-1)/3)+1],probs = as.numeric(q[3]))
}

query<-sprintf("select count(%s) from temp where groups=%s and %s< %s",namecol,j,namecol,quant)
Table1[j,i]<-sqldf(query)

}

}

print(Table1)
________________________________________________________________________
``````

Regards,
Surya

#2

As far as I could understand, for doing computations on columns you can use sapply function. Nested for loops take lot of time in execution. This method is much faster.
Letâ€™s say I want to select observations per column with quantiles < 50%, it can be done as:

``````#set.seed(1729)
temp <- data.frame(groups=c(1,2),value1=rnorm(12),value2=rnorm(12))

#logical output
cols <- sapply(temp, function(x) quantile(x) < 0.50)

#subset
newdata <- temp[cols,]
``````

In case of high dimensional data sets, you can use parallel functions like parSapply, foreach etc.

#3

Hi Manish,

Yes you are right saying that Execution time will be more. So that is why I am trying to run it using sapply. However I need to understand how can I convert the above code.

I need the output in a certain way which you can get by running the code.

In mean time I am also trying.