I have the following dataset:
A1Ratings.csv (1.3 KB)
For each movie I am trying to find out the % of ratings >= 4.
tot_count <- data.frame(apply(movie,2,function(x)sum(x >= 1,na.rm = T)))
req_count <- data.frame(apply(movie,2,function(x)sum(x>=4,na.rm = T)))
count <- data.frame(tot_count,req_count,colnames(movie))                                
colnames(count)[1] <- "tot_count"
colnames(count)[2] <- "req_count"
colnames(count)[3] <- "movie"
count <- count[-1,]
count$percent_rating = (count$req_count/count$tot_count)*100
But apparently this is not the correct order.
hi @Pierre_Lafortune,

Yes,the file has to be sorted on the movie percent_rating.
For the file can you please try [here][1]
df <- read.csv('conditional.csv')
df[-c(1,2)] <- sapply(df[-c(1,2)], function(x) x >= 4)
newdf <-[-c(1,2)], mean, na.rm=T))
colnames(newdf) <- c('Percent_Rating')
newdf[order(newdf$Percent_Rating, decreasing=T),, drop=F]
X318..Shawshank.Redemption..The..1994.                       0.70000000
X260..Star.Wars..Episode.IV...A.New.Hope..1977.              0.53333333
X3578..Gladiator..2000.                                      0.50000000
X541..Blade.Runner..1982.                                    0.44444444
X593..Silence.of.the.Lambs..The..1991.                       0.43750000