I was working on a dataset where I want to see trends of a Categorical Variable (Minimum_Duration) for different sets of rows.
Set 1: For all the rows of the dataset.(total)
Set 2: For the rows for which another categorical variable has some particular values(The ones with maximum frequency)
I want to make this analysis in order to determine whether the two variables have a different kind of relation for this particular subset.
Here is what I did :
This is a snapshot of the 2 variables I am dealing with right now.
I arranged the variable in decreasing order of frequency, and formed a dataset of the top 10 levels.
loc<-as.data.frame(table(total$Preferred_location)) loc<-arrange(loc,desc(Freq)) top<-loc[1:10,]
Now the dataframe top looks like this :
Now I want to see the frequencies of all the levels of the variable Minimum_Duration for the rows with these 10 Preferred_locations in comparison with actual frequencies of those levels.
However, the command to plot them is giving errors:
ggplot(total,aes(x=total$Minimum_Duration[which(total$Preferred_location %in% top$Var1)],fill=as.factor(total$Minimum_Duration)))+geom_bar(position="dodge")
Error: Aesthetics must be either length 1 or the same as the data (300010): x, fill
Please suggest an alternate way to do this.
Thanks in advance!