Getting mean of a numeric variable according to the categorical variable in R


#1

I was trying to calculate the mean of Variable Age for all the females in my dataset but I am getting error NaN. Kindly help me out with this
mean(Age[Gender == “Female”])
[1] NaN


#2

Hi @sanchi_singh1

Please write reproducible code and try to provide sample data frame to remove ambiguity.

here is my try

  1. assuming “df” as dataframe

  2. DataManipulation using “dplyr”

     if(!require(dplyr)) {
           install.packages("dplyr");
           require(dplyr)
         }
         library(dplyr)
    

#below code will give you mean age of male as well as female
df %>% group_by(Age) %>% summarize(Average=mean(Age)) %>% ungroup() %>% arrange(-Average) %>% data.frame


#3

You have many options to do this:

#---------------- With data.table -------------
library(data.table)
DT <- data.table(
Gender = ifelse( sample(c(0,1),100, replace = TRUE) ==0, “Female”, “Male”),
Age = sample(c(1:100), 100, replace = TRUE )
)

DT[ Gender == “Female”, .(AgeMean = mean(Age))]

#---------------- with the standard function “by” -------------
DF <- as.data.frame(DT)
by( DF[, 2], DF[,“Gender”], mean)

#-------------- with sqldf package -----------------------------
library(sqldf)
sqldf(“select Gender, avg(Age) as avgAge from DF group by Gender”)

Regards,
Carlos Ortega - Spain


#4

hi @sanchi_singh1

if you have age as separate variable in the dataset:
x<-#assign the data frame or read file

mean(x$age)

you’ll get age as returned value