R: How to write Loop for PostHoc Tukey test and multiple comparision

r

#1

This is my sample of a big data matrix & and each column has named with multiple information and separated by an underscore.

structure(list(Gene = c(“AGI4120.1_UBQ”, “AGI570.1_Acin”), WT_Tissue_0T_1 = c(0.886461437, 1.093164915), WT_Tissue_0T_2 = c(1.075140682, 1.229862834), WT_Tissue_0T_3 = c(0.632903012, 1.094003128), WT_Tissue_1T_1 = c(0.883151274, 1.26322126), WT_Tissue_1T_2 = c(1.005627276, 0.962729188), WT_Tissue_1T_3 = c(0.87123469, 0.968078993), WT_Tissue_3T_1 = c(0.723601456, 0.633890322), WT_Tissue_3T_2 = c(0.392585237, 0.534819363), WT_Tissue_3T_3 = c(0.640185369, 1.021934772), WT_Tissue_5T_1 = c(0.720291294, 0.589244505), WT_Tissue_5T_2 = c(0.362131744, 0.475251717), WT_Tissue_5T_3 = c(0.549486925, 0.618177919), mut1_Tissue_0T_1 = c(1.464415756, 1.130533457), mut1_Tissue_0T_2 = c(1.01489573, 1.114915728), mut1_Tissue_0T_3 = c(1.171797418, 1.399956009), mut1_Tissue_1T_1 = c(0.927507448, 1.231911575), mut1_Tissue_1T_2 = c(1.089705396, 1.256782289 ), mut1_Tissue_1T_3 = c(0.993048659, 0.999044465), mut1_Tissue_3T_1 = c(1.000993049, 1.103486794), mut1_Tissue_3T_2 = c(1.062562066, 0.883617224 ), mut1_Tissue_3T_3 = c(1.037404833, 0.851875438), mut1_Tissue_5T_1 = c(0.730883813, 0.437440083), mut1_Tissue_5T_2 = c(0.480635551, 0.298762126 ), mut1_Tissue_5T_3 = c(0.85468388, 0.614923997)), row.names = c(NA, -2L), class = c(“tbl_df”, “tbl”, “data.frame”), spec = structure(list( cols = list(Gene = structure(list(), class = c(“collector_character”, “collector”)), WT_Tissue_0T_1 = structure(list(), class = c(“collector_double”, “collector”)), WT_Tissue_0T_2 = structure(list(), class = c(“collector_double”, “collector”)), WT_Tissue_0T_3 = structure(list(), class = c(“collector_double”, “collector”)), WT_Tissue_1T_1 = structure(list(), class = c(“collector_double”, “collector”)), WT_Tissue_1T_2 = structure(list(), class = c(“collector_double”, “collector”)), WT_Tissue_1T_3 = structure(list(), class = c(“collector_double”, “collector”)), WT_Tissue_3T_1 = structure(list(), class = c(“collector_double”, “collector”)), WT_Tissue_3T_2 = structure(list(), class = c(“collector_double”, “collector”)), WT_Tissue_3T_3 = structure(list(), class = c(“collector_double”, “collector”)), WT_Tissue_5T_1 = structure(list(), class = c(“collector_double”, “collector”)), WT_Tissue_5T_2 = structure(list(), class = c(“collector_double”, “collector”)), WT_Tissue_5T_3 = structure(list(), class = c(“collector_double”, “collector”)), mut1_Tissue_0T_1 = structure(list(), class = c(“collector_double”, “collector”)), mut1_Tissue_0T_2 = structure(list(), class = c(“collector_double”, “collector”)), mut1_Tissue_0T_3 = structure(list(), class = c(“collector_double”, “collector”)), mut1_Tissue_1T_1 = structure(list(), class = c(“collector_double”, “collector”)), mut1_Tissue_1T_2 = structure(list(), class = c(“collector_double”, “collector”)), mut1_Tissue_1T_3 = structure(list(), class = c(“collector_double”, “collector”)), mut1_Tissue_3T_1 = structure(list(), class = c(“collector_double”, “collector”)), mut1_Tissue_3T_2 = structure(list(), class = c(“collector_double”, “collector”)), mut1_Tissue_3T_3 = structure(list(), class = c(“collector_double”, “collector”)), mut1_Tissue_5T_1 = structure(list(), class = c(“collector_double”, “collector”)), mut1_Tissue_5T_2 = structure(list(), class = c(“collector_double”, “collector”)), mut1_Tissue_5T_3 = structure(list(), class = c(“collector_double”, “collector”))), default = structure(list(), class = c(“collector_guess”, “collector”))), class = “col_spec”))

I want to follow a Tukey Test and plot bar charts for each Gene (Response vs. Time; filled by both the genotypes) with multiple comparisons letters.

Syntax

df1 <- df %>%
gather(var, response, WT_Tissue_0T_1:mut1_Tissue_5T_3) %>%
separate(var, c(“Genotype”, “Tissue”, “Time”), sep = “_”) %>%
arrange(desc(Gene))
df2 <- df1 %>%
group_by(Gene,Genotype,Tissue,Time) %>%
mutate(Response=mean(response),n=n(),se=sd(response)/sqrt(n))

Two-way ANOVA

fit1 <- aov(Response ~ Genotype*Time, df1)

Hereafter, I want to go proceed Tukey Test (Multiple Comparison), for e.g. Gene “AGI4120.1_UBQ”, plot Response vs. Time, and see how each genotype (WT & mut1) behave in each Time point (0T, 1T, 3T, and 5T)? if the response is significantly different or not and denote by letters in the plot.

As below lsmeans syntax combine all the Genes to one and give output, how can I make it to loop for each gene separately (i.e. “AGI4120.1_UBQ”, “AGI570.1_Acin”) and get letters to show statistically distinct groups (aka “compact letter display”)

lsmeans(fit1, pairwise ~ Genotype | Time)

My final aim is to plot each gene like this below graph and denote significance letters.

df2$genotype <- factor(df2$genotype, levels = c(“WT”,“mut1”)) colours <- c(’#336600’,’#ffcc00’)library(ggplot2)ggplot(df2,aes( x=Time, y=Response, fill=Genotype))+geom_bar(stat=‘identity’, position=‘dodge’)+scale_fill_manual(values=colours)+geom_errorbar(aes(ymin=average_measure-se, ymax=average_measure+se)+facet_wrap(~Gene)+labs(x=‘Time’, y=‘Response’)

Expecting Graph for denoting compact letter display

I would appreciate your kind help, if possible.


#2

Hi @kynda, your code is not reproducible. Could you give a sample data in data.frame format along with the reproducible code?


#3

Hi Joshi,

Thank you for the message. Could you please access my DF and codes https://www.biostars.org/p/317650/#318104