How to do Pairwise Comparisons in R?

How to do Pairwise Comparisons in R, To evaluate if there is a statistically significant difference between the means of three or more independent groups, a one-way ANOVA is utilized.

The following null and alternate hypotheses are used in a one-way ANOVA.

H0: All group means are equal.
HA: Not all group means are equal.

We reject the null hypothesis and come to the conclusion that not all of the group means are equal if the overall p-value of the ANOVA is less than a predetermined significance level (for example, =.05.

We can next conduct post hoc pairwise comparisons to determine which group means are different.

How to compare variances in R – Data Science Tutorials

Example: One-Way ANOVA in R

Consider a teacher who is curious about whether or not the use of three different study methods affects pupils’ exam results.

She distributes ten students to each study method at random in order to test this, then she tracks their exam results.

To conduct a one-way ANOVA in R and check for variations in the mean exam scores among the three groups, use the following code:

Let’s create a data frame

df <- data.frame(technique = rep(c("tech1", "tech2", "tech3"), each=10),
                 score = c(276, 377, 407, 581, 182, 112, 483, 484, 185, 289,
                           81, 82, 183, 183, 183, 584, 187, 190, 192, 193,
                           77, 78, 177, 178, 179, 140, 178, 195, 145, 158))

head(df)

   technique score
1     tech1   176
2     tech1   177
3     tech1   107
4     tech1   181
5     tech1   182
6     tech1   112

Now we can perform one-way ANOVA

model <- aov(score ~ technique, data = df)

View output of ANOVA

summary(model)

           Df Sum Sq Mean Sq F value  Pr(>F)  
technique    2 184786   92393   6.159 0.00626 **
Residuals   27 405053   15002               
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

We will reject the null hypothesis that the mean exam score is the same for each studying method because the overall p-value of the ANOVA (.00626) is less than =.05.

We can now conduct posthoc pairwise comparisons to identify the groups with various means.

Create new variables from existing variables in R – Data Science Tutorials

The Tukey Method

When the sample sizes of each group are equal, the Tukey posthoc method performs the best.

The built-in TukeyHSD() function in R can be used to implement the Tukey posthoc method:

Let’s use the Tukey post-hoc analysis

TukeyHSD(model, conf.level=.95)

Tukey multiple comparisons of means
    95% family-wise confidence level
Fit: aov(formula = score ~ technique, data = df)
$technique
              diff       lwr        upr     p adj
tech2-tech1 -131.8 -267.6121   4.012102 0.0584488
tech3-tech1 -187.1 -322.9121 -51.287898 0.0055676
tech3-tech2  -55.3 -191.1121  80.512102 0.5773136

From the output, we can see that the only p-value (“p adj“) less than 0.05 those pairs are significantly different from each other.

How to create contingency tables in R? – Data Science Tutorials

The Scheffe Method

When comparing group means, the Scheffe technique yields the largest confidence intervals and is the most conservative posthoc pairwise comparison method.

To implement the Scheffe post-hoc approach in R, use the ScheffeTest() function from the DescTools package:

library(DescTools)

Now ready to perform the Scheffe post-hoc method

ScheffeTest(model)

Posthoc multiple comparisons of means: Scheffe Test
    95% family-wise confidence level
$technique
              diff   lwr.ci    upr.ci   pval   
tech2-tech1 -131.8 -273.671  10.07105 0.0726 . 
tech3-tech1 -187.1 -328.971 -45.22895 0.0078 **
tech3-tech2  -55.3 -197.171  86.57105 0.6064   
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

From the output we can see that the only p-value (“p adj“) less than 0.05 those pairs are significantly different from each other.

How to make a rounded corner bar plot in R? – Data Science Tutorials

The Bonferroni Method

When you want to make a set of pre-planned pairwise comparisons, the Bonferroni method is the best to apply.

To use the Bonferroni post-hoc procedure, we can use the R syntax shown below:

Let’s use the Bonferroni post-hoc analysis

pairwise.t.test(df$score, df$technique, p.adj='bonferroni')

               Pairwise comparisons using t tests with pooled SD
data:  df$score and df$technique
      tech1  tech2
tech2 0.0697 -    
tech3 0.0061 0.9650
P value adjustment method: bonferroni

The Holm Method

When you want to make a set of planned pairwise comparisons in advance, you can also use the Holm technique because it frequently has even higher power than the Bonferroni approach.

How to draw heatmap in r: Quick and Easy way – Data Science Tutorials

The Holm post-hoc approach can be used in R using the syntax shown below:

Holm post-hoc approach should be used.

pairwise.t.test(df$score, df$technique, p.adj='holm')

               Pairwise comparisons using t tests with pooled SD
data:  df$score and df$technique
      tech1  tech2
tech2 0.0465 -    
tech3 0.0061 0.3217
P value adjustment method: holm