Skip to content

Data Science Tutorials

  • Home
  • R
  • Statistics
  • Course
  • Machine Learning
  • Guest Blog
  • Contact
  • About Us
  • Toggle search form
  • Boosting in Machine Learning
    Boosting in Machine Learning:-A Brief Overview Machine Learning
  • pheatmap function in R
    The pheatmap function in R R
  • How to Calculate Ratios in R
    How to Calculate Ratios in R R
  • Top 10 Data Visualisation Tools
    Top 10 Data Visualisation Tools Every Data Science Enthusiast Must Know Course
  • sorting in r
    Sorting in r: sort, order & rank R Functions R
  • Statistical test assumptions and requirements
    Statistical test assumptions and requirements Statistics
  • How Do Online Criminals Acquire Sensitive Data
    How Do Online Criminals Acquire Sensitive Data Machine Learning
  • How to get the last value of each group in R
    How to get the last value of each group in R R
How to do Pairwise Comparisons in R?

How to do Pairwise Comparisons in R?

Posted on November 23November 23 By Jim 1 Comment on How to do Pairwise Comparisons in R?
Tweet
Share
Share
Pin

How to do Pairwise Comparisons in R, To evaluate if there is a statistically significant difference between the means of three or more independent groups, a one-way ANOVA is utilized.

The following null and alternate hypotheses are used in a one-way ANOVA.

H0: All group means are equal.
HA: Not all group means are equal.

We reject the null hypothesis and come to the conclusion that not all of the group means are equal if the overall p-value of the ANOVA is less than a predetermined significance level (for example, =.05.

We can next conduct post hoc pairwise comparisons to determine which group means are different.

How to compare variances in R – Data Science Tutorials

Example: One-Way ANOVA in R

Consider a teacher who is curious about whether or not the use of three different study methods affects pupils’ exam results.

She distributes ten students to each study method at random in order to test this, then she tracks their exam results.

To conduct a one-way ANOVA in R and check for variations in the mean exam scores among the three groups, use the following code:

Let’s create a data frame

df <- data.frame(technique = rep(c("tech1", "tech2", "tech3"), each=10),
                 score = c(276, 377, 407, 581, 182, 112, 483, 484, 185, 289,
                           81, 82, 183, 183, 183, 584, 187, 190, 192, 193,
                           77, 78, 177, 178, 179, 140, 178, 195, 145, 158))
head(df)
   technique score
1     tech1   176
2     tech1   177
3     tech1   107
4     tech1   181
5     tech1   182
6     tech1   112

Now we can perform one-way ANOVA

model <- aov(score ~ technique, data = df)

View output of ANOVA

summary(model)
           Df Sum Sq Mean Sq F value  Pr(>F)  
technique    2 184786   92393   6.159 0.00626 **
Residuals   27 405053   15002               
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

We will reject the null hypothesis that the mean exam score is the same for each studying method because the overall p-value of the ANOVA (.00626) is less than =.05.

We can now conduct posthoc pairwise comparisons to identify the groups with various means.

Create new variables from existing variables in R – Data Science Tutorials

The Tukey Method

When the sample sizes of each group are equal, the Tukey posthoc method performs the best.

The built-in TukeyHSD() function in R can be used to implement the Tukey posthoc method:

Let’s use the Tukey post-hoc analysis

TukeyHSD(model, conf.level=.95)
Tukey multiple comparisons of means
    95% family-wise confidence level
Fit: aov(formula = score ~ technique, data = df)
$technique
              diff       lwr        upr     p adj
tech2-tech1 -131.8 -267.6121   4.012102 0.0584488
tech3-tech1 -187.1 -322.9121 -51.287898 0.0055676
tech3-tech2  -55.3 -191.1121  80.512102 0.5773136

From the output, we can see that the only p-value (“p adj“) less than 0.05 those pairs are significantly different from each other.

How to create contingency tables in R? – Data Science Tutorials

The Scheffe Method

When comparing group means, the Scheffe technique yields the largest confidence intervals and is the most conservative posthoc pairwise comparison method.

To implement the Scheffe post-hoc approach in R, use the ScheffeTest() function from the DescTools package:

library(DescTools)

Now ready to perform the Scheffe post-hoc method

ScheffeTest(model)
Posthoc multiple comparisons of means: Scheffe Test
    95% family-wise confidence level
$technique
              diff   lwr.ci    upr.ci   pval   
tech2-tech1 -131.8 -273.671  10.07105 0.0726 . 
tech3-tech1 -187.1 -328.971 -45.22895 0.0078 **
tech3-tech2  -55.3 -197.171  86.57105 0.6064   
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

From the output we can see that the only p-value (“p adj“) less than 0.05 those pairs are significantly different from each other.

How to make a rounded corner bar plot in R? – Data Science Tutorials

The Bonferroni Method

When you want to make a set of pre-planned pairwise comparisons, the Bonferroni method is the best to apply.

To use the Bonferroni post-hoc procedure, we can use the R syntax shown below:

Let’s use the Bonferroni post-hoc analysis

pairwise.t.test(df$score, df$technique, p.adj='bonferroni')
               Pairwise comparisons using t tests with pooled SD
data:  df$score and df$technique
      tech1  tech2
tech2 0.0697 -    
tech3 0.0061 0.9650
P value adjustment method: bonferroni

The Holm Method

When you want to make a set of planned pairwise comparisons in advance, you can also use the Holm technique because it frequently has even higher power than the Bonferroni approach.

How to draw heatmap in r: Quick and Easy way – Data Science Tutorials

The Holm post-hoc approach can be used in R using the syntax shown below:

Holm post-hoc approach should be used.

pairwise.t.test(df$score, df$technique, p.adj='holm')
               Pairwise comparisons using t tests with pooled SD
data:  df$score and df$technique
      tech1  tech2
tech2 0.0465 -    
tech3 0.0061 0.3217
P value adjustment method: holm

Check your inbox or spam folder to confirm your subscription.

What is the best way to filter by row number in R? – Data Science Tutorials

Tweet
Share
Share
Pin
R

Post navigation

Previous Post: How to put margins on tables or arrays in R?
Next Post: How to Calculate Ratios in R

Related Posts

  • Detecting and Dealing with Outliers
    Detecting and Dealing with Outliers: First Step R
  • best books about data analytics
    Best Books About Data Analytics Course
  • How to apply a transformation to multiple columns in R?
    How to apply a transformation to multiple columns in R? R
  • Convert multiple columns into a single column
    Convert multiple columns into a single column-tidyr Part4 R
  • Remove Columns from a data frame
    How to Remove Columns from a data frame in R R
  • How to compare the performance of different algorithms in R
    How to compare the performance of different algorithms in R? R

Comment (1) on “How to do Pairwise Comparisons in R?”

  1. Mario Hasler says:
    November 25 at 8:35 pm

    No! Multiple contrast tests are the best way. See the R-packages lsmeans, multcomp, nparcomp, …!

    Reply

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

  • About Us
  • Contact
  • Disclaimer
  • Guest Blog
  • Privacy Policy
  • YouTube
  • Twitter
  • Facebook
  • Defensive Programming Strategies in R
  • Plot categorical data in R
  • Top Data Modeling Tools for 2023
  • Ogive Graph in R
  • Is R or Python Better for Data Science in Bangalore

Check your inbox or spam folder to confirm your subscription.

  • Data Scientist Career Path Map in Finance
  • Is Python the ideal language for machine learning
  • Convert character string to name class object
  • How to play sound at end of R Script
  • Pattern Searching in R
  • Convert multiple columns into a single column
    Convert multiple columns into a single column-tidyr Part4 R
  • Algorithm Classifications in Machine Learning
    Algorithm Classifications in Machine Learning Machine Learning
  • Checking Missing Values in R
    Checking Missing Values in R R
  • Rounded corner bar plot in R
    How to make a rounded corner bar plot in R? R
  • 5 Free Books to Learn Statistics For Data Science
    5 Free Books to Learn Statistics For Data Science Course
  • Data Scientist in 2023
    How to Become a Data Scientist in 2023 Machine Learning
  • How to create a ggalluvial plot in r
    How to create a ggalluvial plot in R? R
  • Best Books on Data Science with Python
    Best Books on Data Science with Python Course

Copyright © 2023 Data Science Tutorials.

Powered by PressBook News WordPress theme