Skip to content

Data Science Tutorials

For Data Science Learners

  • Select variables of data frame in R R
  • one-sample-proportion-test-in-r
    One sample proportion test in R-Complete Guide R
  • similarity measure between two populations
    Similarity Measure Between Two Populations-Brunner Munzel Test Statistics
  • R-Change Number of Bins in Histogram R
  • Export output as text in R R
  • best books about data analytics
    Best Books to Learn Statistics for Data Science Course
  • Positive or Negative in R R
  • How to compare the performance of different algorithms in R
    How to compare the performance of different algorithms in R? R
How to perform kruskal wallis test in r

How to perform the Kruskal-Wallis test in R?

Posted on May 13May 13 By Admin No Comments on How to perform the Kruskal-Wallis test in R?

How to perform the Kruskal-Wallis test in R, when there are more than two groups, the Kruskal-Wallis test by rank is a non-parametric alternative to the one-way ANOVA test.

It extends the two-samples Wilcoxon test. When the assumptions of the one-way ANOVA test are not met, this method is advised.

This article will show you how to use R to compute the Kruskal-Wallis test.

How to perform the Kruskal-Wallis test in R

We’ll use the PlantGrowth data set that comes with R. It provides the weight of plants produced under two distinct treatment conditions and a control condition.

data <- PlantGrowth

Let’s print the head of the file

head(data)
  weight group
1   4.17  ctrl
2   5.58  ctrl
3   5.18  ctrl
4   6.11  ctrl
5   4.50  ctrl
6   4.61  ctrl

The column “group” is known as a factor in R, while the different categories (“ctr”, “trt1”, “trt2”) are known as factor levels. The levels are listed in alphabetical order.

Display group levels

levels(data$group)
[1] "ctrl" "trt1" "trt2"

If the levels are not in the correct order automatically, reorder them as follows:

data$group <- ordered(data$group,
                         levels = c("ctrl", "trt1", "trt2"))

Summary statistics can be calculated by groupings. You can use the dplyr package.

Type this to install the dplyr package:

install.packages("dplyr")

Compute summary statistics by groups:

library(dplyr)
group_by(data, group) %>%
  summarise(
    count = n(),
    mean = mean(weight, na.rm = TRUE),
    sd = sd(weight, na.rm = TRUE),
    median = median(weight, na.rm = TRUE),
    IQR = IQR(weight, na.rm = TRUE)
  )

Source: local data frame [3 x 6]

   group count  mean        sd median    IQR
  (fctr) (int) (dbl)     (dbl)  (dbl)  (dbl)
1   ctrl    10 5.032 0.5830914  5.155 0.7425
2   trt1    10 4.661 0.7936757  4.550 0.6625
3   trt2    10 5.526 0.4425733  5.435 0.4675

Use box plots to visualize the data.

Read R base graphs to learn how to utilize them. For easy ggplot2-based data visualization, we’ll use the ggpubr R tool.

Download and install the most recent version of ggpubr.

install.packages("ggpubr")

Let’s plot weight by group and color by group

library("ggpubr")
ggboxplot(my_data, x = "group", y = "weight",
          color = "group", palette = c("#00AFBB", "#E7B800", "#FC4E07"),
          order = c("ctrl", "trt1", "trt2"),
          ylab = "Weight", xlab = "Treatment")

Add error bars: mean_se

library("ggpubr")
ggline(data, x = "group", y = "weight",
       add = c("mean_se", "jitter"),
       order = c("ctrl", "trt1", "trt2"),
       ylab = "Weight", xlab = "Treatment")

Compute Kruskal-Wallis test

We want to see if the average weights of the plants in the three experimental circumstances vary significantly.

The test can be run using the kruskal.test() function as follows.

kruskal.test(weight ~ group, data = data)

    Kruskal-Wallis rank-sum test

data:  weight by group
Kruskal-Wallis chi-squared = 7.9882, df = 2, p-value = 0.01842

Inference

We can conclude that there are significant differences between the treatment groups because the p-value is less than the significance criterion of 0.05.

Multiple pairwise comparisons between groups were conducted.

We know there is a substantial difference between groups based on the Kruskal-Wallis test’s results, but we don’t know which pairings of groups are different.

The function pairwise.wilcox.test() can be used to calculate pairwise comparisons between group levels with different testing corrections.

pairwise.wilcox.test(PlantGrowth$weight, PlantGrowth$group,
                 p.adjust.method = "BH")

    Pairwise comparisons using the Wilcoxon rank-sum test

How to perform a one-sample t-test in R?

data:  PlantGrowth$weight and PlantGrowth$group
     ctrl  trt1
trt1 0.199 -   
trt2 0.095 0.027

p-value adjustment method: BH

Conclusion

Only trt1 and trt2 are statistically different (p<0.05) in the pairwise comparison.

R

Post navigation

Previous Post: How to perform the MANOVA test in R?
Next Post: How to make a rounded corner bar plot in R?

Related Posts

  • How to Use Mutate function in R
    How to Use Mutate function in R R
  • Interactive 3d plot in R
    Interactive 3d plot in R-Quick Guide R
  • Filter a Vector in R R
  • Radar plot in R
    How to create Radar Plot in R-ggradar R
  • Arrange the rows in a specific sequence in R
    Arrange the rows in a specific sequence in R R
  • Adding Subtitles in ggplot2 R

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

  • Best Prompt Engineering Books
  • Understanding Machine Learning and Data Science
  • Best Git Books
  • Top 5 Books to Learn Data Engineering
  • Mastering R Programming for Data Science: Tips and Tricks
  • About Us
  • Contact
  • Disclaimer
  • Privacy Policy

https://www.r-bloggers.com

  • YouTube
  • Twitter
  • Facebook
  • Course
  • Excel
  • Machine Learning
  • Opensesame
  • R
  • Statistics

Check your inbox or spam folder to confirm your subscription.

  • Using describeBy() in R: A Comprehensive Guide R
  • Tips for Data Scientist Interview Openings
    Tips for Data Scientist Interview Openings Course
  • How to Create a Frequency Table by Group in R
    How to Create a Frequency Table by Group in R? R
  • Subsetting with multiple conditions in R
    Subsetting with multiple conditions in R R
  • Remove Rows from the data frame in R
    Remove Rows from the data frame in R R
  • Ad Hoc Analysis
    What is Ad Hoc Analysis? Statistics
  • Understanding the Student’s t-Distribution in R R
  • Credit Card Fraud detection in R
    Credit Card Fraud Detection in R R

Privacy Policy

Copyright © 2025 Data Science Tutorials.

Powered by PressBook News WordPress theme