Skip to content

Data Science Tutorials

  • Home
  • R
  • Statistics
  • Course
  • Machine Learning
  • Guest Blog
  • Contact
  • About Us
  • Toggle search form
  • How to Use “not in” operator in Filter
    How to Use “not in” operator in Filter R
  • How to Implement the Sklearn Predict Approach
    How to Implement the Sklearn Predict Approach? R
  • How to apply a transformation to multiple columns in R?
    How to apply a transformation to multiple columns in R? R
  • Get the first value in each group in R
    Get the first value in each group in R? R
  • Remove Rows from the data frame in R
    Remove Rows from the data frame in R R
  • Credit Card Fraud detection in R
    Credit Card Fraud Detection in R R
  • How to do Pairwise Comparisons in R?
    How to do Pairwise Comparisons in R? R
  • R Percentage by Group Calculation
    R Percentage by Group Calculation R
How to perform kruskal wallis test in r

How to perform the Kruskal-Wallis test in R?

Posted on May 13May 13 By Jim No Comments on How to perform the Kruskal-Wallis test in R?
Tweet
Share
Share
Pin

How to perform the Kruskal-Wallis test in R, when there are more than two groups, the Kruskal-Wallis test by rank is a non-parametric alternative to the one-way ANOVA test.

It extends the two-samples Wilcoxon test. When the assumptions of the one-way ANOVA test are not met, this method is advised.

This article will show you how to use R to compute the Kruskal-Wallis test.

How to perform the Kruskal-Wallis test in R

We’ll use the PlantGrowth data set that comes with R. It provides the weight of plants produced under two distinct treatment conditions and a control condition.

data <- PlantGrowth

Let’s print the head of the file

head(data)
  weight group
1   4.17  ctrl
2   5.58  ctrl
3   5.18  ctrl
4   6.11  ctrl
5   4.50  ctrl
6   4.61  ctrl

The column “group” is known as a factor in R, while the different categories (“ctr”, “trt1”, “trt2”) are known as factor levels. The levels are listed in alphabetical order.

Display group levels

levels(data$group)
[1] "ctrl" "trt1" "trt2"

If the levels are not in the correct order automatically, reorder them as follows:

data$group <- ordered(data$group,
                         levels = c("ctrl", "trt1", "trt2"))

Summary statistics can be calculated by groupings. You can use the dplyr package.

Type this to install the dplyr package:

install.packages("dplyr")

Compute summary statistics by groups:

library(dplyr)
group_by(data, group) %>%
  summarise(
    count = n(),
    mean = mean(weight, na.rm = TRUE),
    sd = sd(weight, na.rm = TRUE),
    median = median(weight, na.rm = TRUE),
    IQR = IQR(weight, na.rm = TRUE)
  )

Source: local data frame [3 x 6]

   group count  mean        sd median    IQR
  (fctr) (int) (dbl)     (dbl)  (dbl)  (dbl)
1   ctrl    10 5.032 0.5830914  5.155 0.7425
2   trt1    10 4.661 0.7936757  4.550 0.6625
3   trt2    10 5.526 0.4425733  5.435 0.4675

Use box plots to visualize the data.

Read R base graphs to learn how to utilize them. For easy ggplot2-based data visualization, we’ll use the ggpubr R tool.

Download and install the most recent version of ggpubr.

install.packages("ggpubr")

Let’s plot weight by group and color by group

library("ggpubr")
ggboxplot(my_data, x = "group", y = "weight",
          color = "group", palette = c("#00AFBB", "#E7B800", "#FC4E07"),
          order = c("ctrl", "trt1", "trt2"),
          ylab = "Weight", xlab = "Treatment")

Add error bars: mean_se

library("ggpubr")
ggline(data, x = "group", y = "weight",
       add = c("mean_se", "jitter"),
       order = c("ctrl", "trt1", "trt2"),
       ylab = "Weight", xlab = "Treatment")

Compute Kruskal-Wallis test

We want to see if the average weights of the plants in the three experimental circumstances vary significantly.

The test can be run using the kruskal.test() function as follows.

kruskal.test(weight ~ group, data = data)

    Kruskal-Wallis rank-sum test

data:  weight by group
Kruskal-Wallis chi-squared = 7.9882, df = 2, p-value = 0.01842

Inference

We can conclude that there are significant differences between the treatment groups because the p-value is less than the significance criterion of 0.05.

Multiple pairwise comparisons between groups were conducted.

We know there is a substantial difference between groups based on the Kruskal-Wallis test’s results, but we don’t know which pairings of groups are different.

The function pairwise.wilcox.test() can be used to calculate pairwise comparisons between group levels with different testing corrections.

pairwise.wilcox.test(PlantGrowth$weight, PlantGrowth$group,
                 p.adjust.method = "BH")

    Pairwise comparisons using the Wilcoxon rank-sum test

How to perform a one-sample t-test in R?

data:  PlantGrowth$weight and PlantGrowth$group
     ctrl  trt1
trt1 0.199 -   
trt2 0.095 0.027

p-value adjustment method: BH

Conclusion

Only trt1 and trt2 are statistically different (p<0.05) in the pairwise comparison.

Tweet
Share
Share
Pin
R

Post navigation

Previous Post: How to perform the MANOVA test in R?
Next Post: How to make a rounded corner bar plot in R?

Related Posts

  • How to Label Outliers in Boxplots in ggplot2
    How to Label Outliers in Boxplots in ggplot2? R
  • Comparing group means in R
    One way ANOVA Example in R-Quick Guide R
  • How to Find Unmatched Records in R
    How to Find Unmatched Records in R R
  • How to add columns to a data frame in R
    How to add columns to a data frame in R R
  • Bind together two data frames by their rows or columns in R
    Bind together two data frames by their rows or columns in R R
  • How to Use the Multinomial Distribution in R
    How to Use the Multinomial Distribution in R? R

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

  • About Us
  • Contact
  • Disclaimer
  • Guest Blog
  • Privacy Policy
  • YouTube
  • Twitter
  • Facebook
  • Top 7 Skills Required to Become a Data Scientist
  • Learn Hadoop for Data Science
  • How Do Online Criminals Acquire Sensitive Data
  • Top Reasons To Learn R in 2023
  • Linear Interpolation in R-approx

Check your inbox or spam folder to confirm your subscription.

 https://www.r-bloggers.com
  • How to do Conditional Mutate in R
    How to do Conditional Mutate in R? R
  • How to convert characters from upper to lower case in R
    How to convert characters from upper to lower case in R? R
  • Filter Using Multiple Conditions in R
    Filter Using Multiple Conditions in R R
  • Data Science Challenges in R Programming Language
    Data Science Challenges in R Programming Language Machine Learning
  • How to use image function in R
    How to use the image function in R R
  • How to Use Italic Font in R
    How to Use Italic Font in R R
  • Two-Way ANOVA Example in R
    How to perform a one-sample t-test in R? R
  • Count Observations by Group in R
    Count Observations by Group in R R

Copyright © 2023 Data Science Tutorials.

Powered by PressBook News WordPress theme