Skip to content

Data Science Tutorials

  • Home
  • R
  • Statistics
  • Course
  • Machine Learning
  • Guest Blog
  • Contact
  • About Us
  • Toggle search form
  • Top Data Modeling Tools for 2023
    Top Data Modeling Tools for 2023 Machine Learning
  • Find the Maximum Value by Group in R
    Find the Maximum Value by Group in R R
  • Change ggplot2 Theme Color in R
    Change ggplot2 Theme Color in R ggthemr Package R
  • Subset rows based on their integer locations
    Subset rows based on their integer locations-slice in R R
  • Subsetting with multiple conditions in R
    Subsetting with multiple conditions in R R
  • Detecting and Dealing with Outliers
    Detecting and Dealing with Outliers: First Step R
  • OLS Regression in R
    OLS Regression in R R
  • How to Replace String in Column in R
    How to Replace String in Column using R R
Hypothesis Testing in R

Hypothesis Testing in R

Posted on December 4December 4 By Jim 1 Comment on Hypothesis Testing in R
Tweet
Share
Share
Pin

Hypothesis Testing in R, A formal statistical test called a hypothesis test is used to confirm or disprove a statistical hypothesis.

The following R hypothesis tests are demonstrated in this course.

  • T-test with one sample
  • T-Test of two samples
  • T-test for paired samples

Each type of test can be run using the R function t.test().

How to Create an Interaction Plot in R? – Data Science Tutorials

one sample t-test

t.test(x, y = NULL,
       alternative = c("two.sided", "less", "greater"),
       mu = 0, paired = FALSE, var.equal = FALSE,
       conf.level = 0.95, …)

where:

x, y: The two samples of data.

alternative: The alternative hypothesis of the test.

mu: The true value of the mean.

paired: whether or not to run a paired t-test.

var.equal: Whether to assume that the variances between the samples are equal.

conf.level: The confidence level to use.

The following examples show how to use this function in practice.

Example 1: One-Sample t-test in R

A one-sample t-test is used to determine whether the population’s mean is equal to a given value.

Consider the situation where we wish to determine whether the mean weight of a particular species of turtle is 310 pounds or not. We go out and gather a straightforward random sample of turtles with the weights listed below.

How to Find Unmatched Records in R – Data Science Tutorials

Weights: 301, 305, 312, 315, 318, 319, 310, 318, 305, 313, 305, 305, 305

The following code shows how to perform this one sample t-test in R:

specify a turtle weights vector

weights <- c(301, 305, 312, 315, 318, 319, 310, 318, 305, 313, 305, 305, 305)

Now we can perform a one-sample t-test

t.test(x = weights, mu = 310)
               One Sample t-test
data:  weights
t = 0.045145, df = 12, p-value = 0.9647
alternative hypothesis: true mean is not equal to 310
95 percent confidence interval:
 306.3644 313.7895
sample estimates:
mean of x
 310.0769

From the output we can see:

t-test statistic: 045145

degrees of freedom: 12

p-value: 0. 9647

95% confidence interval for true mean: [306.3644, 313.7895]

mean of turtle weights: 310.0769We are unable to reject the null hypothesis since the test’s p-value of 0. 9647 is greater than or equal to.05.

This means that we lack adequate evidence to conclude that this species of turtle’s mean weight is different from 310 pounds.

Example 2: Two Sample t-test in R

To determine whether the means of two populations are equal, a two-sample t-test is employed.

Consider the situation where we want to determine whether the mean weight of two different species of turtles is equal. We gather a straightforward random sample of turtles from each species with the following weights to test this.

ggpairs in R – Data Science Tutorials

Sample 1: 310, 311, 310, 315, 311, 319, 310, 318, 315, 313, 315, 311, 313

Sample 2: 335, 339, 332, 331, 334, 339, 334, 318, 315, 331, 317, 330, 325

The following code shows how to perform this two-sample t-test in R:

Now we can create a vector of turtle weights for each sample

sample1 <- c(310, 311, 310, 315, 311, 319, 310, 318, 315, 313, 315, 311, 313)
sample2 <- c(335, 339, 332, 331, 334, 339, 334, 318, 315, 331, 317, 330, 325)

Let’s perform two sample t-tests

Welch Two Sample t-test
data:  sample1 and sample2
t = -6.7233, df = 15.366, p-value = 6.029e-06
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -21.16313 -10.99071
sample estimates:
mean of x mean of y
 313.1538  329.2308

We reject the null hypothesis because the test’s p-value (6.029e-06) is smaller than.05.

Accordingly, we have enough data to conclude that the mean weight of the two species is not identical.

Example 3: Paired Samples t-test in R

When each observation in one sample can be paired with an observation in the other sample, a paired samples t-test is used to compare the means of the two samples.

For instance, let’s say we want to determine if a particular training program may help basketball players raise their maximum vertical jump (in inches).

How to create Anatogram plot in R – Data Science Tutorials

We may gather a small, random sample of 12 college basketball players to test this by measuring each player’s maximum vertical jump. Then, after each athlete has used the training regimen for a month, we might take another look at their max vertical leap.

The following information illustrates the maximum jump height (in inches) for each athlete before and after using the training program.

Before: 122, 124, 120, 119, 119, 120, 122, 125, 124, 123, 122, 121

After: 123, 125, 120, 124, 118, 122, 123, 128, 124, 125, 124, 120

The following code shows how to perform this paired samples t-test in R:

Let’s define before and after max jump heights

before <- c(122, 124, 120, 119, 119, 120, 122, 125, 124, 123, 122, 121)
after <- c(123, 125, 120, 124, 118, 122, 123, 128, 124, 125, 124, 120)

We can perform paired samples t-test

t.test(x = before, y = after, paired = TRUE)
               Paired t-test
data:  before and after
t = -2.5289, df = 11, p-value = 0.02803
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -2.3379151 -0.1620849
sample estimates:
mean of the differences
                  -1.25

We reject the null hypothesis since the test’s p-value (0. 02803) is smaller than.05.

Autocorrelation and Partial Autocorrelation in Time Series (datasciencetut.com)

The mean jump height before and after implementing the training program is not equal, thus we have enough data to conclude so.

Check your inbox or spam folder to confirm your subscription.

Tweet
Share
Share
Pin
R

Post navigation

Previous Post: Autocorrelation and Partial Autocorrelation in Time Series
Next Post: How to test the significance of a mediation effect

Related Posts

  • Check whether any values of a logical vector are TRUE
    Check whether any values of a logical vector are TRUE R
  • How to handle Imbalanced Data
    How to handle Imbalanced Data? R
  • How to Recode Values in R
    How to Recode Values in R R
  • How to put margins on tables or arrays in R?
    How to put margins on tables or arrays in R? R
  • Subset rows based on their integer locations
    Subset rows based on their integer locations-slice in R R
  • What is bias variance tradeoff
    What is the bias variance tradeoff? R

Comment (1) on “Hypothesis Testing in R”

  1. StatistikinDD says:
    December 14 at 1:13 am

    Nice post!
    Did you exclude the R code for the two sample t-test?

    Reply

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

  • About Us
  • Contact
  • Disclaimer
  • Guest Blog
  • Privacy Policy
  • YouTube
  • Twitter
  • Facebook
  • Defensive Programming Strategies in R
  • Plot categorical data in R
  • Top Data Modeling Tools for 2023
  • Ogive Graph in R
  • Is R or Python Better for Data Science in Bangalore

Check your inbox or spam folder to confirm your subscription.

  • Data Scientist Career Path Map in Finance
  • Is Python the ideal language for machine learning
  • Convert character string to name class object
  • How to play sound at end of R Script
  • Pattern Searching in R
  • How to Replace Inf Values with NA in R
    How to Replace Inf Values with NA in R R
  • How to Calculate Lag by Group in R
    How to Calculate Lag by Group in R? R
  • Defensive Programming Strategies in R
    Defensive Programming Strategies in R Machine Learning
  • How to Calculate Relative Frequencies in R
    How to Calculate Relative Frequencies in R? R
  • How to perform kruskal wallis test in r
    How to perform the Kruskal-Wallis test in R? R
  • Remove Columns from a data frame
    How to Remove Columns from a data frame in R R
  • Filter Using Multiple Conditions in R
    Filter Using Multiple Conditions in R R
  • Best Books to Learn R Programming
    Best Books to Learn R Programming Course

Copyright © 2023 Data Science Tutorials.

Powered by PressBook News WordPress theme