Skip to content

Data Science Tutorials

  • Home
  • R
  • Statistics
  • Course
  • Contact
  • About Us
  • Toggle search form
  • similarity measure between two populations
    Similarity Measure Between Two Populations-Brunner Munzel Test Statistics
  • Subsetting with multiple conditions in R
    Subsetting with multiple conditions in R R
  • How to perform kruskal wallis test in r
    How to perform the Kruskal-Wallis test in R? R
  • Artificial Intelligence Examples
    Artificial Intelligence Examples-Quick View Course
  • How to Use the Multinomial Distribution in R
    How to Use the Multinomial Distribution in R? R
  • Subset rows based on their integer locations
    Subset rows based on their integer locations-slice in R R
  • Count Observations by Group in R
    Count Observations by Group in R R
  • How to convert characters from upper to lower case in R
    How to convert characters from upper to lower case in R? R
Two Sample Proportions test in R

Two Sample Proportions test in R-Complete Guide

Posted on May 28May 27 By Jim No Comments on Two Sample Proportions test in R-Complete Guide
Tweet
Share
Share
Pin

Two Sample Proportions test in R, To compare two observed proportions, the two-proportions z-test is utilized.

This article explains the fundamentals of the two-proportions *z-test and gives practical examples using R software.

We have two groups of people, for example:

Best GGPlot Themes You Should Know – Data Science Tutorials

n = 500 in Group A with lung cancer.

Healthy people (Group B): n = 500

The number of smokers in each group is as follows:

n = 500, 450 smokers, pA=450/500=0.9 in Group A with lung cancer.

Individuals in Group B, who are in good health: pB=400/500=0.8

The overall proportion of smokers is p=frac(450+400)500+500

The overall proportion of non-smokers is q=1−p

We’d like to know if the proportions of smokers in the two categories of people are the same.

One sample proportion test in R-Complete Guide (datasciencetut.com)

The following are examples of typical research questions:

  1. whether the proportion of smokers in group A (pA) is the same as the proportion of smokers in group B (pB)?
  2. whether the observed proportion of smokers in group A (pA) is lower than that in group B (pB)?
  3. whether the proportion of smokers in group A (pA) is higher than the proportion of smokers in group B (pB)?

In statistics, the appropriate null hypothesis (H0) is defined as follows:

H0:pA=pB
H0:pA≤pB
H0:pA≥pB

The following are the relevant alternative hypothesis (Ha):

Ha:pA≠pB (different)
Ha:pA>pB (greater)
Ha:pA<pB (less)

Note that:

Two-tailed tests are used to test hypotheses 1.

One-tailed tests are used to test hypotheses 2 and 3.

Best online course for R programming – Data Science Tutorials

The overall proportions are p and q.

If |z| is less than 1.96, the difference is not significant at 5%.

If |z| is greater than or equal to 1.96, the difference is significant at 5%.

The z-table contains the corresponding significance level (p-value) for the z-statistic. We’ll look at how to do it in R.

Two Sample Proportions test in R

R functions: prop.test()

prop.test(x, n, p = NULL, alternative = "two.sided", correct = TRUE)

x: a vector of counts of successes

n: a vector of count trials

alternative: an alternative hypothesis specified as a character string

correct: a logical indication of whether or not Yates’ continuity correction should be used when it is possible

It’s worth noting that the function prop.test() uses the Yates continuity correction by default, which is critical if either the expected successes or failures are less than 5.

Calculate the p-Value from Z-Score in R – Data Science Tutorials

If you don’t want the correction, use the prop.test() function’s additional argument correct = FALSE. TRUE is the default value.

(To make the test mathematically comparable to the uncorrected z-test of a proportion, set this option to FALSE.)

We’d like to know if the proportions of smokers in the two categories of people are the same.

res <- prop.test(x = c(450, 400), n = c(500, 500))
res
2-sample test for equality of proportions with continuity correction
data:  c(450, 400) out of c(500, 500)
X-squared = 18.831, df = 1, p-value = 1.428e-05
alternative hypothesis: two.sided
95 percent confidence interval:
 0.05417387 0.14582613
sample estimates:
prop 1 prop 2
   0.9    0.8

The following is what the function returns:

Pearson’s chi-squared test statistic’s value

95 percent confidence intervals and a p-value

a calculated chance of success (the proportion of smokers in the two groups)

Take note of the following:

Type this to see if the observed proportion of smokers in group A (pA) is less than the observed proportion of smokers in group B (pB).

prop.test(x = c(490, 400), n = c(500, 500), alternative = "less")

Alternatively, type this to see if the observed proportion of smokers in group A (pA) is greater than the observed proportion of smokers in group B (pB).

Control Chart in Quality Control-Quick Guide – Data Science Tutorials

prop.test(x = c(450, 400), n = c(500, 500), alternative = "greater")

The result’s interpretation

The test’s p-value is 1.428e-05, which is less than the alpha = 0.05 significance level. With a p-value of 1.428e-05, we may conclude that the proportion of smokers in the two groups is significantly different.

Tweet
Share
Share
Pin
R

Post navigation

Previous Post: Calculate the P-Value from Chi-Square Statistic in R
Next Post: Best Books on Data Science with Python

Related Posts

  • How to Recode Values in R
    How to Recode Values in R R
  • Detecting and Dealing with Outliers
    Detecting and Dealing with Outliers: First Step R
  • How to Use Gather Function in R
    How to Use Gather Function in R?-tidyr Part2 R
  • Error in sum(List) : invalid 'type' (list) of argument
    Error in sum(List) : invalid ‘type’ (list) of argument R
  • How to Standardize Data in R
    How to Standardize Data in R? R
  • Create new variables from existing variables in R
    Create new variables from existing variables in R R

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *




  • About Us
  • Contact
  • Disclaimer
  • Privacy Policy
  • YouTube
  • Twitter
  • Facebook
  • Is Data Science a Dying Profession?
  • How to Label Outliers in Boxplots in ggplot2?
  • Best Books About Data Analytics
  • How to Scale Only Numeric Columns in R
  • Best Books to Learn Statistics for Data Science

Check your inbox or spam folder to confirm your subscription.




 https://www.r-bloggers.com
  • Dealing Missing values in R
    Dealing With Missing values in R R
  • Best online course for R programming
    Best online course for R programming Course
  • Control Chart in Quality Control
    Control Chart in Quality Control-Quick Guide Statistics
  • Add new calculated variables to a data frame and drop all existing variables
    Add new calculated variables to a data frame and drop all existing variables R
  • What Is the Best Way to Filter by Date in R
    What Is the Best Way to Filter by Date in R? R
  • 5 Free Books to Learn Statistics For Data Science
    5 Free Books to Learn Statistics For Data Science Course
  • Augmented Dickey-Fuller Test in R
    Augmented Dickey-Fuller Test in R R
  • glm function in R
    glm function in r-Generalized Linear Models R

Copyright © 2022 Data Science Tutorials.

Powered by PressBook News WordPress theme