Skip to content

Data Science Tutorials

For Data Science Learners

  • Lottery Prediction Using Machine Learning
    Lottery Prediction Using Machine Learning Machine Learning
  • How to Find Unmatched Records in R
    How to Find Unmatched Records in R R
  • Add new calculated variables to a data frame and drop all existing variables
    Add new calculated variables to a data frame and drop all existing variables R
  • What is bias variance tradeoff
    What is the bias variance tradeoff? R
  • Predictive Modeling and Data Science
    Predictive Modeling and Data Science Machine Learning
  • Group By Minimum in R
    Group By Minimum in R R
  • Top Data Science Examples You Should Know 2023
    Top Data Science Applications You Should Know 2023 Machine Learning
  • computational gastronomy for data science
    Computational Gastronomy for Data Science Machine Learning
Two Sample Proportions test in R

Two Sample Proportions test in R-Complete Guide

Posted on May 28May 27 By Admin No Comments on Two Sample Proportions test in R-Complete Guide

Two Sample Proportions test in R, To compare two observed proportions, the two-proportions z-test is utilized.

This article explains the fundamentals of the two-proportions *z-test and gives practical examples using R software.

We have two groups of people, for example:

Best GGPlot Themes You Should Know – Data Science Tutorials

n = 500 in Group A with lung cancer.

Healthy people (Group B): n = 500

The number of smokers in each group is as follows:

n = 500, 450 smokers, pA=450/500=0.9 in Group A with lung cancer.

Individuals in Group B, who are in good health: pB=400/500=0.8

The overall proportion of smokers is p=frac(450+400)500+500

The overall proportion of non-smokers is q=1−p

We’d like to know if the proportions of smokers in the two categories of people are the same.

One sample proportion test in R-Complete Guide (datasciencetut.com)

The following are examples of typical research questions:

  1. whether the proportion of smokers in group A (pA) is the same as the proportion of smokers in group B (pB)?
  2. whether the observed proportion of smokers in group A (pA) is lower than that in group B (pB)?
  3. whether the proportion of smokers in group A (pA) is higher than the proportion of smokers in group B (pB)?

In statistics, the appropriate null hypothesis (H0) is defined as follows:

H0:pA=pB
H0:pA≤pB
H0:pA≥pB

The following are the relevant alternative hypothesis (Ha):

Ha:pA≠pB (different)
Ha:pA>pB (greater)
Ha:pA<pB (less)

Note that:

Two-tailed tests are used to test hypotheses 1.

One-tailed tests are used to test hypotheses 2 and 3.

Best online course for R programming – Data Science Tutorials

The overall proportions are p and q.

If |z| is less than 1.96, the difference is not significant at 5%.

If |z| is greater than or equal to 1.96, the difference is significant at 5%.

The z-table contains the corresponding significance level (p-value) for the z-statistic. We’ll look at how to do it in R.

Two Sample Proportions test in R

R functions: prop.test()

prop.test(x, n, p = NULL, alternative = "two.sided", correct = TRUE)

x: a vector of counts of successes

n: a vector of count trials

alternative: an alternative hypothesis specified as a character string

correct: a logical indication of whether or not Yates’ continuity correction should be used when it is possible

It’s worth noting that the function prop.test() uses the Yates continuity correction by default, which is critical if either the expected successes or failures are less than 5.

Calculate the p-Value from Z-Score in R – Data Science Tutorials

If you don’t want the correction, use the prop.test() function’s additional argument correct = FALSE. TRUE is the default value.

(To make the test mathematically comparable to the uncorrected z-test of a proportion, set this option to FALSE.)

We’d like to know if the proportions of smokers in the two categories of people are the same.

res <- prop.test(x = c(450, 400), n = c(500, 500))
res
2-sample test for equality of proportions with continuity correction
data:  c(450, 400) out of c(500, 500)
X-squared = 18.831, df = 1, p-value = 1.428e-05
alternative hypothesis: two.sided
95 percent confidence interval:
 0.05417387 0.14582613
sample estimates:
prop 1 prop 2
   0.9    0.8

The following is what the function returns:

Pearson’s chi-squared test statistic’s value

95 percent confidence intervals and a p-value

a calculated chance of success (the proportion of smokers in the two groups)

Take note of the following:

Type this to see if the observed proportion of smokers in group A (pA) is less than the observed proportion of smokers in group B (pB).

prop.test(x = c(490, 400), n = c(500, 500), alternative = "less")

Alternatively, type this to see if the observed proportion of smokers in group A (pA) is greater than the observed proportion of smokers in group B (pB).

Control Chart in Quality Control-Quick Guide – Data Science Tutorials

prop.test(x = c(450, 400), n = c(500, 500), alternative = "greater")

The result’s interpretation

The test’s p-value is 1.428e-05, which is less than the alpha = 0.05 significance level. With a p-value of 1.428e-05, we may conclude that the proportion of smokers in the two groups is significantly different.

R

Post navigation

Previous Post: Calculate the P-Value from Chi-Square Statistic in R
Next Post: Best Books on Data Science with Python

Related Posts

  • how to create a hexbins chart in R
    How to create a hexbin chart in R R
  • How to Label Outliers in Boxplots in ggplot2
    How to Label Outliers in Boxplots in ggplot2? R
  • Replace NA with Zero in R
    Replace NA with Zero in R R
  • Survival Plot in R
    How to Perform a Log Rank Test in R R
  • Descriptive Statistics in R R
  • Pattern Mining Analysis in R-With Examples R

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

  • Type II Errors in R
  • Best Prompt Engineering Books
  • Understanding Machine Learning and Data Science
  • Best Git Books
  • Top 5 Books to Learn Data Engineering
  • About Us
  • Contact
  • Disclaimer
  • Privacy Policy

https://www.r-bloggers.com

  • YouTube
  • Twitter
  • Facebook
  • Course
  • Excel
  • Machine Learning
  • Opensesame
  • R
  • Statistics

Check your inbox or spam folder to confirm your subscription.

  • how to create a hexbins chart in R
    How to create a hexbin chart in R R
  • Data Science Challenges in R Programming Language
    Data Science Challenges in R Programming Language Machine Learning
  • Compare numeric vectors in R R
  • Extract columns of data frame in R R
  • How to Label Outliers in Boxplots in ggplot2
    How to Label Outliers in Boxplots in ggplot2? R
  • what-is-epoch-in-machine-learning
    What is Epoch in Machine Learning? Machine Learning
  • droplevels in R with examples
    droplevels in R with examples R
  • 5 Free Books to Learn Statistics For Data Science
    5 Free Books to Learn Statistics For Data Science Course

Privacy Policy

Copyright © 2025 Data Science Tutorials.

Powered by PressBook News WordPress theme