How to compare variances in R

How to compare variances in R?, The F-test is used to see if two populations (A and B) have the same variances.

When should the F-test be used?

A comparison of two variations is useful in a variety of situations, including:

When you wish to examine if the variances of two samples are equal, you can use a two-sample t-test.
When comparing the variability of a new measurement method to that of an older one. Is the measure’s variability reduced by the new method?

Best Books on Data Science with Python – Data Science Tutorials

Hypotheses are based on statistics and research inquiries

whether group A’s variance (σ2A) is the same as group B’s variance (σ2B)?
whether group A (group σ2A) has a lower variance than group B (group σ2B)?
Does group A (group σ2A) has a higher variance than group B (group σ2B)?

In statistics, the analogous null hypothesis (H0) is defined as follows:

H0:σ2A=σ2B
H0:σ2A≤σ2B
H0:σ2A≥σ2B

The following are the relevant alternative hypothesis (Ha):

Ha:σ2A≠σ2B (different)
Ha:σ2A>σ2B (greater)
Ha:σ2A<σ2B (less)

Note that:

Two-tailed tests are used to test hypotheses 1.

One-tailed tests are used to test hypotheses 2 and 3.

Detecting and Dealing with Outliers: First Step – Data Science Tutorials

The F-test necessitates that the two samples be normally distributed.

To compare two variances, use the R function var.test() as follows:

Method 1

var.test(values ~ groups, data,
         alternative = "two.sided")

Method 2

var.test(x, y, alternative = "two.sided")

x,y: numeric vectors

alternative: a different hypothesis “two.sided” (default), “greater” or “less” are the only values that can be used.

data <- ToothGrowth

To get a sense of how the data looks, we use the sample_n() function in the dplyr package to display a random sample of 10 rows.

How to perform One-Sample Wilcoxon Signed Rank Test in R? – Data Science Tutorials

library("dplyr")
sample_n(data, 10)

   len supp dose
1  25.5   VC  2.0
2  14.5   VC  1.0
3  14.5   OJ  1.0
4   9.7   OJ  0.5
5  16.5   VC  1.0
6  27.3   OJ  2.0
7   9.4   OJ  0.5
8  22.5   VC  1.0
9  11.2   VC  0.5
10  8.2   OJ  0.5

In the column “supp,” we want to see if the two groups OJ and VC have the same variances.

F-test assumptions are checked with a preliminary test.

The F-test is extremely sensitive to deviations from the standard assumption. Before applying the F-test, make sure the data is normally distributed.

Control Chart in Quality Control-Quick Guide – Data Science Tutorials

To see if the normal assumption holds, apply the Shapiro-Wilk test. The Q-Q plot (quantile-quantile plot) can also be used to visually analyze the normality of a variable.

The correlation between a particular sample and the normal distribution is depicted in a Q-Q plot.

If you’re not sure about the normality of your data, try Levene’s or Fligner-Killeen tests, which are less sensitive to deviations from the norm.

Compute F-test

res.ftest <- var.test(len ~ supp, data = data)
res.ftest

How to perform the MANOVA test in R? – Data Science Tutorials

F test to compare two variances

data:  len by supp
F = 0.6386, num df = 29, denom df = 29, p-value = 0.2331
alternative hypothesis: true ratio of variances is not equal to 1
95 percent confidence interval:
 0.3039488 1.3416857
sample estimates:
ratio of variances
         0.6385951

The F-test has a p-value of 0.2331, which is higher than the significance level of 0.05. Finally, no significance difference exists between the two variances.

When should the F-test be used?