How to Perform Bootstrapping in R, Bootstrapping is a method for estimating the standard error of any statistic and generating a confidence interval for the statistic.

The basic bootstrapping procedure is as follows:

Take k repeated replacement samples from a given dataset.

Calculate the statistic of interest for each sample.

These yields k different estimates for a given statistic, which you can then use to calculate the statistic’s standard error and create a confidence interval.

We can perform bootstrapping in R by calling the following boot library functions:

## 1. Generate bootstrap samples.

boot(data, statistic, R, …)

where:

data: A vector, matrix, or data frame

statistic: A function that produces the statistic(s) to be bootstrapped

R: Number of bootstrap replicates

## 2. Create a confidence interval using the bootstrap method.

boot.ci(bootobject, conf, type)

where:

bootobject: An object returned by the boot() function

conf: The confidence interval to be computed. The default value is 0.95.

type: The type of confidence interval to compute. Options include “norm”, “basic”, “stud”, “perc”, “bca” and “all” – Default is “all”

The examples below demonstrate how to use these functions in practice.

How to test the significance of a mediation effect (datasciencetut.com)

**Bootstrapping a Single Statistic**

The code below demonstrates how to compute the standard error for the R-squared of a simple linear regression model:

set.seed(123) library(boot)

Now we can define a function to calculate R-squared

rsq_function <- function(formula, data, indices) { d <- data[indices,] #allows boot to select sample fit <- lm(formula, data=d) return(summary(fit)$r.square) }

Let’s perform bootstrapping with 3000 replications

reps <- boot(data=mtcars, statistic=rsq_function, R=3000, formula=mpg~disp)

Ready to view the results of bootstrapping

How to Analyze Likert Scale Data? – Data Science Tutorials

reps

ORDINARY NONPARAMETRIC BOOTSTRAP Call: boot(data = mtcars, statistic = rsq_function, R = 3000, formula = mpg ~ disp) Bootstrap Statistics : original bias std. error t1* 0.7183433 0.003027851 0.06410851

We can see from the results:

This regression model’s estimated R-squared is 0.7183433.

This estimate has a standard error of 0.06513426.

We can also quickly see the distribution of the bootstrapped samples:

Similarity Measure Between Two Populations-Brunner Munzel Test – Data Science Tutorials

plot(reps)

We can also use the following code to compute the 95% confidence interval for the model’s estimated R-squared:

Adjusted bootstrap percentile (BCa) interval calculation

boot.ci(reps, type="bca")

BOOTSTRAP CONFIDENCE INTERVAL CALCULATIONS Based on 3000 bootstrap replicates CALL : boot.ci(boot.out = reps, type = "bca") Intervals : Level BCa 95% ( 0.5474, 0.8160 )

We can see from the output that the 95% bootstrapped confidence interval for the true R-squared values is (.5350, .8188).