How to Perform Bootstrapping in R » Data Science Tutorials

How to Perform Bootstrapping in R, Bootstrapping is a method for estimating the standard error of any statistic and generating a confidence interval for the statistic.

The basic bootstrapping procedure is as follows:

Take k repeated replacement samples from a given dataset.

Calculate the statistic of interest for each sample.

These yields k different estimates for a given statistic, which you can then use to calculate the statistic’s standard error and create a confidence interval.

We can perform bootstrapping in R by calling the following boot library functions:

1. Generate bootstrap samples.

boot(data, statistic, R, …)

where:

data: A vector, matrix, or data frame

statistic: A function that produces the statistic(s) to be bootstrapped

R: Number of bootstrap replicates

2. Create a confidence interval using the bootstrap method.

boot.ci(bootobject, conf, type)

where:

bootobject: An object returned by the boot() function

conf: The confidence interval to be computed. The default value is 0.95.

type: The type of confidence interval to compute. Options include “norm”, “basic”, “stud”, “perc”, “bca” and “all” – Default is “all”

The examples below demonstrate how to use these functions in practice.

How to test the significance of a mediation effect (datasciencetut.com)

Bootstrapping a Single Statistic

The code below demonstrates how to compute the standard error for the R-squared of a simple linear regression model:

set.seed(123)
library(boot)

Now we can define a function to calculate R-squared

rsq_function <- function(formula, data, indices) {
  d <- data[indices,] #allows boot to select sample
  fit <- lm(formula, data=d)
  return(summary(fit)$r.square)
}

Let’s perform bootstrapping with 3000 replications

reps <- boot(data=mtcars, statistic=rsq_function, R=3000, formula=mpg~disp)

Ready to view the results of bootstrapping

How to Analyze Likert Scale Data? – Data Science Tutorials

reps

ORDINARY NONPARAMETRIC BOOTSTRAP
Call:
boot(data = mtcars, statistic = rsq_function, R = 3000, formula = mpg ~
    disp)
Bootstrap Statistics :
     original      bias    std. error
t1* 0.7183433 0.003027851  0.06410851

We can see from the results:

This regression model’s estimated R-squared is 0.7183433.

This estimate has a standard error of 0.06513426.

We can also quickly see the distribution of the bootstrapped samples:

plot(reps)

We can also use the following code to compute the 95% confidence interval for the model’s estimated R-squared:

Adjusted bootstrap percentile (BCa) interval calculation

boot.ci(reps, type="bca")

BOOTSTRAP CONFIDENCE INTERVAL CALCULATIONS
Based on 3000 bootstrap replicates
CALL :
boot.ci(boot.out = reps, type = "bca")
Intervals :
Level       BCa         
95%   ( 0.5474,  0.8160 )

We can see from the output that the 95% bootstrapped confidence interval for the true R-squared values is (.5350, .8188).

How to Use Italic Font in R – Data Science Tutorials