Skip to content

Data Science Tutorials

For Data Science Learners

  • Pattern Mining Analysis in R-With Examples R
  • Understanding the Student’s t-Distribution in R R
  • Positive or Negative in R R
  • The Multinomial Distribution in R
    The Multinomial Distribution in R R
  • How to compare variances in R
    How to compare variances in R R
  • How to Turn Off Scientific Notation in R
    How to Turn Off Scientific Notation in R? R
  • Group By Maximum in R
    Group By Maximum in R R
  • Maximizing Model Accuracy with Train-Test Splits in Machine Learning R
How to Standardize Data in R

How to Standardize Data in R?

Posted on July 28July 27 By Admin No Comments on How to Standardize Data in R?

How to Standardize Data in R?, A dataset must be scaled so that the mean value is 0 and the standard deviation is 1, which is known as standardization.

The z-score standardization, which scales numbers using the following formula, is the most used method for doing this.

Two-Way ANOVA Example in R-Quick Guide – Data Science Tutorials

(xi – xbar) / s

where:

xi: The ith value in the dataset

xbar: The sample mean

s: The sample standard deviation

The examples below demonstrate how to scale one or more variables in a data frame using the z-score standardization in R by using the scale() function and the dplyr package.

Standardize just one variable

In a data frame containing three variables, the following code demonstrates how to scale just one of the variables.

library(dplyr)

Now make this example reproducible

set.seed(123)

Now let’s create an original data frame

df <- data.frame(var1= runif(10, 0, 50),
                 var2= runif(10, 2, 20),
                 var3= runif(10, 5, 30))

Now we can view the original data frame

df
        var1      var2      var3
1  14.378876 19.223000 27.238483
2  39.415257 10.160015 22.320085
3  20.448846 14.196271 21.012670
4  44.150870 12.307401 29.856744
5  47.023364  3.852644 21.392645
6   2.277825 18.196849 22.713262
7  26.405274  6.429579 18.601651
8  44.620952  2.757072 19.853551
9  27.571751  7.902573 12.228993
10 22.830737 19.181066  8.677841

scale var1 to have mean = 0 and standard deviation = 1

df2 <- df %>% mutate_at(c('var1'), ~(scale(.) %>% as.vector))
df2
         var1      var2      var3
1  -0.98619132 19.223000 27.238483
2   0.71268801 10.160015 22.320085
3  -0.57430484 14.196271 21.012670
4   1.03402981 12.307401 29.856744
5   1.22894699  3.852644 21.392645
6  -1.80732540 18.196849 22.713262
7  -0.17012290  6.429579 18.601651
8   1.06592790  2.757072 19.853551
9  -0.09096999  7.902573 12.228993
10 -0.41267825 19.181066  8.677841

You’ll notice that the other two variables didn’t change; only the first variable was scaled.

The new scaled variable has a mean value of 0, and a standard deviation of 1, as we can immediately confirm.

Bind together two data frames by their rows or columns in R (datasciencetut.com)

compute the scaled variable’s mean.

mean(df2$var1)
[1] 2.638406e-17 basically zero

calculate the scaled variable’s standard deviation.

sd(df2$var1)
[1] 1

Standardize Multiple Variables

Multiple variables in a data frame can be scaled simultaneously using the code provided below:

scale var1 and var2 to have mean = 0 and standard deviation = 1

df3 <- df %>% mutate_at(c('var1', 'var2'), ~(scale(.) %>% as.vector))
df3
       var1       var2      var3
1  -0.98619132  1.2570692 27.238483
2   0.71268801 -0.2031057 22.320085
3  -0.57430484  0.4471923 21.012670
4   1.03402981  0.1428686 29.856744
5   1.22894699 -1.2193121 21.392645
6  -1.80732540  1.0917418 22.713262
7  -0.17012290 -0.8041315 18.601651
8   1.06592790 -1.3958243 19.853551
9  -0.09096999 -0.5668114 12.228993
10 -0.41267825  1.2503130  8.677841

Standardize All Variables

Using the mutate_all function, the following code demonstrates how to scale each variable in a data frame.

scale all variables to have mean = 0 and standard deviation = 1

How to Rank by Group in R? – Data Science Tutorials

df4 <- df %>% mutate_all(~(scale(.) %>% as.vector))
df4
        var1       var2        var3
1  -0.98619132  1.2570692  1.09158171
2   0.71268801 -0.2031057  0.30768348
3  -0.57430484  0.4471923  0.09930665
4   1.03402981  0.1428686  1.50888235
5   1.22894699 -1.2193121  0.15986731
6  -1.80732540  1.0917418  0.37034828
7  -0.17012290 -0.8041315 -0.28496363
8   1.06592790 -1.3958243 -0.08543481
9  -0.09096999 -0.5668114 -1.30064291
10 -0.41267825  1.2503130 -1.86662844

Check your inbox or spam folder to confirm your subscription.

R

Post navigation

Previous Post: How to Create an Interaction Plot in R?
Next Post: How to convert characters from upper to lower case in R?

Related Posts

  • Random Forest Machine Learning
    Random Forest Machine Learning Introduction R
  • How to Create an Interaction Plot in R
    How to Create an Interaction Plot in R? R
  • Methods for Integrating R and Hadoop
    Methods for Integrating R and Hadoop complete Guide R
  • How to Calculate Relative Frequencies in R
    How to Calculate Relative Frequencies in R? R
  • How do augmented analytics work
    How do augmented analytics work? R
  • Sort or Order Rank in R R

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

  • Maximizing Model Accuracy with Train-Test Splits in Machine Learning
  • Type II Errors in R
  • Best Prompt Engineering Books
  • Understanding Machine Learning and Data Science
  • Best Git Books
  • About Us
  • Contact
  • Disclaimer
  • Privacy Policy

https://www.r-bloggers.com

  • YouTube
  • Twitter
  • Facebook
  • Course
  • Excel
  • Machine Learning
  • Opensesame
  • R
  • Statistics

Check your inbox or spam folder to confirm your subscription.

  • Credit Card Fraud detection in R
    Credit Card Fraud Detection in R R
  • How to apply a transformation to multiple columns in R?
    How to apply a transformation to multiple columns in R? R
  • How to check regression analysis heteroscedasticity in R
    How to check regression analysis heteroscedasticity in R R
  • Find the Maximum Value by Group in R
    Find the Maximum Value by Group in R R
  • ggpairs in R
    ggpairs in R R
  • How to Create an Interaction Plot in R
    How to Create an Interaction Plot in R? R
  • Survival Plot in R
    How to Perform a Log Rank Test in R R
  • Data Science Applications in Banking
    Data Science Applications in Banking Machine Learning

Privacy Policy

Copyright © 2025 Data Science Tutorials.

Powered by PressBook News WordPress theme