How to Scale Only Numeric Columns in R, To scale only the numeric columns in a data frame in R, use the dplyr package’s following syntax.
library(dplyr) df %>% mutate(across(where(is.numeric), scale))
How to actually use this function is demonstrated in the example that follows.
Use dplyr to Scale Only Numeric Columns as an example.
Let’s say we have the R data frame shown below, which contains details about numerous basketball players.
How to Scale Only Numeric Columns in R
Let’s create a data frame
df <- data.frame(Team=c('P1', 'P2', 'P3', 'P4', 'P5'), points=c(2, 3, 7, 22, 8), value=c(27, 39, 49, 82, 54))
Now we can view the data frame
Team points value 1 P1 2 27 2 P2 3 39 3 P3 7 49 4 P4 22 82 5 P5 8 54
The following fundamental syntax is used by R’s scale() function.
scale(x, center = TRUE, scale = TRUE)
x: Name of the object to scale
center: whether to scale after subtracting the mean. As a rule, TRUE.
scale: Whether to scale after dividing by the standard deviation. As a general, TRUE.
Scaled values are calculated using the following formula by this function:
xscaled = (xoriginal – x̄) / s
xoriginal: The original x-value
x̄: The sample mean
s: The sample standard deviation
This process, which only changes each original value into a z-score, is also known as normalizing data.
Let’s say we want to scale the data frame’s numeric columns solely, using R’s scale function.
To do this, we can use the syntax shown below.
scale just the data frame’s numerical columns.
df %>% mutate(across(where(is.numeric), scale))
Team points value 1 P1 -0.79813157 -1.1284228 2 P2 -0.67342351 -0.5447558 3 P3 -0.17459128 -0.0583667 4 P4 1.69602958 1.5467175 5 P5 -0.04988322 0.1848279
The team column has remained the same, but the values in the three numerical columns (points, assists, and rebounds) have been scaled.