Skip to content

Data Science Tutorials

For Data Science Learners

  • Sort or Order Rank in R R
  • How to check regression analysis heteroscedasticity in R
    How to check regression analysis heteroscedasticity in R R
  • How to compare variances in R
    How to compare variances in R R
  • Extract values from vector in R: dplyr R
  • Correlation Coefficient p value in R
    Correlation Coefficient p value in R R
  • Tips for Data Scientist Interview Openings
    Tips for Data Scientist Interview Openings Course
  • OLS Regression in R
    OLS Regression in R R
  • Select the First Row by Group in R
    Select the First Row by Group in R R
How to apply a transformation to multiple columns in R?

How to apply a transformation to multiple columns in R?

Posted on July 17July 16 By Admin No Comments on How to apply a transformation to multiple columns in R?

How to apply a transformation to multiple columns in R?, To apply a transformation to many columns, use R’s across() function from the dplyr package.

How to apply a transformation to multiple columns in R?

There are innumerable applications for this function, however, the following examples highlight some typical ones:

First Approach: Apply Function to Several Columns

Multiply values in col1 and col2 by 2

df %>%  mutate(across(c(col1, col2), function(x) x*2))

Second Approach: One Summary Statistic for Multiple Columns can be Calculated

calculate the mean of col1 and col2

df %>%  summarise(across(c(col1, col2), mean, na.rm=TRUE))

Third Approach: Multiple Summary Statistics to be Calculated for Multiple Columns

Calculate the mean and standard deviation for col1 and col2

df %>%  summarise(across(c(col1, col2), list(mean=mean, sd=sd), na.rm=TRUE))

The examples below demonstrate each technique using the given data frame.

Subset rows based on their integer locations

Let’s create a data frame

df <- data.frame(team=c('P1', 'P1', 'P1', 'P2', 'P2', 'P2'),
points=c(26, 22, 28, 15, 32, 28),
rebounds=c(16, 15, 16, 12, 13, 10))

Now we can view the data frame

df
   team points rebounds
1   P1     26       16
2   P1     22       15
3   P1     28       16
4   P2     15       12
5   P2     32       13
6   P2     28       10

Example 1: Apply Function to Multiple Columns

The values in the columns for points and rebounds can be multiplied by 2 using the across() function by using the following code.

library(dplyr)

Multiply by two to the values in the columns for points and rebounds.

df %>%  mutate(across(c(points, rebounds), function(x) x*2))
  team points rebounds
1   P1     52       32
2   P1     44       30
3   P1     56       32
4   P2     30       24
5   P2     64       26
6   P2     56       20

Example 2: One Summary Statistic for Multiple Columns can be Calculated

The across() function can be used to determine the mean value for both the points and rebound columns using the following sample code.

How to do Conditional Mutate in R? – Data Science Tutorials

the average value of the columns for points and rebounds.

df %>%  summarise(across(c(points, rebounds), mean, na.rm=TRUE))
    points rebounds
1 25.16667 13.66667

Be aware that we can also use the is.numeric function to have the data frame’s numeric columns generate a summary statistic automatically.

Calculate the mean value for each column of numbers in the data frame.

df %>%  summarise(across(where(is.numeric), mean, na.rm=TRUE))
  points rebounds
1 25.16667 13.66667

Example 3: Multiple Summary Statistics to be Calculated for Multiple Columns

The across() function may be used to determine the mean and standard deviation of the points and rebounds columns using the following code.

Compute the mean and standard deviation for the columns of points and rebounds.

df %>%  summarise(across(c(points, rebounds), list(mean=mean, sd=sd), na.rm=TRUE))
    points_mean points_sd rebounds_mean rebounds_sd 
1    25.16667  5.946988      13.66667     2.42212 

Now we are almost complete with dplyr package techniques. We will discuss transmute() function in an upcoming post.

How to change the column positions in R? – Data Science Tutorials

Check your inbox or spam folder to confirm your subscription.

R Tags:dplyr

Post navigation

Previous Post: Best Books to learn Tensorflow
Next Post: Add new calculated variables to a data frame and drop all existing variables

Related Posts

  • How to Find Optimal Clusters in R, K-means clustering is one of the most widely used clustering techniques in machine learning.
    How to Find Optimal Clusters in R? R
  • How to perform kruskal wallis test in r
    How to perform the Kruskal-Wallis test in R? R
  • How to Join Data Frames for different column names in R
    How to Join Data Frames for different column names in R R
  • Credit Card Fraud detection in R
    Credit Card Fraud Detection in R R
  • How to remove files and folders in R
    How to remove files and folders in R R
  • Aggregate daily data to monthly and yearly in R
    Aggregate daily data to monthly and yearly in R R

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

  • Maximizing Model Accuracy with Train-Test Splits in Machine Learning
  • Type II Errors in R
  • Best Prompt Engineering Books
  • Understanding Machine Learning and Data Science
  • Best Git Books
  • About Us
  • Contact
  • Disclaimer
  • Privacy Policy

https://www.r-bloggers.com

  • YouTube
  • Twitter
  • Facebook
  • Course
  • Excel
  • Machine Learning
  • Opensesame
  • R
  • Statistics

Check your inbox or spam folder to confirm your subscription.

  • How Cloud Computing Improves Workflows in Data Science
    How Cloud Computing Improves Workflows in Data Science Machine Learning
  • How to Join Data Frames for different column names in R
    How to Join Data Frames for different column names in R R
  • Number to Percentage in R
    Number to Percentage in R R
  • How to check regression analysis heteroscedasticity in R
    How to check regression analysis heteroscedasticity in R R
  • Triangular Distribution in R
    Triangular Distribution in R R
  • How to choose optimal number of epochs in R
    How to choose optimal number of epochs in R Machine Learning
  • Comparison between Statistics and Luck
    Lottery Prediction-Comparison between Statistics and Luck Machine Learning
  • Using describeBy() in R: A Comprehensive Guide R

Privacy Policy

Copyright © 2025 Data Science Tutorials.

Powered by PressBook News WordPress theme