How to do Conditional Mutate in R, It’s common to wish to add a new variable based on a condition to an existing data frame. The mutate() and case when() functions from the dplyr package make this task fortunately simple.
Cumulative Sum calculation in R – Data Science Tutorials
With the following data frame, this lesson provides numerous examples of how to apply these functions.
How to do Conditional Mutate in R
Let’s create a data frame
df <- data.frame(player = c('P1', 'P2', 'P3', 'P4', 'P5'), position = c('A', 'B', 'A', 'B', 'B'), points = c(102, 215, 319, 125, 112), rebounds = c(22, 12, 19, 23, 36))
Let’s view the data frame
df
player position points rebounds 1Â Â Â Â P1Â Â Â Â Â Â Â AÂ Â Â 102Â Â Â Â Â Â 22 2Â Â Â Â P2Â Â Â Â Â Â Â BÂ Â Â 215Â Â Â Â Â Â 12 3Â Â Â Â P3Â Â Â Â Â Â Â AÂ Â Â 319Â Â Â Â Â Â 19 4Â Â Â Â P4Â Â Â Â Â Â Â BÂ Â Â 125Â Â Â Â Â Â 23 5Â Â Â Â P5Â Â Â Â Â Â Â BÂ Â Â 112Â Â Â Â Â Â 36
Example 1: Based on one existing variable, create a new variable
A new variable called “score” can be created using the following code depending on the value in the “points” column.
Top Data Science Skills to Get You Hired »
library(dplyr)
Let’s define new variable ‘score’ using mutate() and case_when()
df %>% Â mutate(score = case_when(points < 105 ~ 'LOW', Â points < 212 ~ 'MED', Â Â points < 450 ~ 'HIGH'))
 player position points rebounds score 1    P1       A   102      22  LOW 2    P2       B   215      12 HIGH 3    P3       A   319      19 HIGH 4    P4       B   125      23  MED 5    P5       B   112      36  MED
Example 2: Based on a number of existing variables, create a new variable
The following code demonstrates how to make a new variable called “type” based on the player and position values in the player column.
Tips for Rearranging Columns in R – Data Science Tutorials
library(dplyr)
Now we can define the new variable ‘Type’ using mutate() and case_when()
df %>% Â mutate(Type = case_when(player == 'P1' | player == 'P2' ~ 'starter', Â Â player == 'P3' | player == 'P4' ~ 'backup', Â Â position == 'B' ~ 'reserve'))
 player position points rebounds   Type 1    P1       A   102      22 starter 2    P2       B   215      12 starter 3    P3       A   319      19 backup 4    P4        B   125      23 backup 5    P5       B   112      36 reserve
In order to generate a new variable called “value” depending on the value in the points and rebounds columns, use the following code.
Best online course for R programming – Data Science Tutorials
library(dplyr)
Let’s define the new variable ‘value’ using mutate() and case_when()
df %>% Â mutate(value = case_when(points <= 102 & rebounds <=45 ~ 2, Â Â points <=215 & rebounds > 55 ~ 4, Â Â points < 225 & rebounds < 28 ~ 6, Â Â points < 325 & rebounds > 29 ~ 7, Â Â points >=25 ~ 9))
player position points rebounds value 1Â Â Â Â P1Â Â Â Â Â Â Â AÂ Â Â 102Â Â Â Â Â Â 22Â Â Â Â 2 2Â Â Â Â P2Â Â Â Â Â Â Â BÂ Â Â 215Â Â Â Â Â Â 12Â Â Â Â 6 3Â Â Â Â P3Â Â Â Â Â Â Â AÂ Â Â 319Â Â Â Â Â Â 19Â Â Â Â 9 4Â Â Â Â P4Â Â Â Â Â Â Â BÂ Â Â 125Â Â Â Â Â Â 23Â Â Â Â 6 5Â Â Â Â P5Â Â Â Â Â Â Â BÂ Â Â 112Â Â Â Â Â Â 36Â Â Â Â 7
Hope now you are clear with the concept.