Create new variables from existing variables in R?. To create new variables from existing variables, use the case when() function from the dplyr package in R.
What Is the Best Way to Filter by Date in R? – Data Science Tutorials
The following is the fundamental syntax for this function.
library(dplyr) df %>% Â mutate(new_var = case_when(var1 < 25 ~ 'low', Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â var2 < 35 ~ 'med', Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â TRUE ~ 'high'))
It’s worth noting that TRUE is the same as an “else” expression.
With the given data frame, the following examples demonstrate how to utilize this function in practice.
Calculate the P-Value from Chi-Square Statistic in R.Data Science Tutorials
Let’s create a data frame
df <- data.frame(player = c('A', 'B', 'C', 'D', 'E', 'F'), Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â position = c('R1', 'R2', 'R3', 'R4', 'R5', NA), Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â points = c(102, 105, 219, 322, 232, NA), Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â assists = c(405, 407, 527, 412, 211, NA))
Now we can view the data frame
df
 player position points assists 1     A      R1   102    405 2     B      R2   105    407 3     C      R3   219    527 4     D      R4   322    412 5     E      R5   232    211 6     F    <NA>    NA     NA
Example 1: Create New Variable from One Existing Variable
The following code demonstrates how to make a new variable named quality with values generated from the points column.
Test for Normal Distribution in R-Quick Guide – Data Science Tutorials
df %>% Â mutate(quality = case_when(points > 120 ~ 'high', Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â points > 215 ~ 'med', Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â TRUE ~ 'low' ))
player position points assists quality 1Â Â Â Â Â AÂ Â Â Â Â Â R1Â Â Â 102Â Â Â Â 405Â Â Â Â low 2Â Â Â Â Â BÂ Â Â Â Â Â R2Â Â Â 105Â Â Â Â 407Â Â Â Â low 3Â Â Â Â Â CÂ Â Â Â Â Â R3Â Â Â 219Â Â Â Â 527Â Â Â high 4Â Â Â Â Â DÂ Â Â Â Â Â R4Â Â Â 322Â Â Â Â 412Â Â Â high 5Â Â Â Â Â EÂ Â Â Â Â Â R5Â Â Â 232Â Â Â Â 211Â Â Â high 6Â Â Â Â Â FÂ Â Â Â <NA>Â Â Â Â NAÂ Â Â Â Â NAÂ Â Â Â low
The case when() function created the values for the new column in the following way.
The value in the quality column is “high” if the value in the points column is greater than 120.
If the score in the points column is greater than 215, the quality column value will be “med.”
Count Observations by Group in R – Data Science Tutorials
Otherwise, if the points column value is less than or equal to 215 (or a missing value like NA), the quality column value is “poor.”
Example 2: Create New Variable from Multiple Variables
The following code demonstrates how to make a new variable named quality with values drawn from both the points and assists columns.
df %>% Â mutate(quality = case_when(points > 215 & assists > 10 ~ 'great', Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â points > 215 & assists > 5 ~ 'good', Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â TRUE ~ 'average' ))
player position points assists quality 1Â Â Â Â Â AÂ Â Â Â Â Â R1Â Â Â 102Â Â Â Â 405 average 2Â Â Â Â Â BÂ Â Â Â Â Â R2Â Â Â 105Â Â Â Â 407 average 3Â Â Â Â Â CÂ Â Â Â Â Â R3Â Â Â 219Â Â Â Â 527Â Â great 4Â Â Â Â Â DÂ Â Â Â Â Â R4Â Â Â 322Â Â Â Â 412Â Â great 5Â Â Â Â Â EÂ Â Â Â Â Â R5Â Â Â 232Â Â Â Â 211Â Â great 6Â Â Â Â Â FÂ Â Â Â <NA>Â Â Â Â NAÂ Â Â Â Â NA average
It’s worth noting that the is.na() function can also be used to explicitly assign strings to NA values.
Best GGPlot Themes You Should Know – Data Science Tutorials
df %>% Â mutate(quality = case_when(is.na(points) ~ 'missing', Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â points > 215 & assists > 100 ~ 'great', Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â points > 215 & assists > 150 ~ 'good', Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â TRUE ~ 'average' ))
player position points assists quality 1Â Â Â Â Â AÂ Â Â Â Â Â R1Â Â Â 102Â Â Â Â 405 average 2Â Â Â Â Â BÂ Â Â Â Â Â R2Â Â Â 105Â Â Â Â 407 average 3Â Â Â Â Â CÂ Â Â Â Â Â R3Â Â Â 219Â Â Â Â 527Â Â great 4Â Â Â Â Â DÂ Â Â Â Â Â R4Â Â Â 322Â Â Â Â 412Â Â great 5Â Â Â Â Â EÂ Â Â Â Â Â R5Â Â Â 232Â Â Â Â 211Â Â great 6Â Â Â Â Â FÂ Â Â Â <NA>Â Â Â Â NAÂ Â Â Â Â NA missing