How to Rank by Group in R?, The basic syntax for ranking variables by the group in dplyr is as follows.
The examples that follow with the given data frame demonstrate how to utilize this syntax in practice.
5 Free Books to Learn Statistics For Data Science – Data Science Tutorials
Let’s create a data frame
df <- data.frame(team = c('A', 'A', 'A', 'A', 'B', 'B', 'B', 'C', 'C', 'C'), points = c(12, 28, 19, 22, 32, 45, 22, 28, 13, 19), rebounds = c(5, 7, 7, 12, 11, 4, 10, 7, 8, 8))
Now we can view the data frame
df
team points rebounds 1 A 10 15 2 A 55 17 3 A 25 17 4 A 36 10 5 P2 45 10 6 B 41 14 7 B 82 15 8 P3 25 11 9 C 33 5 10 C 25 18
Example 1: Rank in Ascending Order
The code below demonstrates how to organize points scored by players by the team in ascending order.
How to make a rounded corner bar plot in R? – Data Science Tutorials
library(dplyr)
Now rank points scored, grouped by team
df %>% arrange(team, points) %>% group_by(team) %>% mutate(rank = rank(points))
team points rebounds rank <chr> <dbl> <dbl> <dbl> 1 A 10 15 1 2 A 25 17 2 3 A 36 10 3 4 A 55 17 4 5 B 41 14 1 6 B 82 15 2 7 C 25 18 1 8 C 33 5 2 9 P2 45 10 1 10 P3 25 11 1
Example 2: Rank in Descending Order
The rank() function also allows us to sort the points earned by the group in descending order by using a negative sign.
How to Filter Rows In R? – Data Science Tutorials
library(dplyr)
Let’s calculate rank points scored in reverse, grouped by team
df %>% arrange(team, points) %>% group_by(team) %>% mutate(rank = rank(-points))
team points rebounds rank <chr> <dbl> <dbl> <dbl> 1 A 10 15 4 2 A 25 17 3 3 A 36 10 2 4 A 55 17 1 5 B 41 14 2 6 B 82 15 1 7 C 25 18 2 8 C 33 5 1 9 P2 45 10 1 10 P3 25 11 1
How to Handle Ranking Ties
When ranking numerical values, we may specify how ties should be handled using the ties.method option.
How to get the last value of each group in R – Data Science Tutorials
rank(points, ties.method='average')
To indicate how to manage ties, choose from the available options,
each tied element is assigned to the average rank by default (elements ranked in the 3rd and 4th position would both receive a rank of 3.5)
first: Assigns the lowest rank to the first tied element (elements ranked in the 3rd and 4th positions would receive ranks 3 and 4 respectively)
Filtering for Unique Values in R- Using the dplyr Data Science Tutorials
Every tied element is given the lowest rank possible, minimum (elements ranked in the 3rd and 4th position would both receive a rank of 3)
max: Gives the highest rank to each tied element (elements ranked in the 3rd and 4th position would both receive a rank of 4)
every linked element is given a random rank at random (either element tied for the 3rd and 4th position could receive either rank)
Subsetting with multiple conditions in R – Data Science Tutorials