Crosstab calculation in R, To create a crosstab using functions from the dplyr and tidyr packages in R, use the following basic syntax.
df %>% group_by(var1, var2) %>% tally() %>% spread(var1, n)
The examples below demonstrate how to utilize this syntax in practice.
Example 1: Make a simple crosstab
Let’s say we have the following R data frame:
Let’s create a data frame
df <- data.frame(team=c('X', 'X', 'X', 'X', 'Y', 'Y', 'Y', 'Y'), position=c('A', 'A', 'B', 'C', 'C', 'C', 'D', 'D'), points=c(107, 207, 208, 211, 213, 215, 219, 313))
Now we can view the data frame
team position points 1 X A 107 2 X A 207 3 X B 208 4 X C 211 5 Y C 213 6 Y C 215 7 Y D 219 8 Y D 313
To make a crosstab for the ‘team’ and ‘position’ variables, use the following syntax.
Now we can produce the crosstab
df %>% group_by(team, position) %>% tally() %>% spread(team, n)
position X Y <chr> <int> <int> 1 A 2 NA 2 B 1 NA 3 C 1 2 4 D NA 2
Here’s we can infer the values in the crosstab.
There is 2 player who has a position of ‘A’ and belongs to team ‘X’
There is 1 player who has a position of ‘B’ and belongs to team ‘X’
It’s worth noting that we may change the crosstab’s rows and columns by changing the value used in the spread() function.
Let’s produce a crosstab with ‘position’ along with columns.
df %>% group_by(team, position) %>% tally() %>% spread(position, n)
team A B C D <chr> <int> <int> <int> <int> 1 X 2 1 1 NA 2 Y NA NA 2 2