How to create a ggalluvial plot in R, The ggalluvial package’s vaccines data set includes a survey “that asked respondents their likelihood of receiving an influenza vaccination.”
The discretized probability of immunization, the frequency of respondents, and the survey’s designation make up the response variable.
Let’s install ggalluvial package
install.packages("ggalluvial") library(ggalluvial) head(vaccinations)
survey freq subject response start_date end_date 1 ms153_NSA 48 1 Always 2010-09-22 2010-10-25 2 ms153_NSA 9 2 Always 2010-09-22 2010-10-25 3 ms153_NSA 66 3 Always 2010-09-22 2010-10-25 4 ms153_NSA 1 4 Always 2010-09-22 2010-10-25 5 ms153_NSA 11 5 Always 2010-09-22 2010-10-25 6 ms153_NSA 1 6 Always 2010-09-22 2010-10-25
Alluvial plot
To construct alluvial diagrams in ggplot2, use the geom alluvium and geom stratum functions from the ggalluvial package.
You must utilize the aforementioned routines, pass your data in long format, and specify the axis variables inside the aes.
You can also alter the ggplot2 theme, add the text for each stratum, and add the names of the axis variables, albeit you won’t see them in the examples that follow because we set theme_void.
ggplot(data = vaccinations, aes(axis1 = survey, axis2 = response, y = freq)) + geom_alluvium(aes(fill = response)) + geom_stratum() + geom_text(stat = "stratum", aes(label = after_stat(stratum))) + scale_x_discrete(limits = c("Survey", "Response"), expand = c(0.15, 0.05)) + theme_void()
More categorical variables in your dataset can be passed to aes (axis1, axis2, axis3,…) as in the example below.
ggplot(data = vaccinations, aes(axis1 = survey, # First variable on the X-axis axis2 = response, # Second variable on the X-axis axis3 = survey, # Third variable on the X-axis y = freq)) + geom_alluvium(aes(fill = response)) + geom_stratum() + geom_text(stat = "stratum", aes(label = after_stat(stratum))) + scale_x_discrete(limits = c("Survey", "Response"), expand = c(0.15, 0.05)) + theme_void()
Change ggplot2 Theme Color in R- Data Science Tutorials
With the curve type input of the geom alluvium function, the type of flows in the plot region can be modified.
The “xspline” value by default creates approximation splines with four points per curve. The alternate options are as follows:
Linear
ggplot(data = vaccinations, aes(axis1 = survey, axis2 = response, y = freq)) + geom_alluvium(aes(fill = response), curve_type = "linear") + geom_stratum() + geom_text(stat = "stratum", aes(label = after_stat(stratum))) + scale_x_discrete(limits = c("Survey", "Response"), expand = c(0.15, 0.05)) + theme_void()
Filter Using Multiple Conditions in R – Data Science Tutorials
Cubic
ggplot(data = vaccinations, aes(axis1 = survey, axis2 = response, y = freq)) + geom_alluvium(aes(fill = response), curve_type = "cubic") + geom_stratum() + geom_text(stat = "stratum", aes(label = after_stat(stratum))) + scale_x_discrete(limits = c("Survey", "Response"), expand = c(0.15, 0.05)) + theme_void()
How to add labels at the end of each line in ggplot2? (datasciencetut.com)
Quintic
ggplot(data = vaccinations, aes(axis1 = survey, axis2 = response, y = freq)) + geom_alluvium(aes(fill = response), curve_type = "quintic") + geom_stratum() + geom_text(stat = "stratum", aes(label = after_stat(stratum))) + scale_x_discrete(limits = c("Survey", "Response"), expand = c(0.15, 0.05)) + theme_void()
How to Standardize Data in R? – Data Science Tutorials
Sine
ggplot(data = vaccinations, aes(axis1 = survey, axis2 = response, y = freq)) + geom_alluvium(aes(fill = response), curve_type = "sine") + geom_stratum() + geom_text(stat = "stratum", aes(label = after_stat(stratum))) + scale_x_discrete(limits = c("Survey", "Response"), expand = c(0.15, 0.05)) + theme_void()
Interactive 3d plot in R-Quick Guide – Data Science Tutorials
Arctangent
ggplot(data = vaccinations, aes(axis1 = survey, axis2 = response, y = freq)) + geom_alluvium(aes(fill = response), curve_type = "arctangent") + geom_stratum() + geom_text(stat = "stratum", aes(label = after_stat(stratum))) + scale_x_discrete(limits = c("Survey", "Response"), expand = c(0.15, 0.05)) + theme_void()
Best Books About Data Analytics – Data Science Tutorials
Sigmoid
ggplot(data = vaccinations, aes(axis1 = survey, axis2 = response, y = freq)) + geom_alluvium(aes(fill = response), curve_type = "sigmoid") + geom_stratum() + geom_text(stat = "stratum", aes(label = after_stat(stratum))) + scale_x_discrete(limits = c("Survey", "Response"), expand = c(0.15, 0.05)) + theme_void()
A Side-by-Side Boxplot in R: How to Do It – Data Science Tutorials
For alternative curve types based on asymptotic functions, take note that there is an additional argument entitled curve range. Enter geom alluvium for more information.
Customization of colours
The colours used for the flows can be changed. They can have a single colour assigned to them, a predetermined colour palette, or a custom colour vector.
Note that the stratum boxes’ colours can also be changed.
Fill color
ggplot(data = vaccinations, aes(axis1 = survey, axis2 = response, y = freq)) + geom_alluvium(aes(fill = "red")) + geom_stratum() + geom_text(stat = "stratum", aes(label = after_stat(stratum))) + scale_x_discrete(limits = c("Survey", "Response"), expand = c(0.15, 0.05)) + scale_fill_viridis_d() theme_void()
How to Turn Off Scientific Notation in R? – Data Science Tutorials
Color palette
ggplot(data = vaccinations, aes(axis1 = survey, axis2 = response, y = freq)) + geom_alluvium(aes(fill = response)) + geom_stratum() + geom_text(stat = "stratum", aes(label = after_stat(stratum))) + scale_x_discrete(limits = c("Survey", "Response"), expand = c(0.15, 0.05)) + scale_fill_viridis_d() + theme_void()
How to Add Superscripts and Subscripts to Plots in R? (datasciencetut.com)
Custom colors
colors <- hcl.colors(4, "Red-Blue") ggplot(data = vaccinations, aes(axis1 = survey, axis2 = response, y = freq)) + geom_alluvium(aes(fill = response)) + geom_stratum() + geom_text(stat = "stratum", aes(label = after_stat(stratum))) + scale_x_discrete(limits = c("Survey", "Response"), expand = c(0.15, 0.05)) + scale_fill_manual(values = colors) + theme_void()
Stratum color
ggplot(data = vaccinations, aes(axis1 = survey, axis2 = response, y = freq)) + geom_alluvium(aes(fill = response)) + geom_stratum(aes(fill = response)) + geom_text(stat = "stratum", aes(label = after_stat(stratum))) + scale_x_discrete(limits = c("Survey", "Response"), expand = c(0.15, 0.05)) + theme_void()
How to Add a caption to ggplot2 Plots in R? (datasciencetut.com)
Legend customization
Since the ggalluvial program was developed using ggplot2, it is possible to alter the legend of the plots’ titles, key labels, and positions, or even delete them entirely, as demonstrated in the examples below.
The legend of the alluvial diagram should be renamed.
ggplot(data = vaccinations, aes(axis1 = survey, axis2 = response, y = freq)) + geom_alluvium(aes(fill = response)) + geom_stratum() + geom_text(stat = "stratum", aes(label = after_stat(stratum))) + scale_x_discrete(limits = c("Survey", "Response"), expand = c(0.15, 0.05)) + theme_void() + guides(fill = guide_legend(title = "Title"))
Changing the Font Size in Base R Plots – Data Science Tutorials
Legend key labels
ggplot(data = vaccinations, aes(axis1 = survey, axis2 = response, y = freq)) + geom_alluvium(aes(fill = response)) + geom_stratum() + geom_text(stat = "stratum", aes(label = after_stat(stratum))) + scale_x_discrete(limits = c("Survey", "Response"), expand = c(0.15, 0.05)) + theme_void() + scale_fill_hue(labels = c("A", "B", "C", "D"))
glm function in r-Generalized Linear Models – Data Science Tutorials
Remove the legend
ggplot(data = vaccinations, aes(axis1 = survey, axis2 = response, y = freq)) + geom_alluvium(aes(fill = response)) + geom_stratum() + geom_text(stat = "stratum", aes(label = after_stat(stratum))) + scale_x_discrete(limits = c("Survey", "Response"), expand = c(0.15, 0.05)) + theme_void() + theme(legend.position = "none")