Skip to content

Data Science Tutorials

  • Home
  • R
  • Statistics
  • Course
  • Machine Learning
  • Guest Blog
  • Contact
  • About Us
  • Toggle search form
  • How to Find Optimal Clusters in R, K-means clustering is one of the most widely used clustering techniques in machine learning.
    How to Find Optimal Clusters in R? R
  • Top 10 online data science programmes
    Top 10 online data science programs Course
  • Is Data Science a Dying Profession
    Is Data Science a Dying Profession? R
  • Two-Way ANOVA Example in R
    How to perform a one-sample t-test in R? R
  • Best Online Course For Statistics
    Free Best Online Course For Statistics Course
  • Box Cox transformation in R
    Box Cox transformation in R R
  • Interactive 3d plot in R
    Interactive 3d plot in R-Quick Guide R
  • how to create a hexbins chart in R
    How to create a hexbin chart in R R
How to create Sankey plot in R

How to create a Sankey plot in R?

Posted on October 19October 19 By Jim No Comments on How to create a Sankey plot in R?
Tweet
Share
Share
Pin

How to create a Sankey plot in R?, You must install the ggsankey library and modify your dataset using the package’s make_long function in order to produce a Sankey diagram in ggplot2.

The data’s columns must correspond to the stages x (current stage), next_x (next stage), node (current node), and next_node (the following node).

Keep in mind that the final stage should indicate a NA.

A Side-by-Side Boxplot in R: How to Do It – Data Science Tutorials

Let’s install the remotes packages first,

install.packages("remotes")

Now we can install ggsankey package

remotes::install_github("davidsjoberg/ggsankey")
library(ggsankey)

Load Data

We can make use of mtcars data sets in R.

df <- mtcars %>%
  make_long(cyl, vs, am, gear, carb)
df
    x   node next_x next_node
1  cyl    6     vs         0
2   vs    0     am         1
3   am    1   gear         4
4 gear    4   carb         4
5 carb    4   <NA>        NA
6  cyl    6     vs         0

How to Create an Interaction Plot in R? – Data Science Tutorials

Sankey plot with ggsankey

To construct Sankey diagrams in ggplot2, the ggsankey package includes a geom called geom_sankey.

Keep in mind that you must give a factor as the fill colour when passing the variables to aes. The theme theme_sankey is also present in the function.

Let’s load ggplot2 for graph generation

library(ggplot2)
library(dplyr)
ggplot(df, aes(x = x,
               next_x = next_x,
               node = node,
               next_node = next_node,
               fill = factor(node))) +
  geom_sankey() +
  theme_sankey(base_size = 16)

How to add labels in Sankey Plot

The package’s geom_sankey_label function lets you add labels to Sankey diagrams.

Remember to give the variable you want to display as the label inside the aes.

ggplot(df, aes(x = x,
               next_x = next_x,
               node = node,
               next_node = next_node,
               fill = factor(node),
               label = node)) +
  geom_sankey() +
  geom_sankey_label() +
  theme_sankey(base_size = 16)

How to Add Superscripts and Subscripts to Plots in R? (datasciencetut.com)

How to do Color customization in Sankey Plot

To alter how the Sankey diagram appears in R, a variety of arguments can be changed. The author of the program produced the following pictures as examples.

geom_sankey aesthetics
geom_sankey geometries
Color and fill of the Sankey plot

For instance, by adjusting the fill color palette and a few of the inputs to the geom_sankey_function, we can produce something like this.

ggplot(df, aes(x = x,
               next_x = next_x,
               node = node,
               next_node = next_node,
               fill = factor(node),
               label = node)) +
  geom_sankey(flow.alpha = 0.5, node.color = 1) +
  geom_sankey_label(size = 3.5, color = 1, fill = "white") +
  scale_fill_viridis_d(option = "A", alpha = 0.95) +
  theme_sankey(base_size = 16)

How to Label Outliers in Boxplots in ggplot2? (datasciencetut.com)

Changing the title of the legend

Changes to the legend’s title are available, just like with other ggplot2 charts. Here are several options for action.

ggplot(df, aes(x = x,
               next_x = next_x,
               node = node,
               next_node = next_node,
               fill = factor(node),
               label = node)) +
  geom_sankey(flow.alpha = 0.5, node.color = 1) +
  geom_sankey_label(size = 3.5, color = 1, fill = "white") +
  scale_fill_viridis_d() +
  theme_sankey(base_size = 16) +
  guides(fill = guide_legend(title = "Title"))

How to Add a caption to ggplot2 Plots in R? (datasciencetut.com)

Check your inbox or spam folder to confirm your subscription.

Removing the legend

Finally, you can adjust the Sankey plot legend’s position to “none” if you want to remove it.

ggplot(df, aes(x = x,
               next_x = next_x,
               node = node,
               next_node = next_node,
               fill = factor(node),
               label = node)) +
  geom_sankey(flow.alpha = 0.5, node.color = 1) +
  geom_sankey_label(size = 3.5, color = 1, fill = "white") +
  scale_fill_viridis_d() +
  theme_sankey(base_size = 16) +
  theme(legend.position = "none")

Changing the Font Size in Base R Plots – Data Science Tutorials

Tweet
Share
Share
Pin
R

Post navigation

Previous Post: Difference between R and Python
Next Post: How to create a ggalluvial plot in R?

Related Posts

  • How to compare variances in R
    How to compare variances in R R
  • Error: Can't rename columns that don't exist
    Can’t rename columns that don’t exist R
  • How to Use Mutate function in R
    How to Use Mutate function in R R
  • Two-Way ANOVA Example in R
    How to perform a one-sample t-test in R? R
  • How to Scale Only Numeric Columns in R
    How to Scale Only Numeric Columns in R R
  • Add Significance Level and Stars to Plot in R
    Add Significance Level and Stars to Plot in R R

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

  • About Us
  • Contact
  • Disclaimer
  • Guest Blog
  • Privacy Policy
  • YouTube
  • Twitter
  • Facebook
  • Defensive Programming Strategies in R
  • Plot categorical data in R
  • Top Data Modeling Tools for 2023
  • Ogive Graph in R
  • Is R or Python Better for Data Science in Bangalore

Check your inbox or spam folder to confirm your subscription.

  • Data Scientist Career Path Map in Finance
  • Is Python the ideal language for machine learning
  • Convert character string to name class object
  • How to play sound at end of R Script
  • Pattern Searching in R
  • How to Find Unmatched Records in R
    How to Find Unmatched Records in R R
  • How to Use the Multinomial Distribution in R
    How to Use the Multinomial Distribution in R? R
  • Two Sample Proportions test in R
    Two Sample Proportions test in R-Complete Guide R
  • test for normal distribution in r
    Test for Normal Distribution in R-Quick Guide R
  • How do augmented analytics work
    How do augmented analytics work? R
  • Top Reasons To Learn R
    Top Reasons To Learn R in 2023 Machine Learning
  • Descriptive statistics vs Inferential statistics
    Descriptive statistics vs Inferential statistics: Guide Statistics
  • How to plot categorical data in R
    Plot categorical data in R R

Copyright © 2023 Data Science Tutorials.

Powered by PressBook News WordPress theme