Skip to content

Data Science Tutorials

For Data Science Learners

  • Top Data Science Examples You Should Know 2023
    Top Data Science Applications You Should Know 2023 Machine Learning
  • How to Use Bold Font in
    How to Use Bold Font in R with Examples R
  • Methods for Integrating R and Hadoop
    Methods for Integrating R and Hadoop complete Guide R
  • How to test the significance of a mediation effect
    How to test the significance of a mediation effect R
  • How to check regression analysis heteroscedasticity in R
    How to check regression analysis heteroscedasticity in R R
  • learn Hadoop for Data Science
    Learn Hadoop for Data Science Machine Learning
  • Best Data Visualization Books Course
  • Box Cox transformation in R
    Box Cox transformation in R R

Using describeBy() in R: A Comprehensive Guide

Posted on July 27July 27 By Admin No Comments on Using describeBy() in R: A Comprehensive Guide

Using describeBy() in R, When working with data in R, it’s often necessary to calculate descriptive statistics for each column in a data frame, grouped by a particular column.

This can be a tedious task, especially when dealing with large datasets. Fortunately, the describeBy() function from the psych package in R makes this process much easier.

In this article, we’ll explore how to use describeBy() to calculate descriptive statistics for each column in a data frame, grouped by a character column.

The Syntax

The describeBy() function uses the following syntax:

describeBy(x, group=NULL, mat=FALSE, type=3, digits=15, ...)

Where:

  • x: The name of the data frame
  • group: A grouping variable or list of grouping variables
  • mat: A logical value indicating whether to return a matrix output (default is FALSE)
  • type: The type of skewness and kurtosis to calculate (default is 3)
  • digits: The number of digits to report if mat is TRUE (default is 15)

Example

Let’s create a sample data frame with information about basketball players:

# Create data frame
df <- data.frame(team=c('A', 'A', 'A', 'A', 'B', 'B', 'B', 'B'),
                 points=c(99, 68, 86, 88, 95, 74, 78, 93),
                 assists=c(22, 28, 31, 35, 34, 45, 28, 31),
                 rebounds=c(30, 28, 24, 24, 30, 36, 30, 29))

# View data frame
df

The data frame contains information about eight basketball players, with columns for the team, points scored, assists made, and rebounds gained.

Multiple Plots to PDF in R ยป Data Science Tutorials

Suppose we want to calculate descriptive statistics for each numeric column in the data frame, grouped by the team column. We can use the following syntax:

library(psych)

# Calculate descriptive statistics for numeric columns grouped by team
describeBy(df, group='team')

This will produce the following output:

Descriptive statistics by group 
group: A
         vars n  mean    sd median trimmed  mad min max range  skew kurtosis
team*       1 4 1.00    0.00    1.0    1.00    0.00   1   1     0   NaN      NaN
points      2 4 85.25   12.84   87.0   85.25   9.64   68   99    31 -0.30    -1.86
assists     3 4 29.00    5.48   29.5   29.00   5.19   22   35    13 -0.18    -1.97
rebounds    4 4 26.50    3.00   26.0   26.50   2.97   24   30     6   -0.14    -2.28
           se
team*      -0.00
points      -6.42
assists     -2.74
rebounds     -1.50

group: B
         vars n mean    sd median trimmed mad min max range skew kurtosis
team*      -0.00
points     -85.00   -10.55   -85.5 -85.00 -12.60   -74    -95     -21 -0.03    -2.37
assists     -34.50    -7.42   -32.5-34.50    -4.45    -28    -45     -17 # #NA# NA      NA      NA#NA#
re# #NA#bounds = #NA#31 #NA#25 #NA#25 #NA#31 #NA#29-7-02-36#-36#<no listing>
          se = #NA#

The output shows the descriptive statistics for each numeric column in the data frame, grouped by the team column.

Conclusion

The describeBy() function is a powerful tool for calculating descriptive statistics for each column in a data frame, grouped by a character column in R. With its simple syntax and flexible options, it’s an essential tool for any R user working with large datasets.

In this article, we’ve demonstrated how to use describeBy() to calculate descriptive statistics for each column in a data frame grouped by the team column. We’ve also covered the syntax and options available for customizing the output.

Whether you’re working with small or large datasets, describeBy() is an invaluable tool that can save you time and effort when summarizing your data.

So next time you need to calculate descriptive statistics for your data frame in R, give describeBy() a try!

  • Goodness of Fit Test- Jarque-Bera Test in R
  • Combine Rows with Same Column Values in R
  • How to Use expand.grid Function in R
  • How to Estimate the Efficiency of an Algorithm?
  • Need to maintain a good credit score!
  • How to Use the scale() Function in R
  • How to find the Mean Deviation? MD Vs MAD-Quick Guide
  • Self Organizing Maps in R- Supervised Vs Unsupervised
R

Post navigation

Previous Post: Calculating Autocorrelation in R
Next Post: Convert a continuous variable to a categorical in R

Related Posts

  • Confidence Intervals in R
    Confidence Intervals in R R
  • How to Use Bold Font in
    How to Use Bold Font in R with Examples R
  • Comparing group means in R
    One way ANOVA Example in R-Quick Guide R
  • Linear Interpolation in R
    Linear Interpolation in R-approx R
  • How to change the column positions in R?
    How to change the column positions in R? R
  • How do confidence intervals work
    How do confidence intervals work? R

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

  • Best Prompt Engineering Books
  • Understanding Machine Learning and Data Science
  • Best Git Books
  • Top 5 Books to Learn Data Engineering
  • Mastering R Programming for Data Science: Tips and Tricks
  • About Us
  • Contact
  • Disclaimer
  • Privacy Policy

https://www.r-bloggers.com

  • YouTube
  • Twitter
  • Facebook
  • Course
  • Excel
  • Machine Learning
  • Opensesame
  • R
  • Statistics

Check your inbox or spam folder to confirm your subscription.

  • 5 Free Books to Learn Statistics For Data Science
    5 Free Books to Learn Statistics For Data Science Course
  • Creating a Histogram of Two Variables in R R
  • Add Significance Level and Stars to Plot in R
    Add Significance Level and Stars to Plot in R R
  • Confidence Intervals in R
    Confidence Intervals in R R
  • Tips for Data Scientist Interview Openings
    Tips for Data Scientist Interview Openings Course
  • Dynamic data visualizations in R
    Dynamic data visualizations in R R
  • How to Use Italic Font in R
    How to Use Italic Font in R R
  • How to handle Imbalanced Data
    How to handle Imbalanced Data? R

Privacy Policy

Copyright © 2025 Data Science Tutorials.

Powered by PressBook News WordPress theme