Using describeBy() in R, When working with data in R, it’s often necessary to calculate descriptive statistics for each column in a data frame, grouped by a particular column.

This can be a tedious task, especially when dealing with large datasets. Fortunately, the `describeBy()`

function from the `psych`

package in R makes this process much easier.

In this article, we’ll explore how to use `describeBy()`

to calculate descriptive statistics for each column in a data frame, grouped by a character column.

**The Syntax**

The `describeBy()`

function uses the following syntax:

`describeBy(x, group=NULL, mat=FALSE, type=3, digits=15, ...)`

Where:

`x`

: The name of the data frame`group`

: A grouping variable or list of grouping variables`mat`

: A logical value indicating whether to return a matrix output (default is`FALSE`

)`type`

: The type of skewness and kurtosis to calculate (default is 3)`digits`

: The number of digits to report if`mat`

is`TRUE`

(default is 15)

**Example**

Let’s create a sample data frame with information about basketball players:

# Create data frame df <- data.frame(team=c('A', 'A', 'A', 'A', 'B', 'B', 'B', 'B'), points=c(99, 68, 86, 88, 95, 74, 78, 93), assists=c(22, 28, 31, 35, 34, 45, 28, 31), rebounds=c(30, 28, 24, 24, 30, 36, 30, 29)) # View data frame df

The data frame contains information about eight basketball players, with columns for the team, points scored, assists made, and rebounds gained.

Multiple Plots to PDF in R ยป Data Science Tutorials

Suppose we want to calculate descriptive statistics for each numeric column in the data frame, grouped by the team column. We can use the following syntax:

library(psych) # Calculate descriptive statistics for numeric columns grouped by team describeBy(df, group='team')

This will produce the following output:

Descriptive statistics by group group: A vars n mean sd median trimmed mad min max range skew kurtosis team* 1 4 1.00 0.00 1.0 1.00 0.00 1 1 0 NaN NaN points 2 4 85.25 12.84 87.0 85.25 9.64 68 99 31 -0.30 -1.86 assists 3 4 29.00 5.48 29.5 29.00 5.19 22 35 13 -0.18 -1.97 rebounds 4 4 26.50 3.00 26.0 26.50 2.97 24 30 6 -0.14 -2.28 se team* -0.00 points -6.42 assists -2.74 rebounds -1.50 group: B vars n mean sd median trimmed mad min max range skew kurtosis team* -0.00 points -85.00 -10.55 -85.5 -85.00 -12.60 -74 -95 -21 -0.03 -2.37 assists -34.50 -7.42 -32.5-34.50 -4.45 -28 -45 -17 # #NA# NA NA NA#NA# re# #NA#bounds = #NA#31 #NA#25 #NA#25 #NA#31 #NA#29-7-02-36#-36#<no listing> se = #NA#

The output shows the descriptive statistics for each numeric column in the data frame, grouped by the team column.

**Conclusion**

The `describeBy()`

function is a powerful tool for calculating descriptive statistics for each column in a data frame, grouped by a character column in R. With its simple syntax and flexible options, it’s an essential tool for any R user working with large datasets.

In this article, we’ve demonstrated how to use `describeBy()`

to calculate descriptive statistics for each column in a data frame grouped by the team column. We’ve also covered the syntax and options available for customizing the output.

Whether you’re working with small or large datasets, `describeBy()`

is an invaluable tool that can save you time and effort when summarizing your data.

So next time you need to calculate descriptive statistics for your data frame in R, give `describeBy()`

a try!

- Goodness of Fit Test- Jarque-Bera Test in R
- Combine Rows with Same Column Values in R
- How to Use expand.grid Function in R
- How to Estimate the Efficiency of an Algorithm?
- Need to maintain a good credit score!
- How to Use the scale() Function in R
- How to find the Mean Deviation? MD Vs MAD-Quick Guide
- Self Organizing Maps in R- Supervised Vs Unsupervised