Skip to content

Data Science Tutorials

  • Home
  • R
  • Statistics
  • Course
  • Machine Learning
  • Guest Blog
  • Contact
  • About Us
  • Toggle search form
  • How to Replace String in Column in R
    How to Replace String in Column using R R
  • How to Find Unmatched Records in R
    How to Find Unmatched Records in R R
  • pheatmap function in R
    The pheatmap function in R R
  • How to compare the performance of different algorithms in R
    How to compare the performance of different algorithms in R? R
  • Quantiles by Group calculation in R
    Quantiles by Group calculation in R with examples R
  • How to Join Data Frames for different column names in R
    How to Join Data Frames for different column names in R R
  • How to Turn Off Scientific Notation in R
    How to Turn Off Scientific Notation in R? R
  • Triangular Distribution in R
    Triangular Distribution in R R
How to Create Summary Tables in R

How to Create Summary Tables in R

Posted on July 26July 26 By Jim No Comments on How to Create Summary Tables in R
Tweet
Share
Share
Pin

How to Create Summary Tables in R?, The describe() and describeBy() methods from the psych package is the simplest to use for creating summary tables in R.

How to apply a transformation to multiple columns in R?

library(psych)

Let’s create a summary table

describe(df)

We can now create a summary table that is organized by a certain variable.

describeBy(df, group=df$var_name)

The practical application of these features is demonstrated in the examples that follow.

Example 1:- Create a simple summary table

Let’s say we have the R data frame shown below:

make a data frame

df <- data.frame(team=c('P1', 'P1', 'P1', 'P2', 'P2', 'P2', 'P1'),
points=c(150, 222, 229, 421, 330, 211, 219),
rebounds=c(17, 28, 36, 16, 17, 29, 15),
steals=c(11, 151, 152, 73, 85, 79, 58))

Now we can view the data frame

df
   team points rebounds steals
1   P1    150       17     11
2   P1    222       28    151
3   P1    229       36    152
4   P2    421       16     73
5   P2    330       17     85
6   P2    211       29     79
7   P1    219       15     58

For each variable in the data frame, a summary table can be made using the describe() function.

Add new calculated variables to a data frame and drop all existing variables

library(psych)

Now will create a summary table

describe(df)
vars n   mean    sd median trimmed   mad min max range skew kurtosis
team*       1 7   1.43  0.53      1    1.43  0.00   1   2     1 0.23    -2.20
points      2 7 254.57 90.56    222  254.57 16.31 150 421   271 0.71    -1.03
rebounds    3 7  22.57  8.30     17   22.57  2.97  15  36    21 0.44    -1.73
steals      4 7  87.00 50.34     79   87.00 31.13  11 152   141 0.08    -1.47
            se
team*     0.20
points   34.23
rebounds  3.14
steals   19.03

Here’s how to interpret each value in the output:

vars: column number

n: Number of valid cases

mean: The mean value

median: The median value

trimmed: The trimmed mean (default trims 10% of observations from each end)

mad: The median absolute deviation (from the median)

min: The minimum value

max: The maximum value

range: The range of values (max – min)

skew: The skewness

kurtosis: The kurtosis

se: The standard error

Any variable that has an asterisk (*) next to it has been transformed from being categorical or logical to becoming a numerical variable with values that represent the numerical ordering of the values.

How to Use Spread Function in R?-tidyr

We shouldn’t take the summary statistics for the variable “team” which has been transformed into a numerical variable.

Also, take note that the setting fast=TRUE allows you to merely compute the most typical summary statistics.

Now we can create a smaller summary table

describe(df, fast=TRUE)
         vars n   mean    sd min  max range    se
team        1 7    NaN    NA Inf -Inf  -Inf    NA
points      2 7 254.57 90.56 150  421   271 34.23
rebounds    3 7  22.57  8.30  15   36    21  3.14
steals      4 7  87.00 50.34  11  152   141 19.03

Additionally, we have the option of only computing the summary statistics for a subset of the data frame’s variables:

make a summary table using only the columns “points” and “rebounds”

describe(df[ , c('points', 'rebounds')], fast=TRUE)
         vars n   mean    sd min max range    se
points      1 7 254.57 90.56 150 421   271 34.23
rebounds    2 7  22.57  8.30  15  36    21  3.14

Example 2: Make a summary table that is grouped by a certain variable.

The describeBy() function can be used to group the data frame’s summary table by the variable “team” using the following code.

build the summary table with teams as the primary grouping.

How to Use Mutate function in R – Data Science Tutorials

describeBy(df, group=df$team, fast=TRUE)

Descriptive statistics by group

group: P1
         vars n mean    sd min  max range    se
team        1 4  NaN    NA Inf -Inf  -Inf    NA
points      2 4  205 36.91 150  229    79 18.45
rebounds    3 4   24  9.83  15   36    21  4.92
steals      4 4   93 70.22  11  152   141 35.11
-------------------------------------------------------------
group: P2
         vars n   mean     sd min  max range    se
team        1 3    NaN     NA Inf -Inf  -Inf    NA
points      2 3 320.67 105.31 211  421   210 60.80
rebounds    3 3  20.67   7.23  16   29    13  4.18
steals      4 3  79.00   6.00  73   85    12  3.46

The summary statistics for each of the three teams in the data frame are displayed in the output.

Check your inbox or spam folder to confirm your subscription.

Tweet
Share
Share
Pin
R Tags:psych

Post navigation

Previous Post: Convert multiple columns into a single column-tidyr Part4
Next Post: How to Create an Interaction Plot in R?

Related Posts

  • Convert Multiple Columns to Numeric in R
    Convert Multiple Columns to Numeric in R R
  • OLS Regression in R
    OLS Regression in R R
  • Crosstab calculation in R
    Crosstab calculation in R R
  • How to Create a Frequency Table by Group in R
    How to Create a Frequency Table by Group in R? R
  • Remove Columns from a data frame
    How to Remove Columns from a data frame in R R
  • How to Avoid Overfitting
    How to Avoid Overfitting? Machine Learning

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

  • About Us
  • Contact
  • Disclaimer
  • Guest Blog
  • Privacy Policy
  • YouTube
  • Twitter
  • Facebook
  • Top 7 Skills Required to Become a Data Scientist
  • Learn Hadoop for Data Science
  • How Do Online Criminals Acquire Sensitive Data
  • Top Reasons To Learn R in 2023
  • Linear Interpolation in R-approx

Check your inbox or spam folder to confirm your subscription.

 https://www.r-bloggers.com
  • How do augmented analytics work
    How do augmented analytics work? R
  • Best GGPlot Themes
    Best GGPlot Themes You Should Know R
  • Two-Way ANOVA Example in R
    How to perform a one-sample t-test in R? R
  • What is bias variance tradeoff
    What is the bias variance tradeoff? R
  • How to Replace String in Column in R
    How to Replace String in Column using R R
  • How to use image function in R
    How to use the image function in R R
  • How to change the column positions in R?
    How to change the column positions in R? R
  • How to put margins on tables or arrays in R?
    How to put margins on tables or arrays in R? R

Copyright © 2023 Data Science Tutorials.

Powered by PressBook News WordPress theme