Skip to content

Data Science Tutorials

For Data Science Learners

  • Best Prompt Engineering Books R
  • Making games in R- Nara and eventloop Game Changers
    Making games in R- Nara and eventloop Game Changers Machine Learning
  • Calculating Z-Scores in R: A Step-by-Step Guide R
  • Subset rows based on their integer locations
    Subset rows based on their integer locations-slice in R R
  • Number to Percentage in R
    Number to Percentage in R R
  • Best Books on Generative AI Course
  • How to Create an Interaction Plot in R
    How to Create an Interaction Plot in R? R
  • How to copy files in R
    How to copy files in R R

Descriptive Statistics in R

Posted on June 15June 15 By Admin No Comments on Descriptive Statistics in R

Descriptive Statistics in R: A Step-by-Step Guide

Descriptive statistics are a crucial part of data analysis, as they provide a snapshot of the central tendency and variability of a dataset.

In R, there are two primary functions that can be used to calculate descriptive statistics: summary() and sapply().

In this article, we will explore how to use these functions to gain a deeper understanding of our data.

Replace first match in R ยป Data Science Tutorials

Method 1: Using the summary() Function

The summary() function is a simple and efficient way to calculate various descriptive statistics for each variable in a data frame. To use this function, simply call it on your data frame, like so:

summary(my_data)

The summary() function will return a variety of values for each variable, including the minimum, first quartile, median, mean, third quartile, and maximum.

For example, let’s say we have the following data frame:

df <- data.frame(x=c(1, 4, 4, 5, 6, 7, 10, 12),
                 y=c(2, 2, 3, 3, 4, 5, 11, 11),
                 z=c(8, 9, 9, 9, 10, 13, 15, 17))

We can use the summary() function to calculate descriptive statistics for each variable:

summary(df)

This will output:

       x                y                z        
 Min.   :1.000   Min.   :2.000   Min.   :8.00  
 1st Qu.:4.000   1st Qu.:2.750   1st Qu.:9.00  
 Median :5.500   Median :3.500   Median :9.50  
 Mean   :6.125   Mean   :5.125   Mean   :11.25  
 3rd Qu.:7.750   3rd Qu.:6.500   3rd Qu.:13.50  
 Max.   :12.000   Max.   :11.000   Max.   :17.00 

Method 2: Using the sapply() Function

The sapply() function is a more versatile option for calculating descriptive statistics. It allows us to specify a custom function to apply to each variable in the data frame.

For example, we can use the sapply() function to calculate the standard deviation of each variable:

sapply(df, sd, na.rm=TRUE)

This will output:

       x        y        z 
3.522884 3.758324 3.327376 

We can also use the sapply() function to calculate more complex descriptive statistics by defining a custom function within it.

For example, let’s say we want to calculate the range of each variable:

sapply(df, function(df) max(df)-min(df), na.rm=TRUE)

This will output:

x      y      z 
11 9 9

Conclusion

In this article, we have explored two methods for calculating descriptive statistics in R: the summary() function and the sapply() function.

The summary() function provides a quick and easy way to calculate common descriptive statistics for each variable in a data frame.

The sapply() function offers more flexibility and allows us to define custom functions to calculate more complex descriptive statistics.

By using these functions effectively, we can gain a deeper understanding of our data and make more informed decisions about our analysis and visualization strategies.

  • Major Components of Time Series Analysis
  • Sample Size Calculation and Power Clinical Trials
  • Biases in Statistics Common Pitfalls
  • Area Under Curve in R (AUC)
  • Filtering Data in R 10 Tips -tidyverse package
  • How to Perform Tukey HSD Test in R
  • Statistical Hypothesis Testing-A Step by Step Guide
  • How to Create Frequency Tables in R
  • PCA for Categorical Variables in R
  • sweep function in R
R

Post navigation

Previous Post: Multiple Plots to PDF in R
Next Post: Calculating Z-Scores in R: A Step-by-Step Guide

Related Posts

  • Calculate the p-Value from Z-Score in R
    Calculate the p-Value from Z-Score in R R
  • Dealing Missing values in R
    Dealing With Missing values in R R
  • Changing the Font Size in Base R Plots
    Changing the Font Size in Base R Plots R
  • Type II Error in R
    Type II Error in R R
  • How to Implement the Sklearn Predict Approach
    How to Implement the Sklearn Predict Approach? R
  • Mastering the map() Function in R R

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

  • Best Prompt Engineering Books
  • Understanding Machine Learning and Data Science
  • Best Git Books
  • Top 5 Books to Learn Data Engineering
  • Mastering R Programming for Data Science: Tips and Tricks
  • About Us
  • Contact
  • Disclaimer
  • Privacy Policy

https://www.r-bloggers.com

  • YouTube
  • Twitter
  • Facebook
  • Course
  • Excel
  • Machine Learning
  • Opensesame
  • R
  • Statistics

Check your inbox or spam folder to confirm your subscription.

  • How to do Conditional Mutate in R
    How to do Conditional Mutate in R? R
  • Group By Sum in R
    Group By Sum in R R
  • Best Books to Learn R Programming
    Best Books to Learn R Programming Course
  • How to test the significance of a mediation effect
    How to test the significance of a mediation effect R
  • How to add columns to a data frame in R
    How to add columns to a data frame in R R
  • How to Join Multiple Data Frames in R
    How to Join Multiple Data Frames in R R
  • Error attempt to apply non function in r
    Error attempt to apply non function in r R
  • Locate position of patterns in a character string in R R

Privacy Policy

Copyright © 2025 Data Science Tutorials.

Powered by PressBook News WordPress theme