Compute summary statistics for one or multiple numeric variables.

get_summary_stats(data, ..., type = c("full", "common", "robust",
"five_number", "mean_sd", "mean_se", "mean_ci", "median_iqr",
"median_mad", "quantile", "mean", "median", "min", "max"), show = NULL,
probs = seq(0, 1, 0.25))

## Arguments

data a data frame (optional) One or more unquoted expressions (or variable names) separated by commas. Used to select a variable of interest. If no variable is specified, then the summary statistics of all numeric variables in the data frame is computed. type of summary statistics. Possible values include: "full", "common", "robust", "five_number", "mean_sd", "mean_se", "mean_ci", "median_iqr", "median_mad", "quantile", "mean", "median", "min", "max" a character vector specifying the summary statistics you want to show. Example: show = c("n", "mean", "sd"). This is used to filter the output after computation. numeric vector of probabilities with values in [0,1]. Used only when type = "quantile".

## Value

A data frame containing descriptive statistics, such as:

• n: the number of individuals

• min: minimum

• max: maximum

• median: median

• mean: mean

• q1, q3: the first and the third quartile, respectively.

• iqr: interquartile range

• mad: median absolute deviation (see ?MAD)

• sd: standard deviation of the mean

• se: standard error of the mean

• ci: 95 percent confidence interval of the mean

## Examples

# Full summary statistics
data("ToothGrowth")
ToothGrowth %>% get_summary_stats(len)#> # A tibble: 1 x 13
#>   variable     n   min   max median    q1    q3   iqr   mad  mean    sd    se
#>   <chr>    <dbl> <dbl> <dbl>  <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 len         60   4.2  33.9   19.2  13.1  25.3  12.2  9.04  18.8  7.65 0.988
#> # … with 1 more variable: ci <dbl>
# Summary statistics of grouped data
# Show only common summary
ToothGrowth %>%
group_by(dose, supp) %>%
get_summary_stats(len, type = "common")#> # A tibble: 6 x 12
#>   supp   dose variable     n   min   max median   iqr  mean    sd    se    ci
#>   <fct> <dbl> <chr>    <dbl> <dbl> <dbl>  <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 OJ      0.5 len         10   8.2  21.5  12.2   6.48 13.2   4.46 1.41   3.19
#> 2 VC      0.5 len         10   4.2  11.5   7.15  4.95  7.98  2.75 0.869  1.96
#> 3 OJ      1   len         10  14.5  27.3  23.4   5.35 22.7   3.91 1.24   2.80
#> 4 VC      1   len         10  13.6  22.5  16.5   2.02 16.8   2.52 0.795  1.80
#> 5 OJ      2   len         10  22.4  30.9  26.0   2.5  26.1   2.66 0.84   1.90
#> 6 VC      2   len         10  18.5  33.9  26.0   5.42 26.1   4.80 1.52   3.43
# Robust summary statistics
ToothGrowth %>% get_summary_stats(len, type = "robust")#> # A tibble: 1 x 4
#>   variable     n median   iqr
#>   <chr>    <dbl>  <dbl> <dbl>
#> 1 len         60   19.2  12.2
# Five number summary statistics
ToothGrowth %>% get_summary_stats(len, type = "five_number")#> # A tibble: 1 x 7
#>   variable     n   min   max    q1 median    q3
#>   <chr>    <dbl> <dbl> <dbl> <dbl>  <dbl> <dbl>
#> 1 len         60   4.2  33.9  13.1   19.2  25.3
# Compute only mean and sd
ToothGrowth %>% get_summary_stats(len, type = "mean_sd")#> # A tibble: 1 x 4
#>   variable     n  mean    sd
#>   <chr>    <dbl> <dbl> <dbl>
#> 1 len         60  18.8  7.65
# Compute full summary statistics but show only mean, sd, median, iqr
ToothGrowth %>%
get_summary_stats(len, show = c("mean", "sd", "median", "iqr"))#> # A tibble: 1 x 6
#>   variable     n  mean    sd median   iqr
#>   <chr>    <dbl> <dbl> <dbl>  <dbl> <dbl>
#> 1 len         60  18.8  7.65   19.2  12.2