Proportion Test

Performs proportion tests to either evaluate the homogeneity of proportions (probabilities of success) in several groups or to test that the proportions are equal to certain given values.

Wrappers around the R base function prop.test() but have the advantage of performing pairwise and row-wise z-test of two proportions, the post-hoc tests following a significant chi-square test of homogeneity for 2xc and rx2 contingency tables.

See the Datanovia tutorial Proportion Z-Test in R for a worked walkthrough.

prop_test(
  x,
  n,
  p = NULL,
  alternative = c("two.sided", "less", "greater"),
  correct = TRUE,
  conf.level = 0.95,
  detailed = FALSE
)

pairwise_prop_test(xtab, p.adjust.method = "holm", ...)

row_wise_prop_test(xtab, p.adjust.method = "holm", detailed = FALSE, ...)

Arguments

x: a vector of counts of successes, a one-dimensional table with two entries, or a two-dimensional table (or matrix) with 2 columns, giving the counts of successes and failures, respectively.
n: a vector of counts of trials; ignored if x is a matrix or a table.
p: a vector of probabilities of success. The length of p must be the same as the number of groups specified by x, and its elements must be greater than 0 and less than 1.
alternative: a character string specifying the alternative hypothesis, must be one of "two.sided" (default), "greater" or "less". You can specify just the initial letter. Only used for testing the null that a single proportion equals a given value, or that two proportions are equal; ignored otherwise.
correct: a logical indicating whether Yates' continuity correction should be applied where possible.
conf.level: confidence level of the returned confidence interval. Must be a single number between 0 and 1. Only used when testing the null that a single proportion equals a given value, or that two proportions are equal; ignored otherwise.
detailed: logical value. Default is FALSE. If TRUE, a detailed result is shown.
xtab: a cross-tabulation (or contingency table) with two columns and multiple rows (rx2 design). The columns give the counts of successes and failures respectively.
p.adjust.method: method to adjust p values for multiple comparisons. Used when pairwise comparisons are performed. Allowed values include "holm", "hochberg", "hommel", "bonferroni", "BH", "BY", "fdr", "none". If you don't want to adjust the p value (not recommended), use p.adjust.method = "none".
...: Other arguments passed to the function prop_test().

Value

return a data frame with some the following columns:

n: the number of participants.
group: the categories in the row-wise proportion tests.
statistic: the value of Pearson's chi-squared test statistic.
df: the degrees of freedom of the approximate chi-squared distribution of the test statistic.
p: p-value.
p.adj: the adjusted p-value.
method: the used statistical test.
p.signif, p.adj.signif: the significance level of p-values and adjusted p-values, respectively.
estimate: a vector with the sample proportions x/n.
estimate1, estimate2: the proportion in each of the two populations.
alternative: a character string describing the alternative hypothesis.
conf.low,conf.high: Lower and upper bound on a confidence interval. a confidence interval for the true proportion if there is one group, or for the difference in proportions if there are 2 groups and p is not given, or NULL otherwise. In the cases where it is not NULL, the returned confidence interval has an asymptotic confidence level as specified by conf.level, and is appropriate to the specified alternative hypothesis.

The returned object has an attribute called args, which is a list holding the test arguments.

Functions

prop_test(): performs one-sample and two-samples z-test of proportions. Wrapper around the function prop.test().
pairwise_prop_test(): pairwise comparisons between proportions, a post-hoc tests following a significant chi-square test of homogeneity for 2xc design. Wrapper around pairwise.prop.test()
row_wise_prop_test(): performs row-wise z-test of two proportions, a post-hoc tests following a significant chi-square test of homogeneity for rx2 contingency table. The z-test of two proportions is calculated for each category (row).

Examples

# Comparing an observed proportion to an expected proportion
#%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
prop_test(x = 95, n = 160, p = 0.5, detailed = TRUE)
#> # A tibble: 1 × 11
#>       n    n1 estimate statistic      p    df conf.low conf.high method   
#> * <dbl> <dbl>    <dbl>     <dbl>  <dbl> <int>    <dbl>     <dbl> <chr>    
#> 1   160    95    0.594      5.26 0.0219     1    0.513     0.670 Prop test
#> # ℹ 2 more variables: alternative <chr>, p.signif <chr>

# Comparing two proportions
#%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
# Data: frequencies of smokers between two groups
xtab <- as.table(rbind(c(490, 10), c(400, 100)))
dimnames(xtab) <- list(
  group = c("grp1", "grp2"),
  smoker = c("yes", "no")
)
xtab
#>       smoker
#> group  yes  no
#>   grp1 490  10
#>   grp2 400 100
# compare the proportion of smokers
prop_test(xtab, detailed = TRUE)
#> # A tibble: 1 × 13
#>       n    n1    n2 estimate1 estimate2 statistic        p    df conf.low
#> * <dbl> <dbl> <dbl>     <dbl>     <dbl>     <dbl>    <dbl> <dbl>    <dbl>
#> 1  1000   500   500      0.98       0.8      80.9 2.36e-19     1    0.141
#> # ℹ 4 more variables: conf.high <dbl>, method <chr>, alternative <chr>,
#> #   p.signif <chr>

# Homogeneity of proportions between groups
#%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
# H0: the proportion of smokers is similar in the four groups
# Ha:  this proportion is different in at least one of the populations.
#
# Data preparation
grp.size <- c( 106, 113, 156, 102 )
smokers  <- c( 50, 100, 139, 80 )
no.smokers <- grp.size - smokers
xtab <- as.table(rbind(
  smokers,
  no.smokers
))
dimnames(xtab) <- list(
  Smokers = c("Yes", "No"),
  Groups = c("grp1", "grp2", "grp3", "grp4")
)
xtab
#>        Groups
#> Smokers grp1 grp2 grp3 grp4
#>     Yes   50  100  139   80
#>     No    56   13   17   22

# Compare the proportions of smokers between groups
prop_test(xtab, detailed = TRUE)
#> # A tibble: 1 × 15
#>       n    n1    n2    n3    n4 estimate1 estimate2 estimate3 estimate4
#> * <dbl> <dbl> <dbl> <dbl> <dbl>     <dbl>     <dbl>     <dbl>     <dbl>
#> 1   477   106   113   156   102     0.472     0.885     0.891     0.784
#> # ℹ 6 more variables: statistic <dbl>, p <dbl>, df <dbl>, method <chr>,
#> #   alternative <chr>, p.signif <chr>

# Pairwise comparison between groups
pairwise_prop_test(xtab)
#> # A tibble: 6 × 5
#>   group1 group2         p     p.adj p.adj.signif
#> * <chr>  <chr>      <dbl>     <dbl> <chr>       
#> 1 grp1   grp2   1.25 e-10 6.23 e-10 ****        
#> 2 grp1   grp3   3.09 e-13 1.86 e-12 ****        
#> 3 grp2   grp3   1.000e+ 0 1.000e+ 0 ns          
#> 4 grp1   grp4   6.41 e- 6 2.56 e- 5 ****        
#> 5 grp2   grp4   7.01 e- 2 1.40 e- 1 ns          
#> 6 grp3   grp4   3.06 e- 2 9.19 e- 2 ns          


# Pairwise proportion tests
#%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
# Data: Titanic
xtab <- as.table(rbind(
  c(122, 167, 528, 673),
  c(203, 118, 178, 212)
))
dimnames(xtab) <- list(
  Survived = c("No", "Yes"),
  Class = c("1st", "2nd", "3rd", "Crew")
)
xtab
#>         Class
#> Survived 1st 2nd 3rd Crew
#>      No  122 167 528  673
#>      Yes 203 118 178  212
# Compare the proportion of survived between groups
pairwise_prop_test(xtab)
#> # A tibble: 6 × 5
#>   group1 group2        p    p.adj p.adj.signif
#> * <chr>  <chr>     <dbl>    <dbl> <chr>       
#> 1 1st    2nd    3.13e- 7 9.38e- 7 ****        
#> 2 1st    3rd    2.55e-30 1.27e-29 ****        
#> 3 2nd    3rd    6.90e- 7 1.38e- 6 ****        
#> 4 1st    Crew   1.62e-35 9.73e-35 ****        
#> 5 2nd    Crew   1.94e- 8 7.75e- 8 ****        
#> 6 3rd    Crew   6.03e- 1 6.03e- 1 ns          

# Row-wise proportion tests
#%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
# Data: Titanic
xtab <- as.table(rbind(
  c(180, 145), c(179, 106),
  c(510, 196), c(862, 23)
))
dimnames(xtab) <- list(
  Class = c("1st", "2nd", "3rd", "Crew"),
  Gender = c("Male", "Female")
)
xtab
#>       Gender
#> Class  Male Female
#>   1st   180    145
#>   2nd   179    106
#>   3rd   510    196
#>   Crew  862     23
# Compare the proportion of males and females in each category
row_wise_prop_test(xtab)
#> # A tibble: 4 × 7
#>   group     n statistic    df        p    p.adj p.adj.signif
#> * <chr> <dbl>     <dbl> <dbl>    <dbl>    <dbl> <chr>       
#> 1 1st    2201     121.      1 3.40e-28 1.02e-27 ****        
#> 2 2nd    2201      47.8     1 4.65e-12 9.30e-12 ****        
#> 3 3rd    2201      24.9     1 6.18e- 7 6.18e- 7 ****        
#> 4 Crew   2201     308.      1 5.51e-69 2.20e-68 ****

Arguments

Value

Functions

See also

Examples