wtdPropTable
is a function for calculating the
propotion of a dataset present in any number of groups, with or without
weights. It takes up to 4 input values: a dataframe, two variables for
grouping the dataset, and an optional column of weights.
For testing, we use the mtcars
dataset that comes stock
with R. It contains data from the 1974 Motor Trend US magazine, qith
fuel economy and other statistics for a set of 32 cars. It looks
something like this:
glimpse(mtcars)
#> Rows: 32
#> Columns: 11
#> $ mpg <dbl> 21.0, 21.0, 22.8, 21.4, 18.7, 18.1, 14.3, 24.4, 22.8, 19.2, 17.8,…
#> $ cyl <dbl> 6, 6, 4, 6, 8, 6, 8, 4, 4, 6, 6, 8, 8, 8, 8, 8, 8, 4, 4, 4, 4, 8,…
#> $ disp <dbl> 160.0, 160.0, 108.0, 258.0, 360.0, 225.0, 360.0, 146.7, 140.8, 16…
#> $ hp <dbl> 110, 110, 93, 110, 175, 105, 245, 62, 95, 123, 123, 180, 180, 180…
#> $ drat <dbl> 3.90, 3.90, 3.85, 3.08, 3.15, 2.76, 3.21, 3.69, 3.92, 3.92, 3.92,…
#> $ wt <dbl> 2.620, 2.875, 2.320, 3.215, 3.440, 3.460, 3.570, 3.190, 3.150, 3.…
#> $ qsec <dbl> 16.46, 17.02, 18.61, 19.44, 17.02, 20.22, 15.84, 20.00, 22.90, 18…
#> $ vs <dbl> 0, 0, 1, 1, 0, 1, 0, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 0,…
#> $ am <dbl> 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 0, 0,…
#> $ gear <dbl> 4, 4, 4, 3, 3, 3, 3, 4, 4, 4, 4, 3, 3, 3, 3, 3, 3, 4, 4, 4, 3, 3,…
#> $ carb <dbl> 4, 4, 1, 1, 2, 1, 4, 2, 2, 4, 4, 3, 3, 3, 4, 4, 4, 1, 2, 1, 1, 2,…
We add random weights to the data for testing below.
The code chunk below applies wtdPropTable
to the test
data, comparing proportions when weighting either by horsepower
(hp
) or the random weight variable we created
(rand_wt
). Switching the weighting scheme only changes the
weighted overal proportions (in the right column).
wtdPropTable(test_data, x = "cyl", y = "gear", totWeightVar = "hp") |>
knitr::kable()
#> Warning in wtdPropTable.data.frame(test_data, x = "cyl", y = "gear", totWeightVar = "hp"): Using placeholder in-group weights of 1
gear | 4 | 6 | 8 | Total |
---|---|---|---|---|
n | 11 | 7 | 14 | 32 |
3 | 9.1% | 28.6% | 85.7% | 56.3% |
4 | 72.7% | 57.1% | NA | 22.9% |
5 | 18.2% | 14.3% | 14.3% | 20.8% |
wtdPropTable(test_data, x = "cyl", y = "gear", totWeightVar = "rand_wt") |>
knitr::kable()
#> Warning in wtdPropTable.data.frame(test_data, x = "cyl", y = "gear", totWeightVar = "rand_wt"): Using placeholder in-group weights of 1
gear | 4 | 6 | 8 | Total |
---|---|---|---|---|
n | 11 | 7 | 14 | 32 |
3 | 9.1% | 28.6% | 85.7% | 47.5% |
4 | 72.7% | 57.1% | NA | 36.8% |
5 | 18.2% | 14.3% | 14.3% | 15.6% |
wtdPropTable(mtcars, x = "cyl", y = "gear", totWeightVar = "hp", pct_format = FALSE) |>
mutate(across(2:5, as.numeric)) |>
mutate(across(2:5, ~ifelse(gear == "n", round(.x, 0), round(.x, 2)) |>
as.character())) |>
knitr::kable(digits = 2, drop0trailing = TRUE)
#> Warning in wtdPropTable.data.frame(mtcars, x = "cyl", y = "gear", totWeightVar = "hp", : Using placeholder in-group weights of 1
gear | 4 | 6 | 8 | Total |
---|---|---|---|---|
n | 11 | 7 | 14 | 32 |
3 | 0.09 | 0.29 | 0.86 | 0.56 |
4 | 0.73 | 0.57 | NA | 0.23 |
5 | 0.18 | 0.14 | 0.14 | 0.21 |
wtdPropTable(mtcars, x = "cyl", y = "gear", totWeightVar = "hp", pct_format = TRUE, accuracy = 0.1) |>
knitr::kable()
#> Warning in wtdPropTable.data.frame(mtcars, x = "cyl", y = "gear", totWeightVar = "hp", : Using placeholder in-group weights of 1
gear | 4 | 6 | 8 | Total |
---|---|---|---|---|
n | 11 | 7 | 14 | 32 |
3 | 9.1% | 28.6% | 85.7% | 56.3% |
4 | 72.7% | 57.1% | NA | 22.9% |
5 | 18.2% | 14.3% | 14.3% | 20.8% |
wtdPropTable(mtcars, x = "cyl", y = "gear", totWeightVar = "hp", accuracy = 1) |>
knitr::kable()
#> Warning in wtdPropTable.data.frame(mtcars, x = "cyl", y = "gear", totWeightVar = "hp", : Using placeholder in-group weights of 1
gear | 4 | 6 | 8 | Total |
---|---|---|---|---|
n | 11 | 7 | 14 | 32 |
3 | 9% | 29% | 86% | 56% |
4 | 73% | 57% | NA | 23% |
5 | 18% | 14% | 14% | 21% |
wtdPropTable(mtcars, x = "cyl", y = "gear", totWeightVar = "hp", tot.label = "Statewide") |>
knitr::kable()
#> Warning in wtdPropTable.data.frame(mtcars, x = "cyl", y = "gear", totWeightVar = "hp", : Using placeholder in-group weights of 1
gear | 4 | 6 | 8 | Statewide |
---|---|---|---|---|
n | 11 | 7 | 14 | 32 |
3 | 9.1% | 28.6% | 85.7% | 56.3% |
4 | 72.7% | 57.1% | NA | 22.9% |
5 | 18.2% | 14.3% | 14.3% | 20.8% |