wtdPropTable

wtdPropTable is a function for calculating the propotion of a dataset present in any number of groups, with or without weights. It takes up to 4 input values: a dataframe, two variables for grouping the dataset, and an optional column of weights.

Sample data

For testing, we use the mtcars dataset that comes stock with R. It contains data from the 1974 Motor Trend US magazine, qith fuel economy and other statistics for a set of 32 cars. It looks something like this:

glimpse(mtcars)
#> Rows: 32
#> Columns: 11
#> $ mpg  <dbl> 21.0, 21.0, 22.8, 21.4, 18.7, 18.1, 14.3, 24.4, 22.8, 19.2, 17.8,…
#> $ cyl  <dbl> 6, 6, 4, 6, 8, 6, 8, 4, 4, 6, 6, 8, 8, 8, 8, 8, 8, 4, 4, 4, 4, 8,…
#> $ disp <dbl> 160.0, 160.0, 108.0, 258.0, 360.0, 225.0, 360.0, 146.7, 140.8, 16…
#> $ hp   <dbl> 110, 110, 93, 110, 175, 105, 245, 62, 95, 123, 123, 180, 180, 180…
#> $ drat <dbl> 3.90, 3.90, 3.85, 3.08, 3.15, 2.76, 3.21, 3.69, 3.92, 3.92, 3.92,…
#> $ wt   <dbl> 2.620, 2.875, 2.320, 3.215, 3.440, 3.460, 3.570, 3.190, 3.150, 3.…
#> $ qsec <dbl> 16.46, 17.02, 18.61, 19.44, 17.02, 20.22, 15.84, 20.00, 22.90, 18…
#> $ vs   <dbl> 0, 0, 1, 1, 0, 1, 0, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 0,…
#> $ am   <dbl> 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 0, 0,…
#> $ gear <dbl> 4, 4, 4, 3, 3, 3, 3, 4, 4, 4, 4, 3, 3, 3, 3, 3, 3, 4, 4, 4, 3, 3,…
#> $ carb <dbl> 4, 4, 1, 1, 2, 1, 4, 2, 2, 4, 4, 3, 3, 3, 4, 4, 4, 1, 2, 1, 1, 2,…

We add random weights to the data for testing below.

set.seed(1234)

rand_wt <- data.frame(rand_bin = 1:3,
                      rand_wt = rnorm(3, mean = 1, sd = 0.333))

test_data <- mtcars |> 
  mutate(index = row_number(),
         rand_bin = sample(1:3, nrow(mtcars), replace = TRUE)) |> 
  merge(rand_wt,
        by = "rand_bin")

Changing weights

The code chunk below applies wtdPropTable to the test data, comparing proportions when weighting either by horsepower (hp) or the random weight variable we created (rand_wt). Switching the weighting scheme only changes the weighted overal proportions (in the right column).

wtdPropTable(test_data, x = "cyl", y = "gear", totWeightVar = "hp") |> 
    knitr::kable()
#> Warning in wtdPropTable.data.frame(test_data, x = "cyl", y = "gear", totWeightVar = "hp"): Using placeholder in-group weights of 1
gear 4 6 8 Total
n 11 7 14 32
3 9.1% 28.6% 85.7% 56.3%
4 72.7% 57.1% NA 22.9%
5 18.2% 14.3% 14.3% 20.8%

wtdPropTable(test_data, x = "cyl", y = "gear", totWeightVar = "rand_wt") |>
    knitr::kable()
#> Warning in wtdPropTable.data.frame(test_data, x = "cyl", y = "gear", totWeightVar = "rand_wt"): Using placeholder in-group weights of 1
gear 4 6 8 Total
n 11 7 14 32
3 9.1% 28.6% 85.7% 47.5%
4 72.7% 57.1% NA 36.8%
5 18.2% 14.3% 14.3% 15.6%
wtdPropTable(mtcars, x = "cyl", y = "gear", totWeightVar = "hp", pct_format = FALSE) |> 
    mutate(across(2:5, as.numeric)) |> 
    mutate(across(2:5, ~ifelse(gear == "n", round(.x, 0), round(.x, 2)) |> 
                      as.character())) |> 
    knitr::kable(digits = 2, drop0trailing = TRUE)
#> Warning in wtdPropTable.data.frame(mtcars, x = "cyl", y = "gear", totWeightVar = "hp", : Using placeholder in-group weights of 1
gear 4 6 8 Total
n 11 7 14 32
3 0.09 0.29 0.86 0.56
4 0.73 0.57 NA 0.23
5 0.18 0.14 0.14 0.21

wtdPropTable(mtcars, x = "cyl", y = "gear", totWeightVar = "hp", pct_format = TRUE, accuracy = 0.1) |> 
    knitr::kable()
#> Warning in wtdPropTable.data.frame(mtcars, x = "cyl", y = "gear", totWeightVar = "hp", : Using placeholder in-group weights of 1
gear 4 6 8 Total
n 11 7 14 32
3 9.1% 28.6% 85.7% 56.3%
4 72.7% 57.1% NA 22.9%
5 18.2% 14.3% 14.3% 20.8%

wtdPropTable(mtcars, x = "cyl", y = "gear", totWeightVar = "hp", accuracy = 1) |> 
    knitr::kable()
#> Warning in wtdPropTable.data.frame(mtcars, x = "cyl", y = "gear", totWeightVar = "hp", : Using placeholder in-group weights of 1
gear 4 6 8 Total
n 11 7 14 32
3 9% 29% 86% 56%
4 73% 57% NA 23%
5 18% 14% 14% 21%

wtdPropTable(mtcars, x = "cyl", y = "gear", totWeightVar = "hp", tot.label = "Statewide") |> 
    knitr::kable()
#> Warning in wtdPropTable.data.frame(mtcars, x = "cyl", y = "gear", totWeightVar = "hp", : Using placeholder in-group weights of 1
gear 4 6 8 Statewide
n 11 7 14 32
3 9.1% 28.6% 85.7% 56.3%
4 72.7% 57.1% NA 22.9%
5 18.2% 14.3% 14.3% 20.8%