wtdFreqTable

wtdFreqTable is a function for calculating the frequency any number of groups in a dataset, with or without weights. It takes up to 4 input values:

Sample data

For testing, we use the mtcars dataset that comes stock with R. It contains data from the 1974 Motor Trend US magazine, with fuel economy and other statistics for a set of 32 cars. It looks something like this:

glimpse(mtcars)
#> Rows: 32
#> Columns: 11
#> $ mpg  <dbl> 21.0, 21.0, 22.8, 21.4, 18.7, 18.1, 14.3, 24.4, 22.8, 19.2, 17.8,…
#> $ cyl  <dbl> 6, 6, 4, 6, 8, 6, 8, 4, 4, 6, 6, 8, 8, 8, 8, 8, 8, 4, 4, 4, 4, 8,…
#> $ disp <dbl> 160.0, 160.0, 108.0, 258.0, 360.0, 225.0, 360.0, 146.7, 140.8, 16…
#> $ hp   <dbl> 110, 110, 93, 110, 175, 105, 245, 62, 95, 123, 123, 180, 180, 180…
#> $ drat <dbl> 3.90, 3.90, 3.85, 3.08, 3.15, 2.76, 3.21, 3.69, 3.92, 3.92, 3.92,…
#> $ wt   <dbl> 2.620, 2.875, 2.320, 3.215, 3.440, 3.460, 3.570, 3.190, 3.150, 3.…
#> $ qsec <dbl> 16.46, 17.02, 18.61, 19.44, 17.02, 20.22, 15.84, 20.00, 22.90, 18…
#> $ vs   <dbl> 0, 0, 1, 1, 0, 1, 0, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 0,…
#> $ am   <dbl> 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 0, 0,…
#> $ gear <dbl> 4, 4, 4, 3, 3, 3, 3, 4, 4, 4, 4, 3, 3, 3, 3, 3, 3, 4, 4, 4, 3, 3,…
#> $ carb <dbl> 4, 4, 1, 1, 2, 1, 4, 2, 2, 4, 4, 3, 3, 3, 4, 4, 4, 1, 2, 1, 1, 2,…

We add random weights to the data for testing below.

set.seed(1234)

rand_wt <- data.frame(rand_bin = 1:3,
                      rand_wt = rnorm(3, mean = 1, sd = 0.333))

test_data <- mtcars %>% 
  mutate(index = row_number(),
         rand_bin = sample(1:3, nrow(mtcars), replace = TRUE)) %>% 
  merge(rand_wt,
        by = "rand_bin")

Changing weights

The code chunk below applies wtdFreqTable to the test data, comparing frequencies when weighting either by miles per gallon (mpg) or the random weight variable we created (rand_wt). Switching the weighting scheme only changes the weighted overall frequencies (in the right column).

wtdFreqTable does not normalize the weights to make total weighted count the same as the unweighted one, hence the wonky results when weighting by mpg and hp

wtdFreqTable(test_data, x = "cyl", y = "gear", totWeightVar = "mpg") %>% 
    knitr::kable()
#> Warning in wtdFreqTable.data.frame(test_data, x = "cyl", y = "gear", totWeightVar = "mpg"): Using placeholder in-group weights of 1
gear 4 6 8 Total
n 11 7 14 32.00000
3 1 2 12 12.02551
4 8 4 NA 14.65360
5 2 1 2 5.32089
  
wtdFreqTable(test_data, x = "cyl", y = "gear", totWeightVar = "mpg", inGroupWeightVar = "rand_wt", accuracy = 0.01) %>%
    knitr::kable()
gear 4 6 8 Total
n 11 7 14 32
3 0.54 2.23 12.44 12.03
4 7.78 4.01 NA 14.65
5 1.78 0.99 2.23 5.32

wtdFreqTable(test_data, x = "cyl", y = "gear", totWeightVar = "rand_wt") %>%
    knitr::kable()
#> Warning in wtdFreqTable.data.frame(test_data, x = "cyl", y = "gear", totWeightVar = "rand_wt"): Using placeholder in-group weights of 1
gear 4 6 8 Total
n 11 7 14 32.00000
3 1 2 12 15.21235
4 8 4 NA 11.78520
5 2 1 2 5.00245

wtdFreqTable(test_data, x = "cyl", y = "gear", totWeightVar = "rand_wt", inGroupWeightVar = "rand_wt", accuracy = 0.01) %>%
    knitr::kable()
gear 4 6 8 Total
n 11 7 14 32
3 0.54 2.23 12.44 15.21
4 7.78 4.01 NA 11.79
5 1.78 0.99 2.23 5.00
wtdFreqTable(mtcars, x = "cyl", y = "gear", totWeightVar = "hp", tot.label = "Statewide") %>% 
    knitr::kable()
#> Warning in wtdFreqTable.data.frame(mtcars, x = "cyl", y = "gear", totWeightVar = "hp", : Using placeholder in-group weights of 1
gear 4 6 8 Statewide
n 11 7 14 32.000000
3 1 2 12 18.011078
4 8 4 NA 7.321687
5 2 1 2 6.667235