--- title: "makeWeights" date: "`r Sys.Date()`" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{makeWeights} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r setup, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) library(coreStatsNMR) library(dplyr) library(data.table) set.seed(1234) ``` `makeWeights` is a function for creating tables of weighting values. It takes 3 input values: a dataframe (or data.table), a column with sample counts, and a column with population counts. The dataset should contain one row for each unique group requiring a weight. ### Sample data For testing, we use the `HairEyeColor` dataset that comes stock with R. It's a table containing counts (`Freq`) for each combination of three categorical variables: `Eye`, `Hair`, and `Sex`. It looks something like this: ```{r HairEyeColor} HairEyeColor ``` We tweak it by adding a random population variable. Next, we'll apply the `makeWeights` function to it to look at what it adds onto our dataframe. ```{r test_data} test_data <- HairEyeColor %>% data.frame() %>% mutate(Population = round(runif(length(HairEyeColor),10000,20000),0)) test_dt <- data.table(test_data) test_data %>% glimpse() ``` ### Sample table summary The code chunk below applies `makeWeights` to the test data, summarizing the proportion and population weights for each record, along with some (optional) intermediate values for checking the calculations. Let's look at the first few rows: ```{r makeWeights_0} test_weights <- makeWeights(data = test_data, sampleVal = "Freq", populationVal = "Population", checkCols = TRUE) knitr::kable(head(test_weights), digits = 2) ``` The function calculates the proportion (`propWt`) which would transform the observed sample count (`Freq`) for a given row into a value (`sampleVal_wtd`) proportional to that row's share of the overall population. It also produces the value (`populationWt`) that would transform the observed sample count into the overall population (`Population`) for a given row. Along the way it produces some intermediate values as well, for checking results. Eventually, these checks will happen behind the scenes in more automated fashion, and would only pop up if the user requested them.