---
title: "pairTtest"
date: "`r Sys.Date()`"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{pairTtest}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r setup, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)
library(coreStatsNMR)
library(dplyr)

set.seed(314)
```

`pairTtest` is a function for comparing a set of proportion pairs for statistically significant differences. It requires 3 input values: a dataframe, a column with means to compare, and a column with groups to compare. In other words, the success counts are a subset of the trial counts.

## Sample data

For testing, we use the `airquality` dataset that comes stock with R. It contains ozone level measurements over the course of various months. 

```{r airquality}
glimpse(airquality)
```

We add a random bin (`rand_bin`) as a placeholder categorical variable to represent different data collection methods.

- `Month`
- `Month` plus `rand_bin`

This way we can look at how the proportions of treated insects vary from month-to-month and across the random bins ("data collection periods"):

```{r test_data, echo=TRUE}
test_data <-  airquality %>% 
  mutate(rand_bin = as.character(sample(1:3, nrow(airquality), replace = TRUE)),
         weight = sample(c(0.5,1,2), nrow(.), replace = TRUE))

pairT <- pairTtest(test_data,
                   "Ozone", "Month",
                   alpha = 0.05, n.min = 4)
```

We can inspect the output of pairwise T-testing in table form.

```{r pairTtest_sample}
pairT %>% 
  head(8) %>% 
  knitr::kable()
```

A quick glance at the chart above suggests a good chunk of the statistically significant differences in ozone levels (`r nrow(pairT[pairT$sig,])` rows) are due to the middle months (July + Aug) being different than the months bookending the dataset (May and September.

```{r pairTtest_sig}
filter(pairT, sig) %>%
    head(8) %>% 
    knitr::kable()
```

Doing the same as above, but using our `rand_bin` variable instead of the months in the dataset. Perhaps unsurprisingly, no significant differences in means across these randomly assigned groups.

```{r rand_bin}
pairTtest(test_data, "Ozone", "rand_bin") %>%
  head(8) %>% 
  knitr::kable()
```

## Weighted tests

The code chunk below applies `pairTtest` to the test data, comparing proportions of insects treated with different Months accross our randomly assigned bins (`rand_bin`).

```{r pairTtest_0}
pairT_wtd <- wtdPairTtest(test_data,
                          "Ozone", "Month", "weight",
                          alpha = 0.05, n.min = 4)

pairT_wtd_noAdj <- wtdPairTtest(test_data,
                                "Ozone", "Month", "weight",
                                p.adjust.method = "none",
                                alpha = 0.05, n.min = 4)
```

We can inspect the output of pairwise T-testing in table form.

```{r pairTtest_sample_wtd}
pairT_wtd %>% 
  head(8) %>% 
  knitr::kable()

pairT_wtd_noAdj %>% 
  head(8) %>% 
  knitr::kable()
```

And just the significant results

```{r}
filter(pairT_wtd, sig) %>%
    select(-c(estimate:df)) %>% 
    head(8) %>% 
    knitr::kable()

filter(pairT_wtd_noAdj, sig) %>%
    select(-c(estimate:df)) %>% 
    head(8) %>% 
    knitr::kable()
```