Correction for multiplicity in the ’emmeans’ package | R bloggers

Correction for multiplicity in the ’emmeans’ package | R bloggers

6 minutes, 13 seconds Read

In my recent book (see below), on pages 166 and earlier, I made the point that in pairwise comparisons and, more generally, whenever simultaneous statistical tests are performed, it is necessary to provide P-values ​​that take into account the family error rate, that is, the probability of committing at least one false rejection within the entire family of simultaneous tests (that is, adjusted P-values). In this context, it may be useful to recall that, for a single non-significant test, the error rate \(E_c\) is the chance of a wrong rejection for that one test (based on an uncorrected P-value), while the chance of at least one wrong rejection within a family of \(k\) comparisons are much higher.

In pairwise comparisons, a single test is usually based on the ratio of a difference to its standard error (a t-test), which is assumed to follow a univariate t-distribution when the null hypothesis is true. When multiple simultaneous t-tests are performed, the vector of all t-ratios can be assumed to follow a multivariate t-distribution, under the hypothesis that the null is true for all simultaneous tests (Bretz et al., 2011). Therefore, adjusted P values ​​can be obtained by using the likelihood function of a multivariate t-distribution instead of the simple univariate t-distribution.

As an example, let’s look at the ‘mixture data’ used in Chapter 9 of the main book. Three herbicide mixtures and an untreated control were tested for their weed control potential against an important weed in tomato, i.e. Black nightshade. In the code below, we load the data and fit a one-way ANOVA model using the weight of cannabis plants per pot as the response variable and the herbicide treatment as the explanatory factor. For simplicity, we omit the usual checks of the basic assumptions (see the main book). The ANOVA table shows that the treatment effect is significant and therefore we proceed to pairwise comparison of the treatment means. The P values ​​shown below do not take into account the within-family error rate, but only the comparison error rate; these P values ​​can be reproduced using the likelihood function of a univariate Student’s t distribution (pt() function in R).

library(statforbiology)
library(emmeans)
library(multcomp)
dataset <- getAgroData("mixture")
dataset$Treat <- factor(dataset$Treat)
model <- lm(Weight ~ Treat, data = dataset)
anova(model)
## Analysis of Variance Table
## 
## Response: Weight
##           Df  Sum Sq Mean Sq F value    Pr(>F)    
## Treat      3 1089.53  363.18  23.663 2.509e-05 ***
## Residuals 12  184.18   15.35                      
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
groupMeans <- emmeans(model, ~Treat)
tab <- contrast(groupMeans, method = "pairwise", adjust = "none")
tab
##  contrast                         estimate   SE df t.ratio p.value
##  Metribuzin__348 - Mixture_378        4.05 2.77 12   1.461  0.1697
##  Metribuzin__348 - Rimsulfuron_30    -7.68 2.77 12  -2.774  0.0168
##  Metribuzin__348 - Unweeded         -17.60 2.77 12  -6.352  <.0001
##  Mixture_378 - Rimsulfuron_30       -11.73 2.77 12  -4.235  0.0012
##  Mixture_378 - Unweeded             -21.64 2.77 12  -7.813  <.0001
##  Rimsulfuron_30 - Unweeded           -9.91 2.77 12  -3.578  0.0038
#
# The P-value is obtained from the univariate t distribution (two-tails test)
abst <- abs(as.data.frame(tab)$t.ratio)
2 * pt(abst, 12, lower.tail = FALSE)
## [1] 1.696785e-01 1.683167e-02 3.651239e-05 1.157189e-03 4.782986e-06
## [6] 3.794451e-03

To obtain family error rates, we need to switch from the univariate to the multivariate t-distribution. For example, let’s look at the first t-ratio in the previous code box (t = 1.461). We should ask ourselves, “what is the probability of obtaining a t-ratio as extreme as, or more extreme than, 1.461 from a multivariate t-distribution with six dimensions (i.e. the number of simultaneous tests)?”. In this calculation we must also take into account that the six tests are, at least to some extent, correlated, because they share some common elements, for example the same error term in the denominator. In the simplest case (homoscedasticity and balanced data), this correlation is equal to 0.5 for all pairwise comparisons.

In earlier times, when computing power was limited, calculating probabilities based on the multivariate t-distribution was a daunting task. However, for some specific cases (e.g. linear models with homoscedastic and balanced data), adjusted P values ​​can be obtained using the Studentised Range distribution (the so-called ‘tukey’ method), which is the default option in the contrast() function of the emmeans package, as shown in the following code box.

tab <- contrast(groupMeans, method = "pairwise")
# tab <- contrast(groupMeans, method = "pairwise", adjust = "tukey") # same as above
tab
##  contrast                         estimate   SE df t.ratio p.value
##  Metribuzin__348 - Mixture_378        4.05 2.77 12   1.461  0.4885
##  Metribuzin__348 - Rimsulfuron_30    -7.68 2.77 12  -2.774  0.0698
##  Metribuzin__348 - Unweeded         -17.60 2.77 12  -6.352  0.0002
##  Mixture_378 - Rimsulfuron_30       -11.73 2.77 12  -4.235  0.0055
##  Mixture_378 - Unweeded             -21.64 2.77 12  -7.813  <.0001
##  Rimsulfuron_30 - Unweeded           -9.91 2.77 12  -3.578  0.0173
## 
## P value adjustment: tukey method for comparing a family of 4 estimates
# The P-value is obtained from the Studentised Range Distribution (two-tails test)
abst <- abs(as.data.frame(tab)$t.ratio)
ptukey(sqrt(2) * abst, 4, 12, lower.tail = FALSE)
## [1] 4.884620e-01 6.981178e-02 1.853807e-04 5.501451e-03 2.473776e-05
## [6] 1.725725e-02

This simple method yields exact family error rates with balanced data – which represent the vast majority of designed field experiments in agriculture – and performs reasonably well in the presence of small degrees of imbalance. Within the framework of traditional multiple comparison testing procedures, the approach described above leads to the same results as Tukey’s HSD for balanced data and the Tukey-Kramer test for unbalanced data.

More recently, it has become possible to calculate probabilities directly from the multivariate t-distribution, which is particularly useful because it provides a more general approach to obtaining family error rates. This distribution is implemented in the ‘mvtnorm’ package via the pmvt() function. To perform the calculation, we must specify for each dimension the interval over which the probability is to be calculated (in this case the interval for the first t-ratio is \(\pm 1.461081\)), the number of degrees of freedom (12) and the correlation matrix of the linear combinations, which can be retrieved directly from the ’emmGrid’ object. The code below illustrates these calculations. The quantity ‘plev’ represents the probability of sampling within the interval (i.e. none of the six null hypotheses is incorrectly rejected), while the family error rate corresponds to the probability of sampling outside the interval (i.e. at least one null hypothesis is incorrectly rejected), which is obtained by subtraction.

library(mvtnorm)
t1 <- abs(as.data.frame(tab)$t.ratio)[1]
ncontr <- 6
corMat <- cov2cor(vcov(tab))
plev <- pmvt(lower = rep(-t1, ncontr), upper=rep(t1, ncontr), df = 12,
     corr = corMat)[1]
1 - plev
## [1] 0.4883843

In R, such an approximation can be obtained using the adjust = "mvt" argument.

tab <- contrast(groupMeans, method = "pairwise", adjust = "mvt")
tab
##  contrast                         estimate   SE df t.ratio p.value
##  Metribuzin__348 - Mixture_378        4.05 2.77 12   1.461  0.4885
##  Metribuzin__348 - Rimsulfuron_30    -7.68 2.77 12  -2.774  0.0698
##  Metribuzin__348 - Unweeded         -17.60 2.77 12  -6.352  0.0002
##  Mixture_378 - Rimsulfuron_30       -11.73 2.77 12  -4.235  0.0054
##  Mixture_378 - Unweeded             -21.64 2.77 12  -7.813  <.0001
##  Rimsulfuron_30 - Unweeded           -9.91 2.77 12  -3.578  0.0172
## 
## P value adjustment: mvt method for 6 tests

The above function uses numerical integration methods and is based on simulation; consequently, the results are not fully reproducible. However, it is easy to see that these results are asymptotically equivalent to the results obtained with the Tukey fitting method shown above. Due to this intrinsic complexity, the use of the adjust = "mvt" This argument is not recommended for pairwise comparisons in balanced experiments, while it may be useful in other situations, for example in the presence of highly unbalanced data.

Thanks for reading – and don’t forget to check out my new book below!

Andrea Onofri
Department of Agricultural, Food and Environmental Sciences
University of Perugia (Italy)
Send comments to: [email protected]


  1. Bretz, F., Hothorn, T., Westfall, P., 2011. Multiple Comparisons Using R. CRC Press, Boca Raton, FL.


#Correction #multiplicity #emmeans #package #bloggers

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *