7 Nonparametric tests
- These tests are used when normality is violated
- They do not make assumptions about the distribution of the data
- Wilcoxon signed rank test (for paired data)
- Wilcoxon rank sum test (Also called Mann-Whitney U test)
7.2 Two sample
## Warning in hist.default(puf$iralcfm, plot = FALSE, probability = TRUE): argument 'probability'
## is not made use of
Code
hist(puf$iralcfm[ puf$mrjmon=='yes'], breaks = histinfo$breaks, main="Histogram of Alcohol use",
xlab="Days used past month", col=rgb(1,0,0,.5), border=NA, probability=TRUE)
hist(puf$iralcfm[ puf$mrjmon=='no'], breaks=histinfo$breaks, col=rgb(0,0,1,.5), add=TRUE, border=NA, probability=TRUE )
legend('topright', fill=c(rgb(1,0,0,.5), rgb(0,0,1,.5)), legend=c('Used MJ', "Didn't use MJ"), bty='n')
7.3 Wilcoxon/Mann-Whitney Rank Sum Test
- Let \(X_i\) be 1 if subject \(i\) used marijuana in the past month and zero otherwise.
- \(Y_i\) is alcohol use for subject \(i\) and assume \(X_i=0\) for \(i=1, \ldots, n_0\) and \(X_i=1\) for \(i=n_0+1, \ldots, n\).
- This is a two-sample test (similar to the t-test)
- This method tests the null hypothesis that
\[
\mathbb{P}(Y_0>Y_1) = \mathbb{P}(Y_0<Y_1)
\]
where \(Y_i\) is alcohol use for a randomly selected non-marijuana user and \(Y_j\) is alcohol use for a randomly selected marijuana user.
- Many people often incorrectly say it is a test for the equality of the medians
- Equation above is called “stochastically greater”
- Frank Harrell recommends using the Wilcoxon rank sum test over the t-test Comments about advantage of Wilcoxon here
- Wilcoxon/Mann-Whitney \(U\)-test statistic is not very interpretable, but it is proportional to \(\mathbb{P}(Y_0>Y_1)\), which is equivalent to the area-under-the-curve (AUC) of the Receiver-Operating-Characteristic (ROC curve).
- Code below computes the test statistic and AUC. It uses the
aucRoc
package to compute confidence intervals for the AUC.
Code
spuf = puf[sample(nrow(puf), 100),]
# full data set test
wtestfull = wilcox.test(puf$iralcfm[ puf$mrjmon=='yes'], puf$iralcfm[ puf$mrjmon=='no'])
# subset test
wtest = wilcox.test(spuf$iralcfm[ spuf$mrjmon=='yes'], spuf$iralcfm[ spuf$mrjmon=='no'], conf.int = TRUE)
# Probability that an MJ smoker drinks more
wtest
##
## Wilcoxon rank sum test with continuity correction
##
## data: spuf$iralcfm[spuf$mrjmon == "yes"] and spuf$iralcfm[spuf$mrjmon == "no"]
## W = 630.5, p-value = 0.2448
## alternative hypothesis: true location shift is not equal to 0
## 95 percent confidence interval:
## -4.033002e-05 1.000015e+00
## sample estimates:
## difference in location
## 0.9999813
## W
## 0.5970644
Code
## W
## 0.5970644
Code
7.5 Wilcoxon Signed-rank test
- This is a one-sample test (similar to the one-sample/paired t-test)
- Let’s consider the association between cigarette smoking and alcohol that we looked at before
- The normality assumption is violated here
- We also showed that the t-test is robust to this violation (meaning that the type 1 error is pretty close to where it should be)
- The signed-rank test takes the null hypothesis that the median is equal to zero (or some other chosen value)
- There are a couple different forms of the test-statistic, Wikipedia gives the formula and the variance.
R
uses the one that isn’t described on Wikipedia. - p-values can be computed for small sample sizes exactly.
- In large samples, the test statistic is approximately normal with variance \(N_r(N_r+1)(2N_r +1)/6\)
Code
## Warning in wilcox.test.default(diffs): cannot compute exact p-value with ties
## Warning in wilcox.test.default(diffs): cannot compute exact p-value with zeroes
## V
## 352.5
Code
sdiffs = sign(diffs[ diffs !=0] )
diffsrank = rank(abs(diffs[ diffs !=0]))
nr = length(diffsrank)
# test statistic in R
statistic= sum(diffsrank[sdiffs>0])
# test statistic that has asymptotic normal distribution
Zstatistic = sum(diffsrank*sdiffs)/sqrt(nr*(nr+1)*(2*nr+1)/6)
pvalue = 2*pnorm(abs(Zstatistic), lower.tail = FALSE)
# wilcox.test result
c(wtest$statistic, wtest$p.value)
## V
## 352.5000000 0.7995264
Code
## [1] 352.5000000 -0.2610410 0.7940609
7.6 Permutation testing
- Permutation testing can be used to compute “exact” p-values without distributional assumptions.
7.6.1 Permutation testing intuition
- Consider the two mean example where we are comparing alcohol use between individuals who use marijuana versus those who don’t.
- Our test statistic is \[ T = (\hat\mu_1 - \hat\mu_0)/\sqrt{\hat{\text{Var}}(\hat\mu_1 - \hat\mu_0)} = T(X, Y) \]
- \(X_i\) is paired with \(Y_i\) for each \(i\). If they are independent, the pairing doesn’t matter, so \(Y_i\).
- We can permute the pairing – this combination of \(X_i\) and \(Y_i\) is equally likely under the null and we could compute a test statistic \[ T^{p} = T(X^p, Y) \] where \(X^p\) is the permuted marijuana use data.
- In this case, there are \({n \choose n_1}\) possible pairings. That’s a lot.
- In practice, because there are so many possible permutations, we just randomly choose a large number of permutations
- Permutation tests compute a p-value this way for \(p=1, \ldots, P\) \[ 1/P \sum_{p=1}^P I\{ T(X^p, Y) \ge T\} \]
- The average number of data sets where the test statistics is equal or larger than the observed value.
Code
permutationTest = function(X, Y, nperm=1000){
Tobs = t.test(Y[X==0], Y[X==1])$statistic
# randomly permutes X
Tperms = replicate(nperm, {Xperm = sample(X, replace=FALSE); t.test(Y[Xperm==0], Y[Xperm==1])$statistic} )
c(Tobs=Tobs, pvalue=mean(abs(Tperms)>abs(Tobs)))
}
# In the subset of data
permtest = permutationTest(Y=spuf$iralcfm, X = as.numeric(spuf$mrjmon=='yes'))
wtest = wilcox.test(spuf$iralcfm[ spuf$mrjmon=='yes'], spuf$iralcfm[ spuf$mrjmon=='no'], conf.int = TRUE)
ttest = t.test(spuf$iralcfm[ spuf$mrjmon=='yes'], spuf$iralcfm[ spuf$mrjmon=='no'])
permtest
## Tobs.t pvalue
## 0.1600367 0.8820000
##
## Wilcoxon rank sum test with continuity correction
##
## data: spuf$iralcfm[spuf$mrjmon == "yes"] and spuf$iralcfm[spuf$mrjmon == "no"]
## W = 630.5, p-value = 0.2448
## alternative hypothesis: true location shift is not equal to 0
## 95 percent confidence interval:
## -4.033002e-05 1.000015e+00
## sample estimates:
## difference in location
## 0.9999813
##
## Welch Two Sample t-test
##
## data: spuf$iralcfm[spuf$mrjmon == "yes"] and spuf$iralcfm[spuf$mrjmon == "no"]
## t = -0.16004, df = 18.192, p-value = 0.8746
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -3.369032 2.891759
## sample estimates:
## mean of x mean of y
## 3.500000 3.738636
Code
## Tobs.t pvalue
## -37.05836 0.00000
##
## Wilcoxon rank sum test with continuity correction
##
## data: puf$iralcfm[puf$mrjmon == "yes"] and puf$iralcfm[puf$mrjmon == "no"]
## W = 226081314, p-value < 2.2e-16
## alternative hypothesis: true location shift is not equal to 0
## 95 percent confidence interval:
## 2.000021 2.000031
## sample estimates:
## difference in location
## 2.000047
##
## Welch Two Sample t-test
##
## data: puf$iralcfm[puf$mrjmon == "yes"] and puf$iralcfm[puf$mrjmon == "no"]
## t = 37.058, df = 7352.6, p-value < 2.2e-16
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## 3.691486 4.103838
## sample estimates:
## mean of x mean of y
## 6.926761 3.029099
7.7 There are many other nonparametric tests
- Permutation testing – can be used in a wide array of applications
- Other tests for comparing distributions
- Kruskal-Wallis
- Spearman’s correlation
- Kolmogorov-Smirnov test for comparing two distributions (and other similar tests; Anderson-Darling, Shapiro-Wilk)