7 Nonparametric tests

  • These tests are used when normality is violated
    • They do not make assumptions about the distribution of the data
  • Wilcoxon signed rank test (for paired data)
  • Wilcoxon rank sum test (Also called Mann-Whitney U test)

7.1 Examples of violation of normality assumption

7.2 Two sample

Code
histinfo = hist(puf$iralcfm, plot = FALSE, probability = TRUE)
## Warning in hist.default(puf$iralcfm, plot = FALSE, probability = TRUE): argument 'probability'
## is not made use of
Code
hist(puf$iralcfm[ puf$mrjmon=='yes'], breaks = histinfo$breaks, main="Histogram of Alcohol use", 
     xlab="Days used past month", col=rgb(1,0,0,.5), border=NA, probability=TRUE)
hist(puf$iralcfm[ puf$mrjmon=='no'], breaks=histinfo$breaks, col=rgb(0,0,1,.5), add=TRUE, border=NA, probability=TRUE )
legend('topright', fill=c(rgb(1,0,0,.5), rgb(0,0,1,.5)), legend=c('Used MJ', "Didn't use MJ"), bty='n')

7.3 Wilcoxon/Mann-Whitney Rank Sum Test

  • Let \(X_i\) be 1 if subject \(i\) used marijuana in the past month and zero otherwise.
  • \(Y_i\) is alcohol use for subject \(i\) and assume \(X_i=0\) for \(i=1, \ldots, n_0\) and \(X_i=1\) for \(i=n_0+1, \ldots, n\).
  • This is a two-sample test (similar to the t-test)
  • This method tests the null hypothesis that \[ \mathbb{P}(Y_0>Y_1) = \mathbb{P}(Y_0<Y_1) \] where \(Y_i\) is alcohol use for a randomly selected non-marijuana user and \(Y_j\) is alcohol use for a randomly selected marijuana user.
    • Many people often incorrectly say it is a test for the equality of the medians
    • Equation above is called “stochastically greater”
  • Frank Harrell recommends using the Wilcoxon rank sum test over the t-test Comments about advantage of Wilcoxon here
  • Wilcoxon/Mann-Whitney \(U\)-test statistic is not very interpretable, but it is proportional to \(\mathbb{P}(Y_0>Y_1)\), which is equivalent to the area-under-the-curve (AUC) of the Receiver-Operating-Characteristic (ROC curve).
  • Code below computes the test statistic and AUC. It uses the aucRoc package to compute confidence intervals for the AUC.
Code
spuf = puf[sample(nrow(puf), 100),]
# full data set test
wtestfull = wilcox.test(puf$iralcfm[ puf$mrjmon=='yes'], puf$iralcfm[ puf$mrjmon=='no'])
# subset test
wtest = wilcox.test(spuf$iralcfm[ spuf$mrjmon=='yes'], spuf$iralcfm[ spuf$mrjmon=='no'], conf.int = TRUE)

# Probability that an MJ smoker drinks more
wtest
## 
##  Wilcoxon rank sum test with continuity correction
## 
## data:  spuf$iralcfm[spuf$mrjmon == "yes"] and spuf$iralcfm[spuf$mrjmon == "no"]
## W = 630.5, p-value = 0.2448
## alternative hypothesis: true location shift is not equal to 0
## 95 percent confidence interval:
##  -4.033002e-05  1.000015e+00
## sample estimates:
## difference in location 
##              0.9999813
Code
# this is the AUC
wtest$statistic/sum(spuf$mrjmon=='yes')/sum(spuf$mrjmon=='no')
##         W 
## 0.5970644
Code
#auRoc::

# Probability that a non MJ smokers drinks more
wtest2 = wilcox.test(spuf$iralcfm[ spuf$mrjmon=='no'], spuf$iralcfm[ spuf$mrjmon=='yes'], conf.int = TRUE)
wtest$statistic/sum(spuf$mrjmon=='yes')/sum(spuf$mrjmon=='no')
##         W 
## 0.5970644
Code
plot(puf$iralcfm[puf$mrjmon=='yes'], rank(puf$iralcfm)[puf$mrjmon=='yes'], xlab='Days of alcohol use', ylab='rank')

Code
n1 = sum(spuf$mrjmon=='yes')
n2 = sum(spuf$mrjmon=='no')
wtestStatistic = sum(rank(spuf$iralcfm)[spuf$mrjmon=='yes']) - n1*(n1+1)/2

7.4 One sample

Code
diffs = round(puf$iralcfm-puf$ircigfm)
hist(diffs, xlab='alcohol - cigarettes', main='Hist of diff', freq = FALSE, breaks=seq(-30.5, 30.5, length.out=62))

7.5 Wilcoxon Signed-rank test

  • This is a one-sample test (similar to the one-sample/paired t-test)
  • Let’s consider the association between cigarette smoking and alcohol that we looked at before
  • The normality assumption is violated here
  • We also showed that the t-test is robust to this violation (meaning that the type 1 error is pretty close to where it should be)
  • The signed-rank test takes the null hypothesis that the median is equal to zero (or some other chosen value)
  • There are a couple different forms of the test-statistic, Wikipedia gives the formula and the variance. R uses the one that isn’t described on Wikipedia.
  • p-values can be computed for small sample sizes exactly.
  • In large samples, the test statistic is approximately normal with variance \(N_r(N_r+1)(2N_r +1)/6\)
Code
#descr::CrossTable(table(puf$psyanyflag, puf$illflag), chisq = TRUE)

diffs = diffs[sample(length(diffs), 100)]

wtest = wilcox.test(diffs)
## Warning in wilcox.test.default(diffs): cannot compute exact p-value with ties
## Warning in wilcox.test.default(diffs): cannot compute exact p-value with zeroes
Code
wtest$statistic
##     V 
## 352.5
Code
sdiffs = sign(diffs[ diffs !=0] )
diffsrank = rank(abs(diffs[ diffs !=0]))
nr = length(diffsrank)

# test statistic in R
statistic= sum(diffsrank[sdiffs>0])
# test statistic that has asymptotic normal distribution
Zstatistic = sum(diffsrank*sdiffs)/sqrt(nr*(nr+1)*(2*nr+1)/6)
pvalue = 2*pnorm(abs(Zstatistic), lower.tail = FALSE)

# wilcox.test result
c(wtest$statistic, wtest$p.value)
##           V             
## 352.5000000   0.7995264
Code
# manual result, not exactly the same. May need correction for ties.
c(statistic, Zstatistic, pvalue)
## [1] 352.5000000  -0.2610410   0.7940609

7.6 Permutation testing

  • Permutation testing can be used to compute “exact” p-values without distributional assumptions.

7.6.1 Permutation testing intuition

  • Consider the two mean example where we are comparing alcohol use between individuals who use marijuana versus those who don’t.
  • Our test statistic is \[ T = (\hat\mu_1 - \hat\mu_0)/\sqrt{\hat{\text{Var}}(\hat\mu_1 - \hat\mu_0)} = T(X, Y) \]
  • \(X_i\) is paired with \(Y_i\) for each \(i\). If they are independent, the pairing doesn’t matter, so \(Y_i\).
  • We can permute the pairing – this combination of \(X_i\) and \(Y_i\) is equally likely under the null and we could compute a test statistic \[ T^{p} = T(X^p, Y) \] where \(X^p\) is the permuted marijuana use data.
  • In this case, there are \({n \choose n_1}\) possible pairings. That’s a lot.
  • In practice, because there are so many possible permutations, we just randomly choose a large number of permutations
  • Permutation tests compute a p-value this way for \(p=1, \ldots, P\) \[ 1/P \sum_{p=1}^P I\{ T(X^p, Y) \ge T\} \]
  • The average number of data sets where the test statistics is equal or larger than the observed value.
Code
permutationTest = function(X, Y, nperm=1000){
 Tobs = t.test(Y[X==0], Y[X==1])$statistic
 # randomly permutes X
 Tperms = replicate(nperm, {Xperm = sample(X, replace=FALSE); t.test(Y[Xperm==0], Y[Xperm==1])$statistic} )
 c(Tobs=Tobs, pvalue=mean(abs(Tperms)>abs(Tobs)))
}


# In the subset of data
permtest = permutationTest(Y=spuf$iralcfm, X = as.numeric(spuf$mrjmon=='yes'))
wtest = wilcox.test(spuf$iralcfm[ spuf$mrjmon=='yes'], spuf$iralcfm[ spuf$mrjmon=='no'], conf.int = TRUE)
ttest = t.test(spuf$iralcfm[ spuf$mrjmon=='yes'], spuf$iralcfm[ spuf$mrjmon=='no'])

permtest
##    Tobs.t    pvalue 
## 0.1600367 0.8820000
Code
wtest
## 
##  Wilcoxon rank sum test with continuity correction
## 
## data:  spuf$iralcfm[spuf$mrjmon == "yes"] and spuf$iralcfm[spuf$mrjmon == "no"]
## W = 630.5, p-value = 0.2448
## alternative hypothesis: true location shift is not equal to 0
## 95 percent confidence interval:
##  -4.033002e-05  1.000015e+00
## sample estimates:
## difference in location 
##              0.9999813
Code
ttest
## 
##  Welch Two Sample t-test
## 
## data:  spuf$iralcfm[spuf$mrjmon == "yes"] and spuf$iralcfm[spuf$mrjmon == "no"]
## t = -0.16004, df = 18.192, p-value = 0.8746
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -3.369032  2.891759
## sample estimates:
## mean of x mean of y 
##  3.500000  3.738636
Code
# In the full sample
permtest = permutationTest(Y=puf$iralcfm, X = as.numeric(puf$mrjmon=='yes'))
wtest = wilcox.test(puf$iralcfm[ puf$mrjmon=='yes'], puf$iralcfm[ puf$mrjmon=='no'], conf.int = TRUE)
ttest = t.test(puf$iralcfm[ puf$mrjmon=='yes'], puf$iralcfm[ puf$mrjmon=='no'])

permtest
##    Tobs.t    pvalue 
## -37.05836   0.00000
Code
wtest
## 
##  Wilcoxon rank sum test with continuity correction
## 
## data:  puf$iralcfm[puf$mrjmon == "yes"] and puf$iralcfm[puf$mrjmon == "no"]
## W = 226081314, p-value < 2.2e-16
## alternative hypothesis: true location shift is not equal to 0
## 95 percent confidence interval:
##  2.000021 2.000031
## sample estimates:
## difference in location 
##               2.000047
Code
ttest
## 
##  Welch Two Sample t-test
## 
## data:  puf$iralcfm[puf$mrjmon == "yes"] and puf$iralcfm[puf$mrjmon == "no"]
## t = 37.058, df = 7352.6, p-value < 2.2e-16
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  3.691486 4.103838
## sample estimates:
## mean of x mean of y 
##  6.926761  3.029099

7.7 There are many other nonparametric tests

  • Permutation testing – can be used in a wide array of applications
  • Other tests for comparing distributions
  • Kruskal-Wallis
  • Spearman’s correlation
  • Kolmogorov-Smirnov test for comparing two distributions (and other similar tests; Anderson-Darling, Shapiro-Wilk)

7.8 There are more for high-dimensional data

  • Distance correlation
  • Normalized mutual information