7 Nonparametric tests

These tests are used when normality is violated
- They do not make assumptions about the distribution of the data
Wilcoxon signed rank test (for paired data)
Wilcoxon rank sum test (Also called Mann-Whitney U test)

7.1 Examples of violation of normality assumption

7.2 Two sample

Code

histinfo = hist(puf$iralcfm, plot = FALSE, probability = TRUE)

## Warning in hist.default(puf$iralcfm, plot = FALSE, probability = TRUE): argument 'probability'
## is not made use of

Code

hist(puf$iralcfm[ puf$mrjmon=='yes'], breaks = histinfo$breaks, main="Histogram of Alcohol use", 
     xlab="Days used past month", col=rgb(1,0,0,.5), border=NA, probability=TRUE)
hist(puf$iralcfm[ puf$mrjmon=='no'], breaks=histinfo$breaks, col=rgb(0,0,1,.5), add=TRUE, border=NA, probability=TRUE )
legend('topright', fill=c(rgb(1,0,0,.5), rgb(0,0,1,.5)), legend=c('Used MJ', "Didn't use MJ"), bty='n')

7.3 Wilcoxon/Mann-Whitney Rank Sum Test

Let \(X_i\) be 1 if subject \(i\) used marijuana in the past month and zero otherwise.
\(Y_i\) is alcohol use for subject \(i\) and assume \(X_i=0\) for \(i=1, \ldots, n_0\) and \(X_i=1\) for \(i=n_0+1, \ldots, n\).
This is a two-sample test (similar to the t-test)
This method tests the null hypothesis that \[ \mathbb{P}(Y_0>Y_1) = \mathbb{P}(Y_0<Y_1) \] where \(Y_i\) is alcohol use for a randomly selected non-marijuana user and \(Y_j\) is alcohol use for a randomly selected marijuana user.
- Many people often incorrectly say it is a test for the equality of the medians
- Equation above is called “stochastically greater”
Frank Harrell recommends using the Wilcoxon rank sum test over the t-test Comments about advantage of Wilcoxon here
Wilcoxon/Mann-Whitney \(U\)-test statistic is not very interpretable, but it is proportional to \(\mathbb{P}(Y_0>Y_1)\), which is equivalent to the area-under-the-curve (AUC) of the Receiver-Operating-Characteristic (ROC curve).
Code below computes the test statistic and AUC. It uses the aucRoc package to compute confidence intervals for the AUC.

Code

spuf = puf[sample(nrow(puf), 100),]
# full data set test
wtestfull = wilcox.test(puf$iralcfm[ puf$mrjmon=='yes'], puf$iralcfm[ puf$mrjmon=='no'])
# subset test
wtest = wilcox.test(spuf$iralcfm[ spuf$mrjmon=='yes'], spuf$iralcfm[ spuf$mrjmon=='no'], conf.int = TRUE)

# Probability that an MJ smoker drinks more
wtest

## 
##  Wilcoxon rank sum test with continuity correction
## 
## data:  spuf$iralcfm[spuf$mrjmon == "yes"] and spuf$iralcfm[spuf$mrjmon == "no"]
## W = 630.5, p-value = 0.2448
## alternative hypothesis: true location shift is not equal to 0
## 95 percent confidence interval:
##  -4.033002e-05  1.000015e+00
## sample estimates:
## difference in location 
##              0.9999813

Code

# this is the AUC
wtest$statistic/sum(spuf$mrjmon=='yes')/sum(spuf$mrjmon=='no')

##         W 
## 0.5970644

Code

#auRoc::

# Probability that a non MJ smokers drinks more
wtest2 = wilcox.test(spuf$iralcfm[ spuf$mrjmon=='no'], spuf$iralcfm[ spuf$mrjmon=='yes'], conf.int = TRUE)
wtest$statistic/sum(spuf$mrjmon=='yes')/sum(spuf$mrjmon=='no')

##         W 
## 0.5970644

Code

plot(puf$iralcfm[puf$mrjmon=='yes'], rank(puf$iralcfm)[puf$mrjmon=='yes'], xlab='Days of alcohol use', ylab='rank')

Code

n1 = sum(spuf$mrjmon=='yes')
n2 = sum(spuf$mrjmon=='no')
wtestStatistic = sum(rank(spuf$iralcfm)[spuf$mrjmon=='yes']) - n1*(n1+1)/2

7.4 One sample

Code

diffs = round(puf$iralcfm-puf$ircigfm)
hist(diffs, xlab='alcohol - cigarettes', main='Hist of diff', freq = FALSE, breaks=seq(-30.5, 30.5, length.out=62))

7.5 Wilcoxon Signed-rank test

This is a one-sample test (similar to the one-sample/paired t-test)
Let’s consider the association between cigarette smoking and alcohol that we looked at before
The normality assumption is violated here
We also showed that the t-test is robust to this violation (meaning that the type 1 error is pretty close to where it should be)
The signed-rank test takes the null hypothesis that the median is equal to zero (or some other chosen value)
There are a couple different forms of the test-statistic, Wikipedia gives the formula and the variance. R uses the one that isn’t described on Wikipedia.
p-values can be computed for small sample sizes exactly.
In large samples, the test statistic is approximately normal with variance \(N_r(N_r+1)(2N_r +1)/6\)

Code

#descr::CrossTable(table(puf$psyanyflag, puf$illflag), chisq = TRUE)

diffs = diffs[sample(length(diffs), 100)]

wtest = wilcox.test(diffs)

## Warning in wilcox.test.default(diffs): cannot compute exact p-value with ties

## Warning in wilcox.test.default(diffs): cannot compute exact p-value with zeroes

Code

wtest$statistic

##     V 
## 352.5

Code

sdiffs = sign(diffs[ diffs !=0] )
diffsrank = rank(abs(diffs[ diffs !=0]))
nr = length(diffsrank)

# test statistic in R
statistic= sum(diffsrank[sdiffs>0])
# test statistic that has asymptotic normal distribution
Zstatistic = sum(diffsrank*sdiffs)/sqrt(nr*(nr+1)*(2*nr+1)/6)
pvalue = 2*pnorm(abs(Zstatistic), lower.tail = FALSE)

# wilcox.test result
c(wtest$statistic, wtest$p.value)

##           V             
## 352.5000000   0.7995264

Code

# manual result, not exactly the same. May need correction for ties.
c(statistic, Zstatistic, pvalue)

## [1] 352.5000000  -0.2610410   0.7940609

7.6 Permutation testing

Permutation testing can be used to compute “exact” p-values without distributional assumptions.

7.6.1 Permutation testing intuition

Consider the two mean example where we are comparing alcohol use between individuals who use marijuana versus those who don’t.
Our test statistic is \[ T = (\hat\mu_1 - \hat\mu_0)/\sqrt{\hat{\text{Var}}(\hat\mu_1 - \hat\mu_0)} = T(X, Y) \]
\(X_i\) is paired with \(Y_i\) for each \(i\). If they are independent, the pairing doesn’t matter, so \(Y_i\).
We can permute the pairing – this combination of \(X_i\) and \(Y_i\) is equally likely under the null and we could compute a test statistic \[ T^{p} = T(X^p, Y) \] where \(X^p\) is the permuted marijuana use data.
In this case, there are \({n \choose n_1}\) possible pairings. That’s a lot.
In practice, because there are so many possible permutations, we just randomly choose a large number of permutations
Permutation tests compute a p-value this way for \(p=1, \ldots, P\) \[ 1/P \sum_{p=1}^P I\{ T(X^p, Y) \ge T\} \]
The average number of data sets where the test statistics is equal or larger than the observed value.

Code

permutationTest = function(X, Y, nperm=1000){
 Tobs = t.test(Y[X==0], Y[X==1])$statistic
 # randomly permutes X
 Tperms = replicate(nperm, {Xperm = sample(X, replace=FALSE); t.test(Y[Xperm==0], Y[Xperm==1])$statistic} )
 c(Tobs=Tobs, pvalue=mean(abs(Tperms)>abs(Tobs)))
}


# In the subset of data
permtest = permutationTest(Y=spuf$iralcfm, X = as.numeric(spuf$mrjmon=='yes'))
wtest = wilcox.test(spuf$iralcfm[ spuf$mrjmon=='yes'], spuf$iralcfm[ spuf$mrjmon=='no'], conf.int = TRUE)
ttest = t.test(spuf$iralcfm[ spuf$mrjmon=='yes'], spuf$iralcfm[ spuf$mrjmon=='no'])

permtest

##    Tobs.t    pvalue 
## 0.1600367 0.8820000

Code

wtest

## 
##  Wilcoxon rank sum test with continuity correction
## 
## data:  spuf$iralcfm[spuf$mrjmon == "yes"] and spuf$iralcfm[spuf$mrjmon == "no"]
## W = 630.5, p-value = 0.2448
## alternative hypothesis: true location shift is not equal to 0
## 95 percent confidence interval:
##  -4.033002e-05  1.000015e+00
## sample estimates:
## difference in location 
##              0.9999813

Code

ttest

## 
##  Welch Two Sample t-test
## 
## data:  spuf$iralcfm[spuf$mrjmon == "yes"] and spuf$iralcfm[spuf$mrjmon == "no"]
## t = -0.16004, df = 18.192, p-value = 0.8746
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -3.369032  2.891759
## sample estimates:
## mean of x mean of y 
##  3.500000  3.738636

Code

# In the full sample
permtest = permutationTest(Y=puf$iralcfm, X = as.numeric(puf$mrjmon=='yes'))
wtest = wilcox.test(puf$iralcfm[ puf$mrjmon=='yes'], puf$iralcfm[ puf$mrjmon=='no'], conf.int = TRUE)
ttest = t.test(puf$iralcfm[ puf$mrjmon=='yes'], puf$iralcfm[ puf$mrjmon=='no'])

permtest

##    Tobs.t    pvalue 
## -37.05836   0.00000

Code

wtest

## 
##  Wilcoxon rank sum test with continuity correction
## 
## data:  puf$iralcfm[puf$mrjmon == "yes"] and puf$iralcfm[puf$mrjmon == "no"]
## W = 226081314, p-value < 2.2e-16
## alternative hypothesis: true location shift is not equal to 0
## 95 percent confidence interval:
##  2.000021 2.000031
## sample estimates:
## difference in location 
##               2.000047

Code

ttest

## 
##  Welch Two Sample t-test
## 
## data:  puf$iralcfm[puf$mrjmon == "yes"] and puf$iralcfm[puf$mrjmon == "no"]
## t = 37.058, df = 7352.6, p-value < 2.2e-16
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  3.691486 4.103838
## sample estimates:
## mean of x mean of y 
##  6.926761  3.029099

7.7 There are many other nonparametric tests

Permutation testing – can be used in a wide array of applications
Other tests for comparing distributions
Kruskal-Wallis
Spearman’s correlation
Kolmogorov-Smirnov test for comparing two distributions (and other similar tests; Anderson-Darling, Shapiro-Wilk)

7.8 There are more for high-dimensional data

Distance correlation
Normalized mutual information