Code
knitr::opts_chunk$set(echo = TRUE, eval=TRUE, cache=FALSE)We will analyze diagnostic and age associations in individuals with autism spectrum disorder and typically developing controls. The data we will analyze are called fractional amplitude of low frequency fluctuations fALFF. They are computed from rs-fMRI data using the following steps:
The fALFF data are available already processed from the Autism Brain Imaging Data Exchange (ABIDE) dataset in MNI space at 3mm resolution.
To analyze the data, we will look at associations of fALFF with diagnosis and age, controlling for other covariates.
\[ \text{fALFF}_i \sim \text{age}_i + \text{sex}_i + \text{diagnosis}_i + \text{motion covariates}_i + \text{site}_i \]
# install the latest version of the NIsim package, which includes a function to download the preprocessed ABIDE data.
# devtools::install_github('statimagcoll/PCP', ref='master')
# devtools::install_github('statimagcoll/pbj', ref='master')
### LIBRARIES ###
library(RNifti)
library(parallel)
library(splines)
library(progress)
library(table1)
library(pbj)
library(PCP)easythreshRandomisepbj package in Reasythreshpbj bootstrap procedureThe PCP R package has access to the preprocessed ABIDE data. Here, the code downloads fALFF images, which is a measure derived from the rs-fMRI data.
## DATA LOCATION ##
# SMN data folder
datadir = file.path(dirname(dirname(getwd())), 'data/abide')
derivative='falff'
# will be created by downloadABIDE
dbimagedir = file.path(datadir, 'neuroimaging/cpac/filt_global', derivative)
templatefile = file.path(datadir, 'neuroimaging/MNI152_T1_3mm.nii.gz')
# created by simSetup... Not used in this analysis
dbresimagedir = file.path(datadir, 'neuroimaging/cpac/alff_res')
# created below
maskfile = file.path(datadir, 'neuroimaging/cpac/mask.nii.gz')
overlapfile = file.path(datadir, 'neuroimaging/cpac/overlap.nii.gz')
# load in data and get directories
dat = ABIDE(datadir, derivatives=derivative)
# some data curation is done on download. Subset all pass on visual inspection and making sex and dx factors, remove rows with no file name
dat$imgname = paste(dat$file_id, paste0(derivative, '.nii.gz'), sep='_')
dat$files = file.path(dbimagedir, dat$imgname)dat$sex = factor(dat$sex, levels=1:2, labels = c('Male', 'Female'))
label(dat$sex) = "Sex"
dat$dx_group = factor(toupper(dat$dx_group))
dat$site_id = factor(dat$site_id)
label(dat$age_at_scan) = "Age"
label(dat$func_mean_fd) = "Mean displacement"
label(dat$dx_group) = "Diagnosis"
table1(~ sex + func_mean_fd + age_at_scan | dx_group, data=dat, caption = "ABIDE analysis summary table.")| ASD (N=505) |
HC (N=530) |
Overall (N=1035) |
|
|---|---|---|---|
| Sex | |||
| Male | 443 (87.7%) | 435 (82.1%) | 878 (84.8%) |
| Female | 62 (12.3%) | 95 (17.9%) | 157 (15.2%) |
| Mean displacement | |||
| Mean (SD) | 0.157 (0.188) | 0.103 (0.0986) | 0.129 (0.152) |
| Median [Min, Max] | 0.0926 [0.0180, 1.43] | 0.0718 [0.0161, 0.890] | 0.0834 [0.0161, 1.43] |
| Missing | 1 (0.2%) | 0 (0%) | 1 (0.1%) |
| Age | |||
| Mean (SD) | 17.1 (8.55) | 16.8 (7.45) | 17.0 (8.00) |
| Median [Min, Max] | 14.4 [7.00, 64.0] | 14.7 [6.47, 56.2] | 14.5 [6.47, 64.0] |
This is just a check to make sure we don’t have any missing non-imaging variables. One additional participant is excluded.
After downloading the data, we create a study specific mask based on the subset of the data where all subjects have coverage. We make a small number of exclusions to increase the coverage above 30,000 voxels at 3mm isotropic resolution. This first chunk of code creates a study-level mask by the intersection of all participants where their data are non-zero. It the iteratively excludes the participant that will increase the study-level mask, until we reach a coverage over 30,000 voxels.
imgs = simplify2array(readNifti(dat$files) )
overlap = rowMeans(imgs!=0, dims=3)
overlapImg = readNifti(dat$files[1])
overlapImg[,,] = overlap[,,]
writeNifti(overlapImg, file=overlapfile)
# choose
# number of people with zeros at that location
inds=numeric()
ids = c()
subids = dat$sub_id
# number of voxels with no zeros
nnzero = 0
# iteratively removes subjects who will increase the mask the largest
while(nnzero<30000){
voxSums = rowSums(imgs==0, dims=3)
tab = as.data.frame(table(voxSums))
nnzero=tab[1,2]
# number of unique voxels for each subject
uniquevox = apply(imgs, 4, function(img) sum(img==0 & voxSums==1) )
inds = which.max(uniquevox)
cat('\nIteration:\nsubject removed: ', paste(subids[inds], collapse=', '), '\nmask size is now ', nnzero+sum(uniquevox[inds]), '\nNumber of voxels added:', sum(uniquevox[inds]) )
imgs = imgs[,,,-inds]
subids = subids[-inds]
nnzero=nnzero+sum(uniquevox[inds])
}
Iteration:
subject removed: 51469
mask size is now 8746
Number of voxels added: 5402
Iteration:
subject removed: 51478
mask size is now 23062
Number of voxels added: 14316
Iteration:
subject removed: 50209
mask size is now 25587
Number of voxels added: 2525
Iteration:
subject removed: 50727
mask size is now 27681
Number of voxels added: 2094
Iteration:
subject removed: 50195
mask size is now 28677
Number of voxels added: 996
Iteration:
subject removed: 50216
mask size is now 29426
Number of voxels added: 749
Iteration:
subject removed: 51466
mask size is now 30010
Number of voxels added: 584
Create the mask from the overlap.
badDat = dat[! dat$sub_id %in% subids,]
dat = dat[ dat$sub_id %in% subids,]
# now create the mask
mask = readNifti(dat$files[1])
imgs = simplify2array(readNifti(dat$files) )
# mask actually still has pretty low coverage
mask[,,] = 0
mask[,,] = apply(imgs>0, 1:3, all)
writeNifti(mask, maskfile)
nvox = sum(mask)
ulimit = quantile(apply(imgs, 4, function(x) x[mask!=0]), 0.9)
rm(imgs)Visualize a slice of the data for all participants included in the analysis.
There are issues with the orientation of the images for a small number of participants. We can visualize them here to see what the problem is.
analysisDir = "/Users/vandeks/Library/CloudStorage/Box-Box/SMN/4group-level_analysis/group-level_tutorial/abide_results"
# file.path(getwd(), "/4group-level_analysis/group-level_tutorial/abide_results")
# need age_at_scan in both models for testing nonlinear functions
form = paste0(" ~ sex + age_at_scan + func_mean_fd + func_fber + func_outlier + dx_group + site_id" )
formred = paste0(" ~ sex + age_at_scan + func_mean_fd + func_fber + func_outlier + site_id")
formredAge = paste0(" ~ sex + dx_group + func_mean_fd + func_fber + func_outlier + site_id")
mask = maskfile
ncores = 4fsl_glm: OLS regressionWe can use FSL’s FEAT to do a group-level analysis using a simple GLM (least squares approach). Here we create a text file with the list of images, and text files with the design matrix and contrasts for diagnosis and age.
# feat setup files directory
featDir = file.path(analysisDir, 'feat')
dir.create(featDir, recursive=TRUE, showWarnings = FALSE)
# write text file with all images
write.table(dat$files, file = file.path(featDir, 'images.txt'), row.names=FALSE, col.names=FALSE, quote=FALSE)
# create design file for feat
# save contrast names
# create design matrix without the intercept for feat and save it as a text file
design = model.matrix(as.formula(form), data=dat)
write.table(design, file=file.path(featDir, 'design.txt'), row.names=FALSE, col.names=FALSE, quote=FALSE)
# create contrasts for dx and age
contrasts = matrix(0, nrow=2, ncol=ncol(design))
colnames(contrasts) = colnames(design)
rownames(contrasts) = c('dx', 'age')
contrasts[1, which(colnames(contrasts)=='dx_groupHC')] = 1
contrasts[2, which(colnames(contrasts)=='age_at_scan')] = 1
# save contrasts
write.table(contrasts, file=file.path(featDir, 'contrasts.txt'), row.names=FALSE, col.names=FALSE, quote=FALSE)
getwd()[1] "/Users/vandeks/Library/CloudStorage/Box-Box/SMN/4group-level_analysis/group-level_tutorial"
# output directory
featDir=$(pwd)/abide_results/feat
# merge images to 4d file
fslmerge -t $featDir/falff_outcome.nii.gz $(cat $featDir/images.txt)
# convert text file to .mat format for fsl
Text2Vest $featDir/design.txt $featDir/design.mat
# convert contrast text file to fsl format
Text2Vest $featDir/contrasts.txt $featDir/design.con
# mask file
mask=$(pwd)/../../data/abide/neuroimaging/cpac/mask.nii.gz
# run glm command
fsl_glm -i $featDir/falff_outcome.nii.gz -d $featDir/design.mat -c $featDir/design.con -m $mask \
-o $featDir/pe.nii.gz --out_z=$featDir/zstat.nii.gz --out_t=$featDir/tstat.nii.gz --out_p=$featDir/pstat.nii.gz --out_cope=$featDir/cope.nii.gzflameo: random effects model with subject-specific error varianceFSL’s FLAME model accounts for differences in variance among participants. To do this, it needs an estimate of the variances of the outcome images. We don’t have this for fALFF data, so we can’t fit the model here. It also can account for repeated measurements among participants. Might be helpful to review slides 12-16 in the group-level analysis notes.
randomise: permutation approachNot included as part of this tutorial yet.
pbj package in R: robust variance estimationFirst, replicate the GLM results from fsl. This fits OLS at each voxel, and computes the same Z-statistics as FSL’s FEAT.
HC3 is an option for the heteroskedasticity consistent covariance estimator that improves small sample performance. It defaults to TRUE.transform=t indicates that we want to transform the test statistics from t to Z-statistics.robust=FALSE indicates that we want to use the standard OLS covariance estimator, which assumes homoskedasticity. It defaults to TRUE.Full formula: ~ sex + age_at_scan + func_mean_fd + func_fber + func_outlier + dx_group + site_id
Reduced formula: ~ sex + age_at_scan + func_mean_fd + func_fber + func_outlier + site_id
30307 voxels in mask
StatMap quantiles (0, 0.01, 0.05, 0.95, 0.99, 1):
[ 0, 0, 0.01, 6.01, 9.61, 18.77 ]
sqrtSigma:
[n = 914 ; df = 1 ; rdf = 888 ]
id variable is:
NULL
lmPBJ inference settings:
robust = FALSE ; HC3 = FALSE ; transform = t
pbjInference not run yet.
Writing stat and coef images.
Writing sqrtSigma object.
$stat
[1] "/Users/vandeks/Library/CloudStorage/Box-Box/SMN/4group-level_analysis/group-level_tutorial/abide_results/dxGLM/stat_z.nii.gz"
$sqrtSigma
[1] "/Users/vandeks/Library/CloudStorage/Box-Box/SMN/4group-level_analysis/group-level_tutorial/abide_results/dxGLM/sqrtSigma.rds"
This analysis uses the default settings, which includes a robust covariance estimator, which accommodates heteroskedasticity. Using transform=t converts the test statistics using the inverse CDF of the T-distribution and CDF of the standard normal. For hypothesis testing, this helps to make them more normal, especially for extreme values.
Full formula: ~ sex + age_at_scan + func_mean_fd + func_fber + func_outlier + dx_group + site_id
Reduced formula: ~ sex + age_at_scan + func_mean_fd + func_fber + func_outlier + site_id
30307 voxels in mask
StatMap quantiles (0, 0.01, 0.05, 0.95, 0.99, 1):
[ 0, 0, 0.01, 5.71, 9.13, 18.86 ]
sqrtSigma:
[n = 914 ; df = 1 ; rdf = 888 ]
id variable is:
NULL
lmPBJ inference settings:
robust = TRUE ; HC3 = TRUE ; transform = t
pbjInference not run yet.
Writing stat and coef images.
Writing sqrtSigma object.
$stat
[1] "/Users/vandeks/Library/CloudStorage/Box-Box/SMN/4group-level_analysis/group-level_tutorial/abide_results/dx/stat_z.nii.gz"
$sqrtSigma
[1] "/Users/vandeks/Library/CloudStorage/Box-Box/SMN/4group-level_analysis/group-level_tutorial/abide_results/dx/sqrtSigma.rds"
Writing stat and coef images.
Writing sqrtSigma object.
$stat
[1] "/Users/vandeks/Library/CloudStorage/Box-Box/SMN/4group-level_analysis/group-level_tutorial/abide_results/dx/stat_s.nii.gz"
$sqrtSigma
[1] "/Users/vandeks/Library/CloudStorage/Box-Box/SMN/4group-level_analysis/group-level_tutorial/abide_results/dx/sqrtSigma.rds"
Load interactively with fsleyes. Open also with MNI 1mm template for better resolution viewing. fsleyes can overlay them correctly.
The same approach but testing the age effect.
Writing stat and coef images.
Writing sqrtSigma object.
$stat
[1] "/Users/vandeks/Library/CloudStorage/Box-Box/SMN/4group-level_analysis/group-level_tutorial/abide_results/age/stat_z.nii.gz"
$sqrtSigma
[1] "/Users/vandeks/Library/CloudStorage/Box-Box/SMN/4group-level_analysis/group-level_tutorial/abide_results/age/sqrtSigma.rds"
Writing stat and coef images.
Writing sqrtSigma object.
$stat
[1] "/Users/vandeks/Library/CloudStorage/Box-Box/SMN/4group-level_analysis/group-level_tutorial/abide_results/age/stat_s.nii.gz"
$sqrtSigma
[1] "/Users/vandeks/Library/CloudStorage/Box-Box/SMN/4group-level_analysis/group-level_tutorial/abide_results/age/sqrtSigma.rds"
Last model fitting the intercept after subtracting the mean out of each image. This will show regions that tend to have higher in fALFF (relative to the whole brain average) that are consistent across all participants.
easythresheasythresh is a command line tool from FSL to perform GRF-based cluster extent inference. It takes as input a Z-statistic image, a brain mask, a cluster-forming threshold (CFT), a cluster extent threshold (CET), and a template image for MNI space. It outputs thresholded images and a text file with cluster extent inference results. Here are two code chunks applying the easythresh command to the FSL FEAT results and the pbj OLS results.
# output directory
featDir=$(pwd)/abide_results/feat
# cluster forming threshold p<0.01. Note: NOT recommended based on prior literature showing it won't control the type 1 error rate
cft=3.1
# cluster extent threshold p<0.5. I set this anti-conservatively to include extra results for us to look at. easythresh will only include significant clusters in the output
cet=0.5
# split the zstat image into dx and age
fslsplit $featDir/zstat.nii.gz $featDir/zstat_ -t
mask=/Users/vandeks/Library/CloudStorage/Box-Box/SMN/data/abide/neuroimaging/cpac/mask.nii.gz
template=/Users/vandeks/Library/CloudStorage/Box-Box/SMN/data/abide/neuroimaging/MNI152_T1_3mm.nii.gz
# output directory
easyOutDir=$featDir/easythresh_cft${cft}_cet${cet}
mkdir $easyOutDir 2>/dev/null
# run easythresh command
# cd to the output directory -- easythresh writes output to current directory
curDir=$(pwd)
cd $easyOutDir
easythresh $featDir/zstat_0000.nii.gz $mask $cft $cet $template
cd $curDurpbj outputpbjDir=$(pwd)/abide_results/dxGLM
# other variables are defined above
# output directory
easyOutDir=$pbjDir/easythresh_cft${cft}_cet${cet}
mkdir $easyOutDir 2>/dev/null
# run easythresh command
# cd to the output directory -- easythresh writes output to current directory
curDir=$(pwd)
cd $easyOutDir
# run the command. Last argument is suffix for output images
easythresh $pbjDir/stat_z.nii.gz $mask $cft $cet $template easythresh
cd $curDurTwo tables of the output from easythresh using FSL analysis and pbj analysis as input. Results are for the diagnostic effect. They should be identical because they are performing the exact same analysis.
# read in text file cluster_easythresh.txt and print as table using knitr
easythreshFile = file.path(featDir, 'easythresh_cft3.1_cet0.5', 'cluster_easythresh.txt')
easythreshTab = read.table(easythreshFile, header=TRUE, sep='\t', check.names = FALSE)
knitr::kable(easythreshTab, caption = "Cluster extent inference results from FSL's easythresh command.")| Cluster Index | Voxels | P | -log10(P) | MAX | MAX X (vox) | MAX Y (vox) | MAX Z (vox) | COG X (vox) | COG Y (vox) | COG Z (vox) |
|---|---|---|---|---|---|---|---|---|---|---|
| 8 | 53 | 0.0717 | 1.140 | 4.19 | 22 | 22 | 36 | 23.7 | 20.8 | 36.5 |
| 7 | 33 | 0.1820 | 0.741 | 4.10 | 32 | 36 | 28 | 32.8 | 37.2 | 27.6 |
| 6 | 30 | 0.2100 | 0.677 | 3.79 | 22 | 32 | 42 | 20.2 | 34.0 | 40.5 |
| 5 | 16 | 0.4260 | 0.370 | 4.20 | 38 | 30 | 43 | 38.3 | 29.9 | 42.7 |
| 4 | 16 | 0.4260 | 0.370 | 3.93 | 36 | 46 | 27 | 36.3 | 46.2 | 28.2 |
| 3 | 15 | 0.4490 | 0.348 | 3.80 | 22 | 22 | 40 | 22.6 | 22.5 | 40.6 |
| 2 | 13 | 0.4970 | 0.303 | 3.56 | 29 | 32 | 43 | 30.7 | 32.0 | 42.6 |
| 1 | 13 | 0.4970 | 0.303 | 4.33 | 23 | 14 | 30 | 23.2 | 14.4 | 29.9 |
pbjDir = file.path(analysisDir, 'dxGLM')
# read in text file cluster_easythresh.txt and print as table using knitr
easythreshFile = file.path(pbjDir, 'easythresh_cft3.1_cet0.5', 'cluster_easythresh.txt')
easythreshTab = read.table(easythreshFile, header=TRUE, sep='\t', check.names = FALSE)
knitr::kable(easythreshTab, caption = "Cluster extent inference results from FSL's easythresh command applied to the PBJ OLS results.")| Cluster Index | Voxels | P | -log10(P) | MAX | MAX X (vox) | MAX Y (vox) | MAX Z (vox) | COG X (vox) | COG Y (vox) | COG Z (vox) |
|---|---|---|---|---|---|---|---|---|---|---|
| 8 | 53 | 0.0717 | 1.140 | 4.19 | 22 | 22 | 36 | 23.7 | 20.8 | 36.5 |
| 7 | 33 | 0.1820 | 0.741 | 4.10 | 32 | 36 | 28 | 32.8 | 37.2 | 27.6 |
| 6 | 30 | 0.2100 | 0.677 | 3.79 | 22 | 32 | 42 | 20.2 | 34.0 | 40.5 |
| 5 | 16 | 0.4260 | 0.370 | 4.20 | 38 | 30 | 43 | 38.3 | 29.9 | 42.7 |
| 4 | 16 | 0.4260 | 0.370 | 3.93 | 36 | 46 | 27 | 36.3 | 46.2 | 28.2 |
| 3 | 15 | 0.4490 | 0.348 | 3.80 | 22 | 22 | 40 | 22.6 | 22.5 | 40.6 |
| 2 | 13 | 0.4970 | 0.303 | 3.56 | 29 | 32 | 43 | 30.7 | 32.0 | 42.6 |
| 1 | 13 | 0.4970 | 0.303 | 4.33 | 23 | 14 | 30 | 23.2 | 14.4 | 29.9 |
pbj bootstrap procedureThis uses a parametric bootstrap procedure to estimate the distribution of the maximum cluster size under the null hypothesis. It is semiparametric, in the sense that the robust covariance estimator allows misspecification of the spatial covariance structure. I’ve used a small number of bootstraps, here (500). In practice, I recommend 2000-5000. I’ve used a \(p\)-value cluster forming threshold (CTF) of 0.01. The code also has the option to specify the CFT in terms of the effect size or the chi-square statistic. The rdata_rds argument saves the bootstrap results to an RDS file, so you can load them later without re-running the bootstrap. When you pass this argument, it runs the analyses in the background, so you can go about other tasks in your Rstudio terminal. The status object will let you know when the bootstrap is done.
The default method for pbjInference is to do CEI, but you can also compute \(p\)-values for local maxima, do FWER control, or specify a custom statistic, so it’s really flexible, but you have to know what you’re doing.
This is using the parametric bootstrap to do inference for the diagnostic effect in the OLS model.
Using 500 bootstraps with an equivalent cluster-forming threshold of Z>3.1 (p<0.001) on the chi-square scale.
This uses the standard visualizations from the pbj package to show the results. The cluster extents and centroids should be the same as in the easythresh results above, since they are performing the same analysis, but the FWER p-value will be different because pbj uses the bootstrap procedure to compute p-values. Instead of reporting a max Z-statistic, the pbj package outputs the maximum robust effect size index (RESI), which is a standardized effect size index that is more comparable across different studies Vandekar and Stephens, 2021. The \(p\)-values are pretty similar to the results in easythresh, but not identical because of the different ways of estimating the null distribution of the maximum cluster size. Likely, they are similar because the assumptions of easythresh are valid at this high cluster forming threshold.
Full formula: ~ sex + age_at_scan + func_mean_fd + func_fber + func_outlier + dx_group + site_id
Reduced formula: ~ sex + age_at_scan + func_mean_fd + func_fber + func_outlier + site_id
30010 voxels in mask
StatMap quantiles (0, 0.01, 0.05, 0.95, 0.99, 1):
[ 0, 0, 0.01, 6.05, 9.62, 18.77 ]
sqrtSigma:
[n = 914 ; df = 1 ; rdf = 888 ]
id variable is:
NULL
lmPBJ inference settings:
robust = FALSE ; HC3 = FALSE ; transform = t
pbjInference results:
CEI run with cft = 9.61
image(abideDXglm, cft_chisq = 3.1^2)
tabout = table.statMap(abideDXglm, cft_chisq=3.1^2)[1:10,]
tabout[, 'easythresh p-val'] = NA
# hard-coded. Only got the 8 largest clusters from easythresh
tabout[1:8, 'easythresh p-val'] = easythreshTab$P
# move Max RESI column to the back of tabout
tabout = tabout[, c(setdiff(colnames(tabout), 'Max RESI'), 'Max RESI')]
knitr::kable(tabout, caption = "Ten largest clusters from the parametric bootstrap procedure applied to OLS model in the pbj package compared to the easythresh p-values.")| Cluster ID | Cluster Extent | Centroid (vox) | Unadjusted p-value | FWER p-value | easythresh p-val | Max RESI | |
|---|---|---|---|---|---|---|---|
| 8 | 8 | 53 | 25, 22, 37 | 0.0042087 | 0.072 | 0.0717 | 0.1345862 |
| 24 | 24 | 33 | 34, 38, 29 | 0.0100153 | 0.128 | 0.1820 | 0.1313660 |
| 2 | 2 | 30 | 21, 35, 41 | 0.0120096 | 0.146 | 0.2100 | 0.1210006 |
| 4 | 4 | 16 | 39, 31, 44 | 0.0377485 | 0.282 | 0.4260 | 0.1347982 |
| 21 | 21 | 16 | 37, 47, 29 | 0.0377485 | 0.282 | 0.4260 | 0.1255713 |
| 5 | 5 | 15 | 24, 24, 42 | 0.0422093 | 0.304 | 0.4490 | 0.1212052 |
| 3 | 3 | 13 | 32, 33, 44 | 0.0544400 | 0.366 | 0.4970 | 0.1130873 |
| 23 | 23 | 13 | 24, 15, 31 | 0.0544400 | 0.366 | 0.4970 | 0.1394304 |
| 10 | 10 | 12 | 44, 36, 38 | 0.0608202 | 0.394 | NA | 0.1121541 |
| 11 | 11 | 11 | 37, 54, 37 | 0.0695590 | 0.424 | NA | 0.1292112 |
Cluster extent inference results from the parametric bootstrap procedure in the pbj package.
pbjThis uses the “robust” covariance approach. Now I’m running it at z>3.1, which is required by easythresh, but the bootstrap procedure is more flexible, so we can use z>2.32 as a threshold as well.
Full formula: ~ sex + age_at_scan + func_mean_fd + func_fber + func_outlier + dx_group + site_id
Reduced formula: ~ sex + age_at_scan + func_mean_fd + func_fber + func_outlier + site_id
30010 voxels in mask
StatMap quantiles (0, 0.01, 0.05, 0.95, 0.99, 1):
[ 0, 0, 0.01, 5.75, 9.19, 18.86 ]
sqrtSigma:
[n = 914 ; df = 1 ; rdf = 888 ]
id variable is:
NULL
lmPBJ inference settings:
robust = TRUE ; HC3 = TRUE ; transform = t
pbjInference results:
CEI run with cft = 9.61, 5.38
| Cluster ID | Cluster Extent | Centroid (vox) | Unadjusted p-value | FWER p-value | Max RESI | |
|---|---|---|---|---|---|---|
| 1 | 1 | 710 | 29, 30, 36 | 0.0006987 | 0.036 | 0.1336704 |
| 6 | 6 | 187 | 42, 33, 40 | 0.0047842 | 0.182 | 0.1292265 |
| 35 | 35 | 145 | 38, 49, 30 | 0.0066836 | 0.228 | 0.1232188 |
| 2 | 2 | 123 | 20, 35, 41 | 0.0083315 | 0.266 | 0.1190937 |
| 55 | 55 | 92 | 25, 20, 23 | 0.0125374 | 0.348 | 0.0975146 |
| 15 | 15 | 84 | 37, 17, 33 | 0.0143496 | 0.376 | 0.1148660 |
| 48 | 48 | 83 | 40, 20, 24 | 0.0147334 | 0.384 | 0.1088026 |
| 16 | 16 | 42 | 37, 54, 37 | 0.0365157 | 0.618 | 0.1262520 |
| 21 | 21 | 38 | 50, 41, 35 | 0.0421439 | 0.660 | 0.1038552 |
| 40 | 40 | 33 | 21, 39, 29 | 0.0513619 | 0.708 | 0.0947552 |
Cluster extent inference results from the semiparametric bootstrap procedure in the pbj package.
Full formula: ~ sex + age_at_scan + func_mean_fd + func_fber + func_outlier + dx_group + site_id
Reduced formula: ~ sex + age_at_scan + func_mean_fd + func_fber + func_outlier + site_id
30010 voxels in mask
StatMap quantiles (0, 0.01, 0.05, 0.95, 0.99, 1):
[ 0, 0, 0.01, 5.75, 9.19, 18.86 ]
sqrtSigma:
[n = 914 ; df = 1 ; rdf = 888 ]
id variable is:
NULL
lmPBJ inference settings:
robust = TRUE ; HC3 = TRUE ; transform = t
pbjInference results:
CEI run with cft = 9.61, 5.38
| Cluster ID | Cluster Extent | Centroid (vox) | Unadjusted p-value | FWER p-value | Max RESI | |
|---|---|---|---|---|---|---|
| 1 | 1 | 710 | 29, 30, 36 | 0.0006987 | 0.036 | 0.1336704 |
| 6 | 6 | 187 | 42, 33, 40 | 0.0047842 | 0.182 | 0.1292265 |
| 35 | 35 | 145 | 38, 49, 30 | 0.0066836 | 0.228 | 0.1232188 |
| 2 | 2 | 123 | 20, 35, 41 | 0.0083315 | 0.266 | 0.1190937 |
| 55 | 55 | 92 | 25, 20, 23 | 0.0125374 | 0.348 | 0.0975146 |
| 15 | 15 | 84 | 37, 17, 33 | 0.0143496 | 0.376 | 0.1148660 |
| 48 | 48 | 83 | 40, 20, 24 | 0.0147334 | 0.384 | 0.1088026 |
| 16 | 16 | 42 | 37, 54, 37 | 0.0365157 | 0.618 | 0.1262520 |
| 21 | 21 | 38 | 50, 41, 35 | 0.0421439 | 0.660 | 0.1038552 |
| 40 | 40 | 33 | 21, 39, 29 | 0.0513619 | 0.708 | 0.0947552 |
Cluster extent inference results from the semiparametric bootstrap procedure in the pbj package.
We used open-access data from the Autism Brain Imaging Data Exchange (ABIDE) to analyze associations of fractional amplitude of low frequency fluctuations (fALFF) with diagnosis and age. fALFF quantifies the strength of spontaneous brain activity during rest. resting state functional magnetic resonance imaging (rs-fMRI) data were preprocessed using the CPAC pipeline, and fALFF images were computed as the ratio of power in the 0-0.08Hz band to the power in the 0-0.25Hz band at each voxel. Further details of preprocessing are available on the preprocessed connectomes project website (http://preprocessed-connectomes-project.org/abide/).
fALFF images were analyzed using voxel-wise linear models to assess associations with diagnosis (autism spectrum disorder vs. typically developing controls) and age, controlling for sex, mean relative framewise displacement, foreground-to-background energy ratio, mean number of outliers found in each volume, and site. Group-level analyses were performed using the pbj package in R, which implements robust variance estimation to accommodate unknown heteroskedasticity in the imaging data Vandekar et. al. 2019. Cluster extent inference (CEI) was performed using a cluster-forming threshold of \(Z>2.32\). A parametric bootstrap procedure to estimate the distribution of the maximum cluster size using 500 bootstrap samples.
Classic visualization of the results includes:
| Cluster Extent | Centroid (vox) | FWER p-value | Max RESI |
|---|---|---|---|
| 710 | 29, 30, 36 | 0.036 | 0.1336704 |
| 187 | 42, 33, 40 | 0.182 | 0.1292265 |
| 145 | 38, 49, 30 | 0.228 | 0.1232188 |
| 123 | 20, 35, 41 | 0.266 | 0.1190937 |
| 92 | 25, 20, 23 | 0.348 | 0.0975146 |
| 84 | 37, 17, 33 | 0.376 | 0.1148660 |
| 83 | 40, 20, 24 | 0.384 | 0.1088026 |
| 42 | 37, 54, 37 | 0.618 | 0.1262520 |
| 38 | 50, 41, 35 | 0.660 | 0.1038552 |
| 33 | 21, 39, 29 | 0.708 | 0.0947552 |
This is an example of the brain visualization. You would typically include colorbars, hemisphere label (L/R), and slice labels in a publication figure, but the package doesn’t have those features easily included at the moment. I would recommend saving the Z-statistic out and making the final figure in FSL or ITK-Snap.
# get the Z-statistic map
zstat = stat.statMap(abideDX, method = 'Z')
# make it zero except in the significant ROI
# CEI 2 is the second threshold used to compute CEI inference. Here, it was 2.32^2
clusterMask = abideDX$pbj$ROIs$`CEI 2`
clusterMask[ ! clusterMask %in% clusterTable$`Cluster ID`[clusterTable$`FWER p-value`<0.05] ] = 0
# multiply their image arrays
zstat[,,] = zstat[,,] * as.logical(clusterMask[,,])
# z-index chosen within range of first (largest) cluster
image(zstat, BGimg = templatefile, crop=TRUE, index=33:41)# plot effect size in cluster 1
out = plotData.statMap(abideDX, emForm = "~ dx_group", cft_chisq=2.32^2, roiInds=1, statMethod = 'Z')
# plot fALFF by diagnosis as a scatter plot and overlay with emmeans estimates and confidence intervals
EM = as.data.frame(out$emmeans[[1]])
plot(jitter(as.numeric(out$data$dx_group), factor=1.5),
EM[ match(out$data$dx_group, EM$dx_group), 'emmean'] + resid(out$models[[1]]),
xlab='Diagnosis', ylab='fALFF', bty='l', xaxt='n', pch=16, col='lightgray', main='Cluster 1: fALFF by diagnosis')
axis(1, at=1:2, labels=levels(out$data$dx_group))
points(EM$emmean, pch=19)
arrows(x0=1:2, y0=EM$emmean, x1=1:2, y1=EM$lower.CL, angle=90, length=0.1)
arrows(x0=1:2, y0=EM$emmean, x1=1:2, y1=EM$upper.CL, angle=90, length=0.1)