Brain Age Prediction Using Deep Learning Uncovers Associated Sequence Variants

ARTICLE
https://doi.org/10.1038/s41467-019-13163-9 OPEN
Brain age prediction using deep learning uncovers

associated sequence variants
B.A. Jonsson 1,2, G. Bjornsdottir 1, T.E. Thorgeirsson 1, L.M. Ellingsen 2, G. Bragi Walters 1,2,
D.F. Gudbjartsson 1,2, H. Stefansson 1, K. Stefansson 1,2* & M.O. Ulfarsson 1,2*
1234567890():,;
Machine learning algorithms can be trained to estimate age from brain structural MRI. The
difference between an individual’s predicted and chronological age, predicted age difference
(PAD), is a phenotype of relevance to aging and brain disease. Here, we present a new deep
learning approach to predict brain age from a T1-weighted MRI. The method was trained on a
dataset of healthy Icelanders and tested on two datasets, IXI and UK Biobank, utilizing
transfer learning to improve accuracy on new sites. A genome-wide association study
(GWAS) of PAD in the UK Biobank data (discovery set: N ¼ 12378, replication set:
N ¼ 4456) yielded two sequence variants, rs1452628-T (β ¼ 0:08, P ¼ 1:15 ´ 109 ) and
rs2435204-G (β ¼ 0:102, P ¼ 9:73 ´ 1012 ). The former is near KCNK2 and correlates with
reduced sulcal width, whereas the latter correlates with reduced white matter surface area
and tags a well-known inversion at 17q21.31 (H2).
1 deCODE Genetics/Amgen, Inc., 101 Reykjavik, Iceland. 2 University of Iceland, 101 Reykjavik, Iceland. *email: kstefans@decode.is; mou@hi.is
NATURE COMMUNICATIONS | (2019)10:5409 | https://doi.org/10.1038/s41467-019-13163-9 | www.nature.com/naturecommunications 1

ARTICLE NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-019-13163-9
A
geing has a significant structural impact on the brain that combines predictions from multiple CNNs by either averaging
correlates with decreased mental and physical fitness1 and predictions or by training a data blender.
increased risk of neurodegenerative diseases such as In experiments, we compare our proposed method to a few
Alzheimer’s disease2 and Parkinson’s disease3. Recent publica- brain age prediction methods based on feature extraction and
tions, have demonstrated that MRIs can be used to predict machine learning. We also demonstrate that transfer learning is
chronological age with reasonably good accuracy1,4,5. Such useful for adapting a CNN trained to predict brain age on one site
predictions provide an estimate of biological brain age in inde- to a new site while retaining predictive accuracy. And we look at
pendent samples. The traditional way to perform brain age pre- how the PAD calculated with our method is affected by random
diction is to extract features from brain MRIs followed by weight initialization and retraining. We then check for associa-
classification or regression analysis. This includes extracting tions between PAD and performance on neuropsychological
principal components4, cortical thickness and surface curvature6, tests. Finally, we perform genetic analysis on PAD using UK
volume of gray matter (GM), white matter (WM), and cere- Biobank data, resulting in identification of associations with five
brospinal fluid (CSF)7, and constructing a similarity matrix8. The sequence variants for which we provide detailed phenotypic
drawback of using feature extraction methods is loss of infor- characterizations.
mation since the features are likely not designed explicitly for
extracting information relevant to brain age. Recently, deep
learning (DL) methods have garnered much interest9. These Results
methods learn features that are important without a priori bias or Combining CNN outputs improves prediction accuracy. Our
hypothesis. Convolutional neural networks (CNNs)10 are deep brain age prediction method was developed using images from
learning techniques that are especially powerful for image pro- structural brain MRIs for 1264 healthy Icelanders. To overcome
cessing and computer vision. Previously, they have been applied problems caused by training a DL method on such a small dataset
to brain age prediction11,12. Notably, Cole et al.12 implemented a we use multiple images of the same individuals and utilize a data
3D CNN trained on T1-weighted MRIs to predict brain age and augmentation strategy. We start off by training the method
achieved promising results. independently on the four previously mentioned image types
PAD (the difference between predicted brain age and chron- (Table 1A). The CNN that predicts the test set with the least error
ological age) estimates the deviation from healthy ageing. Studies is the CNN trained on T1-weighted images followed by the CNN
have shown that positive PAD correlates with measures of trained on WM segmented images (Supplementary Figs. 4 and 5
reduced mental and physical fitness; including weaker grip show scatter plots of the CNN test set predictions against
strength, poorer lung function, slower walking speed, lower fluid chronological age).
intelligence, higher allostatic load, and increased mortality risk1. Having four predictions from four different data sources opens
In addition, positive PAD has been shown to associate with up the possibility of combining the predictions. The most
cognitive impairments5,8,13,14, diabetes15, traumatic brain inju- straightforward way of fusing the forecasts is by using a majority
ries8, schizophrenia16,17, and chronic pain18. On the other hand, a voting scheme, e.g., by averaging the predictions made by the four
negative PAD associates with higher educational attainment19, CNNs. Another way to combine forecasts is to implement a data
increased physical activity19, and meditation20. Moreover, PAD blender, for example, by implementing a linear regression model
has been demonstrated to be heritable12,21 and to have a poly- trained to predict brain age from the four CNN brain age
genic overlap with brain disorders such as schizophrenia, bipolar predictions. This technique attempts to find the best linear
disorder, multiple sclerosis, and Alzheimer’s disease21. Further- combination of the four brain age predictions so in theory it
more, the high degree of genetic correlation found among psy- should be guaranteed to be at least as good as the best predicting
chiatric and some neurological disorders suggests that current CNN method. To demonstrate this, we tried combining CNN
diagnostic boundaries do not necessarily reflect underlying biol- brain age predictions using majority voting and linear regression
ogy22. Hence, defining a novel phenotype capturing global age- data blending (Table 1B). Comparing the test set results of
related changes in brain structure could, via variants in the Table 1B to the results in Table 1A, we see that combining
sequence of the genome that associate with these changes, provide predictions results in lower test error than achieved by the CNN
novel biological insights. trained on T1-weighted images.
Here we present a new brain age prediction method (Fig. 1) It is not straightforward to compare the accuracy of our
that uses a 3D CNN trained on MRIs to predict brain age. The method to previous brain age prediction methods, because they
input data are a T1-weighted image registered to Montréal are evaluated on other datasets. However, to establish a baseline
Neurological Institute (MNI) space and data derived from the T1- for the CNN-based techniques, we investigated methods based on
weighted image, i.e., a Jacobian map, and gray and white matter feature extraction such as surface-based morphometry (SBM)26,
segmented images (Fig. 1). The input data also include infor- voxel-based morphometry (VBM)27, and similarity matrices.
mation about the subject’s sex and the type of MRI scanner. The Machine learning regression methods were trained on these three
output of the network is the predicted brain age. types of features separately. For each feature type, we trained
As mentioned above, Cole et al.12 trained a 3D CNN to per- eight different types of regression methods. The list of methods
form brain age prediction. Our network is different in four key we tried is far from exhaustive, instead these methods were
ways. (1) We use a significantly different architecture. While their chosen to represent commonly used and relatively simple to tune
architecture resembles a standard VGGNet architecture23 our regression methods. In addition, we considered methods such as
architecture uses the recent ResNet design24. One of the draw- relevance vector regression (RVR)28 and Gaussian process
backs of the VGG architecture is that the vanishing gradient regression (GPR)29 which have previously been successfully used
problem limits the potential depth of the network. In contrast, the to predict brain age4,8. Table 1C shows the prediction results for
ResNet architecture has no such depth limits. ResNets also have the regression models with the lowest test error for each feature
smoother loss surfaces25, which in turn helps speeding up con- type (the Methods section and Supplementary Table 1 include
vergence. (2) We add inputs to the final CNN layer to factor in more information and results for the regression methods trained
information about sex and scanner. (3) Our technique is the first on these features). In addition, Table 1C shows results for
to use deformation information encoded in Jacobian maps to combining the best predictions for SBM, VBM and similarity
predict brain age. (4) As we have mentioned, our method matrix features using the same methods used to combine the
2 NATURE COMMUNICATIONS | (2019)10:5409 | https://doi.org/10.1038/s41467-019-13163-9 | www.nature.com/naturecommunications

NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-019-13163-9 ARTICLE
a Data
3D CNN
+
3D CNN
Brain age
Preprocessing Combination
prediction
3D CNN
3D CNN
Fig. 1 Illustration of the proposed method and input data. a A flowchart showing a high-level overview of the proposed brain age prediction system.
b Examples of image types generated by the preprocessing step. From left to right: a registered T1-weighted slice, a Jacobian map slice, a GM segmented
slice, and a WM segmented slice.
Table 1 Chronological age prediction accuracy for the considered methods.
Type Method Val MAE Val R2 Test MAE Test R2 No. I

(A) T1-weighted CNN 3.996 0.810 4.006 0.829 1815
Jacobian CNN 4.801 0.710 4.804 0.758 1815
Gray matter CNN 4.766 0.721 4.641 0.776 1815
White matter CNN 4.676 0.735 4.189 0.812 1815
(B) MV (T1 and JM) CNN 4.102 0.803 3.919 0.841 1815
MV (GM and WM) CNN 4.172 0.790 3.674 0.849 1815
MV (T1, JM, and GM) CNN 3.964 0.813 3.838 0.847 1815
MV (T1, JM, GM, and WM) CNN 3.845 0.849 3.584 0.849 1815
LRB (T1, JM, GM, and WM) CNN 3.581 0.847 3.388 0.872 1815
(C) SBM RR 5.268 0.689 5.176 0.697 1320
VBM GPR 4.278 0.781 4.317 0.766 1794
SM RR 4.898 0.722 4.937 0.728 1815
MV (SBM, VBM, and SM) GPR/RR 4.008 0.808 3.940 0.761 1246
LRB (SBM, VBM, and SM) GPR/RR 3.906 0.812 3.849 0.766 1246
(A) The best results are shown in bold. (B) The training/validation/test split is the same as for (A).
(C) The cross validation was performed using 10-fold cross validation. The SBM feature training/test split was 1056/264, the VBM feature training/test split was 1438/356, and the SM feature training/
test split was 1469/346
(A) The performance of the CNNs that were trained using T1-weighted images, Jacobian maps, GM and WM segmented images. Training set (N ¼ 1171), validation set (N ¼ 298), and test set
(N ¼ 346). (B) The performance when combining CNN predictions. (C) The results of the best methods trained on SBM, VBM and similarity matrix features
CV cross validation, GM gray matter, I images, JM Jacobian map, LRB linear regression blender, MV majority voting, MAE mean absolute error, RR ridge regression, SM similarity matrix, SBM surface-
based morphometry, val validation, VBM voxel-based morphometry, WM white matter
CNN predictions. Similarly to the CNNs, the combined method is high (Table 2). The problem is that there can be subtle
predictions have lower test MAE than any of the methods differences between data from different scanning sites which will
limited to single feature types. However, if we compare the results cause a model trained on one site to fail when predicting on the
in Table 1B and C we see that the predictions made with other site. There are multiple reasons for this. The MRI scanner
combined CNN outputs are more accurate than any of those type and parameters between sites can be different, which can
based on the feature extraction methods. cause differences between resolution, contrast and noise levels.
Also, the distribution of age can be different between sites, for
example, it is problematic if the new site has a wider age range
Testing the CNN on other datasets. Next, we examine how the than the training set.
method performs if we predict brain age of images from other We hypothesize that a CNN that is already proficient at
datasets. To do so, we evaluate it on the IXI and UK Biobank30 predicting brain age at one site only needs a small adjustment to
datasets and combine predictions using majority voting. We use adapt to data from a new site. A transfer learning strategy
this combination method rather than data blending because it has achieves this: First, we freeze the model weights of the
similar accuracy to linear regression blender with the added convolutional layers so that only the fully connected layers are
benefit that it is unnecessary to train an extra linear model on the trainable. Second, the CNN is re-trained on a portion of the data
predictions. We observe that the initial prediction error of the from the new site. An advantage of this strategy is that there are

Table 2 UK Biobank and IXI prediction performance with and without transfer learning.
IXI UK Biobank
TL Used Val MAE Val R2 Val set size Test MAE Test R2 Test set size
No 6.420 0.778 104 8.494 −0.630 12395
Yes 4.149 0.907 104 3.631 0.614 12395
The best results are shown in bold
S subjects, TL transfer learning, val validation
now fewer parameters to train, which means we can use less data After repeating these steps, we get four instances of the brain
and training will be faster. We carry out the transfer learning age prediction method that predict brain age of the 12395 subjects
strategy by retraining the majority voting CNN on 440 images in the UK Biobank with mean absolute error (MAE) equal to 4.6,
from the IXI dataset. 5.5, 5.4, and 4.9 years, respectively. The reason why the error is
The re-trained CNN is validated on 104 images from the IXI higher here compared with the original results is that we did not
dataset left out during training (validation set) and tested on reinitialize and retrain the CNNs in cases were the optimization
12395 images from the UK Biobank dataset (test set). Table 2 got stuck in a poor local minimum or a saddle point.
shows that the prediction accuracy is increased significantly by Nevertheless, if we look at the agreement of the original and
doing so. In addition, the test set predictions before and after the four new PADs we find that the intraclass correlation (ICC) is
transfer learning are shown on a scatter plot against chronological estimated to be equal to 0.86 (95% confidence interval [CI] =
age (Supplementary Fig. 6). Surprisingly the accuracy of [0.855, 0.863]). This indicates that the UK Biobank PAD
predictions for the UK Biobank site improve even though the calculated using our method stays rather consistent between the
CNN was not explicitly trained on it. This is intriguing and is five different training runs and is relatively robust to random
perhaps explained by the fact that the IXI set includes a wider age weight initialization.
range than the Icelandic set and includes 3T MRI images unlike
the Icelandic set. Associations between PAD and performance on neu-
In subsequent sections, the CNNs trained with transfer ropsychological tests. As mentioned above, previous studies have
learning on the IXI sample will be used in downstream analysis linked high PAD to cognitive impairment5,8,13,14. In light of this,
of the UK Biobank sample. While it is likely that transfer learning we are interested in looking at if PAD associates with perfor-
on a small subset of the UK Biobank sample will lower the UK mance on neuropsychological tests. Specifically, performance on
Biobank test MAE, we refrained from doing this because we want tests administered by the UK Biobank that are designed to
to use as many subjects as possible in the downstream analysis measure: fluid intelligence, numeric memory, visual memory,
and because of the limited age range in the UK Biobank sample prospective memory, simple processing speed, complex proces-
(all subjects are in the age range 45–80 years). Training on such a sing speed, visual attention, and verbal fluency. To estimate PAD
limited age range would severely bias the model towards in the UK Biobank, we train four CNNs on the Icelandic set, then
predicting ages inside this range. To get around this, it is the IXI set using transfer learning, and combine their predictions
necessary to train the model on a sample with a wider age range. using majority voting. We did not find evidence of association
This is why we use the CNNs trained on the IXI sample, which between PAD and performance on the fluid intelligence, numeric
includes subjects in the age range 20–86 years, in the downstream memory, pairs matching, and prospective memory tests (Sup-
analysis. plementary Table 2 includes these results). However, we see from
Table 3 that PAD is associated with worse performance on the
Effect of random CNN weight initialization on PAD. We know digit substitution test (DSST), trail making tests (TMTs), and the
that because CNNs start out in random initial states, and because reaction time test (a more detailed description of the tests can be
they have highly non-convex loss functions25, it is possible that found in Supplementary Notes 1–7). As expected, these results
two randomly initialized instances of our brain age prediction indicate that PAD is in fact associated with cognitive impairment.
method will converge to two distinct local minima. These states
could in theory both predict age equally well but have uncorre- Genome-wide association study. PAD has previously been
lated PAD values. Here we face a potential problem, because in shown to be heritable12,21, however, to our knowledge no
the absence of a ground truth for the PAD there is no way to tell sequence variants conferring risk of or protecting against PAD
if either one of these PAD predictions is accurate. This sort of have been identified. In order to look for such variants, we ran a
unreliable CNN behavior would be problematic for any down- genome wide association scan (GWAS) in the UK Biobank
stream analysis that utilizes the brain age prediction, because any sample on PAD (same PAD as from the section Testing the CNN
conclusions made about the PAD would depend on the initi- on other datasets) using BOLT-LMM31. This scan yields two
alization of the CNN. In light of this, it would be reassuring if we sequence variants, rs2435204-G and rs1452628-T (Fig. 2 and
could demonstrate that our method generally converges to similar Table 4A) (Supplementary Figs. 8 and 9 show locus zoom plots
PAD predictions after training. for the two genome-wide significant variants). In addition, given
To test this, four additional randomly initialized instances of that sequence variants known to associate with brain structure are
our brain age prediction method are trained and the agreement likely to be enriched for variants that associate with PAD. We
between their PADs is examined. This procedure entails repeating decided to test a smaller set of 331 brain structure variants for
these three main steps four times: (1) Train four CNNs on the association with PAD. This yielded associations with three
Icelandic dataset on the four previously mentioned image types. additional variants (Table 4B). For more information, the ‘Sta-
(2) Freeze convolutions layers and train the CNNs on the IXI tistical methods’ section contains information about how the
dataset (transfer learning step). (3) Predict brain age in the UK brain structure variants were identified.
Biobank dataset using CNNs, combine the predictions with The high number of tests conducted in GWAS combined
majority voting and calculate PAD values. with the general small effect size of common markers greatly

12
10
8
–log10(p)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 16 18 20 22
Chromosome
Fig. 2 Manhattan plot of the GWAS results for the UK Biobank data. The horizontal line denotes the P value threshold for genome-wide significant effect.
Table 3 Pearson’s r correlation between PAD and performance on neuropsychological tests.
Neuropsychological test PAD correlation 95% CI P Value No. subjects

DSST −0.080 (−0.104, −0.054) 4.3e−11 6849
TMT B 0.076 (0.051, 0.103) 3.1e−09 6076
TMT A 0.053 (0.027, 0.078) 3.8e−05 6076
TMT B - A 0.050 (0.024, 0.075) 1.3e−04 5918
Reaction time 0.030 (0.012, 0.047) 7.9e−04 12387
Negative DSST, positive TMT, and positive reaction time indicate worse performance
CI confidence interval, DSST digit substitution test, TMT trail making test
Table 4 Sequence variants associated with PAD estimated using BOLT-LMM.
rs Number (GRCh38) Position (min/maj) Allele MAF (%) Effect 95% CI P Value
(A) rs2435204 chr17:45910839 G/A 26.6 0.11 (0.08, 0.14) 1.4e−12
rs1452628 chr1:214966544 T/A 36.2 −0.08 (−0.10, −0.05) 2.3e−09
(B) rs2790099 chr6:45475612 C/T 36.0 −0.06 (−0.09, −0.03) 8.9e−06
rs6437412 chr3:194747684 C/T 28.2 −0.06 (−0.09, −0.04) 6.8e−06
rs2184968 chr6:126439848 C/T 46.0 0.05 (0.03, 0.08) 7.5e−05
(C) rs2435204 chr17:45910839 G/A 26.6 0.08 (0.03, 0.13) 1.5e−03
rs1452628 chr1:214966544 T/A 36.2 −0.07 (−0.12, −0.03) 8.8e−04
(D) rs2790099 chr6:45475612 C/T 36.0 −0.07 (−0.11, −0.02) 2.9e−03
rs6437412 chr3:194747684 C/T 28.2 −0.05 (−0.09, 0.00) 4.9e−02
rs2184968 chr6:126439848 C/T 46.0 0.06 (0.02, 0.10) 2.9e−03
(A, B) Association between sequence variants and PAD for 12378 subjects in discovery set. (A) Genome-wide significant sequence variants. (B) Sequence variants associated with structural MRI brain
phenotypes that also associate with PAD. (C, D) Association between sequence variants and PAD for 4456 subjects from the replication set. (C) Genome-wide significant sequence variants. (D)
Sequence variants associated with structural MRI brain phenotypes that also associate with PAD. Note that the reported effect sizes are for PAD normalized to unit variance. Before normalization the
standard deviation of PAD was ~4 years. Thus the associated lowering of the protective allele of rs1452628 is approximately −0.32 years
CI confidence interval, MAF minor allele frequency
increases the risk of a false postives32. To protect against potential other variants which came up in the discovery stage were omitted
confound effects we adjusted for potential nuisance variables, because they did not replicate. (2) The identified sequence
such as age, gender, total intracranial volume, principal variants also associate with brain structure likely to be affected by
components from genetic ancestry analysis, head motion, brain ageing. Associations between these sequence variants and
genotyping array, and imaging center. In addition, we removed SBM/VBM brain structure phenotypes and correlation between
individuals of non-white British ancestry and one subject from PAD and the brain structure phenotypes are shown in
each related pair of individuals (the Statistical Methods section Supplementary Tables 4 and 5. Supplementary Table 4 shows
provides more information about the exact adjustment proce- that both PAD and rs1452628-T associate with lower CSF
dure). And then to thoroughly vet each hit we took two steps. throughout the cerebral cortex which is consistent with reduced
(1) We performed a replication test on held out data. (2) Checked cortical sulcal openings. On the other hand, rs2435204-G
if the reported variants associate with other phenotypes related to associates with lower total white matter surface area, and reduced
brain ageing. area in a number of cortical brain regions (Supplementary
(1) The five reported sequence variants also associated with Table 5). Although it was known a priori that the other three
PAD in a replication set of 4456 subjects (Table 4 [C, D]). Four sequence variants would associate with structural brain

phenotypes, the specific structural brain phenotypes that associate neuroticism score, including anxiety, worry, fear, anger, frustra-
with these variants are listed in Supplementary Tables 6–8. tion, depressed mood, and loneliness may associate with PAD.
Running LD score regression33 on the GWAS results, we The other genome-wide significant sequence variant, rs1452628-
estimated the SNP-heritability for PAD to be h2snp ¼ 0:264 (95% T, is located close to KCNK2 (also known as TREK1), which belongs
CI = [0.178, 0.350]). In addition, the intercept of the LD score to the two-pore domain potassium channel family and is mainly
regression model is equal to 0.991 (95% CI = [0.979, 1.003]), expressed in the brain42. In mice, KCNK2 has been implicated in
which indicates that the model did not find any evidence of neuroinflammation43, cerebral ischemia44, and blood-brain barrier
confounding effects in the PAD GWAS results. This h2snp estimate dysfunction45. rs1452628-T correlates with SNPs that have pre-
viously been associated with cortical sulcal opening and GM
is close to the one previously estimated by Kaufmann et al.21 thickness, rs6667184 (r 2 ¼ 0:68), and rs864736 (r 2 ¼ 0:49)46.
(h2snp ¼ 0:1828 [SE ¼ 0:02]). And predictably our h2snp is lower In addition, we identified three sequence variants associated
than the narrow-sense heritability estimate of PAD estimated by with PAD by restricting the analysis to SNPs known a priori to
Cole et al.12 (h2 ¼ 0:66 [SE ¼ 0:09]) using a twin study sample. associate with structural phenotypes. (1) rs2790099-C is located
in an intron of RUNX2, a gene that encodes the RUNX2 protein
which is essential for osteoblastic differentiation and skeletal
Discussion morphogenesis and has been shown to play several roles in cell
Here, we have presented a novel deep learning approach, using cycle regulation47. Supplementary Fig. 7 shows that rs2790099-C
residual convolutional neural networks to predict brain age from is a possible cis-eQTL of RUNX2 and it is most expressed in the
a T1-weighted MRI, a Jacobian map, and gray and white basal ganglia (caudate and putamen). This lines up with the a
matter segmented images, to study the discrepancy between age- priori brain structure GWAS that shows that rs2790099-C has
related structural brain changes and chronological age. The MRI genome-wide significant associations with white matter volume of
based deep learning system was shown to predict brain age from regions in the basal ganglia (putamen and pallidum) (Supple-
T1-weighted MRI data with a MAE ¼ 3:39 and R2 ¼ 0:87 on mentary Table 6). (2) rs6437412-C is an intron variant of
test data. Comparing our approach to other machine learning LINC01968 that associates with increased cortical CSF (Supple-
methods trained on surface-based morphometry, voxel-based mentary Table 7). (3) rs2184968-C is located in an intron of
morphometry, and similarity matrix features, we showed that our CENPW, a gene that has previously been associated with traits,
approach predicts brain age more accurately. We showed such as, height48, cognitive performance49, and male-pattern
that transfer learning can be used to efficiently increase prediction baldness50. Our analysis shows that rs2184968-C is associated
accuracy for new sites. The PAD calculated using this method was with increased CSF in subcortical regions and increased size of
shown to be relatively robust to random weight initialization the fourth ventricle (Supplementary Table 8).
and retraining, a result that indicates that the PAD estimated Confound effects are a problem for big imaging studies due to
using our method can be used as a reliable phenotype in the the huge number of imaging artifacts that can potentially influence
study of brain ageing, as well as in the study of specific both imaging and non-imaging variables of interest32. Some of the
disorders of the brain. We also proposed that PAD could be an confound effects we have tried to control for are effects due to age,
informative phenotype for genetic association studies, and indeed, sex, head size, population structure, and scanner type. Head motion
our association analysis of PAD in a discovery set of 12378 sub- is another potentially problematic confound effect, because it causes
jects and replication set of 4456 subjects yielded five sequence reduction of estimated gray matter volume and thickness in MRI
variants. images similar to what we expect to see due to ageing51. While head
The sequence variant with the strongest association, motion is not important in the evaluation of our method (see Cole
rs2435204-G, tags the H2 (inverted) form of the 17q21.31 et al.12), it is potentially a problematic confound for GWAS analysis
inversion polymorphism34. This inversion spans ~1 Mb and because certain clinical groups associate more with scanner motion.
includes 10 genes, including MAPT, a gene that encodes the tau Elliott et al.52 suggest to use fMRI-derived head motion estimates to
protein which has been implicated in various dementias35. In correct for confound effects due to head motion when running
addition, micro-deletions within the inversion are known to cause GWAS analysis on brain structure phenotypes. We adjusted PAD
intellectual disability36. The H1 inversion haplotype has been for head motion as they suggest, however, this correction only had a
associated with increased risk of Parkinson’s disease, male- small effect on our results. Other potential confounds that we
pattern baldness, and several other phenotypes, whereas H2 has looked at were sample relatedness (the first 40 principal from
been associated with a number of phenotypes including neuro- components genetic ancestry analysis), genotyping array, and the
ticism37, fibromyalgia18, lower educational attainment, increased assessment center where neuropsychological testing was performed.
fecundity38, and smaller intracranial volume (ICV)39 (Note that As with head motion, adjusting for these variables did not affect our
PAD is adjusted for ICV, thus the observed effect on PAD is not results.
caused by ICV). Due to the extensive linkage disequilibrium (LD) From our analysis we see that PAD associates with worse
the 17q21.31 inversion region, reported markers for various performance on neuropsychological tests, specifically poor per-
associations in the region often differ between studies. For formance on DSST, TMT, and the reaction time tests (Table 3).
example, the most recent GWAS meta-analysis of Parkinson’s Interestingly, both the DSST and the reaction time test are
disease reports an association with rs17649553-T, that is fixated designed to measure cognitive processing speed. The TMT is
on and highly correlated with the H2-tagging rs2435204-G designed to asses visual attention. However, psychomotor speed is
(r 2 ¼ 0:82, D0 ¼ 1), with OR ¼ 0:78 (95% CI ¼ ½0:76; 0:80), P= a factor in successful TMT performance53. Furthermore, a decline
1:26 ´ 1068 (their meta-analysis was carried out with a fixed- in processing speed along with impairment of reasoning, mem-
effects model based on inverse-variance weighting)40. ory, and executive function are well documented to occur in age-
rs2435204-G also associates with brain structure phenotypes. associated cognitive decline54. As such, these results are in line
Supplementary Table 5, shows that both PAD and rs2435204-G with other studies that link high PAD to cognitive impair-
associate with increased thickness and decreased area in cortical ment5,8,13,14. We note, that the association between PAD and
brain regions. Interestingly, this pattern of increased thickness TMT is consistent with the previous finding of Cole et al.8.
and decreased area has previously been associated with neuroti- However, the large dataset used here gives more conclusive
cism41. Thus, lifestyle or phenotypes associated with a high results. Supporting this, we additionally find that schizophrenia, a

brain disorder characterized by complex patterns of cognitive

impairment, correlates with positive PAD (greater brain ageing
Max pooling
Max pooling
Max pooling
Res block 1
Res block 2
Res block 3
MR image
than chronological age) and (Supplementary Table 3).
In conclusion, we have presented a new method for predicting
brain age using cutting-edge machine learning techniques. Our
deep learning method produces a single measure (PAD) from raw
MRI data that captures complex underlying correlated changes in
MRI and can be used to study various traits and diseases, and in
particular for genetic discovery. Using such a method represents
Age prediction
Max pooling
Max pooling
Res block 5
Res block 4
one potential way for overcoming challenges with high dimen-
FC block
sional data and multiple testing that plagues MRI research.
Applying our method to large genomic datasets such as the UK
Biobank has enabled us to identify novel associations between
sequence variants and brain ageing. The variants identified are
common SNPs with small effects on PAD (Table 4) accounting
Fig. 3 A flowchart showing the components of the proposed CNN
for only a fraction of the trait variance. However, these first
architecture. Residual (Res), fully connected (FC).
findings provide a foothold, and further research into these loci as
well as additional GWAS studies have potential to shed light on
the biological underpinnings of the ageing brain and its con-
1×1×1
nection to various diseases and disorders. Residual block
Conv
Methods
Datasets. The proposed method was evaluated on T1-weighted MR images from
Output
Conv
Conv
Input
BRN
BRN
ELU
ELU
three independent datasets: an Icelandic dataset, the UK Biobank dataset, and the
IXI dataset. DeCODE genetics provided the Icelandic MR data, consisting of scans
from 1264 healthy subjects aged between 18 and 75 years. This dataset includes
1815 scans in total, since some subjects have several scans. The Icelandic data were
acquired using two different scanners, a 1.5T Phillips Achieva scanner, and a 1.5T
Fig. 4 A flowchart showing the components of the proposed residual block.
Siemens Magnetom Aera scanner. Scans were imaged using a T1-weighted gradient Batch re-normalization (BRN), convolutional layer (Conv).
echo sequence (Philips Achieva: repetition time (TR) = 8.6 ms, echo time (TE) =
4.0 ms, flip angle (FA) = 8°, 170 slices, slice thickness = 1.2 mm, acquisition
matrix = 192 ´ 192, FOV = 240 ´ 240 mm; Siemens Aera: repetition time (TR) =
Fully connected
2400 ms, echo time (TE) = 3.54 ms, flip angle (FA) = 8°, 160 slices, slice thickness
Sex and
scanner
block
= 1.2 mm, acquisition matrix = 192 ´ 192, FOV = 240 ´ 240 mm). Any serious
neurological disorders were prescreened and removed. In addition, we removed
from the training and holdout sets subjects diagnosed with neurodevelopmental
and mental disorders such as autism, bipolar disorder, intellectual disability, or
schizophrenia, and subjects with any copy number variations previously associated
Dropout
with neurodevelopmental or psychiatric disorders.
Concat
Output
Input
FC 1
FC 2
ELU
The UK Biobank dataset consists of T1-weighted MR images of 15040 healthy

subjects aged between 46 and 79 years old. The data were all collected using a 3T
Siemens Skyra scanner. It is well-known that the presence of undetected population
structure can lead to both false positive results and failure to detect genuine
associations in genetic association studies55, in an effort to combat this our analysis Fig. 5 A flowchart showing the components of the proposed fully
was constrained to 12378 individuals of white British ancestry. An additional connected block. Fully connected layer one (FC1), concatenation layer
release of MRI images by UK Biobank was added to a replication set. This set (Concat), fully connected layer two (FC2).
contains 6888 subjects (thereof 4456 subjects of white British ancestry) aged
between 47 and 80 years old. The images in this set were collected using the same
protocol as the previous UK Biobank set. The residual block, displayed in Fig. 4, consists of a combination of layers which
The IXI dataset consists of T1-weighted MR images of 544 healthy subjects and are repeated twice inside the residual blocks. This combination is composed of a 3D
is freely available online. The subjects age at imaging was between 20 and 86 years convolutional layer with stride 1 × 1 × 1 and kernel size 3 × 3 × 3, a batch re-
old. The IXI data were collected from three different sites. The Hammersmith normalization layer60, and an ELU activation function61. The defining element of
Hospital using a Philips 3T system, Guy’s Hospital using a Philips 1.5T system and the residual block is the skip connection which adds the signal feeding into the
the Institute of Psychiatry using a GE 1.5T system. Histograms of the age residual block to the output of a layer close to the end of the block. The number of
distribution of the three datasets mentioned are shown in Supplementary Figs. 1–3. feature maps in block number n was chosen by the rule 2nþ2 .
The fully connected block, depicted in Fig. 5, is a multilayer perceptron (MLP)62
with one hidden layer. The input layer has 128 ´ 4 ´ 5 ´ 4 ¼ 10240 neurons, the
Preprocessing. Preprocessing was carried out using the computational anatomy
hidden layer (FC 1) has 256 neurons that use an ELU activation function, and the
toolbox (CAT12)56. First, the input data were inhomogeneity corrected. Then the
output layer has a single neuron. Following the hidden layer, a dropout63 layer with
skull and other non-brain elements were removed. Finally, the images were
keep rate equal to 0.8 is employed. The output layer (FC 2) has no activation
registered into the standard MNI space using the deformable registration algorithm
function which means that it performs a linear regression on the hidden layer
DARTEL57. For further information, refer to the CAT12 manual58.
features. To account for factors such as scanner type and sex that can affect the
There are three types of images that the preprocessing step generates. The first
estimated brain age of an individual we include them as inputs in the linear
is an MNI-registered image. Second, a Jacobian map which is a by-product of the
regression by concatenating them with the hidden features of the MLP.
deformable registration. Last, a gray matter and white matter soft segmented image.
The mean absolute error was used as the loss function and the CNN was
All of the image types mentioned above have voxel size 1.5 mm3 and voxel
optimized using Adam64 with parameters: learning rate ¼ 0:001, decay = 106 ,
resolution 121 × 145 × 121.
β1 ¼ 0:9, β2 ¼ 0:999, and batch size ¼ 4. The He initialization strategy65 was used
to initialize the weights, and each trainable node in the CNN was regularized with
CNN architecture. The CNN architecture we developed is based on the residual l2 weight decay66, with λ ¼ 5 ´ 105 . Early stopping67 with model
architecture24 (Fig. 3). It was implemented using Keras with TensorFlow as checkpointing was used, i.e., if the validation error did not improve in 100 epochs
backend59 and consists of five residual blocks, each followed by a max pooling layer the training was stopped and the weights with the lowest validation error selected.
of stride 2 × 2 × 2 and kernel size 3 × 3 × 3, and one fully connected block. The Furthermore, to reduce the risk of overfitting, data augmentation68 was used to
convolutional part of the CNN reduces the input image from size 121 × 145 × 121 generate new training instances by applying a coordinate transformation to a
to 128 feature maps of size 4 × 5 × 4. The fully connected part reduces these feature random subset of the training data, consisting of a combined 3D rotation and a 3D
maps down to an age prediction. translation. The rotation angles were between −40° and 40° with equal probability,

and the translation distance, for each direction, was selected between −10 and 10 transformation. After normalization the linear regression adjustments were
voxels with equal probability. reapplied. Sequence variants associated with PAD are only reported if they reach
Our CNN implementation uses about ~8 GB of memory and the training time genome-wide significance. If two genome-wide significant variants are in LD
using an Intel Xeon Gold 6130 Processor CPU with 32GB of RAM and an NVIDIA (r 2 > 0:1) we report the variant with the lower P-value.
Tesla V100 16GB GPU was about 2 days. In addition, we tested for association between PAD and sequence variants
known to associate with structural brain phenotypes. These variants were found by
performing GWAS separately on 305 SBM phenotypes generated with recon-all by
SBM, VBM, and similarity matrix brain age prediction. The SBM features were
using the Freesurfer 6.0 software69 and 540 VBM phenotypes generated by using
generated using FreeSurfer’s recon-all algorithm69 and the VBM features were
CAT1256. All genome-wide significant markers were then aggregated into a single
generated using the CAT12 toolbox (the specific names of the SBM and VBM
list. In cases where variants were in LD (r 2 > 0:5), only the variant with the lower
features are listed in Supplementary Data 1). The similarity matrix was constructed
P-value was selected. The final list included 331 variants, to account for testing test
by taking the inner product between the combined gray and white matter seg-
mented images of each subject. The SBM and VBM features were adjusted for variants for the second time a Bonferroni adjusted significance level αB3 ¼ 2331 0:05

intracranial volume, sex and scanner type. The features were then zero centered 7:5 ´ 105 was used for the PAD association test.
and normalized to unit variance. The regression methods that were tested were, To reduce the risk of false positive sequence variant associations we additionally
linear regression70, lasso71, ridge regression72, elastic net73, random forest regres- checked for association in a replication set of 4456 subjects. To pass this test the
sion74, support vector regression75, relevance vector regression28, and Gaussian association between the variants under consideration and PAD need to show
process regression29. A grid search was used to find the tuning parameters cor- evidence of statistical significance (αR < 0:05).
responding to the lowest cross-validation error for the methods mentioned. The
regression models were implemented using scikit-learn76, except relevance vector Heritability analysis. To estimate SNP-heritability (h2snp ) we ran LD score regres-
machines, which used scikit-rvm. sion33 on the PAD GWAS summary statistics. We used the ldsc command line tool
In addition, we tested combining predictions made by models trained on these and followed standard procedure when running it. To train the LD score regression
three feature types. We decided to pick predictions for the method with the lowest model, we used precomputed European 1000 Genomes phase 3 LD Scores, and
CV MAE for each feature type. Specifically, these regression methods were GPR filtered out rare variants with MAF < 0.01 and imputation quality score < 0.9. The
with a Matérn kernel for both the SBM and VBM, and ridge regression for the slope of the trained regression model times the number of SNPs and divided by the
similarity matrix features. The methods were picked by CV MAE instead of test sample size is an estimate of h2snp 33. In addition, the intercept of the model minus one
MAE to prevent data leakage. Since the SBM and VBM features were not available
is a measure of confounding bias in the test statistics due to confounding effect, such
for every image in the Icelandic dataset, we first calculated the average brain age as cryptic relatedness and population stratification33.
prediction for each subject, before inner joining the predictions by subject into a
single data frame. This resulted in a combined data frame with 1246 rows and three
columns containing brain age predictions for the three regression methods. Since eQTL analysis. To investigate if any of the variants are expression quantitative
training these regression models is faster than training the CNNs, we were able to trait loci (eQTLs) we used the GTEx database (GTEx Analysis Release V7 [dbGaP
combine them using 10-fold cross-validation predictions. Thus, the linear Accession phs000424.v7.p2])78. Our eQTL analysis was carried out by logging onto
regression blender can train on predictions from the whole training set, instead of https://gtexportal.org, typing in the corresponding rs number of identified variants,
being limited to the 298 images in the validation set, as is the case for the CNN and checking if they have any associated eQTLs. However, identifying whether a
prediction combination. variant is truly causal in both GWAS and eQTL is challenging because of the
uncertainty caused by LD79. Therefore, we only report variants as eQTLs of genes if
they are close to being the most significant eQTL of that specific gene.
Statistical methods. To assess the accuracy of the machine learning methods we
performed simple training and validation splits, and selected a suitable model by
evaluating the validation MAE. The subjects from the Icelandic sample were split Ethical regulations. The Icelandic participants in this study were recruited by
between these three sets, and if a subject had multiple images, the images were all deCODE genetics to study the cognitive and neurological effects of rare variants
put in the same set. The data were divided into 64% training set (N s ¼ 809, previously associated with schizophrenia and autism spectrum disorder. The UK
N i ¼ 1171), 16% validation set (N s ¼ 202, N i ¼ 298), and 20% test set (N s ¼ 253, Biobank oversaw the recruitment of subjects of British nationality. Approval for the
N i ¼ 346), were N s is the number of subjects and N i is the number of images. aforementioned schizophrenia study was obtained from the National Bioethics
When evaluating the machine learning models the MAE and R2 score for the Committee of Iceland and the Icelandic Data Protection Authority. Written
images in the validation and test set is calculated. informed consent was obtained from all participants or their guardians before
To assess the transfer learning performance, the IXI dataset was split into 80% blood samples or phenotypic data were obtained. All sample identifiers were
training set (N ¼ 440), 20% validation set (N ¼ 104) and the whole UK Biobank encrypted in accordance with the regulations of the Icelandic Data Protection
dataset was used as a test set (N ¼ 12395). As before, we evaluate accuracy by Authority. Information about ethics oversight in the UK Biobank can be found at
calculating the MAE and R2 score on the validation and test set. https://www.ukbiobank.ac.uk/ethics/.
In order to test the reliability of PAD, the intraclass correlation was calculated
with ICCbare from the ICC R package. The 95% confidence interval was estimated Reporting summary. Further information on research design is available in
using bootstrapping with 2000 sampling iterations. the Nature Research Reporting Summary linked to this article.
The Pearson correlation coefficient was calculated in order to test for
association between PAD and performance on neuropsychological tests. Before
performing the association test we first removed individuals of non-white British Code availability
ancestry and subjects from related pairs. We then adjusted the PAD for age at Any custom code or software used to implement the brain age prediction method
imaging visit, age2 , sex, age ´ sex, age2 ´ sex, total intracranial volume, the first 40 detailed in this paper will be made available upon request.
principal components from genetic ancestry analysis, head motion, genotyping
array, imaging center, and assessment center where neuropsychological tests were Data availability
conducted. The adjustments was performed using linear regression. Adjustment for The genetic and phenotype datasets generated by UK Biobank used in this study are
variables such as genotyping array are probably not necessary for testing for
available via the UK Biobank data access process (see http://www.ukbiobank.ac.uk/
association between PAD and performance on neuropsychological tests. However,
register-apply/). Detailed information about the genetic data and MRI data available in
we included them to keep the adjusted PAD similar to the one we perform the
UK Biobank is listed here: http://www.ukbiobank.ac.uk/scientists-3/genetic-data/, https://
GWAS on. Nine correlation tests were performed, so a Bonferroni adjusted
www.fmrib.ox.ac.uk/ukbiobank/. The Icelandic data used in this publication are not
significance level αB2 ¼ 0:05=9 0:00556 was used. We estimated the 95%
publicly available due to information, contained within them, that could compromise
confidence interval using bootstrapping with 2000 sampling iterations.
We performed a GWAS on PAD using BOLT-LMM31 to find associated research participant privacy. The authors declare that the data supporting the findings of
sequence variants. For the genetic analysis we used version 3 of the imputed genetic this study are available within the article, its supplementary information, and upon
dataset released by UK Biobank in July 201777. The UK Biobank genetic data were request.
assayed using two very similar genotyping arrays (95% of marker content is
shared). Roughly 10% of the subjects were genotyped using applied Biosystems UK Received: 2 April 2019; Accepted: 21 October 2019;
BiLEVE Axiom Array by Affymetrix and the rest using the closely related Applied
Biosystems UK Biobank Axiom Array77. Variants with imputation quality score
below 0.3, and minor allele frequency below 0.1% were filtered out, which left ~20
million variants to be considered for GWAS. Before performing GWAS, we
removed individuals of non-white British ancestry and subjects from related pairs.
We then adjusted the PAD for age at imaging visit, age2 , sex, age ´ sex, age2 ´ sex,
total intracranial volume, the first 40 principal components from genetic ancestry References
analysis, head motion, genotyping array, and imaging center using linear 1. Cole, J. H. et al. Brain age predicts mortality. Mol. Psychiatry 23, 1385 (2018).
regression. The adjusted PAD was then normalized with an inverse normal 2. Abbott, A. A problem for our age. Nature 475, S2 (2011).

3. Reeve, A., Simcox, E. & Turnbull, D. Ageing and Parkinson’s disease: why is 35. Rademakers, R., Cruts, M. & Van Broeckhoven, C. The role of tau (MAPT) in
advancing age the biggest risk factor? Ageing Res. Rev. 14, 19–30 (2014). frontotemporal dementia and related tauopathies. Hum. Mut. 24, 277–295
4. Franke, K., Ziegler, G., Klöppel, S. & Gaser, C., Alzheimer’s Disease (2004).
Neuroimaging Initiative. et al. Estimating the age of healthy subjects from T1- 36. Koolen, D. A. et al. A new chromosome 17q21. 31 microdeletion syndrome
weighted MRI scans using kernel methods: exploring the influence of various associated with a common inversion polymorphism. Nat. Genet. 38, 999
parameters. Neuroimage 50, 883–892 (2010). (2006).
5. Liem, F. et al. Predicting brain-age from multimodal imaging data captures 37. Nagel, M., Jansen, P. R. & Stringer, S. et al. Meta-analysis of genome-wide
cognitive impairment. NeuroImage 148, 179–188 (2017). association studies for neuroticism in 449484 individuals identify novel
6. Wang, J. Age estimation using cortical surface pattern combining thickness genetic loci and pathways. Nat. Genet. 50, 920–927 (2018).
with curvatures. Med. Biol. Eng. Comput. 52, 331–341 (2014). 38. Kong, A. et al. Selection against variants in the genome associated with
7. Kondo, C. et al. An age estimation method using brain local features for T1- educational attainment. Proc. Natl Acad. Sci USA. 114, E727–E732 (2017).
weighted images. In Engineering in Medicine and Biology Society (EMBC), 39. Ikram, M. A., Fornage, M. & Smith, A. V. et al. Common variants at 6q22 and
2015 37th Annual International Conference of the IEEE, 666–669 (IEEE, 2015). 17q21 are associated with intracranial volume. Nat. Genet. 44, 539–544
8. Cole, J. H., Leech, R. & Sharp, D. J., Alzheimer’s Disease Neuroimaging (2013).
Initiative. Prediction of brain age suggests accelerated atrophy after traumatic 40. Nalls, M. A. et al. A meta-analysis of genome-wide association studies
brain injury. Ann. Neurol. 77, 571–581 (2015). identifies 17 new Parkinson’s disease risk loci. Nat. Genet. 49, 1511–1516
9. LeCun, Y., Bengio, Y. & Hinton, G. Deep learning. nature 521, 436 (2015). (2017).
10. LeCun, Y., Bottov, L., Bengio, Y. & Haffner, P. Gradient-based learning 41. Riccelli, R., Toschi, N., Nigro, S., Terracciano, A. & Passamonti, L. Surface-
applied to document recognition. Proc. IEEE 86, 2278–2324 (1998). based morphometry reveals the neuroanatomical basis of the five-factor model
11. Huang, T.-W. et al. Age estimation from brain MRI images using deep of personality. Soc. Cogn. and Affect. Neurosci. 12, 671–684 (2017).
learning. In Biomedical Imaging (ISBI 2017), 2017 IEEE 14th International 42. Hervieu, G. J. et al. Distribution and expression of TREK-1, a two-pore-
Symposium on Biomedical Imaging, 849–852 (IEEE, 2017). domain potassium channel, in the adult rat CNS. Neuroscience 103, 899–919
12. Cole, J. H. et al. Predicting brain age with deep learning from raw imaging (2001).
data results in a reliable and heritable biomarker. NeuroImage 163, 115–124 43. Bittner, S., Ruck, T., Fernández-Orth, J. & Meuth, S. G. Trekking the
(2017). blood–brain-barrier. J. Neuroimm. Pharmacol. 9, 293–301 (2014).
13. Franke, K., Luders, E., May, A., Wilke, M. & Gaser, C. Brain maturation: 44. Cai, Y., Peng, Z., Guo, H., Wang, F. & Zeng, Y. TREK-1 pathway mediates
predicting individual brainage in children and adolescents using structural isoflurane-induced memory impairment in middle-aged mice. Neurobiol.
MRI. NeuroImage 63, 1305–1312 (2012). Learn. Mem. 145, 199–204 (2017).
14. Gaser, C. et al. Brainage in mild cognitive impaired patients: predicting the 45. Wang, W. et al. Lig4-4 selectively inhibits TREK-1 and plays potent
conversion to Alzheimer’s disease. PloS One 8, e67346 (2013). neuroprotective roles in vitro and in rat mcao model. Neurosci. Lett. 671,
15. Franke, K., Gaser, C., Manor, B. & Novak, V. Advanced brainage in older 93–98 (2018).
adults with type 2 diabetes mellitus. Fronti. Aging Neurosci. 5, 90 (2013). 46. Guen, Y. L. et al. eQTL of KCNK2 regionally influences the brain sulcal
16. Koutsouleris, N. et al. Accelerated brain aging in schizophrenia and beyond: a widening: evidence from 15,597 UK Biobank participants with neuroimaging
neuroanatomical marker of psychiatric disorders. Schizophrenia Bull. 40, data. Brain Struct. Funct. 224, 847–857 (2018).
1140–1153 (2013). 47. Stein, G. S. et al. Runx2 control of organization, assembly and activity of the
17. Schnack, H. G. et al. Accelerated brain aging in schizophrenia: a longitudinal regulatory machinery for skeletal gene expression. Oncogene 23, 4315 (2004).
pattern recognition study. Am. J. Psychiatry 173, 607–616 (2016). 48. Nagy, R. et al. Exploration of haplotype research consortium imputation for
18. Kuchinad, A. et al. Accelerated brain gray matter loss in fibromyalgia patients: genome-wide association studies in 20,032 generation scotland participants.
premature aging of the brain? J. Neurosc. 27, 4004–4007 (2007). Gen. Med. 9, 23 (2017).
19. Steffener, J. et al. Differences between chronological and brain age are related 49. Lee, J. J. et al. Gene discovery and polygenic prediction from a genome-wide
to education and self-reported physical activity. Neurobiol. Aging 40, 138–144 association study of educational attainment in 1.1 million individuals. Nat.
(2016). Genet. 50, 1112 (2018).
20. Luders, E., Cherbuin, N. & Gaser, C. Estimating brain age using high- 50. Pirastu, N. et al. Gwas for male-pattern baldness identifies 71 susceptibility
resolution pattern recognition: younger brains in long-term meditation loci explaining 38% of the risk. Nat. Commun. 8, 1584 (2017).
practitioners. Neuroimage 134, 508–513 (2016). 51. Reuter, M. et al. Head motion during MRI acquisition reduces gray matter
21. Kaufmann, T. et al. Common brain disorders are associated with heritable volume and thickness estimates. Neuroimage 107, 107–115 (2015).
patterns of apparent aging of the brain. Nat. Neurosci. 22, 1617–1623 (2019). 52. Elliott, L. T. et al. Genome-wide association studies of brain imaging
22. The Brainstorm Consortium, Anttila, V., Bulik-Sullivan, B., Finucane, H. K. & phenotypes in UK Biobank. Nature 562, 210 (2018).
Walter, R. K. Analysis of shared heritability in common disorders of the brain. 53. Salthouse, T. A. What cognitive abilities are involved in trail-making
Science 360, 6395 (2018). performance? Intelligence 39, 222–232 (2011).
23. Simonyan, K. & Zisserman, A. Very deep convolutional networks for large- 54. Deary, I. J. et al. Age-associated cognitive decline. Br. Med. Bullet. 92, 135–152
scale image recognition. Preprint at http://arXiv.org/abs/1409.1556, (2014). (2009).
24. He, K., Zhang, X., Ren, S. & Sun, J. Deep residual learning for image 55. Marchini, J., Cardon, L. R., Phillips, M. S. & Donnelly, P. The effects of human
recognition. In Proceedings of the IEEE Conference on Computer Vision and population structure on large genetic association studies. Nat. Genet. 36, 512
Pattern Recognition, 770–778 (2016). (2004).
25. Li, H., Xu, Z., Taylor, G., Studer, C. & Goldstein, T. Visualizing the loss 56. Gaser, C. & Dahnke, R. Cat-a computational anatomy toolbox for the analysis
landscape of neural nets. In Advances in Neural Information Processing of structural MRI data. HBM 2016, 336–348 (2016).
Systems, 6389–6399 (2018). 57. Ashburner, J. A fast diffeomorphic image registration algorithm. Neuroimage
26. Fischl, B. & Dale, A. M. Measuring the thickness of the human cerebral cortex 38, 95–113 (2007).
from magnetic resonance images. Proc. Natl Acad.Sci. USA 97, 11050–11055 58. Kurth, F. & Gaser, C. Manual - Computational Anatomy Toolbox - CAT12
(2000). (2017).
27. Ashburner, John & Friston, K. J. Voxel-based morphometry—the methods. 59. Chollet, F. et al. Keras. https://keras.io (2015).
Neuroimage 11, 805–821 (2000). 60. Ioffe, S. Batch renormalization: Towards reducing minibatch dependence in
28. Tipping, M. E. The relevance vector machine. In Advances in Neural batch-normalized models. In Advances in Neural Information Processing
Information Processing Systems, 652–658 (2000). Systems, 1945–1953 (2017).
29. Rasmussen, C. E. Gaussian processes in machine learning. In Advanced 61. Clevert, D.-A., Unterthiner, T. & Hochreiter, S. Fast and accurate deep
Lectures on Machine Learning, 63–71 (Springer, 2004). network learning by exponential linear units (elus). Preprint at http://arXiv.
30. Sudlow, C. et al. Uk Biobank: an open access resource for identifying the org/abs/1511.07289, (2015).
causes of a wide range of complex diseases of middle and old age. PLoS Med. 62. Zell, A. Simulation Neuronaler Netze, volume 1 (Addison-Wesley, Bonn,
12, e1001779 (2015). 1994).
31. George, P.-R. et al. Efficient bayesian mixed-model analysis increases 63. Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I. & Salakhutdinov, R.
association power in large cohorts. Nat. Genet. 47, 284 (2015). Dropout: a simple way to prevent neural networks from overfitting. J. Mach.
32. Smith, S. M. & Nichols, T. E. Statistical challenges in “big data” human Learn. Res. 15, 1929–1958 (2014).
neuroimaging. Neuron 97, 263–268 (2018). 64. Kingma, D. P. & Ba, J. Adam: a method for stochastic optimization. Preprint
33. Bulik-Sullivan, B. K. et al. Ld score regression distinguishes confounding from at http://arXiv.org/abs/1412.6980, (2014).
polygenicity in genome-wide association studies. Nat. Genet. 47, 291 (2015). 65. He, K., Zhang, X., Ren, S. & Sun, J. Delving deep into rectifiers: surpassing
34. Stefansson, H. et al. A common inversion under selection in europeans. Nat. human-level performance on imagenet classification. in Proceedings of the
Genet. 37, 129–137 (2005). IEEE International Conference on Computer Vision, 1026–1034 (2015).

66. Krogh, A. & Hertz, J. A. A simple weight decay can improve generalization. In Author contributions
Advances in Neural Information Processing Systems, 950–957 (1992). B.A.J. implemented the method, wrote the code, and performed experiments. B.A.J. and
67. Morgan, N. & Bourlard, H. Generalization and parameter estimation in M.O.U. developed the method and designed statistical experiments. G.B., T.T., G.B.W.,
feedforward nets: Some experiments. In Advances in Neural Information L.M.E., D.F.G., H.S., and K.S. contributed to analyses of the data and writing the
Processing Systems, 630–637 (1990). manuscript.
68. Goodfellow, I., Bengio, Y., Courville, A. & Bengio, Y. Deep Learning: Data
Augmentation, volume 1 (MIT press Cambridge, 2016).
69. Fischl, B. Freesurfer. Neuroimage 62, 774–781 (2012).
Competing interests
B.A.J., G.B., T.T., G.B.W., D.F.G., H.S., K.S., and M.O.U. are employed by deCODE
70. Seber, G. A. F. & Lee, A. J. Linear Regression Analysis, volume 329. (John
genetics/Amgen, Inc. L.M.E. declares no competing interests.
Wiley & Sons, 2012).
71. Tibshirani, R. Regression shrinkage and selection via the lasso. J. R. Stat. Soc. B
58, 267–288 (1996). Additional information
72. Hoerl, A. E. & Kennard, R. W. Ridge regression: biased estimation for Supplementary information is available for this paper at https://doi.org/10.1038/s41467-
nonorthogonal problems. Technometrics 12, 55–67 (1970). 019-13163-9.
73. Zou, H. & Hastie, T. Regularization and variable selection via the elastic net. J.
R. Stat. Soc. B 67, 301–320 (2005). Correspondence and requests for materials should be addressed to K.S. or M.O.U.
74. Ho, T. K. Random decision forests. In Document Analysis and Recognition,
Proceedings of the Third International Conference on Document Analysis and Peer review information Nature Communications thanks James Cole and the other,
Recognition, volume 1, 278–282 (IEEE, 1995). anonymous, reviewers for their contribution to the peer review of this work. Peer
75. Smola, A. J. & Schölkopf, B. A tutorial on support vector regression. Stat. reviewer reports are available.
Comput. 14, 199–222 (2004).
76. Pedregosa, F. et al. Scikit-learn: machine learning in Python. J. Mach. Learn. Reprints and permission information is available at http://www.nature.com/reprints
Res. 12, 2825–2830 (2011).
77. Bycroft, C. et al. The UK Biobank resource with deep phenotyping and Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in
genomic data. Nature 562, 203 (2018). published maps and institutional affiliations.
78. GTEx Consortium. et al. Genetic effects on gene expression across human
tissues. Nature 550, 204 (2017).
79. Hormozdiari, F. et al. Colocalization of GWAS and eQTL signals detects target
Open Access This article is licensed under a Creative Commons
genes. Am. J. Hum. Genet. 99, 1245–1260 (2016).
Attribution 4.0 International License, which permits use, sharing,
adaptation, distribution and reproduction in any medium or format, as long as you give
appropriate credit to the original author(s) and the source, provide a link to the Creative
Commons license, and indicate if changes were made. The images or other third party
Acknowledgements
This research has been conducted using the UK Biobank Resource under Application material in this article are included in the article’s Creative Commons license, unless
Number 24898. The research leading to these results has received support from the indicated otherwise in a credit line to the material. If material is not included in the
Innovative Medicines Initiative Joint Undertaking under grant agreements no. 115008 article’s Creative Commons license and your intended use is not permitted by statutory
(NEWMEDS) and no. 115300 (EUAIMS), of which resources are composed of EFPIA in- regulation or exceeds the permitted use, you will need to obtain permission directly from
kind contribution and financial contribution from the European Union’s Seventh Fra- the copyright holder. To view a copy of this license, visit http://creativecommons.org/
mework Programme (EU-FP7/2007-2013). The financial support from the European licenses/by/4.0/.
Commission to the NeuroPain project (FP7#HEALTH-2013-602891-2) is acknowl-
edged. The authors are grateful to the participants, and we thank the research nurses and
© The Author(s) 2019
staff at the Recruitment centre (Þjónustumiðstöð rannsóknarverkefna).

Brain Age Prediction Using Deep Learning Uncovers Associated Sequence Variants

Uploaded by

Copyright:

Available Formats

Brain Age Prediction Using Deep Learning Uncovers Associated Sequence Variants

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Brain Age Prediction Using Deep Learning Uncovers Associated Sequence Variants

Uploaded by

Copyright:

Available Formats

ARTICLE

Brain age prediction using deep learning uncovers

NATURE COMMUNICATIONS | (2019)10:5409 | https://doi.org/10.1038/s41467-019-13163-9 | www.nature.com/naturecommunications 1

2 NATURE COMMUNICATIONS | (2019)10:5409 | https://doi.org/10.1038/s41467-019-13163-9 | www.nature.com/naturecommunications

Table 1 Chronological age prediction accuracy for the considered methods.

Type Method Val MAE Val R2 Test MAE Test R2 No. I

NATURE COMMUNICATIONS | (2019)10:5409 | https://doi.org/10.1038/s41467-019-13163-9 | www.nature.com/naturecommunications 3

4 NATURE COMMUNICATIONS | (2019)10:5409 | https://doi.org/10.1038/s41467-019-13163-9 | www.nature.com/naturecommunications

Table 3 Pearson’s r correlation between PAD and performance on neuropsychological tests.

Neuropsychological test PAD correlation 95% CI P Value No. subjects

Table 4 Sequence variants associated with PAD estimated using BOLT-LMM.

NATURE COMMUNICATIONS | (2019)10:5409 | https://doi.org/10.1038/s41467-019-13163-9 | www.nature.com/naturecommunications 5

6 NATURE COMMUNICATIONS | (2019)10:5409 | https://doi.org/10.1038/s41467-019-13163-9 | www.nature.com/naturecommunications

brain disorder characterized by complex patterns of cognitive

The UK Biobank dataset consists of T1-weighted MR images of 15040 healthy

NATURE COMMUNICATIONS | (2019)10:5409 | https://doi.org/10.1038/s41467-019-13163-9 | www.nature.com/naturecommunications 7

8 NATURE COMMUNICATIONS | (2019)10:5409 | https://doi.org/10.1038/s41467-019-13163-9 | www.nature.com/naturecommunications

NATURE COMMUNICATIONS | (2019)10:5409 | https://doi.org/10.1038/s41467-019-13163-9 | www.nature.com/naturecommunications 9

10 NATURE COMMUNICATIONS | (2019)10:5409 | https://doi.org/10.1038/s41467-019-13163-9 | www.nature.com/naturecommunications

You might also like