Multi-task deep neural networks recover evolutionary dynamics and parameters with high accuracy
To infer evolutionary processes acting within each blood cell population, we trained an ensemble of deep neural networks (DNNs), hereafter “classifier”, using summary statistics derived from populations simulated across a range of evolutionary scenarios as input features, (Fig. 1b–d). A total of 4.6 million simulations were produced and we used these to create a look-up grid of parameter combinations representing a comprehensive range of plausible evolutionary scenarios (see “Methods” section, Table 1, Supplementary Fig. 1). Forward simulations were performed and four parameters were varied between simulations: mutation rate, probability of a mutation being beneficial (p), coefficient of positive selection (sp; corresponding to the relative fitness advantage of cells with this mutation), and coefficient of negative selection (sn; relative fitness disadvantage of cells with this mutation)23. In our simulations, we expect selection to act on nonsynonymous sites and synonymous sites were simulated as evolving under a neutral model. Each unique combination of the four parameters corresponds to a distinct evolutionary model. However, each model can be collapsed into one of four overarching evolutionary classes: neutral (no selection), positive selection only, negative selection only, and combination models which allow for the accumulation of mutations subject to both positive and negative selection within the same cellular population (see “Methods” section). Each neural network within our ensemble classifies a population into one of the four evolutionary classes and estimates the four parameters comprising a given model (Fig. 1c). Through comparing the outputs of our classifier, we can determine the uncertainty in the classifications (Fig. 1d)24.
We were able to obtain a high classification accuracy for populations simulated under positive-only (0.99) and combination (0.97) models and relatively high accuracy for populations simulated under neutral (0.80) and negative-only (0.83) models when testing our classifier on a held-out test set of simulations (10% of our simulated data) (Fig. 1e, Supplementary Fig. 2). A perturbation analysis of our inverse model showed that we can predict positive-only or combination evolutionary classes with high certainty but that our model has some difficulty distinguishing between neutral and negative-only selection when there were few mutations. To ensure that our model was able to perform well on data from evolutionary parameters not included in our training data, we generated a novel set of simulations using new parameter combinations that do not appear in the training set. We find that we are able to achieve a similar degree of accuracy in evolutionary class prediction (Supplementary Fig. 3). Fewer mutations, or a lower level of variability in the population, could arise following a selective sweep or when populations are subject to a lower mutation rate (Supplementary Fig. 4). In addition, a lack of mutational information could be attributable to the selective removal of SNPs as a result of negative selection (Supplementary Fig. 5). Distinguishing between neutral evolution and negative selection is a challenge in population genetics as weakly damaging mutations can segregate in the population at low frequencies and have a mild impact on reducing variability at linked loci12,13. Further, while we can distinguish between overarching evolutionary classes with high accuracy, as well as the presence or absence of positive or negative selection, our model struggles to discriminate amongst the weaker coefficients of selection which is notoriously challenging in population genetics12. As such, we limit our inferences of selective dynamics in blood to the overarching evolutionary class which we are able to discriminate with high accuracy.
Hematopoietic populations show evidence of positive and negative selection regardless of disease outcome
We applied our classifier to preleukemic cases and healthy controls to infer population-level evolutionary processes. We find that in the majority of individuals (71%), both controls and cases, the hematopoietic population does not evolve neutrally (Fig. 2a). We reject models of neutral evolution in the majority of cases (79%) and controls (64%) and we observe a significantly higher departure from neutrality in cases than controls (χ2(1) = 7.32, p-value = 0.007, ncases = 73, ncontrol = 246). The majority of cases (62%) and the plurality of (43%) control fit combination classes of evolution; in other words, we are able to detect signatures of both positive and negative selection in their blood biopsy indicating a functional impact of negative selection acting on passenger mutations. We observe that higher levels of predictive uncertainty correspond with a reduction in segregating mutations in the mature blood cell pool, in keeping with our classifier’s performance on simulated populations (Supplementary Fig. 6). In summary, few cases or controls are evolving neutrally, and we find evidence of positive selection and of negative selection in the majority of preleukemic cases and healthy controls.
Hematopoietic populations evolve in an age-dependent manner
As ARCH is known to be an age-associated phenomenon, we investigated if there is any association between the age of an individual and the selective pressures governing their hematopoietic dynamics. Participants were binned into age groups spanning ten-year intervals and the proportion of participants in each age range fitting each evolutionary class was calculated. We find clear associations with age and the dominant class of evolution in individuals (Fig. 2b). Specifically, we observe an age-related decline in the proportion of controls fitting negative-only or neutral classes of evolution and a parallel increase in controls fitting the combination class. Our results are consistent with a model in which individuals accumulate passenger mutations as they age, some of which will have a slightly damaging effect. In parallel, with increasing mutation accumulation, there is an increased likelihood of a rare driver event occurring which would cause an individual to shift to a combination, or positive-only, class of evolution. We find that many preleukemic cases show evidence of positive selection at a younger age than controls. In particular, in preleukemic cases, where age-associated clonal expansions have been previously reported, we observe an increased overall proportion of individuals fitting combination models in younger age groups indicating that driver events have occurred earlier. At a young age, driver mutations are likely to be arising on a background with fewer mildly damaging passenger mutations and thus may experience a relatively higher fitness advantage compared to the same mutation arising on a background with a greater number of mildly damaging passenger mutations; a hypothesis that we investigate in the following section.
Controls have a higher proportion of passenger-to-driver mutations than cases
To investigate if the proportion of mutations in known driver genes compared to non-driver genes could explain some variation in outcomes, we compared the types and patterns of mutations between the cases and controls. We annotated mutations as drivers if they occurred in driver genes found to be highly mutated in the Cancer Genome Atlas Acute Myeloid Leukemia project (Supplementary Fig. 7). We first asked whether cases simply have more mutations, thus predisposing their blood populations to cancer. Consistent with the previous reports7, our cases had more mutations on average than age-matched healthy controls (Wilcoxon rank-sum test, W = 12,196, df = 1, p-value = 2.9e−06, ncases = 92, ncontrols = 385). and, in the combination class of evolution, more mutations in known driver genes (Supplementary Fig. 7). A higher mutation count in cases is consistent with our classifier’s prediction that there is a small increase in mutation rate in a preleukemic context (mean mutation rate: μ = 1.2e−10 per bp per division) compared to healthy controls (mean mutation rate: μ = 1.1e−10 per bp per division) (Wilcoxon rank-sum test, W = 14336, df = 1, p-value = 0.004, ncases = 92, ncontrols = 385) (Fig. 2c–d). We estimated the mutation rate assuming a population size of 10,000 and we have scaled our estimates to account for varying estimates of HSC population size (Supplementary Figs. 8–10).
Intriguingly, we do observe mutations in driver genes in healthy controls fitting positive models of selection. It is possible that these individuals do not progress to disease if driver mutations are arising in competing clones, thus preventing one clone from rising to dominance. However, another possible explanation for the differences in outcome is the proportion of driver to passenger mutations. Using linear regression, we compare the relationship between the number of mutations falling into known driver genes versus non-driver genes for cases and controls fitting the combination and positive evolutionary classes. In the combination model, we find a significant interaction between the number of mutations occurring in non-driver genes compared to driver genes in controls (β = 5.76) and cases (β = 0.642); F (1, 224) = 28.5, p-value = 2.23e−07. However, in the positive class, we did not find a significant interaction between the number of mutations occurring in non-driver genes compared to driver genes in controls (β = 0.16) and cases (β = 0.17); F(1,12) = 0.0004, p-value = 0.98. However, in the positive comparison, our sample size is low, so we may not be powered to detect such a difference. The increased proportion of mutations in passenger genes compared to driver genes in the combination class is consistent with a model in which negative selection acting on mildly damaging passenger mutations is playing a protective role in inhibiting or stalling clonal expansions.
Distinct patterns of inferred pathogenicity associate with evolutionary classes
To determine whether some passengers are playing a protective role, we scored each mutation according to how likely it was to affect protein function and conservation after blood sample classification. In doing so, we can independently evaluate the performance of our evolutionary predictions. We scored mutations using the Combined Annotation-Dependent Depletion (CADD v. 1.4) algorithm (Fig. 3a)25. Using a combination of functional prediction, conservation, epigenetic measurements, gene annotations, and the sequence surrounding a given variant, CADD provides a measure of the functional impact of single nucleotide variants, and small insertions/deletions, in the genome. CADD scores assess whether a mutation alters protein function, and have difficulty distinguishing between whether it changes protein expression, inhibits its activity, or causes the protein to be constitutively active; as such we will call mutations with high CADD scores “function-altering”.
Overall, we observe that mutations falling in known driver genes tend to have a higher CADD score than mutations in non-driver genes. However, in keeping with our expectations of neutral evolution, we do not observe a significant difference in CADD score between mutations in known driver genes (n = 32) and non-driver genes in neutral cases (n = 899) (Wilcoxon rank-sum test, W = 7598.5, p-value = 0.3), suggesting that these mutations are not function-altering and confer no relative fitness advantage or disadvantage to the clone in which they are found. In comparison, in individuals showing evidence of positive selection (positive (n = 47) and combination (n = 401), mutations in known driver genes had significantly higher CADD scores than mutations in non-driver genes (n = 21 and n = 1487, for positive and combination classes, respectively) (Wilcoxon rank-sum test, positive models: W = 548, p-value = 0.001; combination models: W = 278,136, p-value < 2.2e−16). Further, we observe that the average CADD score assigned to passenger mutations in negative-only models (n = 153) is significantly lower (Wilcoxon rank-sum test, W = 39,298, p-value = 0.004) than passenger mutation CADD scores in the neutral class (n = 899) suggesting that negative selection plays a role in removing the more damaging mutations and decreasing the overall pathogenicity of segregating mutations. The role of negative selection in decreasing the overall pathogenicity of the blood pool is further supported by the average CADD scores of passenger mutations in the combination class being greater and smaller than the average score of passenger mutations in the negative-only and neutral class, respectively. In the absence of recombination, mutations which would typically be removed are able to continue to segregate in the blood population in the presence of positively selected driver mutations, and, accordingly, we observe higher average pathogenicity of passenger mutations in the combination class. Finally, it is worth noting that the passenger mutations in the positive-only class have significantly lower pathogenicity than those in the combination class. Passenger mutations with higher pathogenicity are likely to be subject to stronger negative selection thus conferring a protective effect to the individual in the presence of positive selection acting on drivers. A better understanding of how these potentially protective mutations are distributed across genes would allow us to identify which genes might be critical in preventing clonal expansions.
Clusters of genes are enriched for function-altering mutations across evolutionary classes
To investigate if certain genes are more frequently found to play a protective role when mutated, we determined which genes are enriched for function-altering mutations in each evolutionary class, as well as the overlap of genes with function-altering mutations across evolutionary classes (Fig. 3b). For this comparison, we used a lower threshold of a CADD score of 10 to determine which mutations are likely to be function-altering. The majority of genes harboring function-altering mutations are observed in combination and neutral classes of evolution. However, there are subsets of genes that are enriched exclusively for function-altering mutations in the presence of positive or negative selection. Reassuringly, we find that many known driver genes (DNMT3A, TET2, IDH2, TP53) are enriched for function-altering mutations among positive and combination classes of evolution only and not among neutral or negative classes of evolution (Supplementary Fig. 11). Further, we observe that there is an overlap of genes enriched for function-altering mutations in negative-only and combination classes of evolution indicating that these genes might experience stronger negative selection.
We next asked if the inferred pathogenicity of mutations in the dominant clone corresponds with the frequency at which it is observed in the mature blood cell pool. To do so, we evaluated the relationship between the CADD score and the frequency of the dominant clones, defined as the clone with the highest variant allele frequency in an individual, in each class (Fig. 3c). We find that clones are able to rise to fixation in the absence of both negative and positive selection where the primary driving force of evolution is genetic drift. Clones rising to a high frequency stochastically could, in part, be explained by a reduction in the effective population size of the HSC population owing to a small population of stem cells with a higher fitness dominating blood cell production. With a reduced population size, mutations are able to rise to a higher frequency and become fixed in a population more rapidly. However, only mutations with a low CADD score are found at high frequencies in the neutral class. As expected, in the presence of negative selection, we observe a depletion of clones in the higher pathogenicity categories as they have likely been removed by selection. Clones that persist in the negative-only model could indicate a functional threshold at which mutations are not efficiently removed by selection and continue to segregate in the population. Conversely, in the positive-only class, clones, including those with high pathogenicity, are found at higher frequencies. We observe a higher variance in CADD scores in the combination class which is consistent with our expectation that, when neither positive nor negative selection are able to act efficiently, variants will segregate at intermediate frequencies rather than sweeping to high fixation or being purged from the population, respectively.
Selective interference may be associated with slowing clonal expansions
Having established that in the combination class, controls have significantly higher non-driver to driver ratios and that these non-drivers have significantly higher CADD scores than those in the positive-only class, we then ask whether non-drivers played a role in preventing progression to AML through selective interference. Selective interference is particularly relevant as studies report that driver mutations, while found in both healthy controls and preleukemic cases, tend to segregate at a much higher frequency in a preleukemic context7. We propose that selective interference, where the linkage between sites under multiple selective pressures will define the overall impact of selection acting on the population, could play a role in preventing mutations from rising to a high frequency in controls either through passenger mutations hitchhiking within the same clones as driver mutations, or if driver mutations arise in different clones and are competing for dominance in a finite cell pool13,26,27. We expect that clones under purely positive selection will be found at higher frequencies in blood compared to those which are subject to a combination of positive and negative selection where interference might play a role in preventing selective sweeps.
We find that mutations in preleukemic cases fitting a combination class (n = 403) tend to segregate at a significantly higher frequency compared to controls (n = 1095) (Wilcoxon rank-sum test, W = 254,988, p-value < 2.2e−16) (Fig. 3d). However, we do not observe a difference in the frequency at which mutations are found to segregate between cases and controls fitting positive models of evolution or between cases fitting combination and positive models of evolution. Decreased variant allele frequencies in healthy individuals fitting combination models are consistent with our prediction that selection acting on a subset of mutations in healthy controls prevents progression to disease even in the presence of positive selection. However, we do observe signatures of negative selection in preleukemic contexts which suggests that the impact of selection acting on passenger mutations, while detectable through our methods which incorporate multiple summaries of the data, remains negligible with respect to clonal progression. Further, we do observe that, while not significant owing to sample size, preleukemic cases fitting combination models tend to have a later age of diagnosis than preleukemic cases fitting positive-only evolutionary classes indicating that negative selection might play a role in slowing progression to disease.
Indeed, we find that ARCH occurring in the absence of positive selection, that is individuals who fit negative or neutral classes of evolution, is associated with a lower risk of progression to AML compared to individuals who have signatures of positive selection (log-rank test, p-value = 2e−04) (Fig. 4a). We find that individuals who fit combination classes of evolution have an approximately two-fold increased risk of progressing to AML compared to individuals fitting neutral models of evolution (hazard ratio, 2.43; 95% confidence interval, 1.45–4.06, Fig. 4b). Owing to the small number of individuals fitting positive only classes of evolution, we cannot infer if the negative selection acting on passenger mutations in individuals fitting combination classes of evolution reduces the risk of progressing to AML. However, a scenario where multiple clones compete for dominance, thus maintaining clones at intermediate frequencies, would explain the greater risk conferred to individuals fitting combination classes of evolution. Our findings suggest that not all passenger mutations are equal in that some might be more efficient in preventing disease-associated clonal expansions. Further, through accounting for mutations that segregate alongside driver mutations, we would be able to greatly improve our understanding of ARCH as a biomarker for disease and better predict who is at risk of progressing to cancer.