Currently favored sampling practices for tumor sequencing can produce optimal results in the clinical setting
We picked five different setups for in-depth analysis. Single tumor region, local sample, and global sample were taken and analyze from five ovarian cancer patients (see Materials and Methods, Fig. 1A). We selected three patients with a germline mutation affecting genes in the BRCA pathway (Fig. 1B) termed as “HR-deficient” or “BRCA-deficient” patients. The first patient termed as “BRCA1-deficient” harbored a frameshift BRCA1 mutation (rs397507208), that was covered by somatic loss of heterozygosity event (p = 1.5E−03). This tumor also had a heterozygous frame-shift variant in BRCA2 that was not covered by a LOH event. The second patient termed as “BRCA2-deficient” had a pathogenic frameshift mutation in the BRCA2 gene (rs80359550), which was covered by a LOH event (p = 6.5E−05). A third patient (hence on termed as “PALB2-mutant”) harbored a mismatch mutation and somatic LOH (p = 1.7E−08) in the PALB2 gene, which was previously associated with breast cancer7.
Copy-number alterations and loss of heterozygosity (LOH) events were very dissimilar among the patients (Supplementary Table 1). We identified a “copy-number stable” patient (Fig. 1C), with one partial amplification on the first chromosome, and one patient with multiple amplifications affecting entire chromosomes (Fig. 1C) with three-fold more somatic mutations (hence termed as “hypermutating”). The HR-deficient tumors displayed multiple events on smaller chromosome segments (Fig. 1C). One patient (“copy number stable”) displayed far fewer events than the remaining patients did (n = 250 in the “copy number stable” vs. n = 1,015–5,050 in other samples).
Somatic mutation patterns in HR-deficient tumors displayed the BRCA-deficiency associated Signature 3 (Fig. 2A), where base substitution proportions were more evenly distributed across the four nucleotides (Fig. 2B). We identified signatures 2 and 13 in the copy-number stable tumor, which are associated to the activity of the APOBEC cytidine deaminases family and highly increased C > T somatic substitutions. In case of the hypermutating tumor, two signatures with unknown origin were found (Supplementary Table 2).
Of note, when investigating mutations in key oncogenes and tumor suppressor genes (Fig. 2C), HR-deficient tumors were driven by DNA-repair defect paired with somatic TP53 mutations. In case of the hypermutating tumor, we identified the two most common KRAS and PIK3CA activating mutations. The copy-number stable tumor contained multiple mutations affecting key genes in the PI3K pathway, such as PTEN, FGFR2 and two deleterious mutations in the PIK3R1 gene.
When evaluating sequencing outcomes related to different sampling strategies, high frequency clonal mutations were common across all samples, while low-frequency sub-clonal mutations displayed clonal evolution patterns (Fig. 3A). The number of trunk mutations (shared in every region) spanned between 79 and 112 in all patients. While in terms of absolute value, most mutations were identified in the biopsy sample of the hypermutating tumor (n = 583), only 15.9% of these were trunk mutations (Supplemental Fig. 1A). The proportion of trunk mutations increased as the sample size increased and reached 71.4% of mutations in the global sample, comparable to proportions detected in the other samples. This shift in mutation proportions in the hypermutating tumor may be caused by the dilution of sub-clones with increasing tumor size. The highest level of homogeneity was found in the copy-number stable tumor, where > 81.7% of mutations in any sample were trunk mutations.
By comparing mutation variant allele frequencies (VAFs) between samples, we identified different evolution patterns. In case of the BRCA1/2 mutant (HR-deficient) tumors, somatic mutation VAFs were similar between samples and evenly distributed (Fig. 3B). The copy-number stable and PALB2-mutant tumors harbored predominantly high frequency mutations, possibly hinting on a “big bang” model of evolution8. In case of the hypermutating tumor, high frequency clonal mutations were similar between samples, while sub-clonal mutations had higher VAFs in the biopsy sample (Fig. 3B).
In the PALB2-deficient, BRCA2-deficient, and copy-number stable tumors the global sample identified slightly higher proportions of mutations, while in the BRCA1 and hypermutating tumors the biopsy samples covered higher proportions of mutations (Fig. 3C). However, there were no statistically significant differences in the proportion of identified mutations when performing pairwise comparisons between the biopsy, local, and global samples were computed (Fig. 3C).
Cases with available metastatic samples can be used to simulate sample pooling. For this analysis, we utilized data from a clear renal cell carcinoma (ccRCC) multi-region sequencing study1. Gerlinger et al. uncovered the branched evolution of ccRCC by sequencing multiple distinct regions of the same tumor. While some mutations were shared among distinct regions of the tumor (trunk), other branch mutations were shared across samples, and there were also several sample specific (private) mutations. We utilized samples from their study because primary and metastatic samples were available, and because ccRCC tumors show similar tumor evolution patterns as ovarian tumors9,10,11. Our analysis focused on the possibility of pooling samples to capture a wider set of variants without the need to sequence multiple samples from each tumor. We modelled the effects of sample size by sampling from the primary and metastatic sites, changing the composition ratios of the sites and coverage depths. Through combination of data derived from primary and metastatic samples we modelled the effect of high intra-tumor heterogeneity on sampling and mutation composition shifts (Supplemental Fig. 1B). From the in silico results we uncovered that most clonal branch mutations of the minor sample could be identified when proportion reached 20% composition, where 93% of all clonal mutations (including both trunk and region specific) were identified. Most clonal mutations (> 95%) were identified when the composition of the two samples ranged between 30 and 60%. At the same time, there was no composition generating a clear maximum in mutation detection. We have also seen this when we examined the effect of sequencing coverage on mutation calling. When coverage reached the original sequencing coverage (onefold), 92% of clonal mutations were identified, but interestingly only 10% of sub-clonal mutations. As coverage increased, there was no radical increase in identified clonal mutations, while the percentage of identified sub-clonal mutations increased in a linear fashion until coverages reached ~ 300 × (Supplemental Fig. 1C).