Assessment of polygenic architecture and risk prediction based on common variants across fourteen cancers.

Published on Jul 3, 2020in Nature Communications12.121
· DOI :10.1038/S41467-020-16483-3
Genome-wide association studies (GWAS) have led to the identification of hundreds of susceptibility loci across cancers, but the impact of further studies remains uncertain. Here we analyse summary-level data from GWAS of European ancestry across fourteen cancer sites to estimate the number of common susceptibility variants (polygenicity) and underlying effect-size distribution. All cancers show a high degree of polygenicity, involving at a minimum of thousands of loci. We project that sample sizes required to explain 80% of GWAS heritability vary from 60,000 cases for testicular to over 1,000,000 cases for lung cancer. The maximum relative risk achievable for subjects at the 99th risk percentile of underlying polygenic risk scores (PRS), compared to average risk, ranges from 12 for testicular to 2.5 for ovarian cancer. We show that PRS have potential for risk stratification for cancers of breast, colon and prostate, but less so for others because of modest heritability and lower incidence. In cancer many gene variants may contribute to disease etiology, but the impact of a given gene variant may have varied effect size. Here, the authors analyse summary statistics of genome-wide association studies from fourteen cancers, and show the utility of polygenic risk scores may vary depending on cancer type.
Figures & Tables
📖 Papers frequently viewed together
19 Citations
1 Citations
60 Authors (Xiang Shu, ..., Wei Zheng)
1 Citations
#1Parichoy Pal Choudhury (Johns Hopkins University)H-Index: 6
#2Paige MaasH-Index: 6
Last. Nilanjan Chatterjee (Johns Hopkins University)H-Index: 90
view all 8 authors...
This report describes an R package, called the Individualized Coherent Absolute Risk Estimator (iCARE) tool, that allows researchers to build and evaluate models for absolute risk and apply them to estimate an individual's risk of developing disease during a specified time interval based on a set of user defined input parameters. An attractive feature of the software is that it gives users flexibility to update models rapidly based on new knowledge on risk factors and tailor models to different ...
24 CitationsSource
#1Luke J. O’Connor (Harvard University)H-Index: 13
#2Armin P. Schoech (Harvard University)H-Index: 12
Last. Alkes L. Price (Broad Institute)H-Index: 82
view all 6 authors...
Complex traits and common diseases are extremely polygenic, their heritability spread across thousands of loci. One possible explanation is that thousands of genes and loci have similarly important biological effects when mutated. However, we hypothesize that for most complex traits, relatively few genes and loci are critical, and negative selection—purging large-effect mutations in these regions—leaves behind common-variant associations in thousands of less critical regions instead. We refer to...
74 CitationsSource
#1Vivian W.Y. Tam (McMaster University)H-Index: 68
#2Nikunj Patel (McMaster University)H-Index: 1
Last. David Meyre (University of Lorraine)H-Index: 74
view all 6 authors...
Genome-wide association studies (GWAS) involve testing genetic variants across the genomes of many individuals to identify genotype–phenotype associations. GWAS have revolutionized the field of complex disease genetics over the past decade, providing numerous compelling associations for human complex traits and diseases. Despite clear successes in identifying novel disease susceptibility genes and biological pathways and in translating these findings into clinical care, GWAS have not been withou...
399 CitationsSource
#1Nasim Mavaddat (University of Cambridge)H-Index: 15
#2Kyriaki Michailidou (University of Cambridge)H-Index: 53
Last. Douglas F. Easton (University of Cambridge)H-Index: 172
view all 269 authors...
Stratification of women according to their risk of breast cancer based on polygenic risk scores (PRSs) could improve screening and prevention strategies. Our aim was to develop PRSs, optimized for prediction of estrogen receptor (ER)-specific disease, from the largest available genome-wide association dataset and to empirically validate the PRSs in prospective studies. The development dataset comprised 94,075 case subjects and 75,017 control subjects of European ancestry from 69 studies, divided...
330 CitationsSource
#1Clare TurnbullH-Index: 53
#2Amit SudH-Index: 18
Last. Richard S. HoulstonH-Index: 118
view all 3 authors...
More than 15 years have passed since the identification, through linkage, of ‘first-wave’ susceptibility genes for common cancers (BRCA1, BRCA2, MLH1 and MSH2). These genes have strong frequency-penetrance profiles, such that the associated clinical utility probably remains relevant regardless of the context of ascertainment. ‘Second-wave’ genes, not tractable by linkage, were subsequently identified by mutation screening of candidate genes (PALB2, ATM, CHEK2, BRIP1, RAD51C and RAD51D). Their in...
52 CitationsSource
#1Yan Zhang (Johns Hopkins University)H-Index: 12
#2Guanghao Qi (Johns Hopkins University)H-Index: 6
Last. Nilanjan Chatterjee (Johns Hopkins University)H-Index: 90
view all 4 authors...
We developed a likelihood-based approach for analyzing summary-level statistics and external linkage disequilibrium information to estimate effect-size distributions of common variants, characterized by the proportion of underlying susceptibility SNPs and a flexible normal-mixture model for their effects. Analysis of results available across 32 genome-wide association studies showed that, while all traits are highly polygenic, there is wide diversity in the degree and nature of polygenicity. Psy...
135 CitationsSource
#1Amit Khera (Harvard University)H-Index: 75
#2Mark Chaffin (Broad Institute)H-Index: 26
Last. Sekar KathiresanH-Index: 137
view all 11 authors...
A key public health need is to identify individuals at high risk for a given disease to enable enhanced screening or preventive therapies. Because most common diseases have a genetic component, one important approach is to stratify individuals based on inherited DNA variation1. Proposed clinical applications have largely focused on finding carriers of rare monogenic mutations at several-fold increased risk. Although most disease risk is polygenic in nature2–5, it has not yet been possible to use...
1,094 CitationsSource
#1Tracy A. O'Mara (QIMR: QIMR Berghofer Medical Research Institute)H-Index: 23
#2Dylan M. Glubb (QIMR: QIMR Berghofer Medical Research Institute)H-Index: 19
Last. Deborah J. Thompson (University of Cambridge)H-Index: 59
view all 125 authors...
Endometrial cancer is the most commonly diagnosed cancer of the female reproductive tract in developed countries. Through genome-wide association studies (GWAS), we have previously identified eight risk loci for endometrial cancer. Here, we present an expanded meta-analysis of 12,906 endometrial cancer cases and 108,979 controls (including new genotype data for 5624 cases) and identify nine novel genome-wide significant loci, including a locus on 12q24.12 previously identified by meta-GWAS of en...
60 CitationsSource
#1Fredrick R. Schumacher (Case Western Reserve University)H-Index: 77
#2Ali Amin Al Olama (University of Cambridge)H-Index: 34
Last. Rosalind A. EelesH-Index: 112
view all 187 authors...
Genome-wide association studies (GWAS) and fine-mapping efforts to date have identified more than 100 prostate cancer (PrCa)-susceptibility loci. We meta-analyzed genotype data from a custom high-density array of 46,939 PrCa cases and 27,910 controls of European ancestry with previously genotyped data of 32,255 PrCa cases and 33,202 controls of European ancestry. Our analysis identified 62 novel loci associated (P C, p.Pro1054Arg) in ATM and rs2066827 (OR = 1.06; P = 2.3 × 10−9; T>G, p.Val109Gly...
328 CitationsSource
#1Jihyoun Jeon (UM: University of Michigan)H-Index: 23
#2Mengmeng Du (MSK: Memorial Sloan Kettering Cancer Center)H-Index: 6
view all 39 authors...
Background & Aims Guidelines for initiating colorectal cancer (CRC) screening are based on family history but do not consider lifestyle, environmental, or genetic risk factors. We developed models to determine risk of CRC, based on lifestyle and environmental factors and genetic variants, and to identify an optimal age to begin screening. Methods We collected data from 9748 CRC cases and 10,590 controls in the Genetics and Epidemiology of Colorectal Cancer Consortium and the Colorectal Transdisc...
125 CitationsSource
Cited By19
#2Kyriacos KyriacouH-Index: 31
Last. Kyriaki MichailidouH-Index: 53
view all 8 authors...
The PRS combines multiplicatively the effects of common low-risk single nucleotide polymorphisms (SNPs) and has the potential to be used for the estimation of an individual’s risk for a trait or disease. PRS has been successfully implemented for the prediction of breast cancer risk. The combination of PRS with classical breast cancer risk factors provides a more comprehensive risk estimation and could, thus, improve risk stratification and personalized preventative strategies. In this study, we ...
#1Hui-Yi Lin (LSU Health Sciences Center New Orleans)H-Index: 29
#2Po-Yu Huang (ITRI: Industrial Technology Research Institute)H-Index: 1
Last. Jong Y. ParkH-Index: 1
view all 4 authors...
Interactions of single nucleotide polymorphisms (SNPs) and environmental factors play an important role in understanding complex diseases' pathogenesis. A growing number of SNP-environment studies have been conducted in the past decade; however, the statistical methods for evaluating SNP-environment interactions are still underdeveloped. The conventional statistical approach with a full interaction model with an additive SNP mode tests one specific interaction type, so the full interaction model...
#1Ye Lu (DKFZ: German Cancer Research Center)
#5George Theodoropoulos (UoA: National and Kapodistrian University of Athens)H-Index: 31
Last. Jakob R. Izbicki (UHH: University of Hamburg)H-Index: 75
view all 77 authors...
#1Lindsey Byrne (The Ohio State University Wexner Medical Center)H-Index: 1
#2Amanda E. Toland (OSU: Ohio State University)H-Index: 53
More than 40% of the risk of developing prostate cancer (PCa) is from genetic factors. Genome-wide association studies have led to the discovery of more than 140 variants associated with PCa risk. Polygenic risk scores (PRS) generated using these variants show promise in identifying individuals at much higher (and lower) lifetime risk than the average man. PCa PRS also improve the predictive value of prostate-specific antigen screening, may inform the age for starting PCa screening, and are info...
#1Chi Gao (Harvard University)H-Index: 6
#2Eric C. Polley (Mayo Clinic)H-Index: 30
Last. Christine B. Ambrosone (Roswell Park Cancer Institute)H-Index: 86
view all 42 authors...
PURPOSEThis study assessed the joint association of pathogenic variants (PVs) in breast cancer (BC) predisposition genes and polygenic risk scores (PRS) with BC in the general population.METHODSA t...
6 CitationsSource
#1Amit Sud (ICR: Institute of Cancer Research)H-Index: 18
#2Clare Turnbull (ICR: Institute of Cancer Research)H-Index: 53
Last. Richard S. Houlston (ICR: Institute of Cancer Research)H-Index: 118
view all 3 authors...
4 CitationsSource
#1Joel S. Bader (Johns Hopkins University)H-Index: 61
Cancer, a disease of the genome, is caused by a combination of germline predisposing variants and acquired somatic mutations. A unified view of heritable and acquired genetic factors will improve our understanding of cancer occurrence and progression. Fanfani and colleagues provide new insight into heritable cancer risk through a computational method that identifies genes and loci that contribute strongly to cancer heritability; many of these loci also harbor somatic drivers. Beyond improving ca...
#1Wouter J. Peyrot (Harvard University)H-Index: 24
#2Alkes L. Price (Broad Institute)H-Index: 82
Psychiatric disorders are highly genetically correlated, but little research has been conducted on the genetic differences between disorders. We developed a new method (case-case genome-wide association study; CC-GWAS) to test for differences in allele frequency between cases of two disorders using summary statistics from the respective case-control GWAS, transcending current methods that require individual-level data. Simulations and analytical computations confirm that CC-GWAS is well powered ...
9 CitationsSource
#1Lars G. Fritsche (UM: University of Michigan)H-Index: 39
#2Ying Ma (UM: University of Michigan)H-Index: 1
Last. Bhramar MukherjeeH-Index: 58
view all 7 authors...
Polygenic risk scores (PRS) can provide useful information for personalized risk stratification and disease risk assessment, especially when combined with non-genetic risk factors. However, their construction depends on the availability of summary statistics from genome-wide association studies (GWAS) independent from the target sample. For best compatibility, it was reported that GWAS and the target sample should match in terms of ancestries. Yet, GWAS, especially in the field of cancer, often ...
1 CitationsSource