Genome-wide polygenic scores for common diseases identify individuals with risk equivalent to monogenic mutations

Published on Aug 13, 2018in Nature Genetics27.603
· DOI :10.1038/S41588-018-0183-Z
Amit Khera75
Estimated H-index: 75
(Harvard University),
Mark Chaffin26
Estimated H-index: 26
(Broad Institute)
+ 8 AuthorsSekar Kathiresan137
Estimated H-index: 137
A key public health need is to identify individuals at high risk for a given disease to enable enhanced screening or preventive therapies. Because most common diseases have a genetic component, one important approach is to stratify individuals based on inherited DNA variation1. Proposed clinical applications have largely focused on finding carriers of rare monogenic mutations at several-fold increased risk. Although most disease risk is polygenic in nature2–5, it has not yet been possible to use polygenic predictors to identify individuals at risk comparable to monogenic mutations. Here, we develop and validate genome-wide polygenic scores for five common diseases. The approach identifies 8.0, 6.1, 3.5, 3.2, and 1.5% of the population at greater than threefold increased risk for coronary artery disease, atrial fibrillation, type 2 diabetes, inflammatory bowel disease, and breast cancer, respectively. For coronary artery disease, this prevalence is 20-fold higher than the carrier frequency of rare monogenic mutations conferring comparable risk6. We propose that it is time to contemplate the inclusion of polygenic risk prediction in clinical care, and discuss relevant issues.
📖 Papers frequently viewed together
488 Citations
414 Citations
450 Citations
#1Yan Zhang (Johns Hopkins University)H-Index: 12
#2Guanghao Qi (Johns Hopkins University)H-Index: 6
Last. Nilanjan Chatterjee (Johns Hopkins University)H-Index: 15
view all 4 authors...
We developed a likelihood-based approach for analyzing summary-level statistics and external linkage disequilibrium information to estimate effect-size distributions of common variants, characterized by the proportion of underlying susceptibility SNPs and a flexible normal-mixture model for their effects. Analysis of results available across 32 genome-wide association studies showed that, while all traits are highly polygenic, there is wide diversity in the degree and nature of polygenicity. Psy...
111 CitationsSource
#1Kyriaki Michailidou (University of Cambridge)H-Index: 53
#2Sara Lindström (Harvard University)H-Index: 60
Last. Douglas F. Easton (University of Cambridge)H-Index: 172
view all 359 authors...
Breast cancer risk is influenced by rare coding variants in susceptibility genes, such as BRCA1, and many common, mostly non-coding variants. However, much of the genetic contribution to breast cancer risk remains unknown. Here we report the results of a genome-wide association study of breast cancer in 122,977 cases and 105,974 controls of European ancestry and 14,068 cases and 13,104 controls of East Asian ancestry1. We identified 65 new loci that are associated with overall breast cancer risk...
555 CitationsSource
#1Robert A. Scott (University of Cambridge)H-Index: 106
#2Laura J. Scott (UM: University of Michigan)H-Index: 74
Last. Inga Prokopenko (University of Oxford)H-Index: 89
view all 175 authors...
To characterize type 2 diabetes (T2D)-associated variation across the allele frequency spectrum, we conducted a meta-analysis of genome-wide association data from 26,676 T2D case and 132,532 control subjects of European ancestry after imputation using the 1000 Genomes multiethnic reference panel. Promising association signals were followed up in additional data sets (of 14,545 or 7,397 T2D case and 38,994 or 71,604 control subjects). We identified 13 novel T2D-associated loci (P < 5 × 10-8), inc...
370 CitationsSource
#1Anna FryH-Index: 1
Last. Naomi E. AllenH-Index: 116
view all 8 authors...
: The UK Biobank cohort is a population-based cohort of 500,000 participants recruited in the United Kingdom (UK) between 2006 and 2010. Approximately 9.2 million individuals aged 40-69 years who lived within 25 miles (40 km) of one of 22 assessment centers in England, Wales, and Scotland were invited to enter the cohort, and 5.5% participated in the baseline assessment. The representativeness of the UK Biobank cohort was investigated by comparing demographic characteristics between nonresponder...
711 CitationsSource
#1Yan Zhang (Johns Hopkins University)H-Index: 12
#2Guanghao Qi (Johns Hopkins University)H-Index: 6
Last. Nilanjan Chatterjee (Johns Hopkins University)H-Index: 90
view all 4 authors...
Summary-level statistics from genome-wide association studies are now widely used to estimate heritability and co-heritability of traits using the popular linkage-disequilibrium-score (LD-score) regression method. We develop a likelihood-based approach for analyzing summary-level statistics and external LD information to estimate common variants effect-size distributions, characterized by proportion of underlying susceptibility SNPs and a flexible normal-mixture model for their effects. Analysis...
7 CitationsSource
#1Clare Bycroft (University of Oxford)H-Index: 6
#2Colin Freeman (University of Oxford)H-Index: 32
Last. Jonathan Marchini (University of Oxford)H-Index: 68
view all 15 authors...
The UK Biobank project is a large prospective cohort study of ~500,000 individuals from across the United Kingdom, aged between 40-69 at recruitment. A rich variety of phenotypic and health-related information is available on each participant, making the resource unprecedented in its size and scope. Here we describe the genome-wide genotype data (~805,000 markers) collected on all individuals in the cohort and its quality control procedures. Genotype data on this scale offers novel opportunities...
418 CitationsSource
#1Ingrid E. Christophersen (Broad Institute)H-Index: 18
#2Michiel Rienstra (UG: University of Groningen)H-Index: 54
Last. Patrick T. Ellinor (Broad Institute)H-Index: 101
view all 162 authors...
Atrial fibrillation affects more than 33 million people worldwide and increases the risk of stroke, heart failure, and death. Fourteen genetic loci have been associated with atrial fibrillation in European and Asian ancestry groups. To further define the genetic basis of atrial fibrillation, we performed large-scale, trans-ancestry meta-analyses of common and rare variant association studies. The genome-wide association studies (GWAS) included 17,931 individuals with atrial fibrillation and 115,...
154 CitationsSource
#1Alicia R. MartinH-Index: 29
#2Christopher R. Gignoux (Stanford University)H-Index: 42
Last. Eimear E. KennyH-Index: 37
view all 9 authors...
The vast majority of genome-wide association studies (GWASs) are performed in Europeans, and their transferability to other populations is dependent on many factors (e.g., linkage disequilibrium, allele frequencies, genetic architecture). As medical genomics studies become increasingly large and diverse, gaining insights into population history and consequently the transferability of disease risk measurement is critical. Here, we disentangle recent population history in the widely used 1000 Geno...
571 CitationsSource
#1Amit KheraH-Index: 75
#2Sekar Kathiresan (Broad Institute)H-Index: 137
Age-adjusted mortality from coronary artery disease (CAD) has decreased substantially in recent decades, in large part, related to a combination of lifestyle modifications, pharmacological therapies, and revascularization strategies. But do we need a new approach? The All of Us Research Program (a cohort study within the Precision Medicine Initiative) will begin enrollment of ≥1million participants in 2017. This landmark resource will enable investigation into the substantial interindividual var...
22 CitationsSource
#1Pradeep Natarajan (Harvard University)H-Index: 39
#2Robin Young (University of Cambridge)H-Index: 27
Last. Sekar Kathiresan (Harvard University)H-Index: 137
view all 14 authors...
Background —Relative risk reduction with statin therapy has been consistent across nearly all subgroups studied to date. However, in analyses of two randomized controlled primary prevention trials (ASCOT and JUPITER), statin therapy led to a greater relative risk reduction among a subgroup at high genetic risk. Here, we sought to confirm this observation in a third primary prevention randomized controlled trial. Additionally, we assessed if those at high genetic risk had a greater burden of subc...
194 CitationsSource
Cited By867
#1Rishi Caleyachetty (University of Oxford)H-Index: 1
#2Thomas J. Littlejohns (University of Oxford)H-Index: 13
Last. Naomi E. Allen (University of Oxford)H-Index: 116
view all 0 authors...
An increasing number of people are now living with cardiovascular disease (CVD), with concomitant CVD-related hospitalizations, operations, and prescriptions. To ultimately deliver optimal cardiovascular care, access to population-based biobanks with data on multiomics, phenotypes, and lifestyle risk factors are crucial. UK Biobank is a cohort study that incorporated data between 2006 and 2010 from over half a million individuals (40 to 69 years of age) at recruitment from across the United King...
IntroductionTicagrelor is widely considered superior to clopidogrel however a pharmacogenetic substudy of PLATO indicated that the majority of this difference is due to genetic nonresponders to clopidogrel. We evaluated patient outcomes following genotyping for CYP2C19 in a propensity matched acute coronary syndrome cohort treated with either clopidogrel, ticagrelor or aspirin monotherapy. null MethodsICD10 coding identified 6,985 acute coronary syndrome patients at Waitemat[a] District Health B...
#1Evan D. Muse (Torrey Pines Institute for Molecular Studies)H-Index: 21
#2Shang-Fu Chen (Torrey Pines Institute for Molecular Studies)H-Index: 5
Last. Ali Torkamani (Torrey Pines Institute for Molecular Studies)H-Index: 1
view all 0 authors...
Purpose of the review null Coronary artery disease (CAD) is a common disease globally attributable to the interplay of complex genetic and lifestyle factors. Here, we review how genomic sequencing advances have broadened the fundamental understanding of the monogenic and polygenic contributions to CAD and how these insights can be utilized, in part by creating polygenic risk estimates, for improved disease risk stratification at the individual patient level. null Recent findings null Initial stu...
#1Nicholas S. Diab (Yale University)H-Index: 1
#2Syndi Barish (Yale University)H-Index: 2
Last. Sheng Chih Jin (WashU: Washington University in St. Louis)H-Index: 21
view all 0 authors...
Congenital heart disease (CHD) is the most common congenital malformation and the leading cause of mortality therein. Genetic etiologies contribute to an estimated 90% of CHD cases, but so far, a molecular diagnosis remains unsolved in up to 55% of patients. Copy number variations and aneuploidy account for ~23% of cases overall, and high-throughput genomic technologies have revealed additional types of genetic variation in CHD. The first CHD risk genotypes identified through high-throughput seq...
#1Erik Widen (MSU: Michigan State University)H-Index: 1
#2Timothy G. Raben (MSU: Michigan State University)H-Index: 4
Last. Stephen D. H. Hsu (MSU: Michigan State University)H-Index: 30
view all 0 authors...
We use UK Biobank data to train predictors for 65 blood and urine markers such as HDL, LDL, lipoprotein A, glycated haemoglobin, etc. from SNP genotype. For example, our Polygenic Score (PGS) predictor correlates ∼0.76 with lipoprotein A level, which is highly heritable and an independent risk factor for heart disease. This may be the most accurate genomic prediction of a quantitative trait that has yet been produced (specifically, for European ancestry groups). We also train predictors of commo...
#1Dimitrios Vlachakis (UoA: National and Kapodistrian University of Athens)H-Index: 19
#2Eleni Papakonstantinou (AUA: Agricultural University of Athens)H-Index: 3
Last. Dimitrios Avramopoulos (JHBMC: Johns Hopkins Bayview Medical Center)H-Index: 47
view all 0 authors...
The treatment of complex and multifactorial diseases constitutes a big challenge in day-to-day clinical practice. As many parameters influence clinical phenotypes, accurate diagnosis and prompt therapeutic management is often difficult. Significant research and investment focuses on state-of-the-art genomic and metagenomic analyses in the burgeoning field of Precision (or Personalized) Medicine with genome-wide-association-studies (GWAS) helping in this direction by linking patient genotypes at ...
#1Brooke N. Wolford (UM: University of Michigan)H-Index: 11
#2Ida Surakka (UM: University of Michigan)H-Index: 34
Last. Whitney E. Hornsby (UM: University of Michigan)H-Index: 22
view all 0 authors...
Clinicians have historically used family history and other risk prediction algorithms to guide patient care and preventive treatment such as statin therapeutics for coronary artery disease. As polygenic scores move towards clinical use, we have begun to consider the interplay of these scores with other predictors for optimal second generation risk prediction. Here, we assess the use of family history and polygenic scores as independent predictors of coronary artery disease and type 2 diabetes. W...
#1Gunn S (BU: Boston University)
#2Michael Wainberg (Institute for Systems Biology)H-Index: 14
Last. Nathan O. Stitziel (WashU: Washington University in St. Louis)H-Index: 33
view all 0 authors...
Background: A surprising and well-replicated result in genetic studies of human longevity is that centenarians appear to carry disease-associated variants in numbers similar to the general population. With the proliferation of large genome-wide association studies (GWAS) in recent years, investigators have turned to polygenic scores to leverage GWAS results into a measure of genetic risk that can better predict risk of disease than individual significant variants alone. Methods: We selected 54 p...
#1Charles A. German (Wake Forest University)H-Index: 4
#2Michael D. Shapiro (Wake Forest University)H-Index: 45
Purpose of review null The purpose of this review is to understand the conceptual basis and implications of polygenic risk scores (PRS) in assessing risk of future coronary artery disease (CAD). null Recent findings null Genetic information from the USA and beyond has been pooled together to create population-based biobanks, composed of millions of genotyped individuals, which have helped further our understanding of the relationship between single nucleotide polymorphisms (SNPs) and CAD. Contem...
#1Ida Surakka (UM: University of Michigan)H-Index: 34
#2Brooke N. Wolford (UM: University of Michigan)H-Index: 11
Last. Kristian Hveem (NTNU: Norwegian University of Science and Technology)H-Index: 83
view all 0 authors...
BackgroundThe 10-year Atherosclerotic Cardiovascular Disease (ASCVD) risk score is the standard approach to predict risk of incident cardiovascular events and recently, addition of CAD polygenic scores (PGSCAD) have been evaluated. Although age and sex strongly predict the risk of CAD, their interaction with genetic risk prediction has not been systematically examined. null ObjectivesThis study performed an in-depth evaluation of age and sex effects in genetic CAD risk prediction. null MethodsTh...