Polygenic Prediction via Bayesian Regression and Continuous Shrinkage Priors

Abstract

Polygenic prediction has shown promise in identifying individuals at high risk for complex diseases, and may become clinically useful as the predictive performance of polygenic risk scores (PRS) improves. To date, most applications calculate PRS using a subset of largely independent genetic markers, but this approach discards information and limits the predictive value of PRS. More sophisticated Bayesian genomic prediction methods that jointly model genetic markers across the genome are computationally challenging and do not accurately account for linkage disequilibrium (LD) structure. Here, we present PRS-CS, a novel polygenic prediction method that infers posterior SNP effect sizes using GWAS summary statistics and an external LD reference panel. PRS-CS utilizes a high-dimensional Bayesian regression framework, and is distinct from previous work by placing a continuous shrinkage (CS) prior on SNP effect sizes, which is robust to varying genetic architectures, provides substantial computational advantages, and enables multivariate modeling of local LD patterns. Simulation studies using data from the UK Biobank show that PRS-CS outperforms existing methods across a wide range of effect size distributions, especially when the training sample size is large. We apply PRSCS to predict six common, complex diseases and six quantitative traits in the Partners HealthCare Biobank, for which external large-scale GWAS summary statistics are publicly available, and further demonstrate the improvement of PRS-CS in prediction accuracy over alternative methods.

Publication
Nature Communications, 10(1), 1776.
[Selected as Editors’ Highlights]