Equivalence of LD-Score Regression and Individual-Level-Data Methods
Abstract
LD-score (LDSC) regression disentangles the contribution of polygenic signal, in terms of SNP-based heritability, and population stratification, in terms of a so-called intercept, to GWAS test statistics. Whereas LDSC... [ view full abstract ]
LD-score (LDSC) regression disentangles the contribution of polygenic signal, in terms of SNP-based heritability, and population stratification, in terms of a so-called intercept, to GWAS test statistics. Whereas LDSC regression uses summary statistics, methods such as Haseman-Elston (HE) regression and genomic-relatedness-matrix restricted maximum likelihood (GREML) estimation infer SNP-based heritability from individual-level data directly. Owing to the difference in data used by these two types of methods, they are typically considered to be different classes of methods. Nevertheless, recent work has already revealed that LDSC and HE regression yield approximately equivalent estimates of SNP-based heritability when confounding stratification is absent. We extend the equivalence; under the stratification assumed by LDSC regression, we show that the LDSC intercept can be estimated by performing a simple regression of the phenotype on the leading principal component from the genomic-relatedness matrix and transforming this estimate using the corresponding eigenvalue. Using simulated phenotypes, we find in case of two discrete populations that intercept estimates obtained from individual-level data are nearly equivalent to estimates from LDSC regression itself (R2 = 99.9% between estimates from different methods). Moreover, we show using three discrete populations that the intercept, as estimated by LDSC regression, can be retrieved nearly perfectly from a linear combination of squared coefficients of a regression of the phenotype on the leading two principal components, weighted by a function of the corresponding eigenvalues and sample size (R2 = 99.1% between estimates). An empirical application corroborates these findings. Hence, an equivalence principle holds even for complex forms of stratification. Consequently, methods such as LDSC regression are not profoundly different from methods using individual-level data; parameters that are identified within the LDSC framework can be identified equally well by methods using individual-level data.
Authors
-
Ronald de Vlaming
(VU Amsterdam)
-
Magnus Johannesson
(Stockholm School of Economics)
-
Patrik KE Magnusson
(Karolinska Institutet)
-
M Arfan Ikram
(Erasmus Medical Center)
-
Peter M Visscher
(University of Queensland)
Topic Area
Statistical Methods
Session
2A-OS » Methods (13:15 - Thursday, 29th June, Sal A)
Presentation Files
The presenter has not uploaded any presentation files.