Supplementary Material for: Global Individual Ancestry Using Principal Components for Family Data
datasetposted on 09.07.2015, 00:00 by de Andrade M., Ray D., Pereira A.C., Soler J.P.
Studies of complex human diseases and traits associated with candidate genes are potentially vulnerable to bias (confounding) due to population stratification and inbreeding, especially in admixed population. In GWAS, the principal components (PCs) method provides a global ancestry value per subject, allowing corrections for population stratification. However, these coefficients are typically estimated assuming unrelated individuals, and if family structure is present and ignored, such substructures may induce artifactual PCs. Extensions of the PCs method have been proposed by Konishi and Rao [Biometrika 1992;79:631-641], taking into account only siblings' relatedness, and by Oualkacha et al. [Stat Appl Genet Mol Biol 2012, DOI: 10.2202/1544-6115.1711], taking into account large pedigrees and high-dimensional phenotype data. In this work, we extend these methods to estimate the global individual ancestry coefficients from PCs derived from different variance component matrix estimators using SNPs from two simulated data sets and two real data sets: the GENOA sibship data consisting of European and African-American subjects and the Baependi Heart Study consisting of 80 extended Brazilian families, both with genotyping data from the Affymetrix 6.0 chip. Our results show that the family structure plays an important role in the estimation of the global individual ancestry value for extended pedigrees but not for sibships.