Monday, July 07, 2008

Structure-informative SNPs among European Americans

100- 200 carefully selected SNPs are enough, they say, to adequately control for structure among European Americans "while maintaining power in association studies". Their method makes no assumptions about the origin or ancestry of individuals, nor, unfortunately, do they even have any information on this.
Also, they find very little overlap between the AIMs (ancestry informative markers- actually, structure informative markers) they propose here and those found in the several other previous studies looking at genetic diversity in Europe, but:
The greatest overlap is found between the panel we propose here and the 1,441 SNPs proposed by Tian et al. [27] as distinguishing between northern European and Ashkenazi Jewish ancestry.
I'm sure Dienekes will have a field day with this one.

Tracing Sub-Structure in the European American Population with PCA-Informative Markers
Peristera Paschou, Petros Drineas, Jamey Lewis, Caroline M. Nievergelt, Deborah A. Nickerson, Joshua D. Smith, Paul M. Ridker, Daniel I. Chasman, Ronald M. Krauss, Elad Ziv
PLoS Genetics 4(7)
Abstract: Genetic structure in the European American population reflects waves of migration and recent gene flow among different populations. This complex structure can introduce bias in genetic association studies. Using Principal Components Analysis (PCA), we analyze the structure of two independent European American datasets (1,521 individuals–307,315 autosomal SNPs). Individual variation lies across a continuum with some individuals showing high degrees of admixture with non-European populations, as demonstrated through joint analysis with HapMap data. The CEPH Europeans only represent a small fraction of the variation encountered in the larger European American datasets we studied. We interpret the first eigenvector of this data as correlated with ancestry, and we apply an algorithm that we have previously described to select PCA-informative markers (PCAIMs) that can reproduce this structure. Importantly, we develop a novel method that can remove redundancy from the selected SNP panels and show that we can effectively remove correlated markers, thus increasing genotyping savings. Only 150–200 PCAIMs suffice to accurately predict fine structure in European American datasets, as identified by PCA. Simulating association studies, we couple our method with a PCA-based stratification correction tool and demonstrate that a small number of PCAIMs can efficiently remove false correlations with almost no loss in power. The structure informative SNPs that we propose are an important resource for genetic association studies of European Americans. Furthermore, our redundancy removal algorithm can be applied on sets of ancestry informative markers selected with any method in order to select the most uncorrelated SNPs, and significantly decreases genotyping costs.

No comments:

Locations of visitors to this page