Here's a paper from PLoS comparing the set of SNPs in the Affymetrix Gene Chip to the HapMap and Perlegen set of SNPs.
Coverage and Characteristics of the Affymetrix GeneChip Human Mapping 100K SNP Set
Dan L. Nicolae, Xiaoquan Wen, Benjamin F. Voight, Nancy J. Cox
PLoS Genetics 2(5):e67
The paper mentions that the HapMap SNPs were ascertained in Utah Europeans and Yoruba Nigerians. It doesn't say in what populations the SNPs for the Affymetrix 100k set were acsertained.
The authors mention the need to take into account LD between SNPs to reduce redundancy.
A major finding is that the SNPs in the Affymetrix set are undersampled from coding regions and pversampled from areas outside genes, compared to the Perlegen and HapMap SNPs, although the difference is not all that great..see this table.
..also they find, not surprisingly, more LD in European than Africans.
Abstract: Improvements in technology have made it possible to conduct genome-wide association mapping at costs within reach of academic investigators, and experiments are currently being conducted with a variety of high-throughput platforms. To provide an appropriate context for interpreting results of such studies, we summarize here results of an investigation of one of the first of these technologies to be publicly available, the Affymetrix GeneChip Human Mapping 100K set of single nucleotide polymorphisms (SNPs). In a systematic analysis of the pattern and distribution of SNPs in the Mapping 100K set, we find that SNPs in this set are undersampled from coding regions (both nonsynonymous and synonymous) and oversampled from regions outside genes, relative to SNPs in the overall HapMap database. In addition, we utilize a novel multilocus linkage disequilibrium (LD) coefficient based on information content (analogous to the information content scores commonly used for linkage mapping) that is equivalent to the familiar measure r2 in the special case of two loci. Using this approach, we are able to summarize for any subset of markers, such as the Affymetrix Mapping 100K set, the information available for association mapping in that subset, relative to the information available in the full set of markers included in the HapMap, and highlight circumstances in which this multilocus measure of LD provides substantial additional insight about the haplotype structure in a region over pairwise measures of LD.
No comments:
Post a Comment