Sunday, February 28, 2010

Predicting hair, eye, and skin color from a small set of SNPs

They examined the association between 75 SNPs in 24 genes and skin, eye and hair color among 789 people of various ethnic backgrounds. Since this is for forensic purposes, they were looking for a small set of SNP markers (i.e. 3) that could reliably predict these pigmentation phenotypes, independent of ethnic origin. Their sample consisted mostly of individuals of European descent, but a decent number of several other ethnic groups.

Hair color:

SLC45A2, SLC24A5 and MC1R - R squared: 76.3% (one SNP per gene listed)
Skin color:
SLC24A5, SLC45A2, ASIP - R squared: 45.7% ... interaction term of ASIP and SLC45A2 increased r-squared to 49.6% (one SNP per gene listed)
Eye color:
HERC2, SLC24A5, SLC25A2 - R squared: 76.4%... (HERC2 appears to be doing the vast majority of the explaining)

The obvious remaining question from all this is how high does the proportion of variance explained go if you use information from all markers together. Anyway, it appears that, as they mention, five SNPs in five genes account for much of the variation.
Given that most subjects were Eur, it would have been nice to see the extent to which they were driving the results, by for example, doing the same analysis only on them. In other words how different would the results be if most subjects were African or Native American etc...?
I did not know that HERC2 is adjacent (5' side) to OCA2, and contains a promoter region for OCA2.

Predicting Phenotype from Genotype: Normal Pigmentation
Valenzuela RK, Henderson MS, Walsh MH, Garrison NA, Kelch JT, Cohen-Barak O, Erickson DT, John Meaney F, Bruce Walsh J, Cheng KC, Ito S, Wakamatsu K, Frudakis T, Thomas M, Brilliant MH.
J Forensic Sci. 2010 Feb 11. [Epub ahead of print]

Abstract:Genetic information in forensic studies is largely limited to CODIS data and the ability to match samples and assign them to an individual. However, there are circumstances, in which a given DNA sample does not match anyone in the CODIS database, and no other information about the donor is available. In this study, we determined 75 SNPs in 24 genes (previously implicated in human or animal pigmentation studies) for the analysis of single- and multi-locus associations with hair, skin, and eye color in 789 individuals of various ethnic backgrounds. Using multiple linear regression modeling, five SNPs in five genes were found to account for large proportions of pigmentation variation in hair, skin, and eyes in our across-population analyses. Thus, these models may be of predictive value to determine an individual's pigmentation type from a forensic sample, independent of ethnic origin.

Monday, February 15, 2010

Predicting lactase persistence from genetic data ... not yet!

...especially in Africa, SE Europe, and parts of Asia.
It seems like their genetic information consists of the four SNPs that are so far known to be associated with lactase persistence.

A worldwide correlation of lactase persistence phenotype and genotypes

Itan Y, Jones BL, Ingram CJ, Swallow DM, Thomas MG
BMC Evol Biol. 2010 Feb 9;10(1):36. [Epub ahead of print]
ABSTRACT: BACKGROUND: The ability of adult humans to digest the milk sugar lactose - lactase persistence - is a dominant Mendelian trait that has been a subject of extensive genetic, medical and evolutionary research. Lactase persistence is common in people of European ancestry as well as some African, Middle Eastern and Southern Asian groups, but is rare or absent elsewhere in the world. The recent identification of independent nucleotide changes that are strongly associated with lactase persistence in different populations worldwide has led to the possibility of genetic tests for the trait. However, it is highly unlikely that all lactase persistence-associated variants are known. Using an extensive database of lactase persistence phenotype frequencies, together with information on how those data were collected and data on the frequencies of lactase persistence variants, we present a global summary of the extent to which current genetic knowledge can explain lactase persistence phenotype frequency. RESULTS: We used surface interpolation of Old World lactase persistence genotype and phenotype frequency estimates obtained from all available literature and perform a comparison between predicted and observed trait frequencies in continuous space. By accommodating additional data on sample numbers and known false negative and false positive rates for the various lactase persistence phenotype tests (blood glucose and breath hydrogen), we also apply a Monte Carlo method to estimate the probability that known lactase persistence-associated allele frequencies can explain observed trait frequencies in different regions. CONCLUSION: Lactase persistence genotype data is currently insufficient to explain lactase persistence phenotype frequency in much of western and southern Africa, southeastern Europe, the Middle East and parts of central and southern Asia. We suggest that further studies of genetic variation in these regions should reveal additional nucleotide variants that are associated with lactase persistence.

 
Locations of visitors to this page