The authors also bring up the point that in the Rosenberg (2002, Science) data set of 340 or so microsatellites typed in the CEPH panel: "Europeans proved to be more similar to Asians than to other Europeans 38% of the time." This was quite surprising to me.
From their conclusion:
"Thus the answer to the question "How often is a pair of individuals from one population genetically more dissimilar than two individuals chosen from two different populations?" depends on the number of polymorphisms used to define that dissimilarity and the populations being compared... Given ten loci and three distinct populations, the answer is w=0.3, or nearly one-third of the time. With 100 loci, the answer is about 20% of the time; and even using 1,000 loci, w=10%. However, if genetic similarity is measured over many thousands of loci, the answer becomes "never" when individuals are sampled from geographically separated populations.""How can the observations of accurate classifiability be reconciled with high between-population similarities among individuals."
They then ask:
They then ask:
and go on to discuss the crucial reliance on "aggregate properties of populations".
They go on to discuss the implications of the complex disease phenotype - genotype - group membership relationships.
Then in their final paragraph:
"The fact that, given enough genetic data. individuals can be correctly assigned to their population of origin is compatible with the observation that most human genetic variation is found within populations, not between them. It is also compatible with our finding that, even when the most distinct populations are considered and hundreds of loci are used, individuals are frequently more similar to members of other populations than to their own population. Thus caution should be used when using geographic or genetic ancestry to make inferences about individual phenotypes."
I wish I could have read this paper more closely but am somewhat time crunched, and it's not in published pdf format yet, which makes it unpleasant to read (figures at the end of the text, etc..)
Genetic Similarities Within and Between Human Populations
David J Witherspoon , Stephen Wooding , Alan R Rogers, Elizabeth E Marchani W Scott Watkin, Mark A Batzer and Lynn B Jorde
Genetics. Published Articles Ahead of Print: March 4, 2007
Abstract: The proportion of human genetic variation due to differences between populations is modest, and individuals from different populations can be genetically more similar than individuals from the same population. Yet sufficient genetic data can permit accurate classification of individuals into populations. To resolve this apparent conflict, we analyzed the question "How often is a pair of random individuals from two different populations genetically more similar than a pair of individuals randomly selected from any single population"; We compared this frequency (w) with error rates for classification methods, using data sets that vary in number of loci, diversity of populations, and polymorphism ascertainment strategies. Classification methods achieve higher discriminatory power than the individual-based measure, w, because of their use of aggregate properties of populations. The number of loci analyzed is the most critical variable: with one hundred polymorphisms, accurate classification is possible, but w remains sizable, even when using populations as distinct as sub-Saharan Africans and Europeans. Phenotypes controlled by a dozen or fewer loci can be expected to show substantial overlap between human populations. This provides empirical justification for caution when using population labels in biomedical settings, with broad implications for personalized medicine, pharmacogenetics, and the meaning of race.