14 October 2011

23andMe ancestry analysis

Those of us who have had our autosomal DNA analysed by 23andMe sometimes feel ignored with all the attention the company pays to health matters. So it's refreshing to see some attention given this week at the ICHG/ASHG 2011 in Montreal. The 23andMe blog, The Spittoon, mentions two presentations in a posting at http://spittoon.23andme.com/2011/10/13/the-nuts-and-bolts-behind-health-and-ancestry/

They are:

Assigning Intra-European Ancestry to Identical-by-Descent Segments using a Large Database of Self-Reported Ancestry. J. M. Macpherson, B. T. Naughton, C. B. Do, J. Y. Tung, J. L. Mountain23andMe.com, Mountain View, CA.
For assignment of ancestry to a given genomic region, traditional autosomal ancestry analyses rely on probabilistic models of haplotype frequencies. This approach has been successful in assigning ancestry in individuals with ancestry from widely-separated geographic regions, for example in admixture mapping studies of African-American and Latino populations. However, this approach has difficulty in discriminating between haplotypes from more closely-related populations. Here we introduce a method for autosomal ancestry assignment using identical-by-descent (IBD) segments from a large database of individuals of European ancestry who have themselves provided information about their, their parents', and their grandparents' ancestry. The method is frequently able to identify the European countries of origin of segments in individuals of known ancestry correctly, which suggests its use in identifying the origin of segments in individuals of unknown ancestry. The method is based on the idea that, if an individual shares an IBD segment with an individual of uniform ancestry from a given country, the segment likely derives from that country. To guard against the possibility of erroneous or misleading ancestry information, we use a procedure based on principal components analysis to filter the dataset. We examined the concordance of the method's results with the individuals' own self-reported ancestry information; depending on the country of origin, the method correctly identifies European country of origin from 55% to 85% of the time, and correctly identifies European region of origin 65 to 100% of the time. We also explore the accuracy of the method in Ashkenazi Jewish individuals, finding 85% concordance in individuals with self-reported Ashkenazi Jewish ancestry. We conclude by analyzing how this method's coverage and accuracy depend on database size and mean population IBD sharing.
and

Guidelines for evaluating genetic associations for use in direct-to-consumer personal genetic analysis. S. Wu, G. M. Benton, J. Y. Tung, A. B. Chowdry, J. L. Mountain, B. T. Naughton 23andMe, Inc., Mountain View, CA.
Crowd-sourced, free, and online tools mingle alongside commercial and non-profit ventures now offering genetic testing and analysis to individuals. Groups that offer information about how genetics may influence health and physical traits base their services on the scientific literature, primarily genome-wide association studies (GWAS), but how they evaluate and report on this literature can vary. The standards for evidence used by different groups can range from very permissive to quite stringent, balancing different needs for validity, stability, flexibility, scalability, and relevancy to the state of research. 23andMe has developed and evolved a robust set of guidelines for evaluating genetic associations reported in the literature. These guidelines account for sources of error and bias -- including confirmation bias, multiple hypothesis testing, population stratification, and type I error -- by considering parameters such as sample size, existence of replications, effect size, and correction for multiple hypotheses. In addition, the guidelines address broader issues raised by reporting on research primarily focused on populations of European descent and on associations with different levels of evidence to a diverse consumer audience. We describe lessons learned from several years of evaluating genetic associations in the context of connecting individuals to their genetic information.

No comments: