The GLP aggregated and excerpted this blog/article to reflect the diversity of news, opinion and analysis.
Let’s say you have a patient with a severe inherited muscle disorder, the kind that Daniel MacArthur from the Broad Institute of Harvard and MIT specializes in. They’re probably a child, with debilitating symptoms and perhaps no diagnosis. To discover the gene(s) that underlie the kid’s condition, you sequence their genome, or perhaps just their exome: the 1 percent of their DNA that codes for proteins. The results come back, and you see tens of thousands of variants—sites where, say, the usual A has been replaced by a T, or the typical C is instead a G.
You’d then want to know if those variants have ever been associated with diseases, and how common they are in the general population. In an ideal world, you would compare all of a patient’s variants against “every individual who has ever been sequenced in the history of sequencing.”
This is not that world, at least not yet. When Macarthur launched his lab in 2012, he started by sequencing the exomes of some 300 patients with rare muscle diseases. But he quickly realized that he had nothing decent to compare them against. It has never been easier, cheaper, or quicker to sequence a person’s genome, but interpreting those sequences is tricky, absent a comprehensive reference library of human genetic variation. No such library existed, or at least nothing big or diverse enough. So, MacArthur started making one.
Read full, original post: How Data-Wranglers Are Building the Great Library of Genetic Variation