In 2012, scientists with the ENCODE project, a huge catalog of all noncoding DNA in the human genome, declared that 80 percent of our DNA was active and performing some function. Now scientists at Oxford have analyzed the human genome and claim that less than 10 percent of our DNA is functional.
Who’s correct? It’s possible–in fact, it’s likely–that both groups are. It depends on what is meant by the word “functional.” The explanation seems to be that, while some 80 percent of our DNA is doing stuff, less than 10 percent of it is doing such important stuff that natural selection has preserved it largely intact in the mammal line for 100 million years. (Anatomically modern humans — that’s all 7 billion of us, the last Homo standing — have only been around for a couple of hundred thousand years.)
The ENCODE scientists (ENCODE stands for Encyclopedia of DNA Elements), defined “functional” as meaning the DNA has some specific biochemical activity. For example, activity that makes copies of DNA during cell division is functional–essential, in fact–because cells must divide in order for life to go on.
But that activity is not functional in the way the new paper, published in PLOS Genetics July 24, uses that term. The Oxford researchers want to count as “functional” only DNA involved in shaping a person’s body and behavior–the phenotype, DNA that is acted upon by natural selection.
What is included in functional DNA?
First, let me make clear that I’m being deliberately inexact when I say “less than 10 percent.” The new paper estimates that 8.2 percent of the human genome is functional (with a range of 7.1–9.2 percent). But I’m sticking with “less than 10 percent” here for a couple of reasons. One is that saying 8.2 percent gives a misleading impression of precision. That figure is really an estimate.
The other reason, which is surely the important point, is that the Oxford researchers are saying that only a small proportion of our genome, less than a tenth, is so crucial to our existence that natural selection weeds out injurious mutations and works hard to keep it mostly intact.
They believe the ENCODE folks would largely agree. “We don’t think our figure is actually too different from what you would get looking at ENCODE’s bank of data using the same definition for functional DNA,” says joint senior author Chris Ponting of the MRC Functional Genomics Unit at Oxford.
Ponting is right. In 2012, on publication of the ENCODE papers, the project’s lead analysis coordinator Ewan Birney, who is at the European Bioinformatics Institute, blogged that using very strict classical definitions of “functional,” places where scientists are pretty certain there is a specific DNA-protein contact, accounts for about 8 percent of the genome.
Add the exons — exons are the sequences that specify the code for making the proteins that carry out our bodily functions — and that pushes the percentage of “functional” sequences up to 9 percent, which is pretty close to 8.2 percent. (Yes, it is an astonishing fact that protein-coding sequences, which are what we mostly mean when we say “genes,” occupy only a little over 1% of the human genome.)
The idea, Birney said in 2012, is that the 8% is nearly all regulatory sequences, DNA that governs the behavior of the 1 percent of DNA that codes for proteins. He noted that, until the ENCODE project, scientists thought regulatory sequences would take up about the same amount of space as protein-coding sequences. It was a big surprise to learn that the DNA that regulates genes was eight times bigger.
Let me add that these regulatory sequences are believed to be where most of the evolutionary action takes place. Our genes code for pretty much the same proteins that the genes of mice and rhinos do. What looks like huge differences between our species arise largely from differences in regulation — what point in life a particular protein-coding gene turns on or off, and in what cells.
The new paper largely confirms what Birney said in 2012. One of the Oxford scientists, Gerton Lunter, told me in an email that they can’t give an exact breakdown of what functional categories constitute the 8.2 percent, but they have concluded that basically all protein-coding genes are there. All, or nearly all, regulatory material forms a large contribution, too, he said.
It may be junk, but it’s not just noise
The rest of the genome, Birney said in 2012, is not–as some would have it — just biological “noise.” He prefers to call that DNA “biologically neutral,” meaning “that there are totally reproducible, cell-type-specific biochemical events that natural selection does not care about.” Which is what the new PLOS Genetics paper says too.
So what’s the other 90 percent of the human genome up to? It has been called junk, and although scientists don’t much like that term, junk is what a lot of it appears to be. Some of that DNA, for instance, is leftover fragments of dead viruses that invaded our ancestors’ genomes aeons ago.
But most of the 90 percent consists of DNA sequences called transposons. Occupying fully half the human genome, transposons are stretches of DNA that can hop around in host DNA. They are somewhat related to viruses, and Sean Eddy, who does computational biology at the Howard Hughes Medical Institute’s Janelia Farm, calls them “molecular parasites.”
The extent to which a genome is infested with transposons varies from species to species. The Homo sap genome is stuffed with them. We do delete them, but they replicate so fast that we can’t keep up. Transposons, however, are not essential for a successful life. As I wrote here at GLP last fall, it’s perfectly possible to flourish with a lean and mean genome that contains hardly any transposons. My example was a little weed called the humped bladderwort. Only 2.5 percent of its genome is transposons, compared to 50 percent transposons in Homo sap.
Having few transposons could be an advantage. That’s because some transposons are more than just excess baggage. They can be detrimental. A transposon that jumps into the middle of a gene can disrupt the gene’s work. Transposons can also boost the potency of genes involved in disease. They are a factor in some kinds of cancer.
On the other hand, transposons are also the raw material of evolution. When these mobile elements move around in eggs and sperm, they can increase genetic diversity. A 2010 paper suggested that transposons have contributed to human success by speeding our unusually rapid evolution, especially evolution of our big brains.
So transposons can sometimes be damaging and sometimes useful. Mostly, they are neither. Just . . . junk.