The GLP is committed to full transparency. Download and review our 2019 Annual Report

‘Google of sorts’: DNA database harnesses power of genome sequences

| | February 15, 2019

In 2015, scientists discovered a pig in China that would set off a frantic, worldwide search. The pig carried bacteria resistant to colistin, a drug used to cure infections when almost all other drugs have failed. …

In England, where colistin is reserved for patients in rare and dire circumstances, public-health officials worried. Could colistin-resistant bacteria also be lurking in that country?

[T]he search took 256 computers working together for an entire weekend, says Zamin Iqbal, a computational genomicist at the European Bioinformatics Institute … . The researchers there did find colistin resistance among their 24,000 samples, and eventually, countries all over the world found it, too.

Why did this process take so long? The computers at Public Health England had to open up and search the sequencing files of 24,000 genomes one by one.

Related article:  DNA sequencing technique spots wheat pathogens, and diseases-fighting microbes to stop them

So Iqbal decided to build a Google of sorts for bacterial and viral genomes. He and his colleagues downloaded all available genomes—nearly 500,000 at the time—from a public database called the European Nucleotide Archive. The 170,000-gigabyte data set took six whole weeks to download. … The resulting tool is called BIGSI, for BItsliced Genomic Signature Index.

Searching for colistin resistance through nearly 500,000 sequences now takes just a few seconds.

Read full, original post: The Problem With Big DNA

The GLP aggregated and excerpted this article to reflect the diversity of news, opinion, and analysis. Click the link above to read the full, original article.
News on human & agricultural genetics and biotechnology delivered to your inbox.
Optional. Mail on special occasions.

Send this to a friend