Search “coronavirus” on GenBank, a public repository for genomes, and today you’ll find more than 35,000 sequences. Alpaca coronaviruses. Hedgehog coronaviruses. Beluga whale coronaviruses. And, of course, lots and lots of bat coronaviruses.
But very few people have carried out the downstream laboratory work—figuring out how these coronaviruses behave, how they get into the bodies of their hosts, and how likely it is that they could make the hop to humans. “I realized just how much data there is and how little we know about all of it,” says [virologist Michael] Letko.
…
So he decided to build a platform that could experimentally test the world’s collection of coronavirus genomes, to see which ones had the highest likelihood of infecting human cells.
At any given time, there are tens of thousands of unique coronaviruses being carried by animals. But only a handful have ever crossed into humans. If you could understand what makes those viruses different, Letko hypothesized, you could create a prediction engine for forecasting which ones have the potential to emerge in human populations. “If you want to figure out where the next pandemic is going to come from,” he says, “the coronaviruses are a good place to start.”