About a year ago, the sequences of more than 200 viral samples from early cases of COVID-19 in Wuhan, China, disappeared from online scientific databases.
Currently, Seattle researchers have examined files stored on Google Cloud and reported that they have restored 13 of these original sequences. It is intriguing with new information to identify when and how the virus was released from bats and other animals to humans.
A new analysis released on Tuesday supports previous suggestions that various coronaviruses may have been prevalent in Wuhan before the first outbreaks associated with the animal and seafood market in December 2019. I will.
When the Biden administration is investigating the disputed origin of the virus known as SARS-CoV-2, the study does not strengthen or downplay the hypothesis that the pathogen leaked from the famous Wuhan Institute of Veterinary Medicine. However, it raises questions about why the original sequence was deleted and suggests that there may be more revelations to recover from every corner of the Internet.
“This is certainly a great detective job and will greatly advance our efforts to understand the origin of SARS-CoV-2,” said Michael, an evolutionary biologist at the University of Arizona who was not involved in the study. Wolobay said.
Jesse Bloom, a viral expert at the Fred Hutchinson Cancer Research Center, who wrote the new report, said the removal of these sequences was suspicious. “It’s likely that the sequence was deleted and its existence obscured,” he wrote in a treatise, but has not yet been peer-reviewed or published in a scientific journal.
Bloom and Wolobay belong to a candid group of scientists seeking further research into how the pandemic began. In a letter released in May, they determine whether a laboratory leak was likely to spread the coronavirus, or whether it bounced off humans from contact with infected animals outside the laboratory. I complained that I didn’t have enough information.
The sequence of the viral sample holds important clues as to how SARS-CoV-2 migrated from another animal (probably a bat) to our species. The most valuable of all is the early pandemic sequence. This is to bring scientists closer to the original spillover event.
While reviewing genetic data published by various research groups, Bloom came across a March 2020 study containing a spreadsheet containing information on the 241 gene sequences collected by Wuhan University scientists. I did. Spreadsheets showed that scientists uploaded sequences to an online database called the Sequence Read Archive, which is maintained by the US Government’s National Library of Medicine.
However, when Bloom searched the database for Wuhan’s sequence earlier this month, his only result was “Item not found”.
Bewildered, he returned to the spreadsheet for more clues. This shows that 241 sequences were collected by a scientist named Aisi Fu at the People’s Hospital in Wuhan. Bloom searched the medical literature and finally found another study, posted online by Fu and colleagues in March 2020, explaining a new experimental test for SARS-CoV-2. Chinese scientists published it in a scientific journal three months later.
In that study, scientists wrote that they saw 45 samples from nasal swabs taken “from outpatients suspected of having COVID-19 in the early stages of the epidemic.” Next, I searched for some of the genetic material of SARS-CoV-2 with a cotton swab. Researchers have not published the actual sequence of genes taken from the sample. Instead, they only announced some mutations in the virus.
However, some clues have shown Bloom that the sample is the source of the 241 missing sequences. The treatise did not contain an explanation as to why the sequence was uploaded to the sequence read archive, but it disappeared later.
Bloom perused the archive and realized that many of the sequences were stored as files on Google Cloud. He reported that each sequence is contained in a file in the cloud, and the names of the files all share the same basic format.
Bloom exchanged the missing sequence code from Wuhan. Suddenly he had a sequence. After all, he was able to recover 13 sequences from the cloud this way.
With this new data, Bloom once again looked back at the early stages of the pandemic. He combined 13 sequences with other published early coronavirus sequences in hopes of advancing the construction of the SARS-CoV-2 family tree.
Scientists still have a limited number of samples to study, making it difficult to unravel all the steps that SARS-CoV-2 has evolved from the bat virus. Some of the early samples are from Wuhan’s South China Seafood Wholesale Market, which occurred in December 2019.
However, the viruses in these markets actually have three additional mutations that are missing from SARS-CoV-2 samples collected after a few weeks. In other words, these later viruses looked like the coronavirus found in bats, supporting the idea that there was an early strain of virus that did not pass through the seafood market.
Bloom found that the deleted sequences he recovered from the cloud also lacked those extra mutations. “They are three levels more like the bat coronavirus than the virus from the South China fish market,” Bloom said.
He said this suggests that SARS-CoV-2 had been in circulation for some time in Wuhan or later by the time it hit the market. He argued that the viruses on the market do not represent the full diversity of coronaviruses that are already loose in late 2019.
“Perhaps our photographs of what existed in the early days of Wuhan from the sequenced ones may be somewhat biased,” he said.
Bloom acknowledged in his report that this conclusion must be confirmed by a deeper analysis of the viral sequence. Wolobay and his colleagues said they are working on a large-scale study of the SARS-CoV-2 gene to better understand its origin and add 13 recovered sequences of Bloom.
“These additional data play a major role in that effort,” said Wolobay.
It is not clear why this valuable information was lost in the first place. Scientists can request the deletion of files by sending an email to the administrator of the sequence read archive. The National Library of Medicine, which maintains the archive, said 13 sequences were deleted last summer.
“These SARS-CoV-2 sequences were submitted for submission to the SRA in March 2020 and were subsequently requested to be withdrawn by investigators submitting in June 2020,” said National Hygiene. Renata Myles, a spokeswoman for the institute, said.
She said an unnamed investigator had informed Archive Manager that the sequence had been updated and would be added to another database. But Bloom has searched all the databases he knows, but hasn’t found them yet.
“Obviously, I can’t deny that the sequences are somewhere in other databases or web pages, but I couldn’t find them in any of the obvious places I looked up,” he says. I did.
Three of the co-authors of the 2020 test study, which generated 13 sequences, did not immediately respond to emails inquiring about Bloom’s discovery. The study did not provide contact information for another co-author, Fu, who was also named in the spreadsheets of other studies.
Some scientists are skeptical that there is something ominous behind the deletion of the sequence. Stephen Goldstein, a virus expert at the University of Utah, said:
Goldstein said the test paper lists the individual mutations that Wuhan researchers found in the test. The complete sequence is no longer in the archive, but important information has been published for over a year, he said. It was hidden in a format that was difficult for researchers to find.
“We all missed this relatively ambiguous treatise,” Goldstein said.
“I can’t really say why they were removed,” Bloom admitted in an interview. “It can be said that the practical result of removing them was that people were unaware that they existed.” He also said that the Chinese government had many of the early samples of the virus. He said he ordered the destruction of the virus and banned the publication of papers on the coronavirus without its approval.
On his side, Wolobay still wants an answer. “I hope to hear from the authors who generated but deleted these important sequences and get a better understanding of their motives for doing so,” he said. “It’s certainly weird at face value and really needs explanation.”
Regardless of what happened to these 13 sequences, Bloom wonders what other clues can be found online. All of these clues are potentially important in reconstructing the origin of COVID-19.
“Ideally, you should try to find as many other early sequences as possible,” he said. “And I think this study suggests where we should look.”
This article was originally published in The New York Times.
Carl Zimmer c.2021 The New York Times Company
Read all the latest news, latest news, coronavirus news here
Scientists have discovered an early viral sequence that was mysteriously removed from an early Covid case in China.
Source link Scientists have discovered an early viral sequence that was mysteriously removed from an early Covid case in China.