Coronavirus origins: genome analysis suggests two viruses may have combinedAlexandre Hassanin, Muséum national d’histoire naturelle (MNHN)
In the space of a few weeks, we have all learned a lot about COVID-19 and the virus that causes it: SARS-CoV-2. But there have also been a lot of rumours. And while the number of scientific articles on this virus is increasing, there are still many grey areas as to its origins.
In which animal species did it occur? A bat, a pangolin or another wild species? Where does it come from? From a cave or a forest in the Chinese province of Hubei, or elsewhere?
In December 2019, 27 of the first 41 people hospitalised (66%) passed through a market located in the heart of Wuhan city in Hubei province. But, according to a study conducted at Wuhan Hospital, the very first human case identified did not frequent this market. Instead, a molecular dating estimate based on the SARS-CoV-2 genomic sequences indicates an origin in November. This raises questions about the link between this COVID-19 epidemic and wildlife.
The SARS-CoV-2 genome was rapidly sequenced by Chinese researchers. It is an RNA molecule of about 30,000 bases containing 15 genes, including the S gene which codes for a protein located on the surface of the viral envelope (for comparison, our genome is in the form of a double helix of DNA about 3 billion bases in size and contains about 30,000 genes).
Comparative genomic analyses have shown that SARS-CoV-2 belongs to the group of Betacoronaviruses and that it is very close to SARS-CoV, responsible for an epidemic of acute pneumonia which appeared in November 2002 in the Chinese province of Guangdong and then spread to 29 countries in 2003. A total of 8,098 cases were recorded, including 774 deaths. It is known that bats of the genus Rhinolophus (potentially several cave species) were the reservoir of this virus and that a small carnivore, the palm civet (Paguma larvata), may have served as an intermediate host between bats and the first human cases.
Since then, many Betacoronaviruses have been discovered, mainly in bats, but also in humans. For example, RaTG13, isolated from a bat of the species Rhinolophus affinis collected in China’s Yunan Province, has recently been described as very similar to SARS-CoV-2, with genome sequences identical to 96%. These results indicate that bats, and in particular species of the genus Rhinolophus, constitute the reservoir of the SARS-CoV and SARS-CoV-2 viruses.
But how do you define a reservoir? A reservoir is one or several animal species that are not or not very sensitive to the virus, which will naturally host one or several viruses. The absence of symptoms of the disease is explained by the effectiveness of their immune system, which allows them to fight against too much viral proliferation.
On February 7, 2020, we learned that a virus even closer to SARS-CoV-2 had been discovered in pangolin. With 99% of genomic concordance reported, this suggested a more likely reservoir than bats. However, a recent study under review shows that the genome of the coronavirus isolated from the Malaysian pangolin (Manis javanica) is less similar to SARS-Cov-2, with only 90% of genomic concordance. This would indicate that the virus isolated in the pangolin is not responsible for the COVID-19 epidemic currently raging.
However, the coronavirus isolated from pangolin is similar at 99% in a specific region of the S protein, which corresponds to the 74 amino acids involved in the ACE (Angiotensin Converting Enzyme 2) receptor binding domain, the one that allows the virus to enter human cells to infect them. By contrast, the virus RaTG13 isolated from bat R. affinis is highly divergent in this specific region (only 77 % of similarity). This means that the coronavirus isolated from pangolin is capable of entering human cells whereas the one isolated from bat R. affinis is not.
In addition, these genomic comparisons suggest that the SARS-Cov-2 virus is the result of a recombination between two different viruses, one close to RaTG13 and the other closer to the pangolin virus. In other words, it is a chimera between two pre-existing viruses.
This recombination mechanism had already been described in coronaviruses, in particular to explain the origin of SARS-CoV. It is important to know that recombination results in a new virus potentially capable of infecting a new host species. For recombination to occur, the two divergent viruses must have infected the same organism simultaneously.
Two questions remain unanswered: in which organism did this recombination occur? (a bat, a pangolin or another species?) And above all, under what conditions did this recombination take place?
Alexandre Hassanin, Maître de Conférences (HDR) à Sorbonne Université, ISYEB – Institut de Systématique, Evolution, Biodiversité (CNRS, MNHN, SU, EPHE, UA), Muséum national d’histoire naturelle (MNHN)