Coronaviruses are enveloped viruses with a positive-sense RNA genome and with a nucleocapsid of helical symmetry. The SARS epidemic has boosted interest in research on coronavirus biodiversity and genomics. Before 2003, there were only 10 coronaviruses with complete genomes available. After the SARS epidemic, up to December 2008, there was an addition of 16 coronaviruses with complete genomes sequenced, and in September 2012, a novel coronavirus was found in UK. These include two human coronaviruses (human coronavirus NL63 and human coronavirus HKU1), 10 other mammalian coronaviruses [bat SARS coronavirus, bat coronavirus (bat-CoV) HKU2, bat-CoV HKU4, bat-CoV HKU5 bat-CoV HKU8, bat-CoV HKU9, bat-CoV 512/2005, bat-CoV 1A, equine coronavirus, and beluga whale coronavirus] and four coronaviruses (turkey coronavirus, bulbul coronavirus HKU11, thrush coronavirus HKU12, and munia coronavirus HKU13). Coronaviruses are divided into three groups: group 1 coronaviruses (alphacoronavirus), group 2 coronaviruses (betacoronavirus), and group 3 coronaviruses (gammacoronavirus). Two subgroups in group 2 coronavirus (groups 2c and 2d) and two novel subgroups in group 3 coronavirus (groups 3b and 3c) have been proposed.
The diversity of coronaviruses is a result of the infidelity of RNA-dependent RNA polymerase, high frequency of homologous RNA recombination, and the large genomes of coronaviruses. Among all hosts, the diversity of coronaviruses is most evidenced in bats and birds, which may be a result of their species diversity, ability to fly, environmental pressures, and habits of roosting and flocking. The present evidence supports that bat coronaviruses are the gene pools of group 1 and 2 coronaviruses, whereas bird coronaviruses are the gene pools of group 3 coronaviruses. With the increasing number of coronaviruses, more and more closely related coronaviruses from distantly related animals have been observed, which were results of recent interspecies jumping and may be the cause of disastrous outbreaks of zoonotic diseases.
Fig. 3D Structure of Coronavirus
Although it has been proposed that group 1 coronaviruses can be subdivided into groups 1a and 1b based on phylogenetic clustering of group 1a coronaviruses and >90% overall genome identity among the members of this subgroup, no additional genomic evidence, such as gene contents, transcription regulatory sequence (TRS) or other unique genomic features, as in the subgroups in groups 2 and 3 coronaviruses, support such a sub-classification. For the group 1b coronaviruses, in addition to the lack of common genomic features, there is no phylogenetic clustering. Therefore, the group 1b coronaviruses are in fact “non–group 1a” coronaviruses, rather than having common features that make them a distinct lineage.
Although the present sub-classification of group 1 coronaviruses into groups 1a and 1b may not be ideal, the best documented example of generation of coronavirus species through homologous recombination is present in group 1a coronavirus, which is the generation of FCoV [also called feline infectious peritonitis virus (FIPV) in some publications] type II strains by double recombination between FCoV (FIPV) type I strains and canine coronavirus (CCoV). It was originally observed that the sequence of S in type II FCoV was closely related to that of CCoV but the sequence downstream of E in type II FCoV was closely related to that of type I FCoV. This suggests that there may have been a homologous RNA recombination event between the 3′ ends of the genomes of CCoV and type I FCoV, giving rise to a type II FCoV genome. Further analysis by multiple alignments pinpointed the site of recombination to a region in the E gene. A few years later, Herrewegh et al. further discovered an additional recombination region in the pol gene, and they concluded that type II FCoV in fact originated from two recombination events between genomes of CCoV and type I FCoV.
Before the discovery of SARS-CoV, group 2 coronaviruses were considered to include one lineage, with all members possessing haemagglutinin esterase genes and two papain-like proteases (PL1pro and PL2pro) in nsp3 of ORF1ab. However, it was confirmed that SARS-CoV is probably an early split-off from the group 2 coronavirus lineage. Therefore, SARS-CoV was subsequently classified as group 2b coronaviruses and the historical group 2 coronaviruses were classified as group 2a coronaviruses. In 2006 and 2007, two additional subgroups of group 2 coronaviruses were proposed: group 2c and group 2d. These two subgroups form two unique lineages, most closely related to, but distinct from group 2a and group 2b coronaviruses. In addition to phylogenetic evidence, there is also clear-cut evidence from gene contents and other genomic features that four subgroups exist in group 2 coronaviruses. For the gene contents of the genomes of group 2a coronaviruses, they possess PL1pro and PL2pro in nsp3 of ORF1ab, but group 2b, 2c, and 2d coronaviruses only possess one PLpro, which is homologous to PL2pro. Furthermore, the genomes of group 2a, but not those of group 2b, 2c, and 2d coronaviruses, encode haemagglutinin esterase.
Extensive homologous and heterologous recombination events have been documented in both human and animal group 2 coronaviruses, which has led to the generation of various genotypes and strains within a coronavirus species, as well as acquisition of new genes from other non-coronavirus RNA donors. Among the coronaviruses, MHV is one of the most extensively studied examples of homologous recombination in coronaviruses, and is also the coronavirus in which homologous recombination was first observed. As for human coronavirus, the most studied example was HCoV-HKU1.
Since its discovery in 1937, IBV has been the only species of group 3 coronavirus for over 50 years. In the last decade of the last century and the first few years of the 21st century, a few IBV-like viruses, including TCoV, have been described in various species of birds, with some of their genomes sequenced. The sizes, G + C contents, and genome organizations of their genomes were similar, indicating that they probably have diverged from the same ancestor recently. This 70 years of quiescence was broken by two discoveries in 2008—first, the report on SW1 from a beluga whale, with the largest coronavirus genome; and second, the discovery of a novel subgroup of coronavirus from birds of different families, with the smallest coronavirus genomes.
SW1 was the first reported group 3 mammalian coronavirus with complete genome sequence and was phylogenetically distantly related to IBV. Uniquely, eight ORFs, occupying a 4105-base region, were observed between M and N, giving rise to the largest reported coronavirus genome. We propose that this lineage should be group 3b coronavirus, whereas the IBV and IBV-like viruses should be group 3a coronaviruses.
The novel subgroup of avian coronaviruses, group 3c coronavirus consisted of at least three members (BuCoV HKU11, ThCoV HKU12, and MuCoV HKU13), infecting at least three different families of birds (bulbuls, thrushes, and munias). These coronaviruses were distantly related to IBV and SW1. Most interestingly, these three avian coronaviruses were also clustered with a coronavirus recently discovered in the Asian leopard cat (ALC-CoV), for which the complete genome sequence was not available.