I am posting the second part of the first report by Chinese virologist Li-Meng Yan which gives evidence of the Covid-19 virus being created in the Chinese laboratory, Wuhan Institute of Virology. I've posted the first part earlier here. The second part gives the procedures used to create the Covid-19 virus in the lab. I am not including the figures and citations; the full report (in pdf format) with all figures and citations can be accessed via this link:
Unusual Features of the SARS-CoV-2 Genome Suggesting Sophisticated Laboratory Modification Rather Than Natural Evolution and Delineation of Its Probable Synthetic Route
Li-Meng Yan (MD, PhD), Shu Kang (PhD), Jie Guan (PhD), Shanchang Hu (PhD)
Rule of Law Society & Rule of Law Foundation, New York, NY, USA.
Correspondence:
[email protected]
2. Delineation of a synthetic route of SARS-CoV-2
In the second part of this report, we describe a synthetic route of creating SARS-CoV-2 in a laboratory setting. It is postulated based on substantial literature support as well as genetic evidence present in the SARS-CoV-2 genome. Although steps presented herein should not be viewed as exactly those taken, we believe that key processes should not be much different. Importantly, our work here should serve as a demonstration of how SARS-CoV-2 can be designed and created conveniently in research laboratories by following proven concepts and using well-established techniques.
Importantly, research labs, both in Hong Kong and in mainland China, are leading the world in coronavirus research, both in terms of resources and on the research outputs. The latter is evidenced not only by the large number of publications that they have produced over the past two decades but also by their milestone achievements in the field: they were the first to identify civets as the intermediate host for SARS-CoV and isolated the first strain of the virus; they were the first to uncover that SARS-CoV originated from bats; they revealed for the first time the antibody-dependent enhancement (ADE) of SARS-CoV infections; they have contributed significantly in understanding MERS in all domains (zoonosis, virology, and clinical studies); they made several breakthroughs in SARS-CoV-2 research. Last but not least, they have the world’s largest collection of coronaviruses (genomic sequences and live viruses). The knowledge, expertise, and resources are all readily available within the Hong Kong and mainland research laboratories (they collaborate extensively) to carry out and accomplish
the work described below.
2.1 Possible scheme in designing the laboratory-creation of the novel coronavirus
In this sub-section, we outline the possible overall strategy and major considerations that may have been formulated at the designing stage of the project.
To engineer and create a human-targeting coronavirus,
they would have to pick a bat coronavirus as the template/backbone. This can be conveniently done because many research labs have been actively collecting bat coronaviruses over the past two decades. However, this template virus ideally should not be one from Dr. Zhengli Shi’s collections, considering that she is widely known to have been engaged in gain-of-function studies on coronaviruses. Therefore, ZC45 and/or ZXC21, novel bat coronaviruses discovered and owned by military laboratories, would be suitable as the template/backbone. It is also possible that these military laboratories had discovered other closely related viruses from the same location and kept some unpublished. Therefore, the actual template could be ZC45, or ZXC21, or a close relative of them. The postulated pathway described below would be the same regardless of which one of the three was the actual template.
Once they have chosen a template virus, they would first need to
engineer, through molecular cloning, the Spike protein so that it can bind hACE2. The concept and cloning techniques involved in this manipulation have been well-documented in the literature. With almost no risk of failing, the template bat virus could then be converted to a coronavirus that can bind hACE2 and infect humans.
Second, they would
use molecular cloning to introduce a furin-cleavage site at the S1/S2 junction of Spike. This manipulation, based on known knowledge, would likely produce a strain of coronavirus that is a more infectious and pathogenic.
Third, they would
produce an ORF1b gene construct. The ORF1b gene encodes the polyprotein Orf1b, which is processed post-translationally to produce individual viral proteins: RNA-dependent RNA polymerase (RdRp), helicase, guanidine-N7 methyltransferase, uridylate-specific endoribonuclease, and 2’-O-methyltransferase. All of these proteins are parts of the replication machinery of the virus. Among them, the RdRp protein is the most crucial one and is highly conserved among coronaviruses. Importantly, Dr. Zhengli Shi’s laboratory uses a PCR protocol, which amplifies a particular fragment of the RdRp gene, as their primary method to detect the presence of coronaviruses in raw samples (bat fecal swap, feces, etc). As a result of this practice, the Shi group has documented the sequence information of this short segment of RdRp for all coronaviruses that they have successfully detected and/or collected.
Here, the genetic manipulation is less demanding or complicated because Orf1b is conserved and likely Orf1b from any ß coronavirus would be competent enough to do the work. However, we believe that they would want to introduce a particular Orf1b into the virus for one of the two possible reasons:
1. Since many phylogenetic analyses categorize coronaviruses based on the sequence similarity of the RdRp gene only, having a different RdRp in the genome therefore could ensure that SARS-CoV-2 and ZC45/ZXC21 are separated into different groups/sub-lineages in phylogenetic studies. Choosing an RdRp gene, however, is convenient because the short RdRp segment sequence has been recorded for all coronaviruses ever collected/detected. Their final choice was the RdRp sequence from bat coronavirus RaBtCoV/4991, which was discovered in 2013. For RaBtCoV/4991, the only information ever published was the sequence of its short RdRp segment, while neither its full genomic sequence nor virus isolation were ever reported. After amplifying the RdRp segment (or the whole ORF1b gene) of RaBatCoV/4991, they would have then used it for subsequent assembly and creation of the genome of SARS-CoV-2. Small changes in the RdRp sequence could either be introduced at the beginning (through DNA synthesis) or be generated via passages later on. On a separate track, when they were engaged in the fabrication of the RaTG13 sequence, they could have started with the short RdRp segment of RaBtCoV/4991 without introducing any changes to its sequence, resulting in a 100% nucleotide sequence identity between the two viruses on this short RdRp segment. This RaTG13 virus could then be claimed to have been discovered back in 2013.
2. The RdRp protein from RaBatCoV/4991 is unique in that it is superior than RdRp from any other ß coronavirus for developing antiviral drugs. RdRp has no homologs in human cells, which makes this essential viral enzyme a highly desirable target for antiviral development. As an example, Remedesivir, which is currently undergoing clinical trials, targets RdRp. When creating a novel and human-targeting virus, they would be interested in developing the antidote as well. Even though drug discovery like this may not be easily achieved, it is reasonable for them to intentionally incorporate a RdRp that is more amenable for antiviral drug development.
Fourth, they would
use reverse genetics to assemble the gene fragments of spike, ORF1b, and the rest of the template ZC45 into a cDNA version of the viral genome. They would then carry out in vitro transcription to obtain the viral RNA genome. Transfection of the RNA genome into cells would allow the recovery of live and infectious viruses with the desired artificial genome.
Fifth, they would carry out
characterization and optimization of the virus strain(s) to improve the fitness, infectivity, and overall adaptation using serial passage in vivo. One or several viral strains that meet certain criteria would then be obtained as the final product(s).
2.2 A postulated synthetic route for the creation of SARS-CoV-2
In this sub-section, we describe in more details how each step could be carried out in a laboratory setting using available materials and routine molecular, cellular, and virologic techniques. A diagram of this process is shown in Figure 8. We estimate that the whole process could be completed in approximately 6 months.
Step 1: Engineering the RBM of the Spike for hACE2-binding (1.5 months)
The Spike protein of a bat coronavirus is either incapable of or inefficient in binding hACE2 due to the missing of important residues within its RBM. This can be exemplified by the RBM of the template virus ZC45 (Figure 4). The first and most critical step in the creation of SARS-CoV-2 is to engineer the Spike so that it acquires the ability to bind hACE2. As evidenced in the literature, such manipulations have been carried out repeatedly in research laboratories since 2008, which successfully yielded engineered coronaviruses with the ability to infect human cells. Although there are many possible ways that one can engineer the Spike protein, we believe that what was actually undertaken was that they replaced the original RBM with a designed and possibly optimized RBM using SARS’ RBM as a guide. As described in part 1, this theory is supported by our observation that two unique restriction sites, EcoRI and BstEII, exist at either end of the RBM in the SARS-CoV-2 genome(figure 5A) and by the fact that such RBM-swap has been successfully carried out by Dr. Zhengli Shi and by her long-term collaborator and structure biology expert, Dr. Fang Li.
Although ZC45 spike does not contain these two restriction sites (Figure 5B), they can be introduced very easily. The original spike gene would be either amplified with RT-PCR or obtained through DNA synthesis (some changes could be safely introduced to certain variable regions of the sequence) followed by PCR. The gene would then be cloned into a plasmid using restriction sites other than EcoRI and BstEII.
Once in the plasmid, the spike gene can be modified easily. First, an EcoRI site can be introduced by converting the highlighted “gaacac” sequence (Figure 5B) to the desired “gaattc” (Figure 5A). The difference between them are two consecutive nucleotides. Using the commercially available QuikChange Site-Directed Mutagenesis kit, such a di-nucleotide mutation can be generated in no more than one week.
Subsequently, the BstEII site could be similarly introduced at the other end of the RBM. Specifically, the “gaatacc” sequence (Figure 5B) would be converted to the desired “ggttacc” (Figure 5A), which would similarly require a week of time.
Once these restriction sites, which are unique within the spike gene of SARS-CoV-2, were successfully introduced, different RBM segments could be swapped in conveniently and the resulting Spike protein subsequently evaluated using established assays.
As described in part 1, the design of an RBM segment could be well-guided by the high-resolution structures (Figure 3), yielding a sequence that resembles the SARS RBM in an intelligent manner. When carrying out the structure-guided design of the RBM, they would have followed the routine and generated a few (for example a dozen) such RBMs with the hope that some specific variant(s) may be superior than others in binding hACE2. Once the design was finished, they could have each of the designed RBM genes commercially synthesized (quick and very affordable) with an EcoRI site at the 5’-end and a BstEII site at the 3’-end. These novel RBM genes could then be cloned into the spike gene, respectively. The gene synthesis and subsequent cloning, which could be done in a batch mode for the small library of designed RBMs, would take approximately one month.
These engineered Spike proteins might then be tested for hACE2-binding using the established pseudotype virus infection assays. The engineered Spike with good to exceptional binding affinities would be selected. (Although not necessary, directed evolution could be involved here (error-prone PCR on the RBM gene), coupled with either an in vitro binding assay or a pseudotype virus infection assay, to obtain an RBM that binds hACE2 with exceptional affinity.)
Given the abundance of literature on Spike engineering and the available high-resolution
structures of the Spike-hACE2 complex, the success of this step would be very much guaranteed. By the end of this step, as desired, a novel spike gene would be obtained, which encodes a novel Spike protein capable of binding hACE2 with high affinity.
Step 2: Engineering a furin-cleavage site at the S1/S2 junction (0.5 month)
The product from Step 1, a plasmid containing the engineered spike, would be further modified to include a furin-cleavage site (segment indicated by green lines in Figure 4) at the S1/S2 junction. This short stretch of gene sequence can be conveniently inserted using several routine cloning techniques, including QuikChange Site-Directed PCR, overlap PCR followed by restriction enzyme digestion and ligation, or Gibson assembly. None of these techniques would leave any trace in the sequence. Whichever cloning method was the choice, the inserted gene piece would be included in the primers, which would be designed, synthesized, and used in the cloning. This step, leading to a further modified Spike with the furin-cleavage site added at the S1/S2 junction, could be completed in no more than two weeks.
Step 3: Obtain an ORF1b gene that contains the sequence of the short RdRp segment from RaBtCoV/4991 (1 month, yet can be carried out concurrently with Steps 1 and 2)
Unlike the engineering of Spike, no complicated design is needed here, except that the RdRp gene segment from RaBtCoV/4991 would need to be included. Gibson assembly could have been used here. In this technique, several fragments, each adjacent pair sharing 20-40 bp overlap, are combined together in one simple reaction to assemble a long DNA product. Two or three fragments, each covering a significant section of the ORF1b gene, would be selected based on known bat coronavirus sequences. One of these fragments would be the RdRp segment of RaBtCoV/4991. Each fragment would be PCR amplified with proper overlap regions introduced in the primers. Finally, all purified fragments would be pooled equimolar concentrations and added to the Gibson reaction mixture, which, after a short incubation, would yield the desired ORF1b gene in whole.
Step 4: Produce the designed viral genome using reverse genetics and recover live viruses (0.5 month)
Reverse genetics have been frequently used in assembling whole viral genomes, including coronavirus genomes. The most recent example is the reconstruction of the SARS-CoV-2 genome using the transformation-assisted recombination in yeast. Using this method, the Swiss group assembled the entire viral genome and produced live viruses in just one week. This efficient technique, which would not leave any trace of artificial manipulation in the created viral genome, has been available since 2017. In addition to the engineered spike gene (from steps 1 and 2) and the ORF1b gene (from step 3), other fragments covering the rest of the genome would be obtained either through RT-PCR amplification from the template virus or through DNA synthesis by following a sequence slightly altered from that of the template virus. We believe that the latter approach was more likely as it would allow sequence changes introduced into the variable regions of less conserved proteins, the process of which could be easily guided by multiple sequence alignments. The amino acid sequences of more conserved functions, such as that of the E protein, might have been left unchanged. All DNA fragments would then be pooled together and transformed into yeast, where the cDNA version of the SARS-CoV-2 genome would be assembled via transformation-assisted recombination. Of course, an alternative method of reverse genetics, one of which the WIV has successfully used in the past, could also be employed. Although some earlier reverse genetics approaches may leave restriction sites at where different fragments would be joined, these traces would be hard to detect as the exact site of ligation can be anywhere in the ~30kb genome. Either way, a cDNA version of the viral genome would be obtained from the reverse genetics experiment.
Subsequently, in vitro transcription using the cDNA as the template would yield the viral RNA genome, which upon transfection into Vero E6 cells would allow the production of live viruses bearing all of the designed properties.
Step 5: Optimize the virus for fitness and improve its hACE2-binding affinity in vivo (2.5-3 months)
Virus recovered from step 4 needs to be further adapted undergoing the classic experiment – serial passage in laboratory animals. This final step would validate the virus’ fitness and ensure its receptor-oriented adaptation toward its intended host, which, according to the analyses above, should be human. Importantly, the RBM and the furin-cleavage site, which were introduced into the Spike protein separately, would now be optimized together as one functional unit. Among various available animal models (e.g. mice, hamsters, ferrets, and monkeys) for coronaviruses, hACE2 transgenic mice (hACE2-mice) should be the most proper and convenient choice here. This animal model has been established during the study of SARS-CoV and has been available in the Jackson Laboratory for many years.
The procedure of serial passage is straightforward. Briefly, the selected viral strain from step 4, a precursor of SARS-CoV-2, would be intranasally inoculated into a group of anaesthetized hACE2-mice. Around 2-3 days post infection, the virus in lungs would usually amplify to a peak titer. The mice would then be sacrificed and the lungs homogenized. Usually, the mouse-lung supernatant, which carries the highest viral load, would be used to extract the candidate virus for the next round of passage. After approximately 10~15 rounds of passage, the hACE2-binding affinity, the infection efficiency, and the lethality of the viral strain would be sufficiently enhanced and the viral genome stabilized. Finally, after a series of characterization experiments (e.g. viral kinetics assay, antibodies response assay, symptom observation and pathology examination), the final product, SARS-CoV-2, would be obtained, concluding the whole creation process. From this point on, this viral pathogen could be amplified (most probably using Vero E6 cells) and produced routinely.
It is noteworthy that, based on the work done on SARS-CoV, the hACE2-mice, although suitable for SARS-CoV-2 adaptation, is not a good model to reflect the virus’ transmissibility and associated clinical symptoms in humans. We believe that those scientists might not have used a proper animal model (such as the golden Syrian hamster) for testing the transmissibility of SARS-CoV-2 before the outbreak of COVID-19. If they had done this experiment with a proper animal model, the highly contagious nature of SARS-CoV-2 would be extremely evident and consequently SARS-CoV-2 would not have been described as “not causing human-to-human transmission” at the start of the outbreak.
We also speculate that the extensive laboratory-adaptation, which is oriented toward enhanced transmissibility and lethality, may have driven the virus too far. As a result, SARS-CoV-2 might have lost the capacity to attenuate on both transmissibility and lethality during its current adaptation in the human population. This hypothesis is consistent with the lack of apparent attenuation of SARS-CoV-2 so far despite its great prevalence and with the observation that a recently emerged, predominant variant only shows improved transmissibility.
Serial passage is a quick and intensive process, where the adaptation of the virus is accelerated. Although intended to mimic natural evolution, serial passage is much more limited in both time and scale. As a result, less random mutations would be expected in serial passage than in natural evolution. This is particularly true for conserved viral proteins, such as the E protein. Critical in viral replication, the E protein is a determinant of virulence and engineering of it may render SARS-CoV-2 attenuated.
Therefore, at the initial assembly stage, these scientists might have decided to keep the amino acid sequence of the E protein unchanged from that of ZC45/ZXC21. Due to the conserved nature of the E protein and the limitations of serial passage, no amino acid mutation actually occurred, resulting in a 100% sequence identity on the E protein between SARS-CoV-2 and ZC45/ZXC21. The same could have happened to the marks of molecular cloning (restriction sites flanking the RBM). Serial passage, which should have partially naturalized the SARS-CoV-2 genome, might not have removed all signs of artificial manipulation.
3. Final remarks
Many questions remain unanswered about the origin of SARS-CoV-2. Prominent virologists have implicated in a Nature Medicine letter that laboratory escape, while not being entirely ruled out, was unlikely and that no sign of genetic manipulation is present in the SARS-CoV-2 genome. However, here we show that genetic evidence within the spike gene of SARS-CoV-2 genome (restriction sites flanking the RBM; tandem rare codons used at the inserted furin-cleavage site) does exist and suggests that the SARS-CoV-2 genome should be a product of genetic manipulation. Furthermore, the proven concepts, well-established techniques, and knowledge and expertise are all in place for the convenient creation of this novel coronavirus in a short period of time.
Motives aside, the following facts about SARS-CoV-2 are well-supported:
1. If it was a laboratory product, the most critical element in its creation, the backbone/template virus (ZC45/ZXC21), is owned by military research laboratories.
2. The genome sequence of SARS-CoV-2 has likely undergone genetic engineering, through which the virus has gained the ability to target humans with enhanced virulence and infectivity.
3. The characteristics and pathogenic effects of SARS-CoV-2 are unprecedented. The virus is highly transmissible, onset-hidden, multi-organ targeting, sequelae-unclear, lethal, and associated with various symptoms and complications.
4. SARS-CoV-2 caused a world-wide pandemic, taking hundreds of thousands of lives and shutting down the global economy. It has a destructive power like no other.
Judging from the evidence that we and others have gathered, we believe that finding the origin of SARS-CoV-2 should involve an independent audit of the WIV P4 laboratories and the laboratories of their close collaborators. Such an investigation should have taken place long ago and should not be delayed any further.
We also note that in the publication of the chimeric virus SHC015-MA15 in 2015, the attribution of funding of Zhengli Shi by the NIAID was initially left out. It was reinstated in the publication in 2016 in a corrigendum, perhaps after the meeting in January 2016 to reinstate NIH funding for gain-of-function research on viruses. This is an unusual scientific behavior, which needs an explanation for. What is not thoroughly described in this report is the various evidence indicating that several coronaviruses recently published (RaTG13, RmYN02, and several pangolin coronaviruses) are highly suspicious and likely fraudulent. These fabrications would serve no purpose other than to deceive the scientific community and the general public so that the true identity of SARS-CoV-2 is hidden.
Although exclusion of details of such evidence does not alter the conclusion of the current report, we do believe that these details would provide additional support for our contention that SARS-CoV-2 is a laboratory-enhanced virus and a product of gain-of-function research. A follow-up report focusing on such additional evidence is now being prepared and will be submitted shortly.