Antibodies broadly neutralizing sars-cov-2 and other sarbecoviruses and uses thereof
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- TSINGHUA UNIVERSITY
- Filing Date
- 2022-06-24
- Publication Date
- 2026-06-19
Smart Images

Figure BDA0003712839260000111 
Figure BDA0003712839260000121 
Figure BDA0003712839260000131
Abstract
Description
Technical Field
[0001] This invention belongs to the field of biotechnology and relates to antibodies that broadly neutralize SARS-CoV-2 and other sabine viruses and their applications. Specifically, the antibodies may be nanobodies or nanobodies-Fc fusion antibodies. Background Technology
[0002] The novel coronavirus (SARS-CoV-2, Severe Acute Respiratory Syndrome Coronavirus 2) belongs to the Sarbecovirus subgenus of the β-coronavirus family. As its transmission progresses, SARS-CoV-2 has undergone numerous mutations, leading to increasingly potent variants with varying degrees of immune evasion. Major circulating variants include the Alpha, Beta, Gamma, Delta, and Omicron strains. The emergence of these variants has resulted in varying degrees of decreased vaccine efficacy, and many monoclonal antibodies have shown reduced or even complete loss of neutralizing activity against these mutant strains.
[0003] Besides SARS-CoV-2, Severe Acute Respiratory Syndrome Coronavirus 1 (SARS-CoV-1) also belongs to the subgenus *Sabevirus*. The *Sabevirus* subgenus commonly uses bats as animal hosts and can be further divided into three branches: First, SARS-CoV-1 viruses that infect humans and bat-derived SARS-CoV-1 virus strains with similar genomes, such as Bat SARS-like WIV1; second, bat-derived SARS-CoV-1-like virus strains, such as Bat SARS-like CoV ZC45 and Bat SARS-like CoV ZXC21; and third, SARS-CoV-2 viruses that infect humans and bat-derived and pangolin-derived virus strains with similar genomes, such as BatCoV RaTG13. Since the 20th century, *Sabevirus* has caused two major epidemics in the human world (SARS-CoV-1 and SARS-CoV-2), and researchers continue to discover more *Sabevirus* strains closely related to SARS-CoV-1 in natural hosts. To address future variants and the potential for sabeviruses to evolve from their natural hosts and infect humans, discovering universal drugs and vaccines for the prevention and treatment of sabevirus infection is crucial to ending the pandemic and preventing further mutations and possible outbreaks of novel coronaviruses.
[0004] Conventional immunoglobulin IgG antibodies consist of two identical heavy chains and two identical light chains, such as monoclonal antibodies isolated from humans or humanized mouse antibodies. Besides conventional antibodies, camelids also possess unique heavy chain antibodies (HCAbs), composed of only two identical heavy chains, each consisting of one variable region (VHH) and two constant regions (CH2 and CH3). The VHH is the smallest complete functional structure of a heavy chain antibody, also known as a nanobody or single-domain antibody. Nanobodies possess numerous advantages, including small size, high specificity, high stability, ease of production, strong penetration, and low immunogenicity. Thanks to their small size and high specificity, nanobodies can more easily recognize numerous hidden antibody epitopes on target proteins, facilitating the identification of more neutralizing antibody epitopes. This provides innovative ideas for antibody drug and vaccine development, and effective technologies and methods for epidemic prevention and control. Summary of the Invention
[0005] The purpose of this invention is to provide antibodies that broadly neutralize SARS-CoV-2 and other sabeviruses and their applications.
[0006] This invention provides four types of antibodies (nanobodies). This invention protects any one or any combination of the four antibodies.
[0007] The four antibodies are: TH-34 antibody, TH-41 antibody, TH-44 antibody, and TH-40 antibody.
[0008] This invention also provides four antibodies (fusion antibodies). This invention protects any one or any combination of the four antibodies.
[0009] The four antibodies are: TH-34-Fc antibody, TH-41-Fc antibody, TH-44-Fc antibody and TH-40-Fc antibody.
[0010] The TH-34 antibody includes a framework region (FR) and a complementarity-determining region (CDR); CDR1, CDR2, and CDR3 in the TH-34 antibody are amino acid residues at positions 26-33, 51-57, and 96-105 in sequence 8 of the sequence listing, respectively; FR1, FR2, FR3, and FR4 in the TH-34 antibody are amino acid residues at positions 1-25, 34-50, 58-95, and 105-116 in sequence 8 of the sequence listing, respectively.
[0011] The TH-41 antibody includes a framework region (FR) and a complementarity-determining region (CDR); CDR1, CDR2, and CDR3 in the TH-41 antibody are amino acid residues at positions 26-33, 51-57, and 96-105 in sequence 10 of the sequence listing, respectively; FR1, FR2, FR3, and FR4 in the TH-41 antibody are amino acid residues at positions 1-25, 34-50, 58-95, and 105-116 in sequence 10 of the sequence listing, respectively.
[0012] The TH-44 antibody includes a framework region (FR) and a complementarity-determining region (CDR); CDR1, CDR2, and CDR3 in the TH-44 antibody are amino acid residues at positions 26-33, 51-57, and 96-105 in sequence 12 of the sequence listing, respectively; FR1, FR2, FR3, and FR4 in the TH-44 antibody are amino acid residues at positions 1-25, 34-50, 58-95, and 105-116 in sequence 12 of the sequence listing, respectively.
[0013] The TH-40 antibody includes a framework region (FR) and a complementarity-determining region (CDR); CDR1, CDR2, and CDR3 in the TH-40 antibody are amino acid residues at positions 26-33, 51-58, and 97-115 in sequence 14 of the sequence listing, respectively; FR1, FR2, FR3, and FR4 in the TH-40 antibody are amino acid residues at positions 1-25, 34-50, 59-96, and 116-126 in sequence 14 of the sequence listing, respectively.
[0014] Specifically, the TH-34 antibody is shown in sequence 8 of the sequence listing.
[0015] Specifically, the TH-41 antibody is shown in sequence 10 of the sequence listing.
[0016] Specifically, the TH-44 antibody is shown in sequence 12 of the sequence listing.
[0017] Specifically, the TH-40 antibody is shown in sequence 14 of the sequence listing.
[0018] Specifically, the TH-34 antibody comprises the following segments from the N-terminus to the C-terminus: the protein segment shown in Sequence 8 of the sequence listing, and a protein tag. Specifically, the TH-34 antibody consists of the following segments sequentially from the N-terminus to the C-terminus: the protein segment shown in Sequence 8 of the sequence listing, and a protein tag.
[0019] Specifically, the TH-41 antibody comprises the following segments from the N-terminus to the C-terminus: the protein segment shown in Sequence 10 of the sequence listing, and a protein tag. Specifically, the TH-41 antibody consists of the following segments sequentially from the N-terminus to the C-terminus: the protein segment shown in Sequence 10 of the sequence listing, and a protein tag.
[0020] Specifically, the TH-44 antibody comprises the following segments from the N-terminus to the C-terminus: the protein segment shown in Sequence 12 of the sequence listing, and a protein tag. Specifically, the TH-44 antibody consists of the following segments sequentially from the N-terminus to the C-terminus: the protein segment shown in Sequence 12 of the sequence listing, and a protein tag.
[0021] Specifically, the TH-40 antibody comprises the following segments from the N-terminus to the C-terminus: the protein segment shown in Sequence 14 of the sequence listing, and a protein tag. Specifically, the TH-40 antibody consists of the following segments sequentially from the N-terminus to the C-terminus: the protein segment shown in Sequence 14 of the sequence listing, and a protein tag.
[0022] Specifically, the protein tag may be a His6 tag.
[0023] The TH-34-Fc antibody comprises the following segments from the N-terminus to the C-terminus: TH-34 antibody and Fc.
[0024] The TH-41-Fc antibody comprises the following segments from the N-terminus to the C-terminus: TH-41 antibody and Fc.
[0025] The TH-44-Fc antibody comprises the following segments from the N-terminus to the C-terminus: TH-44 antibody and Fc.
[0026] The TH-40-Fc antibody comprises the following segments from the N-terminus to the C-terminus: TH-40 antibody and Fc.
[0027] Specifically, the TH-34-Fc antibody comprises, from N-terminus to C-terminus, the following segments: the protein segment shown in Sequence 8 of the sequence listing, the linker peptide, and Fc.
[0028] Specifically, the TH-41-Fc antibody comprises the following segments from N-terminus to C-terminus: the protein segment shown in Sequence 10 of the sequence listing, the linker peptide, and Fc.
[0029] Specifically, the TH-44-Fc antibody comprises, from N-terminus to C-terminus, the following segments: the protein segment shown in Sequence 12 of the sequence listing, the linker peptide, and Fc.
[0030] Specifically, the TH-40-Fc antibody comprises, from N-terminus to C-terminus, the following segments: the protein segment shown in Sequence 14 of the sequence listing, the linker peptide, and Fc.
[0031] The linker peptide may specifically be "GGGGS".
[0032] The Fc can specifically be a human-derived Fc.
[0033] Specifically, the human Fc is shown as sequence 32 in the sequence list.
[0034] This invention also provides four genes. This invention protects any one or any combination of the four genes.
[0035] The four genes are: the gene encoding TH-34 antibody, the gene encoding TH-41 antibody, the gene encoding TH-44 antibody, and the gene encoding TH-40 antibody.
[0036] This invention also provides four genes. This invention protects any one or any combination of the four genes.
[0037] The four genes are: the gene encoding TH-34-Fc antibody, the gene encoding TH-41-Fc antibody, the gene encoding TH-44-Fc antibody, and the gene encoding TH-40-Fc antibody.
[0038] This invention protects the use of any of the antibodies or antibody combinations described above in the preparation of medicaments for inhibiting coronaviruses.
[0039] The present invention also protects a drug for inhibiting coronaviruses, the active ingredient of which is any of the antibodies or combinations of antibodies described above.
[0040] This invention protects the use of any of the antibodies or antibody combinations described above in the preparation of medicaments for neutralizing coronaviruses.
[0041] The present invention also protects a drug for neutralizing coronaviruses, the active ingredient of which is any of the antibodies or combinations of antibodies described above.
[0042] This invention protects the use of any of the antibodies or antibody combinations described above in the preparation of medicaments for the prevention and / or treatment of diseases caused by coronaviruses.
[0043] This invention also protects a medicament for the prevention and / or treatment of diseases caused by coronaviruses, the active ingredient of which is any of the antibodies or combinations of antibodies described above.
[0044] Specifically, the coronavirus in question is a beta coronavirus.
[0045] Specifically, the coronavirus is a subgenus of Sabevirus.
[0046] Specifically, the coronavirus in question is the novel coronavirus (SARS-CoV-2).
[0047] Specifically, the novel coronavirus may be any of the following: wild-type novel coronavirus, novel coronavirus Alpha strain, novel coronavirus Beta strain, novel coronavirus Gamma strain, novel coronavirus Delta strain, novel coronavirus Omicron BA.1 strain, or novel coronavirus Omicron BA.2 strain.
[0048] Specifically, the coronavirus in question is Severe Acute Respiratory Syndrome Coronavirus (SARS-CoV-1).
[0049] Specifically, the sabevirus subgenus is Pangolin CoV GD, Pangolin CoV GX, Bat CoVWIV16, or Bat CoV RaTG13.
[0050] This invention utilizes the spike proteins of SARS-CoV-2 and SARS-CoV-1 as bait to isolate four nanobodies from immunized alpacas. These nanobodies possess the ability to bind to both SARS-CoV-2 and SARS-CoV-1 spike proteins simultaneously, and are named TH-34 nanobody, TH-41 nanobody, TH-44 nanobody, and TH-40 nanobody. Furthermore, this invention also yielded four corresponding fusion antibodies formed by fusing these four nanobodies with human Fc. The antibodies provided by this invention exhibit broad-spectrum neutralizing effects against SARS-CoV-2, strong neutralizing ability against wild-type novel coronaviruses and naturally occurring variants, and also strong neutralizing ability against other coronaviruses of the sabevirus subgenus. This invention has significant application value for the prevention and control of novel coronaviruses and / or other sabeviruses, and will have profound social implications. Attached Figure Description
[0051] Figure 1This is a gel filtration chromatography chromatogram of the SARS-CoV-2 RBD protein purification process.
[0052] Figure 2 This is a gel filtration chromatography chromatogram of the SARS-CoV-2S protein purification process.
[0053] Figure 3 This is a gel filtration chromatography chromatogram of the SARS-CoV-1S protein purification process.
[0054] Figure 4 The results of a neutralization test of four nanobodies against wild-type novel coronavirus pseudoviruses.
[0055] Figure 5 The results of the neutralization test of four nanobodies against SARS-CoV-1 pseudovirus are presented. Detailed Implementation
[0056] The present invention will now be described in further detail with reference to specific embodiments. The given embodiments are merely illustrative of the invention and not intended to limit its scope. The embodiments provided below can serve as a guide for further improvements by those skilled in the art and do not constitute a limitation on the invention in any way.
[0057] Unless otherwise specified, the experimental methods in the following examples are conventional methods, performed according to the techniques or conditions described in the literature in this field or according to the product instructions. All recombinant plasmids in the examples have been sequenced and verified. Unless otherwise specified, the quantitative experiments in the following examples were performed in triplicate, and the results were averaged. Unless otherwise specified, all materials and reagents used in the following examples are commercially available. 293F cells: Thermo Fisher Scientific, R79007. Plasmid pFastBac-dual: Gibco, 10712024. Plasmid pcDNA3.1(+): Invitrogen, catalog number V790-20. 293T cells: Gader, CRL-11268. Insect cell culture medium (Sf-900II SFM): Gibco, catalog number 10902-088. The adenovirus vaccine AdC68-19S is the ChAdTS-COVID-19S viral fluid prepared in patent application 202010369075.2 (publication number CN113583978A, publication date 2021.11.02). The pMD18-T vector is pMD18-T from Takara catalog number 6011. TMComponents of the 18-T Vector Cloning Kit, https: / / www.takarabiomed.com.cn / ProductShow.aspx?m=20141220150817403017&productID=20141227130215047304.
[0058] hACE2-hela cells (i.e. "HeLa cell lines stably expressing the ACE2molecules"), recorded in the following literature: Wang, R., Zhang, Q., Ge, J., Ren, W., Zhang, R., Lan, J., Ju, B., Su, B., Yu, F., Chen ,P.,Liao,H.,Feng,Y.,Li,X.,Shi,X.,Zhang,Z.,Zhang,F.,Ding,Q.,Zhang,T.,Wang,X.&Zhang,L.Analysis of SARS-CoV-2 variant mutations reveals neutralization escape mechanisms and the ability to use ACE2 receptors from additional species. Immunity 54, 1611-1621.e1615, doi:10.1016 / j.immuni.2021.06.003(2021).
[0059] The pVRC8400 vector (i.e., the pVRC8400 expression vector in the literature) is described in the following literature: Luke H Chao, Daryl E Klein, Aaron G Schmidt, Jennifer M Stephen C Harrison.; Sequential conformational rearrangements in flavivirus membrane fusion; eLife, 2014;3:e04389.
[0060] Example 1: Screening of Nanobodies
[0061] I. Preparation of Immunogens
[0062] 1. Preparation of SARS-CoV-2 RBD protein
[0063] The small fragment between the BamHI and HindIII restriction sites in plasmid pcDNA3.1(+) was replaced with a double-stranded DNA molecule as shown in Sequence 2 of the sequence listing to obtain the recombinant plasmid pcDNA3.1-SARS-CoV-2RBD.
[0064] The DNA molecule shown in Sequence 2 of the sequence listing encodes the protein shown in Sequence 1 of the sequence listing. The protein shown in Sequence 1 of the sequence listing is named SARS-CoV-2RBD protein. The SARS-CoV-2RBD protein exists in a dimer form, with an expected molecular weight of approximately 50 kDa.
[0065] In sequence 1 of the sequence listing, amino acid residues 1-33 form the signal peptide, amino acid residues 34-256 form the SARS-CoV-2RBD, amino acid residues 257-264 form the strep tag, amino acid residues 265-272 form the Flag tag, and amino acid residues 273-278 form the His6 tag.
[0066] The recombinant plasmid pcDNA3.1-SARS-CoV-2RBD was transfected into 293F cells, which were cultured in SMM 293-TII medium for 72 h, and then centrifuged at 4000 rpm for 30 min to collect the supernatant containing SARS-CoV-2RBD protein.
[0067] 2. Preparation of SARS-CoV-2S protein
[0068] The small fragment between the BamHI and HindIII restriction sites in plasmid pcDNA3.1(+) was replaced with a double-stranded DNA molecule as shown in sequence 4 of the sequence listing to obtain the recombinant plasmid pcDNA3.1-SARS-CoV-2spike 2P.
[0069] The DNA molecule shown in Sequence 4 of the sequence listing encodes the protein shown in Sequence 3 of the sequence listing. The protein shown in Sequence 3 of the sequence listing is named the SARS-CoV-2 spike 2P extracellular fusion protein, or SARS-CoV-2S protein for short. The SARS-CoV-2S protein exists in trimer form, with an expected molecular weight of 420 kDa.
[0070] In sequence 3 of the sequence listing, amino acid residues 1-13 form the signal peptide, amino acid residues 14-1211 form the extracellular region of SARS-CoV-2, amino acid residues 1212-1219 form the 3C restriction site, amino acid residues 1220-1227 form the linker peptide, amino acid residues 1228-1255 form the trimer tag (which promotes the formation of a stable trimer), amino acid residues 1256-1263 form the strep tag, and amino acid residues 1264-1269 form the His6 tag.
[0071] The corresponding protein in the wild-type novel coronavirus is shown in Sequence 5 of the sequence listing. Compared with the protein shown in Sequence 5, the protein shown in Sequence 3 has undergone the following modifications: two mutations were introduced, namely, the S1 / S2 restriction site was changed from “RRKR” to “GSAS” and “KV” was changed to “PP”, to increase protein stability; a 3C restriction site, a linker peptide, a trimer tag, a strep tag, and a His6 tag were introduced at the C-terminus.
[0072] The recombinant plasmid pcDNA3.1-SARS-CoV-2spike 2P was transfected into 293F cells, which were cultured in SMM 293-TII medium for 72 h, and then centrifuged at 4000 rpm for 30 min to collect the supernatant containing SARS-CoV-2S protein.
[0073] 3. Preparation of SARS-CoV-1S protein
[0074] (1) Construction of recombinant plasmids
[0075] The small fragment between the EcoRI and XbaI restriction sites in plasmid pFastBac-dual was replaced with a double-stranded DNA molecule as shown in sequence 7 of the sequence listing to obtain the recombinant plasmid pFastBac-SARS-CoV-1spike.
[0076] The DNA molecule shown in Sequence 7 of the sequence listing encodes the protein shown in Sequence 6 of the sequence listing. The protein shown in Sequence 6 of the sequence listing is named the SARS-CoV-1 spike extracellular region fusion protein, or SARS-CoV-1S protein for short. The SARS-CoV-1S protein exists in trimer form, with an expected molecular weight of 420 kDa.
[0077] In sequence 6 of the sequence listing, amino acid residues 1-13 form the signal peptide, amino acid residues 14-1195 form the extracellular region of SARS-CoV-1, amino acid residues 1196-1203 form the 3C restriction site, amino acid residues 1204-1211 form the linker peptide, amino acid residues 1212-1239 form the trimer tag (which promotes the formation of a stable trimer), amino acid residues 1240-1245 form the His6 tag, and amino acid residues 1246-1253 form the strep tag.
[0078] (2) Preparation of recombinant Bacmid
[0079] ① Add the recombinant plasmid pFastBac-SARS-CoV-1spike to Escherichia coli DH10 Bac competent cells and place on ice for 30 min; then heat shock at 42℃ for 75 s, and return to ice for 2 min; then add 500 μl of LB liquid medium and recover at 37℃ for 5 h; then take 10 μl and spread it on LB solid medium containing 50 μg / ml kanamycin, 7 μg / ml gentamicin, 10 μg / ml tetracycline, 40 μg / ml IPTG and 100 μg / ml X-gal, and incubate in the dark for three days until clear blue-white spots appear.
[0080] ② Pick a single white colony and inoculate it into 5 mL of LB liquid medium containing 50 μg / ml kanamycin, 7 μg / ml gentamicin, and 10 μg / ml tetracycline. Incubate at 37°C and 220 rpm for 12 hours with shaking.
[0081] ③ Take the culture system obtained in step ② and extract plasmids using the plasmid miniprep kit (QIAprep Spin Miniprep Kit, Qiagen, catalog number 27106, containing reagents P1, P2, and P3). The specific steps are as follows: Centrifuge the culture system at 13000 rpm for 2 min, collect the bacterial cell pellet, and resuspend the bacterial cells with reagent P1; add reagent P2 and slowly invert and mix 6-8 times; add reagent P3 and slowly invert and mix 6-8 times (a white precipitate will be visible), centrifuge at 13000 rpm for 10 min, and collect the supernatant; take 60... Add 800 μl of pre-chilled isopropanol to 0 μl of the supernatant, incubate at -20°C for 10 min, then centrifuge at 13000 rpm for 15 min and collect the precipitate. Resuspend the precipitate in 500 μl of pre-chilled 70% ethanol aqueous solution, centrifuge at 13000 rpm for 5 min and collect the precipitate. After completely drying the ethanol, dissolve the precipitate in ddH2O preheated at 65°C, centrifuge at 13000 rpm for 5 min, and aspirate the supernatant. This is the recombinant Bacmid solution, referred to as the Bacmid solution.
[0082] (3) Preparation and amplification of recombinant viruses
[0083] ① Take Sf9 cells and add them to a 10cm culture dish. Let them stand for 10 minutes to allow the cells to adhere to the wall. Observe them under a microscope to ensure that about 70%-80% of the bottom of the culture dish is covered by cells.
[0084] ② Take 15 μl of Cellfectin II Reagent and dilute it with 100 μl of insect cell culture medium.
[0085] ③ Take 15-20 μl of the Bacmid solution obtained in step (2) and dilute it with 100 μl of insect cell culture medium.
[0086] ④ Slowly add the liquid phase obtained in step ② to the liquid phase obtained in step ③, gently blow it evenly, let it stand at room temperature for 30 minutes, and then dilute it to 2 ml with insect cell culture medium.
[0087] ⑤ Take the culture dish from step ①, discard the supernatant, and slowly and evenly add the liquid phase obtained in step ④ to the culture dish. After incubating at 27℃ for 5 hours, discard the supernatant, add 7 ml of fresh insect cell culture medium, seal with sealing film, and incubate at 27℃ for 8 days. Collect the culture medium, centrifuge at 600g for 6 minutes, take the supernatant, add fetal bovine serum to make its volume concentration 2-5%, and store for a long time. This is the virus solution of P0 generation recombinant virus.
[0088] ⑥ Take the viral fluid of the P0 generation recombinant virus and add it to the shake-flask cultured cells at a volume ratio of 1:1000, resulting in a cell concentration of 2×10⁻⁶. 6 In Sf9 cell culture at 27°C and 110 rpm for 5 days, the culture medium was collected, centrifuged at 600g for 6 min, and the supernatant was taken. This is the viral fluid of P1 generation recombinant virus, or simply P1 generation viral fluid.
[0089] (4) Protein expression
[0090] Take P1 generation virus solution and add it to 1L of cell culture solution at a volume ratio of 1:100, resulting in a concentration of 2×10⁶ cells / mL. 6 In Sf9 cell culture at 27°C and 125 rpm for 72 hours, the cells were centrifuged at 4000 rpm for 15 minutes. The supernatant was collected, which is the supernatant containing SARS-CoV-1S protein.
[0091] 4. Protein purification
[0092] The protein solutions to be purified are: the supernatant containing SARS-CoV-2RBD protein prepared in step 1, or the supernatant containing SARS-CoV-2S protein prepared in step 2, or the supernatant containing SARS-CoV-1S protein prepared in step 3.
[0093] (1) Affinity chromatography
[0094] Affinity chromatography column specifications: length 3cm, inner diameter 1cm.
[0095] Affinity chromatography column packing material: nickel column beads (purchased from Qiagen, catalog number 30230).
[0096] Perform the following steps in sequence: ① Load 500 ml of the protein solution to be purified onto an affinity chromatography column and incubate at 4°C for 3 hours; ② Wash the column with 100 mL of HEPEs buffer (pH 7.2, 1 M) containing 20 mM imidazole; ③ Elute the target protein with 30 mL of HEPEs buffer (pH 7.2, 1 M) containing 500 mM imidazole and collect the post-column solution.
[0097] (2) Take the post-column solution obtained after affinity chromatography and concentrate it with a 10kD concentration tube (purchased from Merck, catalog number UFC800396) to obtain a concentrated solution with a volume of 1mL.
[0098] (3) Gel filtration chromatography
[0099] The specifications for the gel filtration chromatography column are: length 24cm, inner diameter 2cm.
[0100] Gel filtration chromatography column packing material: Superdex 200 Increase 10 / 300GL (purchased from GE Healthcare, catalog number 28-9909-44).
[0101] Perform the following steps: Load 0.5 mL of the concentrated solution obtained in step (2), elute with PBS buffer (pH 7.2, 10 mM) at a flow rate of 0.5 mL / min, and collect the post-column solution corresponding to the target peak, which is the purified protein solution.
[0102] When the protein solution to be purified is the supernatant containing SARS-CoV-2 RBD protein prepared in step 1, gel filtration chromatography is performed as follows: Figure 1 The retention volume corresponding to the target peak was 18.192 mL, and the purified protein solution was named SARS-CoV-2RBD protein solution.
[0103] When the protein solution to be purified is the supernatant containing SARS-CoV-2S protein prepared in step 2, gel filtration chromatography is performed as follows: Figure 2 The retention volume corresponding to the target peak was 8.633 mL, and the purified protein solution was named SARS-CoV-2S protein solution.
[0104] When the protein solution to be purified is the supernatant containing SARS-CoV-1S protein prepared in step 3, gel filtration chromatography is performed as follows: Figure 3 The retention volume corresponding to the target peak was 11.627 mL, and the purified protein solution was named SARS-CoV-1S protein solution.
[0105] The proteins used in subsequent steps and examples were all provided by purified protein solutions.
[0106] II. Screening of alpaca nanobody display library after immunization
[0107] 1. Establishment of an alpaca immune system and nanobody display library
[0108] The immunization process for alpacas is as follows:
[0109] First immunization: Immunize against SARS-CoV-2 RBD protein, with a protein immunization dose of 200 μg / animal (specifically, dilute the protein solution to 1 mL, then mix it with 1 mL of complete Freund's adjuvant before use);
[0110] Second immunization: Immunize against SARS-CoV-2 RBD protein, with a protein immunization dose of 200 μg / animal (specifically, dilute the protein solution to 1 mL, then mix it with 1 mL of incomplete Freund's adjuvant before use);
[0111] Third immunization: Immunize against SARS-CoV-2 RBD protein, with a protein immunization dose of 200 μg / animal (specifically, dilute the protein solution to 1 mL, then mix it with 1 mL of incomplete Freund's adjuvant before use);
[0112] Fourth immunization: Adenovirus vaccine AdC68-19S, immunization dose is 10. 11 vp / only;
[0113] Fifth immunization: Immunize against SARS-CoV-2S protein, with a protein immunization dose of 200 μg / animal (specifically, dilute the protein solution to 1 mL, then mix it with 1 mL of incomplete Freund's adjuvant before use);
[0114] Sixth immunization: Immunize against SARS-CoV-2S protein, with a protein immunization dose of 200 μg / animal (specifically, dilute the protein solution to 1 mL, then mix it with 1 mL of incomplete Freund's adjuvant before use).
[0115] All six immunizations were administered via subcutaneous immunization in the neck.
[0116] Seven days after the sixth immunization, 50 mL of blood was collected from the jugular vein and sent to Chengdu Apak Biotechnology Co., Ltd. for the construction of a yeast library displaying nanobodies, yielding a diversity of 10. 8 The yeast library.
[0117] 2. First round of magnetic bead sorting
[0118] (1)10 9 One yeast cell was inoculated into 100 mL of SDCAA medium and cultured at 30°C and 250 rpm until OD600 = 7.
[0119] (2) After completing step (1), centrifuge at 4000 rpm for 5 minutes, discard the supernatant, resuspend the precipitate in SGCAA medium (initial OD600 = 0.5), and incubate at 20℃ and 250 rpm until OD600 = 4.
[0120] (3) After completing step (2), take 10 9 Cells were washed with buffer (PBS buffer containing 1% FBS, the same below), then resuspended in 2 mL buffer, bait protein (SARS-CoV-2S protein or SARS-CoV-1S protein) was added and the concentration in the system was 100 nM, incubated at room temperature for 30 minutes, incubated on ice for 10 minutes, then centrifuged at 3500 rpm for 5 minutes, the pellet was collected and washed with buffer.
[0121] (4) Resuspend the precipitate obtained in step (3) in 3 mL of buffer, add 200 μL of Streptavidin MicroBeads (Miltenyi), incubate on ice for 10 minutes (stir and mix every 2 minutes), then add 5 mL of buffer, mix and centrifuge, resuspend the precipitate in 10 mL of buffer, pass through a 70 μm filter and collect the filtrate (filtration is to remove cell clusters), which is the yeast magnetic bead suspension.
[0122] (5) Place the LS column (Miltenyi Biotec, 130-042-401) on a magnetic rack, rinse with 3 mL of cold buffer, add 8.5 mL of the yeast magnetic bead suspension obtained in step (4), remove the LS column after it has drained and immediately put it back, add 1 mL of buffer to rinse, then add the remaining yeast magnetic bead suspension, rinse 3 times with 3 mL of buffer after it has drained, then remove the LS column from the magnetic rack, add 6 mL of buffer to rinse into a 15 mL centrifuge tube.
[0123] (6) Take the centrifuge tube obtained in step (5), centrifuge at 3500 rpm for 5 minutes, collect the precipitate (the precipitate collected here is yeast magnetic beads; the magnetic beads were not removed in subsequent culture, but the magnetic beads in the system will be infinitely diluted as the cells expand and passage), resuspend in 1 mL of SDCAA medium and culture for 24 hours, then centrifuge to collect the precipitate, resuspend in SGCAA medium and culture for 36 hours.
[0124] 3. Second-round sorting
[0125] (1) After completing step 2, centrifuge to collect the cell pellet, wash with buffer, and then 10 8 Resuspend the cells in 500 μL buffer, add bait protein (SARS-CoV-2S protein or SARS-CoV-1S protein) to a concentration of 100 nM, incubate on ice for 30 minutes, then centrifuge at 3500 rpm for 5 minutes, collect the cell pellet and wash with buffer.
[0126] (2) After completing step (1), resuspend the cells in 500 μL buffer, add 2.5 μL anti-HA-AF488 antibody (CellSignaling) and 2.5 μL Streptavidin-PE antibody (eBioscience), incubate on ice for 30 minutes, wash three times with 1 mL buffer, resuspend in 4 mL buffer, filter through a 70 μm filter and collect the filtrate, which is the yeast cell suspension.
[0127] (3) The yeast cell suspension obtained in step (2) is put into the Arial II flow cytometer to sort the FITC-positive and PE-positive yeast into a flow cytometer collection tube containing 2 mL of SDCAA medium.
[0128] (4) After completing step (3), take 2×10 from the collection tube. 3 One yeast cell was diluted in 200 μL of SDCAA medium, spread on an SDCAA plate, and incubated overnight at 30°C; the remaining positive cells were cryopreserved.
[0129] 4. Identification of positive yeast monoclonal clones
[0130] (1) After completing step 3, pick yeast single clones from the SDCAA plate into a 96-well deep plate, add 400 μL of SDCAA medium to each well, and incubate overnight at 30°C and 250 rpm.
[0131] (2) After completing step (1), take 50 μL of sample, centrifuge at 4000 rpm for 5 minutes, discard the supernatant, resuspend the precipitate in 400 μL of GCAA medium, and incubate at 20℃ and 250 rpm for 36 hours.
[0132] (3) After completing step (2), take 100 μL of sample, centrifuge to remove the supernatant, wash the precipitate with buffer, then resuspend it in 200 μL buffer, add bait protein (the bait protein is SARS-CoV-2S protein or SARS-CoV-1S protein) and make its concentration in the system 50 nM, incubate on ice for 30 minutes, then centrifuge at 3500 rpm for 5 minutes, collect the cell pellet, and wash it with buffer.
[0133] (4) Resuspend the cell pellet obtained in step (3) in 200 μL buffer, add 1 μL of anti-HA-AF488 antibody and 1 μL of Streptavidin-PE antibody, incubate on ice for 30 minutes, centrifuge to collect the cell pellet, wash with buffer and resuspend in 200 μL buffer.
[0134] (5) The cell suspension obtained in step (4) is put into a Fortessa flow cytometer to detect the proportion of positive yeast and select double positive yeast monoclonals with a positive yeast proportion of more than 20%.
[0135] (6) Extract yeast plasmids from double-positive yeast monoclonal samples, sequence them, and obtain nanobody gene sequences.
[0136] Four nanobodies were obtained and named TH-34 nanobody, TH-41 nanobody, TH-44 nanobody and TH-40 nanobody respectively.
[0137] The amino acid sequence of the TH-34 nanobody is shown in Sequence 8 of the sequence listing, and its encoding gene is shown in Sequence 9 of the sequence listing. The amino acid sequence of the TH-41 nanobody is shown in Sequence 10 of the sequence listing, and its encoding gene is shown in Sequence 11 of the sequence listing. The amino acid sequence of the TH-44 nanobody is shown in Sequence 12 of the sequence listing, and its encoding gene is shown in Sequence 13 of the sequence listing. The amino acid sequence of the TH-40 nanobody is shown in Sequence 14 of the sequence listing, and its encoding gene is shown in Sequence 15 of the sequence listing.
[0138] Example 2: Preparation of nanobody-Fc fusion protein
[0139] I. Construction of Recombinant Expression Vectors
[0140] 1. Insert the double-stranded DNA molecule shown in Sequence 16 of the sequence listing into the EcoRV restriction site of the pMD18-T vector to obtain the human Fc vector. In Sequence 16 of the sequence listing, nucleotides 51-354 form the CMV Enhancer, nucleotides 355-558 form the CMV Promoter, nucleotides 947-952 form the NheI restriction recognition sequence, nucleotides 954-961 form the NotI restriction recognition sequence, nucleotides 977-1672 encode the human Fc (the amino acid sequence of the human Fc is shown in Sequence 32 of the sequence listing), nucleotides 1673-1675 are the stop codon, and nucleotides 1748-1869 form SV40Poly(A).
[0141] 2. The small fragment between the NheI and NotI restriction enzyme recognition sequences of the human Fc vector (containing the NheI and NotI restriction enzyme recognition sequences, i.e., GCTAGCGCGCGGCCGC) was replaced with the double-stranded DNA molecule shown in Sequence 9 of the sequence listing to obtain the recombinant plasmid TH-34-Fc. The recombinant plasmid TH-34-Fc expresses the following fusion protein, which, from the N-terminus to the C-terminus, consists of a signal peptide, the protein segment shown in Sequence 8 (TH-34 nanobody), a linker peptide (GGGGS), and the protein segment shown in Sequence 32 (human Fc). In cells, the signal peptide is cleaved, leaving the following active protein, which, from the N-terminus to the C-terminus, consists of the protein segment shown in Sequence 8 (TH-34 nanobody), the linker peptide (GGGGS), and the protein segment shown in Sequence 32 (human Fc).
[0142] 3. Replace the small fragment between the NheI and NotI restriction enzyme recognition sequences of the human Fc vector (containing the NheI and NotI restriction enzyme recognition sequences, i.e., GCTAGCGCGCGGCCGC) with the double-stranded DNA molecule shown in Sequence 11 of the sequence listing to obtain the recombinant plasmid TH-41-Fc. The recombinant plasmid TH-41-Fc expresses the following fusion protein, which, from N-terminus to C-terminus, consists of a signal peptide, the protein segment shown in Sequence 10 (TH-41 nanobody), a linker peptide (GGGGS), and the protein segment shown in Sequence 32 (human Fc). In cells, the signal peptide is cleaved, leaving the following active protein, which, from N-terminus to C-terminus, consists of the protein segment shown in Sequence 10 (TH-41 nanobody), the linker peptide (GGGGS), and the protein segment shown in Sequence 32 (human Fc).
[0143] 4. Replace the small fragment between the NheI and NotI restriction enzyme recognition sequences of the human Fc vector (containing the NheI and NotI restriction enzyme recognition sequences, i.e., GCTAGCGCGCGGCCGC) with the double-stranded DNA molecule shown in Sequence 13 of the sequence listing to obtain the recombinant plasmid TH-44-Fc. The recombinant plasmid TH-44-Fc expresses the following fusion protein, which, from N-terminus to C-terminus, consists of a signal peptide, the protein segment shown in Sequence 12 (TH-44 nanobody), a linker peptide (GGGGS), and the protein segment shown in Sequence 32 (human Fc). In cells, the signal peptide is cleaved, leaving the following active protein, which, from N-terminus to C-terminus, consists of the protein segment shown in Sequence 12 (TH-44 nanobody), the linker peptide (GGGGS), and the protein segment shown in Sequence 32 (human Fc).
[0144] 5. Replace the small fragment between the NheI and NotI restriction enzyme recognition sequences of the human Fc vector (containing the NheI and NotI restriction enzyme recognition sequences, i.e., GCTAGCGCGCGGCCGC) with the double-stranded DNA molecule shown in Sequence 15 of the sequence listing to obtain the recombinant plasmid TH-40-Fc. The recombinant plasmid TH-40-Fc expresses the following fusion protein, which, from N-terminus to C-terminus, consists of a signal peptide, the protein segment shown in Sequence 14 (TH-40 nanobody), a linker peptide (GGGGS), and the protein segment shown in Sequence 32 (human Fc). In cells, the signal peptide is cleaved, leaving the following active protein, which, from N-terminus to C-terminus, consists of the protein segment shown in Sequence 14 (TH-40 nanobody), the linker peptide (GGGGS), and the protein segment shown in Sequence 32 (human Fc).
[0145] II. Preparation of Nanobody-Fc Fusion Protein
[0146] 1. Transfect the recombinant plasmid prepared in step one into 293F cells, then culture them in SMM 293-TII medium for 72 h, and then centrifuge at 4℃ and 4000 rpm for 30 min to collect the supernatant.
[0147] 2. Affinity chromatography
[0148] Affinity chromatography column specifications: length 3cm, inner diameter 1cm;
[0149] Affinity chromatography column packing material: protein A beads (Thermo, catalog number 10006D);
[0150] Perform the following steps in sequence: ① Load 300 mL of the supernatant obtained in step 1 onto an affinity chromatography column and incubate at 4°C for 16 hours; ② Wash the column with 60 mL of PBS buffer (pH 7.2, 10 mM); ③ Elute the target protein with 30 mL of elution buffer and collect the post-column solution.
[0151] Elution buffer: Dissolve 7.5g of glycine in water and bring the volume to 500mL. Adjust the pH to 3.0 with hydrochloric acid.
[0152] 3. Take the post-column solution obtained in step 2, concentrate it with an ultrafiltration concentrator, and replace the system with PBS buffer (pH 7.2, 10mM) to obtain 1mL of antibody solution (antibody concentration is approximately 2mg / mL).
[0153] Step two was performed on the recombinant plasmid TH-34-Fc, and the resulting antibody solution was named TH-34-Fc antibody solution.
[0154] Step two was performed on the recombinant plasmid TH-41-Fc, and the resulting antibody solution was named TH-41-Fc antibody solution.
[0155] Step two was performed on the recombinant plasmid TH-44-Fc, and the resulting antibody solution was named TH-44-Fc antibody solution.
[0156] Step two was performed on the recombinant plasmid TH-40-Fc, and the resulting antibody solution was named TH-40-Fc antibody solution.
[0157] Example 3: Detection of the neutralizing activity of nanobody-Fc fusion protein against SARS-CoV-2 and other sabezi viruses.
[0158] The sources of the novel coronavirus membrane proteins are as follows:
[0159] The membrane proteins of the seven novel coronavirus strains originate from the following sources:
[0160] Wild-type novel coronavirus (Genbank: MN908947.3);
[0161] The novel coronavirus Alpha strain (GISAID: EPI_ISL_601443) contains 9 mutations: 69-70del, 144del, N501Y, A570D, D614G, P681H, T716I, S982A, and D1118H; B.1.1.7;
[0162] The novel coronavirus Beta strain (GISAID: EPI_ISL_700450) contains 10 mutations: L18F, D80A, D215G, 242-244del, S305T, K417N, E484K, N510Y, D614G, A701V; B.1.351;
[0163] The novel coronavirus Gamma strain (GISAID: EPI_ISL_792681) contains 12 mutations: L18F, T20N, P26S, D138Y, R190S, K417T, E484K, N501Y, D614G, H655Y, T1027I, V1176F; P.1.
[0164] The novel coronavirus Delta strain (GISAID: EPI_ISL_1534938) contains 10 mutations: T19R, G142D, 156-157del, R158G, A222V, L452R, T478K, D614G, P681R, and D950N; B.1.617.2;
[0165] Omicron novel coronavirus BA.1 strain (GISAID: EPI_ISL_6752027) contains 34 mutations: A67V, 69-70del, T95I, G142D, 143-145del, 211del, L212I, ins214EPE, G339D, S371L, S373P, S375F, K417N, N440K, G446S, S477N, T478K, E484A, Q493R, G496S, Q498R, N501Y, Y505H, T547K, D614G, H655Y, N679K, P681H, N764K, D796Y, N856K, Q954H, N969K, L981F; BA.1;
[0166] The novel coronavirus Omicron BA.2 strain (GISAID: EPI_ISL_8515362) contains 29 mutations: T19I, 24-26del, A27S, G142D, V213G, G339D, S371F, S373P, S375F, T376A, D405N, R408S, K417N, N440K, S477N, T478K, E484A, Q493R, Q498R, N501Y, Y505H, D614G, H655Y, N679K, P681H, N764K, D796Y, N969K, Q954H; BA.2;
[0167] I. Preparation of pseudoviruses
[0168] Co-transfection of 293T cells with a plasmid expressing viral membrane proteins and a backbone plasmid pNL4-3R-E-luciferase yielded infectious pseudoviruses without replication capacity, exhibiting infectivity similar to live viruses. The backbone plasmid pNL4-3R-E-luciferase, i.e., the backbone plasmid pNL4-3R-E containing luciferase (i.e., vector with the luciferase gene containing backbone pNL4-3R-E in the literature): Wang Q, Liu L, Ren W, Gettie A, Wang H, Liang Q, Shi X, Montefiori DC, Zhou T, Zhang L. Cell Rep. 2019.
[0169] The gene encoding the membrane protein of the virus (the virus refers to any one of the seven novel coronavirus strains or any one of the five sabevir strains mentioned above) was inserted between the BamH II and EcoRI restriction sites of the pcDNA3.1(+) vector to obtain a plasmid expressing the viral membrane protein. The plasmid expressing the membrane protein and the backbone plasmid pNL4-3R-E-luciferase were co-transfected into 293T cells and incubated at 37°C (using DMEM medium containing 10% fetal bovine serum). The cell culture supernatant was collected 60 hours after transfection; this was the viral solution containing the corresponding pseudovirus.
[0170] In the wild-type novel coronavirus, the membrane protein is shown in Sequence 17 of the sequence listing. When the gene encoding the novel coronavirus membrane protein is shown in Sequence 18 of the sequence listing (encoding the protein shown in Sequence 17), the above steps are performed to obtain a wild-type novel coronavirus pseudovirus. Various mutations are added to Sequence 18 of the sequence listing (corresponding mutations for the novel coronavirus Alpha strain are shown in Table 1, for the novel coronavirus Beta strain in Table 2, for the novel coronavirus Gamma strain in Table 3, and for the novel coronavirus Delta strain in Table 4), and then used as the gene encoding the viral membrane protein, the above steps are performed to obtain novel coronavirus Alpha strain pseudovirus, novel coronavirus Beta strain pseudovirus, novel coronavirus Gamma strain pseudovirus, and novel coronavirus Delta strain pseudovirus, respectively. The gene encoding the membrane protein of the novel coronavirus OmicronBA.1 strain is shown in Sequence 19 of the sequence listing. The above steps are performed to obtain the novel coronavirus Omicron BA.1 strain pseudovirus. The gene encoding the membrane protein of the novel coronavirus Omicron BA.2 strain is shown in Sequence 20 of the sequence listing. The above steps are performed to obtain the novel coronavirus Omicron BA.2 strain pseudovirus.
[0171] Table 1
[0172] Protein mutation DNA mutation (corresponding to position 18 in sequence) 69-70del (missing two amino acid residues "HV") Missing bits 205-210 of "CACGTG" 144del (missing one amino acid residue "Y") Missing "TAT" bits 430-432 N501Y (a mutation involving one amino acid residue) The "A" at position 1501 mutates to "T". A570D (a single amino acid residue mutation) The "C" at position 1709 mutates to "A". D614G (a single amino acid residue mutation) The "A" at position 1841 mutates to "G". P681H (a mutation involving one amino acid residue) The 2042nd position "C" mutates to "A". T716I (a single amino acid residue mutation) The "C" at position 2147 mutates to "T". S982A (a mutation involving one amino acid residue) The "AG" at positions 2944-2945 mutates to "GC". D1118H (a mutation involving one amino acid residue) The "G" at position 3352 mutated to "C".
[0173] Table 2
[0174] Protein mutation DNA mutation (corresponding to position 18 in sequence) L18F (a mutation involving one amino acid residue) The 52-bit "C" mutates to "T", and the 54-bit "G" mutates to "C". D80A (a mutation involving one amino acid residue) The "A" at position 239 mutates to "C". D215G (a mutation involving one amino acid residue) The 644-bit "A" mutates to "G". 242-244del (three amino acid residues missing) Missing bits 724-732 of "CTGGCCCTG" S305T (a mutation involving one amino acid residue) The 914th position "G" mutates to "C". K417N (a single amino acid residue mutation) The "G" at position 1251 mutates to "C". E484K (a single amino acid residue mutation) The "G" at position 1450 mutated to "A". N501Y (a mutation involving one amino acid residue) The "A" at position 1501 mutates to "T", and the "T" at position 1503 mutates to "C". D614G (a single amino acid residue mutation) The "A" at position 1841 mutates to "G". A701V (a mutation involving one amino acid residue) The "CC" at positions 2102-2103 mutates to "TG".
[0175] Table 3
[0176]
[0177]
[0178] Table 4
[0179] Protein mutation DNA mutation (corresponding to position 18 in sequence) T19R (a mutation involving one amino acid residue) The 55th position "A" mutates to "C", and the 56th position "C" mutates to "G". G142D (a single amino acid residue mutation) The 425th bit "G" mutates to "A". 156-157del (two amino acid residues missing) Missing bits 467-472 of "AGTTCC" R158G (a mutation involving one amino acid residue) The 474-bit "C" mutates to "T". A222V (a mutation involving one amino acid residue) The "C" at position 665 mutates to "T". L452R (a single amino acid residue mutation) The "T" at position 1355 mutates to "G". T478K (a single amino acid residue mutation) The "C" at position 1433 mutates to "A". D614G (a single amino acid residue mutation) The "A" at position 1841 mutates to "G". P681R (a single amino acid residue mutation) The "CCT" at positions 2041-2043 mutated to "AGA". D950N (a single amino acid residue mutation) The "G" at position 2848 mutates to "A", and the "C" at position 2850 mutates to "T".
[0180] The membrane protein of SARS-CoV-1 is shown in Sequence 21 of the sequence listing, and the coding gene of the membrane protein is shown in Sequence 22 of the sequence listing. The above steps yield SARS-CoV-1 pseudoviruses. The membrane protein of Pangolin CoV GD is shown in Sequence 23 of the sequence listing, and the coding gene of the membrane protein is shown in Sequence 24 of the sequence listing. The above steps yield Pangolin CoVGD pseudoviruses. The membrane protein of Pangolin CoV GX is shown in Sequence 25 of the sequence listing, and the coding gene of the membrane protein is shown in Sequence 26 of the sequence listing. The above steps yield Pangolin CoV GX pseudoviruses. The membrane protein of Bat CoV WIV16 is shown in Sequence 27 of the sequence listing, and the coding gene of the membrane protein is shown in Sequence 28 of the sequence listing. The above steps yield BatCoV WIV16 pseudoviruses. The membrane protein of Bat CoV RaTG13 is shown in Sequence 29 of the sequence listing, and the coding gene of the membrane protein is shown in Sequence 30 of the sequence listing. The above steps yield Bat CoV RaTG13 pseudoviruses.
[0181] II. Detection of Neutralizing Activity of Monoclonal Antibodies
[0182] The test antibodies are: TH-34-Fc (TH-34-Fc antibody solution prepared in Example 2), TH-41-Fc (TH-41-Fc antibody solution prepared in Example 2), TH-44-Fc (TH-44-Fc antibody solution prepared in Example 2) or TH-40-Fc (TH-40-Fc antibody solution prepared in Example 2).
[0183] 1. The test antibody was diluted using DMEM medium containing 10% FBS to obtain a diluted solution.
[0184] 2. Mix 100 μl of the diluent obtained in step 1 with 50 μl of the virus solution prepared in step 1 (virus content of 100 TCID50), and incubate at 37°C for 1 hour. Set up a blank control by replacing 100 μl of the diluent with 100 μl of DMEM medium containing 10% FBS.
[0185] 3. After completing step 2, add 50 μl of hACE2-hela cell suspension (containing approximately 2 × 10⁻⁶ cells). 4 (1 cell), incubated at 37°C for 48 hours.
[0186] 4. After completing step 3, add 100 μl of PBS buffer and 50 μl of cell lysis buffer (Bright-Globe). TM The Luciferase Assay System (Promega, E2650) was used to incubate the sample for 2 minutes, and then the luciferase activity was detected using a chemiluminescence analyzer.
[0187] Three replicates were set for each treatment, and the average value of the results was taken.
[0188] Neutralization activity = (fluorescence intensity of blank control group - fluorescence intensity of experimental group with added diluent) / fluorescence intensity of blank control group × 100%.
[0189] The antibody concentration corresponding to a neutralizing activity of 50% is the IC50 value.
[0190] The four antibodies TH-34-Fc, TH-41-Fc, TH-44-Fc, and TH-40-Fc all exhibited strong neutralizing activity against both wild-type SARS-CoV-2 and naturally occurring variants. The IC50 values are shown in Table 5. In Table 5, data are expressed in μg / ml. In Table 5, ND indicates that TH-40-Fc has no neutralizing activity against Omicron BA.2.
[0191] The four antibodies TH-34-Fc, TH-41-Fc, TH-44-Fc, and TH-40-Fc all exhibited strong neutralizing ability against the other five sabeviruses. The IC50 values are shown in Table 6. In Table 6, the data units are μg / ml.
[0192] Table 5
[0193]
[0194] Table 6
[0195] Nanobody SARS-CoV-1 Pangolin CoV GD Pangolin CoV GX Bat CoV WIV16 Bat CoV RaTG13 TH-34-Fc 0.016 0.034 0.501 0.050 0.037 TH-41-Fc 0.058 0.015 0.313 0.037 0.031 TH-44-Fc 0.040 0.041 0.347 0.098 0.042 TH-40-Fc 0.080 0.012 0.015 0.008 0.010
[0196] Example 4: Preparation of Nanobodies
[0197] I. Construction of Recombinant Expression Vectors
[0198] 1. Insert the double-stranded DNA molecule shown in Sequence 31 of the sequence listing into the NotI and BamHI restriction sites of the pVRC8400 vector to obtain the pVRC8400-His tag vector. In Sequence 31 of the sequence listing, nucleotides 1-57 encode the signal peptide, nucleotides 58-65 form the NotI restriction recognition sequence, nucleotides 81-86 form the NheI restriction recognition sequence, nucleotides 87-104 encode the His6 tag, and nucleotides 105-107 are the stop codons.
[0199] 2. Replace the small fragments (containing NotI and BamHI restriction enzyme recognition sequences, i.e., GCGGCCGCGGAGGTGGAGGTAGTGCTAGC) in the pVRC8400-His tag vector with the double-stranded DNA molecule shown in Sequence 9 of the sequence listing to obtain the recombinant plasmid TH-34. Recombinant plasmid TH-34 expresses the following fusion protein, which, from the N-terminus to the C-terminus, consists of a signal peptide, the protein segment shown in Sequence 8 (TH-34 nanobody), and a His6 tag. In cells, the signal peptide is cleaved, leaving the following active protein, which, from the N-terminus to the C-terminus, consists of the protein segment shown in Sequence 8 (TH-34 nanobody) and a His6 tag.
[0200] 3. A small fragment (containing the NotI and BamHI restriction recognition sequences, i.e., GCGGCCGCGGAGGTGGAGGTAGTGCTAGC) of the NotI and BamHI restriction recognition sequences in the pVRC8400-His tag vector, as shown in Sequence 11 of the sequence listing, was extracted to obtain the recombinant plasmid TH-41. The recombinant plasmid TH-41 expresses the following fusion protein, which, from the N-terminus to the C-terminus, consists of a signal peptide, the protein segment shown in Sequence 10 (TH-41 nanobody), and a His6 tag. In cells, the signal peptide is cleaved, leaving the following active protein, which, from the N-terminus to the C-terminus, consists of the protein segment shown in Sequence 10 (TH-41 nanobody) and a His6 tag.
[0201] 4. The small fragment (containing the NotI and BamHI restriction recognition sequences, i.e., GCGGCCGCGGAGGTGGAGGTAGTGCTAGC) of the NotI and BamHI restriction recognition sequences in the pVRC8400-His tag vector, as shown in Sequence Listing 13, is used to obtain the recombinant plasmid TH-44. Recombinant plasmid TH-44 expresses the following fusion protein, which, from N-terminus to C-terminus, consists of a signal peptide, the protein segment shown in Sequence 12 (TH-44 nanobody), and a His6 tag. In cells, the signal peptide is cleaved, leaving the following active protein, which, from N-terminus to C-terminus, consists of the protein segment shown in Sequence 12 (TH-44 nanobody) and a His6 tag.
[0202] 5. A small fragment (containing the NotI and BamHI restriction recognition sequences, i.e., GCGGCCGCGGAGGTGGAGGTAGTGCTAGC) of the NotI and BamHI restriction recognition sequences in the pVRC8400-His tag vector, as shown in Sequence 15 of the sequence listing, was extracted to obtain the recombinant plasmid TH-40. Recombinant plasmid TH-40 expresses the following fusion protein, which, from N-terminus to C-terminus, consists of a signal peptide, the protein segment shown in Sequence 14 (TH-40 nanobody), and a His6 tag. In cells, the signal peptide is cleaved, leaving the following active protein, which, from N-terminus to C-terminus, consists of the protein segment shown in Sequence 14 (TH-40 nanobody) and a His6 tag.
[0203] II. Preparation of Nanobody Proteins
[0204] The recombinant plasmids are: recombinant plasmid TH-34, recombinant plasmid TH-41, recombinant plasmid TH-44, or recombinant plasmid TH-40.
[0205] 1. Transfect the recombinant plasmid prepared in step one into 293F cells, then culture them in SMM 293-TII medium for 72 h, and then centrifuge at 4℃ and 4000 rpm for 30 min to collect the supernatant.
[0206] 2. Affinity chromatography
[0207] Affinity chromatography column specifications: length 3cm, inner diameter 1cm;
[0208] Affinity chromatography column packing material: nickel column beads (purchased from Qiagen, catalog number 30230).
[0209] Perform the following steps in sequence: ① Load 200 ml of the supernatant obtained in step 1 onto an affinity chromatography column and incubate at 4°C for 3 hours; ② Wash the column with 100 mL of HEPEs buffer (pH 7.2, 1 M) containing 20 mM imidazole; ③ Elute the target protein with 30 mL of HEPEs buffer (pH 7.2, 1 M) containing 500 mM imidazole and collect the post-column solution.
[0210] 3. Take the post-column solution obtained in step 2, concentrate it with an ultrafiltration concentrator and replace the system with PBS buffer (pH 7.2, 10mM) to obtain 1mL of antibody solution (antibody concentration is approximately 4mg / mL).
[0211] Step two was performed on the recombinant plasmid TH-34, and the resulting antibody solution was named TH-34 antibody solution.
[0212] Step two was performed on the recombinant plasmid TH-41, and the resulting antibody solution was named TH-41 antibody solution.
[0213] Step two was performed on the recombinant plasmid TH-44, and the resulting antibody solution was named TH-44 antibody solution.
[0214] Step two was performed on the recombinant plasmid TH-40, and the resulting antibody solution was named TH-40 antibody solution.
[0215] Example 5: Detection of the neutralizing activity of nanobodies against SARS-CoV-2 and SARS-CoV-1
[0216] The tested pseudoviruses were: the wild-type novel coronavirus pseudovirus or the SARS-CoV-1 pseudovirus prepared in Example 3.
[0217] The test antibodies were: TH-34 (TH-34 antibody solution prepared in Example 4), TH-41 (TH-41 antibody solution prepared in Example 4), TH-44 (TH-44 antibody solution prepared in Example 4), or TH-40 (TH-40 antibody solution prepared in Example 4).
[0218] The neutralizing activity of the test antibody was detected using the same method as step two in Example 3.
[0219] See results Figure 4 and Figure 5 .
[0220] The four antibodies TH-34, TH-41, TH-44 and TH-40 all exhibited strong neutralizing capabilities against both wild-type SARS-CoV-1 and SARS-CoV-1. The IC50 values against wild-type SARS-CoV-1 were 0.027, 0.021, 0.020 and 0.053 μg / ml, respectively, and the IC50 values against SARS-CoV-1 were 0.371, 0.125, 0.187 and 0.113 μg / ml, respectively.
[0221] The present invention has been described in detail above. For those skilled in the art, the invention can be practiced in a wide range of ways with equivalent parameters, concentrations, and conditions without departing from its spirit and scope, and without requiring unnecessary experiments. Although specific embodiments have been given, it should be understood that further modifications can be made to the invention. In summary, according to the principles of the invention, this application is intended to include any changes, uses, or improvements to the invention, including changes made using conventional techniques known in the art that depart from the scope disclosed herein. Some of the essential features can be applied within the scope of the following appended claims. sequence list <110> Tsinghua University <120> Broad-spectrum neutralizing antibodies against SARS-CoV-2 and other sabeviruses and their applications <130> CGGNQYX226060 <160> 32 <170> SIPOSequenceListing 1.0 <210> 1 <211> 278 <212> PRT <213> Artificial Sequence <400> 1 Met Leu Arg Gly Leu Cys Cys Val Leu Leu Leu Cys Gly Ala Val Phe 1 5 10 15 Val Ser Pro Ser Gln Glu Ile His Ala Arg Phe Arg Arg Gly Ala Arg 20 25 30 Gly Arg Val Gln Pro Thr Glu Ser Ile Val Arg Phe Pro Asn Ile Thr 35 40 45 Asn Leu Cys Pro Phe Gly Glu Val Phe Asn Ala Thr Arg Phe Ala Ser 50 55 60 Val Tyr Ala Trp Asn Arg Lys Arg Ile Ser Asn Cys Val Ala Asp Tyr 65 70 75 80 Ser Val Leu Tyr Asn Ser Ala Ser Phe Ser Thr Phe Lys Cys Tyr Gly 85 90 95 Val Ser Pro Thr Lys Leu Asn Asp Leu Cys Phe Thr Asn Val Tyr Ala 100 105 110 Asp Ser Phe Val Ile Arg Gly Asp Glu Val Arg Gln Ile Ala Pro Gly 115 120 125 Gln Thr Gly Lys Ile Ala Asp Tyr Asn Tyr Lys Leu Pro Asp Asp Phe 130 135 140 Thr Gly Cys Val Ile Ala Trp Asn Ser Asn Asn Leu Asp Ser Lys Val 145 150 155 160 Gly Gly Asn Tyr Asn Tyr Leu Tyr Arg Leu Phe Arg Lys Ser Asn Leu 165 170 175 Lys Pro Phe Glu Arg Asp Ile Ser Thr Glu Ile Tyr Gln Ala Gly Ser 180 185 190 Thr Pro Cys Asn Gly Val Glu Gly Phe Asn Cys Tyr Phe Pro Leu Gln 195 200 205 Ser Tyr Gly Phe Gln Pro Thr Asn Gly Val Gly Tyr Gln Pro Tyr Arg 210 215 220 Val Val Val Leu Ser Phe Glu Leu Leu His Ala Pro Ala Thr Val Cys 225 230 235 240 Gly Pro Lys Lys Ser Thr Asn Leu Val Lys Asn Lys Cys Val Asn Phe 245 250 255 Trp Ser His Pro Gln Phe Glu Lys Asp Tyr Lys Asp Asp Asp Asp Lys 260 265 270 His His His His His His 275 <210> 2 <211> 837 <212> DNA <213> Artificial Sequence <400> 2 atgctgcgcg gactgtgctg cgtgctgcta ctgtgcggcg ccgtgttcgt gagccccagc 60 caggagatcc acgcccgatt caggagagga gccagaggac gcgtgcagcc caccgagagc 120 atcgtgcgct tccccaacat caccaacctg tgccccttcg gcgaggtgtt caacgccacc 180 cgcttcgcca gcgtgtacgc ctggaaccgc aagcgcatca gcaactgcgt ggccgactac 240 agcgtgctgt acaacagcgc cagcttcagc accttcaagt gctacggcgt gagccccacc 300 aagctgaacg acctgtgctt caccaacgtg tacgccgaca gcttcgtgat ccgcggcgac 360 gaggtgcgcc agatcgcccc cggccagacc ggcaagatcg ccgactacaa ctacaagctg 420 cccgacgact tcaccggctg cgtgatcgcc tggaacagca acaacctgga cagcaaggtg 480 ggcggcaact acaactacct gtaccgcctg ttccgcaaga gcaacctgaa gcccttcgag 540 cgcgacatca gcaccgagat ctaccaggcc ggcagcaccc cctgcaacgg cgtggagggc 600 ttcaactgct acttccccct gcagagctac ggcttccagc ccaccaacgg cgtgggctac 660 cagccctacc gcgtggtggt gctgagcttc gagctgctgc acgcccccgc caccgtgtgc 720 ggccccaaga agagcaccaa cctggtgaag aacaagtgcg tgaacttctg gagccacccc 780 cagttcgaga aggactacaa ggacgacgac gacaagcacc accaccacca ccactga 837 <210> 3 <211> 1269 <212> PRT <213> Artificial Sequence <400> 3 Met Phe Val Phe Leu Val Leu Leu Pro Leu Val Ser Ser Gln Cys Val 1 5 10 15 Asn Leu Thr Thr Arg Thr Gln Leu Pro Pro Ala Tyr Thr Asn Ser Phe 20 25 30 Thr Arg Gly Val Tyr Tyr Pro Asp Lys Val Phe Arg Ser Ser Val Leu 35 40 45 His Ser Thr Gln Asp Leu Phe Leu Pro Phe Phe Ser Asn Val Thr Trp 50 55 60 Phe His Ala Ile His Val Ser Gly Thr Asn Gly Thr Lys Arg Phe Asp 65 70 75 80 Asn Pro Val Leu Pro Phe Asn Asp Gly Val Tyr Phe Ala Ser Thr Glu 85 90 95 Lys Ser Asn Ile Ile Arg Gly Trp Ile Phe Gly Thr Thr Leu Asp Ser 100 105 110 Lys Thr Gln Ser Leu Leu Ile Val Asn Asn Ala Thr Asn Val Val Ile 115 120 125 Lys Val Cys Glu Phe Gln Phe Cys Asn Asp Pro Phe Leu Gly Val Tyr 130 135 140 Tyr His Lys Asn Asn Lys Ser Trp Met Glu Ser Glu Phe Arg Val Tyr 145 150 155 160 Ser Ser Ala Asn Asn Cys Thr Phe Glu Tyr Val Ser Gln Pro Phe Leu 165 170 175 Met Asp Leu Glu Gly Lys Gln Gly Asn Phe Lys Asn Leu Arg Glu Phe 180 185 190 Val Phe Lys Asn Ile Asp Gly Tyr Phe Lys Ile Tyr Ser Lys His Thr 195 200 205 Pro Ile Asn Leu Val Arg Asp Leu Pro Gln Gly Phe Ser Ala Leu Glu 210 215 220 Pro Leu Val Asp Leu Pro Ile Gly Ile Asn Ile Thr Arg Phe Gln Thr 225 230 235 240 Leu Leu Ala Leu His Arg Ser Tyr Leu Thr Pro Gly Asp Ser Ser Ser 245 250 255 Gly Trp Thr Ala Gly Ala Ala Ala Tyr Tyr Val Gly Tyr Leu Gln Pro 260 265 270 Arg Thr Phe Leu Leu Lys Tyr Asn Glu Asn Gly Thr Ile Thr Asp Ala 275 280 285 Val Asp Cys Ala Leu Asp Pro Leu Ser Glu Thr Lys Cys Thr Leu Lys 290 295 300 Ser Phe Thr Val Glu Lys Gly Ile Tyr Gln Thr Ser Asn Phe Arg Val 305 310 315 320 Gln Pro Thr Glu Ser Ile Val Arg Phe Pro Asn Ile Thr Asn Leu Cys 325 330 335 Pro Phe Gly Glu Val Phe Asn Ala Thr Arg Phe Ala Ser Val Tyr Ala 340 345 350 Trp Asn Arg Lys Arg Ile Ser Asn Cys Val Ala Asp Tyr Ser Val Leu 355 360 365 Tyr Asn Ser Ala Ser Phe Ser Thr Phe Lys Cys Tyr Gly Val Ser Pro 370 375 380 Thr Lys Leu Asn Asp Leu Cys Phe Thr Asn Val Tyr Ala Asp Ser Phe 385 390 395 400 Val Ile Arg Gly Asp Glu Val Arg Gln Ile Ala Pro Gly Gln Thr Gly 405 410 415 Lys Ile Ala Asp Tyr Asn Tyr Lys Leu Pro Asp Asp Phe Thr Gly Cys 420 425 430 Val Ile Ala Trp Asn Ser Asn Asn Leu Asp Ser Lys Val Gly Gly Asn 435 440 445 Tyr Asn Tyr Leu Tyr Arg Leu Phe Arg Lys Ser Asn Leu Lys Pro Phe 450 455 460 Glu Arg Asp Ile Ser Thr Glu Ile Tyr Gln Ala Gly Ser Thr Pro Cys 465 470 475 480 Asn Gly Val Glu Gly Phe Asn Cys Tyr Phe Pro Leu Gln Ser Tyr Gly 485 490 495 Phe Gln Pro Thr Asn Gly Val Gly Tyr Gln Pro Tyr Arg Val Val Val 500 505 510 Leu Ser Phe Glu Leu Leu His Ala Pro Ala Thr Val Cys Gly Pro Lys 515 520 525 Lys Ser Thr Asn Leu Val Lys Asn Lys Cys Val Asn Phe Asn Phe Asn 530 535 540 Gly Leu Thr Gly Thr Gly Val Leu Thr Glu Ser Asn Lys Lys Phe Leu 545 550 555 560 Pro Phe Gln Gln Phe Gly Arg Asp Ile Ala Asp Thr Thr Asp Ala Val 565 570 575 Arg Asp Pro Gln Thr Leu Glu Ile Leu Asp Ile Thr Pro Cys Ser Phe 580 585 590 Gly Gly Val Ser Val Ile Thr Pro Gly Thr Asn Thr Ser Asn Gln Val 595 600 605 Ala Val Leu Tyr Gln Asp Val Asn Cys Thr Glu Val Pro Val Ala Ile 610 615 620 His Ala Asp Gln Leu Thr Pro Thr Trp Arg Val Tyr Ser Thr Gly Ser 625 630 635 640 Asn Val Phe Gln Thr Arg Ala Gly Cys Leu Ile Gly Ala Glu His Val 645 650 655 Asn Asn Ser Tyr Glu Cys Asp Ile Pro Ile Gly Ala Gly Ile Cys Ala 660 665 670 Ser Tyr Gln Thr Gln Thr Asn Ser Pro Gly Ser Ala Ser Ser Val Ala 675 680 685 Ser Gln Ser Ile Ile Ala Tyr Thr Met Ser Leu Gly Ala Glu Asn Ser 690 695 700 Val Ala Tyr Ser Asn Asn Ser Ile Ala Ile Pro Thr Asn Phe Thr Ile 705 710 715 720 Ser Val Thr Thr Glu Ile Leu Pro Val Ser Met Thr Lys Thr Ser Val 725 730 735 Asp Cys Thr Met Tyr Ile Cys Gly Asp Ser Thr Glu Cys Ser Asn Leu 740 745 750 Leu Leu Gln Tyr Gly Ser Phe Cys Thr Gln Leu Asn Arg Ala Leu Thr 755 760 765 Gly Ile Ala Val Glu Gln Asp Lys Asn Thr Gln Glu Val Phe Ala Gln 770 775 780 Val Lys Gln Ile Tyr Lys Thr Pro Pro Ile Lys Asp Phe Gly Gly Phe 785 790 795 800 Asn Phe Ser Gln Ile Leu Pro Asp Pro Ser Lys Pro Ser Lys Arg Ser 805 810 815 Phe Ile Glu Asp Leu Leu Phe Asn Lys Val Thr Leu Ala Asp Ala Gly 820 825 830 Phe Ile Lys Gln Tyr Gly Asp Cys Leu Gly Asp Ile Ala Ala Arg Asp 835 840 845 Leu Ile Cys Ala Gln Lys Phe Asn Gly Leu Thr Val Leu Pro Pro Leu 850 855 860 Leu Thr Asp Glu Met Ile Ala Gln Tyr Thr Ser Ala Leu Leu Ala Gly 865 870 875 880 Thr Ile Thr Ser Gly Trp Thr Phe Gly Ala Gly Ala Ala Leu Gln Ile 885 890 895 Pro Phe Ala Met Gln Met Ala Tyr Arg Phe Asn Gly Ile Gly Val Thr 900 905 910 Gln Asn Val Leu Tyr Glu Asn Gln Lys Leu Ile Ala Asn Gln Phe Asn 915 920 925 Ser Ala Ile Gly Lys Ile Gln Asp Ser Leu Ser Ser Thr Ala Ser Ala 930 935 940 Leu Gly Lys Leu Gln Asp Val Val Asn Gln Asn Ala Gln Ala Leu Asn 945 950 955 960 Thr Leu Val Lys Gln Leu Ser Ser Asn Phe Gly Ala Ile Ser Ser Val 965 970 975 Leu Asn Asp Ile Leu Ser Arg Leu Asp Pro Pro Glu Ala Glu Val Gln 980 985 990 Ile Asp Arg Leu Ile Thr Gly Arg Leu Gln Ser Leu Gln Thr Tyr Val 995 1000 1005 Thr Gln Gln Leu Ile Arg Ala Ala Glu Ile Arg Ala Ser Ala Asn Leu 1010 1015 1020 Ala Ala Thr Lys Met Ser Glu Cys Val Leu Gly Gln Ser Lys Arg Val 1025 1030 1035 1040 Asp Phe Cys Gly Lys Gly Tyr His Leu Met Ser Phe Pro Gln Ser Ala 1045 1050 1055 Pro His Gly Val Val Phe Leu His Val Thr Tyr Val Pro Ala Gln Glu 1060 1065 1070 Lys Asn Phe Thr Thr Ala Pro Ala Ile Cys His Asp Gly Lys Ala His 1075 1080 1085 Phe Pro Arg Glu Gly Val Phe Val Ser Asn Gly Thr His Trp Phe Val 1090 1095 1100 Thr Gln Arg Asn Phe Tyr Glu Pro Gln Ile Ile Thr Thr Asp Asn Thr 1105 1110 1115 1120 Phe Val Ser Gly Asn Cys Asp Val Val Ile Gly Ile Val Asn Asn Thr 1125 1130 1135 Val Tyr Asp Pro Leu Gln Pro Glu Leu Asp Ser Phe Lys Glu Glu Leu 1140 1145 1150 Asp Lys Tyr Phe Lys Asn His Thr Ser Pro Asp Val Asp Leu Gly Asp 1155 1160 1165 Ile Ser Gly Ile Asn Ala Ser Val Val Asn Ile Gln Lys Glu Ile Asp 1170 1175 1180 Arg Leu Asn Glu Val Ala Lys Asn Leu Asn Glu Ser Leu Ile Asp Leu 1185 1190 1195 1200 Gln Glu Leu Gly Lys Tyr Glu Gln Tyr Ile Lys Leu Glu Val Leu Phe 1205 1210 1215 Gln Gly Pro Gly Gly Gly Ser Gly Gly Gly Ser Gly Tyr Ile Pro Glu 1220 1225 1230 Ala Pro Arg Asp Gly Gln Ala Tyr Val Arg Lys Asp Gly Glu Trp Val 1235 1240 1245 Leu Leu Ser Thr Phe Leu Gly Trp Ser His Pro Gln Phe Glu Lys His 1250 1255 1260 His His His His His 1265 <210> 4 <211> 3810 <212> DNA <213> Artificial Sequence <400> 4 atgttcgtgt tcctggtgct gctgcctctg gtgagcagcc agtgcgtgaa tctgaccacc 60 agaacccagc tgcctcctgc ctacaccaat agcttcacca gaggagttta ttatcccgat 120 aaggtgttca gaagtagtgt attacatagt acccaggacc tgttcctacc tttcttcagt 180 aacgtgacct ggttccacgc catccacgtg agcggcacca atggcaccaa gagattcgac 240 aatcctgtgc tgcctttcaa tgacggcgtg tacttcgcca gcaccgagaa gagcaatatc 300 atcagaggct ggatcttcgg caccaccttg gattccaaga ctcagagcct gctgattgta 360 aacaacgcta caaatgtggt gatcaaggtg tgcgagttcc agttctgcaa tgaccctttc 420 ctgggtgttt attatcataa gaacaacaag agctggatgg agagcgagtt ccgcgtatat 480 tcgtcggcta ataattgcac cttcgagtac gtgagccagc ctttcctgat ggacctggag 540 ggcaagcagg gcaatttcaa gaatctgaga gagttcgtgt tcaagaatat cgacggctac 600 ttcaagatct acagcaagca cacacccatt aatctggtga gagacctgcc tcagggcttc 660 agcgccctgg agcctctggt ggacctgcct atcggcatca atatcaccag attccagacc 720 ctgctggccc tgcacagatc atatcttaca ccaggcgatt cgtcaagcgg ttggaccgct 780 ggagctgcgg catattacgt gggctacctg cagcctagaa ccttcctgct gaagtacaat 840 gagaatggta cgataaccga cgcagttgat tgtgccctgg accctctgag cgagaccaag 900 tgcaccctga agagcttcac cgtggagaag ggcatctacc agaccagcaa tttcagagtg 960 cagcctaccg agagcatcgt gagattccct aatatcacca atctgtgccc tttcggcgag 1020 gtgttcaatg ccaccagatt cgccagcgtg tacgcatgga accgcaagcg gataagcaat 1080 tgcgtggccg actacagcgt gctgtacaat agcgccagct tcagcacctt caaatgttat 1140 ggtgtttcgc caacaaagct gaatgacctg tgcttcacca atgtgtacgc cgacagcttc 1200 gtgatcagag gcgacgaggt gagacagatc gcgccagggc agaccggcaa gatcgccgac 1260 tacaattaca agctgcctga cgacttcacc ggctgcgtga tcgcgtggaa ctctaacaat 1320 ctagattcga aagttggagg caattacaat tacctgtaca gactgttcag aaagagcaat 1380 ctgaagcctt tcgagagaga catcagcacc gagatctacc aggccggcag cacaccgtgt 1440 aatggcgtgg agggcttcaa ttgctacttc cctctgcaga gctacggctt ccagcctacc 1500 aatggcgtgg gctaccagcc ttacagagtg gtggtgctga gcttcgagct gctgcacgct 1560 cccgctaccg tgtgcggccc taagaagc accaatctgg tgaagaataa gtgcgtgaat 1620 ttcaatttca atggtctaac tggaacgggc gtgctgaccg agagcaataa gaagtttctt 1680 ccctttcaac aattcggcag agacatcgcc gacaccacag atgctgtaag agaccctcag 1740 accctggaga tcctggacat cactccgtgt agcttcggcg gcgtgagcgt gatcacaccg 1800 ggtaccaata ccagcaatca ggtggccgtg ctgtaccagg acgtgaattg caccgaggtg 1860 cctgtggcca tccacgccga ccagctgact cccacttgga gggtatattc cacgggaagc 1920 aatgtgttcc agaccagagc cggctgcctg atcggcgccg agcacgtgaa taatagctac 1980 gagtgcgaca tccctatcgg cgccggcatc tgcgccagct accagaccca gaccaatagc 2040 cctggaagcg caagcagcgt ggccagccag agcatcatcg cctacacat gagcctgggc 2100 gccgagaata gcgtggccta cagcaataat agcatcgcca tccctaccaa tttcaccatc 2160 agcgtgacca ccgaaatatt accagtctcc atgaccaaga ccagcgtgga ctgcaccatg 2220 tacatctgcg gcgacagcac cgagtgcagc aatctgctgc tgcagtacgg cagcttctgc 2280 acccagctga atagagccct gaccggcatc gccgtggagc aggacaagaa tacccaggag 2340 gtgttcgccc aggtgaagca gatctacaag actccgccga tcaaggactt cggcggcttc 2400 aatttcagcc aaatactccc agatccaagc aagcctagca agaggagctt catcgaggac 2460 ctgctgttca ataaggtgac cctggccgac gccggcttca tcaagcagta cggcgactgc 2520 ctaggtgata ttgcggcaag agacctgatc tgcgcccaga agtttaacgg tttgacagta 2580 ctacctcctc tgctgaccga cgagatgata gcacaatata cgtcggcatt gctcgctggc 2640 acgatcacat cgggctggac ttcggcgcc ggagcagcgt tgcaaatccc ttcgccatg 2700 cagatggcct acagattcaa tggcatcggc gtgacccaga atgtgctgta cgagaatcag 2760 aagctgatcg ccaatcagtt caatagcgcc atcggcaaga tccaggacag cctgagcagc 2820 accgccagcg ccctgggcaa gctgcaggac gtggtgaatc agaatgccca ggccctgaat 2880 accctggtga agcagctgag cagcaatttc ggcgccatca gtagtgtact caacgatatc 2940 ctgagcagac tggacccgcc ggaggccgag gtgcaaattg atcgtcttat tactggcaga 3000 ctgcagagcc tgcagaccta cgtgacccag cagctgatca gagccgccga gatcagagcc 3060 agcgccaatc tggccgccac cagatgagc gagtgcgtgc tgggccag cagagagtg 3120 gacttctgcg gcaagggcta ccacctgatg agctccctc agagcgctcc acatggcgtg 3180 gtgttcctgc acgtgaccta cgtgcctgcc caggagaga atttcaccac cgcacccgca 3240 atctgccacg acggcaggc ccactccct agagaggggcg tgttcgtgag caatggcacc 3300 cactggttcg tgacccagag aaatttctac gagcctcaga tcatcaccac cgacaatacc 3360 ttcgtgagcg gcaatgcga cgtggtgatc gggatagtca atatactgt ctacgaccct 3420 ctgcagcctg agctggacag cttcaggag gagctggaca agtactca gatcacacc 3480 agccctgacg tggacctcgg tgatatttcg ggaatcaatg ccagcgtggt gatatccag 3540 aaggaaattg atcggctca cgaagtggcc aagaatctctga atgagagcct gatcgacctg 3600 caggagctgg gcaagtacga gcagtacatc aagctggaag ttctgttcca ggggcccgga 3660 ggaggaagtg gaggaggaag tggctatatt ccggaagcgc cgcgcgatgg ccaggcgtat 3720 gtgcgcaaag atggcgaatg ggtgctgctg agcacctttc tgggctggtc ccaccctcag 3780 ttcgagaagc accaccaccaccaccactga 3810 <210> 5 <211> 1211 <212> PRT <213> SARS‐CoV‐2 <400> 5 Met Phe Val Phe Leu Val Leu Pro Leu Val Ser Ser Gln Cys Val 1 5 10 15 Asn Leu Thr Arg Thr Gln Leu Pro Pro Ala Tyr Thr Asn Ser Phe 20 25 30 Thr Arg Gly Val Tyr Tyr Pro Asp Lys Val Phe Arg Ser Ser Val Leu 35 40 45 His Ser Thr Gln Asp Leu Phe Leu Pro Phe Phe Ser Asn Val Thr Trp 50 55 60 Phe His Ala Ile His Val Ser Gly Thr Asn Gly Thr Lys Arg Phe Asp 65 70 75 80 Asn Pro Val Leu Pro Phe Asn Asp Gly Val Tyr Phe Ala Ser Thr Glu 85 90 95 Lys Ser Asn Ile Ile Arg Gly TRP Ile Phe Gly Thr Thr Leu Asp Ser 100 105 110 Lys Thr Gln Ser Leu Leu Ile Val Asn Asn Ala Thr Asn Val Val Ile 115 120 125 Lys Val Cys Glu Phe Gln Phe Cys Asn Asp Pro Phe Leu Gly Val Tyr 130 135 140 Tyr His Lys Asn Asn Lys Ser Trp Met Glu Ser Glu Phe Arg Val Tyr 145 150 155 160 Ser Ser Ala Asn Asn Cys Thr Phe Glu Tyr Val Ser Gln Pro Phe Leu 165 170 175 Met Asp Leu Glu Gly Lys Gln Gly Asn Phe Lys Asn Leu Arg Glu Phe 180 185 190 Val Phe Lys Asn Ile Asp Gly Tyr Phe Lys Ile Tyr Ser Lys His Thr 195 200 205 Pro Ile Asn Leu Val Arg Asp Leu Pro Gln Gly Phe Ser Ala Leu Glu 210 215 220 Pro Leu Val Asp Leu Pro Ile Gly Ile Asn Ile Thr Arg Phe Gln Thr 225 230 235 240 Leu Leu Ala Leu His Arg Ser Tyr Leu Thr Pro Gly Asp Ser Ser Ser 245 250 255 Gly Trp Thr Ala Gly Ala Ala Ala Tyr Tyr Val Gly Tyr Leu Gln Pro 260 265 270 Arg Thr Phe Leu Leu Lys Tyr Asn Glu Asn Gly Thr Ile Thr Asp Ala 275 280 285 Val Asp Cys Ala Leu Asp Pro Leu Ser Glu Thr Lys Cys Thr Leu Lys 290 295 300 Ser Phe Thr Val Glu Lys Gly Ile Tyr Gln Thr Ser Asn Phe Arg Val 305 310 315 320 Gln Pro Thr Glu Ser Ile Val Arg Phe Pro Asn Ile Thr Asn Leu Cys 325 330 335 Pro Phe Gly Glu Val Phe Asn Ala Thr Arg Phe Ala Ser Val Tyr Ala 340 345 350 Trp Asn Arg Lys Arg Ile Ser Asn Cys Val Ala Asp Tyr Ser Val Leu 355 360 365 Tyr Asn Ser Ala Ser Phe Ser Thr Phe Lys Cys Tyr Gly Val Ser Pro 370 375 380 Thr Lys Leu Asn Asp Leu Cys Phe Thr Asn Val Tyr Ala Asp Ser Phe 385 390 395 400 Val Ile Arg Gly Asp Glu Val Arg Gln Ile Ala Pro Gly Gln Thr Gly 405 410 415 Lys Ile Ala Asp Tyr Asn Tyr Lys Leu Pro Asp Asp Phe Thr Gly Cys 420 425 430 Val Ile Ala Trp Asn Ser Asn Asn Leu Asp Ser Lys Val Gly Gly Asn 435 440 445 Tyr Asn Tyr Leu Tyr Arg Leu Phe Arg Lys Ser Asn Leu Lys Pro Phe 450 455 460 Glu Arg Asp Ile Ser Thr Glu Ile Tyr Gln Ala Gly Ser Thr Pro Cys 465 470 475 480 Asn Gly Val Glu Gly Phe Asn Cys Tyr Phe Pro Leu Gln Ser Tyr Gly 485 490 495 Phe Gln Pro Thr Asn Gly Val Gly Tyr Gln Pro Tyr Arg Val Val Val 500 505 510 Leu Ser Phe Glu Leu Leu His Ala Pro Ala Thr Val Cys Gly Pro Lys 515 520 525 Lys Ser Thr Asn Leu Val Lys Asn Lys Cys Val Asn Phe Asn Phe Asn 530 535 540 Gly Leu Thr Gly Thr Gly Val Leu Thr Glu Ser Asn Lys Lys Phe Leu 545 550 555 560 Pro Phe Gln Gln Phe Gly Arg Asp Ile Ala Asp Thr Thr Asp Ala Val 565 570 575 Arg Asp Pro Gln Thr Leu Glu Ile Leu Asp Ile Thr Pro Cys Ser Phe 580 585 590 Gly Gly Val Ser Val Ile Thr Pro Gly Thr Asn Thr Ser Asn Gln Val 595 600 605 Ala Val Leu Tyr Gln Asp Val Asn Cys Thr Glu Val Pro Val Ala Ile 610 615 620 His Ala Asp Gln Leu Thr Pro Thr Trp Arg Val Tyr Ser Thr Gly Ser 625 630 635 640 Asn Val Phe Gln Thr Arg Ala Gly Cys Leu Ile Gly Ala Glu His Val 645 650 655 Asn Asn Ser Tyr Glu Cys Asp Ile Pro Ile Gly Ala Gly Ile Cys Ala 660 665 670 Ser Tyr Gln Thr Gln Thr Asn Ser Pro Arg Arg Lys Arg Ser Val Ala 675 680 685 Ser Gln Ser Ile Ile Ala Tyr Thr Met Ser Leu Gly Ala Glu Asn Ser 690 695 700 Val Ala Tyr Ser Asn Asn Ser Ile Ala Ile Pro Thr Asn Phe Thr Ile 705 710 715 720 Ser Val Thr Thr Glu Ile Leu Pro Val Ser Met Thr Lys Thr Ser Val 725 730 735 Asp Cys Thr Met Tyr Ile Cys Gly Asp Ser Thr Glu Cys Ser Asn Leu 740 745 750 Leu Leu Gln Tyr Gly Ser Phe Cys Thr Gln Leu Asn Arg Ala Leu Thr 755 760 765 Gly Ile Ala Val Glu Gln Asp Lys Asn Thr Gln Glu Val Phe Ala Gln 770 775 780 Val Lys Gln Ile Tyr Lys Thr Pro Pro Ile Lys Asp Phe Gly Gly Phe 785 790 795 800 Asn Phe Ser Gln Ile Leu Pro Asp Pro Ser Lys Pro Ser Lys Arg Ser 805 810 815 Phe Ile Glu Asp Leu Leu Phe Asn Lys Val Thr Leu Ala Asp Ala Gly 820 825 830 Phe Ile Lys Gln Tyr Gly Asp Cys Leu Gly Asp Ile Ala Ala Arg Asp 835 840 845 Leu Ile Cys Ala Gln Lys Phe Asn Gly Leu Thr Val Leu Pro Pro Leu 850 855 860 Leu Thr Asp Glu Met Ile Ala Gln Tyr Thr Ser Ala Leu Leu Ala Gly 865 870 875 880 Thr Ile Thr Ser Gly Trp Thr Phe Gly Ala Gly Ala Ala Leu Gln Ile 885 890 895 Pro Phe Ala Met Gln Met Ala Tyr Arg Phe Asn Gly Ile Gly Val Thr 900 905 910 Gln Asn Val Leu Tyr Glu Asn Gln Lys Leu Ile Ala Asn Gln Phe Asn 915 920 925 Ser Ala Ile Gly Lys Ile Gln Asp Ser Leu Ser Ser Thr Ala Ser Ala 930 935 940 Leu Gly Lys Leu Gln Asp Val Val Asn Gln Asn Ala Gln Ala Leu Asn 945 950 955 960 Thr Leu Val Lys Gln Leu Ser Ser Asn Phe Gly Ala Ile Ser Ser Val 965 970 975 Leu Asn Asp Ile Leu Ser Arg Leu Asp Lys Val Glu Ala Glu Val Gln 980 985 990 Ile Asp Arg Leu Ile Thr Gly Arg Leu Gln Ser Leu Gln Thr Tyr Val 995 1000 1005 Thr Gln Gln Leu Ile Arg Ala Ala Glu Ile Arg Ala Ser Ala Asn Leu 1010 1015 1020 Ala Ala Thr Lys Met Ser Glu Cys Val Leu Gly Gln Ser Lys Arg Val 1025 1030 1035 1040 Asp Phe Cys Gly Lys Gly Tyr His Leu Met Ser Phe Pro Gln Ser Ala 1045 1050 1055 Pro His Gly Val Val Phe Leu His Val Thr Tyr Val Pro Ala Gln Glu 1060 1065 1070 Lys Asn Phe Thr Thr Ala Pro Ala Ile Cys His Asp Gly Lys Ala His 1075 1080 1085 Phe Pro Arg Glu Gly Val Phe Val Ser Asn Gly Thr His Trp Phe Val 1090 1095 1100 Thr Gln Arg Asn Phe Tyr Glu Pro Gln Ile Ile Thr Thr Asp Asn Thr 1105 1110 1115 1120 Phe Val Ser Gly Asn Cys Asp Val Val Ile Gly Ile Val Asn Asn Thr 1125 1130 1135 Val Tyr Asp Pro Leu Gln Pro Glu Leu Asp Ser Phe Lys Glu Glu Leu 1140 1145 1150 Asp Lys Tyr Phe Lys Asn His Thr Ser Pro Asp Val Asp Leu Gly Asp 1155 1160 1165 Ile Ser Gly Ile Asn Ala Ser Val Val Asn Ile Gln Lys Glu Ile Asp 1170 1175 1180 Arg Leu Asn Glu Val Ala Lys Asn Leu Asn Glu Ser Leu Ile Asp Leu 1185 1190 1195 1200 Gln Glu Leu Gly Lys Tyr Glu Gln Tyr Ile Lys 1205 1210 <210> 6 <211> 1253 <212> PRT <213> Artificial Sequence <400> 6 Met Phe Ile Phe Leu Leu Phe Leu Thr Leu Thr Ser Gly Ser Asp Leu 1 5 10 15 Asp Arg Cys Thr Thr Phe Asp Asp Val Gln Ala Pro Asn Tyr Thr Gln 20 25 30 His Thr Ser Ser Met Arg Gly Val Tyr Tyr Pro Asp Glu Ile Phe Arg 35 40 45 Ser Asp Thr Leu Tyr Leu Thr Gln Asp Leu Phe Leu Pro Phe Tyr Ser 50 55 60 Asn Val Thr Gly Phe His Thr Ile Asn His Thr Phe Gly Asn Pro Val 65 70 75 80 Ile Pro Phe Lys Asp Gly Ile Tyr Phe Ala Ala Thr Glu Lys Ser Asn [[ID=3,6]]85 90 95 Val Val Arg Gly Trp Val Phe Gly Ser Thr Met Asn Asn Lys Ser Gln 100 105 110 Ser Val Ile Ile Ile Asn Asn Ser Thr Asn Val Val Ile Arg Ala Cys 115 120 125 Asn Phe Glu Leu Cys Asp Asn Pro Phe Phe Ala Val Ser Lys Pro Met 130 135 140 Gly Thr Gln Thr His Thr Met Ile Phe Asp Asn Ala Phe Asn Cys Thr 145 150 155 160 Phe Glu Tyr Ile Ser Asp Ala Phe Ser Leu Asp Val Ser Glu Lys Ser 165 170 175 Gly Asn Phe Lys His Leu Arg Glu Phe Val Phe Lys Asn Lys Asp Gly 180 185 190 Phe Leu Tyr Val Tyr Lys Gly Tyr Gln Pro Ile Asp Val Val Arg Asp 195 200 205 Leu Pro Ser Gly Phe Asn Thr Leu Lys Pro Ile Phe Lys Leu Pro Leu 210 215 220 Gly Ile Asn Ile Thr Asn Phe Arg Ala Ile Leu Thr Ala Phe Ser Pro 225 230 235 240 Ala Gln Asp Ile Trp Gly Thr Ser Ala Ala Ala Tyr Phe Val Gly Tyr 245 250 255 Leu Lys Pro Thr Thr Phe Met Leu Lys Tyr Asp Glu Asn Gly Thr Ile 260 265 270 Thr Asp Ala Val Asp Cys Ser Gln Asn Pro Leu Ala Glu Leu Lys Cys 275 280 285 Ser Val Lys Ser Phe Glu Ile Asp Lys Gly Ile Tyr Gln Thr Ser Asn 290 295 300 Phe Arg Val Val Pro Ser Gly Asp Val Val Arg Phe Pro Asn Ile Thr 305 310 315 320 Asn Leu Cys Pro Phe Gly Glu Val Phe Asn Ala Thr Lys Phe Pro Ser 325 330 335 Val Tyr Ala Trp Glu Arg Lys Lys Ile Ser Asn Cys Val Ala Asp Tyr 340 345 350 Ser Val Leu Tyr Asn Ser Thr Phe Phe Ser Thr Phe Lys Cys Tyr Gly 355 360 365 Val Ser Ala Thr Lys Leu Asn Asp Leu Cys Phe Ser Asn Val Tyr Ala 370 375 380 Asp Ser Phe Val Val Lys Gly Asp Asp Val Arg Gln Ile Ala Pro Gly 385 390 395 400 Gln Thr Gly Val Ile Ala Asp Tyr Asn Tyr Lys Leu Pro Asp Asp Phe 405 410 415 Met Gly Cys Val Leu Ala Trp Asn Thr Arg Asn Ile Asp Ala Thr Ser 420 425 430 Thr Gly Asn Tyr Asn Tyr Lys Tyr Arg Tyr Leu Arg His Gly Lys Leu 435 440 445 Arg Pro Phe Glu Arg Asp Ile Ser Asn Val Pro Phe Ser Pro Asp Gly 450 455 460 Lys Pro Cys Thr Pro Pro Ala Leu Asn Cys Tyr Trp Pro Leu Asn Asp 465 470 475 480 Tyr Gly Phe Tyr Thr Thr Thr Gly Ile Gly Tyr Gln Pro Tyr Arg Val 485 490 495 Val Val Leu Ser Phe Glu Leu Leu Asn Ala Pro Ala Thr Val Cys Gly 500 505 510 Pro Lys Leu Ser Thr Asp Leu Ile Lys Asn Gln Cys Val Asn Phe Asn 515 520 525 Phe Asn Gly Leu Thr Gly Thr Gly Val Leu Thr Pro Ser Ser Lys Arg 530 535 540 Phe Gln Pro Phe Gln Gln Phe Gly Arg Asp Val Ser Asp Phe Thr Asp 545 550 555 560 Ser Val Arg Asp Pro Lys Thr Ser Glu Ile Leu Asp Ile Ser Pro Cys 565 570 575 Ala Phe Gly Gly Val Ser Val Ile Thr Pro Gly Thr Asn Ala Ser Ser 580 585 590 Glu Val Ala Val Leu Tyr Gln Asp Val Asn Cys Thr Asp Val Ser Thr 595 600 605 Ala Ile His Ala Asp Gln Leu Thr Pro Ala Trp Arg Ile Tyr Ser Thr 610 615 620 Gly Asn Asn Val Phe Gln Thr Gln Ala Gly Cys Leu Ile Gly Ala Glu 625 630 635 640 His Val Asp Thr Ser Tyr Glu Cys Asp Ile Pro Ile Gly Ala Gly Ile 645 650 655 Cys Ala Ser Tyr His Thr Val Ser Leu Leu Arg Ser Thr Ser Gln Lys 660 665 670 Ser Ile Val Ala Tyr Thr Met Ser Leu Gly Ala Asp Ser Ser Ile Ala 675 680 685 Tyr Ser Asn Asn Thr Ile Ala Ile Pro Thr Asn Phe Ser Ile Ser Ile 690 695 700 Thr Thr Glu Val Met Pro Val Ser Met Ala Lys Thr Ser Val Asp Cys 705 710 715 720 Asn Met Tyr Ile Cys Gly Asp Ser Thr Glu Cys Ala Asn Leu Leu Leu 725 730 735 Gln Tyr Gly Ser Phe Cys Thr Gln Leu Asn Arg Ala Leu Ser Gly Ile 740 745 750 Ala Ala Glu Gln Asp Arg Asn Thr Arg Glu Val Phe Ala Gln Val Lys 755 760 765 Gln Met Tyr Lys Thr Pro Thr Leu Lys Tyr Phe Gly Gly Phe Asn Phe 770 775 780 Ser Gln Ile Leu Pro Asp Pro Leu Lys Pro Thr Lys Arg Ser Phe Ile 785 790 795 800 Glu Asp Leu Leu Phe Asn Lys Val Thr Leu Ala Asp Ala Gly Phe Met 805 810 815 Lys Gln Tyr Gly Glu Cys Leu Gly Asp Ile Asn Ala Arg Asp Leu Ile 820 825 830 Cys Ala Gln Lys Phe Asn Gly Leu Thr Val Leu Pro Pro Leu Leu Thr 835 840 845 Asp Asp Met Ile Ala Ala Tyr Thr Ala Ala Leu Val Ser Gly Thr Ala 850 855 860 Thr Ala Gly Trp Thr Phe Gly Ala Gly Ala Ala Leu Gln Ile Pro Phe 865 870 875 880 Ala Met Gln Met Ala Tyr Arg Phe Asn Gly Ile Gly Val Thr Gln Asn 885 890 895 Val Leu Tyr Glu Asn Gln Lys Gln Ile Ala Asn Gln Phe Asn Lys Ala 900 905 910 Ile Ser Gln Ile Gln Glu Ser Leu Thr Thr Thr Ser Thr Ala Leu Gly 915 920 925 Lys Leu Gln Asp Val Val Asn Gln Asn Ala Gln Ala Leu Asn Thr Leu 930 935 940 Val Lys Gln Leu Ser Ser Asn Phe Gly Ala Ile Ser Ser Val Leu Asn 945 950 955 960 Asp Ile Leu Ser Arg Leu Asp Lys Val Glu Ala Glu Val Gln Ile Asp 965 970 975 Arg Leu Ile Thr Gly Arg Leu Gln Ser Leu Gln Thr Tyr Val Thr Gln 980 985 990 Gln Leu Ile Arg Ala Ala Glu Ile Arg Ala Ser Ala Asn Leu Ala Ala 995 1000 1005 Thr Lys Met Ser Glu Cys Val Leu Gly Gln Ser Lys Arg Val Asp Phe 1010 1015 1020 Cys Gly Lys Gly Tyr His Leu Met Ser Phe Pro Gln Ala Ala Pro His 1025 1030 1035 1040 Gly Val Val Phe Leu His Val Thr Tyr Val Pro Ser Gln Glu Arg Asn 1045 1050 1055 Phe Thr Thr Ala Pro Ala Ile Cys His Glu Gly Lys Ala Tyr Phe Pro 1060 1065 1070 Arg Glu Gly Val Phe Val Phe Asn Gly Thr Ser Trp Phe Ile Thr Gln 1075 1080 1085 Arg Asn Phe Phe Ser Pro Gln Ile Ile Thr Thr Asp Asn Thr Phe Val 1090 1095 1100 Ser Gly Asn Cys Asp Val Val Ile Gly Ile Ile Asn Asn Thr Val Tyr 1105 1110 1115 1120 Asp Pro Leu Gln Pro Glu Leu Asp Ser Phe Lys Glu Glu Leu Asp Lys 1125 1130 1135 Tyr Phe Lys Asn His Thr Ser Pro Asp Val Asp Leu Gly Asp Ile Ser 1140 1145 1150 Gly Ile Asn Ala Ser Val Val Asn Ile Gln Lys Glu Ile Asp Arg Leu 1155 1160 1165 Asn Glu Val Ala Lys Asn Leu Asn Glu Ser Leu Ile Asp Leu Gln Glu 1170 1175 1180 Leu Gly Lys Tyr Glu Gln Tyr Ile Lys Trp Pro Leu Glu Val Leu Phe 1185 1190 1195 1200 Gln Gly Pro Gly Gly Gly Ser Gly Gly Gly Ser Gly Tyr Ile Pro Glu 1205 1210 1215 Ala Pro Arg Asp Gly Gln Ala Tyr Val Arg Lys Asp Gly Glu Trp Val 1220 1225 1230 Leu Leu Ser Thr Phe Leu Gly His His His His His His Trp Ser His 1235 1240 1245 Pro Gln Phe Glu Lys 1250 <210> 7 <211> 3762 <212> DNA <213> Artificial Sequence <400> 7 atgttcatct tcctgctgtt cctgaccctg accagcggca gcgacctgga ccggtgcacc 60 accttcgacg acgtgcaggc ccccaactac acccagcaca ccagcagcat gcggggcgtg 120 tactaccccg acgagatctt ccggagcgac accctgtacc tgacccagga cctgttcctg 180 cccttctaca gcaacgtgac cggcttccac accatcaacc acaccttcgg caaccccgtg 240 atccccttca aggacggcat ctacttcgcc gccaccgaga agagcaacgt ggtgcggggc 300 tgggtgttcg gcagcaccat gaacaacaag agccagagcg tgatcatcat caacaacagc 360 accaacgtgg tgatccgggc ctgcaacttc gagctgtgcg acaacccctt cttcgccgtg 420 agcaagccca tgggcaccca gacccacacc atgatcttcg acaacgcctt caactgcacc 480 ttcgagtaca tcagcgacgc cttcagcctg gacgtgagcg agaagagcgg caacttcaag 540 cacctgcggg agttcgtgtt caagaacaag gacggcttcc tgtacgtgta caagggctac 600 cagcccatcg acgtggtgcg ggacctgccc agcggcttca acaccctgaa gcccatcttc 660 aagctgcccc tgggcatcaa catcaccaac ttccgggcca tcctgaccgc cttcagcccc 720 gcccaggaca tctggggcac cagcgccgcc gcctacttcg tgggctacct gaagcccacc 780 accttcatgc tgaagtacga cgagaacggc accatcaccg acgccgtgga ctgcagccag 840 aaccccctgg ccgagctgaa gtgcagtgtg aagagcttcg agatcgacaa gggcatctac 900 cagaccagca acttccgggt ggtgcccagc ggcgacgtgg tgcggttccc caacatcacc 960 aacctgtgcc ccttcggcga ggtgttcaac gccaccaagt tccccagcgt gtacgcctgg 1020 gagcggaaga agatcagcaa ctgcgtggcc gactacagcg tgctgtacaa cagcaccttc 1080 ttcagcacct tcaagtgcta cggcgtgagc gccaccaagc tgaacgacct gtgcttcagc 1140 aacgtgtacg ccgacagctt cgtggtgaag ggcgacgacg tgcggcagat cgcccccggc 1200 cagaccggcg tgatcgccga ctacaactac aagctgcccg acgacttcat gggctgcgtg 1260 ctggcctgga acacccggaa catcgacgcc accagcaccg gcaactacaa ctacaagtac 1320 cggtacctgc ggcacggcaa gctgcggccc ttcgagcggg acatcagcaa cgtgcccttc 1380 agccccgacg gcaagccctg cacccccccc gccctgaact gctactggcc cctgaacgac 1440 tacggcttct acaccactac cggcatcggc taccagccct accgggtggt ggtgctgagc 1500 ttcgagctgc tgaacgcccc cgccaccgtg tgcggcccca agctgagcac cgacctgatc 1560 aagaaccagt gcgtgaactt caacttcaac ggcctgaccg gcaccggcgt gctgaccccc 1620 agcagcaagc ggttccagcc cttccagcag ttcggccggg acgtgagcga cttcaccgac 1680 agcgtgcggg accccaagac cagcgagatc ctggacatca gcccctgcgc cttcggcggc 1740 gtgagcgtga tcacccccgg taccaacgcc agcagcgagg tggccgtgct gtaccaggac 1800 gtgaactgca ccgacgtgag caccgccatc cacgccgacc agctgacccc cgcctggcgg 1860 atctacagca ccggcaacaa cgtgttccag acccaggccg gctgcctgat cggcgccgag 1920 cacgtggaca ccagctacga gtgcgacatc cccatcggcg ccggcatctg cgccagctac 1980 cacaccgtga gcctgctgcg gagcaccagc cagaagagca tcgtggccta caccatgagc 2040 ctgggcgccg acagcagcat cgcctacagc aacaacacca tcgccatccc caccaacttc 2100 agcatcagca tcaccaccga ggtgatgccc gtgagcatgg ccaagaccag cgtggactgc 2160 aacatgtaca tctgcggcga cagcaccgag tgcgccaacc tgctgctgca gtacggcagc 2220 ttctgcaccc agctgaaccg ggccctgagc ggcatcgccg ccgagcagga ccggaacacc 2280 cgggaggtgt tcgcccaggt gaagcagatg tacaagaccc ccaccctgaa gtacttcggc 2340 ggcttcaact tcagccagat cctgcccgac cccctgaagc ccaccaagcg gagcttcatc 2400 gaggacctgc tgttcaacaa ggtgaccctg gccgacgccg gcttcatgaa gcagtacggc 2460 gagtgcctgg gcgacatcaa cgcccgggac ctgatctgcg cccagaagtt caacggcctg 2520 accgtgctgc cccccctgct gaccgacgac atgatcgccg cctacaccgc cgccctggtg 2580 agcggcaccg ccaccgccgg ctggaccttc ggcgccggcg ccgccctgca gatccccttc 2640 gccatgcaga tggcctaccg gttcaacggc atcggcgtga cccagaacgt gctgtacgag 2700 aaccagaagc agatcgccaa ccagttcaac aaggccatca gccagatcca ggagagcctg 2760 accaccacca gcaccgccct gggcaagctg caggacgtgg tgaaccagaa cgcccaggcc 2820 ctgaacaccc tggtgaagca gctgagcagc aacttcggcg ccatcagcag cgtgctgaac 2880 gacatcctga gccggctgga caaggtggag gccgaggtgc agatcgaccg gctgatcacc 2940 ggccggctgc agagcctgca gacctacgtg acccagcagc tgatccgggc cgccgagatc 3000 cgggccagcg ccaacctggc cgccaccaag atgagcgagt gcgtgctggg ccagagcaag 3060 cgggtggact tctgcggcaa gggctaccac ctgatgagct tcccccaggc cgccccccac 3120 ggcgtggtgt tcctgcacgt gacctacgtg cccagccagg agcggaactt caccaccgcc 3180 cccgccatct gccacgaggg caaggcctac ttcccccggg agggcgtgtt cgtgttcaac 3240 ggcaccagct ggttcatcac ccagcggaac ttcttcagcc cccagatcat caccaccgac 3300 aacaccttcg tgagcggcaa ctgcgacgtg gtgatcggca tcatcaacaa caccgtgtac 3360 gaccccctgc agcccgagct ggacagcttc aaggaggagc tggacaagta cttcaagaac 3420 cacaccagcc ccgacgtgga cctgggcgac atcagcggca tcaacgccag cgtggtgaac 3480 atccagaagg agatcgaccg gctgaacgag gtggccaaga acctgaacga gagcctgatc 3540 gacctgcagg agctgggcaa gtacgagcag tacatcaagt ggcccctgga agttctgttc 3600 caggggcccg gaggaggaag tggaggagga agtggctata ttccggaagc gccgcgcgat 3660 ggccaggcgt atgtgcgcaa agatggcgaa tgggtgctgc tgagcacctt tctgggccat 3720 caccatcacc atcactggtc ccaccctcag ttcgagaagt ga 3762 <210> 8 <211> 116 <212> PRT <213> Vicugna pacos <400> 8 Gln Val Gln Leu Gln Glu Ser Gly Gly Gly Leu Val Gln Pro Gly Glu 1 5 10 15 Ser Leu Arg Leu Ser Cys Ala Ala Ser Gly Ser Ile Ser Thr Leu Asn 20 25 30 Val Met Gly Trp Tyr Arg Gln Ala Pro Gly Lys Gln Arg Glu Leu Val 35 40 45 Ala Gln Ile Thr Leu Asp Gly Ser Pro Glu Tyr Ala Asp Ser Val Lys 50 55 60 Gly Arg Phe Thr Ile Thr Lys Asp Gly Ala Gln Ser Thr Leu Tyr Leu 65 70 75 80 Gln Met Asn Asn Leu Lys Pro Glu Asp Thr Ala Val Tyr Phe Cys Lys 85 90 95 Leu Glu Asn Gly Gly Phe Phe Tyr Tyr Trp Gly Gln Gly Thr Gln Val 100 105 110 Thr Val Create Thr 115 <210> 9 <211> 348 <212> DNA <213> Vicugna pacos <400> 9 caggtgcagc tgcaggagtc ggggggaggc ttggtgcagc ctggggagtc tctgagactc 60 tcctgtgcag cctctggaag tatttctacg ttaaatgtca tgggctggta ccgccaggct 120 ccagggaagc agcgcgagtt ggtcgcacag attactcttg atggtagccc tgagtatgca 180 gactccgtga agggccgatt caccatcacc aaggacggcg cccagagcac gttgtatctg 240 caaatgaaca acttgaaacc tgaggacacg gccgtctatt tctgtaaact cgaaaacggc 300 ggattctttt actactgggg ccaggggacc caggtcacgg tctccaca 348 <210> 10 <211> 116 <212> PRT <213> Vicugna pacos <400> 10 Gln Val Gln Leu Val Glu Ser Gly Gly Gly Leu Val Gln Pro Gly Gly 1 5 10 15 Ser Leu Arg Leu Ser Cys Ala Ala Ser Gly Ser Ile Ser Thr Leu Asn 20 25 30 Val Met Gly Trp Tyr Arg Gln Ala Pro Gly Lys Gln Arg Glu Leu Val 35 40 45 Ala Arg Ile Thr Leu Asp Gly Arg Pro Glu Tyr Ala Asp Ser Val Lys 50 55 60 Gly Arg Phe Thr Ile Thr Lys Asp Gly Ala Gln Ser Thr Leu Tyr Leu 65 70 75 80 Gln Met Asn Asn Leu Lys Pro Glu Asp Thr Ala Val Tyr Phe Cys Lys 85 90 95 Leu Glu Asn Gly Gly Phe Phe Tyr Tyr Trp Gly Gln Gly Thr Gln Val 100 105 110 Thr Val Ser Ser 115 <210> 11 <211> 348 <212> DNA <213> Vicugna pacos <400> 11 caggtgcagc tggtggagtc tgggggaggc ttggtgcagc ctggggggtc tctgagactc 60 tcctgtgcag cctctggaag tatttctacg ttaaatgtca tgggctggta ccgccaggct 120 ccagggaagc agcgcgagtt ggtcgcacgg attactcttg atggtagacc tgagtatgca 180 gactccgtga agggccgatt caccatcacc aaggacggcg cccagagcac gttgtatctg 240 caaatgaaca acttgaaacc tgaggacacg gccgtctatt tctgtaaact cgaaaacggc 300 ggattctttt actactgggg ccaggggacc caggtcaccg tctcctcg 348 <210> 12 <211> 116 <212> PRT <213> Vicugna pacos <400> 12 Gln Val Gln Leu Val Glu Ser Gly Gly Gly Leu Val Gln Pro Gly Gly 1 5 10 15 Ser Leu Arg Leu Ser Cys Ala Ala Ser Gly Ser Ile Ser Thr Leu Asn 20 25 30 Val Met Gly Trp Tyr Arg Gln Ala Pro Gly Lys Gln Arg Glu Leu Val 35 40 45 Ala Gln Ile Thr Leu Asp Gly Arg Pro Glu Tyr Ala Asp Ser Val Lys 50 55 60 Gly Arg Phe Thr Ile Thr Lys Asp Gly Ala Gln Ser Thr Leu Tyr Leu 65 70 75 80 Gln Met Asn Asn Leu Lys Pro Glu Asp Thr Ala Val Tyr Phe Cys Lys 85 90 95 Leu Glu Asn Gly Gly Phe Phe Tyr Tyr Trp Gly Gln Gly Thr Gln Val 100 105 110 Thr Val Create Create 115 <210> 13 <211> 348 <212> DNA <213> Vicugna pacos <400> 13 caggtgcagc tggtggagtc tgggggaggc ttggtgcagc ctggggggtc tctgagactc 60 tcctgtgcag cctctggaag tatttctacg ttaaatgtca tgggctggta ccgccaggct 120 ccagggaagc agcgcgagtt ggtcgcacag attactcttg atggtagacc tgagtatgca 180 gactccgtga agggccgatt caccatcacc aaggacggcg cccagagcac gttgtatctg 240 caaatgaaca acttgaaacc tgaggacacg gccgtctatt tctgtaaact cgaaaacggc 300 ggattctttt actactgggg ccaggggacc caggtcaccg tctcctcg 348 <210> 14 <211> 126 <212> PRT <213> Vicugna pacos <400> 14 Gln Val Gln Leu Val Glu Ser Gly Gly Gly Leu Val Gln Ala Gly Gly 1 5 10 15 Ser Leu Arg Leu Ser Cys Ala Ala Ser Gly Arg Ile Leu Ser Arg Tyr 20 25 30 Arg Met Gly Trp Phe Arg Gln Ala Pro Gly Lys Glu Arg Glu Phe Val 35 40 45 Ala Ala Val Ser Trp Ser Asp Gly Ser Thr Tyr Tyr Ala Asp Ser Val 50 55 60 Lys Gly Arg Phe Thr Ile Ser Arg Asp Asn Ala Lys Asn Thr Val Tyr 65 70 75 80 Leu Gln Met Asn Ser Leu Lys Pro Glu Asp Thr Ala Val Tyr Ser Cys 85 90 95 Ala Ala Asp Val Gln Asp Tyr Met Gly Tyr Ser Lys Met Tyr Gln Asp 100 105 110 Tyr Asp Tyr Trp Gly Gln Gly Thr Gln Val Thr Val Ser Ser 115 120 125 <210> 15 <211> 378 <212> DNA <213> Vicugna pacos <400> 15 caggtgcagc tggtggagtc ggggggagga ttggtacagg ctgggggctc tctgagactc 60 tcctgtgcag cctctggacg cattttgagt agatatcgca tgggctggtt ccgccaggct 120 ccagggaagg agcgtgagtt tgtagcagcc gttagttgga gtgatggtag cacatactat 180 gcagactcag tgaagggccg attcaccatc tccagagaca acgccaagaa cacggtgtat 240 ctgcagatga acagcctgaa gcctgaggac acggccgttt attcctgtgc agcagatgtg 300 caagactata tgggttactc caagatgtac caagactatg actactgggg ccaggggacc 360 caggtcaccg tctcctcg 378 <210> 16 <211> 1876 <212> DNA <213> Artificial Sequence <400> 16 tagtaatcaa ttacggggtc attagttcat agcccatata tggagttccg cgttacataa 60 cttacggtaa atggcccgcc tggctgaccg cccaacgacc cccgcccatt gacgtcaata 120 atgacgtatg ttcccatagt aacgccaata gggactttcc attgacgtca atgggtggag 180 tatttacggt aaactgccca cttggcagta catcaagtgt atcatatgcc aagtacgccc 240 cctattgacg tcaatgacgg taaatggccc gcctggcatt atgcccagta catgacctta 300 tgggactttc ctacttggca gtacatctac gtattagtca tcgctattac catggtgatg 360 cggttttggc agtacatcaa tgggcgtgga tagcggtttg actcacgggg atttccaagt 420 ctccacccca ttgacgtcaa tgggagtttg ttttggcacc aaaatcaacg ggactttcca 480 aaatgtcgta acaactccgc cccattgacg caaatgggcg gtaggcgtgt acggtgggag 540 gtctatataa gcagagctcg tttagtgaac cgtcagatcg cctggagacg ccatccacgc 600 tgttttgacc tccatagaag acaccgggac cgatccagcc tccgcggccg ggaacggtgc 660 attggaacgc ggattccccg tgccaagagt gacgtaagta ccgcctatag agtctatagg 720 cccaccccct tggcttcgtt agaacgcggc tacaattaat acataacctt atgtatcata 780 cacatacgat ttaggtgaca ctatagaata acatccactt tgcctttctc tccacaggtg 840 tccactccca ggtccaactg cacctcggtt ctatcgattg aattccacca tgggatggtc 900 atgtatcatc ctttttctag tagcaactgc aaccggtgta cattctgcta gccgcggccg 960 cggtggtggt ggttctgagc ccaaatcttg tgacaaaact cacacatgcc caccgtgccc 1020 agcacctgaa ctcctggggg gaccgtcagt cttcctcttc cccccaaaac ccaaggacac 1080 cctcatgatc tcccggaccc ctgaggtcac atgcgtggtg gtggacgtga gccacgaaga 1140 ccctgaggtc aagttcaact ggtacgtgga cggcgtggag gtgcataatg ccaagacaaa 1200 gccgcgggag gagcagtaca acagcacgta ccgtgtggtc agcgtcctca ccgtcctgca 1260 ccaggactgg ctgaatggca aggagtacaa gtgcaaggtc tccaacaaag ccctcccagc 1320 ccccatcgag aaaaccatct ccaaagccaa agggcagccc cgagaaccac aggtgtacac 1380 cctgccccca tcccgggagg agatgaccaa gaaccaggtc agcctgacct gcctggtcaa 1440 aggcttctat cccagcgaca tcgccgtgga gtgggagagc aatgggcagc cggagaacaa 1500 ctacaagacc acgcctcccg tgctggactc cgacggctcc ttcttcctct atagcaagct 1560 caccgtggac aagagcaggt ggcagcaggg gaacgtcttc tcatgctccg tgatgcatga 1620 ggctctgcac aaccactaca cgcagagag cctctccctg tccccggta atgagtgcg 1680 acggccggca agcccccgct cccggggctc tcgcggtcgt acgaggaag cttggccgcc 1740 atggcccaac ttgttttg cagcttataa tggttacaa taaagcaata gcatcacaa 1800 tttcacaaat aaagcatttt ttcactgca ttctagttgt gttttgtcca aactcatca 1860 tgtatcttat catgta 1876 <210> 17 <211> 1281 <212> PRT <213> SARS‐CoV‐2 <400> 17 Met Phe Val Phe Leu Val Leu Pro Leu Val Ser Ser Gln Cys Val 1 5 10 15 Asn Leu Thr Arg Thr Gln Leu Pro Pro Ala Tyr Thr Asn Ser Phe 20 25 30 Thr Arg Gly Val Tyr Tyr Pro Asp Lys Val Phe Arg Ser Ser Val Leu 35 40 45 His Ser Thr Gln Asp Leu Phe Leu Pro Phe Phe Ser Asn Val Thr Trp 50 55 60 Phe His Ala Ile His Val Ser Gly Thr Asn Gly Thr Lys Arg Phe Asp 65 70 75 80 Asn Pro Val Leu Pro Phe Asn Asp Gly Val Tyr Phe Ala Ser Thr Glu 85 90 95 Lys Ser Asn Ile Ile Arg Gly Trp Ile Phe Gly Thr Thr Leu Asp Ser 100 105 110 Lys Thr Gln Ser Leu Leu Ile Val Asn Asn Ala Thr Asn Val Val Ile 115 120 125 Lys Val Cys Glu Phe Gln Phe Cys Asn Asp Pro Phe Leu Gly Val Tyr 130 135 140 Tyr His Lys Asn Asn Lys Ser Trp Met Glu Ser Glu Phe Arg Val Tyr 145 150 155 160 Ser Ser Ala Asn Asn Cys Thr Phe Glu Tyr Val Ser Gln Pro Phe Leu 165 170 175 Met Asp Leu Glu Gly Lys Gln Gly Asn Phe Lys Asn Leu Arg Glu Phe 180 185 190 Val Phe Lys Asn Ile Asp Gly Tyr Phe Lys Ile Tyr Ser Lys His Thr 195 200 205 Pro Ile Asn Leu Val Arg Asp Leu Pro Gln Gly Phe Ser Ala Leu Glu 210 215 220 Pro Leu Val Asp Leu Pro Ile Gly Ile Asn Ile Thr Arg Phe Gln Thr 225 230 235 240 Leu Leu Ala Leu His Arg Ser Tyr Leu Thr Pro Gly Asp Ser Ser Ser 245 250 255 Gly Trp Thr Ala Gly Ala Ala Ala Tyr Tyr Val Gly Tyr Leu Gln Pro 260 265 270 Arg Thr Phe Leu Leu Lys Tyr Asn Glu Asn Gly Thr Ile Thr Asp Ala 275 280 285 Val Asp Cys Ala Leu Asp Pro Leu Ser Glu Thr Lys Cys Thr Leu Lys 290 295 300 Ser Phe Thr Val Glu Lys Gly Ile Tyr Gln Thr Ser Asn Phe Arg Val 305 310 315 320 Gln Pro Thr Glu Ser Ile Val Arg Phe Pro Asn Ile Thr Asn Leu Cys 325 330 335 Pro Phe Gly Glu Val Phe Asn Ala Thr Arg Phe Ala Ser Val Tyr Ala 340 345 350 Trp Asn Arg Lys Arg Ile Ser Asn Cys Val Ala Asp Tyr Ser Val Leu 355 360 365 Tyr Asn Ser Ala Ser Phe Ser Thr Phe Lys Cys Tyr Gly Val Ser Pro 370 375 380 Thr Lys Leu Asn Asp Leu Cys Phe Thr Asn Val Tyr Ala Asp Ser Phe 385 390 395 400 Val Ile Arg Gly Asp Glu Val Arg Gln Ile Ala Pro Gly Gln Thr Gly 405 410 415 Lys Ile Ala Asp Tyr Asn Tyr Lys Leu Pro Asp Asp Phe Thr Gly Cys 420 425 430 Val Ile Ala Trp Asn Ser Asn Asn Leu Asp Ser Lys Val Gly Gly Asn 435 440 445 Tyr Asn Tyr Leu Tyr Arg Leu Phe Arg Lys Ser Asn Leu Lys Pro Phe 450 455 460 Glu Arg Asp Ile Ser Thr Glu Ile Tyr Gln Ala Gly Ser Thr Pro Cys 465 470 475 480 Asn Gly Val Glu Gly Phe Asn Cys Tyr Phe Pro Leu Gln Ser Tyr Gly 485 490 495 Phe Gln Pro Thr Asn Gly Val Gly Tyr Gln Pro Tyr Arg Val Val Val 500 505 510 Leu Ser Phe Glu Leu Leu His Ala Pro Ala Thr Val Cys Gly Pro Lys 515 520 525 Lys Ser Thr Asn Leu Val Lys Asn Lys Cys Val Asn Phe Asn Phe Asn 530 535 540 Gly Leu Thr Gly Thr Gly Val Leu Thr Glu Ser Asn Lys Lys Phe Leu 545 550 555 560 Pro Phe Gln Gln Phe Gly Arg Asp Ile Ala Asp Thr Thr Asp Ala Val 565 570 575 Arg Asp Pro Gln Thr Leu Glu Ile Leu Asp Ile Thr Pro Cys Ser Phe 580 585 590 Gly Gly Val Ser Val Ile Thr Pro Gly Thr Asn Thr Ser Asn Gln Val 595 600 605 Ala Val Leu Tyr Gln Asp Val Asn Cys Thr Glu Val Pro Val Ala Ile 610 615 620 His Ala Asp Gln Leu Thr Pro Thr Trp Arg Val Tyr Ser Thr Gly Ser 625 630 635 640 Asn Val Phe Gln Thr Arg Ala Gly Cys Leu Ile Gly Ala Glu His Val 645 650 655 Asn Asn Ser Tyr Glu Cys Asp Ile Pro Ile Gly Ala Gly Ile Cys Ala 660 665 670 Ser Tyr Gln Thr Gln Thr Asn Ser Pro Arg Arg Ala Arg Ser Val Ala 675 680 685 Ser Gln Ser Ile Ile Ala Tyr Thr Met Ser Leu Gly Ala Glu Asn Ser 690 695 700 Val Ala Tyr Ser Asn Asn Ser Ile Ala Ile Pro Thr Asn Phe Thr Ile 705 710 715 720 Ser Val Thr Thr Glu Ile Leu Pro Val Ser Met Thr Lys Thr Ser Val 725 730 735 Asp Cys Thr Met Tyr Ile Cys Gly Asp Ser Thr Glu Cys Ser Asn Leu 740 745 750 Leu Leu Gln Tyr Gly Ser Phe Cys Thr Gln Leu Asn Arg Ala Leu Thr 755 760 765 Gly Ile Ala Val Glu Gln Asp Lys Asn Thr Gln Glu Val Phe Ala Gln 770 775 780 Val Lys Gln Ile Tyr Lys Thr Pro Pro Ile Lys Asp Phe Gly Gly Phe 785 790 795 800 Asn Phe Ser Gln Ile Leu Pro Asp Pro Ser Lys Pro Ser Lys Arg Ser 805 810 815 Phe Ile Glu Asp Leu Leu Phe Asn Lys Val Thr Leu Ala Asp Ala Gly 820 825 830 Phe Ile Lys Gln Tyr Gly Asp Cys Leu Gly Asp Ile Ala Ala Arg Asp 835 840 845 Leu Ile Cys Ala Gln Lys Phe Asn Gly Leu Thr Val Leu Pro Pro Leu 850 855 860 Leu Thr Asp Glu Met Ile Ala Gln Tyr Thr Ser Ala Leu Leu Ala Gly 865 870 875 880 Thr Ile Thr Ser Gly Trp Thr Phe Gly Ala Gly Ala Ala Leu Gln Ile 885 890 895 Pro Phe Ala Met Gln Met Ala Tyr Arg Phe Asn Gly Ile Gly Val Thr 900 905 910 Gln Asn Val Leu Tyr Glu Asn Gln Lys Leu Ile Ala Asn Gln Phe Asn 915 920 925 Ser Ala Ile Gly Lys Ile Gln Asp Ser Leu Ser Ser Thr Ala Ser Ala 930 935 940 Leu Gly Lys Leu Gln Asp Val Val Asn Gln Asn Ala Gln Ala Leu Asn 945 950 955 960 Thr Leu Val Lys Gln Leu Ser Ser Asn Phe Gly Ala Ile Ser Ser Val 965 970 975 Leu Asn Asp Ile Leu Ser Arg Leu Asp Lys Val Glu Ala Glu Val Gln 980 985 990 Ile Asp Arg Leu Ile Thr Gly Arg Leu Gln Ser Leu Gln Thr Tyr Val 995 1000 1005 Thr Gln Gln Leu Ile Arg Ala Ala Glu Ile Arg Ala Ser Ala Asn Leu 1010 1015 1020 Ala Ala Thr Lys Met Ser Glu Cys Val Leu Gly Gln Ser Lys Arg Val 1025 1030 1035 1040 Asp Phe Cys Gly Lys Gly Tyr His Leu Met Ser Phe Pro Gln Ser Ala 1045 1050 1055 Pro His Gly Val Val Phe Leu His Val Thr Tyr Val Pro Ala Gln Glu 1060 1065 1070 Lys Asn Phe Thr Thr Ala Pro Ala Ile Cys His Asp Gly Lys Ala His 1075 1080 1085 Phe Pro Arg Glu Gly Val Phe Val Ser Asn Gly Thr His Trp Phe Val 1090 1095 1100 Thr Gln Arg Asn Phe Tyr Glu Pro Gln Ile Ile Thr Thr Asp Asn Thr 1105 1110 1115 1120 Phe Val Ser Gly Asn Cys Asp Val Val Ile Gly Ile Val Asn Asn Thr 1125 1130 1135 Val Tyr Asp Pro Leu Gln Pro Glu Leu Asp Ser Phe Lys Glu Glu Leu 1140 1145 1150 Asp Lys Tyr Phe Lys Asn His Thr Ser Pro Asp Val Asp Leu Gly Asp 1155 1160 1165 Ser Gly To Asn Ala Ser Val Val Asn To Gln To Lys Glu To Asp 1170 1175 1180 Arg Leu Asn Glu Val Ala Lys Asn Leu Asn Glu Ser Leu Ile Asp Leu 1185 1190 1195 1200 Gln Glue Leu Gly Lys Tyre Glue Gln Tyr Ile High Pro Trp Gly Gly 1205 1210 1215 Trp Leu Gly Phe Ile Ala Gly Leu Ile Ala Ile Val Met Val Thr Ile 1220 1225 1230 Met Leu Cys Met Thr Ser Cys Ser Cys Leu Lys Gly Cys Cys 1235 1240 1245 Ser Cys Gly Ser Cys Cys Lys Phe Asp Glu Asp Asp Ser Glu Pro Val 1250 1255 1260 Leu Lys Gly Val Lys Leu His Tyr Thr Trp Ser His Pro Gln Phe Glu 1265 1270 1275 1280 Lys <210> 18 <211> 3846 <212> DNA <213> SARS‐CoV‐2 <400> 18 atgttcgtgt tcctgtgct gctgccctg gtgagcagcc agtgcgtgaa tctgaccacc 60 agaacccagc tgcctcctgc ctacaccaat agcttcacca gaggagttta ttatcccgat 120 aaggtgttca gaagtagtgt attacatagt acccaggacc tgttcctacc tttcttcagt 180 aacgtgacct ggttccacgc catccacgtg agcggcacca atggcaccaa gagattcgac 240 aatcctgtgc tgcctttcaa tgacggcgtg tacttcgcca gcaccgagaa gagcaatatc 300 atcagaggct ggatcttcgg caccaccttg gattccaaga ctcagagcct gctgattgta 360 aacaacgcta caaatgtggt gatcaaggtg tgcgagttcc agttctgcaa tgaccctttc 420 ctgggtgttt attatcataa gaacaacaag agctggatgg agagcgagtt ccgcgtatat 480 tcgtcggcta ataattgcac cttcgagtac gtgagccagc ctttcctgat ggacctggag 540 ggcaagcagg gcaatttcaa gaatctgaga gagttcgtgt tcaagaatat cgacggctac 600 ttcaagatct acagcaagca cacacccatt aatctggtga gagacctgcc tcagggcttc 660 agcgccctgg agcctctggt ggacctgcct atcggcatca atatcaccag attccagacc 720 ctgctggccc tgcacagatc atatcttaca ccaggcgatt cgtcaagcgg ttggaccgct 780 ggagctgcgg catattacgt gggctacctg cagcctagaa ccttcctgct gaagtacaat 840 gagaatggta cgataaccga cgcagttgat tgtgccctgg accctctgag cgagaccaag 900 tgcaccctga agagcttcac cgtggagaag ggcatctacc agaccagcaa tttcagagtg 960 cagcctaccg agagcatcgt gagattccct aatatcacca atctgtgccc tttcggcgag 1020 gtgttcaatg ccaccagatt cgccagcgtg tacgcatgga accgcaagcg gataagcaat 1080 tgcgtggccg actacagcgt gctgtacaat agcgccagct tcagcacctt caaatgttat 1140 ggtgtttcgc caacaaagct gaatgacctg tgcttcacca atgtgtacgc cgacagcttc 1200 gtgatcagag gcgacgaggt gagacagatc gcgccagggc agaccggcaa gatcgccgac 1260 tacaattaca agctgcctga cgacttcacc ggctgcgtga tcgcgtggaa ctctaacaat 1320 ctagattcga aagttggagg caattacaat tacctgtaca gactgttcag aaagagcaat 1380 ctgaagcctt tcgagagaga catcagcacc gagatctacc aggccggcag cacaccgtgt 1440 aatggcgtgg agggcttcaa ttgctacttc cctctgcaga gctacggctt ccagcctacc 1500 aatggcgtgg gctaccagcc ttacagagtg gtggtgctga gcttcgagct gctccacgct 1560 cccgctaccg tgtgcggccc taagaagc accaatctgg tgaagaataa gtgcgtgaat 1620 ttcaatttca atggtctaac tggaacgggc gtgctgaccg agagcaataa gaagtttctt 1680 ccctttcaac aattcggcag agacatcgcc gacaccacag atgctgtaag agaccctcag 1740 accctggaga tcctggacat cactccgtgt agcttcggcg gcgtgagcgt gatcacaccg 1800 ggtaccaata ccagcaatca ggtggccgtg ctgtaccagg acgtgaattg caccgaggtg 1860 cctgtggcca tccacgccga ccagctgact cccacttgga gggtatattc cacgggaagc 1920 aatgtgttcc agaccagagc cggctgcctg atcggcgccg agcacgtgaa taatagctac 1980 gagtgcgaca tccctatcgg cgccggcatc tgcgccagct accagaccca gaccaatagc 2040 cctagagaag ccagaagcgt ggccagccag agcatcatcg cctacaccat gagcctgggc 2100 gccgagaata gcgtggccta cagcaataat agcatcgcca tccctaccaa tttcaccatc 2160 agcgtgacca ccgaaatatt accagtctcc atgaccaaga ccagcgtgga ctgcaccatg 2220 tacatctgcg gcgacagcac cgagtgcagc aatctgctgc tgcagtacgg cagcttctgc 2280 acccagctga atagagccct gaccggcatc gccgtggagc aggacaagaa tacccaggag 2340 gtgttcgccc aggtgaagca gatctacaag actccgccga tcaaggactt cggcggcttc 2400 aatttcagcc aaatactccc agatccaagc aagcctagca agaggagctt catcgaggac 2460 ctgctgttca ataaggtgac cctggccgac gccggcttca tcaagcagta cggcgactgc 2520 ctaggtgata ttgcggcaag agacctgatc tgcgcccaga agtttaacgg tttgacagta 2580 ctacctcctc tgctgaccga cgagatgata gcacaatata cgtcggcatt gctcgctggc 2640 acgatcacat cgggctggac ttcggcgcc ggagcagcgt tgcaaatccc ttcgccatg 2700 cagatggcct acagattcaa tggcatcggc gtgacccaga atgtgctgta cgagaatcag 2760 aagctgatcg ccaatcagtt caatagcgcc atcggcaaga tccaggacag cctgagcagc 2820 accgccagcg ccctgggcaa gctgcaggac gtggtgaatc agaatgccca ggccctgaat 2880 accctggtga agcagctgag cagcaatttc ggcgccatca gtagtgtact caacgatatc 2940 ctgagcagac tggacaaggt ggaggccgag gtgcaaattg atcgtcttat tacggcaga 3000 ctgcagagcc tgcagaccta cgtgacccag cagctgatca gagccgccga gatcagagcc 3060 agcgccaatc tggccgccac caagatgagc gagtgcgtgc tgggccagag caagagagtg 3120 gacttctgcg gcaagggcta ccacctgatg agcttccctc agagcgctcc acatggcgtg 3180 gtgttcctgc acgtgaccta cgtgcctgcc your foot atttcaccac cgcacccgca 3240 atctgccacg acggcaaggc ccacttccct agagagggcg tgttcgtgag caatggcacc 3300 cactggttcg tgacccagag aaatttctac gagcctcaga tcatcaccac cgacaatacc 3360 3420 ctgcagcctg agctggacag cttcaaggag gagctggaca agtacttcaa gaatcacacc 3480 agccctgacg tggacctcgg tgatatttcg ggaatcaatg ccagcgtggt gaatatccag 3540 areaattg atcggctcaa cgaagtggcc aagaatctga atgagagcct gatcgacctg 3600 caggagctgg gcaagtacga gcagtacatc aagtggcctt ggtacatctg gctggggcttc 3660 atcgccggcc tgatcgccat cgtgatggtg accatcatgc tgtgctgcat gacctcctgt 3720 tgttcctgtt tgaaagggtg ttgttcgtgt gggtcctgct gcaagttcga cgaggacgac 3780 agcgagcctg tgctgaaggg cgtgaagctg cactacacct ggagccaccc tcagttcgag 3840 August 3846 <210> 19 <211> 3813 <212> DNA <213> SARS‐CoV‐2 <400> 19 atgttcgtgt tcctggtgct gctgcccctg gtgagcagcc aatgcgtgaa cctgaccaca 60 agaacacagc tgccccccgc ctacaccaac agcttcacaa gaggcgtgta ctaccccgac 120 aaggtgttca gaagcagcgt gctgcacagc acccaagacc tgttcctgcc cttcttcagc 180 aacgtgacct ggttccacgt gatcagcggc accaacggca ccaagagatt cgacaacccc 240 gtgctgcct tcaacgacgg cgtgtacttc gctagcatcg agaagagcaa catcatcaga 300 ggctggatct tcggcaccac cctggacagc aaaacacaga gcctgctgat cgtgaacaac 360 gccaccaacg tggtgatcaa ggtgtgcgag tttcagttct gcaacgaccc cttcttcgac 420 cacaagaaca acaagagctg gatggagagc gagttcagag tgtacagcag cgccaacaac 480 tgcaccttcg agtacgtgag ccaacccttc ctgatggacc tggagggcaa gcaaggcaac 540 ttcaagaacc tgagagagtt cgtgttcaag aacatcgacg gctacttcaa gatctacagc 600 aagcacaccc ccatcatcgt gagagagccc gaggacctgc cccaaggctt cagcgccctg 660 gagcccctgg tggacctgcc catcggcatc aacatcacaa gatttcagac cctgctggcc 720 ctgcacagat cctacctcac ccctggcgac agcagcagcg ggtggacagc tggcgctgcc 780 gcctactacg tgggctacct gcagcctaga accttcctgc tgaagtacaa cgagaacggc 840 accatcaccg acgccgtgga ctgccccctg gaccccctga gccagaccaa gtgcaccctg 900 aagagcttca ccgtggagaa gggcatctat cagacaagca acttcagagt gcagcccacc 960 gagagcatcg tgagattccc caacatcacc aacctgtgcc ccttcgacga ggtgttcaac 1020 gccacaagat tcgctagcgt gtacgcctgg aaccgaaga gaatcagcaa ctgcgtggcc 1080 gactacagcg tgctgtacaa cctggcccccc ttcttcacct tcaagtgcta cggcgtgagc 1140 cccaccaagc tgaacgacct gtgcttcacc aacgtgtacg ccgacagctt cgtgatcaga 1200 ggcgacgagg tgagacagat cgcccccggg cagaccggca acatcgccga ctacaactac 1260 aagctgcccg acgacttcac cggctgcgtg atcgcctgga acagcaacaa gctggacagc 1320 aaggtgtccg gcaactacaa ctacctgtac agactgttca gaaagagcaa cctgaagccc 1380 ttcgagagag acatcagcac cgagatctac caagccggca acaagccctg caacggcgtg 1440 gccggcttca actgctactt ccccctgaga agctacagct tcagacccac ctacggcgtg 1500 ggccatcagc cctacagagt ggtcgtgctg agcttcgagc tgctgcacgc ccccgccacc 1560 gtgtgcggcc ccaagaagag caccaacctg gtgaagaaca agtgcgtgaa cttcaacttc 1620 aacggcctga agggcaccgg cgtgctgacc gagagcaaca agaagttcct gccctttcag 1680 cagttcggca gagacatcgc cgacaccacc gacgccgtga gagaccctca gaccctggag 1740 atcctggaca tcaccccctg cagcttcggc ggcgtgagcg tgatcacccc cggcaccaac 1800 acaagcaacc aagtggccgt gctgtaccaa ggcgtgaact gcaccgaggt gcccgtggcc 1860 atccacgccg atcagctgac ccccacctgg agagtgtaca gcaccggcag caacgtgttt 1920 cagaagag ccgggctcct gatcggcgcc gagtacgtga aacagcta cgagtgcgac 1980 atccccatcg gcgccggcat ctgcgctagc tatcagacac agaccaagag ccacggcagt 2040 gctagcagcg tggctagcca aagcatcatc gcctacacca tgagcctggg cgccgagaac 2100 agcgtggcct acagcaacaa cagcatcgcc atccccacca acttcaccat cagcgtgacc 2160 accgagatcc tgcccgtcag catgaccaag acaagcgtgg actgcaccat gtacatctgc 2220 ggcgacagca ccgagtgcag caacctgctc ctgcagtacg gcagcttctg cacacagctg 2280 aagagagccc tgaccggcat cgccgtggag caagacaaga acacccaaga ggtgttcgcc 2340 caagtgaagc agatctacaa gacccccccc atcaagtact tcggcggctt caacttcagc 2400 caaattctgc ctgaccctag caagcctagc aagagaagct tcatcgagga cctgctgttc 2460 aacaaggtga ccctggccga cgccggcttc atcaagcagt acggcgactg cctgggcgac 2520 atcgccgcta gagacctgat ctgcgctcag aagttcaagg gcctgaccgt gctgcccccc 2580 ctgctgaccg acgagatgat cgctcagtat acaagcgccc tgctcgccgg gacaatcacc 2640 tccggctgga cctttggcgc cggcgctgcc ctgcagatcc ccttcgccat gcagatggcc 2700 tacagattca acggcatcgg cgtgacacag aacgtgctgt acgagaatca gaagctgatc 2760 gccaatcagt tcaacagcgc catcggcaag atccaagaca gcctgagcag caccgctagc 2820 gccctgggca agctgcaaga cgtggtgaac cacaacgcccc aagccctgaa caccctggtg 2880 aagcagctga gcagcaagtt cggcgccatc agcagcgtgc tgaacgacat cttcagcaga 2940 ctggaccctc cagaggccga ggtgcagatc gacagactga tcaccggcag actgcagagc 3000 ctgcagacct acgtgaccca gcagctgatc agagccgccg agatcagagc cagcgccaat 3060 ctggccgcca ccaagatgag cgagtgcgtg ctgggccaga gcaagagat ggacttctgc 3120 ggcaagggct accacctgat gagcttccct cagagcgctc cacatggcgt ggtgttcctg 3180 cacgtgacct acgtgcctgc ccaggagaag aatttcacca ccgcacccgc aatctgccac 3240 gacggcaagg cccacttccc tagagagggc gtgttcgtga gcaatggcac ccactggttc 3300 gtgacccaga gaaatttcta cgagcctcag atcatcacca ccgacaatac cttcgtgagc 3360 ggcaattgcg acgtggtgat cgggatagtc aataatactg tctacgaccc tctgcagcct 3420 gagctggaca gcttcaagga ggagctggac aagtacttca agaatcacac cagccctgac 3480 gtggacctcg gtgatatttc gggaatcaat gccagcgtgg tgaatatcca gaaggaaatt 3540 gatcggctca acgaagtggc caagaatctg aatgagagcc tgatcgacct gcaggagctg 3600 ggcaagtacg agcagtacat caagtggcct tggtacatct ggctgggctt catcgccggc 3660 ctgatcgcca tcgtgatggt gaccatcatg ctgtgctgca tgacctcctg ttgttcctgt 3720 ttgaaagggt gttgttcgtg tgggtcctgc tgcaagttcg acgaggacga cagcgagcct 3780 gtgctgaagg gcgtgaagct gcactacacc tga 3813 <210> 20 <211> 3813 <212> DNA <213> SARS-CoV-2 <400> 20 atgttcgtgt tcctggtatt gctgccgctg gtgagctctc agtgcgtgaa ccttatcacc 60 agaacccaga gctacaccaa cagcttcacc cggggtgttt actaccccga caaagtgttc 120 cggagctctg ttctgcatag cacccaagac ctgttcctgc ttttcttctc taacgtgacc tggttccacg ccatccacgt gtctggaca aacggacca aaagatttga caacccgtc ctgcctttta atgacggagt gtatttcgcc tccacagaaa agagcaacat catcagaggc tggatctttg gcaccactct ggattccaag acccagagcc tgctgatcgt gaacaacgcc acaaacgtcg tgatcaaggt ctgcgagttc cagttctgta acgatccttt tctggacgtg 420 tactaccaca agaacaacaa gagctggatg gaatctgagt ttcgggtgta cagcagcgct aataattgca ccttcgagta cgtttcccag ccattcctga tggacctgga gggcaagcag 540 ggaaacttca agaacctcag agagttcgtg ttcaaaaaca tcgacggcta cttcaagatc 660. acagcaagc acaccccaat caacctgggc agagacctgc ctcagggctt tagcgccctg 720. gaccactgg tggatctccc tatcggcatc aacatcacac ggtttcagac cctgctggcc ctccacagaa gctatctgac gcccggcgac holdcctcg gatggaccgc gggcgccgcc 780 gcttactacg tgggctatct gcagcctaga acattcctgt tgaagtacaa cgagaacggg 840 accatcacag atgccgtgga ctgcgccctg gaccctctga gcgagacaaa gtgcaccctg 900 aagagcttca ccgtggaaaa gggcatctac caaaccagca acttcagagt gcagcctaca 960 gagagcattg tcagattccc caacatcacc aatctgtgcc catttgatga ggtgttcaac 1020 gccacccggt tcgccagcgt gtacgcctgg aatagaaaga gaatctccaa ttgcgtggct 1080 gactacagcg tgctgtacaa cttcgctccc ttcttcgcct tcaagtgcta cggcgtctcc 1140 cctacgaagc tgaacgacct ctgtttcaca aatgtgtacg ccgatagctt cgtgatccgg 1200 ggcaatgagg ttagccagat cgcacccggc cagactggca acatcgcgga ttacaactac 1260 aaactgcccg atgacttcac aggctgcgtg atcgcctgga acagcaacaa gctggacagc 1320 aaggtgggag gtaactacaa ctacttgtac cggctgtttc ggaagtctaa ccttaaacct 1380 tttgagagag acatctctac cgagatctac caagccggaa ataaaccttg caacggcgtg 1440 gccggattca actgctactt tcctctgaga agctacggct tcagacctac gtacggcgtt 1500 ggccaccaac cttaccgcgt ggtggttctg agctttgaac tgctgcacgc ccctgccacc 1560 gtgtgcggcc caaagaaaag taccaaccta gtcaagaaca aatgcgtgaa cttcaatttc 1620 aacggcctga ccggcacagg cgtgctgact gagagcaaca agaaattcct gcctttccaa 1680 cagttcggca gagatattgc tgacaccacc gacgccgtgc gggaccccca gaccctggaa 1740 atcctggata tcaccccttg ttcttttgga ggcgtgagcg tgatcactcc tggaaccaac 1800 acgtccaatc aggtggccgt gctgtatcag ggcgtgaact gcaccgaggt gcccgtggcc 1860 atccacgccg accagctgac ccctacatgg cgggtgtaca gcacaggaag caatgtgttc 1920 cagaccagag ccggctgtct gataggagct gaatacgtga acaattctta cgagtgtgac 1980 attcccatcg gcgccggcat ctgtgcctcc taccagaccc agacaaagag ccaccggaga 2040 gccagaagcg tcgccagcca gtccatcatc gcttatacca tgagcctcgg cgctgaaaac 2100 tccgtggcct acagcaacaa cagcatcgcc atccccacca actttacaat cagcgtcaca 2160 accgaaatcc tgcccgtgag catgacgaag acctctgtgg actgtaccat gtacatctgc 2220 ggcgacagca ccgagtgctc caatctgctg ctgcagtacg gctctttttg cacacagctg 2280 aagagagcac tgaccggaat tgctgtggaa caggacaaga acacccagga ggtgttcgcc 2340 caggtgaagc agatctataa gacacctcca atcaagtact tcggcggctt caactttagc 2400 cagatcctgc ccgaccccag caagccttct aaacgcagct tcattgagga cctgctgttt 2460 aacaaggtga ccctggccga cgctggtttc atcaagcagt acggcgattg cctgggcgac 2520 atcgcggctc gggacctgat ctgcgcccag aagttcaacg gcctgacagt gctgcctcct 2580 ctgctcacag atgagatgat cgcccagtac accagcgccc tgctcgccgg tacaatcaca 2640 tccggctgga ccttcggcgc tggcgctgcc ctgcaaatcc ctttcgcaat gcagatggcc 2700 tacagattca atggaatcgg cgtcacccag aacgtgctgt acgagaacca gaagctgatc 2760 gccaatcaat tcaacagcgc catcggcaag atccaggatt ccctgagctc taccgccagc 2820 gccctgggca agctgcagga cgtggtgaac caaccgcccc aggccctgaa caacactggtg 2880 aaacagctgt cttctaaatt cggcgccatt tcatccgtgc ttaatgacat cctgtctaga 2940 ctggacaaag tggaagccga agtccagatc gacaggctga ttacaggaag actgcaaagc 3000 ctacagacct acgtgaccca gcaactgatc agagccgctg agatcagagc ctctgccaac 3060 ctggcagcca ccaagatgag cgagtgcgtg ctgggacagt ctaagagggt ggatttctgc 3120 ggaaagggtt atcacctgat gagcttcccc caaagcgccc ctcacggcgt ggtgttcctg 3180 catgtgactt acgtgccagc tcaggagaag aacttcacaa ccgcccctgc catctgccac 3240 gacggcaagg cccatttccc tagagaaggc gttttcgtga gcaatggcac ccactggttc 3300 gtgacccaga ggaacttcta cgagccccag atcatcacca ccgataatac tttcgtgagt 3360 ggcaattgtg acgtggtgat cggcatcgtg aacaacaccg tgtacgaccc tctgcagcct 3420 gagctggata gctttaagga ggaactggat aagtacttca aaaaccacac aagcccggac 3480 gtggacctgg gcgacatcag cggcataaac gccagcgtgg tgaacatcca gaaagaaatc 3540 gacagactga acgaagtggc caagaacctg aatgagagcc tgatcgatct gcaggagctg 3600 ggaagtacg agcagtacat caagtggcct tggtacatct ggctgggatt catcgccggc 3660 ctgatcgcta tcgtgatggt gactattatg ctgtgttgca tgaccagttg ttgtagctgc 3720 ctgaagggct gctgcagctg tggcagctgc tgtaattcg acgaggatga tagtgaacct 3780 gtgctgaagg gcgtgaagtt gcactacacc is 3813 <210> 21 <211> 1255 <212> PRT <213> SARS‐CoV‐1 <400> 21 Met Phe Ile Phe Leu Leu Phe Leu Thr Leu Thr Ser Gly Ser Asp Leu 1 5 10 15 Asp Arg Cys Thr Thr Phe Asp Asp Val Gln Ala Pro Asn Tyr Thr Gln 20 25 30 His Thr Ser Ser Met Arg Gly Val Tyr Tyr Pro Asp Glu Ile Phe Arg 35 40 45 Ser Asp Thr Leu Tyr Leu Thr Gln Asp Leu Phe Leu Pro Phe Tyr Ser 50 55 60 Asn Val Thr Gly Phe His Thr Ile Asn His Thr Phe Gly Asn Pro Val 65 70 75 80 Ile Pro Phe Lys Asp Gly Ile Tyr Phe Ala Ala Thr Glu Lys Ser Asn 85 90 95 Val Val Arg Gly Trp Val Phe Gly Ser Thr Met Asn Asn Lys Ser Gln 100 105 110 Ser Val Ile Ile Ile Asn Asn Ser Thr Asn Val Val Ile Arg Ala Cys 115 120 125 Asn Phe Glu Leu Cys Asp Asn Pro Phe Phe Ala Val Ser Lys Pro Met 130 135 140 Gly Thr Gln Thr His Thr Met Ile Phe Asp Asn Ala Phe Asn Cys Thr 145 150 155 160 Phe Glu Tyr Ile Ser Asp Ala Phe Ser Leu Asp Val Ser Glu Lys Ser 165 170 175 Gly Asn Phe Lys His Leu Arg Glu Phe Val Phe Lys Asn Lys Asp Gly 180 185 190 Phe Leu Tyr Val Tyr Lys Gly Tyr Gln Pro Ile Asp Val Val Arg Asp 195 200 205 Leu Pro Ser Gly Phe Asn Thr Leu Lys Pro Ile Phe Lys Leu Pro Leu 210 215 220 Gly Ile Asn Ile Thr Asn Phe Arg Ala Ile Leu Thr Ala Phe Ser Pro 225 230 235 240 Ala Gln Asp Ile Trp Gly Thr Ser Ala Ala Ala Tyr Phe Val Gly Tyr 245 250 255 Leu Lys Pro Thr Thr Phe Met Leu Lys Tyr Asp Glu Asn Gly Thr Ile 260 265 270 Thr Asp Ala Val Asp Cys Ser Gln Asn Pro Leu Ala Glu Leu Lys Cys 275 280 285 Ser Val Lys Ser Phe Glu Ile Asp Lys Gly Ile Tyr Gln Thr Ser Asn 290 295 300 Phe Arg Val Val Pro Ser Gly Asp Val Val Arg Phe Pro Asn Ile Thr 305 310 315 320 Asn Leu Cys Pro Phe Gly Glu Val Phe Asn Ala Thr Lys Phe Pro Ser 325 330 335 Val Tyr Ala Trp Glu Arg Lys Lys Ile Ser Asn Cys Val Ala Asp Tyr 340 345 350 Ser Val Leu Tyr Asn Ser Thr Phe Phe Ser Thr Phe Lys Cys Tyr Gly 355 360 365 Val Ser Ala Thr Lys Leu Asn Asp Leu Cys Phe Ser Asn Val Tyr Ala 370 375 380 Asp Ser Phe Val Val Lys Gly Asp Asp Val Arg Gln Ile Ala Pro Gly 385 390 395 400 Gln Thr Gly Val Ile Ala Asp Tyr Asn Tyr Lys Leu Pro Asp Asp Phe 405 410 415 Met Gly Cys Val Leu Ala Trp Asn Thr Arg Asn Ile Asp Ala Thr Ser 420 425 430 Thr Gly Asn Tyr Asn Tyr Lys Tyr Arg Tyr Leu Arg His Gly Lys Leu 435 440 445 Arg Pro Phe Glu Arg Asp Ile Ser Asn Val Pro Phe Ser Pro Asp Gly 450 455 460 Lys Pro Cys Thr Pro Pro Ala Leu Asn Cys Tyr Trp Pro Leu Asn Asp 465 470 475 480 Tyr Gly Phe Tyr Thr Thr Thr Gly Ile Gly Tyr Gln Pro Tyr Arg Val 485 490 495 Val Val Leu Ser Phe Glu Leu Leu Asn Ala Pro Ala Thr Val Cys Gly 500 505 510 Pro Lys Leu Ser Thr Asp Leu Ile Lys Asn Gln Cys Val Asn Phe Asn 515 520 525 Phe Asn Gly Leu Thr Gly Thr Gly Val Leu Thr Pro Ser Ser Lys Arg 530 535 540 Phe Gln Pro Phe Gln Gln Phe Gly Arg Asp Val Ser Asp Phe Thr Asp 545 550 555 560 Ser Val Arg Asp Pro Lys Thr Ser Glu Ile Leu Asp Ile Ser Pro Cys 565 570 575 Ala Phe Gly Gly Val Ser Val Ile Thr Pro Gly Thr Asn Ala Ser Ser 580 585 590 Glu Val Ala Val Leu Tyr Gln Asp Val Asn Cys Thr Asp Val Ser Thr 595 600 605 Ala Ile His Ala Asp Gln Leu Thr Pro Ala Trp Arg Ile Tyr Ser Thr 610 615 620 Gly Asn Asn Val Phe Gln Thr Gln Ala Gly Cys Leu Ile Gly Ala Glu 625 630 635 640 His Val Asp Thr Ser Tyr Glu Cys Asp Ile Pro Ile Gly Ala Gly Ile 645 650 655 Cys Ala Ser Tyr His Thr Val Ser Leu Leu Arg Ser Thr Ser Gln Lys 660 665 670 Ser Ile Val Ala Tyr Thr Met Ser Leu Gly Ala Asp Ser Ser Ile Ala 675 680 685 Tyr Ser Asn Asn Thr Ile Ala Ile Pro Thr Asn Phe Ser Ile Ser Ile 690 695 700 Thr Thr Glu Val Met Pro Val Ser Met Ala Lys Thr Ser Val Asp Cys 705 710 715 720 Asn Met Tyr Ile Cys Gly Asp Ser Thr Glu Cys Ala Asn Leu Leu Leu 725 730 735 Gln Tyr Gly Ser Phe Cys Thr Gln Leu Asn Arg Ala Leu Ser Gly Ile 740 745 750 Ala Ala Glu Gln Asp Arg Asn Thr Arg Glu Val Phe Ala Gln Val Lys 755 760 765 Gln Met Tyr Lys Thr Pro Thr Leu Lys Tyr Phe Gly Gly Phe Asn Phe 770 775 780 Ser Gln Ile Leu Pro Asp Pro Leu Lys Pro Thr Lys Arg Ser Phe Ile 785 790 795 800 Glu Asp Leu Leu Phe Asn Lys Val Thr Leu Ala Asp Ala Gly Phe Met 805 810 815 Lys Gln Tyr Gly Glu Cys Leu Gly Asp Ile Asn Ala Arg Asp Leu Ile 820 825 830 Cys Ala Gln Lys Phe Asn Gly Leu Thr Val Leu Pro Pro Leu Leu Thr 835 840 845 Asp Asp Met Ile Ala Ala Tyr Thr Ala Ala Leu Val Ser Gly Thr Ala 850 855 860 Thr Ala Gly Trp Thr Phe Gly Ala Gly Ala Ala Leu Gln Ile Pro Phe 865 870 875 880 Ala Met Gln Met Ala Tyr Arg Phe Asn Gly Ile Gly Val Thr Gln Asn 885 890 895 Val Leu Tyr Glu Asn Gln Lys Gln Ile Ala Asn Gln Phe Asn Lys Ala 900 905 910 Ile Ser Gln Ile Gln Glu Ser Leu Thr Thr Thr Ser Thr Ala Leu Gly 915 920 925 Lys Leu Gln Asp Val Val Asn Gln Asn Ala Gln Ala Leu Asn Thr Leu 930 935 940 Val Lys Gln Leu Ser Ser Asn Phe Gly Ala Ile Ser Ser Val Leu Asn 945 950 955 960 Asp Ile Leu Ser Arg Leu Asp Lys Val Glu Ala Glu Val Gln Ile Asp 965 970 975 Arg Leu Ile Thr Gly Arg Leu Gln Ser Leu Gln Thr Tyr Val Thr Gln 980 985 990 Gln Leu Ile Arg Ala Ala Glu Ile Arg Ala Ser Ala Asn Leu Ala Ala 995 1000 1005 Thr Lys Met Ser Glu Cys Val Leu Gly Gln Ser Lys Arg Val Asp Phe 1010 1015 1020 Cys Gly Lys Gly Tyr His Leu Met Ser Phe Pro Gln Ala Ala Pro His 1025 1030 1035 1040 Gly Val Val Phe Leu His Val Thr Tyr Val Pro Ser Gln Glu Arg Asn 1045 1050 1055 Phe Thr Thr Ala Pro Ala Ile Cys His Glu Gly Lys Ala Tyr Phe Pro 1060 1065 1070 Arg Glu Gly Val Phe Val Phe Asn Gly Thr Ser Trp Phe Ile Thr Gln 1075 1080 1085 Arg Asn Phe Phe Ser Pro Gln Ile Ile Thr Thr Asp Asn Thr Phe Val 1090 1095 1100 Ser Gly Asn Cys Asp Val Val Ile Gly Ile Ile Asn Asn Thr Val Tyr 1105 1110 1115 1120 Asp Pro Leu Gln Pro Glu Leu Asp Ser Phe Lys Glu Glu Leu Asp Lys 1125 1130 1135 Tyr Phe Lys Asn His Thr Ser Pro Asp Val Asp Leu Gly Asp Ile Ser 1140 1145 1150 Gly Ile Asn Ala Ser Val Val Asn Ile Gln Lys Glu Ile Asp Arg Leu 1155 1160 1165 Asn Glu Val Ala Lys Asn Leu Asn Glu Ser Leu Ile Asp Leu Gln Glu 1170 1175 1180 Leu Gly Lys Tyr Glu Gln Tyr Ile Lys Trp Pro Trp Tyr Val Trp Leu 1185 1190 1195 1200 Gly Phe Ile Ala Gly Leu Ile Ala Ile Val Met Val Thr Ile Leu Leu 1205 1210 1215 Cys Cys Met Thr Ser Cys Cys Ser Cys Leu Lys Gly Ala Cys Ser Cys 1220 1225 1230 Gly Ser Cys Cys Lys Phe Asp Glu Asp Asp Ser Glu Pro Val Leu Lys 1235 1240 1245 Gly Val Lys Leu His Tyr Thr 1250 1255 <210> 22 <211> 3768 <212> DNA <213> SARS-CoV-1 <400> 22 atgttcatct tcctgctgtt cctgaccctg accagcggca gcgacctgga ccggtgcacc 60 accttcgacg acgtgcaggc ccccaactac acccagcaca ccagcagcat gcggggcgtg 120 tactaccccg acgagatctt ccggagcgac accctgtacc tgacccagga cctgttcctg 180 cccttctaca gcaacgtgac cggcttccac accatcaacc acaccttcgg caacccccgtg 240 atccccttca aggacggcat ctacttcgcc gccaccgaga agagcaacgt ggtgcggggc 300 tgggtgttcg gcagcaccat gaacaacaag agccagagcg tgatcatcat caacaacagc 360 accaacgtgg tgatccgggc ctgcaacttc gagctgtgcg acaacccctt cttcgccgtg 420 agcaagccca tgggcaccca gacccacacc atgatcttcg acaacgcctt caactgcacc 480 ttcgagtaca tcagcgacgc cttcagcctg gacgtgagcg agaagagcgg caacttcaag 540 cacctgcggg agttcgtgtt caagaacaag gacggcttcc tgtacgtgta caagggctac 600 cagcccatcg acgtggtgcg ggacctgccc agcggcttca acaccctgaa gcccatcttc 660 aagctgcccc tgggcatcaa catcaccaac ttccgggcca tcctgaccgc cttcagcccc 720 gcccaggaca tctggggcac cagcgccgcc gcctacttcg tgggctacct gaagcccacc 780 accttcatgc tgaagtacga cgagaacggc accatcaccg acgccgtgga ctgcagccag 840 aaccccctgg ccgagctgaa gtgcagtgtg aagagcttcg agatcgacaa gggcatctac 900 cagaccagca acttccgggt ggtgcccagc ggcgacgtgg tgcggttccc caacatcacc 960 aacctgtgcc ccttcggcga ggtgttcaac gccaccaagt tccccagcgt gtacgcctgg 1020 gagcggaaga agatcagcaa ctgcgtggcc gactacagcg tgctgtacaa cagcaccttc 1080 ttcagcacct tcaagtgcta cggcgtgagc gccaccaagc tgaacgacct gtgcttcagc 1140 aacgtgtacg ccgacagctt cgtggtgaag ggcgacgacg tgcggcagat cgcccccggc 1200 cagaccggcg tgatcgccga ctacaactac aagctgcccg acgacttcat gggctgcgtg 1260 ctggcctgga acacccggaa catcgacgcc accagcaccg gcaactacaa ctacaagtac 1320 cggtacctgc ggcacggcaa gctgcggccc ttcgagcggg acatcagcaa cgtgcccttc 1380 agccccgacg gcaagccctg cacccccccc gccctgaact gctactggcc cctgaacgac 1440 tacggcttct acaccactac cggcatcggc taccagccct accgggtggt ggtgctgagc 1500 ttcgagctgc tgaacgcccc cgccaccgtg tgcggcccca agctgagcac cgacctgatc 1560 aagaaccagt gcgtgaactt caacttcaac ggcctgaccg gcaccggcgt gctgaccccc 1620 agcagcaagc ggttccagcc cttccagcag ttcggccggg acgtgagcga cttcaccgac 1680 agcgtgcggg accccaagac cagcgagatc ctggacatca gcccctgcgc cttcggcggc 1740 gtgagcgtga tcacccccgg taccaacgcc agcagcgagg tggccgtgct gtaccaggac 1800 gtgaactgca ccgacgtgag caccgccatc cacgccgacc agctgacccc cgcctggcgg 1860 atctacagca ccggcaacaa cgtgttccag acccaggccg gctgcctgat cggcgccgag 1920 cacgtggaca ccagctacga gtgcgacatc cccatcggcg ccggcatctg cgccagctac 1980 cacaccgtga gcctgctgcg gagcaccagc cagaagagca tcgtggccta caccatgagc 2040 ctgggcgccg acagcagcat cgcctacagc aacaacacca tcgccatccc caccaacttc 2100 agcatcagca tcaccaccga ggtgatgccc gtgagcatgg ccaagaccag cgtggactgc 2160 aacatgtaca tctgcggcga cagcaccgaa tgcgccaacc tgctgctgca gtacggcagc 2220 ttctgcaccc aactgaaccg ggccctgagc ggcatcgccg ccgagcagga ccggaacacc 2280 cgggaggtgt tcgcccaggt gaagcagatg tacaagaccc ccaccctgaa gtacttcggc 2340 ggcttcaact tcagccagat cctgcccgac cccctgaagc ccaccaagcg gagcttcatc 2400 gaggacctgc tgttcaacaa ggtgaccctg gccgacgccg gcttcatgaa gcagtacggc 2460 gagtgcctgg gcgacatcaa cgcccgggac ctgatctgcg cccagaagtt caacggcctg 2520 accgtgctgc cccccctgct gaccgacgac atgatcgccg cctacaccgc cgccctggtg 2580 agcggcaccg ccaccgccgg ctggaccttc ggcgccggcg ccgccctgca gatccccttc 2640 gccatgcaga tggcctaccg gttcaacggc atcggcgtga cccagaacgt gctgtacgag 2700 aaccagaagc agatcgccaa ccagttcaac aaggccatca gccagatcca ggagagcctg 2760 accaccacca gcaccgccct gggcaagctg caggacgtgg tgaaccagaa cgcccaggcc 2820 ctgaacaccc tggtgaagca gctgagcagc aacttcggcg ccatcagcag cgtgctgaac 2880 gacatcctga gccggctgga caaggtggag gccgaggtgc agatcgaccg gctgatcacc 2940 ggccggctgc agagcctgca gacctacgtg acccagcagc tgatccgggc cgccgagatc 3000 cgggccagcg ccaacctggc cgccaccaag atgagcgagt gcgtgctggg ccagagcaag 3060 cgggtggact tctgcggcaa gggctaccac ctgatgagct tcccccaggc cgccccccac 3120 ggcgtggtgt tcctgcacgt gacctacgtg cccagccagg agcggaactt caccaccgcc 3180 cccgccatct gccacgaggg caaggcctac ttcccccggg agggcgtgtt cgtgttcaac 3240 ggcaccagct ggttcatcac ccagcggaac ttcttcagcc cccagatcat caccaccgac 3300 aacaccttcg tgagcggcaa ctgcgacgtg gtgatcggca tcatcaacaa caccgtgtac 3360 gaccccctgc agcccgagct ggacagcttc aaggaggagc tggacaagta cttcaagaac 3420 cacaccagcc ccgacgtgga cctgggcgac atcagcggca tcaacgccag cgtggtgaac 3480 atccagaagg agatcgaccg gctgaacgag gtggccaaga acctgaacga gagcctgatc 3540 gacctgcagg agctgggcaa gtacgagcag tacatcaagt ggccctggta cgtgtggctg 3600 ggcttcatcg ccggcctgat cgccatcgtg atggtgacca tcctgctgtg ctgcatgacc 3660 agctgctgca gctgcctgaa gggcgcctgc agctgcggca gctgctgcaa gttcgacgag 3720 gacgacagcg agcccgtgct gaagggcgtg aagctgcact acacctga 3768 <210> 23 <211> 1265 <212> PRT <213> Pangolin CoV GD <400> 23 Met Leu Phe Phe Phe Phe Leu His Phe Ala Leu Val Asn Ser Gln Cys 1 5 10 15 Val Asn Leu Thr Gly Arg Ala Ala Ile Gln Pro Ser Phe Thr Asn Ser 20 25 30 Ser Gln Arg Gly Val Tyr Tyr Pro Asp Thr Ile Phe Arg Ser Asn Thr 35 40 45 Leu Val Leu Ser Gln Gly Tyr Phe Leu Pro Phe Tyr Ser Asn Val Ser 50 55 60 Trp Tyr Tyr Ala Leu Thr Lys Thr Asn Ser Ala Glu Lys Arg Val Asp 65 70 75 80 Asn Pro Val Leu Asp Phe Lys Asp Gly Ile Tyr Phe Ala Ala Thr Glu 85 90 95 Lys Ser Asn Ile Val Arg Gly Trp Ile Phe Gly Thr Thr Leu Asp Asn 100 105 110 Thr Ser Gln Ser Leu Leu Ile Val Asn Asn Ala Thr Asn Val Ile Ile 115 120 125 Lys Val Cys Asn Phe Gln Phe Cys Tyr Asp Pro Tyr Leu Ser Gly Tyr 130 135 140 Tyr His Asn Asn Lys Thr Trp Ser Thr Arg Glu Phe Ala Val Tyr Ser 145 150 155 160 Ser Tyr Ala Asn Cys Thr Phe Glu Tyr Val Ser Lys Ser Phe Met Leu 165 170 175 Asp Ile Ala Gly Lys Ser Gly Leu Phe Asp Thr Leu Arg Glu Phe Val 180 185 190 Phe Arg Asn Val Asp Gly Tyr Phe Lys Ile Tyr Ser Lys Tyr Thr Pro 195 200 205 Val Asn Val Asn Ser Asn Leu Pro Ile Gly Phe Ser Ala Leu Glu Pro 210 215 220 Leu Val Glu Ile Pro Ala Gly Ile Asn Ile Thr Lys Phe Arg Thr Leu 225 230 235 240 Leu Thr Ile His Arg Gly Asp Pro Met Pro Asn Asn Gly Trp Thr Val 245 250 255 Phe Ser Ala Ala Tyr Tyr Val Gly Tyr Leu Ala Pro Arg Thr Phe Met 260 265 270 Leu Asn Tyr Asn Glu Asn Gly Thr Ile Thr Asp Ala Val Asp Cys Ala 275 280 285 Leu Asp Pro Leu Ser Glu Ala Lys Cys Thr Leu Lys Ser Leu Thr Val 290 295 300 Glu Lys Gly Ile Tyr Gln Thr Ser Asn Phe Arg Val Gln Pro Thr Glu 305 310 315 320 Ser Ile Val Arg Phe Pro Asn Ile Thr Asn Leu Cys Pro Phe Gly Glu 325 330 335 Val Phe Asn Ala Thr Thr Phe Ala Ser Val Tyr Ala Trp Asn Arg Lys 340 345 350 Arg Ile Ser Asn Cys Val Ala Asp Tyr Ser Val Leu Tyr Asn Ser Thr 355 360 365 Ser Phe Ser Thr Phe Lys Cys Tyr Gly Val Ser Pro Thr Lys Leu Asn 370 375 380 Asp Leu Cys Phe Thr Asn Val Tyr Ala Asp Ser Phe Val Val Arg Gly 385 390 395 400 Asp Glu Val Arg Gln Ile Ala Pro Gly Gln Thr Gly Arg Ile Ala Asp 405 410 415 Tyr Asn Tyr Lys Leu Pro Asp Asp Phe Thr Gly Cys Val Ile Ala Trp 420 425 430 Asn Ser Asn Asn Leu Asp Ser Lys Val Gly Gly Asn Tyr Asn Tyr Leu 435 440 445 Tyr Arg Leu Phe Arg Lys Ser Asn Leu Lys Pro Phe Glu Arg Asp Ile 450 455 460 Ser Thr Glu Ile Tyr Gln Ala Gly Ser Thr Pro Cys Asn Gly Val Glu 465 470 475 480 Gly Phe Asn Cys Tyr Phe Pro Leu Gln Ser Tyr Gly Phe His Pro Thr 485 490 495 Asn Gly Val Gly Tyr Gln Pro Tyr Arg Val Val Val Leu Ser Phe Glu 500 505 510 Leu Leu Asn Ala Pro Ala Thr Val Cys Gly Pro Lys Gln Ser Thr Asn 515 520 525 Leu Val Lys Asn Lys Cys Val Asn Phe Asn Phe Asn Gly Leu Thr Gly 530 535 540 Thr Gly Val Leu Thr Glu Ser Ser Lys Lys Phe Leu Pro Phe Gln Gln 545 550 555 560 Phe Gly Arg Asp Ile Ala Asp Thr Thr Asp Ala Val Arg Asp Pro Gln 565 570 575 Thr Leu Glu Ile Leu Asp Ile Thr Pro Cys Ser Phe Gly Gly Val Ser 580 585 590 Val Ile Thr Pro Gly Thr Asn Thr Ser Asn Gln Val Ala Val Leu Tyr 595 600 605 Gln Asp Val Asn Cys Thr Glu Val Pro Val Ala Ile His Ala Asp Gln 610 615 620 Leu Thr Pro Thr Trp Ser Val Tyr Ser Thr Gly Ser Asn Val Phe Gln 625 630 635 640 Thr Arg Ala Gly Cys Leu Ile Gly Ala Glu His Val Asn Asn Ser Tyr 645 650 655 Glu Cys Asp Ile Pro Ile Gly Ala Gly Ile Cys Ala Ser Tyr Gln Thr 660 665 670 Gln Thr Asn Ser Arg Ser Val Ser Ser Gln Ala Ile Ile Ala Tyr Thr 675 680 685 Met Ser Leu Gly Ala Glu Asn Ser Val Ala Tyr Ala Asn Asn Ser Ile 690 695 700 Ala Ile Pro Thr Asn Phe Thr Ile Ser Val Thr Thr Glu Ile Leu Pro 705 710 715 720 Val Ser Met Thr Lys Thr Ser Val Asp Cys Thr Met Tyr Ile Cys Gly 725 730 735 Asp Ser Ile Glu Cys Ser Asn Leu Leu Leu Gln Tyr Gly Ser Phe Cys 740 745 750 Thr Gln Leu Asn Arg Ala Leu Thr Gly Ile Ala Val Glu Gln Asp Lys 755 760 765 Asn Thr Gln Glu Val Phe Ala Gln Val Lys Gln Ile Tyr Lys Thr Pro 770 775 780 Pro Ile Lys Asp Phe Gly Gly Phe Asn Phe Ser Gln Ile Leu Pro Asp 785 790 795 800 Pro Ser Lys Pro Ser Lys Arg Ser Phe Ile Glu Asp Leu Leu Phe Asn 805 810 815 Lys Val Thr Leu Ala Asp Ala Gly Phe Ile Lys Gln Tyr Gly Asp Cys 820 825 830 Leu Gly Asp Ile Ala Ala Arg Asp Leu Ile Cys Ala Gln Lys Phe Asn 835 840 845 Gly Leu Thr Val Leu Pro Pro Leu Leu Thr Asp Glu Met Ile Ala Gln 850 855 860 Tyr Thr Ser Ala Leu Leu Ala Gly Thr Ile Thr Ser Gly Trp Thr Phe 865 870 875 880 Gly Ala Gly Ala Ala Leu Gln Ile Pro Phe Ala Met Gln Met Ala Tyr 885 890 895 Arg Phe Asn Gly Ile Gly Val Thr Gln Asn Val Leu Tyr Glu Asn Gln 900 905 910 Lys Leu Ile Ala Asn Gln Phe Asn Ser Ala Ile Gly Lys Ile Gln Asp 915 920 925 Ser Leu Ser Ser Thr Ala Ser Ala Leu Gly Lys Leu Gln Asp Val Val 930 935 940 Asn Gln Asn Ala Gln Ala Leu Asn Thr Leu Val Lys Gln Leu Ser Ser 945 950 955 960 Asn Phe Gly Ala Ile Ser Ser Val Leu Asn Asp Ile Leu Ser Arg Leu 965 970 975 Asp Lys Val Glu Ala Glu Val Gln Ile Asp Arg Leu Ile Thr Gly Arg 980 985 990 Leu Gln Ser Leu Gln Thr Tyr Val Thr Gln Gln Leu Ile Arg Ala Ala 995 1000 1005 Glu Ile Arg Ala Ser Ala Asn Leu Ala Ala Thr Lys Met Ser Glu Cys 1010 1015 1020 Val Leu Gly Gln Ser Lys Arg Val Asp Phe Cys Gly Lys Gly Tyr His 1025 1030 1035 1040 Leu Met Ser Phe Pro Gln Ser Ala Pro His Gly Val Val Phe Leu His 1045 1050 1055 Val Thr Tyr Val Pro Ser Gln Glu Lys Asn Phe Thr Thr Thr Pro Ala 1060 1065 1070 Ile Cys His Glu Gly Lys Ala His Phe Pro Arg Glu Gly Val Phe Val 1075 1080 1085 Ser Asn Gly Thr His Trp Phe Val Thr Gln Arg Asn Phe Tyr Glu Pro 1090 1095 1100 Gln Ile Ile Thr Thr Asp Asn Thr Phe Val Ser Gly Ser Cys Asp Val 1105 1110 1115 1120 Val Ile Gly Ile Val Asn Asn Thr Val Tyr Asp Pro Leu Gln Pro Glu 1125 1130 1135 Leu Asp Ser Phe Lys Glu Glu Leu Asp Lys Tyr Phe Lys Asn His Thr 1140 1145 1150 Ser Pro Asp Val Asp Leu Gly Asp Ile Ser Gly Ile Asn Ala Ser Val 1155 1160 1165 Val Asn Ile Gln Lys Glu Ile Asp Arg Leu Asn Glu Val Ala Lys Asn 1170 1175 1180 Leu Asn Glu Ser Leu Ile Asp Leu Gln Glu Leu Gly Lys Tyr Glu Gln 1185 1190 1195 1200 Tyr Ile Lys Trp Pro Trp Tyr Ile Trp Leu Gly Phe Ile Ala Gly Leu 1205 1210 1215 Ile Ala Ile Ile Met Val Thr Ile Met Leu Cys Cys Met Thr Ser Cys 1220 1225 1230 Cys Ser Cys Leu Lys Gly Cys Cys Ser Cys Gly Ser Cys Cys Lys Phe 1235 1240 1245 Asp Glu Asp Asp Ser Glu Pro Val Leu Lys Gly Val Lys Leu His Tyr 1250 1255 1260 Thr 1265 <210> 24 <211> 3798 <212> DNA <213> Pangolin CoV GD <400> 24 atgctgttct tcttcttcct gcacttcgcc ctggtgaact cccagtgcgt gaacctgacc 60 ggccgggccg ctatccagcc ctctttcacc aactcctccc agagaggcgt gtactacccc 120 gacaccatct tccgctccaa caccctggtg ctgagccagg gctacttcct gcccttctat 180 agcaatgtga gctggtacta cgccctgacc aagaccaaca gcgctgaaaa gcgcgtggac 240 aaccccgtgc tggacttcaa agacggcatt tacttcgctg ccaccgaaaa atccaacatc 300 gtgagaggct ggatcttcgg aaccaccctg gacaacacct cccagagcct gctgattgtg 360 aacaatgcca ccaacgtgat tatcaaggtg tgcaatttcc agttttgcta cgacccctat 420 ctgagcggct actaccacaa caacaaaacc tggtccacca gagaatttgc cgtgtatagc 480 agctacgcca actgcacctt cgagtacgtg tccaagagct tcatgctgga catcgccggc 540 aagagcggcc tgttcgacac actgagagag tttgtgttca gaaacgtgga cggctacttc 600 aaaatctaca gcaagtacac ccccgtgaac gtgaacagca acctgcccat cggatttagc 660 gccctggagc ccctggtgga gatccccgct ggcatcaaca tcaccaaatt ccgcaccctg 720 ctgaccatcc acagaggcga ccccatgccc aacaacggct ggaccgtgtt ctccgccgcc 780 tactatgtgg gctacctggc ccccagaacc ttcatgctga actacaacga aaacggcacc 840 atcaccgacg ccgtggactg cgccctggac ccactgtctg aggctaagtg taccctgaag 900 agcctgacag tggaaaaagg catctaccag acctccaact tcagagtgca gcctacagaa 960 agcattgtga gatttcccaa catcaccaac ctgtgcccct ttggcgaggt gttcaacgcc 1020 accaccttcg ccagcgtgta cgcctggaac agaaaaagaa tctcaaactg cgtggccgac 1080 tacagcgtgc tgtacaacag cacctccttc tccaccttca agtgctatgg cgtgtccccc 1140 accaagctga acgatctgtg ttttaccaac gtgtacgccg actccttcgt ggtgagaggc 1200 gacgaggtgc gccagatcgc ccccggacag accggaagga tcgccgacta taactacaag 1260 ctgcccgacg acttcaccgg ctgcgtgatc gcctggaact ccaacaatct ggactctaag 1320 gtgggcggca actacaacta cctgtacaga ctgttcagaa aaagcaacct gaagcccttc 1380 gaaagagaca tctccacaga gatctaccag gccggctcca ccccctgcaa cggcgtggaa 1440 ggcttcaact gctacttccc cctgcagagc tacggcttcc accccaccaa cggcgtgggc 1500 taccagccct acagagtggt ggtgctgtcc ttcgagctgc tgaacgcccc cgccacagtg 1560 tgcggcccta agcagtccac caacctggtg aaaaacaaat gcgtgaactt caatttcaac 1620 ggactgaccg ggaccggcgt gctgaccgag agctccaaaa agtttctgcc cttccagcag 1680 ttcggcagag acatcgccga caccacagac gccgtgagag acccccagac cctggagatt 1740 ctggacatta caccctgctc cttcggcggc gtgtccgtga ttacccccgg caccaacacc 1800 agcaaccagg tggccgtgct gtaccaggat gtgaactgca ccgaggtgcc tgtggccatc 1860 cacgccgacc agctgacccc cacctggagt gtgtacagca ccggcagcaa cgtgttccag 1920 accagagccg gatgcctgat cggcgccgag cacgtgaaca acagctacga gtgcgacatt 1980 cccatcggcg ccggcatctg cgccagctac cagacccaga caaacagcag aagcgtgagc 2040 agccaggcca tcatcgccta caccatgagc ctgggcgccg agaacagcgt ggcctacgcc 2100 aacaattcca ttgccatccc caccaacttc accatcagcg tgaccacaga aatcctgccc 2160 gtgtccatga ccaagaccag cgtggactgc accatgtata tctgcggcga ttccattgag 2220 tgctccaacc tgctgctgca gtacggcagc ttctgcaccc agctgaaccg cgccctgact 2280 ggcatcgccg tggagcagga caagaacacc caggaggtgt tcgctcaggt gaagcagatc 2340 tacaaaaccc cccccattaa ggacttcggc ggcttcaact tcagccagat cctgcccgat 2400 ccctccaagc ccagcaagag aagcttcatc gaagatctgc tgttcaataa ggtgaccctg 2460 gccgacgccg gcttcatcaa acagtatgga gactgcctgg gagacatcgc cgccagagac 2520 ctgatctgcg cccagaaatt caacggcctg accgtgctgc cccccctgct gacagacgag 2580 atgatcgctc agtacaccag cgccctgctg gccggcacaa tcacctccgg atggactttt 2640 ggcgccggcg ccgctctgca gatccccttt gccatgcaga tggcctaccg gttcaacggc 2700 atcggagtga cccagaacgt gctgtacgaa aatcagaagc tgatcgccaa ccagttcaac 2760 tccgctatcg gcaagatcca ggacagcctg agcagcaccg ccagcgccct gggaaagctg 2820 caggatgtgg tgaaccagaa cgcccaggcc ctgaacaccc tggtgaaaca gctgagcagc 2880 2940 gccgaggtgc agatcgacag actgatcacc ggcagactgc agagcctgca gacctacgtg 3000 acccagcagc tgatcagagc cgccgagatt agagccagcg ccaacctggc cgccaccaag 3060 atgagcgaat gcgtgctgggg agagcaag agagtggact tctgcggaaa gggatatcac 3120 ctgatgagct tcccacagag cgccccccac ggagtggtgt tcctgcacgt gacctacgtg 3180 cctagccagg agaagaattt caccacaaca cctgccatct gccacgagg caaggcccac 3240 ttcccccgcg aaggcgtgtt tgtgagcaac ggcaccact ggttcgtgac ccagagaaac 3300 ttctacgagc cccagatcat caccaccgac aataccttcg tgagcggaag ctgtgacgtg 3360 gtgatcggaa tcgtgaacaa taccgtgtac gaccccctgc agcccgaact ggacagcttc 3420 aaagaagc tggacaaata cttcaagaac cacaccagcc cagacgtgga tctgggagat 3480 atcagcggca tcaacgccag cgtggtgaac atccagaagg agatcgacag actgaacgaa 3540 gtggccaaaa acctgaacga gagcctgatc gacctgcagg agctgggaaa gtatgaacag 3600 tacattaagt ggccctggta catctggctg ggcttcattg ccggcctgat cgccatcatt 3660 atggtgacca tcatgctgtg ctgcatgacc agctgctgta gctgcctgaa gggatgctgc 3720 tcctgcggca gctgctgcaa gttcgacgaa gacgactctg agcctgtgct gaagggggtg 3780 aagctgcact acacctga 3798 <210> 25 <211> 1267 <212> PRT <213> Pangolin CoV GX <400> 25 Met Phe Val Phe Leu Phe Val Leu Pro Leu Val Ser Ser Gln Cys Val 1 5 10 15 Asn Leu Thr Thr Arg Thr Gly Ile Pro Pro Gly Tyr Thr Asn Ser Ser 20 25 30 Thr Arg Gly Val Tyr Tyr Pro Asp Lys Val Phe Arg Ser Ser Ile Leu 35 40 45 His Leu Thr Gln Asp Leu Phe Leu Pro Phe Phe Ser Asn Val Thr Trp 50 55 60 Phe Asn Thr Ile Asn Tyr Gln Gly Gly Phe Lys Lys Phe Asp Asn Pro 65 70 75 80 Val Leu Pro Phe Asn Asp Gly Val Tyr Phe Ala Ser Thr Glu Lys Ser 85 90 95 Asn Ile Ile Arg Gly Trp Ile Phe Gly Thr Thr Leu Asp Ala Arg Thr 100 105 110 Gln Ser Leu Leu Ile Val Asn Asn Ala Thr Asn Val Val Ile Lys Val 115 120 125 Cys Glu Phe Gln Phe Cys Thr Asp Pro Phe Leu Gly Val Tyr Tyr His 130 135 140 Asn Asn Asn Lys Thr Trp Val Glu Asn Glu Phe Arg Val Tyr Ser Ser 145 150 155 160 Ala Asn Asn Cys Thr Phe Glu Tyr Ile Ser Gln Pro Phe Leu Met Asp 165 170 175 Leu Glu Gly Lys Gln Gly Asn Phe Lys Asn Leu Arg Glu Phe Val Phe 180 185 190 Lys Asn Val Asp Gly Tyr Phe Lys Ile Tyr Ser Lys His Thr Pro Ile 195 200 205 Asp Leu Val Arg Asp Leu Pro Arg Gly Phe Ala Ala Leu Glu Pro Leu 210 215 220 Val Asp Leu Pro Ile Gly Ile Asn Ile Thr Arg Phe Gln Thr Leu Leu 225 230 235 240 Ala Leu His Arg Ser Tyr Leu Thr Pro Gly Asn Leu Glu Ser Gly Trp 245 250 255 Thr Thr Gly Ala Ala Ala Tyr Tyr Val Gly Tyr Leu Gln Gln Arg Thr 260 265 270 Phe Leu Leu Ser Tyr Asn Gln Asn Gly Thr Ile Thr Asp Ala Val Asp 275 280 285 Cys Ser Leu Asp Pro Leu Ser Glu Thr Lys Cys Thr Leu Lys Ser Leu 290 295 300 Thr Val Glu Lys Gly Ile Tyr Gln Thr Ser Asn Phe Arg Val Gln Pro 305 310 315 320 Thr Ile Ser Ile Val Arg Phe Pro Asn Ile Thr Asn Leu Cys Pro Phe 325 330 335 Gly Glu Val Phe Asn Ala Ser Lys Phe Ala Ser Val Tyr Ala Trp Asn 340 345 350 Arg Lys Arg Ile Ser Asn Cys Val Ala Asp Tyr Ser Val Leu Tyr Asn 355 360 365 Ser Thr Ser Phe Ser Thr Phe Lys Cys Tyr Gly Val Ser Pro Thr Lys 370 375 380 Leu Asn Asp Leu Cys Phe Thr Asn Val Tyr Ala Asp Ser Phe Val Val 385 390 395 400 Lys Gly Asp Glu Val Arg Gln Ile Ala Pro Gly Gln Thr Gly Val Ile 405 410 415 Ala Asp Tyr Asn Tyr Lys Leu Pro Asp Asp Phe Thr Gly Cys Val Ile 420 425 430 Ala Trp Asn Ser Val Lys Gln Asp Ala Leu Thr Gly Gly Asn Tyr Gly 435 440 445 Tyr Leu Tyr Arg Leu Phe Arg Lys Ser Lys Leu Lys Pro Phe Glu Arg 450 455 460 Asp Ile Ser Thr Glu Ile Tyr Gln Ala Gly Ser Thr Pro Cys Asn Gly 465 470 475 480 Gln Val Gly Leu Asn Cys Tyr Tyr Pro Leu Glu Arg Tyr Gly Phe His 485 490 495 Pro Thr Thr Gly Val Asn Tyr Gln Pro Phe Arg Val Val Val Leu Ser 500 505 510 Phe Glu Leu Leu Asn Gly Pro Ala Thr Val Cys Gly Pro Lys Leu Ser 515 520 525 Thr Thr Leu Val Lys Asp Lys Cys Val Asn Phe Asn Phe Asn Gly Leu 530 535 540 Thr Gly Thr Gly Val Leu Thr Thr Ser Lys Lys Gln Phe Leu Pro Phe 545 550 555 560 Gln Gln Phe Gly Arg Asp Ile Ser Asp Thr Thr Asp Ala Val Arg Asp 565 570 575 Pro Gln Thr Leu Glu Ile Leu Asp Ile Thr Pro Cys Ser Phe Gly Gly 580 585 590 Val Ser Val Ile Thr Pro Gly Thr Asn Thr Ser Asn Gln Val Ala Val 595 600 605 Leu Tyr Gln Asp Val Asn Cys Thr Glu Val Pro Met Ala Ile His Ala 610 615 620 Glu Gln Leu Thr Pro Ala Trp Arg Val Tyr Ser Ala Gly Ala Asn Val 625 630 635 640 Phe Gln Thr Arg Ala Gly Cys Leu Val Gly Ala Glu His Val Asn Asn 645 650 655 Ser Tyr Glu Cys Asp Ile Pro Val Gly Ala Gly Ile Cys Ala Ser Tyr 660 665 670 His Ser Met Ser Ser Leu Arg Ser Val Asn Gln Arg Ser Ile Ile Ala 675 680 685 Tyr Thr Met Ser Leu Gly Ala Glu Asn Ser Val Ala Tyr Ser Asn Asn 690 695 700 Ser Ile Ala Ile Pro Thr Asn Phe Thr Ile Ser Val Thr Thr Glu Ile 705 710 715 720 Leu Pro Val Ser Met Thr Lys Thr Ser Val Asp Cys Thr Met Tyr Ile 725 730 735 Cys Gly Asp Ser Ile Glu Cys Ser Asn Leu Leu Leu Gln Tyr Gly Ser 740 745 750 Phe Cys Thr Gln Leu Asn Arg Ala Leu Thr Gly Ile Ala Val Glu Gln 755 760 765 Asp Lys Asn Thr Gln Glu Val Phe Ala Gln Val Lys Gln Ile Tyr Lys 770 775 780 Thr Pro Pro Ile Lys Asp Phe Gly Gly Phe Asn Phe Ser Gln Ile Leu 785 790 795 800 Pro Asp Pro Ser Lys Pro Ser Lys Arg Ser Phe Ile Glu Asp Leu Leu 805 810 815 Phe Asn Lys Val Thr Leu Ala Asp Ala Gly Phe Ile Lys Gln Tyr Gly 820 825 830 Asp Cys Leu Gly Asp Ile Ala Ala Arg Asp Leu Ile Cys Ala Gln Lys 835 840 845 Phe Asn Gly Leu Thr Val Leu Pro Pro Leu Leu Thr Asp Glu Met Ile 850 855 860 Ala Gln Tyr Thr Ser Ala Leu Leu Ala Gly Thr Ile Thr Ser Gly Trp 865 870 875 880 Thr Phe Gly Ala Gly Ala Ala Leu Gln Ile Pro Phe Ala Met Gln Met 885 890 895 Ala Tyr Arg Phe Asn Gly Ile Gly Val Thr Gln Asn Val Leu Tyr Glu 900 905 910 Asn Gln Lys Leu Ile Ala Asn Gln Phe Asn Ser Ala Ile Gly Lys Ile 915 920 925 Gln Asp Ser Leu Ser Ser Thr Ala Ser Ala Leu Gly Lys Leu Gln Asp 930 935 940 Val Val Asn Gln Asn Ala Gln Ala Leu Asn Thr Leu Val Lys Gln Leu 945 950 955 960 Ser Ser Asn Phe Gly Ala Ile Ser Ser Val Leu Asn Asp Ile Leu Ser 965 970 975 Arg Leu Asp Lys Val Glu Ala Glu Val Gln Ile Asp Arg Leu Ile Thr 980 985 990 Gly Arg Leu Gln Ser Leu Gln Thr Tyr Val Thr Gln Gln Leu Ile Arg 995 1000 1005 Ala Ala Glu Ile Arg Ala Ser Ala Asn Leu Ala Ala Thr Lys Met Ser 1010 1015 1020 Glu Cys Val Leu Gly Gln Ser Lys Arg Val Asp Phe Cys Gly Lys Gly 1025 1030 1035 1040 Tyr His Leu Met Ser Phe Pro Gln Ser Ala Pro His Gly Val Val Phe 1045 1050 1055 Leu His Val Thr Tyr Val Pro Ala Gln Glu Lys Asn Phe Thr Thr Ala 1060 1065 1070 Pro Ala Ile Cys His Glu Gly Lys Ala His Phe Pro Arg Glu Gly Val 1075 1080 1085 Phe Val Ser Asn Gly Thr His Trp Phe Ile Thr Gln Arg Asn Phe Tyr 1090 1095 1100 Glu Pro Gln Ile Ile Thr Thr Asp Asn Thr Phe Val Ser Gly Ser Cys 1105 1110 1115 1120 Asp Val Val Ile Gly Ile Val Asn Asn Thr Val Tyr Asp Pro Leu Gln 1125 1130 1135 Pro Glu Leu Asp Ser Phe Lys Glu Glu Leu Asp Lys Tyr Phe Lys Asn 1140 1145 1150 His Thr Ser Pro Asp Val Asp Leu Gly Asp Ile Ser Gly Ile Asn Ala 1155 1160 1165 Ser Val Val Asn With Gln Lys Glu Is Asp Arg Leu Asn Glu Val Ala 1170 1175 1180 Lys Asn Leu Asn Glu Ser Pro Ile Asp Leu Gln Glu Leu Gly Lys Tyr 1185 1190 1195 1200 Glu Gln Tyr Ile Lys Trp Pro Trp Tyr Ile Trp Leu Gly Phe Ile Ala 1205 1210 1215 Gly Leu Wing Met Val Thr and Leu Cys Cys Met Thr 1220 1225 1230 Ser Cys Ser Cys Leu Lys Gly Cys Ser Cys Gly Ser Cys Cys 1235 1240 1245 Lys Phe Asp Glu Asp Asp Ser Glu Pro Val Leu Lys Gly Val Lys Leu 1250 1255 1260 His Tyr Thr 1265 <210> 26 <211> 3804 <212> DNA <213> Pangolin CoV GX <400> 26 atgtttgtgt tcctattcgt cctacctt gtgtcatcac aatgcgtgaa ccttacaaca 60 agaacaggaa tccctcctgg atacacaaac tcatcaacaa gaggagtgta ctaccctgac 120 aaggtgttta gatcatcaat ccttcacctt acacaagatc tctttctacc gttcttctcg 180 aacgtgacat ggtttaacac aatcaactac caaggaggat ttaagaagtt cgacaaccct 240 gtgcttcctt tcaatgacgg agtgtacttt gcatcaacag agaagtctaa tatcatcaga 300 ggatggatct ttggaacaac acttgatgca agaacacaat cacttcttat tgtcaataat 360 gctacgaacg tggtgatcaa agtgtgcgaa tttcaattct gtactgatcc tttcctaggc 420 gttactacc acaacaacaa caagacctgg gtggagaatg agttcgtgt atagctcg 480 gcgaacaact gcacatttga atacatctca caacctttct taatggatct tgaaggaaag 540 cagggtaact ttaaaaacct tcgtgaattt gtgtttaaga atgtcgatgg atactttaag 600 atatattcaa agcatactcc aatcgacttg gttcgggatc ttcctagagg atttgcagca 660 cttgaacctc ttgtggatct tcctatcgga atcaacatca caagatttca aacttctt 720 gcacttcaca gatcatacct tacacctgga aaccttgaat ctggctggac cacgggcgca 780 gcagcatact acgtgggata ccttcaacaa agaacatttc ttctttcata caaccagaat 840 ggaacgatta cagatgcggt cgactgttca cttgatcctc tttcagaaac aaagtgtact 900 cttaaatcac ttacagtgga gaagggtatt taccaaacat caaactttag agtgcaacct 960 acaatctcaa tcgtgagatt tcctaacatc acaaaccttt gccctttcgg tgaagtcttt 1020 aacgcatcaa agttcgcgtc agtgtacgca tggaacagaa agcgcatttc aaactgcgtg 1080 gcagattact cagtgcttta caactcaaca tcattcagca cctttaaatg ctacggagtg 1140 tcacctacaa agttaaatga tctttgcttt acaaacgtgt acgcagattc atttgtggtg 1200 aaaggagatg aagtgagaca aatcgcacct ggacaaacag gagtgatcgc agattacaac 1260 tacaaacttc ctgatgattt cactgggtgc gtgatcgcat ggaactcagt gaaacaagat 1320 gccctgactg gtggcaacta tggttattta tatcgcctct ttcggaagag taagctcaaa 1380 cctttcgagc gggacataag caccgagatc taccaagcag gatcaacacc ttgcaacgga 1440 caagtgggac ttaactgcta ctaccctctt gaaagatacg gatttcaccc tacaacagga 1500 gtgaactacc aacctttccg tgtcgtgtg cttcatttg aactttta cggacctgca 1560 acagtgtgcg gacctaact ttcacaacg ctcgttaagg acagtgcgt gaactttaac 1620 ttcaatggtt tgacggtac tggagtgctt acacatca agaagcagtt tctcccattc 1680 cagcaatttg gacgcgatat ctcagatacg acggacgccg tcgagaccc tcaacactt 1740 gaaatccttg atatcacacc tgctcattt ggaggtgt cagtgatcac acctggaaca 1800 aacacatcaa accaagtgc agtgctttac siagatgtga actgcacaga agtgcctatg 1860 gcaatccacg cagaacaac tacacctgca tggcgcgtat attcggctgg tgcaacgtg 1920 tttcaacaa gagcaggatg ccttgtggga gcagaacacg tgaacactc atacgaatgc 1980 gatatccctg tgggagcagg aatctgcgca tcataccact caatgtcatc acttagatca 2040 gtgaaccaaa gatcaatcat cgcatacaca atgtcacttg gagcagaaaa ctcagtggca 2100 tactcaaca actcaatcgc aatccctaca aactttacaa tctcagtgac aacagagatt 2160 ctcccagtgt caatgacaaa gacttccgtg gattgcacaa tgtacatctg cggattca 2220 atcgaatgct caaaccttct tcttcaatac ggatcattct gtactcaact taacagagcc 2280 ctaacgggca tagctgtgga acaagataag aatacgcaag aagtgtttgc acaagtgaaa 2340 caaatctaca agactccgcc tatcaaagat ttcggcggtt tcaatttcag tcagatatta 2400 ccagacccta gcaaaccgag caagcgctca tttatcgaag atctgctctt taataaggtg 2460 acacttgcag atgcaggatt tatcaaacaa tacggagatt gcctcggcga catagccgcc 2520 agagatctta tctgcgcaca gaagtttaac ggcctgaccg tattgcctcc tcttcttaca 2580 gatgaaatga tcgcacaata cacatcagca cttcttgcag ggacgataac tagcggatgg 2640 actttcggtg cgggcgcagc acttcaaatc cctttcgcga tgcaaatggc atacagattt 2700 aacggaatcg gagtgacaca gaatgtactt tacgagaatc agaaactcat tgcaaaccaa 2760 tttaactcag caatcggaaa gattcaggat tcactttcat caacagcatc agcacttgga 2820 aagctccagg atgtggtgaa ccagaatgcg caagcactta acacgttggt taagcagcta 2880 tcatcaaact ttggagcaat ctcatcagtg cttaacgata tcctttcaag acttgataaa 2940 gtggaagcag aagtgcagat tgaccgtcta attacgggaa gacttcaatc acttcaaaca 3000 tacgttaccc agcagttaat aagagcagca gaaatcagag catcagcaaa ccttgcagca 3060 acaaagatgt cggaatgcgt gcttggacaa tcaaagaggg tagatttctg tggcaagggc 3120 taccatctta tgtcatttcc tcaatcagca cctcacggag tggtgtttct tcacgtgaca 3180 tacgtgcctg cacaagagaa gaatttcaca acagcacctg caatctgcca cgaaggaaag 3240 gctcatttcc cgcgagaagg agtgtttgtg tcaaacggaa cacactggtt tatcacacaa 3300 agaaacttct atgagcctca aatcatcaca acagataaca catttgtgtc aggatcatgc 3360 gatgtggtga tcggaatagt taacaacaca gtatatgacc ctcttcaacc tgaacttgat 3420 tcatttaaag aagaacttga taaatacttt aaaaaccaca catcacctga tgtggatttg 3480 ggagatattt cagggataaa cgcatcagtg gtgaacatcc agaaggaaat cgaccgactc 3540 aatgaggtgg caaagaatct aaacgaaagc ccgattgacc tccaggagct tggaaagtat 3600 gagcaataca tcaaatggcc ttggtacatc tggcttggat ttatcgcagg acttatcgca 3660 atcatcatgg tgacaatcat gctttgctgc atgacatcat gttgttcgtg tctcaaggga 3720 tgttgttctt gtggtagttg ctgcaaattt gatgaagagatg attcagaacc tgtgcttaaa 3780 ggagtgaaac flickering atga 3804 <210> 27 <211> 1255 <212> PRT <213> Bat CoV <400> 27 Met Phe Ile Phe Leu Thr Leu Thr Ser Gly Ser Asp Leu 1 5 10 15 Glu Ser Cys Thr Thr Phe Asp Asp Val Gln Ala Pro Asn Tyr Pro Gln 20 25 30 His Ser Ser Ser Arg Gly Val Tyr Tyr Pro Asp Glu Ile Phe Arg 35 40 45 Ser Asp Thr Leu Tyr Leu Thr Gln Asp Leu Phe Leu Pro Phe Tyr Ser 50 55 60 Asn Val Thr Gly Phe His Thr Ile Asn His Arg Phe Asp Asn Pro Val 65 70 75 80 Ile Pro Phe Lys Asp Gly Val Tyr Phe Ala Ala Thr Glu Lys Ser Asn 85 90 95 Val Val Arg Gly Trp Val Phe Gly Ser Thr Met Asn Asn Lys Ser Gln 100 105 110 Ser Val Ile Ile Ile Asn Asn Ser Thr Asn Val Val Ile Arg Ala Cys 115 120 125 Asn Phe Glu Leu Cys Asp Asn Pro Phe Phe Ala Val Ser Lys Pro Thr 130 135 140 Gly Thr Gln Thr His Thr Met Ile Phe Asp Asn Ala Phe Asn Cys Thr 145 150 155 160 Phe Glu Tyr Ile Ser Asp Ser Phe Ser Leu Asp Val Ala Glu Lys Ser 165 170 175 Gly Asn Phe Lys His Leu Arg Glu Phe Val Phe Lys Asn Lys Asp Gly 180 185 190 Phe Leu Tyr Val Tyr Lys Gly Tyr Gln Pro Ile Asp Val Val Arg Asp 195 200 205 Leu Pro Ser Gly Phe Asn Ile Leu Lys Pro Ile Phe Lys Leu Pro Leu 210 215 220 Gly Ile Asn Ile Thr Asn Phe Arg Ala Ile Leu Thr Ala Phe Leu Pro 225 230 235 240 Ala Gln Asp Thr Trp Gly Thr Ser Ala Ala Ala Tyr Phe Val Gly Tyr 245 250 255 Leu Lys Pro Ala Thr Phe Met Leu Lys Tyr Asp Glu Asn Gly Thr Ile 260 265 270 Thr Asp Ala Val Asp Cys Ser Gln Asn Pro Leu Ala Glu Leu Lys Cys 275 280 285 Ser Val Lys Ser Phe Glu Ile Asp Lys Gly Ile Tyr Gln Thr Ser Asn 290 295 300 Phe Arg Val Ala Pro Ser Lys Glu Val Val Arg Phe Pro Asn Ile Thr 305 310 315 320 Asn Leu Cys Pro Phe Gly Glu Val Phe Asn Ala Thr Thr Phe Pro Ser 325 330 335 Val Tyr Ala Trp Glu Arg Lys Arg Ile Ser Asn Cys Val Ala Asp Tyr 340 345 350 Ser Val Leu Tyr Asn Ser Thr Ser Phe Ser Thr Phe Lys Cys Tyr Gly 355 360 365 Val Ser Ala Thr Lys Leu Asn Asp Leu Cys Phe Ser Asn Val Tyr Ala 370 375 380 Asp Ser Phe Val Val Lys Gly Asp Asp Val Arg Gln Ile Ala Pro Gly 385 390 395 400 Gln Thr Gly Val Ile Ala Asp Tyr Asn Tyr Lys Leu Pro Asp Asp Phe 405 410 415 Thr Gly Cys Val Leu Ala Trp Asn Thr Arg Asn Ile Asp Ala Thr Gln 420 425 430 Thr Gly Asn Tyr Asn Tyr Lys Tyr Arg Ser Leu Arg His Gly Lys Leu 435 440 445 Arg Pro Phe Glu Arg Asp Ile Ser Asn Val Pro Phe Ser Pro Asp Gly 450 455 460 Lys Pro Cys Thr Pro Pro Ala Phe Asn Cys Tyr Trp Pro Leu Asn Asp 465 470 475 480 Tyr Gly Phe Tyr Ile Thr Asn Gly Ile Gly Tyr Gln Pro Tyr Arg Val 485 490 495 Val Val Leu Ser Phe Glu Leu Leu Asn Ala Pro Ala Thr Val Cys Gly 500 505 510 Pro Lys Leu Ser Thr Asp Leu Ile Lys Asn Gln Cys Val Asn Phe Asn 515 520 525 Phe Asn Gly Leu Thr Gly Thr Gly Val Leu Thr Pro Ser Ser Lys Arg 530 535 540 Phe Gln Pro Phe Gln Gln Phe Gly Arg Asp Val Leu Asp Phe Thr Asp 545 550 555 560 Ser Val Arg Asp Pro Lys Thr Ser Glu Ile Leu Asp Ile Ser Pro Cys 565 570 575 Ser Phe Gly Gly Val Ser Val Ile Thr Pro Gly Thr Asn Thr Ser Ser 580 585 590 Glu Val Ala Val Leu Tyr Gln Asp Val Asn Cys Thr Asp Val Pro Val 595 600 605 Ala Ile His Ala Asp Gln Leu Thr Pro Ser Trp Arg Val Tyr Ser Thr 610 615 620 Gly Asn Asn Val Phe Gln Thr Gln Ala Gly Cys Leu Ile Gly Ala Glu 625 630 635 640 His Val Asp Thr Ser Tyr Glu Cys Asp Ile Pro Ile Gly Ala Gly Ile 645 650 655 Cys Ala Ser Tyr His Thr Val Ser Ser Leu Arg Ser Thr Ser Gln Lys 660 665 670 Ser Ile Val Ala Tyr Thr Met Ser Leu Gly Ala Asp Ser Ser Ile Ala 675 680 685 Tyr Ser Asn Asn Thr Ile Ala Ile Pro Thr Asn Phe Ser Ile Ser Ile 690 695 700 Thr Thr Glu Val Met Pro Val Ser Met Ala Lys Thr Ser Val Asp Cys 705 710 715 720 Asn Met Tyr Ile Cys Gly Asp Ser Thr Glu Cys Ala Asn Leu Leu Leu 725 730 735 Gln Tyr Gly Ser Phe Cys Thr Gln Leu Asn Arg Ala Leu Ser Gly Ile 740 745 750 Ala Val Glu Gln Asp Arg Asn Thr Arg Glu Val Phe Ala Gln Val Lys 755 760 765 Gln Met Tyr Lys Thr Pro Thr Leu Lys Asp Phe Gly Gly Phe Asn Phe 770 775 780 Ser Gln Ile Leu Pro Asp Pro Leu Lys Pro Thr Lys Arg Ser Phe Ile 785 790 795 800 Glu Asp Leu Leu Phe Asn Lys Val Thr Leu Ala Asp Ala Gly Phe Met 805 810 815 Lys Gln Tyr Gly Glu Cys Leu Gly Asp Ile Asn Ala Arg Asp Leu Ile 820 825 830 Cys Ala Gln Lys Phe Asn Gly Leu Thr Val Leu Pro Pro Leu Leu Thr 835 840 845 Asp Asp Met Ile Ala Ala Tyr Thr Ala Ala Leu Val Ser Gly Thr Ala 850 855 860 Thr Ala Gly Trp Thr Phe Gly Ala Gly Ala Ala Leu Gln Ile Pro Phe 865 870 875 880 Ala Met Gln Met Ala Tyr Arg Phe Asn Gly Ile Gly Val Thr Gln Asn 885 890 895 Val Leu Tyr Glu Asn Gln Lys Gln Ile Ala Asn Gln Phe Asn Lys Ala 900 905 910 Ile Ser Gln Ile Gln Glu Ser Leu Thr Thr Thr Ser Thr Ala Leu Gly 915 920 925 Lys Leu Gln Asp Val Val Asn Gln Asn Ala Gln Ala Leu Asn Thr Leu 930 935 940 Val Lys Gln Leu Ser Ser Asn Phe Gly Ala Ile Ser Ser Val Leu Asn 945 950 955 960 Asp Ile Leu Ser Arg Leu Asp Lys Val Glu Ala Glu Val Gln Ile Asp 965 970 975 Arg Leu Ile Thr Gly Arg Leu Gln Ser Leu Gln Thr Tyr Val Thr Gln 980 985 990 Gln Leu Ile Arg Ala Ala Glu Ile Arg Ala Ser Ala Asn Leu Ala Ala 995 1000 1005 Thr Lys Met Ser Glu Cys Val Leu Gly Gln Ser Lys Arg Val Asp Phe 1010 1015 1020 Cys Gly Lys Gly Tyr His Leu Met Ser Phe Pro Gln Ala Ala Pro His 1025 1030 1035 1040 Gly Val Val Phe Leu His Val Thr Tyr Val Pro Ser Gln Glu Arg Asn 1045 1050 1055 Phe Thr Thr Ala Pro Ala Ile Cys His Glu Gly Lys Ala Tyr Phe Pro 1060 1065 1070 Arg Glu Gly Val Phe Val Phe Asn Gly Thr Ser Trp Phe Ile Thr Gln 1075 1080 1085 Arg Asn Phe Phe Ser Pro Gln Ile Ile Thr Thr Asp Asn Thr Phe Val 1090 1095 1100 Ser Gly Ser Cys Asp Val Val Ile Gly Ile Ile Asn Asn Thr Val Tyr 1105 1110 1115 1120 Asp Pro Leu Gln Pro Glu Leu Asp Ser Phe Lys Glu Glu Leu Asp Lys 1125 1130 1135 Tyr Phe Lys Asn His Thr Ser Pro Asp Val Asp Leu Gly Asp Ile Ser 1140 1145 1150 Gly Ile Asn Ala Ser Val Val Asn Ile Gln Lys Glu Ile Asp Arg Leu 1155 1160 1165 Asn Glu Val Ala Lys Asn Leu Asn Glu Ser Leu Ile Asp Leu Gln Glu 1170 1175 1180 Leu Gly Lys Tyr Glu Gln Tyr Ile Lys Trp Pro Trp Tyr Val Trp Leu 1185 1190 1195 1200 Gly Phe Ile Gly Leu Ile Ile Val Met Val Thr Ile Leu Leu 1205 1210 1215 Cys Cys Met Thr Ser Cys Ser Cys Leu Lys Gly Ala Cys Ser Cys 1220 1225 1230 Gly Ser Cys Cys Lys Phe Asp Glu Asp Asp Ser Glu Pro Val Leu Lys 1235 1240 1245 Gly Val Lys His Tyr Thr 1250 1255 <210> 28 <211> 3768 <212> DNA <213> Bat CoV <400> 28 atgttcatct ttctgttctt cctcactg actagcggtt ccgatctgga gagctgcact 60 accttgacg acgtccaagc tcccattac cccacact ccagctcccg tcgtggtgtc 120 tattaccccg acgaatctt ccgctccgac actctgtatc tgactcaaga cctctccctc 180 cctttttaca gcaacgtcac tggctccac accatcaatc accgctttga caacccgtg 240 atccccttca aggacggcgt gtacttcgct gccaccgaa aatccaacgt cgtgcgcggc 300 tgggtcttcg gctccaccat gaacaacag tcccaatccg tcattatcat caacaactcc 360 actaacgtcg tcatccgcgc ttgtaacttc gagctgtgcg ataacccctt cttcgctgtc 420 agcaaaccta ctggtaccca aactcatacc atgatcttcg aaacgcttt taactgtact 480 540 catctgcgcg agttcgtctt tagaataag gatggctttc tgtacgtgta caagggttat 600 caacccatcg acgtcgtccg tgacctcccc agcggcttca atattctcaa acccatcttt 660 aagctgcctc tgggcatcaa catcaccaac ttccgtgcca tcctcactgc ctttctgccc 720 gctcaagata cttggggcac ctccgccgct gcttacttcg tgggttatct gaaacccgcc 780 acttttatgc tgaagtacga tgagaatggt actatcactg acgccgtgga ctgttcccag 840 aacccctcg ctgagctcaa gtgcagcgtc aagtccttcg agatcgacaa gggcatctac 900 caaacctcca atttccgtgt cgctccttcc aaagaggtgg tgcgcttccc caacatcact 960 aacctctgtc cctttggcga agtcttcaat gccaccacct ttccttccgt ctacgcttgg 1020 gagcgcaagc gtatttccaa ctgcgtcgct gactactccg tgctgtacaa tagcacttcc 1080 ttcagcactt ttaagtgtta cggcgtgtcc gctactaagc tcaacgatct gtgcttttcc 1140 aacgtgtacg ccgactcctt cgtggtgaag ggtgatgacg tccgtcagat cgctcccggc 1200 caaactggtg tgattgctga ctacaattac aagctgcccg acgatttcac cggctgtgtc 1260 ctcgcttgga acacccgtaa catcgacgct acccagaccg gcaactacaa ctacaagtat 1320 cgctctctgc gccatggtaa actccgtccc ttcgagcgcg atattagcaa cgtccctttt 1380 agccccgatg gtaagccttg cacccctccc gctttcaact gctattggcc tctgaacgac 1440 tacggtttct atatcactaa cggcatcggt taccagccct accgtgtcgt cgtgctgagc 1500 tttgaactgc tgaatgcccc cgctaccgtg tgcggcccta agctctccac cgatctgatc 1560 aagaaccagt gcgtgaattt taactttaat ggtctgactg gcaccggcgt gctgactcct 1620 agctccaagc gcttccagcc ctttcagcaa ttcggccgtg acgtgctgga cttcactgac 1680 tccgtgcgcg accccaagac ttccgaaatt ctggacatct ccccttgctc ctttggcggc 1740 gtctccgtca ttactcccgg taccaatact agcagcgaag tggctgtgct gtaccaagac 1800 gtgaactgca ctgacgtgcc cgtcgctatt cacgccgacc aactcactcc cagctggcgt 1860 gtgtatagca ctggcaacaa cgtcttccag acccaagctg gctgtctgat tggcgctgaa 1920 catgtcgaca cttcctacga gtgcgacatc cccatcggcg ctggtatctg tgcctcctat 1980 cacaccgtct ccagcctccg ctccaccagc cagaagtcca tcgtggctta cactatgtcc 2040 ctcggcgctg actccagcat cgcctactcc aacaacacta tcgccatccc caccaacttc 2100 tccatttcca tcactaccga agtcatgccc gtgagcatgg ccaagacctc cgtggactgt 2160 aacatgtata tctgtggtga ctccaccgag tgcgctaatc tgctgctgca gtatggttcc 2220 ttctgtaccc agctcaaccg cgccctcagc ggcattgctg tcgaacaaga ccgcaatacc 2280 cgtgaagtgt tcgcccaagt gaaacagatg tacaagaccc ctaccctcaa ggactttggc 2340 ggcttcaact tctcccagat tctgcccgac cctctgaagc ctaccaagcg ttccttcatc 2400 gaggatctgc tgttcaataa ggtcactctg gctgacgctg gctttatgaa gcagtatggc 2460 gagtgtctgg gcgacattaa cgcccgcgat ctcatttgcg ctcagaagtt caacggtctc 2520 accgtgctcc ctcctctgct caccgatgac atgatcgccg cttacactgc tgctctggtg 2580 agcggtactg ctaccgctgg ctggactttc ggcgctggtg ctgctctgca aatccctttt 2640 gccatgcaga tggcctaccg cttcaacggt atcggcgtca cccagaacgt cctctacgag 2700 aatcagaagc agattgctaa ccaatttaac aaagctattt cccaaatcca agagtccctc 2760 actaccacta gcactgccct cggcaagctc caagacgtgg tgaaccagaa cgcccaagcc 2820 ctcaacactc tggtgaagca actgagcagc aatttcggcg ccattagcag cgtgctcaat 2880 gacattctgt cccgcctcga caaagtcgag gccgaggtgc agattgaccg tctgatcact 2940 ggtcgtctgc aatccctcca gacctacgtc actcagcagc tgatccgtgc cgctgagatt 3000 cgcgcctccg ccaatctggc cgctactaaa atgagcgagt gcgtcctcgg ccaatccaag 3060 cgcgtcgact tctgtggcaa gggctatcat ctgatgagct ttcctcaagc tgccccccat 3120 ggcgtggtgt ttctgcacgt gacctatgtc ccctcccaag aacgcaactt cactaccgct 3180 cccgctatct gtcacgaggg taaggcttat tttccccgtg agggcgtgtt cgtgttcaat 3240 ggcaccagct ggttcatcac ccaacgtaac ttcttctccc ctcagattat caccaccgac 3300 aatacttttg tctccggcag ctgcgatgtc gtgatcggca tcatcaataa tactgtctat 3360 gatcctctgc agcccgagct cgattccttc aaggaggagc tcgataaata cttcaagaac 3420 cacacctccc ccgacgtcga cctcggtgat attagcggca tcaacgctag cgtggtgaac 3480 atccagaaag agatcgaccg cctcaacgag gtggccaaaa atctgaatga gtctctgatc 3540 gatctgcaag agctcggtaa gtacgagcag tacatcaagt ggccttggta cgtgtggctg 3600 ggcttcatcg ccggactgat cgccatcgtg atggtgacca ttctgctgtg ctgcatgacc 3660 agctgctgca gctgtctgaa gggcgcttgc agctgcggca gctgctgcaa gttcgacgag 3720 gacgacagcg agcccgtgct gaagggcgtg aagctgcact acacatga 3768 <210> 29 <211> 1278 <212> PRT <213> Bat CoV <400> 29 Met Phe Leu Leu Thr Thr Lys Arg Thr Met Phe Val Phe Leu Val Leu 1 5 10 15 Leu Pro Leu Val Ser Ser Gln Cys Val Asn Leu Thr Thr Arg Thr Gln 20 25 30 Leu Pro Pro Ala Tyr Thr Asn Ser Ser Thr Arg Gly Val Tyr Tyr Pro 35 40 45 Asp Lys Val Phe Arg Ser Ser Val Leu His Leu Thr Gln Asp Leu Phe 50 55 60 Leu Pro Phe Phe Ser Asn Val Thr Trp Phe His Ala Ile His Val Ser 65 70 75 80 Gly Thr Asn Gly Ile Lys Arg Phe Asp Asn Pro Val Leu Pro Phe Asn 85 90 95 Asp Gly Val Tyr Phe Ala Ser Thr Glu Lys Ser Asn Ile Ile Arg Gly 100 105 110 Trp Ile Phe Gly Thr Thr Leu Asp Ser Lys Thr Gln Ser Leu Leu Ile 115 120 125 Val Asn Asn Ala Thr Asn Val Val Ile Lys Val Cys Glu Phe Gln Phe 130 135 140 Cys Asn Asp Pro Phe Leu Gly Val Tyr Tyr His Lys Asn Asn Lys Ser 145 150 155 160 Trp Met Glu Ser Glu Phe Arg Val Tyr Ser Ser Ala Asn Asn Cys Thr 165 170 175 Phe Glu Tyr Val Ser Gln Pro Phe Leu Met Asp Leu Glu Gly Lys Gln 180 185 190 Gly Asn Phe Lys Asn Leu Arg Glu Phe Val Phe Lys Asn Ile Asp Gly 195 200 205 Tyr Phe Lys Ile Tyr Ser Lys His Thr Pro Ile Asn Leu Val Arg Asp 210 215 220 Leu Pro Pro Gly Phe Ser Ala Leu Glu Pro Leu Val Asp Leu Pro Ile 225 230 235 240 Gly Ile Asn Ile Thr Arg Phe Gln Thr Leu Leu Ala Leu His Arg Ser 245 250 255 Tyr Leu Thr Pro Gly Asp Ser Ser Ser Gly Trp Thr Ala Gly Ala Ala 260 265 270 Ala Tyr Tyr Val Gly Tyr Leu Gln Pro Arg Thr Phe Leu Leu Lys Tyr 275 280 285 Asn Glu Asn Gly Thr Ile Thr Asp Ala Val Asp Cys Ala Leu Asp Pro 290 295 300 Leu Ser Glu Thr Lys Cys Thr Leu Lys Ser Phe Thr Val Glu Lys Gly 305 310 315 320 Ile Tyr Gln Thr Ser Asn Phe Arg Val Gln Pro Thr Asp Ser Ile Val 325 330 335 Arg Phe Pro Asn Ile Thr Asn Leu Cys Pro Phe Gly Glu Val Phe Asn 340 345 350 Ala Thr Thr Phe Ala Ser Val Tyr Ala Trp Asn Arg Lys Arg Ile Ser 355 360 365 Asn Cys Val Ala Asp Tyr Ser Val Leu Tyr Asn Ser Thr Ser Phe Ser 370 375 380 Thr Phe Lys Cys Tyr Gly Val Ser Pro Thr Lys Leu Asn Asp Leu Cys 385 390 395 400 Phe Thr Asn Val Tyr Ala Asp Ser Phe Val Ile Thr Gly Asp Glu Val 405 410 415 Arg Gln Ile Ala Pro Gly Gln Thr Gly Lys Ile Ala Asp Tyr Asn Tyr 420 425 430 Lys Leu Pro Asp Asp Phe Thr Gly Cys Val Ile Ala Trp Asn Ser Lys 435 440 445 His Ile Asp Ala Lys Glu Gly Gly Asn Phe Asn Tyr Leu Tyr Arg Leu 450 455 460 Phe Arg Lys Ala Asn Leu Lys Pro Phe Glu Arg Asp Ile Ser Thr Glu 465 470 475 480 Ile Tyr Gln Ala Gly Ser Lys Pro Cys Asn Gly Gln Thr Gly Leu Asn 485 490 495 Cys Tyr Tyr Pro Leu Tyr Arg Tyr Gly Phe Tyr Pro Thr Asp Gly Val 500 505 510 Gly His Gln Pro Tyr Arg Val Val Val Leu Ser Phe Glu Leu Leu Asn 515 520 525 Ala Pro Ala Thr Val Cys Gly Pro Lys Lys Ser Thr Asn Leu Val Lys 530 535 540 Asn Lys Cys Val Asn Phe Asn Phe Asn Gly Leu Thr Gly Thr Gly Val 545 550 555 560 Leu Thr Glu Ser Asn Lys Lys Phe Leu Pro Phe Gln Gln Phe Gly Arg 565 570 575 Asp Ile Ala Asp Thr Thr Asp Ala Val Arg Asp Pro Gln Thr Leu Glu 580 585 590 Ile Leu Asp Ile Thr Pro Cys Ser Phe Gly Gly Val Ser Val Ile Thr 595 600 605 Pro Gly Thr Asn Ala Ser Asn Gln Val Ala Val Leu Tyr Gln Asp Val 610 615 620 Asn Cys Thr Glu Val Pro Val Ala Ile His Ala Asp Gln Leu Thr Pro 625 630 635 640 Thr Trp Arg Val Tyr Ser Thr Gly Ser Asn Val Phe Gln Thr Arg Ala 645 650 655 Gly Cys Leu Ile Gly Ala Glu His Val Asn Asn Ser Tyr Glu Cys Asp 660 665 670 Ile Pro Ile Gly Ala Gly Ile Cys Ala Ser Tyr Gln Thr Gln Thr Asn 675 680 685 Ser Arg Ser Val Ala Ser Gln Ser Ile Ile Ala Tyr Thr Met Ser Leu 690 695 700 Gly Ala Glu Asn Ser Val Ala Tyr Ser Asn Asn Ser Ile Ala Ile Pro 705 710 715 720 Thr Asn Phe Thr Ile Ser Val Thr Thr Glu Ile Leu Pro Val Ser Met 725 730 735 Thr Lys Thr Ser Val Asp Cys Thr Met Tyr Ile Cys Gly Asp Ser Thr 740 745 750 Glu Cys Ser Asn Leu Leu Leu Gln Tyr Gly Ser Phe Cys Thr Gln Leu 755 760 765 Asn Arg Ala Leu Thr Gly Ile Ala Val Glu Gln Asp Lys Asn Thr Gln 770 775 780 Glu Val Phe Ala Gln Val Lys Gln Ile Tyr Lys Thr Pro Pro Ile Lys 785 790 795 800 Asp Phe Gly Gly Phe Asn Phe Ser Gln Ile Leu Pro Asp Pro Ser Lys 805 810 815 Pro Ser Lys Arg Ser Phe Ile Glu Asp Leu Leu Phe Asn Lys Val Thr 820 825 830 Leu Ala Asp Ala Gly Phe Ile Lys Gln Tyr Gly Asp Cys Leu Gly Asp 835 840 845 Ile Ala Ala Arg Asp Leu Ile Cys Ala Gln Lys Phe Asn Gly Leu Thr 850 855 860 Val Leu Pro Pro Leu Leu Thr Asp Glu Met Ile Ala Gln Tyr Thr Ser 865 870 875 880 Ala Leu Leu Ala Gly Thr Ile Thr Ser Gly Trp Thr Phe Gly Ala Gly 885 890 895 Ala Ala Leu Gln Ile Pro Phe Ala Met Gln Met Ala Tyr Arg Phe Asn 900 905 910 Gly Ile Gly Val Thr Gln Asn Val Leu Tyr Glu Asn Gln Lys Leu Ile 915 920 925 Ala Asn Gln Phe Asn Ser Ala Ile Gly Lys Ile Gln Asp Ser Leu Ser 930 935 940 Ser Thr Ala Ser Ala Leu Gly Lys Leu Gln Asp Val Val Asn Gln Asn 945 950 955 960 Ala Gln Ala Leu Asn Thr Leu Val Lys Gln Leu Ser Ser Asn Phe Gly 965 970 975 Ala Ile Ser Ser Val Leu Asn Asp Ile Leu Ser Arg Leu Asp Lys Val 980 985 990 Glu Ala Glu Val Gln Ile Asp Arg Leu Ile Thr Gly Arg Leu Gln Ser 995 1000 1005 Leu Gln Thr Tyr Val Thr Gln Gln Leu Ile Arg Ala Ala Glu Ile Arg 1010 1015 1020 Ala Ser Ala Asn Leu Ala Ala Thr Lys Met Ser Glu Cys Val Leu Gly 1025 1030 1035 1040 Gln Ser Lys Arg Val Asp Phe Cys Gly Lys Gly Tyr His Leu Met Ser 1045 1050 1055 Phe Pro Gln Ser Ala Pro His Gly Val Val Phe Leu His Val Thr Tyr 1060 1065 1070 Val Pro Ala Gln Glu Lys Asn Phe Thr Thr Ala Pro Ala Ile Cys His 1075 1080 1085 Asp Gly Lys Ala His Phe Pro Arg Glu Gly Val Phe Val Ser Asn Gly 1090 1095 1100 Thr His Trp Phe Val Thr Gln Arg Asn Phe Tyr Glu Pro Gln Ile Ile 1105 1110 1115 1120 Thr Thr Asp Asn Thr Phe Val Ser Gly Ser Cys Asp Val Val Ile Gly 1125 1130 1135 Ile Val Asn Asn Thr Val Tyr Asp Pro Leu Gln Pro Glu Leu Asp Ser 1140 1145 1150 Phe Lys Glu Glu Leu Asp Lys Tyr Phe Lys Asn His Thr Ser Pro Asp 1155 1160 1165 Val Asp Leu Gly Asp Ile Ser Gly Ile Asn Ala Ser Val Val Asn Ile 1170 1175 1180 Gln Lys Glu Ile Asp Arg Leu Asn Glu Val Ala Lys Asn Leu Asn Glu 1185 1190 1195 1200 Ser Leu Ile Asp Leu Gln Glu Leu Gly Lys Tyr Glu Gln Tyr Ile Lys 1205 1210 1215 Trp Pro Trp Tyr Ile Trp Leu Gly Phe Ile Ala Gly Leu Ile Ala Ile 1220 1225 1230 Ile Met Val Thr Ile Met Leu Cys Cys Met Thr Ser Cys Cys Ser Cys 1235 1240 1245 Leu Lys Gly Cys Cys Ser Cys Gly Ser Cys Cys Lys Phe Asp Glu Asp 1250 1255 1260 Asp Ser Glu Pro Val Leu Lys Gly Val Lys Leu His Tyr Thr 1265 1270 1275 <210> 30 <211> 3837 <212> DNA <213> Bat CoV <400> 30 atgttcctgc tgaccaccaa gcgcaccatg ttcgtgttcc tggtgctgct gcccctggtg 60 agcagccagt gcgtgaacct gaccacccgc acccagctgc cccccgccta caccaacagc 120 agcacccgcg gcgtgtacta ccccgacaag gtgttccgca gcagcgtgct gcacctgacc 180 caggacctgt tcctgccctt cttcagcaac gtgacctggt tccacgccat ccacgtgagc 240 ggcaccaacg gcatcaagcg cttcgacaac cccgtgctgc ccttcaacga cggcgtgtac 300 ttcgccagca ccgagaagag caacatcatc cgcggctgga tcttcggcac caccctggac 360 agcaagaccc agagcctgct gatcgtgaac aacgccacca acgtggtgat caaggtgtgc 420 gagttccagt tctgcaacga ccccttcctg ggcgtgtact accacaagaa caacaagagc 480 tggatggaga gcgagttccg cgtgtacagc agcgccaaca actgcacctt cgagtacgtg 540 agccagccct tcctgatgga cctggagggc aagcagggca acttcaagaa cctgcgcgag 600 ttcgtgttca agaacatcga cggctacttc aagatctaca gcaagcacac ccccatcaac 660 ctggtgcgcg acctgccccc cggcttcagc gccctggagc ccctggtgga cctgcccatc 720 ggcatcaaca tcacccgctt ccagaccctg ctggccctgc accgcagcta cctgaccccc 780 ggcgacagca gcagcggctg gaccgccggc gccgccgcct actacgtggg ctacctgcag 840 ccccgcacct tcctgctgaa gtacaacgag aacggcacca tcaccgacgc cgtggactgc 900 gccctggacc ccctgagcga gaccaagtgc accctgaaga gcttcaccgt ggagaagggc 960 atctaccaga ccagcaactt ccgcgtgcag cccaccgaca gcatcgtgcg cttccccaac 1020 atcaccaacc tgtgcccctt cggcgaggtg ttcaacgcca ccaccttcgc cagcgtgtac 1080 gcctggaacc gcaagcgcat cagcaactgc gtggccgact acagcgtgct gtacaacagc 1140 accagcttca gcaccttcaa gtgctacggc gtgagcccca ccaagctgaa cgacctgtgc 1200 ttcaccaacg tgtacgccga cagcttcgtg atcaccggcg acgaggtgcg ccagatcgcc 1260 cccggccaga ccggcaagat cgccgactac aactacaagc tgcccgacga cttcaccggc 1320 tgcgtgatcg cctggaacag caagcacatc gacgccaagg agggcggcaa cttcaactac 1380 ctgtaccgcc tgttccgcaa ggccaacctg aagcccttcg agcgcgacat cagcaccgag 1440 atctaccagg ccggcagcaa gccctgcaac ggccagaccg gcctgaactg ctactacccc 1500 ctgtaccgct acggcttcta ccccaccgac ggcgtgggcc accagcccta ccgcgtggtg 1560 gtgctgagct tcgagctgct gaacgccccc gccaccgtgt gcggccccaa gaagagcacc 1620 aacctggtga agaacaagtg cgtgaacttc aacttcaacg gcctgaccgg caccggcgtg 1680 ctgaccgaga gcaacaagaa gttcctgccc ttccagcagt tcggccgcga catcgccgac 1740 accaccgacg ccgtgcgcga cccccagacc ctggagatcc tggacatcac cccctgcagc 1800 ttcggcggcg tgagcgtgat cacccccggc accaacgcca gcaaccaggt ggccgtgctg 1860 taccaggacg tgaactgcac cgaggtgccc gtggccatcc acgccgacca gctgaccccc 1920 acctggcgcg tgtacagcac cggcagcaac gtgttccaga cccgcgccgg ctgcctgatc 1980 ggcgccgagc acgtgaacaa cagctacgag tgcgacatcc ccatcggcgc cggcatctgc 2040 gccagctacc agacccagac caacagccgc agcgtggcca gccagagcat catcgcctac 2100 accatgagcc tgggcgccga gaacagcgtg gcctacagca acaacagcat cgccatcccc 2160 accaacttca ccatcagcgt gaccaccgag atcctgcccg tgagcatgac caagaccagc 2220 gtggactgca ccatgtacat ctgcggcgac agcaccgagt gcagcaacct gctgctgcag 2280 tacggcagct tctgcaccca gctgaaccgc gccctgaccg gcatcgccgt ggagcaggac 2340 aagaacaccc aggaggtgtt cgcccaggtg aagcagatct acaagacccc ccccatcaag 2400 gacttcggcg gcttcaactt cagccagatc ctgcccgacc ccagcaagcc cagcaagcgc 2460 agcttcatcg aggacctgct gttcaacaag gtgaccctgg ccgacgccgg cttcatcaag 2520 cagtacggcg actgcctggg cgacatcgcc gcccgcgacc tgatctgcgc ccagaagttc 2580 aacggcctga ccgtgctgcc ccccctgctg accgacgaga tgatcgccca gtacaccagc 2640 gccctgctgg ccggcaccat caccagcggc tggaccttcg gcgccggcgc cgccctgcag 2700 atccccttcg ccatgcagat ggcctaccgc ttcaacggca tcggcgtgac ccagaacgtg 2760 2820 gacagcctga gcagcaccgc cagcgccctg ggcaagctgc aggacgtggt gaaccagaac 2880 gcccaggccc tgaacaccct ggtgaagcag ctgagcagca acttcggcgc catcagcagc 2940 gtgctgaacg acatcctgag ccgcctggac aaggtggagg ccgaggtgca gatcgaccgc 3000 ctgatcaccg gccgcctgca gagcctgcag acctacgtga cccagcagct gatccgcgcc 3060 gccgagatcc gcgccagcgc caacctggcc gccaccaaga tgagcgagtg cgtgctggggc 3120 cagagcaagc gcgtggactt ctgcggcaag ggctaccacc tgatgagctt cccccagagc 3180 gccccccacg gcgtggtgtt cctgcacgtg acctacgtgc ccgcccagga gaagaacttc 3240 accaccgcccc ccgccatctg ccacgacggc aaggcccact tcccccgga gggcgtgttc 3300 gtgagcaacg gcaccactg gttcgtgacc cagcgcaact tctacgagcc ccagatcatc 3360 accaccgaca acaccttcgt gagcggcagc tgcgacgtgg tgatcggcat cgtgaacaac 3420 accgtgtacg accccctgca gcccgagctg gacagcttca aggaggagct ggacaagtac 3480 ttcaagaacc acaccagccc cgacgtggac ctgggcgaca tcagcggcat caacgccagc gtggtgaaca tccagaagga gatcgaccgc ctgaacgagg tggccaaga cctgaacgag agcctgatcg acctgcagga gctgggcaag tacgagcagt acatcaagtg gccctggtac 3660. atctggctgg gcttcatcgc cggcctgatc gccatcatca tggtgaccat catgctgtgc 3720 3780. tgcatgacca gctgctgcag ctgcctgaag ggctgctgca gctgcggcag ctgctgcaag ttcgacgagg acgacagcga gcccgtgctg aagggcgtga agctgcacta cacctga 3837 <210> 31 <211> 107 <212> DNA <213> Artificial Sequence <400> 31 atgggatggt catgtatcat cctttttcta gtagcaactg caacctgtgt acattcagcg gccgcggagg tggaggtagt gctagccatc accatcacca tcactaa <210> 32 <211> 232 <212> PRT <213> Artificial Sequence <400> 32 Glu Pro Lys Ser Cys Asp Lys Thr His Thr Cys Pro Pro Cys Pro Ala 1 5 10 15 Pro Glu Leu Leu Gly Gly Pro Ser Val Phe Leu Phe Pro Pro Lys Pro 20 25 30 Lys Asp Thr Leu Met Ile Ser Arg Thr Pro Glu Val Thr Cys Val Val 35 40 45 Val Asp Val Ser His Glu Asp Pro Glu Val Lys Phe Asn Trp Tyr Val 50 55 60 Asp Gly Val Glu Val His Asn Ala Lys Thr Lys Pro Arg Glu Glu Gln 65 70 75 80 Tyr Asn Ser Thr Tyr Arg Val Val Ser Val Leu Thr Val Leu His Gln 85 90 95 Asp Trp Leu Asn Gly Lys Glu Tyr Lys Cys Lys Val Ser Asn Lys Ala 100 105 110 Leu Pro Ala Pro Ile Glu Lys Thr Ile Ser Lys Ala Lys Gly Gln Pro 115 120 125 Arg Glu Pro Gln Val Tyr Thr Leu Pro Pro Ser Arg Glu Glu Met Thr 130 135 140 Lys Asn Gln Val Ser Leu Thr Cys Leu Val Lys Gly Phe Tyr Pro Ser 145 150 155 160 Asp Ile Ala Val Glu Trp Glu Ser Asn Gly Gln Pro Glu Asn Asn Tyr 165 170 175 Lys Thr Thr Pro Pro Val Leu Asp Ser Asp Gly Ser Phe Phe Leu Tyr 180 185 190 Ser Lys Leu Thr Val Asp Lys Ser Arg Trp Gln Gln Gly Asn Val Phe 195 200 205 Ser Cys Ser Val Met His Glu Ala Leu His Asn His Tyr Thr Gln Lys 210 215 220 Ser Leu Ser Leu Ser Pro Gly Lys 225 230
Claims
1. A nanobody conjugated to SARS-CoV-2, the amino acid sequence of which is shown in SEQ ID NO:
8.
2. The fusion protein of SARS-CoV-2 is composed of the following segments from the N-terminus to the C-terminus: the protein segment shown in SEQ ID NO: 8, the linker peptide, and the human Fc; the amino acid sequence of the human Fc is shown in SEQ ID NO:
32.
3. A gene encoding the nanobody of claim 1 or the fusion protein of claim 2.
4. The use of the nanobody of claim 1 in the preparation of a medicament for inhibiting coronaviruses; wherein the coronavirus is SARS-CoV-2 or SARS-CoV-1.
5. A drug for inhibiting coronaviruses, wherein the active ingredient is the nanobody of claim 1 or the fusion protein of claim 2.
6. The use of the nanobody of claim 1 in the preparation of a medicament for neutralizing coronaviruses; wherein the coronavirus is SARS-CoV-2 or SARS-CoV-1.
7. A medicament for neutralizing coronaviruses, wherein the active ingredient is the nanobody of claim 1 or the fusion protein of claim 2.
8. The use of the nanobody of claim 1 in the preparation of a medicament for treating diseases caused by coronaviruses; wherein the coronavirus is SARS-CoV-2 or SARS-CoV-1.
9. A medicament for treating diseases caused by coronaviruses, wherein the active ingredient is the nanobody of claim 1 or the fusion protein of claim 2.
10. Use of the fusion protein of claim 2 in the preparation of a medicament for inhibiting coronaviruses; wherein the coronavirus is SARS-CoV-2, SARS-CoV-1, Pangolin CoV GD, Pangolin CoV GX, Bat CoV WIV16, or Bat CoV RaTG13.
11. Use of the fusion protein of claim 2 in the preparation of a medicament for neutralizing coronaviruses; wherein the coronavirus is SARS-CoV-2, SARS-CoV-1, Pangolin CoV GD, Pangolin CoV GX, Bat CoV WIV16, or Bat CoV RaTG13.
12. Use of the fusion protein of claim 2 in the preparation of a medicament for treating diseases caused by coronaviruses; wherein the coronavirus is SARS-CoV-2, SARS-CoV-1, Pangolin CoV GD, Pangolin CoV GX, Bat CoVWIV16, or Bat CoV RaTG13.