Cd147-s protein complex and use thereof

The three-dimensional structure of the CD147-S protein complex was determined by cryo-electron microscopy, which solved the problem of unclear CD147-S protein interactions, provided a drug design method, and enabled the development of effective inhibitory and safe drugs against SARS-CoV-2.

CN122255293APending Publication Date: 2026-06-23FOURTH MILITARY MEDICAL UNIVERSITY

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
FOURTH MILITARY MEDICAL UNIVERSITY
Filing Date
2026-01-19
Publication Date
2026-06-23

AI Technical Summary

Technical Problem

In the prior art, the structure of the CD147-S protein complex is unclear, and there is insufficient direct evidence to block the interaction between CD147 and the SARS-CoV-2 S protein, which limits the development of antiviral drugs, especially the poor inhibitory effect on infection of target cells with low or no ACE2 expression.

Method used

The three-dimensional structure of the CD147 extracellular region and the SARS-CoV-2 S protein trimer was determined by cryo-electron microscopy, revealing the interaction interface and active site between CD147 and the S protein. This provides a computer-aided drug design method to screen for antibodies, peptides, proteins or small molecules that can bind to the CD147 extracellular region and inhibit or activate the activity of the CD147 protein.

Benefits of technology

The three-dimensional structure of the CD147-S protein complex was clarified, providing a basis for the development of drugs targeting SARS-CoV-2 invading host cells. It effectively inhibits viral infection, especially in target cells with low or no ACE2 expression, reduces mortality and viral load in critically ill patients, and has good safety.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure FT_1
    Figure FT_1
  • Figure FT_2
    Figure FT_2
  • Figure FT_3
    Figure FT_3
Patent Text Reader

Abstract

The present application relates to a CD147-S protein complex and its application. The complex comprises a trimer composed of three identical amino acid sequences of new coronavirus S protein and one CD147 extracellular region. The application includes drug resistance, screening and synthesis of antibodies, protein substances, peptide substances or small molecule substances for inhibiting SARS-CoV-2 infection based on the structure of the complex.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the structure and application of CD147-S protein, a complex formed by the spike protein (S protein) of the novel coronavirus (SARS-CoV-2) and the extracellular region (CD147ECD) of the immunoglobulin superfamily molecule CD147. Background Technology

[0002] Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) is an enveloped, single-stranded, positive-sense RNA virus belonging to the β-coronavirus genus. COVID-19 is a disease caused by the SARS-CoV-2 coronavirus. It typically spreads among close contacts. Common symptoms include fever, cough, fatigue, shortness of breath, loss of taste and smell, and muscle aches. The time from infection to symptom onset is usually 1 to 14 days, and one-third of infected individuals are asymptomatic. Most patients with obvious symptoms (81%) experience mild to moderate symptoms (most commonly mild pneumonia), while 14% develop severe symptoms (dyspnea, hypoxia, or radiographic involvement of more than 50% of the lungs), and 5% develop critical symptoms (respiratory failure, shock, or multiple organ failure). Elderly individuals or those with underlying medical conditions are at higher risk of developing severe symptoms. Some people experience a range of effects for months after recovery, and organ damage has been observed.

[0003] The pathogen of this disease, SARS-CoV-2, mainly infects host cells by binding its spike protein (S protein) to receptors on the surface of host cells.

[0004] The SARS-CoV-2 virus genome is nearly 30kb and encodes a total of 29 viral proteins. Among them, the envelope protein (E protein), membrane protein (M protein), nucleocapsid protein (N protein), and spike protein (S protein) are important structural proteins of the virus, which are involved in functions such as viral genome transcription and replication, viral particle packaging, viral antiviral responses against host cells, and viral invasion. The S protein is the core protein for viral recognition, binding, and invasion of target cell receptors, and is the preferred target for the development of viral therapeutics and vaccines.

[0005] CD147, an immunoadhesion molecule, is a widely expressed transmembrane glycoprotein with a molecular weight of 50-60 kDa. It is a potential adhesion molecule that shares functional similarities with N-CAM, I-CAM, and other related IgSF subsets, participating in cell-to-cell and cell-to-matrix adhesion. Furthermore, previous studies have shown that CD147 serves as a universal receptor for SARS-CoV-2 S protein infection of host cells, mediating the infection of host cells by circulating strains and various mutant strains of the novel coronavirus [1,2].

[0006] CD147 can serve as a novel target for antiviral drug development. In vitro experiments have validated the safety of the corresponding antibody drug, and the development of this receptor-blocking drug is not affected by viral mutations. The research team has proposed for the first time that CD147 is the receptor for SARS-CoV-2 to invade human host cells, and that the CD147-S protein represents a crucial new viral infection pathway independent of ACE2. Therefore, for the development of specific antiviral drugs to treat COVID-19, CD147 is a more promising target than the positive protective function of ACE2.

[0007] Previously, the inventors confirmed the interaction between CD147 and the SARS-CoV-2 S protein using methods such as SPR, CO-IP, ELISA, and negative electron microscopy. They also observed the co-localization of CD147 and the S protein in SARS-CoV-2-infected cells and lung and kidney tissues from COVID-19 patients using immunoelectron microscopy. In vitro experiments showed that interfering with CD147 in target cells or blocking CD147 with the anti-CD147 antibody methimazole significantly inhibited viral replication. Overexpression of human CD147 in hamster cells induced SARS-CoV-2 infection, a process that could be blocked by CD147 antibodies. Animal experiments showed that SARS-CoV-2 could be detected in the lung tissue of transgenic CD147 humanized mice (hCD147 mice), while wild-type mice could not be infected [1,2]. These results suggest that SARS-CoV-2 invades host cells via the CD147-S protein pathway; however, direct evidence for the CD147-S protein complex (CD147-Spike protein complex) has not yet been obtained, and its structure remains unclear.

[0008] The aforementioned preliminary studies have demonstrated that the interaction between the CD147 molecule and the S protein is a novel pathway for viral invasion of target cells. Meperizumab is a humanized IgG2 antibody recombinantly expressed in CHO cells, with an affinity constant (K... D The value is 1.7 × 10 -10 M. In vitro studies have shown that meperizumab can significantly inhibit virus-induced cytopathic effects, EC. 50 The concentration was 24.86 μg / mL; it also significantly inhibited viral replication, with an IC50 value of 24.86 μg / mL. 50The concentration was 15.16 μg / mL. Human clinical studies have shown that no significant adverse reactions were observed after intravenous infusion of 0.56 mg / kg of methimazole in humans, indicating good safety characteristics. Meperizumab injection is a therapeutic humanized antibody drug targeting CD147. It inhibits viral infection of target cells by blocking the interaction between CD147 molecules and viral S protein. Meperizumab can effectively inhibit the infection of host cells by blocking the universal receptor of SARS-CoV-2 epidemic strains and variants such as Alpha, Beta, Gamma, and Delta. In particular, it inhibits the infection of target cells with low or no ACE2 expression. On the other hand, it inhibits local and systemic inflammatory responses by blocking the interaction between CD147 and CyPA. The team has conducted a series of phase I, II, and III international and domestic multicenter clinical studies. The results show that meperizumab injection can effectively reduce the mortality rate of critically ill patients, shorten hospital stay, and reduce viral load, while having good safety in human application [3,4].

[0009] Reference:

[0010] 1.Wang K, Chen W, Zhang Z, Deng Y, Lian JQ, Du P, Wei D, Zhang Y, SunXX, Gong L, Yang L, Yang XM, Zhao Z, Sun S, Gu H,Wang Z, Wang CF, Lu Y, Liu YY, Wang QY, Bian H, Zhu P, Chen ZN. CD147-spikeprotein is a novel route for SARS-CoV-2 infection to host cells. SignalTransduct Target Ther. 2020;5(1):283. 2. Geng J, Chen L, Yuan Y, Wang K, Wang Y, Qin C, Wu G, Chen R, Zhang Z, Wei D, Du P, Zhang J, Lin P, Zhang K, Deng Y, Xu K, Liu J, Sun X, Guo T, Yang X, Wu J, Jiang J, Li L, Zhang K, Wang Z, Zhang J, Yan Q, Zhu H, Zheng Z, Miao J, Fu X, Yang F, Chen X, Tang H, Zhang Y, Shi Y, Zhu Y, Pei Z, Huo F, Liang X, Wang Y, Wang Q, Xie W, Li Y, Shi M, Bian H, Zhu P, Chen ZN. CD147 antibody specifically and effectively inhibits infection and cytokine storm of SARS-CoV-2 and its variants delta, alpha, beta, and gamma. SignalTransduct Target Ther. 2021;6(1):347. 3. Bian H, Zheng ZH, Wei D, Wen A, Zhang Z, Lian JQ, Kang WZ, Hao CQ, Wang J, Xie RH, Dong K, Xia JL, Miao JL, Kang W, Li G, Zhang D, Zhang M, SunXX, Ding L, Zhang K, Jia J, Ding J, Li Z, Jia Y, Liu LN, Zhang Z, Gao ZW, DuH, Yao N, Wang Q, Wang K, Geng JJ, Wang B, Guo T, Chen R, Zhu YM, Wang LJ, HeQ, Yao RR, Shi Y, Yang XM, Zhou JS, Ma YN, Wang YT, Liang X, Huo F, Wang Z, Zhang Y, Yang X, Zhang Y, Gao LH, Wang L, Chen XC, Tang H, Liu SS, Wang QY, Chen ZN, Zhu P. Safety and efficacy of meplazumab in healthy volunteers and COVID-19 patients: a randomized phase 1 and an exploratory phase 2 trial. Signal Transduct Target Ther. 2021; 6(1): 194. 4.Bian H, Chen L, Zheng ZH, Sun XX, Geng JJ, Chen R, Wang K, Yang X,Chen SR, Chen SY, Xie RH, Zhang K, Miao JL, Jia JF, Tang H, Liu SS, Shi HW, Yang Y, Chen CCC, Dusilek C, Rivabem L, Cavalcante AJW, LopesSS, Saporito WF, Fucci FJC, Simon-Campos JA, Wang L, Liu LN, Wang QY, Wei D, Zhang Z, Chen ZN, Zhu P. Meplazumab in hospitalized adults with severe COVID-19 (DEFLECT): a multicenter, seamless phase 2 / 3, randomized, third-partydouble-blind clinical trial. Signal Transduct Target Ther. 2023;8(1):46. Summary of the Invention Based on the inventor's new discovery, this application provides a CD147-S protein complex, the complex comprising a trimer composed of a SARS-CoV-2 S protein with three identical amino acid sequences and a CD147 extracellular region; The SARS-CoV-2 S protein described herein has the protein sequence of SEQ ID NO:1; or a protein having at least 95% amino acid sequence identity with SEQ ID NO:1; or a protein sequence having one or more amino acid substitutions, deletions, or replacements with SEQ ID NO:1. In the trimer, one of the SARS-CoV-2 S protein's RBDs is in an upward-open conformation, binding to the extracellular region of CD147, while the other two SARS-CoV-2 S protein RBDs are in a downward-closed conformation. The amino acid residues at the interface between the SARS-CoV-2 S protein and the CD147 extracellular region include CD147 residues R54, E92, E84, Q100, and S112, and SARS-CoV-2 S protein residues G413, K424, K417, Y489, and G447.

[0011] Furthermore, using the gold standard Fourier shell correlation criterion with a resolution of 0.143, the density map of the CD147-S protein complex had a resolution of 3.75 Å, with Laplace plot statistics of 2.9% and side chain statistics of 3.7%. Furthermore, the recognition motif for the extracellular region of CD147 on the SARS-CoV-2 S protein is embedded in the interface between protomers within the closed conformation of the S protein trimer; when binding to the extracellular region of CD147, the RBD of the SARS-CoV-2 S protein undergoes a 32.3° outward rotation, exposing the receptor-binding motif.

[0012] Furthermore, the linker loop between the two Ig domains of the bound CD147 spans the β6 chain of the SARS-CoV-2 S protein RBD to form a concave binding surface that can accommodate CD147.

[0013] Furthermore, the SARS-CoV-2 S protein RBD domain has the amino acid sequence of SEQ ID NO: 2; or is a protein with at least 95% amino acid sequence identity with SEQ ID NO: 2; or is a protein sequence with one or more amino acid substitutions, deletions, or replacements with SEQ ID NO: 2.

[0014] Furthermore, mutations occur at one or more of the G413, K424, K417, Y489, and G447 sites in the RBD domain of the SARS-CoV-2 S protein.

[0015] Furthermore, the SARS-CoV-2 S protein RBD domain has undergone mutations at one or more of the following sites: A. G413A mutation on the RBD of the S protein; B. The K424A mutation on the S protein RBD; C. K417A mutation on the S protein RBD; D. The Y489A mutation on the RBD of the S protein; The G447A mutation on the RBD of the E. S protein.

[0016] The present invention also provides a method for preparing the above-mentioned protein complex, the method comprising the following steps: A. Protein was expressed and purified using eukaryotic FreeStyle 293F cells after transient staining with polyethyleneimine. B. Purification of S protein from filtered cell supernatant using Anti-Flag affinity resin; C. The S protein was purified by gel filtration chromatography; D. Preparation of a complex of CD147 extracellular protein and novel coronavirus S protein purified by chromatographic column.

[0017] This invention also provides a computer-aided method for drug design based on structure-based bioactive substances, the method comprising: a. Provide a CD147-S protein complex, wherein the complex is the aforementioned protein complex; b. Use the structural information of the complex to design or screen antibodies, peptides, proteins, small molecules, or compounds; c. Synthesize the antibodies, peptides, proteins, and small molecules described in step b; d. Evaluate the biological activity of the antibodies, peptides, proteins, small molecules, etc. synthesized in step c.

[0018] Optionally, the screening includes performing the screening in one or more chemical compound databases. Further, the method includes the interaction of compounds identified in the screening step with the complex model. Optionally, the design includes targeted drug design or randomized drug design to screen for compounds predicted to bind to the three-dimensional structure of the CD147 extracellular functional epitope or to screen for compounds that bind to the CD147 protein, inhibiting or activating the activity of the CD147 protein.

[0019] This invention also relates to the application of the above-mentioned protein complex as a target in the preparation of anti-COVID-19 drugs.

[0020] This invention relates to the cryo-electron microscopy electron density of the CD147-S protein complex, a complex formed by the extracellular region of the immune adhesion molecule CD147 and the SARS-CoV-2 spike protein, as well as the active site of action of CD147 and the SARS-CoV-2 S protein obtained from this cryo-electron microscopy structure, and the applications of these structures and active sites. This invention elucidates the three-dimensional structure and active site of the CD147 extracellular region-SARS-CoV-2 spike protein complex, and provides a three-dimensional model and structural basis for drug design targeting this active site for SARS-CoV-2 invasion of host cells. Attached Figure Description

[0021] Figure 1 shows the exon / intron structure of the CD147 gene.

[0022] Figure 2 shows the expression of CD147 induced by Escherichia coli Origami B (DE3).

[0023] Figure 3 is a schematic diagram of the full length of the SARS-CoV-2 S protein (Wuhan-Hu-1, GenBank: MN908947).

[0024] Figure 4 shows the electrophoretic identification results of the S protein.

[0025] Figure 5 shows the data processing flow and structural analysis of the CD147-SARS-CoV-2 S protein complex cryo-electron microscopy (cryo-EM) data; A is an example of an electron micrograph of the protein complex; B is a representative two-dimensional classification diagram of the unbound S protein and the CD147-S protein complex; C is a representative two-dimensional classification diagram of the CD147-S protein complex; D is a flowchart of the CD147–S protein complex cryo-EM data processing flow, which includes particle selection, classification, and density reconstruction.

[0026] Figure 6 shows the structure and interaction interface analysis of the CD147-S protein complex; A is the electron density map of the CD147-S protein complex; B is the three-dimensional structure of the CD147-S protein; C is the RBD interaction model of the CD147-S protein, with key amino acid residues at the binding interface labeled.

[0027] Figure 7 shows the interaction between CD147 and the S protein and its mutants as determined by SPR; the upper figure shows the affinity determination of SARS-CoV-2 S protein RBD with wild-type CD147 and CD147 mutants, and the lower figure shows the affinity determination of CD147 with wild-type SARS-CoV-2 S protein RBD and RBD mutants.

[0028] Figure 8 shows the viral load of CD147 and its mutants mediated by TaqMan real-time quantitative PCR in cell invasion. Figure 9 is a diagram illustrating the interaction pattern between methimazole and CD147 and the SARS-CoV-2 S protein. Detailed Implementation

[0029] Unless otherwise specified, the scientific and technical terms used in this article are intended for understanding by those skilled in the art.

[0030] This invention relates to the following inventions: the three-dimensional structure of a complex formed by the extracellular region of CD147 and the SARS-CoV-2 spike protein trimer, the active sites for interaction between CD147 and the SARS-CoV-2 spike protein obtained from the structure, a drug design method based on such structure, antibodies, protein substances, peptide substances, small molecule substances confirmed by these methods, and the application of these substances in therapeutic diagnostics.

[0031] It should be noted that the terms "homologue," "fragment," "variant," "analogue," and "derivative" in this invention, as used in the claims, refer to molecules formed by substituting, mutating, altering, replacing, deleting, or adding one or more amino acid residues from the amino acid sequence of the CD147-S protein complex already formed in the frozen complex provided by this invention, or to molecules formed by selecting a segment of the amino acid sequence of the CD147-S protein complex already formed in the CD147-S protein complex provided by this invention.

[0032] The term "protein three-dimensional structure" as used in this article refers to the three-dimensional structure of a protein determined by its amino acid sequence under certain conditions; that is, the three-dimensional structure formed by the folding of a protein with an amino acid sequence under certain conditions. The structure of a protein can be determined using X-ray diffraction, NMR, or cryo-electron microscopy.

[0033] In this invention, "variant" refers to the addition, deletion, or substitution of amino acid residues in the wild-type CD147 extracellular region, the novel coronavirus spike protein S protein sequence, or fragments of the CD147 extracellular region and the novel coronavirus spike protein S protein sequence. The amino acid sequence variants of the CD147-S protein complex of this invention may include (1) deletion of any one or more amino acid residues in the sequence, and (2) substitution of any one or more amino acid residues in the sequence with one or more amino acid residues.

[0034] In this invention, "homogeneous" refers to the similarity in amino acid sequences. Specifically, the amino acid sequence of the CD147-S protein complex in this invention is compared with other amino acid sequences. If at least 70% of the amino acid sequences are identical, they are considered homologous. The comparison can be performed visually or using computer software such as CLUSTAL to perform protein homology comparison.

[0035] In this invention, "fragment" refers to any segment of the amino acid sequence of the CD147-S protein complex provided by this invention, and this segment can be used for structural studies.

[0036] In this invention, "analog" refers to amino acid substitutions or deletions that are similar in amino acid sequence to the CD147-S protein complex provided in this invention and do not impair the ability of amino acids to form a complex.

[0037] In this invention, "derivative" refers to the amino acid sequence formed by chemically modifying the amino acid sequence of the CD147-S protein complex provided by this invention, such as by substituting hydrogen on alkyl, acyl, or amino groups.

[0038] The CD147-S protein complex described in this invention refers not only to the naturally occurring CD147 molecule structure or the complex structure formed by freezing a wild-type CD147 molecule and the novel coronavirus spike protein S protein, but also to the atomic structure model formed by a mutant of the wild-type CD147 molecule that has the same three-dimensional structure as the wild-type CD147 molecule. Mutants of the wild-type CD147 molecule can be: replacing at least one amino acid residue in the wild-type CD147 molecule, adding / deleting amino acids in the peptide chain of the wild-type CD147 molecule, or adding / deleting amino acids at the N-terminus or C-terminus of the wild-type CD147 molecule peptide chain.

[0039] The CD147-S protein complex described in this invention also includes a complex structure formed by a mutant of the spike protein S protein having the same three-dimensional structure as the novel coronavirus spike protein S protein and a CD147 molecule. The novel coronavirus spike protein S protein mutant may be formed by: replacing at least one amino acid residue in the spike protein S protein molecule, adding / deleting amino acids in the light chain and / or heavy chain of the S protein molecule, or adding / deleting amino acids at the N-terminus or C-terminus of the spike protein S protein molecule.

[0040] In one embodiment, the CD147-S protein complex is automatically optimized using Bayesian polishing and particle projection to achieve a final overall electron density of 3.75 Å, with the resolution threshold determined using the gold standard criterion of Fourier shell correlation (FSC) of 0.143. Additionally, this invention provides a method for preparing the CD147-S protein complex, comprising the following steps: cloning the CD147 extracellular region gene sequence into a pET21a(+) prokaryotic expression system, expressing it in Origami B(DE3) strain, and purifying the CD147 extracellular region protein; and expressing and purifying the novel coronavirus spike protein S protein and its RBD. For the full-length S protein, the 2019-nCoV-S gene (GenBank: NM_908947, residues 1-1208) was selected, with proline substitutions at residues 986 and 987, and the furin cleavage site (residues 682-685) replaced with "GSAS". TwinStrepTag and 6XHisTag were added to the COOH terminus, and the protein was cloned into the vector pcDNA3.4. For RBD expression, residues 319-591aa of the S protein were selected, and the cloning vector was pET32a. Partially expressed mutant spike protein RBDs were obtained from Genscript or Sinocare. Subsequently, the purified CD147 extracellular region protein and the purified novel coronavirus spike protein S protein were incubated at 4°C, and the complex was then purified. Provide a 5-20 mg / ml CD147-S protein complex solution, mix the CD147-S protein complex solution and the pooling solution; allow the mixture to stand for a period of time until the CD147-S protein complex forms in the solution.

[0041] In the implementation scheme, the three-dimensional structure of the CD147-S protein complex was determined using cryo-electron microscopy. The specific method was as follows: An equimolar amount of the protein sample was diluted to a suitable observation concentration using solvents such as DPBS (Dulbecco's phosphate-buffered saline); a dedicated microgrid (e.g., Quantifoil 300) was subjected to glow discharge for approximately 20 seconds; then approximately 4 μL of the diluted solution was dropped onto the surface of the microgrid carbon film, and the sample was rapidly frozen in liquid ethane using a Vitrobot MarkIV (ThermoFisher Scientific) rapid freezing device at 100% humidity and 4°C, blotted dry with filter paper for 3 seconds, and then frozen. The prepared sample was stored in liquid nitrogen for later use before detection. The sample was then transferred to a FEITitan Krios transmission electron microscope, operated at 300 kV, equipped with a Falcon III Summit direct detector and a Gatan Quantum GIF energy filter, in zero-loss mode with a slit width of 20 eV. Automatic data collection was performed using EPU software at a magnification of 96,000× nominal, with a pixel size of 0.846 Å. Each film was acquired for 40 seconds in super-resolution mode, totaling 35 frames, with a dose rate of 1.53 eF / s per frame. / Å2. A total of 2,402 photomicrographs were acquired in a single acquisition, with defocus ranges from 1.2 to 2.5 μm. All image processing was performed using RELION-3.1.2 (Fernandez-Leiro and Scheries, 2017). All images were overlaid and motion-corrected using MotionCorr2 (Zheng et al., 2017). Images affected by crystalline ice or other visible contamination were removed by manual inspection, and photomicrographs with a maximum estimated resolution exceeding 4.0 Å were discarded after estimating the contrast transfer function (CTF) parameters using GCTF (Zhang, 2016). 641,033 particles were automatically selected using Gautomatch, referencing the 2D classification results of manually selected particles. Particles with a 330-pixel bounding box size extracted from the dataset underwent multiple rounds of no-reference 2D classification to remove invalid categories, ultimately retaining approximately 505,182 high-quality particles. Among these, 189,656 particles were selected as the initial model for the SARS-CoV-2 S protein-CD147 complex and subjected to 3D classification. A high-quality category was selected for further 2D and 3D classification. 62,906 particles underwent automated 3D optimization to generate an initial 4.65 Å density map. These particles were further polished using Bayesian polishing and automated optimization of their projections to generate a final 3.75 Å density map, with resolution determined using the gold-standard Fourier shell correlation (FSC) of 0.143. Local resolution estimation was performed using unfiltered half-maps in RELION-3.1.2. The initial coordinates of the CD147-S protein complex were based on the structures of the S protein (7DDN) and CD147 (3B5H) from the PDB database and fitted to the 3.75 Å post-processed density map using the UCSF Chimera and Molecular Dynamics Flexible Fitting (MDFF) protocol to maximize the correlation coefficient between atomic coordinates and the density map. The adjusted structures were then optimized using 100.0 ns molecular dynamics (MD) simulations with positional constraints applied via the density map during the simulations. The system settings for each MD simulation were consistent with previous studies. Coordinate values ​​are collected every 10 ps for density cross-correlation analysis to avoid model overfitting during optimization.

[0042] According to the present invention, those skilled in the art can use this three-dimensional structure to simulate or calculate the three-dimensional structures of CD147 extracellular region mutants, fragments, derivatives, variants, analogs, or homologs with amino acid sequences similar to those of the CD147 extracellular region described in the present invention. These techniques are based on information obtained from the analysis of the complex structure of the CD147 extracellular region and the spike protein. Therefore, based on the data information from the complex structure, it is possible to use some conventional techniques in the art to derive the three-dimensional structures and models of CD147 extracellular region mutants, fragments, derivatives, variants, analogs, or homologs. The derivation of the structure of any CD147 extracellular region mutant, fragment, derivative, variant, analog, or homolog can even be achieved without data on its complex structure. Furthermore, when constructing a complex or supercomplex structure of a CD147 extracellular region mutant, fragment, derivative, variant, analog, or homolog, information obtained from the CD147 extracellular region structure of this invention can be used to optimize the simulation of the three-dimensional structure of the new CD147 extracellular region, the SARS-CoV-2 spike protein, and its RBD sequence mutants, fragments, derivatives, variants, analogs, or homologs. This invention's novel findings make it possible to determine the active sites of the CD147 extracellular region interacting with other antibodies, integrins, CyPA, and other interacting molecules, as well as to implement structure-based drug design and screening. Antibodies, proteins, peptides, or small molecules (compounds) designed and screened in this way can effectively influence the activity of the CD147 molecule. Moreover, this invention provides a model for the modification, alteration, and optimization of the SARS-CoV-2 spike protein S protein and other proteins with similar epitopes to the spike protein.

[0043] Utilizing the structural epitope information of the CD147-S protein complex described in this invention helps in screening receptor CD147 antibodies, ligands, and other interacting molecules, as well as corresponding active sites, particularly for screening inhibitory or activating molecules of CD147, such as antibodies, peptides, proteins, and chemical molecules. Furthermore, this model will facilitate the development, modification, and optimization of innovative antibodies against the novel coronavirus spike protein S protein, as well as the modification, alteration, and optimization of other proteins with similar epitopes to spike protein S protein antibodies. For example, targeting the active binding site of the extracellular region of CD147, computer-aided design and screening of reagents, antagonists, and drugs that can bind to it are possible. Structure-based drug design refers to using computer simulations to predict the interaction between an antibody, peptide, polypeptide, protein, chemical substance, and protein conformation. Typically, for a protein that can effectively interact with a therapeutic antibody, peptide, protein, or chemical molecule, it is necessary to infer a compatible conformation from the three-dimensional structure of the therapeutic compound to ensure binding. Knowledge of the protein's three-dimensional structure allows those skilled in the art to design diagnostic and therapeutic antibodies, peptides, proteins, or chemical molecules with similar conformations. Information such as the binding site of the extracellular region of CD147 to the spike protein S protein of the novel coronavirus can enable those skilled in the art to design an antibody, peptide, protein or compound that can bind to CD147, and that the antibody, peptide, protein or compound can inhibit the biological activity and function of CD147.

[0044] Structural information about the CD147-S protein complex provides crucial information for identifying potential active sites on the extracellular region of the CD147 molecule. This structural information facilitates the design of CD147 inhibitors targeting these active sites. For example, computer technology can be used to identify ligands that bind to the active site, or for drug design, or cryo-electron microscopy analysis of the complex structure can be used to identify and locate the binding sites of the ligands.

[0045] Greer et al. designed thymidine nucleotide inhibitors using computer models of repetitive sequences and protein-ligand complex structures. Therefore, inhibitors of the CD147 molecule can also be designed using this method. For example, using the three-dimensional structure of the CD147-S protein complex mentioned in this invention, antibodies, protein molecules, peptides, small molecules, etc., that can bind to the functional active sites mentioned in this invention or other sites on the three-dimensional structure can be designed through computer modeling. These substances are then synthesized to form complexes with CD147. The complex structures are then analyzed using cryo-electron microscopy to obtain the actual binding sites. Necessary adjustments can then be made to the structure and / or functional groups of the ligands until the optimal molecule is obtained.

[0046] Furthermore, based on the three-dimensional structure of the CD147-S protein complex, various computer software programs can be used for reasoned drug design, thereby designing inhibitors of the CD147 molecule, including inhibitors of the novel coronavirus spike protein S protein, to antagonize viral invasion and infection. For example, automated ligand-receptor docking software can be used (Jones et al. in Current Opinion in Biotechnology, Vol.6, (1995), 652-656).

[0047] The structure-based drug design methods described above all require first identifying substances that can interact with the target biomolecule. Sometimes, such substances are available from the literature. However, most inhibitors targeting the target molecule are unknown, or the goal is to discover new inhibitors. In such cases, the first step is to screen in databases (such as the Cambridge Structural Database) for compounds that can interact with the active site or site of the target molecule. When the active site of the target molecule is unknown, screening criteria are usually based on pharmacokinetic properties, such as metabolic stability and toxicity. However, determining the structure of the CD147-S protein complex will help reveal the structural information of the CD147 extracellular active site, allowing for screening based on the structure and properties of this active site. Screening criteria could include: whether potential inhibitors can form a three-dimensional pharmacodynamic structure with the CD147 extracellular active site.

[0048] One embodiment of the present invention is a method for selecting and determining CD147 extracellular region antibodies, ligands and other interacting molecules using computer-aided three-dimensional modeling and molecular docking technology, including: (1) providing a protein structure, including the structure of the CD147-S protein complex of the present invention; and using computer three-dimensional modeling to predict the three-dimensional structures of possible antibodies, ligands and other interacting molecules; (2) docking the three-dimensional structure of the CD147 extracellular region with the three-dimensional structures of other spike proteins, ligands and other interacting molecules; (3) evaluating whether the three-dimensional structures of viral spike proteins, ligands and other interacting molecules can bind to the three-dimensional structure of the active site of the CD147 extracellular region; further analysis including (4) analysis of the biological activity of CD147 extracellular region antibodies, viral spike proteins and other interacting molecules with CD147; (5) whether CD147 extracellular region antibodies, ligands and other interacting molecules can regulate the interaction and biological function of CD147 with the novel coronavirus spike protein S protein.

[0049] Another embodiment of the present invention is a structure-based computer-aided drug design method, comprising: (1) providing a protein structure, including a three-dimensional structure or model of a CD147-S protein complex of the present invention; (2) designing an antibody, peptide, protein substance or compound using the three-dimensional structure or model; and (3) synthesizing the antibody, peptide, protein substance or compound.

[0050] The ligands, interacting molecules, or inhibitory antibodies, peptides, proteins, and chemical molecules of this invention can be identified by various methods known to those skilled in the art. For example, by binding or interacting with the extracellular region of CD147 protein, the CD147 protein can be identified in solution or on cells using methods such as immunoassays (e.g., enzyme-linked immunosorbent assay, ELISA), radioimmunoassay (RIA), or binding assays (e.g., surface plasmon resonance, yeast two-hybrid, phage peptide library, or antibody library techniques).

[0051] The invention is described in more detail by way of the following examples. These examples are provided to illustrate the invention, and not to limit it.

[0052] Example 1: Formation and structural analysis of the CD147-S protein complex 1. Expression and purification of CD147 extracellular protein refer to Figure 1 As shown, the extracellular region gene sequence of CD147 (GenBank: NM_198589) was cloned into the pET21a(+) prokaryotic expression system, expressed by Origami B(DE3) strain, and the extracellular region of CD147 was purified. The extracellular region of CD147 was expressed in prokaryotes, with IPTG concentration of 0.1 mM. After induction culture at 24℃ for 24 hours, the cells were lysed and collected for electrophoresis identification.

[0053] Purification and identification of expressed antigens: Ni 2+ The expressed antigen was purified by column chromatography, and the results are shown in Figure 2. The antigen with high purity was obtained after elution. Finally, the antigen was purified by affinity chromatography to obtain 10 mg of purified CD147 extracellular fragment protein.

[0054] 2. Expression and purification of SARS-CoV-2 S protein The SARS-CoV-2 S protein expressed in this invention is 2019-nCoV-S (amino acids from residues 1 to 1208 of GenBank: NM_908947, as shown in Figure 3; it should be noted that other S proteins, such as S protein mutants, can also be constructed using vectors designed in this invention for S protein expression and purification that can be used for structural or antibody activity assays). It has proline substitutions at residues 986 and 987, and the furin cleavage site (residues 682-685) is replaced with "GSAS". TwinStrepTag and 6×hisTag are added to the COOH end. The specific mutant spike protein S sequence used is shown in SEQ ID NO: 1.

[0055] The above-mentioned SEQ ID NO: 1 was directly synthesized and cloned into the eukaryotic expression vector pcDNA3.4 between the XbaⅠ and AgeⅠ restriction endonuclease sites. HEK293F cells were transiently transfected with this vector for secretory expression. After one week of expression, the spike protein was purified from the filtered cell supernatant using Anti-Flag affinity resin (GenScript). Subsequently, size exclusion chromatography was performed on a Superdex 200 10 / 300 Increase column (GE Healthcare) in PBS buffer (pH 7.4). The protein was then concentrated by centrifugation at 3500g and 4°C using 10 kDa ultrafiltration tubes. Finally, the purified protein was identified by SDS-PAGE electrophoresis as follows. Figure 4 As shown, SDS-PAGE results indicated that the full-length S protein was successfully expressed and purified, yielding approximately 5 mg of S protein. The purified protein was concentrated and stored at -80°C.

[0056] 3. Preparation of CD147-S protein cryo-electron microscopy complex The extracellular region of CD147 obtained in step 1 and the S protein obtained in step 2 were mixed at different molar ratios of 1:3, 1:1 and 3:1 and incubated at 4°C for 2 hours to optimize the formation of the CD147-S protein complex.

[0057] 4. Cryo-electron microscopy structural data collection, image processing, and model building Dataset collection: 3.5 μL of the CD147-S protein complex (CD147:S protein molar ratio of 3:1) was dropped onto a pre-glow-discharged lacey carbon support mesh (Quantifoil Au 300 mesh, R1.2 / 1.3) and rapidly frozen at 25°C and 100% humidity using a Vitrobot MarkIV (ThermoFisher Scientific). The frozen mesh sample was placed in a cryo-electron microscope, and high-resolution electron microscopy images were acquired at low temperature. The sample was then transferred to a FEITitan Krios transmission electron microscope, operated at 300 kV with a Falcon III Summit direct detector and a Gatan Quantum GIF energy filter, in zero-loss mode with a slit width of 20 eV. Automated data acquisition was performed using EPU software at 96,000× nominal magnification, with a pixel size of 0.846 Å. Each multi-frame image sequence was acquired for 40 s in super-resolution mode, for a total of 35 frames, with a dose rate of 1.53 eV per frame. / Ų, a single acquisition yielded 2,402 photomicrographs, with defocus ranging from 1.2 to 2.5 μm. Representative images are shown below. Figure 5 A.

[0058] Image processing: The 2,402 photomicrographs obtained in the above steps were subjected to image processing. All image processing was performed using RELION-3.1.2 (Fernandez-Leiro and Scheries, 2017). The specific steps are as follows: (1) Denoising: All images are grouped by frame, and each group is superimposed and motion corrected by MotionCorr2 (Zheng et al., 2017). The groups are then merged into a single drift-corrected image for subsequent particle identification. Images affected by crystalline ice or other visible contamination are removed by manual inspection. After estimating the contrast transfer function (CTF) parameters using GCTF (Zhang, 2016), photomicrographs with a maximum estimated resolution exceeding 4.0 Å are discarded. After denoising, this embodiment still has 2,289 drift-corrected images remaining. (2) 641,033 particles were automatically selected from 2,289 drift-corrected images using Gautomatch. The selected particles were then subjected to iterative optimization using a Bayesian approach combined with maximum likelihood estimation for two-dimensional classification, yielding the average image for each category. The obtained two-dimensional classification results were manually selected to obtain 50 groups of particle classification results with distinct morphologies. The results are as follows: Figure 5 B and Figure 5 C.

[0059] (3) From 2,289 drift-corrected images, based on the center coordinates of 641,033 particles obtained by automatic selection, particles were re-extracted with a 330 square pixel box. After multiple rounds of no-reference two-dimensional classification, invalid categories were removed, and approximately 505,182 high-quality particles were finally retained.

[0060] (4) From 505,182 high-quality particles, based on the typical morphological characteristics of the SARS-CoV-2 S protein-CD147 complex in the two-dimensional average image, including the trilobal apical structure, the long rod-shaped base domain, and the relatively consistent particle orientation distribution, the categories with low signal-to-noise ratio, no obvious structural features, or potential aggregates were excluded, resulting in 189,656 particles. These particles were used for the initial model construction of the SARS-CoV-2 S protein-CD147 complex and were further classified into three dimensions (Category 1: This category presents a complete S protein-CD147 complex structure, with a clearly visible trilobal apical density and a continuous CD147 binding domain density, suitable for subsequent high-resolution reconstruction; Category 2: This category shows a weaker CD147 binding region density, which may correspond to partial dissociation or conformational dynamics, while the spike protein body density remains relatively complete; Category 3: This category of particles shows a blurred structure or atypical morphology, which may be non-specific binding, denatured, or contaminated particles, and therefore were not used for subsequent reconstruction analysis). A high-quality category was selected for further two-dimensional and three-dimensional classification, yielding conformationally complete structures of the S protein–CD147 complex with clear binding interfaces. Finally, 62,906 representative particles were selected and automatically optimized in three dimensions, resulting in a reconstructed CD147–S protein complex density map with an initial resolution of 4.65 Å, as shown below. Figure 5 D; (5) Subsequently, Bayesian polishing was performed on 62,906 representative particles to correct for local particle motion during exposure, and the projection direction and CTF parameters of each particle were further optimized. Finally, the gold-standard Fourier shell correlation (FSC) criterion with a resolution of 0.143 was used to evaluate and determine that the final resolution of the CD147-S protein complex density map was 3.75 Å, with Ramachandran outliers of 2.9% and sidechain outliers of 3.7%. The results are as follows: Figure 6 A.

[0061] Model building: (1) The three-dimensional structure of the S protein-CD147 complex was constructed using molecular dynamics (MD) under electron density guidance. The initial coordinates of the CD147-S protein complex were based on the structures of the S protein (PDB number: 7DDN) and CD147 (PDB number: 3B5H). The initial three-dimensional atomic coordinates of the spliced ​​complex were fitted to the above 3.75 Å density map using UCSF Chimera (Pettersen et al., 2004) and the molecular dynamics flexible fitting (MDFF) protocol (Trabuco et al., 2008) to maximize the correlation coefficient between the atomic coordinates and the density map (He et al., 2016), thus obtaining a finely fitted three-dimensional atomic model of the CD147–S protein complex.

[0062] (2) The adjusted three-dimensional atomic model was then optimized by molecular dynamics (MD) simulation for 100 ns to obtain the final optimized three-dimensional atomic model. During the simulation, the positions of protein amino acid atoms were constrained by a 3.75 Å density map (Igaev et al., 2019), that is, protein atoms moved to the high-density region of the density map and retained secondary structure elements (Li et al., 2024; Zeng et al., 2024). Coordinate values ​​were collected every 10 ps for density cross-correlation analysis to avoid the risk of model overfitting during the optimization process.

[0063] Furthermore, to assess the binding affinity between CD147 and the RBD domain and its variants that actually participate in CD147 in the SARS-CoV-2 S protein, the molecular mechanics generalized Born surface area (MM / GBSA) method in AMBER Tools 19 was used to uniformly extract 100 conformational snapshots from the MD trajectory and calculate their binding free energy (ΔG). bind Based on the above simulations and energy calculations, the final structure of the CD147-S protein is as follows: Figure 6 As shown in Figure B, conformational clustering and density fitting analysis revealed that the structure contains an S protein trimer composed of three identical S protein peptide chains. One of the S protein's RBDs is in an upward open conformation, binding to one molecule of CD147, while the other two RBDs are in a downward closed conformation. Figure 6 (A and B). When the trimer binds to CD147, the RBD of the S protein involved in CD147 binding rotates outward to expose the receptor-binding motif (RBM), thereby mediating the formation of the CD147-S protein complex.

[0064] Example 2: Analysis of key epitopes of the binding of the novel coronavirus spike protein S protein to CD147 In the structure of the CD147-S protein resolved by cryo-electron microscopy in Example 1, the linker loop between the two IgG domains of CD147 spans the β6 chain of the RBD (residues 490-493), forming a concave binding surface to provide a CD147 binding interface. Analysis of the interface between the SARS-CoV-2 spike and CD147 showed that CD147 residues R54, E92, E84, Q100, and S112 form hydrogen bonds with spike residues G413, K424, K417, Y489, and G447, respectively (Figure 6C).

[0065] Example 3: Functional confirmation of key epitopes binding to CD147 by the novel coronavirus spike protein S protein 1. Preparation and purification of the RBD domain of the spike protein (S protein) of the novel coronavirus and its mutants. The spike protein (S protein) of SARS-CoV-2 is typically a key structural protein on its surface, responsible for viral binding and invasion of host cells. The S protein consists of two subunits: S1 and S2. The S1 subunit contains the receptor-binding domain (RBD), a crucial region for viral binding to host cell receptors (such as CD147 and ACE2). The RBD exists in two conformational states within the S protein: "up" and "down." It can only bind to the ACE2 receptor on host cells when the RBD is in the "up" state. During viral invasion, the RBD transitions from the "down" state to the "up" state, exposing the ACE2 binding site. The conformational changes of the RBD are highly dynamic and vary among different viral variants. For example, the BA.2.86 variant exhibits subtle differences in the binding interface between the "up" and "down" states of the RBD when binding to ACE2; these differences may affect the virus's infectivity and immune evasion capabilities.

[0066] To evaluate the key amino acid residues involved in the recognition of receptor CD147 by the key domain protein RBD of the novel coronavirus S protein, wild-type RBD and five mutant forms (G413A, K424A, K417A, Y489A, and G447A) were prepared by expression based on the binding interface revealed by the CD147-S protein complex of the present invention (Example 2). The mutated amino acids included G413, K424, K417, Y489, and G447 on the RBD that can form hydrogen bonds with CD147. The protein sequence of wild-type RBD is SEQ ID NO: 2. Each mutant was modified by introducing an alanine (A) mutation based on the wild-type protein sequence using a mutation kit. To facilitate the purification of the RBD protein, Flag&6×HisTag (SEQ ID NO: 3) was fused to the C-terminus of the above-mentioned wild-type RBD and mutant proteins G413A, K424A, K417A, Y489A, and G447A.

[0067] The wild-type RBD and five mutant forms (G413A, K424A, K417A, Y489A, and G447A) were directly synthesized and cloned into the eukaryotic expression vector pcDNA3.4. The vector was transiently transfected into HEK293F cells for secretory expression. One week after expression, the RBD and mutants of the S protein were purified from the filtered cell supernatant using Anti-Flag affinity resin (GenScript). Subsequently, size exclusion chromatography was performed in PBS buffer (pH 7.4) using a Superdex 200 10 / 300 Increase column (GE Healthcare). The concentrated protein was centrifuged at 3500g and 4°C using a 10 kDa ultrafiltration tube and stored at -80°C.

[0068] 2. In vitro validation of key epitopes for S protein binding to CD147 To further validate the key epitopes at the binding interface revealed by the S protein-CD147 complex, wild-type RBD (RBD-WT) and RBD mutants (RBD-G413A, RBD-K424A, RBD-K417A, RBD-Y489A, RBD-G447A) proteins were prepared, as described in step 1 of Example 3. Simultaneously, wild-type CD147 (CD147-WT) and CD147 mutants involved in S protein RBD binding—CD147-R54A, CD147-E84A, CD147-E92A, CD147-Q100A, and CD147-S112A—were prepared, as described in step 1 of Example 1.

[0069] Surface plasmon resonance (SPR) analysis was performed using the two groups of RBD and CD147 mutants described above. SPR analysis was conducted using a Biacore T200 (Cytiva) instrument. The SPR analysis protocol for determining key epitopes on the CD147 protein was as follows: the stationary phase was immobilized on a CM5 sensor chip (29149603, Cytiva) using an amino-coupling kit (GE Healthcare, BR-1000-50). The stationary phase was RBD-WT, and the mobile phase contained wild-type CD147 (CD147-WT) and its mutants (CD147-R54A, CD147-E84A, CD147-E92A, CD147-Q100A, CD147-S112A). SPR affinity was analyzed and characterized using Biacore T200 evaluation software (Cytiva), and the data were visualized using OriginPro 8.5 (OriginLab). The SPR assay for determining key epitopes on the RBD of the S protein uses CD147-WT as the stationary phase and RBD-G413A, RBD-K424A, RBD-K417A, RBD-Y489A, and RBD-G447A as the mobile phase. Other parameters and analytical methods are the same as those for determining key epitopes on the CD147 protein.

[0070] SPR analysis results showed that, compared with wild-type CD147 or wild-type RBD, mutations in RBD or CD147 led to a significant decrease in affinity between CD147-S protein RBDs (Figure 7). These results demonstrate that the CD147-S protein complex of this invention can be used to determine the interface and key amino acid residues of CD147-S protein interactions.

[0071] 3. Cellular-level validation of key functional epitopes for S protein binding to CD147 To identify the functional binding of key residues between CD147 and the SARS-CoV-2 spike protein (RBD) as revealed by the CD147-S protein complex, eukaryotic overexpression plasmids encoding wild-type CD147 (CD147-WT) and its mutants (including CD147-R54A, CD147-E84A, CD147-E92A, CD147-Q100A, and CD147-S112A) were constructed. Full-length wild-type CD147 and mutant plasmids were transfected into Vero E6 CD147 / ACE2-KO cells (the cells were CD147 / ACE2 double knockout Vero E6 cells (CD147 / ACE2-KO) prepared at our research center, lacking wild-type CD147 / ACE2 expression) using a multifunctional DNA / siRNA transfection reagent (PT-114-15, Polyplus). Vero E6 CD147 / ACE2-KO cells were cultured in 24-well plates and incubated overnight at 37°C. After discarding the cell culture medium, SARS-CoV-2 evovirus (original strain) was added to each well for infection. After incubation for 2 hours, the cell supernatant was replaced with medium containing 2% fetal bovine serum (FBS), and the cells were cultured at 37°C for another 48 hours. Finally, samples were collected, and viral load was measured using TaqMan real-time quantitative PCR (TaqMan™ UniversalMaster Mix II, Thermo Scientific™) to assess infection efficiency.

[0072] TaqMan real-time quantitative PCR results showed that, compared with the wild-type CD147 group, each CD147 mutant significantly inhibited the ability of SARS-CoV-2 to enter host cells (Figure 8). This result indicates that the interface information revealed by the CD147-S protein complex in this invention can be used to identify key amino acid residues on the surface of the CD147 molecule involved in SARS-CoV-2 invasion of cells.

[0073] 4. Experiment on the interaction between methacin and CD147 and SARS-CoV-2 S protein: Meperizumab has demonstrated good safety and efficacy in the treatment of severe COVID-19 (Bian et al., 2023). Therefore, the structure of the CD147-S protein complex of this invention was compared with the CD147-MPZ Fab complex (PDB ID: 5X0T) previously reported by our research group, and the results showed that there is extensive conflict between the interface of MPZ and S protein RBD binding to CD147.

[0074] This invention uses a competitive inhibition ELISA method to detect the ability of meperizumab to competitively inhibit the binding of SARS-CoV-2 S protein by binding to CD147. Experimental procedure: The prepared RBD-WT protein was coated onto a 96-well plate. After incubation for 1 hour with 1 mg / mL CD147-WT protein and different concentrations of meperizumab (2-fold serial dilutions, from 200-0.049 µg / mL), the plate was washed three times with PBST, incubated with anti-CD147 murine antibody HAb18, and then incubated with HRP-labeled goat anti-mouse antibody. 100 μl of TMB was added, and the reaction was allowed to develop for 4 minutes, then 1 M H2SO4 was added to terminate the reaction. The optical density (OD) at 450 nm was measured using a microplate spectrophotometer (Epoch, BioTek Instruments, Inc.).

[0075] Experimental results showed that meperizumab blocked the interaction between CD147 and the SARS-CoV-2 S protein in a dose-dependent manner. These competitive inhibition experiments demonstrated that meperizumab directly blocked the binding of the spike protein RBD to CD147 in a dose-dependent manner (Figure 9). This result indicates that the CD147-S protein complex of this invention can be used to evaluate the mechanism of action of anti-SARS-CoV-2 monoclonal antibodies targeting CD147.

[0076] This invention uses cryo-electron microscopy to prepare and resolve the structure of the CD147-S protein complex, demonstrating that CD147 acts as a receptor mediating ACE2-independent SARS-CoV-2 cell invasion. Similar to the receptor ACE2, CD147 binds only to the upright RBD in its open conformation and induces significant clockwise rotation of the N-terminal domains (NTDs) of the three protomers. Therefore, CD147 binding may shift the conformational landscape of the spike trimer to a more open state. This transition of the spike protein from a closed to an open RBD state may trigger significant conformational dynamics changes in the S1 subunit and activate the fusion peptide, thereby preparing the spike protein for membrane fusion.

[0077] The cryo-electron microscopy and molecular dynamics (MD) simulations provided in this invention identified the RBD domain of the S protein in its upwardly extended state as the interface for binding the CD147 receptor. Significant overlap exists between the binding sites of CD147 and ACE2 on the RBD, indicating that these two receptors cannot simultaneously bind to the spike protein, suggesting that CD147 and ACE2 represent two independent pathways for SARS-CoV-2 invasion. The binding interface shared by CD147 and ACE2 is partially embedded between adjacent protomers in the closed conformation of the spike trimer; this structural feature may provide insights into the complex mechanism by which SARS-CoV-2 evades humoral immunity before cellular invasion. Furthermore, through structure-guided mutation binding affinity assays and viral invasion experiments, key amino acid residues mediating the CD147-spike interaction were identified. These residues include R54, E92, E84, Q100, and S112 on CD147, and G413, K424, K417, Y489, and G447 on the S protein. Notably, the Gamma variant carrying the K417T mutation exhibits reduced affinity for CD147, consistent with binding free energy calculations based on the CD147-S protein complex structure. Previous reports have also indicated that the K417T mutation reduces the S protein's affinity for ACE2 (Barton et al., 2021). The persistent presence of the K417T mutation in the S protein may be attributed to its ability to maintain relatively high affinity for both CD147 and ACE2 while mediating immune escape (Greaney et al., 2021).

[0078] After resolving the binding site, we validated using the meperidine monoclonal antibody that this antibody can inhibit SARS-CoV-2 invasion by blocking the CD147-S protein interaction through steric hindrance. Although neutralizing antibodies against the viral spike have been used to treat COVID-19, their activity against emerging SARS-CoV-2 variants may be limited (Dejnirattisai et al., 2022). Therefore, based on the structure of the CD147-S protein complex in this patent, developing small molecules, peptides, and other therapeutic agents targeting the host receptor CD147 is a promising strategy for broad-spectrum anti-SARS-CoV-2 drugs targeting host factors.

[0079] Amino acid sequence listing (from the instruction manual): sequence list <110> Chen Zhinan <120> Structure and application of CD147-S protein complex <160> 1 <210> 1 <211> 1208 <212> PRT <213>SARS-CoV-2 <220> <221>DOMAIN <222>(1)...(1208) <223>Extracellular domain of Spike protein <400>1 <120> Structure and application of CD147-S protein complex <160> 2 <210> 2 <211> 194 <212> PRT <213> SARS-CoV-2 <220> <221> RBD-WT DOMAIN <222> (1)...(194) <223> RBD of Spike protein <400> 2 NITNLCPFGEVFNATRFASVYAWNRKRISNCVADYSVLYNSASFSTFKCYGVSPTKLNDLCFTNVYADSFVIRGDEVRQIAPGQTGKIADYNYKLPDDFTGCVIAWNSNNLDSKVGGNYNYLYRLFRKSNLKPFERDISTEIYQAGSTPCNGVEGFNCYFPLQSYGFQPTNGVGYQPYRVVVLSFELLHAPATV <120> Structure and application of CD147-S protein complex <160> 3 <210> 3 <211> 14 <212> PRT <213> Artificial synthesis <220> <221> Tag DOMAIN <222> (1)...(14) <223> Flag & 6×HisTag <400> 6 DYKDDDDKHHHHHH.

Claims

1. A CD147-S protein complex, characterized in that, The complex comprises a trimer consisting of the SARS-CoV-2 S protein with three identical amino acid sequences and an extracellular CD147 region; The SARS-CoV-2 S protein described herein has the protein sequence of SEQ ID NO:1; or a protein having at least 95% amino acid sequence identity with SEQ ID NO:1; or a protein sequence having one or more amino acid substitutions, deletions, or replacements with SEQ ID NO:

1. In the spike protein trimer, one of the SARS-CoV-2 S protein's RBDs is in an upward-open conformation, binding to the extracellular region of CD147, while the other two SARS-CoV-2 S protein RBDs are in a downward-closed conformation. The amino acid residues at the interface between the SARS-CoV-2 S protein and the extracellular region of CD147 include CD147 residues R54, E92, E84, Q100, and S112, and SARS-CoV-2 S protein residues G413, K424, K417, Y489, and G447.

2. The CD147-S protein complex structure as described in claim 1, characterized in that, The CD147-S protein complex density map was evaluated using the gold standard Fourier shell correlation criterion with a resolution of 0.

143. The resolution was 3.75 Å, the Laplace plot statistics were 2.9%, and the side chain statistics were 3.7%.

3. The CD147-S protein complex structure as described in claim 1, characterized in that, The recognition motif of the CD147 extracellular region on the SARS-CoV-2 S protein is embedded in the interface between protomers within the closed conformation of the S protein trimer. When it binds to the extracellular region of CD147, the RBD of the SARS-CoV-2 S protein rotates outward by 32.3°, exposing the receptor-binding motif.

4. The CD147-S protein complex as described in claim 1, characterized in that, The linker loop between the two Ig domains of the bound CD147 spans the β6 chain of the SARS-CoV-2 S protein RBD to form a concave binding surface that can accommodate CD147.

5. The CD147-S protein complex as described in claim 1, characterized in that, The SARS-CoV-2 S protein RBD domain has the amino acid sequence of SEQ ID NO: 2; or is a protein with at least 95% amino acid sequence identity with SEQ ID NO: 2; or is a protein sequence with one or more amino acid substitutions, deletions, or replacements with SEQ ID NO:

2.

6. The CD147-S protein complex as described in claim 5, characterized in that, Mutations occur at one or more of the G413, K424, K417, Y489, and G447 sites in the RBD domain of the SARS-CoV-2 S protein.

7. The CD147-S protein complex as described in claim 1, characterized in that, The SARS-CoV-2 S protein RBD domain may be mutated at one or more of the following sites: (A) G413A mutation on the RBD of the S protein; (B) K424A mutation on the RBD of the S protein; (C) K417A mutation on the RBD of the S protein; (D) Y489A mutation on the RBD of the S protein; (E) G447A mutation on the RBD of the S protein.

8. A method for preparing the protein complex according to any one of claims 1-7, characterized in that, The method includes the following steps: (1) The protein was expressed and purified using eukaryotic FreeStyle 293F cells after transient staining with polyethyleneimine. (2) Purification of S protein from filtered cell supernatant using Anti-Flag affinity resin; (3) The S protein was purified by gel filtration chromatography; (4) Preparation of a complex of CD147 extracellular protein and novel coronavirus S protein purified by chromatographic column.

9. A computer-aided method for drug design based on structure-based bioactive substances, characterized in that, The method includes: (a) Provides a CD147-S protein complex, wherein the complex is the protein complex according to any one of claims 1-7; (b) Using the structural information of the protein complex, design or screen antibodies, peptides, proteins, small molecules, or compounds; (c) Synthesize the antibodies, peptides, proteins, and small molecules described in step b; (d) Evaluate the bioactivity of the antibodies, peptides, proteins, small molecules, etc. synthesized in step c.

10. The method as described in claim 9, characterized in that: The screening mentioned above includes one or more chemical compound databases, wherein the three-dimensional structure of the compounds is known.

11. The method as described in claim 9, characterized in that: The method further includes allowing the compounds identified in the screening step to interact with the complex model.

12. The method as described in claim 9, characterized in that: The design includes targeted or randomized drug design, screening for compounds predicted to bind to the three-dimensional structure of the CD147 extracellular functional epitope, or screening for compounds that bind to the CD147 protein to inhibit or activate the activity of the CD147 protein.

13. The use of the complex according to any one of claims 1-7 as a target for preparing an anti-COVID-19 drug.