Development and application of high-efficiency and high-fidelity adenine base editor

By mutating specific amino acid residues of TadA8e, the off-target effect problem of adenine base editor was solved, achieving efficient and high-fidelity gene editing for the treatment of hereditary diseases.

CN119464262BActive Publication Date: 2026-06-23AGRICULTURAL GENOMICS INSTITUTE AT SHENZHEN CHINESE ACADEMY OF AGRICULTURAL SCIENCES (SHENZHEN BRANCH GUANGDONG LABORATORY FOR LINGNAN MODERN AGRICULTURE)

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
AGRICULTURAL GENOMICS INSTITUTE AT SHENZHEN CHINESE ACADEMY OF AGRICULTURAL SCIENCES (SHENZHEN BRANCH GUANGDONG LABORATORY FOR LINGNAN MODERN AGRICULTURE)
Filing Date
2024-10-21
Publication Date
2026-06-23

AI Technical Summary

Technical Problem

Existing adenine base editors suffer from off-target effects at the whole-genome level, making it difficult to achieve efficient and high-fidelity gene editing.

Method used

By modifying the amino acid residues of TadA8e, especially by mutations at sites such as Y149, S109, N119, N122, and D147, the off-target effects of the adenine base editor are reduced, and the editing accuracy is improved.

Benefits of technology

It significantly reduced the off-target effects of the adenine base editor, improved the accuracy and efficiency of editing, expanded the scope of editing applications, and was successfully applied to the treatment of hereditary diseases.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure SMS_1
    Figure SMS_1
  • Figure SMS_2
    Figure SMS_2
  • Figure SMS_3
    Figure SMS_3
Patent Text Reader

Abstract

The application provides development and application of a high-efficiency and high-fidelity adenine base editor. The inventors compared the activities of several adenine base editors (ABEs) on a large number of sgRNA target sites, and found that ABE8e showed the highest editing efficiency and a wider editing window; however, ABE8e introduced a serious sgRNA-independent DNA off-target effect. Therefore, the inventors carried out optimization and modification on ABE8e, revealed some sites related to off-target effect and editing accuracy, and obtained a series of TadA8e variants, which significantly reduced the off-target effect of the adenine base editor and improved the editing accuracy.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention belongs to the field of genetic engineering technology; more specifically, this invention relates to the development and application of an adenine base editor based on the CRISPR system. Background Technology

[0002] Single-base editors can mutate a single base at a target site without generating double-strand breaks (DSBs), thus avoiding chromosomal mismatches, translocations, and recombination caused by double-strand breaks.

[0003] Cytosine base editors (CBEs) and adenine base editors (ABEs) can mediate C-to-T and A-to-G mutations, respectively. Since approximately 58% of known genetic diseases are caused by single-base mutations, and ABEs can treat 47% of single-base mutation diseases through A-to-G editing, this presents significant potential for the therapeutic application of ABEs.

[0004] ABE7.10, as the original ABEs, contains two TadA (adenine deaminase): a wild-type TadA and a TadA variant (TadA... * Subsequently, researchers obtained several improved variants of ABEs, such as ABEmax, miniABEmax, and ABE8e, through methods such as codon optimization, adding nuclear localization signals, or phage-assisted evolution.

[0005] These enhanced ABEs improve editing efficiency or increase the number of editing windows to varying degrees. Among them, ABE8e consists of a single TadA... * The domain is constructed by introducing an 8-amino acid mutation (TadA8e), which exhibits an editing efficiency 3 to 11 times higher than ABE7.10. Therefore, ABE8e has been widely used in the treatment of various genetic disease models, including spinal muscular atrophy (SMA), hereditary heart disease, sickle cell anemia, β-thalassemia, and cystic fibrosis.

[0006] Although base editors do not induce DNA double-strand breaks (DSBs), the use of deaminases for base substitution at target genomic sites raises concerns about their genome-wide specificity and potential safety in clinical applications. Both CBEs and ABEs have been shown to induce RNA off-target effects at the transcriptome level in cells.

[0007] Given the difficulty in detecting off-target effects at the genome-wide level, the inventors developed an unbiased genome-wide off-target detection method (GOTI) through two-cell injection in embryos to distinguish between spontaneous and off-target mutations, thereby enabling accurate assessment of the specificity of base editors. Using this method, it was found that BE3 generates a large number of genome-wide single nucleotide variants (SNVs) in mouse embryos, while ABE7.10 does not have this effect. However, the specificity at the genome-wide level for optimized ABEs remains to be investigated. Summary of the Invention

[0008] The purpose of this invention is to provide a high-efficiency, high-fidelity adenine base editor for development and application.

[0009] In a first aspect of the invention, a method is provided to reduce the off-target effects of an adenine base editor (ABE) and improve editing accuracy (fidelity), comprising: using TadA8e as an adenine deaminase and modifying the amino acid residues of the TadA8e; said modification comprising mutating the amino acid residues of Y149, S109, N119, N122, D147, R111, I166 or N167 of TadA8e.

[0010] In one or more embodiments, the modification of the amino acid residue includes: mutating the amino acid residue to another amino acid residue (any one of the other 19).

[0011] In one or more embodiments, the modification of the amino acid residues includes: Y149V, Y149L, Y149M, or Y149I for the Y149 site; S109F for the S109 site; N119C, N119Q, or N119D for the N119 site; N122F, N122V, N122L, or N122A for the N122 site; and D147K for the D147 site.

[0012] In one or more embodiments, the modification of the amino acid residues includes Y149V and / or S109F (more preferably, the mutation is Y149V).

[0013] In one or more embodiments, reducing the off-target effects of the adenine base editor includes reducing non-specific binding between the base editor and DNA.

[0014] In one or more embodiments, the mutation at the amino acid site is calculated based on the full length of the TadA8e protein.

[0015] In one or more embodiments, the TadA8e protein includes a protein selected from the group consisting of: (a1) a protein with the amino acid sequence of SEQ ID NO:1; (b1) a protein derived from (a1) having the function of the (a1) protein, formed by substituting, deleting, or adding one or more (e.g., 1-20; preferably 1-15; more preferably 1-10, such as 5, 3) amino acid residues of the amino acid sequence of SEQ ID NO:1; (c1) a protein derived from (a1) having the function of the (a1) protein, having at least 80% (preferably 85%; more preferably 90%; more preferably 95%, such as 98%, 99%) homology to the protein sequence defined by (a1); or (d1) a protein active fragment defined by (a1), or a protein formed by adding a tag sequence, an enzyme digestion sequence, or a reporter protein to both ends thereof. The TadA8e also includes the aforementioned biologically active fragment.

[0016] In another aspect of the invention, a TadA8e variant is provided, comprising a mutation selected from the following histidine residues: Y149, S109, N119, N122, D147, R111, I166, or N167; preferably, the mutation comprises: Y149V, Y149L, Y149M, or Y149I for the Y149 site; S109F for the S109 site; N119C, N119Q, or N119D for the N119 site; N122F, N122V, N122L, or N122A for the N122 site; and / or D147K for the D147 site; more preferably, the mutation comprises: Y149V and / or S109F.

[0017] In another aspect of the invention, an adenine base editor is provided, comprising the aforementioned TadA8e variant and an element operatively linked thereto suitable for performing adenine editing; preferably, the adenine base editor comprises an operatively linked TadA8e variant, a Cas nuclease.

[0018] In one or more embodiments, the adenine base editor includes operatively linked: NLS, TadA8e variant, Cas nuclease, and NLS.

[0019] In one or more embodiments, the Cas nuclease includes (but is not limited to): Cas9 nuclease or its homologs, the homologs including (but not limited to): SpRY nuclease, SaKKH nuclease, IscB nuclease.

[0020] In one or more embodiments, the NLS is a BPNLS.

[0021] In one or more embodiments, the TadA8e variant and the Cas nuclease further include a linker peptide.

[0022] In another aspect of the invention, isolated polynucleotides are provided, said polynucleotides encoding said TadA8e variant; or said adenine base editor.

[0023] In another aspect of the invention, a construct or an expression vector containing said construct is provided, which contains said polynucleotide (encoding the corresponding amino acid sequence).

[0024] In one or more embodiments, the construct includes the expression cassette of the adenine base editor and the operatively linked sgRNA expression cassette of the target gene.

[0025] In one or more embodiments, the target gene includes (but is not limited to): Hpd, the construct or expression vector containing the construct being used to prepare a pharmaceutical composition for alleviating or treating hereditary type I tyrosinemia; more preferably, the nucleotide sequence of the sgRNA is shown in SEQ ID NO:7.

[0026] In one or more embodiments, the construct includes: construct 1: an expression cassette for the sgRNA targeting the target gene, an expression cassette for the gene encoding the TadA8e variant and the gene encoding the N-terminal portion of the Cas nuclease; and construct 2: an expression cassette for the C-terminal portion of the Cas nuclease.

[0027] In one or more embodiments, the construct 1 comprises: a U6 promoter-targeting gene sgRNA-EFS-NLS-TadA8e variant encoding gene-connecting peptide-Cas nuclease N-terminal portion encoding gene-Intein; preferably, both ends further include ITR; preferably, a connecting peptide is further included between the TadA8e variant encoding gene and the Cas nuclease N-terminal portion encoding gene; preferably, the NLS is BPNLS.

[0028] In one or more embodiments, the construct 2 includes: an EFS–Intein–Cas nuclease C-terminal coding gene–NLS; preferably, it also includes ITRs at both ends; preferably, a linker peptide is further included between the TadA8e variant coding gene and the Cas nuclease N-terminal coding gene; preferably, the NLS is BPNLS.

[0029] In one or more embodiments, the construct further includes a reporter protein expression cassette.

[0030] In another aspect of the invention, a recombinant cell is provided, which contains the aforementioned construct or expression vector, or whose genome is integrated with the aforementioned polynucleotide (encoding the corresponding amino acid sequence).

[0031] In another aspect of the invention, the use of the adenine base editor or the construct or an expression vector containing the construct is provided for gene editing or for preparing reagents for gene editing; wherein the gene editing has reduced off-target effects and improved editing accuracy; preferably, the gene editing is adenine base editor-mediated gene editing; preferably, the gene editing is an A·T to G·C conversion.

[0032] In one or more embodiments, the adenine base editor is used for gene editing as a cellular-level method, including single-cell, two-cell, or multi-cell methods.

[0033] In one or more embodiments, the described use is applied to cells or cell cultures isolated in vitro.

[0034] In one or more embodiments, the use is for purposes other than disease diagnosis or treatment.

[0035] In another aspect of the invention, a method for gene editing (including gene editing primarily aimed at A·T to G·C conversion) is provided, comprising mediating gene editing with the adenine base editor or the construct or an expression vector containing the construct; preferably, gene editing is performed by transfecting or injecting a nucleic acid sequence encoding the adenine base editor into a receptor; preferably, the receptor comprises a somatic cell or a germ cell; preferably, the germ cell comprises an embryonic cell or a fertilized egg.

[0036] In one or more embodiments, the gene editing method is an in vitro method for non-living organisms.

[0037] In one or more embodiments, the gene-editing method is targeted at objects that do not develop into living organisms.

[0038] In one or more embodiments, the gene editing method is a cellular-level method, including single-cell, two-cell, or multi-cell methods.

[0039] In one or more embodiments, the gene editing includes gene editing targeting the target gene Hpd; preferably, the gene editing reagent is used to prepare a pharmaceutical composition for alleviating or treating hereditary type I tyrosinemia; preferably, in the gene editing targeting the target gene Hpd, the nucleotide sequence of the sgRNA is as shown in SEQ ID NO:7.

[0040] In another aspect of the invention, a reagent or kit for gene editing is provided, comprising: the TadA8e variant, or a polynucleotide encoding thereof; or the adenine base editor, or a polynucleotide encoding thereof; or the construct or an expression vector containing the construct.

[0041] Other aspects of the invention will be apparent to those skilled in the art from the disclosure herein. Attached Figure Description

[0042] Figure 1 , Cas9, BE3, YE1-BE3-FNLS, ABE7.10, ABE7.10 F148A Unbiased comprehensive analysis of ABEmax, miniABEmax, and ABE8e in 102-sgRNA cells.

[0043] a. Schematic diagram comparing the editing features of Cas9, CBEs, and ABEs in HEK293T cells that have stably integrated 102 sgRNAs.

[0044] b. Box plot of targeting efficiencies of Cas9, CBEs, and ABEs at 102 sgRNA target sites. Specifically, the efficiencies of indels, C-to-T base substitutions, and A-to-G base substitutions were calculated as the targeting efficiencies of Cas9, CBEs, and ABEs, respectively.

[0045] c, the proportion of indels in different groups.

[0046] d, the proportion of C editors in different groups.

[0047] e. Heatmap of average C-to-T editing efficiency of CBEs and A-to-G editing efficiency of ABEs at 1–20 sites. Data from three independent experiments.

[0048] Figure 2 The same Tyr sgRNA was used to target Cas9, BE3, YE1-BE3-FNLS, ABE7.10, and ABE7.10. F148A Unbiased genome-wide off-target analysis was performed using ABEmax, miniABEmax, and ABE8e.

[0049] a. Schematic diagram of the GOTI method for analyzing off-target DNA effects of Cas9 and base editor.

[0050] b. Comparison of detected off-target SNVs. Data are expressed as mean ± standard error (n≥3). P-values ​​were calculated using a two-tailed unpaired t-test.

[0051] c. Distribution of SNV mutation types in the Cre, two CBEs, and five ABEs injection groups. The numbers represent the percentage of a certain type of SNV among all SNVs.

[0052] d, sequence motifs of off-target DNA SNVs from the BE3 and ABE8e groups, respectively.

[0053] e. The overlap of off-target SNVs among three BE3-treated embryos and three ABE8e-treated embryos injected with the same Tyr sgRNA.

[0054] f, the overlap between off-target SNVs detected by GOTI and off-target predictions by Cas-OFFinder and CRISPOR.

[0055] Figure 3 , Cas9, BE3, YE1-BE3-FNLS, ABE7.10, ABE7.10 F148A Analysis of the DNA off-target properties of ABEmax, miniABEmax and ABE8e.

[0056] a. Targeting efficiency of tdTomato+ and tdTomato- cells in the Tyr region based on whole-genome sequencing (WGS) analysis.

[0057] bc, the proportion of C to T mutations (b) and A to G mutations (c) in the detected off-target DNA SNVs.

[0058] d. Distribution of off-target DNA SNVs on chromosomes in mouse embryos.

[0059] e. Comparison of detected off-target indels. Data are expressed as mean ± standard error (n≥3). P-values ​​were calculated using a two-tailed unpaired t-test.

[0060] Figure 4 Unbiased genome-wide off-target analysis of ABE8e was performed using Dmd, Fah, PCSK9, and non-target (NT) sgRNAs.

[0061] Schematic diagram of sgRNA sites for the a, Dmd, Fah, and PCSK9 gene loci.

[0062] b. Comparison of detected off-target DNA SNVs. Data are expressed as mean ± standard error (n≥3). P-values ​​were calculated using a two-tailed unpaired t-test.

[0063] c. Distribution of off-target SNV mutation types. The numbers represent the percentage of a certain type of SNV among all SNVs.

[0064] d, Off-target SNV sequence motif.

[0065] e. The overlap of off-target SNVs in different individuals with the same sgRNA.

[0066] f, the overlap between off-target SNVs detected by GOTI and off-target sites predicted by Cas-OFFinder and CRISPOR.

[0067] Figure 5 We further analyzed the off-target properties of ABE8e using Dmd, Fah, PCSK9, and non-target (NT) sgRNAs.

[0068] a. Based on whole-genome sequencing (WGS) analysis, the targeting efficiency of ABE8e at Dmd, Fah, and PCSK9 sites was analyzed.

[0069] b. Distribution of off-target DNA SNVs on chromosomes in mouse embryos.

[0070] c, the proportion of A to G mutations in the detected off-target DNA SNVs.

[0071] d. Comparison of detected off-target indels. Data are expressed as mean ± standard error (n≥3). P-values ​​were calculated using a two-tailed unpaired t-test.

[0072] Figure 6 Eight mutation sites of TadA8e were reconstructed through saturation mutations.

[0073] a. Schematic diagram of saturation mutations at eight sites of ABE8e.

[0074] b. Editing window for ABE8e and each of its variants.

[0075] c. Peak A-to-G editing efficiency of ABE8e and its variants at the A5 site. Red dashed lines mark 40% to distinguish between efficiencies comparable to (or higher than) ABE8e and efficiencies lower than ABE8e. Data are expressed as mean ± standard error (n=3).

[0076] d. Line plot of average A-to-G editing efficiency at sites 1–20 for 13 variants (S109F, N119Q, N119D, N119C, N122F, N122L, N122V, N122A, D147K, Y149L, Y149I, Y149M, and Y149V).

[0077] e. Bystander A and unintended C edits for 13 variants. A3-A7 are defined within the target window. A2 / A5 and A8 / A5 represent the ratio of unintended A2 and A8 edits to the expected peak A5 edit; C5 / A5 represents the ratio of unintended C5 edits to the expected A5 edit. Data are expressed as mean ± standard error (n=3).

[0078] f. Line plot of average A-to-G editing efficiency at A1-A14 sites for six double mutants (S109F / N119Q, S109F / D147K, S109F / Y149V, N119Q / D147K, N119Q / Y149V, and D147K / Y149V).

[0079] g. The heatmap shows the average A-to-G editing efficiency at the A1-A14 sites for the three selected variants (ABE8eS109F, ABE8eY149V, ABE8eS109F / Y149V) in this study, as well as the previously reported high-fidelity variants (ABE8eV106W, ABE8eV82G, ABE8eK20A / R21A, ABE8eF148A, ABE8eN108Q, or ABE9).

[0080] Figure 7 ABE8e S109F ABE8e Y149V and ABE8e S109F / Y149V Off-target analysis of the whole genome and transcriptome.

[0081] a.ABE8e S109F ABE8e Y149V and ABE8e S109F / Y149V Comparison of off-target DNA SNVs detected in the groups. Data are expressed as mean ± standard error (n=3). P-values ​​were calculated using a two-tailed unpaired t-test.

[0082] b.ABE8e S109F ABE8e Y149V and ABE8e S109F / Y149V Distribution of DNA mutation types in the group.

[0083] c. From ABE8e S109F ABE8e Y149V and ABE8e S109F / Y149V Motif of off-target DNA SNV sequences in the group.

[0084] d. ABE8e analyzed by RNA-seq S109F ABE8e Y149V and ABE8e S109F / Y149VNumber of off-target RNA SNVs in the group. Data are expressed as mean ± standard error (n=3). P-values ​​were calculated using a two-tailed unpaired t-test.

[0085] e.GFP, ABE8e, ABE8e S109F ABE8e Y149V and ABE8e S109F / Y149V Distribution of RNA mutation types in the group.

[0086] f. Motif of RNA SNVs.

[0087] Figure 8 、TadA8e Y149V Compatibility with different CRISPR systems.

[0088] a. ABE8e and ABE8e based on SpRY Y149V Editing frequencies from A to G at 10 sites with NNN PAM. In the heatmap, the editing efficiencies shown represent the mean of three biologically independent replicates.

[0089] b. ABE8e and ABE8e based on SpRY Y149V Average A-to-G editing efficiency at 10 sites in the A2-A14 region.

[0090] c. ABE8e and ABE8e based on SpRY Y149V Average C-to-T / G / A editing efficiency at 7 target sites. Data are expressed as mean ± standard error (n=3).

[0091] d. ABE8e and ABE8e based on SaKKH Y149V Heatmap of A to G editing frequencies at 6 NNNRRT PAM sites.

[0092] e. ABE8e and ABE8e based on SaKKH Y149V Average A-to-G editing efficiency at 6 sites in the A2-A22 region.

[0093] f. ABE8e and ABE8e based on IscB Y149V Heatmap of A to G editing efficiency at 5 NWRRNA PAM sites.

[0094] g. IscB-based ABE8e and ABE8e Y149V Average A-to-G editing efficiency in the A1-A16 region at 5 sites.

[0095] Figure 9 ABE8e Y149V Treatment of hereditary tyrosinemia type I mice.

[0096] a.ABE8e Y149V A schematic diagram of Hpd gene knockout mediated in mice. The Hpd sgRNA was designed to target the start codon (ATG) of the Hpd gene, with the target A located at position 6. The mutation from A to G can induce the knockout of the Hpd gene.

[0097] b.ABE8e Y149V Schematic diagram of the dual AAV system for in vivo delivery of Hpd sgRNA. Fah - / - Mice were injected with dual AAVs and treated with nitixinone for 7 days, then the treatment was discontinued.

[0098] c.ABE8e Y149V -Hpd sgRNA and saline-treated Fah - / - Mouse body weight ratio. Data are expressed as mean ± standard error (n=3).

[0099] d. Day 30 after injection Fah - / - Deep sequencing analysis of the Hpd genomic region in mouse liver DNA. Data are expressed as mean ± standard error (n=5).

[0100] e. On day 30 post-injection, for Fah who did not receive the injection but took nitixinone... - / - Mice and Fah mice not treated with nitixinone dual AAVs - / - Mouse livers were subjected to immunohistochemical (IHC) staining with anti-HPD antibody.

[0101] fh. For those who did not receive or take nitixinone, Fah - / - Mice, Fah, injected with saline but not receiving nitincinone - / - Mice, and Fah mice injected with dual AAVs but not receiving nitixinone - / - Serum levels of aspartate aminotransferase (AST), alanine aminotransferase (ALT), and total bilirubin were measured in mice. Data are expressed as mean ± standard error (n = 5). P-values ​​were calculated using a two-tailed unpaired t-test. Detailed Implementation

[0102] In this invention, the activities of several adenine base editors (ABEs) for a large number of sgRNA target sites were compared. ABE8e was found to exhibit the highest editing efficiency and a wider editing window. Further analysis revealed that ABE8e introduces a severe sgRNA-independent DNA off-target effect. Therefore, the inventors conducted an in-depth analysis of ABE8e, revealing several sites related to off-target effects and editing accuracy. Mutations targeting these sites were performed to optimize and obtain a series of TadA8e variants. These variants significantly reduced the off-target effects of the adenine base editor and improved editing accuracy (fidelity).

[0103] As used in this invention, "mutation" refers to the substitution of a residue in a sequence (e.g., a nucleic acid or amino acid sequence) by another residue, the change of one or more residues in a sequence to another residue, or the occurrence of deletion or insertion. In the editor of this invention, it is desirable to enhance targeted editing performance, generate as many "A·T to G·C" conversions as possible, and reduce conversions of other types of bases or other effects; the enhanced targeted editing performance includes: reduced off-target effects, improved editing accuracy, improved editing efficiency, and improved target specificity.

[0104] As used herein, unless otherwise stated, “TadA8e mutant,” “TadA8e variant,” and “mutant TadA8e” are used interchangeably and refer to the amino acid sequence corresponding to TadA8e, with the mutation selected from the following sites or combinations thereof: Y149, S109, N119, N122, and D147.

[0105] If it is necessary to represent TadA8e before mutation, it can be an enzyme with an amino acid sequence such as SEQ ID NO:1. Unless otherwise stated, the mutation site of the mutant in this invention is based on the sequence shown in SEQ ID NO:1.

[0106] In this invention, unless otherwise stated, the mutant is identified by “the amino acid that was replaced at the original amino acid position”, such as Y149V, which means that the amino acid at position 149 is replaced by V by the starting enzyme.

[0107] As used in this invention, "operationally linked" or "operably coupled to" refers to a situation where certain portions of a linear DNA sequence can regulate or control the activity of other portions of the same linear DNA sequence. For example, if a promoter controls transcription of a sequence, then it is operably coupled to the coding sequence. "Operationally sequential coupling" refers to the coupling of elements in a specific order, such as from amino acid to carboxyl terminus.

[0108] As used in this invention, an "element" refers to a series of functional nucleic acid / protein sequences useful for protein expression, which are systematically constructed to form an expression construct. The sequences of the "elements" can be those provided in this invention, as well as variants thereof, provided that these variants substantially retain the function of the "elements," obtained by inserting or deleting bases (e.g., 1-50 bp; preferably 1-30 bp, more preferably 1-20 bp, even more preferably 1-10 bp), or by random or site-directed mutagenesis.

[0109] As used in this invention, the term "construction" refers to a single-stranded or double-stranded DNA molecule that has been artificially modified to contain DNA fragments arranged and combined according to sequences not found in nature. The "construction" may include a "plasmid," an "in vitro transcription product," or a viral vector; or, the "construction" may be contained within an expression vector as part of the expression vector.

[0110] As used in this invention, the "sgRNA" refers to "Single-guide RNA (sgRNA)," which is designed based on a "target site on a target gene." Its sequence is sufficient to synergize with the endonuclease Cas to guide a Cas-mediated DNA double-strand break at the target site. In this invention, the "sgRNA" includes sgRNA in RNA form (such as mRNA), as well as sequences in DNA form corresponding to the sgRNA sequence or constructs containing said sequences, as long as they can be processed or converted into active "sgRNA" within the cell.

[0111] In this invention, the "target gene" can also be called the "purpose gene," which refers to a gene of interest that, after being introduced into a cell, is useful for observing changes in the cell, regulating cell performance, or improving cell-related diseases.

[0112] As used in this invention, the term "animal" is not particularly limited, as long as its cells have a genome in the general sense and the gene-editing system is active within its cells. For example, the animal can be a mammal, including humans, non-human primates (monkeys, orangutans), livestock and agricultural animals (e.g., pigs, sheep, cattle), mice (mice), and rodents (e.g., mice, rats, rabbits), etc.

[0113] As used in this invention, the term "cell" includes, but is not limited to: somatic (tissue) cells, induced pluripotent stem cells, and germ cells (such as fertilized egg cells and oocytes). Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences

[0114] As used in this invention, the term "off-target effect" refers to the failure of a modification to a specific location in the genome to achieve the pre-set target, resulting in a deviation from the intended modification or failure to perform the modification. Causes of "off-target effects" include, but are not limited to: inaccurate binding to the target site, inaccurate cleavage operations after sequence recognition, and insufficient precision in editing the cleavage site.

[0115] Although genome editing technology has been widely used in recent years, it suffers from off-target effects and other problems. How to use this technology to correct cellular genomes and alter disease states remains a challenging task. Currently, there are no effective gene-editing-based treatments or drugs for the vast majority of diseases.

[0116] In this invention, genome-wide off-target analysis performed via two-cell embryo injection (GOTI) revealed that ABE8e introduces a severe sgRNA-independent DNA off-target effect. To address this issue, the inventors conducted in-depth research and analysis on TadA8e, generating 152 single mutants and 6 double mutants through saturation mutations of eight bases.

[0117] Based on the inventor's new discovery, a method is provided to reduce the off-target effects of an adenine base editor (ABE) and improve editing accuracy (fidelity), comprising: using TadA8e as an adenine deaminase and modifying the amino acid residues of TadA8e; the modification includes mutating the amino acid residues of Y149, S109, N119, N122, D147, R111, I166 or N167 of TadA8e.

[0118] In the mutant, ABE8e Y149V It was identified as the optimal variant due to its high efficiency comparable to ABE8e and the absence of detectable off-target effects in both DNA and RNA. Furthermore, TadA8e... Y149V It is compatible with various Cas homologous proteins, thus expanding the range of editing applications. Furthermore, it utilizes adeno-associated virus (AAV)-mediated ABE8e... Y149V Successful treatment of a mouse model of hereditary tyrosinemia type I (HTI). This invention highlights the ABE8e mutant, particularly ABE8e. Y149VTherapeutic significance in ABE-mediated gene therapy.

[0119] This invention also includes fragments, derivatives, and analogs of the TadA8e mutant. As used herein, the terms “fragment,” “derivative,” and “analyte” refer to proteins that substantially retain the same biological function or enzymatic activity as the TadA8e mutant of this invention. Fragments, derivatives, or analogs of TadA8e can be (i) proteins with one or more conserved or non-conserved amino acid residues (preferably conserved amino acid residues) substituted, which may or may not be encoded by the genetic code; or (ii) proteins having substituents in one or more (e.g., 1-20, more preferably 1-10, even more preferably 1-8, 1-5, 1-3, or 1-2) amino acid residues; or (iii) proteins formed by fusing additional amino acid sequences to this protein sequence (e.g., leader sequences, secretory sequences, sequences used to purify this protein, or proteoprotein sequences, or fusion proteins). These fragments, derivatives, and analogs are within the scope well known to those skilled in the art as defined herein. However, the condition that must be met is that the amino acid sequence of the TadA8e mutant and its fragments, derivatives and analogs must contain at least one mutation specifically pointed out above in this invention, that is, a mutation occurring at the following positions: Y149, S109, N119, N122, D147.

[0120] For example, in this art, substitution with amino acids of similar or identical properties generally does not alter protein function. Similarly, adding or deleting one or more amino acids at the C-terminus and / or N-terminus generally does not change protein function. This term also includes enzyme-active fragments and enzyme-active derivatives of the TadA8e mutant. However, in these variant forms, there is certainly a mutation at at least one of the key positions described above in this invention.

[0121] In the adenine base editor of the present invention, each element can be a recombinant protein or a synthetic protein, preferably a recombinant protein.

[0122] The present invention also includes variants formed by changing the functionality of each element, i.e., functionally conservative variants (including fragments, derivatives and analogues, etc.).

[0123] The present invention also provides a multinucleotide sequence encoding a selected or modified enzyme variant or a conserved variant thereof.

[0124] The polynucleotide encoding the mature protein of the mutant includes: a coding sequence that encodes only the mature protein; a coding sequence of the mature protein and various additional coding sequences; a coding sequence of the mature protein (and optional additional coding sequences) and a non-coding sequence. "Polynucleotide encoding a protein" can be a polynucleotide that includes the protein itself, or it can include additional coding and / or non-coding sequences.

[0125] The selected, optimized, or modified full-length enzyme nucleotide sequences or fragments thereof of the present invention can generally be obtained by PCR amplification, recombinant methods, or artificial synthesis. For PCR amplification, primers can be designed based on the nucleotide sequences disclosed in the present invention, especially the open reading frame sequences, and the relevant sequences can be amplified using commercially available cDNA libraries or cDNA libraries prepared according to conventional methods known to those skilled in the art as templates. When the sequence is long, it is often necessary to perform two or more PCR amplifications, and then splice the fragments amplified from each amplification in the correct order.

[0126] Once the relevant sequence is obtained, it can be obtained in large quantities using recombination methods. This typically involves cloning it into a vector, transferring it into cells, and then isolating the sequence from the proliferated host cells using conventional methods.

[0127] The present invention also relates to vectors containing the polynucleotides of the present invention, host cells generated by genetic engineering using the vectors of the present invention or selected, optimized or modified enzyme-coding sequences, and methods for generating the proteins of the present invention via recombinant technology.

[0128] Vectors containing the appropriate DNA sequence and appropriate promoter or control sequence can be used to transform appropriate host cells or recipient cells.

[0129] This invention also provides a method for gene editing, including gene editing mediated by the adenine base editor described in this invention. Besides using the adenine base editor described in this invention for gene editing, other gene editing reagents known in the art can be used. In the embodiments of this invention, preferred constructs and implementation methods are provided.

[0130] In this invention, there are no particular limitations on the applicable gene editing targets; they can be somatic cells or germ cells, animal cells or human cells.

[0131] In a preferred embodiment of the invention, a construct for targeted therapy of hereditary type I tyrosinemia is provided, comprising an expression cassette carrying an adenine base editor of the TadA8e mutant of the present invention, and an operatively linked sgRNA expression cassette targeting a gene, Hpd.

[0132] In conjunction with the adenine base editor described above, the present invention also provides a preferred sgRNA for Hpd. In a preferred embodiment, the sgRNA is the sgRNA with the sequence shown in SEQ ID NO:7.

[0133] The sgRNA can be introduced into cells separately or together with the adenine base editor.

[0134] The sgRNA and the adenine base editor described herein may be included in the pharmaceutical composition.

[0135] The pharmaceutical composition may also contain a pharmaceutically acceptable carrier. Such carriers include (but are not limited to): saline, buffer solutions, glucose, water, glycerol, ethanol, and combinations thereof. Generally, the pharmaceutical formulation should be matched to the route of administration. The pharmaceutical compositions of the present invention can be formulated as injections, for example, prepared using conventional methods with physiological saline or an aqueous solution containing glucose and other excipients. The pharmaceutical compositions are preferably manufactured under aseptic conditions. The dosage of the active ingredient is a therapeutically effective amount.

[0136] The present invention will be further illustrated below with reference to specific embodiments. It should be understood that these embodiments are for illustrative purposes only and are not intended to limit the scope of the invention. Experimental methods in the following embodiments that do not specify specific conditions are generally performed according to conventional conditions such as those described in J. Sambrook et al., Molecular Cloning: A Laboratory Manual, Science Press, or according to the manufacturer's recommendations.

[0137] Materials and methods

[0138] 1. Carrier Construction

[0139] Base editor:

[0140] (1) Cytosine base editor 3 (BE3): rAPOBEC1-nCas9-UGI (Cas9 is spCas9).

[0141] (2) BE3-FNLS: (1) BE3, where the N-terminus of the rAPOBEC1 sequence is connected to the Flag tag and the NLS nuclear localization sequence (sequence: PKKKRKV), and there is also an NLS at the C-terminus of the base editor.

[0142] (3) YE1-BE3-FNLS: For BE3-FNLS, the 90th position of the rAPOBEC1 sequence is mutated from W to Y (W90Y), and the 126th position is mutated from R to E (R126E).

[0143] (4) ABE7.10 F148AThe ABE7.10 editor mutates the F at position 148 of both the TadA and TadA* sequences to A.

[0144] (5)ABE8e: BPNLS, TadA8e, linker, nCas9, BPNLS, polyA.

[0145] (6)ABE8e S109F : BPNLS, TadA8e (S is mutated to F at position 109), linker, nCas9, BPNLS, polyA.

[0146] (7)ABE8e Y149V : BPNLS, TadA8e (Y is mutated to V at position 149), linker, nCas9, BPNLS, polyA.

[0147] (8)ABE8e-nSpRY: BPNLS, TadA8e, linker, nSpRY, BPNLS, polyA.

[0148] (9)ABE8e Y149V -nSpRY: BPNLS, TadA8e (Y is mutated to V at position 149), linker, nSpRY, BPNLS, poly.

[0149] (10)ABE8e-nSaKKH: BPNLS, TadA8e, linker, nSaKKH, BPNLS, polyA.

[0150] (11)ABE8e Y149V -nSaKKH: BPNLS, TadA8e (Y is mutated to V at position 149), linker, nSaKKH, BPNLS, poly.

[0151] (12)ABE8e-enIscB: BPNLS, TadA8e, linker, enIscB, BPNLS, polyA.

[0152] (13)ABE8e Y149V -enIscB: BPNLS, TadA8e (Y to V mutation at position 149), linker, enIscB, BPNLS, polyA.

[0153] The linker sequence is: SGGSSGGSSGSETPGTSESATPESSGGSSGGS.

[0154] The amino acid sequence of adenine deaminase TadA8e is as follows (SEQ ID NO:1):

[0155] MSEVEFSHEYWMRHALTLAKRARDEREVPVGAVLVLNNRVIGEGWNRAIGLHDPTA HAEIMALRQGGLVMQNYRLIDATLYVTFEPCVMCAGAMIHSRIGRVVFGVRNS 109 KR 111 GAAGSLMN 119 VLN 122 YPGMNHRVEITEGILADECAALLCD 147 FY 149 RMPRQVFNAQK KAQSSI 166 N 167

[0156] Editor expression cassettes (including Cas9, CBE, and ABE) were cloned into the pCMV-editor-backbone vector, and sgRNAs were cloned into the sgRNA vector backbone. Gene templates were obtained from Addgene or synthesized using GNEWIZ. PCR amplification was performed using KOD-Plus-Neo high-fidelity DNA polymerase (Toyobo, KOD-401). Recombinant plasmids or different ABE8e variants were constructed using NEBuilder HiFi DNA Assembly Master Mix (New England BioLabs). sgRNA targeting oligonucleotides were synthesized, annealed, and ligated to BbsI or BsaI sites to generate sgRNA expression vectors. The metaphase and PAM sequences for all targets are listed in Table 1.

[0157] Table 1. Protospacer and PAM sequences

[0158]

[0159] 2. Cell culture, transfection, FACS and genomic DNA extraction

[0160] HEK293T and 102-sgRNA cells were cultured in modified DMEM (Gibco) medium supplemented with 10% fetal bovine serum (FBS; Biological Industries) and 1% penicillin / streptomycin (Beyotime) at 37°C and 5% CO2. To compare editing efficiency in parallel, the editor plasmid was transfected alone into 102-sgRNA cells or co-transfected with the sgRNA expression vector into HEK293T cells using PEI (Polyscience). Forty-eight hours after transfection, double-positive cells expressing GFP and mCherry were isolated by flow cytometry (FACS, BD FACSAria III). Approximately 500,000 102-sgRNA cells were sorted, and genomic DNA (gDNA) was extracted using the TIANamp Genomic Kit (Tiangen). In addition, approximately 20,000 HEK293T cells were sorted out, and gDNA was extracted using the One-Step Mouse Genotyping Kit (Vazyme).

[0161] 3. Target site deep sequencing

[0162] Nested PCR (Table 2) was performed using Takara Ex Taq polymerase (Takara) to amplify the genomic sequences of the target sites. Barcodes were added in the second round of PCR to distinguish samples. PCR products were purified using a universal DNA purification kit (TIANGEN) and 150-bp paired-end sequencing was performed on an Illumina NovaSeq 6000 platform (Genewiz Co. Ltd). The raw data from deep sequencing were multiplexed using fastq-multx, and reads from each sample were aligned with the reference target sequence using CRISPResso2 (v2.0.32). The in-situ editing efficiency for each target site was calculated using an internal script (v5.26.2).

[0163] Table 2. Primers for the target region

[0164]

[0165]

[0166]

[0167] 4. Animal husbandry

[0168] Heterozygous Ai9 (B6.Cg-Gt(ROSA)26Sortm9(CAG-td-Tomato)Hze / J; JAX 007909) male mice were mated with 4-week-old female C57BL / 6 mice to collect embryos. ICR female mice were used as recipients. Fah was used in in vivo editing experiments. - / - Mice. The use and care of these animals are in accordance with bioscience ethics.

[0169] 5. In vitro transcription of mRNA and sgRNA

[0170] Plasmids were amplified by PCR using primers IVT F and R (Table 3), and a T7 promoter was added to the N-terminus of the coding region. The purified PCR product was used as a template for in vitro transcription (IVT) using the mMESSAGE mMACHINE T7 ULTRA kit (Life Technologies). For sgRNA IVT, a T7 promoter was added by PCR amplification of pX330, and the purified PCR product was used as a template for IVT using the MEGA shortscript T7 kit (Life Technologies). Editor mRNA and sgRNA were purified using the MEGA clear kit (Life Technologies).

[0171] Table 3. In vitro transcription primers

[0172]

[0173] 6. Two-cell embryo injection, embryo transfer, and FACS

[0174] Four-week-old superovulatory C57BL / 6 female mice were mated with homozygous Ai9 male mice, and fertilized embryos were obtained from the oviducts 24 hours after hCG injection. For two-cell embryo injection, a mixture of Cas9 or base editor mRNA (50 ng / μl), sgRNA (50 ng / μl), and Cre mRNA (2 ng / μl) was injected into one blastomere of the two-cell embryo 48 hours after hCG injection. Injection was performed using a FemtoJet microinjector (Eppendorf) in droplets of M2 culture medium containing 5 μg / ml cytochalasin B (CB) at a constant flow setting. After injection, the embryos were cultured in KSOM medium containing amino acids for 2 hours and then transferred to the oviducts of 0.5 dpc pseudopregnant ICR female mice. On day 14.5 (E14.5), embryonic tissue was cut into small pieces and digested at 37°C with 5 mL of 0.05% trypsin-EDTA (Gibco) for 30 minutes. Digestion was terminated by adding 5 mL of DMEM medium containing 10% FBS. Subsequently, fetal tissue was homogenized using a 1 mL pipette tip and the cell suspension was centrifuged at 200 g for 6 minutes. The resulting pellet was resuspended in 2 mL of DMEM containing 10% FBS. Finally, the cell suspension was filtered through a 40 μm cell filter and sorted by flow cytometry to separate tdTomato+ and tdTomato- cells.

[0175] 7. Whole Genome Sequencing (WGS) and Data Analysis

[0176] tdTomato was extracted using the DNeasy blood and tissue kit (Qiagen). + and tdTomato - Cellular gDNA. WGS was performed with an average coverage of 50× using the BGIDNBSEQ-T7 platform. Qualified sequencing reads were aligned to a reference genome (GRCm39) using BWA (v0.7.17). The aligned reads were sorted and duplicates in the mapped BAM file were marked using the Picard tool (v2.25.7). Subsequently, whole-genome SNVs were invoked using three algorithms: Mutect2 (v3.5), Lofreq (v2.1.2), and Strelka (v2.7.1), with default parameters. In parallel, whole-genome indels were detected using Mutect2 (v3.5), Scalpel (v0.5.3), and Strelka (v2.7.1), with default parameters. Overlapping portions of SNVs or indels from the three algorithms were considered true variants.

[0177] 8. RNA library preparation, sequencing, and data analysis

[0178] HEK293T cells were transfected with a base editor and sgRNA plasmid, or with GFP and mCherry plasmids as controls. After 48 hours, approximately 500,000 cells (top 5% GFP / mCherry) were collected, and RNA was extracted using Trizol (Ambion). For library construction, the mRNA was first fragmented and converted to cDNA using random primers or oligo(dT) primers. Adapters were then ligated to the 5' and 3' ends of the cDNA fragments, and correctly ligated fragments were enriched and amplified by PCR. Sequencing was performed on an Illumina Novaseq 6000 platform. For RNA-seq data analysis, initial quality control was performed using FastQC (v0.11.3). Subsequently, STAR (v2.7.1) was used to align the processed reads to the GRCh38 reference genome in 2-pass mode with default parameters. After alignment, the aligned reads were sorted using the Picard tool (v2.25.5) to identify duplicates in the BAM file. Mutation is invoked using the HaplotypeCaller function in GATK (v4.2.0.0).

[0179] 9. AAV vector cloning, production, and in vivo injection

[0180] By using U6-Hpd sgRNA-N-TadA8e Y149V The sequences -nCas9(2-573)-intein and C-intein-nCas9(574-1368) were cloned between the two ITRs of AAV to construct AAV vectors. These vectors, along with helper plasmids and AAV8 packaging plasmids, were transfected into HEK293T cells using PEI. Viral particles were collected from the cell culture medium and cells 3-5 days after transfection. Subsequently, the collected particles were purified and concentrated.

[0181] Fah, aged 6 to 8 weeks - / - Mice were injected intravenously with 5 × 10⁻⁵ kg of body weight. 11 (vg / kg) of a dual AAV mixture, control group 6-8 week old Fah - / - Mice were injected with only 200 μl of physiological saline.

[0182] 10. Mouse body weight, blood collection, and serum analysis

[0183] Mice were given 10 mg / L nitixinone (Sigma-Aldrich) as drinking water for the first 7 days after injection, after which nitixinone feeding was stopped. Mouse weight was recorded every other day. Control mice that had stopped nitixinone treatment were euthanized when their body weight decreased by more than 20%. Blood was collected retro-orbital and allowed to clot at room temperature for 2 hours. Serum was then separated by centrifugation at 3000g for 15 minutes at 4°C. Aspartate aminotransferase (AST), alanine aminotransferase (ALT), and total bilirubin were measured using fresh serum samples.

[0184] 11. Immunohistochemical staining

[0185] Liver tissue was fixed with 4% paraformaldehyde (PFA) at 4°C for 2 hours, then dehydrated with ethanol and embedded in paraffin. Paraffin sections were 5 μm thick, dewaxed with xylene, and then rehydrated with ethanol and distilled water. Anti-HPD antibody (ab133515, Abcam) was used as the primary antibody. Sections were sealed with glass slides and photographed using a bright-field microscope system (Leica Microsystems).

[0186] Example 1: ABE8e demonstrates higher editing efficiency and a wider editing window compared to other ABE tools.

[0187] To determine the most suitable ABE gene editor for further improvement, five ABEs (ABE7.10, ABE7.10...) were first analyzed. F148A The editing efficiency and window size of ABEmax, miniABEmax, and ABE8e were compared in parallel, with wild-type Cas9 and CBE (BE3, YE1-BE3-FNLS) used as controls. To ensure a comprehensive and unbiased evaluation, the inventors employed an sgRNA target library detection strategy. In short, a plasmid library containing 102 sgRNAs and their corresponding target sequences was constructed and integrated into the genome of HEK293T cells via lentiviral infection (hereinafter referred to as 102-sgRNA cells). These sgRNAs contain at least one adenine within 1-20 nucleotides at the end of the adjacent motif (PAM) in the preinterstitial region. Eight different editors were transfected into 102-sgRNA cells, and the editing efficiency of these ABEs, CBEs, and Cas9s was calculated based on the frequency of A-to-G base substitutions, C-to-T base substitutions, and indels at the target sites. Figure 1 a).

[0188] Deep sequencing analysis of 102 targets in each editor processing group showed that ABE8e had significantly higher targeting editing efficiency than other ABEs (average 35.9%, all other ABEs ≤15.2%). Figure 1(b) In comparison, Cas9 showed an average editing efficiency of 36.4%, while the average C-to-T editing efficiencies for the BE3 and YE1-BE3-FNLS groups were 13.6% and 18.8%, respectively.

[0189] However, purity assessment of the edited product indicated that ABE8e also introduced relatively high levels of indels and cytosine editing (…). Figure 1 (c and 1d). Furthermore, ABE8e displays the widest editing window, significantly wider than the other four ABEs and two CBEs (c and 1d). Figure 1 e).

[0190] These results demonstrate that ABE8e provides high gene editing efficiency and a wide editing window among these common base editors.

[0191] Example 2: ABE8e introduces significantly more off-target effects across the entire genome than other ABEs.

[0192] To comprehensively evaluate the five ABE-mediated genome-wide off-target effects, a GOTI test was performed using sgRNAs targeting the tyrosinase (Tyr) gene (Tables 1-2). In this experiment, sgRNAs encoding ABE7.10 and ABE7.10 were used respectively. F148A mRNAs encoding ABEmax, miniABEmax, ABE8e, or Cas9, BE3, YE1-BE3-FNLS (control) were injected together with Tyr sgRNA and Cre-encoding mRNA into a blastomere of an Ai9 (CAG-LoxP-Stop-LoxP-tdTomato) mouse two-cell embryo. On embryonic day 14.5 (E14.5), the edited cells (tdTomato) were sorted by flow cytometry. + ) and unedited cells (tdTomato) - ()( Figure 2 a) After validating targeting efficiency via Sanger sequencing, whole-genome sequencing (WGS) was performed on edited and unedited cells (50× coverage) from all eight edit treatment groups (at least three embryos per group). Single nucleotide variants (SNVs) and insertion / deletion variants (indels) were then identified in the edited cells using three independent algorithms (Mutect2, Lofreq, and Strelka), with unedited cells from the same embryo serving as a reference.

[0193] WGS analysis further confirmed the targeting efficiency of the Cas9, CBE, and ABE editing tools. Figure 3a). Off-target effect assays revealed a relatively high number of SNVs in cells edited with BE3 (an average of 225 per embryo). Surprisingly, the results also showed that ABE8e, using the same Tyr sgRNA, induced a significantly higher number of off-target mutations than BE3 (an average of 371 per embryo), more than 28 times that of the Cre-only control group (i.e., 13 SNVs per embryo). In contrast, the average number of SNVs in the other four ABE, Cas9, and YE1-BE3-FNLS variants ranged from 12 to 21 per embryo, showing no significant difference from the Cre-only control embryos. Figure 2 b).

[0194] Analysis of the base transition types in each editor group showed that off-target editing in the BE3 group was significantly biased towards C-to-T and G-to-A (accounting for 95% of all off-targets), while the ABE8e group was biased towards A-to-G and T-to-C base transitions (accounting for 94% of all off-targets). Figure 2 c, 3b, and 3c). Furthermore, the sequence identification map shows that BE3 typically introduces off-target effects within the TC sequence context, while ABE8e tends to introduce off-target effects at the TA site. Figure 2 d). These preferences are identical to those observed in cytosine deaminase APOBEC1 and adenine deaminase TadA8e, respectively. Furthermore, almost no off-target sites were shared between any ABE8e-treated embryos. Figure 2 e), and it did not overlap with the predicted off-target mutations. Figure 2 f), which is similar to the off-target characteristics of BE3. These new mutations are randomly distributed on chromosomes ( Figure 3 d). This finding indicates that ABE8e-mediated off-target SNVs are sgRNA-independent, suggesting that they are caused by TadA8e. The inventors also noted that the number of indels in all eight experimental groups was slightly increased relative to the Cre control group, although the average number of indels per embryo did not exceed 13 ( Figure 3 These results indicate that ABE8e introduces a significantly higher off-target frequency in mouse embryos than other commonly used base editors.

[0195] Because ABE is frequently used to introduce or reverse point mutations in disease-related genes, it enables the creation or study of mouse disease models. To further validate the sgRNA-independent off-target effects of ABE8e, a GOTI test was performed using three different sgRNAs targeting the disease-related genes Dmd, Fah, and PCSK9. Additionally, a target not present in the human or mouse genome, termed non-target (NT) sgRNA, was also included. Figure 4a). The inventors observed that ABE8e-mediated targeted editing efficiency of Dmd (average 85.8%), Fah (average 73.6%), and PCSK9 (average 91.3%) sgRNAs was high. Figure 5 a).

[0196] Consistent with off-target analysis of Tyr sgRNA, ABE8e induced an average of 279, 343, 387, and 313 SNVs on Dmd, Fah, PCSK9, and NT sgRNAs, respectively, which were 21 to 30 times higher than those in the Cre-only control group. Figure 4 b and 5b), of which A-to-G and T-to-C substitutions account for 91%-95% of off-target types. Figure 4 c and 5c).

[0197] It is noteworthy that, despite reports that ABE can catalyze cytosine conversion, the off-target C-to-T / G / A cytosine deamination activity detected by the inventors in these sgRNAs was not significant. All off-target sites of the four different sgRNAs showed a consistent preference for converting A to G within the TA motif. Figure 4 d). Furthermore, no significant overlap of off-target SNVs was detected between repeated experiments ( Figure 4 e) These SNVs also did not overlap with predicted off-target mutations. Figure 4 f). Similarly, ABE8e showed an increase in the number of indels for these four sgRNAs (f). Figure 5 d).

[0198] These results confirm that ABE8e introduces off-target effects across the entire genome in a sgRNA-independent manner.

[0199] Example 3: Mutation Analysis of TadA8e

[0200] In this embodiment, 102-sgRNA cells were used during testing to analyze the editing characteristics of different mutants at 102 sgRNA sites.

[0201] Furthermore, the inventors aimed to improve the accuracy of ABE8e by narrowing its editing window. Analysis comparing the TadA deaminase components of ABE7.10, ABEmax, and miniABEmax revealed that TadA8e contains eight mutation sites relative to these three ABEs: S109, R111, N119, N122, D147, Y149, I166, and N167. Figure 6 a) In conjunction with the inventors Figure 2The study found that these three ABEs do not cause significant genome-wide off-target SNVs, and the inventors hypothesized that eight mutations in TadA8e are the cause of its increased off-target activity. In order to reduce off-target editing while maintaining its high targeting efficiency, a saturation mutation scheme targeting these eight sites in ABE8e was designed.

[0202] 152 ABE8e monovariates were generated through saturation mutations of eight bases. Each variant was then transfected into 102-sgRNA cells and targeted deep sequencing analysis was performed. To ensure consistent comparisons between variants, the edit window was defined as the target sequence location where the average edit efficiency was greater than 30% of the average peak edit efficiency. Figure 6 As shown in b and 6c, with PAM positions 21-23, the base editing activity window of ABE8e is from A2 to A10, with the highest editing efficiency occurring at A5 (average 44.0%). Comparison of the activities of mutants at eight sites showed that the mutation at R111 significantly reduced efficiency, while the mutations at I166 and N167 had no significant effect on the editing window size.

[0203] Notably, 16 of the 19 mutations at the S109 position resulted in narrower editing windows, with S109F exhibiting the narrowest (A3-A7) and highest editing efficiency (average A5 activity of 42.3%). All mutations at the N119 position reduced the editing window by 1 to 4 nucleotides, with the N119Q (A3-A9), N119D (A3-A10), and N119C (A3-A10) variants showing higher editing efficiencies than ABE8e (averages of 47.8%, 46.8%, and 48.6%, respectively). At the N122 position, four mutants (N122F, N122L, N122V, and N122A) demonstrated relatively high editing efficiencies (48.1%, 46.6%, 47.9%, and 41.0%, respectively), and their editing windows were one base narrower than ABE8e. At position D147, the editing window of the D147K mutant is slightly smaller (A2-A9), and its editing efficiency is significantly higher than that of ABE8e, reaching 53.1%. Most Y149 mutants have a tight editing window (A3-A7), and some of them show high editing efficiency, including Y149L (average 45.1%), Y149I (40.6%), Y149M (45.9%), and Y149V (42.4%).

[0204] Further comparisons were made of A-to-G editing, bystander A editing, and non-target cytosine conversion at different protospacer positions for these 13 variants (S109F, N119Q, N119D, N119C, N122F, N122L, N122V, N122A, D147K, Y149L, Y149I, Y149M, and Y149V). Figure 4 (d) and (4e) indicate that Y149V has the narrowest edit window and the lowest number of bystander A and C edits, suggesting it is the optimal candidate among single mutants. The inventors also noted that the S109F and N119Q mutations at the other two positions ranked second and third, respectively. Meanwhile, D147K showed the highest efficiency at positions A3 through A8 and had fewer C edits compared to ABE8e, and therefore was included in subsequent combined mutation tests.

[0205] To examine whether combining these effective mutations could further narrow the edit window and / or improve A-to-G editing efficiency, the inventors paired S109F, N119Q, D147K, and Y149V to generate six double mutants. Among them, the S109F / Y149V pair had a relatively narrow edit window (A3-A6), with slightly lower editing efficiency at position A5 (average 33.8%). Figure 6 f).

[0206] Further comparison of ABE8e S109F ABE8e Y149V and ABE8e S109F / Y149V Variants, as well as other previously reported ABE8e mutants, including ABE8e V106W ABE8e V82 ABE8e K20A / R21 ABE8e F148A ABE8e N108 and ABE8e N108Q / L145T The editing properties of (ABE9) in 102-sgRNA cells indicate that ABE8e S109F and ABE8e Y149V Editing efficiency and ABE8e V106W ABE8e V82G and ABE8e K20A / R21A Comparable, but the editing window is narrower. Figure 6 g). Furthermore, although ABE8e S109F / Y149V The editing window is slightly wider than ABE8e N108Q Or ABE9, but its editing efficiency at the A5 position is higher than these variants.

[0207] The above results indicate that the inventors' screening of 158 variants yielded several ABE variants with high fidelity and high efficiency, among which ABE8e... S109F ABE8e Y149V and ABE8e S109F / Y149V It is the optimal candidate for further off-target analysis.

[0208] Example 4, ABE8e Y149V and ABE8e S109F / Y149V Eliminates off-target effects of whole genome and transcriptome

[0209] Considering ABE8e Y149V and ABE8e S109F Compared to ABE8e, it has a narrower editing window, lower bystander editing effect, and considerably higher target editing efficiency. S109F / Y149V With a tight window and slightly lower efficiency, the inventors then examined whether the three variants exhibited reduced off-target effects in DNA and RNA by performing GOTI analysis in Ai9 mouse embryos and RNA sequencing (RNA-seq) in human HEK293T cells.

[0210] Since PCSK9 sgRNA has been used to lower cholesterol in mice and primates, GOTI analysis was used to assess the off-target effects of these three variants when targeting this site. Notably, the three variants showed off-target effects in tdTomato + The average targeting efficiency at the PCSK9 site in cells was high (87.4%-94.4%), with significantly less off-target effect than ABE8e. Figure 7 a). In the two single mutants, ABE8e S109F Each embryo produced an average of 117 SNVs, 70% lower than ABE8e, but still significantly higher than the Cre control group. In contrast, ABE8e... Y149V Each embryo induced an average of only 24 SNVs, similar to the Cre control group. Furthermore, ABE8e S109F / Y149V The double mutant embryos introduced an average of 26 SNVs per embryo, which was not significantly different from the Cre control group. Figure 7 a). Further off-target SNV pattern and motif analysis also revealed A-to-G mutation bias (accounting for 73% of all SNVs) in ABE8e. S109F A slight preference for TA motifs was detected during editing, but in ABE8e Y149V Or ABE8e S109F / Y149V There is no (in the editor) Figure 7 (b and 7c).

[0211] These results confirm that ABE8e S109FIt will still cause relatively low genome-wide off-target effects, while ABE8e Y149V and ABE8e S109F / Y149V Compared to ABE8e, it significantly reduced DNA off-target effects, reaching a level indistinguishable from the unedited control group.

[0212] To evaluate the off-target activity of these candidate ABEs at the RNA level, the inventors compared ABE8e and ABE8e... S109F ABE8e Y149V and ABE8e S109F / Y149V The cells were transfected into HEK293T cells with the same sgRNA, targeting HEK293-site 2. Cells transfected with GFP and mCherry backbone plasmids served as vector controls. After validating the targeting efficiency of the four ABE variants using Sanger sequencing, RNA-seq (average depth 125×) analysis showed that ABE8e introduced an average of 11396 RNA SNVs in the transcriptome, while the control group detected 1831 RNA SNVs. Figure 7 d). Furthermore, 93% of the SNVs in the ABE8e group were A-to-G or U-to-C mutations, preferentially introduced at the TA motif ( Figure 7 e and 7f) indicate that ABE8e can mediate a wide range of adenine deaminase activity in the transcriptome of human cells. In contrast, in ABE8e S109F (Average 2543 SNVs), ABE8e Y149V (Average 1646 SNVs) or ABE8e S109F / Y149V The number of RNA SNVs detected in the sample (average 1806 SNVs) was not significantly different from that in the control group. Similarly, no mutation bias or preferential motif recognition was observed compared to the control group. Figure 7 e and 7f).

[0213] In summary, these results indicate that ABE8e S109F Variants can reduce DNA off-target effects and eliminate RNA off-target effects, while ABE8e Y149V and ABE8e S109F / Y149V Variants can eliminate off-target effects on the genome and transcriptome. Among them, due to ABE8e... Y149V Compared to ABE8e S109F / Y149V It has higher targeting efficiency and is the relatively optimal candidate ABE8e variant.

[0214] Example 5, TadA8e Y149V Compatibility with different CRISPR systems

[0215] Considering ABE8e Y149VWith high editing efficiency and negligible off-target effects, the inventors of TadA8e Y149V It can be combined with various CRISPR systems to extend its application to A sites that are not applicable to conventional NGG PAMs. Since the SpRY nuclease can target almost all PAMs, ABE8e-nSpRY and ABE8e were constructed by replacing nSpCas9 with nSpRY. Y149V The activity of these two ABEs at 10 genomic sites with NNN PAM was examined in HEK293T cells using the -nSpRY base editor (Table 2).

[0216] Targeted deep sequencing analysis showed that ABE8e-nSpRY mediated efficient A-to-G editing between A3 and A10, with an average efficiency exceeding 20%. Figure 8 a). In contrast, ABE8e Y149V -nSpRY typically exhibits high editing activity at the A5 and A6 positions, with A-to-G base substitution efficiencies ranging from 9.52% to 37.63% at the NAN PAM site, 33.65% to 83.29% at the NGN PAM site, 38.37% at an NCN PAM site, and 26.38% at an NTN PAM site. The results obtained at these 10 target sites indicate that ABE8e Y149V -nSpRY is relatively more efficient on average at the A6 position than ABE8e-nSpRY (49.6% vs 42.5%), and its editing window is narrower (A4-A6); Figure 8 b). Furthermore, 7 of the 10 target sites contain at least one cytosine at positions 4-6, and these positions were used to evaluate bystander C-editing. Figure 8 c). At NNN sites 3 and 7, ABE8e Y149V The cytosine switching activity of -nSpRY was significantly lower than that of ABE8e-nSpRY (2.0% vs 9.2% and 1.2% vs 6.0%, respectively). At five other genomic loci, ABE8e... Y149V The C-editing activity of -nSpRY was similar to that observed in the untreated vector control group.

[0217] In addition, the inventors constructed ABE8e-nSaKKH and ABE8e Y149V The -nSaKKH editor was used to compare their activity at sites with NNNRRT PAM (Tables 1-2). Tests at six genomic sites showed that ABE8e-nSaKKH had high efficiency (>50%) between A6 and A14 positions. Y149V-nSaKKH exhibits high A-to-G substitution activity at positions A10 to A13, with an efficiency of up to 65.8%, slightly lower than ABE8e-nSaKKH. Figure 8 d). In addition, ABE8e Y149V The editing window for -nSaKKH (typically from A6 to A14) is narrower than that for ABE8e-nSaKKH (from A2 to A17). Figure 8 e).

[0218] The inventors also used TadA8e Y149V It is fused to the Cas9 ancestor IscB (496 amino acids) to form ABE8e-enIscB and ABE8e. Y149V -enIscB microbase editors were used to compare their activity at NWRRNA TAM sites (Tables 1-2). ABE8e was used at five genomic sites. Y149V -enIscB showed the highest A-to-G conversion efficiency at position A4 (average 27.2%), comparable to ABE8e-enIscB (average 28.6%). However, at other positions, ABE8e... Y149V -enIscB has relatively lower A-editing activity than ABE8e-enIscB ( Figure 8 f and 8g).

[0219] Overall, these results indicate that TadA8e Y149V It is compatible with PAM-relaxed Cas9 variants, ultra-compact CRISPR, and ancestral Cas9 systems, thereby expanding the target range for efficient adenine base editing.

[0220] Example 6, ABE8e Y149V Precisely editing Hpd in vivo to treat fatal hereditary type I tyrosinemia in order to explore ABE8e Y149V The inventors chose Fah because of its potential applications in gene therapy for treating genetic diseases. - / -Mice are a widely used model of hereditary type I tyrosinemia (HTI), a fatal genetic disease. HTI is caused by a loss-of-function mutation of fumaroacetoacetate hydrolase (FAH), an enzyme in the tyrosine metabolic pathway. HTI is characterized by the abnormal accumulation of tyrosine and its metabolites, leading to hepatotoxicity, ultimately damaging the liver and causing weight loss. Inhibition of the upstream enzyme hydroxyphenylpyruvate dioxygenase (HPD) with niticinon can prevent the accumulation of this tyrosine metabolite and prevent fatal liver failure. Previous studies have reported that HTI mice can be treated by Cas9 nuclease-mediated Hpd gene disruption and CBE-mediated introduction of an early stop codon into the Hpd gene. Here, an alternative gene silencing strategy is proposed, as previously described, using ABE8e... Y149V Mediated in vivo start codon mutations to disrupt the Hpd gene ( Figure 9 a).

[0221] Due to the limited payload capacity of AAVs (less than 5kb), a dual AAV system was adopted, and ABE8e was used. Y149V It is divided into two parts, which are fused with the N-terminus and C-terminus of Cfa intein respectively. Figure 9 b). These two constructs were packaged into the AAV8 serotype and administered to 6- to 8-week-old Fah pups via tail vein injection. - / - In mice. Saline injection as Fah - / - Mouse control group. After injection, mice continued to receive niticinone for 7 days, then the drug was discontinued. Figure 9 b). On day 32 post-injection, control group mice that had lost less than 20% of their body weight were euthanized. In contrast, the treatment group mice behaved no differently than wild-type mice; they gained weight and did not require administration of nitixinon (b). Figure 9 c).

[0222] On day 30, the editing efficiency and liver function of the mouse liver were assessed. (In ABE8e) Y149V In treated mice, deep sequencing of the Hpd gene in liver tissue showed that the average base editing efficiency of the precise A-to-G mutation at the A6 position was 69.0%, with almost no bystander A editing, C editing, or insertion / deletion. Figure 9 d). Furthermore, immunohistochemical (IHC) staining of liver sections with anti-HPD antibodies revealed extensive HPD-negative hepatocellular plaques. Figure 9 e).

[0223] Serum biomarkers, including aspartate aminotransferase (AST), alanine aminotransferase (ALT), and total bilirubin levels, indicate ABE8e Y149V Fah processed - / -Liver function was significantly improved in mice (without nitixinidone) compared to mice treated with nitixinidone. - / - Similar to mice, but significantly superior to Fah injected with saline. - / - Mice (without nitinidone) Figure 9 fh).

[0224] These results indicate that ABE8e Y149V In vivo editing, with its efficient and precise gene editing characteristics, is a meaningful base editor for human genetic diseases.

[0225] The embodiments described above are merely illustrative of several implementations of the present invention, and while the descriptions are specific and detailed, they should not be construed as limiting the scope of the present invention. It should be noted that those skilled in the art can make various modifications and improvements without departing from the concept of the present invention, and these modifications and improvements all fall within the scope of protection of the present invention. Therefore, the scope of protection of this patent should be determined by the appended claims.

Claims

1. A method for reducing DNA and RNA off-target effects and improving editing accuracy of an adenine base editor, the method comprising: using TadA8e as an adenine deaminase and modifying the amino acid residues of the TadA8e; The modification involves mutating the amino acid residues of Y149 and S109 of TadA8e to obtain the TadA8e variant. The mutation is selected from: Y149V mutation, or S109F and Y149V double mutation; wherein the amino acid sequence of wild-type TadA8e is shown in SEQ ID NO:

1.

2. The method as described in claim 1, characterized in that, The adenine base editor includes an operationally linked TadA8e variant and a Cas nuclease.

3. The method as described in claim 1, characterized in that, The adenine base editor comprises operatively linked NLS, a TadA8e variant, a Cas nuclease, and NLS.

4. The method as described in claim 3, characterized in that, The Cas nuclease is selected from Cas9 nuclease or its homologs, and the homologs are SpRY nuclease, SaKKH nuclease, and IscB nuclease.

5. The application of an adenine base editor in reducing DNA and RNA off-target effects and improving editing accuracy, the adenine base editor comprising an operatively linked: a TadA8e variant, a Cas nuclease; the TadA8e variant being selected from: a Y149V mutation, or a double mutation of S109F and Y149V; wherein the amino acid sequence of wild-type TadA8e is as shown in SEQ ID NO:

1.

6. The application as described in claim 5, characterized in that, The adenine base editor comprises operatively linked NLS, a TadA8e variant, a Cas nuclease, and NLS.

7. The application as described in claim 5, characterized in that, The Cas nuclease includes: Cas9 nuclease or its homologs, wherein the homologs are: SpRY nuclease, SaKKH nuclease, and IscB nuclease.

8. Use of a construct or expression vector containing said construct for preparing a reagent for gene editing; wherein said gene editing has reduced DNA and RNA off-target effects, improved editing precision, is adenine base editor-mediated gene editing, and produces A•T to G•C conversion; The construct includes an expression cassette of an adenine base editor and an operatively linked sgRNA expression cassette targeting a specific gene; the adenine base editor includes an operatively linked TadA8e variant and a Cas nuclease; the TadA8e variant is selected from the following: a Y149V mutation, or a double mutation of S109F and Y149V; wherein the amino acid sequence of wild-type TadA8e is shown in SEQ ID NO:

1.

9. The use as described in claim 8, characterized in that, The target gene is: Hpd The construct or expression vector containing the construct is used to prepare a pharmaceutical composition for treating hereditary type I tyrosinemia.

10. The use as described in claim 8, characterized in that, The nucleotide sequence of the sgRNA is shown in SEQ ID NO:

7.

11. The use as described in any one of claims 8-10, characterized in that, The construct includes: Construct 1: An expression cassette for the sgRNA targeting the target gene, an expression cassette for the TadA8e variant gene, and an expression cassette for the N-terminal portion of the Cas nuclease gene; and Construct 2: Expression cassette of the C-terminus of Cas nuclease.