Epigenetic editing tool targeting hepatitis b virus gene

By introducing a fusion molecule with repressive epigenetic modification into a specific regulatory region of the HBV genome, the problem of HBV's inability to eliminate cccDNA and integrated DNA in existing technologies has been solved, achieving targeted HBV gene silencing and realizing the effect of functional cure of hepatitis B.

WO2026124637A1PCT designated stage Publication Date: 2026-06-18EPIGENIC THERAPEUTICS INC

Patent Information

Authority / Receiving Office
WO · WO
Patent Type
Applications
Current Assignee / Owner
EPIGENIC THERAPEUTICS INC
Filing Date
2025-12-12
Publication Date
2026-06-18

AI Technical Summary

Technical Problem

Existing technologies are insufficient to effectively eliminate hepatitis B virus (HBV) cccDNA and integrated DNA, leading to viral replication rebound during hepatitis B treatment and preventing functional cure. Furthermore, existing epigenetic editing tools lack targeting and persistence.

Method used

Develop a fusion molecule containing a DNA-binding domain, an epigenetic modification domain, and a transcriptional regulatory domain. By targeting specific regulatory regions of the HBV genome to introduce repressive epigenetic modifications, altering the transcriptional activity of cccDNA and integrated genomic DNA, and using CRISPR enzymes such as Cas9 or zinc finger nucleases to bind to DNA methyltransferase and transcriptional repressor domains, the target gene can be silenced.

🎯Benefits of technology

It effectively inhibits HBV gene expression, reduces viral replication and protein expression, achieves the goal of treating hepatitis B, avoids DNA double-strand breaks and immunogenicity risks, and provides a lasting therapeutic effect.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure PCTCN2025142088-FTAPPB-I100001
    Figure PCTCN2025142088-FTAPPB-I100001
  • Figure PCTCN2025142088-FTAPPB-I100002
    Figure PCTCN2025142088-FTAPPB-I100002
  • Figure PCTCN2025142088-FTAPPB-I100003
    Figure PCTCN2025142088-FTAPPB-I100003
Patent Text Reader

Abstract

An epigenetic editing tool targeting a hepatitis B virus gene and the use thereof. A composition for regulating the expression of a hepatitis B virus (HBV) gene in a cell, which composition comprises a fusion molecule or a nucleic acid sequence encoding the fusion molecule, wherein the fusion molecule comprises at least one DNA-binding domain, at least one epigenetic modification domain, and at least one transcriptional regulatory domain.
Need to check novelty before this filing date? Find Prior Art

Description

Epigenetic editing tools targeting hepatitis B virus genes Technical Field

[0001] This application relates to the field of biomedicine, specifically to a target-specific epigenetic editing tool and its uses. Background Technology

[0002] Hepatitis B virus (HBV) is a hepatotropic DNA virus that causes chronic hepatitis B infection, leading to persistent liver inflammation and significantly increasing the risk of cirrhosis and liver cancer. The World Health Organization (WHO) estimates that there are currently over 250 million cases of chronic hepatitis B caused by HBV worldwide, resulting in over 800,000 deaths annually, placing a significant burden on global public health. China, as a populous country, has approximately one-third of the world's hepatitis B carriers. Since the 1980s, with increased hepatitis B vaccination rates, new hepatitis B cases in my country have been effectively controlled; however, the number of hepatitis B carriers in the country remains substantial.

[0003] Current treatments for HBV, such as nucleoside (acid) analogs and IFN, can effectively inhibit viral replication, but achieving a functional cure (reducing HBsAg to undetectable levels) is difficult and requires long-term medication. Once treatment stops, HBV replication rebounds. The main reason for this is that after HBV infects hepatocytes, it forms covalently closed circular DNA (cccDNA), and the HBV genome can integrate into the host hepatocyte genome. cccDNA and integrated DNA can stably serve as templates for viral replication and protein expression for a long time. Current conventional treatments do not eliminate these two components. Therefore, eliminating or silencing cccDNA and integrated DNA may be an effective treatment for HBV infection. Because cccDNA can assemble with histones in the host hepatocyte nucleus to form chromosome-like structures, its transcriptional regulation can also be affected by epigenetic modifications. The HBV genome contains three CpG islands, hereinafter referred to as CG I, CG II, and CG III, which are key regions for epigenetic regulation. Transcription of the HBV genome is regulated by four promoters (Xp, Cp, Sp1, Sp2) and two enhancers (Enh I and Enh II). Xp, Cp, Enh I, and Enh II are located in the CG II region, while Sp1 and Sp2 are located near CG I and CG III. Therefore, epigenetic editing of CpG islands may affect multiple regulatory elements, thereby affecting the transcription of the viral genome.

[0004] Introducing epigenetic modifications into specific regulatory regions or sites of the HBV genome can alter chromatin structure, thereby adjusting the target gene to a transcriptional repression state and achieving target gene silencing. This process does not cleave DNA, avoiding the possibility of genomic double-strand breaks and fundamentally eliminating the risk of activating unpredictable DNA repair mechanisms and potentially generating immunogenic truncated or mutant proteins. However, current research on epigenetic editing tools specifically targeting the HBV genome is mostly in its early stages, and the development of epigenetic editing technologies capable of sustainably curing or eliminating HBV virus still faces many unknowns and limitations. Summary of the Invention

[0005] This application aims to develop a safe and effective therapy that silences and regulates HBV target genes. It provides a method for introducing repressive epigenetic modifications into specific regulatory regions of the HBV genome using epigenetic editing tools, thereby altering the transcriptional activity of HBV cccDNA and integrated genomic DNA, reducing HBV replication and protein expression, and achieving the goal of treating hepatitis B.

[0006] On the one hand, this application provides a composition for regulating the expression of hepatitis B virus (HBV) genes in cells, the composition comprising a fusion molecule or a nucleic acid sequence encoding the fusion molecule, the fusion molecule comprising at least one DNA-binding domain, at least one epigenetic modification domain and at least one transcriptional regulatory domain.

[0007] For example, the complete genome sequence of HBV type B (serotype adw) is shown below:

[0008] For example, the complete genome sequence of type C HBV (serotype adr) is shown below:

[0009] For example, the complete genome sequence of the D-type HBV (serotype ayw) is shown below:

[0010] In some embodiments, the DNA-binding domain is selected from CRISPR enzymes, zinc finger nucleases (ZNF), transcription activator-like effector (TALE) domains, homing endonucleases, dCas9-FokI nucleases, Argonaute (Ago) nucleases, or MegaTal nucleases.

[0011] In some embodiments, the CRISPR enzyme is a type 2 Cas protein and / or a mutant thereof.

[0012] In some embodiments, the CRISPR enzyme is one or more of the following Cas proteins: type II-A Cas protein, type II-B Cas protein, type II-C Cas protein, type VA Cas protein, type VB Cas protein, type VC Cas protein, type VU Cas protein, and mutants thereof.

[0013] In some embodiments, the CRISPR enzyme is the Cas9 protein and / or a mutant thereof.

[0014] In some embodiments, the at least one DNA-binding domain is dCas9.

[0015] In some embodiments, the dCas9 includes Staphylococcus aureus dCas9, Streptococcus pyogenes dCas9, Campylobacter jejuni dCas9, Corynebacterium diphtheria dCas9, Eubacterium ventriosum dCas9, Streptococcus pasteurianus dCas9, Lactobacillus farciminis dCas9, Sphaerochaeta globus dCas9, Azospirillum (e.g., strain B510) dCas9, Gluconacetobacter diazotrophicus dCas9, Neisseria cinerea dCas9, and Roseburia enterica. The following bacteria are listed: *Intestinalis* dCas9, *Parvibaculum lavamentivorans* dCas9, *Nitratifractor salsuginis* (e.g., strain DSM 16511) dCas9, *Campylobacter lari* (e.g., strain CF89-12) dCas9, and *Streptococcus thermophilus* (e.g., strain LMD-9) dCas9.

[0016] In some embodiments, the dCas9 comprises the amino acid sequence shown in SEQ ID NO: 40-57.

[0017] In some embodiments, the composition further comprises at least one single guide RNA (sgRNA) or a nucleic acid encoding the sgRNA.

[0018] In some embodiments, the sgRNA is complementary to a target nucleotide sequence near the HBV gene and / or within the HBV gene regulatory element, or contains a partial sequence that is complementary to the target nucleotide sequence for 15-20 consecutive base pairs.

[0019] In some embodiments, the at least one epigenetic modification domain provides methylation modification of at least one nucleotide in the vicinity of the HBV gene and / or within the HBV gene regulatory element.

[0020] In some embodiments, the at least one epigenetic modification domain comprises a DNA methyltransferase (DNMT) or a functionally active fragment thereof.

[0021] In some embodiments, the appearance modification domain is selected from one or more of DNMT3A, DNMT3B, DNMT3C, DNMT1, DNMT2 and DNMT3L.

[0022] In some embodiments, the appearance modification domain includes at least one DNMT3A and at least one DNMT3L, and is connected by a connector sequence.

[0023] In some embodiments, the appearance modification structure domain includes DNMT3A and DNMT3L, and the C-terminal of DNMT3A is connected to the N-terminal of DNMT3L, or the C-terminal of DNMT3L is connected to the N-terminal of DNMT3A.

[0024] In some embodiments, the DNA methyltransferase comprises the amino acid sequence shown in any one of SEQ ID NOs:4-9.

[0025] In some embodiments, the transcriptional regulatory domain is a transcriptional repressor domain selected from: KRAB, ZIM3KRAB, ZNF680, ZNF554, ZNF264, ZNF582, ZNF324, ZNF669, ZNF354A, ZNF82, ZNF595, ZNF419, ZNF566, ZIM2, EHMT2, SUV39H1, ZFPM1, TRIM28, EZH2, MXD1, SID, LSD1, HP1a, HDAC3, ZNF436, ZNF257, ZNF675, ZNF490, ZNF320, ZNF331, ZNF816, ZNF41, ZNF189, ZNF528, ZNF543, ZNF140, ZNF610, ZNF350, ZNF8, ZNF30. ZNF98, ZNF677, ZNF596, ZNF214, ZNF37A, ZNF34, ZNF250, ZNF547, ZNF273, ZFP82, ZNF224 , ZNF33A, ZNF45, ZNF175, ZNF184, ZFP28-1, ZFP28-2, ZNF18, ZNF213, ZNF394, ZFP1, ZFP1 4. ZNF416, ZNF557, ZNF729, ZNF254, ZNF764, ZNF785, ZNF10, CBX5, RYBP, YAF2, MGA, CBX1, SCMH1, MPP8, SUMO3, HERC2, BIN1, PCGF2, TOX, FOXA1, FOXA2, IRF2BP1, IRF2BP2, IRF2BPL IRF-2BP1_2N-terminal domain, HOXA13, HOXB13, HOXC13, HOXA11, HOXC11, HOXC10, HOXA10, HOXB9, HOXA9, ZFP28, ZN334, ZN568, ZN37A, ZN181 , ZN510, ZN862, ZN140, ZN208, ZN248, ZN571, ZN699, ZN726, ZIK1, ZNF2, Z705F, ZNF14, ZN471, ZN624, ZNF84, ZNF7, ZN8 91, ZN337, Z705G, ZN529, ZN729, ZN419, Z705A, ZN302, ZN486, ZN621, ZN688, ZN33A, ZN554, ZN878, ZN772, ZN224, ZN18 4. ZN544, ZNF57, ZN283, ZN549, ZN211, ZN615, ZN253, ZN226, ZN730, Z585A, ZN732, ZN681, ZN667, ZN649, ZN470, ZN484,ZN431,ZN382,ZN254,ZN124,ZN607,ZN317,ZN620,ZN141,ZN584,ZN540,ZN75D,ZN555,ZN658,ZN684,RBAK,ZN829,ZN582,ZN112,ZN716,HKR1,ZN350,ZN480,ZN416,ZNF92,ZN100,ZN736,ZNF74,ZN443,ZN195,ZN530,ZN782,ZN791,ZN331,Z354C,ZN157,ZN727,ZN550,ZN793,ZN235,ZN724,ZN573,ZN577,ZN789,ZN718,ZN300,ZN383,ZN429,ZN677,ZN850,ZN454,ZN257,ZN264,ZN485,ZN737,ZNF44,ZN596,ZN565,ZN543,ZFP69,SUMO1,ZNF12,ZN169,ZN433,ZN175,ZN347,ZNF25,ZN519,Z585B,ZN517,ZN846,ZN230,ZNF66,ZN713,ZN816,ZN426,ZN674,ZN627,ZNF20,Z587B,ZN316,ZN233,ZN611,ZN556,ZN234,ZN560,ZNF77,ZN682,ZN614,ZN785,ZN445,ZFP30,ZN225,ZN551,ZN610,ZN528,ZN284,ZN418,ZN490,ZN805,Z780B,ZN763,ZN285,ZNF85,ZN223,ZNF90,ZN557,ZN425,ZN229,ZN606,ZN155,ZN222,ZN442,ZNF91,ZN135,ZN778,ZN534,ZN586,ZN567,ZN440,ZN583,ZN441,ZNF43,ZN589,ZN563,ZN561,ZN136,ZN630,ZN527,ZN333,Z324B,ZN786,ZN709,ZN792,ZN599,ZN613,ZF69B,ZN799,ZN569,ZN564,ZN546,ZFP92,ZN723,ZN439,ZFP57,ZNF19,ZN404,ZN274,CBX3,ZN250,ZN570,ZN675,ZN695,ZN548,ZN132,ZN738,ZN420,ZN626,ZN559,ZN460,ZN268,ZN304,ZN605,ZN844,SUMO5,ZN101,ZN783,ZN417,ZN182,ZN823,ZN177,ZN197,ZN717,ZN669,ZN256,ZN251,CBX4,CDY2,CDYL2,ZN562,ZN461,Z324A,ZN766,ID2,ZN214,CBX7,ID1,CREM,SCX,ASCL1,ZN764,SCML2,TWST1,CREB1,TERF1,ID3,CBX8,GSX1,NKX22,ATF1,TWST2,ZNF17,TOX3,TOX4,ZMYM3,I2BP1,RHXF1,SSX2,I2BPL,ZN680,TRI68,HXA13,PHC3,TCF24,HXB13,HEY1,PHC2,ZNF81,FIGLA,SAM11,KMT2B,HEY2,JDP2,HXC13,ASCL4,HHEX,GSX2,ETV7,ASCL3,PHC1,OTP,I2BP2,VGLL2,HXA11,PDLI4,ASCL2,CDX4,ZN860,LMBL4,PDIP3,NKX25,CEBPB,ISL1,CDX2,PROP1,SIN3B,SMBT1,HXC11,HXC10,PRS6A,VSX1,NKX23,MTG16,HMX3,HMX1,KIF22,CSTF2,CEBPE,DLX2,PPARG,PRIC1,UNC4,BARX2,ALX3,TCF15,TERA,VSX2,HXD12,CDX1,TCF23,ALX1,HXA10,RX,CXXC5,SCML1,NFIL3,DLX6,MTG8,CEBPD,SEC13,FIP1,ALX4,LHX3,PRIC2,MAGI3,NELL1,PRRX1,MTG8R,RAX2,DLX3,DLX1,NKX26,NAB1,SAMD7,PITX3,WDR5,MEOX2,NAB2,DHX8,CBX6,EMX2,CPSF6,HXC12,KDM4B,LMBL3,PHX2A,EMX1,NC2B,DLX4,SRY,ZN777,ZN398,GATA3,BSH,SF3B4,TEAD1,TEAD3,RGAP1,PHF1,GATA2,FOXO3,ZN212,IRX4,ZBED6,LHX4,SIN3A,RBBP7,NKX61,R51A1,MB3L1,DLX5,NOTC1,TERF2,ZN282,RGS12,ZN840,SPI2B,PAX7,NKX62,ASXL2,FOXO1,GATA1,ZMYM5,LRP1,MIXL1,SGT1,LMCD1,CEBPA, SOX14, WTIP, PRP19, NKX11, RBBP4, DMRT2, SMCA2, and their functionally active fragments.

[0026] In some embodiments, the transcriptional repressor domain comprises the amino acid sequence shown in any one of SEQ ID NOs:10-39.

[0027] In some embodiments, the transcriptional repressor domain comprises a zinc finger protein-based transcription factor or a functionally active fragment thereof.

[0028] In some embodiments, the zinc finger protein-based transcription factor is a Krüppel-associated repressor (KRAB) or a KRAB domain derived from ZIM3 (ZIM3 KRAB).

[0029] In some embodiments, the transcriptional regulatory domain comprises two or more zinc finger-based transcription factors or their functionally active fragments, wherein the two or more zinc finger-based transcription factors are of the same or different types and are connected by a linker sequence.

[0030] In some embodiments, the transcriptional repressor domain includes a histone modification domain.

[0031] In some embodiments, the histone modification domain is selected from: EZH2, HDAC3, HDAC1, EHMT2(G9A), PRMT1, PRMT5, SETDB1, hSIRT1, HP1a, LSD1, and their functionally active fragments.

[0032] In some embodiments, the histone modification domain comprises the amino acid sequence shown in any one of SEQ ID NO:24-39.

[0033] In some embodiments, the epigenetic modification domain and the transcriptional regulatory domain are both located at the N-terminus or C-terminus of the DNA binding domain, and the various domains are directly or indirectly connected by adapter sequences.

[0034] In some embodiments, the epigenetic modification domain and the transcriptional regulatory domain are located at the N-terminus and C-terminus of the DNA binding domain, respectively, and the domains are directly or indirectly connected by adapter sequences.

[0035] In some embodiments, the connector sequence is an XTEN connector sequence.

[0036] In some embodiments, the fusion molecule is sequentially connected from the N-terminus to the C-terminus to: 1) the epigenetic modification domain, the transcriptional regulatory domain, and the DNA-binding domain; or 2) the transcriptional regulatory domain, the epigenetic modification domain, and the DNA-binding domain; or 3) the DNA-binding domain, the epigenetic modification domain, and the transcriptional regulatory domain; or 4) the DNA-binding domain, the transcriptional regulatory domain, and the epigenetic modification domain; or 5) the epigenetic modification domain, the DNA-binding domain, and the transcriptional regulatory domain; or 6) the transcriptional regulatory domain, the DNA-binding domain, and the epigenetic modification domain.

[0037] In some embodiments, the fusion molecule is sequentially connected from the N-terminus to the C-terminus to: 1) one or a combination of DNMT3A and DNMT3L, one or more zinc finger protein-based transcription factors or histone modification domains, and dCas9; or 2) one or more zinc finger protein-based transcription factors or histone modification domains, one or a combination of DNMT3A and DNMT3L, and dCas9; or 3) dCas9, one or a combination of DNMT3A and DNMT3L, and one or more zinc finger protein-based transcription factors. 4) dCas9, one or more zinc finger protein-based transcription factors or histone modification domains, and one or a combination of DNMT3A and DNMT3L; or 5) one or a combination of DNMT3A and DNMT3L, dCas9, and one or more zinc finger protein-based transcription factors or histone modification domains; or 6) one or more zinc finger protein-based transcription factors or histone modification domains, dCas9, and one or a combination of DNMT3A and DNMT3L.

[0038] In some embodiments, the fusion molecule comprises the following domains: dCas9-DNMT(3A-3L)-KRAB; dCas9-KRAB-DNMT(3A-3L); KRAB-DNMT(3A-3L)-dCas9; DNMT(3A-3L)-KRAB-dCas9; DNMT(3A-3L)-dCas9-EZH2; DNMT(3A-3L)-dCas9-HDAC3; DNMT(3A-3L)-dCas9-HP1a; DNMT(3A-3L)-dCas9-HDAC1; D NMT(3A-3L)-dCas9-PRMT1; DNMT(3A-3L)-dCas9-SETDB1; DNMT(3A-3L)-dCas9-hSIRT1; DNMT(3A-3L)-dCas9-PRMT5; DNMT(3A-3L)-dCas9-G9A, ​​wherein DNMT(3A-3L) indicates that DNMT3A and DNMT3L are directly or indirectly connected in any order, - indicates that the domains at both ends are directly or indirectly connected in the order from the N-terminus to the C-terminus, and KRAB indicates at least one zinc finger protein-based transcription factor selected from KRAB and ZIM3 KRAB.

[0039] In some embodiments, the fusion molecule further includes a nuclear localization signal (NLS) and / or a marker domain.

[0040] In some embodiments, the NLS sequence is directly or indirectly fused to the C-terminus, N-terminus, or both ends of the at least one DNA-binding domain.

[0041] In some embodiments, the fusion molecule comprises the amino acid sequence shown in any one of SEQ ID NO: 58-67.

[0042] In some embodiments, the composition is capable of providing modification of at least one nucleotide in the vicinity of the HBV gene and / or within the HBV gene regulatory element.

[0043] In some embodiments, the composition can inhibit HBV gene expression or reduce HBV gene products.

[0044] In some embodiments, the nucleic acid encoding the fusion molecule is deoxyribonucleic acid (DNA) or messenger ribonucleic acid (mRNA).

[0045] In some embodiments, the fusion molecule or the nucleic acid encoding the fusion molecule is packaged in liposomes or lipid nanoparticles.

[0046] In some embodiments, the fusion molecule or nucleic acid encoding the fusion molecule and the sgRNA or nucleic acid encoding the sgRNA are packaged in the same or different liposomes or lipid nanoparticles (LNPs).

[0047] In some embodiments, the liposomes or the lipid nanoparticles comprise ionizable lipids (20%-70% molar ratio), polyethylene glycol-modified lipids (0%-30% molar ratio), supporting lipids (30%-50% molar ratio), and cholesterol (10%-50% molar ratio).

[0048] In some embodiments, the ionizable lipid is selected from pH-responsive ionizable lipids, thermoresponsive ionizable lipids, and photoresponsive ionizable lipids.

[0049] In some embodiments, the fusion molecule or the nucleic acid encoding the fusion molecule is packaged in an AAV vector.

[0050] In some embodiments, the fusion molecule or the nucleic acid encoding the fusion molecule and the sgRNA or the nucleic acid encoding the sgRNA are packaged in the same or different AAV vectors.

[0051] In some embodiments, the composition is a pharmaceutical composition comprising a pharmaceutically acceptable carrier.

[0052] On the other hand, this application provides a method for reducing or eliminating the expression of hepatitis B virus (HBV) gene products in cells, the method comprising the step of introducing the composition described in this application into the cells, thereby reducing or eliminating the expression of the HBV gene products in the cells.

[0053] In some embodiments, the method is an in vitro method or an in vivo method.

[0054] On the other hand, this application provides a method for treating a subject with hepatitis B virus (HBV) infection-related disease or alleviating symptoms of HBV infection-related disease in a subject, the method comprising the step of introducing an effective amount of the composition described in this application into the subject's cells.

[0055] In some implementations, the subject is a mammal, such as a human, monkey, mouse, rat, rabbit, pig, horse, cat, and dog.

[0056] In some embodiments, the method includes administering the composition to the subject once or more.

[0057] In some embodiments, the method includes administering the composition to the subject at least twice.

[0058] In some embodiments, the interval between applying the composition once or more or at least twice is 1 day, 2 days, 3 days, 4 days, 5 days, 6 days, 7 days, 8 days, 9 days, 10 days, 11 days, 12 days, 13 days, 14 days or 15 days.

[0059] In some embodiments, the HBV infection-related diseases include hepatitis, cirrhosis, liver fibrosis, and hepatocellular carcinoma caused by HBV infection.

[0060] On the other hand, this application provides the composition described herein for treating a subject with hepatitis B virus (HBV) infection-related disease or alleviating symptoms of HBV infection-related disease in a subject.

[0061] On the other hand, this application provides a kit comprising the composition described in this application, a container for holding the composition, and / or instructions for use.

[0062] Other aspects and advantages of this application will readily be apparent to those skilled in the art from the detailed description below. Only exemplary embodiments of this application are shown and described in the following detailed description. As will be appreciated by those skilled in the art, the content of this application enables them to make modifications to the disclosed specific embodiments without departing from the spirit and scope of the invention to which this application pertains. Accordingly, the descriptions in the accompanying drawings and specification of this application are merely exemplary and not restrictive. Attached Figure Description

[0063] The specific features of the invention involved in this application are shown in the appended claims. The features and advantages of the invention can be better understood by referring to the exemplary embodiments and drawings described in detail below. A brief description of the drawings is as follows:

[0064] Figure 1 shows a schematic diagram of the scheme for validating the epigenetic editing function of the composition of this application in a primary hepatocyte (PHH) HBV infection model.

[0065] Figures 2A-2B show the results of verifying the epigenetic editing function of the composition of this application using the PHH cell line shown in Figure 1. Figure 2A shows the degree of inhibition of HBsAg and HBeAg / HBcAg antigen expression levels in the culture medium on days 4, 8, and 12 after application of the composition of this application; Figure 2B shows the degree of inhibition of HBV total RNA and pgRNA expression levels in cells on day 12 after application of the composition of this application.

[0066] Figure 3 shows the inhibitory effect of the composition of this application on the antigen expression levels of HBsAg and HBeAg / HBcAg in the HepG2.2.15 cell line.

[0067] Figure 4 shows the knockdown effect of EPIREG#1 of this application on HBV markers in transgenic HBV mice (the HBV marker content shown in each curve is logarithmically represented by a base of 10). Detailed Implementation

[0068] The following specific embodiments illustrate the implementation of the invention. Those skilled in the art can easily understand other advantages and effects of the invention from the content disclosed in this specification.

[0069] Terminology Definition

[0070] In this application, the term "fusion molecule" generally refers to a molecule consisting of at least two parts (bipartite molecule), such as in this application, which comprises at least one DNA-binding protein and at least one gene expression regulator described in this application coupled together to form a single entity. For example, the at least one gene expression regulator may be fused to the at least one DNA-binding protein at any amino acid other than the N-terminal, C-terminal, or terminal amino acids, and other molecules or parts may also be fused to parts already included in the fusion molecule. The parts constituting the fusion molecule may be separated by a linker or may be directly coupled. In some embodiments, the fusion molecule is a fusion protein, which may be a chimeric protein produced by directly or indirectly covalently or nonvalently linking two or more genes, which originally encode individual proteins. In some embodiments, translation of the fusion gene produces a single polypeptide having functional properties derived from each original protein. Those skilled in the art fully understand the optimal sequence and / or combination of assays used to determine the parts in the fusion molecule of this application.

[0071] In this application, the terms “polynucleotide,” “nucleotide,” “nucleotide sequence,” “nucleic acid,” and “oligonucleotide” are used interchangeably. They generally refer to a polymeric form of nucleotides of any length, which is a deoxyribonucleotide or ribonucleotide, or an analogue thereof. Polynucleotides can have any three-dimensional structure and can perform any function, known or unknown. The following are non-limiting examples of polynucleotides: coding or non-coding regions of genes or gene fragments, multiple loci (one locus) as defined by ligation analysis, exons, introns, messenger RNA (mRNA), transfer RNA, ribosomal RNA, short hairpin RNA (shRNA), microRNA (miRNA), ribozymes, cDNA, recombinant polynucleotides, branched polynucleotides, plasmids, vectors, isolated DNA of any sequence, isolated RNA of any sequence, nucleic acid probes, and primers. Polynucleotides may contain one or more modified nucleotides, such as methylated nucleotides and nucleotide analogues. If present, the nucleotide structure may be modified before or after polymer assembly. The sequence of the nucleotide may be interrupted by non-nucleotide components. Polynucleotides may be further modified after polymerization, such as by conjugation with labeled components.

[0072] In this application, the term "DNA-binding protein" generally refers to a large protein composed of one or more DNA-binding domains (domains with different functions), said DNA-binding domain being a folded protein domain containing at least one motif that recognizes double-stranded or single-stranded DNA. For example, said DNA-binding domain may recognize a specific DNA sequence (recognition or regulatory sequence) or have general affinity for DNA. In some cases, other domains of the DNA-binding protein typically modulate the activity of the DNA-binding domain; the DNA-binding function may be structural or include transcriptional regulation, and sometimes these two functions overlap. In some embodiments of the methods and compositions provided in this application, the DNA-binding protein may comprise a (DNA) nuclease, such as a nuclease capable of targeting DNA in a sequence-specific manner or capable of being directed or instructed to target DNA in a sequence-specific manner, such as the CRISPR-Cas system, zinc finger nucleases (ZFNs), transcription activator-like effector nucleases (TALENs), or a broad range of nucleases. In some embodiments, the DNA-binding protein is a DNA nuclease derived from the CRISPR-Cas system. For example, said CRISPR-Cas system-derived DNA nuclease is a Cas protein.

[0073] In this application, the term "Cas protein" is used interchangeably with "CRISPR protein," "CRISPR enzyme," "CRISPR-Cas protein," "CRISPR-Cas enzyme," "Cas," "CRISPR effector," or "Cas effector protein," and is a component of the CRISPR-Cas system. A Cas protein (e.g., an engineered Cas protein) may have substantially the same nuclease activity as its wild-type counterpart (e.g., between 80% and 100%, between 90% and 100%, between 95% and 100%, between 98% and 100%, between 99% and 100%, between 99.9% and 100%, or about 100%). In some cases, an engineered Cas protein has higher nuclease activity than its wild-type counterpart (e.g., at least 5%, at least 10%, at least 20%, at least 30%, at least 40%, at least 50%, at least 60%, at least 70%, at least 80%, or at least 90%). Optionally or additionally, the Cas protein (e.g., an engineered Cas protein) may have a specificity that is at least 5%, at least 10%, at least 20%, at least 30%, at least 40%, at least 50%, at least 60%, at least 70%, at least 80%, or at least 90% higher than the wild-type corresponding Cas protein. In specific instances, the Cas protein (e.g., an engineered Cas protein) has a specificity that is at least 30% higher than the wild-type corresponding Cas protein. As used herein, the term “specificity” for Cas may correspond to the number or percentage of on-target polynucleotide cleavage events relative to all polynucleotide cleavage events (including on-target and off-target events). The activity and specificity of the Cas protein are consistent with those described in the following literature: Hsu PD et al., DNA targeting specificity of RNA-guided Cas9 nucleases, Nat Biotechnol. Sep 2013; 31(9):827-832 and Slaymaker IM et al., Rationally engineered Cas9 nucleases with improved specificity, Science. Jan 1 2016; 351(6268):84-88. Examples of methods for detecting the activity and specificity of the Cas protein are also described in this paper by reference in their entirety.

[0074] Codon optimization can be performed on nucleic acid molecules encoding Cas. Examples of codon-optimized sequences in this context are sequences optimized for expression in eukaryotes (e.g., humans) or for expression in humans, or sequences optimized for another eukaryote such as the animals or mammals discussed herein; see, for example, the SaCas9 human codon-optimized sequence in WO 2014 / 093622 (PCT / US2013 / 074667). While this is preferred, it should be understood that other examples are possible, and codon optimization for host species other than humans or for specific organs is known. In some embodiments, the enzyme-coding sequence encoding Cas is codon-optimized for expression in specific cells such as eukaryotic cells. Eukaryotic cells can be cells of a specific organism (such as mammals, including but not limited to humans or non-human eukaryotes or the animals or mammals described herein, such as mice, rats, rabbits, dogs, livestock, or non-human mammals or primates) or cells derived from a specific organism. In some implementations, methods for altering human germline genetic traits and / or methods for altering animal genetic traits (which may cause them suffering without any substantial medical benefit to humans or animals), and animals produced by such methods, may be excluded. Generally, codon optimization refers to the process of modifying a nucleic acid sequence to enhance expression in a target host cell by replacing at least one codon of the natural sequence with codons that are more frequently or most frequently used in the genes of that host cell (e.g., about 1, 2, 3, 4, 5, 10, 15, 20, 25, 50, or more codons), while maintaining the natural amino acid sequence. Different species exhibit specific biases toward specific codons of specific amino acids. Codon bias (differences in codon use between organisms) is generally associated with the translation efficiency of messenger RNA (mRNA), which is thought to depend, among other things, on the nature of the codons being translated and the availability of specific transfer RNA (tRNA) molecules. The dominance of selected tRNAs in a cell generally reflects the most frequently used codons in peptide synthesis. Therefore, genes can be customized for optimal gene expression in a given organism based on codon optimization. Codon usage tables are readily available in "codon usage databases" such as www.kazusa.orjp / codon / , and these tables can be modified in various ways. See Nakamura, Y. et al., "Codon usage tabulated from the international DNA sequence databases: status for the year 2000," Nucl. Acids Res. 28:292 (2000).Computer algorithms for codon optimization of specific sequences for expression in specific host cells are also available, such as Gene Forge (Appagen; Jacobus, PA). In some implementations, one or more codons (e.g., 1, 2, 3, 4, 5, 10, 15, 20, 25, 50 or more, or all codons) in the sequence encoding Cas correspond to the most frequently used codon for a specific amino acid.

[0075] In some embodiments, the Cas protein may have nucleic acid cleavage activity. The Cas protein may have RNA binding and DNA cleavage functions. In some embodiments, Cas may direct the cleavage of one or two nucleic acid strands at or near a target sequence location, such as within the target sequence and / or within the complementary sequence of the target sequence or at a sequence associated with the target sequence, for example, within approximately 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, 50, 100, 200, 500 or more base pairs from the first or last nucleotide of the target sequence. In some embodiments, the Cas protein can direct more than one cleavage (e.g., one, two, three, four, five, or more cleavages) of one or both strands within the target sequence and / or its complementary sequence or a sequence associated with the target sequence and / or approximately one, two, three, four, five, six, seven, eight, nine, ten, fifteen, twenty, twenty, twenty, five, five, or more base pairs from the first or last nucleotide of the target sequence. In some embodiments, the cleavage can be blunt-ended, i.e., producing blunt ends. In some embodiments, the cleavage can be staggered, i.e., producing sticky ends.

[0076] In some embodiments, the vector encodes a Cas protein targeting a nucleic acid, which may be mutated relative to the corresponding wild-type enzyme such that the mutated Cas protein targeting the nucleic acid lacks the ability to cleave one or both strands of a target polynucleotide containing the target sequence. For example, an alteration or mutation in the HNH domain produces a mutated Cas protein that substantially lacks all DNA cleavage activity. For example, the DNA cleavage activity of the mutated enzyme is approximately 25%, 10%, 5%, 1%, 0.1%, 0.01%, or lower than the cleavage activity of the unmutated form of the enzyme; an example is when the cleavage activity of the mutated form is zero or negligible compared to the unmutated form.

[0077] In some embodiments, the Cas protein may form a component of an inducible system. The inducible nature of this system would allow for spatiotemporal control of gene editing or gene expression using a form of energy. The form of energy may include, but is not limited to, electromagnetic radiation, acoustic energy, chemical energy, and thermal energy. Examples of inducible systems include tetracycline-inducible promoters (Tet-On or Tet-Off), small two-hybrid transcriptional activation systems (FKBP, ABA, etc.), or photoinducible systems (phytochrome, LOV domain, or cryptochrome). In one embodiment, a CRISPR effector protein may be part of a photoinducible transcriptional effector (LITE) that directs changes in transcriptional activity in a sequence-specific manner. The light component may include a CRISPR effector protein, a light-responsive cytochrome heterodimer (e.g., from Arabidopsis thaliana), and a transcriptional activation / repression domain. Other examples of inducible DNA-binding proteins and methods of use thereof are provided in US 61 / 736465 and US 61 / 721,283, and WO 2014018423 A2 (which is hereby incorporated herein by reference in its entirety).

[0078] In some embodiments, the mutated Cas may have one or more mutations that reduce off-target effects, such as improved CRISPR enzymes (e.g., when complexed with guide RNA) for achieving modification of the target locus but reducing or eliminating off-target activity, and improved CRISPR enzymes (e.g., when complexed with guide RNA) for enhancing CRISPR enzyme activity. It should be understood that the mutated enzymes described below can be used in any method described herein as elsewhere in accordance with this application. Any methods, products, compositions, and uses described elsewhere herein are equally applicable to the mutated CRISPR enzymes further detailed below.

[0079] Methods and mutations applicable in various combinations to enhance or reduce the activity and / or specificity of the target nuclease compared to off-target activity, or to enhance or reduce the binding and / or specificity of the target binding compared to off-target binding, can be used to compensate for or enhance mutations or modifications made to promote other effects. Such mutations or modifications to promote other effects include mutations or modifications to Cas and / or mutations or modifications to the guide RNA. The methods and mutations of this application are used to regulate Cas nuclease activity and / or binding to chemically modified guide RNA.

[0080] In some embodiments, the catalytic activity of the Cas protein of this application is altered or modified. It should be understood that if the catalytic activity differs from that of the corresponding wild-type Cas protein (e.g., unmutated Cas protein), the mutated Cas has altered or modified catalytic activity. Catalytic activity can be determined by methods known in the art. For example, and not limited to, catalytic activity can be determined in vitro or in vivo by measuring the percentage of insertions / deletions (e.g., after a given time, or at a given dose). In some embodiments, catalytic activity is enhanced. In some embodiments, the catalytic activity is enhanced by at least 5%, at least 10%, at least 20%, at least 30%, at least 40%, at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, or at least 100%. In some embodiments, catalytic activity is reduced. In some embodiments, the catalytic activity is reduced by at least 5%, preferably at least 10%, more preferably at least 20%, such as at least 30%, at least 40%, at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, or (substantially) 100%. One or more mutations described herein can deactivate the catalytic activity, which can significantly reduce all catalytic activity, reducing the activity below detectable levels or to unmeasurable catalytic activity.

[0081] One or more characteristics of an engineered Cas protein may differ from those of the corresponding wild-type Cas protein. Examples of such characteristics include catalytic activity, gRNA binding, Cas protein specificity (e.g., editing to determine target specificity), Cas protein stability, off-target binding, target binding, protease activity, nickase activity, and PFS recognition. In some instances, the engineered Cas protein may contain one or more mutations of the corresponding wild-type Cas protein. In some embodiments, the engineered Cas protein exhibits enhanced catalytic activity compared to the corresponding wild-type Cas protein. In some embodiments, the engineered Cas protein exhibits decreased catalytic activity compared to the corresponding wild-type Cas protein. In some embodiments, the engineered Cas protein exhibits increased gRNA binding compared to the corresponding wild-type Cas protein. In some embodiments, the engineered Cas protein exhibits decreased gRNA binding compared to the corresponding wild-type Cas protein. In some embodiments, the Cas protein exhibits enhanced specificity compared to the corresponding wild-type Cas protein. In some embodiments, the Cas protein exhibits decreased specificity compared to the corresponding wild-type Cas protein. In some embodiments, the Cas protein exhibits enhanced stability compared to the corresponding wild-type Cas protein. In some embodiments, the Cas protein exhibits decreased stability compared to the corresponding wild-type Cas protein. In some embodiments, the engineered Cas protein further comprises one or more mutations that inactivate catalytic activity. In some embodiments, off-target binding of the Cas protein is increased compared to the corresponding wild-type Cas protein. In some embodiments, off-target binding of the Cas protein is decreased compared to the corresponding wild-type Cas protein. In some embodiments, target binding of the Cas protein is increased compared to the corresponding wild-type Cas protein. In some embodiments, target binding of the Cas protein is decreased compared to the corresponding wild-type Cas protein. In some embodiments, the engineered Cas protein has higher protease activity or polynucleotide binding capacity compared to the corresponding wild-type Cas protein. In some embodiments, PFS recognition is altered compared to the corresponding wild-type Cas protein.

[0082] Examples of Cas proteins include class I (e.g., types I, III, and IV) and class II (e.g., types II, V, and VI) Cas proteins, such as Cas9, Cas12 (e.g., Cas12a, Cas12b, Cas12c, Cas12d), Cas13 (e.g., Cas13a, Cas13b, Cas13c, Cas13d), CasX, CasY, Cas14, their variants (e.g., mutant forms, truncated forms), their homologs, and their orthologs. The terms "ortholog" and "homolog" are well known in the art. With further guidance, a "homolog" of a protein, as used herein, is a protein of the same species that performs the same or similar function as its homolog. Homologous proteins may be, but are not necessarily, structurally related, or only partially structurally related. An "ortholog" of a protein, as used herein, is a protein of a different species that performs the same or similar function as its ortholog. Orthologous proteins can be, but do not have to be, structurally related, or only partially structurally related.

[0083] In some embodiments, the Cas protein is a class 2 Cas protein, i.e., a Cas protein of a class 2 CRISPR-Cas system. Class 2 CRISPR-Cas systems may have subtypes, such as type II-A, type II-B, type II-C, type VA, type VB, type VC, or type VU. In some embodiments, the Cas protein is Cas9, Cas12a, Cas12b, Cas12c, or Cas12d. In some embodiments, Cas9 may be SpCas9, SaCas9, StCas9, and other Cas9 orthologs. Cas12 may be Cas12a, Cas12b, and Cas12c, including FnCas12a or its homologs or orthologs. The definition and exemplary members of CRISPR-Cas systems include those described in the following literature: Kira S. Makarova and Eugene V. Koonin, Annotation and Classification of CRISPR-Cas systems, Methods Mol Biol. 2015; 1311:47-75 and Sergey Shmakov et al., Diversity and evolution of class 2 CRISPR-Cas systems, Nat Rev Microbial. 2017 Mar; 15(3):169-182.

[0084] In some instances, Cas proteins contain at least one RuvC domain and at least one HNH domain. Cas proteins may also contain first and second adapter domains connecting the RuvC domain to the HNH domain. The first adapter (L1) and second adapter (L2) connecting the HNH domain to the RuvC domain in Cas9 are described in Nishimasu, H. et al., “Crystal structure of Cas9 in complex with guide RNA and target RNA”, Cell 156 (February 27, 2014, 2014):935-949, and Ribeiro, L. et al. (2018), “Protein engineering strategies to expand CRISPR-Cas9 applications”, International Journal of Genomics Volume 2018, Article ID 1652567 (doi.org / 10.1155 / 2018 / 1652567). Figure 1 of Ribeiro's work shows the overall organization, structure, and function of Cas9, which is specifically incorporated herein by reference. Specifically, Figure 1A by Ribeiro shows a schematic diagram of the domain organization of SpCas9, illustrating the genetic structure of the HNH and RuvC domains, including the linker L1 (spanning amino acids 765-780) and L2 (spanning amino acids 906-918) as described herein. Similarly, when referring to the first and second linker domains, the domain organization of Staphylococcus aureus Cas9 (SaCas9) can be utilized. In one embodiment, the linker 1 domain region spans residues 481-519 and connects the RuvC-II domain to the HNH domain in SaCas9. In some embodiments, the linker 2 region spans residues 629-649 and connects the RuvC-III domain of SaCas9 to the HNH domain. Therefore, the first and / or second linker domains can be mutated in Cas9 orthologs, and amino acid residues corresponding to wild-type SaCas9 can be referenced. See Nishimasu, Cell. 27 Aug 2015; 162(5):1113-1126; doi:10.1016 / j.cell.2015.08.007, which is incorporated herein by reference. In particular, Figure 1, Nishimasu’s S1-S3, details the domain organization of the Cas9 protein, the teachings of which are incorporated herein by reference.The first and second adapters may contain about 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45 or more amino acids. The first and second adapters may correspond to the wild-type adapter. In some aspects, the first and second adapters may contain one or more mutations in the first and / or second adapter. In one aspect, the first and / or second adapter contains one or more mutations that enhance Cas9 protein specificity. In some embodiments, the adapters L1 and L2 linking the HNH domain of Cas9 to the RuvC domain contain wild-type amino acid sequences. In some embodiments, the linker connecting the HNH and RuvC domains contains a mutation in one or more amino acids. In one embodiment, the first linker (L1) contains a mutation corresponding to the amino acid T769I of SpCas9, and / or the second linker (L2) contains a mutation corresponding to the amino acid G915M of SpCas9. In one embodiment, one or more linker mutations, such as T769I and G915M, confer enhanced specificity to the Cas9 protein. In one embodiment, as described herein, one or more mutations in the first and second linkers can be combined with one or more mutations in other parts of the Cas9 protein to further enhance specificity and / or maintain substantially equivalent activity to the wild-type Cas9 protein. In one embodiment, mutations in the linkers and / or additional mutations in the Cas protein can be identified using methods detailed herein, which enhance / improve specificity for wild-type Cas9 and substantially preserve its wild-type activity.

[0085] In some embodiments, the Cas protein may be a Cas protein of a type II CRISPR-Cas system (a type II Cas protein). In some embodiments, the Cas protein may be a type II Cas protein, such as Cas9. In some embodiments, a CRISPR / Cas9-based system may include a Cas9 protein or a fragment thereof, a Cas9 fusion protein, a nucleic acid encoding a Cas9 protein or a fragment thereof, or a nucleic acid encoding a Cas9 fusion protein. "Cas9 (CRISPR-associated protein 9)" refers to a polypeptide or fragment thereof having at least about 85% amino acid identity with NCBI accession number NP_269215 and possessing RNA-binding activity, DNA-binding activity, and / or DNA-cutting activity (e.g., endonuclease or nickase activity). The function of Cas9 can be defined by any of a variety of assays, including but not limited to fluorescence polarization-based nucleic acid binding assays, fluorescence polarization-based chain invasion assays, transcription assays, EGFP destruction assays, DNA cleavage assays, and / or Surveyor assays. "Cas9 nucleic acid molecule" refers to a polynucleotide encoding a Cas9 polypeptide or a fragment thereof. An exemplary Cas9 nucleic acid sequence is provided under genome sequence number NC_002737. In some embodiments, inhibitors of Cas9, such as naturally occurring Cas9 or variants thereof in *Streptococcus pyogenes* (SpCas9) or *Staphylococcus aureus* (SaCas9), are disclosed herein. Cas9 recognizes exogenous DNA by base pairing of the protospacer adjacent motif (PAM) sequence and guide RNA (gRNA) with the target DNA. The relative ease with which Cas9 induces target strand breaks at any genomic locus enables efficient genome editing across a wide range of cell types and organisms. Cas9 derivatives can also be used as transcriptional activators / repressors.

[0086] In some cases, CRISPR-Cas proteins are Cas9 or variants thereof. In some instances, Cas9 can be wild-type Cas9, including any naturally occurring bacterial Cas9, or can be codon-optimized or modified forms, including any chimera, mutant, homolog, or ortholog. In another aspect of this application, the Cas9 enzyme may contain one or more mutations and can be used as a universal DNA-binding protein fused with or without a functional domain. The mutation can be artificially introduced or a gain-of-function or loss-of-function mutation. Other aspects of this application relate to mutated Cas9 enzymes fused with domains, including but not limited to nucleases, transcriptional activators, transcriptional repressors, recombinases, transposases, histone remodelers, demethylases, DNA methyltransferases, cryptochromes, photoinducible / controllable domains, or chemically inducible / controllable domains. In some cases, the Cas9 enzyme may be derived from or derived from SpCas9 (Streptococcus pyogenes Cas9), saCas9 (Staphylococcus aureus Cas9), or StCas9 (wild-type Cas9 of Streptococcus thermophilus). As used herein, the term “derived” for enzymes means that a derived enzyme is largely based on the significance of having a high sequence homology with a wild-type enzyme, but has been mutated (modified) in some manner known in the art or described herein. In examples, mutations may include one or more mutations in the first linker domain, the second linker domain, and / or other parts of a protein. High sequence homology relative to the wild-type enzyme may include at least 80%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or higher.

[0087] In specific embodiments, the CRISPR enzyme may be a Cas9 protein derived from or originating from organisms including the following genera: Streptococcus, Campylobacter, Nitrifying Bacteria, Staphylococcus, Parvibaculum, Roseburia, Neisseria, Gluconacetobacter, Azospirillum, Sphaerochaeta, and Lactobacillus. Genus (Lactobacillus), Genus (Eubacterium), Genus (Corynebactere), Genus (Carnobacterium), Genus (Rhodobacter), Genus (Listeria), Genus (Paludibacter), Genus (Clostridium), Family (Lachnospiraceae), Family (Clostridiaridium), Genus (Leptotrichia), Genus (Francisella), Genus (Legionella), Genus (Alic) yclobacillus), Methanomethyophilus, Porphyromonas, Prevotella, Bacteroidetes, Helcococcus, Letospira, Desulfovibrio, Desulfonatronum, Opitutaceae, Tuberibacillus, Bacillus, Brevibacterium *Bacillus*, *Methylobacterium*, or *Acidaminococcus*, *Streptococcus*, *Campylobacter*, *Nitrifying Bacteria*, *Staphylococcus*, *Parvibaculum*, *Rochetes*, *Neisseria*, *Gluconobacterium*, *Azotobacter*, *Squamous*, *Lactobacillus*, *Enterobes*, *Corynebacterium*, *Sutterella*, *Legionella*, *Treponema*, *Filifactor*, *Enterobes*, *Streptococcus*, *Lactobacillus*, *Mycoplasma*, *Bacteroides*Flaviivola, Flavobacterium, Lepidococcus, Azotobacter, Staphylococcus, Neisseria, Rochetomyces, Corynebacterium, Staphylococcus, Nitrifying Bacteria, Mycoplasma, or Campylobacter.

[0088] In some implementations, the CRISPR enzyme may be derived from or derived from Cas9 proteins of organisms including: *Streptococcus mutans*, *Streptococcus agalactiae*, *Streptococcus equisimilis*, *Streptococcus sanguinis*, *Streptococcus pneumoniae*, *Campylobacter jejuni*, *Escherichia coli*; *Bacillus thuringiensis*, *N. tergarcus*; *Staphylococcus auricularis*, *S. carnosus*; *Neisseria meningitidis*, *N. gonorrhoeae*, *Listeria monocytogenes*, *Listeria ivanovii*; *Clostridium botulinum*, *Clostridium difficile*, *Clostridium tetani*, or *Clostridium sordellii*, *Francisella tularensis*. The species include *Francisella tularensis* 1), *Francisella tularensis* subsp. novicida, *Prevotella albensis*, *Lachnospiraceae bacterium MC20171*, *Butyrivibrio proteoclasticus*, *Peregrinibacteria bacterium GW2011 GWA2_33_10*, *Parcubacteria bacterium GW2011GWC2_44_17*, *Smithella* sp. SCADC, and *Acidaminococcus* sp. BV3L6.The following bacteria are listed: BV3L6, *Lachnospiraceae bacterium MA2020*, *Candidatus Methanoplasma termitum*, *Eubacterium eligens*, *Moraxella bovoculi 237*, *Leptospira inadai*, *Lachnospiraceae bacterium ND2006*, *Porphyromonas crevioricanis 3*, *Prevotella disiens*, and *Porphyromonas macacae*. In some embodiments, the Cas9 protein is derived from or derived from organisms containing *Streptococcus pyogenes*, *Staphylococcus aureus*, or *Streptococcus thermophilus* Cas9.

[0089] In a more preferred embodiment, the Cas9 protein is derived from a bacterial species selected from Streptococcus pyogenes, Staphylococcus aureus, or Streptococcus thermophilus Cas9. In some embodiments, Cas9 is derived from a bacterial species selected from *Streptococcus tularensis* 1, *Prevotella alberella*, *Tricholoma MC20171*, *Vibrio proteolyticus*, *Heterophytes* GW2011 GWA233JO, *Pangolinella* GW2011 GWC2_44_17, *Smithia spp.* SCADC, *Acidococcus* spp. BV3L6BV3L6, *Tricholoma MA2020*, *Mycoplasma methanogens* candidate, *Eubacterium tumefaciens*, *Bacillus bovoi* 237237, *Leptospira oryzae*, *Spirulina* ND2006, *Porphyromonas brasiliensis* 3, *Prevotella glycolytica*, and *Porphyromonas macranthum*. In some embodiments, the Cas9 protein is derived from bacterial species selected from *Acidococcus* species BV3L6 and *Trichophyton* species MA2020. In some embodiments, the effector protein is derived from a subspecies of *Tulafrancsis* 1, including but not limited to *Tulafrancsis novicea*.

[0090] Cas9 proteins include, but are not limited to, Streptococcus pyogenes M1 serotype (UniProt ID: Q99ZW2), Staphylococcus aureus Cas9 (UniProt ID: J7RUA5), Eubacterium ventriosum Cas9 (UniProt ID: A5Z395), Azotobacter spp. (strain B510) Cas9 (UniProt ID: D3NT09), Staphylococcus diazotrophus (strain ATCC 49037) Cas9 (UniProt ID: A9HKP2), Neisseria griseus Cas9 (UniProt ID: D0W2Z9), Rosbyrates caspaenatum Cas9 (UniProt ID: C7G697), Corynebacterium glutamicum (strain DS-1) Cas9 (UniProt ID: A7HP89), and Brine nitrate lysate bacteria (strain DSM 16511) Cas9 (UniProt ID: Q99ZW2), Staphylococcus aureus Cas9 (UniProt ID: J7RUA5), Eubacterium ventriosum Cas9 (UniProt ID: A5Z395), Azotobacter spp. (strain B510) Cas9 (UniProt ID: D3NT09), Gluconobacterium diazoxide lysate (strain ATCC 49037) Cas9 (UniProt ID: A9HKP2), Neisseria griseus Cas9 (UniProt ID: D0W2Z9), Rosbyrates caspaenatum Cas9 (UniProt ID: D7G697), Corynebacterium glutamicum (strain DS-1) Cas9 (UniProt ID: A7HP89), and Brine nitrate lysate bacteria (ID:E6WZS9), Campylobacter cas9 (UniProt ID:G1UFN3).

[0091] Enzymatic activity of Cas9 derived from Streptococcus pyogenes or any closely related Cas9 produces a double-strand break at a target site sequence that hybridizes to 20 nucleotides of a guide sequence and is followed by a preintermediate sequence adjacent motif (PAM) sequence (examples include NGG / NRG, which can be determined as described herein). CRISPR activity for site-specific DNA recognition and cleavage via Cas9 is defined by the guide sequence, the tracr sequence that partially hybridizes to the guide sequence, and the PAM sequence. Further details of the CRISPR system are described in Karginov and Hannon, The CRISPR system: small RNA-guided defense in bacteria and archaea, Mole Cell, January 15, 2010; 37(1):7. A type II CRISPR locus from *Streptococcus pyogenes* SF370 comprises a cluster of four genes: Cas9, Cas1, Cas2, and Csnl, along with two non-coding RNA elements (tracrRNA) and a characteristic array of repetitive sequences (positive repeat sequences) separated by short segments (spacer regions, each approximately 30 bp) of non-repetitive sequences. In this system, targeted DNA double-strand breaks (DSBs) are generated in four consecutive steps. First, two non-coding RNAs (the precrRNA array and tracrRNA) are transcribed from the CRISPR locus. Second, the tracrRNA hybridizes with the positive repeat sequences of the precrRNA and is then processed into mature crRNA containing a single spacer region sequence. Third, the mature crRNA:tracrRNA complex forms a heteroduplex between the spacer region of the crRNA and the prespacer sequence DNA, directing Cas9 towards a DNA target composed of the prespacer sequence and the corresponding PAM. Finally, Cas9 mediates the cleavage of the target DNA upstream of the PAM to generate a DSB within the original spacer region. In some implementations, Cas9 may be constitutively present, inducible, conditionally present, administered, or delivered. Cas9 optimization can be used to enhance function or develop new functions. Chimeric Cas9 proteins can be generated, and Cas9 can be used as a universal DNA-binding protein. The structural information provided for Cas9 can be used for further engineering and optimization of the CRISPR-Cas system, and this can also infer structure-function relationships in other CRISPR enzyme systems, particularly in other type II CRISPR enzymes or Cas9 orthologs. Furthermore, the Cas9 protein contains an easily identifiable C-terminal region homologous to the transposon ORF-B and includes an active RuvC-like nuclease (an arginine-rich region).

[0092] In this application, the term "gene expression regulator" is generally selected from gene expression repressors (e.g., KRAB), gene expression activators, or epigenetic modification regulators (e.g., DNMT3A, DNMT3L, DNMT3A-DNMT3L fusion peptide), or any combination thereof. Various gene expression regulators are known in the art; see, for example, Thakore et al., Nat Methods. 2016; 13:127-37, which is incorporated herein by reference in its entirety.

[0093] In some embodiments, the gene expression regulator includes a gene expression repressor. The repressor can be any known gene expression repressor, for example, selected from the Krüppel-associated box (KRAB) domain, the mSin3 interacting domain (SID), MAX-interacting protein 1 (MXI1), the chromosome shadow domain, the ear repressor domain (SRDX), eukaryotic releasing factor 1 (ERF1), eukaryotic releasing factor 3 (ERF3), tetracycline repressor, lad repressor, Catharanthus roseus G-box binding factors 1 and 2, Drosophila Groucho, triple motif 28 (TRTM28), nuclear receptor co-repressor 1, nuclear receptor co-repressor 2, or fragments or fusions thereof. For example, the Krüppel-associated box (KRAB) domain is a type of transcriptional repressor domain present in the N-terminal portion of many zinc finger-based transcription factors. The KRAB domain acts as a transcriptional repressor when it binds to target DNA via its DNA-binding domain. The KRAB domain is rich in charged amino acids and can be divided into subdomains A and B. The KRAB A and B subdomains can be separated by a variable spacer region, and many KRAB proteins contain only the A subdomain. The 45-amino acid sequence of the KRAB A subdomain has been shown to be important for transcriptional repression. The B subdomain itself does not repress transcription but enhances the repression exerted by the KRAB A subdomain. The KRAB domain recruits co-repressors KAP1 (KRAB-associated protein-1, also known as transcription intermediary factor 1β, KRAB-A interacting protein, and triple motif protein 28) and heterochromatin protein 1 (HP1), as well as other chromatin regulatory proteins, leading to transcriptional repression through heterochromatin formation. In some embodiments, the methods and compositions provided herein include a fusion molecule comprising dCas9 fused to a KRAB domain or a fragment thereof. In some embodiments, the KRAB domain or a fragment thereof is fused to the N-terminus of dCas9. In some embodiments, the KRAB domain or a fragment thereof is fused to the C-terminus of dCas9. In one embodiment, the KRAB domain or a fragment thereof is fused to both the N-terminus and C-terminus of the dCas9 molecule.In some embodiments, the fusion molecule comprises a KRAB domain comprising the sequence shown in SEQ ID NO:10, a sequence substantially identical to SEQ ID NO:10 (e.g., at least 80%, 85%, 90%, 92%, 95%, 97%, 98%, 99% or higher identity), or a sequence having one, two, three, four, five or more alterations (e.g., amino acid substitutions, insertions or deletions) relative to SEQ ID NO:10, or any fragment thereof. In some embodiments, the zinc finger-based transcription factor is a KRAB domain found in many Krüppel-type C2H2 zinc finger proteins, for example, a KRAB domain derived from ZIM3 (ZIM3KRAB). For example, the fusion molecule comprises a ZIM3 KRAB domain comprising the sequence shown in SEQ ID NO:11, a sequence substantially identical to SEQ ID NO:11 (e.g., at least 80%, 85%, 90%, 92%, 95%, 97%, 98%, 99% or higher identity), or a sequence having one, two, three, four, five or more alterations (e.g., amino acid substitutions, insertions or deletions) relative to SEQ ID NO:11, or any fragment thereof. Active fragments of other KRAB domains can be identified by any suitable alignment method in the art.

[0094] In some embodiments, the gene expression regulator includes an activator of gene expression. The activator can be any known gene expression activator, such as the VP16 activation domain, VP64 activation domain, p65 activation domain, Epstein-Barr virus R trans-activator Rta molecule, or fragments thereof. Activation of dCas9 is known in the art. See, for example, Chavez et al., Nat Methods. (2016) 13:563-67, which is incorporated herein by reference in its entirety. In some embodiments, the methods and compositions provided herein include a fusion molecule comprising dCas9 fused with VP64, p65, Rta, or any combination thereof. The triple activator VP64-p65-Rta (also known as VPR) (where the three transcriptional activation domains are fused using short amino acid linkers) can effectively upregulate target gene expression when fused with dCas9. In some embodiments, the methods and compositions provided herein include a fusion molecule comprising dCas9 fused with VPR.

[0095] In some embodiments, the methods and compositions provided herein include a fusion molecule comprising dCas9 fused to a gene expression regulator, wherein the gene expression regulator comprises an epigenetic modification regulator. In some embodiments, the fusion molecule regulates the expression of a target gene at a regulatory element (e.g., promoter, enhancer, or transcription start site) of the target gene through epigenetic modification, such as histone acetylation or methylation, or DNA methylation. The regulator may be any known epigenetic modification regulator, such as a histone acetyltransferase (e.g., p300 catalytic domain), histone deacetylase, histone methyltransferase (e.g., SUV39H1 or G9a (EHMT2)), histone demethylase (e.g., LSD1), DNA methyltransferase (e.g., DNMT3A or DNMT3A-DNMT3L), DNA demethylase (e.g., TET1 catalytic domain or TDG), or fragments thereof.

[0096] In some embodiments, the epigenetic modification regulator may have histone modification activity; histone modification activity may include, but is not limited to, histone deacetylase, histone acetyltransferase, histone demethylase, or histone methyltransferase activity. For example, the epigenetic modification regulator may have histone acetyltransferase activity, wherein the histone acetyltransferase may be p300 or CREB-binding protein (CBP) protein or a fragment thereof. In some embodiments, the methods and compositions provided herein include a fusion molecule comprising dCas9 fused to acetyltransferase p300 or a fragment thereof (e.g., the catalytic core of p300). In some embodiments, the methods and compositions provided herein include a fusion molecule comprising dCas9 fused to CREB-binding protein (CBP) protein or a fragment thereof. Again, for example, the epigenetic modification regulator may have histone demethylase activity. In some embodiments, the epigenetic modification regulator may include an enzyme that removes methyl (CH3-) groups from nucleic acids or proteins (e.g., histones). In some embodiments, the methods and compositions provided herein include a fusion molecule comprising dCas9 fused to Lys-specific histone demethylase 1 (LSD1) or a fragment thereof. As another example, the epigenetic modification regulator may have histone methyltransferase activity. In some embodiments, the methods and compositions provided herein include a fusion molecule comprising dCas9 fused to SUV39H1 or a fragment thereof. In some embodiments, the methods and compositions provided herein include a fusion molecule comprising dCas9 fused to G9a (EHMT2) or a fragment thereof.

[0097] In some embodiments, the epigenetic modification regulator may have DNA demethylase activity. In some embodiments, the epigenetic modification regulator may convert methyl groups to hydroxymethylcytosine as a mechanism for demethylating DNA. In some embodiments, the methods and compositions provided herein include a fusion molecule comprising dCas9 fused to 10-11 transmethylcytosine dioxygenase 1 (TET1) or a fragment thereof. In some embodiments, the methods and compositions provided herein include a fusion molecule comprising dCas9 fused to thymine DNA glycosylase (TDG) or a fragment thereof. For example, the epigenetic modification regulator may have DNA methylase activity. In some embodiments, the epigenetic modification regulator may have methylase activity involving the transfer of methyl groups to DNA, RNA, proteins, small molecules, cytosine, or adenine. In some embodiments, the methods and compositions provided herein include a fusion molecule comprising dCas9 fused to DNMT3A or a fragment thereof. In some embodiments, the methods and compositions provided herein include a fusion molecule comprising dCas9 fused to DNMT3L or a fragment thereof. In some embodiments, the methods and compositions provided herein include a fusion molecule comprising dCas9 fused to DNMT3L and DNMT3L or fragments thereof. In some embodiments, the methods and compositions provided herein include a fusion molecule comprising dCas9 fused to a DNMT3A-DNMT3L fusion peptide. In some embodiments, the epigenetic modification regulator having DNA methyltransferase activity is human-derived DNMT3L comprising the amino acid sequence shown in SEQ ID NO:7.

[0098] In this application, the term "guide RNA (gRNA)" is used interchangeably with "guide molecule," "guide sequence," and "single guide RNA (sgRNA)," and in the case of a CRISPR-Cas system, it typically includes any polynucleotide sequence that is sufficiently complementary to the target DNA sequence to hybridize with it and guide the nucleic acid targeting complex (e.g., the composition described in this application) to specifically bind to the target DNA sequence. The guide RNA may form a double strand with the target DNA sequence. In some embodiments, the guide RNA is capable of forming a complex with a CRISPR-Cas protein and includes a guide sequence that is sufficiently complementary to the target DNA sequence to hybridize with it and guides the complex to sequence-specific binding to the target DNA sequence. The guide molecule or guide RNA of the CRISPR-Cas protein may include a tracr-mate sequence (including a "positive repeat sequence" in the case of an endogenous CRISPR system) and a guide sequence (also referred to as a "spacer region" in the case of an endogenous CRISPR system). In some embodiments, the CRISPR-Cas system or complex described herein does not contain a tracr sequence and / or is independent of the presence of a tracr sequence. In some embodiments, the guide molecule may comprise or consist substantially of a positive repeat sequence fused to or linked to a guide sequence or spacer region sequence. Generally, CRISPR-Cas systems are characterized by elements that promote the formation of CRISPR complexes at target DNA sequence sites, wherein hybridization between the target DNA sequence and the guide sequence promotes the formation of the CRISPR complex.

[0099] In some embodiments, the guide sequence or spacer region of the guide molecule is 15 to 50 nucleotides in length. In some embodiments, the spacer region of the guide RNA is at least 15 nucleotides in length. In some embodiments, the spacer region is 15 to 17 nucleotides in length, 17 to 20 nucleotides in length, 20 to 24 nucleotides in length, 23 to 25 nucleotides in length, 24 to 27 nucleotides in length, 27 to 30 nucleotides in length, 30 to 35 nucleotides in length, or longer than 35 nucleotides in length. In some implementations, the length of the guide sequence is 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 5... 6, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, or 100 nucleotides.

[0100] In some implementations, the sequence of the guide molecule (forward repeat sequence and / or spacer region) is selected to reduce the degree of secondary structure within the guide molecule. In some implementations, when optimally folded, approximately or less than approximately 75%, 50%, 40%, 30%, 25%, 20%, 15%, 10%, 5%, 1% or less of the nucleotides targeting the guide RNA are involved in self-complementary base pairing. Optimal folding can be determined by any suitable polynucleotide folding algorithm. Some procedures are based on calculating the minimum Gibbs free energy. An example of such an algorithm is mFold, as described by Zuker and Stiegler (Nucleic Acids Res 9 (1981), 133-148). Another exemplary folding algorithm is RNAfold, an online web server developed by the Institute of Theoretical Chemistry at the University of Vienna, which uses a centroid structure prediction algorithm (see, for example, ARGruber et al., 2008, Cell 106(1):23-24 and PACarr and GM Church, 2009, Nature Biotechnology 27(12):1151-62).

[0101] As described above, the CRISPR / Cas9 system utilizes gRNA to provide targeting for CRISPR / Cas9-based systems. gRNA is a fusion of two non-coding RNAs: crRNA and tracrRNA. By exchanging sequences encoding a 20 bp prespacer sequence, the sgRNA can target any desired DNA sequence, the prespacer sequence conferring targeting specificity through complementary base pairing with the desired DNA target. The gRNA mimics the naturally occurring crRNA:tracrRNA duplex involved in type II effector systems. This duplex (which may include, for example, a 42-nucleotide crRNA and a 75-nucleotide tracrRNA) acts as a guide for Cas9 to cleave target nucleic acids.

[0102] On the other hand, the sgRNA provided in this application may further include a scaffold sequence having: the nucleotide sequence shown in SEQ ID NO:80; or a nucleotide sequence having at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, at least 99.9%, or at least 100% sequence identity with the nucleotide sequence shown in SEQ ID NO:80 and retaining its biological activity; or a nucleotide sequence obtained by modifying the nucleotide sequence shown in SEQ ID NO:80 and retaining its biological activity. For example, the modification may be one or more of base phosphorylation, base sulfidation, base methylation, base hydroxylation, sequence shortening, and sequence lengthening. Further, the sequence shortening and the sequence lengthening may include the deletion or addition of 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10 bases relative to the base sequence.

[0103] As an example only, the stent sequence can be:

[0104] In some embodiments, the sgRNA provided in this application may further include the CRISPR spacer sequence at the 5' end of the scaffold sequence. The CRISPR spacer sequence is a sequence of 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, or 30 nucleotides in length and capable of complementary pairing with the target sequence. In some preferred embodiments, the CRISPR spacer sequence is a sequence of 20 or 21 nucleotides in length and capable of complementary pairing with the target sequence. In some embodiments, the sgRNA may further include a terminator at the 3' end of the spacer sequence. For example, the terminator may be a plurality of terminators, such as at least 6 (e.g., 7 or 8) U.

[0105] In this application, the terms "target DNA" and "target sequence" or "prespacer sequence" are used interchangeably, generally referring to a nucleotide sequence present in a target nucleic acid containing a nucleobase sequence complementary to an oligonucleotide (e.g., guide RNA) of this application. In some cases, the target sequence consists of a region on the target nucleic acid complementary to a consecutive nucleotide sequence of an oligonucleotide of this application. In some cases, the target sequence is longer than the complementary sequence of a single oligonucleotide and may, for example, represent an optional region of the target nucleic acid that can be targeted by several oligonucleotides of this application. In some cases, "target sequence" may mean a portion of a target gene, such as one or more exon sequences, intron sequences, or regulatory sequences of a target gene, or a combination of exon and intron sequences, intron and regulatory sequences, exon and regulatory sequences, or exon, intron, and regulatory sequences. In the context of the formation of a CRISPR complex or system of this application, "target DNA" refers to a sequence to which the guide RNA sequence is designed to be complementary, wherein hybridization between the target DNA and the guide RNA sequence facilitates the formation of a CRISPR complex or system. In some embodiments, the target DNA is located in the cell nucleus or cytoplasm. CRISPR / Cas9-based systems may include at least one gRNA that targets a different DNA sequence. The target DNA sequences may overlap. Following the target sequence or prespacer sequence is a PAM sequence located at the 3' end of the prespacer sequence. Different type II systems have different PAM requirements. For example, the type II Streptococcus pyogenes system uses the “NGG” sequence, where “N” can be any nucleotide.

[0106] In some implementations, the number of gRNAs administered to the cells can be at least 1 gRNA, at least 2 different gRNAs, at least 3 different gRNAs, at least 4 different gRNAs, at least 5 different gRNAs, at least 6 different gRNAs, at least 7 different gRNAs, at least 8 different gRNAs, at least 9 different gRNAs, at least 10 different gRNAs, at least 11 different gRNAs, at least 12 different gRNAs, at least 13 different gRNAs, at least 14 different gRNAs, at least 15 different gRNAs, at least 16 different gRNAs, at least 17 different gRNAs, at least 18 different gRNAs, at least 19 different gRNAs, at least 20 different gRNAs, at least 25 different gRNAs, at least 30 different gRNAs, at least 35 different gRNAs, at least 40 different gRNAs, at least 45 different gRNAs, or at least 50 different gRNAs.

[0107] In some implementations, the number of gRNAs administered to the cells can be at least 1 gRNA to at least 50 different gRNAs, at least 1 gRNA to at least 45 different gRNAs, at least 1 gRNA to at least 40 different gRNAs, at least 1 gRNA to at least 35 different gRNAs, at least 1 gRNA to at least 30 different gRNAs, at least 1 gRNA to at least 25 different gRNAs, at least 1 gRNA to at least 20 different gRNAs, at least 1 gRNA to at least 16 different gRNAs, at least 1 gRNA to at least 12 different gRNAs, at least 1 gRNA to at least 8 different gRNAs, at least 1 gRNA to at least 4 different gRNAs, at least 4 different gRNAs to at least 50 different gRNAs, at least 4 different gRNAs to at least 45 different gRNAs, at least 4 different gRNAs to at least 40 different gRNAs, at least 4 different gRNAs to at least 35 different gRNAs, and so on. The range of gRNAs is as follows: at least 4 different gRNAs to at least 30 different gRNAs; at least 4 different gRNAs to at least 25 different gRNAs; at least 4 different gRNAs to at least 20 different gRNAs; at least 4 different gRNAs to at least 16 different gRNAs; at least 4 different gRNAs to at least 12 different gRNAs; at least 4 different gRNAs to at least 8 different gRNAs; at least 8 different gRNAs to at least 50 different gRNAs; at least 8 different gRNAs to at least 45 different gRNAs; at least 8 different gRNAs to at least 40 different gRNAs; at least 8 different gRNAs to at least 35 different gRNAs; 8 different gRNAs to at least 30 different gRNAs; at least 8 different gRNAs to at least 25 different gRNAs; 8 different gRNAs to at least 20 different gRNAs; at least 8 different gRNAs to at least 16 different gRNAs; or 8 different gRNAs to at least 12 different gRNAs. In some embodiments, gRNAs are selected to increase or decrease the transcription of target genes.

[0108] As used herein, the term "regulatory element" refers to a genetic element that controls the expression of a nucleic acid sequence. Examples include splicing signals, promoter sequences, polyadenylation signals, transcription termination sequences, upstream regulatory domains, origins of replication, internal ribosome entry sites ("IRES"), enhancers, etc., which together enable the replication, transcription, and translation of coding sequences in recipient cells. Not all of these control sequences are required. Transcriptional control signals in eukaryotes typically consist of "promoter" and "enhancer" elements. Promoters and enhancers are short arrays of DNA sequences; promoters are regulatory elements that promote the initiation of transcription in operable coding regions, while enhancers are regulatory elements that increase the rate of genetic transcription by increasing the activity of the nearest promoter on the same DNA molecule. These sequences specifically interact with cellular proteins involved in transcription (Maniatis et al., Science 236:1237 (1987), incorporated herein by reference in its entirety). Promoter and enhancer elements have been isolated from a wide range of eukaryotic sources, including genes in yeast, insect and mammalian cells, and viruses (similar control sequences, i.e., promoters, have also been found in prokaryotes). The choice of specific promoters and enhancers depends on the recipient cell type. Some eukaryotic promoters and enhancers have a broad host range, while others function within a limited subgroup of cell types (for reviews, see, e.g., Voss et al., Trends Biochem. Sci., 11:287 (1986); and Maniatis et al. (ibid.), incorporated herein by reference in their entirety). For example, the SV40 early gene enhancer is highly active in a wide range of cell types from many mammalian species and has been used to express proteins in a variety of mammalian cells (Dijkema et al., EMBO J. 4:761 (1985), incorporated herein by reference in its entirety). Promoter and enhancer elements derived from the human elongation factor 1-α gene (Uetsuki et al., J. Biol. Chem., 264:5791 (1989); Kim et al., Gene 91:217 (1990); and Mizushima and Nagata, Nucl. Acids. Res., 18:5322 (1990)), long terminal repeat sequences of Rous sarcoma virus (Gorman et al., Proc. Natl. Acad. Sci. USA 79:6777 (1982)), and human cytomegalovirus (Boshart et al., Cell 41:521 (1985)) can also be used to express proteins in different mammalian cell types, and the aforementioned references are incorporated herein by reference in their entirety. Promoters and enhancers can exist naturally, alone or together. For example, long terminal repeat sequences of retroviruses contain promoter and enhancer elements.Generally, the functions of promoters and enhancers are independent of the gene being transcribed or translated. Therefore, the enhancers and promoters used can be “endogenous,” “exogenous,” or “heterogeneous” relative to the gene to which they are operatively linked. An “endogenous” enhancer / promoter is one that is naturally linked to a given gene in the genome. An “exogenous” or “heterogeneous” enhancer or promoter is one that is juxtaposed with a gene through genetic manipulation (i.e., molecular biology techniques), such that transcription of that gene is directed by the linked enhancer / promoter. The presence of a “splicing signal” on the expression vector typically leads to high levels of expression of the recombinant transcript.

[0109] In some implementations, "splicing signals" mediate the removal of introns from primary RNA transcripts, consisting of splicing donor and acceptor sites (Sambrook et al., Molecular Cloning: A Laboratory Manual, 2nd ed., Cold Spring Harbor Laboratory Press, New York (1989), pp. 16.7-16.8, incorporated herein by reference in its entirety). Commonly used splicing donor and acceptor sites are splicing sites from 16S RNA derived from SV40.

[0110] In some implementations, the “transcription termination signal” is typically located downstream of the polyadenylation signal and is several hundred nucleotides in length. For example, the term “poly A signal” or “poly A sequence” refers to the DNA sequence that directs the termination and polyadenylation of nascent RNA transcripts. Efficient polyadenylation of recombinant transcripts is often necessary because transcripts lacking a poly A signal are unstable and rapidly degraded. The poly A signal used in expression vectors can be “heterologous” or “endogenous.” An endogenous poly A signal is a signal naturally present at the 3’ end of the coding region of a given gene in the genome. A heterologous poly A signal is a signal isolated from one gene and operatively linked to the 3’ end of another gene. A commonly used heterologous poly A signal is the SV40 poly A signal. The SV40 poly A signal is contained on a 237 bp BamHI / BclI restriction fragment and directs termination and polyadenylation (Sambrook et al., ibid., 16.6–16.7, incorporated herein by reference in its entirety).

[0111] In this application, the term “inactivated Cas9 protein” may be referred to as “dCas9” protein. Methods known for generating Cas9 proteins (or fragments thereof) with inactivated DNA-cutting domains are described, for example, Jinek et al., Science. 337: 816-821 (2012); Qi et al., “Repurposing CRISPR as an RNA-Guided Platform for Sequence-Specific Control of Gene Expression”, Cell. 28, 152(5): 1173-83 (2013), the entire contents of which are incorporated herein by reference. For example, the DNA-cutting domain of Cas9 is known to comprise two subdomains, the HNH nuclease subdomain and the RuvC1 subdomain. The HNH subdomain cleaves the strand complementary to the gRNA, while the RuvC1 subdomain cleaves the non-complementary strand. Mutations in these subdomains can silence the nuclease activity of Cas9. For example, mutations in D10A and H840A completely inactivate the nuclease activity of *Streptococcus pyogenes* Cas9 (Jinek et al., Science. 337: 816-821 (2012); Qi et al., Cell. 28; 152(5): 1173-83 (2013)). Suitable CRISPR-inactivating or nick DNA-binding domains include, but are not limited to, nuclease-inactivating variant Cas9 domains, including the D10A, D10A / D839A / H840A, and D10A / D839A / H840A / N863A mutant domains, as described in WO2015089406A1, which is incorporated herein by reference. In some cases, nuclease-free dCas9 from *Streptococcus pyogenes* has been targeted by gRNAs to genes in bacteria, yeast, and human cells to silence gene expression through steric hindrance. As used herein, “dCas” may refer to the dCas protein or a fragment thereof. As used herein, “dCas9” may refer to the dCas9 protein or a fragment thereof. As used herein, the terms “iCas” and “dCas” are used interchangeably to refer to a non-catalytically active CRISPR-related protein. In one embodiment, the dCas protein contains one or more mutations in its DNA cleavage domain. In one embodiment, the dCas protein contains one or more mutations in its RuvC or HNH domain. In one embodiment, the dCas molecule contains one or more mutations in both its RuvC and HNH domains. In one embodiment, the dCas protein is a fragment of the wild-type Cas protein. In one embodiment, the dCas protein contains a functional domain derived from the wild-type Cas protein, wherein the functional domain is selected from the Reel domain, the bridged helical domain, or the PAM interaction domain.In one embodiment, the nuclease activity of dCas is reduced by at least 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, or 99% compared to the nuclease activity of the corresponding wild-type Cas protein.

[0112] Suitable dCas can be derived from wild-type Cas proteins. Cas proteins can be derived from type I, type II, or type III CRISPR-Cas systems. In one embodiment, suitable dCas can be derived from Cas1, Cas2, Cas3, Cas4, Cas5, Cas6, Cas7, Cas8, Cas9, or Cas10. In one embodiment, dCas is derived from the Cas9 protein. For example, dCas9 can be obtained by introducing point mutations (e.g., substitution, deletion, or addition) into the DNA cleavage domains (e.g., nuclease domains, such as the RuvC and / or HNH domains) of the Cas9 protein. See, for example, Jinek et al., Science (2012) 337:816-21, which is incorporated herein by reference in its entirety. For example, introducing two point mutations into the RuvC and HNH domains reduces Cas9 nuclease activity while preserving Cas9 sgRNA and DNA-binding activity. In one embodiment, the two point mutations within the RuvC and HNH active sites are the D10A and H840A mutations of Streptococcus pyogenes Cas9. Alternatively, the D10 and H840 sites of Streptococcus pyogenes Cas9 can be deleted to eliminate Cas9 nuclease activity while preserving its sgRNA and DNA-binding activity. In one embodiment, the two point mutations within the RuvC and HNH active sites are the D10A and N580A mutations of Streptococcus pyogenes Cas9.

[0113] In various embodiments, this application relates to the dCas protein or any variant or mutant thereof. All variants and mutants of dCas9 can be used in the methods, compositions, fusion molecules, or kits disclosed herein, including but not limited to those derived from SpCas9 (Cas9 isolated from Streptococcus pyogenes), SaCas9 (Cas9 isolated from Staphylococcus aureus), StCas9 (Cas9 isolated from Streptococcus thermophilus), NmCas9 (Cas9 isolated from Neisseria meningitidis), FnCas9 (Cas9 isolated from Francisella novicida), CjCas9 (Cas9 isolated from Campylobacter jejuni), ScCas9 (Cas9 isolated from Streptococcus canis), and any variants and mutant forms of Cas9 listed above, such as those of high-fidelity Cas9 (Kleinstiver et al., Nature. January 28, 2016) and enhanced SpCas9 (Slaymaker et al., Sciences. January 1, 2016). For example, the dCas9 sequences shown in SEQ ID NOs:40-57 of this application provide only a few exemplary options and are not exclusive. In one embodiment, the dCas protein is a Streptococcus pyogenes dCas9 protein containing mutations at D10 and / or H840 (as shown in SEQ ID NO:40). In one embodiment, the dCas protein is a Streptococcus pyogenes dCas9 protein containing mutations at D10A and / or H840A (as shown in SEQ ID NO:40). In one embodiment, the dCas9 protein is the Staphylococcus aureus dCas9 protein, comprising the amino acid sequence shown in any one of SEQ ID NO:41-43, a sequence substantially identical to any one of SEQ ID NO:41-43 (e.g., at least 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or higher sequence identity), or a sequence having one, two, three, four, five or more alterations (e.g., amino acid substitutions, insertions or deletions) relative to any one of SEQ ID NO:41-43, or any fragment thereof.

[0114] Similar mutations can also be applied to any other naturally occurring Cas9 (e.g., Cas9 from other species) or engineered Cas9. In some embodiments, dCas9 includes *Streptococcus pyogenes* dCas9, *Staphylococcus aureus* dCas9, *Campylobacter jejuni* dCas9, *Corynebacterium diphtheria* dCas9, *Eubacterium ventriosum* dCas9, *Streptococcus pasteurianus* dCas9, *Lactobacillus farciminis* dCas9, *Sphaerochaeta globus* dCas9, *Azospirillum* (e.g., strain B510) dCas9, *Gluconacetobacter diazotrophicus* dCas9, *Neisseria cinerea* dCas9, and *Roseburia*. The following bacteria are listed as fragments: *Intestinalis* dCas9, *Parvibaculum lavamentivorans* dCas9, *Nitratifractor salsuginis* (e.g., strain DSM 16511) dCas9, *Campylobacter lari* (e.g., strain CF89-12) dCas9, *Streptococcus thermophilus* (e.g., strain LMD-9) dCas9, or fragments thereof. In some embodiments, this application also provides a vector comprising nucleotides encoding the following protein molecules: Streptococcus pyogenes dCas9, Staphylococcus aureus dCas9, Campylobacter jejuni dCas9, Corynebacterium diphtheriae dCas9, Eubacterium truncatum dCas9, Streptococcus pasteurellae dCas9, Lactobacillus sausageii dCas9, Trichophyton spp. dCas9, Azotobacter spp. (strain B510) dCas9, Staphylococcus diazotrophus dCas9, Neisseria griseus dCas9, Rosbyrates spp. dCas9, Cladosporium spp. dCas9, Bacillus nitrate lysate brine (strain DSM 16511) dCas9, Campylobacter guillier (strain CF89-12) dCas9, Streptococcus thermophilus (strain LMD-9) dCas9, or fragments thereof.

[0115] In this application, the term "nucleotide modification" may refer to the synthesis or modification of the nucleic acid described in this invention by methods well-established in the art, such as those described in "Current protocols in nucleic acid chemistry" Beaucage, SL et al., (Edrs.), John Wiley & Sons, Inc., New York, NY, USA (which are incorporated herein by reference). Such modifications may include, but are not limited to: terminal modifications, such as 5'-terminal modifications (e.g., phosphorylation, conjugation, inverted linkage) or 3'-terminal modifications (e.g., conjugation, DNA nucleotides, inverted linkage, etc.); base modifications, such as substitution with a stable base, a destabilized base, or a base paired with an expanded library of bases, base removal (base-free nucleotide), or conjugated bases; sugar modifications (e.g., sugar modification at the 2' or 4' position) or sugar substitution; or backbone modifications, including modification or substitution of phosphodiester bonds.

[0116] In this application, the terms "DNA methylation" and "nucleic acid methylation" are used interchangeably. They generally refer to the methylation state of gene fragments, nucleotides, or their bases in this application, often occurring within cells transfected with nucleic acids containing a structural gene encoding a polypeptide effectively linked to a promoter, during which cytosine in the promoter nucleic acid is converted to 5-methylcytosine. Promoter nucleic acids in which at least one cytosine is converted to 5-methylcytosine are referred to as "methylated" nucleic acids or DNA. The DNA fragment containing the gene in this application may have methylation on one or more strands, and may also have methylation at one or more sites.

[0117] In this application, the term "part thereof" generally refers to a portion or fragment of a specified whole. For example, when used in this application relative to a specified polypeptide sequence, the term "part thereof" refers to a continuous length of the specified polypeptide sequence shorter than the full-length sequence of the specified polypeptide. A portion of a specified polypeptide can be defined by its first position and its last position, wherein the first and last positions each correspond to positions in the sequence of the specified polypeptide, wherein the sequence position corresponding to the first position is located at the N-terminus of the sequence position corresponding to the last position, and thus the sequence of that portion is a continuous amino acid sequence in the specified polypeptide that begins at the sequence position corresponding to the first position and ends at the sequence position corresponding to the last position. A portion can also be defined by reference to a position in the sequence of the specified polypeptide and the residue length relative to the reference position, thereby the sequence of that portion is a continuous amino acid sequence in the specified polypeptide that has a defined length and is located in the specified polypeptide according to the defined position.

[0118] In this application, the term "direct or indirect fusion" generally refers to the relative terms "direct fusion" or "indirect fusion." The term "direct fusion" generally refers to direct linking or direct binding. For example, direct linking can refer to the direct connection of linked substances (e.g., amino acid sequence segments) without any spacer (e.g., amino acid residues or their derivatives); for example, amino acid sequence segment X and another amino acid sequence segment Y are directly linked through an amide bond formed by the C-terminal amino acid of amino acid sequence segment X and the N-terminal amino acid of amino acid sequence segment Y. "Indirect fusion" generally refers to the indirect connection of linked substances (e.g., amino acid sequence segments) with a spacer (e.g., amino acid residues or their derivatives).

[0119] In this application, the carrier used to package the composition, fusion molecule and / or guide molecule (sgRNA) described in this application may contain lipid particles, for example, lipid nanoparticles (LNPs) and liposomes.

[0120] For example, as used herein, the terms "lipid nanoparticle (LNP)" or "one LNP" or "multiple LNPs" generally refer to particles containing multiple (i.e., more than one) lipid molecules physically bound together by intermolecular forces (e.g., covalent or non-covalent). LNPs can be, for example, microspheres (including monolayer and multilayer vesicles, such as liposomes), dispersed phases in emulsions, micelles, or internal phases in suspensions. LNPs can encapsulate nucleic acids within cationic lipid particles (e.g., liposomes) and can be delivered to cells relatively easily. In some instances, lipid nanoparticles do not contain any viral components, which helps minimize safety and immunogenicity issues. The lipid particles can be used for in vitro, ex vivo, and in vivo delivery. The lipid particles can also be used for cell populations of various sizes. The LNPs of this application can be readily prepared by various methods known in the art, such as by mixing an organic phase with an aqueous phase. Mixing of the two phases can be achieved using microfluidic devices and impinging flow reactors. The more thoroughly the organic and aqueous phases are mixed, the better the encapsulation efficiency and particle size distribution of the obtained LNPs. Preferably, the particle size of the LNP can be adjusted by changing the mixing rate between the organic and aqueous phases. The faster the mixing rate, the smaller the particle size of the prepared LNP. The encapsulation efficiency can be optimized by adjusting the N / P (ionizable lipid / nucleic acid) ratio of the LNP system. In some preferred embodiments, the N / P ratio is 1:1 to 9:1. In some instances, LNPs can be used to deliver DNA molecules (e.g., molecules containing coding sequences of DNA-binding proteins and / or sgRNA) and / or RNA molecules (e.g., mRNA containing Cas or sgRNA). In some cases, LNPs can be used to deliver Cas / gRNA RNP complexes. In some embodiments, LNPs are used to deliver mRNA and gRNA (e.g., mRNA fusion molecules containing DNMT3A-DNMT3L(3A-3L)-dCas9-KRAB or DNMT3A-DNMT3L-ZIM3 KRAB-dCas9, and at least one sgRNA targeting the HBV gene).

[0121] The components of the LNP may include cationic lipids such as 1,2-diphenylcarbamate-3-dimethylammonium-propane (DLinDAP), 1,2-dilinoleyloxy-3-N,N-dimethylaminopropane (DLinDMA), 1,2-dilinoleyloxyketone-N,N-dimethyl-3-aminopropane (DLinK-DMA), and 1,2-dilinoleyl-4-(2-dimethylaminoethyl)-. In some embodiments, the LNP may include ionizable lipids. In some embodiments, ionizable lipids include, but are not limited to, pH-responsive ionizable lipids, thermoresponsive ionizable lipids, and light-responsive ionizable lipids. In some embodiments, ionizable lipids include cationic and anionic lipids that ionize under certain conditions (such as, but not limited to, pH, temperature, or light). In some embodiments, the molar ratio of ionizable lipids in LNP is from 20% to about 70% (e.g., from about 20% to about 70%, from about 20% to about 65%, from about 20% to about 60%, from about 20% to about 55%, from about 20% to about 50%, from about 20% to about 45%, from about 20% to about 40%, from about 20% to about 35%, from about 20% to about 30%, from about 20% to about 25%, from about 30% to about 70%, from about 30% to about 65%, from about 30% to about 60%, from about 30%). (approximately 50% to 55%, 30% to 50%, 30% to 45%, 30% to 40%, 30% to 35%, 40% to 70%, 40% to 65%, 40% to 60%, 40% to 55%, 40% to 50%, 40% to 45%, 50% to 70%, 50% to 65%, 50% to 60%, 50% to 55%, 60% to 70%, or 60% to 65%). In some embodiments, the LNP may comprise polyethylene glycol-modified lipids. In some embodiments, the molar ratio of polyethylene glycol-modified lipids in the LNP is from 0% to about 30% (e.g., from about 0% to about 30%, from about 0% to about 25%, from about 0% to about 20%, from about 0% to about 15%, from about 0% to about 10%, from about 10% to about 30%, from about 10% to about 25%, from about 10% to about 20%, from about 10% to about 15%, from about 20% to about 30%, or from about 20% to about 25%). In some embodiments, the LNP may contain supporting lipids. In some embodiments, the molar ratio of supporting lipids in the LNP is from 30% to about 50% (e.g., from about 30% to about 50%, from about 30% to about 45%, from about 30% to about 40%, from about 30% to about 35%, from about 40% to about 50%, or from about 40% to about 45%). In some embodiments, the LNP may contain cholesterol.In some embodiments, the molar ratio of cholesterol in LNP is from 10% to about 50% (e.g., from about 10% to about 50%, from about 10% to about 45%, from about 10% to about 40%, from about 10% to about 35%, from about 10% to about 30%, from about 10% to about 25%, from about 10% to about 20%, from about 10% to about 15%, from about 20% to about 50%, from about 20% to about 45%, from about 20% to about 40%, from about 20% to about 35%, from about 20% to about 30%, from about 20% to about 25%, from about 30% to about 50%, from about 30% to about 45%, from about 30% to about 40%, from about 30% to about 35%, from about 40% to about 50%, or from about 40% to about 45%). In some embodiments, the LNP may comprise a mixture of ionizable lipids (20%-70% molar ratio), polyethylene glycol-modified lipids (0%-30% molar ratio), supporting lipids (30%-50% molar ratio), and cholesterol (10%-50% molar ratio).

[0122] For example, as used herein, the term "liposome" generally refers to a vesicle with an internal space that is isolated from an external medium by one or more bilayer membranes. In some embodiments, the bilayer membrane can be formed from amphiphilic molecules, such as synthetic or naturally derived lipids comprising spatially isolated hydrophilic and hydrophobic domains; in other embodiments, the bilayer membrane can be formed from amphiphilic polymers and surfactants. In some embodiments, the liposome is a spherical vesicle structure consisting of a single or multiple lipid bilayer surrounding an internal aqueous compartment and a relatively impermeable outer lipophilic phospholipid bilayer. In some embodiments, the liposome is biocompatible, non-toxic, capable of delivering hydrophilic and lipophilic drug molecules, protecting their carriers from degradation by plasma enzymes, and transporting their load across biological membranes and the blood-brain barrier (BBB). Liposomes can be made from several different types of lipids, such as phospholipids. Liposomes may contain natural phospholipids and lipids such as 1,2-distearate-sn-glycerol-3-phosphatidylcholine (DSPC), sphingomyelin, lecithin, monosialotetrahexosylganglioside, or any combination thereof. Several other additives may be added to liposomes to modify their structure and properties. For example, liposomes may also contain cholesterol, sphingomyelin, and / or 1,2-dioleoyl-sn-glycerol-3-phosphoethanolamine (DOPE), for example, to increase stability and / or prevent leakage of the internal carriers of the liposomes.

[0123] In this application, the term "adenovirus-associated virus (AAV) vector" generally refers to a vector having a functional or partially functional ITR sequence and a transgene. As used herein, the term "ITR" refers to an inverted terminal repeat sequence. ITR sequences may be derived from adenovirus serotypes, including but not limited to AAV-1, AAV-2, AAV-3, AAV-4, AAV-5, AAV-6, AAV7, AAV8, AAV9, AAV10, AAV11, AAV12, and AAV13, as well as any AAV variants or mixtures. However, the ITR need not be a wild-type nucleotide sequence and may be altered (e.g., by insertion, deletion, or substitution of nucleotides) as long as the sequence retains the function of providing functional rescue, replication, and packaging. AAV vectors may have one or more wholly or partially deleted AAV wild-type genes, preferably the rep and / or cap genes, but retain functional flanking ITR sequences. The functional ITR sequence serves, for example, to rescue, replicate, and package AAV viral particles or granules. Therefore, "AAV vector" is defined in this application as including at least those sequences required for inserting the transgene into the subject's cells. Optionally, it may include those cis sequences necessary for viral replication and packaging (e.g., functional ITR).

[0124] In this application, the term "pharmaceutically acceptable carrier" generally refers to a carrier for administering therapeutic agents, such as antibodies or peptides, genes, and other therapeutic agents. This term refers to any pharmaceutical carrier that does not itself induce antibody production harmful to the individual receiving the composition and can be administered without causing excessive toxicity. Suitable carriers can be large, slowly metabolized macromolecules, such as proteins, polysaccharides, polylactic acid, polyglycolic acid, polyamino acids, amino acid copolymers, lipid aggregates, and inactivated viral particles. These carriers are well known to those skilled in the art. Pharmaceutically acceptable carriers in therapeutic compositions may include liquids such as water, saline, glycerol, and ethanol. These carriers may also contain excipients such as wetting agents or emulsifiers, pH buffers, etc.

[0125] In this application, the terms "sequence encoding..." or "nucleic acid encoding..." generally refer to a nucleic acid (RNA or DNA molecule) containing a nucleotide sequence encoding a protein. The coding sequence may also include start and stop signals operatively linked to regulatory elements comprising promoters and polyadenylation signals capable of directing expression in the cells of an individual or mammal to which the nucleic acid has been administered. Codon optimization of the coding sequence is possible. In some embodiments, the coding nucleic acid may be mRNA; one or more modification techniques may be used to produce more stable mRNA. Known mRNA modification techniques can be broadly categorized into three types: synthesizing mRNA by replacing natural ribonucleic acid with artificially synthesized non-natural ribonucleic acid; adding 5' caps, a 3' poly(A) tail, and a UTR (untranslated region) sequence; and employing specialized novel formulation techniques to effectively protect the mRNA. Among these, preferred mRNA modification techniques involve synthesizing mRNA by replacing natural ribonucleic acid with artificially synthesized non-natural ribonucleic acid. Chemical modifications on eukaryotic mRNA can be broadly categorized into three types: methylation, pseudouridine (Ψ), and hypoxanthine. For example, the chemical modification may be selected from: pseudouridine, N1-methylpseuuridine, N1-ethylpseuuridine, 2-thiouridine, 4'-thiouridine, 5-methylcytosine, 2-thio-1-methyl-1-deazo-pseuuridine, 2-thio-1-methylpseuuridine, 2-thio-5-aza-uridine, 2-thio-dihydropseuuridine, 2-thio-dihydrouridine, 2-thio-pseuuridine, 4-methoxy-2-thio-pseuuridine, 4-methoxy-pseuuridine, 4-thio-1-methylpseuuridine, 4-thio-pseuuridine, 5-aza-uridine, dihydropseuuridine, 5-methyluridine, 5-methoxyuridine, and 2'-O-methyluridine.

[0126] In this application, the term "complementarity" generally refers to the Worson-Crick (e.g., AT / U and CG) or Hoogsteen base pairing between nucleotides or nucleotide analogs of a nucleic acid molecule. "Complementarity" refers to a property shared between two nucleic acid sequences such that when they are arranged antiparallel to each other, the nucleotide bases at each position will be complementary.

[0127] In this application, the terms “subject” and “patient” are used interchangeably and generally refer to humans and non-human animals. The term “non-human animal” as used in this application includes all vertebrates, such as mammals and non-mammals, including non-human primates, sheep, dogs, cats, horses, cattle, chickens, amphibians, reptiles, etc.

[0128] In this application, the terms "effective amount" and "therapeutic effective amount" or "therapeutic effective dose" are used interchangeably, generally referring to the amount or dose of a fusion molecule (protein), peptide, nucleic acid, lipid nanoparticle, liposome, one or more AAV particles, or one or more virions capable of producing a sufficient amount of the desired protein to modulate protein activity in the desired manner, thereby providing a remission tool for clinical intervention. In some embodiments, a therapeutically effective amount or dose of the transfected fusion protein, peptide, nucleic acid, one or more AAV particles, or one or more virions as described herein is sufficient to confer inhibition of the gene targeted by the fusion protein / gene therapy construct.

[0129] As used herein, the term “treatment,” such as disease, means that, in one implementation, a subject (e.g., a person) who has a disease, is at risk of developing a disease, and / or experiences disease symptoms will experience milder symptoms and / or recover more quickly when the fusion molecule described herein or the nucleic acid and / or gRNA encoding the fusion molecule or the nucleic acid encoding the gRNA is administered, compared to when the fusion molecule or the nucleic acid and / or gRNA encoding the fusion molecule or the nucleic acid encoding the gRNA has never been administered.

[0130] The embodiments described below are not intended to be limited by any theory, but are merely for illustrating the compositions, methods of use, and applications of this application, and are not intended to limit the scope of the invention.

[0131] Example

[0132] Example 1

[0133] The composition of this application results in a decrease in the level of HBV markers in HBV-infected primary hepatocytes (PHH).

[0134] In this embodiment, primary hepatitis B virus type D (GenBank: U95551, provided by WuXi AppTec) was used to infect the primary hepatitis B cell line (PHH) (provided by WuXi AppTec). The experimental protocol is as follows (Figure 1): Two days after primary hepatitis B cells were infected with HBV, different versions of EPIREEG mRNA (SEQ ID NO: 68-77) and gRNA SG35 (tool version number and tool and sgRNA sequences are shown in Table 1, the mass ratio of mRNA to sgRNA is 1:1) were delivered into the primary hepatitis B cells using LNP (LNP preparation reference: https: / / doi.org / 10.1038 / s41586-021-03534-y). The total amount of EPIREEG and gRNA delivered in each group was 2.5 ug / ml. Culture supernatant was collected every two days after drug administration to detect the expression levels of HBV surface antigen (HBsAg) and core antigen (HBcAg) secreted by cells. Cell samples were collected 14 days after drug administration to extract DNA and RNA from the cells, and the levels of HBV mRNA and changes in HBV DNA were detected. Myrcludex B (provided by WuXi AppTec, C9023GE070-1 / PE0723), which inhibits HBV infection of hepatocytes, was used as a positive control. The control (Ctrl) group served as a blank control after infection. The NT group used sgRNA targeting non-HBV genome molecules.

[0135] The experimental results are shown in Figures 2A-2B. After delivering different versions of EPIREG and SG35 to HBV-infected primary hepatocytes, the expression levels of HBsAg and HBeAg / HBcAg in the supernatant were inhibited to varying degrees. EPIREG#1 showed an inhibition rate approaching 90%, while EPIREG#8 showed an inhibition rate exceeding 50%, and the inhibitory effect remained stable for at least 12 days. Total HBV RNA and pgRNA in the cells also decreased to varying degrees, with EPIREG#1 showing an inhibition rate approaching 80%. These results indicate that different versions of EPIREG can inhibit the expression of free HBV genomic cccDNA to varying degrees, with EPIREG#1 showing an inhibition efficiency of nearly 80% at the genomic transcriptional level and nearly 90% at the protein expression level.

[0136] Table 1. List of EPIREG tools and gRNA information

[0137] Example 2

[0138] The composition of this application results in a decrease in the level of integrated HBV markers in the HepG2.2.15 cell line.

[0139] This embodiment uses the HepG2.2.15 cell line to integrate the type D HBV virus genome (GenBank: U95551). The experimental protocol is as follows: After cell plating, different versions of EPIREEG mRNA (SEQ ID NO: 68-77) and gRNA SG35 (tool version number and tool and sgRNA sequences are shown in Table 1, with a mass ratio of mRNA to sgRNA of 1:1) were delivered into HepG2.2.15 cells using LNP (LNP preparation reference: https: / / doi.org / 10.1038 / s41586-021-03534-y). The total amount of EPIREEG and gRNA delivered in each group was 2.5 μg / ml. The culture supernatant was collected weekly after drug administration, and the expression levels of HBV surface antigen (HBsAg) and core antigen (HBcAg) secreted by the cells were detected. The control (Ctrl) group served as a blank control after infection; the NT group used sgRNA targeting non-HBV genomes.

[0140] The experimental results are shown in Figure 3. After delivering different versions of EPIREG and SG35 to HepG2.2.15, the expression levels of HBsAg and HBeAg / HBcAg in the supernatant were inhibited to varying degrees at different time points. The efficacy of each version of EPIREG was similar to that in the PHH experiment described in Example 1. EPIREG#1 showed an inhibition rate of nearly 80%, and EPIREG#8 showed an inhibition rate of nearly 70%, with the inhibitory effect lasting for at least 14 days. These results indicate that different versions of EPIREG can inhibit the expression of integrated HBV genome to varying degrees, with EPIREG#1 showing an inhibition efficiency of nearly 80% at the protein expression level.

[0141] Example 3

[0142] The knockdown effects of different EPIREGs on HBV markers in transgenic HBV mice

[0143] This embodiment used transgenic HBV mice (purchased from Beijing Vitonda Biotechnology Co., Ltd., C57BL / 6-HBV, subsequent feeding and testing were performed by Beijing Vitonda). This strain of mice had a 1.28-fold length of HBV genome inserted into its genome (type A, GenBank: AF305422.1). Transgenic HBV mice were treated with different combinations of EPIRG and gRNA SG35 to verify the inhibitory effect of different EPIRGs on the expression of integrated HBV genes in an in vivo model.

[0144] The experimental protocol was as follows: EPIREG was delivered via tail vein injection. A combination of mRNA (SEQ ID NO: 68) encoding EPIREG#1 and sgRNA SG35 (tool version number and tool and sgRNA sequences are shown in Table 1, mass ratio of mRNA to sgRNA 1:1) was delivered to transgenic HBV mice using lipid nanoparticles (LNP) (LNP preparation reference: https: / / doi.org / 10.1038 / s41586-021-03534-y) at a dose of 5 mg / kg. The negative control was injected with PBS (200 μL). Blood samples were collected periodically after administration to detect the levels of secreted HBV surface antigen (HBsAg) and core antigen (HBcAg / HBeAg) in serum, and HBV DNA levels were detected using qPCR. (Hepatitis B e antigen quantitative reagent kit: Maccura Biotechnology, catalog number IM4403003; Hepatitis B surface antigen quantitative reagent kit: Maccura Biotechnology, catalog number IM4403001; Hepatitis B virus nucleic acid quantitative reagent kit: Sansure Biotech, catalog number 2015340008; Detection method according to the kit instructions).

[0145] The results of the post-drug administration are shown in Figure 4. After delivery of the combination of EPIREG#1 and SG35 to transgenic HBV mice, the expression levels of serum antigens HBsAg, HBeAg, and HBV DNA rapidly decreased by 95%, 66%, and 98%, respectively, one week after administration, followed by slight recovery, and finally the inhibition efficiencies stabilized at 86%, 58%, and 66%, respectively. The experimental results indicate that EPIREG can inhibit the expression of the HBV genome in transgenic HBV mice.

Claims

1. A composition for regulating the expression of hepatitis B virus (HBV) genes in cells, the composition comprising a fusion molecule or a nucleic acid sequence encoding the fusion molecule, the fusion molecule comprising at least one DNA-binding domain, at least one epigenetic modification domain, and at least one transcriptional regulatory domain.

2. The composition according to claim 1, wherein the DNA-binding domain is selected from CRISPR enzymes, zinc finger nucleases (ZNF), transcription activator-like effector (TALE) domains, homing endonucleases, dCas9-FokI nucleases, Argonaute (Ago) nucleases, or MegaTal nucleases.

3. The composition according to claim 1 or 2, wherein the CRISPR enzyme is a type 2 Cas protein and / or a mutant thereof.

4. The composition according to any one of claims 1-3, wherein the CRISPR enzyme is one or more of the following Cas proteins: type II-A Cas protein, type II-B Cas protein, type II-C Cas protein, type VA Cas protein, type VB Cas protein, type VC Cas protein, type VU Cas protein, and mutants thereof.

5. The composition according to any one of claims 1-4, wherein the CRISPR enzyme is Cas9 protein and / or a mutant thereof.

6. The composition according to any one of claims 1-5, wherein the at least one DNA-binding domain is dCas9.

7. The composition according to claim 6, wherein the dCas9 comprises Staphylococcus aureus dCas9, Streptococcus pyogenes dCas9, Campylobacter jejuni dCas9, Corynebacterium diphtheria dCas9, Eubacterium ventriosum dCas9, Streptococcus pasteurianus dCas9, Lactobacillus farciminis dCas9, Sphaerochaeta globus dCas9, Azospirillum (e.g., strain B510) dCas9, Gluconacetobacter diazotrophicus dCas9, and Neisseria griseus. cinerea)dCas9, Roseburia intestinalis dCas9, Parvibaculum lavamentivorans dCas9, Nitratifractor salsuginis (e.g., strain DSM 16511) dCas9, Campylobacter lari (e.g., strain CF89-12) dCas9, Streptococcus thermophilus (e.g., strain LMD-9) dCas9.

8. The composition according to claim 6 or 7, wherein the dCas9 comprises the amino acid sequence shown in SEQ ID NO: 40-57.

9. The composition according to any one of claims 1-8, wherein the composition further comprises at least one single guide RNA (sgRNA) or a nucleic acid encoding said sgRNA.

10. The composition according to claim 9, wherein the sgRNA is complementary to a target nucleotide sequence near the HBV gene and / or within the HBV gene regulatory element, or comprises a partial sequence complementary to the target nucleotide sequence for 15-20 consecutive base pairs.

11. The composition according to any one of claims 1-10, wherein the at least one epigenetic modification domain provides methylation modification of at least one nucleotide in the vicinity of the HBV gene and / or within the HBV gene regulatory element.

12. The composition according to any one of claims 1-11, wherein the at least one epigenetic modification domain comprises a DNA methyltransferase (DNMT) or a functionally active fragment thereof.

13. The composition according to any one of claims 1-12, wherein the apparent modification domain is selected from one or more of DNMT3A, DNMT3B, DNMT3C, DNMT1, DNMT2 and DNMT3L.

14. The composition according to any one of claims 1-13, wherein the apparent modification domain comprises at least one DNMT3A and at least one DNMT3L, and is connected by a connector sequence.

15. The composition according to any one of claims 1-14, wherein the apparent modification domain comprises DNMT3A and DNMT3L, and the C-terminus of DNMT3A is connected to the N-terminus of DNMT3L, or the C-terminus of DNMT3L is connected to the N-terminus of DNMT3A.

16. The composition according to any one of claims 12-15, wherein the DNA methyltransferase comprises the amino acid sequence shown in any one of SEQ ID NOs:4-9.

17. The composition according to any one of claims 1-16, wherein the transcriptional regulatory domain is a transcriptional repressor domain selected from: KRAB, ZIM3 KRAB, ZNF680, ZNF554, ZNF264, ZNF582, ZNF324, ZNF669, ZNF354A, ZNF82, ZNF595, ZNF419, ZNF566, ZIM2, EHMT2, SUV39H1, ZFPM1, TRIM28, EZH2, MXD1, SID, LSD1, HP1 a, HDAC3, ZNF436, ZNF257, ZNF675, ZNF490, ZNF320, ZNF331, ZNF816, ZNF41, ZNF189, ZNF528, ZNF543, ZNF140, ZNF610, ZNF350, ZNF8, ZNF30, ZNF98, ZNF677, ZNF596, ZNF214, ZNF37A, ZNF34, ZNF250, ZNF547, ZNF273, ZFP82, ZNF224, ZNF33A, ZNF45, ZNF175, ZNF184, ZFP28-1, ZFP28-2, ZNF18, ZNF213, ZNF394, ZFP1, ZFP14, ZNF416, Z NF557, ZNF729, ZNF254, ZNF764, ZNF785, ZNF10, CBX5, RYBP, YAF2, MGA, CBX1, SCMH1, MPP8, SUMO3, HERC2, BIN1, PCGF2, TOX, FOXA1, FOXA2, IRF2BP1, IRF2BP2, IRF2BPL IRF-2BP1_2N-terminal domain, HOXA13, HOXB13, HOXC13, HOXA11, HOXC11, HOXC10, HOXA10, HOXB9, HOXA9, ZFP28, ZN334, ZN568, ZN37A, ZN 181, ZN510, ZN862, ZN140, ZN208, ZN248, ZN571, ZN699, ZN726, ZIK1, ZNF2, Z705F, ZNF14, ZN471, ZN624, ZNF84, ZN F7, ZN891, ZN337, Z705G, ZN529, ZN729, ZN419, Z705A, ZN302, ZN486, ZN621, ZN688, ZN33A, ZN554, ZN878, ZN772, Z N224, ZN184, ZN544, ZNF57, ZN283, ZN549, ZN211, ZN615, ZN253, ZN226, ZN730, Z585A, ZN732, ZN681, ZN667, ZN649,ZN470,ZN484,ZN431,ZN382,ZN254,ZN124,ZN607,ZN317,ZN620,ZN141,ZN584,ZN540,ZN75D,ZN555,ZN658,ZN684,RBAK,ZN829,ZN582,ZN112,ZN716,HKR1,ZN350,ZN480,ZN416,ZNF92,ZN100,ZN736,ZNF74,ZN443,ZN195,ZN530,ZN782,ZN791,ZN331,Z354C,ZN157,ZN727,ZN550,ZN793,ZN235,ZN724,ZN573,ZN577,ZN789,ZN718,ZN300,ZN383,ZN429,ZN677,ZN850,ZN454,ZN257,ZN264,ZN485,ZN737,ZNF44,ZN596,ZN565,ZN543,ZFP69,SUMO1,ZNF12,ZN169,ZN433,ZN175,ZN347,ZNF25,ZN519,Z585B,ZN517,ZN846,ZN230,ZNF66,ZN713,ZN816,ZN426,ZN674,ZN627,ZNF20,Z587B,ZN316,ZN233,ZN611,ZN556,ZN234,ZN560,ZNF77,ZN682,ZN614,ZN785,ZN445,ZFP30,ZN225,ZN551,ZN610,ZN528,ZN284,ZN418,ZN490,ZN805,Z780B,ZN763,ZN285,ZNF85,ZN223,ZNF90,ZN557,ZN425,ZN229,ZN606,ZN155,ZN222,ZN442,ZNF91,ZN135,ZN778,ZN534,ZN586,ZN567,ZN440,ZN583,ZN441,ZNF43,ZN589,ZN563,ZN561,ZN136,ZN630,ZN527,ZN333,Z324B,ZN786,ZN709,ZN792,ZN599,ZN613,ZF69B,ZN799,ZN569,ZN564,ZN546,ZFP92,ZN723,ZN439,ZFP57,ZNF19,ZN404,ZN274,CBX3,ZN250,ZN570,ZN675,ZN695,ZN548,ZN132,ZN738,ZN420,ZN626,ZN559,ZN460,ZN268,ZN304,ZN605,ZN844,SUMO5,ZN101,ZN783,ZN417,ZN182,ZN823,ZN177,ZN197,ZN717,ZN669,ZN256,ZN251,CBX4,CDY2,CDYL2,ZN562,ZN461,Z324A,ZN766,ID2,ZN214,CBX7,ID1,CREM,SCX,ASCL1,ZN764,SCML2,TWST1,CREB1,TERF1,ID3,CBX8,GSX1,NKX22,ATF1,TWST2,ZNF17,TOX3,TOX4,ZMYM3,I2BP1,RHXF1,SSX2,I2BPL,ZN680,TRI68,HXA13,PHC3,TCF24,HXB13,HEY1,PHC2,ZNF81,FIGLA,SAM11,KMT2B,HEY2,JDP2,HXC13,ASCL4,HHEX,GSX2,ETV7,ASCL3,PHC1,OTP,I2BP2,VGLL2,HXA11,PDLI4,ASCL2,CDX4,ZN860,LMBL4,PDIP3,NKX25,CEBPB,ISL1,CDX2,PROP1,SIN3B,SMBT1,HXC11,HXC10,PRS6A,VSX1,NKX23,MTG16,HMX3,HMX1,KIF22,CSTF2,CEBPE,DLX2,PPARG,PRIC1,UNC4,BARX2,ALX3,TCF15,TERA,VSX2,HXD12,CDX1,TCF23,ALX1,HXA10,RX,CXXC5,SCML1,NFIL3,DLX6,MTG8,CEBPD,SEC13,FIP1,ALX4,LHX3,PRIC2,MAGI3,NELL1,PRRX1,MTG8R,RAX2,DLX3,DLX1,NKX26,NAB1,SAMD7,PITX3,WDR5,MEOX2,NAB2,DHX8,CBX6,EMX2,CPSF6,HXC12,KDM4B,LMBL3,PHX2A,EMX1,NC2B,DLX4,SRY,ZN777,ZN398,GATA3,BSH,SF3B4,TEAD1,TEAD3,RGAP1,PHF1,GATA2,FOXO3,ZN212,IRX4,ZBED6,LHX4,SIN3A,RBBP7,NKX61,R51A1,MB3L1,DLX5,NOTC1,TERF2,ZN282,RGS12,ZN840,SPI2B,PAX7,NKX62,ASXL2,FOXO1,GATA1,ZMYM5,LRP1,MIXL1, SGT1, LMCD1, CEBPA, SOX14, WTIP, PRP19, NKX11, RBBP4, DMRT2, SMCA2, and their functionally active fragments.

18. The composition according to claim 17, wherein the transcriptional repressor domain comprises the amino acid sequence shown in any one of SEQ ID NOs:10-39.

19. The composition according to claim 17 or 18, wherein the transcriptional repressor domain comprises a zinc finger-based transcription factor or a functionally active fragment thereof.

20. The composition according to claim 19, wherein the zinc finger-based transcription factor is a Krüppel-associated repressor (KRAB) or a KRAB domain derived from ZIM3 (ZIM3 KRAB).

21. The composition according to any one of claims 1-20, wherein the transcriptional regulatory domain comprises two or more of the zinc finger-based transcription factors or their functionally active fragments, wherein the two or more zinc finger-based transcription factors are of the same or different types and are connected by a linker sequence.

22. The composition according to claim 17 or 18, wherein the transcriptional repressor domain comprises a histone modification domain.

23. The composition according to claim 22, wherein the histone modification domain is selected from: EZH2, HDAC3, HDAC1, EHMT2(G9A), PRMT1, PRMT5, SETDB1, hSIRT1, HP1a, LSD1, and their functionally active fragments.

24. The composition according to claim 23, wherein the histone modification domain comprises the amino acid sequence shown in any one of SEQ ID NO:24-39.

25. The composition according to any one of claims 1-24, wherein the epigenetic modification domain and the transcriptional regulatory domain are both located at the N-terminus or C-terminus of the DNA binding domain, and the respective domains are directly or indirectly connected by adapter sequences.

26. The composition according to any one of claims 1-24, wherein the epigenetic modification domain and the transcriptional regulatory domain are located at the N-terminus and C-terminus of the DNA binding domain, respectively, and the domains are directly or indirectly connected by adapter sequences.

27. The composition according to any one of claims 14, 21, 25 and 26, wherein the connector sequence is an XTEN connector sequence.

28. The composition according to any one of claims 1-27, wherein the fusion molecule is sequentially connected from the N-terminus to the C-terminus with: 1) The epigenetic modification domain, the transcriptional regulatory domain, and the DNA-binding domain; or 2) The transcriptional regulatory domain, the epigenetic modification domain, and the DNA-binding domain; or 3) The DNA-binding domain, the epigenetic modification domain, and the transcriptional regulatory domain; or 4) The DNA-binding domain, the transcriptional regulatory domain, and the epigenetic modification domain; or 5) The epigenetic modification domain, the DNA-binding domain, and the transcriptional regulatory domain; or 6) The transcriptional regulatory domain, the DNA binding domain, and the epigenetic modification domain.

29. The composition according to any one of claims 1-28, wherein the fusion molecule is sequentially connected from the N-terminus to the C-terminus with: 1) One or a combination of DNMT3A and DNMT3L, one or more zinc finger-based transcription factors or histone modification domains, and dCas9; or 2) One or more zinc finger-based transcription factors or histone modification domains, one or a combination of DNMT3A and DNMT3L, and dCas9; or 3) One or a combination of dCas9, DNMT3A, and DNMT3L, and one or more zinc finger-based transcription factors or histone modification domains; or 4) dCas9, one or more zinc finger-based transcription factors or histone modification domains, and one or a combination of DNMT3A and DNMT3L; or 5) One or a combination of DNMT3A and DNMT3L, dCas9, and one or more zinc finger-based transcription factors or histone modification domains; or 6) One or more zinc finger protein-based transcription factors or histone modification domains, dCas9, and one or a combination of DNMT3A and DNMT3L.

30. The composition according to any one of claims 1-29, wherein the fusion molecule comprises the following domains: dCas9-DNMT(3A-3L)-KRAB; dCas9-KRAB-DNMT(3A-3L); KRAB-DNMT(3A-3L)-dCas9; DNMT(3A-3L)-KRAB-dCas9; DNMT(3A-3L)-dCas9-EZH2; DNMT(3A-3L)-dCas9-H DAC3; DNMT(3A-3L)-dCas9-HP1a; DNMT(3A-3L)-dCas9-HDAC1; DNMT(3A-3L)-dCas9-PRMT1; DNMT(3A-3L) -dCas9-SETDB1; DNMT(3A-3L)-dCas9-hSIRT1; DNMT(3A-3L)-dCas9-PRMT5; DNMT(3A-3L)-dCas9-G9A, ​​where, DNMT(3A-3L) indicates that DNMT3A and DNMT3L are connected directly or indirectly in any order, - indicates that the domains at both ends are connected directly or indirectly in order from the N-terminus to the C-terminus, and KRAB indicates at least one zinc finger protein-based transcription factor selected from KRAB and ZIM3 KRAB.

31. The composition according to any one of claims 1-30, wherein the fusion molecule further comprises a nuclear localization signal (NLS) and / or a marker domain.

32. The composition according to claim 31, wherein the NLS sequence is directly or indirectly fused to the C-terminus, N-terminus, or both ends of the at least one DNA binding domain.

33. The composition according to any one of claims 1-32, wherein the fusion molecule comprises the amino acid sequence shown in any one of SEQ ID NO: 58-67.

34. The composition according to any one of claims 1-33, wherein the composition is capable of providing modification of at least one nucleotide in the vicinity of the HBV gene and / or within the regulatory element of the HBV gene.

35. The composition according to any one of claims 1-34, wherein the composition is capable of inhibiting the expression of the HBV gene or reducing the HBV gene product.

36. The composition according to any one of claims 1-35, wherein the nucleic acid encoding the fusion molecule is deoxyribonucleic acid (DNA) or messenger ribonucleic acid (mRNA).

37. The composition according to any one of claims 1-36, wherein the fusion molecule or the nucleic acid encoding the fusion molecule is packaged in liposomes or lipid nanoparticles.

38. The composition according to any one of claims 1-37, wherein the fusion molecule or nucleic acid encoding the fusion molecule and the sgRNA or nucleic acid encoding the sgRNA are packaged in the same or different liposomes or lipid nanoparticles (LNPs).

39. The composition according to claim 37 or 38, wherein the liposome or the lipid nanoparticle comprises ionizable lipids (20%-70% molar ratio), polyethylene glycol-modified lipids (0%-30% molar ratio), supporting lipids (30%-50% molar ratio), and cholesterol (10%-50% molar ratio).

40. The composition of claim 39, wherein the ionizable lipid is selected from pH-responsive ionizable lipids, thermoresponsive ionizable lipids, and photoresponsive ionizable lipids.

41. The composition according to any one of claims 1-36, wherein the fusion molecule or the nucleic acid encoding the fusion molecule is packaged in an AAV vector.

42. The composition according to any one of claims 1-36 and 41, wherein the fusion molecule or nucleic acid encoding the fusion molecule and the sgRNA or nucleic acid encoding the sgRNA are packaged in the same or different AAV vectors.

43. The composition according to any one of claims 1-42, wherein the composition is a pharmaceutical composition comprising a pharmaceutically acceptable carrier.

44. A method for reducing or eliminating the expression of hepatitis B virus (HBV) gene products in cells, the method comprising the step of introducing the composition of any one of claims 1-43 into the cells, thereby reducing or eliminating the expression of the HBV gene products in the cells.

45. The method according to claim 44, wherein it is an in vitro method or an in vivo method.

46. ​​A method for treating a subject with hepatitis B virus (HBV) infection-related disease or alleviating symptoms of HBV infection-related disease in a subject, the method comprising the step of introducing an effective amount of the composition of any one of claims 1-43 into the cells of the subject.

47. The method of claim 46, wherein the subject is a mammal, such as a human, monkey, mouse, rat, rabbit, pig, horse, cat, and dog.

48. The method according to claim 46 or 47, wherein the method comprises administering the composition to the subject once or more.

49. The method according to any one of claims 46-48, the method comprising administering the composition to the subject at least twice.

50. The method according to claim 48 or 49, wherein the interval between the application of the composition once or more or at least twice is 1 day, 2 days, 3 days, 4 days, 5 days, 6 days, 7 days, 8 days, 9 days, 10 days, 11 days, 12 days, 13 days, 14 days or 15 days.

51. The method according to any one of claims 46-50, wherein the HBV infection-related diseases include hepatitis, cirrhosis, liver fibrosis, and hepatocellular carcinoma caused by HBV infection.

52. The composition of any one of claims 1-43, wherein the composition is used to treat a subject with hepatitis B virus (HBV) infection-related disease or to alleviate symptoms of HBV infection-related disease in a subject.

53. A kit comprising the composition of any one of claims 1-43, a container for disposing of the composition, and / or instructions for use.