Methods for targeted modulation of pcsk9 by epigenetic editing and uses thereof
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- EPIGENIC THERAPEUTICS INC
- Filing Date
- 2025-06-20
- Publication Date
- 2026-06-26
AI Technical Summary
Existing PCSK9 inhibitors and gene-editing drugs have poor long-term medication adherence and genomic safety risks, and cannot provide a safe treatment option that can be effective for life with a single dose.
Epigenetic editing tools were used to target the transcriptional regulatory region of the PCSK9 gene. By introducing repressive epigenetic modifications, the transcriptional activity of PCSK9 was altered. A complex containing a recruitment domain, a transcriptional repressor domain, and a DNA methylation domain was used with guide RNA to reduce or eliminate the expression of the PCSK9 gene product.
It achieves safe and sustained inhibition of PCSK9 expression, reduces LDL-C levels in the blood, and treats mixed hyperlipidemia, avoiding the inconvenience of long-term treatment and the risk of heredity.
Smart Images

Figure 00000058_0000 
Figure 00000058_0001 
Figure 00000059_0000
Abstract
Description
Methods and applications of targeted regulation of PCSK9 via epigenetic editing
[0001] priority
[0002] This application claims the rights and priority of Chinese application No. 202410817049X, filed on June 21, 2024. The entire contents of Chinese application No. 202410817049X are incorporated herein by reference for all purposes. Technical Field
[0003] This disclosure pertains to the biomedical field and particularly relates to an epigenetic editing tool and guide RNA for targeting the reduction or elimination of PCSK9 gene products in vivo. Background Technology
[0004] Familial hypercholesterolemia (FH) is primarily caused by mutations in the low-density lipoprotein receptor (LDLR) gene, which impairs the clearance of cholesterol from the blood. The LDLR gene encodes the LDL receptor, a cell surface protein responsible for transporting LDL cholesterol from the blood to the liver for metabolism and clearance. When these receptors are impaired or insufficient in number, LDL cholesterol levels in the blood rise, and long-term high LDL cholesterol can lead to atherosclerosis and other cardiovascular diseases. Heterozygous FH patients consist of one normal LDLR gene and one mutated LDLR gene, resulting in approximately half the function of the LDL receptor in their individuals, leading to moderate to severe hypercholesterolemia. Genetic testing can determine the type of LDLR gene mutation, which is crucial for diagnosis and familial screening.
[0005] Currently, there are various types of drugs used to lower low-density lipoprotein cholesterol (LDL-C) levels in the blood, including statins, bile sequestrants, fibrates, niacin, and PCSK9 (proprotein convertase subtilisin type 9) inhibitors. Among them, PCSK9 inhibitors have attracted much attention due to their unique mechanism of action. PCSK9 protein promotes the degradation of LDL receptors (LDL-R) on the surface of liver cells in vivo, which hinders the liver's ability to clear LDL-C from the blood. By reducing PCSK9 expression, its binding to LDL-R can be reduced, preventing LDL-R degradation and thereby increasing the number of LDL-R on the surface of liver cells, effectively improving the clearance rate of LDL-C from the blood.
[0006] Currently, drug strategies for reducing PCSK9 expression include inhibitors, monoclonal antibodies, siRNA, and gene-editing drugs. Inhibitors, monoclonal antibodies, and siRNA have shown promising therapeutic effects, but the need for long-term medication leads to poor patient adherence. Furthermore, these strategies may cause patient intolerance, and some patients have serious pre-existing complications that preclude treatment. Therefore, more effective drugs with better sustained efficacy are crucial. Gene therapy drugs have shown some therapeutic potential, such as Verve-101, a gene-editing drug developed by Verve Therapeutics in the United States. This drug uses base editing technology to directly modify the PCSK9 genome sequence in the liver for a one-time treatment, addressing the pain point of long-term medication. However, gene therapy drugs with genomic cleavage activity still pose long-term genomic safety risks, hindering rapid market validation and acceptance. Therefore, there is an urgent need for a safe, single-dose, lifelong effective treatment to avoid the inconvenience of long-term treatment and the risks of hereditary complications.
[0007] This patent provides a treatment strategy that can effectively and persistently inhibit PCSK9. By targeting the transcriptional regulatory region of the target gene PCSK9 with an epigenetic editing tool, an inhibitory epigenetic modification is introduced to change the transcriptional activity of PCSK9, thereby inhibiting the expression of the PCSK9 gene and reducing the level of LDL-c in the blood, thus achieving the purpose of treating mixed hyperlipidemia. Summary of the Invention
[0008] This disclosure provides a composition comprising: a) a complex comprising a first fusion and a second fusion, or a nucleic acid sequence encoding the complex; b) a guide RNA (sgRNA) complementary to the PCSK9 gene and / or a regulatory element of the PCSK9 gene, wherein the sgRNA comprises a nucleic acid sequence as shown in SEQ ID NO: 1 and / or SEQ ID NO: 2; wherein the first fusion comprises, from N-terminus to C-terminus, a recruitment domain A and a greening repressor domain, and the second fusion comprises, from N-terminus to C-terminus, a DNA methylation domain, a nucleic acid binding domain, and a recruitment domain A'.
[0009] In some embodiments, the composition comprises: a) a complex comprising a first fusion and a second fusion, or a nucleic acid sequence encoding the complex; b) two guide RNAs (sgRNAs) complementary to the PCSK9 gene and / or regulatory elements of the PCSK9 gene, wherein the sgRNAs comprise nucleic acid sequences as shown in SEQ ID NO: 1 and SEQ ID NO: 2; wherein the first fusion comprises, from N-terminus to C-terminus, a recruitment domain A and a greening repressor domain, and the second fusion comprises, from N-terminus to C-terminus, a DNA methylation domain, a nucleic acid binding domain, and a recruitment domain A'.
[0010] In some embodiments, the sgRNA comprises a nucleic acid sequence as shown in SEQ ID NO: 48 or 49. In some embodiments, the sgRNA comprises a nucleic acid sequence as shown in SEQ ID NO: 1 and / or SEQ ID NO: 2, and has more than 80%, more than 85%, more than 90%, more than 95%, or more than 99% identity with SEQ ID NO: 48 or 49.
[0011] In some embodiments, the recruitment domain A is selected from one of two groups of domains, and the recruitment domain A' is selected from the other of two groups of domains: 1) universal control non-derepressor protein 4 (GCN4), a GFP11 fragment derived from split green fluorescent protein (GFP), or a GVKESLV polypeptide; and 2) a single-chain antibody (scFv), a GFP1-10 fragment derived from split green fluorescent protein (GFP), or a PDZ protein domain.
[0012] In some implementations, one of the recruitment domains A and A' is a domain of GCN4, and the other is a domain of scFv.
[0013] In some implementations, one of the recruitment domains A and A' is a GFP11 fragment, and the other is a GFP1-10 fragment.
[0014] In some implementations, one of the recruitment domains A and A' is a GVKESLV domain, and the other is a PDZ protein domain.
[0015] In some embodiments, the recruitment domain A comprises an amino acid sequence as shown in SEQ ID NO:7, and the recruitment domain A' comprises an amino acid sequence as shown in SEQ ID NO:30.
[0016] In some embodiments, the transcriptional repressor domain is selected from one or more of the following domains: KRAB, ZIM3, ZNF680, ZNF554, ZNF264, ZNF582, ZNF324, ZNF669, ZNF354A, ZNF82, ZNF595, ZNF419, ZNF566, ZIM2, EHMT2, SUV39H1, ZFPM1, TRIM28, EZH2, MXD1, SID, LSD1, HP1a, HDAC3, HDAC1, PRMT1, SETDB1, hSIRT1, ZNF436, ZNF257, ZNF675, ZNF490, ZNF320, ZNF331, ZNF816, ZNF41, ZNF189, ZNF528, ZNF543, ZNF140, ZNF610, ZNF35 0, ZNF8, ZNF30, ZNF98, ZNF677, ZNF596, ZNF214, ZNF37A, ZNF34, ZNF250, ZNF547, ZNF273, ZF P82, ZNF224, ZNF33A, ZNF45, ZNF175, ZNF184, ZFP28-1, ZFP28-2, ZNF18, ZNF213, ZNF394, ZFP 1. ZFP14, ZNF416, ZNF557, ZNF729, ZNF254, ZNF764, ZNF785, ZNF10, CBX5, RYBP, YAF2, MGA, C BX1, SCMH1, MPP8, SUMO3, HERC2, BIN1, PCGF2, TOX, FOXA1, FOXA2, IRF2BP1, IRF2BP2, IRF2BPL IRF-2BP1_2N-terminal domain, HOXA13, HOXB13, HOXC13, HOXA11, HOXC11, HOXC10, HOXA10, HOXB9, HOXA9, ZFP28, ZN334, ZN568, ZN37A , ZN181, ZN510, ZN862, ZN140, ZN208, ZN248, ZN571, ZN699, ZN726, ZIK1, ZNF2, Z705F, ZNF14, ZN471, ZN624, ZN F84, ZNF7, ZN891, ZN337, Z705G, ZN529, ZN729, ZN419, Z705A, ZN302, ZN486, ZN621, ZN688, ZN33A, ZN554, ZN87 8. ZN772, ZN224, ZN184, ZN544, ZNF57, ZN283, ZN549, ZN211, ZN615, ZN253, ZN226, ZN730, Z585A, ZN732, ZN681,ZN667,ZN649,ZN470,ZN484,ZN431,ZN382,ZN254,ZN124,ZN607,ZN317,ZN620,ZN141,ZN584,ZN540,ZN75D,ZN555,ZN658,ZN684,RBAK,ZN829,ZN582,ZN112,ZN716,HKR1,ZN350,ZN480,ZN416,ZNF92,ZN100,ZN736,ZNF74,ZN443,ZN195,ZN530,ZN782,ZN791,ZN331,Z354C,ZN157,ZN727,ZN550,ZN793,ZN235,ZN724,ZN573,ZN577,ZN789,ZN718,ZN300,ZN383,ZN429,ZN677,ZN850,ZN454,ZN257,ZN264,ZN485,ZN737,ZNF44,ZN596,ZN565,ZN543,ZFP69,SUMO1,ZNF12,ZN169,ZN433,ZN175,ZN347,ZNF25,ZN519,Z585B,ZN517,ZN846,ZN230,ZNF66,ZN713,ZN816,ZN426,ZN674,ZN627,ZNF20,Z587B,ZN316,ZN233,ZN611,ZN556,ZN234,ZN560,ZNF77,ZN682,ZN614,ZN785,ZN445,ZFP30,ZN225,ZN551,ZN610,ZN528,ZN284,ZN418,ZN490,ZN805,Z780B,ZN763,ZN285,ZNF85,ZN223,ZNF90,ZN557,ZN425,ZN229,ZN606,ZN155,ZN222,ZN442,ZNF91,ZN135,ZN778,ZN534,ZN586,ZN567,ZN440,ZN583,ZN441,ZNF43,ZN589,ZN563,ZN561,ZN136,ZN630,ZN527,ZN333,Z324B,ZN786,ZN709,ZN792,ZN599,ZN613,ZF69B,ZN799,ZN569,ZN564,ZN546,ZFP92,ZN723,ZN439,ZFP57,ZNF19,ZN404,ZN274,CBX3,ZN250,ZN570,ZN675,ZN695,ZN548,ZN132,ZN738,ZN420,ZN626,ZN559,ZN460,ZN268,ZN304,ZN605,ZN844,SUMO5,ZN101,ZN783,ZN417,ZN182,ZN823,ZN177,ZN197,ZN717,ZN669,ZN256,ZN251,CBX4,CDY2,CDYL2,ZN562,ZN461,Z324A,ZN766,ID2,ZN214,CBX7,ID1,CREM,SCX,ASCL1,ZN764,SCML2,TWST1,CREB1,TERF1,ID3,CBX8,GSX1,NKX22,ATF1,TWST2,ZNF17,TOX3,TOX4,ZMYM3,I2BP1,RHXF1,SSX2,I2BPL,ZN680,TRI68,HXA13,PHC3,TCF24,HXB13,HEY1,PHC2,ZNF81,FIGLA,SAM11,KMT2B,HEY2,JDP2,HXC13,ASCL4,HHEX,GSX2,ETV7,ASCL3,PHC1,OTP,I2BP2,VGLL2,HXA11,PDLI4,ASCL2,CDX4,ZN860,LMBL4,PDIP3,NKX25,CEBPB,ISL1,CDX2,PROP1,SIN3B,SMBT1,HXC11,HXC10,PRS6A,VSX1,NKX23,MTG16,HMX3,HMX1,KIF22,CSTF2,CEBPE,DLX2,PPARG,PRIC1,UNC4,BARX2,ALX3,TCF15,TERA,VSX2,HXD12,CDX1,TCF23,ALX1,HXA10,RX,CXXC5,SCML1,NFIL3,DLX6,MTG8,CEBPD,SEC13,FIP1,ALX4,LHX3,PRIC2,MAGI3,NELL1,PRRX1,MTG8R,RAX2,DLX3,DLX1,NKX26,NAB1,SAMD7,PITX3,WDR5,MEOX2,NAB2,DHX8,CBX6,EMX2,CPSF6,HXC12,KDM4B,LMBL3,PHX2A,EMX1,NC2B,DLX4,SRY,ZN777,ZN398,GATA3,BSH,SF3B4,TEAD1,TEAD3,RGAP1,PHF1,GATA2,FOXO3,ZN212,IRX4,ZBED6,LHX4,SIN3A,RBBP7,NKX61,R51A1,MB3L1,DLX5,NOTC1,TERF2,ZN282,RGS12,ZN840,SPI2B,PAX7,NKX62,ASXL2,FOXO1,GATA1,ZMYM5, LRP1, MIXL1, SGT1, LMCD1, CEBPA, SOX14, WTIP, PRP19, NKX11, RBBP4, DMRT2, SMCA2, and their functionally active fragments, preferably, the transcriptional repressor domain is ZIM3.
[0017] In some embodiments, the transcriptional repressor domain comprises the amino acid sequence shown in SEQ ID NO:8.
[0018] In some embodiments, the DNA methylation domain comprises at least one DNA methyltransferase or a functionally active fragment thereof.
[0019] In some embodiments, the DNA methyltransferase is selected from DNMT3A, DNMT3B, DNMT3C, DNMT1, DNMT2, and DNMT3L.
[0020] In some embodiments, the DNA methylation domain comprises at least one DNMT3A and at least one DNMT3L.
[0021] In some embodiments, the DNA methylation domain comprises a DNMT3A-DNMT3L domain or a DNMT3L-DNMT3A domain; wherein, - indicates that the domains at both ends are directly or indirectly connected in order from the N-terminus to the C-terminus.
[0022] In some embodiments, the DNMT3A comprises an amino acid sequence as shown in SEQ ID NO:10.
[0023] In some embodiments, the DNMT3L comprises an amino acid sequence as shown in SEQ ID NO:11.
[0024] In some implementations, the nucleic acid binding domain is a DNA binding domain.
[0025] In some implementations, the DNA-binding domain is selected from: TALE domain, zinc finger domain, tetR domain, a wide range of nucleases, Cas protein, Argonaute (Ago) protein, and their homologues, modified forms, or variants.
[0026] In some implementations, the DNA-binding domain is capable of binding to guide RNA.
[0027] In some embodiments, the DNA-binding domain is a Cas protein, and the Cas protein is a type II Cas nuclease.
[0028] In some embodiments, the Cas protein is selected from type II Cas nucleases and type II V Cas nucleases.
[0029] In some embodiments, the Cas protein is a Cas9 or Cas12 protein, preferably an inactivated Cas9 (dCas9) protein.
[0030] In some implementations, dCas9 includes Staphylococcus aureus dCas9, Streptococcus pyogenes dCas9, Campylobacter jejuni dCas9, Corynebacterium diphtheria dCas9, Eubacterium ventriosum dCas9, Streptococcus pasteurianus dCas9, Lactobacillus farciminis dCas9, Sphaerochaeta globus dCas9, Azospirillum (e.g., strain B510) dCas9, Gluconacetobacter diazotrophicus dCas9, Neisseria cinerea dCas9, and Roseburia The following bacteria are listed: *Intestinalis* dCas9, *Parvibaculum lavamentivorans* dCas9, *Nitratifractor salsuginis* (e.g., strain DSM 16511) dCas9, *Campylobacter lari* (e.g., strain CF89-12) dCas9, and *Streptococcus thermophilus* (e.g., strain LMD-9) dCas9.
[0031] In some embodiments, the dCas9 comprises the amino acid sequence shown in SEQ ID NO:12-29.
[0032] In some embodiments, the first fusion and the second fusion are linked by a cleavage peptide.
[0033] In some embodiments, the cleavage peptide is a 2A peptide and / or IRES.
[0034] In some embodiments, the 2A peptide is selected from P2A, T2A, E2A, or F2A.
[0035] In some embodiments, the complex comprises an amino acid sequence as shown in SEQ ID NO:42.
[0036] In some embodiments, the complex is packaged in liposomes or lipid nanoparticles.
[0037] In some embodiments, the complex and the sgRNA are packaged in liposomes or lipid nanoparticles.
[0038] In some embodiments, the complex and the sgRNA are packaged in the same liposome or lipid nanoparticle, or in different liposomes or lipid nanoparticles.
[0039] In some embodiments, liposomes or lipid nanoparticles comprise ionizable lipids (20%-70% molar ratio), polyethylene glycol-modified lipids (0%-30% molar ratio), supporting lipids (30%-50% molar ratio), and cholesterol (10%-50% molar ratio).
[0040] In some implementations, the ionizable lipids are selected from the group consisting of: pH-responsive ionizable lipids, thermoresponsive ionizable lipids, and light-responsive ionizable lipids.
[0041] In some embodiments, the complex is packaged in an AAV carrier.
[0042] In some embodiments, the complex and the sgRNA are packaged in an AAV vector.
[0043] In some embodiments, the complex and the sgRNA are packaged in the same AAV vector or in different AAV vectors.
[0044] In some embodiments, the composition is a pharmaceutical composition comprising a pharmaceutically acceptable carrier.
[0045] This disclosure provides a method for reducing or eliminating the expression of the proprotein convertase subtilisin / Kexin type 9 (PCSK9) gene product in cells, the method comprising the step of introducing the composition as described above into the cells of a subject, thereby reducing or eliminating the expression of the PCSK9 gene product in the cells.
[0046] This disclosure provides a method for reducing or eliminating the expression of the PCSK9 gene product in a subject in vivo, the method comprising the step of introducing the composition as described above into the subject's cells, thereby reducing or eliminating the expression of the PCSK9 gene product in the cells.
[0047] This disclosure provides a method for reducing low-density lipoprotein (LDL) cholesterol in a subject, the method comprising the step of introducing the composition as described above into the subject's cells, thereby reducing LDL cholesterol in the subject.
[0048] This disclosure provides a method for amplifying a cell population with reduced expression of the PCSK9 gene product, the method comprising the steps of: i) introducing a) a complex comprising a first fusion compound and a second fusion compound, or a nucleic acid sequence encoding the complex; b) introducing one or more guide RNAs (sgRNAs) complementary to the PCSK9 gene and / or regulatory elements of the PCSK9 gene into a plurality of cells, wherein the sgRNAs comprise nucleic acid sequences as shown in SEQ ID NO: 1 and / or SEQ ID NO: 2; ii) amplifying the plurality of cells to generate a plurality of modified cells with reduced expression of the PCSK9 gene product, wherein the first fusion compound comprises, from N-terminus to C-terminus, a recruitment domain A and a transcriptional repressor domain, and the second fusion compound comprises, from N-terminus to C-terminus, a DNA methylation domain, a nucleic acid binding domain, and a recruitment domain A', wherein, relative to cells without the introduction of the complex or the nucleic acid sequence encoding the complex, the PCSK9 gene product expression of the plurality of modified cells is reduced by at least 50%, at least 60%, at least 70%, at least 80%, or at least 90%, and wherein the cells are hepatocytes.
[0049] On the other hand, this disclosure provides the use of the composition as described above in the preparation of a medicament for treating or alleviating PCSK9-related diseases in a subject.
[0050] In some implementations, the subjects are mammals, such as humans, monkeys, mice, rats, rabbits, pigs, horses, cats, and dogs.
[0051] In some implementations, the PCSK9-related diseases are hypercholesterolemia (FH), complicated dyslipidemia, hyperlipidemia, hyperlipoproteinemia, coronary syndrome, dyslipidemia, atherosclerosis, or liver cancer. Beneficial effects
[0052] This disclosure overcomes the problems associated with current technologies by providing an epigenetic editing tool complex and guide RNA for the targeted reduction or elimination of gene products (e.g., PCSK9) in cells for in vivo gene therapy. The epigenetic editing tool complex and guide RNA of this disclosure can be used to treat hereditary diseases, including, for example, liver diseases, diseases associated with high cholesterol, and diseases associated with cholesterol (e.g., low-density lipoprotein (LDL) cholesterol) disorders. Attached Figure Description
[0053] This disclosure can be more fully understood with reference to the following figures.
[0054] Figure 1 shows the expression level of PCSK9 protein in humanized mice 42 days after administration.
[0055] Figure 2 shows a comparison of PCSK9 protein levels in the blood of patients who received the EPIREG009 composition and the CRISPRoff composition.
[0056] Figure 3 shows the mRNA expression level of human PCSK9.
[0057] Figure 4 shows the blood PCSK9 level after administration of the EPIREG009 composition.
[0058] Figure 5 shows the blood LDL-c levels after administration of the EPIREG009 composition. Detailed Implementation
[0059] The following description of this disclosure is merely intended to illustrate various embodiments of the disclosure. Therefore, the specific modifications discussed should not be construed as limiting the scope of this disclosure. It will be apparent to those skilled in the art that various equivalents, changes, and modifications can be made without departing from the scope of this disclosure, and it should be understood that these equivalent embodiments are included herein. All references cited herein, including publications, patents, and patent applications, are incorporated herein by reference in their entirety.
[0060] I. Definition
[0061] As used herein, the terms “nucleic acid,” “oligonucleotide,” or “polynucleotide” refer to at least two nucleotides covalently linked together. The description of a single strand also defines the sequence of the complementary strand. Therefore, nucleic acid also encompasses the complementary strand of the described single strand. Many variants of nucleic acids can be used for the same purpose as a given nucleic acid. Therefore, nucleic acid also encompasses substantially the same nucleic acid and its complement. A single strand provides a probe that can hybridize with a target sequence under strict hybridization conditions. Therefore, nucleic acid also encompasses probes that hybridize under strict hybridization conditions. Nucleic acids can be single-stranded or double-stranded, or can contain portions having both double-stranded and single-stranded sequences. Nucleic acids can be DNA (both genomic DNA and cDNA), RNA, or hybrids, wherein nucleic acids can contain combinations of deoxyribonucleotides and ribonucleotides, and combinations of bases including uracil, adenine, thymine, cytosine, guanine, inosine, xanthine, hypoxanthine, isocytosine, and isoguanine. Nucleic acids can be obtained by chemical synthesis or by recombinant methods.
[0062] As used herein, the term "operable link" refers to gene expression under the control of a promoter to which it is spatially linked. The promoter can be located at the 5' (upstream) or 3' (downstream) of the gene under its control. The distance between the promoter and the gene can be approximately the same as the distance between the promoter and the gene it controls in the gene from which the promoter originates. As is known in the art, variations in this distance can be tolerated without loss of promoter function.
[0063] As used herein, the term "promoter" or "core promoter" refers to a synthetic or naturally derived molecule capable of conferring, activating, or enhancing nucleic acid expression in a cell. A promoter may contain one or more specific transcriptional regulatory sequences to further enhance nucleic acid expression and / or alter spatial and / or temporal expression of nucleic acids. Promoters may also include distal enhancers or repressor elements, which can be located up to several thousand base pairs from the transcription start site. Promoters can originate from sources including viruses, bacteria, fungi, plants, insects, and animals. Promoters can constitutively or differentially regulate the expression of genomic molecules in response to external stimuli such as physiological stress, pathogens, metal ions, or inducers, either relative to the cell, tissue, or organ in which expression occurs, or relative to the developmental stage in which expression occurs. Representative examples of promoters include the bacteriophage T7 promoter, bacteriophage T3 promoter, SP6 promoter, lac operator gene promoter, tac promoter, SV40 late promoter, SV40 early promoter, RSV-LTR promoter, and CMV IE promoter.
[0064] As used in this article, the term "vector" refers to a nucleic acid sequence containing a replication origin. Vectors can be viral vectors, bacteriophages, bacterial artificial chromosomes, or yeast artificial chromosomes. Vectors can be DNA or RNA vectors. Vectors can be self-replicating extrachromosomal vectors, such as DNA plasmids.
[0065] As used herein, the terms “adeno-associated virus (AAV) vector,” “AAV gene therapy vector,” and “gene therapy vector” refer to a vector having a functional or partially functional ITR sequence and a transgene. The term “ITR” as used herein refers to an inverted terminal repeat (ITR). ITR sequences can be derived from adeno-associated virus serotypes, including but not limited to AAV-1, AAV-2, AAV-3, AAV-4, AAV-5, and AAV-6. However, the ITR does not necessarily have to be a wild-type nucleotide sequence and can be altered (e.g., by insertion, deletion, or substitution of nucleotides) as long as the sequence retains its function to provide functional rescue, replication, and packaging. One or more AAV wild-type genes, preferably rep and / or cap genes, of an AAV vector may be wholly or partially deleted, but functional flanking ITR sequences are retained. The function of the functional ITR sequence is, for example, to rescue, replicate, and package AAV viral particles or granules. Therefore, “AAV vector” is defined herein as including at least those sequences required for inserting the transgene into the cells of a subject. Optionally, sequences that are necessary for viral replication and packaging (e.g., functional ITRs) are included.
[0066] The terms “subject” and “patient” as used herein are used interchangeably and refer to both humans and non-human animals. The term “non-human animal” in this disclosure includes all vertebrates, such as mammals and non-mammals, such as non-human primates, sheep, dogs, cats, horses, cattle, chickens, amphibians, reptiles, etc.
[0067] As used herein, the term "treatment" (e.g., disease) refers to a subject (e.g., a human) who has a disease, is at risk of developing a disease, and / or experiences symptoms of a disease, and in one embodiment, when administered, for example, the fusion molecule described herein or the nucleic acid and / or gRNA encoding the fusion molecule or the nucleic acid encoding the gRNA, experiences milder symptoms and / or recovers more quickly compared to never having administered the fusion molecule or the nucleic acid and / or the gRNA encoding the gRNA.
[0068] II. Nucleic acid binding domain
[0069] In certain embodiments of the methods and compositions according to this disclosure as defined herein, the nucleic acid binding domain refers to a DNA binding domain (e.g., a DNA target) comprising a (DNA) nuclease, such as a nuclease that can target DNA in a sequence-specific manner or can be directed or indicated to target DNA in a sequence-specific manner, such as a CRISPR-Cas system, zinc finger nucleases (ZFNs), transcription activator-like effector nucleases (TALENs), or a wide range of nucleases. In some embodiments, the DNA binding protein is a DNA nuclease derived from the CRISPR-Cas system.
[0070] Transcription activator-like effector nuclease (TALEN) system
[0071] In some implementations, the nucleic acid-binding protein is a (modified) transcription activator-like effector nuclease (TALEN) system. Transcription activator-like effectors (TALEs) can be engineered to bind virtually any desired DNA sequence. Exemplary methods for genome editing using the TALEN system can be found, for example, in the following literature: Cermak T. Doyle EL. Christian M. Wang L. Zhang Y. Schmidt C, et al. Efficient design and assembly of custom TALEN and other TAL effector-based constructs for DNA targeting. Nucleic Acids Res. 2011; 39:e82; Zhang F. Cong L. Lodato S. Kosuri S. Church GM. Arlotta P. Efficient construction of sequence-specific TAL effectors for modulating mammalian transcription. Nat Biotechnol. 2011; 29:149-153, and U.S. Patent Nos. 8,450,471, 8,440,431 and 8,440,432, each of which is incorporated herein by reference in its entirety.
[0072] As a further guideline, but not limited thereto, naturally occurring TALE, or "wild-type TALE," is a nucleic acid-binding protein secreted by various Proteobacterial species. TALE polypeptides contain a nucleic acid-binding domain consisting of tandem repeat sequences of highly conserved monomeric polypeptides, primarily 33, 34, or 35 amino acids in length, differing from each other mainly at amino acid positions 12 and 13. In some embodiments, the nucleic acid is DNA.
[0073] As used herein, the term “polypeptide monomer” or “TALE monomer” refers to a highly conserved repeating polypeptide sequence within the TALE nucleic acid binding domain, and the term “repeated variable double residue” or “RVD” refers to the highly variable amino acid at positions 12 and 13 of the polypeptide monomer.
[0074] As provided throughout this disclosure, the amino acid residues of RVD are described using IUPAC single-letter codes for the amino acids. The general representation of the TALE monomer contained within the DNA-binding domain is X. 1-11 -(X 12 X 13 )-X 14-33 or 34或35 The subscript indicates the position of the amino acid, and X represents any amino acid. 12 X 13 The RVD is represented by the variable amino acid at position 13. In some polypeptide monomers, the variable amino acid at position 13 is missing or absent, and in such polypeptide monomers, the RVD consists of a single amino acid. In such cases, the RVD can optionally be represented as X*, where X represents X. 12 And (*) indicates that X13 is not present. The DNA-binding domain contains several repeats of the TALE monomer, which can be represented as (X 1-11 -(X 12 X 13 )-X 14-33或34或35 ) zIn an advantageous embodiment, z is at least 5 to 40. In another advantageous embodiment, z is at least 10 to 26. The TALE monomer has nucleotide binding affinity determined by the identity of the amino acids in the RVD. For example, a polypeptide monomer with RVD NI preferentially binds to adenine (A), a polypeptide monomer with RVD NG preferentially binds to thymine (T), a polypeptide monomer with RVD HD preferentially binds to cytosine (C), and a polypeptide monomer with RVD NN preferentially binds to both adenine (A) and guanine (G). In yet another embodiment of this disclosure, a polypeptide monomer with RVD IG preferentially binds to T. Therefore, the number and order of polypeptide monomer repeats in the nucleic acid binding domain of the TALE determine its nucleic acid target specificity. In a further embodiment of this disclosure, a polypeptide monomer with RVD NS recognizes all four base pairs and can bind to A, T, G, or C. The structure and function of TALENs are further described in, for example, the following references: Moscou et al., Science 326:1501 (2009); Boch et al., Science 326:1509-1512 (2009); and Zhang et al., Nature Biotechnology 29:149-153 (2011), each of which is incorporated herein by reference in its entirety. In some embodiments, targeting is achieved by binding a TALEN fragment of a polynucleotide. In some embodiments, the targeting domain comprises or is composed of a catalytically inactive TALEN or its nucleic acid-binding fragment.
[0075] Zinc finger nuclease (ZFN) system
[0076] In some embodiments, the nucleic acid structural protein (e.g., a DNA-binding protein) comprises or is composed of a (modified) zinc finger nuclease (ZFN) system. The ZFN system uses an artificial restriction enzyme created by fusing a zinc finger DNA-binding domain with a DNA-cutting domain, which can be engineered to target a desired DNA sequence. Exemplary methods for genome editing using ZFNs can be found, for example, in U.S. Patent Nos. 6,534,261, 6,607,882, 6,746,838, 6,794,136, 6,824,978, 6,866,997, 6,933,113, 6,979,539, 7,013,219, 7,030,215, 7,220,719, 7,241,573, 7,241,574, 7,585,849, 7,595,376, 6,903,185, and 6,479,626, each of which is incorporated herein by reference in its entirety. As further guidance, but not limited thereto, artificial zinc finger (ZF) technology involves arrays of ZF modules to target novel DNA binding sites in the genome. Each finger module in the ZF array targets three DNA bases. Custom arrays of individual zinc finger domains were assembled into ZF proteins (ZFPs). ZFPs can contain functional domains. The first synthetic zinc finger nuclease (ZFN) was developed by fusing the ZF protein with the catalytic domain of the IIS-type restriction enzyme FokI. (Kim, Y. Get al., 1994, Chimeric restriction endonuclease, Proc. Natl. Acad. Sci. USA 91, 883-887; Kim, Y. Get al., 1996, Hybrid restriction enzymes: zinc finger fusions to FokI cleavage domain. Proc. Natl. Acad. Sci. USA 93, 1156-1160). By using paired ZFN heterodimers, each targeting a different nucleotide sequence separated by short intervals, improved cleavage specificity can be obtained with reduced off-target activity. (Doyon, Y. et al., 2011, Enhancing zinc-finger-nuclease activity with improved obligate heterodimeric architectures. Nat. Methods 8, 74-79). ZFPs can also be designed as transcriptional activators and repressors and have been used to target many genes in a variety of organisms. In some embodiments, the targeting domain comprises or is composed of a zinc finger nuclease that binds to nucleic acids or a nucleic acid-binding fragment thereof.In some embodiments, the zinc finger nuclease (fragment thereof) that binds to nucleic acids is non-catalytically active.
[0077] Large-scale nucleases
[0078] In some embodiments, the nucleic acid structural protein (e.g., a DNA-binding protein) comprises (modified) a large-scale nuclease, which is an endonuclease characterized by a large recognition site (a double-stranded DNA sequence of 12-40 base pairs). Exemplary methods using large-scale nucleases can be found in U.S. Patent Nos. 8,163,514, 8,133,697, 8,021,867, 8,119,361, 8,119,381, 8,124,369, and 8,129,134, each of which is incorporated herein by reference in its entirety. In some embodiments, targeting is achieved by binding a large-scale nuclease fragment of a polynucleotide. In some embodiments, targeting is achieved by binding a non-catalytically inactive large-scale nuclease (fragment) of a polynucleotide. Thus, in certain embodiments, the targeting domain comprises, or is composed of, a large-scale nuclease that binds nucleic acids or a nucleic acid-binding fragment thereof.
[0079] CRISPR-Cas system
[0080] In some embodiments, the nucleic acid structural protein (e.g., DNA-binding protein) and the single guide RNA sequence are derived from a CRISPR-Cas system. This disclosure provides a CRISPR / Cas9-based engineered system for genome editing and the treatment of genetic diseases. The CRISPR / Cas9-based engineered system can be designed to target any gene (e.g., PCSK9), including genes associated with genetic diseases, liver disease, and cholesterol dysregulation such as LDL. This disclosure provides a CRISPR-Cas system comprising a genetically engineered Cas protein and / or guide RNA having desired specificity and activity (e.g., reduced or eliminated expression of the PCSK9 gene product). The CRISPR / Cas9-based system may include a Cas9 protein, a mutant Cas9 protein, or a Cas9 fusion protein (e.g., a DNMT3A-DNMT3L(3A3L)-dCas9-GCN4 fusion molecule) and two sgRNAs (PCSK9 sgRNAs). The Cas9 fusion protein may, for example, include domains with different activities than the endogenous domains of Cas9 (e.g., DNMT3A, DNMT3L, or GCN4).
[0081] Generally, Cas protein (which may be used interchangeably with CRISPR protein, CRISPR enzyme, CRISPR-Cas protein, CRISPR-Cas enzyme, Cas, CRISPR effector, or Cas effector protein in this document) and / or guide sequence are components of the CRISPR-Cas system. The CRISPR-Cas system or CRISPR system collectively refers to transcripts and other elements involved in the expression of or directing the activity of CRISPR-related (“Cas”) genes, including sequences encoding Cas genes, tracr (trans-activating CRISPR) sequences (e.g., tracrRNA or active partial tracrRNA), tracr-mate sequences (in the case of endogenous CRISPR systems, encompassing “positive repeats” and partial positive repeats processed by tracrRNA), guide sequences (also referred to as “spacer regions” in the case of endogenous CRISPR systems), or the term “RNA” as used herein (e.g., RNA such as CRISPR RNA and trans-activating (tracr) RNA or a single guide RNA (also referred to as sgRNA; chimeric RNA)) or other sequences and transcripts from CRISPR loci.
[0082] Generally, CRISPR systems are characterized by elements (also referred to as prespacer sequences in the case of endogenous CRISPR systems) that promote the formation of the CRISPR complex at the target sequence site. In the engineered systems of this disclosure, direct repeat sequences can include naturally occurring or non-natural sequences. The direct repeat sequences of this disclosure are not limited to naturally occurring lengths and sequences. Furthermore, the direct repeat sequences of this disclosure can include nucleotide insertions, such as aptamers or sequences that bind to adaptor proteins (for binding to functional domains). In some embodiments, a direct repeat containing, for example, an inserted sequence has one end that is substantially the first half of a short DR, and the other end that is substantially the second half of the short DR.
[0083] Typically, the guide sequence (or spacer sequence) can be sufficiently complementary to the PCSK9 polynucleotide sequence to hybridize with the PCSK9 sequence and guide the CRISPR complex to specifically bind to the PCSK9 polynucleotide sequence. In some embodiments, when optimal alignment is performed using a suitable alignment algorithm, the complementarity between the guide sequence and its corresponding target sequence is equal to or greater than about 50%, 60%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or higher.
[0084] In some implementations, cutting efficiency can be modulated by introducing mismatches, such as one or more mismatches, for example, one or two mismatches between the spacer sequence and the target sequence (including the location of the mismatch along the spacer / target region). For example, the closer a double mismatch is to the center (i.e., not at 3' or 5'), the greater its impact on cutting efficiency. Therefore, cutting efficiency can be adjusted by selecting the location of the mismatch along the spacer region. For example, if a cut of the target of less than 100% is desired (e.g., in a cell population), one or more, preferably two, mismatches can be introduced between the spacer region and the target sequence in the spacer sequence. The closer the mismatch location is to the center along the spacer region, the lower the cutting percentage.
[0085] The CRISPR-Cas system or its components can be used to introduce one or more mutations into target loci or nucleic acid sequences. These mutations can include the introduction, deletion, or substitution of one or more nucleotides at each target sequence in the cell via guide RNA or sgRNA. The mutations can include the introduction, deletion, or substitution of 1-75 nucleotides at each target sequence in the cell via guide RNA.
[0086] Typically, in the case of an endogenous CRISPR-Cas system, the formation of a CRISPR complex (containing a guide sequence that hybridizes to the target sequence and is compounded with one or more Cas proteins) results in cleavage in or near the target sequence (e.g., within 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 20, 50, or more base pairs), but this may depend on, for example, secondary structure, particularly in the case of an RNA target. In some cases, in the case of an endogenous CRISPR system, the formation of a CRISPR complex (containing a guide sequence that hybridizes to the target sequence and is compounded with one or more Cas proteins) results in cleavage of one or both strands (if applicable) in or near the target sequence (e.g., within 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 20, 50, or more base pairs).
[0087] In some embodiments, the guide RNA (capable of guiding Cas to the target locus) may comprise (1) a guide sequence capable of hybridizing with a target locus (polynucleotide target locus, such as an RNA target locus) in a eukaryotic cell; and (2) a forward repeat (DR) sequence present in a single RNA, namely sgRNA (arranged in a 5' to 3' orientation) or crRNA.
[0088] General information concerning CRISPR-Cas systems, their components, and the delivery of such components, including all methods, materials, delivery media, carriers, particles, AAVs, and their preparation and use, including quantities and formulations, useful in the practice of this disclosure, is made in reference to the following U.S. Patent Nos. 8,999,641, 8,993,233, 8,945,839, 8,932,814, 8,906,616, 8,895,308, 8,889,418, 8,889,356, 8,871,445, 8,865,406, 8,795,965, 8,771,945, and 8,697,359; U.S. Patent Publications Nos. US2014-0310830, US2014-0287938A1, and US2014-027323. 4A1, US2014-0273232A1, US2014-0273231A1, US2014-0256046A1, US2014-0248702 A1, US2014-0242700A1, US2014-0242699A1, US2014-0242664A1, US2014-0234972A1 US2014-0227787A1, US2014-0189896A1, US2014-0186958, US2014-0186919A1, US2014-0186843A1, US2014-0179770A1, US2014-0179006A1, US2014-0170753; European Patents EP 2784162 B1 and EP 2771468 B1; European Patent Applications EP 2771468, EP 2764103 and EP 2784162;and PCT patent publications WO 2021 / 183807A1 (PCT / US2021 / 021973), WO 2014 / 093661 (PCT / US2013 / 074743), WO 2014 / 093694 (PCT / US2013 / 074790), WO 2014 / 093595(PCT / US2013 / 074611), WO 2014 / 093718(PCT / US2013 / 074825), WO 2014 / 093709(PCT / US2013 / 074812), WO 2014 / 093622(PCT / US2013 / 074667)、WO 2014 / 093635(PCT / US2013 / 074691), WO 2014 / 093655(PCT / US2013 / 074736), WO 2014 / 093712(PCT / US2013 / 074819), WO 2014 / 093701(PCT / US2013 / 074800), WO 2014 / 018423(PCT / US2013 / 051418), WO 2014 / 204723(PCT / US2014 / 041790), WO 2014 / 204724(PCT / US2014 / 041800), WO2014 / 204725(PCT / US2014 / 041803), WO WO 2014 / 204726 (PCT / US2014 / 041804), WO 2014 / 204727 (PCT / US2014 / 041806), WO 2014 / 204728 (PCT / US2014 / 041808), and WO 2014 / 204729 (PCT / US2014 / 041809), each of which is incorporated herein by reference in its entirety.
[0089] Cas protein
[0090] Cas proteins (e.g., engineered Cas proteins) may have substantially the same nuclease activity as the wild-type corresponding Cas protein (e.g., between 80% and 100%, between 90% and 100%, between 95% and 100%, between 98% and 100%, between 99% and 100%, between 99.9% and 100%, or about 100%). In some cases, engineered Cas proteins have higher nuclease activity than the wild-type corresponding Cas protein (e.g., at least 5%, at least 10%, at least 20%, at least 30%, at least 40%, at least 50%, at least 60%, at least 70%, at least 80%, or at least 90%).
[0091] Optionally or additionally, the Cas protein (e.g., an engineered Cas protein) may have a specificity that is at least 5%, at least 10%, at least 20%, at least 30%, at least 40%, at least 50%, at least 60%, at least 70%, at least 80%, or at least 90% higher than the wild-type corresponding Cas protein. In specific instances, the Cas protein (e.g., an engineered Cas protein) has a specificity that is at least 30% higher than the wild-type corresponding Cas protein. As used herein, the term “specificity” for Cas may correspond to the number or percentage of on-target polynucleotide cleavage events relative to all polynucleotide cleavage events (including on-target and off-target events). The activity and specificity of the Cas protein are consistent with those described in the following literature: Hsu PD et al., DNA targeting specificity of RNA-guided Cas9 nucleases, Nat Biotechnol. Sep 2013; 31(9):827-832 and Slaymaker IM et al., Rationally engineered Cas9 nucleases with improved specificity, Science. Jan 1 2016; 351(6268):84-88. Examples of methods for detecting the activity and specificity of the Cas protein are also described in this paper and are incorporated herein by reference in their entirety and are described elsewhere in this paper.
[0092] In some embodiments, the Cas protein (e.g., its RuvC domain) may slide one base upstream (relative to the PAM) and produce a staggered cleavage, which can be filled and result in the replication of a single base (i.e., a +1 insertion). Examples of +1 insertion sites are described in Zuo, Z. and Liu, J. (2016) Cas9-catalyzed DNA Cleavage Generates Staggered Ends: Evidence from Molecular Dynamics Simulations. Scientific Reports 6,37584. In some embodiments, engineered Cas proteins have a different +1 insertion frequency than their wild-type counterparts. For example, the +1 insertion frequency is higher when guanine is present at the -2 position of the PAM than when thymidine, cytidine, or adenine is present at the -2 position of the PAM. In some cases, the +1 insertion depends on host mechanisms in human cells. In some instances, the Cas protein may produce a staggered cleavage. The staggered cleavage may be a 1-bp or a 1-nucleotide 5' overhang. Interleaved cleavage can be a 1-bp or a 3' overhang of a 1-nucleotide.
[0093] Codon optimization can be performed on nucleic acid molecules encoding Cas. Examples of codon-optimized sequences in this context are sequences optimized for expression in eukaryotes such as humans (i.e., optimized for expression in humans), or sequences optimized for another eukaryote such as the animals or mammals discussed herein; see, for example, the SaCas9 human codon-optimized sequence in WO 2014 / 093622 (PCT / US2013 / 074667). While this is preferred, it should be understood that other examples are possible, and codon optimization for host species other than humans or for specific organs is known. In some embodiments, the enzyme-coding sequence encoding Cas is codon-optimized for expression in specific cells such as eukaryotic cells. Eukaryotic cells can be cells of a specific organism (such as mammals, including but not limited to humans or non-human eukaryotes or the animals or mammals described herein, such as mice, rats, rabbits, dogs, livestock, or non-human mammals or primates) or cells derived from a specific organism. In some implementations, methods for altering human germline genetic traits and / or methods for altering animal genetic traits (which may cause them suffering without any substantial medical benefit to humans or animals), and animals produced by such methods, may be excluded. Generally, codon optimization refers to the process of modifying a nucleic acid sequence to enhance expression in a target host cell by replacing at least one codon of the natural sequence with codons that are more frequently or most frequently used in the genes of that host cell (e.g., about 1, 2, 3, 4, 5, 10, 15, 20, 25, 50, or more codons), while maintaining the natural amino acid sequence. Different species exhibit specific biases toward specific codons of specific amino acids. Codon bias (differences in codon use between organisms) is generally associated with the translation efficiency of messenger RNA (mRNA), which is thought to depend, among other things, on the nature of the codons being translated and the availability of specific transfer RNA (tRNA) molecules. The dominance of selected tRNAs in a cell generally reflects the most frequently used codons in peptide synthesis. Therefore, genes can be customized for optimal gene expression in a given organism based on codon optimization. Codon usage tables are readily available in "codon usage databases" such as www.kazusa.orjp / codon / , and these tables can be modified in various ways. See Nakamura, Y. et al., "Codon usage tabulated from the international DNA sequence databases: status for the year 2000," Nucl. Acids Res. 28:292 (2000).Computer algorithms for codon optimization of specific sequences for expression in specific host cells are also available, such as Gene Forge (Appagen; Jacobus, PA). In some implementations, one or more codons (e.g., 1, 2, 3, 4, 5, 10, 15, 20, 25, 50 or more, or all codons) in the sequence encoding Cas correspond to the most frequently used codon for a specific amino acid.
[0094] In some embodiments, the Cas protein may have nucleic acid cleavage activity. The Cas protein may have RNA binding and DNA cleavage functions. In some embodiments, Cas may direct the cleavage of one or two nucleic acid strands at or near a target sequence location, such as within the target sequence and / or within the complementary sequence of the target sequence or at a sequence associated with the target sequence, for example, within approximately 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, 50, 100, 200, 500 or more base pairs from the first or last nucleotide of the target sequence. In some embodiments, the Cas protein can direct more than one cleavage (e.g., one, two, three, four, five, or more cleavages) of one or both strands within the target sequence and / or its complementary sequence or a sequence associated with the target sequence and / or approximately one, two, three, four, five, six, seven, eight, nine, ten, fifteen, twenty, twenty, twenty, five, five, or more base pairs from the first or last nucleotide of the target sequence. In some embodiments, the cleavage can be blunt-ended, i.e., producing blunt ends. In some embodiments, the cleavage can be staggered, i.e., producing sticky ends.
[0095] In some embodiments, the vector encodes a Cas protein targeting a nucleic acid, which may be mutated relative to the corresponding wild-type enzyme such that the mutated Cas protein targeting the nucleic acid lacks the ability to cleave one or both strands of a target polynucleotide containing the target sequence. For example, an alteration or mutation in the HNH domain produces a mutated Cas protein that substantially lacks all DNA cleavage activity, for example, the DNA cleavage activity of the mutated enzyme is approximately 25%, 10%, 5%, 1%, 0.1%, 0.01%, or lower than the cleavage activity of the unmutated form of the enzyme; an example is when the cleavage activity of the mutated form is zero or negligible compared to the unmutated form. As used herein, the term “derived” with respect to an enzyme means that the derived enzyme is largely based on the meaning of a high sequence homology with the wild-type enzyme, but has been mutated (modified) in some manner known in the art or described herein.
[0096] Typically, in the case of endogenous nucleic acid targeting systems, the formation of a nucleic acid targeting complex (containing a guide RNA or crRNA that hybridizes to the target sequence and is complexed with one or more nucleic acid targeting effector proteins) results in the cleavage of one or more DNA strands in or near the target sequence (e.g., within 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 20, 50 or more base pairs). As used herein, the term “one or more sequences associated with a target locus” refers to a sequence close to the target sequence (e.g., within 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 20, 50 or more base pairs from the target sequence, wherein the target sequence is contained within the target locus).
[0097] It should be understood that effector proteins are based on or derived from enzymes, and therefore in some embodiments, the term "effector protein" naturally includes "enzyme". However, it should also be understood that, as required in some embodiments, effector proteins may have DNA or RNA binding activity, but not necessarily cleavage or nicking activity, including death Cas protein function.
[0098] In some embodiments, the Cas protein may form a component of an inducible system. The inducible nature of this system would allow for spatiotemporal control of gene editing or gene expression using a form of energy. The form of energy may include, but is not limited to, electromagnetic radiation, acoustic energy, chemical energy, and thermal energy. Examples of inducible systems include tetracycline-inducible promoters (Tet-On or Tet-Off), small two-hybrid transcriptional activation systems (FKBP, ABA, etc.), or photoinducible systems (phytochrome, LOV domain, or cryptochrome). In one embodiment, a CRISPR effector protein may be part of a photoinducible transcriptional effector (LITE) that directs changes in transcriptional activity in a sequence-specific manner. The light component may include a CRISPR effector protein, a light-responsive cytochrome heterodimer (e.g., from Arabidopsis thaliana), and a transcriptional activation / repression domain. Other examples of inducible DNA-binding proteins and methods of use thereof are provided in US 61 / 736465 and US 61 / 721,283, and WO 2014018423 A2 (which is hereby incorporated herein by reference in its entirety).
[0099] In some embodiments, the mutated Cas may have one or more mutations that reduce off-target effects, such as improved CRISPR enzymes (e.g., when complexed with guide RNA) for achieving modification of the target locus but reducing or eliminating off-target activity, and improved CRISPR enzymes (e.g., when complexed with guide RNA) for enhancing CRISPR enzyme activity. It should be understood that the mutated enzymes described below can be used in any method described herein as elsewhere in accordance with this disclosure. Any methods, products, compositions, and uses described elsewhere herein are equally applicable to the mutated CRISPR enzymes further detailed below.
[0100] Methods and mutations that can be used in various combinations to enhance or reduce the activity and / or specificity of the mid-target activity compared to off-target activity, or to enhance or reduce the binding and / or specificity of mid-target binding compared to off-target binding, can be used to compensate for or enhance mutations or modifications made to promote other effects. Such mutations or modifications made to promote other effects include mutations or modifications to Cas and / or mutations or modifications to the guide RNA. The methods and mutations disclosed herein are used to regulate Cas nuclease activity and / or binding to chemically modified guide RNA.
[0101] In some embodiments, the catalytic activity of the Cas protein of this disclosure is altered or modified. It should be understood that if the catalytic activity differs from that of the corresponding wild-type Cas protein (e.g., unmutated Cas protein), the mutated Cas possesses altered or modified catalytic activity. Catalytic activity can be determined by methods known in the art. For example, and not limited to, catalytic activity can be determined in vitro or in vivo by measuring the percentage of insertions / deletions (e.g., after a given time, or at a given dose). In some embodiments, catalytic activity is enhanced. In some embodiments, the catalytic activity is enhanced by at least 5%, at least 10%, at least 20%, at least 30%, at least 40%, at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, or at least 100%. In some embodiments, catalytic activity is reduced. In some embodiments, the catalytic activity is reduced by at least 5%, preferably at least 10%, more preferably at least 20%, such as at least 30%, at least 40%, at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, or (substantially) 100%. One or more mutations described herein can deactivate the catalytic activity, which can significantly reduce all catalytic activity, reducing the activity below detectable levels or to unmeasurable catalytic activity.
[0102] One or more characteristics of an engineered Cas protein may differ from those of the corresponding wild-type Cas protein. Examples of such characteristics include catalytic activity, gRNA binding, Cas protein specificity (e.g., editing to determine target specificity), Cas protein stability, off-target binding, target binding, protease activity, nickase activity, and PFS recognition. In some instances, the engineered Cas protein may contain one or more mutations of the corresponding wild-type Cas protein. In some embodiments, the engineered Cas protein exhibits enhanced catalytic activity compared to the corresponding wild-type Cas protein. In some embodiments, the engineered Cas protein exhibits decreased catalytic activity compared to the corresponding wild-type Cas protein. In some embodiments, the engineered Cas protein exhibits increased gRNA binding compared to the corresponding wild-type Cas protein. In some embodiments, the engineered Cas protein exhibits decreased gRNA binding compared to the corresponding wild-type Cas protein. In some embodiments, the Cas protein exhibits enhanced specificity compared to the corresponding wild-type Cas protein. In some embodiments, the Cas protein exhibits decreased specificity compared to the corresponding wild-type Cas protein. In some embodiments, the Cas protein exhibits enhanced stability compared to the corresponding wild-type Cas protein. In some embodiments, the Cas protein exhibits decreased stability compared to the corresponding wild-type Cas protein. In some embodiments, the engineered Cas protein further comprises one or more mutations that inactivate catalytic activity. In some embodiments, off-target binding of the Cas protein is increased compared to the corresponding wild-type Cas protein. In some embodiments, off-target binding of the Cas protein is decreased compared to the corresponding wild-type Cas protein. In some embodiments, target binding of the Cas protein is increased compared to the corresponding wild-type Cas protein. In some embodiments, target binding of the Cas protein is decreased compared to the corresponding wild-type Cas protein. In some embodiments, the engineered Cas protein has higher protease activity or polynucleotide binding capacity compared to the corresponding wild-type Cas protein. In some embodiments, PFS recognition is altered compared to the corresponding wild-type Cas protein.
[0103] Examples of Cas proteins
[0104] Examples of Cas proteins include class I (e.g., types I, III, and IV) and class II (e.g., types II, V, and VI) Cas proteins, such as Cas9, Cas12 (e.g., Cas12a, Cas12b, Cas12c, Cas12d), Cas13 (e.g., Cas13a, Cas13b, Cas13c, Cas13d), CasX, CasY, Cas14, their variants (e.g., mutant forms, truncated forms), their homologs, and their orthologs. The terms "ortholog" and "homolog" are well known in the art. With further guidance, a "homolog" of a protein, as used herein, is a protein of the same species that performs the same or similar function as a protein that is its homolog. Homologous proteins may, but are not necessarily, structurally related, or only partially structurally related. A "ortholog" of a protein, as used herein, is a protein of a different species that performs the same or similar function as a protein that is its ortholog. Orthologous proteins may be, but not necessarily, structurally related, or only partially structurally related.
[0105] Type 2 Cas proteins
[0106] In some embodiments, the Cas protein is a class 2 Cas protein, i.e., a Cas protein of a class 2 CRISPR-Cas system. The class 2 CRISPR-Cas system can be a subtype, such as type II-A, type II-B, type II-C, type VA, type VB, type VC, or type VU. In some embodiments, the Cas protein is Cas9, Cas12a, Cas12b, Cas12c, or Cas12d. In some embodiments, Cas9 can be SpCas9, SaCas9, StCas9, and other Cas9 orthologs. Cas12 can be Cas12a, Cas12b, and Cas12c, including FnCas12a, or its homologs or orthologs. The definition and exemplary members of CRISPR-Cas systems are included in the following literature: Kira S. Makarova and Eugene V. Koonin, Annotation and Classification of CRISPR-Cas systems, Methods Mol Biol. 2015; 1311:47-75; and Sergey Shmakov et al., Diversity and evolution of class 2 CRISPR-Cas systems, Nat Rev Microbial. 2017 Mar; 15(3):169-182.
[0107] Cas protein adapter
[0108] In some instances, the Cas protein comprises at least one RuvC domain and at least one HNH domain. The Cas protein may further comprise first and second adapter domains connecting the RuvC and HNH domains. The first adapter (L1) and second adapter (L2) connecting the HNH and RuvC domains in Cas9 are described in the following studies: Nishimasu, H. et al. “Crystal structure of Cas9 in complex with guide RNA and target RNA” Cell 156 (Feb. 27, 2014): 935-949 and Ribeiro, L. et al. (2018) “Protein engineering strategies to expand CRISPR-Cas9 applications” International Journal of Genomics Volume 2018, Article ID 1652567 (doi.org / 10.1155 / 2018 / 1652567). Figure 1 by Ribeiro illustrates the overall organization, structure, and function of Cas9, which is incorporated herein by reference. Specifically, Figure 1A shows a schematic diagram of the domain organization of SpCas9, indicating the genetic structure of the HNH and RuvC domains, including the linkers L1 (spanning amino acids 765-780) and L2 (spanning amino acids 906-918), as described herein.
[0109] Similarly, when referencing the first and second adapter domains, the domain organization of Staphylococcus aureus Cas9 (SaCas9) can be utilized. In one aspect, the adapter 1 domain region spans residues 481-519 and links the RuvC-II domain to the HNH domain in SaCas9. In some embodiments, the adapter 2 region spans residues 629-649 and links the RuvC-III domain and the HNH domain of SaCas9. Thus, the first and / or second adapter domains in Cas9 orthologs can be mutated and can be referenced to amino acid residues corresponding to the amino acids in wild-type SaCas9. See Nishimasu, Cell. 2015 Aug 27; 162(5):1113-1126; doi:10.1016 / j.cell.2015.08.007, which is incorporated herein by reference. Specifically, Nishimasu’s Figures 1 and S1-S3 provide a detailed description of the domain organization of the Cas9 protein, which is incorporated herein by reference.
[0110] The first and second adapters may contain about 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45 or more amino acids. The first and second adapters may correspond to the wild-type adapter. In some aspects, the first and second adapters may contain one or more mutations in the first and / or second adapter. In one aspect, the first and / or second adapter contains one or more mutations that enhance the specificity of the Cas9 protein.
[0111] In some embodiments, the linkers L1 and L2 connecting the HNH and RuvC domains of Cas9 contain wild-type amino acid sequences. In some embodiments, the linkers connecting the HNH and RuvC domains contain mutations in one or more amino acids. In one exemplary embodiment, the first linker (L1) contains a mutation corresponding to the amino acid T769I of SpCas9, and / or the second linker (L2) contains a mutation corresponding to the amino acid G915M of SpCas9. In one exemplary embodiment, one or more linker mutations, such as T769I and G915M, confer improved specificity for the Cas9 protein.
[0112] In one embodiment, one or more mutations in the first and second linkers may be combined with one or more mutations in other parts of the Cas9 protein to further improve specificity and / or maintain activity substantially equivalent to that of the wild-type Cas9 protein, as described herein. In one embodiment, mutations in the linkers and / or additional mutations within the Cas protein may be identified using methods detailed herein that enhance / improve specificity and substantially maintain the wild-type activity of wild-type Cas9.
[0113] Type II Cas proteins (e.g., Cas9)
[0114] In some embodiments, the Cas protein may be a Cas protein of a type II CRISPR-Cas system (a type II Cas protein). In some embodiments, the Cas protein may be a type II Cas protein, such as Cas9. In some embodiments, the CRISPR / Cas9-based system may include a Cas9 protein or a fragment thereof, a Cas9 fusion protein, a nucleic acid encoding a Cas9 protein or a fragment thereof, or a nucleic acid encoding a Cas9 fusion protein. "Cas9 (CRISPR-associated protein 9)" refers to a polypeptide or fragment thereof having at least about 85% amino acid identity with NCBI accession number NP_269215 and possessing RNA-binding activity, DNA-binding activity, and / or DNA-cutting activity (e.g., endonuclease or cleavage enzyme activity). "Cas9 function" can be defined by any of a variety of assays, including but not limited to fluorescence polarization-based nucleic acid binding assays, fluorescence polarization-based chain invasion assays, transcription assays, EGFP destruction assays, DNA-cutting assays, and / or Surveyor assays, as described herein. "Cas 9 nucleic acid molecule" refers to a polynucleotide encoding a Cas9 polypeptide or a fragment thereof. An exemplary Cas9 nucleic acid sequence is provided at genome sequence number NC_002737. In some embodiments, inhibitors of Cas9, such as those naturally occurring in *Streptococcus pyogenes* (SpCas9) or *Staphylococcus aureus* (SaCas9), or variants thereof, are disclosed herein. Cas9 recognizes foreign DNA by pairing its prespacer neighbor motif (PAM) sequence and guide RNA (gRNA) with the target DNA. The relative ease with which Cas9 induces target strand breaks at any genomic site enables efficient genome editing across a wide range of cell types and organisms. Cas9 derivatives can also be used as transcriptional activators / repressors.
[0115] In some cases, the CRISPR-Cas protein is Cas9 or a variant thereof. In some instances, Cas9 can be wild-type Cas9, including any naturally occurring bacterial Cas9. Cas9 orthologs typically share a common organization of 3-4 RuvC domains and one HNH domain. The 5' RuvC domain cleaves the non-complementary strand, and the HNH domain cleaves the complementary strand. All symbols refer to the guide sequence. The catalytic residues in the 5' RuvC domain are identified by comparing the Cas9 of interest with other Cas9 orthologs (from the *Streptococcus pyogenes* type II CRISPR locus, *Streptococcus thermophilus* CRISPR locus 1, *Streptococcus thermophilus* CRISPR locus 3, and *Francisella novicida* type II CRISPR locus) and by mutating the conserved Asp residue (D10) to alanine to convert Cas9 into a complementary strand cleaving enzyme. Therefore, the Cas enzyme can be wild-type Cas9, including any naturally occurring bacterial Cas9. CRISPR, Cas, or Cas9 enzymes can be codon-optimized or modified versions, including any chimera, mutant, homolog, or ortholog. In another aspect of this disclosure, the Cas9 enzyme can contain one or more mutations and can be used as a universal DNA-binding protein with or without fusion with functional domains.
[0116] The mutation may be an artificially introduced mutation or a gain-of-function or loss-of-function mutation. In some embodiments, the transcriptional activation domain may be VP64. In some embodiments, the transcriptional repressor domain may be KRAB or SID4X. Other aspects of this disclosure relate to mutated Cas9 enzymes fused with domains, including but not limited to nucleases, transcriptional activators, repressors, recombinases, transposases, histone remodelers, demethylases, DNA methyltransferases, cryptochromes, photoinducible / controllable domains, or chemically inducible / controllable domains. This disclosure may relate to sgRNA or tracrRNA or guide or chimeric guide sequences that allow for enhanced performance of these RNAs in the cell. Such type II CRISPR enzymes may be any Cas enzyme. In some cases, the Cas9 enzyme is derived from or derived from SpCas9 or SaCas9. As used herein, the term “derived” means, with respect to an enzyme, that the derived enzyme is largely based on the wild-type enzyme (in the sense of high sequence homology with the wild-type enzyme), but has been mutated (modified) in some manner known in the art or described herein. In one instance, the mutation may include one or more mutations in the first linker domain, the second linker domain, and / or other parts of the protein. High sequence homology may include at least 80%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or higher relative to the wild-type enzyme.
[0117] Cas enzymes can be identified as Cas9, as this can refer to a general class of enzymes that share homology with the largest nucleases from the type II CRISPR system, possessing multiple nuclease domains. In some cases, the Cas9 enzymes are derived from or derived from SpCas9 (Streptococcus pyogenes Cas9) or saCas9 (Staphylococcus aureus Cas9). “StCas9” refers to the wild-type Cas9 (UniProt ID: G3ECR1) from Streptococcus thermophilus. Similarly, “SpCas9” refers to the wild-type Cas9 (UniProt ID: Q99ZW2) from Streptococcus pyogenes. As used herein, the term “derived” means, with respect to an enzyme, that the derived enzyme is largely based on the wild-type enzyme (in the sense of high sequence homology with the wild-type enzyme) but has been mutated (modified) in some manner known in the art or described herein. It should be understood that the terms Cas and CRISPR enzyme are generally used interchangeably herein unless explicitly stated otherwise. As mentioned above, many residue numbers used in this article refer to the Cas9 enzyme from the type II CRISPR locus in Streptococcus pyogenes.
[0118] In certain embodiments, the effector protein is a Cas9 effector protein derived from or originating from the following genera: Streptococcus, Campylobacter, Nitratifractor, Staphylococcus, Parvibaculum, Roseburia, Neisseria, Gluconacetobacter, Azospirillum, Sphaerochaeta, and Lactobacillus. Acillus, Eubacterium, Corynebacterium, Carnobacterium, Rhodobacter, Listeria, Paludibacter, Clostridium, Lachnospiraceae, Clostridiaridium, Leptotrichia, Francisella, Legionella, Alicyclobaci llus), Methanomethyophilus, Porphyromonas, Prevotella, Bacteroidetes, Helcococcus, Letospira, Desulfovibrio, Desulfonatronum, Opitutaceae, Tuberibacillus, Bacillus, Brevibacilus Methylobacterium or Acidaminococcus, Streptococcus, Campylobacter, Nitratifractor, Staphylococcus, Parvibaculum, Roseburia, Neisseria, Gluconacetobacter, Azospirillum, Sphaerochaeta,Lactobacillus, Eubacterium, Corynebacterium, Sutterella, Legionella, Treponema, Filifactor, Eubacterium, Streptococcus, Lactobacillus, Mycoplasma, Bacteroides, Flavobacterium *Iivola*, *Flavobacterium*, *Sphaerochaeta*, *Azospirillum*, *Gluconacetobacter*, *Neisseria*, *Roseburia*, *Parvibaculum*, *Staphylococcus*, *Nitratifractor*, *Mycoplasma*, or *Campylobacter*.
[0119] In some embodiments, the Cas9 protein is derived from or originates from organisms selected from: *Streptococcus mutans*, *Streptococcus agalactiae*, *Streptococcus equisimilis*, *Streptococcus sanguinis*, *Streptococcus pneumoniae*, *Campylobacter jejuni*, *Campylobacter coli*, *N. salsuginis*, *N. tergarcus*, *Staphylococcus auricularis*, *Staphylococcus carnosus*, *Neisseria meningitidis*, and *Neisseria gonorrhoeae*. Listeria monocytogenes, Listeria ivanovii, Clostridium botulinum, Clostridium difficile, Clostridium tetani or Clostridium sordellii, Francisella tularensis 1, Francisella tularensis subsp. novicida, Prevotella albensis, Lachnospiraceae bacterium MC2017 1, Butyrivibrio proteoclasticus, Peregrinibacteria bacterium GW2011 GWA2_33_10, Parcubacteria bacterium GW2011 GWC2_44_17, species of the genus *Smithella* SCADC, and species of the genus *Acidaminococcus* sp.BV3L6, Lachnospiraceae bacterium MA2020, Candidatus Methanoplasma termitum, Eubacterium eligens, Moraxella bovoculi 237, Leptospira inadai, Lachnospiraceae bacterium ND2006, Porphyromonas crevioricanis 3, Prevotella disiens, and Porphyromonas macacae. In some embodiments, the Cas9 protein is Cas9 derived from or originating from Streptococcus pyogenes, Staphylococcus aureus, or Streptococcus thermophilus.
[0120] In a more preferred embodiment, the Cas9 protein is derived from a bacterial species selected from Streptococcus pyogenes, Staphylococcus aureus, or Streptococcus thermophilus. In some embodiments, the Cas9 is derived from bacterial species selected from the following: *Francisella tularensis* 1, *Prevotella albensis*, *Lachnospiraceae bacterium* MC20171, *Butyrivibrio proteoclasticus*, *Peregrinibacteria bacterium* GW2011GWA2 33JO, *Parcubacteria bacterium* GW2011 GWC2_44_17, *Smithella* sp. SCADC, *Acidaminococcus* sp. BV3L6, *Lachnospiraceae bacterium* MA2020, and *Candidatus Methanoplasma* (candidate). The bacteria mentioned include *Termitum*, *Eubacterium eligens*, *Moraxella bovoculi* 237, *Leptospira inadai*, *Lachnospiraceae bacterium* ND2006, *Porphyromonas crevioricanis* 3, *Prevotella disiens*, and *Porphyromonas macacae*. In some embodiments, the Cas9 protein is derived from bacterial species selected from *Acidaminococcus sp.* BV3L6 and *Lachnospiraceae bacterium* MA2020. In some embodiments, the effector protein is derived from a subspecies of Francisella tularensis, including but not limited to Francisella tularensis subsp. novicida.
[0121] Cas9 enzymes include, but are not limited to, serotype M1 of *Streptococcus pyogenes* (UniProt ID: Q99ZW2), *Staphylococcus aureus* Cas9 (UniProt ID: J7RUA5), *Eubacterium ventriosum* Cas9 (UniProt ID: A5Z395), *Azospirillum* (strain B510) Cas9 (UniProt ID: D3NT09), *Gluconacetobacter diazotrophicus* (strain ATCC 49037) Cas9 (UniProt ID: A9HKP2), *Neisseria cinerea* Cas9 (UniProt ID: D0W2Z9), *Roseburia intestinalis* Cas9 (UniProt ID: C7G697), and *Parvibaculum*. lavamentivorans (strain DS-1)Cas9 (UniProt ID: A7HP89), Nitratifractor salsuginis (strain DSM 16511)Cas9 (UniProt ID: E6WZS9), Campylobacter lari Cas9 (UniProt ID: G1UFN3).
[0122] Enzymatic action of Cas9 derived from *Streptococcus pyogenes*, or any closely related Cas9, produces a double-strand break at a target site sequence that hybridizes to 20 nucleotides of a guide sequence and is followed by a prespacer adjacent motif (PAM) sequence (examples include NGG / NRG or a PAM that can be identified as described herein). CRISPR activity for site-specific DNA recognition and cleavage via Cas9 is defined by the guide sequence, the tracr sequence that partially hybridizes to the guide sequence, and the PAM sequence. Further aspects of the CRISPR system are described in Karginov and Hannon, *The CRISPR system: small RNA-guided defense in bacteria and archaea*, *Mole Cell* 2010, January 15; 37(1):7. The type II CRISPR locus from *Streptococcus pyogenes* SF370 contains a cluster of four genes: Cas9, Cas1, Cas2, and Csnl, along with two non-coding RNA elements, tracrRNA, and a characteristic array of repetitive sequences (positive repeats) separated by short non-repetitive sequences (spacer regions, each approximately 30 bp). In this system, targeted DNA double-strand breaks (DSBs) are generated in four consecutive steps. First, two non-coding RNAs, the pre-crRNA array and tracrRNA, are transcribed from the CRISPR locus. Second, the tracrRNA hybridizes with the positive repeat sequences of the pre-crRNA and is then processed into mature crRNA containing the individual spacer sequences. Third, the mature crRNA:tracrRNA complex forms a heteroduplex between the crRNA spacer region and the pre-spacer sequence DNA, guiding Cas9 to the DNA target composed of the pre-spacer sequence and the corresponding PAM. Finally, Cas9 mediates the cleavage of the target DNA upstream of the PAM, generating a DSB within the pre-spacer sequence. An array of pre-crRNAs consisting of a single spacer region flanked by two forward repeat sequences (DRs) is also covered by the term "tracr-mate sequence". In some embodiments, Cas9 may be constitutively present, inducibly present, conditionally present, administered, or delivered. Cas9 optimization can be used to enhance function or develop new functions. Chimeric Cas9 proteins can be generated, and Cas9 can be used as a universal DNA-binding protein. The structural information provided for Cas9 can be used for further engineering and optimization of the CRISPR-Cas system, and this can also infer the structure-function relationships of other CRISPR enzyme systems, particularly those of other type II CRISPR enzymes or Cas9 orthologs.Crystal structure information (described in U.S. Provisional Application 61 / 915,251, filed December 12, 2013; 61 / 930,214, filed January 22, 2014; 61 / 980,012, filed April 15, 2014; and Nishimasu et al, “Crystal Structure of Cas9 in Complex with Guide RNA and Target DNA,” Cell 156(5):935-949, DOI: http: / / dx.doi.org / 10.1016 / j.cell.2014.02.001(2014), each of which is incorporated herein by reference in its entirety) provides structural information for truncating and generating modular or multipart CRISPR enzymes that can be incorporated into inducible CRISPR-Cas systems. Specifically, it provides structural information for the *Streptococcus pyogenes* Cas9 (SpCas9), and this can be extrapolated to other Cas9 orthologs or other type II CRISPR enzymes. The Cas9 gene exists in several different bacterial genomes, typically at the same locus as the Cas1, Cas2, and Cas4 genes, as well as the CRISPR cassette. Furthermore, the Cas9 protein contains an easily identifiable C-terminal region homologous to the transposon ORF-B, and includes an arginine-rich region containing an active RuvC-like nuclease.
[0123] dCas9
[0124] The Cas9 protein can be mutated to inactivate nuclease activity. An inactivated Cas9 protein (iCas9, also known as "dCas9") from *S. pyogenes*, lacking endonuclease activity, has recently been targeted by gRNA into genes in bacteria, yeast, and human cells to silence gene expression through steric hindrance. As used herein, "dCas molecule" can refer to the dCas protein or a fragment thereof. As used herein, "dCas9 molecule" can refer to the dCas9 protein or a fragment thereof. The terms "iCas" and "dCas" are used interchangeably and refer to catalytically inactivated CRISPR-related proteins. In one embodiment, the dCas molecule contains one or more mutations in the DNA cleavage domain. In one embodiment, the dCas molecule contains one or more mutations in the RuvC or HNH domain. In one embodiment, the dCas molecule contains one or more mutations in both the RuvC and HNH domains. In one embodiment, the dCas molecule is a fragment of a wild-type Cas molecule. In one embodiment, the dCas molecule comprises a functional domain derived from a wild-type Cas molecule, wherein the functional domain is selected from a Reel domain, a bridged helical domain, or a PAM interaction domain. In one embodiment, the nuclease activity of the dCas molecule is reduced by at least 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, or 99% compared to the corresponding wild-type Cas molecule.
[0125] Suitable dCas molecules can be derived from wild-type Cas molecules. The Cas molecules can originate from type I, type II, or type III CRISPR-Cas systems. In one embodiment, suitable dCas molecules can be derived from Cas1, Cas2, Cas3, Cas4, Cas5, Cas6, Cas7, Cas8, Cas9, or Cas10 molecules. In one embodiment, the dCas molecule is derived from a Cas9 molecule. The dCas9 molecule can be obtained, for example, by introducing point mutations (e.g., substitution, deletion, or addition) at DNA-cutting domains such as nuclease domains, such as RuvC and / or HNH domains, in the Cas9 molecule. See, for example, Jinek et al., Science (2012) 337:816-21, the entirety of which is incorporated herein by reference. For example, introducing two point mutations in the RuvC and HNH domains reduces Cas9 nuclease activity while preserving Cas9 sgRNA and DNA-binding activity. In one embodiment, the two point mutations within the RuvC and HNH active sites are D10A and H840A mutations in the *S. pyogenes* Cas9 molecule. Alternatively, D10 and H840 of the Cas9 molecule can be deleted to inactivate the Cas9 nuclease activity while retaining its sgRNA and DNA binding activity. In one embodiment, the two point mutations within the RuvC and HNH active sites are D10A and N580A mutations in the *S. pyogenes* Cas9 molecule.
[0126] In various embodiments, this disclosure relates to the dCas molecule or any variant or mutant thereof. All variants and mutants of dCas9 can be used in the methods or compositions disclosed herein, including but not limited to those derived from SpCas9 (Cas9 isolated from Streptococcus pyogenes), SaCas9 (Cas9 isolated from Staphylococcus aureus), StCas9 (Cas9 isolated from Streptococcus thermophilus), NmCas9 (Cas9 isolated from Neisseria meningitidis), FnCas9 (Cas9 isolated from Francisella novicida), CjCas9 (Cas9 isolated from Campylobacter jejuni), ScCas9 (Cas9 isolated from Streptococcus canis), and any variants and mutant forms of Cas9 listed above, such as those of high-fidelity Cas9 (Kleinstiver et al., Nature. January 28, 2016) and enhanced SpCas9 (Slaymaker et al., Sciences. January 1, 2016). This list provides only a few exemplary options and is not exclusive.
[0127] In one embodiment, the dCas molecule is the dCas9 molecule numbered SEQ ID NO: 12.
[0128] In some embodiments, this disclosure provides a carrier encoding dCas9 molecules of Streptococcus pyogenes, Staphylococcus aureus, Campylobacter jejuni, Corynebacterium diphtheriae, Eubacterium ventriosum, Streptococcus pasteurianus, Lactobacillus farciminis, Sphaerochaeta globus, Azospirillum (strain B510), Gluconacetobacter diazotrophicus, and Neisseria griseus. Nucleotides of the following strains: dCas9 molecules of *Cinerea*, *Roseburia intestinalis*, *Parvibaculum lavamentivorans*, *Nitratifractor salsuginis* (strain DSM 16511), *Campylobacter lari* (strain CF89-12), and *Streptococcus thermophilus* (strain LMD-9).
[0129] Exemplary dCas9 proteins include, but are not limited to, those listed in Table 1.
[0130] Table 1. Exemplary dCas9 proteins
[0131] Cas9 fusion protein
[0132] CRISPR / Cas9-based systems may include fusion compounds (e.g., DNMT3A-DNMT3L(3A3L)-dCas9-GCN4). The fusion compound may contain at least one DNA-binding protein (e.g., dCas9) and at least one gene expression regulator (e.g., GCN4, DNMT3A, DNMT3L, DNMT3A-DNMT3L fusion peptide).
[0133] III. DNA methylation domain
[0134] The methods and compositions disclosed herein include DNA methylation domains comprising dCas9 molecules fused with DNMT3A and DNMT3L or fragments thereof. In some embodiments, the methods and compositions disclosed herein include fusion molecules comprising dCas9 molecules fused with a DNMT3A-DNMT3L fusion peptide (SEQ ID NO: 38).
[0135] In one embodiment, the Cas9 fusion protein further includes a nuclear localization sequence (NLS), such as an LS fused to the N-terminus and / or C-terminus of Cas9.
[0136] Nuclear localization sequences are known in the art. In one embodiment, the NLS comprises the amino acid sequences of SEQ ID NO: 31-33, sequences substantially identical to SEQ ID NO: 31-33 (e.g., at least 80%, 85%, 90%, 92%, 95%, 97%, 98%, 99% or higher identity), sequences having 1, 2, 3, 4, 5 or more variations (e.g., amino acid substitutions, insertions or deletions) relative to SEQ ID NO: 31-33, or any fragment thereof.
[0137] In some embodiments, the CRISPR / Cas9-based system may include a dCas9 molecule and a gene expression regulator, or nucleic acids encoding a dCas9 molecule and a gene expression regulator. In one embodiment, the dCas9 molecule and the gene expression regulator are covalently linked. In one embodiment, the gene expression regulator is directly covalently fused to the dCas9 molecule. In one embodiment, the gene expression regulator is indirectly covalently fused to the dCas9 molecule, for example, via a non-regulator or linker or via a second regulator. In one embodiment, the gene expression regulator is located at the N-terminus and / or C-terminus of the dCas9 molecule. In one embodiment, the dCas9 molecule and the gene expression regulator are non-covalently linked. Exemplary sequences include, but are not limited to, those listed in Table 2. In some embodiments, the linker between the dCas9 and at least one gene expression regulator comprises the amino acid sequence shown in SEQ ID NO: 34-36. In one embodiment, the dCas9 molecule is fused to a first tag, such as a first peptide tag. In one embodiment, the gene expression regulator is fused to a second tag, such as a second peptide tag. In one implementation, the first and second tags, such as the first peptide tag and the second peptide tag, interact non-covalently with each other, thereby bringing the dCas9 molecule and the gene expression regulator into close proximity.
[0138] In one embodiment, the CRISPR / Cas9-based system includes a fusion molecule or a nucleic acid encoding the fusion molecule. In one embodiment, the fusion molecule includes a sequence comprising dCas9 fused with a gene expression regulator. In one embodiment, the dCas9 molecule includes *Streptococcus pyogenes* dCas9 molecules, *Staphylococcus aureus* dCas9 molecules, *Campylobacter jejuni* dCas9 molecules, *Corynebacterium diphtheria* dCas9 molecules, *Eubacterium ventriosum* dCas9 molecules, *Streptococcus pasteurianus* dCas9 molecules, *Lactobacillus farciminis* dCas9 molecules, *Sphaerochaeta globus* dCas9 molecules, *Azospirillum* (strain B510) dCas9 molecules, *Gluconacetobacter diazotrophicus* dCas9 molecules, and *Neisseria griseus*. The following bacteria are included in the study: dCas9 molecules of *Roseburia intestinalis*, *Parvibaculum lavamentivorans*, *Nitratifractor salsuginis* (strain DSM 16511), *Campylobacter lari* (strain CF89-12), and *Streptococcus thermophilus* (strain LMD-9), or fragments thereof.
[0139] In one embodiment, the fusion molecule is a DNMT3A-DNMT3L(3A3L)-dCas9-GCN4 fusion molecule (SEQ ID NO: 39), which comprises, from the N-terminus to the C-terminus, a DNMT3A-DNMT3L fusion peptide (3A3L), a dCas9 peptide, and a GCN4 peptide domain fused directly or indirectly (e.g., via a linker).
[0140] In one embodiment, the fusion molecule comprises the amino acid sequence of SEQ ID NO: 39, a sequence substantially identical to SEQ ID NO: 39 (e.g., sequence identity of at least 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or higher), or a sequence having one, two, three, four, five or more variations (e.g., substitution, insertion or deletion) relative to SEQ ID NO: 39, or any fragment thereof.
[0141] IV. Recruitment Domain
[0142] The first and second fusions of the epigenetic editing tools of this application form aggregated complexes through interactions between their respective recruitment domains. Therefore, this application provides non-limiting examples of combinations of recruitment domain A and recruitment domain A': (1) one of the recruitment domains A and A' has a domain of GCN4, and the other domain is scFv; or (2) one of the recruitment domains A and A' has a domain of a GFP11 fragment, and the other domain is GFP1-10; or (3) one of the recruitment domains A and A' has a domain of GVKESLV, and the other domain is a PDZ protein domain. Similarly, the situation where GFP11 and GFP1-10 are derived from splitting GFP to form recruitment domain A and recruitment domain A', respectively, can also be applied to other classes of fluorescent proteins, such as mCherry (GenBank: QSL83322.1), eYFP (GenBank: AAO48597.1), and eCFP (GenBank: AHJ09746.1). Different sets of recruitment domain A and recruitment domain A' can be obtained by splitting mCherry, eYFP, or eCFP for use in the complex provided in this application. In some embodiments, one of the first fusion and the second fusion of the complex of this application may contain two or more recruitment domains, which are linked by a linker sequence. The amino acid sequence of an exemplary recruitment domain may contain the amino acid sequence shown in SEQ ID NO: 7 or 30.
[0143] V. Transcriptional repressor domain
[0144] The term "transcriptional repression domain" as used in this article is selected from one or more of the following domains: KRAB, ZIM3, ZNF680, ZNF554, ZNF264, ZNF582, ZNF324, ZNF669, ZNF354A, ZNF82, ZNF595, ZNF419, ZNF566, ZIM2, EHMT2, SUV39H1, ZFPM1, TRIM28, EZH2, MXD1, SID, LSD1, HP1a, HDAC3, HDAC1, PRMT1, SETDB1, hSIRT1, ZNF436, ZNF257, ZNF675, ZNF490, ZNF320, ZNF331, ZNF816, ZNF41, ZNF189, ZNF528, ZNF543, ZNF140, ZNF610, ZNF350 , ZNF8, ZNF30, ZNF98, ZNF677, ZNF596, ZNF214, ZNF37A, ZNF34, ZNF250, ZNF547, ZNF273, ZFP 82, ZNF224, ZNF33A, ZNF45, ZNF175, ZNF184, ZFP28-1, ZFP28-2, ZNF18, ZNF213, ZNF394, ZFP 1. ZFP14, ZNF416, ZNF557, ZNF729, ZNF254, ZNF764, ZNF785, ZNF10, CBX5, RYBP, YAF2, MGA, C BX1, SCMH1, MPP8, SUMO3, HERC2, BIN1, PCGF2, TOX, FOXA1, FOXA2, IRF2BP1, IRF2BP2, IRF2BPL IRF-2BP1_2N-terminal domain, HOXA13, HOXB13, HOXC13, HOXA11, HOXC11, HOXC10, HOXA10, HOXB9, HOXA9, ZFP28, ZN334, ZN568, ZN37A , ZN181, ZN510, ZN862, ZN140, ZN208, ZN248, ZN571, ZN699, ZN726, ZIK1, ZNF2, Z705F, ZNF14, ZN471, ZN624, ZN F84, ZNF7, ZN891, ZN337, Z705G, ZN529, ZN729, ZN419, Z705A, ZN302, ZN486, ZN621, ZN688, ZN33A, ZN554, ZN87 8. ZN772, ZN224, ZN184, ZN544, ZNF57, ZN283, ZN549, ZN211, ZN615, ZN253, ZN226, ZN730, Z585A, ZN732, ZN681,ZN667,ZN649,ZN470,ZN484,ZN431,ZN382,ZN254,ZN124,ZN607,ZN317,ZN620,ZN141,ZN584,ZN540,ZN75D,ZN555,ZN658,ZN684,RBAK,ZN829,ZN582,ZN112,ZN716,HKR1,ZN350,ZN480,ZN416,ZNF92,ZN100,ZN736,ZNF74,ZN443,ZN195,ZN530,ZN782,ZN791,ZN331,Z354C,ZN157,ZN727,ZN550,ZN793,ZN235,ZN724,ZN573,ZN577,ZN789,ZN718,ZN300,ZN383,ZN429,ZN677,ZN850,ZN454,ZN257,ZN264,ZN485,ZN737,ZNF44,ZN596,ZN565,ZN543,ZFP69,SUMO1,ZNF12,ZN169,ZN433,ZN175,ZN347,ZNF25,ZN519,Z585B,ZN517,ZN846,ZN230,ZNF66,ZN713,ZN816,ZN426,ZN674,ZN627,ZNF20,Z587B,ZN316,ZN233,ZN611,ZN556,ZN234,ZN560,ZNF77,ZN682,ZN614,ZN785,ZN445,ZFP30,ZN225,ZN551,ZN610,ZN528,ZN284,ZN418,ZN490,ZN805,Z780B,ZN763,ZN285,ZNF85,ZN223,ZNF90,ZN557,ZN425,ZN229,ZN606,ZN155,ZN222,ZN442,ZNF91,ZN135,ZN778,ZN534,ZN586,ZN567,ZN440,ZN583,ZN441,ZNF43,ZN589,ZN563,ZN561,ZN136,ZN630,ZN527,ZN333,Z324B,ZN786,ZN709,ZN792,ZN599,ZN613,ZF69B,ZN799,ZN569,ZN564,ZN546,ZFP92,ZN723,ZN439,ZFP57,ZNF19,ZN404,ZN274,CBX3,ZN250,ZN570,ZN675,ZN695,ZN548,ZN132,ZN738,ZN420,ZN626,ZN559,ZN460,ZN268,ZN304,ZN605,ZN844,SUMO5,ZN101,ZN783,ZN417,ZN182,ZN823,ZN177,ZN197,ZN717,ZN669,ZN256,ZN251,CBX4,CDY2,CDYL2,ZN562,ZN461,Z324A,ZN766,ID2,ZN214,CBX7,ID1,CREM,SCX,ASCL1,ZN764,SCML2,TWST1,CREB1,TERF1,ID3,CBX8,GSX1,NKX22,ATF1,TWST2,ZNF17,TOX3,TOX4,ZMYM3,I2BP1,RHXF1,SSX2,I2BPL,ZN680,TRI68,HXA13,PHC3,TCF24,HXB13,HEY1,PHC2,ZNF81,FIGLA,SAM11,KMT2B,HEY2,JDP2,HXC13,ASCL4,HHEX,GSX2,ETV7,ASCL3,PHC1,OTP,I2BP2,VGLL2,HXA11,PDLI4,ASCL2,CDX4,ZN860,LMBL4,PDIP3,NKX25,CEBPB,ISL1,CDX2,PROP1,SIN3B,SMBT1,HXC11,HXC10,PRS6A,VSX1,NKX23,MTG16,HMX3,HMX1,KIF22,CSTF2,CEBPE,DLX2,PPARG,PRIC1,UNC4,BARX2,ALX3,TCF15,TERA,VSX2,HXD12,CDX1,TCF23,ALX1,HXA10,RX,CXXC5,SCML1,NFIL3,DLX6,MTG8,CEBPD,SEC13,FIP1,ALX4,LHX3,PRIC2,MAGI3,NELL1,PRRX1,MTG8R,RAX2,DLX3,DLX1,NKX26,NAB1,SAMD7,PITX3,WDR5,MEOX2,NAB2,DHX8,CBX6,EMX2,CPSF6,HXC12,KDM4B,LMBL3,PHX2A,EMX1,NC2B,DLX4,SRY,ZN777,ZN398,GATA3,BSH,SF3B4,TEAD1,TEAD3,RGAP1,PHF1,GATA2,FOXO3,ZN212,IRX4,ZBED6,LHX4,SIN3A,RBBP7,NKX61,R51A1,MB3L1,DLX5,NOTC1,TERF2,ZN282,RGS12,ZN840,SPI2B,PAX7,NKX62,ASXL2,FOXO1,GATA1,ZMYM5, LRP1, MIXL1, SGT1, LMCD1, CEBPA, SOX14, WTIP, PRP19, NKX11, RBBP4, DMRT2, SMCA2, and their functionally active fragments. In some specific embodiments, the transcriptional repression domain is ZIM3; in some more specific embodiments, the transcriptional repression domain comprises the amino acid sequence shown in SEQ ID NO:8.
[0145] gRNA
[0146] In the context of the CRISPR-Cas system, the term "guide sequence" as used herein includes a sequence that is sufficiently complementary to the PCSK9 nucleic acid sequence to hybridize with it and direct the nucleic acid targeting complex to specifically bind to the PCSK9 polynucleotide sequence. The guide sequence may form a double strand with the PCSK9 sequence. The double strand may be a DNA double strand, an RNA double strand, or an RNA / DNA double strand. The terms "guide molecule," "guide RNA," and "single guide RNA" are used interchangeably herein and refer to an RNA-based molecule capable of forming a complex with a CRISPR-Cas protein and containing a guide sequence that is sufficiently complementary to the target nucleic acid sequence to hybridize with it and direct the complex to sequence-specific binding to the target nucleic acid sequence. As described herein, the guide RNA (sgRNA) complementary to a DNA sequence near and / or within the PCSK9 regulatory element comprises the nucleic acid sequence shown in SEQ ID NO: 1 and / or 2.
[0147] Delivery system
[0148] This disclosure also provides delivery systems for introducing components of the systems and compositions described herein into cells, tissues, organs, or organisms.
[0149] A delivery system may include one or more delivery media and / or vehicles.
[0150] Cargo
[0151] The delivery system may include one or more delivery vehicles. The delivery vehicle may include one or more components of the systems and compositions described herein. The delivery vehicle may contain one or more of the following: i) a plasmid encoding one or more Cas proteins; ii) a plasmid encoding one or more guide RNAs; iii) mRNA of one or more Cas proteins; iv) one or more guide RNAs; v) one or more Cas proteins; vi) any combination thereof. In some instances, the delivery vehicle may contain a plasmid encoding one or more Cas proteins and one or more (e.g., multiple) guide RNAs. In some embodiments, the delivery vehicle may include mRNA encoding one or more Cas proteins and one or more guide RNAs.
[0152] In some instances, the delivery vehicle may include one or more Cas proteins and one or more guide RNAs, for example, in the form of a ribonucleoprotein complex (RNP). The ribonucleoprotein complex can be delivered using the methods and systems described herein. In some cases, the ribonucleoprotein can be delivered using a peptide-based shuttle. In one instance, the ribonucleoprotein can be delivered using a synthetic peptide comprising an endosome leakage domain (ELD) operably linked to a cell penetration domain (CPD), a histidine-rich domain, and the CPD, as described, for example, in WO2016161516.
[0153] Physical delivery
[0154] In some embodiments, the carrier can be introduced into cells via physical delivery methods. Examples of physical methods include microinjection, electroporation, and hydrodynamic delivery.
[0155] Microinjection
[0156] Direct microinjection of the delivery vehicle into cells can achieve high efficiency, such as above 90% or about 100%. In some embodiments, microinjection can be performed using a microscope and a needle (e.g., 0.5–5.0 μm in diameter) to pierce the cell membrane and deliver the delivery vehicle directly to the target site within the cell. Microinjection can be used for both in vitro and ex vivo delivery.
[0157] Microinjection can be performed on plasmids containing sequences encoding Cas proteins and / or guide RNA, mRNA, and / or guide RNA. In some cases, microinjection can be used to i) deliver DNA directly to the cell nucleus, and / or ii) deliver mRNA (e.g., transcribed in vitro) to the cell nucleus or cytoplasm. In some instances, microinjection can be used to deliver sgRNA directly to the cell nucleus and mRNA encoding Cas to the cytoplasm, for example, to promote Cas translation and shuttle to the cell nucleus.
[0158] Microinjection can be used to produce genetically modified animals. For example, gene-editing vectors can be injected into zygotes to allow for effective germline modification. Such methods can produce normal embryos and full-term mouse pups with the desired modifications. Microinjection can also be used to provide transient upregulation or downregulation of specific genes within the cellular genome, for example using CRISPRa and CRISPRi.
[0159] Electroporation
[0160] In some embodiments, the carrier and / or delivery medium can be delivered via electroporation. Electroporation uses pulsed high-voltage current to instantaneously open nanoscale pores within the cell membrane of cells suspended in a buffer solution, allowing components with hydrodynamic diameters of tens of nanometers to flow into the cells. In certain cases, electroporation can be used for various cell types and efficiently transfer carriers into cells. Electroporation can be used for both in vitro and ex vivo delivery.
[0161] Electroporation can also be used to deliver payloads into the nucleus of mammalian cells by applying specific voltages and reagents, such as through nuclear transfection. Such methods include those described in the following literature: Wu Y, et al. (2015). Cell Res 25:67-79; Ye L, et al. (2014). Proc Natl Acad Sci USA 111:9591-6; Choi PS, Meyerson M. (2014). Nat Commun 5:3728; Wang J, Quake SR. (2014). Proc Natl Acad Sci 111:13157-62. Electroporation can also be used for in vivo delivery of payloads, for example, using the method described in Zuckermann M, et al. (2015). Nat Commun 6:7391.
[0162] Fluid dynamics delivery
[0163] Hydrodynamic delivery can also be used to deliver payloads, such as in vivo. In some instances, hydrodynamic delivery can be performed by rapidly injecting a large volume (8-10% body weight) solution containing the gene-editing payload into the bloodstream of a subject (e.g., an animal or human), such as via the tail vein in mice. Because blood is incompressible, the large volume of fluid can lead to an increase in hydrodynamic pressure, temporarily enhancing permeability to endothelial and parenchymal cells, allowing payloads that would normally be unable to cross cell membranes to enter the cells. This method can be used to deliver naked DNA plasmids and proteins. The delivered payload may accumulate in the liver, kidneys, lungs, muscles, and / or heart.
[0164] transfection
[0165] Carriers such as nucleic acids can be introduced into cells through transfection methods. Examples of transfection methods include calcium phosphate-mediated transfection, cationic transfection, liposome transfection, dendritic polymer transfection, heat shock transfection, magnetic transfection, liposome transfection, impalefection, phototransfection, and proprietary reagents to enhance nucleic acid uptake.
[0166] Delivery medium
[0167] The delivery system may include one or more delivery media. The delivery media can deliver a carrier to cells, tissues, organs, or organisms (e.g., animals or plants). The carrier may be packaged, carried, or otherwise associated with the delivery media. The delivery media may be selected based on the type of carrier to be delivered and / or whether the delivery is in vitro and / or in vivo. Examples of delivery media include the vectors, viruses, non-viral media, and other delivery reagents described herein.
[0168] The delivery medium according to this disclosure may have a maximum dimension (e.g., diameter) of less than 100 micrometers (μm). In some embodiments, the maximum dimension of the delivery medium is less than 10 μm. In some embodiments, the delivery medium may have a maximum dimension of less than 2000 nanometers (nm). In some embodiments, the delivery medium may have a maximum dimension of less than 1000 nanometers (nm). In some embodiments, the delivery medium may have a maximum dimension (e.g., diameter) of less than 900 nm, less than 800 nm, less than 700 nm, less than 600 nm, less than 500 nm, less than 400 nm, less than 300 nm, less than 200 nm, less than 150 nm, or less than 100 nm or less than 50 nm. In some embodiments, the delivery medium may have a maximum dimension in the range of 25 nm to 200 nm.
[0169] In some embodiments, the delivery medium may be or comprise particles. For example, the delivery medium may be or comprise nanoparticles (e.g., particles with a maximum dimension (e.g., diameter) not exceeding 1000 nm). The particles may be provided in various forms, such as as solid particles (e.g., metals such as silver, gold, iron, titanium, nonmetals, lipid-based solids, polymers), suspensions of particles, or combinations thereof. Metallic, dielectric, and semiconductor particles, as well as hybrid structures (e.g., core-shell particles), can be prepared.
[0170] carrier
[0171] The systems, compositions, and / or delivery systems described herein may comprise one or more vectors. This disclosure also includes vector systems. Vector systems may comprise one or more vectors. In some embodiments, a vector refers to a nucleic acid molecule capable of transporting another nucleic acid linked thereto. Vectors include single-stranded, double-stranded, or partially double-stranded nucleic acid molecules, nucleic acid molecules containing one or more free ends, nucleic acid molecules without free ends (e.g., circular), nucleic acid molecules containing DNA, RNA, or both, and various other polynucleotides known in the art. Vectors may be plasmids, such as circular double-stranded DNA loops into which additional DNA segments can be inserted, for example, using standard molecular cloning techniques. Some vectors may be capable of autonomous replication in the host cell to which they are introduced (e.g., bacterial vectors with a bacterial origin of replication and augmented mammalian vectors). Some vectors (e.g., non-augmented mammalian vectors) are integrated into the host cell's genome upon introduction into the host cell, thereby replicating along with the host genome. In some instances, vectors may be expression vectors, such as those capable of directing the expression of genes operatively linked to them. In some cases, expression vectors can be used for expression in eukaryotic cells. Common expression vectors useful in recombinant DNA technologies are typically in the form of plasmids.
[0172] Examples of vectors include pGEX, pMAL, pRIT5, E. coli expression vectors (e.g., pTrc, pET l ld), yeast expression vectors (e.g., pYepSecl, pMFa, pJRY88, pYES2, and picZ), baculovirus vectors (e.g., for expression in insect cells such as SF9 cells) (e.g., pAc and pVL series), and mammalian expression vectors (e.g., pCDM8 and pMT2PC).
[0173] The vector may contain i) a Cas coding sequence, and / or ii) a single or at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 12, 14, 16, 32, 48, or 50 guide RNA coding sequences. Each RNA coding sequence may have a promoter within a single vector. Optionally or additionally, promoters controlling (e.g., driving transcription and / or expression) multiple RNA coding sequences may be present within a single vector.
[0174] Control element
[0175] The vector may contain one or more regulatory elements. These regulatory elements may be operatively linked to the coding sequence of a Cas protein, a helper protein, a guide RNA (e.g., a single guide RNA, crRNA, and / or tracrRNA), or a combination thereof. The term "operatively linked" means that the nucleotide sequence of interest is linked to the regulatory element in a manner that allows for nucleotide sequence expression (e.g., in an in vitro transcription / translation system or in a host cell when the vector is introduced into the host cell). In some instances, the vector may contain: a first regulatory element operatively linked to a nucleotide sequence encoding a Cas protein, and a second regulatory element operatively linked to a nucleotide sequence encoding a guide RNA.
[0176] Examples of regulatory elements include promoters, enhancers, internal ribosome entry sites (IRES), and other expression control elements (e.g., transcription termination signals, such as polyadenylation signals and poly-U sequences). Such regulatory elements are described, for example, in Goeddel, *GENE EXPRESSION TECHNOLOGY: METHODS IN ENZYMOLOGY* 185, Academic Press, San Diego, Calif (1990). Regulatory elements include those that direct constitutive expression of nucleotide sequences in many types of host cells, and those that direct nucleotide sequence expression only in certain host cells (e.g., tissue-specific regulatory sequences). Tissue-specific promoters can direct expression primarily in desired tissues of interest such as muscle, neurons, bone, skin, blood, specific organs (e.g., liver, pancreas), or specific cell types (e.g., lymphocytes). Regulatory elements can also direct expression in a time-dependent manner, such as in a cell cycle-dependent or developmental stage-dependent manner, and may or may not be tissue- or cell-type specific.
[0177] Examples of promoters include one or more pol III promoters (e.g., 1, 2, 3, 4, 5 or more pol III promoters), one or more pol II promoters (e.g., 1, 2, 3, 4, 5 or more pol II promoters), one or more pol I promoters (e.g., 1, 2, 3, 4, 5 or more pol I promoters), or combinations thereof. Examples of pol III promoters include, but are not limited to, the U6 and HI promoters. Examples of pol II promoters include, but are not limited to, the retroviral Rouss sarcoma virus (RSV) LTR promoter (optionally having an RSV enhancer), the cytomegalovirus (CMV) promoter (optionally having a CMV enhancer), the SV40 promoter, the dihydrofolate reductase promoter, the actin promoter, the glycerol phosphokinase (PGK) promoter, and the EF1a promoter.
[0178] Viral vector
[0179] The delivery vehicle can be delivered via a virus. In some embodiments, a viral vector is used. The viral vector may include a virus-derived DNA or RNA sequence (e.g., retrovirus, replication-defective retrovirus, adenovirus, replication-defective adenovirus, and adeno-associated virus) for packaging into a virus. The viral vector also contains polynucleotides carried by the virus for transfection into host cells. Viruses and viral vectors can be used for in vitro, ex vivo, and / or in vivo delivery.
[0180] Adeno-associated virus (AAV)
[0181] The systems and compositions described herein can be delivered via adeno-associated virus (AAV). AAV vectors can be used for such delivery. AAV belongs to the Parvoviridae family of the Dependent Virus genus and is a single-stranded DNA virus. In some embodiments, AAV can provide a persistent source of the provided DNA because the genomic material delivered by AAV can exist indefinitely in cells, for example, as exogenous DNA or with some modifications, and be directly integrated into the host DNA. In some embodiments, AAV does not cause or relate to any disease in humans. The virus itself is capable of efficiently infecting cells while evoking little to no innate or adaptive immune response or associated toxicity.
[0182] Examples of AAVs that can be used in this paper include AAV-1, AAV-2, AAV-3, AAV-4, AAV-5, AAV-6, AAV-8, and AAV-9. The type of AAV can be selected based on the cells to be targeted; for example, AAV serotypes 1, 2, 5, or heterozygous capsids AAV1, AAV2, AAV5, or any combination thereof can be selected for targeting brain or neuronal cells; and AAV4 can be selected for targeting cardiac tissue. AAV8 can be used for delivery to the liver. AAV-2-based vectors were initially proposed for CFTR delivery to the CF airway, and other serotypes such as AAV-1, AAV-5, AAV-6, and AAV-9 have shown improved gene transfer efficiency in various lung epithelial models. Examples of cell types targeted by AAVs are described in Grimm, D. et al, J. Virol. 82:5887-5911 (2008) and WO 2021 / 183807A1, the entirety of which is incorporated herein by reference.
[0183] CRISPR-Cas AAV particles can be generated in HEK 293T cells. Once particles with specific tropism are generated, they can be used to infect target cell lines in essentially the same way as natural viral particles. This may allow the CRISPR-Cas component to persist in the infected cell type, which is why this delivery version is particularly well-suited for situations requiring long-term expression. Examples of dosages and formulations of AAV that can be used include those described in U.S. Patent Nos. 8,454,972 and 8,404,658.
[0184] Several strategies can be used to deliver the systems and compositions described herein using AAV. In some instances, the coding sequences for Cas and gRNA can be packaged directly onto a DNA plasmid vector and delivered via an AAV particle. In some instances, AAV can be used to deliver gRNA into cells previously engineered to express Cas. In some instances, the coding sequences for Cas and gRNA can be fabricated into two separate AAV particles for co-transfection of target cells. In some instances, markers, tags, and other sequences can be packaged within the same AAV particle as coding sequences for Cas and / or gRNA.
[0185] Lentiviral
[0186] The systems and compositions described herein can be delivered via lentiviruses. Lentiviral vectors can be used for such delivery. Lentivirals are complex retroviruses with the ability to infect and express their genes in cells undergoing mitosis and post-mitosis.
[0187] Examples of lentiviruses include human immunodeficiency virus (HIV), which can target a wide range of cell types using envelope glycoproteins from other viruses; and minimal non-primate lentiviral vectors based on equine infectious anemia virus (EIAV), which can be used for ocular treatment. In some embodiments, self-inactivating lentiviral vectors having siRNAs targeting a common exon shared by HIV tat / rev, TAR decoys localized to the nucleolar, and anti-CCR5-specific hammerhead ribozyme (see, for example, DiGiusto et al. (2010) Sci Transl Med 2:36ra43) can be used and / or are suitable for the nucleic acid targeting systems described herein.
[0188] Lentivirals can be pseudotyped using other viral proteins, such as the G protein of vesicular stomatitis virus. In doing so, the cell tropism of the lentivirus can be altered as needed, making it either broad or narrow. In some cases, to improve safety, second- and third-generation lentiviral systems may separate essential genes into three plasmids, which may reduce the likelihood of accidental reconstruction of live viral particles within cells.
[0189] In some instances, leveraging their integration capabilities, lentiviruses can be used to create libraries containing cells with various genetic modifications, for example, for screening and / or studying genes and signaling pathways.
[0190] adenovirus
[0191] The systems and compositions described herein can be delivered via adenovirus. Adenoviral vectors can be used for such delivery. Adenoviruses include non-enveloped viruses having an icosahedral nucleocapsid containing a double-stranded DNA genome. Adenoviruses can infect both dividing and non-dividing cells. In some embodiments, the adenovirus does not integrate into the host cell's genome, which can be used to limit off-target effects of the CRISPR-Cas system in gene editing applications.
[0192] Non-viral vectors
[0193] The delivery medium may include non-viral media. Generally, methods and media capable of delivering nucleic acids and / or proteins can be used to deliver the systems and compositions described herein. Examples of non-viral media include lipid nanoparticles, cell-penetrating peptides (CPPs), DNA nanowires, gold nanoparticles, streptococcal hemolysin O, multifunctional encapsulated nanodevices (MENDs), lipid-coated mesoporous silica particles, and other inorganic nanoparticles.
[0194] lipid particles
[0195] The delivery medium may include lipid particles, such as lipid nanoparticles (LNPs) and liposomes.
[0196] Lipid nanoparticles (LNP)
[0197] LNPs can encapsulate nucleic acids within cationic lipid particles (such as liposomes) and can be delivered to cells relatively easily. In some instances, lipid nanoparticles do not contain any viral components, which helps to minimize safety and immunogenicity. Lipid particles can be used for in vitro, ex vivo, and in vivo delivery. Lipid particles can be used for cell populations of various sizes.
[0198] In some instances, LNPs can be used to deliver DNA molecules (e.g., those containing coding sequences of Cas and / or gRNA) and / or RNA molecules (e.g., Cas mRNA, gRNA). In some cases, LNPs can be used to deliver Cas / gRNA RNP complexes.
[0199] In some implementations, LNPs are used to deliver mRNA and gRNA, such as mRNA fusion molecules comprising DNMT3A-DNMT3L(3A-3L)-dCas9-KRAB and at least one sgRNA targeting PCSK9.
[0200] The components of LNPs may include cationic lipids such as 1,2-dilineoyl-3-dimethylammonium-propane (DLinDAP), 1,2-dilinoleyloxy-3-N,N-dimethylaminopropane (DLinDMA), 1,2-dilinoleyloxyketo-N,N-dimethyl-3-aminopropane (DLinK-DMA), 1,2-dilinoleyl-4-(2-dimethylaminoethyl)-[1,3]-dioxolane (DLinKC2-DMA), (3-o-[2-(methoxypolyethyleneglycol 2000)succinoyl]-1,2-dimyristoyl-sn-glycol (PEG-S-DMG), R-3-[(ro-methoxy-poly(ethylene glycol)2000)carbamoyl]-1,2-dimyristyloxlpropyl-3-amine (PEG-C-DOMG) and any combination thereof. The preparation and encapsulation of LNPs can be adapted from Conway et al. al, Molecular Therapy, vol.27, no.4, pages 866-877, Apr.2019 and Rosin et al, Molecular Therapy, vol.19, no.12, pages 1286-2200, Dec.201.
[0201] In some embodiments, the LNP may comprise ionizable lipids. In some embodiments, the ionizable lipids include, but are not limited to, pH-responsive, thermoresponsive, and light-responsive ionizable lipids. In some embodiments, the ionizable lipids include cationic and anionic lipids that ionize under certain conditions, such as, but not limited to, pH, temperature, or light. In some embodiments, the molar ratio of ionizable lipids in the LNP is from 20% to about 70% (e.g., about 20% to about 70%, about 20% to about 65%, about 20% to about 60%, about 20% to about 55%, about 20% to about 50%, about 20% to about 45%, about 20% to about 40%, about 20% to about 35%, about 20% to about 30%, about 20% to about 25%, about 30% to about 70%, about 30% to about 65%, about 30% to about 60%, about 3...). The percentages are approximately 0% to 55%, 30% to 50%, 30% to 45%, 30% to 40%, 30% to 35%, 40% to 70%, 40% to 65%, 40% to 60%, 40% to 55%, 40% to 50%, 40% to 45%, 50% to 70%, 50% to 65%, 50% to 60%, 50% to 55%, 60% to 70%, or 60% to 65%. In some embodiments, the molar ratio of ionizable lipids in the LNP is approximately 45% to 50%.
[0202] In some embodiments, the LNP may comprise PEGylated lipids. In some embodiments, the molar ratio of the PEGylated lipids in the LNP is from 0% to about 30% (e.g., from about 0% to about 30%, from about 0% to about 25%, from about 0% to about 20%, from about 0% to about 15%, from about 0% to about 10%, from about 10% to about 30%, from about 10% to about 25%, from about 10% to about 20%, from about 10% to about 15%, from about 20% to about 30%, or from about 20% to about 25%). In some embodiments, the molar ratio of the PEGylated lipids in the LNP is about 1%.
[0203] In some embodiments, the LNP may comprise supporting lipids. In some embodiments, the molar ratio of the supporting lipids in the LNP is from 5% to about 50% (e.g., from about 5% to about 50%, from about 5% to about 45%, from about 5% to about 40%, from about 5% to about 35%, from about 5% to about 30%, from about 5% to about 25%, from about 5% to about 20%, from about 5% to about 15%, from about 5% to about 10%, from about 10% to about 50%, from about 10% to about 45%, from about 10% to about 40%, from about 10% to about 35%, from about 10% to about 30%, from about 10% to about 25%, from about 10% to about 20%, from about 1...). 0% to about 15%, about 20% to about 50%, about 20% to about 45%, about 20% to about 40%, about 20% to about 35%, about 20% to about 30%, about 20% to about 25%, about 30% to about 50%, about 30% to about 45%, about 30% to about 40%, about 30% to about 35%, about 40% to about 50%, about 40% to about 45%, about 30% to about 50%, about 30% to about 45%, about 30% to about 40%, about 30% to about 35%, about 40% to about 50%, or about 40% to about 45%. In some embodiments, the molar ratio of the supporting lipids of the LNP is about 9%.
[0204] In some embodiments, the LNP may contain cholesterol. In some embodiments, the molar ratio of cholesterol in the LNP is from 10% to about 50% (e.g., from about 10% to about 50%, from about 10% to about 45%, from about 10% to about 40%, from about 10% to about 35%, from about 10% to about 30%, from about 10% to about 25%, from about 10% to about 20%, from about 10% to about 15%, from about 20% to about 50%, from about 20% to about 45%, from about 20% to about 40%, from about 20% to about 35%, from about 20% to about 30%, from about 20% to about 25%, from about 30% to about 50%, from about 30% to about 45%, from about 30% to about 40%, from about 30% to about 35%, from about 40% to about 50%, or from about 40% to about 45%). In some embodiments, the molar ratio of cholesterol in the LNP is from about 40% to about 45%.
[0205] In some embodiments, the LNP may comprise a mixture of ionizable lipids (20%-70% molar ratio), PEGylated lipids (0%-30% molar ratio), supporting lipids (30%-50% molar ratio), and cholesterol (10%-50% molar ratio). In some embodiments, the LNP may comprise a mixture of ionizable lipids (45-50% molar ratio), PEGylated lipids (1% molar ratio), supporting lipids (9% molar ratio), and cholesterol (40-50% molar ratio).
[0206] Liposomes
[0207] In some embodiments, lipid particles may be liposomes. Liposomes are spherical vesicle structures consisting of a single or multiple lipid bilayers surrounding an internal aqueous compartment and a relatively impermeable outer lipophilic phospholipid bilayer. In some embodiments, liposomes are biocompatible, non-toxic, capable of delivering both hydrophilic and lipophilic drug molecules, protecting their payloads from degradation by cytoplasmic enzymes, and transporting their loads across biological membranes and the blood-brain barrier (BBB).
[0208] Liposomes can be made from several different types of lipids, such as phospholipids. Liposomes can contain natural phospholipids and lipids, such as 1,2-distearate-sn-glycerol-3-phosphatidylcholine (DSPC), sphingomyelin, lecithin, monosialotetrahexosylganglioside, or any combination thereof.
[0209] Several other additives can be added to liposomes to modify their structure and properties. For example, liposomes can further contain cholesterol, sphingomyelin, and / or 1,2-dioleoyl-sn-glycerol-3-phosphoethanolamine (DOPE) to, for example, improve stability and / or prevent leakage of the internal carriers of the liposomes.
[0210] Stable Nucleic Acid Lipid Particles (SNALP)
[0211] In some embodiments, the lipid particles may be stable nucleic acid lipid particles (SNALP). SNALP may comprise ionizable lipids (DLinDMA) (e.g., cationic at low pH), neutral cofactor lipids, cholesterol, diffusible polyethylene glycol (PEG) lipids, or any combination thereof. In some instances, SNALP may comprise synthetic cholesterol, dipalmitoylphosphatidylcholine, 3-N-[(w-methoxypolyethylene glycol)2000]carbamoyl]-1,2-dimyristoxypropylamine, and cationic 1,2-dilinoleyloxy-3-N,N-dimethylaminopropane. In some instances, SNALP may comprise synthetic cholesterol, 1,2-distearate-sn-glycerol-3-phosphatecholine, PEG-cDMA, and 1,2-dilinoleyloxy-3-(N,N-dimethyl)aminopropane (DLinDMA).
[0212] Other lipids
[0213] The lipid particles may also contain one or more other types of lipids, such as cationic lipids like aminolipids 2,2-dilinole-4-dimethylaminoethyl-[1,3]-dioxolane (DLin-KC2-DMA), DLin-KC2-DMA4, Cl2-200, and auxiliary lipids distearate phosphatidylcholine, cholesterol, and PEG-DMG.
[0214] Lipid complexes and / or polymeric complexes
[0215] In some embodiments, the delivery medium comprises lipid complexes and / or polymeric complexes. Lipid complexes can bind to negatively charged cell membranes and induce endocytosis. Examples of cationic lipid complexes can be complexes comprising both lipid and non-lipid components. Examples of lipid complexes and polymeric complexes include FuGENE-6 reagent, non-liposome solutions containing lipids and other components, zwitterionic aminolipids (ZAL), Ca2p (e.g., forming DNA / Ca2+ microcomplexes), polyethyleneimine (PEI) (e.g., branched PEI), and poly(L-lysine) (PLL).
[0216] Cell-penetrating peptides
[0217] In some embodiments, the delivery medium includes cell-penetrating peptides (CPPs). CPPs are short peptides that facilitate cellular uptake of various molecular carriers, such as nanoparticles, small chemical molecules, and large fragments of DNA.
[0218] CPPs can vary in size, amino acid sequence, and charge. In some instances, CPPs can translocate across the plasma membrane and facilitate the delivery of various molecular carriers into the cytoplasm or organelles. CPPs can be introduced into cells through different mechanisms, such as direct penetration into the membrane, endocytosis-mediated entry, and translocation through the formation of temporary structures.
[0219] CPPs can have an amino acid composition containing relatively abundant positively charged amino acids such as lysine or arginine, or a sequence containing alternating patterns of polar / charged amino acids and nonpolar hydrophobic amino acids. These two types of structures are respectively referred to as polycationic or amphiphilic. A third type of CPP is a hydrophobic peptide containing only nonpolar residues with low net charge or hydrophobic amino acid groups that are essential for cellular uptake. Another type of CPP is the trans-activator of transcription (Tat) from human immunodeficiency virus I (HIV-I). Examples of CPPs include penetrantin, Tat (48-60), transportan, and (R-AhX-R4) (AhX stands for aminocaproyl). Examples of CPPs and related applications also include those described in U.S. Patent 8,372,951.
[0220] CPPs can be used relatively readily for in vitro and ex vivo applications, and typically require extensive optimization for each delivery vehicle and cell type. In some instances, CPPs can be directly covalently linked to Cas proteins, then complexed with gRNA and delivered to cells. In other instances, CPP-Cas and CPP-gRNA can be delivered separately to multiple cells. CPPs can also be used to deliver RNPs.
[0221] DNA nanowires
[0222] In some embodiments, the delivery medium comprises DNA nanowires. A DNA nanowire refers to a spherical structure of DNA (e.g., having the shape of a yarn ball). The nanowires can be synthesized using a rolling circle amplification of palindromic sequences that facilitate self-assembly of the structure. The spheres can then be loaded with a payload. Examples of DNA nanowires are described in Sun W et al, J Am Chem Soc. 2014 Oct 22; 136(42):14722-5; and Sun W et al, Angew Chem Int Ed Engl. 2015 Oct 5; 54(41):12029-33. The DNA nanowires may have palindromic sequences complementary to the gRNA portion within the Cas:gRNA ribonucleoprotein complex. The DNA nanowires can be coated, for example with PEI, to induce endosome escape.
[0223] Gold nanoparticles
[0224] In some embodiments, the delivery medium comprises gold nanoparticles (also known as AuNPs or colloidal gold). The gold nanoparticles can form complexes with carriers such as Cas:gRNA RNPs. The gold nanoparticles can be coated, for example, in silicates and the endosome-destructive polymer PAsp (DET). Examples of gold nanoparticles include AuraSense Therapeutics' Spherical Nucleic Acids (SNAs). TM ) constructs, as well as those described in Mou R, et al. (2017). ACS Nano 11:2452-8; Lee K, et al. (2017). Nat Biomed Eng 1:889-901.
[0225] iTOP
[0226] In some embodiments, the delivery medium includes iTOP. iTOP refers to a combination of small molecules that drive the efficient intracellular delivery of native proteins without relying on any transduction peptide. iTOP can be used for transduction induced by cell osmosis and propane betaine, using NaCl-mediated hyperosmosis in conjunction with the transduction compound (propane betaine) to trigger macrocytic uptake of extracellular macromolecules into the cell. Examples of iTOP methods and reagents include those described in D'Astolfo DS, Pagliero RJ, Pras A, et al. (2015). Cell 161:674-690.
[0227] Polymer-based particles
[0228] In some embodiments, the delivery medium may comprise polymer-based particles (e.g., nanoparticles). In some embodiments, the polymer-based particles may mimic the membrane fusion mechanism of a virus. The polymer-based particles may be a synthetic copy of the influenza virus mechanism and form a transfection complex with various types of nucleic acids (siRNA, miRNA, plasmid DNA, or shRNA, mRNA) taken up by the cell via endocytosis (a process involving the formation of acidic compartments). The low pH in the late endosome acts as a chemical switch, making the particle surface hydrophobic and facilitating membrane penetration. Once inside the cytosol, the particle releases its payload for cellular action. This active endosome escape technique is safe and maximizes transfection efficiency because it utilizes a natural uptake pathway. In some embodiments, the polymer-based particles may comprise alkylated and carboxyalkylated branched polyethyleneimine. In some instances, the polymer-based particles are VIROMER, such as VIROMER RNAi, VIROMER RED, VIROMER mRNA, or VIROMER CRISPR. Examples of methods for delivering the systems and compositions described herein include those described in the following literature: Bawage SS et al., Synthetic mRNA expressed Casl3a mitigates RNA virus infections, www.biorxiv.org / content / l0.ll01 / 370460v1.full doi:doi.org / 10.1101 / 370460, RED, a powerful tool for transfection of keratinocytes.doi:10.13140 / RG.2.2.16993.61281, Transfection-Factbook 2018: technology, product overview, users' data., doi:10.13140 / RG.2.2.23912.16642.
[0229] Streptococcal hemolysin O (SLO)
[0230] The delivery medium may be streptococcal hemolysin O (SLO). SLO is a toxin produced by group A streptococci that exerts its effect by creating pores in the mammalian cell membrane. SLO can function in a reversible manner, which allows the delivery of proteins (e.g., up to 100 kDa) into the cytosol of the cell without impairing overall viability. Examples of SLO include those described in the following literature: Sierig G, et al. (2003). Infect Immun 71:446-55; Walev I, et al. (2001). Proc Natl Acad Sci US A 98:3185-90; Teng KW, et al. (2017). Elife 6:e25460.
[0231] Multifunctional coated nanodevices (MEND)
[0232] The delivery medium may include a multifunctional enveloped nanodevice (MEND). The MEND may include condensed plasmid DNA, a PLL core, and a lipid shell. The MEND may further include a cell-penetrating peptide (e.g., stearoyl octarginine). The cell-penetrating peptide may be within the lipid shell. The lipid envelope may be modified with one or more functional components, such as one or more of the following: polyethylene glycol (e.g., to increase vascular circulation time), ligands for targeting specific tissues / cells, other cell-penetrating peptides (e.g., for larger cell delivery), lipids that enhance endosome escape, and nuclear delivery tags. In some instances, the MEND may be a tetralayer MEND (T-MEND) that can target both the cell nucleus and mitochondria. Examples of MENDs include those described in the following literature: Kogure K, et al. (2004). J Control Release 98:317-23; Nakamura T, et al. (2012). Ace Chem Res 45:1113-21.
[0233] Lipid-coated mesoporous silica particles
[0234] The delivery medium may include lipid-coated mesoporous silica particles. The lipid-coated mesoporous silica particles may comprise a mesoporous silica nanoparticle core and a lipid membrane shell. The silica core may have a large internal surface area, resulting in high payload loading capacity. In some embodiments, pore size, pore chemistry, and overall particle size may be modified to load different types of payloads. The lipid coating of the particles may also be modified to maximize payload loading, increase cycle time, and provide precise targeting and payload release. Examples of lipid-coated mesoporous silica particles include those described in the following literature: Du X, et al. (2014). Biomaterials 35:5580-90; Durfee PN, et al. (2016). ACS Nano 10:8325-45.
[0235] Inorganic nanoparticles
[0236] The delivery medium may include inorganic nanoparticles. Examples of inorganic nanoparticles include carbon nanotubes (CNTs) (e.g., described in Bates Kand Kostarelos K. (2013). Adv Drug Deliv Rev 65:2023-33), bare mesoporous silica nanoparticles (MSNPs) (e.g., described in Luo GF, et al. (2014). Sci Rep 4:6064), and dense silica nanoparticles (SiNPs) (e.g., described in Luo D and Saltzman WM. (2000). Nat Biotechnol 18:893-5).
[0237] How to use
[0238] The compositions and systems described herein can be used for a variety of applications, including modifying non-animal organisms such as plants and fungi and modifying animals, for the treatment and diagnosis of diseases in plants, animals, and humans. Typically, the compositions and systems can be introduced into cells, tissues, organs, or organisms where they modify the expression and / or activity of one or more genes (e.g., PCSK9).
[0239] In some embodiments, the expression of the PCSK9 gene product is reduced in cells in which the compositions and systems described herein are introduced. In some embodiments, the reduction in PCSK9 gene product expression is temporary. In some embodiments, the reduction in PCSK9 gene product expression is stable. In some embodiments, the reduction in PCSK9 gene product expression is heritable.
[0240] In some embodiments, multiple cells modified with the compositions and systems described herein contain at least 10%, at least 20%, at least 30%, at least 40%, at least 50%, at least 60%, at least 70%, at least 80%, or at least 90% reduced expression of the PCSK9 gene product compared to cells not incorporating the compositions and systems described herein.
[0241] In some embodiments, cells expanded or derived from multiple cells modified by the compositions and systems described herein also contain at least 10%, at least 20%, at least 30%, at least 40%, at least 50%, at least 60%, at least 70%, at least 80%, or at least 90% reduced expression of the PCSK9 gene product compared to cells never introduced with the compositions and systems described herein.
[0242] Cells and organisms
[0243] This disclosure provides cells, tissues, and organisms comprising the engineered Cas protein, a CRISPR-Cas system, a polynucleotide encoding one or more components of the CRISPR-Cas system, and / or a carrier comprising the polynucleotide. This disclosure also provides nucleotide sequences encoding effector proteins in any of the methods or compositions described herein, said nucleotide sequences being codon-optimized for expression in eukaryotes or eukaryotic cells. In one embodiment of this disclosure, the codon-optimized effector protein is any Cas protein discussed herein and is codon-optimized for operability in eukaryotic cells or organisms, such as those mentioned elsewhere herein, including, but not limited to, yeast cells or mammalian cells or organisms, including mouse cells, rat cells, and human cells or non-human eukaryotic organisms such as plants.
[0244] In some embodiments, modification of the target locus of interest may result in: eukaryotic cells in which the expression of at least one gene product is altered; eukaryotic cells in which the expression of at least one gene product is altered, wherein the expression of the at least one gene product is increased; eukaryotic cells in which the expression of at least one gene product is altered, wherein the expression of the at least one gene product is decreased; or eukaryotic cells comprising an edited genome.
[0245] In some embodiments, the eukaryotic cell may be a mammalian cell or a human cell.
[0246] In other embodiments, the non-naturally occurring or engineered compositions, vector systems, or delivery systems described in this specification may be used for: site-specific gene knockout; site-specific genome editing; RNA sequence-specific interference; or multiplex genome engineering.
[0247] This document also provides a gene product derived from the cells, cell lines, or organisms described herein. In some embodiments, the amount of expressed gene product may be greater or less than the amount of gene product from cells without altered expression or edited genomes. In some embodiments, the gene product may be altered compared to the gene product from cells without altered expression or edited genomes.
[0248] Exemplary Therapy
[0249] This disclosure provides the use of the epigenetic editing tools and sgRNA in the treatment of various diseases and disorders. In some embodiments, the disclosure herein relates to a treatment method in which cells are edited using CRISPR or a base editor to regulate at least one gene, and the edited cells are subsequently administered to a patient in need. In some embodiments, the editing involves knocking in, knocking out, or knocking down the expression of at least one target gene in the cell. In particular embodiments, the editing may include the insertion, trans-or natural or synthetic, of a foreign gene, small gene, or sequence of one or more exons into a target gene locus, a hotspot locus, a safe harbor locus at a genomic location of the gene (where a new gene or gene element can be introduced without disrupting the expression or regulation of adjacent genes), or correction by inserting or deleting one or more mutations in the DNA sequence encoding a regulatory element of the target gene. In some embodiments, the editing includes the introduction of one or more point mutations in the nucleic acids (e.g., genomic DNA) of the target cell.
[0250] In some embodiments, the treatment targets diseases / disorders of organs, including liver diseases, eye diseases, muscle diseases, heart diseases, blood diseases, brain diseases, kidney diseases, or may include treatments for autoimmune diseases, central nervous system diseases, cancer and other proliferative diseases, neurodegenerative diseases, inflammatory diseases, metabolic disorders, musculoskeletal disorders, etc.
[0251] In some embodiments, the disease is associated with high cholesterol and provides regulation of cholesterol (e.g., LDL). In some embodiments, the regulation is influenced by modifications in the target gene PCSK9. PCSK9 is associated with, but is not limited to, the following diseases and disorders: abeta-lipoproteinemia, adenoma, arteriosclerosis, atherosclerosis, cardiovascular disease, gallstones, coronary artery disease, coronary heart disease, non-insulin-dependent diabetes mellitus, hypercholesterolemia, familial hypercholesterolemia, hyperinsulinemia, hyperlipidemia, familial hyperlipidemia with complications, hypobeta-lipoproteinemia, chronic renal failure, liver disease, liver tumors, melanoma, myocardial infarction, somnolence, tumor metastasis, nephroblastoma, obesity, peritonitis, pseudoxanthoma elastica, cerebrovascular accident, vascular disease, xanthoma, peripheral vascular disease, myocardial ischemia. The following conditions are considered as potential causes of hyperlipidemia, impaired glucose tolerance, xanthomas, multifocal hypercholesterolemia, secondary hepatic malignancies, dementia, overweight, chronic hepatitis C, carotid atherosclerosis, type A hyperlipoproteinemia, intracranial atherosclerosis, ischemic stroke, acute coronary syndrome, aortic calcification, cardiovascular disease, type B hyperlipoproteinemia, peripheral artery disease, type II familial aldosteronism, familial hypoβ-lipoproteinemia, autosomal recessive hypercholesterolemia, autosomal dominant hypercholesterolemia, coronary artery disease, liver cancer, ischemic stroke, and atherosclerotic cardiovascular disease (NOS). Epigenetic modification of the PCSK9 gene using any of the methods described herein may be used to treat, prevent, and / or alleviate symptoms of the diseases and disorders described herein.
[0252] Dyslipidemia is a genetic disorder characterized by elevated levels of lipids in the blood, leading to arterial blockage (atherosclerosis). These lipids include plasma cholesterol, triglycerides, high-density lipoprotein (HDL), or low-density lipoprotein (LDL). Dyslipidemia increases the risk of heart attack, stroke, or other circulatory problems. Current treatments include lifestyle modifications such as exercise and dietary adjustments, as well as the use of lipid-lowering medications such as statins. Non-statin lipid-lowering medications include bile acid sequestrants, cholesterol absorption inhibitors, homozygous familial hypercholesterolemia medications, fibrates, niacin, omega-3 fatty acids, and / or combination products. Treatment regimens are typically tailored to the specific lipid abnormality, although different lipid abnormalities often coexist. Treatment in children is more challenging because dietary changes can be difficult to implement, and lipid-lowering therapies have not yet been proven effective. Epigenetic modifications of the PCSK9 gene using any of the methods described herein can be used to treat, prevent, and / or alleviate symptoms of dyslipidemia, such as LDL dyslipidemia.
[0253] PCSK9 activity is primarily confined to the liver, and it is associated with dyslipidemia, PCSK9-associated familial hypercholesterolemia, familial hypercholesterolemia, papillary gastric carcinoma, homozygous familial hypercholesterolemia, and nasopharyngitis. PCSK9-associated familial hypercholesterolemia is an autosomal dominant genetic disorder in which the body exhibits dangerously high blood cholesterol levels due to a lack of low-density lipoprotein cholesterol receptors. It affects between 1 in 500 heterozygous individuals and 1 in 1,000,000 homozygous individuals worldwide, and is more common in South African whites, French Canadians, Lebanese Christians, and Finns. Common symptoms of PCSK9-associated familial hypercholesterolemia include elevated circulating cholesterol, either alone or including very low-density lipoprotein (VLDL). Current treatments for PCSK9-associated familial hypercholesterolemia include the use of statins to inhibit HMG-CoA reductase in the liver. Another option for treating PCSK9-related familial hypercholesterolemia is ezetimibe, which inhibits the absorption of cholesterol in the intestines.
[0254] In some implementations, epigenetic modifications of the PCSK9 gene by any of the methods described herein can target the liver, the primary site of PCSK9 activity.
[0255] Example
[0256] To enable those skilled in the art to better understand the present disclosure, the technical solutions in the embodiments of the present disclosure will be clearly and completely described below. Obviously, the described embodiments are only some embodiments of the present disclosure, and not all embodiments.
[0257] Example 1: Construction of fusion molecular plasmid
[0258] Design and construction of plasmids containing the complexes of this application, and construction of IVT template plasmids:
[0259] The amino acid sequences of different elements (including scFv, ZIM3, DNMT3A, DNMT3L, dCas9, GCN4, etc.) were optimized by Genscript to be suitable for mammalian expression and synthesized. They were then cloned into IVT vectors (containing T7 promoter, UTR sequence and PolyA sequence) using the NEBuilder kit.
[0260] When optimizing different functional elements, these elements were synthesized by Genscript using nucleic acid sequences optimized for mammalian expression. First, the vector excluding the element to be replaced was amplified by PCR. Then, the element to be replaced was amplified from the synthesized sequence, with homologous arm sequences introduced. Finally, the different elements were recombined into the vector using NEBuilder reagent to construct the final expression plasmid.
[0261] Table 2. EpiIREG009 Sequence Information from the Appearance Editing Tool
[0262] Table 3. CRISPRoff sequence information
[0263] Example 2: In vitro transcription of mRNA encoding fusion molecules
[0264] The IVT template was linearized using NEB's XbaI (R0145L) or HindIII (R3104L) restriction endonucleases. The digested DNA was purified by phenol-chloroform-ethanol, and the nucleic acid concentration was quantified using a spectrophotometer.
[0265] The purified template was co-capped in vitro transcribed using a NEB kit (E2040S). The capping analog used was CAG Trimer (ON-134) from Zwei Biotech. The reaction system was followed according to the manufacturer's instructions, incubated at 37°C for 2 hours, and then subjected to DNase digestion. The RNA transcribed in vitro was purified using Thermo Fisher Scientific LiCl reagent (AM9480) according to the manufacturer's instructions. The nucleic acid concentration of the purified RNA was quantified using a spectrophotometer.
[0266] The obtained RNA was subjected to capillary electrophoresis and integrity determination using the Bioptic Qsep1 capillary electrophoresis instrument with S1 clips. RNA with integrity greater than 80% was used for subsequent experiments.
[0267] Example 3: Lipid nanoparticle encapsulation of mRNA and sgRNA encoding fusion molecules
[0268] The LNPs used in this patent were prepared according to the method described in the literature (Conway, A. et al. Non-viral Delivery of Zinc Finger Nuclease mRNA Enables Highly Efficient In Vivo Genome Editing of Multiple Therapeutic Gene Targets. Mol Ther 27, 866-877 (2019). https: / / doi.org:10.1016 / j.ymthe.2019.03.003), specifically as follows: An ethanol solution containing 1,2-distearate-sn-glycerol-3-phosphocholine, cholesterol, PEG lipids, and ionizable cationic lipids was rapidly mixed with an aqueous solution containing mRNA and sgRNA (weight ratio 1:1) at pH 4 at a flow rate ratio of 1:3 (ethanol:water phase). Throughout the study, the N:P ratio between the ionizable lipids and nucleic acids was maintained at 4 to 6. The obtained LNP formulation was dialyzed overnight in 1×PBS, sterile filtered through 0.22 μm filter, and stored at 4°C until use.
[0269] Example 4: PCSK9 gene silencing in PCSK9 humanized mice using LNP delivery of mRNA and sgRNA encoding the fusion molecule.
[0270] For the human PCSK9 gene, we selected PCSK9-Hs-sg3 and PCSK9-Hs-sg8 as the final guide RNAs. The mRNA of the epigenetic editing tool EPREG009, along with PCSK9-Hs-sg3 and PCSK9-Hs-sg8, were prepared by LNP embedding at a mass ratio of 1:0.5:0.5. The results were divided into two groups: a single PCSK9-Hs-sg3 group and an EPREG009 group (EPIREG009 and PCSK9-Hs-sg3 mass ratio of 1:1), and a single PCSK9-Hs-sg8 group and an EPREG009 group (EPIREG009 and PCSK9-Hs-sg8 mass ratio of 1:1). The mass ratio of PCSK9-Hs-sg3 and PCSK9-Hs-sg8 was 1:1. The PCSK9-Hs-sg3 and PCSK9-Hs-sg8 were combined with the EPIREG009 group (the mass ratio of EPIREG009 to PCSK9-Hs-sg3 and PCSK9-Hs-sg8 was 1:0.5:0.5) to obtain LNP test samples targeting the human PCSK9 target gene (LNP preparation reference: https: / / doi.org / 10.1038 / s41586-021-03534-y).
[0271] PCSK9 humanized mice (purchased from Cyagen (Suzhou) Biotechnology Co., Ltd.) were used as a pharmacodynamic study model. This model completely replaces the mouse PCSK9 genome sequence with the human PCSK9 gene, and can be used to study the efficacy of drugs targeting human PCSK9. In this model study, we used a dosage of 3 mg / kg, and collected blood samples 42 days after administration to measure the expression level of human PCSK9 protein. The results are shown in Figure 1: at a dosage of 3 mg / kg, an inhibition efficiency of over 90% was achieved.
[0272] Table 4. sgRNA sequence information targeting human PCSK9
[0273] Table 5. Primer information for human PCSK9 qPCR
[0274] Example 5: Silent PCSK9 gene in cynomolgus monkey cells
[0275] Following in vitro and humanized mouse studies of EPIREG009, we aim to further investigate the inhibitory activity of epigenetic editing tools in large animals. In non-human primate pharmacodynamic studies, we used cynomolgus monkeys as a model, administering the same doses of EPIREG009 and CRISPRoff. In this study, a dosage of 3 mg / kg was used, targeting Monkey-sg3 and Monkey-sg12 of the monkey PCSK9 genome. EPIREG009 or CRISPRoff mRNA was encapsulated with two sgRNAs at a mass ratio of 1:0.5:0.5 using LNPs. (LNP preparation reference: https: / / doi.org / 10.1038 / s41586-021-03534-y).
[0276] Table 6. sgRNA sequence information targeting monkey PCSK9
[0277] Three days prior to administration, intravenous blood samples were collected from cynomolgus monkeys to serve as the baseline value for subsequent PCSK9 protein detection. The drug was diluted to 15 ml before administration via intravenous infusion. Close clinical observation and care were provided to the animals for 24 hours post-administration. Blood samples were collected intravenously on days 3, 7, and 14 post-administration, and PCSK9 protein levels were quantitatively detected using an ELISA kit (purchased from R&D Systems, catalog number: DPC900; the detection method was performed according to the kit instructions). The inhibitory efficiency of the drug on the PCSK9 gene was analyzed by comparing the pre-administration PCSK9 protein levels with the pre-administration levels. The results are shown in Figure 2, indicating that the EPIREG009 composition exhibited superior inhibitory efficiency compared to the CRISPRoff composition at a dose of 3 mg / kg.
[0278] To further elucidate the persistence of EPIREG009's efficacy, we conducted persistence assays on PCSK9 and LDL-c levels in the blood of primates (cynomolgus monkeys) after EPIREG009 administration. The ELISA kit was purchased from R&D Systems (catalog number: DPC900), and the assay method was performed according to the kit instructions. LDL-c was measured using a biochemical analyzer. Using the PCSK9 and LDL-c levels at day 4 (day 0 being the administration date) as baseline, the results in Table 7 and Figures 4 and 5 show that during the approximately 300-day study, PCSK9 maintained an inhibitory efficiency of approximately 80%, and LDL-c levels decreased by approximately 40%. This demonstrates that EPIREG009 exhibits good in vivo inhibitory efficiency and persistence, making it a potential drug for heterozygous hypercholesterolemia.
[0279] Table 7. Blood PCSK9 and LDL-c levels after drug administration
[0280] Example 6: Silencing of the human PCSK9 gene in human hepatocellular carcinoma cell lines
[0281] The test cells were the human hepatocellular carcinoma line Huh7. 50,000 cells / well were seeded into 24-well plates. After 12 hours of plating, LNP samples (PCSK9-Hs-sg3 and PCSK9-Hs-sg8 combined with EPIREG009) obtained in Example 4 were added to the culture medium at different concentrations: 0, 0.025, 0.078, 0.25, 0.78, 2.5, 7.8, and 20 μg / ml. After 4-6 hours, the medium was replaced with fresh DMEM + 10% FBS, and the cells were cultured for another 14 days. mRNA was extracted from the cells, and cDNA was reverse transcribed. The mRNA expression level of the target gene PCSK9 was detected by qPCR. The inhibitory efficiency of different concentrations of the EPIREG009 composition on the PCSK9 target gene was obtained by comparison with the 0 μg / ml control group. The results, shown in Figure 3, indicate that the EPIREG009 composition showed a good dose-dependent effect.
[0282] By incorporating via reference
[0283] The full contents of every patent and scientific document mentioned in this article are incorporated herein by reference for all purposes.
[0284] Equivalence
[0285] This disclosure may be embodied in other specific ways without departing from its spirit or essential characteristics. Therefore, the above embodiments should be considered illustrative in all cases and not as limiting of the invention described herein. Consequently, the scope of this disclosure is defined by the appended claims rather than by the foregoing description and is intended to be encompassed therein by all variations within the equivalent meaning and scope of the claims.
Claims
1. A composition comprising: a) a complex comprising a first fusion and a second fusion, or a nucleic acid sequence encoding the complex; b) a guide RNA (sgRNA) complementary to a sequence of a PCSK9 gene and / or a regulatory element of a PCSK9 gene, the sgRNA comprising a nucleic acid sequence as set forth in SEQ ID NO: 1 and / or SEQ ID NO: 2; wherein the first fusion comprises, in order from N-terminus to C-terminus, a recruitment domain A and a transcription repressor domain, and the second fusion comprises, in order from N-terminus to C-terminus, a DNA methylation domain, a nucleic acid binding domain, and a recruitment domain A’.
2. The composition of claim 1, comprising: a) a complex comprising a first fusion and a second fusion, or a nucleic acid sequence encoding the complex; b) two guide RNAs (sgRNAs) complementary to a sequence of a PCSK9 gene and / or a regulatory element of a PCSK9 gene, the sgRNAs comprising nucleic acid sequences as set forth in SEQ ID NO: 1 and SEQ ID NO: 2; wherein the first fusion comprises, in order from N-terminus to C-terminus, a recruitment domain A and a transcription repressor domain, and the second fusion comprises, in order from N-terminus to C-terminus, a DNA methylation domain, a nucleic acid binding domain, and a recruitment domain A’.
3. The composition of claim 1 or 2, wherein the recruitment domain A is selected from any one of one group of the following domains, and the recruitment domain A’ is selected from any one of the other group of the following domains: 1) a general control non-derepressible protein 4 (GCN4), a GFP11 fragment derived from split green fluorescent protein (GFP), or a GVKESLV polypeptide; and 2) a single-chain antibody (scFv), a GFP1-10 fragment derived from split green fluorescent protein (GFP), or a PDZ protein domain.
4. The composition of claim 3, wherein: 1) the domain of one of the recruitment domain A and the recruitment domain A’ is GCN4, and the domain of the other is a scFv; or 2) the domain of one of the recruitment domain A and the recruitment domain A’ is a GFP11 fragment, and the domain of the other is a GFP1-10.
5. The composition of claim 4, wherein the recruitment domain A comprises an amino acid sequence as set forth in SEQ ID NO: 7, and the recruitment domain A’ comprises an amino acid sequence as set forth in SEQ ID NO:
30.
6. The composition of claim 1 or 2, wherein the transcriptional repressor domain is selected from one or more of the following domains: KRAB, ZIM3, ZNF680, ZNF554, ZNF264, ZNF582, ZNF324, ZNF669, ZNF354A, ZNF82, ZNF595, ZNF419, ZNF566, ZIM2, EHMT2, SUV39H1, ZFPM1, TRIM28, EZH2, MXD1, SID, LSD1, HP1a, HDAC3, HDAC1, PRMT1, SETDB1, hSIRT1, ZNF436, ZNF257, ZNF675, ZNF490, ZNF320, ZNF331, ZNF816, ZNF41, ZNF189, ZNF528, ZNF543, ZNF140, ZNF610, ZNF350, ZNF8, ZNF30, ZNF98, ZNF677, ZNF596, ZNF214, ZNF37A, ZNF34, ZNF250, ZNF547, ZNF273, ZFP82, ZNF224, ZNF33A, ZNF45, ZNF175, ZNF184, ZFP28-1, ZFP28-2, ZNF18, ZNF213, ZNF394, ZFP1, ZFP14, ZNF416, ZNF557, ZNF729, ZNF254, ZNF764, ZNF785, ZNF10, CBX5, RYBP, YAF2, MGA, CBX1, SCMH1, MPP8, SUMO3, HERC2, BIN1, PCGF2, TOX, FOXA1, FOXA2, IRF2BP1, IRF2BP2, IRF2BPL IRF-2BP1_2 N-terminal domain, HOXA13, HOXB13, HOXC13, HOXA11, HOXC11, HOXC10, HOXA10, HOXB9, HOXA9, ZFP28, ZN334, ZN568, ZN37A, ZN181, ZN510, ZN862, ZN140, ZN208, ZN248, ZN571, ZN699, ZN726, ZIK1, ZNF2, Z705F, ZNF14, ZN471, ZN624, ZNF84, ZNF7, ZN891, ZN337, Z705G, ZN529, ZN729, ZN419, Z705A, ZN302, ZN486, ZN621, ZN688, ZN33A, ZN554, ZN878, ZN772, ZN224, ZN184, ZN544, ZNF57, ZN283, ZN549, ZN211, ZN615, ZN253, ZN226, ZN730, Z585A, ZN732,ZN681, ZN667, ZN649, ZN470, ZN484, ZN431, ZN382, ZN254, ZN124, ZN607, ZN317, ZN620, ZN141, ZN584, ZN540, ZN75D, ZN555, ZN658, ZN684, RBAK, ZN829, ZN582, ZN112, ZN716, HKR1, ZN350, ZN480, ZN416, ZNF92, ZN100, ZN736, ZNF74, ZN443, ZN195, ZN530, ZN782, ZN791, ZN331, Z354C, ZN157, ZN727, ZN550, ZN793, ZN235, ZN724, ZN573, ZN577, ZN789, ZN718, ZN300, ZN383, ZN429, ZN677, ZN850, ZN454, ZN257, ZN264, ZN485, ZN737, ZNF44, ZN596, ZN565, ZN543, ZFP69, SUMO1, ZNF12, ZN169, ZN433, ZN175, ZN347, ZNF25, ZN519, Z585B, ZN517, ZN846, ZN230, ZNF66, ZN713, ZN816, ZN426, ZN674, ZN627, ZNF20, Z587B, ZN316, ZN233, ZN611, ZN556, ZN234, ZN560, ZNF77, ZN682, ZN614, ZN785, ZN445, ZFP30, ZN225, ZN551, ZN610, ZN528, ZN284, ZN418, ZN490, ZN805, Z780B, ZN763, ZN285, ZNF85, ZN223, ZNF90, ZN557, ZN425, ZN229, ZN606, ZN155, ZN222, ZN442, ZNF91, ZN135, ZN778, ZN534, ZN586, ZN567, ZN440, ZN583, ZN441, ZNF43, ZN589, ZN563, ZN561, ZN136, ZN630, ZN527, ZN333, Z324B, ZN786, ZN709, ZN792, ZN599, ZN613, ZF69B, ZN799, ZN569, ZN564, ZN546, ZFP92, ZN723, ZN439, ZFP57, ZNF19, ZN404, ZN274, CBX3, ZN250, ZN570, ZN675, ZN695, ZN548, ZN132, ZN738, ZN420, ZN626, ZN559, ZN460, ZN268, ZN304, ZN605,ZN844, SUMO5, ZN101, ZN783, ZN417, ZN182, ZN823, ZN177, ZN197, ZN717, ZN669, ZN256, ZN251, CBX4, CDY2, CDYL2, ZN562, ZN461, Z324A, ZN766, ID2, ZN214, CBX7, ID1, CREM, SCX, ASCL1, ZN764, SCML2, TWST1, CREB1, TERF1, ID3, CBX8, GSX1, NKX22, ATF1, TWST2, ZNF17, TOX3, TOX4, ZMYM3, I2BP1, RHXF1, SSX2, I2BPL, ZN680, TRI68, HXA13, PHC3, TCF24, HXB13, HEY1, PHC2, ZNF81, FIGLA, SAM11, KMT2B, HEY2, JDP2, HXC13, ASCL4, HHEX, GSX2, ETV7, ASCL3, PHC1, OTP, I2BP2, VGLL2, HXA11, PDLI4, ASCL2, CDX4, ZN860, LMBL4, PDIP3, NKX25, CEBPB, ISL1, CDX2, PROP1, SIN3B, SMBT1, HXC11, HXC10, PRS6A, VSX1, NKX23, MTG16, HMX3, HMX1, KIF22, CSTF2, CEBPE, DLX2, PPARG, PRIC1, UNC4, BARX2, ALX3, TCF15, TERA, VSX2, HXD12, CDX1, TCF23, ALX1, HXA10, RX, CXXC5, SCML1, NFIL3, DLX6, MTG8, CEBPD, SEC13, FIP1, ALX4, LHX3, PRIC2, MAGI3, NELL1, PRRX1, MTG8R, RAX2, DLX3, DLX1, NKX26, NAB1, SAMD7, PITX3, WDR5, MEOX2, NAB2, DHX8, CBX6, EMX2, CPSF6, HXC12, KDM4B, LMBL3, PHX2A, EMX1, NC2B, DLX4, SRY, ZN777, ZN398, GATA3, BSH, SF3B4, TEAD1, TEAD3, RGAP1, PHF1, GATA2, FOXO3, ZN212, IRX4, ZBED6, LHX4, SIN3A, RBBP7, NKX61, R51A1, MB3L1, DLX5, NOTC1, TERF2, ZN282, RGS12, ZN840, SPI2B, PAX7, NKX62, ASXL2, FOXO1,GATA1, ZMYM5, LRP1, MIXL1, SGT1, LMCD1, CEBPA, SOX14, WTIP, PRP19, NKX11, RBBP4, DMRT2, SMCA2, and functionally active fragments thereof.
7. The composition of claim 6, wherein the transcription repressor domain is ZIM3.
8. The composition of claim 7, wherein the transcription repressor domain comprises an amino acid sequence as set forth in SEQ ID NO:
8.
9. The composition of claim 1 or 2, wherein the DNA methylation domain comprises at least one DNA methyltransferase or a functionally active fragment thereof.
10. The composition of claim 9, wherein the DNA methyltransferase is selected from the group consisting of DNMT3A, DNMT3B, DNMT3C, DNMT1, DNMT2, and DNMT3L.
11. The composition of claim 10, wherein the DNA methylation domain comprises at least one DNMT3A and at least one DNMT3L.
12. The composition of claim 11, the DNA methylation domain comprising a DNMT3A- DNMT3L domain or a DNMT3L-DNMT3A domain; wherein, - indicates that the domains at its two ends are connected directly or indirectly in the order from N-terminus to C-terminus.
13. The pharmaceutical composition of claim 12, wherein the DNMT3A comprises an amino acid sequence as set forth in SEQ ID NO:
10.
14. The pharmaceutical composition of claim 12, wherein the DNMT3L comprises an amino acid sequence as set forth in SEQ ID NO:
11.
15. The composition of claim 1 or 2, wherein the nucleic acid binding domain is a DNA binding domain.
16. The composition of claim 15, wherein the DNA binding domain is selected from the group consisting of a TALE domain, a zinc finger domain, a tetR domain, a meganuclease, a Cas protein, an Argonaute (Ago) protein, and a homolog, modified version, or variant thereof.
17. The composition of claim 16, wherein the DNA binding domain is capable of binding to a guide RNA.
18. The composition of claim 17, wherein the DNA binding domain is a Cas protein, and the Cas protein is a Class II Cas nuclease.
19. The composition of claim 18, wherein the Cas protein is selected from the group consisting of a Class II Type II Cas nuclease and a Class II Type V Cas nuclease.
20. The composition of claim 19, wherein the Cas protein is a Cas9 protein.
21. The composition of claim 20, wherein the Cas protein is a dead Cas9 (dCas9) protein.
22. The composition of claim 21, wherein the dCas9 comprises an amino acid sequence as set forth in SEQ ID NO: 12-29.
23. The composition of claim 1 or 2, wherein the first fusion and the second fusion are linked by a cleavable peptide.
24. The composition of claim 23, wherein the 2A peptide is selected from the group consisting of P2A, T2A, E2A, or F2A.
25. The composition of any one of claims 1-24, wherein the complex comprises an amino acid sequence as set forth in SEQ ID NO:
42.
26. The composition of any one of claims 1-25, wherein the complex and the sgRNA are packaged in a liposome or a lipid nanoparticle.
27. The composition of claim 26, wherein the complex and the sgRNA are packaged in the same liposome or lipid nanoparticle, or in different liposomes or lipid nanoparticles.
28. The composition of any one of claims 1-27, wherein the complex and the sgRNA are packaged in an AAV vector.
29. The composition of claim 27, wherein the complex and the sgRNA are packaged in the same AAV vector, or packaged in different AAV vectors.
30. The composition of any one of claims 1 to 29, wherein the composition is a pharmaceutical composition comprising a pharmaceutically acceptable carrier.
31. A method for reducing or eliminating expression of a proprotein convertase subtilisin / kexin type 9 (PCSK9) gene product in a cell, the method comprising introducing the composition of any one of claims 1-30 into a cell of a subject, thereby reducing or eliminating expression of the PCSK9 gene product in the cell.
32. A method of reducing or eliminating expression of a PCSK9 gene product in a subject in vivo, the method comprising introducing the composition of any one of claims 1-31 into a cell of a subject, thereby reducing or eliminating expression of the PCSK9 gene product in the cell.
33. A method of reducing low density lipoprotein (LDL) cholesterol in a subject, the method comprising introducing the composition of any one of claims 1-30 into a cell of a subject, thereby reducing LDL cholesterol in the subject.
34. A method of expanding a population of cells with reduced expression of a PCSK9 gene product, the method comprising the steps of: i) introducing into a plurality of cells a) a complex comprising a first fusion and a second fusion, or a nucleic acid sequence encoding the complex; b) one or more guide RNA (sgRNA) complementary to a sequence of a PCSK9 gene and / or a regulatory element of a PCSK9 gene, wherein the sgRNA comprises a nucleic acid sequence as set forth in SEQ ID NO: 1 and / or SEQ ID NO: 2; ii) culturing the plurality of cells to produce a plurality of modified cells with reduced expression of a PCSK9 gene product, wherein the first fusion comprises, in order from N-terminus to C-terminus, a recruitment domain A and a transcription repressor domain, the second fusion comprises, in order from N-terminus to C-terminus, a DNA methylation domain, a nucleic acid binding domain, and a recruitment domain A’, wherein the PCSK9 gene product expression of the plurality of modified cells is reduced at least 50%, at least 60%, at least 70%, at least 80%, or at least 90% relative to cells that have not been introduced with the complex or the nucleic acid sequence encoding the complex, and wherein the cells are liver cells.
35. Use of the composition of any one of claims 1 to 30 in the manufacture of a medicament for treating or alleviating a PCSK9-related disease in a subject.
36. The use of claim 35, wherein the subject is a mammal, such as a human, monkey, mouse, rat, rabbit, pig, horse, cat, and dog.
37. The use of claim 35, wherein the PCSK9-related disease is a liver disease, a disease associated with high cholesterol, and a disease associated with cholesterol (such as low density lipoprotein (LDL) cholesterol) disorder.
38. The use of claim 35, wherein the PCSK9-related disease is hypercholesterolemia (FH), complex dyslipidemia, hyperlipidemia, hyperlipoproteinemia, coronary syndrome, dyslipidemia, atherosclerosis, or liver cancer.