Minimal versatile genetic perturbation technology (MVGPT)

WO2026096419A3PCT designated stage Publication Date: 2026-06-11WILLIAM MARCH RICE UNIVERSITY +1

Patent Information

Authority / Receiving Office
WO · WO
Patent Type
Applications
Current Assignee / Owner
WILLIAM MARCH RICE UNIVERSITY
Filing Date
2025-10-28
Publication Date
2026-06-11

AI Technical Summary

Technical Problem

Current CRISPR/Cas systems for genome editing in mammalian cells are inefficient, prone to unintended genetic alterations, and cytotoxic, while base editors lack orthogonality for multiplexed gene editing and activation, posing safety concerns for therapeutic applications.

Method used

A modular platform (mvGPT) using an engineered compact prime editor (PE), a transcriptional fusion activator (MS2-p65-HSFl), and a drive-and-process (DAP) array to enable simultaneous and orthogonal gene editing, activation, and repression in human cells, avoiding double-strand breaks and incorporating a compact RNA Pol III promoter for scalable and multiplexed RNA expression.

Benefits of technology

Enables precise and safe multiplexed gene editing, activation, and repression in human cells, reducing cytotoxicity and enhancing therapeutic potential by leveraging prime editing and orthogonal gene modulation without causing double-strand breaks, compatible with both viral and non-viral delivery modalities.

✦ Generated by Eureka AI based on patent content.
Patent Text Reader

Abstract

Provided herein are drive-and-process (DAP) array architectures for multiplex genome and cellular engineering, including nuclease editing, base editing, prime editing, gene activation, gene repression, and genetic perturbation, in virtually any cell type or organisms with minimal genetic payloads. DAP arrays are composed of engineered tRNAs and small RNAs of interests, in a tandemly assembled single-array architecture.
Need to check novelty before this filing date? Find Prior Art

Description

DESCRIPTIONMINIMAL VERSATILE GENETIC PERTURBATION TECHNOLOGY (MVGPT)REFERENCE TO RELATED APPLICATIONS

[0001] The present application claims the priority benefit of United States provisional application number 63 / 712,648, filed October 28, 2024, the entire contents of which are incorporated herein by reference.STATEMENT OF FEDERALLY SPONSORED RESEARCH

[0002] This invention was made with government support under Grant No. R01HL157714 awarded by the National Institutes of Health and Grant No. CBET-2143626 awarded by the National Science Foundation. The government has certain rights in the invention.REFERENCE TO A SEQUENCE LISTING

[0003] This application contains a Sequence Listing XML, which has been submitted electronically and is hereby incorporated by reference in its entirety. Said Sequence Listing XML, created on October 14, 2025, is named RICEP0162WO_ST26.xml and is 10,697 bytes in size.BACKGROUND1. Field

[0004] The present invention relates generally to the fields of molecular biology and genetic engineering. More particularly, it concerns CRISPR array architectures for simultaneous and orthogonal gene editing, activation, and repression in mammalian cells.2. Description of Related Art

[0005] A compact, robust, and generalizable genetic perturbation system capable of precise genome editing alongside simultaneous and orthogonal modulation of endogenous gene expression in living cells is critical for therapeutic interventions of complex diseases, genetic screening, and metabolic engineering (Yuan & Gao, 2022; McCarty et al., 2020; Gao et al., 2016). While CRISPR / Cas nucleases combined with transcriptional activators can achieve concurrent gene knockout and activation (Dahlman et al., 2015; Kiani et al., 2015), precise4929-2659-7749, v. 1gene editing using Cas nucleases relies on homology-directed repair stimulated by doublestranded breaks (DSBs), which is inefficient in most therapeutically relevant cell types (Campa et al., 2019; Kosicki et al., 2018; Haapaniemie et al., 2018; Ihry et al., 2018). Further, DSBs often lead to unintended genetic alterations and cytotoxicity, posing safety concerns for therapeutic applications (Campa et al., 2019; Kosicki et al., 2018; Haapaniemi et al., 2018; Ihry et al., 2018). Base editors (BEs) provide an alternative approach, converting one nucleobase to another without causing DSBs (Komor et al., 2016; Nishida et al., 2016; Gaudelli et al., 2017; Daniel et al., 2023). However, BEs are limited to point mutations and lack orthogonality for multiplexed gene editing and activation (Farzadfard et al., 2019).

[0006] Prime editors (PEs) employ a nickase Cas9 (nCas9)-reverse transcriptase (RT) fusion protein and a prime editing guide RNA (pegRNA) to reverse transcribe a short RNA template into targeted DNA sites without requiring DSBs (Anzalone et al., 2019; Anzalone et al., 2022; Zeng et al., 2024). PEs enable a variety of precise gene editing capabilities, including base substitutions, insertions, deletions, and gene inversions and translocations (Anzalone et al., 2019; Anzalone et al., 2022). PE-based multiplexed genetic perturbation technology, enabling simultaneous gene editing and transcriptomic modulation, are needed.SUMMARY

[0007] Provided herein are constructs and methods for a streamlined and modular platform (mvGPT) to perform simultaneous and orthogonal endogenous gene editing, activation, and repression in human cells. The mvGPT platform comprises three primary molecular modules for the concurrent use of an engineered compact PE, a recruitable transcriptional fusion activator, and a shRNA, which enables simultaneous and orthogonal gene editing, activation, and repression. Functioning as the central component of mvGPT, the drive-and-process (DAP) array utilizes an engineered hCtRNA as a robust RNA Pol III promoter, facilitating the tandem expression of programmable small RNAs. Alternatively, the DAP array may use the mouse CtRNA or QtRNA. These RNAs, released by cellular tRNA processing mechanisms, enable orthogonal genetic perturbations. The absence of lengthy promoters in the DAP array reduces the delivery payload, enabling high scalability and multiplexity with mvGPT.

[0008] Gene editing in the mvGPT platform is accomplished when PEAK complexes with pegRNA and ngRNA, both products of the DAP array. PEAK, the prime editing system24929-2659-7749, v. 1with advanced kernel, includes a truncated 451 aa MMLV-RT, the shortest MMLV-RT variant up to date derived from PE2, with enhancing mutations D200C and V101R, an engineered N- terminal VirD2 NLS, and a C-terminal SV40 NLS. PEAK can effectively pair with epegRNA encoded in the DAP array to enable enhanced prime editing activity. In contrast to prior Cas nuclease-based technologies, mvGPT employs prime editing to modify the context of the genome without causing DSBs, thus avoiding error-prone and cytotoxic DNA repair mechanisms and enhancing the safety of potential therapeutic applications. The gene activation function operates orthogonally to gene editing by leveraging the transcriptional fusion activator MS2-p65-HSFl (MPH), which is recruited to PEAK with a spacer-tailored (11-19 nt) agRNA using engineered MS2-binding stem-loops. Finally, gene repression is independently achieved by shRNA produced by the DAP array. Gene repression mediated by the DAP array-generated shRNA avoids the incorporation of additional proteins, thereby minimizing interference with the gene editing and activation process. mvGPT is compatible with both viral and non-viral delivery modalities for enhanced applications. The compact size of mvGPT positions it as a useful tool for studying complex genomic functions or complex genetic diseases where precision and tunable perturbations on both the genome and transcriptome are required.

[0009] mvGPT represents a compact and versatile molecular technology that enables effective simultaneous and orthogonal gene editing, activation, and repression in human cells, providing better support for the genetic interrogation of complex biology, the study of complex genetic diseases, and the development of human gene therapy. mvGPT may incorporate recently engineered prime editors with different RTs and improved efficiencies (Doman et al., 2023). mvGPT may incorporate proximal dead single guide RNAs and truncated sgRNAs to modify chromatin structure and activate the edited gene to enhance PE editing efficiency further (Xiaoyi et al., 2023; Park et al., 2021). Moreover, discovery and adaptation of potent mammalian endogenous aptamer-activator systems to the mvGPT platform may be used to further reduce its size. Applications of mvGPT may extend to primary cells and animal models of complex genetic diseases, underscoring its therapeutic promise.

[0010] Provided herein are nucleic acid constructs comprising, from 5’ to 3’, at least four repetitions of a 5' leader sequence derived from a tRNA and a small RNA, wherein the small RNAs are, independently, selected from the group consisting of a guide RNA, a nicking guide RNA, a prime editing guide RNA, an engineered prime editing guide RNA, and a shorthairpin RNA. The nucleic acid construct may comprise a 3’ poly T termination signal. The34929-2659-7749, v. 1nucleic acid constructs may lack any further RNA polymerase III promoter sequence. The nucleic acid constructs comprise at least one nicking guide RNA and at least one prime editing guide RNA or at least one engineered prime editing guide RNA, wherein the nicking guide RNA is positioned upstream of the prime editing guide RNA or the engineered prime editing guide RNA. The small RNA may be a prime editing guide RNA, where the nucleic acid construct further comprises an interval sequence positioned between each pegRNA and the downstream 5’ leader sequence derived from a tRNA, and preferably where the interval sequence comprises a pseudoknot positioned at the 3’ end of the prime editing guide RNA. The guide RNA may be an engineered guide RNA with an RNA aptamer insert, which can recruit a gene activator, such as, for example, a MS2 aptamer, which can recruit an MPH gene activator comprising an MS2 bacteriophage coat protein fused with the activation domains of P65 and HSF1 genes. The guide RNA may comprise a spacer sequence having a length of 11-19 nucleotides. The leader sequence may be derived from a human cysteine tRNA, optionally where the leader sequence comprises or consisting of the sequence AGAGGGGGTATAGCTCAGTGGTAGAGCATTTGACTGCAGATCAAGAGGTCCCCG GTTCAAATCCGGGTGCCCCCT (SEQ ID NO: 1).

[0011] Provided herein are vectors comprising any of the nucleic acid constructs provided herein. The vector may be a plasmid, a DNA virus (e.g., an adeno-associated virus), or an RNA virus (e.g., a lentivirus). The nucleic acid construct may be in reverse orientation relative to the viral RNA genome. The vector may further comprise a Cas nuclease expression cassette, such as, for example, a prime editor. The prime editor may comprise a Cas9 nuclease with an H840A substitution and a reverse transcriptase, wherein the reverse transcriptase is a truncated MMLV reverse transcriptase of amino acids 24-474, comprising D200C and V101R substitutions, and optionally comprising a C-terminal nuclear localization signal derived from SV40 (LrgT) and / or an N-terminal nuclear localization signal derived from VirD2. The nucleic acid construct may further comprise a transcriptional fusion activator MS2-p65-HSFl (MPH) expression cassette.

[0012] Provided herein are methods of performing multiplex gene knock-out, knock- in, knock-down, deletion, disruption, correction, replacement, reversion, integration, inversion, activation, and / or epigenetic modification comprising contacting a cell with any of the nucleic acid constructs or vector provided herein.44929-2659-7749, v. 1

[0013] Provided herein are methods of treating a disease in a patient comprising administering to the patient any of the nucleic acid constructs or vectors provided herein.

[0014] Provided herein are prime editors comprising Cas9 nuclease with an H840A substitution and a reverse transcriptase, wherein the reverse transcriptase is a truncated MMLV reverse transcriptase of amino acids 24-474, comprising D200C and V101R substitutions, and optionally comprising a C-terminal nuclear localization signal derived from SV40 (LrgT) and / or an N-terminal nuclear localization signal derived from VirD2. Provided herein is a nucleic acid encoding these prime editors.

[0015] Provided herein are compositions comprising any of the vectors provided herein, a nucleic acid encoding a prime editor provided herein, and a nucleic acid encoding a transcriptional fusion activator MS2-p65-HSFl (MPH).

[0016] Other objects, features and advantages of the present invention will become apparent from the following detailed description. It should be understood, however, that the detailed description and the specific examples, while indicating preferred embodiments of the invention, are given by way of illustration only, since various changes and modifications within the spirit and scope of the invention will become apparent to those skilled in the art from this detailed description.BRIEF DESCRIPTION OF DRAWINGS

[0017] The patent or application file contains at least one drawing executed in color. Copies of this patent or patent application publication with color drawing(s) will be provided by the Office upon request and payment of the necessary fee.

[0018] The following drawings form part of the present specification and are included to further demonstrate certain aspects of the present invention. The invention may be better understood by reference to one or more of these drawings in combination with the detailed description of specific embodiments presented herein.

[0019] FIGS. 1A-1H. Engineering a compact and efficient prime editing system. (FIG. 1A) Schematic of the drive-and-process (DAP) array. (FIG. IB) Schematic of the BFP fluorescent reporter. On-target 1 -locus MPE converts BFP to GFP and can be used as an indicator of prime editing efficiency. EFS, elongation factor la short promoter. (FIG. 1C)54929-2659-7749, v. 1Evaluation of PE variants with different C-terminal (left) and N-terminal NLSs using the BFP- to-GFP reporter. Dashed lines show the highest GFP conversion achieved by the best variants: C-terminal SV40 and N-terminal VirD2. X-axes lists the PE variants tested. (FIG. ID) Integration of engineered pegRNA (epegRNA) into the DAP array. (FIG. IE) Performance of PE variants with truncated MMLV-RT, as indicated by BFP-to-GFP conversion rates. (FIG. IF) Rational engineering of 451 aa MMLV-RT by introducing previously reported beneficial mutations. Upper dash line indicates the D200C mutant and lower dash line indicates the 451 aa MMLV-RT without additional mutations. (FIG. 1G) Comparison among top-performing 451 aa MMLV-RT variants. (FIG. 1H) Comparison of engineered prime editors and PE2 targeting the endogenous HEK3 locus in HEK293T cell. Bars represent the mean ± s.d. from n >= 3 independent biological replicates.

[0020] FIGS. 2A-2K. Multiplex and orthogonal gene activation with PEAK. (FIG. 1 A) Schematic of gene activation using either the Cas9 nickase variant or a prime editor. TSS: transcription start site; GOI: gene of interest. (FIG. IB) Plasmid fluorescent reporters 1-3 with 8 x target protospacer, a miniCMV promoter, and green fluorescent proteins of varying halflives. CL1 and PEST are degrons that can destabilize their fused proteins and half lives of EGFPs follows the hierarchy: EGFP-CL1-PEST < EGFP-PEST < EGFP. (FIG. 1C) Fluorescence microscope images showcasing the activation of Reporter 1 in HEK293T cells by Cas9 variants and prime editor. Scale bar indicates 100 pm. (FIG. ID) Flow cytometry analysis of gene activation across Reporters 1-3 in HEK293T, K562 and Hela cells, quantified by mean fluorescent intensity. (FIG. IE) Activation of endogenous genes in HEK293T cells using dCas9+MPH+agRNA, comparing agRNAs generated by either the hCtRNA promoter or the human U6 promoter. (FIG. IF) Multiplex endogenous gene activation using dCas9, MPH, and DAP arrays encoding multiple agRNAs. (FIG. 1G) Comparison between multiplex gene activation by SAM+DAP and by PEAK+MPH+DAP. (FIG. 1H) Schematic of a full-length agRNA and a truncated spacer agRNA. (FIGS. II and 1 J) Activation of endogenous gene IL1B (FIG. II) and RHOXF2 (FIG. 1J) in HEK293T cells via PEAK, MPH, and truncated spacer agRNAs. (FIG. IK) Endogenous IL1B and RHOXF2 gene activation using SAM system or PEAK with MPH and truncated agRNAs. A 19-nt-spacer and an 11-nt-spacer agRNAs were coupled with PEAK to activate endogenous IL1B and RHOXF2, respectively. Bars represent the mean ± s.d. from n = 3 independent biological replicates.64929-2659-7749, v. 1

[0021] FIGS. 3A-3F. Multiplex and orthogonal gene repression with DAP shRNAs. (FIG. 3A) Illustration of gene repression using PE2-mediated CRISPRi strategy targeting an EGFP reporter gene. (FIG. 3B) EGFP reporter repression with 33 different sgRNAs that span the EGFP gene. (FIG. 3C) Schematic of gene silencing achieved via shRNAs produced by the DAP array. (FIG. 3D) Comparative analysis of gene repression efficiencies using three different methods: PE2-mediated CRISPRi. dCas9-KRAB-MECP2 fusion protein, and DAP shRNA-mediated RNAi. GPP: Broad Institute GPP web portal. (FIG. 3E) Repression of the endogenous MLH1 gene using shRNAs designed by different web tools and expressed from DAP arrays. GEN: GenScript siRNA design tool; INV: InvivoGen siRNA Wizard. (FIG. 3F) Multiplex gene repression with DAP-shRNA array. FWD and REV are two DAP arrays encoding the same set of shRNAs in opposite order. Experiments were performed in HEK293T cells and analysed using reverse transcription-quantitative polymerase chain reaction (RT- qPCR) or flow cytometry. Bars represent mean ± s.d. from n = 3 independent biological replicates.

[0022] FIGS. 4A-4I. Complex genetic diseases study and combinatorial delivery approaches using DAP array, PEAK and MPH. (FIG. 4A) Schematic of a hypothetical complex genetic disease model involving Wilson's disease, Type I diabetes, and Transthyretin amyloidosis. Treatment of the disease model requires orthogonal editing of the ATP7B gene, activation of the PDX1 gene, and repression of the TTR gene. (FIG. 4B) Design of a DAP array encoding a shRNA for gene silencing, a truncated agRNA for gene activation, and a ngRNA and a epegRNA for gene editing. (FIGS. 4C, 4D) Therapeutic genetic perturbation in HepG2 disease cell line transfected by plasmids encoding the DAP array, PEAK, and MPH. REV: the direction of DAP array was reversed as compared to FWD DAP array. (FIG. 4E) Genetic perturbation in HEK293T cells transfected with plasmids encoding the DAP array, PEAK and MPH to install the C.3207OA mutation in the ATP7B gene, upregulate the expression of RHOXF2 gene, and silence the MLH1 gene. (FIGS. 4F, 4G) Combinatorial delivery of the DAP array (AAV), PEAK (mRNA), and MPH (mRNA) into HEK293T cells. (FIGS. 4H, 41) Combinatorial delivery using plasmids for the DAP array and MPH, and lentivirus for PEAK. All controls were eGFP transfected. A stable cell line expressing PEAK was established before introducing the DAP array and MPH via plasmid transfection. Gene editing outcomes were analysed by Sanger sequencing and transcriptional regulations were analysed by RT-qPCR. Error bars represent mean ± s.d. from n = 3 independent biological replicates.74929-2659-7749, v. 1

[0023] FIGS. 5A-5N. Development of BFP fluorescent reporter. (FIG. 5A) Efficient and precise MPE with PE2 and DAP array at human endogenous genomic loci in HEK293T cells. Dashed line indicates 50% editing efficiency. NGS results were analyzed by CRISPResso2. (FIG. 5B) Thirteen (EP 1.1 -EPl.13) DAP array designed to edit H66Y (by C- to-T) of BFP were transfected with PE2 and BFP reporter (OP1.19) plasmids. OP1.21 is a positive control expressing GFP. (FIG. 5C) Fluorescence microscope data of 1 -locus MPE on plasmid BFP reporter in HEK293T cells. (FIGS. 5D, 5E) EPl.11 enables higher MPE on plasmid BFP reporter than other DAP arrays validated by both mean and median fluorescent intensities. (FIGS. 5F-5H) Sorting of BFP reporter stable cell lines. BFP plasmid was packaged as lentivirus to infect HEK293T, K562 and Hela cells with lx, lOx and lOOx multiplexity of infection (MOI). lOOx MOI infected cells achieved highest BFP intensity, of which the population with top 5% BFP intensity was sorted and expanded as BFP stable cell lines. (FIGS. 5I-5N) EP 1.11 enables higher MPE than other DAP arrays validated in BFP reporter stable cell lines. Error bars represent mean ± s.d. from n = 3 or 4 replicates.

[0024] FIG. 6. Development of the BFP reporter v2. The BFP gene and the DAP array EPl.11 (in reverse direction) were packaged into lentivirus, transduced into HEK293T cells with high MOI. The cell populations with the top 5% BFP intensity were sorted and expanded as the BFP v2 reporter cell line. Testing PE variants on the BFP v2 reporter cell line only requires transfection of a plasmid encoding the engineered PE variants.

[0025] FIGS. 7A-7F. 3-color reporter. (FIG. 7A) Schematic of the 3-color fluorescent reporter. On-target 3-loci MPE can rescue the fluorescent signal of EGFP, TagBFP and mCherry by insertion, substitution, and deletion, respectively. (FIG. 7B) Schematic of NLS engineering. (FIG. 7C) Schematic of representative NLS-engineered prime editors. (FIGS. 7D- 7F) 3-loci MPE on 3-color reporter stable cell line testing the performance of selected PE variants. Flow cytometry was performed 3- or 5-days post-transfection. Dashed lines indicate 27.4% in (FIG. 4D), 19.2% in (FIG. 7E), and 7.2% in (FIG. 7F).

[0026] FIGS. 8A-8C. Screening of PE variants with engineered N-terminal and C- terminal NLSs. (FIG. 8A) PE N-terminal NLS screen results (1-day after transfection). The PE variant EPl .30N with an N-terminal Vir D2 NLS achieved the highest BFP-to-GFP conversion. (FIG. 8B) Additional comparison on selected PE variants. EP2.5 enabled the highest BFP-to- GFP conversion in the BFP v2 stable cell line among all tested variants. (FIG. 8C) Schematic of representative PE variants. Error bars represent mean ± s.d. from n = 3 or 4 replicates.84929-2659-7749, v. 1

[0027] FIGS. 9A-9E. Screening of PE variants with rationally engineered 451 aa MMLV-RT. (FIG. 9A) Three independent biological replicates were performed to screen the advantageous mutations that enable higher BFP-to-GFP conversion than the 451 aa MMLV- RT (D200C) (denoted as w / o mutation). (FIG. 9B) Summary of advantageous mutants found in (FIG. 9A). Individual introduction of V101R, D124R, or P127R into the 451 aa MMLV-RT (D200C) showed increased BFP-to-GFP conversion in all three replicates. Introduction of E117R, K398R, K193R, orT306R showed increased activity in two ofthe three replicates. The mutations with grey shades only showed improvement in one of the three biological replicates. (FIG. 9C) The 451 aa MMLV-RT double mutant, D200C and V101R, showed the highest BFP- to-GFP conversion. (FIG. 9D, 9E) Additional comparison confirming that the engineered double mutant, D200C and V101R,is the most efficient variant. Error bars represent mean ± s.d. from n = 3-8 replicates.

[0028] FIGS. 10A-10F. Screening of efficient DAP array for 3-loci MPE on 3-color reporter stable cell line. EP1.16-EP1.19 (DAP arrays) were designed to rescue GFP gene on 3- color reporter. EP1.20-EP1.23 (DAP arrays) were designed to rescue TagBFP gene on 3-color reporter. EP1.24-EP1.27 (DAP arrays) were designed to rescue mCherry gene on 3-color reporter. (FIGS. 10A and 10B) EPl.18 DAP array enables the highest MPE efficiency quantified by both mean and median fluorescent intensities. (FIGS. 10C and 10D) EP1.20 DAP array enables the highest MPE efficiency evaluated by both mean and median fluorescent intensities. (FIGS. 10E and 10F) EP1.24 DAP array enables the highest MPE efficiency measured by both mean and median fluorescent intensities. Experiments were performed in HEK293T cells. Error bars represent mean ± s.d. from n = 3 or 4 replicates.

[0029] FIGS. 1 1 A-l IB. Validation of the DAP array for 3-loci MPE. (FIG. 1 1 A) DAP arrays from EPl.18, EP1.20, EP1.24 were assembled as a single array with interval sequences for 3-loci MPE. 3-loci MPE was performed by co-transfecting the 3-color reporter stable cell line with PE2 and the EP2 DAP array. (FIG. 11B) Schematic of EPl.18, EP1.20, EP1.24 and EP2. “N” represents nicking guide RNA, “P” represents prime editing guide RNA, “I” represents interval sequence. Error bars represent mean ± s.d. from n = 3 or 4 replicates.

[0030] FIG. 12. Comparison between MPE and eMPE across different DAP array dosages. Response curve for BFP-to-GFP conversion rate versus DAP array concentration using MPE and eMPE with PE4max, PE2, and PE2*. eMPE consistently shows higher efficiency than MPE with different doses of DAP array or PE variants.94929-2659-7749, v. 1

[0031] FIGS. 13A-13C. Truncation of the MMLV-RT from PE2 into more compact variants. (FIG. 13A) Schematic of the C-terminal truncation region of MMLV-RT (visualized by Geneious software). The sequence segment shown corresponds to SEQ ID NO: 3. (FIGS. 13B and 13C) Comparison among engineered PE variants with truncated MMLV-RT and PE2 with full-length MMLV-RT. Error bars represent mean ± s.d. from n = 4 replicates.

[0032] FIGS. 14A-14B. Rational engineering of MMLV-RT truncation variants. (FIG. 14A) Each of the 28 selected mutations was incorporated in the 444 aa RT for comparison. 7 were found advantageous in improving the PE efficiency of 444 aa RT. EP2.2, the 444 aa MMLV-RT truncation variant without rational mutations. (FIG. 14B) Additional comparison identifies Q221R improves prime editing efficiency of 444 aa RT. Error bars represent mean ± s.d. from n = 3-8 replicates.

[0033] FIG. 15. Protein sequence alignment between 451 aa XMRV-RT (SEQ ID NO: 4) and 451 aa MMLV-RT (SEQ ID NO: 5).

[0034] FIGS. 16A-16B. Selected mutations to enhance the binding affinity between DNA / RNA substrate and 451 aa MMLV-RT truncation variants. (FIG. 16A) List of selected amino acids in MMLV-RT that may be spatially close to the DNA / RNA substrate using the structure of XMRV-RT (PDF: 4HKQ; Nowak et al., 2013) as the reference. (FIG. 16B) Structure (PDB: 4HKQ) of selected residues (represented as spheres in grey) of 451 aa RT within given distances of the substrate DNA / RNA.

[0035] FIG. 17. HEK293T cells with GFP fluorescence activated by indicated Cas variants of SAM system on Reporters 1-3. Flow cytometry readout of gene activation on Reporters 1-3 in HEK293T cells, quantified by the median fluorescent intensity (left) or percentage of GFP-positive cells.

[0036] FIGS. 18A-18D. EGFP activation using reporters with increased stringency. (FIG. 18 A) Schematic of fluorescent reporters 4-6 with 1 x target protospacer, miniCMV promoter and green fluorescent proteins of different half-lives. (FIG. 18B) Schematic of fluorescent reporter 7-9 with flipped 1 x target protospacer, miniCMV promoter, and green fluorescent proteins of different half-lives. (FIGS. 18C and 18D) Flow cytometry readout of gene activation on Reporters 4-9 in HEK293T cells, quantified by (FIG. 18C) the mean fluorescent intensity or FIG. 18D) median fluorescent intensity.104929-2659-7749, v. 1

[0037] FIG. 19. Percentage of HEK293T cells with GFP fluorescence activated by indicated Cas variants of SAM system on Reporters 4-9. 3 biological replicates (each with 3-4 technical replicates) were performed. Solid short line represents median of each column.

[0038] FIGS. 20A-20D. IL1B gene activation with truncated agRNA and Cas variants in HEK293T cells. Endogenous IL1B gene activation using agRNAs with truncated spacers of different lengths and (FIG. 20A) wtCas9, (FIG. 20B) dCas9, (FIG. 20C) nCas9(D10A), and (FIG. 20D) nCas9 (H840A). NC, negative control, GFP transfected. Error bars represent mean ± s.d. from n = 3 replicates.

[0039] FIG. 21. IL1B gene activation in HEK293T cells with truncated agRNA and PE2. Truncating the spacer region of IL1B agRNA from 20 nt to 19 nt enhanced the gene activation ability of PE2.

[0040] FIG. 22. PDX1 gene activation in HepG2 cells with truncated agRNA and PEAK. Truncating the spacer region of PDX1 agRNA from 20 nt to 11 nt lead to significantly improved gene activation ability of PEAK.

[0041] FIG. 23. Designed sgRNAs tiling the EFS-EGFP reporter. 33 sgRNAs that tile the EFS promoter and the EGFP coding sequence in the EFS-EGFP reporter were designed for evaluating the CRISPRi gene repression strategy. The EFS-EGFP reporter sequence is provided as SEQ ID NO: 6.

[0042] FIGS. 24A-24B. CRISPRi with PEs on EFS-EGFP reporter in HEK293T cells. (FIG. 24 A) sgRNA6 enabled PE2 with efficient gene repression. Multiplexing sgRNA5-8 did not further improve the gene repression ability of PE2. (FIG. 24B) With sgRNA6, dCas9- KRAB-MeCP2 enabled the most effective gene repression as compared to PE2, dCas9, and CRISPR-Off V2.1 (Nunez et al., 2021). Error bars represent mean ± s.d. from n = 3 or 4 replicates.

[0043] FIG. 25. Fluorescent microscopy comparison of four types of CRISPRi strategy in HEK293T cells. dCas9-KRAB-MeCP2 enabled the most efficient gene repression. Scale bar: 100 pm.

[0044] FIG. 26. Influence of 3’ poly-T sequence between shRNA and hCtRNA in theDAP array. 0 to 6 thymine were placed between the GPP shRNA 1 and the second hCtRNA in114929-2659-7749, v. 1a DAP array. Results were assessed by EGFP repression evaluated by FACS. Error bars represent mean ± s.d. from n = 4 replicates.

[0045] FIG. 27. Effect of MMR repression on editing efficiency in HEK293T cells. PEAK only group (Left) was transfected with an eMPE array targeting the HEK3 +5G>C loci. PEAK + MMR Repression group (Right) was transfected with the eMPE array and DAP array containing shRNA repressing MLH1, MSH2, MSH6, and PMS2. MLH1 expression (green) and editing efficiency normalized to PEAK group editing efficiency (blue) are shown for each group. Error bars represent mean ± s.d. from n = 3 replicates.

[0046] FIG. 28. Comparison among different shRNAs targeting the 1TR gene for repression. The shRNA sequence of Patisiran enabled highly efficient gene repression in HepG2 cells. Error bars represent mean ± s.d. from n = 3 replicates.

[0047] FIG. 29. Modeling of Wilson’s disease with PEAK and DAP array in HEK293T cells. ePE3, eMPE with nicking gRNA and epegRNA; ePE2, only epegRNA was used. The eMPE targeting H1069Q was used in the DAP array described in FIG. 4. Error bars represent mean ± s.d. from n = 3 replicates.

[0048] FIGS. 30A-30C. Single cell analysis of genetic perturbations in in HEK293T cells. (FIGS. 30A and 30B) Relative MLH1 and RHOXF2 expression in single cells from the control group (green, n=16) or treated group (orange, n=16). Treated group was transfected with eGFP (for selection), PEAK, MPH, and the DAP array targeting each perturbation. Control cells were transfected with eGFP only. (FIG. 30C) Representation of single cells harboring 3 simultaneous genetic perturbations (ATP7B c.3207 C>A editing, MLH1 knockdown, and RHOXF2 activation), 2 simultaneous perturbations, or 1 perturbation. Box plots show the median, 25thand 75thpercentiles, maximum, and minimum. Treated single cells were considered successfully activated if the relative RHOXF2 expression was above the 75thpercentile of the control cells. Similarly, treated single cells were considered successfully repressed if the relative MLH1 expression was below the 25thpercentile of the control cells.

[0049] FIG. 31. Independence of PEAK, MPH, and the DAP array in HEK293T cells. ATP7B c.3207 OA editing efficiency (orange), MLH1 expression (blue), and RHOXF2 expression (green) are shown for HEK293T cells transfected with PEAK, MPH, and the DAP array targeting each perturbation. Additional groups lacking either the MPH or PEAK are124929-2659-7749, v. 1presented, along with an eGFP transfected control. Error bars represent mean ± s.d. from n = 3 replicates.

[0050] FIG. 32. PEI Max transfection protocol in HEK293T cells. Seed 2e4 cells in each well of a 96 well plate 16 h before transfection. OptiMEM dilution volume: 5 pl (DNA)+5 pl (PEI Max). Waiting time after mixing DNA and PEI: 5 min. Total DNA amount: 250 ng. PEI Max amount: 0.5 pl. Error bars represent mean ± s.d. from n = 4 replicates. A plasmid expressing EGFP was used as DNA input in this optimization experiment.

[0051] FIGS. 33A-33D. Example gating strategy used in this study. (FIG. 33 A) Gating strategy of BFP v2 HEK293T stable cell line control (w / o transfection). (FIG. 33B) Gating strategy of BFP v2 HEK293T stable cell line transfected by PEAK after two days. (FIG. 33C) Gating strategy of 3 -color reporter HEK293T stable cell line control (w / o transfection). (FIG. 33D) Gating strategy of 3-color reporter HEK293T stable cell transfected by EP2.5 after 3 days.

[0052] FIG. 34. A 4-loci DAP multiplex array for base editing was constructed to test multiplex editing in mouse N2a cells. When tested, minimal editing activity was observed in the multiplex array delivered with the C>T base editor tadCBEd, especially compared to the efficiently edited singleplex controls with individual U6 promoters.

[0053] FIG. 35. Screening of various mouse tRNA that are highly expressed in the mouse brain using a duplex editing system at highly efficient editing sites in N2a cells. Mouse CtRNA and QtRNA had high editing efficiency, substantially outperforming the original tRNA design. When screening additional leader and trailer sequences before and after the tRNA used for tRNA recognition and processing, the editing efficiency was rescued, performing similarly to the singleplexed controls. The Hpd-E sequence is provided as SEQ ID NO: 7. The Pcsk9-A sequence is provided as SEQ ID NO: 8.DETAILED DESCRIPTION

[0054] Programmable and modular systems capable of orthogonal genomic and transcriptomic perturbations are crucial for biological research and treating human genetic diseases. Provided herein is minimal versatile genetic perturbation technology (mvGPT), a compact and multiplexed RNA expression system for base and prime editing. mvGPT provides a flexible toolkit designed for simultaneous and orthogonal gene editing, activation, and134929-2659-7749, v. 1repression in human cells. The mvGPT combines an engineered compact prime editor (PE), a fusion activator MS2-p65-HSFl (MPH), and a drive-and-process (DAP) multiplex array that produces RNAs tailored to different types of genetic perturbation. mvGPT can precisely edit human genome via PE coupled with a prime editing guide RNA and a nicking guide RNA, activate endogenous gene expression using PE with a truncated single guide RNA containing MPH-recruiting MS2 aptamers, and silence endogenous gene expression via RNA interference with a short-hairpin RNA. The DAP multiplex array uses a 75 bp human cysteine tRNA (hCtRNA) as a promoter and spacer between RNA elements (FIG. 1 A). Following endogenous hCtRNA processing, individual RNA subunits are released from the array, thus avoiding cumbersome individual promoters while retaining similar levels of RNA expression. When the DAP multiplex array is used in mouse studies, the mouse CtRNA or QtRNA may be used in place of the human CtRNA. When integrated into mvGPT, the drive-and-process (DAP) array (Yuan & Gao, 2022; Zhao et al., 2023) orchestrates the production of the RNAs with distinct functionalities for endogenous gene editing, activation, and repression at independent genomic loci. For gene editing, mvGPT utilizes pegRNA and nicking guide RNA (ngRNA) to direct our engineered compact PEs to the target loci, efficiently introducing precise gene edits. For gene activation, a truncated sgRNA containing MS2-binding stem loops recruits the MPH activation complex to PE, upregulating gene transcription. Finally, the DAP array-generated shRNA ensures potent gene repression through RNA interference (RNAi). mvGPT’ s capacity for simultaneous and orthogonal genetic interventions was demonstrated by correcting Wilson’s disease-related c.3207C>A mutation in the ATP7B gene, upregulating the PDX1 gene for treating Type I diabetes, and repressing the TTR gene to manage transthyretin amyloidosis. Moreover, mvGPT was successfully delivered using mRNA, AAV, and lentivirus, highlighting its broad compatibility with different delivery systems and in vivo applications.I. Drive-and-process (DAP) arrays for multiplex genome engineering

[0055] Provided herein are drive-and-process (DAP) array architectures for multiplex genome and cellular engineering, including nuclease editing, base editing, prime editing, gene activation, gene repression, and genetic perturbation, in virtually any cell type or organisms with minimal genetic payloads. DAP arrays are composed of engineered tRNAs and small RNAs of interests, in a tandemly assembled single-array architecture.

[0056] To achieve the highest performance of DAP array, the 5’ leader sequences of any tRNA can be tuned, in a manner of nucleotide by nucleotide, to identify the most efficient144929-2659-7749, v. 1tRNA sequence for DAP array. For example, an engineered 75-nt human cysteine tRNA (hCtRNA) is shown below. The hCtRNAs on a DAP array are used to both express and release the small RNAs.

[0057] The sequence of engineered hCtRNA from 5’ to 3’ is as follows: AGAGGGGGTATAGCTCAGTGGTAGAGCATTTGACTGCAGATCAAGAGGTCCCCG GTTCAAATCCGGGTGCCCCCT (SEQ ID NO: 1).

[0058] To express small RNAs, the DNA sequence of hCtRNA is used as an RNA polymerase III promoter, which can efficiently drive the expression of the small RNAs on a DAP array. To release small RNAs after expression, the endogenous tRNA processing machinery RNase P and RNase Z can recognize and then separate the small RNAs from the DAP array, which is expressed by hCtRNAs. DAP arrays function independently and efficiently, compatible with virial deliveries, including DNA viruses, such as adeno-associated virus (AAV), or RNA viruses, such as lenti virus.

[0059] When the DAP multiplex array is used in mouse studies, the mouse CtRNA or QtRNA may be used in place of the human CtRNA.

[0060] For multiplex CRISPR-Cas nuclease editing or base editing with DAP array, hCtRNA and guide RNA (gRNA) are tandemly assembled, in architecture of “hCtRNA-gRNA(1)-hCtRNA-gRNA (2) ... hCtRNA-gRNA (N)-poly T termination signal” (FIG. 1). The constructed DAP array can be expressed either in trans or in cis with Cas nuclease or base editor of choice. To construct a single array encoding DAP array and Cas nuclease or base editor with or without other functional proteins, the location of the DAP array should be independent of any existing gene cassettes. For example, DAP arrays can be placed upstream of the Cas protein expression transcript, or downstream of the RNA Pol II termination signal.

[0061] For multiplex prime editing (MPE) with a DAP array, two different DAP array architectures are used to address prime editing with or without using nicking gRNA (ngRNA). For MPE without ngRNA, engineered prime editing gRNA (epegRNA) should be used to assemble the DAP array, in the architecture of “hCtRNA-epegRNA (l)-hCtRNA-epegRNA(2) . . . hCtRNA-epegRNA (n)-poly T termination signal”, because the hairpin structure used in epegRNA can prevent tRNA processing machinery from disrupting the editing information encoded in the 3’ extension of pegRNA. For MPE with ngRNA, ngRNA should be placed upstream of the pegRNA or epegRNA, in architecture of “hCtRNA-ngRNA (l)-hCtRNA-154929-2659-7749, v. 1pegRNA / epegRNA (1) ... hCtRNA-ngRNA (n)-hCtRNA-pegRNAZepegRNA (N)-poly T termination signal”. Notably, an interval sequence (I) should be used to prevent pegRNA from being disrupted by tRNA processing. Other uses of MPE DAP array are the same as described above.

[0062] For multiplex gene activation with a DAP array, gRNAs are engineered activation gRNAs (agRNA) with RNA aptamers that can recruit gene activators. For example, MS2 aptamers can be inserted in SpCas9 gRNA, to recruit an MPH gene activator, which is composed of MS2 bacteriophage coat protein fused with the activation domains of P65 and HSF1 genes. The DAP array for gene activation can be constructed in architecture of “hCtRNA-agRNA (l)-hCtRNA-agRNA (2) ... hCtRNA-agRNA (n)-poly T termination signal”. Other uses of the DAP array are the same as described above.

[0063] For multiplex gene repression with a DAP array, short-hairpin RNA (shRNA) is used to silence target gene expression via RNA interference (RNAi). The hCtRNA can efficiently express shRNA to knockdown endogenous genes. The DAP array for gene repression can be constructed in architecture of “hCtRNA-shRNA (l)-hCtRNA-shRNA (2) ... hCtRNA-shRNA (n)-poly T termination signal”. Other uses of the DAP array are the same as described above.

[0064] To enable multiple different types of genetic perturbations, including combinations of gene editing, gene activation, and gene repression. The DAP array can be constructed in architecture of “hCtRNA-small RNA (l)-hCtRNA- small RNA (2) . . . hCtRNA- small RNA (n)-poly T termination signal”, in which small RNA can be gRNA, agRNA, ngRNA, pegRNA, epegRNA, shRNA et al, depending on the genetic perturbations needed to be included. Other uses of the DAP array are the same as described above.

[0065] DAP arrays can be packaged in viral vectors. The orientation of the DAP array in DNA viral vectors does not influence the effectiveness. However, the DAP array must be reversed in RNA viral vectors to keep functions, by preventing endogenous tRNA processing machinery from disrupting the viral RNA genome.

[0066] In summary, the DAP array can be engineered and used for any genetic perturbation with high efficiency. DAP array can be used for multiplex gene knock-out, knock- in, knock-down, deletion, disruption, correction, replacement, reversion, integration, inversion, activation, epigenetic modification, and their combinations or orthogonal uses.164929-2659-7749, v. 1II. CRISPR Systems

[0067] Gene editing is a technology that allows for the modification of target genes within living cells. Recently, harnessing the bacterial immune system of CRISPR to perform on demand gene editing revolutionized the way scientists approach genomic editing. The Cas9 protein of the CRISPR system, which is an RNA guided DNA endonuclease, can be engineered to target new sites with relative ease by altering its guide RNA sequence. This discovery has made sequence specific gene editing functionally effective.

[0068] In general, “CRISPR system” refers collectively to transcripts and other elements involved in the expression of or directing the activity of CRIS PR-associated (“Cas”) genes, including sequences encoding a Cas gene, a tracr (trans-activating CRISPR) sequence (e.g. tracrRNA or an active partial tracrRNA), a tracr-mate sequence (encompassing a “direct repeat” and a tracrRNA-processed partial direct repeat in the context of an endogenous CRISPR system), a guide sequence (also referred to as a “spacer” in the context of an endogenous CRISPR system), and / or other sequences and transcripts from a CRISPR locus.

[0069] The CRISPR / Cas nuclease or CRISPR / Cas nuclease system can include a noncoding RNA molecule (guide) RNA, which sequence-specifically binds to DNA, and a Cas protein (e.g., Cas9), with nuclease functionality e.g., two nuclease domains). One or more elements of a CRISPR system can derive from a type I, type II, or type III CRISPR system, e.g., derived from a particular organism comprising an endogenous CRISPR system, such as Streptococcus pyogenes.

[0070] The CRISPR system can induce double stranded breaks (DSBs) at the target site, followed by disruptions as discussed herein. In other embodiments, Cas9 variants, deemed “nickases," are used to nick a single strand at the target site. Paired nickases can be used, e.g., to improve specificity, each directed by a pair of different gRNAs targeting sequences such that upon introduction of the nicks simultaneously, a 5' overhang is introduced. In other embodiments, catalytically inactive Cas9 is fused to a heterologous effector domain such as a base editing enzyme or a reverse transcriptase.

[0071] The CRISPR enzyme can be Cas9 (e.g., from 5. pyogenes or 5. pneumonia or S. aureus or S. auricularis or S. lugdunensis). The CRISPR enzyme can direct cleavage of one or both strands at the location of a target sequence, such as within the target sequence and / or within the complement of the target sequence. The vector can encode a CRISPR enzyme that174929-2659-7749, v. 1is mutated with respect to a corresponding wild-type enzyme such that the mutated CRISPR enzyme lacks the ability to cleave one or both strands of a target polynucleotide containing a target sequence. For example, an aspartate-to-alanine substitution (DIO A) in the RuvC I catalytic domain of Cas9 from .S', pyogenes converts Cas9 from a nuclease that cleaves both strands to a nickase (cleaves a single strand). In some embodiments, a Cas9 nickase may be used in combination with guide sequence(s), e.g., two guide sequences, which target respectively sense and antisense strands of the DNA target. This combination allows both strands to be nicked and used to induce NHEJ or HDR.

[0072] In some embodiments, a Cas9 polypeptide can be a deactivated (e.g., mutated, dCAs9) Cas9 polypeptide, wherein the deactivated Cas9 does not comprise HNH and / or RuvC nickase activities. The HNH and RuvC motifs have been characterized in S. thermophilus (see, e.g., Sapranauskas et al. Nucleic Acids Res. 39:9275-9282 (2011)) and one of skill would be able to identify and mutate these motifs in Cas9 polypeptides from other organisms. For example, the mutations D10A and H840A completely inactivate the nuclease activity of 5. pyogenes Cas9. Notably, a Cas9 polypeptide in which the HNH motif and / or RuvC motif is / are specifically mutated so that the nickase activity is reduced, deactivated, and / or absent, can retain one or more of the other known Cas9 functions including DNA, RNA and PAM recognition and binding activities and thus remain functional with regard to these activities, while non-functional with regard to one or both nickase activities.

[0073] In some embodiments, an enzyme coding sequence encoding the CRISPR enzyme is codon optimized for expression in particular cells, such as eukaryotic cells. The eukaryotic cells may be those of or derived from a particular organism, such as a mammal, including but not limited to human, mouse, rat, rabbit, dog, or non-human primate. In general, codon optimization refers to a process of modifying a nucleic acid sequence for enhanced expression in the host cells of interest by replacing at least one codon of the native sequence with codons that are more frequently or most frequently used in the genes of that host cell while maintaining the native amino acid sequence. Various species exhibit particular bias for certain codons of a particular amino acid. Codon bias (differences in codon usage between organisms) often correlates with the efficiency of translation of messenger RNA (mRNA), which is in turn believed to be dependent on, among other things, the properties of the codons being translated and the availability of particular transfer RNA (tRNA) molecules. The predominance of selected tRNAs in a cell is generally a reflection of the codons used most frequently in peptide184929-2659-7749, v. 1synthesis. Accordingly, genes can be tailored for optimal gene expression in a given organism based on codon optimization.

[0074] A single-molecule guide RNA (sgRNA) can comprise, in the 5' to 3' direction, an optional spacer extension sequence, a spacer sequence, a minimum CRISPR repeat sequence, a single-molecule guide linker, a minimum tracrRNA sequence, a 3' tracrRNA sequence and / or an optional tracrRNA extension sequence. The optional tracrRNA extension can comprise elements that contribute additional functionality (e.g., stability) to the guide RNA. The single-molecule guide linker can link the minimum CRISPR repeat and the minimum tracrRNA sequence to form a hairpin structure. The optional tracrRNA extension can comprise one or more hairpins. In particular embodiments, the disclosure provides for an sgRNA comprising a spacer sequence and a tracrRNA sequence.

[0075] The CRISPR enzyme may be part of a fusion protein comprising one or more heterologous protein domains. A CRISPR enzyme fusion protein may comprise any additional protein sequence, and optionally a linker sequence between any two domains. Examples of protein domains that may be fused to a CRISPR enzyme include, without limitation, epitope tags, reporter gene sequences, and protein domains having one or more of the following activities: methylase activity, demethylase activity, transcription activation activity, transcription repression activity, transcription release factor activity, histone modification activity, RNA cleavage activity, nucleic acid binding activity, base editing activity, or reverse transcription activity. Non-limiting examples of epitope tags include histidine (His) tags, V5 tags, FLAG tags, influenza hemagglutinin (HA) tags, Myc tags, VSV-G tags, and thioredoxin (Trx) tags. Examples of reporter genes include, but are not limited to, glutathione-5- transferase (GST), horseradish peroxidase (HRP), chloramphenicol acetyltransferase (CAT) beta galactosidase, beta-glucuronidase, luciferase, green fluorescent protein (GFP), HcRed, DsRed, cyan fluorescent protein (CFP), yellow fluorescent protein (YFP), and autofluorescent proteins including blue fluorescent protein (BFP). A CRISPR enzyme may be fused to a gene sequence encoding a protein or a fragment of a protein that bind DNA molecules or bind other cellular molecules, including but not limited to maltose binding protein (MBP), S-tag, Lex A DNA binding domain (DBD) fusions, GAL4A DNA binding domain fusions, and herpes simplex virus (HSV) BP16 protein fusions. Additional domains that may form part of a fusion protein comprising a CRISPR enzyme are described in US 20110059502, incorporated herein by reference.194929-2659-7749, v. 1III. Prime Editors

[0076] Prime editing is a versatile and precise genome editing method that directly writes new genetic information into a specified DNA site using a CRISPR system working in association with a polymerase (i.e., in the form of a fusion protein or otherwise provided in trans with the CRISPR system), wherein the prime editing system is programmed with a prime editing (pe) guide RNA (“pegRNA”) that both specifies the target site and templates the synthesis of the desired edit in the form of a replacement DNA strand by way of an extension (either DNA or RNA) engineered onto a guide RNA (e.g., at the 5' or 3' end, or at an internal portion of a guide RNA). As such, prime editors allow for prime editing on a target nucleotide sequence in the presence of a pegRNA (or “extended guide RNA”). The pegRNA consists of (from 5’ to 3’) a sgRNA that anneals to a target site, a scaffold for the nCas9, a reverse transcription template (RT template) containing the desired edit, and a primer binding site (PBS) that binds to the non-target strand. The RT template can be programmed to introduce any type of edit, including all possible base transitions and transversions, and insertions and deletions of nucleotides of any length. The prime editing system is further enhanced by including an additional nicking sgRNA that increases editing efficiency by favoring DNA repair to replace the non-edited strand. The term “prime editor” refers to fusion constructs comprising a Cas9 nickase and a reverse transcriptase. The term “prime editor” may refer to the fusion protein or to the fusion protein complexed with a pegRNA, and / or further complexed with a second-strand nicking sgRNA. In some embodiments, the prime editor may also refer to the complex comprising a fusion protein (reverse transcriptase fused to a Cas9), a pegRNA, and a regular guide RNA capable of directing the second-site nicking step of the non-edited strand as described herein. In other embodiments, the reverse transcriptase component of the “prime editor” may be provided in trans. Further examples of prime editors and their use are provided in PCT Publn. WO2020191249, which is incorporated by reference herein in its entirety. The RT may be MMLV-RT, which has the sequence of:TLNIEDEHRLHETSKEPDVSLGSTWLSDFPQAWAETGGMGLAVRQAPLI IPLKATSTPVS IK QYPMSQEARLGIKPHIQRLLDQGILVPCQSPWNTPLLPVKKPGTNDYRPVQDLREVNKRVED IHPTVPNPYNLLSGLPP SHQWYTVLDLKDAFFCLRLHPTSQPLFAFEWRDPEMGI SGQLTWT RLPQGFKNSPTLFDEALHRDLADFRIQHPDLILLQYVDDLLLAATSELDCQQGTRALLQTLG NLGYRASAKKAQICQKQVKYLGYLLKEGQRWLTEARKETVMGQPTPKTPRQLREFLGTAGFC RLWIPGFAEMAAPLYP LTKTGTLFNWGPDQQKAYQE IKQALLTAPALGLPDLTKPFELFVDE KQGYAKGVLTQKLGPWRRPVAYLSKKLDPVAAGWPPCLRMVAAIAVLTKDAGKLTMGQPLVI LAPHAVEALVKQPPDRWLSNARMTHYQALLLDTDRVQFGPVVALNPATLLPLPEEGLQHNCL DILAEAHGTRPDLTDQPLPDADHTWYTDGSSLLQEGQRKAGAAVTTETEVIWAKALPAGTSA QRAELIALTQALKMAEGKKLNVYTDSRYAFATAHIHGE IYRRRGLLTSEGKE IKNKDE ILAL204929-2659-7749, v. 1LKALFLPKRLS I IHCP GHQKGHSAEARGNRMADQAARKAAITETPDTSTLLIENSSPYTSEH F ( SEQ ID NO : 2 )The MMLV-RT may comprise D524F, E562Q, and / or D583N substitutions. The MMLV-RT may also comprise D200C and / or V101R substitutions.

[0077] While INDEL profiles from CRIS PR-induced DSBs may have some sequencedependent predictability in insertion and deletion outcomes (Chakrabarti et al., 2019), the INDEL profiles are nonetheless heterogeneous in their outcome and are site-specific. NHEJ- based INDEL correction thus may produce both non-productive edits and productive edits in restoring the ORF. Prime editing has an advantage of specifying the exact insertion or deletion outcome for exon refraining, thereby ensuring that all of the edits are productive in restoring the correct ORF. Furthermore, in NHEJ-based INDEL correction, a non-productive edit prevents the sgRNA from re-annealing to the site and inducing a productive edit. In prime editing, a non-productive event (i.e. no editing as the edited strand is not successfully incorporated leaving the native sequence intact) leaves the sgRNA target site still amenable to re-annealing and another attempt at inducing the desired edit.

[0078] Prime editing can theoretically be used to correct all possible point mutations including base pair transitions and transversions, whereas base editors are limited only to transitions of A:T to G:C or C:G to T:A. In addition, theoretically prime editing is not limited to an editing window as base editing. Also, prime editing can be used to destroy splice sites. As prime editing necessitates the coordination of multiple pegRNA components for editing, such as the spacer sequence, the primer binding site (PBS), and the reverse transcriptase (RT) template, it is likely that editing events at off-target sites are minimal. However, a recent study demonstrated that two opposite strand nicks using the PE3 system can cause undesired editing outcomes in mouse zygote injections (Aida et al., 2020). These undesired editing outcomes were reduced by utilizing a sgRNA that is mutation-specific and can nick only after successful editing and resolution of the pegRNA nick (PE3b system). Nucleotide editing technologies have the potential to eliminate disease-causing mutations following a single treatment.IV. Vectors

[0079] Any type of vector may be used for administration of a system described herein. In some embodiments, the vector is a lipid nanoparticle. In some embodiments, the vector is a viral vector. In some embodiments, the viral vector is a non-integrating viral vector (i.e., that214929-2659-7749, v. 1does not insert sequence from the vector into a host chromosome). In some embodiments, the viral vector is an adeno-associated virus vector (AAV), a lentiviral vector, an integrasedeficient lentiviral vector, an adenoviral vector, a vaccinia viral vector, an alphaviral vector, or a herpes simplex viral vector.

[0080] Where a vector is used, it may be a viral vector, such as a non-integrating viral vector. In some embodiments, the viral vector is an adeno-associated virus vector, a lentiviral vector, an integrase-deficient lentiviral vector, an adenoviral vector, a vaccinia viral vector, an alphaviral vector, or a herpes simplex viral vector.

[0081] In embodiments, particular embodiments, the vector is an AAV vector. AAV is a small virus that infects humans and some other primate species. AAV is not currently known to cause disease. The virus causes a very mild immune response, lending further support to its apparent lack of pathogenicity. In many cases, AAV vectors integrate into the host cell genome, which can be important for certain applications, but can also have unwanted consequences. Gene therapy vectors using AAV can infect both dividing and quiescent cells and persist in an extrachromosomal state without integrating into the genome of the host cell, although in the native vims some integration of virally earned genes into the host genome does occur. These features make AAV a very attractive candidate for creating viral vectors for gene therapy, and for the creation of isogenic human disease models. Recent human clinical trials using AAV for gene therapy in the retina have shown promise. AAV belongs to the genus Dependoparvovirus, which in turn belongs to the family Parvoviridae . The vims is a small (20 nm) replicationdefective, nonenveloped vims.

[0082] Wild-type AAV has attracted considerable interest from gene therapy researchers due to a number of features. Chief amongst these is the vims's apparent lack of pathogenicity. It can also infect non-dividing cells and has the ability to stably integrate into the host cell genome at a specific site (designated AAVS1) in the human chromosome 19. This feature makes it somewhat more predictable than retrovimses, which present the threat of a random insertion and of mutagenesis, which is sometimes followed by development of a cancer. The AAV genome integrates most frequently into the site mentioned, while random incorporations into the genome take place with a negligible frequency. Development of AAVs as gene therapy vectors, however, has eliminated this integrative capacity by removal of the rep and cap from the DNA of the vector. The desired gene together with a promoter to drive transcription of the gene is inserted between the inverted terminal repeats (ITR) that aid in224929-2659-7749, v. 1concatemer formation in the nucleus after the single-stranded vector DNA is converted by host cell DNA polymerase complexes into double-stranded DNA. AAV-based gene therapy vectors form episomal concatemers in the host cell nucleus. In non-dividing cells, these concatemers remain intact for the life of the host cell. In dividing cells, AAV DNA is lost through cell division, since the episomal DNA is not replicated along with the host cell DNA. Random integration of AAV DNA into the host genome is detectable but occurs at very low frequency. AAVs also present very low immunogenicity, seemingly restricted to generation of neutralizing antibodies, while they induce no clearly defined cytotoxic response. This feature, along with the ability to infect quiescent cells present their dominance over adenoviruses as vectors for human gene therapy.

[0083] The AAV genome is built of single-stranded deoxyribonucleic acid (ssDNA), either positive- or negative-sensed, which is about 4.7 kilobase long. The genome comprises inverted terminal repeats (ITRs) at both ends of the DNA strand, and two open reading frames (ORFs): rep and cap. The former is composed of four overlapping genes encoding Rep proteins required for the AAV life cycle, and the latter contains overlapping nucleotide sequences of capsid proteins: VP1, VP2 and VP3, which interact together to form a capsid of an icosahedral symmetry.

[0084] The Inverted Terminal Repeat (ITR) sequences comprise 145 bases each. They were named so because of their symmetry, which was shown to be required for efficient multiplication of the AAV genome. The feature of these sequences that gives them this property is their ability to form a hairpin, which contributes to so-called self-priming that allows primase-independent synthesis of the second DNA strand. The ITRs were also shown to be required for both integration of the AAV DNA into the host cell genome (19th chromosome in humans) and rescue from it, as well as for efficient encapsidation of the AAV DNA combined with generation of a fully assembled, deoxyribonuclease-resistant AAV particles.

[0085] With regard to gene therapy, ITRs seem to be the only sequences required in cis next to the therapeutic gene: structural (cap) and packaging (rep) proteins can be delivered in trans. With this assumption many methods were established for efficient production of recombinant AAV (rAAV) vectors containing a reporter or therapeutic gene. However, it was also published that the ITRs are not the only elements required in cis for the effective replication and encapsidation. A few research groups have identified a sequence designated234929-2659-7749, v. 1cis-acting Rep-dependent element (CARE) inside the coding sequence of the rep gene. CARE was shown to augment the replication and encapsidation when present in cis.

[0086] On the “left side” of the genome there are two promoters called p5 and pl 9, from which two overlapping messenger ribonucleic acids (mRNAs) of different length can be produced. Each of these contains an intron which can be either spliced out or not. Given these possibilities, four various mRNAs, and consequently four various Rep proteins with overlapping sequence can be synthesized. Their names depict their sizes in kilodaltons (kDa): Rep78, Rep68, Rep52 and Rep40. Rep78 and 68 can specifically bind the hairpin formed by the ITR in the self-priming act and cleave at a specific region, designated terminal resolution site, within the hairpin. They were also shown to be necessary for the AAVS1 -specific integration of the AAV genome. All four Rep proteins were shown to bind ATP and to possess helicase activity. It was also shown that they upregulate the transcription from the p40 promoter (mentioned below) but downregulate both p5 and pl 9 promoters.

[0087] The right side of a positive-sensed AAV genome encodes overlapping sequences of three capsid proteins, VP1, VP2 and VP3, which start from one promoter, designated p40. The molecular weights of these proteins are 87, 72 and 62 kiloDaltons, respectively. The AAV capsid is composed of a mixture of VP1, VP2, and VP3 totaling 60 monomers arranged in icosahedral symmetry in a ratio of 1 : 1 : 10, with an estimated size of 3.9 MegaDaltons.

[0088] The cap gene produces an additional, non-structural protein called the Assembly-Activating Protein (AAP). This protein is produced from ORF2 and is essential for the capsid-assembly process. The exact function of this protein in the assembly process and its structure have not been solved to date.

[0089] All three VPs are translated from one mRNA. After this mRNA is synthesized, it can be spliced in two different manners: either a longer or shorter intron can be excised resulting in the formation of two pools of mRNAs: a 2.3 kb- and a 2.6 kb-long mRNA pool. Usually, especially in the presence of adenovirus, the longer intron is preferred, so the 2.3-kb- long mRNA represents the so-called “major splice”. In this form the first AUG codon, from which the synthesis of VP1 protein starts, is cut out, resulting in a reduced overall level of VP1 protein synthesis. The first AUG codon that remains in the major splice is the initiation codon for VP3 protein. However, upstream of that codon in the same open reading frame lies an ACG244929-2659-7749, v. 1sequence (encoding threonine) which is surrounded by an optimal Kozak context. This contributes to a low level of synthesis of VP2 protein, which is actually VP3 protein with additional N terminal residues, as is VP1.

[0090] Since the bigger intron is preferred to be spliced out, and since in the major splice the ACG codon is a much weaker translation initiation signal, the ratio at which the AAV structural proteins are synthesized in vivo is about 1: 1:20, which is the same as in the mature virus particle. The unique fragment at the N terminus of VP1 protein was shown to possess the phospholipase A2 (PLA2) activity, which is probably required for the releasing of AAV particles from late endosomes. Muralidhar et al. reported that VP2 and VP3 are crucial for correct virion assembly. More recently, however, Warrington et al. showed VP2 to be unnecessary for the complete virus particle formation and an efficient infectivity, and also presented that VP2 can tolerate large insertions in its N terminus, while VP1 cannot, probably because of the PLA2 domain presence.

[0091] The AAV vector may be replication-defective or conditionally replication defective. In embodiments, the AAV vector is a recombinant AAV vector. In some embodiments, the AAV vector comprises a sequence isolated or derived from an AAV vector of serotype AAV1, AAV2, AAV3, AAV4, AAV5, AAV6, AAV7, AAV8, AAV9, AAV10, AAV 11 or any combination thereof.V. Nucleic Acid Delivery

[0092] In some embodiments, expression cassettes are employed for use directly in a genetic-based delivery approach. Provided herein are expression vectors which contain one or more nucleic acids encoding fusion proteins or target proteins or genes of interest. In some embodiments, a nucleic acid encoding the first fusion protein and a nucleic acid encoding the second fusion protein are provided on the same vector. In further embodiments, a nucleic acid encoding one or more of the fusion proteins and a nucleic acid encoding a gene of interest or target protein are provided on separate vectors.

[0093] Expression requires that appropriate signals be provided in the vectors and include various regulatory elements such as enhancers / promoters from both viral and mammalian sources that drive expression of the genes of interest in cells. Elements designed to optimize messenger RNA stability and translatability in host cells also are defined. The conditions for the use of a number of dominant drug selection markers for establishing254929-2659-7749, v. 1permanent, stable cell clones expressing the products are also provided, as is an element that links expression of the drug selection markers to expression of the polypeptide.

[0094] Throughout this application, the term “expression cassette” is meant to include any type of genetic construct containing a nucleic acid coding for a gene product in which part or all of the nucleic acid encoding sequence is capable of being transcribed and translated, i.e., is under the control of a promoter. A “promoter” refers to a DNA sequence recognized by the synthetic machinery of the cell, or introduced synthetic machinery, required to initiate the specific transcription of a gene. The phrase “under transcriptional control” or “operably linked” means that the promoter is in the correct location and orientation in relation to the nucleic acid to control RNA polymerase initiation and expression of the gene. An “expression vector” is meant to include expression cassettes comprised in a genetic construct that is capable of replication, and thus including one or more of origins of replication, transcription termination signals, poly-A regions, selectable markers, and multipurpose cloning sites.

[0095] The term promoter will be used here to refer to a group of transcriptional control modules that are clustered around the initiation site for RNA polymerase II. Much of the thinking about how promoters are organized derives from analyses of several viral promoters, including those for the HSV thymidine kinase (tk) and SV40 early transcription units. These studies, augmented by more recent work, have shown that promoters are composed of discrete functional modules, each consisting of approximately 7-20 bp of DNA, and containing one or more recognition sites for transcriptional activator or repressor proteins.

[0096] At least one module in each promoter functions to position the start site for RNA synthesis. The best-known example of this is the TATA box, but in some promoters lacking a TATA box, such as the promoter for the mammalian terminal deoxynucleotidyl transferase gene and the promoter for the S V40 late genes, a discrete element overlying the start site itself helps to fix the place of initiation.

[0097] Additional promoter elements regulate the frequency of transcriptional initiation. Typically, these are located in the region 30-110 bp upstream of the start site, although a number of promoters have recently been shown to contain functional elements downstream of the start site as well. The spacing between promoter elements frequently is flexible, so that promoter function is preserved when elements are inverted or moved relative to one another. In the tk promoter, the spacing between promoter elements can be increased to264929-2659-7749, v. 150 bp apart before activity begins to decline. Depending on the promoter, it appears that individual elements can function either co-operatively or independently to activate transcription.

[0098] In certain embodiments, viral promotes such as the human cytomegalovirus (CMV) immediate early gene promoter, the SV40 early promoter, the Rous sarcoma virus long terminal repeat, rat insulin promoter and glyceraldehyde-3-phosphate dehydrogenase can be used to obtain high-level expression of the coding sequence of interest. The use of other viral or mammalian cellular or bacterial phage promoters which are well-known in the art to achieve expression of a coding sequence of interest is contemplated as well, provided that the levels of expression are sufficient for a given purpose. By employing a promoter with well-known properties, the level and pattern of expression of the protein of interest following transfection or transformation can be optimized. Further, selection of a promoter that is regulated in response to specific physiologic signals can permit inducible expression of the gene product.

[0099] Enhancers are genetic elements that increase transcription from a promoter located at a distant position on the same molecule of DNA. Enhancers are organized much like promoters. That is, they are composed of many individual elements, each of which binds to one or more transcriptional proteins. The basic distinction between enhancers and promoters is operational. An enhancer region as a whole must be able to stimulate transcription at a distance; this need not be true of a promoter region or its component elements. On the other hand, a promoter must have one or more elements that direct initiation of RNA synthesis at a particular site and in a particular orientation, whereas enhancers lack these specificities. Promoters and enhancers are often overlapping and contiguous, often seeming to have a very similar modular organization.VI. Pharmaceutical Formulations and Routes of Administration

[0100] In another aspect, for administration to a patient in need of such treatment, pharmaceutical formulations (also referred to as a pharmaceutical preparations, pharmaceutical compositions, pharmaceutical products, medicinal products, medicines, medications, or medicaments) comprise a therapeutically effective amount of a compound disclosed herein formulated with one or more excipients and / or drug carriers appropriate to the indicated route of administration. In some embodiments, the compounds disclosed herein are formulated in a manner amenable for the treatment of human and / or veterinary patients. In some embodiments,274929-2659-7749, v. 1formulation comprises admixing or combining one or more of the compounds disclosed herein with one or more of the following excipients: lactose, sucrose, starch powder, cellulose esters of alkanoic acids, cellulose alkyl esters, talc, stearic acid, magnesium stearate, magnesium oxide, sodium and calcium salts of phosphoric and sulfuric acids, gelatin, acacia, sodium alginate, polyvinylpyrrolidone, and / or poly vinyl alcohol. In some embodiments, e.g., for oral administration, the pharmaceutical formulation may be tableted or encapsulated. In some embodiments, the compounds may be dissolved or slurried in water, polyethylene glycol, propylene glycol, ethanol, corn oil, cottonseed oil, peanut oil, sesame oil, benzyl alcohol, sodium chloride, and / or various buffers. In some embodiments, the pharmaceutical formulations may be subjected to pharmaceutical operations, such as sterilization, and / or may contain drug carriers and / or excipients such as preservatives, stabilizers, wetting agents, emulsifiers, encapsulating agents such as lipids, dendrimers, polymers, proteins such as albumin, nucleic acids, and buffers.

[0101] Pharmaceutical formulations may be administered by a variety of methods, e.g., orally or by injection (e.g. subcutaneous, intravenous, and intraperitoneal). Depending on the route of administration, the compounds disclosed herein may be coated in a material to protect the compound from the action of acids and other natural conditions which may inactivate the compound. To administer the active compound by other than parenteral administration, it may be necessary to coat the compound with, or co-administer the compound with, a material to prevent its inactivation. In some embodiments, the active compound may be administered to a patient in an appropriate carrier, for example, liposomes, or a diluent. Pharmaceutically acceptable diluents include saline and aqueous buffer solutions. Liposomes include water-in- oil-in-water CGF emulsions as well as conventional liposomes.

[0102] The compounds disclosed herein may also be administered parenterally, intraperitoneally, intraspinally, or intracerebrally. Dispersions can be prepared in glycerol, liquid polyethylene glycols, and mixtures thereof and in oils. Under ordinary conditions of storage and use, these preparations may contain a preservative to prevent the growth of microorganisms.

[0103] Pharmaceutical compositions suitable for injectable use include sterile aqueous solutions (where water soluble) or dispersions and sterile powders for the extemporaneous preparation of sterile injectable solutions or dispersion. The carrier can be a solvent or dispersion medium containing, for example, water, ethanol, polyol (such as, glycerol,284929-2659-7749, v. 1propylene glycol, and liquid polyethylene glycol, and the like), suitable mixtures thereof, and vegetable oils. The proper fluidity can be maintained, for example, by the use of a coating such as lecithin, by the maintenance of the required particle size in the case of dispersion and by the use of surfactants. Prevention of the action of microorganisms can be achieved by various antibacterial and antifungal agents, for example, parabens, chlorobutanol, phenol, ascorbic acid, thimerosal, and the like. In many cases, it will be preferable to include isotonic agents, for example, sugars, sodium chloride, or polyalcohols such as mannitol and sorbitol, in the composition. Prolonged absorption of the injectable compositions can be brought about by including in the composition an agent which delays absorption, for example, aluminum monostearate or gelatin.

[0104] The compounds disclosed herein can be administered orally, for example, with an inert diluent or an assimilable edible carrier. The compounds and other ingredients may also be enclosed in a hard or soft-shell gelatin capsule, compressed into tablets, or incorporated directly into the patient’s diet. For oral therapeutic administration, the compounds disclosed herein may be incorporated with excipients and used in the form of ingestible tablets, buccal tablets, troches, capsules, elixirs, suspensions, syrups, wafers, and the like. The percentage of the therapeutic compound in the compositions and preparations may, of course, be varied. The amount of the therapeutic compound in such pharmaceutical formulations is such that a suitable dosage will be obtained.

[0105] The therapeutic compound may also be administered topically to the skin, eye, ear, or mucosal membranes. Administration of the therapeutic compound topically may include formulations of the compounds as a topical solution, lotion, cream, ointment, gel, foam, transdermal patch, or tincture. When the therapeutic compound is formulated for topical administration, the compound may be combined with one or more agents that increase the permeability of the compound through the tissue to which it is administered. In other embodiments, it is contemplated that the topical administration is administered to the eye. Such administration may be applied to the surface of the cornea, conjunctiva, or sclera. Without wishing to be bound by any theory, it is believed that administration to the surface of the eye allows the therapeutic compound to reach the posterior portion of the eye. Ophthalmic topical administration can be formulated as a solution, suspension, ointment, gel, or emulsion. Finally, topical administration may also include administration to the mucosa membranes such as the inside of the mouth. Such administration can be directly to a particular location within the294929-2659-7749, v. 1mucosal membrane such as a tooth, a sore, or an ulcer. Alternatively, if local delivery to the lungs is desired the therapeutic compound may be administered by inhalation in a dry-powder or aerosol formulation.

[0106] In some embodiments, it may be advantageous to formulate parenteral compositions in dosage unit form for ease of administration and uniformity of dosage. Dosage unit form as used herein refers to physically discrete units suited as unitary dosages for the patients to be treated; each unit containing a predetermined quantity of therapeutic compound calculated to produce the desired therapeutic effect in association with the required pharmaceutical carrier. In some embodiments, the specification for the dosage unit forms of the disclosure are dictated by and directly dependent on (a) the unique characteristics of the therapeutic compound and the particular therapeutic effect to be achieved, and (b) the limitations inherent in the art of compounding such a therapeutic compound for the treatment of a selected condition in a patient. In some embodiments, active compounds are administered at a therapeutically effective dosage sufficient to treat a condition associated with a condition in a patient. For example, the efficacy of a compound can be evaluated in an animal model system that may be predictive of efficacy in treating the disease in a human or another animal.

[0107] In some embodiments, the effective dose range for the therapeutic compound can be extrapolated from effective doses determined in animal studies for a variety of different animals. In some embodiments, the human equivalent dose (HED) in mg / kg can be calculated in accordance with the following formula (see, e.g., Reagan-Shaw et al., FASEB J., 22(3):659- 661, 2008, which is incorporated herein by reference):HED (mg / kg) = Animal dose (mg / kg) x (Animal Km / Human Km)Use of the Kmfactors in conversion results in HED values based on body surface area (BSA) rather than only on body mass. Kmvalues for humans and various animals are well known. For example, the Kmfor an average 60 kg human (with a BSA of 1.6 m2) is 37, whereas a 20 kg child (BSA 0.8 m2) would have a Kmof 25. Kmfor some relevant animal models are also well known, including: mice Kmof 3 (given a weight of 0.02 kg and BSA of 0.007); hamster Kmof 5 (given a weight of 0.08 kg and BSA of 0.02); rat Kmof 6 (given a weight of 0.15 kg and BSA of 0.025) and monkey Kmof 12 (given a weight of 3 kg and BSA of 0.24).

[0108] Precise amounts of the therapeutic composition depend on the judgment of the practitioner and are specific to each individual. Nonetheless, a calculated HED dose provides304929-2659-7749, v. 1a general guide. Other factors affecting the dose include the physical and clinical state of the patient, the route of administration, the intended goal of treatment and the potency, stability and toxicity of the particular therapeutic formulation.

[0109] The actual dosage amount of a compound of the present disclosure or composition comprising a compound of the present disclosure administered to a patient may be determined by physical and physiological factors such as type of animal treated, age, sex, body weight, severity of condition, the type of disease being treated, previous or concurrent therapeutic interventions, idiopathy of the patient and on the route of administration. These factors may be determined by a skilled artisan. The practitioner responsible for administration will typically determine the concentration of active ingredient(s) in a composition and appropriate dose(s) for the individual patient. The dosage may be adjusted by the individual physician in the event of any complication.

[0110] In some embodiments, the therapeutically effective amount typically will vary from about 0.001 mg / kg to about 1000 mg / kg, from about 0.01 mg / kg to about 750 mg / kg, from about 100 mg / kg to about 500 mg / kg, from about 1 mg / kg to about 250 mg / kg, from about 10 mg / kg to about 150 mg / kg in one or more dose administrations daily, for one or several days (depending of course of the mode of administration and the factors discussed above). Other suitable dose ranges include 1 mg to 10,000 mg per day, 100 mg to 10,000 mg per day, 500 mg to 10,000 mg per day, and 500 mg to 1 ,000 mg per day. In some embodiments, the amount is less than 10,000 mg per day with a range of 750 mg to 9,000 mg per day.

[0111] In some embodiments, the amount of the active compound in the pharmaceutical formulation is from about 2 to about 75 weight percent. In some of these embodiments, the amount if from about 25 to about 60 weight percent.

[0112] Single or multiple doses of the agents are contemplated. Desired time intervals for delivery of multiple doses can be determined by one of ordinary skill in the art employing no more than routine experimentation. As an example, patients may be administered two doses daily at approximately 12-hour intervals. In some embodiments, the agent is administered once a day.

[0113] The agent(s) may be administered on a routine schedule. As used herein a routine schedule refers to a predetermined designated period of time. The routine schedule may encompass periods of time which are identical, or which differ in length, as long as the314929-2659-7749, v. 1schedule is predetermined. For instance, the routine schedule may involve administration twice a day, every day, every two days, every three days, every four days, every five days, every six days, a weekly basis, a monthly basis or any set number of days or weeks there-between. Alternatively, the predetermined routine schedule may involve administration on a twice daily basis for the first week, followed by a daily basis for several months, etc. In other embodiments, the disclosure provides that the agent(s) may be taken orally and that the timing of which is or is not dependent upon food intake. Thus, for example, the agent can be taken every morning and / or every evening, regardless of when the patient has eaten or will eat.VII. Definitions

[0114] The term “nucleotide editing Cas9” refers to a Cas9 protein fused to a base editor or a prime editor. Non-limiting examples of Cas9 include SpCas9, SpCas9-NG, SaCas9, SaCas9-KKH, SauCas9, and SlugCas9. Non limiting examples of a base editor include ABEmax, ABE8e, ABE8eV106W, ABE8.20-m.

[0115] The terms “polynucleotide,” “nucleic acid” and “transgene” are used interchangeably herein to refer to all forms of nucleic acid, oligonucleotides, including deoxyribonucleic acid (DNA) and ribonucleic acid (RNA) and polymers thereof. Polynucleotides include genomic DNA, cDNA and antisense DNA, and spliced or unspliced mRNA, rRNA, tRNA and inhibitory DNA or RNA (RNAi, e.g., small or short hairpin (sh)RNA, microRNA (miRNA), small or short interfering (si)RNA, trans-splicing RNA, or antisense RNA). Polynucleotides can include naturally occurring, synthetic, and intentionally modified or altered polynucleotides (e.g., variant nucleic acid). Polynucleotides can be single stranded, double stranded, or triplex, linear or circular, and can be of any suitable length. In discussing polynucleotides, a sequence or structure of a particular polynucleotide may be described herein according to the convention of providing the sequence in the 5' to 3' direction. A nucleic acid “backbone” can be made up of a variety of linkages, including one or more of sugar-phosphodiester linkages, peptide-nucleic acid bonds (“peptide nucleic acids” or PNA; PCT No. WO 95 / 32305), phosphorothioate linkages, methylphosphonate linkages, or combinations thereof. Sugar moieties of a nucleic acid can be ribose, deoxyribose, or similar compounds with substitutions, e.g., 2’ methoxy or 2’ halide substitutions. Nitrogenous bases can be conventional bases (A, G, C, T, U), analogs thereof (e.g., modified uridines such as 5- methoxyuridine, pseudouridine, or N1 -methylpseudouridine, or others); inosine; derivatives of purines or pyrimidines (e.g., N4-methyl deoxyguanosine, deaza- or aza-purines, deaza- or aza-324929-2659-7749, v. 1pyrimidines, pyrimidine bases with substituent groups at the 5 or 6 position (e.g., 5- methylcytosine), purine bases with a substituent at the 2, 6, or 8 positions, 2-amino-6- methylaminopurine, O6-methylguanine, 4-thio-pyrimidines, 4-amino-pyrimidines, 4- dimethylhydrazine -pyrimidines, and O4-alkyl-pyrimidines; U.S. Patent 5,378,825 and PCT No. WO 93 / 13121). For general discussion see The Biochemistry of the Nucleic Acids 5-36, Adams et al., ed., 11thed., 1992). Nucleic acids can include one or more “abasic” residues where the backbone includes no nitrogenous base for position(s) of the polymer (U.S. Patent 5,585,481). A nucleic acid can comprise only conventional RNA or DNA sugars, bases and linkages, or can include both conventional components and substitutions (e.g., conventional bases with 2’ methoxy linkages, or polymers containing both conventional bases and one or more base analogs). Nucleic acid includes “locked nucleic acid” (UNA), an analogue containing one or more LNA nucleotide monomers with a bicyclic furanose unit locked in an RNA mimicking sugar conformation, which enhance hybridization affinity toward complementary RNA and DNA sequences (Vester and Wengel, 2004, Biochemistry 43(42): 13233-41). RNA and DNA have different sugar moieties and can differ by the presence of uracil or analogs thereof in RNA and thymine or analogs thereof in DNA.

[0116] A nucleic acid encoding a polypeptide often comprises an open reading frame that encodes the polypeptide. Unless otherwise indicated, a particular nucleic acid sequence also includes degenerate codon substitutions.

[0117] Nucleic acids can include one or more expression control or regulatory elements operably linked to the open reading frame, where the one or more regulatory elements are configured to direct the transcription and translation of the polypeptide encoded by the open reading frame in a mammalian cell. Non-limiting examples of expression control / regulatory elements include transcription initiation sequences (e.g., promoters, enhancers, a TATA box, and the like), translation initiation sequences, mRNA stability sequences, poly A sequences, secretory sequences, and the like. Expression control / regulatory elements can be obtained from the genome of any suitable organism.

[0118] As used herein, “AAV” refers to an adeno-associated virus vector. As used herein, “AAV” refers to any AAV serotype and variant, including but not limited to an AAV1, AAV2, AAV3, AAV4, AAV5, AAV6, AAV7, AAV8, AAVrhlO (see, e.g., SEQ ID NO: 81 of US 9,790,472, which is incorporated by reference herein in its entirety), AAVrh74 (see, e.g., SEQ ID NO: 1 of US 2015 / 0111955, which is incorporated by reference herein in its entirety),334929-2659-7749, v. 1AAV9 vector, AAV9P vector (also known as AAVMYO, see, Weinmann et al., 2020, Nature Communications, 11 :5432), and Myo-AAV vectors described in Tabebordbar et al., 2021, Cell, 184: 1-20 (e.g., MyoAAV 1A, 2A, 3A, 4A, 4C, or 4E) , wherein the number following AAV indicates the AAV serotype. The term “AAV” can also refer to any known AAV (vector) system. In some embodiments, the AAV vector is a single-stranded AAV (ssAAV). In some embodiments, the AAV vector is a double-stranded AAV (dsAAV). Any variant of an AAV vector or serotype thereof, such as a self-complementary AAV (scAAV) vector, is encompassed within the general terms AAV vector, AAV1 vector, etc. See, e.g., McCarty et al., Gene Ther. 2001;8:1248-54, Naso et al., BioDrugs 2017; 31:317-334, and references cited therein for detailed discussion of various AAV vectors. Structurally, AAVs are small (25 nm), single-DNA stranded non-enveloped viruses with an icosahedral capsid. Naturally occurring or engineered AAV serotypes and variants that differ in the composition and structure of their capsid protein have varying tropism, i.e., ability to transduce different cell types. When combined with active promoters, this tropism defines the site of gene expression.

[0119] “Guide RNA”, “guide RNA”, and simply “guide” are used herein interchangeably to refer to either a crRNA (also known as CRISPR RNA), or the combination of a crRNA and a trRNA (also known as tracrRNA). The crRNA and trRNA may be associated as a single RNA molecule (single guide RNA, sgRNA) or in two separate RNA molecules (dual guide RNA, dgRNA). “Guide RNA” or “guide RNA” refers to each type. The trRNA may be a naturally-occurring sequence, or a trRNA sequence with modifications or variations compared to naturally-occurring sequences. For clarity, the terms “guide RNA” or “guide” as used herein, and unless specifically stated otherwise, may refer to an RNA molecule (comprising A, C, G, and U nucleotides) or to a DNA molecule encoding such an RNA molecule (comprising A, C, G, and T nucleotides) or complementary sequences thereof. In general, in the case of a DNA nucleic acid construct encoding a guide RNA, the U residues in any of the RNA sequences described herein may be replaced with T residues, and in the case of a guide RNA construct encoded by any of the DNA sequences described herein, the T residues may be replaced with U residues.

[0120] Target sequences for Cas9s include both the positive and negative strands of genomic DNA (i.e., the sequence given and the sequence’s reverse compliment), as a nucleic acid substrate for a Cas9 is a double stranded nucleic acid. Accordingly, where a guide sequence is said to be “complementary to a target sequence”, it is to be understood that the344929-2659-7749, v. 1guide sequence may direct a guide RNA to bind to the reverse complement of a target sequence. Thus, in some embodiments, where the guide sequence binds the reverse complement of a target sequence, the guide sequence is identical to certain nucleotides of the target sequence (e.g., the target sequence not including the PAM) except for the substitution of U for T in the guide sequence.

[0121] A “promoter” refers to a nucleotide sequence, usually upstream (5') of a coding sequence, which directs and / or controls the expression of the coding sequence by providing the recognition for RNA polymerase and other factors required for proper transcription. "Promoter" includes a minimal promoter that is a short DNA sequence comprised of a TATA- box and optionally other sequences that serve to specify the site of transcription initiation, to which regulatory elements are added for control of expression.

[0122] An “enhancer” is a DNA sequence that can stimulate transcription activity and may be an innate element of the promoter or a heterologous element that enhances the level or tissue specificity of expression. It is capable of operating in either orientation (5’->3’ or 3’- >5’) and may be capable of functioning even when positioned either upstream or downstream of the promoter.

[0123] Promoters and / or enhancers may be derived in their entirety from a native gene or be composed of different elements derived from different elements found in nature, or even be comprised of synthetic DNA segments. A promoter or enhancer may comprise DNA sequences that are involved in the binding of protein factors that modulate / control effectiveness of transcription initiation in response to stimuli, physiological or developmental conditions.

[0124] Non-limiting examples include SV40 early promoter, mouse mammary tumor vims LTR promoter; adenovirus major late promoter (Ad MLP); a herpes simplex vims (HSV) promoter, a cytomegalovims (CMV) promoter such as the CMV immediate early promoter region (CMVIE), a rous sarcoma vims (RSV) promoter, pol II promoters, pol III promoters, synthetic promoters, hybrid promoters, and the like. In addition, sequences derived from non- viral genes, such as the murine metallothionein gene, will also find use herein. Exemplary constitutive promoters include the promoters for the following genes which encode certain constitutive or “housekeeping” functions: hypoxanthine phosphoribosyl transferase (HPRT), dihydrofolate reductase (DHFR), adenosine deaminase, phosphoglycerol kinase (PGK), pyruvate kinase, phosphoglycerol mutase, the actin promoter, and other constitutive promoters354929-2659-7749, v. 1known to those of skill in the art. In addition, many viral promoters function constitutively in eukaryotic cells. These include: the early and late promoters of S V40; the long terminal repeats (LTRs) of Moloney Leukemia Virus and other retroviruses; and the thymidine kinase promoter of Herpes Simplex Virus, among many others. Accordingly, any of the above -referenced constitutive promoters can be used to control transcription of a heterologous gene insert.

[0125] A “transgene” is used herein to conveniently refer to a nucleic acid sequence / polynucleotide that is intended or has been introduced into a cell or organism. Transgenes include any nucleic acid, such as a gene that encodes an inhibitory RNA or polypeptide or protein, and are generally heterologous with respect to naturally occurring AAV genomic sequences.

[0126] The term “transduce” refers to introduction of a nucleic acid sequence into a cell or host organism by way of a vector (e.g., a viral particle). Introduction of a transgene into a cell by a viral particle is can therefore be referred to as “transduction” of the cell. The transgene may or may not be integrated into genomic nucleic acid of a transduced cell. If an introduced transgene becomes integrated into the nucleic acid (genomic DNA) of the recipient cell or organism it can be stably maintained in that cell or organism and further passed on to or inherited by progeny cells or organisms of the recipient cell or organism. Finally, the introduced transgene may exist in the recipient cell or host organism extra chromosomally, or only transiently. A “transduced cell” is therefore a cell into which the transgene has been introduced by way of transduction. Thus, a “transduced” cell is a cell into which, or a progeny thereof in which a transgene has been introduced. A transduced cell can be propagated, transgene transcribed and the encoded inhibitory RNA or protein expressed. For gene therapy uses and methods, a transduced cell can be in a mammal.

[0127] A nucleic acid / transgene is “operably linked” when it is placed into a functional relationship with another nucleic acid sequence. A nucleic acid / transgene encoding and RNAi or a polypeptide, or a nucleic acid directing expression of a polypeptide may include an inducible promoter, or a tissue-specific promoter for controlling transcription of the encoded polypeptide. A nucleic acid operably linked to an expression control element can also be referred to as an expression cassette.

[0128] As used herein, the terms “modify” or “variant” and grammatical variations thereof, mean that a nucleic acid, polypeptide or subsequence thereof deviates from a reference364929-2659-7749, v. 1sequence. Modified and variant sequences may therefore have substantially the same, greater or less expression, activity or function than a reference sequence, but at least retain partial activity or function of the reference sequence. A particular type of variant is a mutant protein, which refers to a protein encoded by a gene having a mutation, e.g., a missense or nonsense mutation.

[0129] In general, “CRISPR system” refers collectively to transcripts and other elements involved in the expression of or directing the activity of CRISPR-associated (“Cas”) genes, including sequences encoding a Cas gene, a tracr (trans-activating CRISPR) sequence (e.g. tracrRNA or an active partial tracrRNA), a tracr-mate sequence (encompassing a “direct repeat” and a tracrRNA-processed partial direct repeat in the context of an endogenous CRISPR system), a guide sequence (also referred to as a “spacer” in the context of an endogenous CRISPR system), and / or other sequences and transcripts from a CRISPR locus.

[0130] As used herein, a “spacer sequence,” sometimes also referred to herein and in the literature as a “spacer,” “protospacer,” “guide sequence,” or “targeting sequence” refers to a sequence within a guide RNA that is complementary to a target sequence and functions to direct a guide RNA to a target sequence for cleavage by a Cas9. For clarity, the terms “spacer sequence”, “spacer,” “protospacer,” “guide sequence,” or “targeting sequence” as used herein, and unless specifically stated otherwise, may refer to an RNA molecule (comprising A, C, G, and U nucleotides) or to a DNA molecule encoding such an RNA molecule (comprising A, C, G, and T nucleotides) or complementary sequences thereof.

[0131] A “nucleic acid” or “polynucleotide” variant refers to a modified sequence which has been genetically altered compared to wild-type. The sequence may be genetically modified without altering the encoded protein sequence. Alternatively, the sequence may be genetically modified to encode a variant protein. A nucleic acid or polynucleotide variant can also refer to a combination sequence which has been codon modified to encode a protein that still retains at least partial sequence identity to a reference sequence, such as wild-type protein sequence, and also has been codon-modified to encode a variant protein. For example, some codons of such a nucleic acid variant will be changed without altering the amino acids of a protein encoded thereby, and some codons of the nucleic acid variant will be changed which in turn changes the amino acids of a protein encoded thereby.374929-2659-7749, v. 1

[0132] The terms “protein” and “polypeptide” are used interchangeably herein. The “polypeptides” encoded by a “nucleic acid” or “polynucleotide” or “transgene” disclosed herein include partial or full-length native sequences, as with naturally occurring wild-type and functional polymorphic proteins, functional subsequences (fragments) thereof, and sequence variants thereof, so long as the polypeptide retains some degree of function or activity. Accordingly, in methods and uses of the disclosure, such polypeptides encoded by nucleic acid sequences are not required to be identical to the endogenous protein that is defective, or whose activity, function, or expression is insufficient, deficient or absent in a treated mammal.

[0133] An example of an amino acid modification is a conservative amino acid substitution or a deletion. In particular embodiments, a modified or variant sequence retains at least part of a function or activity of the unmodified sequence (e.g., wild-type sequence).

[0134] Another example of an amino acid modification is a targeting peptide introduced into a capsid protein of a viral particle. Peptides have been identified that target recombinant viral vectors or nanoparticles to various organs and tissues.

[0135] A “variant” of a molecule is a sequence that is substantially similar to the sequence of the native molecule. For nucleotide sequences, variants include those sequences that, because of the degeneracy of the genetic code, encode the identical amino acid sequence of the native protein. Naturally occurring allelic variants such as these can be identified with the use of molecular biology techniques, as, for example, with polymerase chain reaction (PCR) and hybridization techniques. Variant nucleotide sequences also include synthetically derived nucleotide sequences, such as those generated, for example, by using site-directed mutagenesis, which encode the native protein, as well as those that encode a polypeptide having amino acid substitutions. Generally, nucleotide sequence variants of the disclosure will have at least 40%, 50%, 60%, to 70%, e.g., 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, to 79%, generally at least 80%, e.g., 81%-84%, at least 85%, e.g., 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, to 98%, sequence identity to the native (endogenous) nucleotide sequence. In certain embodiments, the variant is biologically functional (i.e., retains 5%, 10%, 15%, 20%, 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 99% or 100% of activity or function of wild-type).

[0136] “Conservative variations” of a particular nucleic acid sequence refers to those nucleic acid sequences that encode identical or essentially identical amino acid sequences.384929-2659-7749, v. 1Because of the degeneracy of the genetic code, a large number of functionally identical nucleic acids encode any given polypeptide. For instance, the codons CGT, CGC, CGA, CGG, AGA and AGG all encode the amino acid arginine. Thus, at every position where an arginine is specified by a codon, the codon can be altered to any of the corresponding codons described without altering the encoded protein. Such nucleic acid variations are “silent variations,” which are one species of “conservatively modified variations.” Every nucleic acid sequence described herein that encodes a polypeptide also describes every possible silent variation, except where otherwise noted. One of skill in the art will recognize that each codon in a nucleic acid (except ATG, which is ordinarily the only codon for methionine) can be modified to yield a functionally identical molecule by standard techniques. Accordingly, each “silent variation” of a nucleic acid that encodes a polypeptide is implicit in each described sequence.

[0137] The term “substantial identity” of polynucleotide sequences means that a polynucleotide comprises a sequence that has at least 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, or 79%, or at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, or 89%, or at least 90%, 91%, 92%, 93%, or 94%, or even at least 95%, 96%, 97%, 98%, or 99% sequence identity, compared to a reference sequence using one of the alignment programs described using standard parameters. One of skill in the art will recognize that these values can be appropriately adjusted to determine corresponding identity of proteins encoded by two nucleotide sequences by taking into account codon degeneracy, amino acid similarity, reading frame positioning, and the like. Substantial identity of amino acid sequences for these purposes normally means sequence identity of at least 70%, at least 80%, 90%, or even at least 95%.

[0018] The term “substantial identity” in the context of a polypeptide indicates that a polypeptide comprises a sequence with at least 70%, 71 %, 72%, 73%, 74%, 75%, 76%, 77%, 78%, or 79%, or 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, or 89%, or at least 90%, 91%, 92%, 93%, or 94%, or even, 95%, 96%, 97%, 98% or 99%, sequence identity to the reference sequence over a specified comparison window. An indication that two polypeptide sequences are identical is that one polypeptide is immunologically reactive with antibodies raised against the second polypeptide. Thus, a polypeptide is identical to a second polypeptide, for example, where the two peptides differ only by a conservative substitution.

[0139] The terms “treat” and “treatment” refer to both therapeutic treatment and prophylactic or preventative measures, wherein the object is to prevent, inhibit, reduce, or decrease an undesired physiological change or disorder, such as the development, progression394929-2659-7749, v. 1or worsening of the disorder. For purposes of this disclosure, beneficial or desired clinical results include, but are not limited to, alleviation of symptoms, diminishment of extent of disease, stabilizing a (z.e., not worsening or progressing) symptom or adverse effect of disease, delay or slowing of disease progression, amelioration or palliation of the disease state, and remission (whether partial or total), whether detectable or undetectable. “Treatment” can also mean prolonging survival as compared to expected survival if not receiving treatment. Those in need of treatment include those already with the condition or disorder as well as those predisposed (e.g., as determined by a genetic assay).

[0140] As used herein, “essentially free,” in terms of a specified component, is used herein to mean that none of the specified component has been purposefully formulated into a composition and / or is present only as a contaminant or in trace amounts. The total amount of the specified component resulting from any unintended contamination of a composition is therefore well below 0.05%, preferably below 0.01 %. Most preferred is a composition in which no amount of the specified component can be detected with standard analytical methods.

[0141] As used herein the specification, “a” or “an” may mean one or more. As used herein in the claim(s), when used in conjunction with the word “comprising,” the words “a” or “an” may mean one or more than one.

[0142] The use of the term “or” in the claims is used to mean “and / or” unless explicitly indicated to refer to alternatives only or the alternatives are mutually exclusive, although the disclosure supports a definition that refers to only alternatives and “and / or.” As used herein “another” may mean at least a second or more.

[0143] Throughout this application, the tern “about” is used to indicate that a value includes the inherent variation of error for the device, the inherent variation in the method being employed to determine the value, the variation that exists among the study subjects, or a value that is within 10% of a stated value.

[0144] The terms “comprise,” “have” and “include” are open-ended linking verbs. Any forms or tenses of one or more of these verbs, such as “comprises,” “comprising,” “has,” “having,” “includes” and “including,” are also open-ended. For example, any method that “comprises,” “has” or “includes” one or more steps is not limited to possessing only those one or more steps and also covers other unlisted steps.404929-2659-7749, v. 1

[0145] The term “effective," as that term is used in the specification and / or claims, means adequate to accomplish a desired, expected, or intended result. “Effective amount," “Therapeutically effective amount” or “pharmaceutically effective amount” when used in the context of treating a patient or subject with a compound means that amount of the compound which, when administered to a subject or patient for treating or preventing a disease, is an amount sufficient to effect such treatment or prevention of the disease.

[0146] As used herein, the term “patient” or “subject” refers to a living mammalian organism, such as a human, monkey, cow, sheep, goat, dog, cat, mouse, rat, guinea pig, or transgenic species thereof. In certain embodiments, the patient or subject is a primate. Nonlimiting examples of human patients are adults, juveniles, infants and fetuses.

[0147] The above definitions supersede any conflicting definition in any reference that is incorporated by reference herein. The fact that certain terms are defined, however, should not be considered as indicative that any term that is undefined is indefinite. Rather, all terms used are believed to describe the disclosure in terms such that one of ordinary skill can appreciate the scope and practice the present disclosure.VIII. Examples

[0148] The following examples are included to demonstrate preferred embodiments of the invention. It should be appreciated by those of skill in the art that the techniques disclosed in the examples which follow represent techniques discovered by the inventor to function well in the practice of the invention, and thus can be considered to constitute preferred modes for its practice. However, those of skill in the art should, in light of the present disclosure, appreciate that many changes can be made in the specific embodiments which are disclosed and still obtain a like or similar result without departing from the spirit and scope of the invention.Materials & Methods

[0149] Molecular cloning. Plasmids, sgRNAs, and primers were designed and generated using Benchling. DNA templates for polymerase chain reaction (PCR) were from previously established plasmids, Addgene plasmids, or synthesized fragments (IDT, gBlock). Standard PCR amplification was performed using 2 x Phanta Max Master Mix (Vazyme, P525) for DNA fragment or vector amplification. The resulting fragments were purified by gel414929-2659-7749, v. 1extraction and assembled through Gibson Assembly Master Mix (New England Biolabs, E261 IL) or Golden Gate Assembly with BsaI-HFv2 (New England Biolabs, R3733S) or Esp3I (Thermo Fisher Scientific, ER0451) and T4 DNA Ligase (New England Biolabs, M0202S). DNA assembly products were transformed into 10 pl Stbl3 competent cells generated by Mix and Go! E. coli Transformation Kit and Buffer Set (Zymo research, T3001) and plated on agar plates supplemented with 100 pg / ml ampicillin. Plasmids were obtained via DNA miniprep using the QIAquick PCR Purification Kit (Qiagen) and DNA spin column (Epoch Life Science). Typical PCR reactions (20 pl) included 1 pl template (1-10 ng / pl), 2 pl 10 pM primer pair, 7 pl ultrapure (Millipore) or distilled water, and 10 pl 2 x Phanta Max Master Mix (Vazyme, P525). Annealing temperatures, typically set at 60°C, could be adjusted between 55°C and 65°C for optimal amplification yield. Long DNA fragments (>10 kb) could be amplified with high-fidelity via 25-cycle PCR.

[0150] Cell culture. All cells were maintained and passaged in 10 mb TC treated cell culture dishes with vents (Greiner Bio-One, 639160). HepG2 cells (ATCC, HB-8065) were cultured in Eagle’s Minimum Essential Medium (EMEM) (ATCC, 30-2003) supplemented with 10% (v / v) fetal bovine serum (FBS) (Gibco, 10437028) and 1% (v / v) penicillinstreptomycin (Pen-Strep) (Gibco, 15140122). HEK293T cells (ATCC, CRL-3216) and HeLa cells (ATCC, CCL-2) were maintained in Dulbecco's Modified Eagle’s Medium (DMEM) plus GlutaMAX (Gibco, 10569044) supplemented with 10% (v / v) FBS (Gibco, 10437028) and 1% (v / v) Pen-Strep (Gibco, 15140122). K562 (ATCC, CCL-243) was cultured in Roswell Park Memorial Institute (RPMI) 1640 medium plus GlutaMAX (Gibco, 61870036) supplemented with 10% (v / v) FBS (Gibco, 10437028) and 1% (v / v) Pen-Strep (Gibco, 15140122). Cells were incubated at 37 °C with 5% CO2 and passaged upon reaching 80-90% confluency. Cells were authenticated by the supplier using STR (short tandem repeat) analysis.

[0151] Transfection. Cells with low passage number (1 - 10, freshly thawed counted as 0) were passaged every other day and counted using a Countess II FL Automated Cell Counter (Thermo Fisher Scientific) before seeding for transfection. The seeded plate was pre-incubated at room temperature on a flat surface for 15 minutes before being placed into the incubator to reduce the edge effect and avoid unevenly seeded cells. For fluorescent reporter relevant assays, cells were plated at 2 x 104cells (Reporters 1-9, BFP v2 reporter cell line, EFS-EGFP and BFP reporters) or 0.75 x 104cells (3-color reporter cell line) per 100 pl culture medium per well in 96-well plates (Corning, 3598) 16-18 hours before transfections. For gene editing,424929-2659-7749, v. 1gene activation, and gene repression assays, cells were plated at 0.75-2 x 104(HEK293T cells) or 1.5 x 104(HepG2 cells) per 100 pl culture medium per well in poly-D-lysine coated plates (Coming, 356690) 16 hours before transfections. Transfection reagents including PEI Max (1 mg / ml, PH = 7.1, Polysciences), Lipofectamine 2000 (Invitrogen, 11668019), Lipofectamine 3000 (Invitrogen, L3000001), Lipofectamine MessengerMax (Invitrogen LMRNA001) were used. In fluorescent reporter assays, both PEI Max and Lipofectamine 2000 were used. In gene editing, gene activation, and gene repression assays, both Lipofectamine 2000 (for HEK293T cells) and Lipofectamine 3000 (for HepG2 cells) were used. PEI Max transfection for each well of a 96-well plate was optimized, briefly, 100-250 ng DNA and 0.5 pl PEI Max were diluted in 5 pl OptiMEM I Reduced-Serum Medium (Gibco, 31985062), respectively. Then, 5 pl diluted DNA was mixed with 5 pl diluted PEI Max for 5 min before being added into each well (Supplementary Fig. 22). Lipofectamine 2000 transfection was performed as described previously1. Lipofectamine 3000 transfections was performed following the reagent protocol, but using 0.5 pl Lipofectamine 3000, 0.5 pl P3000, and up to 450 ng DNA for each well of a 96-well plate. Specifically, for reporter gene activation assay, 150 ng plasmid of sgRNA with MS2 stem loops, 150 ng plasmid of Cas9 variant or prime editor, 150 ng plasmid of MPH, and 50 ng fluorescent reporter plasmid were transfected in HEK293T, K562 and Hela cells using Lipofectamine 2000. For prime editing assay to be quantified by sequencing, 225 ng plasmid of prime editor variant and 75 ng plasmid of MPE DAP array were transfected in HEK293T cells using Lipofectamine 2000. For prime editing assay to be quantified by flow cytometry, 150 ng plasmid of prime editor and 50 ng plasmid of fluorescent reporter were transfected in HEK293T cells using PEI Max. For gene repression assay, 50-500 ng plasmid of DAP-shRNA was transfected using Lipofectamine 2000 (HEK293T cells) or 3000 (HepG2 cells). For genetic perturbation assay, 150 ng plasmid of PEAK, 150 ng plasmid of MPH and 150 ng plasmid of the DAP array were transfected using Lipofectamine 2000 (HEK293T cells) or 3000 (HepG2 cells). For multiplex gene activation assay in HEK293T cells, 150 ng plasmid of PEAK / dCas9 / PE2, 150 ng plasmid of MPH, and 200 ng pooled plasmids of DAP array were transfected. A plasmid expressing EGFP was used as control in quantitative reverse transcription PCR (RT-qPCR) assays. Dose-relevant assays comparing MPE and eMPE were performed by only changing the amount of DAP array, with no filler plasmid being used. For AAV and mRNA combinatorial viral and non-viral delivery, 4 days after the AAV transduction of the DAP array, 1500 transduced HEK293T cells were transfected with 150 ng MPH and 200 ng PE mRNA using Lipofectamine MessengerMax, following the manufacturer’s protocol.434929-2659-7749, v. 1Cellular DNA and RNA were extracted three days after transfection for gene editing, gene activation, and gene repression analysis.

[0152] Genomic DNA extraction. The culture medium was carefully aspirated from each well for HEK293T, Hela, and HepG2 cells. Next, 100 pl of freshly prepared lysis buffer [10 mM Tris-HCl, pH7.5, 0.05% SDS, 25 pl / ml proteinase K (Thermo Fisher Scientific)] was added to each well of the 96-well plate. The samples were incubated at 37°C for 5 min and then heat-inactivated at 80°C for 30 minutes. Genomic DNA lysate was used immediately or stored at 4°C.

[0153] Flow cytometry. Approximately 48-60 hours after transfection, the fluorescence of each well was imaged using the EVOS FLoid Imaging System (Thermo Fisher Scientific). For flow cytometry sample preparation, the culture medium of each well was gently aspirated, followed by the addition of 100 pl TrypLE Express (Thermo Fisher Scientific, 12605028) per well of a 96-well plate. The samples were then incubated at 37 °C for 5 minutes before dilution with 150 pl / well culture medium. High-throughput flow cytometry was performed using the Sony SA3800 Flow Cytometer, and the data was analyzed using FlowJo 10.8.1 (FlowJo, LLC). Cells were gated by forward versus side scatter (FSC vs. SSC) plot to identify cell population and exclude debris, forward scatter height versus forward scatter area (FSC-H vs. FSC-A) plot for doublet exclusion, and FSC-H or histogram vs. EGFP-A, BFP-A, or mCherry-A plot to reflect fluorescence signals. All represented samples had at least three biological replicates. Data are representative of at least 5,000 gated events per condition.

[0154] Fluorescence-activated cell sorting. The Sony MA900 Cell sorter was used for performing fluorescence-activated cell sorting (FACS) experiments. Cells were gated by forward versus back scatter (FSC vs. BSC) plot to identify cell population and exclude debris. Forward scatter height versus forward scatter area (FSC-H vs. FSC-A) plot was used for doublet exclusion, and FSC-H or histogram vs. BFP-A plot to reflect fluorescence signals. In developing HEK293T, K562, and Hela stable cell lines integrated with BFP or BFP v2 (BFP gene and the DAP array to enable H66Y prime editing), the cell population with the top 5% BFP fluorescent signal was sorted from cells transduced with 100 x MOI (FIG. 32).

[0155] Targeted amplicon sequencing and data analysis. Genomic regions surrounding each target locus were amplified, purified, quantified, and sent for Sanger sequencing (Epoch Life Science) or next-generation sequencing (NGS) (Amplicon-EZ, Genewiz). Partial Illumina444929-2659-7749, v. 1adapters provided by Amplicon-EZ were added to the 5’ end of each forward and reverse primer. A typical 10 pl PCR reaction was conducted using 0.5 pmol of each forward and reverse primer (IDT), 1 pl genomic DNA extract, and 5 pl 2 x Phanta Master Mix (Vazyme), with a 60°C annealing temperature and 35-cycle amplification. All primer pairs successfully amplified the desired fragments, verified by DNA electrophoresis in a 1% agarose gel. PCR products were purified using the QIAquick PCR Purification Kit (Qiagen) and DNA spin column (Epoch Life Science). For Sanger sequencing, each amplicon was eluted in 20 pl ultrapure water (Milipore) and quantified by NanoDrop OneC (Thermo Scientific). The sequencing premix (15 pl) was prepared by adding 1 pl diluted DNA (10-20 ng) and 2.5 pl 10 pM sequencing primer to 11.5 pl ultrapure water. For NGS, multiple amplicons were pooled, purified, and eluted in 30 pl of ultrapure water, then quantified first by NanoDrop OneC (Thermo Fisher Scientific) to adjust the DNA concentration to 60-80 ng / pl, and subsequently by the Qubit dsDNA HS Assay Kit (Thermo Fisher Scientific) to obtain approximately 500 ng amplicon in 25 pl ultrapure water for Amplicon-EZ. Sanger sequencing results were analyzed using EditR (http: / / baseeditr / com / ). NGS results were analyzed using CRISPResso2 (http: / / crispresso2.pinellolab.org). Sanger sequencing and NGS data were visualized using the GraphPad Prims 9.4.1.

[0156] RNA extraction. RNA was extracted using the Quick-RNA Miniprep (Zymo Research, R1055) according to the manufacturer’s instructions. Three days post-transfection, the culture medium was removed, and 300 pl of RNA Lysis Buffer was added to each well in a 96-well plate. The cell lysate was cleared by centrifugation at 13,300 rpm for 1 minute, after which the supernatant was transferred to a Spin-Away Filter in a collection tube and centrifuged at 13,300 rpm for 1 minute to remove most genomic DNA. The flow-through was collected in a separate 1.5 mL microcentrifuge tube, and an equal volume of 100% ethanol was added and mixed thoroughly. This mixture was then transferred to a Zymo-Spin IIICG column in a collection tube and centrifuged at 13,300 rpm for 30 seconds. After discarding the flow- through, 400 pl RNA Prep Buffer was added to the column, and the sample was centrifuged at 13,300 rpm for 30 seconds. The flow-through was discarded again, followed by the addition of 700 pl RNA Wash Buffer to the column and centrifugation at 13,300 rpm for 30 seconds. Another wash step was performed using 400 pl RNA Wash Buffer, followed by centrifugation at 13,300 rpm for 2 minutes to ensure complete removal of the wash buffer. For RNA elution, 50 pl DNase / RNase-free water was added to the column and centrifuged at 13,300 rpm for 1 minute. The eluted RNA was either used immediately or stored at -80°C.454929-2659-7749, v. 1

[0157] Reverse transcription (RT). RT was carried out using HiScript III All-in-one RT SuperMix Perfect for qPCR (Vazyme, R333-01) following the manufacturer’s instructions. A 20 pl reaction mix was prepared, which included 4 pl of 5 x All-in-one qRT SuperMix, 1 pl Enzyme Mix, and 15 pl of the RNA elute with the lowest concentration measured by NanoDrop OneC (Thermo Scientific). An equivalent amount by mass of RNA was added for samples in the same batch with higher concentration, and RNase-free ddffcO was used to make up the 20 pl volume. The reaction mix was incubated at 50°C for 15 minutes, followed by 85°C for 5 seconds. The resulting coding DNA (cDNA) was either used immediately for qPCR or stored at -20°C.

[0158] Endogenous gene activation sgRNA design. The NCBI reference sequence database (RefSeq) accession number of a transcript of the gene of interest (GOI) was obtained from UCSC Genome Browser using its table browser. This RefSeq accession number was then provided to Benchling to import the sequence of the GOI along with annotations of functional elements. sgRNAs with MS2 stem loops were designed to target the 300-bp region upstream of the transcription start site (TSS) of GOI.

[0159] Quantitative polymerase chain reaction (qPCR). qPCR primers were selected from the “qPCR primers’’ track in the UCSC genome browser (Zeisel et al., 2013). qPCR was performed using Taq Pro Universal SYBR qPCR Master Mix (Vazyme, Q712-02) following the manufacturer’s protocol. In brief, a 20 pl reaction mix was prepared to contain 10 pl of 2 x qPCR master mix, 7 pl of ddFFO, 0.5 pl of 10 pM Forward Primer, 0.5 pl of 10 pM Reverse Primer, and 2 pl cDNA template. The qPCR was conducted on a Bio-Rad Cl 000 Touch Thermal Cycler with a CFX96 Real-Time System or a Applied Biosystems QS 12K Flex using the following program: Stage 1 -Initial Denaturation, Rep 1, 95°C, 2 minutes; Stage 2-Cycling Reaction, Rep 45, 95°C for 5 seconds followed by 60°C for 30 seconds; Stage 3-Melting Curve, Rep 1, 95°C for 5 seconds, followed by 65°C to 95°C ramp at 0.5°C / cycle. Data were extracted using Bio-Rad CFX Maestro 1.1 (Version 4.1.2433.1219) or using the QuantStudio 12K Flex Software vl.6. Data analysis followed the Delta-Delta Ct method.

[0160] mRNA in vitro transcription. PEAK and MPH mRNAs were transcribed in vitro using the HiScribe T7 ARCA mRNA Kit (with tailing) (New England Biolabs, E2060S) with modified nucleotides following the manufacturer's instructions. A DNA template containing a T7 promoter upstream of the gene of interest was prepared via standard PCR using 2 x Phanta Max Master Mix (Vazyme, P525) and purified through gel electrophoresis. A 20 pl IVT464929-2659-7749, v. 1reaction was set up with 1 pg of DNA template, 10 pl of ARCA / NTP Mix (2X), 2.5 pl of 5mCTP(10mM), 2.5 pl of Pseudo-UTP (lOmM), 2 pl of T7 RNA Polymerase Mix, and ddtfcO filled to 20 pl. The reaction was mixed gently and incubated at 37°C for 30 minutes. Then, 2 pl of DNase I was added, mixed well, and incubated at 37°C for 15 minutes. Afterward, 5 pl of Poly(A) Polymerase Reaction Buffer (10X), 5 pl of Poly(A) Polymerase, and 20 pl ddfbO were added directly to the 20 pl IVT reaction, mixed gently, and incubated at 37°C for 30 minutes.

[0161] mRNA purification. The transcribed mRNA was purified by Li Cl precipitation using materials provided in the HiScribe T7 ARCA mRNA Kit (with tailing) (New England Biolabs, E2065S) following the manufacturer’s instructions. Briefly, 25 pl LiCl solution was added to the 50 pl tailing reaction mix, mixed well, and incubated at -20°C for 30 minutes. Then, the mixture was centrifuged at 4°C for 15 minutes at maximum speed to pellet the RNA. The supernatant was carefully removed, and the RNA pellet was rinsed with 500 pl of cold 70% ethanol, followed by centrifugation at 4°C for 10 minutes. The ethanol was carefully removed, and any residual liquid was eliminated using a sharp tip. The RNA pellet was airdried and resuspended in 50 pl of 0.1 mM EDTA RNA storage solution. The RNA was heated at 65°C for 5-10 minutes to ensure complete dissolution and mixed well. mRNA quality and concentration were assessed using Nanodrop OneC (Thermo Scientific). Purified mRNA was either immediately used or was aliquoted and stored at -20°C or below until further use.

[0162] Lentivirus and AAV production. Low passage HEK293T cells were seeded at 5 x 106cells per 10 ml culture media [10% v / v FBS (Gibco, 10437028). 90% v / v DMEM plus GlutaMAX (Gibco, 10569044), and penicillin-streptomycin (Gibco, 15140122) diluted to 100 units / mL and 100 pg / mL, respectively] per 10-cm cell culture dish (Greiner Bio-One, 639160) 16 hours before transfection. For lentivirus production per 10-cm dish, 5 pg of transfer vector plasmid containing the construct of interest, 2.5 pg of pMD2.G envelope plasmid (Addgene, #12259), and 4.5 pg of psPAX2 packaging plasmid (Addgene, #12260) were added into 260 pl of serum-free DMEM in a 50-ml tube, followed by addition of 78 pl PEI Max (1 mg / ml, PH = 7.1, Polysciences), vortexed, and then incubated at room temperature for 10 min. The transfection mixture was then diluted with 10 ml of culture medium to replace the old medium from the 10-cm dish. After 48 hours, the full volume of supernatant was used directly or collected in a 15-ml tube and centrifuged at 3200 x g for 5 min at room temperature to remove the cell debris, clarified through a 0.45 pm PVDF filter (Millipore) and concentrated using474929-2659-7749, v. 1PEG virus precipitation kit (Biovision) with an optimized protocol. Briefly, 2.5 ml of PEG solution was added to the 10 ml supernatant, inverted evenly, and refrigerated at 4°C for 24 hours. The mixture was then centrifuged at 3200 x g and 4°C for 30 min, followed by several rounds of aspiration and centrifugation to entirely remove the supernatant from the precipitated white pellet. Lastly, the pellet was suspended in 80 pl of virus resuspension solution. The process for AAV production was similar to that of lenti virus production except for the plasmid used. For each 10-cm dish, 3 pg of vector plasmid, 5 pg of pHelper plasmid (Cell Biolabs), and 4 pg of AAV1 -Rep-Cap plasmid (Addgene, #112862) were transfected. The freshly prepared lentivirus or AAV were immediately used for transductions.

[0163] Transduction. Low passage cells were seeded at 1500 to 2 x 104cells per 100 pl culture medium per well in a 96-well poly-D-lysine coated plate (Corning, 356690) and preincubated at room temperature for 15 min, followed by the addition of freshly prepared lentivirus or AAV, and then placed into the incubator. Cells were transduced at different multiplicity of infection (MOI). When developing HEK293T, K562, and Hela cells reporter stable cell lines expressing BFP or BFP v2 (BFP with the DAP array to enable H66Y prime editing), transducing 2 x 104cells / well with 0.1, 1, and 10 pl / well lentivirus concentrate represent 1 x MOI, 10 x MOI, and 100 x MOI, respectively. To develop HEK293T stable cell line expressing 3-color reporters, 100 pl lentivirus concentrate was added to 2 x 104cells / well. When developing HEK293T stable cell line expressing PEAK, 2 x 104cells / well were transduced with 20-40 pl lentivirus concentrate. 24 h after lentiviral transduction, 1 pg / ml puromycin (Thermo Fisher Scientific, J67236.8EQ) were supplemented into cell culture media to initiate puromycin selection. Once cells in the 96-well reached >90% confluency, they were dissociated and replated into a 10-cm cell culture dish with 10 mL culture media containing 1 pg / ml puromycin. Transduced cells were used in downstream experiments such as FACS or genomic editing once they reached >80% confluency. For AAV transduction, HEK293T cells were seeded at 1500 cells per 100 pl culture medium per well in the 96-well poly-D-lysine coated plate (Coming, 356690), pre-incubated at room temperature for 15 min, transduced with 30 pl AAV (encoding the DAP array) concentrate, and placed into the incubator. Three days after AAV transduction, cells in each transduced well were transfected with the PEAK and MPH mRNAs.

[0164] Prime editing gRNA design. PegRNAs and nicking gRNAs were designed usingPrimerDesign (Hsu et al., 2021) (https: / / diugthatgene.pinellolab.partners.org / ). Non-484929-2659-7749, v. 1interfering nucleotide linkers between a pegRNA and the 3’ motif was designed using pegLIT (Nelson et al., 2022) (https: / / peglit.liugroup.us).

[0165] Endogenous gene repression shRNA design. shRNAs in the DAP array were designed using the Broad Institute GPP web portal (https: / / portals.broadinstitute.org / gpp / public / ) or GenScript siRNA Target Finder (https: / / www.genscript.com / tools / sima-target-finder) or InvivoGen siRNA Wizard Online Tool (Fakhr et al., 2016) (https: / / www.invivogen.com / sima-wizard).

[0166] Statistics and reproducibility. Values were reported as mean ± SD. Groups were compared using the unpaired two-tailed t-test or the nested one-way ANOVA with Dunnett’s multiple comparisons test. The solid lines and dashed lines of the violin plot represent the quartiles and median. Biologically independent experiments reported here were performed by different researchers using separate splits of the mammalian cell type used.

[0167] Data availability. Genomic sequencing raw reads are available at NCBI SRA PRJNA9655386. Plasmids for DAP array assembly are available at https: / / www.addgene.org / Xue_Gao / .

[0168] Code availability’. Customized code for analyses described in the study is available on Github (https: / / github.com / qichenyuan / ). In the rational engineering, Pymol (Ver 2.5) (Schrodinger, 2015) was used with “select SI within X of S2” which gives atoms in SI that are within X Angstroms of any atoms in S2, to determine residues within 5 / 7 / 10 A of the DNA-RNA duplex in the crystal structure of 451 aa XMRV-RT (PDB: 4HKQ) (Nowak et al., 2013).Example 1 - Engineering a compact and efficient prime editing system

[0169] Before exploring PEs for gene regulation, the efficiency and compactness of the prime editing system was optimized. To enable high-throughput screening of engineered PE variants, a reporter assay was developed linking PE activity to the conversion of blue fluorescent protein (BFP) to green fluorescent protein (GFP), achieved by a C-to-T substitution that converts His to Tyr within the BFP Thr-His-Gly chromophore (Glaser et al., 2016) (FIG. IB). Thirteen DAP arrays were screened, each containing different ngRNA / pegRNA pairs. The DAP array EP 1.11 demonstrated the highest prime editing efficiency, measured by the BFP- to-GFP conversion rates (FIGS. 5A-N). The EPl.11 DAP array and the BFP gene were494929-2659-7749, v. 1integrated into the HEK293T genome via lentiviral infection, creating the BFP reporter v2 stable cell line to evaluate PE activity (FIG. 6). Additionally, a 3-color reporter system was developed with an efficient 3-loci multiplex prime editing (MPE) DAP array to simultaneously report PEs’ ability to insert a 9-bp fragment to recover the EGFP TYG chromophore, delete a 6-bp pre-installed stop codons on mCherry, and substitute a 6-bp fragment to recover the TagBFP LYG chromophore (FIGS. 7A, 10, and 11).

[0170] Building on the foundational PE2 system, its prime editing efficiency was improved by optimizing the nuclear trafficking via nuclear localization signal (NLS) engineering (Dingwall et al., 1982; Kalderon et al., 1984; Suzuki et al., 2016). Ninety-one PE2 variants were constructed with different C-terminal NLSs and 31 variants with different N- terminal NLSs sourced from the NLSdb database (Nair et al., 2003), followed by screening using the BFP reporter v2 stable cell line (FIGS. 1C, 8A). Substantial improvements were observed with the N-terminal VirD2 NLS or the C-terminal SV40 NLS (FIG. 1C). Combining these two best NLSs into one variant (EP2.5) produced a synergistic effect, resulting in a 7% increase in BFP-to-GFP conversion rates compared to PE2 with a N- and C-terminal BPSV40 NLSs (FIG. 8B). This improvement was consistent across various genetic modifications when tested with the 3-color reporter cell lines, with EP2.5 showing a 10-18% increase in insertion efficiency, 9-14% increase in substitution efficiency, and 12-35% increase in deletion efficiency compared to PE2 (FIGS. 7D-7F). To further improve prime editing efficiencies, previously reported engineered pegRNAs (epegRNAs) were incorporated into the DAP array, which each have structured 3’ motif for enhanced stability and resistance to degradation (Nelson et al., 2022) (hereafter referred to as eMPE) (FIG. ID). eMPE consistently outperformed MPE across multiple PE variants and different DAP array dosages (FIGS. ID and 12). Notably, the DAP array with the trimmed pseudoknot evopreQl (tevopreQl, with linker) showed a 10-35% increase in prime editing efficiency reflected by the BFP-to-GFP reporter and was used for all further eMPE experiments (FIG. ID).

[0171] Next, the Moloney Murine Leukemia Virus reverse transcriptase (MMLV-RT) used in PE2 was truncated, guided by studies indicating that the RNase H domain and the first 23 amino acid (aa) residues of the MMLV-RT are non-essential (Das & Georgiadis 2004; Zheng et al., 2022; Gao et al., 2022; Griinewald et al., 2023; Bock et al., 2022). Compared to the canonical PE2 with full length 677aa (1-677) MMLV-RT, a truncated variant with only the polymerase domain (25-468, 444 aa in length) lost nearly 90% of prime editing efficiency,504929-2659-7749, v. 1while another variant (24-474, 451 aa in length) retained 70% of the editing efficiency (FIGS. IE and 13). To improve the activity of the truncated 451 aa RT, 31 variants were constructed harboring mutations that were previously reported to enhance the MMLV-RT performance (Oscorbin et al., 2021). Using the minimally active 444 aa RT as a baseline eight mutations were identified that individually increased the prime editing efficiency of the 444 aa RT (FIG. 14). When implemented into the 451 aa RT variant, the D200C mutation increased the prime editing efficiency by 27% (FIG. IF).

[0172] Finally, the electrostatic interactions between the MMLV-RT and the negatively-charged DNA / RNA hybrid were enhanced, as guided by the crystal structure of XMRV-RT (PDB: 4HKQ) (Nowak et al., 2013), which shares high homology with our 451 aa MMLV-RT (FIG. 15). Forty-four residues within 10 A of the DNA / RNA substrate were individually modified to positively charged Arginine within the 451 aa MMLV-RT (D200C) (FIG. 16). Seven additional mutations were discovered that enhanced the editing efficiency. Through subsequent screening of these mutations in different combinations, the double mutant 451 aa MMLV-RT (V101R+D200C) was identified, which demonstrated a 9% increase in BFP-to-GFP conversion efficiency compared to the single mutant 451 aa MMLV-RT (D200C) (FIGS. 1G, 9A-9E). The performance of the engineered PEs were evaluated at the human endogenous HEK3 locus, showing a consistent correlation of the editing efficiency with the reporter systems (FIG. 1H). Thus, the rationally engineered PE (EP3.61), featuring a truncated 451 aa MMLV-RT (V101R+D200C) and optimized NLS sequences, achieved similar endogenous editing efficiencies compared to PE2 with the full-length RT (FIG. 1H). Additionally, when coupled with the DAP eMPE array, EP3.61 can achieve 69% higher prime editing efficiency compared to PE2 (FIG. 1H). Collectively, a compact and efficient prime editing system was engineered with the shortest active MMLV-RT reported up to date, termed “prime editing with advanced kernel” (PEAK), that incorporates an N-terminal VirD2 NLS, a C-terminal SV40 NLS, a truncated 451 aa MMLV-RT with beneficial mutations V101R and D200C, and epegRNA in DAP eMPE array (FIG. 1H).Example 2 - MPH-recruiting prime editors enable transcriptional activation on fluorescent reporters

[0173] To develop a gene activation system using PE, the synergistic activation mediator (SAM) system was incorporated, which is a powerful RNA-guided programmable gene activator composed of a catalytically dead Cas9 (dCas9), activation sgRNA (agRNA)514929-2659-7749, v. 1with two MS2-binding stem loops, and an MS2-p65-HSFl (MPH) transcriptional activator (Konermann et al., 2015). Following MPH recruitment to the target loci through the MS 2- binding stem loops, the SAM system attracts transcription factors and chromatin remodeling complexes for gene upregulation. It was hypothesized that substituting the dCas9 of the SAM system with a nickase Cas9 (nCas9) or a PE could still achieve sufficient gene activation (FIG. 2A). Nine reporter variants with different half-lives of EGFP and copy numbers of protospacer targets were developed. An agRNA with a 20-nt protospacer sequence was used to direct a nCas9 or PE, along with MPH, to the protospacer target region of the reporter for EGFP gene activation, which was subsequently quantified via flow cytometry.

[0174] Reporters 1-3 were designed with eight copies of protospacer targets upstream of a miniCMV promoter that drives the expression of EGFP, EGFP-PEST, and EGFP-CL1- PEST, respectively (FIG. 2B). The protein depredation sequences PEST and CL1 were included to shorten the half-life of the fused proteins, thereby enhancing the fluorescent reporters’ signal-to-background performance (Li et al., 1998; Gilon et al., 1998). In HEK293T cells, all evaluated nCas9 variants (D10A, H840A, and H863A) and PE2 successfully activated Reporters 1-3 when paired with the 20-nucleotide (nt) agRNA and MPH from the SAM system (FIG. 2C). Furthermore, when the testing was extended to different human cell lines, including K562 and Hela, substantial activation of Reporters 1-3 was observed with the nCas9 variants and PE2 compared to the non-transfected and no Cas plasmid transfection controls (FIGS. 2D and 17). Overall, PE2 retained an average of 77% GFP activation, while the nCas9 variants showed activities comparable to dCas9 (82%~108%) (FIGS. 2D and 17). As expected, substituting dCas9 with wild-type Cas9 (wtCas9) resulted in only 6% GFP activation, likely due to double-stranded DNA cleavage (FIGS. 2D and 17).

[0175] To increase the stringency for gene activation, Reporters 4-6 were designed with only one protospacer target instead of eight (FIG. 18A). With these reporters, all nCas9 variants and PE2 still enabled significant gene activation, ranging from 72% to 102% EGFP florescence compared to dCas9 (FIGS. 18C and 18D). In contrast, wtCas9 retained only 19% of the activation capability. Finally, to assess the impact of nicking on either the sense or the antisense DNA strand on gene activation, Reporters 7-9 were designed with protospacer regions on the opposite strand compared to Reporters 4-6 (FIG. 18B). No significant difference was observed in gene activation when nicking occurred on the sense or antisense DNA strand (FIGS. 18C, 18D, and 19). Together, these results demonstrate that PE, a fusion of Cas9 nickase (H840A)524929-2659-7749, v. 1and MMLV-RT, along with Cas9 nickases (D10A, H840A, and N863A), but not wtCas9, can effectively substitute dCas9 in the SAM system to activate gene expression in fluorescent reporter assays.Example 3 - Efficient activation with MPH -recruiting prime editors on endogenous gene loci

[0176] Next, whether agRNAs expressed from the DAP array can effectively activate endogenous genes was determined. agRNAs targeting IL1B or RHOXF2 were expressed using either individual DAP arrays or a conventional U6 promoter, which were co-delivered with the dCas9-SAM system (Yuan & Gao, 2022). Compared to individual U6 promoters, similar gene activation was observed for both genes using the DAP array (with 75bp hCtRNA as the promoter), while reducing the promoter length by 70% (FIG. 2E). Further expanding our approach, five DAP arrays each containing six agRNAs were assembled, allowing for 30-gene multiplexed activation using the dCas9-SAM system. Remarkably, 29 out of the 30 targeted genes were activated, with an average 404-fold increase in mRNA expression compared to the GFP transfected control as measured by RT-qPCR, demonstrating the DAP array’s capacity for efficient multiplex endogenous gene activation (FIG. 2F).

[0177] However, when dCas9 was substituted with PEAK in the SAM system, gene activation decreased by 89% on average across 15 tested endogenous genes (ranging from 99.5% to 43%) (FIG. 2G). This reduced efficiency was speculated to be attributed to the DNA nicking activity of the nCas9 (H840A) in the PEAK system, as a 15% average reduction (up to 48%) was also observed in the EGFP reporter assay using nCas9 (H840A) or PE2 (FIGS. 2D and 17-19). Inspired by a truncated sgRNA design (Dahlman et al., 2015; Kiani et al., 2015) that effectively inactivates the nuclease activity of wtCas9 for efficient gene activation, a similar truncation in the 20-nt protospacer of agRNA was hypothesized to enhance endogenous gene activation efficiency by PEAK. Therefore, a series of truncated agRNAs (driven by a DAP array) with spacer lengths ranging from 8 to 20 nt (FIG. 2H) were constructed. The gene activation capabilities of PEAK, along with other Cas9 variants, including dCas9, wtCas9, nCas9(D10A), and nCas9(H840A), was evaluated with truncated agRNAs (FIGS. 21, 2J, and 20-22). At the IL1B site in HEK293T cells, a 19-nt agRNA demonstrated significantly improved activation, resulting in a 4,676-fold increase with PE2 and a 19,619-fold increase using PEAK. This represents a 280% and 328% improvement in activation capability compared to 20-nt agRNA paired with PE2 (1698-fold) or PEAK (5974-fold), respectively (FIGS. 21 and 21). Similarly, at the RH0XF2 site in HEK293T cells, an 11-nt agRNA enabled highly efficient534929-2659-7749, v. 1gene activation (15,649-fold increase), while a 20-nt agRNA resulted in significantly lower activation (2,360-fold increase) with PEAK (FIG. 2J). Further, PEAK with truncated spacer agRNAs reached similar levels of activation compared to the dCas9-based SAM system with full-length spacer agRNAs on the IL1B and RH0XF2 genes (FIG. 2K). In HepG2 cells targeting the PDX1 gene, an agRNA with 11 -nt protospacer allowed PEAK to achieve substantially higher gene activation (438-fold increase) compared to a 20-nt agRNA (83-fold increase), yielding results comparable to the dCas9 SAM system (534-fold increase) (FIG. 22). Together, this provides a viable strategy for efficient endogenous gene activation using PEAK and the DAP array, through the optimization of truncated agRNAs with spacer lengths ranging 11 and 19 nucleotides.Example 4 - DAP shRNA array for efficient and scalable endogenous gene repression

[0178] To develop an efficient and orthogonal method for gene repression using PEAK and the DAP array, a compact and modular approach that can independently silence multiple genes without affecting gene editing or activation was attempted. The CRISPR interference (CRISPRi) strategy (Qi Lei et al., 2013), utilizing the PE2 and a traditional sgRNA to inhibit the transcription of targeted genes, was adopted (FIG. 3A). A EGFP reporter was designed and PE2 was employed to repress the EGFP expression from the transfected plasmid in HEK293T cells (FIGS. 3 A and 23). While PE2 was able to achieve up to 57% EGFP repression, it required laborious designing and screening of the optimum target (FIG. 3B). Only 17 of the 33 tested sgRNAs repressed the EGFP reporter expression by more than 50% (FIG. 3B). In addition, no synergistic effects in EGFP repression was observed when multiple sgRNAs were co-delivered (FIG. 24). The more potent CRISPR repressor dCas9-KRAB-MeCP2 (Yeo et aL, 2018) was also tested, which outperformed the PE2 -mediated CRISPRi strategy, achieving an average of 88% EGFP repression with the top-performing sgRNAs (sgRNA6, 19, and 21) (FIGS. 3D, 24, and 25). Although fusing PE with dCas9-KRAB-MeCP2 or other repressor proteins could potentially improve endogenous gene repression, the added repressors would not only interfere with gene activation but also increase the protein fusion size, negating the goal of creating a compact and orthogonal gene perturbation system.

[0179] As an alternative strategy, RNA interference (RNAi) is a well-established mechanism for gene silencing in human cells (Fire et al., 1998; Elbashir et al., 2001). Shorthairpin RNAs (shRNAs), typically 48 base pairs in length, have been known to achieve highly efficient gene repression (Paddison et al., 2002; Brummelkamp et al., 2002; Moffat et al., 2006)544929-2659-7749, v. 1and present a viable option for incorporation into the DAP array. Given the previous success in using the DAP array to express short RNAs for gene editing and activation, shRNAs were hypothesized to also be able to be integrated into the DAP array for orthogonal gene repression (FIG. 3C). To test this hypothesis, the top five shRNAs (GPP shRNAl-5) generated by the genetic perturbation platform (GPP) webtool were selected to target the EGFP transcript. GPP shRNAl repressed EGFP expression by more than 98%, surpassing the efficiency of proteinbased transcription repression systems (FIG. 3D). Additionally, as the 3’ end overhang may affect shRNA’s gene silencing function (Elbashir et al., 2001), GPP shRNAl -expressing DAP arrays were tested with various poly-A tail sequence lengths and found no influence on gene repression efficiency (FIG. 26).

[0180] Next, a dozen shRNAs targeting the endogenous MI. Hl gene, which encodes a key protein of cellular DNA repair mechanisms (Chen et al., 2021), were designed to validate the efficacy of shRNAs generated by the DAP array in repressing the endogenous gene in human cells. The GPP1 shRNA achieved the most efficient endogenous MLH1 gene knockdown (92%), outperforming those designed by GenScript (GEN) or InvivoGen (INV) (FIG. 3E). To further examine the multiplexed gene repression using this system, a DAP array was designed expressing four shRNAs designed by GPP, each targeting a key gene involved in DNA mismatch repair (MMR) pathway - MLH1 , MSH2, MSH6, and PMS2, potentially mediating prime editing in MMR-proficient cells (Chen et al., 2021). The multiplexed DAP shRNA array demonstrated highly efficient gene repression, with knockdown efficiencies of MLH1, MSH2, MSH6, and PMS2 at 85%, 48%, 66%, and 79%, respectively (FIG. 3F). In addition, the order of shRNAs in the DAP array did not affect the knockdown efficiency (FIG. 3F). When co-delivered with PEAK and an eMPE array targeting the endogenous human HEK3 locus, the DAP shRNA array marginally but significantly improved editing efficiency in HEK293T cells, an effect which could be amplified in MMR-proficient cell lines such as iPSCs or T cells (FIG. 27). Together, these findings underscore the DAP-derived shRNA array as a minimal, versatile RNA module for efficient gene repression, functioning independently of gene editing and activation by PEAK.Example 5 - Simultaneous genetic perturbations using the DAP array, PEAK, and MPH

[0181] A minimal and versatile genetic perturbation technology (mvGPT) has been established comprising a diverse RNA-encoding DAP array, PEAK, and MPH to orchestrate endogenous gene editing, activation, and repression. To explore its potential for simultaneous554929-2659-7749, v. 1gene perturbations, three distinct genetic loci relevant to the Wilson’s disease, type I diabetes, or transthyretin amyloidosis were targeted (FIG. 4A). Wilson’s disease, commonly caused by a C.3207OA; p.H1069Q mutation in the ATP7B gene (Chang et al., 2017), leads to excessive copper accumulation in the body and potential liver failure (Ferenci et al., 2015). For Type I diabetes, activating the PDX1 gene can induce hepatocyte transdifferentiation into pancreatic beta-like insulin-producing cells, increasing insulin levels and lowering glucose levels in the blood (Ferber et al., 2000; Liao et al., 2017). Finally, hereditary transthyretin amyloidosis is a severe disease caused by mutations in the transthyretin (TTR) gene, resulting in extracellular amyloid deposition and multiple organ dysfunctions (Adams et al., 2018; Hawkins et al., 2015). Reducing transthyretin levels has been shown to be an effective therapeutic approach to manage this disease (Paddison et al., 2022).

[0182] To simultaneously alleviate conditions associated with these genetic diseases, a DAP array was created encoding an eMPE array (to correct the ATP7B H1069Q mutation via a c.3207A>C substitution), an optimized 11-nt spacer agRNA (to activate the PDX1 gene), and the shRNA from the medication Patisiran (Adams et al., 2018) (to repress the TTR gene), which outperformed all other shRNAs that were designed and tested (FIGS. 4B, 22, and 28). As liver cells are closely involved in the cause and treatment of all three diseases, the human hepatoma cell line HepG2 was used as the testing platform. A HepG2 stable cell line was established with a short sequence containing the pathogenic ATP7B c.3207C>A; p.H1069Q mutation via lentiviral transduction. Subsequently, these HepG2 cells were transfected with plasmids encoding the designed DAP array, PEAK, and MPH (FIG. 4C). The results demonstrated a 5% ATP7B c.3207A>C correction, up to a 1700-fold activation of the PDX1 gene, and a 93% repression of the TTR gene in the HepG2 stable cell line (FIG. 4D). To further validate the capabilities of mvGPT for multiplexed gene perturbations, another DAP array was constructed with different targets. This array encoded for an eMPE array to install the Wilson’s diseasecausing ATP7B c.3207C>A; p.H1069Q mutation in situ, a truncated agRNA with an 11-nt spacer to activate the RH0XF2 gene, and an shRNA to silence the MLH1 gene. When tested using plasmid delivery, a 25% prime editing efficiency was observed for ATP7B c.3207C>A installation, a 700-fold gene activation was observed for RHOXF2, and an 87% repression was observed for MLH1 gene expression in HEK293T cells (FIGS. 4E and 29). Additionally, as expected, omitting MPH within the transfection group resulted in editing and repression only, while omitting PEAK resulted in only repression, demonstrating the lack of cross-talk within the mvGPT platform (FIG. 30).564929-2659-7749, v. 1

[0183] Finally, alternative delivery modalities beyond plasmids were explored for mvGPT. First, the DAP array, PEAK, and MPH were prepared as messenger RNAs (mRNAs). Interestingly, delivering the DAP array as mRNA did not produce functional RNAs in cells, likely due to its inability to enter the nucleus for pre-tRNA processing (Abelson et al., 1998). As a result, instead of delivering all three components as mRNAs, the DAP array was packaged into an AAV vector (FIG. 4F). Four days after transducing HEK293T cells with AAV carrying the DAP array, cells were transfected with MPH and PEAK mRNAs. This strategy led to a 5% prime editing efficiency for installing the ATP7B c.3207C>A; p.H1069Q mutation, a 64-fold gene activation of RH0XF2, and a 75% gene repression of MLH1 (FIG. 4G). Furthermore, w PEAK was packaged into a single lentiviral vector with a puromycin resistance gene and HEK293T cells were transduced for selection. After establishing a stable cell line expressing PEAK, cells were transfected with plasmids encoding the DAP array and MPH (FIG. 4H). This approach achieved a 12% prime editing efficiency for installing the ATP7B c.3207C>A; p.H1069Q mutation, an 81 -fold upregulation of RH0XF2, and a 66% gene repression of MLH1 (FIG. 41). These results demonstrated the effective combinatorial viral and non-virial delivery of the DAP array, PEAK, and MPH for simultaneous endogenous genetic perturbations, including gene editing, gene activation, and gene repression.Example 6 -

[0184] A 4-loci DAP multiplex array for base editing was constructed to test multiplex editing in mouse N2a cells. When tested, minimal editing activity was observed in the multiplex array delivered with the C>T base editor tadCBEd, especially compared to the efficiently edited singleplex controls with individual U6 promoters (FIG. 34). This result demonstrates that the DAP multiplex strategy, which is used in mvGPT, will not work in mouse N2a cells, and may pose a concern for any in vivo mouse studies of any application of the mvGPT platform.

[0185] It was hypothesized that the issue may be inefficient tRNA processing from the DAP array, potentially due to mouse N2a cells lacking the endogenous RNAse P and RNAse Z proteins that cleave the tRNA and release the individual sgRNA, shRNA, and agRNA. To test this, various mouse tRNA that are highly expressed in the mouse brain were screened using a duplex editing system at highly efficient editing sites in N2a cells (FIG. 35). The original human CtRNA did not perform well compared to the singleplexed controls. In contrast, mouse CtRNA and QtRNA had high editing efficiency, substantially outperforming the original tRNA design. When screening additional leader and trailer sequences before and after the tRNA used574929-2659-7749, v. 1for tRNA recognition and processing, the editing efficiency was rescued, performing similarly to the singleplexed controls. * *

[0186] All of the methods disclosed and claimed herein can be made and executed without undue experimentation in light of the present disclosure. While the compositions and methods of this invention have been described in terms of preferred embodiments, it will be apparent to those of skill in the art that variations may be applied to the methods and in the steps or in the sequence of steps of the method described herein without departing from the concept, spirit and scope of the invention. More specifically, it will be apparent that certain agents which are both chemically and physiologically related may be substituted for the agents described herein while the same or similar results would be achieved. All such similar substitutes and modifications apparent to those skilled in the art are deemed to be within the spirit, scope and concept of the invention as defined by the appended claims.584929-2659-7749, v. 1REFERENCESThe following references, to the extent that they provide exemplary procedural or other details supplementary to those set forth herein, are specifically incorporated herein by reference.U.S. Patent 10,308,947Abelson et al., tRNA Splicing*. Journal of Biological Chemistry 273, 12685-12688 (1998).Adams et al., Patisiran, an RNAi Therapeutic, for Hereditary Transthyretin Amyloidosis. New England Journal of Medicine 379, 11-21 (2018).Anzalone et al., Programmable deletion, replacement, integration and inversion of large DNA sequences with twin prime editing. Nature Biotechnology 40, 731 -740 (2022).Anzalone et al., Search- and-replace genome editing without double-strand breaks or donor DNA. Nature 576, 149-157 (2019).Anzalone et al., Genome editing with CRISPR-Cas nucleases, base editors, transposases and prime editors. Nature biotechnology 38, 824-844 (2020).Bock et al., In vivo prime editing of a metabolic liver disease in mice. Science Translational Medicine 14, eabl9238 (2022).Brummelkamp et al., A System for Stable Expression of Short Interfering RNAs in Mammalian Cells. Science 296, 550-553 (2002).Campa et al., Multiplexed genome engineering by Casl2a and CRISPR arrays encoded on single transcripts. Nature Methods 16, 887-893 (2019).Chang & Hahn, Chapter 3 - The genetics of Wilson disease. In: Handbook of Clinical Neurology (eds Czlonkowska A, Schilsky ML). Elsevier (2017).Chen et al., Enhanced prime editing systems by manipulating cellular determinants of editing outcomes. Cell 184, 5635-5652.e5629 (2021).Cong et al., Multiplex genome engineering using CRISPR / Cas systems. Science 339, 819-823 (2013).Dahlman et al., Orthogonal gene knockout and activation with a catalytically active Cas9 nuclease. Nature Biotechnology 33, 1159-1161 (2015).Daniel et al., Revolutionizing genetic disease treatment: Recent technological advances in base editing. Current Opinion in Biomedical Engineering 28, 100472 (2023).Das & Georgiadis, The Crystal Structure of the Monomeric Reverse Transcriptase from Moloney Murine Leukemia Virus. Structure 12, 819-829 (2004).594929-2659-7749, v. 1Dingwall et al., A polypeptide domain that specifies migration of nucleoplasmin into the nucleus. Cell 30, 449-458 (1982).Doman et al., Phage-assisted evolution and protein engineering yield compact, efficient prime editors. Cell 186, 3983-4002.e3926 (2023).Elbashir et al., Duplexes of 21 -nucleotide RNAs mediate RNA interference in cultured mammalian cells. Nature 411, 494-498 (2001).Elbashir et al., Functional anatomy of siRNAs for mediating efficient RNAi in Drosophila melanogaster embryo lysate. The EMBO Journal 20, 6877-6888 (2001).Fakhr et al., Precise and efficient siRNA design: a key point in competent gene silencing. Cancer Gene Therapy 23, 73-82 (2016).Farzadfard et al., Single-Nucleotide-Resolution Computing and Memory in Living Cells. Molecular Cell 75, 769-780.e764 (2019).Ferber et al., Pancreatic and duodenal homeobox gene 1 induces expression of insulin genes in liver and ameliorates streptozotocin-induced hyperglycemia. Nature Medicine 6, 568- 572 (2000).Ferenci et al., Encephalopathy in Wilson Disease: Copper Toxicity or Liver Failure? Journal of Clinical and Experimental Hepatology 5, S88-S95 (2015).Fire et al., Potent and specific genetic interference by double-stranded RNA in Caenorhabditis elegans. Nature 391, 806-811 (1998).Gao et al., Complex transcriptional modulation with orthogonal and inducible dCas9 regulators. Nature Methods 13, 1043-1049 (2016).Gao et al., A truncated reverse transcriptase enhances prime editing by split AAV vectors. Molecular Therapy 30, 2942-2951 (2022).Gaudelli et al., Programmable base editing of A»T to G*C in genomic DNA without DNA cleavage. Nature 551, 464-471 (2017).Gilon et al., Degradation signals for ubiquitin system proteolysis in Saccharomyces cerevisiae. The EMBO journal 17, 2759-2766 (1998).Glaser et al., GFP to BFP Conversion: A Versatile Assay for the Quantification of CRISPR / Cas9-mediated Genome Editing. Molecular Therapy - Nucleic Acids 5, (2016).Griinewald et al., Engineered CRISPR prime editors with compact, untethered reverse transcriptases. Nat Biotechnol 41, 337-343 (2023).Haapaniemi et al., CRISPR-Cas9 genome editing induces a p53-mediated DNA damage response. Nature Medicine 24, 927-930 (2018).604929-2659-7749, v. 1Hawkins et al., Evolving landscape in the management of transthyretin amyloidosis. Annals of medicine 47, 625-638 (2015).Hsu et al., PrimeDesign software for rapid and simplified design of prime editing guide RNAs. Nature communications 12, 1-6 (2021).Ihry et al., p53 inhibits CRISPR-Cas9 engineering in human pluripotent stem cells. Nature Medicine 24, 939-946 (2018).Kalderon et al., A short amino acid sequence able to specify nuclear location. Cell 39, 499-509 (1984).Kiani et al., Cas9 gRNA engineering for genome editing, activation and repression. Nature Methods 12, 1051-1054 (2015).Komor et al., Programmable editing of a target base in genomic DNA without double-stranded DNA cleavage. Nature 533, 420-424 (2016).Konermann et al., Genome-scale transcriptional activation by an engineered CRISPR-Cas9 complex. Nature 517, 583-588 (2015).Kosicki et al., Repair of double-strand breaks induced by CRISPR-Cas9 leads to large deletions and complex rearrangements. Nature Biotechnology 36, 765-771 (2018).Li et al., Generation of Destabilized Green Fluorescent Protein as a Transcription Reporter*. Journal of Biological Chemistry 273, 34970-34975 (1998).Liao et al., In Vivo Target Gene Activation via CRISPR / Cas9-Mediated Trans-epigenetic Modulation. Cell 171, 1495-1507.el415 (2017).McCarty et al., Multiplexed CRISPR technologies for gene editing and transcriptional regulation. Nature Communications 11, 1281 (2020).Moffat et al., A Lentiviral RNAi Library for Human and Mouse Genes Applied to an Arrayed Viral High-Content Screen. Cell 124, 1283-1298 (2006).Nair et al., NLSdb: database of nuclear localization signals. Nucleic Acids Research 31, 397- 399 (2003).Nelson et al., Engineered pegRNAs improve prime editing efficiency. Nature Biotechnology 40, 402-410 (2022).Nishida et al., Targeted nucleotide editing using hybrid prokaryotic and vertebrate adaptive immune systems. Science 353, aaf8729 (2016).Nissim et al., Multiplexed and programmable regulation of gene networks with an integrated RNA and CRISPR / Cas toolkit in human cells. Molecular cell 54, 698-710 (2014).Nowak et al., Structural analysis of monomeric retroviral reverse transcriptase in complex with an RNA / DNA hybrid. Nucleic Acids Research 41, 3874-3887 (2013).614929-2659-7749, v. 1Nunez et al., Genome-wide programmable transcriptional memory by CRISPR-based epigenome editing. Cell 184, 2503-2519.e2517 (2021).Oscorbin & Filipenko, M-MuLV reverse transcriptase: Selected properties and improved mutants. Computational and Structural Biotechnology Journal 19, 6315-6327 (2021).Paddison et al., Short hairpin RNAs (shRNAs) induce sequence-specific silencing in mammalian cells. Genes & development 16, 948-958 (2002).Park et al., Targeted mutagenesis in mouse cells and embryos using an enhanced prime editor. Genome Biology 22, 170 (2021).Qi Lei et al., Repurposing CRISPR as an RNA-Guided Platform for Sequence-Specific Control of Gene Expression. Cell 152, 1173-1183 (2013).Schrodinger, The PyMOL Molecular Graphics System, Version 2.1.) (2015).Smith et al., Enabling large-scale genome editing at repetitive elements by reducing DNA nicking. Nucleic acids research 48, 5183-5195 (2020).Suzuki et al., In vivo genome editing via CRISPR / Cas9 mediated homology-independent targeted integration. Nature 540, 144-149 (2016).Xiaoyi et al., Chromatin context-dependent regulation and epigenetic manipulation of prime editing. bioRxiv, 2023.2004.2012.536587 (2023).Xie et al., Boosting CRISPR / Cas9 multiplex editing capability with the endogenous tRNA- processing system. Proceedings of the National Academy of Sciences 112, 3570-3575 (2015).Yeo et al., An enhanced CRISPR repressor for targeted mammalian gene regulation. Nature Methods 15, 611-616 (2018).Yuan & Gao, Multiplex base- and prime-editing with drive-and-process CRISPR arrays. Nature Communications 13, 2771 (2022).Zeisel et al., An accessible database for mouse and human whole transcriptome qPCR primers. Bioinformatics 29, 1355-1356 (2013).Zeng et al., Recent advances in prime editing technologies and their promises for therapeutic applications. Current Opinion in Biotechnology 86, 103071 (2024).Zetsche et al., Multiplex gene editing by CRISPR-Cpfl using a single crRNA array. Nature biotechnology 35, 31-34 (2017).Zhao et al., Multiplex Base-Editing Enables Combinatorial Epigenetic Regulation for Genome Mining of Fungal Natural Products. J Am Chem Soc 145, 413-421 (2023).Zheng et al., A flexible split prime editor using truncated reverse transcriptase improves dual- AAV delivery in mouse liver. Molecular Therapy 30, 1343-1351 (2022).624929-2659-7749, v. 1

Claims

1. CLAIMS1. A nucleic acid construct comprising, from 5’ to 3’, at least four repetitions of a 5’ leader sequence derived from a tRNA and a small RNA, wherein the small RNAs are, independently, selected from the group consisting of a guide RNA, a nicking guide RNA, a prime editing guide RNA, an engineered prime editing guide RNA, and a short-hairpin RNA.

2. The nucleic acid construct of claim 1, comprising a 3’ poly T termination signal.

3. The nucleic acid construct of claim 1 or 2, wherein the nucleic acid construct lacks any further RNA polymerase III promoter sequence.

4. The nucleic acid construct of any one of claims 1-3, wherein the nucleic acid construct comprises at least one nicking guide RNA and at least one prime editing guide RNA or at least one engineered prime editing guide RNA, wherein the nicking guide RNA is positioned upstream of the prime editing guide RNA or the engineered prime editing guide RNA.

5. The nucleic acid construct of any one of claims 1-4, wherein the small RNA is a prime editing guide RNA, wherein the nucleic acid construction further comprises an interval sequence positioned between each pegRNA and the downstream 5’ leader sequence derived from a tRNA.

6. The nucleic acid construct of claim 5, further comprising a pseudoknot positioned at the 3’ end of the prime editing guide RNA.

7. The nucleic acid construct of any one of claims 1-6, wherein the guide RNA is an engineered guide RNA with an RNA aptamer insert, which can recruit a gene activator.

8. The nucleic acid construct of claim 7, wherein the RNA aptamer is a MS2 aptamer, which can recruit an MPH gene activator comprising an MS2 bacteriophage coat protein fused with the activation domains of P65 and HSF1 genes.

9. The nucleic acid construct of any one of claims 1-8, wherein the guide RNA comprises a spacer sequence having a length of 11-19 nucleotides.

10. The nucleic acid construct of any one of claims 1-9, wherein the leader sequence is derived from a human cysteine tRNA, optionally wherein the leader sequence comprises or consisting of the sequence634929-2659-7749, v. 1AGAGGGGGTATAGCTCAGTGGTAGAGCATTTGACTGCAGATCAAGAGGTCCCCGGTTCAAATCCGGGTGCCCCCT (SEQ ID NO: 1).

11. A vector comprising the nucleic acid construct of any one of claims 1-10.

12. The vector of claim 11, wherein the vector is a plasmid, a DNA vims (e.g., an adeno- associated vims), or an RNA vims (e.g., a lentivirus).

13. The vector of claim 11, wherein the nucleic acid construct is in reverse orientation relative to the viral RNA genome.

14. The vector of any one of claims 11-13, further comprising a Cas nuclease expression cassette.

15. The vector of claim 14, wherein the Cas nuclease is a base editor or a prime editor.

16. The vector of claim 15, wherein the prime editor comprises a Cas9 nuclease with an H840A substitution and a reverse transcriptase, wherein the reverse transcriptase is a tmncated MMLV reverse transcriptase of amino acids 24-474, comprising D200C and V101R substitutions, and optionally comprising a C-terminal nuclear localization signal derived from SV40 (LrgT) and / or an N-terminal nuclear localization signal derived from VirD2.

17. A method of performing multiplex gene knock-out, knock-in, knock-down, deletion, disruption, correction, replacement, reversion, integration, inversion, activation, and / or epigenetic modification comprising contacting a cell with a nucleic acid constmct of any one of claims 1-10 or a viral vector of any one of claims 11-16.

18. A method of treating a disease in a patient comprising administering to the patient a nucleic acid constmct of any one of claims 1-10 or a viral vector of any one of claims 11-16.

19. A prime editor comprising Cas9 nuclease with an H840A substitution and a reverse transcriptase, wherein the reverse transcriptase is a tmncated MMLV reverse transcriptase of amino acids 24-474, comprising D200C and V101R substitutions, and optionally comprising a C-terminal nuclear localization signal derived from SV40 (LrgT) and / or an N-terminal nuclear localization signal derived from VirD2.644929-2659-7749, v.

120. A composition comprising the vector of any one of claims 11-16, a nucleic acid encoding a prime editor of claim 19, and a nucleic acid encoding a transcriptional fusion activator MS2-p65-HSFl (MPH).654929-2659-7749, v. 1