rna-guided targeting of genetic and epigenetic regulatory proteins to specific genomic loci
By developing a fusion protein of the non-catalytically active dCas9 protein and the transcriptional activation domain VP64, the problem of the CRISPR/Cas system's inability to achieve genome-wide targeted regulation under RNA guidance has been solved, achieving increased gene expression and synergistic activation, and expanding the application scope of gene regulation.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- THE GENERAL HOSPITAL CORP
- Filing Date
- 2014-03-14
- Publication Date
- 2026-06-19
AI Technical Summary
Existing CRISPR/Cas systems struggle to target heterologous functional domains to specific genomic loci under RNA guidance to regulate gene expression, particularly to achieve transcriptional activation rather than repression.
Develop fusion proteins of non-catalytically active CRISPR-associated 9 (dCas9) proteins with heterologous functional domains (such as the transcription activation domain VP64) to regulate gene expression by guiding RNA to target specific genomic sites.
It achieves RNA-guided gene expression enhancement and co-activation, provides an efficient means of regulating multiple genes in cells, and expands the application potential of the CRISPR/Cas system.
Smart Images

Figure CN113563476B_ABST
Abstract
Description
[0001] This application is a divisional application of patent application filed on March 14, 2014, with application number 201480026276.5 and invention title "RNA-guided targeting of genetic and epigenetic regulatory proteins to specific genomic loci".
[0002] Declaration of priority
[0003] This application claims the benefit of U.S. Patent Application Serial No. 61 / 799,647, filed March 15, 2013; U.S. Patent Application Serial No. 61 / 838,178, filed June 21, 2013; U.S. Patent Application Serial No. 61 / 838,148, filed June 21, 2013; and U.S. Patent Application Serial No. 61 / 921,007, filed December 26, 2013. The full contents of the aforementioned patent application serial numbers are incorporated herein by reference.
[0004] Federally funded research or development
[0005] This invention was made with government funding under grant number DP1GM105378 from the National Institutes of Health and grant number W911NF-11-2-0056 from the Defense Advanced Research Projects Agency (DARPA) of the Department of Defense. The government holds certain rights to this invention. Technical Field
[0006] The present invention relates to methods and compositions for RNA-guided targeting of genetic and epigenetic regulatory proteins, such as transcription activators, histone modifying enzymes, DNA methylation modifiers, to specific genomic loci. background
[0007] Clustered regularly spaced short palindromic repeats (CRISPR) and CRISPR-associated (cas) genes, known as the CRISPR / Cas system, are used by various bacteria and archaea to mediate defenses against viruses and other foreign nucleic acids. These systems use small RNAs to detect and silence foreign nucleic acids in a sequence-specific manner.
[0008] Three types of CRISPR / Cas systems have been described (Makarova et al., Nat. Rev. Microbiol. 9, 467 (2011); Makarova et al., Biol. Direct 1, 7 (2006); Makarova et al., Biol. Direct 6, 38 (2011)). Recent work has shown that type II CRISPR / Cas systems can be engineered to target double-strand DNA breaks to specific sequences in vitro using a single “guide RNA” complementary to the DNA target site and the Cas9 nuclease (Jinek et al., Science 2012; 337: 816–821). This targeted Cas9-based system also works in cultured human cells (Mali et al., Science. Feb. 15, 2013; 339(6121):823-6; Cong et al., Science. Feb. 15, 2013; 339(6121):819-23) and in vivo in zebrafish (Hwang and Fu et al., Nat Biotechnol. Mar. 2013; 31(3):227-9) to induce targeted alterations in endogenous genes.
[0009] Overview
[0010] This invention is at least in part based on the development of fusion proteins comprising a heterologous functional domain (e.g., a transcriptional activation domain) fused to a Cas9 nuclease whose nuclease activity has been inactivated by mutation (also known as “dCas9”). While published studies have used guide RNA to target catalytically active but inactivated Cas9 nuclease proteins to specific genomic loci, no work has yet modified this system to recruit additional effector domains. This work also provides the first evidence of an RNA-guided process that leads to an increase (rather than a decrease) in the expression level of the target gene.
[0011] In addition, this disclosure provides the first example of how multiple gRNAs can be used to bring multiple dCas9-VP64 fusions to a single promoter, thereby resulting in co-activation of transcription.
[0012] Therefore, in a first aspect, the present invention provides a fusion protein comprising a non-catalytically active CRISPR-associated 9 (dCas9) protein linked to a heterologous functional domain (HFD), said heterologous functional domain modifying gene expression, histones, or DNA, such as a transcriptional activation domain, a transcriptional repressor (e.g., a silencer such as heterochromatin protein 1 (HP1), such as HP1α or HP1β, or a transcriptional repression domain, such as a Krueppel-associated box (KRAB) domain, an ERF repression domain (ERD), or an mSin3A interaction domain (SID)), an enzyme modifying the methylation state of DNA (e.g., DNA methyltransferase (DNMT) or a 10-11 translocation (TET) protein, such as TET1, also known as Tet methylcytosine dioxygenase 1), or an enzyme modifying histone subunits (e.g., histone acetyltransferase (HAT), histone deacetylase (HDAC), or histone demethylase). In some embodiments, the heterologous functional domain is a transcriptional activation domain, such as a transcriptional activation domain from VP64 or NF-κB p65; an enzyme that catalyzes DNA demethylation, such as TET; or a histone modification (e.g., LSD1, histone methyltransferase, HDAC, or HAT) or transcriptional silencing domain, such as a heterochromatin protein 1 (HP1), such as HP1α or HP1β; or a biological chain, such as CRISPR / Cas isoform Ypest protein 4 (Csy4), MS2, or λN protein.
[0013] In some implementations, the catalytically inactive Cas9 protein is derived from Streptococcus pyogenes.
[0014] In some embodiments, the catalytically inactive Cas9 protein contains mutations at D10, E762, H983, or D986; and at H840 or N863, for example, at D10 and H840, such as D10A or D10N and H840A or H840N or H840Y.
[0015] In some embodiments, the heterologous functional domain is connected to the N-terminus or C-terminus of a non-catalytically active Cas9 protein via an optional intercalator, wherein the intercalator does not interfere with the activity of the fusion protein.
[0016] In some embodiments, the fusion protein includes either or both of a nuclear localization sequence and one or more epitope tags (e.g., c-myc, 6His, or FLAG tags) between the N-terminus, C-terminus, or a non-catalytically active CRISPR-associated 9 (Cas9) protein and the heterologous functional domain, optionally having one or more intercalators.
[0017] In other respects, the present invention provides a nucleic acid encoding the fusion protein described herein, an expression vector comprising said nucleic acid, and a host cell expressing said fusion protein.
[0018] In another aspect, the present invention provides a method for increasing the expression of a target gene in cells. The method includes, for example, expressing a Cas9-HFD fusion protein as described herein in the cells by contacting the cells with an expression vector comprising a sequence encoding the fusion protein, and also, for example, expressing one or more guide RNAs having complementarity to a target gene in the cells by contacting the cells with one or more expression vectors comprising a nucleic acid sequence encoding one or more guide RNAs.
[0019] Unless otherwise defined, all technical and / or scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention pertains. Methods and materials used in this invention are described herein; other suitable methods and materials known in the art may also be used. The materials, methods, and embodiments described are exemplary only and are not intended to be limiting. All publications, patent applications, patents, sequences, database entries, and other references mentioned herein are incorporated herein by reference in their entirety. In case of any conflict, the definitions included in this patent specification shall prevail.
[0020] The features and advantages of the invention will be apparent from the following detailed description, accompanying drawings, and claims. Overview of the attached figures
[0021] This patent or application document contains at least one drawing in color. A published copy of this patent or application with color drawings may be provided by the Patent Office upon request and payment of the necessary fees.
[0022] Figure 1A This is a schematic diagram illustrating how a single guide RNA (sgRNA) recruits the Cas9 nuclease to a specific DNA sequence, thereby introducing targeted alterations. The sequence of the guide RNA shown is GGAGCGAGCGGAGCGGUACAGUUUUAGAGCUAGAAAUAGCAAGUUAAAAUAAGGCUAGUCCG (SEQ ID NO:9).
[0023] Figure 1BThis is a schematic diagram showing a longer form of the guide RNA used to recruit the Cas9 nuclease to a specific DNA sequence, thereby introducing a targeted altered sgRNA. The sequence of the guide RNA shown is GGAGCGAGCGGAGCGGUACAGUUUUAGAGCUAGAAAUAGCAAGUUAAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGUCGGUGCUUUU (SEQ ID NO:10).
[0024] Figure 1C This refers to a Cas9 protein that is fused to a transcriptional activation domain and recruited to a specific DNA sequence by sgRNA. The Cas9 protein contains D10A and H840A mutations that render the nuclease portion of the protein inactive. The sequence of the guide RNA shown is GGAGCGAGCGGAGCGGUACAGUUUUAGAGCUAGAAAUAGCAAGUUAAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGUCGGUGCUUUU (SEQ ID NO:10).
[0025] Figure 1D This is a schematic diagram depicting the dCas9-VP64 fusion protein being recruited to a specific genomic target sequence by a chimeric sgRNA.
[0026] Figure 1E This is a simplified diagram illustrating the location and orientation of 16 sgRNAs targeted to the promoter of the endogenous human VEGFA gene. Small horizontal arrows represent the first 20 nt of gRNAs complementary to the genomic DNA sequence, with the arrows pointing from 5' to 3'. Gray bars indicate DNase I hypersensitive sites previously identified in human 293 cells relative to the transcription start site (right-angled arrows) (Liu et al., J Biol Chem. 2001 Apr 6; 276(14):11323-34).
[0027] Figure 2A This is a bar graph showing the activation of VEGFA protein expression in 293 cells by various sgRNAs, each expressed with (gray bars) or without (black bars) dCas9-VP64. The fold increase in VEGFA activation relative to the off-target sgRNA control as described in the Methods section was calculated. Each experiment was performed in triplicate, and the error bars represent the standard error of the mean. An asterisk indicates a sample significantly elevated above the off-target control, as determined by a paired one-sided t-test (p < 0.05).
[0028] Figure 2BThis is a bar graph showing the co-activation of VEGFA protein expression induced by multiple sgRNA expression via dCas9-VP64 protein. It shows the fold activation of VEGFA protein in 293 cells where a specified sgRNA combination was co-expressed with dCas9-VP64. Note that in all these experiments, the amount of each individual sgRNA expression plasmid used for transfection was the same. Activation fold values were calculated as described in 2A and are shown as gray bars. The sum of the calculated average activation fold values induced by each individual sgRNA for each combination is shown as black bars. An asterisk indicates all combinations found to be significantly greater than the expected sum as determined by analysis of variance (ANOVA) (p<0.05).
[0029] Figure 3A This is a simplified diagram illustrating the location and orientation of the six sgRNAs targeted in the promoter of the endogenous human NTF3 gene. Horizontal arrows represent the first 20 nt of sgRNAs complementary to the genomic DNA sequence, pointing from 5' to 3'. Gray lines indicate regions of potentially open chromatin identified by ENCODE DNase I hypersensitive reaction tracking on the UCSC Genome Browser, with thicker bars representing exons that are transcribed first. The displayed numbers are relative to the transcription start site (+1, right-angled arrows).
[0030] Figure 3B This is a bar graph showing the activation of NTF3 gene expression by sgRNA-guided dCas9-VP64 in 293 cells. It displays the activation of NTF3 gene expression in 293 cells co-transfected with a specified amount of dCas9-VP64 and an NTF3-targeting sgRNA expression plasmid, as detected by quantitative RT-PCR, and against the GAPDH control (ΔCt x 10). 4 The relative expression of standardized NTF3 mRNA was determined. All experiments were performed in triplicate, and the error bars represent the standard error of the mean. An asterisk indicates a sample that was significantly larger than the off-target gRNA control as determined by a paired one-sided t-test (P < 0.05).
[0031] Figure 3C This demonstrates the co-activation of NTF3 mRNA expression induced by dCas9-VP64 protein through multiple gRNA expression. The relative expression of NTF3 mRNA, normalized to GAPDH control (ΔCt x 10⁴), as detected by quantitative RT-PCR in 293 cells co-transfected with a combination of dCas9-VP64 and a specified NTF3-targeting gRNA expression plasmid is shown. Note that the amount of each individual gRNA expression plasmid used for transfection was identical in all these experiments. All experiments were performed in triplicate, and the error bars represent the standard error of the mean. The sum of the calculated mean fold increases induced by each individual gRNA for each combination is shown.
[0032] Figure 4 This is an exemplary sequence of an sgRNA expression vector.
[0033] Figure 5 This is an exemplary sequence of the CMV-T7-Cas9 D10A / H840A-3XFLAG-VP64 expression vector.
[0034] Figure 6 This is an exemplary sequence of the D10A / H840A-3XFLAG-VP64 expression vector recorded by CMV-T7-Cas9.
[0035] Figure 7 This is an exemplary sequence of Cas9-HFD, i.e., Cas9-activator. Optional 3xFLAG sequences are underlined; nuclear localization signals PKKKRKVS (SEQ ID NO:11) are shown in lowercase; the two adapters are shown in bold; and the VP64 transcription activator sequence DALDDFDLDMLGSDALDDFDLDMLGSDALDDFDLDMLGSDALDDFDLDML (SEQ ID NO:12) is boxed.
[0036] Figures 8A-8B These are exemplary sequences of (8A)dCas9-NLS-3XFLAG-HP1α and (8B)dCas9-NLS-3XFLAG-HP1β. Box = kernel location signal; underscore = triple flag tag; double underscore = HP1α hinge and chromosome shadow domain.
[0037] Figure 9 This is an example sequence of dCas9-TET1.
[0038] Figure 10 This is a bar graph showing the results obtained using various dCas9-VP64 fusion constructs. Of the constructs tested, the optimized dCas9-VP64 architecture includes an N-terminal NLS (NFN) and an additional NLS (N) or FLAG tag / NLS (NF) located between dCas9 and VP64. Expression of the VEGFA gene in human HEK293 cells was activated via RNA-guided transcriptional activation mediated by the dCas9-VP64 fusion. An expression plasmid encoding a variant of dCas9-VP64 was co-transfected with a plasmid expressing three gRNAs targeting sites upstream of the VEGFA start codon (in this experiment, the gRNA was expressed from a single gRNA and processed by the Csy4 endonuclease). VEGFA protein expression was measured by ELISA, and the results are shown as the mean of two replicates, with the error representing the standard error of the mean.
[0039] Figure 11A -B is a bar graph showing the activity of dCas9-VP64 activators with alternative substitution mutations that inactivate Cas9 catalytic function. (11A) Plasmids expressing various Cas9-inactivating substitutions for residues D10 and H840 of dCas9-VP64 protein were co-transfected into HEK293 cells with either a single gRNA or three different targeting gRNAs targeting the upstream region of VEGFA (blue and red bars, respectively). (11B) Plasmids expressing these dCas9-VP64 variants were also transfected into HEK293 cell lines stably expressing a single VEGFFA targeting gRNA. VEGFA protein levels were measured by ELISA, and the mean of two replicates and the standard error of the mean (error bars) are shown.
[0040] Detailed Explanation
[0041] This article describes fusion proteins of heterologous functional domains (e.g., transcriptional activation domains) fused to the Cas9 protein in a non-catalytically active form (for the purpose of enabling the RNA-guided targeting of these functional domains to specific genomic locations in cells and living organisms).
[0042] The CRISPR / Cas system has evolved in bacteria as a defense mechanism to protect them from invading plasmids and viruses. A short prototypical spacer sequence derived from a foreign nucleic acid is integrated into a CRISPR locus and subsequently translated and processed into short CRISPR RNA (crRNA). These crRNAs, complexed with a second tracrRNA, then use their complementary sequences to the invading nucleic acid to guide Cas9-mediated breakage and subsequent destruction of the foreign nucleic acid. In 2012, Doudna and colleagues demonstrated that a single guide RNA (sgRNA) composed of a fusion of crRNA and tracrRNA can mediate the recruitment of the Cas9 nuclease to a specific DNA sequence in vitro. Figure 1C ; Jinek et al., Science 2012).
[0043] Recently, longer forms of sgRNA have been used to introduce targeted alterations in human cells and zebrafish. Figure 1B; Mali et al. Science 2013, Hwang and Fu et al., Nat Biotechnol. Mar 2013; 31(3):227-9). Qi et al. demonstrated that gRNA-mediated recruitment of a catalytically inactive mutant form of Cas9 (referred to as dCas9) can lead to the suppression of specific endogenous genes in E. coli and the EGFP reporter gene in human cells (Qi et al., Cell 152, 1173–1183 (2013)). Although this study demonstrates the potential of RNA-guided Cas9 technology to regulate gene expression, it does not test or show whether heterologous functional domains (e.g., transcriptional activation domains) can be fused into dCas9 without disrupting its ability to be recruited to specific genomic sites by programmable sgRNAs or dual gRNAs (dgRNAs – i.e., customized crRNAs and tracrRNAs).
[0044] As described in this article, in addition to guiding Cas9-mediated nuclease activity, CRISPR-derived RNA may also be used to target the heterologous functional domain fused with Cas9 (Cas9-HFD) to specific sites in the genome. Figure 1C For example, as described herein, a single guide RNA (sgRNA) may be used to target Cas9-HFD, such as Cas9 transcription activator (hereinafter referred to as Cas9 activator), to the promoter of a specific gene, thereby increasing the expression of the target gene. Thus, Cas9-HFD can be localized to sites in the genome, with target specificity determined by the sequence complementarity of the guide RNA. Target sequences also include PAM sequences (2-5 nucleotide sequences designated by the Cas9 protein adjacent to the sequence designated by the RNA).
[0045] Cas9-HFD is generated by fusing a heterologous functional domain (e.g., a transcriptional activation domain from VP64 or NF-κB p65) to the N-terminus or C-terminus of a catalytically inactive Cas9 protein.
[0046] Cas9
[0047] Many bacteria express variants of the Cas9 protein. The Cas9 from *Streptococcus pyogenes* is currently the most commonly used; some other Cas9 proteins share a high level of sequence identity with *Streptococcus pyogenes* Cas9 and use the same guide RNA. Others are more diverse, using different gRNAs and similarly recognizing different PAM sequences (a 2–5 nucleotide sequence designated by the protein adjacent to the RNA-designated sequence). Chylinski et al. classified Cas9 proteins from a large group of bacteria (RNA Biology 10:5,1–12; 2013), and many Cas9 proteins are listed in Supplementary Figure 1 and its Supplementary Table 1, which are incorporated herein by reference. Other Cas9 protein descriptions are found in Esvelt et al., Nat Methods. 2013 Nov; 10(11):1116-21 and Fonfara et al., “Phylogeny of Cas9 determines functional exchangeability of dual-RNA and Cas9 among orthologous type II CRISPR-Cas systems.” Nucleic Acids Res. 2013 Nov 22. [Electronic publication prior to print] doi:10.1093 / nar / gkt1074.
[0048] Cas9 molecules from many species can be used in the methods and compositions described herein. While *Streptococcus pyogenes* and *Streptococcus thermophilus* Cas9 molecules are the subject of much of the disclosure herein, Cas9 molecules from other species listed herein, or Cas9 molecules derived from or based on the Cas9 proteins of said species, can also be used. In other words, while many descriptions herein use *Streptococcus pyogenes* and *Streptococcus thermophilus* Cas9 molecules, Cas9 molecules from other species can be substituted for them. Such species include those shown in the table below, which is based on Supplementary Figure 1 of Chylinski et al., 2013.
[0049]
[0050]
[0051]
[0052] The constructs and methods described in this article include the use of any of those Cas9 proteins and their corresponding guide RNAs or other compatible guide RNAs. Cas9 from the Streptococcus thermophilus LMD-9CRISPR1 system has been shown to function in human cells in Cong et al. (Science 339,819(2013)). Additionally, Jinek et al. showed in vitro that Cas9 orthologs from Streptococcus thermophilus and Listeria innocua (but not from Neisseria meningitidis or Campylobacter jejuni, which may use different guide RNAs) can be cleaved by dual Streptococcus pyogenes gRNAs, although with slightly reduced efficiency.
[0053] In some embodiments, the system utilizes a Cas9 protein from Streptococcus pyogenes (e.g., encoded in bacteria or codon-optimized for expression in mammalian cells) containing mutations at D10, E762, H983 or D986 and H840 or N863, such as D10A / D10N and H840A / H840N / H840Y, to partially catalytically inactivate the protein via nucleases; these substitutions may be alanine residues (as they are in Nishimasu et al., Cell 156, 935–949 (2014)) or they may be other residues, such as glutamine, asparagine, tyrosine, serine, or aspartic acid, for example, E762Q, H983N, H983Y, D986N, N863D, N863S, or N863H. Figure 1C The sequence of live Streptococcus pyogenes Cas9, which can be used to adapt to the methods and compositions described herein, is as follows; exemplary mutations of D10A and H840A are shown in bold and underlined.
[0054]
[0055]
[0056] In some embodiments, the Cas9 nuclease used herein shares at least about 50% sequence identity with the Cas9 sequence of Streptococcus pyogenes, i.e., at least 50% identity with SEQ ID NO:13. In some embodiments, the nucleotide sequence shares about 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 99%, or 100% identity with SEQ ID NO:13.
[0057] In some embodiments, the non-catalytically active Cas9 used herein has at least about 50% sequence identity with the non-catalytically active Streptococcus pyogenes Cas9, i.e., at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 99%, or 100% identity with SEQ ID NO:13, wherein mutations at D10 and H840, such as D10A / D10N and H840A / H840N / H840Y, are retained.
[0058] In some embodiments, any differences from SEQ ID NO:13 in the non-conserved region are identified by sequence alignment of the sequences shown in Chylinski et al., RNA Biology 10:5,1–12;2013 (e.g., in Supplementary Figure 1 and its Supplementary Table 1); Esvelt et al., NatMethods. Nov 2013; 10(11):1116-21 and Fonfara et al., Nucl. Acids Res. (2014) 42(4):2577-2590. [November 22, 2013, prior to print, electronic version] doi:10.1093 / nar / gkt1074, and wherein mutations at D10 and H840, e.g., D10A / D10N and H840A / H840N / H840Y, are preserved.
[0059] To determine the percentage identity of two sequences, the sequences are aligned for optimal comparison purposes (gap may be introduced in one or both of the first and second amino acid or nucleic acid sequences as needed for optimal alignment, and non-homologous sequences may be ignored for comparison purposes). The length of the reference sequence aligned for comparison purposes is at least 50% (in some embodiments, about 50%, 55%, 60%, 65%, 70%, 75%, 85%, 90%, 95%, or 100% of the reference sequence length is aligned). Nucleotides or residues at corresponding positions are then compared. When a position in the first sequence is occupied by the same nucleotide or residue as the corresponding position in the second sequence, the molecules are identical at that position. The percentage identity between the two sequences is a function of the number of identical positions shared by the sequences, taking into account the number of gaps that need to be introduced for optimal alignment of the two sequences and the length of each gap.
[0060] Sequence comparison and determination of percentage identity between two sequences can be achieved using mathematical algorithms. For the purposes of this application, the percentage identity between two amino acid sequences was determined using the Needleman and Wunsch ((1970) J. Mol. Biol. 48: 444-453) algorithm, employing the Blossum 62 scoring matrix and utilizing a gap penalty of 12, a gap extension penalty of 4, and a frameshift gap penalty of 5. This algorithm has been integrated into the GAP program within the GCG software package.
[0061] Heterogeneous functional domains
[0062] A transcription activation domain can be fused to the N or C terminus of Cas9. Additionally, although this specification exemplifies transcription activation domains, other heterologous functional domains known in the art may also be used (e.g., amino acids 473–530 of the repression domain (ERD) of transcription repressors such as KRAB, ERD, SID, and others, such as amino acids 1–97 of the KRAB domain of KOX1, or amino acids 1–36 of the Mad mSIN3 interaction domain (SID); see Beerli et al., PNAS USA). 95:14628-14633(1998)) or silencers such as heterochromatin protein 1 (HP1, also known as swi6), such as HP1α or HP1β; proteins or peptides that can recruit long non-coding RNAs (lncRNAs) fused to fixed RNA-binding sequences such as those bound by MS2 capsid protein, endonuclease Csy4, or λN protein; enzymes that modify the methylation state of DNA (e.g., DNA methyltransferase (DNMT) or TET protein); or enzymes that modify histone subunits (e.g., histone B). Acyltransferase (HAT), histone deacetylase (HDAC), histone methyltransferase (e.g., for methylation of lysine or arginine residues), or histone demethylase (e.g., for demethylation of lysine or arginine residues)). Many sequences of such domains are known in the art, for example, domains that catalyze the hydroxylation of methylcysteine residues in DNA. Exemplary proteins include the 10-11-translocation (TET) 1-3 family of enzymes that convert 5-methylcytosine (5-mC) in DNA to 5-hydroxymethylcytosine (5-hmC).
[0063] The sequences of human TET1-3 are known in the art and are shown in the table below:
[0064]
[0065] *Variant (1) represents a longer transcript and encodes a longer isotype (a). Variant (2) differs from variant 1 in the 5' UTR and 3' UTR as well as in the coding sequence. The resulting isotype (b) is shorter and has a different C-terminus compared to isotype a.
[0066] In some embodiments, the catalytic domain may include all or part of its full-length sequence, such as a cysteine-rich extension and a 2OGFeDO domain encoded by seven highly conserved exons, for example, a catalytic module comprising the Tet1 catalytic domain comprising amino acids 1580-2052, the Tet2 catalytic domain comprising amino acids 1290-1905, and the Tet3 catalytic domain comprising amino acids 966-1678. For an example of the alignment of the critical catalytic residues in all three Tet proteins, see, for example, Iyer et al., Cell Cycle. June 1, 2009; 8(11):1698-710. Epub June 27, 2009, Figure 1, and for the full-length sequence (see, for example, seq 2c), see its supplemental material (available at ftp.ncbi.nih.gov / pub / aravind / DONS / supplementary_material_DONS.html); in some embodiments, the sequence comprises amino acids 1418-2136 of Tet1 or the corresponding region in Tet2 / 3.
[0067] Other catalytic molecules can be derived from proteins identified by Iyer et al., 2009.
[0068] In some embodiments, the heterologous functional domain is a biological chain and comprises all or part of the MS2 capsid protein, the endonuclease Csy4, or the λN protein (e.g., a DNA-binding domain derived therefrom). These proteins can be used to recruit RNA molecules containing specific stem-loop structures to sites specified by the dCas9 gRNA targeting sequence. For example, dCas9 fused to the MS2 capsid protein, the endonuclease Csy4, or λN can be used to recruit long non-coding RNAs (lncRNAs) such as XIST or HOTAIR; see, for example, Keryer-Bibens et al., Biol. Cell 100:125–138 (2008), which are linked to the Csy4, MS2, or λN binding sequence. Alternatively, the Csy4, MS2, or λN protein binding sequence can be linked to another protein, as described in Keryer-Bibens et al. (ibid.), and said protein can be targeted to the dCas9 binding site using the methods and compositions described herein. In some embodiments, Csy4 is non-catalytically active.
[0069] In some embodiments, the fusion protein includes a linker between dCas9 and a heterologous functional domain. Linkers that can be used in these fusion proteins (or between fusion proteins in tandem structures) may comprise any sequence that does not interfere with the function of the fusion protein. In a preferred embodiment, the linker is short, for example, 2-20 amino acids, and is generally flexible (i.e., containing highly free amino acids such as glycine, alanine, and serine). In some embodiments, the linker comprises one or more units consisting of GGGS (SEQ ID NO:14) or GGGGS (SEQ ID NO:15), for example, repeats of 2, 3, 4, or more GGGS (SEQ ID NO:14) or GGGGS (SEQ ID NO:15) units. Other linker sequences may also be used.
[0070] How to use
[0071] The Cas9-HFD system is a useful and versatile tool for modifying the expression of endogenous genes. Current methods for obtaining this system require the production of novel engineered DNA-binding proteins (such as engineered zinc fingers or transcription activator-like effector DNA-binding domains) targeting each site. Because these methods require the expression of large proteins specifically engineered to bind to each target site, they limit their ability to be used for multiplexing. However, Cas9-HFD requires only the expression of a single Cas9-HFD protein, which can be targeted to multiple sites in the genome by expressing multiple short gRNAs. The system can therefore be readily used to simultaneously induce the expression of many genes or to recruit multiple Cas9-HFDs to a single gene, promoter, or enhancer. This capability will have wide-ranging applications, such as in basic biological research, where it can be used to study gene function and maintain the expression of multiple genes in a single pathway, and in synthetic biology, where it will allow researchers to generate circuits in cells responsible for multiple output signals. The relative ease with which this technique can be executed and adapted for multiplexing makes it a widely useful technique with many broad applications.
[0072] The method described herein involves contacting cells with nucleic acids encoding the Cas9-HFD described herein and guide RNA encoding one or more selected genes, thereby regulating the expression of those genes.
[0073] Guide RNA (gRNA)
[0074] Guide RNAs generally occur in two distinct systems: System 1, which uses separate crRNAs and tracrRNAs that together guide Cas9 cleavage, and System 2, which uses a chimeric crRNA-tracrRNA hybrid (referred to as a single guide RNA or sgRNA, see also Jinek et al., Science 2012; 337:816–821) combining the two separate guide RNAs in a single system. The tracrRNA can be variably truncated, and many lengths have been shown to be functional in both the single system (System 1) and the chimeric gRNA system (System 2). For example, in some embodiments, the tracrRNA can be truncated from its 3' end by at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, 30, 35, or 40 nt. In some embodiments, the tracrRNA molecule can be truncated from its 5' end by at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, 30, 35, or 40 nt. Alternatively, the tracrRNA molecule can be truncated from both the 5' and 3' ends, for example, by truncating at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, or 20 nt at the 5' end and by whipping at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, 30, 35, or 40 nt at the 3' end. See, for example, Jinek et al., Science 2012; 337:816–821; Mali et al., Science. Feb. 15, 2013; 339(6121):823-6; Cong et al., Science. Feb. 15, 2013; 339(6121):819-23; and Hwang and Fu et al., Nat Biotechnol. Mar. 2013; 31(3):227-9; Jinek et al., Elife 2, e00471(2013)). For System 2, longer chimeric gRNAs have generally shown greater mid-target activity, but the relative specificity of gRNAs of different lengths remains undetermined, thus shorter gRNAs may be desirable in some cases. In some embodiments, the gRNA is complementary to a region approximately 100-800 bp upstream of the transcription start site, such as approximately 500 bp upstream of the transcription start site, including the transcription start site, or approximately 100-800 bp downstream of the transcription start site, such as approximately 500 bp. In some embodiments, a vector (e.g., a plasmid) encoding more than one gRNA is used, such as a plasmid encoding 2, 3, 4, 5 or more gRNAs at different sites within the same region of the target gene.
[0075] A guide RNA (e.g., a single gRNA or tracrRNA / crRNA) having a 17-20 nt complementary strand at its 5' end to the genomic DNA target site can be used to guide the Cas9 nuclease to a specific 17-20 nt genomic target having an additional adjacent preinterstitial sequence neighboring motif (PAM) having, for example, the sequence NGG. Thus, this method may include the use of a single guide RNA comprising a crRNA fused to a typically trans-encoded tracrRNA, such as the single Cas9 guide RNA described in Mali et al., Science 2013 Feb 15; 339(6121):823-6, having, for example, 25-17, optionally 20 or fewer nucleotides (nt) of a sequence complementary to the target sequence at its 5' end, for example, 20, 19, 18, or 17 nt of the complementary strand of the target sequence immediately following the 5' end of the preinterstitial sequence neighboring motif (PAM) such as NGG, NAG, or NNGG, preferably 17 or 18 nt. In some implementations, the single Cas9 guide RNA consists of the following sequence:
[0076]
[0077] or
[0078]
[0079] Where X 17-20 It is a nucleotide sequence that is complementary to 17-20 consecutive nucleotides of the target sequence. DNA encoding a single guide RNA has been previously described in the literature (Jinek et al., Science. 337(6096): 816-21(2012) and Jinek et al., Elife. 2: e00471(2013)).
[0080] The guide RNA may contain any X sequence that does not interfere with Cas9 binding. N N (in RNA) can be 0-200, for example 0-100, 0-50 or 0-20.
[0081] In some embodiments, the guide RNA contains one or more adenine (A) or uracil (U) nucleotides at its 3' end. In some embodiments, as a result of the optional presence of one or more T's that serve as a termination signal to terminate RNA PolIII transcription, the RNA includes one or more U's at the 3' end of the molecule, for example, 1 to 8 or more U's (e.g., U, UU, UUU, UUUU, UUUUU, UUUUUU, UUUUUUU, UUUUUUUU, UUUUUUUU).
[0082] While some examples described herein utilize a single gRNA, the methods have also been used in conjunction with dual gRNAs (e.g., crRNA and tracrRNA found in naturally occurring systems). In this case, a single tracrRNA can be combined with several different crRNAs expressed using this system, such as the following sequences:
[0083] (X 17-20 )GUUUUAGAGCUA(SEQ ID NO:102);
[0084] (X 17-20 )GUUUUAGAGCUAUGCUGUUUUG(SEQ ID NO:103); or (X 17-20GUUUUAGAGCUAUGCU (SEQ ID NO: 104); and the tracrRNA sequence. In this case, the crRNA is used as a guide RNA in the methods and molecules described herein, and the tracrRNA is expressed from the same or different DNA molecules. In some embodiments, the method includes contacting cells with tracrRNA, which comprises or is composed of the following sequence: GGAACCAUUCAAAACAGCAUAGCAAGUUAAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGUCGGUGC (SEQ ID NO: 8) or its active portion (the active portion is the portion that retains the ability to form a complex with Cas9 or dCas9). In some embodiments, the tracrRNA molecule may be truncated from its 3' end by at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, 30, 35, or 40 nt. In another embodiment, the tracrRNA may be truncated at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, 30, 35, or 40 nt from its 5' end. Alternatively, the tracrRNA molecule may be truncated from both the 5' and 3' ends, for example, by truncating at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, or 20 nt from the 5' end and at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, 30, 35, or 40 nt from the 3' end. In addition to SEQ ID NO:8, the exemplary tracrRNA also includes the following sequences: UAGCAAGUUAAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGUCGGUGC (SEQ ID NO:105) or its active portion; or AGCAUAGCAAGUUAAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGUCGGUGC (SEQ ID NO:106) or its active portion.
[0085] In some implementations, when (X) 17-20 When GUUUUAGAGCUAUGCUGUUUUG (SEQ ID NO:102) is used as a crRNA, the following tracrRNA is used: GGAACCAUUCAAAACAGCAUAGCAAGUUAAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGUCGGUGC (SEQ ID NO:8) or its active fraction.
[0086] In some implementations, when (X)17-20 When GUUUUAGAGCUA (SEQ ID NO:102) is used as a crRNA, the following tracrRNA is used: UAGCAAGUUAAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGUCGGUGC (SEQ ID NO:105) or its active fraction.
[0087] In some implementations, when (X) 17-20 When GUUUUAGAGCUAUGCU (SEQ ID NO:104) is used as a crRNA, the following tracrRNAs are used: AGCAUAGCAAGUUAAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGUCGGUGC (SEQ ID NO:106) or their active fraction.
[0088] In some implementations, the gRNA is targeted to at least three or more mismatched sites that are different from any sequence in the rest of the genome to minimize off-target effects.
[0089] Modified RNA oligonucleotides, such as locked nucleic acids (LNAs), have been shown to improve the specificity of RNA-DNA hybridization by locking the modified oligonucleotides in a more favorable (stable) conformation. For example, 2'-O-methylRNA, in which there is an additional covalently linked base between the 2' oxygen and the 4' carbon, can improve overall thermal stability and selectivity when incorporated into oligonucleotides (Formula I).
[0090]
[0091] Therefore, in some embodiments, the tru-gRNA disclosed herein may comprise one or more modified RNA oligonucleotides. For example, the truncated guide RNA molecule described herein may have one or more or all regions of the guide RNA complementary to the target sequence modified, such as locked (2'-O-4'-C methylene bridge), 5'-methylcytidine, 2'-O-methyl-pseuuridine, or wherein the phosphoribosyl backbone has been replaced by a polyamide chain (peptide nucleic acid), such as synthetic ribonucleic acid.
[0092] In other embodiments, one, some, or all of the nucleotides of the tru-gRNA sequence may be modified, for example, by locking (2'-O-4'-C methylene bridge), 5'-methylcytidine, 2'-O-methyl-pseuuridine, or wherein the phosphoribosyl backbone has been replaced by a polyamide chain (peptide nucleic acid), such as synthetic ribonucleic acid.
[0093] In some embodiments, the single guide RNA and / or crRNA and / or tracrRNA may contain one or more adenine (A) or uracil (U) nucleotides at the 3' end.
[0094] Existing Cas9-based RGNs utilize gRNA-DNA heteroduplex formation to guide targeting of specific genomic sites. However, RNA-DNA heteroduplexes can form structures with a more mixed range of structures than their DNA-DNA counterparts. In fact, DNA-DNA duplexes are more sensitive to mismatches, suggesting that DNA-guided nucleases may not readily bind to off-target sequences, thus making them relatively more specific than RNA-guided nucleases. Therefore, the guide RNA that can be used in the methods described herein can be a hybrid, i.e., in which one or more deoxyribonucleotides, such as short DNA oligonucleotides, replace all or part of the gRNA, such as all or part of the complementary region of the gRNA. This DNA-based molecule can replace all or part of the gRNA in a single gRNA system or optionally all or part of the crRNA and / or tracrRNA in a dual crRNA / tracrRNA system. Such systems, incorporating DNA into the complementary region, should be more likely to target the desired genomic DNA sequence than RNA-DNA duplexes due to the overall intolerance of DNA-DNA duplexes to mismatches. Methods for producing such duplexes are known in the art, see, for example, Barker et al., BMC Genomics. April 22, 2005; 6:57; and Sugimoto et al., Biochemistry. September 19, 2000; 39(37):11270-81.
[0095] In addition, in systems using separate crRNA and tracrRNA, one or both may be synthetic and contain one or more modified (e.g., locked) nucleotides or deoxyribonucleotides.
[0096] In a cellular context, the complex of Cas9 with these synthetic gRNAs can be used to enhance the genome-wide specificity of the CRISPR / Cas9 nuclease system.
[0097] The method may include expressing the Cas9 gRNA plus fusion protein described herein in cells, or contacting the cells with the fusion protein.
[0098] Expression System
[0099] To use the described fusion protein and guide RNA, it may be desirable to express them from the nucleic acids encoding them. This can be done in several ways. For example, the nucleic acid encoding the guide RNA or fusion protein can be cloned into an intermediate vector for transformation into prokaryotic or eukaryotic cells for replication and / or expression. The intermediate vector is typically a prokaryotic vector, such as a plasmid, shuttle vector, or insect vector, which is used to store or manipulate the nucleic acid encoding the fusion protein or to produce the fusion protein. The nucleic acid encoding the guide RNA or fusion protein can also be cloned into an expression vector, for example, for administration to plant cells, animal cells, preferably mammalian or human cells, fungal cells, bacterial cells, or protozoan cells.
[0100] To achieve expression, the sequence encoding the guide RNA or fusion protein is typically subcloned into an expression vector containing a promoter that directs transcription. Suitable bacterial and eukaryotic promoters are well-known in the art and described, for example, in Sambrook et al., *Molecular Cloning, A Laboratory Manual* (3rd edition, 2001); Kriegler, *Gene Transfer and Expression: A Laboratory Manual* (1990); and *Current Protocols in Molecular Biology* (Ausubel et al., eds., 2010). Bacterial expression systems for expressing engineered proteins are available, for example, in *Escherichia coli*, *Bacillus* sp., and *Salmonella* (Palva et al., 1983, Gene 22:229-235). Kits for such expression systems are commercially available. Eukaryotic expression systems for mammalian cells, yeast, and insect cells are well-known in the art and are also commercially available.
[0101] The promoter used to guide nucleic acid expression depends on the specific application. For example, strong constitutive promoters are commonly used for the expression and purification of fusion proteins. Conversely, when administering fusion proteins in vivo for gene regulation, either constitutive or inducible promoters can be used, depending on the specific purpose of the fusion protein. Additionally, preferred promoters for administering fusion proteins can be weak promoters, such as HSV-TK or promoters with similar activity. Promoters may also include elements that respond to transactivation, such as hypoxia-responsive elements, Gal4-responsive elements, lac-repressive elements, and small molecule control systems such as tetracycline-regulated systems and RU-486 systems (see, for example, Gossen & Bujard, 1992, Proc. Natl. Acad. Sci. USA, 89:5547; Oligino et al., 1998, Gene Ther., 5:491-496; Wang et al., 1997, Gene Ther., 4:432-441; Neering et al., 1996, Blood, 88:1147-55 and Rendahl et al., 1998, Nat. Biotechnol., 16:757-761).
[0102] In addition to the promoter, expression vectors typically contain a transcription unit or expression cassette containing all the additional elements required for the expression of nucleic acids in host cells (prokaryotic or eukaryotic). Common expression cassettes thus contain a promoter operatively linked to, for example, a nucleic acid sequence encoding a fusion protein, and any signals required for efficient polyadenylation of the transcript, transcription termination, ribosome binding sites, or translation termination. Additional elements of the expression cassette may include, for example, enhancers and heterosplicing intron signals.
[0103] The specific expression vector for transferring genetic information into cells is selected based on the intended use of the fusion protein (e.g., expression in plants, animals, bacteria, fungi, protozoa, etc.). Standard bacterial expression vectors include plasmids such as pBR322-based plasmids, pSKF, pET23D, and commercially available target-fusion expression systems such as GST and LacZ. A preferred tag-fusion protein is maltose-binding protein (MBP). Such tag-fusion proteins can be used to purify engineered TALE repeat proteins. Epitope tags such as c-myc or FLAG can also be added to recombinant proteins to provide convenient isolation methods for monitoring expression and for monitoring cellular and subcellular localization.
[0104] Expression vectors containing regulatory elements derived from eukaryotic viruses are commonly used for eukaryotic expression vectors, such as SV40 vectors, papillomavirus vectors, and vectors derived from Epstein-Barr virus. Other exemplary eukaryotic vectors include pMSG, pAV009 / A+, pMTO10 / A+, pMAMneo-5, baculovirus pDSVE, and any other vectors that allow protein expression directed by promoters such as: SV40 early promoter, SV40 late promoter, metallothionein promoter, murine mammary tumor virus promoter, Rous sarcoma virus promoter, polyhedrosis protein promoter, or other promoters shown to be effective for expression in eukaryotic cells.
[0105] Vectors used to express guide RNA may include RNA Pol III promoters that drive guide RNA expression, such as H1, U6, or 7SK promoters. These promoters allow gRNA expression in mammalian cells after plasmid transfection. Alternatively, the T7 promoter can be used, for example, for in vitro transcription, and the RNA can be transcribed and purified in vitro. Vectors suitable for expressing short RNAs such as siRNA, shRNA, or other small RNAs can be used.
[0106] Some expression systems have markers for selecting stably transfected cell lines, such as thymidine kinase, hygromycin B phosphotransferase, and dihydrofolate reductase. High-yield expression systems are also suitable, such as those using baculovirus vectors in insect cells, utilizing fusion protein coding sequences directed by polyhedrosis protein promoters or other strong baculovirus promoters.
[0107] Elements typically included in expression vectors also include replicons that function in E. coli, genes encoding antibiotic resistance to allow selection of bacteria with recombinant plasmids, and unique restriction sites in non-essential regions of the plasmid that allow the insertion of recombinant sequences.
[0108] Standard transfection methods can be used to generate bacterial, mammalian, yeast, or insect cell lines expressing large amounts of protein, which are then purified using standard techniques (see, e.g., Colley et al., 1989, J. Biol. Chem., 264:17619-22; Guide to Protein Purification, in Methods in Enzymology, Vol. 182 (Deutscher, ed., 1990)). Transformation of eukaryotic and prokaryotic cells is performed according to standard techniques (see, e.g., Morrison, 1977, J. Bacteriol. 132:349-351; Clark-Curtiss & Curtiss, Methods in Enzymology 101:347-362 (Wu et al., ed., 1983).
[0109] Any known method for introducing foreign nucleotide sequences into host cells may be used. These methods include transfection with calcium phosphate, polybrene, protoplast fusion, electroporation, nuclear transfection, liposomes, microinjection, naked DNA, plasmid vectors, viral vectors (free and fusion types), and any other known methods for introducing cloned genomic DNA, cDNA, synthetic DNA, or other foreign genetic material into host cells (see, for example, Sambrook et al., ibid.). The only requirement is that the specific genetic engineering method used is capable of successfully introducing at least one gene into a host cell capable of expressing a selected protein.
[0110] In some embodiments, the fusion protein includes a nuclear localization domain that provides a protein to be transported to the nucleus. Several nuclear localization sequences (NLSs) are known, and any suitable NLS can be used. For example, many NLSs have multiple basic amino acids and are called dibasic repeats (reviewed in Garcia-Bustos et al., 1991, Biochim. Biophys. Acta, 1071:83-101). The dibasic repeat containing the NLS can be placed in any part of the chimeric protein and result in the chimeric protein being localized within the nucleus. In a preferred embodiment, the nuclear localization domain is integrated into the final fusion protein because the final function of the fusion protein described herein will generally require a protein localized in the nucleus. However, in cases where the DBD domain itself or another functional domain within the final chimeric protein has an inherent nuclear transport function, it may not be necessary to add a separate nuclear localization domain.
[0111] The present invention includes the carrier and cells containing the carrier. Example
[0112] The invention is further described in the following embodiments, which do not limit the scope of the invention as described in the claims.
[0113] Example 1. Engineered CRISPR / Cas activation system:
[0114] It is hypothesized that RNA-guided transcriptional activators can be generated by fusing a strongly synthetic VP64 activation domain (Beerli et al., ProcNatl Acad Sci USA 95, 14628–14633 (1998)) to the C-terminus of a catalytically inactive dCas9 protein. Figure 1D ).
[0115] To express guide RNA (gRNA) in human cells, an engineered vector was used that expresses a full-length chimeric gRNA driven by the U6 promoter (a fusion of crRNA and tracrRNA originally described by Jinek et al. (Science 2012)). The gRNA expression plasmid was constructed as follows: Pairs of DNA oligonucleotides encoding a variable 20nt gRNA target sequence were annealed together to produce a short double-stranded DNA fragment with a 4 bp overhang (Table 1).
[0116]
[0117]
[0118] These fragments were ligated into the BsmBI-digested plasmid pMLM3636 to generate DNA encoding a chimeric ~102nt single-stranded guide RNA expressed via the human U6 promoter (Mali et al., Science. 2013 Feb 15; 339(6121):823-6; Hwang et al., Nat Biotechnol. 2013 Mar; 31(3):227-9). The pMLM3636 plasmid and its complete DNA sequence were obtained from Addgene. See also Figure 4 .
[0119] To engineer Cas9 activators, D10A and H840A catalytic mutations (previously described in Jinek et al., 2012; and Qi et al., 2013) were introduced into wild-type or codon-optimized Cas9 sequences. Figure 5 These mutations render Cas9 inactive, thus preventing it from inducing double-strand breaks. In one construct, a triple flag tag, nuclear localization signal, and VP64 activation domain are fused to the C-terminus of inactivated Cas9. Figure 6 The expression of this fusion protein is driven by the CMV promoter.
[0120] The dCas-VP64 expression plasmid was constructed as follows. DNA encoding the Cas9 nuclease (dCas9) with an inactivated D10A / H840A mutation was amplified from plasmid pMJ841 (Addgene plasmid #39318) by PCR using primers that added the T7 promoter site to the 5' of the start codon and the nuclear localization signal to the C-terminus of the Cas9 coding sequence. This DNA was then cloned into a plasmid containing the CMV promoter, as previously described (Hwang et al., Nat Biotechnol 31, 227–229 (2013)), to generate plasmid pMLM3629. An oligonucleotide encoding a triple FLAG epitope was annealed and cloned into the XhoI and PstI sites of plasmid pMLM3629 to generate plasmid pMLM3647 expressing dCas9 with a C-terminal flag FLAG tag. DNA sequences encoding the Gly4Ser adapter, which is then ligated into the VP64 activation domain, were introduced downstream of a FLAG-tagged dCas9 in plasmid pMLM3647 to generate plasmid pSL690. The D10A / H840A mutation was introduced into plasmid pJDS247 via QuikChange site-directed mutagenesis (Agilent), which encodes a codon-optimized FLAG-tagged Cas9 sequence for expression in human cells, to generate plasmid pMLM3668. Subsequently, DNA sequences encoding the Gly4Ser adapter and the VP64 activation domain were cloned into pMLM3668 to generate a codon-optimized dCas9-VP64 expression vector named pMLM3705.
[0121] Cell culture, transfection, and ELISA assays were performed as follows. Flp-In T-Rex 293 cells were maintained in advanced DMEM supplemented with 10% FBS, 1% pentrep, and 1% Glutamax (Invitrogen). Cells were transfected using liposome LTX (Invitrogen) according to the manufacturer's instructions. In short, 160,000 293 cells were seeded in 24-well plates, and the next day, the cells were transfected with 250 ng gRNA plasmid, 250 ng Cas9-VP64 plasmid, 30 ng pmaxGFP plasmid (Lonza), 0.5 μL Plus reagent, and 1.65 μL liposome LTX. Tissue culture medium from the infected 293 cells was harvested 40 hours post-transfection, and secreted VEGF-A protein was measured using R&D System's Human VEGF-A ELISA kit, "Human VEGF Immunoassay."
[0122] Sixteen sgRNAs were constructed to target sequences in 293 cells located upstream, downstream, or above the transcription start site of the human VEGFA gene at three DNase I high-sensitivity sites (HSS). Figure 1E ).
[0123] Before testing the ability of the 16 VEGFA-targeting gRNAs to recruit the novel dCas9-VP64 fusion protein, the ability of each of these gRNAs to direct the Cas9 nuclease to its desired target site in human 293 cells was first determined. For this purpose, the gRNAs and Cas9 expression vectors were transfected at a 1:3 ratio, as previous optimization experiments showed that this plasmid ratio induced high levels of Cas9-induced DNA cleavage in U2OS cells.
[0124] In addition to transfecting cells with a plasmid encoding 125 ng of VEGFA-targeting gRNA and a plasmid encoding an active Cas9 nuclease (pMLM3639), 293 cells were transfected as described above for the dCas9-VP16 VEGFA experiment. Forty hours post-transfection, genomic DNA was isolated using the QIAamp DNA Blood Mini Kit (Qiagen) according to the manufacturer's instructions. PCR amplification of three different target regions in the VEGFA promoter was performed using Phusion Hot Start II high-fidelity DNA polymerase (NEB) with 3% DMSO and the following landing PCR cycles: 10 cycles at 98°C for 10 seconds; 72–62°C, -1°C / cycle, 15 seconds; 72°C, 30 seconds, followed by 25 cycles at 98°C for 10 seconds; 62°C, 15 seconds; 72°C, 30 seconds. The -500 region was amplified using primers oFYF434 (5'-TCCAGATGGCACATTGTCAG-3'(SEQ ID NO:82)) and oFYF435 (5'-AGGGAGCAGGAAAGTGAGGT-3'(SEQ ID NO:83)). The region surrounding the transcription start site was amplified using primers oFYF438 (5'-GCACGTAACCTCACTTTCCT-3'(SEQ ID NO:84)) and oFYF439 (5'-CTTGCTACCTCTTTCCTCTTTCT-3'(SEQ ID NO:85)). The +500 region was amplified using primers oFYF444 (5'-AGAGAAGTCGAGGAAGAGAGAG-3'(SEQ ID NO:86)) and oFYF445 (5'-CAGCAGAAAGTTCATGGTTTCG-3'(SEQ ID NO:87)). The PCR products were purified using Ampure XP beads (Agencourt), followed by T7 endonuclease I assay, and analyzed on a QIAXCEL capillary electrophoresis system as previously described (Reyon et al., Nat Biotech 30, 460-465 (2012)).
[0125] All 16 gRNAs can mediate the efficient introduction of Cas9 nuclease-induced insertion / deletion mutations at their respective target sites, as assessed using the previously described T7E1 genotyping assay (Table 2). Thus, all 16 gRNAs can complex with the Cas9 nuclease and direct its activity to specific target genomic sites in human cells.
[0126] Table 2. Frequency of insertion / deletion mutations induced by VEGFA-targeted gRNAs and Cas9 nucleases
[0127]
[0128] To test whether the dCas9-VP64 protein could also be targeted to specific genomic sites in human cells by the same gRNA, an enzyme-linked immunosorbent assay (ELISA) of VEGFA protein was performed as follows. Forty hours post-transfection, the culture medium of Flp-In T-Rex HEK293 cells transfected with plasmids encoding VEGFA-targeting sgRNA and dCas9-VP64 was harvested, and VEGFA protein expression was measured as previously described (Maeder et al., NatMethods 10, 243–245 (2013)). The activation fold of VEGFA expression was calculated by dividing the concentration of VEGFA protein from the culture medium of cells expressing sgRNA and dCas9-VP64 by the concentration of VEGFA protein from the culture medium of cells expressing off-target sgRNA (a sequence targeted in the EGFP reporter gene) and dCas9-VP64.
[0129] When co-expressed with dCas9-VP64 in human 293 cells, 15 out of the 16 gRNAs tested induced a significant increase in VEGFA protein expression. Figure 2A The observed VEGFA-induced levels ranged from 2 to 18.7 folds of activation, with an average of 5 fold activation. Control experiments showed that expression of any of the 16 gRNAs alone, dCas9-VP64 alone, and dCas9-VP64 in combination with an "off-target" gRNA designed to bind to the EGFP receptor gene sequence failed to induce elevated VEGFA expression. Figure 2A This indicates that co-expression of a specific gRNA and the dCas9-VP64 protein is required for promoter activation. Therefore, dCas9-VP64 is stably expressed and can be directed by gRNA to activate transcription at specific genomic loci in human cells. The greatest increase in VEGFA was observed in cells transfected with gRNA3, which induced 18.7-fold protein expression. Interestingly, six of the three best gRNAs, along with six of the nine gRNAs capable of inducing 3-fold or more expression, targeted the -500 region (free ~500 bp from the transcription start site).
[0130] Because, in one respect, the system described herein uses variable gRNAs to recruit a common dCas9-VP64 activating fusion, it is conceivable that expression of multiple guide RNAs in a single cell could enable multiple or combined activation of endogenous gene targets. To test this possibility, 293 cells were transfected with a dCas9-VP64 expression plasmid along with expression plasmids of four gRNAs (V1, V2, V3, and V4), each individually inducing expression from the VEGFA promoter. Co-expression of all four gRNAs with dCas9-VP64 induced synergistic activation of VEGFA protein expression (i.e., a fold greater than the expected additive effect of each individual activator). Figure 2B In addition, various combinations of three of these four activators also synergistically activate the VEGFA promoter. Figure 2B Since co-activation of transcription is believed to be caused by the recruitment of multiple activating domains to a single promoter, multiple gRNA / dCas9-VP64 complexes may bind to the VEGFA promoter simultaneously in these experiments.
[0131] These experiments demonstrate that co-expression of Cas9-HFD, such as Cas9-activator protein (with a VP64 transcriptional activation domain), and sgRNA with a 20 nt sequence complementary to a site in the human VEGF-A promoter, in human HEK293 cells leads to upregulation of VEGF-A expression. The increase in VEGF-A protein was measured by ELISA, and it was found that the gRNA alone, together with the Cas9-activator fusion protein, could increase VEGF-A protein levels by up to ~18-fold. Figure 2A Additionally, it is possible to achieve even greater activation enhancement through transcriptional synergy (by introducing multiple gRNAs targeting different sites in the same promoter along with a Cas9-activating fusion protein). Figure 2B ).
[0132] Example 2. Engineered CRISPR / Cas activation system targeting endogenous human NTF3 gene
[0133] To extend the generality of this finding, we tested whether an RNA-guided activator platform could be used to induce expression of the human NTF3 gene. To achieve this, six sgRNAs were engineered at predicted DNase I high-sensitivity sites (HSS) in the human NTF3 promoter, and plasmids expressing each of these gRNAs were co-transfected with a plasmid encoding the dCas9-VP64 protein, which had been codon-optimized for human cell expression. Figure 3A ).
[0134] All six tests showed that gRNA induced significant increases in NTF3 transcript levels, as detected by quantitative RT-PCR. Figure 3B While the fold-up of activation values for these six RNA-guided activators cannot be precisely calculated (because baseline transcript levels are essentially undetectable), the average activation of NTF3 mRNA expression varied within a 4-fold range. Reducing the amount of transfected gRNA and dCas9-VP64 expression plasmid resulted in less NTF3 gene activation. Figure 3B This indicates a clear dose-dependent effect.
[0135] In addition, 293 cells were co-transfected with dCas9-VP64 and NTF3-targeting gRNA expression plasmids, both alone and in single and dual combinations. The relative expression of NTF3 mRNA was detected by quantitative RT-PCR, and this relative expression was compared with that of the GAPDH control (ΔCt x 10). 4 The amount of each individual gRNA expression plasmid used for transfection was standardized in all these experiments. Figure 3B This demonstrates that the expression of multiple gRNAs is synergistically activated by the dCas9-VP64 protein-induced NTF3 mRNA expression.
[0136] Example 3. Engineered CRISPR / Cas-MS2, -Csy4 and –λN fusion systems – generating biological chains
[0137] A fusion protein is generated in which the MS2 capsid protein, a Csy4 nuclease (preferably a non-catalytically inactive Csy4, such as the H29A mutant described in Haurwitz et al. 329(5997):1355-8(2010)), or λN is fused to the N or C terminus of inactivated dCas9. MS2 and λN are phage proteins that bind specific RNA sequences, thus serving as adaptors for linking the dCas9 protein to a heterologous RNA sequence labeled with a specific MS2 or λN RNA-binding sequence. The dCas9-MS2 fusion or dCas9-λN fusion is co-expressed with a chimeric long non-coding RNA (lncRNA) fused to its 5' or 3' end with an MS2 or λN stem-loop recognition sequence. The chimeric Xist or chimeric RepA lncRNA will be specifically recruited by the dCas9 fusion, and the ability of this strategy to induce targeted silencing will be measured by measuring target gene expression. The system will be optimized by testing various alterations to the capsid protein and the chimeric RNA. Previously, it has been shown that N55K and ΔFG mutations in the MS2 capsid protein prevent protein aggregation and increase affinity for stem-loop RNA. Additionally, we will test a high-affinity C-loop RNA mutant reported to increase affinity for the MS2 capsid protein. Exemplary sequences of the MS2 and λN proteins are given below; MS2 functions as a dimer, thus the MS2 protein may include fusion single-stranded dimer sequences.
[0138] 1. An exemplary sequence of a fusion of a single MS2 capsid protein (wt, N55K, or ΔFG) with the N-terminus or C-terminus of dCas9.
[0139] MS2 capsid protein chloroamino acid sequence:
[0140]
[0141] MS2 N55K:
[0142]
[0143] MS2ΔFG:
[0144]
[0145] 2. Exemplary sequences of fusions of fused dimeric MS2 capsid proteins (wt, N55K, or ΔFG) with the N-terminus or C-terminus of dCas9.
[0146] Dimeric MS2 capsid protein:
[0147]
[0148]
[0149] Dimerized MS2ΔFG:
[0150]
[0151] 3. Exemplary sequences of fusions of λN and the N-terminus or C-terminus of dCas9.
[0152] λN amino acid sequence:
[0153] or
[0154]
[0155] 4. Exemplary sequences of fusions of Csy4 and the N-terminus or C-terminus of dCas9.
[0156] An exemplary sequence of Cys4 (e.g., in an inactivated form) is given in Haurwitz et al. 329(5997):1355-8(2010).
[0157] The construct was expressed in cells that also expressed regulatory RNAs, such as long non-coding RNAs (lncRNAs) fused to the 5' or 3' ends of λN or MS2 homologous stem-loop recognition sequences, including HOTAIR, HOTTIP, XIST, or XIST RepA. The wild-type and high-affinity sequences of MS2 were AAACAUGAGGAUUACCCAUGUCG (SEQ ID NO:96) and AAACAUGAGGAUCACCCAUGUCG (SEQ ID NO:97), respectively (see Keryer-Bibens et al., ibid., Figure 2); the nutL and nutR BoxB bound by λN were GCCUGAAGAAGGGC (SEQ ID NO:98) and GCCCUGAAAAAGGGC (SEQ ID NO:99), respectively. The sequences bound by Csy4 were GTTCACTGCCGTATAGGCAG (truncated 20 nt) (SEQ ID NO:100) or GUUCACUGCCGUAUAGGCAGCUAAGAAA (SEQ ID NO:101).
[0158] The binding of dCas9 / MS2 to a target site in cells expressing an MS2-binding sequence marker lncRNA recruits that lncRNA to the dCas9 binding site; where the lncRNA is a repressor, for example, XIST (a gene near the dCas9 binding site) is repressed. Similarly, the binding of dCas9 / λN to a target site in cells expressing an λN-binding sequence marker lncRNA recruits that lncRNA to the dCas9 binding site.
[0159] Example 4. Engineered CRISPR / Cas-HP1 fusion system - sequence-specific silencing
[0160] The dCas9 fusion protein described herein can also be used to target silencing domains, such as heterochromatin protein 1 (HP1, also known as swi6), such as HP1α or HP1β. Truncated forms of HP1α or HP1β, in which heterochromatin has been removed, can be targeted to specific loci to induce heterochromatin formation and gene silencing. An exemplary sequence of a truncated HP1 fused with dCas9 is shown below. Figures 8A-8B In this case, the HP1 sequence can be fused with the N or C end of the inactivated dCas9 as described above.
[0161] Example 5. Engineered CRISPR / Cas-TET fusion system – Sequence-specific demethylation
[0162] The dCas9 fusion protein described herein can also be used to target enzymes that modify the methylation state of DNA (e.g., DNA methyltransferases (DNMTs) or TET proteins). A truncated form of TET1 can be targeted to specific loci to catalyze DNA demethylation. An exemplary sequence of a truncated TET1 fused with dCas9 is shown below. Figure 9 The TET1 sequence can be fused to the N or C end of the inactivated dCas9 as described above.
[0163] Example 6. Engineered Optimized CRISPR / Cas-VP64 Fusion
[0164] The activity of dCas9-based transcriptional activators with VP64 activation domains was optimized by altering the number and position of nuclear localization signals (NLS) and 3xFLAG-tags within these fusions. Figure 10 dCas9-VP64 fusions containing an N-terminal NLS and an NLS located between the dCas9 and VP64 sequences consistently induce higher levels of target gene activation, possibly due to increased nuclear localization of the activator. Figure 10 Furthermore, even higher levels of activation were observed when the 3xFLAG tag was placed between the C-terminus of dCas9 and the N-terminus of VP64. The 3xFLAG tag could serve as an artificial linker, providing the necessary spacing between dCas9 and VP64, and may allow for better folding of the VP64 domain (which might be impossible when constrained near dCas9) or better recognition of VP64 by the transcriptional mediator complex that recruits RNA polymerase II. Alternatively, the negatively charged 3xFLAG tag could also serve as an incidental transcriptional activation domain, thereby enhancing the effect of the VP64 domain.
[0165] Example 7. Optimized catalytically inactive Cas9 protein (dCas9)
[0166] Further optimization of the dCas9-VP64 activator was achieved by altering the nature of the inactivating mutation that eliminates the nuclease activity of Cas9 in the dCas9 domain. Figure 11A-B). In published studies to date, mutations of catalytic residues D10 and H840 to alanine (D10A and H840A) have disrupted the network of active sites mediating DNA hydrolysis. It has been hypothesized that alanine substitutions at these positions may lead to destabilization of dCas9, resulting in suboptimal activity. Therefore, more structurally conserved substitutions on D10 or H840 (e.g., to asparagine or tyrosine residues: D10N, H840N, and H840Y) were tested to see if they could lead to greater gene activation via dCas9-VP64 fusions with these different mutations. When dCas9-VP64 variants with these variant substitutions were co-transfected into HEK293 cells with three gRNAs targeting the upstream region of the endogenous human VEGFA gene, increased VEGFA protein expression was observed for all but one variant. Figure 11A However, when the dCas9-VP64 variant is co-transfected with only one of these gRNAs ( Figure 11A Or when transfected into a HEK293-derived cell line expressing a single VEGFA-targeting gRNA ( Figure 11B However, this effect was not so significant.
[0167] Other implementation plans
[0168] It should be understood that although the invention has been described in detail, the foregoing description is illustrative and not limiting, and the scope of the invention is defined by the scope of the claims. Other aspects, advantageous aspects, and modifications are within the scope of the following claims. sequence list <110> GE Healthcare <120> RNA-guided targeting of genetic and epigenetic regulatory proteins to specific genomic loci <130> 00786-0882WO1 <140> PCT / US2014 / 027335 <141> 2014-03-14 <150> 61 / 921,007 <151> 2013-12-26 <150> 61 / 838,178 <151> 2013-06-21 <150> 61 / 838,148 <151> 2013-06-21 <150> 61 / 799,647 <151> 2013-03-15 <160> 113 <170> PatentIn version 3.5 <210> 1 <211> 262 <212> RNA <213> Artificial sequence <220> <223> Description of artificial sequences: Synthetic Guided polynucleotides <220> <221> Modified bases <222> (1)..(20) <223> a, c, u, g, unknown regions or other regions and this region may include 17-20 nucleotides, some of which may be absent. <220> <221> Modified bases <222> (63)..(262) <223> a, c, u, g, unknown regions or other regions and this region may include 0-200 nucleotides, some of which may be absent. <400> 1 nnnnnnnnnn nnnnnnnnnn guuuuagagc uagaaauagc aaguuaaaau aaggcuaguc 60 cgnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 120 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 180 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 240 nnnnnnnnnn nnnnnnnnnn nn 262 <210> 2 <211> 275 <212> RNA <213> Artificial sequence <220> <223> Description of artificial sequences: Synthetic Guided polynucleotides <220> <221> Modified bases <222> (1)..(20) <223> a, c, u, g, unknown regions or other regions and this region may include 17-20 nucleotides, some of which may be absent. <220> <221> Modified bases <222> (76) (275) <223> a, c, u, g, unknown regions or other regions and this region may include 0-200 nucleotides, some of which may be absent. <400> 2 nnnnnnnnnn nnnnnnnnnn guuuuagagc uaugcugaaa agcauagcaa guuaaaauaa 60 ggcuaguccg uuaucnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 120 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 180 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 240 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnn 275 <210> 3 <211> 287 <212> RNA <213> Artificial sequence <220> <223> Description of artificial sequences: Synthetic Guided polynucleotides <220> <221> Modified bases <222> (1)..(20) <223> a, c, u, g, unknown regions or other regions and this region may include 17-20 nucleotides, some of which may be absent. <220> <221> Modified bases <222> (88)..(287) <223> a, c, u, g, unknown regions or other regions and this region may include 0-200 nucleotides, some of which may be absent. <400> 3 nnnnnnnnnn nnnnnnnnnn guuuuagagc uaugcuguuu uggaaacaaa acagcauagc 60 aaguuaaaau aaggcuaguc cguuaucnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 120 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 180 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 240 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnn 287 <210> 4 <211> 296 <212> RNA <213> Artificial sequence <220> <223> Description of artificial sequences: Synthetic Guided polynucleotides <220> <221> Modified bases <222> (1)..(20) <223> a, c, u, g, unknown regions or other regions and this region may include 17-20 nucleotides, some of which may be absent. <220> <221> Modified bases <222> (97)..(296) <223> a, c, u, g, unknown regions or other regions and this region may include 0-200 nucleotides, some of which may be absent. <400> 4 nnnnnnnnnn nnnnnnnnnn guuuuagagc uagaaauagc aaguuaaaau aaggcuaguc 60 cguuaucaac uugaaaaagu ggcaccgagu cggugcnnnn nnnnnnnnnn nnnnnnnnnn 120 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 180 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 240 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnn 296 <210> 5 <211> 96 <212> RNA <213> Artificial sequence <220> <223> Description of artificial sequences: Synthetic Guided oligonucleotides <220> <221> Modified bases <222> (1)..(20) <223> a, c, u, g, unknown regions or other regions and this region may include 17-20 nucleotides, some of which may be absent. <400> 5 nnnnnnnnnn nnnnnnnnnn guuuaagagc uagaaauagc aaguuuaaau aaggcuaguc 60 cguuaucaac uugaaaaagu ggcaccgagu cggugc 96 <210> 6 <211> 106 <212> RNA <213> Artificial sequence <220> <223> Description of artificial sequences: Synthetic Guided polynucleotides <220> <221> Modified bases <222> (1)..(20) <223> a, c, u, g, unknown regions or other regions and this region may include 17-20 nucleotides, some of which may be absent. <400> 6 nnnnnnnnnn nnnnnnnnnn guuuuagagc uaugcuggaa acagcauagc aaguuuaaau 60 aaggcuaguc cguuaucaac uugaaaaagu ggcaccgagu cggugc 106 <210> 7 <211> 106 <212> RNA <213> Artificial sequence <220> <223> Description of artificial sequences: Synthetic Guided polynucleotides <220> <221> Modified bases <222> (1)..(20) <223> a, c, u, g, unknown regions or other regions and this region may include 17-20 nucleotides, some of which may be absent. <400> 7 nnnnnnnnnn nnnnnnnnnn guuuaagagc uaugcuggaa acagcauagc aaguuuaaau 60 aaggcuaguc cguuaucaac uugaaaaagu ggcaccgagu cggugc 106 <210> 8 <211> 79 <212> RNA <213> Artificial sequence <220> <223> Description of artificial sequences: Synthetic Guided oligonucleotides <400> 8 ggaaccauuc aaaacagcau agcaaguuaa aauaaggcua guccguuauc aacuugaaaa 60 aguggcaccg agucggugc 79 <210> 9 <211> 62 <212> RNA <213> Artificial sequence <220> <223> Description of artificial sequences: Synthetic Guided oligonucleotides <400> 9 ggagcgagcg gagcgguaca guuuuagagc uagaaauagc aaguuaaaau aaggcuaguc 60 cg 62 <210> 10 <211> 100 <212> RNA <213> Artificial sequence <220> <223> Description of artificial sequences: Synthetic Guided polynucleotides <400> 10 ggagcgagcg gagcgguaca guuuuagagc uagaaauagc aaguuaaaau aaggcuaguc 60 cguuaucaac uugaaaaagu ggcaccgagu cggugcuuuu 100 <210> 11 <211> 8 <212> PRT <213> Artificial sequence <220> <223> Description of artificial sequences: Synthetic Nuclear localization signal peptide <400> 11 Pro Lys Lys Lys Arg Lys Val Ser 1 5 <210> 12 <211> 50 <212> PRT <213> Artificial sequence <220> <223> Description of artificial sequences: Synthetic VP64 domain peptide <400> 12 Asp Ala Leu Asp Asp Phe Asp Leu Asp Met Leu Gly Ser Asp Ala Leu 1 5 10 15 Asp Asp Phe Asp Leu Asp Met Leu Gly Ser Asp Ala Leu Asp Asp Phe 20 25 30 Asp Leu Asp Met Leu Gly Ser Asp Ala Leu Asp Asp Phe Asp Leu Asp 35 40 45 Met Leu 50 <210> 13 <211> 1368 <212> PRT <213> Streptococcus pyogenes <400> 13 Met Asp Lys Lys Tyr Ser Ile Gly Leu Ala Ile Gly Thr Asn Ser Val 1 5 10 15 Gly Trp Ala Val Ile Thr Asp Glu Tyr Lys Val Pro Ser Lys Lys Phe 20 25 30 Lys Val Leu Gly Asn Thr Asp Arg His Ser Ile Lys Lys Asn Leu Ile 35 40 45 Gly Ala Leu Leu Phe Asp Ser Gly Glu Thr Ala Glu Ala Thr Arg Leu 50 55 60 Lys Arg Thr Ala Arg Arg Arg Tyr Thr Arg Arg Lys Asn Arg Ile Cys 65 70 75 80 Tyr Leu Gln Glu Ile Phe Ser Asn Glu Met Ala Lys Val Asp Asp Ser 85 90 95 Phe Phe His Arg Leu Glu Glu Ser Phe Leu Val Glu Glu Asp Lys Lys 100 105 110 His Glu Arg His Pro Ile Phe Gly Asn Ile Val Asp Glu Val Ala Tyr 115 120 125 His Glu Lys Tyr Pro Thr Ile Tyr His Leu Arg Lys Lys Leu Val Asp 130 135 140 Ser Thr Asp Lys Ala Asp Leu Arg Leu Ile Tyr Leu Ala Leu Ala His 145 150 155 160 Met Ile Lys Phe Arg Gly His Phe Leu Ile Glu Gly Asp Leu Asn Pro 165 170 175 Asp Asn Ser Asp Val Asp Lys Leu Phe Ile Gln Leu Val Gln Thr Tyr 180 185 190 Asn Gln Leu Phe Glu Glu Asn Pro Ile Asn Ala Ser Gly Val Asp Ala 195 200 205 Lys Ala Ile Leu Ser Ala Arg Leu Ser Lys Ser Arg Arg Leu Glu Asn 210 215 220 Leu Ile Ala Gln Leu Pro Gly Glu Lys Lys Asn Gly Leu Phe Gly Asn 225 230 235 240 Leu Ile Ala Leu Ser Leu Gly Leu Thr Pro Asn Phe Lys Ser Asn Phe 245 250 255 Asp Leu Ala Glu Asp Ala Lys Leu Gln Leu Ser Lys Asp Thr Tyr Asp 260 265 270 Asp Asp Leu Asp Asn Leu Leu Ala Gln Ile Gly Asp Gln Tyr Ala Asp 275 280 285 Leu Phe Leu Ala Ala Lys Asn Leu Ser Asp Ala Ile Leu Leu Ser Asp 290 295 300 Ile Leu Arg Val Asn Thr Glu Ile Thr Lys Ala Pro Leu Ser Ala Ser 305 310 315 320 Met Ile Lys Arg Tyr Asp Glu His His Gln Asp Leu Thr Leu Leu Lys 325 330 335 Ala Leu Val Arg Gln Gln Leu Pro Glu Lys Tyr Lys Glu Ile Phe Phe 340 345 350 Asp Gln Ser Lys Asn Gly Tyr Ala Gly Tyr Ile Asp Gly Gly Ala Ser 355 360 365 Gln Glu Glu Phe Tyr Lys Phe Ile Lys Pro Ile Leu Glu Lys Met Asp 370 375 380 Gly Thr Glu Glu Leu Leu Val Lys Leu Asn Arg Glu Asp Leu Leu Arg 385 390 395 400 Lys Gln Arg Thr Phe Asp Asn Gly Ser Ile Pro His Gln Ile His Leu 405 410 415 Gly Glu Leu His Ala Ile Leu Arg Arg Gln Glu Asp Phe Tyr Pro Phe 420 425 430 Leu Lys Asp Asn Arg Glu Lys Ile Glu Lys Ile Leu Thr Phe Arg Ile 435 440 445 Pro Tyr Tyr Val Gly Pro Leu Ala Arg Gly Asn Ser Arg Phe Ala Trp 450 455 460 Met Thr Arg Lys Ser Glu Glu Thr Ile Thr Pro Trp Asn Phe Glu Glu 465 470 475 480 Val Val Asp Lys Gly Ala Ser Ala Gln Ser Phe Ile Glu Arg Met Thr 485 490 495 Asn Phe Asp Lys Asn Leu Pro Asn Glu Lys Val Leu Pro Lys His Ser 500 505 510 Leu Leu Tyr Glu Tyr Phe Thr Val Tyr Asn Glu Leu Thr Lys Val Lys 515 520 525 Tyr Val Thr Glu Gly Met Arg Lys Pro Ala Phe Leu Ser Gly Glu Gln 530 535 540 Lys Lys Ala Ile Val Asp Leu Leu Phe Lys Thr Asn Arg Lys Val Thr 545 550 555 560 Val Lys Gln Leu Lys Glu Asp Tyr Phe Lys Lys Ile Glu Cys Phe Asp 565 570 575 Ser Val Glu Ile Ser Gly Val Glu Asp Arg Phe Asn Ala Ser Leu Gly 580 585 590 Thr Tyr His Asp Leu Leu Lys Ile Ile Lys Asp Lys Asp Phe Leu Asp 595 600 605 Asn Glu Glu Asn Glu Asp Ile Leu Glu Asp Ile Val Leu Thr Leu Thr 610 615 620 Leu Phe Glu Asp Arg Glu Met Ile Glu Glu Arg Leu Lys Thr Tyr Ala 625 630 635 640 His Leu Phe Asp Asp Lys Val Met Lys Gln Leu Lys Arg Arg Arg Tyr 645 650 655 Thr Gly Trp Gly Arg Leu Ser Arg Lys Leu Ile Asn Gly Ile Arg Asp 660 665 670 Lys Gln Ser Gly Lys Thr Ile Leu Asp Phe Leu Lys Ser Asp Gly Phe 675 680 685 Ala Asn Arg Asn Phe Met Gln Leu Ile His Asp Asp Ser Leu Thr Phe 690 695 700 Lys Glu Asp Ile Gln Lys Ala Gln Val Ser Gly Gln Gly Asp Ser Leu 705 710 715 720 His Glu His Ile Ala Asn Leu Ala Gly Ser Pro Ala Ile Lys Lys Gly 725 730 735 Ile Leu Gln Thr Val Lys Val Val Asp Glu Leu Val Lys Val Met Gly 740 745 750 Arg His Lys Pro Glu Asn Ile Val Ile Glu Met Ala Arg Glu Asn Gln 755 760 765 Thr Thr Gln Lys Gly Gln Lys Asn Ser Arg Glu Arg Met Lys Arg Ile 770 775 780 Glu Glu Gly Ile Lys Glu Leu Gly Ser Gln Ile Leu Lys Glu His Pro 785 790 795 800 Val Glu Asn Thr Gln Leu Gln Asn Glu Lys Leu Tyr Leu Tyr Tyr Leu 805 810 815 Gln Asn Gly Arg Asp Met Tyr Val Asp Gln Glu Leu Asp Ile Asn Arg 820 825 830 Leu Ser Asp Tyr Asp Val Asp Ala Ile Val Pro Gln Ser Phe Leu Lys 835 840 845 Asp Asp Ser Ile Asp Asn Lys Val Leu Thr Arg Ser Asp Lys Asn Arg 850 855 860 Gly Lys Ser Asp Asn Val Pro Ser Glu Glu Val Val Lys Lys Met Lys 865 870 875 880 Asn Tyr Trp Arg Gln Leu Leu Asn Ala Lys Leu Ile Thr Gln Arg Lys 885 890 895 Phe Asp Asn Leu Thr Lys Ala Glu Arg Gly Gly Leu Ser Glu Leu Asp 900 905 910 Lys Ala Gly Phe Ile Lys Arg Gln Leu Val Glu Thr Arg Gln Ile Thr 915 920 925 Lys His Val Ala Gln Ile Leu Asp Ser Arg Met Asn Thr Lys Tyr Asp 930 935 940 Glu Asn Asp Lys Leu Ile Arg Glu Val Lys Val Ile Thr Leu Lys Ser 945 950 955 960 Lys Leu Val Ser Asp Phe Arg Lys Asp Phe Gln Phe Tyr Lys Val Arg 965 970 975 Glu Ile Asn Asn Tyr His His Ala His Asp Ala Tyr Leu Asn Ala Val 980 985 990 Val Gly Thr Ala Leu Ile Lys Lys Tyr Pro Lys Leu Glu Ser Glu Phe 995 1000 1005 Val Tyr Gly Asp Tyr Lys Val Tyr Asp Val Arg Lys Met Ile Ala 1010 1015 1020 Lys Ser Glu Gln Glu Ile Gly Lys Ala Thr Ala Lys Tyr Phe Phe 1025 1030 1035 Tyr Ser Asn Ile Met Asn Phe Phe Lys Thr Glu Ile Thr Leu Ala 1040 1045 1050 Asn Gly Glu Ile Arg Lys Arg Pro Leu Ile Glu Thr Asn Gly Glu 1055 1060 1065 Thr Gly Glu Ile Val Trp Asp Lys Gly Arg Asp Phe Ala Thr Val 1070 1075 1080 Arg Lys Val Leu Ser Met Pro Gln Val Asn Ile Val Lys Lys Thr 1085 1090 1095 Glu Val Gln Thr Gly Gly Phe Ser Lys Glu Ser Ile Leu Pro Lys 1100 1105 1110 Arg Asn Ser Asp Lys Leu Ile Ala Arg Lys Lys Asp Trp Asp Pro 1115 1120 1125 Light Light Tyr Gly Gly Phe Asp Ser Pro Thr Val Ala Tyr Ser Val 1130 1135 1140 Leu Val Val Ala Lys Val Glu Lys Gly Lys Ser Lys Lys Leu Lys 1145 1150 1155 Ser Val Lys Glu Leu Leu Gly Ile Thr Ile Met Glu Arg Ser Ser 1160 1165 1170 Phe Glu Lys Asn Pro Ile Asp Phe Leu Glu Ala Lys Gly Tyr Lys 1175 1180 1185 Glu Val Lys Lys Asp Leu Ile Ile Lys Leu Pro Lys Tyr Ser Leu 1190 1195 1200 Phe Glu Leu Glu Asn Gly Arg Lys Arg Met Leu Ala Ser Ala Gly 1205 1210 1215 Glu Leu Gln Lys Gly Asn Glu Leu Ala Leu Pro Ser Lys Tyr Val 1220 1225 1230 Asn Phe Leu Tyr Leu Ala Ser His Tyr Glu Lys Leu Lys Gly Ser 1235 1240 1245 Pro Glu Asp Asn Glu Gln Lys Gln Leu Phe Val Glu Gln His Lys 1250 1255 1260 His Tyr Leu Asp Glu Ile Ile Glu Gln Ile Ser Glu Phe Ser Lys 1265 1270 1275 Arg Val Ile Leu Ala Asp Ala Asn Leu Asp Lys Val Leu Ser Ala 1280 1285 1290 Tyr Asn Lys His Arg Asp Lys Pro Ile Arg Glu Gln Ala Glu Asn 1295 1300 1305 Ile Ile His Leu Phe Thr Leu Thr Asn Leu Gly Ala Pro Ala Ala 1310 1315 1320 Phe Lys Tyr Phe Asp Thr Thr Ile Asp Arg Lys Arg Tyr Thr Ser 1325 1330 1335 Thr Lys Glu Val Leu Asp Ala Thr Leu Ile His Gln Ser Ile Thr 1340 1345 1350 Gly Leu Tyr Glu Thr Arg Ile Asp Leu Ser Gln Leu Gly Gly Asp 1355 1360 1365 <210> 14 <211> 4 <212> PRT <213> Artificial sequence <220> <223> Description of artificial sequences: Synthetic Connector peptide <400> 14 Gly Gly Gly Ser 1 <210> 15 <211> 5 <212> PRT <213> Artificial sequence <220> <223> Description of artificial sequences: Synthetic Connector peptide <400> 15 Gly Gly Gly Gly Ser 1 5 <210> 16 <211> twenty three <212> DNA <213> Artificial sequence <220> <223> Description of artificial sequences: Synthetic Oligonucleotides at target binding sites <400> 16 gtgtgcagac ggcagtcact agg 23 <210> 17 <211> twenty three <212> DNA <213> Artificial sequence <220> <223> Description of artificial sequences: Synthetic Oligonucleotides at target binding sites <400> 17 gagcagcgtc ttcgagagtg agg 23 <210> 18 <211> twenty three <212> DNA <213> Artificial sequence <220> <223> Description of artificial sequences: Synthetic Oligonucleotides at target binding sites <400> 18 ggtgagtgag tgtgtgcgtg tgg 23 <210> 19 <211> twenty three <212> DNA <213> Artificial sequence <220> <223> Description of artificial sequences: Synthetic Oligonucleotides at target binding sites <400> 19 gttggagcgg ggagaaggcc agg 23 <210> 20 <211> twenty three <212> DNA <213> Artificial sequence <220> <223> Description of artificial sequences: Synthetic Oligonucleotides at target binding sites <400> 20 gggtgggggg agtttgctcc tgg 23 <210> twenty one <211> twenty three <212> DNA <213> Artificial sequence <220> <223> Description of artificial sequences: Synthetic Oligonucleotides at target binding sites <400> twenty one ggctttggaa agggggtggg ggg 23 <210> twenty two <211> twenty three <212> DNA <213> Artificial sequence <220> <223> Description of artificial sequences: Synthetic Oligonucleotides at target binding sites <400> twenty two ggggcggggt cccggcgggg cgg 23 <210> twenty three <211> twenty three <212> DNA <213> Artificial sequence <220> <223> Description of artificial sequences: Synthetic Oligonucleotides at target binding sites <400> twenty three gctcggaggt cgtggcgctg ggg 23 <210> twenty four <211> twenty three <212> DNA <213> Artificial sequence <220> <223> Description of artificial sequences: Synthetic Oligonucleotides at target binding sites <400> twenty four gactcaccgg ccagggcgct cgg 23 <210> 25 <211> twenty three <212> DNA <213> Artificial sequence <220> <223> Description of artificial sequences: Synthetic Oligonucleotides at target binding sites <400> 25 ggcgcagcgg ttaggtggac cgg 23 <210> 26 <211> twenty three <212> DNA <213> Artificial sequence <220> <223> Description of artificial sequences: Synthetic Oligonucleotides at target binding sites <400> 26 ggcgcatggc tccgccccgc cgg 23 <210> 27 <211> twenty three <212> DNA <213> Artificial sequence <220> <223> Description of artificial sequences: Synthetic Oligonucleotides at target binding sites <400> 27 gccacgacct ccgagctacc cgg 23 <210> 28 <211> twenty three <212> DNA <213> Artificial sequence <220> <223> Description of artificial sequences: Synthetic Oligonucleotides at target binding sites <400> 28 gcggcgtgag ccctccccct tgg 23 <210> 29 <211> twenty three <212> DNA <213> Artificial sequence <220> <223> Description of artificial sequences: Synthetic Oligonucleotides at target binding sites <400> 29 ggaggcgggg tggagggggt cgg 23 <210> 30 <211> twenty three <212> DNA <213> Artificial sequence <220> <223> Description of artificial sequences: Synthetic Oligonucleotides at target binding sites <400> 30 gggctcacgc cgcgctccgg cgg 23 <210> 31 <211> twenty three <212> DNA <213> Artificial sequence <220> <223> Description of artificial sequences: Synthetic Oligonucleotides at target binding sites <400> 31 gaccccctcc accccgcctc cgg 23 <210> 32 <211> twenty three <212> DNA <213> Artificial sequence <220> <223> Description of artificial sequences: Synthetic Oligonucleotides at target binding sites <400> 32 gagcgcggag ccatctggcc ggg 23 <210> 33 <211> twenty three <212> DNA <213> Artificial sequence <220> <223> Description of artificial sequences: Synthetic Oligonucleotides at target binding sites <400> 33 gcgcggcgcg gaaggggtta agg 23 <210> 34 <211> twenty three <212> DNA <213> Artificial sequence <220> <223> Description of artificial sequences: Synthetic Oligonucleotides at target binding sites <400> 34 gcggcgcggc gcgggccggc ggg 23 <210> 35 <211> twenty three <212> DNA <213> Artificial sequence <220> <223> Description of artificial sequences: Synthetic Oligonucleotides at target binding sites <400> 35 gccgcgccgc cctcccccgc cgg 23 <210> 36 <211> twenty three <212> DNA <213> Artificial sequence <220> <223> Description of artificial sequences: Synthetic Oligonucleotides at target binding sites <400> 36 gcggttataa ccagccaacc cgg 23 <210> 37 <211> twenty three <212> DNA <213> Artificial sequence <220> <223> Description of artificial sequences: Synthetic Oligonucleotides at target binding sites <400> 37 gtgcgcggag ctgttcggaa ggg 23 <210> 38 <211> 26 <212> DNA <213> Artificial sequence <220> <223> Description of artificial sequences: Synthetic Oligonucleotides at target binding sites <400> 38 acaccgtgtg cagacggcag tcactg 26 <210> 39 <211> 26 <212> DNA <213> Artificial sequence <220> <223> Description of artificial sequences: Synthetic Oligonucleotides at target binding sites <400> 39 acaccgagca gcgtcttcga gagtgg 26 <210> 40 <211> 26 <212> DNA <213> Artificial sequence <220> <223> Description of artificial sequences: Synthetic Oligonucleotides at target binding sites <400> 40 acaccggtga gtgagtgtgt gcgtgg 26 <210> 41 <211> 26 <212> DNA <213> Artificial sequence <220> <223> Description of artificial sequences: Synthetic Oligonucleotides at target binding sites <400> 41 acaccgttgg agcggggaga aggccg 26 <210> 42 <211> 26 <212> DNA <213> Artificial sequence <220> <223> Description of artificial sequences: Synthetic Oligonucleotides at target binding sites <400> 42 acaccgggtg gggggagttt gctccg 26 <210> 43 <211> 26 <212> DNA <213> Artificial sequence <220> <223> Description of artificial sequences: Synthetic Oligonucleotides at target binding sites <400> 43 acaccggctttggaaagggg gtgggg 26 <210> 44 <211> 26 <212> DNA <213> Artificial sequence <220> <223> Description of artificial sequences: Synthetic Oligonucleotides at target binding sites <400> 44 acaccggggc ggggtcccgg cggggg 26 <210> 45 <211> 26 <212> DNA <213> Artificial sequence <220> <223> Description of artificial sequences: Synthetic Oligonucleotides at target binding sites <400> 45 acaccgctcg gaggtcgtgg cgctgg 26 <210> 46 <211> 26 <212> DNA <213> Artificial sequence <220> <223> Description of artificial sequences: Synthetic Oligonucleotides at target binding sites <400> 46 acaccgactc accggccagg gcgctg 26 <210> 47 <211> 26 <212> DNA <213> Artificial sequence <220> <223> Description of artificial sequences: Synthetic Oligonucleotides at target binding sites <400> 47 acaccggcgc agcggttagg tggacg 26 <210> 48 <211> 26 <212> DNA <213> Artificial sequence <220> <223> Description of artificial sequences: Synthetic Oligonucleotides at target binding sites <400> 48 acaccggcgc atggctccgc cccgcg 26 <210> 49 <211> 26 <212> DNA <213> Artificial sequence <220> <223> Description of artificial sequences: Synthetic Oligonucleotides at target binding sites <400> 49 acaccgccacgacctccgag ctaccg 26 <210> 50 <211> 26 <212> DNA <213> Artificial sequence <220> <223> Description of artificial sequences: Synthetic Oligonucleotides at target binding sites <400> 50 acaccgcggc gtgagccctccccctg 26 <210> 51 <211> 26 <212> DNA <213> Artificial sequence <220> <223> Description of artificial sequences: Synthetic Oligonucleotides at target binding sites <400> 51 acaccggagg cggggtggag ggggtg 26 <210> 52 <211> 26 <212> DNA <213> Artificial sequence <220> <223> Description of artificial sequences: Synthetic Oligonucleotides at target binding sites <400> 52 acaccgggct cacgccgcgc tccggg 26 <210> 53 <211> 26 <212> DNA <213> Artificial sequence <220> <223> Description of artificial sequences: Synthetic Oligonucleotides at target binding sites <400> 53 acaccgaccc cctccacccc gcctcg 26 <210> 54 <211> 26 <212> DNA <213> Artificial sequence <220> <223> Description of artificial sequences: Synthetic Oligonucleotides at target binding sites <400> 54 acaccgagcg cggagccatc tggccg 26 <210> 55 <211> 26 <212> DNA <213> Artificial sequence <220> <223> Description of artificial sequences: Synthetic Oligonucleotides at target binding sites <400> 55 acaccgcgcg gcgcggaagg ggttag 26 <210> 56 <211> 26 <212> DNA <213> Artificial sequence <220> <223> Description of artificial sequences: Synthetic Oligonucleotides at target binding sites <400> 56 acaccgcggc gcggcgcggg ccggcg 26 <210> 57 <211> 26 <212> DNA <213> Artificial sequence <220> <223> Description of artificial sequences: Synthetic Oligonucleotides at target binding sites <400> 57 acaccgccgc gccgccctcc cccgcg 26 <210> 58 <211> 26 <212> DNA <213> Artificial sequence <220> <223> Description of artificial sequences: Synthetic Oligonucleotides at target binding sites <400> 58 acaccgcggt tataaccagc caaccg 26 <210> 59 <211> 26 <212> DNA <213> Artificial sequence <220> <223> Description of artificial sequences: Synthetic Oligonucleotides at target binding sites <400> 59 acaccgtgcg cggagctgtt cggaag 26 <210> 60 <211> 26 <212> DNA <213> Artificial sequence <220> <223> Description of artificial sequences: Synthetic Oligonucleotides at target binding sites <400> 60 aaaacagtga ctgccgtctg cacacg 26 <210> 61 <211> 26 <212> DNA <213> Artificial sequence <220> <223> Description of artificial sequences: Synthetic Oligonucleotides at target binding sites <400> 61 aaaaccactc tcgaagacgc tgctcg 26 <210> 62 <211> 26 <212> DNA <213> Artificial sequence <220> <223> Description of artificial sequences: Synthetic Oligonucleotides at target binding sites <400> 62 aaaaccacgc acacactcac tcaccg 26 <210> 63 <211> 26 <212> DNA <213> Artificial sequence <220> <223> Description of artificial sequences: Synthetic Oligonucleotides at target binding sites <400> 63 aaaacggccttctccccgct ccaacg 26 <210> 64 <211> 26 <212> DNA <213> Artificial sequence <220> <223> Description of artificial sequences: Synthetic Oligonucleotides at target binding sites <400> 64 aaaacggagc aaactccccc cacccg 26 <210> 65 <211> 26 <212> DNA <213> Artificial sequence <220> <223> Description of artificial sequences: Synthetic Oligonucleotides at target binding sites <400> 65 aaaaccccaccccctttcca aagccg 26 <210> 66 <211> 26 <212> DNA <213> Artificial sequence <220> <223> Description of artificial sequences: Synthetic Oligonucleotides at target binding sites <400> 66 aaaacccccg ccgggacccc gccccg 26 <210> 67 <211> 26 <212> DNA <213> Artificial sequence <220> <223> Description of artificial sequences: Synthetic Oligonucleotides at target binding sites <400> 67 aaaaccagcg ccacgacctc cgagcg 26 <210> 68 <211> 26 <212> DNA <213> Artificial sequence <220> <223> Description of artificial sequences: Synthetic Oligonucleotides at target binding sites <400> 68 aaaacagcgc cctggccggt gagtcg 26 <210> 69 <211> 26 <212> DNA <213> Artificial sequence <220> <223> Description of artificial sequences: Synthetic Oligonucleotides at target binding sites <400> 69 aaaacgtcca cctaaccgct gcgccg 26 <210> 70 <211> 26 <212> DNA <213> Artificial sequence <220> <223> Description of artificial sequences: Synthetic Oligonucleotides at target binding sites <400> 70 aaaacgcggg gcggagccat gcgccg 26 <210> 71 <211> 26 <212> DNA <213> Artificial sequence <220> <223> Description of artificial sequences: Synthetic Oligonucleotides at target binding sites <400> 71 aaaacggtag ctcggaggtc gtggcg 26 <210> 72 <211> 26 <212> DNA <213> Artificial sequence <220> <223> Description of artificial sequences: Synthetic Oligonucleotides at target binding sites <400> 72 aaaacagggg gagggctcac gccgcg 26 <210> 73 <211> 26 <212> DNA <213> Artificial sequence <220> <223> Description of artificial sequences: Synthetic Oligonucleotides at target binding sites <400> 73 aaaacaccccctccaccccgcctccg 26 <210> 74 <211> 26 <212> DNA <213> Artificial sequence <220> <223> Description of artificial sequences: Synthetic Oligonucleotides at target binding sites <400> 74 aaaacccgga gcgcggcgtg agcccg 26 <210> 75 <211> 26 <212> DNA <213> Artificial sequence <220> <223> Description of artificial sequences: Synthetic Oligonucleotides at target binding sites <400> 75 aaaacgaggc ggggtggagg gggtcg 26 <210> 76 <211> 26 <212> DNA <213> Artificial sequence <220> <223> Description of artificial sequences: Synthetic Oligonucleotides at target binding sites <400> 76 aaaacggcca gatggctccg cgctcg 26 <210> 77 <211> 26 <212> DNA <213> Artificial sequence <220> <223> Description of artificial sequences: Synthetic Oligonucleotides at target binding sites <400> 77 aaaactaacc ccttccgcgc cgcgcg 26 <210> 78 <211> 26 <212> DNA <213> Artificial sequence <220> <223> Description of artificial sequences: Synthetic Oligonucleotides at target binding sites <400> 78 aaaacgccgg cccgcgccgc gccgcg 26 <210> 79 <211> 26 <212> DNA <213> Artificial sequence <220> <223> Description of artificial sequences: Synthetic Oligonucleotides at target binding sites <400> 79 aaaacgcggg ggagggcggc gcggcg 26 <210> 80 <211> 26 <212> DNA <213> Artificial sequence <220> <223> Description of artificial sequences: Synthetic Oligonucleotides at target binding sites <400> 80 aaaacggttg gctggttata accgcg 26 <210> 81 <211> 26 <212> DNA <213> Artificial sequence <220> <223> Description of artificial sequences: Synthetic Oligonucleotides at target binding sites <400> 81 aaaacttccg aacagctccg cgcacg 26 <210> 82 <211> 20 <212> DNA <213> Artificial sequence <220> <223> Description of artificial sequences: Synthetic Primers <400> 82 tccagatggc acattgtcag 20 <210> 83 <211> 20 <212> DNA <213> Artificial sequence <220> <223> Description of artificial sequences: Synthetic Primers <400> 83 agggagcagg aaagtgaggt 20 <210> 84 <211> 20 <212> DNA <213> Artificial sequence <220> <223> Description of artificial sequences: Synthetic Primers <400> 84 gcacgtaacc tcactttcct 20 <210> 85 <211> twenty three <212> DNA <213> Artificial sequence <220> <223> Description of artificial sequences: Synthetic Primers <400> 85 cttgctacct ctttcctctt tct 23 <210> 86 <211> twenty two <212> DNA <213> Artificial sequence <220> <223> Description of artificial sequences: Synthetic Primers <400> 86 agagaagtcg aggaagagag ag 22 <210> 87 <211> twenty two <212> DNA <213> Artificial sequence <220> <223> Description of artificial sequences: Synthetic Primers <400> 87 cagcagaaag ttcatggttt cg 22 <210> 88 <211> 130 <212> PRT <213> Enterobacterial bacteriophage λ <400> 88 Met Ala Ser Asn Phe Thr Gln Phe Val Leu Val Asp Asn Gly Gly Thr 1 5 10 15 Gly Asp Val Thr Val Ala Pro Ser Asn Phe Ala Asn Gly Val Ala Glu 20 25 30 Trp Ile Ser Ser Asn Ser Arg Ser Gln Ala Tyr Lys Val Thr Cys Ser 35 40 45 Val Arg Gln Ser Ser Ala Gln Asn Arg Lys Tyr Thr Ile Lys Val Glu 50 55 60 Val Pro Lys Val Ala Thr Gln Thr Val Gly Gly Val Glu Leu Pro Val 65 70 75 80 Ala Ala Trp Arg Ser Tyr Leu Asn Met Glu Leu Thr Ile Pro Ile Phe 85 90 95 Ala Thr Asn Ser Asp Cys Glu Leu Ile Val Lys Ala Met Gln Gly Leu 100 105 110 Leu Lys Asp Gly Asn Pro Ile Pro Ser Ala Ile Ala Ala Asn Ser Gly 115 120 125 Ile Tyr 130 <210> 89 <211> 130 <212> PRT <213> Artificial sequence <220> <223> Description of artificial sequences: Synthetic λ phage MS2 N55K mutant peptide <400> 89 Met Ala Ser Asn Phe Thr Gln Phe Val Leu Val Asp Asn Gly Gly Thr 1 5 10 15 Gly Asp Val Thr Val Ala Pro Ser Asn Phe Ala Asn Gly Val Ala Glu 20 25 30 Trp Ile Ser Ser Asn Ser Arg Ser Gln Ala Tyr Lys Val Thr Cys Ser 35 40 45 Val Arg Gln Ser Ser Ala Gln Lys Arg Lys Tyr Thr Ile Lys Val Glu 50 55 60 Val Pro Lys Val Ala Thr Gln Thr Val Gly Gly Val Glu Leu Pro Val 65 70 75 80 Ala Ala Trp Arg Ser Tyr Leu Asn Met Glu Leu Thr Ile Pro Ile Phe 85 90 95 Ala Thr Asn Ser Asp Cys Glu Leu Ile Val Lys Ala Met Gln Gly Leu 100 105 110 Leu Lys Asp Gly Asn Pro Ile Pro Ser Ala Ile Ala Ala Asn Ser Gly 115 120 125 Ile Tyr 130 <210> 90 <211> 117 <212> PRT <213> Artificial sequence <220> <223> Description of artificial sequences: Synthetic λ phage MS2 ΔFG mutant peptide <400> 90 Met Ala Ser Asn Phe Thr Gln Phe Val Leu Val Asp Asn Gly Gly Thr 1 5 10 15 Gly Asp Val Thr Val Ala Pro Ser Asn Phe Ala Asn Gly Ile Ala Glu 20 25 30 Trp Ile Ser Ser Asn Ser Arg Ser Gln Ala Tyr Lys Val Thr Cys Ser 35 40 45 Val Arg Gln Ser Ser Ala Gln Asn Arg Lys Tyr Thr Ile Lys Val Glu 50 55 60 Val Pro Lys Gly Ala Trp Arg Ser Tyr Leu Asn Met Glu Leu Thr Ile 65 70 75 80 Pro Ile Phe Ala Thr Asn Ser Asp Cys Glu Leu Ile Val Lys Ala Met 85 90 95 Gln Gly Leu Leu Lys Asp Gly Asn Pro Ile Pro Ser Ala Ile Ala Ala 100 105 110 Asn Ser Gly Ile Tyr 115 <210> 91 <211> 262 <212> PRT <213> Artificial sequence <220> <223> Description of artificial sequences: Synthetic Dimeric MS2 capsid peptide <400> 91 Met Ala Ser Asn Phe Thr Gln Phe Val Leu Val Asp Asn Gly Gly Thr 1 5 10 15 Gly Asp Val Thr Val Ala Pro Ser Asn Phe Ala Asn Gly Val Ala Glu 20 25 30 Trp Ile Ser Ser Asn Ser Arg Ser Gln Ala Tyr Lys Val Thr Cys Ser 35 40 45 Val Arg Gln Ser Ser Ala Gln Asn Arg Lys Tyr Thr Ile Lys Val Glu 50 55 60 Val Pro Lys Val Ala Thr Gln Thr Val Gly Gly Val Glu Leu Pro Val 65 70 75 80 Ala Ala Trp Arg Ser Tyr Leu Asn Met Glu Leu Thr Ile Pro Ile Phe 85 90 95 Ala Thr Asn Ser Asp Cys Glu Leu Ile Val Lys Ala Met Gln Gly Leu 100 105 110 Leu Lys Asp Gly Asn Pro Ile Pro Ser Ala Ile Ala Ala Asn Ser Gly 115 120 125 Leu Tyr Gly Ala Met Ala Ser Asn Phe Thr Gln Phe Val Leu Val Asp 130 135 140 Asn Gly Gly Thr Gly Asp Val Thr Val Ala Pro Ser Asn Phe Ala Asn 145 150 155 160 Gly Val Ala Glu Trp Ile Ser Ser Asn Ser Arg Ser Gln Ala Tyr Lys 165 170 175 Val Thr Cys Ser Val Arg Gln Ser Ser Ala Gln Asn Arg Lys Tyr Thr 180 185 190 Ile Lys Val Glu Val Pro Lys Val Ala Thr Gln Thr Val Gly Gly Val 195 200 205 Glu Leu Pro Val Ala Ala Trp Arg Ser Tyr Leu Asn Met Glu Leu Thr 210 215 220 Ile Pro Ile Phe Ala Thr Asn Ser Asp Cys Glu Leu Ile Val Lys Ala 225 230 235 240 Met Gln Gly Leu Leu Lys Asp Gly Asn Pro Ile Pro Ser Ala Ile Ala 245 250 255 Ala Asn Ser Leu Ile Asn 260 <210> 92 <211> 262 <212> PRT <213> Artificial sequence <220> <223> Description of artificial sequences: Synthetic Dimeric MS2 N55K mutant capsid peptide <400> 92 Met Ala Ser Asn Phe Thr Gln Phe Val Leu Val Asp Asn Gly Gly Thr 1 5 10 15 Gly Asp Val Thr Val Ala Pro Ser Asn Phe Ala Asn Gly Val Ala Glu 20 25 30 Trp Ile Ser Ser Asn Ser Arg Ser Gln Ala Tyr Lys Val Thr Cys Ser 35 40 45 Val Arg Gln Ser Ser Ala Gln Lys Arg Lys Tyr Thr Ile Lys Val Glu 50 55 60 Val Pro Lys Val Ala Thr Gln Thr Val Gly Gly Val Glu Leu Pro Val 65 70 75 80 Ala Ala Trp Arg Ser Tyr Leu Asn Met Glu Leu Thr Ile Pro Ile Phe 85 90 95 Ala Thr Asn Ser Asp Cys Glu Leu Ile Val Lys Ala Met Gln Gly Leu 100 105 110 Leu Lys Asp Gly Asn Pro Ile Pro Ser Ala Ile Ala Ala Asn Ser Gly 115 120 125 Leu Tyr Gly Ala Met Ala Ser Asn Phe Thr Gln Phe Val Leu Val Asp 130 135 140 Asn Gly Gly Thr Gly Asp Val Thr Val Ala Pro Ser Asn Phe Ala Asn 145 150 155 160 Gly Val Ala Glu Trp Ile Ser Ser Asn Ser Arg Ser Gln Ala Tyr Lys 165 170 175 Val Thr Cys Ser Val Arg Gln Ser Ser Ala Gln Lys Arg Lys Tyr Thr 180 185 190 Ile Lys Val Glu Val Pro Lys Val Ala Thr Gln Thr Val Gly Gly Val 195 200 205 Glu Leu Pro Val Ala Ala Trp Arg Ser Tyr Leu Asn Met Glu Leu Thr 210 215 220 Ile Pro Ile Phe Ala Thr Asn Ser Asp Cys Glu Leu Ile Val Lys Ala 225 230 235 240 Met Gln Gly Leu Leu Lys Asp Gly Asn Pro Ile Pro Ser Ala Ile Ala 245 250 255 Ala Asn Ser Leu Ile Asn 260 <210> 93 <211> 236 <212> PRT <213> Artificial sequence <220> <223> Description of artificial sequences: Synthetic Dimeric MS2 ΔFG mutant capsid peptide <400> 93 Met Ala Ser Asn Phe Thr Gln Phe Val Leu Val Asp Asn Gly Gly Thr 1 5 10 15 Gly Asp Val Thr Val Ala Pro Ser Asn Phe Ala Asn Gly Val Ala Glu 20 25 30 Trp Ile Ser Ser Asn Ser Arg Ser Gln Ala Tyr Lys Val Thr Cys Ser 35 40 45 Val Arg Gln Ser Ser Ala Gln Lys Arg Lys Tyr Thr Ile Lys Val Glu 50 55 60 Val Pro Lys Gly Ala Trp Arg Ser Tyr Leu Asn Met Glu Leu Thr Ile 65 70 75 80 Pro Ile Phe Ala Thr Asn Ser Asp Cys Glu Leu Ile Val Lys Ala Met 85 90 95 Gln Gly Leu Leu Lys Asp Gly Asn Pro Ile Pro Ser Ala Ile Ala Ala 100 105 110 Asn Ser Gly Leu Tyr Gly Ala Met Ala Ser Asn Phe Thr Gln Phe Val 115 120 125 Leu Val Asp Asn Gly Gly Thr Gly Asp Val Thr Val Ala Pro Ser Asn 130 135 140 Phe Ala Asn Gly Val Ala Glu Trp Ile Ser Ser Asn Ser Arg Ser Gln 145 150 155 160 Ala Tyr Lys Val Thr Cys Ser Val Arg Gln Ser Ser Ala Gln Lys Arg 165 170 175 Lys Tyr Thr Ile Lys Val Glu Val Pro Lys Gly Ala Trp Arg Ser Tyr 180 185 190 Leu Asn Met Glu Leu Thr Ile Pro Ile Phe Ala Thr Asn Ser Asp Cys 195 200 205 Glu Leu Ile Val Lys Ala Met Gln Gly Leu Leu Lys Asp Gly Asn Pro 210 215 220 Ile Pro Ser Ala Ile Ala Ala Asn Ser Leu Ile Asn 225 230 235 <210> 94 <211> 22 <212> PRT <213> Escherichia virus lambda <400> 94 Met Asp Ala Gln Thr Arg Arg Arg Glu Arg Arg Ala Glu Lys Gln Ala 1 5 10 15 Gln Trp Lys Ala Ala Asn 20 <210> 95 <211> 107 <212> PRT <213> Escherichia virus lambda <400> 95 Met Asp Ala Gln Thr Arg Arg Arg Glu Arg Arg Ala Glu Lys Gln Ala 1 5 10 15 Gln Trp Lys Ala Ala Asn Pro Leu Leu Val Gly Val Ser Ala Lys Pro 20 25 30 Val Asn Arg Pro Ile Leu Ser Leu Asn Arg Lys Pro Lys Ser Arg Val 35 40 45 Glu Ser Ala Leu Asn Pro Ile Asp Leu Thr Val Leu Ala Glu Tyr His 50 55 60 Lys Gln Ile Glu Ser Asn Leu Gln Arg Ile Glu Arg Lys Asn Gln Arg 65 70 75 80 Thr Trp Tyr Ser Lys Pro Gly Glu Arg Gly Ile Thr Cys Ser Gly Arg 85 90 95 Gln Lys Ile Lys Gly Lys Ser Ile Pro Leu Ile 100 105 <210> 96 <211> twenty three <212> RNA <213> Enterobacterial bacteriophage λ <400> 96 aaacaugagg auuacccaug ucg 23 <210> 97 <211> twenty three <212> RNA <213> Artificial sequence <220> <223> Description of artificial sequences: Synthetic High-affinity MS2-binding oligonucleotides <400> 97 aaacaugagg aucacccaug ucg 23 <210> 98 <211> 15 <212> RNA <213> Enterobacterial bacteriophage λ <400> 98 gcccugaaga agggc 15 <210> 99 <211> 15 <212> RNA <213> Enterobacterial bacteriophage λ <400> 99 gcccugaaaa agggc 15 <210> 100 <211> 20 <212> DNA <213> Artificial sequence <220> <223> Description of artificial sequences: Synthetic truncated Csy4 binding site oligonucleotides <220> <223> Description of combined DNA / RNA molecules: Synthetic truncated Csy4 binding site oligonucleotides <400> 100 gttcactgcc gtataggcag 20 <210> 101 <211> 28 <212> RNA <213> Artificial sequence <220> <223> Description of artificial sequences: Synthetic Csy4 binding site oligonucleotide <400> 101 guucacugcc guauaggcag cuaagaaa 28 <210> 102 <211> 32 <212> RNA <213> Artificial sequence <220> <223> Description of artificial sequences: Synthetic crRNA oligonucleotides <220> <221> Modified bases <222> (1)..(20) <223> a, c, u, g, unknown regions or other regions and this region may include 17-20 nucleotides, some of which may be absent. <400> 102 nnnnnnnnnn nnnnnnnnnn guuuuagagc ua 32 <210> 103 <211> 42 <212> RNA <213> Artificial sequence <220> <223> Description of artificial sequences: Synthetic crRNA oligonucleotides <220> <221> Modified bases <222> (1)..(20) <223> a, c, u, g, unknown regions or other regions and this region may include 17-20 nucleotides, some of which may be absent. <400> 103 nnnnnnnnnn nnnnnnnnnn guuuuagagc uaugcuguuu ug 42 <210> 104 <211> 36 <212> RNA <213> Artificial sequence <220> <223> Description of artificial sequences: Synthetic crRNA oligonucleotides <220> <221> Modified bases <222> (1)..(20) <223> a, c, u, g, unknown regions or other regions and this region may include 17-20 nucleotides, some of which may be absent. <400> 104 nnnnnnnnnn nnnnnnnnnn guuuuagagc uaugcu 36 <210> 105 <211> 60 <212> RNA <213> Artificial sequence <220> <223> Description of artificial sequences: Synthetic tracrRNA oligonucleotides <400> 105 uagcaaguua aaauaaggcu aguccguuau caacuugaaa aaguggcacc gagucggugc 60 <210> 106 <211> 64 <212> RNA <213> Artificial sequence <220> <223> Description of artificial sequences: Synthetic tracrRNA oligonucleotides <400> 106 agcauagcaa guuaaaauaa ggcuaguccg uuaucaacuu gaaaaagugg caccgagucg 60 gugc 64 <210> 107 <211> 2279 <212> DNA <213> Artificial sequence <220> <223> Description of artificial sequences: Synthetic RNA expression vector polynucleotide <220> <221> Modified bases <222> (331) (350) <223> a, c, t, g, unknown or other <400> 107 gacgtcgcta gctgtacaaa aaagcaggct ttaaaggaac caattcagtc gactggatcc 60 ggtaccaagg tcgggcagga agagggccta tttcccatga ttccttcata tttgcatata 120 cgatacaagg ctgttagaga gataattaga attaatttga ctgtaaacac aaagatatta 180 gtacaaaata cgtgacgtag aaagtaataa tttcttgggt agtttgcagt tttaaaatta 240 tgttttaaaa tggactatca tatgcttacc gtaacttgaa agtatttcga tttcttggct 300 ttatatatct tgtggaagg nnnnnnnn nnnnnnn nn gttggagc 360 tagaaatagc aagttaaaat aaggctagtc cgttatcac ttgaaaagt ggcaccgagt 420 cggtgctttt tttaagcttg ggccgctcga ggtaccctc tacatatgac atgtgagcaa 480 aaggccagca aaaggccagg aaccgtaaaa aggccgcgtt gctggcgtttt ttccataggc 540 tccgcccccc tgacgagcat cacaaaatc gacgctcag tcagaggtgg cgaaacccga 600 caggactata aagataccag gcgtttcccc ctggaagctc cctcgtgcgc tctcctgttc 660 cgaccctgcc gcttaccgga tacctgtccg ccttctccc ttcgggaagc gtggcgcttt 720 ctcatagctc acgctgtagg tatctcagtt cggtgtaggt cgttcgctcc aagctgggct 780 gtgtgcacga accccgtt cagcccgacc gctgcgcctt atccggtaac tatcgtcttg 840 agtccaaccc ggtaagacac gacttatcgc cactggcagc agccactggt aacaggatta 900 gcagagcgag gtatgtaggc ggtgctacag agttcttgaa gtggtggcct aactacggct 960 acactagaag aacagtattt ggtatctgcg ctctgctgaa gccagttacc ttcggaaaaa 1020 gagttggtag ctcttgatcc ggcaaacaaa ccaccgctgg tagcggtggt ttttttgttt 1080 gcaagcagca gattacgc agaaaaaaag gatctcaaga agatcctttg atcttttcta 1140 cggggtctga cgctcagtgg aacgaaaact cacgttaagg gattttggtc atgagattat 1200 caaaaaggat cttcacctag atccttttaa attaaaaatg aagttttaaa tcaatctaaa 1260 gtatatatga gtaaacttgg tctgacagtt accaatgctt aatcagtgag gcacctatct 1320 cagcgatctg tctatttcgt tcatccatag ttgcctgact ccccgtcgtg tagataacta 1380 cgatacggga gggcttacca tctggcccca gtgctgcaat gataccgcga gacccacgct 1440 caccggctcc agatttatca gcaataaacc agccagccgg aagggccgag cgcagaagtg 1500 gtcctgcaac tttatccgcc tccatccagt ctattaattg ttgccgggaa gctagagtaa 1560 gtagttcgcc agttaatagt ttgcgcaacg ttgttgccat tgctacaggc atcgtggtgt 1620 cacgctcgtc gtttggtatg gcttcattca gctccggttc ccaacgatca aggcgagtta 1680 catgatcccc catgttgtgc aaaaaagcgg ttagctcctt cggtcctccg atcgttgtca 1740 gaagtaagtt ggccgcagtg ttatcactca tggttatggc agcactgcat aattctctta 1800 ctgtcatgcc atccgtaaga tgcttttctg tgactggtga gtactcaacc aagtcattct 1860 gagaatagtg tatgcggcga ccgagttgct cttgcccggc gtcaatacgg gataataccg 1920 cgccacatag cagaacttta aaagtgctca tcattggaaa acgttcttcg gggcgaaaac 1980 tctcaaggat cttaccgctg ttgagatcca gttcgatgta acccactcgt gcacccaact 2040 gatcttcagc atcttttact ttcaccagcg tttctgggtg agcaaaaaca ggaaggcaaa 2100 atgccgcaaa aaagggaata agggcgacac ggaaatgttg aatactcata ctcttccttt 2160 ttcaatatta ttgaagcatt tatcagggtt attgtctcat gagcggatac atatttgaat 2220 gtatttagaa aaataaacaa ataggggttc cgcgcacatt tccccgaaaa gtgccacct 2279 <210> 108 <211> 7786 <212> DNA <213> Artificial sequence <220> <223> Description of artificial sequences: Synthetic CMV-T7-Cas9 D10A / H840A-3xFlag-VP64 polynucleotide <400> 108 atatgccaag tacgccccct attgacgtca atgacggtaa atggcccgcc tggcattatg 60 cccagtacat gaccttatgg gactttccta cttggcagta catctacgta ttagtcatcg 120 ctattaccat ggtgatgcgg ttttggcagt acatcaatgg gcgtggatag cggtttgact 180 cacggggatt tccaagtctc caccccattg acgtcaatgg gagtttgttt tggcaccaaa 240 atcaacggga ctttccaaaa tgtcgtaaca actccgcccc attgacgcaa atgggcggta 300 ggcgtgtacg gtgggaggtc tatataagca gagctggttt agtgaaccgt cagatccgct 360 agagatccgc ggccgctaat acgactcact ataggagag ccgccaccat ggataagaaa 420 tactcaatag gcttagctat cggcacaaat agcgtcggat gggcggtgat cactgatgaa 480 tataaggttc cgtctaaaaa gttcaaggtt ctgggaaata cagaccgcca fotatcaaa 540 aaaaatctta taggggctct tttatttgac agtggagaga cagcggaagc gactcgtctc 600 aaacggacag ctcgtagaag gtatacacgt cggaagaatc gtatttgtta tctacaggag 660 attttttcaa atgagatggc gaaagtagat gatagtttct ttcatcgact tgaagagtct 720 tttttggtgg aagaagacaa gaagcatgaa cgtcatccta tttttggaaa tatagtagat 780 gaagttgctt atcatgagaa atatccaact atctatcatc tgcgaaaaaa attggtagat 840 tctactgata aagcggattt gcgcttaatc tatttggcct tagcgcatat gattaagttt 900 cgtggtcatt ttttgattga gggagattta aatcctgata atagtgatgt ggacaacta 960 tttatccagt tggtacaaac ctacaatcaa ttattgaag aaaaccctat taacgcaagt 1020 ggagtagatg ctaaagcgat tctttctgca cgattgagta aatcaagacg attagaaaat 1080 ctcattgctc agctccccgg tgagaagaaa aatggcttat ttgggaatct cattgctttg 1140 tcattgggtt tgacccctaa ttttaaatca aattttgatt tggcagaaga tgctaaatta 1200 cagctttcaa aagatactta cgatgatgat ttagataatt tattggcgca aattggagat 1260 caatatgctg attgttttt ggcagctaag aatttatcag atgctatttt actttcagat 1320 atcctaagag taaatactga aataactaag gctcccctat cagcttcaat gattaaacgc 1380 tacgatgaac atcatcaaga cttgactctt ttaaaagctt tagttcgaca acaacttcca 1440 gaaaagtata aagaaatctt ttttgatcaa tcaaaaaacg gatatgcagg ttatattgat 1500 gggggagcta gccaagaaga attttataaa tttatcaaac caattttaga aaaaatggat 1560 ggtactgagg aattatggt gaaactaaat cgtgaagatt tgctgcgcaa gcaacggacc 1620 tttgacaacg gctctattcc ccatcaaatt cacttgggtg agctgcatgc tattttgaga 1680 agacaagaag acttttatcc attttaaaaa gacaatcgtg agaagattga aaaaatcttg 1740 acttttcgaa ttccttatta tgttggtcca ttggcgcgtg gcaatagtcg ttttgcatgg 1800 atgactcgga agtctgaaga aacaattacc ccatggaatt ttgaagaagt tgtcgataaa 1860 ggtgcttcag ctcaatcatt tattgaacgc atgacaaact ttgataaaaa tcttccaaat 1920 gaaaaagtac taccaaaaca tagtttgctt tatgagtatt ttacggttta taacgaattg 1980 acaaaggtca aatatgttac tgaaggaatg cgaaaaccag catttctttc aggtgaacag 2040 aagaaagcca ttgttgattt actcttcaaa acaaatcgaa aagtaaccgt taagcaatta 2100 aaagaagatt atttcaaaaa aatagaatgt tttgatagtg ttgaaatttc aggagttgaa 2160 gatagattta atgcttcatt aggtacctac catgatttgc taaaattat taaagataaa 2220 gattttttgg ataatgaaga aaatgaagat atcttagagg atattgtttt aacattgacc 2280 ttatttgaag atagggagat gattgaggaa agacttaaaa catatgctca cctctttgat 2340 gataaggtga tgaaacagct taaacgtcgc cgttatactg gttggggacg tttgtctcga 2400 aaattgatta atggtattag ggataagcaa tctggcaaaa caatattaga ttttttgaaa 2460 tcagatggtt ttgccaatcg caattttatg cagctgatcc atgatgatag tttgacattt 2520 aaagaagaca ttcaaaaagc acaagtgtct ggacaaggcg atagtttaca tgaacatatt 2580 gcaaatttag ctggtagccc tgctattaaa aaaggtattt tacagactgt aaaagttgtt 2640 gatgaattgg tcaaagtaat ggggcggcat aagccagaaa atatcgttat tgaaatggca 2700 cgtgaaaatc agacaactca aaagggccag aaaaattcgc gagagcgtat gaaacgaatc 2760 gaagaaggta tcaaagaatt aggaagtcag attcttaaag agcatcctgt tgaaaatact 2820 caattgcaaa atgaaaagct ctatctctat tatctccaaa atggagaaga catgtatgtg 2880 gaccaagaat tagatattaa tcgtttaagt gattatgatg tcgatgccat tgttccacaa 2940 agtttcctta aagacgattc aatagacaat aaggtcttaa cgcgttctga taaaaatcgt 3000 ggtaaatcgg ataacgttcc aagtgaagaa gtagtcaaaa agatgaaaaa ctattggaga 3060 caacttctaa acgccaagtt aatcactcaa cgtaagtttg ataatttaac gaaagctgaa 3120 cgtggaggtt tgagtgaact tgataaagct ggtttatca aacgccaatt ggttgaaact 3180 cgccaaatca ctaagcatgt ggcacaaatt ttggatagtc gcatgaatac taatacgat 3240 gaaaatgata aacttattcg agaggttaaa gtgattacct taaaatctaa attagttct 3300 gacttccgaa aagatttcca attctataaa gtacgtgaga ttaacaatta ccatcatgcc 3360 catgatgcgt atctaaatgc cgtcgttgga actgctttga ttaagaaata tccaaaactt 3420 gaatcggagt ttgtctatgg tgattataaa gtttatgatg ttcgtaaaat gattgctaag 3480 tctgagcaag aaataggcaa agcaaccgca aatattct tttactctaa tatcatgaac 3540 ttcttcaaaa cagaaattac acttgcaaat ggagagattc gcaaacgccc tctaatcgaa 3600 actaatgggg aactggaga aattgtctgg gataaagggc gagattttgc cacagtgcgc 3660 aaagtattgt ccatgcccca agtcaatatt gtcagaaaa cagaagtaca gandaggcgga 3720 ttctccaagg agtcaattt accaaaaga aattcggaca agcttattgc tcgtaaaaaa 3780 gactgggatc caaaaaaata tggtggtttt gatagtccaa cggtagctta ttcagtccta 3840 gtggttgcta aggtggaaaa aggaatcg aagaagttaa aatccgttaa agagttacta 3900 gggatcacaa tttaggaag aagttccttt gaaaaaaatc cgattgactt tttagaagct 3960 aaaggatata agagttaa aaagactta atcattaac tacctaata tagtcttttt 4020 gagttagaaa acggtcgtaa acggatgctg gctagtgccg gagattaca aaaggaaat 4080 gagctggctc tgccaagcaa atgtgaat ttttatatt tagctgtca ttatgaaag 4140 ttgaagggta gtccagaga taacgaacaaaacaattgt tgtggagca gcataagcat 4200 tattagatg agattattga gcaatcagt gatttttcta agcgtgttat tttagcagat 4260 gccaatttag aaagttct tagtgcatat aaaaacata gagacaacc atacgtgaa 4320 caagcagaaa atattattca tttatttacg ttgacgaatc ttggagctcc cgctgctttt 4380 aaatattttg atacaacaat tgatcgtaaa cgatatacgt ctacaaaaga agttttagat 4440 gccactctta tccatcaatc catcactggt ctttatgaaa cacgcattga tttgagtcag 4500 ctaggaggtg acggttctcc caagaagaag aggaaagtct cgagcgacta caaagaccat 4560 gacggtgatt ataaagatca tgacatcgat tacaaggatg acgatgacaa ggctgcagga 4620 ggcggtggaa gcgggcgcgc cgacgcgctg gacgatttcg atctcgacat gctgggttct 4680 gatgccctcg atgactttga cctggatatg ttgggaagcg acgcattgga tgactttgat 4740 ctggacatgc tcggctccga tgctctggac gatttcgatc tcgatatgtt ataaccggtc 4800 atcatcacca tcaccattga gtttaaaccc gctgatcagc ctcgactgtg ccttctagtt 4860 gccagccatc tgttgtttgc ccctcccccg tgccttcctt gaccctggaa ggtgccactc 4920 ccactgtcct ttcctaataa aatgaggaaa ttgcatcgca ttgtctgagt aggtgtcatt 4980 ctattctggg gggtggggtg gggcaggaca gcaaggggga ggattgggaa gacaatagca 5040 ggcatgctgg ggatgcggtg ggctctatgg cttctgaggc ggaaagaacc agctggggct 5100 cgataccgtc gacctctagc tagagcttgg cgtaatcatg gtcatagctg tttcctgtgt 5160 gaaattgtta tccgctcaca attccacaca acatacgagc cggaagcata aagtgtaaag 5220 cctagggtgc ctaatgagtg agctaactca cattaattgc gttgcgctca ctgcccgctt 5280 tccagtcggg aaacctgtcg tgccagctgc attaatgaat cggccaacgc gcggggagag 5340 gcggtttgcg tattgggcgc tcttccgctt cctcgctcac tgactcgctg cgctcggtcg 5400 ttcggctgcg gcgagcggta tcagctcact caaaggcggt aatacggtta tccacagaat 5460 caggggataa cgcaggaaag aacatgtgag caaaaggcca gcaaaaggcc aggaaccgta 5520 aaaaggccgc gttgctggcg tttttccata ggctccgccc ccctgacgag catcacaaaa 5580 atcgacgctc aagtcagagg tggcgaaacc cgacaggact ataaagatac caggcgtttc 5640 cccctggaag ctccctcgtg cgctctcctg ttccgaccct gccgcttacc ggatacctgt 5700 ccgcctttct cccttcggga agcgtggcgc tttctcaatg ctcacgctgt aggtatctca 5760 gttcggtgta ggtcgttcgc tccaagctgg gctgtgtgca cgaacccccc gttcagcccg 5820 accgctgcgc cttatccggt aactatcgtc ttgagtccaa cccggtaaga cacgacttat 5880 cgccactggc agcagccact ggtaacagga ttagcagagc gaggtatgta ggcggtgcta 5940 cagagttctt gaagtggtgg cctaactacg gctacactag aaggacagta tttggtatct 6000 gcgctctgct gaagccagtt accttcggaa aaagagttgg tagctcttga tccggcaaac 6060 aaaccaccgc tggtagcggt ggtttttttg tttgcaagca gcagattacg cgcagaaaaa 6120 aaggatctca agaagatcct ttgatctttt ctacggggtc tgacgctcag tggaacgaaa 6180 actcacgtta agggattttg gtcatgagat tatcaaaaag gatcttcacc tagatccttt 6240 taaattaaaa atgaagtttt aaatcaatct aaagtatata tgagtaaact tggtctgaca 6300 gttaccaatg cttaatcagt gaggcaccta tctcagcgat ctgtctattt cgttcatcca 6360 tagttgcctg actccccgtc gtgtagataa ctacgatacg ggagggctta ccatctggcc 6420 ccagtgctgc aatgataccg cgagacccac gctcaccggc tccagattta tcagcaataa 6480 accagccagc cggaagggcc gagcgcagaa gtggtcctgc aactttatcc gcctccatcc 6540 agtctattaa ttgttgccgg gaagctagag tagtagttc gccagttaat agtttgcgca 6600 acgttgttgc cattgctaca ggcatcgtgg tgtcacgctc gtcgtttggt atggcttcat 6660 tcagctccgg ttcccaacga tcaaggcgag ttacatgatc ccccatgttg tgcaaaaaag 6720 cggttagctc cttcggtcct ccgatcgttg tcaagtaa gttggccgca gtgttatcac 6780 tcatggttat ggcagcactg cataattctc ttactgtcat gccatccgta agatgctttt 6840 ctgtgactgg tgagtactca accaagtcat tctgagata gtgtatgcgg cgaccgagtt 6900 gctcttgccc ggcgtcaata cgggataata ccgcgccaca tagcagaact ttaaaagtgc 6960 tcatcattgg aaaacgttct tcggggcgaa aactctcaag gatcttaccg ctgttgagat 7020 ccagttcgat gtaacccact cgtgcaccca actgatcttc agcatctttt actttcacca 7080 7140 cacggaaatg ttgaatactc atactcttcc tttttcaata ttattgaagc atttatcagg 7200 gttattgtct catgagcgga tacatatttg aatgtattta gaaaaataaa caaatagggg 7260 ttccgcgcac atttccccga aaagtgccac ctgacgtcga cggatcggga gatcgatctc 7320 ccgatcccct agggtcgact ctcagtacaa tctgctctga tgccgcatag ttaagccagt 7380 atctgctccc tgcttgtgtg ttggaggtcg ctgagtagtg cgcgagcaaa atttaagcta 7440 caacaaggca aggcttgacc gacaattgca tgaagaatct gcttagggtt aggcgttttg 7500 cgctgcttcg cgatgtacgg gccagatata cgcgttgaca ttgattattg actagttatt 7560 aatagtaatc aattacgggg tcattagttc atagcccata tatggagttc cgcgttacat 7620 aacttacggt aaatggcccg cctggctgac cgcccaacga cccccgccca ttgacgtcaa 7680 taatgacgta tgttcccata gtaacgccaa tagggacttt ccattgacgt caatgggtgg 7740 actatttacg gtaaactgcc cacttggcag tacatcaagt gtatcc 7786 <210> 109 <211> 7785 <212> DNA <213> Artificial Sequence <220> <223> Description of artificial sequence: Synthetic The D10A / H840A-3xFLAG-VP64 polynucleotide of the MV-T7-Cas9 record <400> 109 atatgccaag tacgccccct attgacgtca atgacggtaa atggcccgcc tggcattatg 60 cccagtacat gaccttatgg gactttccta cttggcagta catctacgta ttagtcatcg 120 ctattaccat ggtgatgcgg ttttggcagt acatcaatgg gcgtggatag cggtttgact 180 cacggggatt tccaagtctc caccccattg acgtcaatgg gagtttgttt tggcaccaaa 240 atcaacggga ctttccaaaa tgtcgtaaca actccgcccc attgacgcaa atgggcggta 300 ggcgtgtacg gtgggaggtc tatataagca gagctggttt agtgaaccgt cagatccgct 360 agagatccgc ggccgctaat acgactcact atagggagag ccgccaccat ggataaaaaag 420 tattctattg gtttagccat cggcactaat tccgttggat gggctgtcat aaccgatgaa 480 tacaaagtac cttcaaagaa atttaaggtg ttgggggaaca cagaccgtca ttcgattaaa 540 aagaatctta tcggtgccct cctattcgat agtggcgaaa cggcagaggc gactcgcctg 600 aaacgaaccg ctcggagaag gtatacacgt cgcaagaacc gaatatgtta cttacaagaa 660 atttttagca atgagatggc caaagttgac gattctttct ttcaccgtttt ggaagagtcc 720 ttccttgtcg agaggacaa gaaacatgaa cggcacccca tcttggaaa catagtagat 780 gaggtggcat atcatgaaaa gtacccacg atttatcacc tcagaaaaaa gctagttgac 840 tcaactgata aagcggacct gaggttaatc tacttggctc ttgcccatat gataaagttc 900 cgtgggcact ttctcattga gggtgatcta aatccggaca actcggatgt cgacaactg 960 ttcatccagt tagtacaaac ctataatcag ttgtttgaag agaaccctat aaatgcaagt 1020 ggcgtggatg cgaaggctat tcttagcgcc cgcctctcta atcccgacg gctagaaaac 1080 ctgatcgcac aattacccgg agagagaaa aatggttgt tcggtaacct tatagcgctc 1140 tcactaggcc tgacaccaaa tttaagtcg aacttcgact tagctgaga tgccaattg 1200 cagcttagta aggacacgta cgatgacgat ctcgacaatc tactggcaca attggagat 1260 cagtatgcgg acttatttt ggctgccaaa aaccttagcg atgcaatcct cctatctgac 1320 atactgagag ttatactga gattaccaag gcgccgttat ccgctcaat gatcaaagg 1380 1440. to take away atcaccaaga cttgacactt ctcaaggccc tagtccgtca gcaactgcct 1500. 1500. 1500. 1500. 1500. 1500. 1500. 1500. 1500. 1500 ggcggagcga gtcaagagga attctacaag tttatcaaac ccatattaga gaagatggat gggacggaag agttgcttgt aaaactcaat cgcgaagatc tactgcgaaa gcagcggact ttcgacaacg gtagcattcc acatcaaatc cacttaggcg aattgcatgc father aggcaggagg atttttatcc gttcctcaaa gacaatcgtg aaaagattga gaaaatccta acctttcgca taccttacta tgtgggaccc ctggcccgag ggaactctcg gttcgcatgg atgacaaga agtccgaaga aacgattact ccatggaatt ttgaggaagt tgtcgataaa ggtgcgtcag ctcaatcgtt catcgagagg atgaccaact ttgacaagaa tttaccgaac gaaaaagtat tgcctaagca cagtttactt tcgagtatt tcacagtgta caatgaactc acgaaagtta agtatgtcac tgagggcatg cgtaaacccg cctttctaag cggagaacag aagaaagcaa tagtagtct gttattcaag accaaccgca aagtgacagt tagcaattg aaagaggact actttaagaa aattgaatgc ttcgattctg tcgagatctc cggggtagaa 2160 gatcgattta atgcgtcact tggtacgtat catgacctcc taaagataat taaagataag 2220 gacttcctgg ataacgaaga gaatgaagat atcttagaag atatagtgtt gactcttacc 2280 ctctttgaag atcgggaaat gattgaggaa agactaaaaa catacgctca cctgttcgac 2340 gataaggtta tgaaacagtt aaagaggcgt cgctatacgg gctgggggacg attgtcgcgg 2400 aaacttatca acgggataag agacaagcaa agtggtaaaa ctattctcga tttcttaaag 2460 agcgacggct tcgccaatag gaactttatg cagctgatcc atgatgactc tttaaccttc 2520 aaagaggata tacaaaaggc acaggtttcc ggacaagggg actcattgca cgaacatatt 2580 gcgaatcttg ctggttcgcc agccatcaaa aagggcatac tccagacagt caaagtagtg 2640 gatgagctag ttaaggtcat gggacgtcac aaaccggaaa acattgtaat cgagatggca 2700 cgcgaaaatc aaacgactca gaaggggcaa aaaaacagtc gagagcggat gaagagaata 2760 gaagagggta ttaaagaact gggcagccag atcttaaagg agcatcctgt ggaaaatacc 2820 caattgcaga acgagaact tacctctat tacctacaa atggaaggga catgtatgtt 2880 gatcaggaac tggacataaa ccgtttatct gattacgacg tcgatgccat tgtaccccaa 2940 tcctttttga aggacgattc atcgacaat aaagtgctta cacgctcgga taagaaccga 3000 gggaaaagtg acatgttcc aagcgaggaa gtcgtaaga aaatgaagaa ctattggcgg 3060 cagctcctaa atgcgaaact gataacgca aggaagttcg attackac taagctgag 3120 agggtggct tgtctgaact tgacaggcc ggatttatta aacgtcagct cgtggaaacc 3180 cgccaaatca aaagcatgt tgcacagata ctagattccc gatgaatac gaatacgac 3240 gagaacgata agctgattcg ggaagtcaa gtaatcactt taagtcaa attggtgtcg 3300 gacttcagaa aggattttca attcttaaa gttagggaga taaataacta ccaccatgcg 3360 cacgacgctt atcttaatgc cgtcgtaggg accgcactca ttaagaata cccgaagcta 3420 gaaagtgagt ttgtgtatgg tgattacaa gtttatgacg tccgtagat gatcgcgaaa 3480 agcgaacagg agataggcaa ggctacagcc aaatactct tttattctaa cattatgaat 3540 ttctttaaga cggaaatcac tctggcaaac ggagagatac gcaaacgacc tttaattgaa 3600 accaatgggg agacaggtga aatcgtatgg gataagggcc gggacttcgc gacggtgaga 3660 aaagttttgt ccatgcccca agtcaacata gtaaagaaaa ctgaggtgca gaccggaggg 3720 ttttcaaagg aatcgattct tccaaaaagg aatagtgata agctcatcgc tcgtaaaaag 3780 gactgggacc cgaaaaagta cggtggcttc gatagcccta cagttgccta ttctgtccta 3840 gtagtggcaa aagttgagaa gggaaaatcc aagaaactga agtcagtcaa agaattattg 3900 gggataacga ttatggagcg ctcgtctttt gaaaagaacc ccatcgactt cttgaggcg 3960 aaaggttaca aggaagtaaa aaaggatctc ataattaaac taccaaagta tagtctgttt 4020 gagttagaaa atggccgaaa acggatgttg gctagcgccg gagagcttca aaaggggaac 4080 gaactcgcac taccgtctaa atacgtgaat ttcctgtatt tagcgtccca ttacgagaag 4140 ttgaaaggtt cacctgaaga taacgaacag aagcaacttt ttgttgagca gcacaaacat 4200 tatctcgacg aaatcataga gcaaatttcg gaattcagta agagagtcat cctagctgat 4260 gccaatctgg acaaagtatt aagcgcatac aacaagcaca gggataaacc catacgtgag 4320 caggcggaaa atattatcca tttgtttact cttaccaacc tcggcgctcc agccgcattc 4380 aagtattttg acacaacgat agatcgcaaa cgatacactt ctaccaagga ggtgctagac 4440 gcgacactga ttcaccaatc catcacggga ttatatgaaa ctcggataga tttgtcacag 4500 cttgggggtg acggatcccc caagaagaag aggaaagtct cgagcgacta caaagaccat 4560 gacggtgatt ataaagatca tgacatcgat tacaaggatg acgatgacaa ggctgcagga 4620 ggcggtggaa gcgggcgcgc cgacgcgctg gacgatttcg atctcgacat gctgggttct 4680 gatgccctcg atgactttga cctggatatg ttgggaagcg acgcattgga tgactttgat 4740 ctggacatgc tcggctccga tgctctggac gatttcgatc tcgatatgtt ataaccggtc 4800 atcatcacca tcaccattga gtttaaaccc gctgatcagc ctcgactgtg ccttctagtt 4860 gccagccatc tgttgtttgc ccctcccccg tgccttcctt gaccctggaa ggtgccactc 4920 ccactgtcct ttcctaataa aatgaggaaa ttgcatcgca ttgtctgagt aggtgtcatt 4980 ctattctggg gggtggggtg gggcaggaca gcaaggggga ggattgggaa gacaatagca 5040 ggcatgctgg ggatgcggtg ggctctatgg cttctgaggc ggaaagaacc agctggggct 5100 cgataccgtc gacctctagc tagagcttgg cgtaatcatg gtcatagctg tttcctgtgt 5160 gaaattgtta tccgctcaca attccacaca acatacgagc cggaagcata aagtgtaaag 5220 cctagggtgc ctaatgagtg agctaactca cattaattgc gttgcgctca ctgcccgctt 5280 tccagtcggg aaacctgtcg tgccagctgc attaatgaat cggccaacgc gcggggagag 5340 gcggtttgcg tattgggcgc tcttccgctt cctcgctcac tgactcgctg cgctcggtcg 5400 ttcggctgcg gcgagcggta tcagctcact caaaggcggt aatacggtta tccacagaat 5460 caggggataa cgcaggaaag aacatgtgag caaaaggcca gcaaaaggcc aggaaccgta 5520 aaaaggccgc gttgctggcg tttttccata ggctccgccc ccctgacgag catcacaaaa 5580 atcgacgctc aagtcagagg tggcgaaacc cgacaggact ataaagatac caggcgtttc 5640 cccctggaag ctccctcgtg cgctctcctg ttccgaccct gccgcttacc ggatacctgt 5700 ccgcctttct cccttcggga agcgtggcgc tttctcaatg ctcacgctgt aggtatctca 5760 gttcggtgta ggtcgttcgc tccaagctgg gctgtgtgca cgaacccccc gttcagcccg 5820 accgctgcgc cttatccggt aactatcgtc ttgagtccaa cccggtaaga cacgacttat 5880 cgccactggc agcagccact ggtaacagga ttagcagagc gaggtatgta ggcggtgcta 5940 cagagttctt gaagtggtgg cctaactacg gctacactag aaggacagta tttggtatct 6000 gcgctctgct gaagccagtt accttcggaa aaagagttgg tagctcttga tccggcaaac 6060 aaaccaccgc tggtagcggt ggtttttttg tttgcaagca gcagattacg cgcagaaaaa 6120 aaggatctca agaagatcct ttgatctttt ctacggggtc tgacgctcag tggaacgaaa 6180 actcacgtta agggattttg gtcatgagat tatcaaaaag gatcttcacc tagatccttt 6240 taaattaaaa atgaagtttt aaatcaatct aaagtatata tgagtaaact tggtctgaca 6300 gttaccaatg cttaatcagt gaggcaccta tctcagcgat ctgtctattt cgttcatcca 6360 tagttgcctg actccccgtc gtgtagataa ctacgatacg ggagggctta ccatctggcc 6420 ccagtgctgc aatgataccg cgagacccac gctcaccggc tccagattta tcagcaataa 6480 accagccagc cggaagggcc gagcgcagaa gtggtcctgc aactttatcc gcctccatcc 6540 agtctattaa ttgttgccgg gaagctagag tagtagttc gccagttaat agtttgcgca 6600 acgttgttgc cattgctaca ggcatcgtgg tgtcacgctc gtcgtttggt atggcttcat 6660 tcagctccgg ttcccaacga tcaaggcgag ttacatgatc ccccatgttg tgcaaaaaag 6720 cggttagctc cttcggtcct ccgatcgttg tcaagtaa gttggccgca gtgttatcac 6780 tcatggttat ggcagcactg cataattctc ttactgtcat gccatccgta agatgctttt 6840 ctgtgactgg tgagtactca accaagtcat tctgagata gtgtatgcgg cgaccgagtt 6900 gctcttgccc ggcgtcaata cgggataata ccgcgccaca tagcagaact ttaaaagtgc 6960 tcatcattgg aaaacgttct tcggggcgaa aactctcaag gatcttaccg ctgttgagat 7020 ccagttcgat gtaacccact cgtgcaccca actgatcttc agcatctttt actttcacca 7080 7140 cacggaaatg ttgaatactc atactcttcc tttttcaata ttattgaagc atttatcagg gttattgtct catgagcgga tacatatttg aatgtattta gaaaataaa caatagggg ttccgcgcac atttccccga aaagtgccac ctgacgtcga cggatcggga gatcgatctc ccgatcccct agggtcgact ctcagtacaa tctgctctga tgccgcatag ttaagccagt atctgctccc tgcttgtgtg ttggaggtcg ctgagtagtg cgcgagcaaa atttaagcta 7440 caacaaggca aggcttgacc cacaattgca tgaagaatct gcttagggtt aggcgttttg cgctgcttcg cgatgtacgg gccagatata cgcgttgaca ttgattattg actagttatt aatagtaatc aattacgggg tcattagttc atagcccata tatggagttc cgcgttacat aacttacggt aaatggcccg cctggctgac cgcccaacga cccccgccca ttgacgtcaa 7740. taatgacgta tgttcccata gtaacgccaa tagggacttt ccattgacgt caatgggtgg actatttacg gtaaactgcc cacttggcag tacatcaagt gtatc <210> 110 <211> 1461 <212> PRT <213> The snowstorm <220> <223> Location address: Home address Cas9‑‚‽₽ <400> 110 Met Asp Lys Lys Tyr Ser Ile Gly Leu Ala Ile Gly Thr Asn Ser Val 1 5 10 15 Gly Trp Ala Val Ile Thr Asp Glu Tyr Lys Val Pro Ser Lys Lys Phe 20 25 30 Lys Val Leu Gly Asn Thr Asp Arg His Ser Ile Lys Asn Leu Ile 35 40 45 Gly Ala Leu Phe Asp Ser Gly Glu Thr Ala Glu Ala Thr Arg Leu 50 55 60 Lys Arg Thr Ala Arg Arg Tyr Thr Arg Arg Lys Asn Arg With Cys 65 70 75 80 Tyr Leu Gln Glu Ile Phe Ser Asn Glu Met Ala Asp Ser 85 90 95 Phe Phe His Arg Leu Glu Glu Ser Phe Leu Val Glu Glu Asp Lys Lys 100 105 110 His Glu Arg His Pro Ile Phe Gly Asn Ile Val Asp Glu Val Ala Tyr 115 120 125 His Glu Lys Tyr Pro Thr Ile Tyr His Leu Arg Lys Leu Val Asp 130 135 140 Ser Thr Asp Lys Ala Asp Leu Arg Leu Ile Tyr Leu Ala Leu Ala His 145 150 155 160 Met Ile Lys Phe Arg Gly His Phe Leu Ile Glu Gly Asp Leu Asn Pro 165 170 175 Asp Asn Ser Asp Val Asp Lys Leu Phe Ile Gln Leu Val Gln Thr Tyr 180 185 190 Asn Gln Leu Phe Glu Glu Asn Pro Ile Asn Ala Ser Gly Val Asp Ala 195 200 205 Lys Ala Ile Leu Ser Ala Arg Leu Ser Lys Ser Arg Arg Leu Glu Asn 210 215 220 Leu Ile Ala Gln Leu Pro Gly Glu Lys Lys Asn Gly Leu Phe Gly Asn 225 230 235 240 Leu Ile Ala Leu Ser Leu Gly Leu Thr Pro Asn Phe Lys Ser Asn Phe 245 250 255 Asp Leu Ala Glu Asp Ala Lys Leu Gln Leu Ser Lys Asp Thr Tyr Asp 260 265 270 Asp Asp Leu Asp Asn Leu Leu Ala Gln Ile Gly Asp Gln Tyr Ala Asp 275 280 285 Leu Phe Leu Ala Ala Lys Asn Leu Ser Asp Ala Ile Leu Leu Ser Asp 290 295 300 Ile Leu Arg Val Asn Thr Glu Ile Thr Lys Ala Pro Leu Ser Ala Ser 305 310 315 320 Met Ile Lys Arg Tyr Asp Glu His His Gln Asp Leu Thr Leu Leu Lys 325 330 335 Ala Leu Val Arg Gln Gln Leu Pro Glu Lys Tyr Lys Glu Ile Phe Phe 340 345 350 Asp Gln Ser Lys Asn Gly Tyr Ala Gly Tyr Ile Asp Gly Gly Ala Ser 355 360 365 Gln Glu Glu Phe Tyr Lys Phe Ile Lys Pro Ile Leu Glu Lys Met Asp 370 375 380 Gly Thr Glu Glu Leu Leu Val Lys Leu Asn Arg Glu Asp Leu Leu Arg 385 390 395 400 Lys Gln Arg Thr Phe Asp Asn Gly Ser Ile Pro His Gln Ile His Leu 405 410 415 Gly Glu Leu His Ala Ile Leu Arg Arg Gln Glu Asp Phe Tyr Pro Phe 420 425 430 Leu Lys Asp Asn Arg Glu Lys Ile Glu Lys Ile Leu Thr Phe Arg Ile 435 440 445 Pro Tyr Tyr Val Gly Pro Leu Ala Arg Gly Asn Ser Arg Phe Ala Trp 450 455 460 Met Thr Arg Lys Ser Glu Glu Thr Ile Thr Pro Trp Asn Phe Glu Glu 465 470 475 480 Val Val Asp Lys Gly Ala Ser Ala Gln Ser Phe Ile Glu Arg Met Thr 485 490 495 Asn Phe Asp Lys Asn Leu Pro Asn Glu Lys Val Leu Pro Lys His Ser 500 505 510 Leu Leu Tyr Glu Tyr Phe Thr Val Tyr Asn Glu Leu Thr Lys Val Lys 515 520 525 Tyr Val Thr Glu Gly Met Arg Lys Pro Ala Phe Leu Ser Gly Glu Gln 530 535 540 Lys Lys Ala Ile Val Asp Leu Leu Phe Lys Thr Asn Arg Lys Val Thr 545 550 555 560 Val Lys Gln Leu Lys Glu Asp Tyr Phe Lys Lys Ile Glu Cys Phe Asp 565 570 575 Ser Val Glu Ile Ser Gly Val Glu Asp Arg Phe Asn Ala Ser Leu Gly 580 585 590 Thr Tyr His Asp Leu Leu Lys Ile Ile Lys Asp Lys Asp Phe Leu Asp 595 600 605 Asn Glu Glu Asn Glu Asp Ile Leu Glu Asp Ile Val Leu Thr Leu Thr 610 615 620 Leu Phe Glu Asp Arg Glu Met Ile Glu Glu Arg Leu Lys Thr Tyr Ala 625 630 635 640 His Leu Phe Asp Asp Lys Val Met Lys Gln Leu Lys Arg Arg Arg Tyr 645 650 655 Thr Gly Trp Gly Arg Leu Ser Arg Lys Leu Ile Asn Gly Ile Arg Asp 660 665 670 Lys Gln Ser Gly Lys Thr Ile Leu Asp Phe Leu Lys Ser Asp Gly Phe 675 680 685 Ala Asn Arg Asn Phe Met Gln Leu Ile His Asp Asp Ser Leu Thr Phe 690 695 700 Lys Glu Asp Ile Gln Lys Ala Gln Val Ser Gly Gln Gly Asp Ser Leu 705 710 715 720 His Glu His Ile Ala Asn Leu Ala Gly Ser Pro Ala Ile Lys Lys Gly 725 730 735 Ile Leu Gln Thr Val Lys Val Val Asp Glu Leu Val Lys Val Met Gly 740 745 750 Arg His Lys Pro Glu Asn Ile Val Ile Glu Met Ala Arg Glu Asn Gln 755 760 765 Thr Thr Gln Lys Gly Gln Lys Asn Ser Arg Glu Arg Met Lys Arg Ile 770 775 780 Glu Glu Gly Ile Lys Glu Leu Gly Ser Gln Ile Leu Lys Glu His Pro 785 790 795 800 Val Glu Asn Thr Gln Leu Gln Asn Glu Lys Leu Tyr Leu Tyr Tyr Leu 805 810 815 Gln Asn Gly Arg Asp Met Tyr Val Asp Gln Glu Leu Asp Ile Asn Arg 820 825 830 Leu Ser Asp Tyr Asp Val Asp Ala Ile Val Pro Gln Ser Phe Leu Lys 835 840 845 Asp Asp Ser Ile Asp Asn Lys Val Leu Thr Arg Ser Asp Lys Asn Arg 850 855 860 Gly Lys Ser Asp Asn Val Pro Ser Glu Glu Val Val Lys Lys Met Lys 865 870 875 880 Asn Tyr Trp Arg Gln Leu Leu Asn Ala Lys Leu Ile Thr Gln Arg Lys 885 890 895 Phe Asp Asn Leu Thr Lys Ala Glu Arg Gly Gly Leu Ser Glu Leu Asp 900 905 910 Lys Ala Gly Phe Ile Lys Arg Gln Leu Val Glu Thr Arg Gln Ile Thr 915 920 925 Lys Tyr Asp Ser Arg Met Asn Thr Lys Tyr Asp 930,935,940 Glu Asn Asp Lys Leu Ile Arg Glu Val Lys Val Ile Thr Lys Ser 945 950 955 960 Lys Leu Val Ser Asp Phe Arg Lys Asp Phe Gln Phe Tyr Lys Val Arg 965,970,975 Glu Ile Asn Asn Tyr His Ala His Asp Ala Tyr Leu Asn Ala Val 980,985,990 Val Gly Thr Ala Leu Ile Lys Tyr Pro Lys Leu Glu Ser Glu Phe 995 1000 1005 Val Tyr Gly Asp Tyr Lys Val Tyr Asp Val Arg Lys Met Ile Ala 1010 1015 1020 Lys Ser Glu Gln Glu Ile Gly Lys Ala Thr Ala Lys Tyr Phe Phe 1025 1030 1035 Tyr Ser Asn Ile Met Asn Phe Phe Lys Thr Glu Ile Thr Leu Ala 1040 1045 1050 Asn Gly Glu With Arg Lys Arg Pro Leu Glu Thr Asn Gly Glu 1055 1060 1065 Thr Gly Glu Ile Val Trp Asp Lys Gly Arg Asp Phe Ala Thr Val 1070 1075 1080 Arg Lys Val Leu Ser Met Pro Gln Val Asn Ile Val Lys Lys Thr 1085 1090 1095 Glu Val Gln Thr Gly Gly Phe Ser Lys Glu Ser Ile Leu Pro Lys 1100 1105 1110 Arg Asn Ser Asp Lys Leu Ile Ala Arg Lys Lys Asp Trp Asp Pro 1115 1120 1125 Light Light Tyr Gly Gly Phe Asp Ser Pro Thr Val Ala Tyr Ser Val 1130 1135 1140 Leu Val Val Ala Lys Val Glu Lys Gly Lys Ser Lys Lys Leu Lys 1145 1150 1155 Ser Val Lys Glu Leu Leu Gly Ile Thr Ile Met Glu Arg Ser Ser 1160 1165 1170 Phe Glu Lys Asn Pro Ile Asp Phe Leu Glu Ala Lys Gly Tyr Lys 1175 1180 1185 Glu Val Lys Lys Asp Leu Ile Ile Lys Leu Pro Lys Tyr Ser Leu 1190 1195 1200 Phe Glu Leu Glu Asn Gly Arg Lys Arg Met Leu Ala Ser Ala Gly 1205 1210 1215 Glu Leu Gln Lys Gly Asn Glu Leu Ala Leu Pro Ser Lys Tyr Val 1220 1225 1230 Asn Phe Leu Tyr Leu Ala Ser His Tyr Glu Lys Leu Lys Gly Ser 1235 1240 1245 Pro Glu Asp Asn Glu Gln Lys Gln Leu Phe Val Glu Gln His Lys 1250 1255 1260 His Tyr Leu Asp Glu Ile Ile Glu Gln Ile Ser Glu Phe Ser Lys 1265 1270 1275 Arg Val Ile Leu Ala Asp Ala Asn Leu Asp Lys Val Leu Ser Ala 1280 1285 1290 Tyr Asn Lys His Arg Asp Lys Pro Ile Arg Glu Gln Ala Glu Asn 1295 1300 1305 Ile Ile His Leu Phe Thr Leu Thr Asn Leu Gly Ala Pro Ala Ala 1310 1315 1320 Phe Lys Tyr Phe Asp Thr Thr Ile Asp Arg Lys Arg Tyr Thr Ser 1325 1330 1335 Thr Lys Glu Val Leu Asp Ala Thr Leu Ile His Gln Ser Ile Thr 1340 1345 1350 Gly Leu Tyr Glu Thr Arg Ile Asp Leu Ser Gln Leu Gly Gly Asp 1355 1360 1365 Gly Ser Pro Lys Lys Lys Arg Lys Val Ser Ser Asp Tyr Lys Asp 1370 1375 1380 His Asp Gly Asp Tyr Lys Asp His Asp Ile Asp Tyr Lys Asp Asp 1385 1390 1395 Asp Asp Lys Ala Ala Gly Gly Gly Gly Ser Gly Arg Ala Asp Ala 1400 1405 1410 Leu Asp Asp Phe Asp Leu Asp Met Leu Gly Ser Asp Ala Leu Asp 1415 1420 1425 Asp Phe Asp Leu Asp Met Leu Gly Ser Asp Ala Leu Asp Asp Phe 1430 1435 1440 Asp Leu Asp Met Leu Gly Ser Asp Ala Leu Asp Asp Phe Asp Leu 1445 1450 1455 Asp Met Leu 1460 <210> 111 <211> 1527 <212> PRT <213> Artificial sequence <220> <223> Description of artificial sequences: Synthetic dCas9-NLS-3xFLAG-HP1α polypeptide <400> 111 Met Asp Lys Lys Tyr Ser Ile Gly Leu Ala Ile Gly Thr Asn Ser Val 1 5 10 15 Gly Trp Ala Val Ile Thr Asp Glu Tyr Lys Val Pro Ser Lys Lys Phe 20 25 30 Lys Val Leu Gly Asn Thr Asp Arg His Ser Ile Lys Lys Asn Leu Ile 35 40 45 Gly Ala Leu Leu Phe Asp Ser Gly Glu Thr Ala Glu Ala Thr Arg Leu 50 55 60 Lys Arg Thr Ala Arg Arg Arg Tyr Thr Arg Arg Lys Asn Arg Ile Cys 65 70 75 80 Tyr Leu Gln Glu Ile Phe Ser Asn Glu Met Ala Lys Val Asp Asp Ser 85 90 95 Phe Phe His Arg Leu Glu Glu Ser Phe Leu Val Glu Glu Asp Lys Lys 100 105 110 His Glu Arg His Pro Ile Phe Gly Asn Ile Val Asp Glu Val Ala Tyr 115 120 125 His Glu Lys Tyr Pro Thr Ile Tyr His Leu Arg Lys Lys Leu Val Asp 130 135 140 Ser Thr Asp Lys Ala Asp Leu Arg Leu Ile Tyr Leu Ala Leu Ala His 145 150 155 160 Met Ile Lys Phe Arg Gly His Phe Leu Ile Glu Gly Asp Leu Asn Pro 165 170 175 Asp Asn Ser Asp Val Asp Lys Leu Phe Ile Gln Leu Val Gln Thr Tyr 180 185 190 Asn Gln Leu Phe Glu Glu Asn Pro Ile Asn Ala Ser Gly Val Asp Ala 195 200 205 Lys Ala Ile Leu Ser Ala Arg Leu Ser Lys Ser Arg Arg Leu Glu Asn 210 215 220 Leu Ile Ala Gln Leu Pro Gly Glu Lys Lys Asn Gly Leu Phe Gly Asn 225 230 235 240 Leu Ile Ala Leu Ser Leu Gly Leu Thr Pro Asn Phe Lys Ser Asn Phe 245 250 255 Asp Leu Ala Glu Asp Ala Lys Leu Gln Leu Ser Lys Asp Thr Tyr Asp 260 265 270 Asp Asp Leu Asp Asn Leu Leu Ala Gln Ile Gly Asp Gln Tyr Ala Asp 275 280 285 Leu Phe Leu Ala Ala Lys Asn Leu Ser Asp Ala Ile Leu Leu Ser Asp 290 295 300 Ile Leu Arg Val Asn Thr Glu Ile Thr Lys Ala Pro Leu Ser Ala Ser 305 310 315 320 Met Ile Lys Arg Tyr Asp Glu His His Gln Asp Leu Thr Leu Leu Lys 325 330 335 Ala Leu Val Arg Gln Gln Leu Pro Glu Lys Tyr Lys Glu Ile Phe Phe 340 345 350 Asp Gln Ser Lys Asn Gly Tyr Ala Gly Tyr Ile Asp Gly Gly Ala Ser 355 360 365 Gln Glu Glu Phe Tyr Lys Phe Ile Lys Pro Ile Leu Glu Lys Met Asp 370 375 380 Gly Thr Glu Glu Leu Leu Val Lys Leu Asn Arg Glu Asp Leu Leu Arg 385 390 395 400 Lys Gln Arg Thr Phe Asp Asn Gly Ser Ile Pro His Gln Ile His Leu 405 410 415 Gly Glu Leu His Ala Ile Leu Arg Arg Gln Glu Asp Phe Tyr Pro Phe 420 425 430 Leu Lys Asp Asn Arg Glu Lys Ile Glu Lys Ile Leu Thr Phe Arg Ile 435 440 445 Pro Tyr Tyr Val Gly Pro Leu Ala Arg Gly Asn Ser Arg Phe Ala Trp 450 455 460 Met Thr Arg Lys Ser Glu Glu Thr Ile Thr Pro Trp Asn Phe Glu Glu 465 470 475 480 Val Val Asp Lys Gly Ala Ser Ala Gln Ser Phe Ile Glu Arg Met Thr 485 490 495 Asn Phe Asp Lys Asn Leu Pro Asn Glu Lys Val Leu Pro Lys His Ser 500 505 510 Leu Leu Tyr Glu Tyr Phe Thr Val Tyr Asn Glu Leu Thr Lys Val Lys 515 520 525 Tyr Val Thr Glu Gly Met Arg Lys Pro Ala Phe Leu Ser Gly Glu Gln 530 535 540 Lys Lys Ala Ile Val Asp Leu Leu Phe Lys Thr Asn Arg Lys Val Thr 545 550 555 560 Val Lys Gln Leu Lys Glu Asp Tyr Phe Lys Lys Ile Glu Cys Phe Asp 565 570 575 Ser Val Glu Ile Ser Gly Val Glu Asp Arg Phe Asn Ala Ser Leu Gly 580 585 590 Thr Tyr His Asp Leu Leu Lys Ile Ile Lys Asp Lys Asp Phe Leu Asp 595 600 605 Asn Glu Glu Asn Glu Asp Ile Leu Glu Asp Ile Val Leu Thr Leu Thr 610 615 620 Leu Phe Glu Asp Arg Glu Met Ile Glu Glu Arg Leu Lys Thr Tyr Ala 625 630 635 640 His Leu Phe Asp Asp Lys Val Met Lys Gln Leu Lys Arg Arg Arg Tyr 645 650 655 Thr Gly Trp Gly Arg Leu Ser Arg Lys Leu Ile Asn Gly Ile Arg Asp 660 665 670 Lys Gln Ser Gly Lys Thr Ile Leu Asp Phe Leu Lys Ser Asp Gly Phe 675 680 685 Ala Asn Arg Asn Phe Met Gln Leu Ile His Asp Asp Ser Leu Thr Phe 690 695 700 Lys Glu Asp Ile Gln Lys Ala Gln Val Ser Gly Gln Gly Asp Ser Leu 705 710 715 720 His Glu His Ile Ala Asn Leu Ala Gly Ser Pro Ala Ile Lys Lys Gly 725 730 735 Ile Leu Gln Thr Val Lys Val Val Asp Glu Leu Val Lys Val Met Gly 740 745 750 Arg His Lys Pro Glu Asn Ile Val Ile Glu Met Ala Arg Glu Asn Gln 755 760 765 Thr Thr Gln Lys Gly Gln Lys Asn Ser Arg Glu Arg Met Lys Arg Ile 770 775 780 Glu Glu Gly Ile Lys Glu Leu Gly Ser Gln Ile Leu Lys Glu His Pro 785 790 795 800 Val Glu Asn Thr Gln Leu Gln Asn Glu Lys Leu Tyr Leu Tyr Tyr Leu 805 810 815 Gln Asn Gly Arg Asp Met Tyr Val Asp Gln Glu Leu Asp Ile Asn Arg 820 825 830 Leu Ser Asp Tyr Asp Val Asp Ala Ile Val Pro Gln Ser Phe Leu Lys 835 840 845 Asp Asp Ser Ile Asp Asn Lys Val Leu Thr Arg Ser Asp Lys Asn Arg 850 855 860 Gly Lys Ser Asp Asn Val Pro Ser Glu Glu Val Val Lys Lys Met Lys 865 870 875 880 Asn Tyr Trp Arg Gln Leu Leu Asn Ala Lys Leu Ile Thr Gln Arg Lys 885 890 895 Phe Asp Asn Leu Thr Lys Ala Glu Arg Gly Gly Leu Ser Glu Leu Asp 900 905 910 Lys Ala Gly Phe Ile Lys Arg Gln Leu Val Glu Thr Arg Gln Ile Thr 915 920 925 Lys His Val Ala Gln Ile Leu Asp Ser Arg Met Asn Thr Lys Tyr Asp 930 935 940 Glu Asn Asp Lys Leu Ile Arg Glu Val Lys Val Ile Thr Leu Lys Ser 945 950 955 960 Lys Leu Val Ser Asp Phe Arg Lys Asp Phe Gln Phe Tyr Lys Val Arg 965 970 975 Glu Ile Asn Asn Tyr His His Ala His Asp Ala Tyr Leu Asn Ala Val 980 985 990 Val Gly Thr Ala Leu Ile Lys Lys Tyr Pro Lys Leu Glu Ser Glu Phe 995 1000 1005 Val Tyr Gly Asp Tyr Lys Val Tyr Asp Val Arg Lys Met Ile Ala 1010 1015 1020 Lys Ser Glu Gln Glu Ile Gly Lys Ala Thr Ala Lys Tyr Phe Phe 1025 1030 1035 Tyr Ser Asn Ile Met Asn Phe Phe Lys Thr Glu Ile Thr Leu Ala 1040 1045 1050 Asn Gly Glu Ile Arg Lys Arg Pro Leu Ile Glu Thr Asn Gly Glu 1055 1060 1065 Thr Gly Glu Ile Val Trp Asp Lys Gly Arg Asp Phe Ala Thr Val 1070 1075 1080 Arg Lys Val Leu Ser Met Pro Gln Val Asn Ile Val Lys Lys Thr 1085 1090 1095 Glu Val Gln Thr Gly Gly Phe Ser Lys Glu Ser Ile Leu Pro Lys 1100 1105 1110 Arg Asn Ser Asp Lys Leu Ile Ala Arg Lys Lys Asp Trp Asp Pro 1115 1120 1125 Light Light Tyr Gly Gly Phe Asp Ser Pro Thr Val Ala Tyr Ser Val 1130 1135 1140 Leu Val Val Ala Lys Val Glu Lys Gly Lys Ser Lys Lys Leu Lys 1145 1150 1155 Ser Val Lys Glu Leu Leu Gly Ile Thr Ile Met Glu Arg Ser Ser 1160 1165 1170 Phe Glu Lys Asn Pro Ile Asp Phe Leu Glu Ala Lys Gly Tyr Lys 1175 1180 1185 Glu Val Lys Lys Asp Leu Ile Ile Lys Leu Pro Lys Tyr Ser Leu 1190 1195 1200 Phe Glu Leu Glu Asn Gly Arg Lys Arg Met Leu Ala Ser Ala Gly 1205 1210 1215 Glu Leu Gln Lys Gly Asn Glu Leu Ala Leu Pro Ser Lys Tyr Val 1220 1225 1230 Asn Phe Leu Tyr Leu Ala Ser His Tyr Glu Lys Leu Lys Gly Ser 1235 1240 1245 Pro Glu Asp Asn Glu Gln Lys Gln Leu Phe Val Glu Gln His Lys 1250 1255 1260 His Tyr Leu Asp Glu Ile Ile Glu Gln Ile Ser Glu Phe Ser Lys 1265 1270 1275 Arg Val Ile Leu Ala Asp Ala Asn Leu Asp Lys Val Leu Ser Ala 1280 1285 1290 Tyr Asn Lys His Arg Asp Lys Pro Ile Arg Glu Gln Ala Glu Asn 1295 1300 1305 Ile Ile His Leu Phe Thr Leu Thr Asn Leu Gly Ala Pro Ala Ala 1310 1315 1320 Phe Lys Tyr Phe Asp Thr Thr Ile Asp Arg Lys Arg Tyr Thr Ser 1325 1330 1335 Thr Lys Glu Val Leu Asp Ala Thr Leu Ile His Gln Ser Ile Thr 1340 1345 1350 Gly Leu Tyr Glu Thr Arg Ile Asp Leu Ser Gln Leu Gly Gly Asp 1355 1360 1365 Gly Ser Pro Lys Lys Lys Arg Lys Val Ser Ser Asp Tyr Lys Asp 1370 1375 1380 His Asp Gly Asp Tyr Lys Asp His Asp Ile Asp Tyr Lys Asp Asp 1385 1390 1395 Asp Asp Lys Ala Ala Gly Gly Gly Gly Ser Met Lys Glu Gly Glu 1400 1405 1410 Asn Asn Lys Pro Arg Glu Lys Ser Glu Ser Asn Lys Arg Lys Ser 1415 1420 1425 Asn Phe Ser Asn Ser Ala Asp Asp Ile Lys Ser Lys Lys Lys Arg 1430 1435 1440 Glu Gln Ser Asn Asp Ile Ala Arg Gly Phe Glu Arg Gly Leu Glu 1445 1450 1455 Pro Glu Lys Ile Ile Gly Ala Thr Asp Ser Cys Gly Asp Leu Met 1460 1465 1470 Phe Leu Met Lys Trp Lys Asp Thr Asp Glu Ala Asp Leu Val Leu 1475 1480 1485 Ala Lys Glu Ala Asn Val Lys Cys Pro Gln Ile Val Ile Ala Phe 1490 1495 1500 Tyr Glu Glu Arg Leu Thr Trp His Ala Tyr Pro Glu Asp Ala Glu 1505 1510 1515 Asn Lys Glu Lys Glu Thr Ala Lys Ser 1520 1525 <210> 112 <211> 1521 <212> PRT <213> Artificial sequence <220> <223> Description of artificial sequences: Synthetic dCas9-NLS-3xFLAG-HP1β peptide <400> 112 Met Asp Lys Lys Tyr Ser Ile Gly Leu Ala Ile Gly Thr Asn Ser Val 1 5 10 15 Gly Trp Ala Val Ile Thr Asp Glu Tyr Lys Val Pro Ser Lys Lys Phe 20 25 30 Lys Val Leu Gly Asn Thr Asp Arg His Ser Ile Lys Asn Leu Ile 35 40 45 Gly Ala Leu Phe Asp Ser Gly Glu Thr Ala Glu Ala Thr Arg Leu 50 55 60 Lys Arg Thr Ala Arg Arg Tyr Thr Arg Arg Lys Asn Arg With Cys 65 70 75 80 Tyr Leu Gln Glu Ile Phe Ser Asn Glu Met Ala Asp Ser 85 90 95 Phe Phe His Arg Leu Glu Glu Ser Phe Leu Val Glu Glu Asp Lys Lys 100 105 110 His Glu Arg His Pro Ile Phe Gly Asn Ile Val Asp Glu Val Ala Tyr 115 120 125 His Glu Lys Tyr Pro Thr Ile Tyr His Leu Arg Lys Leu Val Asp 130 135 140 Ser Thr Asp Lys Ala Asp Leu Arg Leu Ile Tyr Leu Ala Leu Ala His 145 150 155 160 Met Ile Lys Phe Arg Gly His Phe Leu Ile Glu Gly Asp Leu Asn Pro 165 170 175 Asp Asn Ser Asp Val Asp Lys Leu Phe Ile Gln Leu Val Gln Thr Tyr 180 185 190 Asn Gln Leu Phe Glu Glu Asn Pro Ile Asn Ala Ser Gly Val Asp Ala 195 200 205 Lys Ala Ile Leu Ser Ala Arg Leu Ser Lys Ser Arg Arg Leu Glu Asn 210 215 220 Leu Ile Ala Gln Leu Pro Gly Glu Lys Lys Asn Gly Leu Phe Gly Asn 225 230 235 240 Leu Ile Ala Leu Ser Leu Gly Leu Thr Pro Asn Phe Lys Ser Asn Phe 245 250 255 Asp Leu Ala Glu Asp Ala Lys Leu Gln Leu Ser Lys Asp Thr Tyr Asp 260 265 270 Asp Asp Leu Asp Asn Leu Leu Ala Gln Ile Gly Asp Gln Tyr Ala Asp 275 280 285 Leu Phe Leu Ala Ala Lys Asn Leu Ser Asp Ala Ile Leu Leu Ser Asp 290 295 300 Ile Leu Arg Val Asn Thr Glu Ile Thr Lys Ala Pro Leu Ser Ala Ser 305 310 315 320 Met Ile Lys Arg Tyr Asp Glu His His Gln Asp Leu Thr Leu Leu Lys 325 330 335 Ala Leu Val Arg Gln Gln Leu Pro Glu Lys Tyr Lys Glu Ile Phe Phe 340 345 350 Asp Gln Ser Lys Asn Gly Tyr Ala Gly Tyr Ile Asp Gly Gly Ala Ser 355 360 365 Gln Glu Glu Phe Tyr Lys Phe Ile Lys Pro Ile Leu Glu Lys Met Asp 370 375 380 Gly Thr Glu Glu Leu Leu Val Lys Leu Asn Arg Glu Asp Leu Leu Arg 385 390 395 400 Lys Gln Arg Thr Phe Asp Asn Gly Ser Ile Pro His Gln Ile His Leu 405 410 415 Gly Glu Leu His Ala Ile Leu Arg Arg Gln Glu Asp Phe Tyr Pro Phe 420 425 430 Leu Lys Asp Asn Arg Glu Lys Ile Glu Lys Ile Leu Thr Phe Arg Ile 435 440 445 Pro Tyr Tyr Val Gly Pro Leu Ala Arg Gly Asn Ser Arg Phe Ala Trp 450 455 460 Met Thr Arg Lys Ser Glu Glu Thr Ile Thr Pro Trp Asn Phe Glu Glu 465 470 475 480 Val Val Asp Lys Gly Ala Ser Ala Gln Ser Phe Ile Glu Arg Met Thr 485,490,495 Asn Phe Asp Lys Asn Leu Pro Asn Glu Lys Val Leu Pro Lys His Ser 500 505 510 Leu Tyr Glu Tyr Phe Thr Val Tyr Asn Glu Thr Lys Val Lys 515,520,525 Tyr Val Thr Glu Gly Met Arg Lys Pro Ala Phe Leu Ser Gly Glu Gln 530 535 540 Lys Lys Ala Ile Val Asp Leu Leu Phe Lys Thr Asn Arg Lys Val Thr 545 550 555 560 Val Lys Gln Leu Lys Glu Asp Tyr Phe Lys Ile Glu Cys Phe Asp 565,570,575 Ser Val Glu Ile Ser Gly Val Glu Asp Arg Phe Asn Ala Ser Leu Gly 580,585,590 Thr Tyr His Asp Leu Leu Lys Ile Ile Lys Asp Lys Asp Phe Leu Asp 595,600,605 Asn Glu Glu Asn Glu Asp With Glu Asp With Val With Thr With Thr 610 615 620 Leu Phe Glu Asp Arg Glu Met Ile Glu Glu Arg Leu Lys Thr Tyr Ala 625 630 635 640 His Leu Phe Asp Asp Lys Val Met Lys Gln Leu Lys Arg Arg Arg Tyr 645 650 655 Thr Gly Trp Gly Arg Leu Ser Arg Lys Leu Ile Asn Gly Ile Arg Asp 660 665 670 Lys Gln Ser Gly Lys Thr Ile Leu Asp Phe Leu Lys Ser Asp Gly Phe 675 680 685 Ala Asn Arg Asn Phe Met Gln Leu Ile His Asp Asp Ser Leu Thr Phe 690 695 700 Lys Glu Asp Ile Gln Lys Ala Gln Val Ser Gly Gln Gly Asp Ser Leu 705 710 715 720 His Glu His Ile Ala Asn Leu Ala Gly Ser Pro Ala Ile Lys Lys Gly 725 730 735 Ile Leu Gln Thr Val Lys Val Val Asp Glu Leu Val Lys Val Met Gly 740 745 750 Arg His Lys Pro Glu Asn Ile Val Ile Glu Met Ala Arg Glu Asn Gln 755 760 765 Thr Thr Gln Lys Gly Gln Lys Asn Ser Arg Glu Arg Met Lys Arg Ile 770 775 780 Glu Glu Gly Ile Lys Glu Leu Gly Ser Gln Ile Leu Lys Glu His Pro 785 790 795 800 Val Glu Asn Thr Gln Leu Gln Asn Glu Lys Leu Tyr Leu Tyr Tyr Leu 805 810 815 Gln Asn Gly Arg Asp Met Tyr Val Asp Gln Glu Leu Asp Ile Asn Arg 820 825 830 Leu Ser Asp Tyr Asp Val Asp Ala Ile Val Pro Gln Ser Phe Leu Lys 835 840 845 Asp Asp Ser Ile Asp Asn Lys Val Leu Thr Arg Ser Asp Lys Asn Arg 850 855 860 Gly Lys Ser Asp Asn Val Pro Ser Glu Glu Val Val Lys Lys Met Lys 865 870 875 880 Asn Tyr Trp Arg Gln Leu Leu Asn Ala Lys Leu Ile Thr Gln Arg Lys 885 890 895 Phe Asp Asn Leu Thr Lys Ala Glu Arg Gly Gly Leu Ser Glu Leu Asp 900 905 910 Lys Ala Gly Phe Ile Lys Arg Gln Leu Val Glu Thr Arg Gln Ile Thr 915 920 925 Lys His Val Ala Gln Ile Leu Asp Ser Arg Met Asn Thr Lys Tyr Asp 930 935 940 Glu Asn Asp Lys Leu Ile Arg Glu Val Lys Val Ile Thr Leu Lys Ser 945 950 955 960 Lys Leu Val Ser Asp Phe Arg Lys Asp Phe Gln Phe Tyr Lys Val Arg 965 970 975 Glu Ile Asn Asn Tyr His His Ala His Asp Ala Tyr Leu Asn Ala Val 980 985 990 Val Gly Thr Ala Leu Ile Lys Lys Tyr Pro Lys Leu Glu Ser Glu Phe 995 1000 1005 Val Tyr Gly Asp Tyr Lys Val Tyr Asp Val Arg Lys Met Ile Ala 1010 1015 1020 Lys Ser Glu Gln Glu Ile Gly Lys Ala Thr Ala Lys Tyr Phe Phe 1025 1030 1035 Tyr Ser Asn Ile Met Asn Phe Phe Lys Thr Glu Ile Thr Leu Ala 1040 1045 1050 Asn Gly Glu Ile Arg Lys Arg Pro Leu Ile Glu Thr Asn Gly Glu 1055 1060 1065 Thr Gly Glu Ile Val Trp Asp Lys Gly Arg Asp Phe Ala Thr Val 1070 1075 1080 Arg Lys Val Leu Ser Met Pro Gln Val Asn Ile Val Lys Lys Thr 1085 1090 1095 Glu Val Gln Thr Gly Gly Phe Ser Lys Glu Ser Ile Leu Pro Lys 1100 1105 1110 Arg Asn Ser Asp Lys Leu Ile Ala Arg Lys Lys Asp Trp Asp Pro 1115 1120 1125 Light Light Tyr Gly Gly Phe Asp Ser Pro Thr Val Ala Tyr Ser Val 1130 1135 1140 Leu Val Val Ala Lys Val Glu Lys Gly Lys Ser Lys Lys Leu Lys 1145 1150 1155 Ser Val Lys Glu Leu Leu Gly Ile Thr Ile Met Glu Arg Ser Ser 1160 1165 1170 Phe Glu Lys Asn Pro Ile Asp Phe Leu Glu Ala Lys Gly Tyr Lys 1175 1180 1185 Glu Val Lys Lys Asp Leu Ile Ile Lys Leu Pro Lys Tyr Ser Leu 1190 1195 1200 Phe Glu Leu Glu Asn Gly Arg Lys Arg Met Leu Ala Ser Ala Gly 1205 1210 1215 Glu Leu Gln Lys Gly Asn Glu Leu Ala Leu Pro Ser Lys Tyr Val 1220 1225 1230 Asn Phe Leu Tyr Leu Ala Ser His Tyr Glu Lys Leu Lys Gly Ser 1235 1240 1245 Pro Glu Asp Asn Glu Gln Lys Gln Leu Phe Val Glu Gln His Lys 1250 1255 1260 His Tyr Leu Asp Glu Ile Ile Glu Gln Ile Ser Glu Phe Ser Lys 1265 1270 1275 Arg Val Ile Leu Ala Asp Ala Asn Leu Asp Lys Val Leu Ser Ala 1280 1285 1290 Tyr Asn Lys His Arg Asp Lys Pro Ile Arg Glu Gln Ala Glu Asn 1295 1300 1305 Ile Ile His Leu Phe Thr Leu Thr Asn Leu Gly Ala Pro Ala Ala 1310 1315 1320 Phe Lys Tyr Phe Asp Thr Thr Ile Asp Arg Lys Arg Tyr Thr Ser 1325 1330 1335 Thr Lys Glu Val Leu Asp Ala Thr Leu Ile His Gln Ser Ile Thr 1340 1345 1350 Gly Leu Tyr Glu Thr Arg Ile Asp Leu Ser Gln Leu Gly Gly Asp 1355 1360 1365 Gly Ser Pro Lys Lys Lys Arg Lys Val Ser Ser Asp Tyr Lys Asp 1370 1375 1380 His Asp Gly Asp Tyr Lys Asp His Asp Ile Asp Tyr Lys Asp Asp 1385 1390 1395 Asp Asp Lys Ala Ala Gly Gly Gly Gly Ser Thr Ala His Glu Thr 1400 1405 1410 Asp Lys Ser Glu Gly Gly Lys Arg Lys Ala Asp Ser Asp Ser Glu 1415 1420 1425 Asp Lys Gly Glu Glu Ser Lys Pro Lys Lys Lys Lys Glu Glu Ser 1430 1435 1440 Glu Lys Pro Arg Gly Phe Ala Arg Gly Leu Glu Pro Glu Arg Ile 1445 1450 1455 Ile Gly Ala Thr Asp Ser Ser Gly Glu Leu Met Phe Leu Met Lys 1460 1465 1470 Trp Lys Asn Ser Asp Glu Ala Asp Leu Val Pro Ala Lys Glu Ala 1475 1480 1485 Asn Val Lys Cys Pro Gln Val Val Ile Ser Phe Tyr Glu Glu Arg 1490 1495 1500 Leu Thr Trp His Ser Tyr Pro Ser Glu Asp Asp Asp Lys Lys Asp 1505 1510 1515 Asp Lys Asn 1520 <210> 113 <211> 2126 <212> PRT <213> Artificial sequence <220> <223> Description of artificial sequences: Synthetic dCas9-3xFLAG-TET1CD polypeptide <400> 113 Met Asp Lys Lys Tyr Ser Ile Gly Leu Ala Ile Gly Thr Asn Ser Val 1 5 10 15 Gly Trp Ala Val Ile Thr Asp Glu Tyr Lys Val Pro Ser Lys Lys Phe 20 25 30 Lys Val Leu Gly Asn Thr Asp Arg His Ser Ile Lys Asn Leu Ile 35 40 45 Gly Ala Leu Phe Asp Ser Gly Glu Thr Ala Glu Ala Thr Arg Leu 50 55 60 Lys Arg Thr Ala Arg Arg Tyr Thr Arg Arg Lys Asn Arg With Cys 65 70 75 80 Tyr Leu Gln Glu Ile Phe Ser Asn Glu Met Ala Asp Ser 85 90 95 Phe Phe His Arg Leu Glu Glu Ser Phe Leu Val Glu Glu Asp Lys Lys 100 105 110 His Glu Arg His Pro Ile Phe Gly Asn Ile Val Asp Glu Val Ala Tyr 115 120 125 His Glu Lys Tyr Pro Thr Ile Tyr His Leu Arg Lys Leu Val Asp 130 135 140 Ser Thr Asp Lys Ala Asp Leu Arg Leu Ile Tyr Leu Ala Leu Ala His 145 150 155 160 Met Ile Lys Phe Arg Gly His Phe Leu Ile Glu Gly Asp Leu Asn Pro 165 170 175 Asp Asn Ser Asp Val Asp Lys Leu Phe Ile Gln Leu Val Gln Thr Tyr 180 185 190 Asn Gln Leu Phe Glu Glu Asn Pro Ile Asn Ala Ser Gly Val Asp Ala 195 200 205 Lys Ala Ile Leu Ser Ala Arg Leu Ser Lys Ser Arg Arg Leu Glu Asn 210 215 220 Leu Ile Ala Gln Leu Pro Gly Glu Lys Lys Asn Gly Leu Phe Gly Asn 225 230 235 240 Leu Ile Ala Leu Ser Leu Gly Leu Thr Pro Asn Phe Lys Ser Asn Phe 245 250 255 Asp Leu Ala Glu Asp Ala Lys Leu Gln Leu Ser Lys Asp Thr Tyr Asp 260 265 270 Asp Asp Leu Asp Asn Leu Leu Ala Gln Ile Gly Asp Gln Tyr Ala Asp 275 280 285 Leu Phe Leu Ala Ala Lys Asn Leu Ser Asp Ala Ile Leu Leu Ser Asp 290 295 300 Ile Leu Arg Val Asn Thr Glu Ile Thr Lys Ala Pro Leu Ser Ala Ser 305 310 315 320 Met Ile Lys Arg Tyr Asp Glu His His Gln Asp Leu Thr Leu Leu Lys 325 330 335 Ala Leu Val Arg Gln Gln Leu Pro Glu Lys Tyr Lys Glu Ile Phe Phe 340 345 350 Asp Gln Ser Lys Asn Gly Tyr Ala Gly Tyr Ile Asp Gly Gly Ala Ser 355 360 365 Gln Glu Glu Phe Tyr Lys Phe Ile Lys Pro Ile Leu Glu Lys Met Asp 370 375 380 Gly Thr Glu Glu Leu Leu Val Lys Leu Asn Arg Glu Asp Leu Leu Arg 385 390 395 400 Lys Gln Arg Thr Phe Asp Asn Gly Ser Ile Pro His Gln Ile His Leu 405 410 415 Gly Glu Leu His Ala Ile Leu Arg Arg Gln Glu Asp Phe Tyr Pro Phe 420 425 430 Leu Lys Asp Asn Arg Glu Lys Ile Glu Lys Ile Leu Thr Phe Arg Ile 435 440 445 Pro Tyr Tyr Val Gly Pro Leu Ala Arg Gly Asn Ser Arg Phe Ala Trp 450 455 460 Met Thr Arg Lys Ser Glu Glu Thr Ile Thr Pro Trp Asn Phe Glu Glu 465 470 475 480 Val Val Asp Lys Gly Ala Ser Ala Gln Ser Phe Ile Glu Arg Met Thr 485 490 495 Asn Phe Asp Lys Asn Leu Pro Asn Glu Lys Val Leu Pro Lys His Ser 500 505 510 Leu Leu Tyr Glu Tyr Phe Thr Val Tyr Asn Glu Leu Thr Lys Val Lys 515 520 525 Tyr Val Thr Glu Gly Met Arg Lys Pro Ala Phe Leu Ser Gly Glu Gln 530 535 540 Lys Lys Ala Ile Val Asp Leu Leu Phe Lys Thr Asn Arg Lys Val Thr 545 550 555 560 Val Lys Gln Leu Lys Glu Asp Tyr Phe Lys Lys Ile Glu Cys Phe Asp 565 570 575 Ser Val Glu Ile Ser Gly Val Glu Asp Arg Phe Asn Ala Ser Leu Gly 580 585 590 Thr Tyr His Asp Leu Leu Lys Ile Ile Lys Asp Lys Asp Phe Leu Asp 595 600 605 Asn Glu Glu Asn Glu Asp Ile Leu Glu Asp Ile Val Leu Thr Leu Thr 610 615 620 Leu Phe Glu Asp Arg Glu Met Ile Glu Glu Arg Leu Lys Thr Tyr Ala 625 630 635 640 His Leu Phe Asp Asp Lys Val Met Lys Gln Leu Lys Arg Arg Arg Tyr 645 650 655 Thr Gly Trp Gly Arg Leu Ser Arg Lys Leu Ile Asn Gly Ile Arg Asp 660 665 670 Lys Gln Ser Gly Lys Thr Ile Leu Asp Phe Leu Lys Ser Asp Gly Phe 675 680 685 Ala Asn Arg Asn Phe Met Gln Leu Ile His Asp Asp Ser Leu Thr Phe 690 695 700 Lys Glu Asp Ile Gln Lys Ala Gln Val Ser Gly Gln Gly Asp Ser Leu 705 710 715 720 His Glu His Ile Ala Asn Leu Ala Gly Ser Pro Ala Ile Lys Lys Gly 725 730 735 Ile Leu Gln Thr Val Lys Val Val Asp Glu Leu Val Lys Val Met Gly 740 745 750 Arg His Lys Pro Glu Asn Ile Val Ile Glu Met Ala Arg Glu Asn Gln 755 760 765 Thr Thr Gln Lys Gly Gln Lys Asn Ser Arg Glu Arg Met Lys Arg Ile 770 775 780 Glu Glu Gly Ile Lys Glu Leu Gly Ser Gln Ile Leu Lys Glu His Pro 785 790 795 800 Val Glu Asn Thr Gln Leu Gln Asn Glu Lys Leu Tyr Leu Tyr Tyr Leu 805 810 815 Gln Asn Gly Arg Asp Met Tyr Val Asp Gln Glu Leu Asp Ile Asn Arg 820 825 830 Leu Ser Asp Tyr Asp Val Asp Ala Ile Val Pro Gln Ser Phe Leu Lys 835 840 845 Asp Asp Ser Ile Asp Asn Lys Val Leu Thr Arg Ser Asp Lys Asn Arg 850 855 860 Gly Lys Ser Asp Asn Val Pro Ser Glu Glu Val Val Lys Lys Met Lys 865 870 875 880 Asn Tyr Trp Arg Gln Leu Leu Asn Ala Lys Leu Ile Thr Gln Arg Lys 885 890 895 Phe Asp Asn Leu Thr Lys Ala Glu Arg Gly Gly Leu Ser Glu Leu Asp 900 905 910 Lys Ala Gly Phe Ile Lys Arg Gln Leu Val Glu Thr Arg Gln Ile Thr 915 920 925 Lys His Val Ala Gln Ile Leu Asp Ser Arg Met Asn Thr Lys Tyr Asp 930 935 940 Glu Asn Asp Lys Leu Ile Arg Glu Val Lys Val Ile Thr Leu Lys Ser 945 950 955 960 Lys Leu Val Ser Asp Phe Arg Lys Asp Phe Gln Phe Tyr Lys Val Arg 965 970 975 Glu Ile Asn Asn Tyr His His Ala His Asp Ala Tyr Leu Asn Ala Val 980 985 990 Val Gly Thr Ala Leu Ile Lys Lys Tyr Pro Lys Leu Glu Ser Glu Phe 995 1000 1005 Val Tyr Gly Asp Tyr Lys Val Tyr Asp Val Arg Lys Met Ile Ala 1010 1015 1020 Lys Ser Glu Gln Glu Ile Gly Lys Ala Thr Ala Lys Tyr Phe Phe 1025 1030 1035 Tyr Ser Asn Ile Met Asn Phe Phe Lys Thr Glu Ile Thr Leu Ala 1040 1045 1050 Asn Gly Glu Ile Arg Lys Arg Pro Leu Ile Glu Thr Asn Gly Glu 1055 1060 1065 Thr Gly Glu Ile Val Trp Asp Lys Gly Arg Asp Phe Ala Thr Val 1070 1075 1080 Arg Lys Val Leu Ser Met Pro Gln Val Asn Ile Val Lys Lys Thr 1085 1090 1095 Glu Val Gln Thr Gly Gly Phe Ser Lys Glu Ser Ile Leu Pro Lys 1100 1105 1110 Arg Asn Ser Asp Lys Leu Ile Ala Arg Lys Lys Asp Trp Asp Pro 1115 1120 1125 Light Light Tyr Gly Gly Phe Asp Ser Pro Thr Val Ala Tyr Ser Val 1130 1135 1140 Leu Val Val Ala Lys Val Glu Lys Gly Lys Ser Lys Lys Leu Lys 1145 1150 1155 Ser Val Lys Glu Leu Leu Gly Ile Thr Ile Met Glu Arg Ser Ser 1160 1165 1170 Phe Glu Lys Asn Pro Ile Asp Phe Leu Glu Ala Lys Gly Tyr Lys 1175 1180 1185 Glu Val Lys Lys Asp Leu Ile Ile Lys Leu Pro Lys Tyr Ser Leu 1190 1195 1200 Phe Glu Leu Glu Asn Gly Arg Lys Arg Met Leu Ala Ser Ala Gly 1205 1210 1215 Glu Leu Gln Lys Gly Asn Glu Leu Ala Leu Pro Ser Lys Tyr Val 1220 1225 1230 Asn Phe Leu Tyr Leu Ala Ser His Tyr Glu Lys Leu Lys Gly Ser 1235 1240 1245 Pro Glu Asp Asn Glu Gln Lys Gln Leu Phe Val Glu Gln His Lys 1250 1255 1260 His Tyr Leu Asp Glu Ile Ile Glu Gln Ile Ser Glu Phe Ser Lys 1265 1270 1275 Arg Val Ile Leu Ala Asp Ala Asn Leu Asp Lys Val Leu Ser Ala 1280 1285 1290 Tyr Asn Lys His Arg Asp Lys Pro Ile Arg Glu Gln Ala Glu Asn 1295 1300 1305 Ile Ile His Leu Phe Thr Leu Thr Asn Leu Gly Ala Pro Ala Ala 1310 1315 1320 Phe Lys Tyr Phe Asp Thr Thr Ile Asp Arg Lys Arg Tyr Thr Ser 1325 1330 1335 Thr Lys Glu Val Leu Asp Ala Thr Leu Ile His Gln Ser Ile Thr 1340 1345 1350 Gly Leu Tyr Glu Thr Arg Ile Asp Leu Ser Gln Leu Gly Gly Asp 1355 1360 1365 Gly Ser Pro Lys Lys Lys Arg Lys Val Ser Ser Asp Tyr Lys Asp 1370 1375 1380 His Asp Gly Asp Tyr Lys Asp His Asp Ile Asp Tyr Lys Asp Asp 1385 1390 1395 Asp Asp Lys Ala Ala Gly Gly Gly Gly Ser Leu Pro Thr Cys Ser 1400 1405 1410 Cys Leu Asp Arg Val Ile Gln Lys Asp Lys Gly Pro Tyr Tyr Thr 1415 1420 1425 His Leu Gly Ala Gly Pro Ser Val Ala Ala Val Arg Glu Ile Met 1430 1435 1440 Glu Asn Arg Tyr Gly Gln Lys Gly Asn Ala Ile Arg Ile Glu Ile 1445 1450 1455 Val Val Tyr Thr Gly Lys Glu Gly Lys Ser Ser His Gly Cys Pro 1460 1465 1470 Ile Ala Lys Trp Val Leu Arg Arg Ser Ser Asp Glu Glu Lys Val 1475 1480 1485 Leu Cys Leu Val Arg Gln Arg Thr Gly His His Cys Pro Thr Ala 1490 1495 1500 Val Met Val Val Leu Ile Met Val Trp Asp Gly Ile Pro Leu Pro 1505 1510 1515 Met Ala Asp Arg Leu Tyr Thr Glu Leu Thr Glu Asn Leu Lys Ser 1520 1525 1530 Tyr Asn Gly His Pro Thr Asp Arg Arg Cys Thr Leu Asn Glu Asn 1535 1540 1545 Arg Thr Cys Thr Cys Gln Gly Ile Asp Pro Glu Thr Cys Gly Ala 1550 1555 1560 Ser Phe Ser Phe Gly Cys Ser Trp Ser Met Tyr Phe Asn Gly Cys 1565 1570 1575 Lys Phe Gly Arg Ser Pro Ser Pro Arg Arg Phe Arg Ile Asp Pro 1580 1585 1590 Ser Ser Pro Leu His Glu Lys Asn Leu Glu Asp Asn Leu Gln Ser 1595 1600 1605 Leu Ala Thr Arg Leu Ala Pro Ile Tyr Lys Gln Tyr Ala Pro Val 1610 1615 1620 Ala Tyr Gln Asn Gln Val Glu Tyr Glu Asn Val Ala Arg Glu Cys 1625 1630 1635 Arg Leu Gly Ser Lys Glu Gly Arg Pro Phe Ser Gly Val Thr Ala 1640 1645 1650 Cys Leu Asp Phe Cys Ala His Pro His Arg Asp Ile His Asn Met 1655 1660 1665 Asn Asn Gly Ser Thr Val Val Cys Thr Leu Thr Arg Glu Asp Asn 1670 1675 1680 Arg Ser Leu Gly Val Ile Pro Gln Asp Glu Gln Leu His Val Leu 1685 1690 1695 Pro Leu Tyr Lys Leu Ser Asp Thr Asp Glu Phe Gly Ser Lys Glu 1700 1705 1710 Gly Met Glu Ala Lys Ile Lys Ser Gly Ala Ile Glu Val Leu Ala 1715 1720 1725 Pro Arg Arg Lys Lys Arg Thr Cys Phe Thr Gln Pro Val Pro Arg 1730 1735 1740 Ser Gly Lys Lys Arg Ala Ala Met Met Thr Glu Val Leu Ala His 1745 1750 1755 Lys Ile Arg Ala Val Glu Lys Lys Pro Ile Pro Arg Ile Lys Arg 1760 1765 1770 Lys Asn Asn Ser Thr Thr Thr Asn Asn Ser Lys Pro Ser Ser Leu 1775 1780 1785 Pro Thr Leu Gly Ser Asn Thr Glu Thr Val Gln Pro Glu Val Lys 1790 1795 1800 Ser Glu Thr Glu Pro His Phe Ile Leu Lys Ser Ser Asp Asn Thr 1805 1810 1815 Lys Thr Tyr Ser Leu Met Pro Ser Ala Pro His Pro Val Lys Glu 1820 1825 1830 Ala Ser Pro Gly Phe Ser Trp Ser Pro Lys Thr Ala Ser Ala Thr 1835 1840 1845 Pro Ala Pro Leu Lys Asn Asp Ala Thr Ala Ser Cys Gly Phe Ser 1850 1855 1860 Glu Arg Ser Ser Thr Pro His Cys Thr Met Pro Ser Gly Arg Leu 1865 1870 1875 Ser Gly Ala Asn Ala Ala Ala Ala Asp Gly Pro Gly Ile Ser Gln 1880 1885 1890 Leu Gly Glu Val Ala Pro Leu Pro Thr Leu Ser Ala Pro Val Met 1895 1900 1905 Glu Pro Leu Ile Asn Ser Glu Pro Ser Thr Gly Val Thr Glu Pro 1910 1915 1920 Leu Thr Pro His Gln Pro Asn His Gln Pro Ser Phe Leu Thr Ser 1925 1930 1935 Pro Gln Asp Leu Ala Ser Ser Pro Met Glu Glu Asp Glu Gln His 1940 1945 1950 Ser Glu Ala Asp Glu Pro Pro Ser Asp Glu Pro Leu Ser Asp Asp 1955 1960 1965 Pro Leu Ser Pro Ala Glu Glu Lys Leu Pro His Ile Asp Glu Tyr 1970 1975 1980 Trp Ser Asp Ser Glu His Ile Phe Leu Asp Ala Asn Ile Gly Gly 1985 1990 1995 Val Ala Ile Ala Pro Ala His Gly Ser Val Leu Ile Glu Cys Ala 2000 2005 2010 Arg Arg Glu Leu His Ala Thr Thr Pro Val Glu His Pro Asn Arg 2015 2020 2025 Asn His Pro Thr Arg Leu Ser Leu Val Phe Tyr Gln His Lys Asn 2030 2035 2040 Leu Asn Lys Pro Gln His Gly Phe Glu Leu Asn Lys Ile Lys Phe 2045 2050 2055 Glu Ala Lys Glu Ala Lys Asn Lys Lys Met Lys Ala Ser Glu Gln 2060 2065 2070 Lys Asp Gln Ala Ala Asn Glu Gly Pro Glu Gln Ser Ser Glu Val 2075 2080 2085 Asn Glu Leu Asn Gln Ile Pro Ser His Lys Ala Leu Thr Leu Thr 2090 2095 2100 His Asp Asn Val Val Thr Val Ser Pro Tyr Ala Leu Thr His Val 2105 2110 2115 Ala Gly Pro Tyr Asn His Trp Val 2120 2125
Claims
1. A method for increasing the expression of a target gene in mammalian cells, the method comprising contacting the mammalian cells in vitro or ex vivo with an expression vector comprising a promoter operatively linked to a nucleic acid encoding a fusion protein comprising a non-catalytically active *Streptococcus pyogenes* CRISPR-associated 9 (Cas9) protein linked to a heterologous functional domain, and contacting the mammalian cells with one or more expression vectors comprising a promoter operatively linked to a nucleic acid sequence encoding a guide RNA for one or more promoters targeting the target gene. The Cas9 protein, which lacks catalytic activity, contains mutations at D10 and H840, and the heterologous functional domain is a transcriptional activation domain. The mammalian cells in contact therewith express the fusion protein and the one or more guide RNAs, such that the expressed Cas9 fusion protein targets the promoter of the target gene via the expressed one or more guide RNAs, and wherein the increased expression is mediated by the transcriptional activation domain. The transcriptional activation domain includes VP64, and the catalytically inactive Cas9 protein mutation is (i) D10A or D10N, and (ii) H840A, H840N, or H840Y.
2. A method for increasing the expression of a target gene in mammalian cells in vitro or in vitro, the method comprising expressing a fusion protein in the mammalian cells, the fusion protein comprising a non-catalytically active Streptococcus pyogenes Cas9 protein linked to a heterologous functional domain, and expressing one or more guide RNAs targeting the target gene. The non-catalytically active Cas9 protein contains mutations at D10 and H840, and the heterologous functional domain is a transcriptional activation domain. The expressed Cas9 fusion protein targets the target gene via one or more guide RNAs, and the increased expression is mediated by the transcriptional activation domain. The transcriptional activation domain includes VP64, and the catalytically inactive Cas9 protein mutation is (i) D10A or D10N, and (ii) H840A, H840N, or H840Y.
3. The method of claim 1 or 2, wherein the catalytically inactive Cas9 protein mutation is: (i) D10A and (ii) H840A.
4. The method of claim 1 or 2, wherein the heterologous functional domain is connected to the N-terminus or C-terminus of the non-catalytically active Cas9 protein via an optional intercalator, wherein the intercalator does not interfere with the activity of the fusion protein.
5. The method of claim 1 or 2, wherein the fusion protein further comprises one or both of a nuclear localization sequence and one or more epitope tags at the N-terminus, C-terminus, or between the non-catalytically active Cas9 protein and the heterologous functional domain, optionally having one or more intercalators.
6. The method of claim 5, wherein the one or more epitope tags are selected from the group consisting of c-myc, 6His, and FLAG.
7. The method of claim 1 or 2, wherein expressing the fusion protein in the mammalian cell comprises contacting the mammalian cell with an expression vector containing a promoter operatively linked to the nucleic acid encoding the fusion protein.
8. The method of claim 1 or 2, wherein expressing the one or more guide RNAs targeting the target gene in the mammalian cell comprises contacting the mammalian cell with one or more expression vectors, the one or more expression vectors comprising a promoter operatively linked to a nucleic acid sequence encoding the one or more guide RNAs targeting the target gene.