Methods and compositions for modulating gene expression

JP2026110713APending Publication Date: 2026-07-02FLAGSHIP PIONEERING INNOVATIONS V INC

View PDF 0 Cites 0 Cited by

Patent Information

Authority / Receiving Office: JP · JP
Patent Type: Applications
Current Assignee / Owner: FLAGSHIP PIONEERING INNOVATIONS V INC
Filing Date: 2026-04-22
Publication Date: 2026-07-02

Application Information

Patent Timeline

22 Apr 2026

Application

02 Jul 2026

Publication

JP2026110713A

IPC: C12N15/09; C12N9/78; C12N15/11; C12N15/113; C12N15/54; C12N15/55; C12N9/16; C12N15/63; C12N1/15; C12N1/19; C12N1/21; C12N5/10; C12N15/62; C12N15/87; A61K45/00; A61K31/74; A61K31/785; A61K31/7088; A61K48/00; A61P43/00; A61K47/64; A61K47/59; A61K47/52; A61P35/00; A61P31/00; A61P37/06; C12N9/10; C12N15/10

Explore More Agents

Novelty Search
Search existing technologies and assess novelty
↗
FTO
Analyze whether a product may infringe others' patents
↗
Design FTO
Check prior-design risk for exterior design
↗
Drafting
Draft patent application text based on a technical solution
↗
Find Solutions with TRIZ
Generate feasible solution to solve your technical challenge
↗

Get free access to AI patent search and analysis

Check patentability, review prior art and ask IP Agent with full patent context.

Smart Images

Figure 2026110713000017
Figure 2026110713000018
Figure 2026110713000019

Patent Text Reader

Abstract

To provide methods and compositions for modulating gene expression. [Solution] This disclosure provides compositions for modulating gene expression and methods for modulating transcription. In particular, this disclosure provides a variety of agents, compositions, and methods for modulating gene expression, delivery to cells (e.g., mammalian cells such as somatic cells; e.g., delivery across the cell membrane), as well as related treatment methods. This disclosure also provides site-specific agents that act to disrupt and / or modify anchor sequence-mediated connections by genetic and / or epigenetic methods.

Need to check novelty before this filing date? Find Prior Art

Description

[Technical Field]

[0001] References of related applications This application claims priority and benefits therefrom to U.S. Provisional Applications No. 62 / 384,603 (filed September 7, 2016), No. 62 / 416,501 (filed November 2, 2016), No. 62 / 439,327 (filed December 27, 2016), and No. 62 / 542,703 (filed August 8, 2017). The contents of each of these applications are incorporated herein by reference. [Background technology]

[0002] background Many diseases are caused by defects in the regulation of the expression of certain genes. [Overview of the Initiative] [Means for solving the problem]

[0003] overview In particular, this disclosure provides a variety of agents, compositions, and methods for modulating gene expression, delivery to cells (e.g., mammalian cells such as somatic cells; e.g., delivery across the cell membrane), as well as related treatment methods. To the knowledge of the inventors, this disclosure provides the first disclosure of site-specific agents that physically disrupt and / or modify anchor sequence-mediated connections. This disclosure also provides site-specific agents that act to disrupt and / or modify anchor sequence-mediated connections in particular by genetic and / or epigenetic ways.

[0004] In some embodiments, the present disclosure provides a site-directed disruption agent comprising a DNA-binding moiety that specifically binds to one or more target anchor sequences within a cell with sufficient affinity to compete with the binding of endogenous nucleating polypeptides within the cell, and does not bind to non-target anchor sequences within the cell.

[0005] In some embodiments, the present disclosure provides a method of modulating the expression of a gene within an anchor sequence-mediated connection comprising a first anchor sequence and a second anchor sequence, the method comprising contacting the first and / or second anchor sequence with a site-specific disruptor disclosed herein.

[0006] In some embodiments, the present disclosure provides a method of modulating the expression of a gene within 10 kb of a first anchor sequence within an anchor sequence-mediated connection comprising a first anchor sequence and a second anchor sequence, the method comprising contacting the first and / or second anchor sequence with a site-specific disruptor disclosed herein.

[0007] In some embodiments, the present disclosure provides a method of increasing the expression of a gene within an anchor sequence-mediated connection comprising a first anchor sequence and a second anchor sequence, wherein the first and / or second anchor sequence is located within 10 kb of an external enhancing sequence, the method comprising contacting the first and / or second anchor sequence with a site-specific disruptor disclosed herein.

[0008] In some embodiments, the present disclosure provides a method comprising delivering a site-specific disruptor disclosed herein to a mammalian cell.

[0009] In some embodiments, the present disclosure provides a fusion molecule comprising (i) a site-specific targeting moiety and (ii) a deaminating agent, wherein the site-specific targeting moiety targets the fusion molecule to a target anchor sequence but not to at least one non-target anchor sequence;

[0010] In some embodiments, the present disclosure provides a composition comprising (i) a fusion polypeptide comprising an enzymatically inactive Cas polypeptide and a deaminating agent, or a nucleic acid encoding the fusion polypeptide; and (ii) a guide RNA that targets the fusion polypeptide to a target anchor sequence but not to at least one non-target anchor sequence.

[0011] In some embodiments, the present disclosure provides a method of modulating the expression of a gene within an anchor sequence-mediated connection comprising a first anchor sequence and a second anchor sequence, the method comprising contacting the first and / or second anchor sequence with a site-specific disruptor disclosed herein.

[0012] In some embodiments, the present disclosure provides a method of modulating the expression of a gene within 10 kb of a first anchor sequence within an anchor sequence-mediated connection comprising a first anchor sequence and a second anchor sequence, the method comprising contacting the first and / or second anchor sequence with a site-specific disruptor disclosed herein.

[0013] In some embodiments, the present disclosure provides a method of decreasing the expression of a gene within an anchor sequence-mediated connection comprising a first anchor sequence, a second anchor sequence, and an internal enhancing sequence, the method comprising contacting the first and / or second anchor sequence with a site-specific disruptor disclosed herein.

[0014] In some embodiments, the present disclosure provides a method comprising (a) delivering a fusion molecule or composition disclosed herein to a mammalian cell.

[0015] In some embodiments, the present disclosure provides a method comprising (a) substituting, adding, or deleting one or more nucleotides of an anchor sequence in a mammalian somatic cell.

[0016] In some embodiments, the present disclosure provides a method comprising delivering a mammalian somatic cell to a subject having a disease or condition, wherein one or more nucleotides of an anchor sequence in the mammalian somatic cell are substituted, added, or deleted.

[0017] In some embodiments, the present disclosure provides a method comprising the step of (a) administering to mammalian somatic cells, wherein the mammalian somatic cells are obtained from the subject, and the fusion molecule or composition disclosed herein is delivered to the mammalian cells ex vivo.

[0018] In some embodiments, the disclosure provides a fusion molecule comprising an epigenetic modifier in which (i) a site-specific targeting moiety and (ii) a site-specific targeting moiety target the fusion molecule to a target anchor sequence but not to at least one non-target anchor sequence.

[0019] In some embodiments, the present disclosure provides a site-specific guide RNA comprising a target nucleic acid including an anchor sequence and a targeting domain complementary to it.

[0020] In some embodiments, the disclosure provides a composition comprising (i) a fusion polypeptide comprising an enzymatically inactive Cas polypeptide and an epigenetic modifier, or a nucleic acid encoding the fusion polypeptide; and (ii) a guide RNA that targets the fusion polypeptide to a target anchor sequence but does not target at least one non-target anchor sequence.

[0021] In some embodiments, the Disclosure provides a method for modulating gene expression within an anchor-mediated linkage comprising a first anchor sequence and a second anchor sequence, the method comprising the step of contacting the first and / or second anchor sequences with a fusion molecule or composition disclosed herein.

[0022] In some embodiments, the Disclosure provides a method for modulating the expression of a gene within 10 kb of a first anchor sequence in an anchor sequence-mediated linkage comprising a first anchor sequence and a second anchor sequence, the method comprising the step of contacting the first and / or second anchor sequences with a fusion molecule or composition disclosed herein.

[0023] In some embodiments, the Disclosure provides a method for reducing the expression of a gene within an anchor-mediated linkage comprising a first anchor sequence, a second anchor sequence, and an internal enhancing sequence, the method comprising the step of contacting the first and / or second anchor sequences with a fusion molecule or composition disclosed herein.

[0024] In some embodiments, the Disclosure provides a method for increasing the expression of a gene within an anchor-mediated linkage comprising a first anchor sequence and a second anchor sequence, wherein the first and / or second anchor sequences are located within 10 kb of an external enhancing sequence, and the method comprises the step of contacting the first and / or second anchor sequences with a fusion molecule or composition disclosed herein.

[0025] In some embodiments, the present disclosure provides a method comprising the step of (a) delivering a fusion molecule or composition disclosed herein to mammalian cells.

[0026] In some embodiments, the Disclosure provides an engineered site-specific nucleating agent comprising: an engineered DNA-binding moiety which binds specifically to one or more intracellular target sequences with sufficient affinity to compete with the binding of endogenous intracellular nucleating polypeptides, but does not bind to non-target intracellular sequences; and a nucleating polypeptide dimerization domain associated with the engineered DNA-binding moiety, wherein when the engineered DNA-binding moiety binds to the at least one target sequence, the nucleating polypeptide dimerization domain is localized thereto, each of the at least one target sequence being a target anchor sequence, and when the nucleating polypeptide dimerization domain is localized to the target anchor sequence, the at least one target anchor sequence is positioned relative to an anchor sequence to which the nucleating polypeptide binds, such that the interaction between the nucleating polypeptide dimerization domain and the nucleating polypeptide generates an anchor sequence-mediated linkage.

[0027] In one embodiment, the disclosure includes a pharmaceutical preparation comprising a composition that binds to an anchor sequence of an anchor sequence-mediated linkage and alters the formation of the anchor sequence-mediated linkage, wherein the composition modulates the transcription of a target gene associated with the anchor sequence-mediated linkage in human cells.

[0028] In one embodiment, the Disclosure includes a composition comprising a targeted moiety that binds to the anchor sequence of an anchor sequence-mediated linkage and alters the formation of the anchor sequence-mediated linkage (for example, altering the affinity of the anchor sequence to the linkage nucleating molecule by, for example, at least 10%, 15%, 20%, 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, or more).

[0029] In one embodiment, the disclosure includes a pharmaceutical preparation comprising a composition that includes a targeting moiety that binds to an anchor sequence of an anchor sequence-mediated linkage and alters the formation of the anchor sequence-mediated linkage, for example, a composition that modulates the transcription of a target gene in an expression unit associated with an anchor sequence-mediated linkage in human cells.

[0030] In various embodiments of this disclosure described herein, one or more of the various embodiments described herein may be combined.

[0031] In some embodiments, the targeting moiety is (i) a chemical substance, e.g., a chemical substance that modulates cytosine (C) or adenine (A) (e.g., sodium bisulfite, ammonium bisulfite); (ii) having enzymatic activity (methyltransferase, demethylase, nuclease (e.g., Cas9), deaminase); or (iii) comprising an effector moiety that sterically impairs the formation of anchor sequence-mediated linkages [e.g., membrane translocating polypeptide + nanoparticles].

[0032] In some embodiments, the anchor sequence-mediated linkage is associated with one or more transcriptional regulatory sequences. In one embodiment, one or more transcriptional regulatory sequences are located inside the anchor sequence-mediated linkage, for example, a type 1 anchor sequence-mediated linkage. In another embodiment, one or more transcriptional regulatory sequences are located outside the anchor sequence-mediated linkage, for example, a type 2 anchor sequence-mediated linkage. In yet another embodiment, one or more transcriptional regulatory sequences are located inside an enhancing sequence, for example, and at least partially outside a silencing sequence, for example, an anchor sequence-mediated linkage, for example, a type 3 anchor sequence-mediated linkage. In yet another embodiment, one or more transcriptional regulatory sequences are located inside an enhancing sequence, for example, and at least partially outside an enhancing sequence, for example, an anchor sequence-mediated linkage, for example, a type 4 anchor sequence-mediated linkage.

[0033] In some embodiments, the composition disrupts the formation of anchor-mediated connections (for example, by reducing the affinity of the anchor sequence to the connective nucleation molecule by, for example, at least 10%, 15%, 20%, 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, or more). In some embodiments, the composition promotes the formation of anchor-mediated connections (for example, by increasing the affinity of the anchor sequence to the connective nucleation molecule by, for example, at least 10%, 15%, 20%, 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, or more). In some embodiments, the target gene is located within the anchor-mediated connection. In some embodiments, the target gene is outside the anchor sequence-mediated linkage. In some embodiments, the target gene is both inside and outside the anchor sequence-mediated linkage. In some embodiments, the composition physically disrupts the formation of the anchor sequence-mediated linkage, for example, the composition is both a target and an effector, for example, a membrane-permeable translocation polypeptide. In some embodiments, the composition comprises an anchor sequence-binding targeting moiety (e.g., gRNA, membrane-permeable translocation polypeptide) operably linked to an effector moiety that modulates the formation of the anchor sequence-mediated linkage. In some embodiments, the effector moiety is a chemical substance, for example, a chemical substance that modulates cytosine (C) or adenine (A) (e.g., sodium bisulfite, ammonium bisulfite). In some embodiments, the effector moiety has enzymatic activity (methyltransferase, demethylase, nuclease (e.g., Cas9), deaminase). In some embodiments, the effector moiety sterically impairs the formation of the anchor sequence-mediated linkage, for example, a membrane-permeable translocation polypeptide and / or nanoparticles.

[0034] In some embodiments, the compositions or methods described herein are ABX nAt least one polypeptide comprising at least one sequence of C (wherein A is selected from a hydrophobic amino acid or amide-containing skeleton having a nucleic acid side chain, e.g., aminoethylglycine; B and C may be the same or different, each independently selected from arginine, asparagine, glutamine, lysine, and their analogues; X is each independently a hydrophobic amino acid, or X is each independently an amide-containing skeleton having a nucleic acid side chain, e.g., aminoethylglycine; n is an integer from 1 to 4), further comprising polypeptides that hybridize to a nucleic acid sequence in an anchor sequence-mediated linkage (e.g., an anchor sequence of an anchor sequence-mediated linkage, e.g., a CTCF-binding motif, a BORIS-binding motif, a cohesine-binding motif, a USF1-binding motif, a YY1-binding motif, a TATA box, a ZNF143-binding motif, etc.).

[0035] The compositions and methods described in the various embodiments of the above-described aspects can be used in any other embodiments described herein.

[0036] In one embodiment, the disclosure includes a method for modulating the expression of a target gene in an anchor-mediated linkage, which involves targeting a target gene or a sequence that is not outside of or part of an associated transcriptional regulatory sequence that affects the transcription of the gene, such as modulating the expression of the gene by targeting an anchor sequence.

[0037] In one embodiment, the disclosure includes a method for modulating the transcription of a target gene, which involves targeting an anchor sequence to alter the formation of an anchor sequence-mediated linkage, or targeting a sequence that is discontinuous with the target gene or its associated transcriptional regulatory sequences that affect the transcription of the target gene.

[0038] In some embodiments, the method includes an anchor sequence-mediated linkage containing one or more associated genes, and one or more transcriptional regulatory sequences within the anchor sequence-mediated linkage. In some embodiments, the anchor sequence-mediated linkage includes one or more associated genes and one or more transcriptional regulatory sequences located outside the anchor sequence-mediated linkage. In some embodiments, the anchor sequence-mediated linkage includes one or more associated genes and one or more transcriptional regulatory sequences located at least partially inside and outside the anchor sequence-mediated linkage. For example, one or more repressive signals may be located outside the anchor sequence-mediated linkage, and one or more enhancing sequences and target genes may be located inside the anchor sequence-mediated linkage. In another example, one or more enhancing sequences may be located both inside and outside the anchor sequence-mediated linkage.

[0039] In some embodiments, the target gene is discontinuous with one or more anchor sequences. In some embodiments where the gene is discontinuous with the anchor sequence, the gene may be separated from the anchor sequence by approximately 100 bp to 500 Mb, 500 bp to 200 Mb, 1 kb to 100 Mb, 25 kb to 50 Mb, 50 kb to 1 Mb, 100 kb to 750 kb, 150 kb to 500 kb, or 175 kb to 500 kb. In some embodiments, the genes are approximately 100 bp, 300 bp, 500 bp, 600 bp, 700 bp, 800 bp, 900 bp, 1 kb, 5 kb, 10 kb, 15 kb, 20 kb, 25 kb, 30 kb, 35 kb, 40 kb, 45 kb, 50 kb, 55 kb, 60 kb, 65 kb, 70 kb, 75 kb, 80 kb, 85 kb, 90 kb, 95 kb, 100 kb, 125 kb, 150 kb, 175 kb, 200 kb, 22 The distance from the anchor array is 5kb, 250kb, 275kb, 300kb, 350kb, 400kb, 500kb, 600kb, 700kb, 800kb, 900kb, 1Mb, 2Mb, 3Mb, 4Mb, 5Mb, 6Mb, 7Mb, 8Mb, 9Mb, 10Mb, 15Mb, 20Mb, 25Mb, 50Mb, 75Mb, 100Mb, 200Mb, 300Mb, 400Mb, 500Mb, or any size in between.

[0040] In some embodiments, the anchor-mediated linkage includes a target gene and is associated with one or more transcriptional regulatory sequences, e.g., silencing / repression sequences and enhancing sequences. In some embodiments, the anchor-mediated linkage includes one or more genes, e.g., 2, 3, 4, 5, or more. In some embodiments, the anchor-mediated linkage is associated with one or more transcriptional regulatory sequences, e.g., 2, 3, 4, 5, or more.

[0041] In some embodiments, the target gene is discontinuous with one or more transcriptional regulatory sequences. In some embodiments where the gene is discontinuous with the transcriptional regulatory sequences, the gene may be separated from the transcriptional regulatory sequences by approximately 100 bp to 500 Mb, 500 bp to 200 Mb, 1 kb to 100 Mb, 25 kb to 50 Mb, 50 kb to 1 Mb, 100 kb to 750 kb, 150 kb to 500 kb, or 175 kb to 500 kb. In some embodiments, the genes are approximately 100 bp, 300 bp, 500 bp, 600 bp, 700 bp, 800 bp, 900 bp, 1 kb, 5 kb, 10 kb, 15 kb, 20 kb, 25 kb, 30 kb, 35 kb, 40 kb, 45 kb, 50 kb, 55 kb, 60 kb, 65 kb, 70 kb, 75 kb, 80 kb, 85 kb, 90 kb, 95 kb, 100 kb, 125 kb, 150 kb, 175 kb, 200 kb, 22 The transcriptional control sequence is separated by 5kb, 250kb, 275kb, 300kb, 350kb, 400kb, 500kb, 600kb, 700kb, 800kb, 900kb, 1Mb, 2Mb, 3Mb, 4Mb, 5Mb, 6Mb, 7Mb, 8Mb, 9Mb, 10Mb, 15Mb, 20Mb, 25Mb, 50Mb, 75Mb, 100Mb, 200Mb, 300Mb, 400Mb, 500Mb, or any size in between.

[0042] In one embodiment, the present disclosure includes a pharmaceutical composition comprising (a) a targeting portion and (b) a DNA sequence including, for example, an anchor sequence.

[0043] In one embodiment, the Disclosure includes a composition comprising a targeted moiety that binds to the anchor sequence of an anchor sequence-mediated linkage and alters the formation of the anchor sequence-mediated linkage (for example, altering the affinity of the anchor sequence to the linkage nucleating molecule by, for example, at least 10%, 15%, 20%, 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, or more).

[0044] In one embodiment, the disclosure comprises a protein containing a domain that acts on DNA, for example, an enzyme domain (e.g., a nuclease domain, for example, a Cas9 domain, for example, a dCas9 domain; DNA methyltransferase, demethylase, deaminase), in combination with at least one guide RNA (gRNA) or antisense DNA oligonucleotide that targets the protein to an anchor sequence of a target anchor sequence-mediated linkage, and the composition is effective in altering the target anchor sequence-mediated linkage in human cells.

[0045] In some embodiments, the enzyme domain is Cas9 or dCas9. In some embodiments, the protein comprises two enzyme domains, for example, dCas9 and a methylase or demethylase domain.

[0046] The compositions described in the various embodiments of the above-described aspects can be used in any other embodiments described herein.

[0047] In one embodiment, the disclosure includes a composition for modulating the transcription of a nucleic acid sequence by introducing a targeted modification into an anchor sequence-mediated linkage, the composition comprising a targeted moiety that binds to an anchor sequence.

[0048] In some embodiments, the targeting moiety includes an enzyme, such as a sequence-targeting polypeptide like Cas9. In some embodiments, the targeting moiety includes a fusion of the sequence-targeting polypeptide with a connective nucleation molecule, such as dCas9 with a connective nucleation molecule. In some further embodiments, the targeting moiety further includes a guide RNA or a nucleic acid encoding the guide RNA. In some further embodiments, the targeting moiety targets one or more nucleotides of an anchor sequence in an anchor sequence-mediated linkage for substitution, addition, or deletion via CRISPR, TALEN, dCas9, recombinant, transposon, etc. In some embodiments, the targeting moiety targets one or more DNA methylation sites in an anchor sequence-mediated linkage. In some further embodiments, the targeted moiety includes: at least one exogenous anchor sequence; modification of at least one connective nucleation molecule binding site, such as by altering the binding affinity to the connective nucleation molecule; a change in the orientation of at least one common nucleotide sequence, such as a CTCF binding motif, a YY1 binding motif, a ZNF143 binding motif, or other binding motifs described herein; and the introduction of at least one substitution, addition, or deletion in at least one anchor sequence, such as a CTCF binding motif, a YY1 binding motif, a ZNF143 binding motif, or other binding motifs described herein.

[0049] In certain embodiments, the composition modifies the chromatin structure.

[0050] In some embodiments, the composition includes a vector comprising a targeting portion, such as a viral vector, for example, a lentiviral vector.

[0051] In certain embodiments, the targeted modification alters at least one of the binding sites for a binding nucleation molecule, such as the anchor sequence within the anchor sequence-mediated linkage, the alternative splicing site, and the binding affinity to the binding site for non-coding RNA.

[0052] In some embodiments, this disclosure includes pharmaceutical compositions comprising compositions described herein.

[0053] The compositions described in the various embodiments of the above-described aspects can be used in any other embodiments described herein.

[0054] In one embodiment, the disclosure includes a composition comprising a synthetic nucleation molecule having a selected binding affinity to an anchor sequence within a target anchor sequence-mediated connection.

[0055] In some embodiments, the binding affinity may be at least 10%, 20%, 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, or more higher or lower than the affinity of the endogenous connective nucleation molecule associated with the target anchor sequence. In some embodiments, the synthetic connective nucleation molecule has amino acid sequence identity of about 30–90%, about 30–85%, about 30–80%, about 30–70%, about 50–80%, and about 50–90% with respect to the endogenous connective nucleation molecule.

[0056] In some embodiments, the connective nucleating molecule disrupts the binding of endogenous connective nucleating molecules to their binding site, such as through competitive binding. In some further embodiments, the connective nucleating molecule is manipulated to bind to a target sequence.

[0057] In some embodiments, the composition further comprises a polymer carrier or a targeted moiety, such as a liposome, peptide, aptamer, or a combination thereof.

[0058] In certain embodiments, the disclosure includes a method for preparing a conjugate nucleating molecule having a selected binding affinity.

[0059] The compositions described in the various embodiments of the above-described aspects can be used in any other embodiments described herein.

[0060] In one embodiment, the disclosure includes a composition comprising a targeted portion that binds to a specific anchor array-mediated connection and alters the topology of the anchor array-mediated connection.

[0061] In some embodiments, the targeted portion is a nucleic acid sequence, a protein, a protein fusion, or a membrane-permeable polypeptide. In some embodiments, the nucleic acid sequence is selected from the group consisting of gRNA and sequences complementary to an anchor sequence or sequences containing at least 80%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, and at least 99% complementary sequences thereto. In some embodiments, the nucleic acid sequence includes a sequence complementary to a binding motif for a conjugate nucleation molecule or consensus sequence or sequences containing at least 80%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, and at least 99% complementary sequences thereto. In some embodiments, the protein is a conjugate nucleation molecule, e.g., CTCF, cohesin, USF1, YY1, TAF3, ZNF143, or another polypeptide, a dominant-negative conjugate nucleation molecule, a protein containing a DNA-binding sequence, e.g., a transcription factor, or a fusion of a sequence-targeted polypeptide and a conjugate nucleation molecule. In some embodiments, the membrane-permeable polypeptide is ABX nThe formula comprises at least one sequence of C (wherein A is selected from a hydrophobic amino acid or amide-containing skeleton having a nucleic acid side chain, e.g., aminoethylglycine; B and C may be the same or different and independently selected from arginine, asparagine, glutamine, lysine, and their analogues; X is independently a hydrophobic amino acid, or X is independently an amide-containing skeleton having a nucleic acid side chain, e.g., aminoethylglycine; n is an integer from 1 to 4). In some embodiments, the protein is an epigenetic enzyme (DNA methylase (e.g., DNMT3a, DNMT3b, DNMTL), DNA demethylase (e.g., TET family), histone methyltransferase, histone deacetylase (e.g., HDAC1, HDAC2, HDAC3), sirtuin 1, 2, 3, 4, 5, 6, or 7, lysine-specific histone demethylase 1 (LSD1), histone-lysine-N-methyltransferase (Setd b1) Selected from the group consisting of true chromatin histone-lysine N-methyltransferase 2 (G9a), histone-lysine N-methyltransferase (SUV39H1), zeste homolog 2 enhancer (EZH2), viral lysine methyltransferase (vSET), histone methyltransferase (SET2), and protein-lysine N-methyltransferase (SMYD2), and fusions of sequence-targeted polypeptides and linkage nucleation molecules.

[0062] In some embodiments, the targeting moiety includes a sequence-targeted polypeptide, e.g., Cas9, a fusion of a sequence-targeted polypeptide, e.g., a fusion of dCas9 and a conjugate nucleation molecule, or a conjugate nucleation molecule. In some embodiments, the targeting moiety includes a guide RNA or a nucleic acid encoding a guide RNA. In some embodiments, the targeting moiety modulates the transcription of a gene in an anchor sequence-mediated conjugate in human cells by introducing a targeted modification into the anchor sequence-mediated conjugate.

[0063] In some embodiments, the targeted moiety binds to the anchor sequence of the anchor sequence-mediated linkage, and the targeted moiety introduces the targeted modification into the anchor sequence to modulate the transcription of the gene in the anchor sequence-mediated linkage in human cells. In some embodiments, the targeted modification includes, for example, at least one substitution, addition, or deletion of one or more nucleotides in the anchor sequence. In some embodiments, the targeted modification includes at least one substitution, addition, or deletion of one or more nucleotides in the anchor sequence, such as a binding motif for a linkage nucleation molecule, as described herein. In some embodiments, the targeted modification includes at least one common nucleotide sequence in the opposite orientation, such as a binding motif for a linkage nucleation molecule. In some embodiments, the targeted modification includes an anchor sequence that does not exist in nature and forms or disrupts the anchor sequence-mediated linkage.

[0064] The compositions described in the various embodiments of the above-described aspects can be used in any other embodiments described herein.

[0065] In one embodiment, the present disclosure includes a composition comprising a protein comprising a first polypeptide containing Cas or a modified Cas protein domain and a second polypeptide containing a polypeptide having DNA methyltransferase activity [or demethylating or deaminase activity], in combination with at least one guide RNA (gRNA) or antisense DNA oligonucleotide that targets an anchor sequence of a target anchor sequence-mediated linkage, wherein the system is effective in altering a target anchor sequence-mediated linkage in a human cell.

[0066] In some embodiments, the composition is effective in altering target anchor sequence-mediated connectivity in human cells.

[0067] The compositions described in the various embodiments of the above-described aspects can be used in any other embodiments described herein.

[0068] In one embodiment, the disclosure includes a pharmaceutical composition comprising a Cas protein and at least one guide RNA (gRNA) that targets the Cas protein to an anchor sequence of a target anchor sequence-mediated linkage, wherein the Cas protein is effective in inducing mutations in the target anchor sequence that reduce the formation of an anchor sequence-mediated linkage associated with the target anchor sequence.

[0069] In one embodiment, the disclosure includes a synthetic nucleic acid comprising a plurality of anchor sequences, gene sequences, and transcriptional regulatory sequences.

[0070] In some embodiments, the gene sequence and the transcriptional regulatory sequence are located between multiple anchor sequences. In some embodiments, the nucleic acid comprises, in order, (a) an anchor sequence, a gene sequence, a transcriptional regulatory sequence, and an anchor sequence, or (b) an anchor sequence, a transcriptional regulatory sequence, a gene sequence, and an anchor sequence.

[0071] In some embodiments, the sequences are separated by linker sequences. In some embodiments, the anchor sequences are 7-100nt, 10-100nt, 10-80nt, 10-70nt, 10-60nt, 10-50nt, or 20-80nt. In some embodiments, the nucleic acid is in the range of 3,000-50,000 bp, 3,000-40,000 bp, 3,000-30,000 bp, 3,000-20,000 bp, 3,000-15,000 bp, 3,000-12,000 bp, 3,000-10,000 bp, 3,000-8,000 bp, 5,000-30,000 bp, 5,000-20,000 bp, 5,000-15,000 bp, 5,000-12,000 bp, 5,000-10,000 bp, or any range in between.

[0072] In some embodiments, the vector includes nucleic acids described herein.

[0073] In some embodiments, the cells contain nucleic acids as described herein.

[0074] In some embodiments, the pharmaceutical composition includes nucleic acids described herein.

[0075] In some embodiments, a method for modulating gene expression by administering a composition includes nucleic acids described herein.

[0076] The nucleic acids described in the various embodiments of the above-described embodiment can be used in any other embodiment described herein.

[0077] In one embodiment, the Disclosure comprises (a) a nucleic acid encoding a protein comprising a first polypeptide domain comprising Cas or a modified Cas protein and a second polypeptide domain comprising a polypeptide having DNA methyltransferase activity [or demethylating or deaminase activity]; and (b) a kit comprising at least one guide RNA (gRNA) for targeting the protein to an anchor sequence of a target anchor sequence-mediated linkage in a target cell.

[0078] In some embodiments, (a) and (b) are provided in the same vector, e.g., plasmid, AAV vector, AAV9 vector. In some embodiments, (a) and (b) are provided in separate vectors.

[0079] The kits described in the various embodiments of the above-described aspects can be used in any other embodiments described herein.

[0080] In one embodiment, the disclosure includes a method for preparing a conjugate nucleating molecule having a selected binding affinity.

[0081] In one embodiment, the Disclosure includes a method for altering gene expression / altering anchor sequence-mediated linkage in a mammalian subject, comprising administering (separately or in the same pharmaceutical composition) a protein comprising (i) a first polypeptide domain comprising Cas or a modified Cas protein and a second polypeptide domain comprising a polypeptide having DNA methyltransferase activity [or demethylating or deaminase activity], or (ii) a protein comprising a first polypeptide domain comprising Cas or a modified Cas protein and a second polypeptide domain comprising a polypeptide having a role in DNA methyltransferase activity [or demethylating or deaminase activity], and a nucleic acid encoding at least one guide RNA (gRNA) that targets an anchor sequence of an anchor sequence-mediated linkage.

[0082] In some embodiments, the anchor sequence is or includes a CTCF-binding motif, such as SEQ ID NO: 1 or SEQ ID NO: 2. In some embodiments, the anchor sequence is or includes a CTCF-binding motif associated with a target disease gene.

[0083] In some embodiments, the Cas protein is dCas9; dCas9 is human codon optimized. In some embodiments, the methyltransferase is a methyltransferase from the DNMT family. In some embodiments, the polypeptide is an enzyme from the TET family. In some embodiments, the protein has a linker between the first and second polypeptides.

[0084] In some embodiments, the gRNA is selected from gRNAs for different diseases.

[0085] Methods described in various embodiments of the above-described aspects can be used in any other embodiments described herein.

[0086] In one embodiment, the disclosure includes a method for modifying a chromatin structure, such as a two-dimensional structure, which involves modulating the transcription of a nucleic acid sequence by altering the topology of anchor sequence-mediated connections. The alteration of the topology of anchor sequence-mediated connections, such as loops, modulates the transcription of the nucleic acid sequence.

[0087] In another embodiment, the disclosure includes a method for modifying a chromatin structure, such as a two-dimensional structure, which involves modulating the transcription of a nucleic acid sequence by altering the topology of multiple anchor sequence-mediated connections. The alteration of the topology of multiple anchor sequence-mediated connections, such as multiple loops, modulates the transcription of the nucleic acid sequence.

[0088] In another embodiment, the disclosure includes a method for modulating the transcription of a nucleic acid sequence, which involves altering anchor sequence-mediated connections, such as loops, that affect the transcription of the nucleic acid sequence. The anchor sequence-mediated connections modulate the transcription of the nucleic acid sequence.

[0089] In certain embodiments, altering anchor sequence-mediated connections modifies the chromatin structure. For example, altering the chromatin structure by substituting, adding, or deleting one or more nucleotides within the anchor sequence of an anchor sequence-mediated connection modifies the chromatin structure.

[0090] In various embodiments of the above-described aspects or any other aspects of the present disclosure described herein, the topology is altered by substituting, adding, or deleting one or more nucleotides of an anchor sequence within an anchor sequence-mediated linkage. For example, the one or more nucleotides that are substituted, added, or deleted may be within at least one anchor sequence, such as a binding motif for a linkage nucleation molecule.

[0091] In some embodiments, the topology is altered by at least one of the following: modulating DNA methylation at one or more sites within the anchor sequence-mediated linkage; altering the orientation of at least one common nucleotide sequence, such as a binding motif for a linkage nucleation molecule; altering spatial isolation within the anchor sequence-mediated linkage; altering the rotational free energy within the anchor sequence-mediated linkage; and altering the positional degrees of freedom within the anchor sequence-mediated linkage.

[0092] In some further embodiments, the topology is altered by one or more of the following: disrupting anchor array-mediated connections, forming anchor array-mediated connections that do not exist naturally, forming multiple anchor array-mediated connections that do not exist naturally, and introducing exogenous anchor arrays.

[0093] In certain embodiments, the topology is altered to produce modulation of stable transcriptions, such as modulations that last for at least about 1 hour to about 30 days, or at least about 2 hours, 6 hours, 12 hours, 18 hours, 24 hours, 2 days, 3 days, 4 days, 5 days, 6 days, 7 days, 8 days, 9 days, 10 days, 11 days, 12 days, 13 days, 14 days, 15 days, 16 days, 17 days, 18 days, 19 days, 20 days, 21 days, 22 days, 23 days, 24 days, 25 days, 26 days, 27 days, 28 days, 29 days, 30 days, or longer, or any time in between.

[0094] In certain embodiments, the topology is changed to produce, for example, transient transcription modulation, such as modulation that lasts for about 30 minutes to about 7 days or less, or about 1 hour, 2 hours, 3 hours, 4 hours, 5 hours, 6 hours, 7 hours, 8 hours, 9 hours, 10 hours, 11 hours, 12 hours, 13 hours, 14 hours, 15 hours, 16 hours, 17 hours, 18 hours, 19 hours, 20 hours, 21 hours, 22 hours, 24 hours, 36 hours, 48 hours, 60 hours, 72 hours, 4 days, 5 days, 6 days, 7 days or less, or any time in between.

[0095] In some embodiments, the method further includes modulating a connection nucleation molecule, such as a binding affinity to an anchor sequence within an anchor sequence-mediated connection, which interacts with the anchor sequence-mediated connection.

[0096] In certain embodiments, the anchor array-mediated connection includes at least a first anchor array and a second anchor array. In one embodiment, the anchor array-mediated connection is mediated by a first connective nucleating molecule bound to the first anchor array, a second connective nucleating molecule bound to the second anchor array, and association between the first and second connective nucleating molecules. In another embodiment, the first or second connective nucleating molecule has a binding affinity to the anchor array that is higher or lower than a reference value, such as the binding affinity to the anchor array in the absence of modification.

[0097] In some embodiments, the second anchor sequence is discontinuous with respect to the first anchor sequence. In one embodiment, anchor sequence-mediated connectivity is mediated by a first connective nucleating molecule bound to the first anchor sequence, a second connective nucleating molecule bound to a discontinuous second anchor sequence, and association between the first and second connective nucleating molecules. In another embodiment, the first or second connective nucleating molecule has a binding affinity to an anchor sequence that is higher or lower than a reference value, such as the binding affinity to the anchor sequence in the absence of modification.

[0098] In some embodiments where the anchor sequences are discontinuous with respect to each other, the first anchor sequence is separated from the second anchor sequence by approximately 500 bp to 500 Mb, 750 bp to 200 Mb, 1 kb to 100 Mb, 25 kb to 50 Mb, 50 kb to 1 Mb, 100 kb to 750 kb, 150 kb to 500 kb, or 175 kb to 500 kb. In some embodiments, the first anchor array is approximately 500bp, 600bp, 700bp, 800bp, 900bp, 1kb, 5kb, 10kb, 15kb, 20kb, 25kb, 30kb, 35kb, 40kb, 45kb, 50kb, 55kb, 60kb, 65kb, 70kb, 75kb, 80kb, 85kb, 90kb, 95kb, 100kb, 125kb, 150kb, 175kb, 200kb, 225kb, 2 It is separated from the second anchor array by 50kb, 275kb, 300kb, 350kb, 400kb, 500kb, 600kb, 700kb, 800kb, 900kb, 1Mb, 2Mb, 3Mb, 4Mb, 5Mb, 6Mb, 7Mb, 8Mb, 9Mb, 10Mb, 15Mb, 20Mb, 25Mb, 50Mb, 75Mb, 100Mb, 200Mb, 300Mb, 400Mb, 500Mb, or any size in between.

[0099] In certain embodiments, the first and second anchor sequences each include a common nucleotide sequence, such as a binding motif for a connective nucleation molecule, as described herein. In some embodiments, the first and second anchor sequences include different sequences, for example, the first anchor sequence includes a binding motif for a connective nucleation molecule, and the second anchor sequence includes a binding motif for another molecule, such as another connective nucleation molecule.

[0100] In some embodiments, the anchor array-mediated connection includes multiple anchor arrays. In one embodiment, at least one of the anchor arrays includes a CTCF binding motif.

[0101] In some embodiments, the anchor sequence-mediated connection includes a loop, such as an intrachromosomal loop. In one embodiment, the loop includes a first anchor sequence, a nucleic acid sequence, a transcriptional regulatory sequence such as an enhancing or silencing sequence, and a second anchor sequence. In another embodiment, the loop includes, in order, a first anchor sequence, a transcriptional regulatory sequence, and a second anchor sequence; or a first anchor sequence, a nucleic acid sequence, and a second anchor sequence. In yet another embodiment, either or both of the nucleic acid sequence and the transcriptional regulatory sequence are located inside or outside the loop.

[0102] In a particular embodiment, the anchor sequence-mediated connection has multiple loops. In one embodiment, the anchor sequence-mediated connection comprises multiple loops, and the anchor sequence-mediated connection includes at least one of an anchor sequence, a nucleic acid sequence, and a transcriptional regulatory sequence in one or more of the loops.

[0103] In some embodiments, the transcription of a nucleic acid sequence, such as the transcription of a target nucleic acid sequence, is modulated compared to the transcription of the target sequence in the absence of a reference value, e.g., an anchor sequence-mediated linkage change.

[0104] In some embodiments, transcription is activated by the inclusion of an activation loop. In one embodiment, the anchor sequence-mediated linkage includes a transcriptional regulatory sequence, such as an enhancing sequence, which increases the transcription of the nucleic acid sequence. In some further embodiments, transcription is activated by the exclusion of an inhibitory loop. In one embodiment, the anchor sequence-mediated linkage excludes a transcriptional regulatory sequence, such as a silencing sequence, which decreases the transcription of the nucleic acid sequence.

[0105] In some embodiments, transcription is suppressed by the inclusion of a repression loop. In one embodiment, the anchor sequence-mediated linkage includes a transcriptional regulatory sequence, such as a silencing sequence, which reduces the transcription of the nucleic acid sequence. In some further embodiments, transcription is suppressed by the exclusion of an activation loop. In one embodiment, the anchor sequence-mediated linkage excludes a transcriptional regulatory sequence, such as an enhancing sequence, which increases the transcription of the nucleic acid sequence.

[0106] In certain embodiments, anchor sequence-mediated linkage is altered in vivo in a subject, such as a human subject. In some embodiments, the methods described herein further include administering a targeted moiety to a subject selected from at least one of an exogenous connective nucleation molecule, a nucleic acid encoding a connective nucleation molecule, and a fusion of a sequence-targeting polypeptide and a connective nucleation molecule. In one embodiment, the connective nucleation molecule disrupts the binding of the endogenous connective nucleation molecule to its binding site, such as through competitive binding. In another embodiment, the targeted moiety includes an enzyme, such as a sequence-targeting polypeptide such as Cas9. In yet another embodiment, the targeted moiety further includes a connective nucleation molecule. In yet another embodiment, the targeted moiety further includes a guide RNA or a nucleic acid encoding a guide RNA.

[0107] In some embodiments, administration involves administering a vector, such as a lentiviral vector, which contains a targeted moiety, for example, a nucleic acid encoding a conjugate nucleation molecule. In some further embodiments, administration involves administering a polymer carrier, for example, a formulation formulated in liposomes.

[0108] In one embodiment, the disclosure includes engineered cells that include targeted modifications during anchor sequence-mediated linkage.

[0109] In another embodiment, the disclosure includes an engineered nucleic acid sequence that includes an anchor sequence-mediated linkage having a targeted modification.

[0110] In various embodiments of the above-described aspects or any other aspects of the present disclosure described herein, the targeted modification includes one or more of the following: substitution, addition or deletion of one or more nucleotides in an anchor sequence within an anchor sequence-mediated linkage; substitution, addition or deletion of one or more nucleotides in at least one anchor sequence, e.g., a CTCF-binding motif; modification of one or more DNA methylation sites within an anchor sequence-mediated linkage; and at least one exogenous anchor sequence.

[0111] In some embodiments, the targeted modification alters at least one connective nucleation molecule binding site, such as by changing its binding affinity to the connective nucleation molecule. In some further embodiments, the targeted modification alters the orientation of at least one common nucleotide sequence, e.g., a CTCF binding motif; disrupts anchor sequence-mediated connections; and causes the formation of anchor sequence-mediated connections that do not exist in nature.

[0112] In certain embodiments, the anchor sequence-mediated connection includes at least a first anchor sequence and a second anchor sequence. In one embodiment, the anchor sequence-mediated connection is mediated by a first connective nucleating molecule bound to the first anchor sequence, a second connective nucleating molecule bound to the second anchor sequence, and association between the first and second connective nucleating molecules. In another embodiment, the first or second connective nucleating molecule has a binding affinity to the anchor sequence that is higher or lower than a reference value, such as the binding affinity to the anchor sequence in the absence of modification.

[0113] In some embodiments, the second anchor sequence is discontinuous with respect to the first anchor sequence. In one embodiment, anchor sequence-mediated connectivity is mediated by a first connective nucleating molecule bound to the first anchor sequence, a second connective nucleating molecule bound to a discontinuous second anchor sequence, and association between the first and second connective nucleating molecules. In another embodiment, the first or second connective nucleating molecule has a binding affinity to an anchor sequence that is higher or lower than a reference value, such as the binding affinity to the anchor sequence in the absence of modification.

[0114] In some embodiments where the anchor sequences are discontinuous with respect to each other, the first anchor sequence is separated from the second anchor sequence by approximately 500 bp to 500 Mb, 750 bp to 200 Mb, 1 kb to 100 Mb, 25 kb to 50 Mb, 50 kb to 1 Mb, 100 kb to 750 kb, 150 kb to 500 kb, or 175 kb to 500 kb. In some embodiments, the first anchor array is approximately 500bp, 600bp, 700bp, 800bp, 900bp, 1kb, 5kb, 10kb, 15kb, 20kb, 25kb, 30kb, 35kb, 40kb, 45kb, 50kb, 55kb, 60kb, 65kb, 70kb, 75kb, 80kb, 85kb, 90kb, 95kb, 100kb, 125kb, 150kb, 175kb, 200kb, 225kb, 2 It is separated from the second anchor array by 50kb, 275kb, 300kb, 350kb, 400kb, 500kb, 600kb, 700kb, 800kb, 900kb, 1Mb, 2Mb, 3Mb, 4Mb, 5Mb, 6Mb, 7Mb, 8Mb, 9Mb, 10Mb, 15Mb, 20Mb, 25Mb, 50Mb, 75Mb, 100Mb, 200Mb, 300Mb, 400Mb, 500Mb, or any size in between.

[0115] In certain embodiments, the first and second anchor sequences each include a common nucleotide sequence, such as a CTCF-binding motif. In some embodiments, the first and second anchor sequences include different sequences; for example, the first anchor sequence includes a CTCF-binding motif, and the second anchor sequence includes an anchor sequence other than a CTCF-binding motif.

[0116] In some embodiments, the anchor array-mediated connection includes multiple anchor arrays. In one embodiment, at least one of the anchor arrays includes a CTCF binding motif.

[0117] In some further embodiments, the anchor sequence-mediated connection includes a loop, such as an intrachromosomal loop. In one embodiment, the loop includes a first anchor sequence, a nucleic acid sequence, a transcriptional regulatory sequence such as an enhancing or silencing sequence, and a second anchor sequence. In another embodiment, the loop includes, in order, a first anchor sequence, a transcriptional regulatory sequence, and a second anchor sequence; or a first anchor sequence, a nucleic acid sequence, and a second anchor sequence. In yet another embodiment, either or both of the nucleic acid sequence and the transcriptional regulatory sequence are located inside or outside the loop.

[0118] In a particular embodiment, the anchor sequence-mediated connection has multiple loops. In one embodiment, the anchor sequence-mediated connection comprises multiple loops, and the anchor sequence-mediated connection includes at least one of an anchor sequence, a nucleic acid sequence, and a transcriptional regulatory sequence in one or more of the loops.

[0119] In some embodiments, the transcription of a nucleic acid sequence, such as the transcription of a target nucleic acid sequence, is modulated compared to the transcription of the target sequence in the absence of a reference value, e.g., an anchor sequence-mediated linkage change.

[0120] In some embodiments, transcription is activated by the inclusion of an activation loop. In one embodiment, the anchor sequence-mediated linkage includes a transcriptional regulatory sequence, such as an enhancing sequence, which increases the transcription of the nucleic acid sequence. In some further embodiments, transcription is activated by the exclusion of an inhibitory loop. In one embodiment, the anchor sequence-mediated linkage excludes a transcriptional regulatory sequence, such as a silencing sequence, which decreases the transcription of the nucleic acid sequence.

[0121] In some embodiments, transcription is suppressed by the inclusion of a repression loop. In one embodiment, the anchor sequence-mediated linkage includes a transcriptional regulatory sequence, such as a silencing sequence, which reduces the transcription of the nucleic acid sequence. In some further embodiments, transcription is suppressed by the exclusion of an activation loop. In one embodiment, the anchor sequence-mediated linkage excludes a transcriptional regulatory sequence, such as an enhancing sequence, which increases the transcription of the nucleic acid sequence.

[0122] In some embodiments, this disclosure includes a pharmaceutical composition comprising the engineered cells described herein or the engineered nucleic acid sequence described herein. In some further embodiments, this disclosure includes a plurality of cells comprising the engineered cells described herein. In some further embodiments, this disclosure includes a vector comprising the engineered nucleic acid sequence described herein.

[0123] In one embodiment, the disclosure includes a method for treating a disease or condition, which involves administering a targeted portion selected from at least one of an exogenous connective nucleation molecule, a nucleic acid encoding a connective nucleation molecule, and a fusion of a sequence-targeted polypeptide and a connective nucleation molecule.

[0124] In certain embodiments, the connective nucleating molecule disrupts the binding of the endogenous connective nucleating molecule to its binding site, for example, through competitive binding.

[0125] In some embodiments, the targeting moiety comprises an enzyme, such as a sequence-targeting polypeptide such as Cas9. In some embodiments, the targeting moiety further comprises a connective nucleation molecule. In some further embodiments, the targeting moiety further comprises a guide RNA or a nucleic acid encoding the guide RNA. In some further embodiments, the targeting moiety targets one or more nucleotides of an anchor sequence in an anchor sequence-mediated linkage for substitution, addition, or deletion, for example, via CRISPR, TALEN, dCas9, recombinant, transposon, etc. In some embodiments, the targeting moiety targets one or more DNA methylation sites in an anchor sequence-mediated linkage. In some further embodiments, the targeting moiety introduces at least one of the following: at least one exogenous anchor sequence; a modification at least one connective nucleation molecule binding site, such as by altering the binding affinity to the connective nucleation molecule; a change in the orientation of at least one common nucleotide sequence, such as a CTCF binding motif; and at least one substitution, addition, or deletion in at least one anchor sequence, such as a CTCF binding motif.

[0126] In certain embodiments, administration includes administering a vector containing a targeting moiety, for example, a nucleic acid encoding a conjugate nucleation molecule, such as a viral vector. In some further embodiments, administration includes administering a formulation, such as liposomes.

[0127] In some embodiments, the disease or condition is selected from the group consisting of cancer, trinucleotide repeats (Huntington's disease, fragile X, all spinocerebellar ataxia, Friedreich's ataxia, myotonic dystrophy, etc.), autosomal dominant conditions, imprinting gene disorders (Prader-Willi syndrome, Angelman syndrome), haploinsufficiency disorders, dominant-negative mutations (severe congenital neutropenia), viral diseases (HIV, HBV, HCV, HPV, etc.), and environmentally induced transcriptional-epigenetic alterations (smoking, maternal dietary effects on gene expression).

[0128] In one aspect, this disclosure is ABX n A pharmaceutical composition comprising at least one polypeptide, e.g., a membrane-permeable polypeptide, each comprising at least one sequence of C (wherein A is selected from a hydrophobic amino acid or amide-containing skeleton having a nucleic acid side chain, e.g., aminoethylglycine; B and C may be the same or different, each independently selected from arginine, asparagine, glutamine, lysine, and their analogues; X is each independently a hydrophobic amino acid, or X is each independently an amide-containing skeleton having a nucleic acid side chain, e.g., aminoethylglycine; n is an integer from 1 to 4), which can hybridize to a nucleic acid sequence in an anchor sequence-mediated linkage (e.g., an anchor sequence of an anchor sequence-mediated linkage, e.g., a CTCF-binding motif, a BORIS-binding motif, a cohesine-binding motif, a USF1-binding motif, a YY1-binding motif, a TATA box, a ZNF143-binding motif, etc.).

[0129] The compositions described in the various embodiments of the above-described aspects can be used in any other embodiments described herein. In some embodiments, the targeted portion of one or more embodiments described herein comprises a membrane-permeable migration polypeptide, for example, a polypeptide described herein.

[0130] In some embodiments, the hydrophobic amino acids are alanine, valine, isoleucine, leucine, methionine, phenylalanine, tyrosine, tryptophan, and so An analogue is selected from the following. In some embodiments, B is selected from arginine or glutamine. In some embodiments, C is arginine. In some embodiments, n is 2.

[0131] In some embodiments, the polypeptide has a size in the range of about 5 to about 50 amino acid units in length.

[0132] In some embodiments, the composition comprises two or more polypeptides linked together. In some embodiments, the polypeptides are linked together, for example, by linking amino acids on one polypeptide to one or more amino acids or carboxyl or amino termini on another polypeptide, a branched polypeptide, or by linking via new peptide bonds, linear polypeptides. In some embodiments, the polypeptides are linked by linkers as described herein.

[0133] In some embodiments, the nucleic acid side chain is independently selected from the group consisting of purine side chains, pyrimidine side chains, and nucleic acid analog side chains. In some embodiments, the nucleic acid side chain hybridizes to a heterogeneous portion containing a nucleic acid side chain, e.g., PNA, or nucleic acid.

[0134] In some embodiments, the composition comprises a membrane-permeable polypeptide and at least one heterogeneous moiety. In one embodiment, the heterogeneous moiety is a conjugate nucleation molecule that interacts with anchor sequence-mediated conjugates. In another embodiment, the heterogeneous moiety is a sequence-targeting polypeptide, such as Cas9. In yet another embodiment, the heterogeneous moiety is a guide RNA or a nucleic acid encoding a guide RNA.

[0135] In some embodiments, the heterologous moiety is selected from the group consisting of small molecules (e.g., drugs), peptides (e.g., ligands), and nucleic acids (e.g., siRNA, DNA, modified RNA, RNA). In other embodiments, the heterologous moiety has at least one effector activity selected from the group consisting of modulating biological activity, binding to regulatory proteins, modulating enzyme activity, modulating substrate binding, modulating receptor activation, modulating protein stability / degradation, and modulating transcript stability / degradation. In another embodiment, the heterologous moiety has at least one targeted function selected from the group consisting of modulating function, modulating molecules (e.g., enzymes, proteins, or nucleic acids), and localizing to a specific location. In yet another embodiment, the heterologous moiety is a tag or label, and is, for example, cleavable. In another embodiment, the heterogeneous portion is selected from the group consisting of epigenetic modifiers, epigenetic enzymes, bicyclic peptides, transcription factors, DNA or protein modifying enzymes, DNA insertion agents, efflux pump inhibitors, nuclear receptor activators or inhibitors, proteasome inhibitors, competitive inhibitors for enzymes, protein synthesis inhibitors, nucleases, protein fragments or domains, tags or markers, antigens, antibodies or antibody fragments, ligands or receptors, synthetic or analog peptides derived from naturally occurring bioactive peptides, antimicrobial peptides, pore-forming peptides, targeted or cytotoxic peptides, degradable or self-destructive peptides, CRISPR systems or their components, DNA, RNA, artificial nucleic acids, nanoparticles, oligonucleotide aptamers, peptide aptamers, and drugs with low pharmacokinetic or pharmacodynamic (PK / PD) properties.

[0136] In some embodiments, the composition further comprises two or more heterologous moieties linked to the polypeptide, for example, via a linker or directly, on an amino terminus, a carboxyl terminus, all terms, a combination of some carboxyl terms and some amino terms of the polypeptide, one or more amino acids of the polypeptide, or any combination thereof. In some embodiments, the heterologous moieties are linked to one of the polypeptides, for example, via a linker or directly, on an amino terminus, a carboxyl terminus, both terms, or on one or more amino acids of the polypeptide.

[0137] In some embodiments, the composition further includes, for example, a linker between polypeptides or between polypeptides and heterologous moieties. The linker may be a chemical bond, for example, one or more covalent or non-covalent bonds. In some embodiments, the linker is a peptide linker (e.g., non-ABX). n The linker is a C polypeptide. Such linkers may be 2 to 30 amino acids long or longer. The linkers include the flexible, rigid, or cleavable linkers described herein.

[0138] In some embodiments, the composition modulates DNA methylation at one or more sites within the anchor sequence-mediated linkage.

[0139] In some embodiments, the composition transiently modulates the transcription, resulting in modulation that lasts for, for example, about 30 minutes to about 7 days or less, or about 1 hour, 2 hours, 3 hours, 4 hours, 5 hours, 6 hours, 7 hours, 8 hours, 9 hours, 10 hours, 11 hours, 12 hours, 13 hours, 14 hours, 15 hours, 16 hours, 17 hours, 18 hours, 19 hours, 20 hours, 21 hours, 22 hours, 24 hours, 36 hours, 48 hours, 60 hours, 72 hours, 4 days, 5 days, 6 days, 7 days or less, or any time in between.

[0140] In some embodiments, the composition provides a stable modulation of the transfer, resulting in a modulation that lasts for at least about 1 hour to about 30 days, or at least about 2 hours, 6 hours, 12 hours, 18 hours, 24 hours, 2 days, 3 days, 4 days, 5 days, 6 days, 7 days, 8 days, 9 days, 10 days, 11 days, 12 days, 13 days, 14 days, 15 days, 16 days, 17 days, 18 days, 19 days, 20 days, 21 days, 22 days, 23 days, 24 days, 25 days, 26 days, 27 days, 28 days, 29 days, 30 days, or longer, or any time in between.

[0141] In some embodiments, the composition modulates the binding affinity to anchor sequences within anchor sequence-mediated connections, interacting with connection nucleating molecules, such as anchor sequence-mediated connections.

[0142] In some embodiments, the composition disrupts the binding of endogenous linkage nucleating molecules to their binding sites, for example, through competitive binding.

[0143] In one embodiment, the disclosure includes a method for modifying the expression of a target gene, which involves altering an anchor sequence-mediated connection associated with the target gene, the alteration of which modulates the transcription of the target gene.

[0144] In one embodiment, the disclosure includes a method for modifying the expression of a target gene, which involves administering a composition described herein to cells, tissues, or subjects.

[0145] In one embodiment, the present disclosure includes a method for modulating the transcription of a nucleic acid sequence, comprising administering a composition described herein to alter an anchor sequence-mediated linkage, such as a loop, which modulates the transcription of the nucleic acid sequence, wherein the alteration of the anchor sequence-mediated linkage modulates the transcription of the nucleic acid sequence.

[0146] Methods described in various embodiments of the above-described aspects can be used in any other embodiments described herein.

[0147] In some embodiments, the composition modulates DNA methylation at one or more sites within the anchor sequence-mediated linkage.

[0148] In some embodiments, changes in anchor array-mediated connections result in transient modulation of transcription, for example, from about 30 minutes to about 7 days or less, or modulation that lasts for about 1 hour, 2 hours, 3 hours, 4 hours, 5 hours, 6 hours, 7 hours, 8 hours, 9 hours, 10 hours, 11 hours, 12 hours, 13 hours, 14 hours, 15 hours, 16 hours, 17 hours, 18 hours, 19 hours, 20 hours, 21 hours, 22 hours, 24 hours, 36 hours, 48 hours, 60 hours, 72 hours, 4 days, 5 days, 6 days, 7 days or less, or any time in between.

[0149] In some embodiments, changes in anchor sequence-mediated connections result in stable modulation of transcription, for example, modulation that lasts for at least about 1 hour to about 30 days, or at least about 2 hours, 6 hours, 12 hours, 18 hours, 24 hours, 2 days, 3 days, 4 days, 5 days, 6 days, 7 days, 8 days, 9 days, 10 days, 11 days, 12 days, 13 days, 14 days, 15 days, 16 days, 17 days, 18 days, 19 days, 20 days, 21 days, 22 days, 23 days, 24 days, 25 days, 26 days, 27 days, 28 days, 29 days, 30 days, or longer, or any time in between.

[0150] In some embodiments, the composition modulates the binding affinity to anchor sequences within anchor sequence-mediated connections, interacting with connection nucleating molecules, such as anchor sequence-mediated connections.

[0151] In some embodiments, the composition disrupts the binding of endogenous linkage nucleating molecules to their binding sites, for example, through competitive binding.

[0152] In some embodiments, the heterologous portion is a sequence-targeted polypeptide, such as Cas9. In some embodiments, the heterologous portion is a guide RNA or a nucleic acid encoding a guide RNA.

[0153] In one embodiment, the present disclosure includes a method for modulating gene expression, which includes providing a composition described herein, for example, a heterologous moiety that is an endogenous effector, an exogenous effector, or an agonist or antagonist thereof that inhibits CpG binding.

[0154] In one embodiment, the present disclosure includes a method of delivering a therapeutic agent, comprising administering the composition described herein to a subject, wherein the heterogeneous portion is the therapeutic agent, and the composition increases the intracellular delivery of the therapeutic agent compared to the therapeutic agent alone.

[0155] Methods described in various embodiments of the above-described aspects can be used in any other embodiments described herein.

[0156] In some embodiments, the composition is targeted to specific cells or tissues. For example, the composition is targeted to epithelial, connective tissue, muscle, or nerve tissue or cells, or a combination thereof. For example, the composition is targeted to cells or tissues of specific organ systems, such as the cardiovascular system (heart, blood vessels); the digestive system (esophagus, stomach, liver, gallbladder, pancreas, intestines, colon, rectum, and anus); the endocrine system (hypothalamus, pituitary gland, pineal gland or pineal gland, thyroid gland, parathyroid gland, adrenal gland); the excretory system (kidneys, ureters, bladder); the lymphatic system (lymph, lymph nodes, lymphatic vessels, tonsils, adenoids, thymus, spleen); the cutaneous system (skin, hair, nails); the muscular system (e.g., skeletal muscle); the nervous system (brain, spinal cord, nerves); the reproductive system (ovaries, uterus, mammary glands, testes, vas deferens, seminal vesicles, prostate); the respiratory system (pharynx, larynx, trachea, bronchi, lungs, diaphragm); the skeletal system (bones, cartilage), and combinations thereof. In some embodiments, the composition crosses the blood-brain barrier, placental membrane, or blood-testis barrier.

[0157] In some embodiments, the composition is administered systemically. In some embodiments, the administration is parenteral, and the therapeutic agent is a parenteral therapeutic agent.

[0158] In some embodiments, the composition exhibits enhanced pharmacokinetics or pharmacodynamics, such as improved PK / PD, e.g., improved targeting, absorption, or transport compared to the therapeutic agent alone (e.g., improvements of at least 5%, 10%, 15%, 20%, 30%, 40%, 50%, 60%, 75%, 80%, 90%, or greater). In some embodiments, the composition exhibits reduced undesirable effects, such as reduced diffusion to non-target sites, reduced off-target activity, or reduced toxic metabolism compared to the therapeutic agent alone (e.g., reductions of at least 5%, 10%, 15%, 20%, 30%, 40%, 50%, 60%, 75%, 80%, 90%, or greater compared to the therapeutic agent alone). In some embodiments, the composition increases the efficacy of the therapeutic agent and / or reduces its toxicity compared to the therapeutic agent alone (e.g., by at least 5%, 10%, 15%, 20%, 30%, 40%, 50%, 60%, 75%, 80%, 90%, or more).

[0159] In one embodiment, the present disclosure includes a method for intracellular delivery of a therapeutic agent, comprising contacting a cell with a composition described herein, wherein the heterogeneous portion is the therapeutic agent, and the composition increases the intracellular delivery of the therapeutic agent compared to the therapeutic agent alone.

[0160] Methods described in various embodiments of the above-described aspects can be used in any other embodiments described herein.

[0161] In some embodiments, the composition has a differential PK / PD compared to the therapeutic agent alone. For example, the composition exhibits an increase or decrease in absorption or distribution, metabolism or excretion (e.g., an increase or decrease of at least 5%, 10%, 15%, 20%, 30%, 40%, 50%, 60%, 75%, 80%, 90%, or greater) compared to the therapeutic agent alone.

[0162] In some embodiments, the composition is administered in a dose sufficient to increase intracellular delivery of the therapeutic agent without significantly increasing endocytosis, for example, less than 50%, 40%, 20%, 15%, 10%, 5%, 4%, 3%, 2%, 1%, or any percentage in between. In some embodiments, the composition is administered in a dose sufficient to increase intracellular delivery of the therapeutic agent without significantly increasing calcium influx, for example, less than 50%, 40%, 20%, 15%, 10%, 5%, 4%, 3%, 2%, 1%, or any percentage in between. In some embodiments, the composition is administered in a dose sufficient to increase intracellular delivery of the therapeutic agent without significantly increasing endosomal activity, for example, less than 50%, 40%, 20%, 15%, 10%, 5%, 4%, 3%, 2%, 1%, or any percentage in between.

[0163] In one embodiment, the present disclosure includes a method for modulating the transcription of a gene in a cell, comprising contacting a cell with a composition described herein, wherein the composition targets a gene and modulates its transcription.

[0164] Methods described in various embodiments of the above-described aspects can be used in any other embodiments described herein.

[0165] In some embodiments, the composition is administered in an amount sufficient to deliver intracellularly a therapeutic agent having reduced off-target transcriptional activity compared to the heterologous portion alone, without significantly altering the off-target transcriptional activity, and over a sufficient period of time.

[0166] In one embodiment, the present disclosure includes a method for bringing a cell into contact with a composition described herein, wherein the composition targets the cell and modulates membrane proteins on the cell, such as ion channels, cell surface receptors, and synaptic receptors.

[0167] In one embodiment, the present disclosure includes a method for inducing cell death, comprising contacting cells with a composition described herein, wherein the composition targets cells and induces apoptosis.

[0168] Methods described in various embodiments of the above-described aspects can be used in any other embodiments described herein.

[0169] In some embodiments, the composition targets cells carrying viral DNA sequences or mutations in genes. In one embodiment, the cells are infected with the virus. In another embodiment, the cells carry genetic mutations. In some embodiments, the composition targets cells in the early stages of necrosis, for example, cells that bind to necrotic cell markers.

[0170] In one embodiment, the present disclosure includes a method for increasing the bioavailability of a therapeutic agent, which involves administering a composition described herein in which the therapeutic agent is a heterogeneous part.

[0171] Methods described in various embodiments of the above-described aspects can be used in any other embodiments described herein.

[0172] In some embodiments, the composition improves at least one PK / PD parameter compared to the therapeutic agent alone, such as improved targeting, absorption, or transport (e.g., by at least 5%, 10%, 15%, 20%, 30%, 40%, 50%, 60%, 75%, 80%, 90%, or more). In some embodiments, the composition reduces at least one undesirable parameter compared to the therapeutic agent alone, such as reduced diffusion to non-target sites, off-target activity, or toxic metabolism (e.g., by at least 5%, 10%, 15%, 20%, 30%, 40%, 50%, 60%, 75%, 80%, 90%, or more). In some embodiments, the composition increases the efficacy of the therapeutic agent and / or reduces its toxicity compared to the therapeutic agent alone.

[0173] In one embodiment, the Disclosure includes a method for treating an acute or chronic infection, comprising administering a composition described herein.

[0174] Methods described in various embodiments of the above-described aspects can be used in any other embodiments described herein.

[0175] In some embodiments, the composition targets infected cells carrying a pathogen. In some embodiments, the infection is caused by a pathogen selected from the group consisting of viruses, bacteria, parasites, and prions. In some embodiments, the composition induces cell death in infected cells; for example, the heterologous portion is an antibacterial, antiviral, or antiparasitic therapeutic agent.

[0176] In one embodiment, the Disclosure includes a method for treating cancer, which involves administering a composition described herein.

[0177] Methods described in various embodiments of the above-described aspects can be used in any other embodiments described herein.

[0178] In some embodiments, the heterologous portion is a therapeutic agent that modulates the gene expression of one or more genes.

[0179] In some embodiments, the composition targets cancer cells that carry mutations in their genes. In some embodiments, the composition induces cell death in cancer cells, and the heterologous portion, for example, is a chemotherapeutic agent.

[0180] In one embodiment, the Disclosure includes a method for treating a neurological disorder or condition, which involves administering a composition described herein.

[0181] Methods described in various embodiments of the above-described aspects can be used in any other embodiments described herein.

[0182] In some embodiments, the composition modulates the neurotransmitter, neuropeptide, or neuroreceptor activity or activation of a neuroreceptor.

[0183] In some embodiments, the neurological disorder or disorder is Dravet syndrome.

[0184] In one embodiment, the Disclosure includes a method of treating a disease / disorder / condition in a subject, comprising administering a composition described herein, wherein the composition modulates transcription to treat the disease / disorder / condition.

[0185] Methods described in various embodiments of the above-described aspects can be used in any other embodiments described herein.

[0186] In some embodiments, the disease / disorder / condition is a genetic disorder.

[0187] In one embodiment, the present disclosure includes a method for inducing immune tolerance, which includes providing compositions described herein, for example, in which the heterologous portion is an antigen.

[0188] In one embodiment, the disclosure includes a method for altering the expression of a target gene in the genome, comprising administering to a genome a pharmaceutical composition comprising a DNA sequence comprising (a) a targeting portion and (b) an anchor sequence, wherein the anchor sequence facilitates the formation of a linkage that brings a gene expression factor (enhancing sequence, silencing / repressing sequence) into an operable linkage with the target gene.

[0189] In one embodiment, the disclosure includes a system for pharmaceutical use comprising a protein comprising a first polypeptide domain containing Cas or a modified Cas protein and a second polypeptide domain containing a polypeptide having DNA methyltransferase activity [or demethylating or deaminase activity], in combination with at least one guide RNA (gRNA) or antisense DNA oligonucleotide that targets the protein to an anchor sequence of a target anchor sequence-mediated linkage, the system being effective in altering a target anchor sequence-mediated linkage in human cells.

[0190] In one embodiment, the disclosure includes a system for altering the expression of a target gene in human cells, comprising a targeting region (e.g., gRNA, LDB) that associates with an anchor sequence associated with a target gene operably linked to the targeting region, and optionally a heterologous region (e.g., an enzyme, e.g., a nuclease or a deactivated nuclease (e.g., Cas9, dCas9), methylase, demethylase, deaminase), which modulates the anchor sequence-mediated connection and is effective in altering the expression of the target gene.

[0191] The systems described in the various embodiments of the above-described aspects can be used in any other embodiments described herein.

[0192] In some embodiments, the targeting portion and the effector portion are linked. In some embodiments, the system includes a synthetic polypeptide comprising the targeting portion and the heterogeneous portion. In some embodiments, the system includes a nucleic acid vector or a plurality of vectors encoding at least one of the targeting portion and the heterogeneous portion.

[0193] The embodiments described herein may be used in conjunction with any one or more of the embodiments described herein.

[0194] definition As used herein, the term “anchor sequence” refers to a sequence recognized by an anchor sequence-mediated linkage, such as a linkage nucleating agent (e.g., a nucleating protein) that binds sufficiently to form a loop. In some embodiments, the anchor sequence comprises one or more CTCF-binding motifs. In some embodiments, the anchor sequence is not located within a gene coding region. In some embodiments, the anchor sequence is located within an intergeneric region. In some embodiments, the anchor sequence is not located within either an enhancer or a promoter. In some embodiments, the anchor sequence is located at least 400 bp, at least 450 bp, at least 500 bp, at least 550 bp, at least 600 bp, at least 650 bp, at least 700 bp, at least 750 bp, at least 800 bp, at least 850 bp, at least 900 bp, at least 950 bp, or at least 1 kb away from any transcription start site. In some embodiments, the anchor sequence is located within a region without genomic imprinting, single-allele expression, and / or single-allele epigenetic markings. In some embodiments of this disclosure, techniques are provided that enable the specific targeting of one or more specific anchor sequences without targeting other anchor sequences (e.g., sequences which may contain a conjugate nucleation agent (e.g., CTCF) binding motif in a different context); such targeted anchor sequences may be referred to as “targeted anchor sequences.” In some embodiments, the sequence and / or activity of the targeted anchor sequence is modulated, but the sequence and / or activity of one or more other anchor sequences which may be present in the same system as the targeted anchor sequence (e.g., in the same cell, and / or in some embodiments, on the same nucleic acid molecule, e.g., on the same chromosome) is not modulated.

[0195] As used herein, the term “anchor-mediated linkage” refers to a DNA structure, in some cases a loop, that results from and / or is maintained by the physical interaction or linkage of at least two anchor sequences in DNA by one or more proteins, such as nucleating proteins, or by one or more proteins and / or nucleic acid entities (such as RNA or DNA), which bind to the anchor sequences in a manner that enables spatial proximity and functional linkage between the anchor sequences (see Figure 1).

[0196] As used herein, the term “associated with” means that a gene is associated with an anchor sequence-mediated linkage if the formation or disruption of the linkage causes an alteration in the expression (e.g., transcription) of the target gene. For example, the formation or disruption of an anchor sequence-mediated linkage causes an enhancing or silencing / repressing sequence to become associated with or unassociated with that gene.

[0197] As used herein, the term “anchor-mediated linkage not found in nature” refers to the formation of an anchor-mediated linkage not found in nature. The formation of an anchor-mediated linkage not found in nature may, but is not limited to, result from the modification, addition, or deletion of one or more anchor sequences and the modification of one or more linkage nucleating molecules.

[0198] As used herein, the term “common nucleotide sequence” refers to a linkage nucleation molecule binding site within an anchor sequence. Examples of common nucleotide sequences include, but are not limited to, the CTCF binding motif, USF1 binding motif, YY1 binding motif, TAF3 binding motif, and ZNF143 binding motif.

[0199] As used herein, the term “conjugate nucleating agent” refers to a protein that can associate directly or indirectly with an anchor sequence and interact with one or more conjugate nucleating agents (interacting with the anchor sequence or other nucleic acids) to form a dimer (or higher-order structure) containing two or more such conjugate nucleating agents, which may or may not be identical to each other. When conjugate nucleating agents associated with different anchor sequences associate with each other, and the different anchor sequences are maintained in close physical proximity to each other, the resulting structure is an anchor-sequence-mediated conjugate. That is, the close physical proximity of another conjugate nucleating molecule—a conjugate nucleating molecule interacting with an anchor sequence—an anchor sequence generates an anchor-sequence-mediated conjugate (e.g., a DNA loop in some cases) that begins and ends with the anchor sequence (see Figure 2). As will be readily apparent to those skilled in the art reading this specification, terms such as “nucleating polypeptide,” “nucleating molecule,” and “conjugate nucleating protein” may also be used to refer to conjugate nucleating agents. Similarly, as anyone skilled in the art will readily understand from reading this specification, a collection of two or more conjugate nucleating agents (which in some embodiments may include multiple copies of the same agent, and / or in some embodiments may include one or more of each of several different agents) may be referred to as a “complex,” “dimer,” “multimer,” etc.

[0200] The term “loop” refers to a type of chromatin structure that can be created by the simultaneous localization of two or more anchor sequences as anchor sequence-mediated connections. Thus, loops are formed as a result of the interaction between at least two anchor sequences in DNA and one or more proteins, such as nucleating proteins, or one or more proteins and / or nucleic acid entities (such as RNA or DNA) that bind to the anchor sequences, enabling spatial proximity and functional linkage between them. Those skilled in the art who read this specification will understand that a 2D representation of such a structure can be presented as a loop, for example, as shown in Figure 2. An “activating loop” is a structure that is open to active gene transcription, for example, a structure containing transcriptional regulatory sequences (enhancing sequences) that enhance transcription. A “repressing loop” is a structure that is closed to active gene transcription, for example, a structure containing transcriptional regulatory sequences (silencing sequences) that repress transcription.

[0201] As used herein, the term “sequence-targeted polypeptide” refers to an enzyme or protein, such as Cas9, that recognizes or specifically binds to a target sequence. In some embodiments, the sequence-targeted polypeptide is a catalytically inactive protein, such as dCas9, that lacks endonuclease activity.

[0202] As used herein, the term “subject” means an organism, e.g., a mammal (e.g., human, non-human mammal, non-human primate, primate, laboratory animal, mouse, rat, hamster, gerbil, cat, or dog). In some embodiments, a human subject is an adult, adolescent, or child subject. In some embodiments, a subject has a disease or condition. In some embodiments, a subject suffers from a disease, disorder, or condition, e.g., a disease, disorder, or condition that can be treated as provided herein. In some embodiments, a subject is susceptible to a disease, disorder, or condition; in some embodiments, a susceptible subject is predisposed to developing a disease, disorder, or condition and / or exhibits a high risk of developing one (compared to the mean risk observed in the subject or population of reference). In some embodiments, a subject exhibits one or more symptoms of a disease, disorder, or condition. In some embodiments, a subject does not exhibit a specific symptom (e.g., clinical signs of the disease) or feature of a disease, disorder, or condition. In some embodiments, a subject does not exhibit any symptom or feature of a disease, disorder, or condition. In some embodiments, the subject is a patient. In some embodiments, the subject is an individual to be administered and / or has administered a diagnosis and / or therapy.

[0203] As used herein, the terms “targeting moiety” or “targeting element” refer to molecules that specifically bind to sequences within or around an anchor sequence-mediated linkage. Examples of targeting moieties include, but are not limited to, enzymes, sequence-targeting polypeptides such as Cas9, fusions of sequence-targeting polypeptides with linkage nucleating molecules such as dCas9 and linkage nucleating molecules, or guide RNA or nucleic acids such as RNA, DNA, or modified RNA or DNA.

[0204] As used herein, the term “transcriptional regulatory sequence” refers to a nucleic acid sequence that increases or decreases the transcription of a gene. “Enhancing sequences” increase the likelihood of gene transcription. “Silencing or repressing sequences” decrease the likelihood of gene transcription. Enhancer and silencing sequences are approximately 50–3500 bp in length and can affect gene transcription at distances of up to 1 Mb. In certain embodiments, for example, the following are provided: (Item 1) A site-directed disruption agent comprising a DNA-binding portion that specifically binds to one or more target anchor sequences within the cell with sufficient affinity to compete with the binding of endogenous nucleating polypeptides within the cell, and does not bind to non-target anchor sequences within the cell. (Item 2) The site-directed disruptor according to item 1, further comprising a negative effector portion associated with the DNA-binding portion, wherein when the DNA-binding portion binds to one or more target anchor sequences, the negative effector portion localizes thereto, and the dimerization of the endogenous nucleating polypeptide is reduced when the negative effector portion is present compared to when the negative effector portion is absent. (Item 3) The site-specific disruptor according to item 2, wherein the negative effector portion is a variant of the dimerization domain of the endogenous nucleating polypeptide, or a dimerized portion thereof, or comprises the same. (Item 4) A site-specific disruptor according to any one of the above items, wherein the DNA-binding portion is a polymer or comprises a polymer. (Item 5) The site-specific destructive agent according to item 4, wherein the polymer is a polyamide or comprises one. (Item 6) The site-directed disruptor described in item 4, wherein the polymer is an oligonucleotide. (Item 7) The site-directed disruptor according to item 6, wherein the oligonucleotide has a sequence containing a complement to the target anchor sequence. (Item 8) The site-directed disruptor according to item 6 or 7, wherein the oligonucleotide comprises chemical modification. (Item 9) The site-specific disruptor described in item 4, wherein the polymer is a peptide nucleic acid. (Item 10) The site-specific disruptor described in item 4, wherein the DNA-binding portion is or contains a peptide nucleic acid mixture. (Item 11) A site-directed disruptor according to any one of item 4, wherein the DNA-binding portion is a peptide or polypeptide, or comprises such peptide or polypeptide. (Item 12) The site-directed destructive agent according to item 11, wherein the polypeptide is a zinc finger polypeptide. (Item 13) The site-specific disruptor described in item 13, wherein the polypeptide is a transcription activator-like effector nuclease (TALEN) polypeptide, or comprises the same. (Item 14) A site-directed disruptor according to any one of items 1 to 3, wherein the DNA-binding portion is a small molecule or contains one. (Item 15) A method for modulating the expression of a gene within an anchor-mediated linkage comprising a first anchor sequence and a second anchor sequence, comprising the step of contacting the first and / or second anchor sequences with a site-directed disruptor described in any one of items 1 to 14. (Item 16) The method according to item 2, wherein the anchor sequence-mediated connection includes at least one internal transcriptional regulatory sequence. (Item 17) The method according to item 16, wherein the transcriptional regulatory sequence is an enhancing sequence. (Item 18) The method according to item 16, wherein the transcriptional regulatory sequence is a silencing or repressive sequence. (Item 19) The method according to any one of items 15 to 18, wherein the gene is separated from the internal transcriptional regulatory sequence by at least 300 base pairs. (Item 20) The method according to any one of items 15 to 19, wherein the first and / or second anchor sequence is located within 500 kb of the external transcriptional control sequence. (Item 21) The method according to item 20, wherein the external transcriptional regulatory sequence is an enhancing sequence. (Item 22) The method according to item 20, wherein the external transcriptional regulatory sequence is a silencing or repressive sequence. (Item 23) A method for modulating the expression of a gene within 10 kb of a first anchor sequence in an anchor sequence-mediated linkage including a first anchor sequence and a second anchor sequence, Step of bringing the first and / or second anchor array into contact with a site-specific destructive agent described in any one of items 1 to 14. Methods that include... (Item 24) The method according to item 23, wherein the anchor sequence-mediated connection includes at least one internal transcriptional regulatory sequence. (Item 25) The method according to item 24, wherein the transcriptional regulatory sequence is an enhancing sequence. (Item 26) The method according to item 24, wherein the transcriptional regulatory sequence is a silencing or repressive sequence. (Item 27) The method according to any one of items 24 to 26, wherein the gene is separated from the internal transcriptional regulatory sequence by at least 300 base pairs. (Item 28) The method according to any one of items 23 to 27, wherein the first and / or second anchor sequence is located within 500 kb of the external transcriptional control sequence. (Item 29) The method according to item 28, wherein the external transcriptional regulatory sequence is an enhancing sequence. (Item 30) The method according to item 28, wherein the external transcriptional regulatory sequence is a silencing or repressive sequence. (Item 31) A method for reducing gene expression within an anchor sequence-mediated linkage comprising a first anchor sequence, a second anchor sequence, and an internal enhancing sequence, Step of bringing the first and / or second anchor array into contact with a site-specific destructive agent described in any one of items 1 to 14. Methods that include... (Item 32) The method according to item 31, wherein the first and / or second anchor arrays are located within 500 kb of an external silencing or suppression array. (Item 33) The method according to item 31 or 32, wherein the gene is separated from the internal enhancing sequence by at least 300 base pairs. (Item 34) A method for increasing the expression of a gene within an anchor sequence-mediated linkage comprising a first anchor sequence and a second anchor sequence, wherein the first and / or second anchor sequences are located within 10 kb of an external enhancing sequence. Step of bringing the first and / or second anchor array into contact with a site-specific destructive agent described in any one of items 1 to 14. Methods that include... (Item 35) The method according to item 34, wherein the anchor array-mediated connection further comprises an internal enhancing array. (Item 36) The method according to item 35, wherein the gene is separated from the internal enhancing sequence by at least 300 base pairs. (Item 37) (a) A step of delivering a site-directed disruptor described in any one of items 1 to 14 to mammalian cells. A method that includes this. (Item 38) The method according to item 37, wherein the mammalian cells are somatic cells. (Item 39) The method according to item 37 or 38, wherein the mammalian cells are primary cells. (Item 40) The method according to any one of items 37 to 39, wherein the delivery step is performed ex vivo. (Item 41) The method according to item 40, further comprising the step of removing the mammalian cells from the subject before the delivery step. (Item 42) The method according to item 40, further comprising the step of administering the mammalian cells after the delivery step. (Item 43) The method according to any one of items 37 to 39, wherein the delivery step comprises administering the composition comprising the site-specific destructive agent to a target. (Item 44) The method described in item 41 or 42, wherein the subject has a disease or condition. (Item 45) The method according to any one of items 37 to 43, wherein the delivery step includes delivery across the cell membrane. (Item 46) (i) Site-specific targeting portion, (ii) Deamination agent and A fusion molecule comprising the site-specific targeting portion, wherein the site-specific targeting portion targets the fusion molecule to a target anchor sequence but does not target at least one non-target anchor sequence. (Item 47) The fusion molecule according to item 46, wherein the target anchor sequence contains a CTCF-binding motif. (Item 48) The fusion molecule according to item 47, wherein the at least one non-target anchor sequence also contains a CTCF-binding motif. (Item 49) The fusion molecule according to any one of items 46 to 48, wherein the deamination agent is a deaminase. (Item 50) The fusion molecule according to any one of items 46 to 49, wherein the site-specific targeting portion comprises a Cas polypeptide and a site-specific guide RNA. (Item 51) The fusion molecule described in item 50, wherein the Cas polypeptide is enzymatically inactive. (Item 52) The fusion molecule according to item 50 or 51, wherein the Cas polypeptide is a Cas9 polypeptide. (Item 53) A fusion molecule according to any one of items 46 to 48, wherein the deamination agent comprises an oligonucleotide. (Item 54) The fusion molecule described in item 53, wherein the oligonucleotide is conjugated with sodium bisulfite. (Item 55) A fusion molecule according to any one of items 46 to 49, wherein the site-specific targeting portion is a polymer. (Item 56) A fusion molecule according to any one of items 46 to 55, wherein the DNA-binding portion is a polymer or comprises a polymer. (Item 57) The fusion molecule described in item 56, wherein the polymer is a polyamide or contains one. (Item 58) The fusion molecule described in item 56, wherein the polymer is an oligonucleotide. (Item 59) The fusion molecule according to item 58, wherein the oligonucleotide has a sequence containing a complement to the target anchor sequence. (Item 60) The fusion molecule according to item 58 or 59, wherein the oligonucleotide includes chemical modification. (Item 61) The fusion molecule described in item 56, wherein the polymer is a peptide nucleic acid. (Item 62) The fusion molecule described in item 46, wherein the DNA-binding portion is or contains a peptide nucleic acid mixture. (Item 63) The fusion molecule according to item 56, wherein the DNA-binding portion is a peptide or polypeptide, or contains one. (Item 64) The fusion molecule described in item 63, wherein the polypeptide is a zinc finger polypeptide. (Item 65) The fusion molecule according to item 63, wherein the polypeptide is a transcription activator-like effector nuclease (TALEN) polypeptide or contains the same. (Item 66) A fusion molecule according to any one of items 46 to 48, wherein the DNA-binding portion is a small molecule or contains one. (Item 67) (i) a fusion polypeptide comprising an enzymatically inactive Cas polypeptide and a deamination agent, or a nucleic acid encoding the fusion polypeptide; and (ii) A guide RNA that targets the fusion polypeptide to the target anchor sequence but does not target at least one non-target anchor sequence. A composition containing the following: (Item 68) A method for modulating the expression of a gene in an anchor-mediated linkage comprising a first anchor sequence and a second anchor sequence, comprising the step of contacting the first and / or second anchor sequences with a fusion molecule according to any one of items 46 to 66 or a composition according to item 67. (Item 69) The method according to item 68, wherein the anchor sequence-mediated connection includes at least one internal transcriptional regulatory sequence. (Item 70) The method according to item 69, wherein the transcriptional regulatory sequence is an enhancing sequence. (Item 71) The method according to item 69, wherein the transcriptional regulatory sequence is a silencing or repressive sequence. (Item 72) The method according to any one of items 69 to 71, wherein the gene is separated from the internal transcriptional regulatory sequence by at least 300 base pairs. (Item 73) The method according to any one of items 69 to 72, wherein the first and / or second anchor sequence is located within 500 kb of the external transcriptional control sequence. (Item 74) The method according to item 73, wherein the external transcriptional regulatory sequence is an enhancing sequence. (Item 75) The method according to item 73, wherein the external transcriptional regulatory sequence is a silencing or repressive sequence. (Item 76) A method for modulating the expression of a gene within 10 kb of a first anchor sequence in an anchor sequence-mediated linkage including a first anchor sequence and a second anchor sequence, Step of bringing the first and / or second anchor array into contact with the fusion molecule described in any one of items 46 to 66 or the composition described in item 67. Methods that include... (Item 77) The method according to item 76, wherein the anchor sequence-mediated connection includes at least one internal transcriptional regulatory sequence. (Item 78) The method according to item 77, wherein the transcriptional regulatory sequence is an enhancing sequence. (Item 79) The method according to item 77, wherein the transcriptional regulatory sequence is a silencing or repressive sequence. (Item 80) The method according to any one of claims 77 to 79, wherein the gene is separated from the internal transcriptional control sequence by at least 300 base pairs. (Item 81) The method according to any one of items 76 to 79, wherein the first and / or second anchor sequence is located within 500 kb of the external transcriptional control sequence. (Item 82) The method according to item 81, wherein the external transcriptional regulatory sequence is an enhancing sequence. (Item 83) The method according to item 81, wherein the external transcriptional regulatory sequence is a silencing or repressive sequence. (Item 84) A method for reducing gene expression within an anchor sequence-mediated linkage comprising a first anchor sequence, a second anchor sequence, and an internal enhancing sequence, Step of bringing the first and / or second anchor array into contact with the fusion molecule described in any one of items 46 to 66 or the composition described in item 67. Methods that include... (Item 85) The method according to item 84, wherein the first and / or second anchor sequences are located within 500 kb of an external silencing or suppression sequence. (Item 86) The method according to item 84 or 85, wherein the gene is separated from the internal enhancing sequence by at least 300 base pairs. (Item 87) A method for increasing the expression of a gene within an anchor sequence-mediated linkage comprising a first anchor sequence and a second anchor sequence, wherein the first and / or second anchor sequences are located within 10 kb of an external enhancing sequence. Step of bringing the first and / or second anchor array into contact with the fusion molecule described in any one of items 46 to 66 or the composition described in item 67. Methods that include... (Item 88) The method according to item 87, wherein the anchor array-mediated connection further comprises an internal enhancing array. (Item 89) The method according to item 88, wherein the gene is separated from the internal enhancing sequence by at least 300 base pairs. (Item 90) (a) A step of delivering a fusion molecule described in any one of items 46 to 66 or a composition described in item 67 to mammalian cells. A method that includes this. (Item 91) The method according to item 90, wherein the mammalian cells are somatic cells. (Item 92) The method according to item 90 or 91, wherein the mammalian cells are primary cells. (Item 93) The method according to any one of items 90 to 92, wherein the step of delivering is performed ex vivo. (Item 94) The method according to item 93, further comprising the step of removing the mammalian cells from the subject prior to the step of delivering. (Item 95) (b) The method according to item 93 or 94, further comprising the step of administering the mammalian cells to the subject. (Item 96) The method according to any one of items 90 to 92, wherein the step of delivering comprises administering to the subject a composition comprising the fusion molecule according to any one of items 46 to 66 or the composition according to item 67. (Item 97) The method according to any one of items 90 to 93, wherein the step of delivering comprises delivery across the cell membrane. (Item 98) (a) A step of substituting, adding, or deleting one or more nucleotides of an anchor sequence in a mammalian somatic cell A method comprising. (Item 99) The method according to item 98, wherein the mammalian somatic cell is a primary cell. (Item 100) The method according to item 98, wherein the step of substituting, adding, or deleting is performed in vivo. (Item 101) The method according to item 98, wherein the step of substituting, adding, or deleting is performed ex vivo. (Item 102) The method according to any one of items 98 to 101, wherein the mammalian somatic cell is a non-embryonic cell. (Item 103) The method according to any one of items 98 to 102, wherein the anchor sequence is a genomic anchor sequence. (Item 104) A method comprising the step of delivering mammalian somatic cells to a subject having a disease or condition, wherein one or more nucleotides of an anchor sequence in the mammalian somatic cells are substituted, added, or deleted. (Item 105) (a) The step of administering mammalian somatic cells to a subject comprising, wherein the mammalian somatic cells are obtained from the subject, and the fusion molecule according to any one of Items 46 to 66 or the composition according to Item 67 is delivered ex vivo to the mammalian somatic cells. (Item 106) The method according to any one of Items 94 to 96 or 105, wherein the subject is a mammal. (Item 107) The method according to Item 106, wherein the subject has a disease or condition. (Item 108) (i) A site-specific targeting moiety and (ii) An epigenetic modifier A fusion molecule comprising, wherein the site-specific targeting moiety targets the fusion molecule to a target anchor sequence but not to at least one non-target anchor sequence. (Item 109) The fusion molecule according to Item 108, wherein the target anchor sequence comprises a CTCF binding motif. (Item 110) The fusion molecule according to Item 109, wherein the at least one non-target anchor sequence also comprises a CTCF binding motif. (Item 111) The fusion molecule according to any one of Items 108 to 110, wherein the epigenetic modifier is selected from the group consisting of DNA methylase, DNA demethylase, histone methyltransferase, histone deacetylase, and combinations thereof. (Item 112) The fusion molecule according to any one of Items 108 to 111, wherein the site-specific targeting moiety comprises a Cas polypeptide and a site-specific guide RNA. (Item 113) The fusion molecule described in item 112, wherein the Cas polypeptide is enzymatically inactive. (Item 114) The fusion molecule according to item 112 or 113, wherein the Cas polypeptide is a Cas9 polypeptide. (Item 115) A fusion molecule according to any one of items 108 to 111, wherein the site-specific targeting portion is a polymer. (Item 116) The fusion molecule according to item 115, wherein the polymer is a polyamide or contains one. (Item 117) The fusion molecule described in item 115, wherein the polymer is an oligonucleotide. (Item 118) The fusion molecule according to item 116, wherein the oligonucleotide has a sequence containing a complement to the target anchor sequence. (Item 119) The fusion molecule according to item 116, wherein the oligonucleotide includes chemical modification. (Item 120) The fusion molecule described in item 115, wherein the polymer is a peptide nucleic acid. (Item 121) The fusion molecule described in item 108, wherein the site-specific targeting portion is or contains a peptide nucleic acid mixture. (Item 122) The fusion molecule according to item 115, wherein the site-specific targeting binding portion is a peptide or polypeptide, or comprises such a peptide. (Item 123) The fusion molecule described in item 122, wherein the polypeptide is a zinc finger polypeptide. (Item 124) The fusion molecule described in item 122, wherein the polypeptide is a transcription activator-like effector nuclease (TALEN) polypeptide or contains the same. (Item 125) A fusion molecule according to any one of items 108 to 110, wherein the site-specific binding portion is a small molecule or contains one. (Item 126) A site-specific guide RNA comprising a targeting domain complementary to a target nucleic acid containing an anchor sequence. (Item 127) The site-specific guide RNA according to Item 126, wherein the targeting domain is not complementary to at least one non-target nucleic acid containing the anchor sequence. (Item 128) The site-specific guide RNA according to Item 126 or 127, wherein the anchor sequence contains a CTCF binding motif. (Item 129) (i) A fusion polypeptide comprising an enzymatically inactive Cas polypeptide and an epigenetic modifier, or a nucleic acid encoding the fusion polypeptide; and (ii) A guide RNA that targets the fusion polypeptide to a target anchor sequence but does not target at least one non-target anchor sequence A composition comprising. (Item 130) A method for modulating the expression of a gene within an anchor sequence-mediated connection comprising a first anchor sequence and a second anchor sequence, comprising: Contacting the first and / or second anchor sequence with a fusion molecule according to any one of Items 108 to 125, a site-specific guide RNA according to any one of Items 126 to 128, or a composition according to Item 129 A method comprising. (Item 131) The method according to Item 130, wherein the anchor sequence-mediated connection contains at least one internal transcription control sequence. (Item 132) The method according to Item 131, wherein the transcription control sequence is an enhancing sequence. (Item 133) The method according to Item 131, wherein the transcription control sequence is a silencing or suppressing sequence. (Item 134) The method according to any one of Items 130 to 133, wherein the gene is at least 300 base pairs away from the internal transcription control sequence. (Item 135) The method according to any one of items 130 to 134, wherein the first and / or second anchor sequence is located within 500 kb of the external transcriptional control sequence. (Item 136) The method according to item 135, wherein the external transcriptional regulatory sequence is an enhancing sequence. (Item 137) The method according to item 135, wherein the external transcriptional regulatory sequence is a silencing or repressive sequence. (Item 138) A method for modulating the expression of a gene within 10 kb of a first anchor sequence in an anchor sequence-mediated linkage including a first anchor sequence and a second anchor sequence, Steps to bring the first and / or second anchor sequence into contact with the fusion molecule described in any one of items 108-125, the site-specific guide RNA described in any one of items 126-128, or the composition described in item 129. Methods that include... (Item 139) The method according to item 138, wherein the anchor sequence-mediated connection includes at least one internal transcriptional regulatory sequence. (Item 140) The method according to item 139, wherein the internal transcriptional regulatory sequence is an enhancing sequence. (Item 141) The method according to item 139, wherein the internal transcriptional regulatory sequence is a silencing or repressive sequence. (Item 142) The method according to any one of items 139 to 141, wherein the gene is separated from the internal transcriptional regulatory sequence by at least 300 base pairs. (Item 143) The method according to any one of items 138 to 142, wherein the first and / or second anchor sequence is located within 500 kb of the external transcriptional control sequence. (Item 144) The method according to item 143, wherein the external transcriptional regulatory sequence is an enhancing sequence. (Item 145) The method according to item 143, wherein the external transcriptional regulatory sequence is a silencing or repressive sequence. (Item 146) A method for reducing gene expression within an anchor sequence-mediated linkage comprising a first anchor sequence, a second anchor sequence, and an internal enhancing sequence, Steps to bring the first and / or second anchor sequence into contact with the fusion molecule described in any one of items 108-125, the site-specific guide RNA described in any one of items 126-128, or the composition described in item 129. Methods that include... (Item 147) The method according to item 146, wherein the first and / or second anchor arrays are located within 500 kb of an external silencing or suppression array. (Item 148) The method according to item 146 or 147, wherein the gene is separated from the internal enhancing sequence by at least 300 base pairs. (Item 149) A method for increasing the expression of a gene within an anchor sequence-mediated linkage comprising a first anchor sequence and a second anchor sequence, wherein the first and / or second anchor sequences are located within 10 kb of an external enhancing sequence, and the method is Steps to bring the first and / or second anchor sequence into contact with the fusion molecule described in any one of items 108-125, the site-specific guide RNA described in any one of items 126-128, or the composition described in item 129. Methods that include... (Item 150) The method according to item 149, wherein the anchor array-mediated connection further comprises an internal enhancing array. (Item 151) The method according to item 150, wherein the gene is separated from the internal enhancing sequence by at least 300 base pairs. (Item 152) (a) A step of delivering a fusion molecule described in any one of items 108-125, a site-specific guide RNA described in any one of items 126-128, or a composition described in item 129 to mammalian cells. A method that includes this. (Item 153) The method according to item 152, wherein the mammalian cells are somatic cells. (Item 154) The method according to item 152 or 153, wherein the mammalian cells are primary cells. (Item 155) The method according to any one of items 152 to 154, wherein the delivery step is performed ex vivo. (Item 156) The method according to item 155, further comprising the step of removing the mammalian cells from the subject before the delivery step. (Item 157) The method according to item 155 or 156, further comprising the step of administering the mammalian cells after the delivery step. (Item 158) The method according to any one of items 152 to 155, wherein the delivery step comprises administering to a target a composition comprising a fusion molecule according to any one of items 108 to 125, a site-specific guide RNA according to any one of items 126 to 128, or a composition according to item 129. (Item 159) The method described in any one of items 156, 157, or 158, wherein the subject has a disease or condition. (Item 160) The method according to any one of items 152 to 159, wherein the delivery step includes delivery across the cell membrane. (Item 161) A manipulated site-specific nucleating agent, An engineered DNA-binding moiety that specifically binds to one or more target sequences within the cell with sufficient affinity to compete with the binding of endogenous nucleating polypeptides within the cell, and does not bind to non-target sequences within the cell; and The manipulated DNA binding portion includes a nucleating polypeptide dimerization domain, and when the manipulated DNA binding portion binds to the at least one target sequence, the nucleating polypeptide dimerization domain is localized there, and each of the at least one target sequences is a target anchor sequence. When the nucleating polypeptide dimerization domain is localized to the target anchor sequence, the at least one or more target anchor sequences are positioned relative to the anchor sequence to which the nucleating polypeptide binds, such that the interaction between the nucleating polypeptide dimerization domain and the nucleating polypeptide generates an anchor sequence-mediated connection. Manipulated site-specific nucleating agents. (Item 162) The manipulated site-specific nucleating agent according to item 1, wherein the target anchor sequence does not contain a CTCF-binding motif. (Item 163) A method for modulating the expression of a gene within an anchor sequence-mediated linkage comprising a first anchor sequence and a second anchor sequence, Step of bringing the first and / or second anchor array into contact with the manipulated site-specific nucleating agent described in item 161 or 162. Methods that include... (Item 164) The method according to item 163, wherein the anchor sequence-mediated connection includes at least one internal transcriptional regulatory sequence. (Item 165) The method according to item 163, wherein the internal transcriptional regulatory sequence is an enhancing sequence. (Item 166) The method according to item 163, wherein the internal transcriptional regulatory sequence is a silencing or repressive sequence. (Item 167) The method according to any one of claims 163 to 166, wherein the gene is separated from the internal transcriptional regulatory sequence by at least 300 base pairs. (Item 168) The method according to any one of items 163 to 167, wherein the first and / or second anchor sequence is located within 500 kb of the external transcriptional control sequence. (Item 169) The method according to item 168, wherein the external transcriptional regulatory sequence is an enhancing sequence. (Item 170) The method according to item 168, wherein the external transcriptional regulatory sequence is a silencing or repressive sequence. (Item 171) A method for modulating the expression of a gene within 10 kb of a first anchor sequence in an anchor sequence-mediated linkage including a first anchor sequence and a second anchor sequence, Step of bringing the first and / or second anchor array into contact with the manipulated site-specific nucleating agent described in item 161 or 162. Methods that include... (Item 172) The method according to item 171, wherein the anchor sequence-mediated connection comprises at least one internal transcriptional regulatory sequence. (Item 173) The method according to item 172, wherein the internal transcriptional regulatory sequence is an enhancing sequence. (Item 174) The method according to item 172, wherein the internal transcriptional regulatory sequence is a silencing or repressive sequence. (Item 175) The method according to any one of items 171 to 174, wherein the gene is separated from the internal transcriptional regulatory sequence by at least 300 base pairs. (Item 176) The method according to any one of items 171 to 175, wherein the first and / or second anchor sequence is located within 500 kb of the external transcriptional control sequence. (Item 177) The method according to item 176, wherein the external transcriptional regulatory sequence is an enhancing sequence. (Item 178) The method according to item 176, wherein the external transcriptional regulatory sequence is a silencing or repressive sequence. (Item 179) A method for reducing gene expression within an anchor sequence-mediated linkage comprising a first anchor sequence, a second anchor sequence, and an internal enhancing sequence, Step of bringing the first and / or second anchor array into contact with the manipulated site-specific nucleating agent described in item 161 or 162. Methods that include... (Item 180) The method according to item 179, wherein the first and / or second anchor arrays are located within 500 kb of an external silencing or suppression array. (Item 181) The method according to item 179 or 180, wherein the gene is separated from the internal enhancing sequence by at least 300 base pairs. (Item 182) A method for increasing the expression of a gene within an anchor sequence-mediated linkage comprising a first anchor sequence and a second anchor sequence, wherein the first and / or second anchor sequences are located within 10 kb of an external enhancing sequence, and the method is Step of bringing the first and / or second anchor array into contact with the manipulated site-specific nucleating agent described in item 161 or 162. Methods that include... (Item 183) The method according to item 182, wherein the anchor array-mediated connection further comprises an internal enhancing array. (Item 184) The method according to item 183, wherein the gene is separated from the internal enhancing sequence by at least 300 base pairs. (Item 185) (a) Step of delivering the manipulated site-specific nucleating agent described in item 161 or 162 to mammalian cells. A method that includes this. (Item 186) The method according to item 185, wherein the mammalian cells are somatic cells. (Item 187) The method according to item 186, wherein the mammalian cells are primary cells. (Item 188) The method according to any one of items 185 to 187, wherein the delivery step is performed ex vivo. (Item 189) The method according to item 188, further comprising the step of removing the mammalian cells from the subject before the delivery step. (Item 190) The method according to item 188 or 189, further comprising the step of administering the mammalian cells after the delivery step. (Item 191) The method according to any one of items 185 to 187, wherein the delivery step comprises administering a composition comprising an engineered site-specific nucleating agent as described in item 161 or 162 to mammalian cells. (Item 192) The method according to any one of items 189, 190, or 191, wherein the subject has a disease or condition. (Item 193) The method according to any one of items 185 to 192, wherein the delivery step includes delivery across the cell membrane. (Item 194) A method for modulating the expression of a target gene in an expression unit, A method comprising modulating the expression of the target gene by targeting the target gene or an associated transcriptional regulatory sequence that affects the transcription of the gene, for example, a sequence outside the anchor sequence, or by altering the formation of an anchor sequence-mediated linkage using a targeting portion that is not part of the target gene. (Item 195) A method for modulating the transcription of a target gene in a nucleic acid sequence, for example, an expression unit, A method comprising altering the formation of the anchor sequence-mediated connection using a targeting moiety that targets the target gene or an associated transcriptional regulatory sequence that affects the transcription of the target gene, in order to alter the formation of the anchor sequence-mediated connection. (Item 196) A pharmaceutical preparation comprising a composition that includes a targeting moiety that binds to an anchor sequence of an anchor sequence-mediated linkage and alters the formation of the anchor sequence-mediated linkage, for example, a composition that modulates the transcription of a target gene associated with the anchor sequence-mediated linkage in human cells. (Item 197) A composition comprising a targeted moiety that binds to the anchor sequence of an anchor sequence-mediated connection and alters the formation of the anchor sequence-mediated connection (for example, by changing the affinity of the anchor sequence to the connection nucleating molecule by, for example, at least 10%, 15%, 20%, 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, or more). (Item 198) The target portion, (i) A chemical substance, for example, a chemical substance that modulates cytosine (C) or adenine (A) (e.g., sodium bisulfite, ammonium bisulfite), (ii) Having enzymatic activity (methyltransferase, demethylase, nuclease (e.g., Cas9), deaminase), (iii) The method or composition according to any one of items 194 to 197, comprising an effector moiety which is an antisense oligonucleotide conjugate comprising, for example, an ssDNA oligo, locked nucleic acid (LNA), peptide oligonucleotide conjugate (e.g., a membrane permeable polypeptide having nucleic acid side chains), crosslinked nucleic acid (BNA), a polyamide, and a DNA binding molecule. (Item 199) The method or composition according to any one of items 194 to 198, wherein one or more transcriptional regulatory sequences are located within the anchor sequence-mediated linkage, for example, a type 1 anchor sequence-mediated linkage. (Item 200) The method or composition according to any one of items 194 to 199, wherein one or more transcriptional regulatory sequences are located outside of the anchor sequence-mediated linkage, for example, including a type 2 anchor sequence-mediated linkage. (Item 201) The method or composition according to any one of items 194 to 200, wherein one or more transcriptional regulatory sequences are located, for example, inside an enhancing sequence and at least partially outside, for example, a silencing sequence, the anchor sequence-mediated linkage, for example, a type 3 anchor sequence-mediated linkage. (Item 202) The method or composition according to any one of items 194 to 201, wherein one or more transcriptional regulatory sequences are, for example, located inside an enhancing sequence and at least partially outside, for example, the enhancing sequence, the anchor sequence-mediated linkage, for example, a type IV anchor sequence-mediated linkage. (Item 203) A pharmaceutical composition comprising (a) a targeting portion and (b) a DNA sequence including, for example, an anchor sequence. (Item 204) A composition comprising a protein containing a domain that acts on DNA, for example, an enzyme domain (e.g., a nuclease domain, for example, a Cas9 domain, for example, a dCas9 domain; DNA methyltransferase, demethylase, deaminase), in combination with at least one guide RNA (gRNA) or antisense DNA oligonucleotide that targets the protein to an anchor sequence of a target anchor sequence-mediated linkage, the composition being effective in altering the target anchor sequence-mediated linkage in human cells. (Item 205) A composition comprising a targeted portion that binds to an anchor array in an anchor array-mediated connection and alters the topology of the anchor array-mediated connection. (Item 206) A composition comprising a protein comprising a first polypeptide containing a Cas or modified Cas protein domain and a second polypeptide containing a polypeptide having DNA methyltransferase activity [or demethylating or deaminase activity], in combination with at least one guide RNA (gRNA) or antisense DNA oligonucleotide that targets the protein to an anchor sequence of a target anchor sequence-mediated linkage, wherein the system is effective in altering the target anchor sequence-mediated linkage in human cells. (Item 207) A pharmaceutical composition comprising a Cas protein and at least one guide RNA (gRNA) that targets the Cas protein to an anchor sequence of a target anchor sequence-mediated connection, wherein the Cas protein is effective in inducing mutations in the target anchor sequence that reduce the formation of an anchor sequence-mediated connection associated with the target anchor sequence. (Item 208) (a) nucleic acids encoding a protein comprising a first polypeptide domain containing Cas or a modified Cas protein and a second polypeptide domain containing a polypeptide having DNA methyltransferase activity [or demethylating or deaminase activity]; and (b) A kit comprising at least one guide RNA (gRNA) or antisense DNA oligonucleotide for targeting the protein to an anchor sequence of a target anchor sequence-mediated linkage in a target cell. (Item 209) A method for [altering gene expression / altering anchor sequence-mediated connections] in mammals, wherein the subject (separately or in the same pharmaceutical composition) a) (i) a protein comprising a first polypeptide domain containing Cas or a modified Cas protein and a second polypeptide domain containing a polypeptide having DNA methyltransferase activity [or demethylating or deaminase activity], or (ii) a nucleic acid encoding a protein comprising a first polypeptide domain containing Cas protein and a second polypeptide domain containing a polypeptide having DNA methyltransferase activity [or demethylating or deaminase activity], and b) At least one guide RNA (gRNA) or antisense DNA oligonucleotide that targets the anchor sequence of the anchor sequence-mediated linkage. A method including administering [a substance]. (Item 210) This includes modulating the transcription of nucleic acid sequences by changing the topology of anchor sequence-mediated connections, for example, by altering loops. A method for modifying a chromatin structure, such as a two-dimensional structure, wherein the change in the topology of the anchor sequence-mediated connection modulates the transcription of the nucleic acid sequence. (Item 211) Modulating the transcription of nucleic acid sequences involves altering the topology of multiple anchor sequence-mediated connections, for example, multiple loops. A method for modifying a chromatin structure, such as a two-dimensional structure, wherein the change in topology modulates the transcription of the nucleic acid sequence. (Item 212) Anchor sequence-mediated connections that affect the transcription of nucleic acid sequences, such as altering loops, A method for modulating the transcription of a nucleic acid sequence, wherein a change in the anchor sequence-mediated connection modulates the transcription of the nucleic acid sequence. (Item 213) Manipulated cells, including targeted alterations to anchor sequence-mediated connections. (Item 214) An engineered nucleic acid sequence containing anchor sequence-mediated links having targeted modifications. (Item 215) A pharmaceutical composition comprising the manipulated cells described in any one of the preceding items or the manipulated nucleic acid sequence described in any one of the preceding items. (Item 216) Multiple cells, including the manipulated cells described in any one of the above items. (Item 217) A vector comprising the manipulated nucleic acid sequence described in any one of the above items. (Item 218) A composition for modulating the transcription of a nucleic acid sequence by introducing a targeted modification into an anchor sequence-mediated linkage, the composition comprising a targeted moiety that binds to the anchor sequence. (Item 219) A composition comprising a synthetic linkage nucleating molecule having a selected binding affinity to the anchor sequence within the anchor sequence-mediated linkage. (Item 220) Synthetic nucleic acids containing multiple anchor sequences, gene sequences, and transcriptional modifier sequences. (Item 221) A vector containing the nucleic acid described in any one of the above items. (Item 222) A cell containing the nucleic acid described in any one of the above items. (Item 223) A pharmaceutical composition comprising a nucleic acid as described in any one of the above items. (Item 224) A method for modulating gene expression by administering a composition containing the nucleic acid described in any one of the above items. (Item 225) A method for preparing connective nucleation molecules having selected binding affinity. (Item 226) A method for treating a disease or condition, comprising administering a targeted portion selected from an exogenous connective nucleation molecule, a nucleic acid encoding the connective nucleation molecule, or a sequence-targeted polypeptide fusion with the connective nucleation molecule, the targeted portion which alters the anchor sequence-mediated connection. (Item 227) ABX n At least one polypeptide comprising at least one sequence of C (wherein A is selected from a hydrophobic amino acid or amide-containing skeleton having a nucleic acid side chain, e.g., aminoethylglycine; B and C may be the same or different, each independently selected from arginine, asparagine, glutamine, lysine, and their analogues; X is each independently a hydrophobic amino acid, or X is each independently an amide-containing skeleton having a nucleic acid side chain, e.g., aminoethylglycine; n is an integer from 1 to 4), The composition and method according to any one of the above items, further comprising a polypeptide that hybridizes to a nucleic acid sequence within an anchor sequence-mediated linkage (e.g., the anchor sequence of the anchor sequence-mediated linkage, e.g., a CTCF binding motif, a BORIS binding motif, a cohesin binding motif, a USF1 binding motif, a YY1 binding motif, a TATA box, a ZNF143 binding motif, etc.). (Item 228) A method for modifying the expression of a target gene, To alter the anchor sequence-mediated connections associated with the target gene. A method comprising the modification which modulates the transcription of the target gene. (Item 229) A method for modifying the expression of a target gene, Administering the composition described in any one of the above items to cells, tissues, or subjects. Methods that include... (Item 230) A method for modulating the transcription of nucleic acid sequences, A method comprising administering a composition according to any one of the preceding items to alter an anchor sequence-mediated linkage, such as a loop, that modulates the transcription of a nucleic acid sequence, wherein the alteration of the anchor sequence-mediated linkage modulates the transcription of the nucleic acid sequence. (Item 231) A method for altering the expression of a target gene, comprising administering to a genome a pharmaceutical composition comprising (a) a targeting portion and (b) a DNA sequence including an anchor sequence, wherein the anchor sequence promotes the formation of a linkage that operably links a gene expression factor (enhancing sequence, silencing / repression sequence) to the target gene. (Item 232) A method for modulating gene expression, comprising providing a composition according to any one of the above items, wherein the targeted portion is, for example, an endogenous effector, an exogenous effector, or an agonist or antagonist thereof, comprising an effector portion that inhibits CpG binding. (Item 233) A method for delivering therapeutic agents, A method comprising administering a composition described in any one of the above items to a target, wherein the targeted portion comprises an effector portion which is the therapeutic agent, the composition increases the intracellular delivery of the therapeutic agent compared to the therapeutic agent alone, and the composition modulates gene transcription. (Item 234) A method for modulating membrane proteins on cells, A method comprising contacting the cells with a composition described in any one of the above items, wherein the composition targets the cells and modulates the membrane proteins. (Item 235) A method for inducing cell death, comprising contacting a cell with a composition described in any one of the preceding items, wherein the composition targets the cell and induces apoptosis. (Item 236) A method for increasing the bioavailability of therapeutic agents, A method comprising administering the composition described in any one of the preceding items, wherein the therapeutic agent is a heterogeneous part. (Item 237) A method for treating a disease / disorder / condition in a subject, A method comprising administering a composition described in any of the preceding items, wherein the composition modulates transcription to treat the disease / disorder / condition. (Item 238) A method for treating an acute or chronic infection, comprising administering the composition described in any one of the preceding items. (Item 239) A method for treating cancer, comprising administering a composition described in any one of the preceding items. (Item 240) A method for treating a neurological disorder or disorder, comprising administering the composition described in any one of the preceding items. (Item 241) A method for inducing immune tolerance, comprising providing a composition according to any one of the preceding items, wherein, for example, the heterogeneous portion is an antigen. (Item 242) A system for pharmaceutical use comprising a protein comprising a first polypeptide domain containing Cas or a modified Cas protein and a second polypeptide domain containing a polypeptide having DNA methyltransferase activity [or demethylating or deaminase activity], in combination with at least one guide RNA (gRNA) or antisense DNA oligonucleotide that targets the protein to an anchor sequence of a target anchor sequence-mediated linkage, the system being effective in altering the target anchor sequence-mediated linkage in human cells. (Item 243) A system for altering the expression of target genes in human cells, A targeting region (e.g., gRNA, LDB) that associates with an anchor sequence associated with the target gene, A system comprising, optionally, a heterologous portion (e.g., an enzyme, e.g., a nuclease or a deactivated nuclease (e.g., Cas9, dCas9), methylase, demethylase, deaminase) operably linked to the targeting portion, which modulates the linkage mediated by the anchor sequence and is effective in altering the expression of the target gene.

[0205] The following detailed description of embodiments of this disclosure will be better understood when read in conjunction with the accompanying drawings, which are illustrated for illustrative purposes. However, it should be understood that this disclosure is not limited to the exact configuration and means of embodiments shown in the drawings. [Brief explanation of the drawing]

[0206] [Figure 1] Figure 1 illustrates the physical interaction or binding between one connection nucleation molecule-anchor sequence and another connection nucleation molecule-anchor sequence for generating anchor sequence-mediated connections. [Figure 2] Figure 2 illustrates a method for targeting and generating anchor array-mediated connections, such as loops. [Figure 3]Figure 3 illustrates one embodiment of modulating gene expression through the generation of anchor sequence-mediated connections that do not exist in nature (loop integration). [Figure 4] Figure 4 illustrates a method for modulating gene expression. The left side of the figure is the same illustration as shown in Figure 1. The right side of the figure shows the disruption of anchor sequence-mediated connections (loop exclusion). [Figure 5] Figure 5 illustrates another embodiment in which the incorporation of a novel anchor sequence modulates gene expression through the generation of anchor sequence-mediated connections that do not exist in nature. [Figure 6] Figure 6 is an illustration illustrating some types of anchor array-mediated connections. [Figure 7-1] Figure 7 illustrates the disruption of the upstream anchor sequence-mediated connection to the MYC gene, which leads to the downregulation of MYC expression levels. Panels A, B, C, and D illustrate the reduction in MYC expression, and panel E represents a map of the gRNA sequence, as further described in Examples 1 and 2. [Figure 7-2] Figure 7 illustrates the disruption of the upstream anchor sequence-mediated connection to the MYC gene, which leads to the downregulation of MYC expression levels. Panels A, B, C, and D illustrate the reduction in MYC expression, and panel E represents a map of the gRNA sequence, as further described in Examples 1 and 2. [Figure 8-1] Figure 8 illustrates the disruption of anchor sequence-mediated connections associated with the FOXJ3 gene, which leads to downregulation of FOXJ3 expression levels. As further described in Example 3, panel A represents a map of gRNA and SNA sequences, and panels B, C, D, and E illustrate the reduction in FOXJ3 levels. [Figure 8-2] Figure 8 illustrates the disruption of anchor sequence-mediated connections associated with the FOXJ3 gene, which leads to downregulation of FOXJ3 expression levels. As further described in Example 3, panel A represents a map of gRNA and SNA sequences, and panels B, C, D, and E illustrate the reduction in FOXJ3 levels. [Figure 8-3]Figure 8 illustrates the disruption of anchor sequence-mediated connections associated with the FOXJ3 gene, which leads to downregulation of FOXJ3 expression levels. As further described in Example 3, panel A represents a map of gRNA and SNA sequences, and panels B, C, D, and E illustrate the reduction in FOXJ3 levels. [Figure 9] Figure 9 illustrates the disruption of anchor sequence-mediated connections associated with the TUSC5 gene, which leads to upregulation of TUSC5 expression levels. As further described in Example 4, panel A represents the upregulation of TUSC5 expression levels, and panel B represents a map of gRNA sequences. [Figure 10] Figure 10 illustrates the disruption of the upstream anchor sequence-mediated connection to the DAND5 gene, which leads to upregulation of DAND5 expression levels. As further described in Example 5, Panel A represents the upregulation of DAND5 expression levels, and Panel B represents a map of the gRNA sequence. [Figure 11] Figure 11 illustrates the disruption of upstream or downstream anchor sequence-mediated connections to the SHMT2 gene, which leads to downregulation of SHMT2 expression levels. Panels B and C represent gRNA sequence maps, and panels A and D represent downregulation of SHMT2 expression levels, as further described in Example 6. [Figure 12] Figure 12 illustrates the disruption of the upstream anchor sequence-mediated connection to the TTC21B gene, which leads to the upregulation of TTC21B expression levels. As further described in Example 7, panels A and B represent the upregulation of TTC21B expression levels, and panel C represents a map of the gRNA sequence. [Figure 13] Figure 13 illustrates the disruption of downstream anchor sequence-mediated connections to the CDK6 gene, which leads to downregulation of CDK6 expression levels. As further described in Example 13, Panel A represents the downregulation of CDK6 expression levels, and Panel B represents a map of gRNA sequences. [Figure 14] Figure 14 illustrates polypeptide beta hybridizing to the CTCF region of the miR290 loop and physically interfering with the CTCF looping function (mediated by the polypeptide backbone and polynucleotide sequence). [Figure 15] Figure 15 illustrates the multimerized polypeptide beta hybridizing to the promoter of the ELANE gene. [Figure 16] Figure 16 illustrates polypeptide beta ligated to a double-stranded unmethylated CTCF anchor sequence with specificity to the H19-IGF2 locus, so as to mimic a nonmethylated CTCF binding motif on one of the paternal alleles and form a maternal-type loop. [Figure 17] Figure 17 provides a summary of specific experimental data for targeted disruption of anchor array-mediated connections. [Modes for carrying out the invention]

[0207] Detailed explanation The compositions described herein modulate gene expression in a subject by altering the two-dimensional chromatin structure (for example, anchor sequence-mediated connections that can be illustrated in two dimensions as having a higher-order structure than a linear chain, as will be understood by those skilled in the art) by modifying, for example, anchor sequence-mediated connections in DNA, for example, genomic DNA.

[0208] In one embodiment, the disclosure includes a composition comprising a targeting moiety that binds to a specific anchor sequence-mediated connection and alters the topology of an anchor sequence-mediated connection, such as an anchor sequence-mediated connection having a physical interaction of two or more DNA loci linked by a nucleation molecule.

[0209] The formation of anchor-mediated connections forces gene expression regulators to interact with the target gene or spatially constrains the activity of the regulators. Alterations in anchor-mediated connections enable gene therapy, such as gene expression modulation, without altering the coding sequence of the gene being modulated.

[0210] In some embodiments, the composition modulates the transcription of genes associated with anchor sequence-mediated linkage by physically interfering between one or more anchor sequences and linkage nucleating molecules. For example, DNA-binding small molecules (e.g., minor or major groove binders), peptides (e.g., zinc fingers, TALENs, novel or modified peptides), proteins (e.g., CTCF, modified CTCF with reduced CTCF-binding and / or aggregation-binding affinity), or nucleic acids (e.g., ssDNA, modified DNA or RNA, peptide oligonucleotide conjugates, locked nucleic acids, cross-linked nucleic acids, polyamides, and / or triple-helical oligonucleotides) can physically prevent linkage nucleating molecules from interacting with one or more anchor sequences to modulate gene expression.

[0211] In some embodiments, the composition modulates the transcription of genes associated with an anchor sequence-mediated linkage by modifying the anchor sequence, for example, by epigenetic modification. For example, gene expression can be modulated by targeting one or more anchor sequences associated with an anchor sequence-mediated linkage containing a target gene, such as by methylation modification by a DNA methyltransferase, for example, a dCas9-methyltransferase fusion, or for example, an antisense oligonucleotide-enzyme fusion.

[0212] In some embodiments, the composition modulates the transcription of genes associated with anchor sequence-mediated linkages by modifying the anchor sequence, for example, by genome modification. For example, one or more anchor sequences associated with an anchor sequence-mediated linkage containing a target gene can be targeted with a deamination enzyme (e.g., deamination oligonucleotide (e.g., oligo-sodium bisulfite conjugate), dCas-enzyme fusion, antisense oligonucleotide-enzyme fusion, deamination antisense oligonucleotide-enzyme fusion) to modulate gene expression.

[0213] In some embodiments, the composition modulates the transcription of genes associated with anchor sequence-mediated linkages, for example, by activating or repressing transcription, or by inducing epigenetic changes in chromatin.

[0214] Anchor array-mediated connection In some embodiments, the anchor sequence-mediated connection includes one or more anchor sequences, one or more genes, and one or more transcriptional regulatory sequences, such as enhancing or silencing sequences. In some embodiments, the transcriptional regulatory sequences are located inside, partially inside, or outside the anchor sequence-mediated connection.

[0215] In one embodiment, the anchor sequence-mediated connection includes loops such as intrachromosomal loops. In a particular embodiment, the anchor sequence-mediated connection has multiple loops. One or more loops may include a first anchor sequence, a nucleic acid sequence, a transcriptional regulatory sequence, and a second anchor sequence. In another embodiment, at least one loop includes, in order, a first anchor sequence, a transcriptional regulatory sequence, and a second anchor sequence; or a first anchor sequence, a nucleic acid sequence, and a second anchor sequence. In yet another embodiment, either or both of the nucleic acid sequence and the transcriptional regulatory sequence are located inside or outside the loop. In yet another embodiment, one or more of the loops include a transcriptional regulatory sequence.

[0216] In some embodiments, the anchor array-mediated connection includes a TATA box, CAAT box, GC box, or CAP portion.

[0217] In some embodiments, the anchor sequence-mediated connection comprises multiple loops, in which case the anchor sequence-mediated connection includes at least one of the following in one or more of the loops: an anchor sequence, a nucleic acid sequence, and a transcriptional regulatory sequence.

[0218] In one embodiment, the compositions described herein may include compositions for modulating the transcription of a nucleic acid sequence having a targeted moiety that binds to an anchor sequence by introducing a targeted modification to an anchor sequence-mediated linkage. In some embodiments, the anchor sequence-mediated linkage is modified by targeting one or more nucleotides within the anchor sequence-mediated linkage for substitution, addition, or deletion.

[0219] In some embodiments, transcription is activated by the inclusion of an activation loop or the exclusion of an inhibitory loop. In one such embodiment, the anchor sequence-mediated linkage includes a transcriptional regulatory sequence that increases the transcription of the nucleic acid sequence. In another such embodiment, the anchor sequence-mediated linkage excludes a transcriptional regulatory sequence that decreases the transcription of the nucleic acid sequence.

[0220] In some embodiments, transcription is suppressed by the inclusion of a repression loop or the exclusion of an activation loop. In one such embodiment, the anchor sequence-mediated linkage includes a transcriptional regulatory sequence that reduces the transcription of the nucleic acid sequence. In another such embodiment, the anchor sequence-mediated linkage excludes a transcriptional regulatory sequence that increases the transcription of the nucleic acid sequence.

[0221] Anchor array Each anchor sequence-mediated linkage includes one or more anchor sequences, e.g., multiple. Anchor sequences can be manipulated or altered to disrupt naturally occurring loops or to form new loops (e.g., to form exogenous loops or non-naturally occurring loops with exogenous or altered anchor sequences; see Figures 3, 4, and 5). Such alterations modulate gene expression by altering the two-dimensional structure of DNA, for example, thereby modulating the ability of a target gene to interact with gene regulators and control factors (e.g., enhancing and silencing / repressing sequences). In some embodiments, chromatin structure is modified by substituting, adding, or deleting one or more nucleotides within the anchor sequence of an anchor sequence-mediated linkage.

[0222] The anchor arrays may be discontinuous with respect to each other. In embodiments using discontinuous anchor arrays, the first anchor array may be separated from the second anchor array by approximately 500 bp to 500 Mb, approximately 750 bp to 200 Mb, approximately 1 kb to 100 Mb, approximately 25 kb to 50 Mb, approximately 50 kb to 1 Mb, approximately 100 kb to 750 kb, approximately 150 kb to 500 kb, or approximately 175 kb to 500 kb. In some embodiments, the first anchor array is approximately 500bp, 600bp, 700bp, 800bp, 900bp, 1kb, 5kb, 10kb, 15kb, 20kb, 25kb, 30kb, 35kb, 40kb, 45kb, 50kb, 55kb, 60kb, 65kb, 70kb, 75kb, 80kb, 85kb, 90kb, 95kb, 100kb, 125kb, 150kb, 175kb, 200kb, 225kb, 2 It is separated from the second anchor array by 50kb, 275kb, 300kb, 350kb, 400kb, 500kb, 600kb, 700kb, 800kb, 900kb, 1Mb, 2Mb, 3Mb, 4Mb, 5Mb, 6Mb, 7Mb, 8Mb, 9Mb, 10Mb, 15Mb, 20Mb, 25Mb, 50Mb, 75Mb, 100Mb, 200Mb, 300Mb, 400Mb, 500Mb, or any size in between.

[0223] In one embodiment, the anchor sequence includes a common nucleotide sequence, for example, the CTCF motif: N(T / C / G)N(G / A / T)CC(A / T / G)(C / G)(C / T / A)AG(G / A)(G / T)GG(C / A / T)(G / A)(C / G)(C / T / A)(G / A / C)(Sequence ID 1) (wherein N is any nucleotide). The CTCF binding motif may also be in the opposite orientation, for example, (G / A / C)(C / T / A)(C / G)(G / A)(C / A / T)GG(G / T)(G / A)GA(C / T / A)(C / G)(A / T / G)CC(G / A / T)N(T / C / G)N(Sequence ID 2). In one embodiment, the anchor sequence includes a sequence that is at least 75%, at least 80%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, and at least 99% identical to either sequence number 1 or sequence number 2.

[0224] In some embodiments, the anchor sequence-mediated linkage includes at least a first anchor sequence and a second anchor sequence. The first and second anchor sequences may each include a common nucleotide sequence, for example, each including a CTCF-binding motif. In some embodiments, the first and second anchor sequences include different sequences, for example, the first anchor sequence includes a CTCF-binding motif and the second anchor sequence includes an anchor sequence other than the CTCF-binding motif. In some embodiments, each anchor sequence includes a common nucleotide sequence and one or more adjacent nucleotides on one or both sides of the common nucleotide sequence.

[0225] Two CTCF-binding motifs capable of forming a link (e.g., continuous or discontinuous CTCF-binding motifs) may exist in the genome in any orientation, for example, in the same orientation (tandem) 5'→3' (left tandem, e.g., two CTCF-binding motifs containing SEQ ID NO: 1) or 3'→5' (right tandem, e.g., two CTCF-binding motifs containing SEQ ID NO: 2), or in a convergent orientation where one CTCF-binding motif contains SEQ ID NO: 1 and the other contains SEQ ID NO: 2. (CTCFBSDB2.0: Database For CTCF binding motifs And) Using GenomeOrganization (http: / / insulatordb.uthsc.edu / ), we can associate the target gene with This allows for the identification of CTCF-binding motifs.

[0226] In some embodiments, the anchor sequence includes a CTCF-binding motif associated with the target disease gene.

[0227] In some embodiments, the chromatin structure is modified by substituting, adding, or deleting one or more nucleotides within at least one anchor sequence, e.g., a connective nucleation molecule binding site. One or more nucleotides can be specifically targeted for targeted modifications, e.g., substitution, addition, or deletion within an anchor sequence, e.g., a connective nucleation molecule binding site.

[0228] In some embodiments, anchor sequence-mediated linkage is altered by changing the orientation of at least one common nucleotide sequence, for example, a linkage nucleation molecule binding site.

[0229] In some embodiments, the anchor sequence includes a connective nucleation molecule binding site, e.g., a CTCF binding motif, and the targeting portion introduces a modification of at least one connective nucleation molecule binding site, e.g., a change in binding affinity to the connective nucleation molecule.

[0230] In some embodiments, anchor sequence-mediated connections are altered by introducing exogenous anchor sequences. The addition of non-naturally occurring or exogenous anchor sequences forms or disrupts naturally occurring anchor sequence-mediated connections, for example, by inducing the formation of non-naturally occurring loops that alter the transcription of nucleic acid sequences.

[0231] Types of anchor array-mediated connections In some embodiments, the anchor sequence-mediated linkage includes one or more genes, for example, two, three, four, five, or more.

[0232] In some embodiments, the Disclosure includes methods for modulating the expression of a target gene in an anchor-mediated linkage, which include targeting a sequence that is outside, not part of, or contained within, an associated transcriptional regulatory sequence that affects the transcription of the target gene or the gene, for example, by targeting an anchor sequence.

[0233] In some embodiments, the disclosure includes methods for modulating the transcription of a target gene, which include targeting an associated transcriptional regulatory sequence that is discontinuous with the target gene or affects the transcription of the target gene, such as targeting an anchor sequence.

[0234] In some embodiments, the anchor sequence-mediated linkage binds to one or more transcription regulatory sequences, e.g., two, three, four, five, or more. In some embodiments, the target gene is discontinuous with one or more transcription regulatory sequences. In some embodiments where the gene is discontinuous with transcription regulatory sequences, the gene may be only about 100 bp to about 500 Mb, about 500 bp to about 200 Mb, about 1 kb to about 100 Mb, about 25 kb to about 50 Mb, about 50 kb to about 1 Mb, about 100 kb to about 750 kb, about 150 kb to about 500 kb, or about 175 kb to about 500 kb away from one or more transcription regulatory sequences. In some embodiments, the genes are approximately 100 bp, 300 bp, 500 bp, 600 bp, 700 bp, 800 bp, 900 bp, 1 kb, 5 kb, 10 kb, 15 kb, 20 kb, 25 kb, 30 kb, 35 kb, 40 kb, 45 kb, 50 kb, 55 kb, 60 kb, 65 kb, 70 kb, 75 kb, 80 kb, 85 kb, 90 kb, 95 kb, 100 kb, 125 kb, 150 kb, 175 kb, 200 kb, 22 The transcriptional control sequence is separated by 5kb, 250kb, 275kb, 300kb, 350kb, 400kb, 500kb, 600kb, 700kb, 800kb, 900kb, 1Mb, 2Mb, 3Mb, 4Mb, 5Mb, 6Mb, 7Mb, 8Mb, 9Mb, 10Mb, 15Mb, 20Mb, 25Mb, 50Mb, 75Mb, 100Mb, 200Mb, 300Mb, 400Mb, 500Mb, or any size in between.

[0235] In some embodiments, the type of anchor-mediated linkage can help determine how to modulate gene expression by altering the anchor-mediated linkage, for example, by determining the selection of a target region. For example, some types of anchor-mediated linkages contain one or more transcriptional regulatory sequences within the linkage. Disruption of such an anchor-mediated linkage, for example by altering one or more anchor sequences, can reduce the transcription of the target gene within the linkage.

[0236] Type 1 In some embodiments, the expression of a target gene is regulated, modulated, or influenced by an anchor-mediated linkage and one or more associated transcriptional regulatory sequences. In some embodiments, the anchor-mediated linkage includes one or more associated genes and one or more transcriptional regulatory sequences. For example, the target gene and one or more transcriptional regulatory sequences are at least partially located within an anchor-mediated linkage, e.g., a type 1 anchor-mediated linkage. See Figure 6. The anchor-mediated linkage described in Figure 6 may also be referred to as a “type 1, EP subtype”.

[0237] In some embodiments, the target gene has a predetermined level of expression, for example, in its native state, for example, in a disease state. For example, the target gene may have a high level of expression. By disrupting the anchor-mediated linkage, the expression of the target gene can be reduced, for example, by a reduction in transcription due to conformational changes of DNA that were previously open to transcription within the anchor-mediated linkage, for example, by a reduction in transcription due to conformational changes of DNA that create a further distance between the target gene and the enhancing sequence. In one embodiment, both the associated gene and one or more transcriptional regulatory sequences, for example, an enhancing sequence, are located within the anchor-mediated linkage. Disruption of the anchor-mediated linkage reduces gene expression. In one embodiment, the gene associated with the anchor-mediated linkage is at least partially accessible to one or more transcriptional regulatory sequences located within the anchor-mediated linkage. Disruption of the anchor-mediated linkage reduces gene expression.

[0238] For example, a type 1 anchor-mediated linkage contains a gene encoding MYC, and disruption of this linkage reduces gene expression and MYC protein levels. In another example, a type 1 anchor-mediated linkage contains a gene encoding Foxj3, and disruption of this linkage reduces gene expression and Foxj3 protein levels.

[0239] Type 2 In some embodiments, the expression of a target gene is regulated, modulated, or influenced by one or more transcriptional regulatory sequences that are associated with an anchor-mediated linkage but are inaccessible due to the anchor-mediated linkage. For example, an anchor-mediated linkage associated with a gene disrupts the ability of one or more transcriptional regulatory sequences to regulate, modulate, or influence gene expression. The transcriptional regulatory sequences may be distant from the gene, for example, at least partially on the opposite side of the gene from the anchor-mediated linkage, e.g., internally or externally. For example, the gene is inaccessible to the transcriptional regulatory sequences due to the proximity of the anchor-mediated linkage. In some embodiments, one or more enhancing sequences are separated from the gene by an anchor-mediated linkage, e.g., a type 2 anchor-mediated linkage. See Figure 6.

[0240] In some embodiments, type 2 genes are contained within anchor sequence-mediated linkages, but transcriptional regulatory sequences (e.g., enhancing sequences) are not contained within anchor sequence-mediated linkages. This subtype of type 2 can be called "type 2, subtype 1".

[0241] In some embodiments, the type 2 transcriptional regulatory sequence (e.g., enhancing sequence) is contained within an anchor sequence-mediated linkage, but the gene is not contained within the anchor sequence-mediated linkage. This subtype of type 2 can be called "type 2, subtype 2".

[0242] In some embodiments, a gene is inaccessible to one or more transcriptional regulatory sequences due to anchor-mediated linkage, and disruption of the anchor-mediated linkage allows the transcriptional regulatory sequences to regulate, modulate, or influence gene expression. In one embodiment, a gene is located both inside and outside of anchor-mediated linkage and is inaccessible to one or more transcriptional regulatory sequences. Disruption of the anchor-mediated linkage increases access to the transcriptional regulatory sequences, thereby regulating, modulating, or influencing gene expression; for example, the transcriptional regulatory sequences increase gene expression. In one embodiment, a gene is located inside an anchor-mediated linkage and is inaccessible to one or more transcriptional regulatory sequences located outside the anchor-mediated linkage, at least partially. Disruption of the anchor-mediated linkage increases gene expression. In one embodiment, a gene is located outside an anchor-mediated linkage and is inaccessible to one or more transcriptional regulatory sequences located inside the anchor-mediated linkage. Disruption of the anchor-mediated linkage increases gene expression.

[0243] In some embodiments, the target gene has a predetermined level of expression, for example, in its native state, for example, in a disease state. For example, the target gene may have moderate to low levels of expression. The expression of the target gene can be modulated by disrupting the anchor sequence-mediated linkage, for example, by an increase in transcription due to conformational changes of DNA within the anchor sequence-mediated linkage that were previously closed with respect to transcription, for example, by an increase in transcription due to conformational changes of DNA that more closely associate the enhancing sequence with the target gene.

[0244] For example, a type 2 anchor-mediated linkage includes the gene encoding SCN1a, and disruption of this linkage increases gene expression and SCN1a protein levels. In another example, a type 2 anchor-mediated linkage includes the gene encoding Serpin1a, and disruption of this linkage increases gene expression and Serpin1a protein levels. In yet another example, altering the anchor-mediated linkage associated with the IL-10 gene can induce an IL-10-mediated tolerance response, for example, by increasing IL-10 expression to improve an autoimmune state. In yet another example, IL-6 expression can be increased by altering its associated anchor-mediated linkage to bring one or more enhancing sequences very close to the IL-6 gene.

[0245] Type 3 In some embodiments, the expression of a target gene is regulated, modulated, or influenced by one or more transcriptional regulatory sequences that are associated with an anchor-mediated linkage but are not necessarily located on the same side of the anchor-mediated linkage. For example, the anchor-mediated linkage is associated with one or more genes, and one or more transcriptional regulatory sequences are located at least partially inside and outside the anchor-mediated linkage. In some embodiments, one or more enhancing sequences are located inside the anchor-mediated linkage, and one or more repressive signals, such as silencing sequences, are located outside the anchor-mediated linkage, such as a type 3 anchor-mediated linkage. See Figure 6.

[0246] In some embodiments, a gene is unable to access one or more transcriptional regulatory sequences due to anchor-mediated linkage, and disruption of the anchor-mediated linkage allows the transcriptional regulatory sequences to regulate, modulate, or influence gene expression. In one embodiment, a gene is located inside an anchor-mediated linkage and cannot access one or more transcriptional regulatory sequences, such as silencing / repressor sequences located outside the anchor-mediated linkage. Disruption of the anchor-mediated linkage reduces gene expression. In one embodiment, a gene is located both inside and outside an anchor-mediated linkage and cannot access one or more transcriptional regulatory sequences, such as silencing / repressor sequences, or anchor-mediated links located outside the anchor-mediated linkage. Disruption of the anchor-mediated linkage reduces gene expression. In one embodiment, a gene is located outside an anchor-mediated linkage and cannot access one or more transcriptional regulatory sequences, such as silencing / repressor sequences located inside the anchor-mediated linkage. Disruption of the anchor-mediated linkage reduces gene expression.

[0247] In some embodiments, the target gene has a specified level of expression, for example, in its native state, for example, in a disease state. For example, the target gene may have a high level of expression in its native state. By disrupting the anchor sequence-mediated linkage, the expression of the target gene can be modulated, for example, by a reduction in transcription due to conformational changes of DNA that create a greater distance between the target gene and the enhancing sequence, for example, by a reduction in transcription due to conformational changes of DNA that were previously open to transcription within the anchor sequence-mediated linkage, for example, by a reduction in transcription due to conformational changes of DNA that more closely associate the silencing sequence with the target gene, for example, by a reduction in transcription due to conformational changes of DNA that create a greater distance between the target gene and the enhancing sequence.

[0248] Type 4 In some embodiments, the expression of a target gene is regulated, modulated, or influenced by one or more transcriptional regulatory sequences that are associated with, but not necessarily located within, an anchor-mediated linkage. For example, the anchor-mediated linkage is associated with one or more genes, and one or more transcriptional regulatory sequences are located at least partially within and outside the anchor-mediated linkage, e.g., a type IV anchor-mediated linkage. See Figure 6.

[0249] In some embodiments, a gene is inaccessible to one or more transcriptional regulatory sequences due to anchor-mediated linkage, and disruption of the anchor-mediated linkage allows the transcriptional regulatory sequences to regulate, modulate, or influence gene expression. In one embodiment, a gene is located inside an anchor-mediated linkage and is inaccessible to one or more transcriptional regulatory sequences located outside the linkage. Disruption of the anchor-mediated linkage increases gene expression. In one embodiment, a gene is located both inside and outside an anchor-mediated linkage and is inaccessible to one or more transcriptional regulatory sequences, such as enhancing sequences, located outside the linkage. Disruption of the anchor-mediated linkage increases gene expression. In one embodiment, a gene is located outside an anchor-mediated linkage and is inaccessible to one or more transcriptional regulatory sequences, such as enhancing sequences located inside the linkage. Disruption of the anchor-mediated linkage increases gene expression.

[0250] In some embodiments, the target gene has a specified level of expression, for example, in its native state, for example, in a disease state. For example, the target gene may have a high level of expression in its native state. The expression of the target gene can be modulated by disrupting anchor sequence-mediated connections, for example, by increasing transcription through conformational changes within the anchor sequence-mediated connection that open the DNA to transcription, for example, by associating one or more enhancing sequences with the target gene, resulting in increased transcription through conformational changes of the DNA.

[0251] targeting part In some embodiments, the compositions, drugs, fusion molecules, or other molecules described herein include one or more targeted moieties described herein. The targeted moieties can target anchor sequence-mediated connections for at least one exogenous anchor sequence; modification of at least one connective nucleation molecule binding site, such as by altering the binding affinity to the connective nucleation molecule; a change in the orientation of at least one common nucleotide sequence, such as a CTCF binding motif; and at least one modification of substitution, addition, or deletion in at least one anchor sequence, such as a CTCF binding motif.

[0252] Those skilled in the art, upon reading the following examples of specific types of targeting moieties, will understand that in some embodiments, the targeting moieties are site-specific. That is, in some embodiments, the targeting moiety specifically binds to one or more target anchor sequences (e.g., intracellular) and does not bind to non-target anchor sequences (e.g., within the same cell).

[0253] The targeting portion can modulate specific functions, modulate specific molecules (e.g., enzymes, proteins, or nucleic acids), and bind specifically for localization. The targeting function can act on specific molecules, e.g., molecular targets. For example, a targeted therapeutic agent can interact with a specific molecule to increase, decrease, or otherwise modulate its function.

[0254] In some embodiments, the targeting portion binds to an anchor sequence (e.g., a DNA sequence). In various parts of this disclosure, the term “DNA binding portion” may be used to refer to the targeting portion.

[0255] In some embodiments, the compositions, drugs, fusion molecules, or other molecules described herein include a targeting moiety (e.g., gRNA, antisense, oligonucleotide, peptide oligonucleotide conjugate) operably linked to an effector moiety that binds to an anchor sequence and modulates the formation of an anchor sequence-mediated linkage. The targeting moiety can bind to the anchor sequence of the anchor sequence-mediated linkage and alter the formation of the anchor sequence-mediated linkage (e.g., by altering the affinity of the anchor sequence for the linkage nucleating molecule by, for example, at least 10%, 15%, 20%, 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, or more). The targeting moiety may be any one of the pharmacokinetically weak small molecules, peptides, nucleic acids, nanoparticles, aptamers, and drugs described herein.

[0256] The targeting moiety can target one or more nucleotides in an anchor sequence-mediated linkage for substitution, addition, or deletion, such as an anchor sequence, or a common nucleotide sequence within an anchor sequence, using a gene editing system or the like. In some embodiments, the targeting moiety binds to an anchor sequence-mediated linkage, such as an anchor sequence within an anchor sequence-mediated linkage, and alters the topology of the anchor sequence-mediated linkage.

[0257] In some embodiments, the targeting moiety targets one or more nucleotides of an anchor sequence within an anchor sequence-mediated linkage for substitution, addition, or deletion, for example, by CRISPR, TALEN, dCas9, oligonucleotide pairing, recombination, or transposon. In some embodiments, the targeting moiety targets one or more DNA methylation sites within an anchor sequence-mediated linkage.

[0258] The targeted region can be altered by a gene editing system or the like by substitution, addition, or deletion of one or more nucleotides in an anchor sequence-mediated linkage, such as an anchor sequence, or a common nucleotide sequence within an anchor sequence.

[0259] In some embodiments, the targeted moiety modulates the transcription of the gene in the anchor sequence-mediated linkage in human cells by introducing a targeted modification into the anchor sequence-mediated linkage. The targeted modification may include one or more nucleotides, for example, substitution, addition, or deletion of the anchor sequence in the anchor sequence-mediated linkage. The targeted moiety binds to the anchor sequence of the anchor sequence-mediated linkage, and the targeted moiety can modulate the transcription of the gene in the anchor sequence-mediated linkage in human cells by introducing a targeted modification into the anchor sequence. In some embodiments, the targeted modification alters at least one of the binding sites for linkage nucleation molecules by, for example, altering the binding affinity to the anchor sequence, alternative splicing sites, and binding sites for uncoding RNA in the anchor sequence-mediated linkage.

[0260] In some embodiments, the targeted portion edits the anchor sequence-mediated connection by: at least one exogenous anchor sequence; modification of at least one connective nucleation molecule binding site, such as by altering the binding affinity to the connective nucleation molecule; a change in the orientation of at least one common nucleotide sequence, such as a CTCF binding motif; and at least one substitution, addition, or deletion in at least one anchor sequence, such as a CTCF binding motif.

[0261] In some embodiments, the targeted moiety is a nucleic acid sequence, a protein, a protein fusion, or a membrane-permeable polypeptide. In some embodiments, the targeted moiety is selected from an exogenous connective nucleation molecule, a nucleic acid encoding a connective nucleation molecule, or a fusion of a sequence-targeted polypeptide and a connective nucleation molecule.

[0262] As will be described in more detail herein, in some embodiments, the targeted moiety described herein may be a polymer or polymer moiety, for example, a polymer of nucleotides (such as oligonucleotides), a peptide nucleic acid, a peptide-nucleic acid mixture, a peptide or polypeptide, a polyamide, a carbohydrate, etc.

[0263] nucleic acid sequence In some embodiments, the targeted portion includes a nucleic acid sequence. In some embodiments, the nucleic acid sequence encodes a gene or expression product.

[0264] As will be readily apparent to those skilled in the art reading this specification, the targeting moiety may include nucleic acid sequences that do not encode a gene or expression product. For example, in some embodiments, the targeting moiety includes an oligonucleotide that hybridizes to a target anchor sequence. For example, in some embodiments, the oligonucleotide sequence includes a complement to the target anchor sequence, or has a sequence that is at least 80%, at least 85%, at least 90%, at least 95%, at least 97%, at least 98%, or at least 99% identical to the complement of the target anchor sequence.

[0265] Nucleic acid sequences may include, but are not limited to, DNA, RNA, modified oligonucleotides (e.g., chemically modified, such as modifications altering skeletal links, sugar molecules, and / or nucleic acid bases), and artificial nucleic acids. In some embodiments, nucleic acid sequences may include, but are not limited to, genomic DNA, cDNA, peptide nucleic acid (PNA) or peptide oligonucleotide conjugates, locked nucleic acid (LNA), cross-linked nucleic acid (BNA), polyamides, triple-helix-forming oligonucleotides, modified DNA, antisense DNA oligonucleotides, tRNA, mRNA, rRNA, modified RNA, miRNA, gRNA, and siRNA or other RNA or DNA molecules.

[0266] In some embodiments, the nucleic acid sequence has a length of about 2 to about 5000 nt, about 10 to about 100 nt, about 50 to about 150 nt, about 100 to about 200 nt, about 150 to about 250 nt, about 200 to about 300 nt, about 250 to about 350 nt, about 300 to about 500 nt, about 10 to about 1000 nt, about 50 to about 1000 nt, about 100 to about 1000 nt, about 1000 to about 2000 nt, about 2000 to about 3000 nt, about 3000 to about 4000 nt, about 4000 to about 5000 nt, or any range in between.

[0267] In one embodiment, the disclosure includes a synthetic nucleic acid comprising a plurality of anchor sequences, gene sequences, and transcriptional regulatory sequences. In some embodiments, the gene sequences and transcriptional regulatory sequences are located between the plurality of anchor sequences. In some embodiments, the synthetic nucleic acid comprises, in order, (a) an anchor sequence, a gene sequence, a transcriptional regulatory sequence, and an anchor sequence, or (b) an anchor sequence, a transcriptional regulatory sequence, a gene sequence, and an anchor sequence. In some embodiments, the sequences are separated by linker sequences. In some embodiments, the anchor sequences are in any range between 7 and 100 nt, 10 and 100 nt, 10 and 80 nt, 10 and 70 nt, 10 and 60 nt, 10 and 50 nt, 20 and 80 nt, or any range in between. In some embodiments, the nucleic acid is in the range of 3,000-50,000 bp, 3,000-40,000 bp, 3,000-30,000 bp, 3,000-20,000 bp, 3,000-15,000 bp, 3,000-12,000 bp, 3,000-10,000 bp, 3,000-8,000 bp, 5,000-30,000 bp, 5,000-20,000 bp, 5,000-15,000 bp, 5,000-12,000 bp, 5,000-10,000 bp, or any range in between.

[0268] In another embodiment, this disclosure includes vectors containing nucleic acids described herein.

[0269] In another aspect, this disclosure includes cells or tissues containing nucleic acids described herein.

[0270] In another aspect, this disclosure includes pharmaceutical compositions comprising nucleic acids described herein.

[0271] In another aspect, the disclosure includes a method for modulating gene expression by administering a composition comprising the nucleic acids described herein.

[0272] analog The nucleic acid sequence may include nucleosides, such as purines or pyrimidines, such as adenine, cytosine, guanine, thymine, and uracil. In some embodiments, the nucleic acid sequence includes one or more nucleoside analogs. Nucleoside analogs include, but are not limited to, 5-fluorouracil; 5-bromouracil, 5-chlorouracil, 5-iodouracil, hypoxanthine, xanthine, 4-acetylcytosine, 4-methylbenzimidazole, 5-(carboxyhydroxylmethyl)uracil, 5-carboxymethylaminomethyl-2-thiouridine, 5-carboxymethylaminomethyluracil, dihydrouracil, dihydrouridine, beta-D-galactosylkeosin, inosine, N6-isopentenyladenine, 1-methylguanine, 1-methylinosine, 2,2-dimethylguanine, 2-methyladenine, 2-methylguanine, 3-methylcytosine, 5-methylcytosine, N6-adenine, 7-methylguanine, 5-methylaminomethyluracil, 5-methoxyaminomethyl-2-thiouracil, beta-D-mannosylkeosin, 5'-methoxycarboxymethyl Chiluracil, 5-Methoxyuracil, 2-Methylthio-N6-isopentenyladenine, Uracil-5-oxyacetic acid (v), Weibtoxosin, Pseudouracil, Keosin, 2-Thiocytosine, 5-Methyl-2-thiouracil, 2-Thiouracil, 4-Thiouracil, 5-Methyluracil, Uracil-5-oxyacetic acid methyl ester, Uracil-5-oxyacetic acid (v), 5-Methyl-2-thiouracil, 3-(3-amino-3-N- Nucleoside analogs include 2-carboxypropyl)uracil, (acp3)w, 2,6-diaminopurine, 3-nitropyrrole, inosine, thiouridine, queosin, waiosin, diaminopurine, isoguanine, isocytosine, diaminopyrimidine, 2,4-difluorotoluene, isoquinoline, pyrrolo[2,3-β]pyridine, and any other that can form base pairs with a purine or pyrimidine side chain.

[0273] SHRNA In some embodiments, the targeting moiety includes a nucleic acid sequence, e.g., a guide RNA (gRNA). In some embodiments, the targeting moiety includes a guide RNA or a nucleic acid encoding a guide RNA. A short-chain synthetic gRNA consists of a “scaffold” sequence required for Cas9 binding and a user-defined targeting sequence of approximately 20 nucleotides for genomic targeting. In practice, guide RNA sequences are generally designed to have a length of 17–24 nucleotides (e.g., 19, 20, or 21 nucleotides) and are complementary to the targeted nucleic acid sequence. Custom gRNA generators and algorithms are commercially available for use in the design of effective guide RNAs. Gene editing has also been achieved using chimeric “single guide RNA” (“sgRNA”), which is an engineered (synthetic) single RNA molecule that mimics the naturally occurring crRNA-tracrRNA complex and contains both tracrRNA (for nuclease binding) and at least one crRNA (for driving the nuclease to the targeted sequence for editing). Chemically modified sgRNAs have also been shown to be effective in genome editing; see, for example, Hendel et al. (2015), Nature Biotechnol, pp. 985-991. .

[0274] In some embodiments, the nucleic acid sequence includes a sequence complementary to the anchor sequence. In one embodiment, the anchor sequence includes a CTCF binding motif or consensus sequence: N(T / C / G)N(G / A / T)CC(A / T / G)(C / G)(C / T / A)AG(G / A)(G / T)GG(C / A / T)(G / A)(C / G)(C / T / A)(G / A / C)(Sequence ID 1) (wherein N is any nucleotide). The CTCF binding motif or consensus sequence may also be in the opposite orientation, for example, (G / A / C)(C / T / A)(C / G)(G / A)(C / A / T)GG(G / T)(G / A)GA(C / T / A)(C / G)(A / T / G)CC(G / A / T)N(T / C / G)N(Sequence ID 2). In some embodiments, the nucleic acid sequence includes a sequence complementary to the CTCF-binding motif or consensus sequence.

[0275] In some embodiments, the nucleic acid sequence includes a sequence that is at least 80%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% complementary to the anchor sequence. In some embodiments, the nucleic acid sequence includes a sequence that is at least 80%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% complementary to the CTCF-binding motif or consensus sequence. In some embodiments, the nucleic acid sequence is selected from the group consisting of a gRNA and a sequence that is complementary to the anchor sequence or at least 80%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% complementary.

[0276] In some embodiments, the epigenetic modifier is a gRNA, antisense DNA, or a steric presence near a triple-helix-forming oligonucleotide and anchor sequence used as a DNA target. The gRNA recognizes a specific DNA sequence (e.g., an anchor sequence adjacent to a sequence-constituting sequence, a CTCF anchor sequence). The gRNA may also contain further sequences that interfere with the connective nucleation molecule sequence and act as a steric blocker. In some embodiments, the gRNA is combined with one or more peptides, such as S-adenosylmethionine (SAM), which act as steric presences that interfere with the connective nucleation molecule.

[0277] Nucleic acids that code for proteins In some embodiments, the vector, for example, a viral vector, includes a targeting moiety, such as a nucleic acid encoding a conjugate nucleation molecule.

[0278] Nucleic acids or proteins described herein, such as nucleic acids encoding conjugate nucleation molecules or epigenetic modifiers, can be incorporated into vectors. Vectors derived from retroviruses, such as lentiviruses, are preferred means for achieving long-term gene transfer, as they allow for the long-term, stable incorporation and proliferation of transgenes in daughter cells. Examples of vectors include expression vectors, replication vectors, probe-generating vectors, and sequencing vectors. Expression vectors can be provided to cells in the form of viral vectors. Viral vector technology is well known in the art and is described in various virology and molecular biology manuals. Viruses useful as vectors include, but are not limited to, retroviruses, adenoviruses, adeno-associated viruses, herpesviruses, and lentiviruses. Generally, preferred vectors contain a replication origin, promoter sequence, convenient restriction endonuclease site, and one or more select markers that are functional in at least one organism.

[0279] The expression of natural or synthetic nucleic acids is typically achieved by operably ligating the nucleic acid encoding the gene of interest to a promoter and incorporating the construct into an expression vector. The vector may be suitable for replication and incorporation in eukaryotes. A typical cloning vector contains transcriptional and translational terminators, start sequences, and promoters useful for the expression of the desired nucleic acid sequence.

[0280] Further promoter elements, such as enhancing sequences, regulate the frequency of transcription initiation. Typically, these are located 30–110 bp upstream of the initiation site, although some promoters have recently been shown to also contain functional elements downstream of the initiation site. The space between promoter elements is often flexible, and therefore promoter function is conserved even if the elements are oriented opposite to each other or are moving. In the thymidine kinase (TK) promoter, the space between promoter elements can be increased to 50 bp before activity begins to decline. Depending on the promoter, individual elements appear to be able to function cooperatively or independently to activate transcription.

[0281] One example of a suitable promoter is the very early cytomegalovirus (CMV) promoter sequence. This promoter sequence is a potent constitutive promoter sequence that can drive high levels of expression of any polynucleotide sequence operably ligated to it. Another example of a suitable promoter is elongation growth factor-1α (EF-1α). However, other constitutive promoter sequences may also be used, but are not limited to: the monkey virus 40 (SV40) early promoter, mouse mammary cancer virus (MMTV), the long terminal repeat (LTR) promoter of human immunodeficiency virus (HIV), the MoMuLV promoter, the avian leukemia virus promoter, the Epstein-Barr virus very early promoter, the Roussarcoma virus promoter, and, but are not limited to, human gene promoters such as the actin promoter, myosin promoter, hemoglobin promoter, and creatine kinase promoter.

[0282] Furthermore, this disclosure should not be limited to the use of constitutive promoters. Inducible promoters are also intended as part of this disclosure. The use of an inducible promoter provides a molecular switch that can switch on the expression of a polynucleotide sequence to which it is operably ligated when expression of the polynucleotide sequence is desired, or switch off the expression when expression is not desired. Examples of inducible promoters include, but are not limited to, metallothione promoters, glucocorticoid promoters, progesterone promoters, and tetracycline promoters.

[0283] The introduced expression vector may also contain a selection marker gene or a reporter gene, or both, to facilitate the identification and selection of expression cells derived from a population of cells to be transfected or infected via the viral vector. In other embodiments, the selection marker can be supported on a separate DNA fragment and used in a simultaneous transfection procedure. Both the selection marker and the reporter gene can be flanked by appropriate transcriptional regulatory sequences to enable expression in host cells. Useful selection markers include, for example, antibiotic resistance genes such as neo.

[0284] Reporter genes can be used to identify potentially transfected cells and to evaluate the function of transcriptional regulatory sequences. Generally, a reporter gene is a gene encoding a polypeptide that is not present in or expressed by the recipient source, and whose expression is indicated by several readily detectable characteristics, such as enzymatic activity. Reporter gene expression is assayed at a suitable time after the DNA has been introduced into the recipient cells. Suitable reporter genes may include genes encoding luciferase, beta-galactosidase, chloramphenicol acetyltransferase, secreted alkaline phosphatase, or green fluorescent protein genes (e.g., Ui-Tei et al., 2000, FEBS Letters 479: pp. 79-82). The existing systems are well known and can be prepared using known techniques or obtained commercially. Generally, a construct having the smallest 5' facile region exhibiting the highest level of expression of the reporter gene is identified as the promoter. Such a promoter region can be ligated to the reporter gene and used to evaluate a drug's ability to modulate promoter-driven transcription.

[0285] RNAi Certain RNA agents can inhibit gene expression through a biological process called RNA interference (RNAi). RNAi molecules include RNA or RNA-like structures typically containing 15–50 base pairs (e.g., about 18–25 base pairs) and having a nucleic acid base sequence identical (complementary) or nearly identical (substantially complementary) to the coding sequence in a target gene expressed intracellularly. Examples of RNAi molecules, but not limited to, include small interfering RNA (siRNA), double-stranded RNA (dsRNA), microRNA (miRNA), small hairpin RNA (shRNA), meroduplex, and Dicer substrates (U.S. Patents 8,084,599, 8,349,809, and 8,513,207). In one embodiment, this disclosure includes compositions for inhibiting the expression of a polypeptide described herein, such as a conjugate nucleation molecule or an epigenetic modifier encoding a gene.

[0286] RNAi molecules contain sequences that are substantially or completely complementary to all or a fragment of a target gene. RNAi molecules can complement sequences at intron-exon boundaries to prevent the maturation of a newly generated nuclear RNA transcript of a particular gene into mRNA for transcription. RNAi molecules complementary to a particular gene can hybridize with the mRNA for that gene and prevent its translation. Antisense molecules may be DNA, RNA, or derivatives or hybrids thereof. Examples of such derivative molecules, but are not limited to, peptide nucleic acids (PNAs) and phosphorothioate molecules such as deoxyribonuclear guanidine (DNG) or ribonuclear guanidine (RNG).

[0287] RNAi molecules can be supplied to cells either as "ready-to-use" RNA synthesized in vitro, or as antisense genes transfected into cells that yield RNAi molecules during transcription. Hybridization with mRNA results in the degradation of the hybridized molecule and / or inhibition of translational complex formation by RNAseH. Both attempts fail, producing the original gene product.

[0288] The length of the RNAi molecule hybridizing to the target transcript should be approximately 10 nucleotides, approximately 15–30 nucleotides, or approximately 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30 nucleotides, or more. The degree of identity between the antisense sequence and the targeted transcript should be at least 75%, at least 80%, at least 85%, at least 90%, or at least 95%.

[0289] RNAi molecules may also contain protrusions, i.e., unpaired, protruding nucleotides that are not directly involved in the double helix structure typically formed by the paired core sequences of the sense and antisense strands as defined herein. RNAi molecules may also contain approximately 1 to 5 nucleotides of 3' and / or 5' protrusions independent of each of the sense and antisense strands. In one embodiment, both the sense and antisense strands contain 3' and 5' protrusions. In one embodiment, one or more 3' protrusion nucleotides of one strand pair with one or more 5' protrusion nucleotides of the other strand. In another embodiment, one or more 3' protrusion nucleotides of one strand do not base-pair with one or more 5' protrusion nucleotides of the other strand. The sense and antisense strands of an RNAi molecule may or may not contain the same number of nucleotide bases. The antisense and sense strands may form a double helix with only the 5' end being blunt, only the 3' end being blunt, both the 5' and 3' ends being blunt, or neither the 5' nor the 3' ends being blunt. In another embodiment, one or more nucleotides in the protrusion may contain a thiophosphate, a phosphorothioate, a deoxynucleotide inverted (3'-to-3' linked) nucleotide, or a modified ribonucleotide or deoxynucleotide.

[0290] Small interfering RNA (siRNA) molecules contain a nucleotide sequence identical to approximately 15 to 25 consecutive nucleotides of the target mRNA. In some embodiments, the siRNA sequence begins with dinucleotide AA and contains approximately 30-70% (approximately 30-60%, 40-60%, or 45-55%) GC content, and does not have a high percentage of identity to any non-target nucleotide sequence in the mammalian genome into which it is introduced, as determined, for example, by a standard BLAST search.

[0291] siRNA and shRNA are similar to intermediates in the processing pathway of endogenous microRNA (miRNA) genes (Bartel, Cell 116: pp. 281-297, 2 (2004). In some embodiments, siRNA can function as a miRNA and vice versa (Zeng et al., Mol Cell 9: pp. 1327-1333, 2002; Doench et al., GenesDev 17: pp. 438-442, 2003). MicroRNAs are s Similar to iRNAs, miRNAs use RISC to downregulate target genes, but unlike siRNAs, many animal miRNAs do not cleave mRNA. Instead, miRNAs reduce protein output by translational repression or poly(A) excision and mRNA degradation (Wu et al., ProcNatlAcad Sci USA 103: pp. 4034-4039, 2006). The iRNA binding site is located within the mRNA 3'UTR; miRNAs are thought to target a site from the 5' end of the miRNA that has almost complete complementarity with nucleotides 2-8 (Rajewsky, NatGenet Vol. 38, Supplement: pp. S8-13, 2006; Lim et al., Nature 43). (Vol. 3: pp. 769-773, 2005). This region is known as a seed region. Since siRNA and miRNA are interchangeable, exogenous siRNA downregulates mRNA that has seed complementarity with siRNA (Birmingham et al., NatMethods Vol. 3: 199). (pp. 204, 2006). Multiple target sites within the 3'UTR provide stronger downregulation (Doench et al., Genes Dev Vol. 17: pp. 438-442, 2003).

[0292] A list of known miRNA sequences, in particular, from Wellcome Trust Sanger These can be found in databases maintained by research institutions such as the Institute, the Penn Center for Bioinformatics, the Memorial Sloan Kettering Cancer Center, and the European Molecule Biology Laboratory. Known effective siRNA sequences and homologous binding sites are also well-documented in the relevant literature. RNAi molecules are readily designed and produced using techniques known in the industry. Furthermore, computer tools exist that increase the chances of discovering effective and specific sequence motifs (Pei et al., 2 2006, Reynolds et al.; 2004, Khvorova et al.; 2003, Schwarz et al.; 2003 Ui-Tei et al., 2004; Heale et al., 2005; Chalk et al., 2004; Amarzguioui et al. , 2004).

[0293] RNAi molecules modulate the expression of RNA encoded by a gene. Because multiple genes can share some degree of sequence homology with one another, in some embodiments, RNAi molecules can be designed to target a class of genes with sufficient sequence homology. In some embodiments, RNAi molecules may contain sequences that are complementary to sequences shared between different gene targets or unique to a particular gene target. In some embodiments, RNAi molecules can be designed to target conserved regions of RNA sequences that have homology between several genes and thereby target several genes in a gene family (e.g., different gene isoforms, splice variants, mutant genes, etc.). In some embodiments, RNAi molecules can be designed to target sequences that are unique to a specific RNA sequence in a single gene.

[0294] In some embodiments, the RNAi molecule may be a linkage nucleation molecule, e.g., CTCF, cohesin, USF1, YY1, TATA box-binding protein-associated factor 3 (TAF3), ZNF143, or another polypeptide that promotes the formation of anchor sequence-mediated linkages, or an epigenetic modifier, e.g., but not limited to, DNA methylases (e.g., DNMT3a, DNMT3b, DNMTL), DNA demethylation (e.g., enzymes of the TET family catalyze the oxidation of 5-methylcytosine to 5-hydroxymethylcytosine and more highly oxidative derivatives), histone methyltransferase, histone deacetylase (e.g., HDAC1) The RNAi targets sequences in enzymes involved in post-translational modifications, such as HDAC2, HDAC3), sirtuins 1, 2, 3, 4, 5, 6, or 7, lysine-specific histone demethylase 1 (LSD1), histone-lysine-N-methyltransferase (Setdb1), true chromatin histone-lysine-N-methyltransferase 2 (G9a), histone-lysine-N-methyltransferase (SUV39H1), zeste homolog 2 enhancer (EZH2), viral lysine methyltransferase (vSET), histone methyltransferase (SET2), protein-lysine-N-methyltransferase (SMYD2), and others. In one embodiment, the RNAi molecule targets a deacetylase protein, e.g., sirtuins 1, 2, 3, 4, 5, 6, or 7. In one embodiment, the disclosure includes a composition comprising RNAi that targets a conjugate nucleation molecule, e.g., CTCF.

[0295] Peptide or protein portion In some embodiments, the targeted moiety includes a peptide or protein moiety, such as a DNA-binding protein, a CRISPR component protein, a connective nucleation molecule, a dominant-negative connective nucleation molecule, an epigenetic modifier, or any combination thereof.

[0296] The peptide or protein portion may include, but is not limited to, peptide ligands, antibody fragments, or targeted aptamers that bind to receptors such as extracellular receptors, neuropeptides, hormone peptides, peptide drugs, toxic peptides, viral or microbial peptides, synthetic peptides, and agonist or antagonist peptides.

[0297] The peptide or protein portion may be linear or branched. The peptide or protein portion may have a length of about 5 to about 200 amino acids, about 15 to about 150 amino acids, about 20 to about 125 amino acids, about 25 to about 100 amino acids, or any range in between.

[0298] Examples of peptide or protein moieties used in the methods and compositions described herein include, but are not limited to, ubiquitin, bicyclic peptides as ubiquitin ligase inhibitors, transcription factors, DNA and protein modifying enzymes such as topoisomerases, topoisomerase inhibitors such as topotecan, DNA methyltransferases such as the DNMT family (e.g., DNMT3a, DNMT3b, DNMTL), protein methyltransferases (e.g., viral lysine methyltransferase (vSET)), Protein-lysine N-methyltransferase (SMYD2), deaminase (e.g., APOBEC, UG1), zeste homolog 2 enhancer (EZH2), PRMT1, histone-lysine N-methyltransferase (Setdb1), histone methyltransferase (SET2), true chromatin histone-lysine N-methyltransferase 2 (G9a), histone-lysine N-methyltransferase (SUV39H1), and histone methyltransferases such as G9a), histone deacetylase (e.g., (HDAC1, HDAC2, HDAC3), enzymes that play a role in DNA demethylation (e.g., enzymes of the TET family catalyze the oxidation of 5-methylcytosine to 5-hydroxymethylcytosine and more highly oxidative derivatives), protein demethylases such as KDM1A and lysine-specific histone demethylase 1 (LSD1), helicases such as DHX9, acetyltransferases, deacetylases (e.g., sirtuin 1, 2, 3, 4, 5, 6, or 7), kinases, phosphatases, ethidium bromide, sybr Green , and DNA insertion agents such as proflavin, efflux pump inhibitors such as peptide mimetic compounds such as phenylalanine arginyl β-naphthylamide or quinoline derivatives, nuclear receptor activators and inhibitors, proteasome inhibitors, competitive inhibitors of enzymes such as those involved in lysosomal storage disorders, protein synthesis inhibitors, nucleases (e.g., Cpf1, Cas9, zinc finger nucleases), one or more fusions thereof (e.g., dCas9-DNMT, dCas9-APOBEC, dCas9-UG1), and KRAB domains,Examples include specific domains derived from proteins.

[0299] Some examples of peptides include, but are not limited to, fluorescent tags or markers, antigens, antibodies, antibody fragments such as single-domain antibodies, ligands and receptors such as glucagon-like peptide-1 (GLP-1), GLP-2 receptor 2, cholecystokinin B (CCKB), and somatostatin receptor, peptide therapeutics such as those that bind to specific cell surface receptors such as G protein-coupled receptors (GPCRs) or ion channels, synthetic or analog peptides derived from naturally occurring bioactive peptides, antimicrobial peptides, pore-forming peptides, tumor-targeting or cytotoxic peptides, and degradable or self-destructive peptides such as apoptosis-inducing peptide signals or photosensitized peptides.

[0300] The peptides described herein also include small molecule antigen-binding peptides, such as single-chain antibodies and nanobodies (e.g., Steeland et al., 2016, Nanobodies as Therapeutics: Big Opportunities for small antibodies. Drug Discov Today: Vol. 21 (No. 7): 1076 The peptide may contain antigen-binding antibodies or antibody-like fragments (see page 113). Such low-molecular-weight antigen-binding peptides may bind to cytoplasmic antigens, nuclear antigens, or intraorganelle antigens.

[0301] In one embodiment, the disclosure includes a cell or tissue containing any one of the proteins described herein.

[0302] In another aspect, the disclosure includes pharmaceutical compositions comprising proteins described herein.

[0303] In another aspect, the disclosure includes a method for modulating gene expression by administering a composition comprising the protein described herein.

[0304] DNA binding domain In some embodiments, the targeting portion includes the DNA-binding domain of the protein. DNA-binding proteins have different structural motifs that play a crucial role in binding to DNA.

[0305] The helix-turn-helix motif is a common DNA recognition motif in repressor proteins. This motif contains two helices, one of which recognizes DNA (the so-called recognition helix), and whose side chains provide binding specificity. They are common in proteins that regulate developmental processes. Sometimes, more than one protein may compete for the same sequence or recognize the same DNA fragment. They may differ in their affinity for the same sequence or DNA conformation through H-binding, salt crosslinking, and Van der Waals interactions.

[0306] DNA-binding proteins having an HhH structural motif may be involved in sequence-specific DNA binding, which occurs through the formation of hydrogen bonds between protein backbone nitrogen and DNA phosphate groups.

[0307] DNA-binding proteins with the HLH structural motif are transcriptional regulatory proteins and, in principle, are involved in various developmental processes. This motif is longer than the other two motifs in terms of residues. Many of these proteins interact to form homodimers and heterodimers. This structural motif consists of two long helix regions; the N-terminal helix binds to DNA, while the loop region dimerizes the protein.

[0308] In some transcription factors, the dimer-binding site with DNA forms a leucine zipper. This motif contains two amphiphilic helices, one from each subunit interacting with the other, resulting in a left-handed coiled-coil supersecondary structure. A leucine zipper is the interlocking of regularly spaced leucine residues in one helix with leucine from an adjacent helix. Often, the helices involved in a leucine zipper exhibit a seven-residue sequence (abcdefg) where residues a and d are hydrophobic, and all others are hydrophilic. The leucine zipper motif can mediate homodimerization or heterodimerization.

[0309] Some eukaryotic transcription factors are Zn ++ It exhibits a unique motif called a Zn-finger, where the ion is coordinated by two Cys and two His residues. This transcription factor contains a trimer with stoichiometric ββ'α. ++ The apparent effect of coordination is the stabilization of a small loop structure instead of a hydrophobic core residue. Each Zn-finger interacts with a consecutive 3-base pair segment in the main groove of the double helix in a conformationally identical manner. Protein-DNA interactions are determined by two factors: (i) H-bond interactions between the α-helix and the DNA segment, often between Arg residues and guanine bases, and (ii) H-bond interactions between the DNA phosphate backbone, often between Arg and His. Alternative Zn-finger motifs use Zn with 6 Cys. ++ It chelates.

[0310] DNA-binding proteins also include TATA box-binding proteins, first identified as components of the class II initiation factor TFIID. They participate in transcription by all three nuclear RNA polymerases, each acting as a subunit. The structure of TBP exhibits two α / β structural domains of 89-90 amino acids. The C-terminus or core region binds with high affinity to the TATA consensus sequence (TATAa / tAa / t, sequence number xx), which recognizes the minor groove determinant and promotes DNA bending. TBP resembles a molecular saddle. The binding side aligns with the central 8 strands of a 10-strand antiparallel β-sheet. The upper surface contains four α-helices that bind to various components of the transcription mechanism.

[0311] DNA provides base specificity in the form of nitrogenous bases. The R group of amino acids, including basic residues such as lysine, arginine, histidine, asparagine, and glutamine, can readily interact with adenine (A:T base pair) and guanine (G:C base pair), with the NH2 and X=O groups of the base pair being able to form hydrogen bonds with the amino acid residues of glutamine, asparagine, arginine, and lysine, preferably with adenine (A:T base pair) and guanine (G:C base pair).

[0312] In some embodiments, the DNA-binding protein is a transcription factor. The transcription factor (TF) may be a modular protein containing a DNA-binding domain responsible for specific recognition of a base sequence and one or more effector domains that can activate or repress transcription. The TF interacts with chromatin and recruits a protein complex that acts as a coactivator or corepressor.

[0313] Gene editing systems In some embodiments, the targeting portion (e.g., site-specific targeting portion) comprises one or more components of the gene editing system. As will be understood by those skilled in the art reading this specification, and as will be further described herein, the components of the gene editing system can be used in a variety of situations, including, but are not limited to, gene editing. For example, such components can be used to target drugs that physically, genetically, and / or epigenetically modify a target anchor sequence. In some embodiments, the targeting moiety targets one or more nucleotides of an anchor sequence-mediated linkage for substitution, addition, and / or deletion. Exemplary gene editing systems include the clustering regulatory-mediated short palindromic repeat (CRISPR) system, zinc finger nucleases (ZFNs), and activator-like effector-based nucleases (TALENs). Methods based on ZFNs, TALENs, and CRISPR are described, for example, in Gaj et al., Trends Biotechnol. Vol. 31, No. 7 (2013). ): Described on pages 397-405; The CRISPR gene editing method is described, for example, in Guan et al., Application of CRISPR-Cas systemingene therapy: Pre-clinical progress in animal model. DNARepair, July 30, 2016 [electronically published before print]; Zheng et al., Precisegene deletion and replacement using the CRISPR / Cas9 system in human cells. BioTechniques, Vol. 57, No. 3, September 2014, pp. 115-124. .

[0314] For example, in some embodiments, the site-specific targeting moiety includes a Cas nuclease (e.g., Cas9) and a site-specific guide RNA, as further described herein. In some embodiments, the Cas nuclease is an enzymatically inactive one, such as dCas9, as further described herein.

[0315] In one embodiment, the methods and compositions described herein can be used in conjunction with CRISPR-based gene editing, where a guide RNA (gRNA) is used in a clustering regulatory-mediated short palindromic repeat (CRISPR) system for gene editing. The CRISPR system is an adaptive defense system originally discovered in bacteria and archaea. The CRISPR system uses an RNA-inducible nuclease called a CRISPR-associated or "Cas" endonuclease (e.g., Cas9 or Cpf1) to cleave foreign DNA. In a typical CRISPR / Cas system, the endonuclease is directed to a target nucleotide sequence (e.g., a site in the genome to be sequence-edited) by a sequence-specific non-coding "guide RNA" that targets a single-stranded or double-stranded DNA sequence. Three classes (I-III) of CRISPR systems have been identified. Class II CRISPR systems use a single Cas endonuclease (rather than multiple Cas proteins). A class II CRISPR system includes type II Cas endonucleases such as Cas9, CRISPR RNA ("crRNA"), and transactivating crRNA ("tracrRNA"). crRNA contains a "guide RNA," which is typically an RNA sequence of about 20 nucleotides that corresponds to the target DNA sequence. crRNA also contains a region that binds to tracrRNA and is cleaved by RNase III, forming a partially double-stranded structure resulting in a crRNA / tracrRNA hybrid. The crRNA / tracrRNA hybrid then instructs the Cas9 endonuclease to recognize and cleave the target DNA sequence. The target DNA sequence must generally be adjacent to a "protospacer fringe motif" ("PAM") that is specific to a given Cas endonuclease; however, PAM sequences appear throughout a given genome.CRISPR endonucleases identified from various prokaryotic species have unique PAM sequence requirements; examples of PAM sequences include 5'-NGG (Streptococcus pyogenes), 5'-NNAGAA (Streptococcus thermophilus CRISPR1), 5'-NGGNG (Streptococcus thermophilus CRISPR3), and 5'-NNNGATT (Neisseria meningiditis). Some endonucleases, such as Cas9 endonuclease, associate with a G-rich PAM site, e.g., 5'-NGG, and perform blunt-end cleavage of target DNA at a position 3 nucleotides upstream (5' side) from the PAM site. Another class II CRISPR system includes the smaller V-type endonuclease Cpf1, which is smaller than Cas9; examples include AsCpf1 (derived from Acidaminococcus sp.) and LbCpf1 (derived from Lachnospiraceae sp.). Cpf1-bound CRISPR arrays process to mature crRNA without requiring tracrRNA; in other words, the Cpf1 system requires only the Cpf1 nuclease and crRNA to cleave the target DNA sequence. The Cpf1 endonuclease associates with T-rich PAM sites, e.g., 5'-TTN. Cpf1 can also recognize 5'-CTA PAM motifs. Cpf1 cleaves target DNA by introducing offset or alternating double-strand breaks with 4 or 5 nucleotide 5' overhangs, for example, by cleaving target DNA with 5-nucleotide offset or alternating breaks at positions 18 nucleotides downstream (3' side) from the PAM site on the coding strand and 23 nucleotides downstream from the PAM site on the complementary strand; the resulting 5-nucleotide overhangs allow for more precise genome editing by homologous recombination DNA insertion than insertion with blunt-ended DNA. For example, Zetsche et al. (2015), Cell, vol. 163: 759-757. Please refer to page 71.

[0316] Various CRISPR-binding (Cas) genes or proteins can be used in the methods of this disclosure, and the selection of the Cas protein will depend on the specific conditions of the method. Examples of specific Cas proteins include class II systems such as Cas1, Cas2, Cas3, Cas4, Cas5, Cas6, Cas7, Cas8, Cas9, Cas10, Cpf1, C2C1, or C2C3. In some embodiments, the Cas protein, e.g., the Cas9 protein, may be derived from any of various prokaryotic species. In some embodiments, a specific Cas protein, e.g., a specific Cas9 protein, is selected to recognize a specific protospacer-adjacent motif (PAM) sequence. In some embodiments, the targeting moiety includes an enzyme, e.g., a sequence-targeting polypeptide such as Cas9. In certain embodiments, the Cas protein, e.g., the Cas9 protein, may be obtained from bacteria or archaea or synthesized using known methods. In certain embodiments, the Cas protein may be derived from Gram-positive or Gram-negative bacteria. In certain embodiments, the Cas protein may be derived from Streptococcus (e.g., S. pyogenes, S. thermophilus), Crptoococcus, Corynebacterium, Haemophilus, Eubacterium, Pasteurella, Prevotella, Veillonella, or Marinobacter. In some embodiments, two or more different Cas proteins, or nucleic acids encoding two or more Cas proteins, may be introduced into cells, fertilized eggs, embryos, or animals to enable, for example, the recognition and modification of sites containing the same, similar, or different PAM motifs.In some embodiments, Cas proteins are modified to deactivate nucleases, such as nuclease-deficient Cas9, and to recruit the activation domains of transcription activators or repressors, such as E. coli Pol, the ω-subunit of VP64, p65, KRAB, or SID4X, to induce epigenetic modifications, such as histone acetyltransferases, histone methyltransferases and demethylases, DNA methyltransferases, and enzymes that play a role in DNA demethylation (for example, enzymes of the TET family catalyze the oxidation of 5-methylcytosine to 5-hydroxymethylcytosine and more highly oxidative derivatives).

[0317] For gene editing, CRISPR arrays can be designed to contain one or more guide RNA sequences corresponding to a desired target DNA sequence; see, for example, Cong et al. (2013), Science, vol. 339: pp. 819-823; Ran et al. (2013), Nature Protocols, vol. 8: pp. 2281-2308. For DNA cleavage to occur, Cas9 requires a gRNA sequence of at least approximately 16 or 17 nucleotides; for Cpf1, a gRNA sequence of at least approximately 16 nucleotides is required to achieve a detectable DNA cleavage.

[0318] Wild-type Cas9 generates double-strand breaks (DSBs) at specific DNA sequences targeted by gRNAs, but several CRISPR endonucleases with modified functionality are available. For example, “nickase” type Cas9 generates only single-strand breaks; catalytically inactive Cas9 ("dCas9") does not cleave target DNA but interferes with transcription through steric hindrance. dCas9 can be further fused with heterologous effectors to repress (CRISPRi) or activate (CRISPRa) the expression of target genes. For example, Cas9 can be fused to a transcriptional silencer (e.g., a KRAB domain) or a transcriptional activator (e.g., a dCas9-VP64 fusion). Catalytically inactive Cas9 (dCas9) fused to a FokI nuclease ("dCas9-FokI") can be used to generate DSBs at target sequences homologous to two gRNAs. For example, see several CRISPR / Cas9 plasmids disclosed in the Addgene repository (Addgene, 75 Sidney St., Suite 550A, Cambridge, MA 02139; addgene.org / crispr / ) and publicly available there. “Double nickase” Cas9, which introduces two separate double-strand breaks, each directed by a different guide RNA, was developed by Ran et al. ( It is described in Cell, Vol. 154, pp. 1380-1389 (2013) as a method for achieving more precise genome editing.

[0319] CRISPR technology for editing eukaryotic genes is disclosed in U.S. Patent Applications Publication Nos. 2016 / 0138008A1 and US2015 / 0344912A1, as well as U.S. Patents Nos. 8,697,359, 8,771,945, 8,945,839, 8,999,641, 8,993,233, 8,895,308, 8,865,406, 8,889,418, 8,871,445, 8,889,356, 8,932,814, 8,795,965, and 8,906,616. The Cpf1 endonuclease, along with its corresponding guide RNA and PAM site, is disclosed in U.S. Patent Application Publication No. 2016 / 0208243A1.

[0320] In some embodiments, the desired genome modification involves homologous recombination, in which one or more double-strand DNA breaks in a target nucleotide sequence are generated by an RNA-inducible nuclease and guide RNA, and the breaks are repaired using a homologous recombination mechanism ("homologous recombination repair"). In such embodiments, a donor template encoding a desired nucleotide sequence to be inserted into or knocked into the double-strand break is provided to the cell or subject; examples of preferred templates include single-strand DNA templates and double-strand DNA templates (e.g., ligated to polypeptides described herein). Generally, donor templates encoding nucleotide changes over a region of less than about 50 nucleotides are provided in the form of single-strand DNA; larger donor templates (e.g., more than 100 nucleotides) are often provided as double-strand DNA plasmids. In some embodiments, the donor template is provided to the cell or subject in an amount sufficient to achieve the desired homologous recombination repair but not persistent in the cell or subject after a given period (e.g., after one or more cell division cycles). In some embodiments, the donor template has a core nucleotide sequence that differs from the target nucleotide sequence (e.g., homologous endogenous genomic region) by at least 1, at least 5, at least 10, at least 20, at least 30, at least 40, at least 50, or more nucleotides. This core sequence is flanked by a “homologous arm” or region exhibiting high sequence identity with the target nucleotide sequence; in embodiments, the high identity region contains at least 10, at least 50, at least 100, at least 150, at least 200, at least 300, at least 400, at least 500, at least 600, at least 750, or at least 1000 nucleotides on each side of the core sequence. In some embodiments where the donor template is in the form of single-stranded DNA, the core sequence is flanked by a homologous arm containing at least 10, at least 20, at least 30, at least 40, at least 50, at least 60, at least 70, at least 80, or at least 100 nucleotides on each side of the core sequence.In embodiments where the donor template is in the form of double-stranded DNA, the core sequence is flanked by homology arms containing at least 500, at least 600, at least 700, at least 800, at least 900, or at least 1000 nucleotides on each side of the core sequence. In one embodiment, two separate double-strand breaks are "double nickase" Cas9 (Ran et al. (2013), Cell, 15). After introducing the target nucleotide sequence into the cell or target using (see Volume 4: pages 1380-1389), the donor template is delivered.

[0321] In some embodiments, the composition comprises a gRNA and a targeted nuclease, such as Cas9, such as wild-type Cas9, nickase Cas9 (e.g., Cas9 D10A), dead Cas9 (dCas9), eSpCas9, Cpf1, C2C1, or C2C3, or a polypeptide described herein linked to a nucleic acid encoding such a nuclease. The selection of the nuclease and gRNA is determined by whether the targeted mutation is a nucleotide deletion, substitution, or addition, for example, a nucleotide deletion, substitution, or addition to a targeted sequence. A fusion of a catalytically inactive endonuclease, such as dead Cas9 (dCas9, e.g., D10A;H840A), to all or part (e.g., a bioactive portion) of one or more effector domains (e.g., epigenome editors including, but not limited to, DNMT3a, DNMT3L, DNMT3b, KRAB domain, Tet1, p300, VP64, and the above fusions), creates a chimeric protein that can modulate the activity and / or expression of one or more target nucleic acid sequences (e.g., methylate or demethylate the DNA sequences) by linking the composition to a specific DNA site via one or more RNA sequences (e.g., DNA recognition elements such as, but not limited to, zinc finger arrays, sgRNA, TAL arrays, peptide nucleic acids, etc., as described herein).

[0322] As used herein, “bioactive portion of effector domain” is a portion (e.g., “minimal” or “core” domain) that maintains the function of the effector domain (e.g., completely, partially, or minimally). In some embodiments, fusion of dCas9 with all or part of one or more effector domains of an epigenetic modifier (enzymes having a role in DNA methylation or DNA demethylation, e.g., DNMT3a, DNMT3b, DNMT3L, DNMT inhibitors, combinations thereof, TET family enzymes, protein acetyltransferases or deacetylases, dCas9-DNMT3a / 3L, dCas9-DNMT3a / 3L / KRAB, dCas9 / VP64, etc.) is linked to a polypeptide to produce a chimeric protein useful in the methods described herein. Therefore, in some embodiments, the affinity or ability of the anchor sequence to bind to a nucleating protein is reduced by ligating a nucleic acid encoding a dCas9-methylase fusion to a polypeptide and administering it to a target requiring it, along with site-specific gRNA or antisense DNA oligonucleotide that targets the fusion to an anchor sequence (such as a CTCF-binding motif). In other embodiments, the affinity or ability of the anchor sequence to bind to a nucleating protein is increased by ligating a nucleic acid encoding a dCas9-enzyme fusion to a polypeptide along with site-specific gRNA or antisense DNA oligonucleotide that targets the fusion to a binding anchor sequence (such as a CTCF-binding motif) and administering the whole to a target requiring it. In some embodiments, all or part of the effector domain of one or more methyltransferases, or enzymes associated with demethylation, is fused with an inactive nuclease, such as dCas9, and ligated to a polypeptide.Exemplary dCas9 fusion methods and compositions applicable to the methods and compositions described herein are publicly known, for example, Kearns et al., Functional annotation of native enhancers with a Cas9-histone demethylase fusion. Nature Methods Vol. 12, pp. 401-403 (2015); and McDonald et al., Reprogrammable CRISPR / Cas9-based system for inducing site-specific DNA methylation. Biology Open 2016: doi:10.1242 / bio.019067.

[0323] In other embodiments, effector domains (all or bioactive portions) of 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, or more methyltransferases or enzymes having a role in DNA demethylation are fused with dCas9 and linked to a polypeptide. The chimeric proteins described herein may also include linkers described herein, e.g., amino acid linkers. In some embodiments, the linker includes two or more amino acids, e.g., one or more GS sequences. In some embodiments, the fusion of Cas9 (e.g., dCas9) and two or more effector domains (e.g., of DNA methylases or enzymes having a role in DNA demethylation) includes one or more intervening linkers (e.g., GS linkers) between the domains and is linked to a polypeptide. In some embodiments, dCas9 is fused with multiple effector domains (e.g., 2 to 5, e.g., 2, 3, 4, or 5) including intervening linkers, and then linked to a polypeptide.

[0324] In some embodiments, the targeting portion includes one or more components of the CRISPR system described above.

[0325] For example, in some embodiments, the targeting portion includes a gRNA having a targeting domain that hybridizes to a nucleic acid containing the target anchor sequence and / or is at least 80%, at least 85%, at least 90%, at least 95%, at least 97%, at least 98%, and at least 99% identical to the complement of the nucleic acid containing the target anchor sequence. In some embodiments, the gRNA is a site-specific gRNA whose targeting domain does not hybridize to at least one nucleic acid containing a non-target anchor sequence. In some embodiments, site-specific gRNA has structure I: (I)XYZ (In the formula, X and Z are 5' and 3' site-specific targeting sequences for the target CTCF binding motif, respectively, and Y is, (a) RNA sequence complementary to the sequence of Sequence ID No. 1; (b) RNA sequences that are at least 75%, 80%, 85%, 90%, and 95% identical to the RNA sequence complementary to sequence number 1; (c) An RNA sequence complementary to the sequence of Sequence ID No. 1, having at least 1, 2, 3, 4, or 5 but fewer than 15, 12, or 10 nucleotide additions, substitutions, or deletions; (d) RNA sequence complementary to the sequence of Sequence ID No. 2; (e) RNA sequences that are at least 75%, 80%, 85%, 90%, and 95% identical to the RNA sequence complementary to sequence number 2; (f) An RNA sequence complementary to the sequence of Sequence ID No. 2, having at least 1, 2, 3, 4, or 5 but fewer than 15, 12, or 10 nucleotide additions, substitutions, or deletions. (Selected from) Includes an array of.

[0326] In some embodiments, X and Z are each 2 to 50 nucleotides long, for example, 2 to 20, 2 to 10, and 2 to 5 nucleotides long.

[0327] Some embodiments describe compositions or methods comprising gRNAs that specifically target CTCF-binding motifs associated with oncogenes, tumor suppressors, or diseases associated with nucleotide repeats. For example, see CTCFBSDB2.0:Database For CTCF Binding Motifs And GenomeOrganization.

[0328] In some embodiments, a pharmaceutical composition comprising the guide RNA described herein is provided.

[0329] In some embodiments, the methods described herein include a method for delivering one or more of the above-described CRISPR system components to the nucleus of a target, for example, a cell or tissue of the target, by linking such components to a polypeptide described herein.

[0330] Connecting nucleation molecule In some embodiments, the targeted portion includes a connective nucleating molecule, a nucleic acid encoding the connective nucleating molecule, or a combination thereof. In some embodiments, anchor sequence-mediated linkage is mediated by a first connective nucleating molecule bound to a first anchor sequence, a second connective nucleating molecule bound to a discontinuous second anchor sequence, and association between the first and second connective nucleating molecules. In some embodiments, the connective nucleating molecule can disrupt the binding of an endogenous connective nucleating molecule to its binding site, for example, by competitive binding.

[0331] The conjugate nucleation molecule may be, for example, CTCF, cohesin, USF1, YY1, TATA box-binding protein-associated factor 3 (TAF3), ZNF143 binding motif, or another polypeptide that promotes the formation of anchor sequence-mediated links. The conjugate nucleation molecule may be an endogenous polypeptide or other protein, for example, a transcription factor, for example, an autoimmune modulator (AIRE), another factor, for example, an X-inactivation specific transcript (XIST), or an engineered polypeptide that is engineered to recognize a specific DNA sequence of interest, for example, having a zinc finger, leucine zipper, or bHLH domain for sequence recognition. The conjugate nucleation molecule can modulate DNA interactions within or around anchor sequence-mediated links. For example, the conjugate nucleation molecule can recruit other factors to the anchor sequence that alters the formation or disruption of the anchor sequence-mediated link.

[0332] The conjugate nucleating molecule may also have a dimerization domain for homodimerization or heterodimerization. For example, one or more endogenous and engineered conjugate nucleating molecules may interact to form an anchor sequence-mediated linkage. In some embodiments, the conjugate nucleating molecule is engineered to further include a stabilization domain, e.g., an aggregation interaction domain, for stabilizing the anchor sequence-mediated linkage. In some embodiments, the conjugate nucleating molecule is engineered to bind to a target sequence, e.g., its target sequence binding affinity is modulated. In some embodiments, a conjugate nucleating molecule is selected or engineered to have a binding affinity to the anchor sequence within the anchor sequence-mediated linkage.

[0333] Connective nucleation molecules and their corresponding anchor sequences can be identified in cells carrying inactivating mutations in CTCFs and by methods based on Chromosome Conformation Capture or 3C, such as Hi-C or high-efficiency sequencing, to examine topologically related domains, e.g., distal DNA regions or topological interactions between loci, in the absence of CTCFs. Long-range DNA interactions can also be identified. Further analysis may include ChIA-PET analysis using baits such as cohesin, YY1 or USF1, ZNF143 binding motifs, and MS to identify complexes that associate with the bait.

[0334] In some embodiments, one or more binding nucleating molecules have a binding affinity to an anchor sequence that is higher or lower than a reference value, for example, the binding affinity to the anchor sequence in the absence of modification.

[0335] In some embodiments, a binding nucleating molecule, for example, modulates the binding affinity to the anchor sequence within the anchor sequence-mediated connection, thereby altering its interaction with the anchor sequence-mediated connection.

[0336] In some embodiments,

[0337] different parts In some embodiments, the compositions, agents, and / or fusion molecules described herein may include one or more heterogeneous parts. The heterogeneous parts may be effectors (e.g., drugs, small molecules), tags (e.g., fluorophores, photosensitive agents such as KillerRed), or any of the editing or targeting parts described herein.

[0338] In some embodiments, heterogeneous portions may be linked to the membrane-permeable migration polypeptide described herein. In some embodiments, the membrane-permeable migration polypeptide described herein is linked to one or more heterogeneous portions.

[0339] In one aspect, the disclosure includes a cell or tissue comprising any one of the heterologous moieties described herein.

[0340] In another aspect, the disclosure includes a pharmaceutical composition comprising a heterologous moiety described herein.

[0341] In another aspect, the disclosure includes a method of modulating gene expression by administering a composition comprising a heterologous moiety described herein.

[0342] In one aspect, the heterologous moiety is any of the targeting moieties that modulate the two-dimensional structure of chromatin (i.e., modulate the structure of chromatin in a manner that changes its two-dimensional representation).

[0343] In one embodiment, the heterologous moiety is a small molecule (e.g., a peptidomimetic or small organic molecule having a molecular weight less than 2000 daltons), a peptide or polypeptide (e.g., a non-ABX n C polypeptide, e.g., an antibody or an antigen-binding fragment thereof), a nucleic acid (e.g., siRNA, mRNA, RNA, DNA, modified DNA or RNA, an antisense DNA oligonucleotide, antisense RNA, ribozyme, a therapeutic mRNA encoding a protein), a nanoparticle, an aptamer, or a drug with low PK / PD.

[0344] In some embodiments, the heterologous moiety can be cleaved from the polypeptide by specific proteolysis or enzymatic cleavage (e.g., by TEV protease, thrombin, factor Xa or enteropeptidase) (e.g., after administration).

[0345] Effector moiety The heterologous moiety may be an effector moiety having effector activity. The effector moiety can modulate biological activity, for example, increasing or decreasing enzyme activity, gene expression, cell signaling, and cell or organ function. Effector activity may also include binding regulatory proteins for modulating the activity of regulatory factors, such as transcription or translation. Effector activity may also include activator or inhibitor (or "negative effector") functions as described herein. For example, the heterologous moiety may induce enzyme activity by inducing an increase in substrate affinity in an enzyme, for example, fructose 2,6-bisphosphate activates phosphofructokinase 1, increasing the rate of glycolysis in response to insulin. In another example, the heterologous moiety may inhibit substrate binding to a receptor and inhibit its activation, for example, naltrexone and naloxone bind to opioid receptors without activating them, blocking the receptor's ability to bind opioids. Effector activity may also include modulating the stability / degradation of proteins and / or the stability / degradation of transcripts. For example, proteins can be targeted and marked for degradation by polypeptide cofactors, such as ubiquitin. In another example, heterologous moieties inhibit enzyme activity by blocking the active site of the enzyme. For instance, methotrexate is a structural analog of tetrahydrofolate, a coenzyme for the dihydrofolate reductase enzyme, which binds to dihydrofolate reductase 1000 times more tightly than its native substrate and inhibits nucleotide base synthesis.

[0346] In some embodiments, the composition includes a targeting moiety (e.g., gRNA, membrane translocation polypeptide) operably linked to an effector moiety that binds to an anchor sequence and modulates the formation of a linkage mediated by the anchor sequence.

[0347] In some embodiments, the effector molecule is a chemical that modulates a chemical, such as cytosine (C) or adenine (A) (e.g., sodium bisulfite, ammonium bisulfite). In some embodiments, the effector moiety has enzymatic activity (methyltransferase, demethylase, nuclease (e.g., Cas9), deaminase). In some embodiments, the effector moiety sterically hinders the formation of anchor sequence-mediated linkages [e.g., membrane-permeable polypeptide + nanoparticles (def: 1-100 nm)].

[0348] The effector portion having effector activity may be any one of the small molecules, peptides, nucleic acids, nanoparticles, aptamers, and drugs with low PK / PD as described herein.

[0349] Negative effect section In some embodiments, the effector is an inhibitor or a “negative effector.” In the context of a negative effector moiety that modulates the formation of an anchor sequence-mediated linkage, in some embodiments, the negative effector moiety is characterized by reduced dimerization of the endogenous nucleating polypeptide when it is present compared to when it is absent. For example, in some embodiments, the negative effector moiety is a variant of the dimerization domain of the endogenous nucleating polypeptide, or a dimerization moiety thereof, or includes such a variant.

[0350] Dominant-negative linked nucleation molecules For example, in certain embodiments, anchor-mediated linkage is altered (e.g., disrupted) by using a dominant-negative effector, such as a protein that recognizes and binds to an anchor sequence (e.g., a CTCF-binding motif), but has an inactive (e.g., mutated) dimerization domain, such as a dimerization domain that cannot form a functional anchor-mediated linkage. For example, the zinc finger domain of CTCF can be altered so that it binds to a specific anchor sequence (by adding a zinc finger that recognizes an adjacent nucleic acid), but the homodimerization domain is altered to prevent interaction between the manipulated CTCF and the endogenous form of CTCF. The DNA encoding this protein can then be administered to a target that requires it.

[0351] In some embodiments, the composition includes a synthetic connective nucleating molecule having a selected binding affinity to an anchor sequence within a target anchor sequence-mediated linkage (the binding affinity may be at least 10%, 15%, 20%, 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, or higher or lower than the affinity of an endogenous connective nucleating molecule associating with the target anchor sequence. The synthetic connective nucleating molecule may have 30-90%, 30-85%, 30-80%, 30-70%, 50-80%, or 50-90% amino acid sequence identity with respect to the endogenous connective nucleating molecule). The connective nucleating molecule can disrupt the binding of the endogenous connective nucleating molecule to its anchor sequence by competitive binding or other means. In some further embodiments, the connective nucleating molecule is manipulated to bind to a novel anchor sequence within the anchor sequence-mediated linkage.

[0352] In some embodiments, the dominant-negative effector has a domain that recognizes a specific DNA sequence (e.g., an anchor sequence adjacent to a sequence-constituting sequence, a CTCF anchor sequence) and a second domain that provides steric presence near the anchor sequence. The second domain may include a dominant-negative conjugate nucleation molecule or a fragment thereof, a polypeptide that interferes with the recognition of the conjugate nucleation molecule sequence (e.g., a peptide / nucleic acid or PNA amino acid backbone), a nucleic acid sequence ligated to a small molecule that confers steric interference, or any other combination of the DNA recognition element and a steric blocker.

[0353] Epigenetic modifiers In some embodiments, the heterogeneous portion is an epigenetic modifier. Useful epigenetic modifiers in the methods and compositions described herein include, for example, agents that affect DNA methylation, histone acetylation, and RNA-related silencing. In some embodiments, the methods described herein involve sequence-specific targeting of an epigenetic enzyme (e.g., an enzyme that generates or removes epigenetic marks, e.g., acetylation and / or methylation). Exemplary epigenetic enzymes that can be targeted to anchor sequences using the CRISPR method described herein include DNA methylases (e.g., DNMT3a, DNMT3b, DNMTL), DNA demethylation (e.g., TET family), histone methyltransferases, histone deacetylases (e.g., HDAC1, HDAC2, HDAC3), sirtuins 1, 2, 3, 4, 5, 6, or 7, lysine-specific histone demethylase 1 (LSD1), histone-lysine-N-methyltransferase (Setdb1), true chromatin histone-lysine-N-methyltransferase 2 (G9a), histone-lysine-N-methyltransferase (SUV39H1), zeste homolog 2 enhancer (EZH2), viral lysine methyltransferase (vSET), histone methyltransferase (SET2), and protein-lysine-N-methyltransferase (SMYD2). Examples of such epigenetic modifiers are described, for example, in deGroote et al., Nuc. Acids Res. (2012): pp. 1-18. .

[0354] In some embodiments, epigenetic modifiers useful herein are incorporated herein by reference, Koferle et al., Genome Medicine, Vol. 7, No. 59 (2015): This includes structures listed on pages 1-3 (for example, Table 1).

[0355] Tagging or monitoring section The heterogeneous portion may be a tag for labeling or monitoring the polypeptide or another heterogeneous portion linked to the polypeptide described herein. The tagged or monitored portion may be removed by chemical agents or by enzymatic cleavage such as proteolysis or intent splicing. Affinity tags may be useful for purifying tagged polypeptides using affinity techniques. Some examples include chitin-binding protein (CBP), maltose-binding protein (MBP), glutathione-S-transferase (GST), and poly(His) tags. Solubilization tags may be useful for recombinant proteins expressed in chaperone-deficient species such as E. coli to assist in the proper folding of proteins and protect them from precipitation. Some examples include thioredoxin (TRX) and poly(NANP). The tagged or monitored portion may include a photosensitive tag, e.g., a fluorescent tag. Fluorescent tags are useful for visualization. GFP and its variants are some examples of commonly used fluorescent tags. Protein tags may undergo specific enzymatic modifications (such as biotinylation by biotin ligases) or chemical modifications (such as reaction with FlAsH-EDT2 for fluorescence imaging). To link a protein to multiple other components, the tagging and monitoring regions are often combined. The tagging or monitoring region can also be removed by specific proteolysis or enzymatic cleavage (e.g., by TEV proteases, thrombin, factor Xa, or enteropeptidases).

[0356] The tagging or monitoring portion may be a small molecule, peptide, nucleic acid, nanoparticle, aptamer, or other drug.

[0357] nucleic acid The heterogeneous portion may be a nucleic acid. The nucleic acid heterogeneous portion may include, but is not limited to, DNA, RNA, and artificial nucleic acids. The nucleic acid may include, but is not limited to, genomic DNA, cDNA, modified DNA, antisense DNA oligonucleotides, tRNA, mRNA, rRNA, modified RNA, miRNA, gRNA, and siRNA or other RNAi molecules. In one embodiment, the nucleic acid is siRNA for targeting gene expression products. In another embodiment, the nucleic acid includes one or more nucleoside analogs described herein.

[0358] Nucleic acids have lengths of approximately 2 to 5000 nt, approximately 10 to 100 nt, approximately 50 to 150 nt, approximately 100 to 200 nt, approximately 150 to 250 nt, approximately 200 to 300 nt, approximately 250 to 350 nt, approximately 300 to 500 nt, approximately 10 to 1000 nt, approximately 50 to 1000 nt, approximately 100 to 1000 nt, approximately 1000 to 2000 nt, approximately 2000 to 3000 nt, approximately 3000 to 4000 nt, approximately 4000 to 5000 nt, or any range in between.

[0359] Some examples of nucleic acids include, but are not limited to, nucleic acids that hybridize to endogenous genes (e.g., gRNA or antisense ssDNA as described elsewhere herein), nucleic acids that hybridize to exogenous nucleic acids such as viral DNA or RNA, nucleic acids that hybridize to RNA, nucleic acids that interfere with gene transcription, nucleic acids that interfere with RNA translation, nucleic acids that stabilize or destabilize RNA by targeting degradation, nucleic acids that interfere with DNA or RNA binding factors by interfering with their expression or function, nucleic acids that are linked to intracellular proteins and modulate their function, and nucleic acids that are linked to intracellular protein complexes and modulate their function.

[0360] This disclosure intends to utilize RNA therapeutic agents (e.g., modified RNA) as useful heterogeneous portions in the compositions described herein. For example, a modified mRNA encoding a protein of interest is ligated to a polypeptide described herein and used in the target. It can be expressed in vivo.

[0361] In some embodiments, the modified RNA or DNA oligonucleotides linked to the polypeptides described herein have modified nucleosides or nucleotides. Such modifications are publicly known and are described, for example, in WO2012 / 019168. Further modifications are described, for example, in WO2015038892;WO2015038892;WO2015089511;WO2015196130;WO2015196118 and WO2015196128A2.

[0362] In some embodiments, the modified RNA or DNA oligonucleotides linked to the polypeptides described herein have one or more terminal modifications, e.g., a 5' cap structure and / or a poly-A tail (e.g., 100-200 nucleotides long). The 5' cap structure can be selected from the group consisting of CapO, Capl, ARCA, inosine, Nl-methyl-guanosine, 2'-fluoro-guanosine, 7-deaza-guanosine, 8-oxo-guanosine, 2-amino-guanosine, LNA-guanosine, and 2-azido-guanosine. In some cases, the modified RNA also contains a 5'UTR and a 3'UTR containing at least one Kozak sequence. Such modifications are known and described, for example, in WO2012135805 and WO2013052523. Further terminal modifications are described, for example, in WO2014164253, WO2016011306, WO2012045075, and WO2014093924.

[0363] A chimeric enzyme for synthesizing capped RNA molecules (e.g., modified mRNA) which may contain at least one chemical modification is described in WO2014028429.

[0364] In some embodiments, modified mRNA can be cyclized or concatemerized to produce molecules capable of translating to facilitate the interaction between poly(A) binding proteins and 5' end binding proteins. The mechanism of cyclization or concatemerization may be mediated by at least three different pathways: 1) chemical, 2) enzymatic, and 3) ribozyme catalysis. The newly formed 5'- / 3'- bond may be intramolecular or intermolecular. Such modifications are described, for example, in WO2013151736.

[0365] Methods for preparing and purifying modified RNA are publicly known and disclosed in the art. For example, modified RNA can be prepared using only in vitro transcription (IVT) enzymatic synthesis. Methods for preparing IVT polynucleotides are publicly known in the art and are described in WO2013151666, WO2013151668, WO2013151663, WO2013151669, WO2013151670, WO2013151664, WO2013151665, WO2013151671, WO2013151672, WO2013151667 and WO2013151736.S. The purification method includes contacting a sample with a surface linked to multiple thymidines or their derivatives and / or multiple uracils or their derivatives (poly-T / U) under conditions that allow RNA transcripts to bind to the surface, and eluting the purified RNA transcript from the surface (WO2014152031); using ion (e.g., anion) exchange chromatography to enable the separation of longer RNAs up to 10,000 nucleotides in length in an expandable manner (WO2014144767); and purifying RNA transcripts containing poly-A tails by subjecting the modified RMNA sample to DNAse treatment (WO2014152030).

[0366] Modified RNAs encoding proteins are publicly known in the fields of human diseases, antibodies, viruses, and various in vivo settings, and are disclosed, for example, in Table 6 of International Publications WO2013151666, WO2013151668, WO2013151663, WO2013151669, WO2013151670, WO2013151664, WO2013151665, and WO2013151736; Tables 6 and 7 of International Publication WO2013151672; Tables 6, 178, and 179 of International Publication WO2013151671; and Tables 6, 185, and 186 of International Publication WO2013151667. Any of the above can be synthesized as an IVT polynucleotide, a chimeric polynucleotide, or a cyclic polynucleotide and linked to the polypeptide described herein, each of which may contain one or more modified nucleotides or terminal modifications.

[0367] Peptide oligonucleotide conjugates The heterogeneous portion may be a peptide oligonucleotide conjugate. A peptide oligonucleotide conjugate includes a chimeric molecule (such as a peptide / nucleic acid mixture) containing a nucleic acid portion linked to a peptide portion. In some embodiments, the peptide portion may include any peptide or protein portion described herein. In some embodiments, the nucleic acid portion may include any nucleic acid or oligonucleotide described herein, such as DNA or RNA, or modified DNA or RNA.

[0368] In some embodiments, the peptide oligonucleotide conjugate includes the peptide antisense oligonucleotide conjugate. In some embodiments, the peptide oligonucleotide conjugate is a synthetic oligonucleotide having a chemically modified backbone. The peptide oligonucleotide conjugate can bind to both DNA and RNA targets in a sequence-specific manner to form a double-stranded structure. When bound to a double-stranded DNA (dsRNA) target, the peptide oligonucleotide conjugate can replace one of the DNA strands in the double helix by strand entry to form a triple-stranded structure, with the displaced DNA strand potentially existing as a single-stranded D-loop.

[0369] In some embodiments, the peptide oligonucleotide conjugate may be cell and / or tissue-specific targeting (it can be directly conjugated to oligos, peptides, and / or proteins, etc.).

[0370] In some embodiments, the peptide oligonucleotide conjugate includes a membrane-permeable migration polypeptide, for example, a membrane-permeable migration polypeptide described elsewhere in this specification.

[0371] Solid-phase synthesis of several peptide oligonucleotide conjugates is described, for example, by Williams et al., 2010, Curr. Protoc. NucleicAcidChem., Chapter 4, Section 41, doi: 10.1002 / 0471142700.nc0441s42. The synthesis and characterization of peptide oligonucleotide conjugates, as well as the stepwise solid-phase synthesis of peptide oligonucleotide conjugates on novel solid-phase supports, are described, for example, in Bongardt et al., Innovation Perspect. Solid Phase Synth. Comb. Libr., Collect. Pap., Int. Symp., 5th edition, 1999, 2 Pages 67-270; Antopolsky et al., Helv. Chim. Acta, 1999, Vol. 82, pp. 2130-2140.

[0372] nanoparticles The heterogeneous portion may be nanoparticles. The nanoparticles include inorganic materials having sizes of approximately 1 to approximately 1000 nanometers, approximately 1 to approximately 500 nanometers, approximately 1 to approximately 100 nm, approximately 30 nm to approximately 200 nm, approximately 50 nm to approximately 300 nm, approximately 75 nm to approximately 200 nm, approximately 100 nm to approximately 200 nm, and any range in between. The nanoparticles have a composite structure with nanoscale dimensions. In some embodiments, the nanoparticles are typically spherical, but different forms are also possible depending on the nanoparticle composition. The portion of the nanoparticle that comes into contact with the external environment is generally identified as the surface of the nanoparticle. In the nanoparticles described herein, the size limit can be restricted to two dimensions, and thus the nanoparticles include a composite structure having a diameter of approximately 1 to approximately 1000 nm, where the specific diameter depends on the nanoparticle composition and the intended use of the nanoparticles by experimental design. For example, nanoparticles used in therapeutic applications typically have a size of approximately 200 nm or less.

[0373] Further desirable properties of nanoparticles, such as surface charge and steric stabilization, may vary depending on the specific application of interest. Exemplary properties that may be desirable in clinical applications, such as cancer treatment, are incorporated herein by reference in their entirety: Davis et al., Nature, 2008, Vol. 7, pp. 771-782; Duncan, Nature, 2006, Vol. 6, pp. 688. ~701 pages; and Allen, Nature, 2002, Vol. 2, pp. 750-763. Further characteristics can be identified by those skilled in the art when reading this disclosure. The dimensions and properties of nanoparticles can be detected by techniques known in the art. Exemplary techniques for detecting particle dimensions include, but are not limited to, dynamic light scattering (DLS) and various microscopes such as transmission electron microscopes (TEM) and atomic foci (AFM). Exemplary techniques for detecting particle morphology include, but are not limited to, TEM and AFM. Exemplary techniques for detecting the surface charge of nanoparticles include, but are not limited to, zeta potential methods. Further techniques suitable for detecting other chemical properties include, 1 H, 11 B, and 13 C and 19 This includes 1F NMR, UV / Vis and infrared / Raman spectroscopy, and fluorescence spectroscopy (when nanoparticles are used in combination with fluorescent labeling), and further techniques that can be identified by those skilled in the art.

[0374] low molecule In one embodiment, the targeting moiety is a small molecule that alters one or more DNA methylation sites within an anchor sequence-mediated linkage, for example, by mutating a methylated cysteine to thymine. For example, a bisulfite compound, such as sodium bisulfite, ammonium bisulfite, or other bisulfite salts, can be used to alter one or more DNA methylation sites, for example, by changing a nucleotide sequence from cysteine to thymine.

[0375] The heterologous portion may be a small molecule. Examples of small molecules include, but are not limited to, small peptides, peptide mimes (e.g., peptoids), amino acids, amino acid analogs, synthetic polynucleotides, polynucleotide analogs, nucleotides, nucleotide analogs, and generally organic and inorganic compounds (including heteroorganic and organometallic compounds) having a molecular weight of less than approximately 5,000 grams / mol. For example, organic or inorganic compounds having a molecular weight of less than approximately 2,000 grams / mol; for example, organic or inorganic compounds having a molecular weight of less than approximately 1,000 grams / mol; for example, organic or inorganic compounds having a molecular weight of less than approximately 500 grams / mol; as well as salts, esters, and other pharmaceutically acceptable forms of such compounds. Small molecules may also include, but are not limited to, neurotransmitters, hormones, drugs, toxins, viruses or microbial particles, synthetic molecules, and agonists or antagonists.

[0376] A suitable example of a small molecule is "The Pharmacological Basis of Therapeutics," Goodman and Gilman, McGraw-Hill, New York, NY (1996), 9th edition, DrugsActing. Synaptic and Neuroeffector Junctional Sites; Drugs Acting on theCentral Nervous System; Autacoids: Drug Therapy of Inflammation; Water, Saltsand Ions;DrugsAffecting Renal Function and Electrolyte Metabolism;Cardiovascular Drugs;DrugsAffecting Gastrointestinal Function; DrugsAffecting UterineMotility;Chemotherapy of Parasitic Infections; Chemotherapy of Microbial Diseases; Chemotherapy of Neoplastic Diseases; Drugs Used for Immunosuppression; Drugs Acting on Blood-Forming Examples include those described in the sections on organs; hormones and hormone antagonists; vitamins, dermatology; and toxicology (all incorporated herein by reference). Some examples of small molecules include, but are not limited to, prion drugs such as tacrolimus, ubiquitin ligase or HECT ligase inhibitors such as heclin, histone modifiers such as sodium butyrate, enzyme inhibitors such as 5-azacytidine, anthracyclines such as doxorubicin, beta-lactams such as penicillin, antibacterial agents, chemotherapeutic agents, antiviral agents, modulators derived from other organisms such as VP64, and drugs with insufficient bioavailability, such as chemotherapeutic agents with deficient pharmacokinetics.

[0377] In some embodiments, the small molecule is an epigenetic modifier, such as those described, for example, by de Groote et al., Nuc. Acids Res. (2012): pp. 1-18. Exemplary small molecule epigenetic modifiers are, for example, by reference by Lu et al., J. BiomolecularScreening, Vol. 17, No. 5 (2012): pp. 555-571, e.g., Table 1. This is described in 2. In some embodiments, the epigenetic modifier includes vorinostat and romidepsin. In some embodiments, the epigenetic modifier includes class I, II, III, and / or IV histone deacetylase (HDAC) inhibitors. In some embodiments, the epigenetic modifier includes SirTI activators. In some embodiments, the epigenetic modifier includes garcinol, Lys-CoA, C646, (+)-JQI, I-BET, BICI, MS120, DZNep, UNC0321, EPZ004777, AZ505, AMI-I, pyrazoleamide 7b, benzo[d]imidazole 17b, acylated dapsone derivatives (e.g., PRMTI), methylstat, 4,4'-dicarboxy-2,2'-bipyridine, SID8573 The present invention comprises 6331, hydroxamate analog 8, tanyl cypromy, bisguanidine and biguanide polyamine analogs, UNC669, Vidaza, decitabine, sodium phenyl butyrate (SDB), lipoic acid (LA), quercetin, valproic acid, hydralazine, bactrim, green tea extract (e.g., epigallocatechin gallate (EGCG)), curcumin, sulforaphane and / or allicin / diallyl disulfide. In some embodiments, the epigenetic modifier is an inhibitor of DNA methylation, e.g., a DNA methyltransferase inhibitor (e.g., 5-azacitidine and / or decitabine). In some embodiments, the epigenetic modifier modifies histone modifications, e.g., histone acetylation, histone methylation, histone smoylation and / or histone phosphorylation. In some embodiments, the epigenetic modifier is a histone deacetylase inhibitor (e.g., vorinostat and / or trichostatin A).

[0378] In some embodiments, the small molecule is a pharmaceutically active agent. In one embodiment, the small molecule is an inhibitor of metabolic activity or a component. Useful classes of pharmaceutically active agents include, but are not limited to, antibiotics, anti-inflammatory agents, angiogenic or vasoactive agents, growth factors, and chemotherapeutic (anti-neoplastic) agents (e.g., tumor suppressants). These are derived from the categories and examples described herein, or (Orme-Johnson 2007, Methods) One or a combination of molecules derived from Cell Biol. 2007; Vol. 80: pp. 813-8126 This can be used. In one embodiment, the present disclosure includes a composition comprising an antibiotic, an anti-inflammatory agent, an angiogenic or vasoactive agent, a growth factor or a chemotherapeutic agent.

[0379] Oligonucleotide aptamers The heterogeneous portion may be an oligonucleotide aptamer. The aptamer portion is an oligonucleotide or a peptide aptamer. Oligonucleotide aptamers are single-stranded DNA or RNA (ssDNA or ssRNA) molecules that can bind to pre-selected targets such as proteins and peptides with high affinity and specificity.

[0380] Oligonucleotide aptamers are nucleic acid species that can bind to a variety of molecular targets, including small molecules, proteins, nucleic acids, and even cells, tissues, and organisms, by being manipulated by SELEX (Systematic Evolution of Ligands by Exponential Enrichment) through in vitro selection in repeated rounds or equivalent. Aptamers provide discriminative molecular recognition and can be produced by chemical synthesis. Furthermore, aptamers possess desirable conservation properties and exhibit little to no immunogenicity in therapeutic applications.

[0381] Both DNA and RNA aptamers exhibit robust binding affinity to a variety of targets. For example, DNA and RNA aptamers bind to lysozyme, thrombin, human immunodeficiency virus trans-action response element (HIV TAR), hemin, interferon-gamma, and vascular endothelial growth factor (VEG). F) The following have been selected: prostate-specific antigen (PSA), dopamine, and non-classical oncogenes, including heat shock factor 1 (HSF1).

[0382] Diagnostic techniques for aptamer-based plasma protein profiling include aptamer plasma proteomics. This technique will enable future multi-biomarker protein measurements that can help in the diagnostic differentiation between diseased and healthy states.

[0383] Peptide aptamer The heterogeneous portion may be a peptide aptamer. A peptide aptamer has one (or more) short, variable peptide domains containing a low molecular weight peptide of 12-14 kDa. Peptide aptamers can be designed to specifically bind to and interfere with intracellular protein-protein interactions.

[0384] Peptide aptamers are artificial proteins selected or engineered to bind to specific target molecules. These proteins contain one or more peptide loops of variable sequences. They are typically isolated from combinatorial libraries and subsequently improved by iterative and selection of site-directed mutagenesis or variable region mutagenesis. In vivo, peptide aptamers can bind to cellular protein targets and exert biological effects such as interference with normal protein interactions between their target molecules and other proteins. In particular, variable peptide aptamer loops bound to transcription factor-binding domains are screened against target proteins bound to transcription factor-activating domains. The in vivo binding of peptide aptamers to their targets by this selection strategy is detected as the expression of downstream yeast marker genes. Such experiments identify the specific proteins to which the aptamers bind and the protein interactions that the aptamers disrupt in a phenotypic manner. Furthermore, peptide aptamers derivatized with appropriate functional regions can induce specific post-translational modifications of their target proteins or alter the intracellular localization of the target.

[0385] Peptide aptamers can also recognize targets in vitro. They are useful as an alternative to antibodies in biosensors and are used to detect active isoforms of proteins from populations containing both inactive and active protein forms. Derivatives known as tadpoles, in which the "head" of the peptide aptamer is covalently linked to a unique sequence double-stranded DNA "tail," allow for the quantification of rare target molecules in a mixture by PCR (e.g., using quantitative real-time polymerase chain reaction) of its DNA tail.

[0386] Peptide aptamer selection can be performed using different systems, but currently, the most widely used is the yeast two-hybrid system. Peptide aptamers can also be selected from combinatorial peptide libraries constructed using phage display and other surface display techniques such as mRNA display, ribosome display, bacterial display, and yeast display. These experimental procedures are also known as biopanning. Among the peptides obtained from biopanning, mimotopes can be considered a type of peptide aptamer. All peptides panned from combinatorial peptide libraries are stored in a special database called MimoDB.

[0387] medication In one embodiment, the heterogeneous portion is a drug having undesirable pharmacokinetic or pharmacodynamic (PK / PD) parameters. By linking the heterogeneous portion to a polypeptide, at least one PK / PD parameter of the heterogeneous portion, such as targeting, absorption, and transport, can be improved, or at least one undesirable PK / PD parameter, such as diffusion to off-target sites and toxic metabolism, can be mitigated. For example, the specificity of the polypeptide described herein can be improved by linking it to a drug with poor targeting / transport, such as doxorubicin or a beta-lactam such as penicillin. In another example, the minimum dose of the polypeptide described herein can be improved by linking it to a drug with poor absorption properties, such as insulin or human growth hormone. In yet another example, the maximum dose of the polypeptide described herein can be improved by linking it to a drug with toxic metabolic properties, such as high doses of acetaminophen.

[0388] Membrane-permeable migration polypeptides In one embodiment, the composition comprises a polypeptide described herein having properties that enable transmembrane migration so that the composition is delivered intracellularly, for example, to a target site within an object, independently of endosomes. In some embodiments, the targeted portion comprises a membrane-permeable translocation polypeptide.

[0389] In one embodiment, the disclosure includes cells or tissues comprising any one of the membrane-permeable migration polypeptides described herein.

[0390] In another aspect, this disclosure includes a pharmaceutical composition comprising a membrane-permeable migration polypeptide as described herein.

[0391] In another aspect, the disclosure includes a method for modulating gene expression by administering a composition comprising a membrane-permeable polypeptide as described herein.

[0392] In one embodiment, the disclosure includes a method for altering gene expression or altering anchor sequence-mediated linkage using a membrane-permeable polypeptide. In some embodiments, the membrane-permeable polypeptide is the targeting moiety. In some embodiments, the membrane-permeable polypeptide is a delivery agent that assists in the delivery of the targeting moiety described herein. The target site may be intracellular, for example, in the cytoplasm or within an organelle (e.g., in the nucleus, such as a target DNA sequence or chromatin structure). The therapeutic compositions described herein may have further advantageous properties such as improved targeting, absorption, or transport, or reduced off-target activity, toxic metabolism, or toxic efflux.

[0393] In one embodiment, the composition is ABX nThe molecule comprises at least one membrane-permeable polypeptide each containing at least one sequence of C (wherein A is selected from a hydrophobic amino acid or amide-containing skeleton having a nucleic acid side chain, e.g., aminoethylglycine; B and C may be the same or different and independently selected from arginine, asparagine, glutamine, lysine, and their analogues; X is independently a hydrophobic amino acid, or X is independently an amide-containing skeleton having a nucleic acid side chain, e.g., aminoethylglycine; n is an integer from 1 to 4).

[0394] Hydrophobic amino acids include, but are not limited to, amino acids having hydrophobic side chains, such as alanine (ala, A), valine (val, V), isoleucine (iso, I), leucine (leu, L), methionine (met, M), phenylalanine (phe, F), tyrosine (tyr, Y), tryptophan (trp, W), and their analogues.

[0395] Examples of amino acid analogs include, but are not limited to, D-amino acids, amino acids lacking a hydrogen atom on the α-carbon such as dehydroalanine, metabolic intermediates such as ornithine and citrulline, non-alpha amino acids such as β-alanine, γ-aminobutyric acid, and 4-aminobenzoic acid, twin α-carbon amino acids such as cystathionine, lanthionine, diencholic acid, and diaminopimelic acid, and any other known in the art.

[0396] Nucleic acid side chains In one embodiment, the membrane-permeable polypeptide comprises one or more nucleic acid side chains linked to an amide backbone. Each amino acid unit in the polypeptide comprises an amide bond and its corresponding side chain. One or more amino acid units in the membrane-permeable polypeptide, along with a nucleic acid side chain instead of an amino acid side chain, have an amide-containing backbone, such as aminoethylglycine, as well as a peptide backbone. Peptide nucleic acids (PNAs) are known to hybridize to complementary DNA and RNA with higher affinity than their oligonucleotide counterparts. This characteristic of PNAs not only makes the polypeptides of this disclosure stable hybrids with nucleic acid side chains, but at the same time, the neutral backbone and hydrophobic side chains result in hydrophobic units within the polypeptide.

[0397] Examples of nucleic acid side chains include, but are not limited to, purine or pyrimidine side chains such as adenine, cytosine, guanine, thymine, and uracil. In one embodiment, the nucleic acid side chain includes a nucleoside analog described herein.

[0398] size In some embodiments, the membrane-permeable polypeptide has a size in the range of about 5 to about 500 amino acid units, for example, 5 to 400, 5 to 300, 5 to 250, 5 to 200, 5 to 150, or 5 to 100 amino acid units. The polypeptide may have a length in the range of about 5 to about 50 amino acids, about 5 to about 40 amino acids, about 5 to about 30 amino acids, about 5 to about 25 amino acids, or any other range. In one embodiment, the polypeptide has a length of about 10 amino acids. In another embodiment, the polypeptide has a length of about 15 amino acids. In another embodiment, the polypeptide has a length of about 20 amino acids. In another embodiment, the polypeptide has a length of about 25 amino acids. In another embodiment, the polypeptide has a length of about 30 amino acids.

[0399] Membrane-permeable polypeptides are ABX within their length range. n It may have more than one sequence of C. Each ABX n The C sequence is divided into one or more amino acids by another ABXn It can be separated from the C sequence. In one embodiment, the polypeptide is ABX n The C sequence is repeated, and the sequence is separated by one or more amino acid units. In another embodiment, the polypeptide has at least two (e.g., 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15 or more, e.g., 2-20, 2-10, 2-5) ABX n It contains a C sequence and separates the sequence by one or more amino acid units. In another embodiment, ABX n The C sequence is separated by one (or more) hydrophobic amino acids, such as isoleucine or leucine.

[0400] Composition, the same or different multiple ABX n The C sequence may be included. In one embodiment, at least two of the plurality are identical in sequence and / or length. In one embodiment, at least two of the plurality differ in sequence and / or length. In one embodiment, the composition is a plurality of ABX sequences, where at least two of the plurality are the same and at least two are different. n Contains a C sequence. In one embodiment, ABX in a membrane-permeable polypeptide. n The C sequences are not identical in terms of their arrangement, length, or any combination thereof.

[0401] Production of proteins or polypeptides The methods for producing therapeutic proteins or polypeptides described herein are commonplace in the industry. Generally, see Smales and James (eds.), Therapeutic Proteins: Methods and Protocols (Methods in Molecular Biology), Humana Press (2005); See also Crommelin, Sindelar, and Meibohm (eds.), Pharmaceutical Biotechnology: Fundamentals and Applications, Springer (2013).

[0402] Proteins or polypeptides of a composition can be biochemically synthesized using standard solid-phase techniques. Such methods include exclusive solid-phase synthesis, partial solid-phase synthesis, fragment condensation, and classical solution synthesis. These methods can be used when the peptide is relatively short (i.e., 10 kDa) and / or cannot be produced by recombinant techniques (i.e., not encoded by a nucleic acid sequence), and therefore involves different chemical reactions.

[0403] The solid-phase synthesis procedure is well known in the industry, as described in John Morrow Stewart and Janis Dillaha Young, Solid Phase Peptide Syntheses, 2nd edition, Pierce Chemical Company, 1984; This is further described by Coin, I. et al., Nature Protocols, Vol. 2: pp. 3247-3256, 2007.

[0404] For longer peptides, recombinant methods can be used. Methods for producing recombinant therapeutic polypeptides are commonplace in this industry. Generally, Smales and James (eds.) See also Therapeutic Proteins: Methods and Protocols (Methods in Molecular Biology), HumanaPress (2005); and Crommelin, Sindelar and Meibohm (eds.), Pharmaceutical Biotechnology: Fundamentals and Applications, Springer (2013).

[0405] Exemplary methods for producing therapeutic pharmaceutical proteins or polypeptides include expression in mammalian cells, but recombinant proteins can also be produced using insect cells, yeast, bacteria, or other cells under the control of an appropriate promoter. Mammalian expression vectors may include a replication origin, a suitable promoter, and other 5' or 3' flanking untranscribed sequences, as well as untranscribed elements such as essential ribosome binding sites, polyadenylation sites, splice donor and acceptor sites, and 5' or 3' untranslated sequences such as termination sequences. DNA sequences derived from the SV40 viral genome, e.g., SV40 origin, initial promoter, splice, and polyadenylation sites, can be used to provide other genetic elements necessary for the expression of heterologous DNA sequences. Appropriate cloning and expression vectors for use with bacterial, fungal, yeast, and mammalian cell hosts are described in Green and Sambrook, Molecular Cloning: A Laboratory Manual (4th edition), Cold Spring Harbor Laboratory Press (2012).

[0406] If a large amount of protein or polypeptide is desired, refer to Brian Bray, Nature Reviews Drug Discovery, Vol. 2: pp. 587-593, 2003; and Weissbach. It is produced using techniques such as those described by Weissbach, 1988, Methods for Plant Molecular Biology, Academic Press, NY, Part VIII, pp. 421-463. It is possible.

[0407] Recombinant proteins can be expressed and produced using various mammalian cell culture systems. Examples of mammalian expression systems include CHO cells, COS cells, HeLA, and BHK cell lines. The process of host cell culture for the production of protein therapeutics is described in Zhou and Kantardjieff (eds.), Mammalian Cell Cultures for Biologics Manufacturing (Advances in Biochemical Engineering / Biotechnology), Springer (2014). The compositions described herein may include vectors encoding recombinant proteins, such as viral vectors, e.g., lentiviral vectors. Vectors, e.g., viral vectors, contain nucleic acids encoding recombinant proteins.

[0408] The purification of protein therapeutics is described in Franks, Protein Biotechnology: Isolation, Characterization, and Stabilization, Humana Press (2013); and Cutler, Protein Purification Protocols (Methods in Molecular Biology), Humana Press (2010).

[0409] The formulation of protein therapeutic agents is described in Meyer (ed.), Therapeutic Protein Drug Products: Practical Approach to Formulation in the Laboratory, Manufacturing, and the Clinic. This is described in the Woodhead Publishing Series (2012).

[0410] Linker The proteins or polypeptides described herein may also include a linker. In some embodiments, for example, a protein described herein, comprising a first polypeptide domain containing Cas or a modified Cas protein and a second polypeptide domain containing a polypeptide having DNA methyltransferase activity [or demethylating or deaminase activity], has a linker between the first and second polypeptides. In one embodiment, one or more polypeptides described herein are linked using a linker. The linker may be a chemical bond, for example, one or more covalent or non-covalent bonds. In some embodiments, the linker is a peptide linker (e.g., non-ABX). n The linker is a C peptide. Such linkers may be 2 to 30 amino acids long or longer. The linkers include flexible, rigid, or cleavage linkers as described herein.

[0411] The most commonly used flexible linkers have sequences consisting primarily of stretches of Gly and Ser residues ("GS" linkers). Flexible linkers are useful for linking domains that require a certain degree of movement or interaction and may contain small, nonpolar (e.g., Gly) or polar (e.g., Ser or Thr) amino acids. Incorporation of Ser or Thr can also maintain the stability of the linker in aqueous solutions by forming hydrogen bonds with water molecules, thus reducing undesirable interactions between the linker and the protein moiety.

[0412] Rigid linkers are useful for maintaining a fixed distance between domains and preserving their independent functions. Rigid linkers can also be useful when spatial isolation of domains is important for preserving the stability or bioactivity of one or more components in the fusion. Rigid linkers are found in alpha-helix structures or Pro-rich sequences, (XP) n (wherein X represents any amino acid, preferably Ala, Lys, or Glu) may be present.

[0413] Scleavable linkers can release free functional domains in vivo. In some embodiments, linkers can be cleaved under specific conditions, such as the presence of a reducing agent or protease. In vivo scleavable linkers may utilize the reversible nature of disulfide bonds. One example involves a thrombin-sensitive sequence (e.g., PRS) between two Cys residues. In vitro thrombin treatment of CPRSC results in cleavage of the thrombin-sensitive sequence, while the reversible disulfide bond remains intact. Such linkers are known, for example, Chen et al., 2013, FusionProteinLinkers:Property, Design and Functionality. Adv Drug Deliv Rev. 65(10):13. This is described on pages 57-1369. In vivo cleavage of linkers in fusions can also be carried out in vivo under pathological conditions (e.g., cancer or inflammation) by proteases expressed in specific cells or tissues, or confined within specific cellular compartments. The specificity of many proteases provides slower cleavage of linkers within confined compartments.

[0414] Examples of linking molecules include hydrophobic linkers such as negatively charged sulfonate groups; lipids such as poly(-CH2-) hydrocarbon chains, such as polyethylene glycol (PEG) groups, their unsaturated variants, their hydroxylated variants, their amidated or otherwise N-containing variants, and non-carbon linkers; carbohydrate linkers; phosphodiester linkers; or other molecules that can covalently link two or more polypeptides. Non-covalent linkers also include hydrophobic lipid spheres, to which polypeptides are linked via hydrophobic regions or hydrophobic extensions of polypeptides, such as leucine, isoleucine, valine, or perhaps alanine, phenylalanine, or even tyrosine, methionine, glycine, or a series of other hydrophobic residues. Polypeptides can be linked using charge-based chemical reactions, such that a positively charged portion of one polypeptide is linked to the negative charge of another polypeptide or nucleic acid.

[0415] Polypeptide polymerization The composition may include, for example, a plurality (two or more) of membrane-permeable migration polypeptides linked together via a linker as described herein.

[0416] The composition may contain multiple membrane-permeable migration polypeptides, some of which are the same or different. In one embodiment, at least two of the plurality are identical in sequence and / or length. In one embodiment, at least two of the plurality are different in sequence and / or length. In one embodiment, the composition contains multiple polypeptides, at least two of which are the same and at least two of which are different. In one embodiment, the polypeptides in the composition are not identical in sequence, length, or any combination thereof.

[0417] The composition comprises a membrane permeable polypeptide linked to another membrane permeable polypeptide by a linker, for example. In some embodiments, the composition comprises two or more polypeptides linked by a linker. In some embodiments, the composition comprises three or more polypeptides linked by a linker. In some embodiments, the composition comprises four or more polypeptides linked by a linker. In some embodiments, the composition comprises five or more polypeptides linked by a linker. The linker may be a chemical bond, for example, one or more covalent or non-covalent bonds, for example, a flexible, rigid or cleavable peptide linker. Such a linker may be 2 to 30 amino acids or longer. Further linkers are described in more detail elsewhere in this specification and are also applicable.

[0418] In one embodiment, two or more membrane-permeable polypeptides are linked via peptide bonds, for example, the carboxyl terminus of one polypeptide is bonded to the amino terminus of another polypeptide. In another embodiment, one or more amino acids on one polypeptide are linked to one or more amino acids on another polypeptide via disulfide bonds between cysteine side chains, for example. In yet another embodiment, one or more amino acids on one polypeptide are linked to the carboxyl or amino terminus of another polypeptide, for example, to create a branched polypeptide.

[0419] In another embodiment, one or more nucleic acid side chains on one membrane-permeable polypeptide interact with one or more amino acid side chains on another membrane-permeable polypeptide, such as arginine which forms a false pair with guanosine. In another embodiment, one or more nucleic acid side chains on one membrane-permeable polypeptide interact with one or more nucleic acid side chains on another membrane-permeable polypeptide, such as via hydrogen bonds. In another embodiment, multiple membrane-permeable polypeptides interact to create a specific sequence during the arrangement of nucleic acid side chains. For example, a carboxy-terminal nucleic acid side chain from one polypeptide interacts with an amino-terminal nucleic acid side chain from another polypeptide to create a false--5' to false--3' nucleotide sequence. In another example, polypeptides are linked to one or more polypeptides via amino acids and / or terminals on each polypeptide, and their respective nucleic acid side chains align to create a false--5' to false--3' nucleotide sequence. The false sequence can bind to selected target sequences, such as anchor sequences for anchor-mediated linkage, e.g., CTCF-binding motifs, cohesin-binding motifs, USF1-binding motifs, YY1-binding motifs, TATA boxes, ZNF143-binding motifs, or transcriptional regulatory sequences, e.g., enhancing or silencing sequences. The false sequence can interfere with factor binding and transcription by binding to target sequences. The false sequence can interfere with gene expression by hybridizing with nucleic acid sequences such as mRNA.

[0420] In one embodiment, membrane-transferable polypeptides are linked together to form a false-5' to false-3' nucleotide sequence that binds with sufficient avidity to an anchor sequence recognized by a nucleating protein, which binds with anchor sequence-mediated linkage, e.g., a loop, or a physical interaction or binding between one linkage nucleating molecule-anchor sequence and another linkage nucleating molecule-anchor sequence to form a two-dimensional DNA structure. Examples of anchor sequences include, but are not limited to, CTCF-binding motifs, e.g., CTCF-binding motif or consensus sequence: N(T / C / G)N(G / A / T)CC(A / T / G)(C / G)(C / T / A)AG(G / A)(G / T)GG(C / A / T)(G / A)(C / G)(C / T / A)(G / A / C)(SEQ ID NO: 1) (wherein N is any nucleotide). The linked polypeptides may also create a false--5' to false--3' nucleotide sequence that binds in the opposite direction to a CTCF-binding motif or consensus sequence, for example, (G / A / C)(C / T / A)(C / G)(G / A)(C / A / T)GG(G / T)(G / A)GA(C / T / A)(C / G)(A / T / G)CC(G / A / T)N(T / C / G)N (Sequence ID 2).

[0421] The membrane-permeable translocation polypeptides described herein can be polymerized, for example, by linking two or more polypeptides, using standard ligation techniques. Such methods include common natural chemical ligation strategies (Siman, P. and Brik, A. Org. Biomol. Chem. 2012, vol. 10:5684). ~5697 pages; Kent, SBH Chem. Soc. Rev. 2009, Vol. 38: 338-351 Pages; and Hackenberger, CPR and Schwarzer, D. Angew. Chem., Int. Ed. 2008, Vol. 47: pp. 10030-10074), click modification protocol (Tasdelen, MA; Yagci, Y. Angew. Chem., Int. Ed. 2013, Vol. 52: pp. 5930-5938). Palomo, JM Org. Biomol. Chem. 2012, Vol. 10: pp. 9309-9318; Eldijk, MB; vanHest, JCM Angew. Chem., Int. Ed. 2011, Vol. 50: p. 880 pp. 6-8827; and Lallana, E.; Riguera, R.; Fernandez-Megia, E. Angew. Chem., Int. Ed. 2011, Vol. 50: pp. 8794-8804), and bioorthogonal reactions (King, M.; Wagner, A. Bioconjugate Chem. 2014, Vol. 25: pp. 825-839; Lang, K.; Chin, JW Chem. Rev. 2014, Vol. 114: pp. 4764-4806; Patterson, D. M.;Nazarova, LA; Prescher, JA ACS Chem. Biol. 2014, 9:592~ Page 605; Lang, K.; Chin, JW. ACS Chem. Biol. 2014, Vol. 9: pp. 16-20; Akaoka, Y.; Ojida, A.; Hamachi, I. Angew. Chem., Int. Ed. 2013, Vol. 52: 4 (pp. 88-4106; Debets, MF; vanHest, JCM; Rutjes, FPJT Org. Biomol. Chem. 2013, Vol. 11: pp. 6439-6455; and Ramil, CP; Lin, Q. Chem. Commun. 2013, Vol. 49: pp. 11007-11022).

[0422] In some embodiments, the ordering of membrane-permeable polypeptides in a polymer may be specific or random, for example, if the polypeptides are not identical. For example, the polypeptides described herein may be polymerized by template-driven synthesis, or the polymerization may be ordered by physical constraint or hybridization to a template, such as DNA, protein, or hybrid DNA-protein. In one embodiment, a template, such as a DNA sequence, specifically hybridizes to the polypeptides described herein. Polypeptides are linked to other polypeptides by one of the methods described herein, for example, general chemical ligation, and the selection of which polypeptides are linked may be constrained by their ability to hybridize to a template. Thus, specific polypeptide polymers can be produced by their ability to specifically hybridize to a template.

[0423] In some embodiments, the order of membrane-permeable polypeptides in the polymer is determined by the chemical ligation strategy used. In one embodiment, chemical ligation techniques such as click reactions and bioorthogonal reactions dictate which polypeptides are ligated, as the chemical ligation strategy requires specific entities to react in order for the ligation technique to proceed. For example, one polypeptide may be labeled with phenylazide and another with cyclooctin. Cyclooctin and phenylazide react to ligate the two polypeptides.

[0424] Hybridization In embodiments where the membrane-permeable polypeptide includes nucleic acid side chains, it can interact with nucleic acids. In one embodiment, one or more nucleic acid side chains on the polypeptide hybridize with a nucleic acid sequence, e.g., DNA such as genomic DNA, RNA such as siRNA or mRNA molecules. One or more nucleic acid side chains on the polypeptide specifically hybridize with one or more nucleic acid residues in a target nucleic acid sequence. In one embodiment, polypeptides are linked together, and the nucleic acid side chains hybridize with a nucleic acid sequence (e.g., a gene locus, mRNA, anchor sequence for anchor sequence-mediated linkage, e.g., CTCF binding motif, cohesin binding motif, USF1 binding motif, YY1 binding motif, TATA box, ZNF143 binding motif, etc.).

[0425] Nucleic acid side chains or false sequences of nucleic acid side chains can hybridize substantially to the nucleic acid side chain or false sequence, or to a target nucleic acid sequence that is 100%, 95%, 90%, 85%, 80%, 75%, or 70% complementary to it. Hybridization of nucleic acid side chains or false sequences of nucleic acid side chains with target nucleic acid sequences can be carried out under preferred hybridization conditions routinely determined by an optimization procedure. Conditions such as temperature, component concentrations, hybridization and washing times, buffering components, and their pH and ionic strength may vary depending on various factors such as the length and GC content of the nucleic acid side chain or false sequence and the complementary target nucleic acid sequence. For example, when using relatively short nucleic acid side chains or false sequences of nucleic acid side chains, lower stringent conditions can be employed. Detailed conditions for hybridization can be found in publications such as Molecular Cloning, A Laboratory Manual, 4th Edition (ColdSpring Harbor Laboratory Press, 2012).

[0426] heterogeneous parts linked to polypeptides The composition may include heterologous moieties as described herein, linked to a membrane-permeable polypeptide of a targeted moiety via covalent or non-covalent bonds or linkers as described herein. In one embodiment, the composition includes heterologous moieties linked to the polypeptide via peptide bonds. For example, the amino terminus of the polypeptide is linked to the heterologous moiety via a peptide bond using an optional linker. In another embodiment, the carboxyl terminus of the polypeptide is linked to the heterologous moiety.

[0427] In one embodiment, the composition comprises a membrane-permeable migration polypeptide linked to two heterogeneous moieties. For example, the amino and carboxyl ends of the polypeptide are linked to heterogeneous moieties that may be the same or different moieties.

[0428] In another embodiment, one or more amino acids of a membrane-permeable polypeptide are linked to a heterologous moiety via disulfide bonds between cysteine side chains, hydrogen bonds, or any other known chemical reaction. One heterologous moiety may be a biologically active effector, and the other heterologous moiety may be a ligand or antibody for targeting the composition to specific cells expressing a receptor. For example, a chemotherapeutic agent such as topotecan, a topoisomerase inhibitor, may be linked to one end of the polypeptide, and a ligand or antibody may be linked to the other end of the polypeptide to target the composition to specific cells or tissues. In another example, both heterologous moieties are biologically active effectors.

[0429] In another embodiment, multiple membrane-permeable polypeptides, which are the same or different membrane-permeable polypeptides, are linked to a single heterologous molar. The polypeptides may act as a coating surrounding the larger heterologous molar, assisting its transmembrane movement. The heterologous molar may have a molecular weight greater than about 500 g / mol or Dalton, for example, an organic or inorganic compound having a molecular weight greater than about 1,000 g / mol, for example, an organic or inorganic compound having a molecular weight greater than about 2,000 g / mol, for example, an organic or inorganic compound having a molecular weight greater than about 3,000 g / mol, for example, an organic or inorganic compound having a molecular weight greater than about 4,000 g / mol, for example, an organic or inorganic compound having a molecular weight greater than about 5,000 g / mol, and this includes salts, esters, and other pharmaceutically acceptable forms of such compounds.

[0430] In one embodiment, the composition comprises a membrane-permeable polypeptide ligated to a heterologous moiety on one or both of its ends and another heterologous moiety ligated to another site on the polypeptide. One or both of the amino and carboxyl ends of the polypeptide are ligated to the heterologous moiety, and one or more amino acid units in the polypeptide, which are either amino acids or nucleic acids, are ligated to one or more heterologous moieties via disulfide bonds or hydrogen bonds, etc. For example, a DNA-modifying enzyme is ligated to the polypeptide, and a nucleic acid having an unmethylated CTCF-binding motif complementary to a target methylation gene is hybridized to the nucleic acid side chain of the polypeptide. Upon administration, the composition modulates gene transcription by targeting the CTCF genome-binding motif. In another example, a double-stranded nucleic acid having an unmethylated CTCF-binding motif along with a gene-specific flanking sequence is ligated to the polypeptide. Upon administration, the unmethylated CTCF-binding motif acts as an alternative anchor sequence for the CTCF protein to bind. In yet another example, another heterologous moiety, such as ubiquitin and an effector, is ligated to the polypeptide. Upon administration, the composition penetrates the cell membrane, and the effector performs its function. Next, ubiquitin targets the components for degradation.

[0431] In one embodiment, the composition comprises a membrane-permeable polypeptide linked to one or more heterologous moieties via covalent bonds and another heterologous moiety linked to a nucleic acid within the polypeptide. For example, a protein synthesis inhibitor is covalently linked to the polypeptide, and siRNA or other target-specific nucleic acids are hybridized to the nucleic acid within the polypeptide. Upon administration, the siRNA targets the composition to mRNA transcripts, and the protein synthesis inhibitor and siRNA act to inhibit mRNA expression.

[0432] In some embodiments, the pharmaceutical composition, structure I: (II)XYZ (In the formula, X and Z are 5' and 3' site-specific targeting sequences for the target CTCF binding motif, respectively, and Y is, (a) RNA sequence complementary to the sequence of Sequence ID No. 1; (b) RNA sequences that are at least 75%, 80%, 85%, 90%, and 95% identical to the RNA sequence complementary to sequence number 1; (c) An RNA sequence complementary to the sequence of Sequence ID No. 1, having at least 1, 2, 3, 4, or 5 but fewer than 15, 12, or 10 nucleotide additions, substitutions, or deletions; (d) RNA sequence complementary to the sequence of Sequence ID No. 2; (e) RNA sequences that are at least 75%, 80%, 85%, 90%, and 95% identical to the RNA sequence complementary to sequence number 2; (f) An RNA sequence complementary to the sequence of Sequence ID No. 2, having at least 1, 2, 3, 4, or 5 but fewer than 15, 12, or 10 nucleotide additions, substitutions, or deletions. (Selected from) It contains a membrane-transferable polypeptide linked to a gRNA containing the sequence.

[0433] In some embodiments, X and Z are each 2 to 50 nucleotides long, for example, 2 to 20, 2 to 10, and 2 to 5 nucleotides long.

[0434] In some embodiments, the gRNA contains a specific targeting sequence for oncogenes, tumor suppressors, or CTCF-binding motifs associated with diseases related to nucleotide repeats.

[0435] The membrane-permeable polypeptides described herein can be linked to heterologous moieties by using standard ligation techniques, such as those described herein for polypeptide linking.

[0436] To introduce small mutations or single-point mutations, homologous recombination (HR) templates can be ligated to membrane-permeable polypeptides. In one embodiment, the HR template is a single-stranded DNA (ssDNA) oligo or plasmid. For ssDNA oligo design, an overall homology of about 100–150 bp with a mutation introduced near the center can be used to obtain a homology arm of 50–75 bp.

[0437] In some embodiments, a gRNA or antisense DNA oligonucleotide is used to target a target anchor sequence, for example, a CTCF-binding motif. (a) Nucleotide sequence containing Sequence ID 1; (b) Nucleotide sequences that are at least 75%, 80%, 85%, 90%, and 95% identical to Sequence ID No. 1; (c) Nucleotide sequences containing SEQ ID NO: 1 having at least 1, 2, 3, 4, or 5 nucleotide additions, substitutions, or deletions, but fewer than 15, 12, or 10 nucleotides; (d) Nucleotide sequence including Sequence ID 2; (e) Nucleotide sequences that are at least 75%, 80%, 85%, 90%, or 95% identical to SEQ ID NO: 2; Nucleotide sequences containing SEQ ID NO: 2 having at least 1, 2, 3, 4, or 5 but fewer than 15, 12, or 10 nucleotide additions, substitutions, or deletions. It is linked to a membrane-permeable migration polypeptide along with an HR template selected from the available options.

[0438] The linkers described herein can be incorporated to covalently or non-covalently link a membrane-permeable polypeptide and a heterologous moiety. The linkers can be used, for example, to detach the polypeptide from the heterologous moiety. For example, the linker can be positioned between the polypeptide and the heterologous moiety to provide molecular flexibility in the secondary and tertiary structures. In one embodiment, the linker comprises at least one glycine, alanine, and serine amino acid to provide flexibility. In another embodiment, the linker is a hydrophobic linker such as a negatively charged sulfonate group, a polyethylene glycol (PEG) group, or a pyrophosphate diester group. In yet another embodiment, the linker is cleavable, selectively releasing the heterologous moiety from the polypeptide, but is stable enough to prevent premature cleavage.

[0439] Binding after administration In some embodiments, the membrane-permeable polypeptides described herein have the ability to form bonds with other polypeptides, heterologous moieties described herein, such as effector molecules, such as nucleic acids, proteins, peptides, or other molecules, or other drugs, such as intracellular molecules, via covalent or non-covalent bonds, after administration, for example. In one embodiment, one or more amino acids on the polypeptide can be linked to nucleic acids via arginine forming a false pair with guanosine, or via internucleotide phosphate bonds or interpolymer bonds, for example. In some embodiments, the nucleic acids are DNA, such as genomic DNA, or RNA, such as tRNA or mRNA molecules. In another embodiment, one or more amino acids on the polypeptide can be linked to proteins or peptides.

[0440] fusion molecule In some embodiments, the composition includes a fusion molecule, such as a fusion molecule containing a peptide or polypeptide. Those skilled in the art who read this specification will understand that the term “protein fusion” may refer to a fusion molecule containing a “protein” (or peptide or polypeptide) component. In some embodiments, the protein fusion includes one or more of the portions described herein, for example, a nucleic acid sequence, a peptide or protein portion, a membrane-permeable polypeptide, a targeted peptide / aptamer, or other heterogeneous portions described herein.

[0441] In one embodiment, the present disclosure includes a cell or tissue containing any one of the protein fusions described herein.

[0442] In another embodiment, the present disclosure includes a pharmaceutical composition comprising a protein fusion described herein.

[0443] In another embodiment, the Disclosure includes a method for modulating gene expression by administering a composition comprising a protein fusion described herein. For example, the protein fusion may be dCas9-DNMT, dCas9-DNMT-3a-3L, dCas9-DNMT-3a-3a, dCas9-DNMT-3a-3L-3a, dCas9-DNMT-3a-3L-KRAB, dCas9-KRAB, dCas9-APOBEC, APOBEC-dCas9, dCas9-APOBEC-UGI, dCas9-UGI, UGI-dCas9-APOBEC, UGI-APOBEC-dCas9, any variant of the protein fusion described herein, or other fusions of the protein or protein domain described herein.

[0444] Exemplary dCas9 fusion methods and compositions applicable to the methods and compositions described herein are known, for example, Kearns et al., Functional annotation of native enhancers with a Cas9-histone demethylase fusion. Nature Methods Vol. 12, pp. 401-403 (2015); and McDonald et al., Reprogrammable CRISPR / Cas9-based system For inducing site-specific DNA methylation. This is described in Biology Open 2016: doi:10.1242 / bio.019067. Using methods known in the art, dCas9 can be fused to any of the various drugs and / or molecules described herein; the fusion molecules thus obtained may be useful in various disclosed methods.

[0445] In one embodiment, the disclosure includes a composition comprising a protein containing a domain that acts on DNA, for example, an enzyme domain (e.g., a nuclease domain, for example, a Cas9 domain, for example, a dCas9 domain; DNA methyltransferase, demethylase, deaminase), in combination with at least one guide RNA (gRNA) or antisense DNA oligonucleotide that targets the protein to an anchor sequence of a target anchor sequence-mediated linkage, and which is effective in altering the target anchor sequence-mediated linkage in human cells. In some embodiment...

Claims

[Claim 1] The invention as shown in the drawings.