Mutant myocilin disease models and their use

Non-human animals with humanized MYOC gene loci and CRISPR-related systems replicate glaucoma phenotypes, addressing the inadequacy of existing models by enhancing MYOC expression and pressure, facilitating effective treatment evaluation.

JP7883584B2Active Publication Date: 2026-07-01REGENERON PHARMACEUTICALS INC

Patent Information

Authority / Receiving Office
JP · JP
Patent Type
Patents
Current Assignee / Owner
REGENERON PHARMACEUTICALS INC
Filing Date
2022-12-08
Publication Date
2026-07-01

AI Technical Summary

Technical Problem

Current animal models fail to accurately reproduce the glaucoma phenotype caused by MYOC toxic gain-of-function mutant proteins, leading to trabecular meshwork stress and elevated intraocular pressure.

Method used

Development of non-human animals with humanized MYOC gene loci, including specific mutations and promoter sequences, to mimic human MYOC function and expression, integrated with CRISPR-related systems for targeted gene editing and expression enhancement.

Benefits of technology

The humanized MYOC models exhibit increased MYOC mRNA and protein expression, elevated intraocular pressure, and glaucoma-like phenotypes, providing a reliable model for evaluating glaucoma treatments and reagents.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure 0007883584000009
    Figure 0007883584000009
  • Figure 0007883584000010
    Figure 0007883584000010
  • Figure 0007883584000011
    Figure 0007883584000011
Patent Text Reader

Abstract

Provided are non-human animal genomes, non-human animal cells, and non-human animals that contain a humanized MYOC locus, as well as methods of making and using such non-human animal genomes, non-human animal cells, and non-human animals. The non-human animal cells or animals that contain a humanized MYOC locus express a human myocilin protein or a chimeric myocilin protein, a fragment of which is derived from human myocilin. Provided are methods for using such non-human animals that contain a humanized MYOC locus to evaluate the in vivo efficacy of human myocilin targeting reagents and reagents for treating glaucoma.
Need to check novelty before this filing date? Find Prior Art

Description

[Technical Field]

[0001] Cross-reference of related applications This application claims the benefit of U.S. Patent Application No. 63 / 287,281, filed on 8 December 2021, which is incorporated herein by reference in its entirety for all purposes.

[0002] References to sequence listings submitted as text files via EFS Web The sequence listing described in file 588714SEQLIST.xml is 302 kilobytes in size, was created on December 1, 2022, and is incorporated herein by reference. [Background technology]

[0003] Mutations in myocilin (MYOC) are involved in the most common genetic cause of glaucoma, accounting for 8–10% of cases of autosomal dominant familial juvenile open-angle glaucoma and 2–3% of cases of primary open-angle glaucoma (POAG). MYOC toxic gain-of-function mutant proteins aggregate within cells, leading to trabecular meshwork (TM) stress, elevated intraocular pressure (IOP), and glaucoma. Better animal models are needed to reproduce the glaucoma phenotype caused by MYOC. [Overview of the Initiative]

[0004] The present invention provides non-human animals, non-human animal cells, and non-human animal genomes containing humanized MYOC gene loci, as well as methods for constructing and using such non-human animals, non-human animal cells, and non-human animal genomes. Also provided are humanized non-human animal MYOC genes, nucleic acids containing humanized non-human animal MYOC genes, targeted vectors for use in humanizing non-human animal MYOC genes, and methods for constructing and using such humanized MYOC genes.

[0005] In one embodiment, non-human animals, non-human animal cells, and non-human animal genomes are provided, which contain a humanized endogenous MYOC locus in their genome, wherein the region of the endogenous MYOC locus is deleted and replaced with a corresponding human MYOC sequence. In some such non-human animals, non-human animal cells, and non-human animal genomes, the region of the endogenous MYOC locus includes both the MYOC exon sequence and the MYOC intron sequence, and the corresponding human MYOC sequence includes both the MYOC exon sequence and the MYOC intron sequence.

[0006] In some such non-human animals, non-human animal cells, and non-human animal genomes, the humanized endogenous MYOC locus contains mutations associated with glaucoma. In some such non-human animals, non-human animal cells, and non-human animal genomes, the human MYOC sequence contains mutations. In some such non-human animals, non-human animal cells, and non-human animal genomes, the humanized endogenous MYOC locus contains the Y437H mutation. In some such non-human animals, non-human animal cells, and non-human animal genomes, the human MYOC sequence contains the Y437H mutation.

[0007] In some such non-human animals, non-human animal cells, and non-human animal genomes, the humanized endogenous MYOC locus contains an endogenous MYOC promoter, and the human MYOC sequence is operably ligated to the endogenous MYOC promoter. In some such non-human animals, non-human animal cells, and non-human animal genomes, at least one intron and at least one exon of the endogenous MYOC locus are deleted and replaced with the corresponding human MYOC sequence. In some such non-human animals, non-human animal cells, and non-human animal genomes, the humanized endogenous MYOC locus contains the human MYOC 3' untranslated region. In some such non-human animals, non-human animal cells, and non-human animal genomes, the endogenous MYOC 5' untranslated region is not deleted and is not replaced with the corresponding human MYOC sequence.

[0008] In some such non-human animals, non-human animal cells, and non-human animal genomes, the humanized endogenous MYOC locus encodes the human myosyllin protein. In some such non-human animals, non-human animal cells, and non-human animal genomes, the human myosyllin protein sequence includes the sequence described in Sequence ID No. 4, and optionally, the human myosyllin protein sequence is encoded by a coding sequence including the sequence described in Sequence ID No. 5.

[0009] In some such non-human animals, non-human animal cells, and non-human animal genomes, the entire MYOC coding sequence of the endogenous MYOC locus has been deleted and replaced with the corresponding human MYOC sequence. In some such non-human animals, non-human animal cells, and non-human animal genomes, the region of the endogenous MYOC locus from the MYOC start codon to the MYOC stop codon has been deleted and replaced with the corresponding human MYOC sequence.

[0010] In some such non-human animals, non-human animal cells, and non-human animal genomes, the region of the endogenous MYOC locus from the MYOC start codon to the MYOC stop codon has been deleted and replaced with a human MYOC sequence containing the corresponding human MYOC sequence and the human MYOC 3' untranslated region, the human MYOC sequence containing the Y437H mutation, the endogenous MYOC 5' untranslated region not deleted and not replaced with the human MYOC sequence, the humanized endogenous MYOC locus containing the endogenous MYOC promoter, and the human MYOC sequence operably linked to the endogenous MYOC promoter.

[0011] In some such non-human animals, non-human animal cells, and non-human animal genomes, the human MYOC sequence at the humanized endogenous MYOC locus contains a sequence that is at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identical to the sequence described in Sequence ID No. 87. In some such non-human animals, non-human animal cells, and non-human animal genomes, the humanized endogenous MYOC locus encodes a myosilin protein containing a sequence that is at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identical to the sequence described in Sequence ID No. 4. In some such non-human animals, non-human animal cells, and non-human animal genomes, the humanized endogenous MYOC locus contains a myosilin coding sequence containing a sequence that is at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identical to the sequence described in Sequence ID No. 5. In some such non-human animals, non-human animal cells, and non-human animal genomes, the humanized endogenous MYOC locus contains a sequence that is at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identical to the sequence described in Sequence ID No. 88 or 89.

[0012] In some such non-human animals, non-human animal cells, and non-human animal genomes, the humanized endogenous MYOC locus does not contain a selection cassette or reporter gene. In some such non-human animals, non-human animal cells, and non-human animal genomes, homozygous for the humanized endogenous MYOC locus. In some such non-human animals, non-human animal cells, and non-human animal genomes, the non-human animal contains the humanized endogenous MYOC locus in its germline.

[0013] In some such non-human animals, non-human animal cells, and non-human animal genomes, the non-human animal is a mammal. In some such non-human animals, non-human animal cells, and non-human animal genomes, the non-human animal is a rat or a mouse. In some such non-human animals, non-human animal cells, and non-human animal genomes, the non-human animal is a mouse.

[0014] Some such non-human animals, non-human animal cells, and non-human animal genomes further include an expression cassette integrated into their genome, the expression cassette comprising (a) a nucleic acid encoding a chimeric clustered and regularly interspaced short palindromic repeat (CRISPR)-related (Cas) protein containing a nuclease-inactive Cas protein fused to one or more transcription-activating domains, and (b) a nucleic acid encoding a chimeric adapter protein containing an adapter protein fused to one or more transcription-activating domains. Some such non-human animals, non-human animal cells, and non-human animal genomes further include one or more guide RNAs, or one or more expression cassettes encoding one or more guide RNAs, each guide RNA containing one or more adapter-binding elements to which a chimeric adapter protein can specifically bind, each of the one or more guide RNAs being capable of complexing with a Cas protein and guiding it to a target sequence in a target gene, and at least one of the one or more guide RNAs targeting a humanized endogenous MYOC locus. Selectively, one or more guide RNAs, or one or more expression cassettes encoding one or more guide RNAs, are located within the trabecular network.

[0015] Some such non-human animals, non-human animal cells, and non-human animal genomes further include an expression cassette integrated into a second genome encoding one or more guide RNAs, each of which contains one or more adapter binding elements to which a chimeric adapter protein can specifically bind. Each of the one or more guide RNAs can form a complex with a Cas protein and direct it to a target sequence within a target gene, and at least one of the one or more guide RNAs targets the humanized endogenous MYOC locus.

[0016] In some such non-human animals, non-human animal cells, and non-human animal genomes, the first expression cassette is integrated into the Rosa26 locus, the Cas protein is a Cas9 protein containing mutations corresponding to D10A and N863A when optimally aligned with the Streptococcus pyogenes Cas9 protein, one or more transcriptional activator domains in the chimeric Cas protein include VP64, the adapter protein includes the MS2 coat protein or a functional fragment or variant thereof, one or more transcriptional activation domains in the chimeric adapter protein include p65 and HSF1, the non-human animal further includes one or more guide RNAs, or one or more expression cassettes encoding one or more guide RNAs, the one or more guide RNAs, or one or more expression cassettes encoding one or more guide RNAs are in the trabecular meshwork, each of the one or more guide RNAs contains two adapter binding elements to which the chimeric adapter protein can specifically bind, the two adapter binding elements include a first adapter binding element within the first loop of each of the one or more guide RNAs and a second adapter binding element within the second loop of each of the one or more guide RNAs, and the target sequence is within the region 200 base pairs upstream and 1 base pair downstream of the transcription start site.

[0017] In some such non-human animals, non-human animal cells, and non-human animal genomes, the non-human animal is a mouse, and one or more guide RNAs target one or more guide RNA target sequences selected from SEQ ID NOs: 90-95. In some such non-human animals, non-human animal cells, and non-human animal genomes, the non-human animal is a mouse, and one or more guide RNAs target one or more guide RNA target sequences selected from SEQ ID NOs: 93-94. In some such non-human animals, non-human animal cells, and non-human animal genomes, the non-human animal is a mouse, and one or more guide RNAs target the guide RNA target sequence set forth in SEQ ID NO: 93.

[0018] Some such non-human animals or non-human animal cells have at least 2-fold, at least 3-fold, at least 4-fold, at least 5-fold, at least 6-fold, at least 7-fold, at least 8-fold, at least 9-fold, at least 10-fold, at least 11-fold, at least 12-fold, at least 13-fold, at least 14-fold, or at least 15-fold increased MYOC mRNA or protein expression compared to control non-human animals or non-human animal cells that do not contain one or more guide RNAs or one or more expression cassettes encoding one or more guide RNAs. Some such non-human animals have at least 2-fold, at least 3-fold, at least 4-fold, at least 5-fold, at least 6-fold, at least 7-fold, at least 8-fold, at least 9-fold, at least 10-fold, at least 11-fold, at least 12-fold, at least 13-fold, at least 14-fold, or at least 15-fold increased MYOC mRNA or protein expression in the eye, limbal ring, retina, ciliary body, or trabecular network compared to control non-human animals that do not contain one or more guide RNAs or one or more expression cassettes encoding one or more guide RNAs. Some such non-human animals have MYOC mRNA or protein expression in the limbal ring or columnar network that is at least 2-fold, at least 3-fold, at least 4-fold, at least 5-fold, at least 6-fold, at least 7-fold, at least 8-fold, at least 9-fold, at least 10-fold, at least 11-fold, at least 12-fold, at least 13-fold, at least 14-fold, or at least 15-fold increased compared to a control non-human animal that does not contain one or more guide RNAs or one or more expression cassettes encoding one or more guide RNAs.Some such non-human animals have MYOC mRNA or protein expression in the trabecular network that is at least 2-fold, at least 3-fold, at least 4-fold, at least 5-fold, at least 6-fold, at least 7-fold, at least 8-fold, at least 9-fold, at least 10-fold, at least 11-fold, at least 12-fold, at least 13-fold, at least 14-fold, or at least 15-fold increased compared to a control non-human animal that does not contain one or more guide RNAs or one or more expression cassettes encoding one or more guide RNAs.

[0019] In some such non-human animals, the non-human animals have elevated intraocular pressure compared to wild-type non-human animals or control non-human animals. Optionally, the non-human animals have an intraocular pressure of at least 15, at least 16, at least 17, at least 18, at least 19, at least 20, at least 21, or at least 22 mmHg. Optionally, the non-human animals have an intraocular pressure at least 1, at least 2, at least 3, at least 4, at least 5, or at least 6 mmHg higher than the intraocular pressure of control non-human animals.

[0020] In another embodiment, nucleic acids are provided that include a humanized non-human animal MYOC gene (i.e., the humanized non-human animal MYOC locus described above) in which a region of the endogenous MYOC gene is deleted and replaced with a corresponding human MYOC sequence. Optionally, the region of the endogenous MYOC gene includes both the MYOC exon sequence and the MYOC intron sequence, and the corresponding human MYOC sequence includes both the MYOC exon sequence and the MYOC intron sequence.

[0021] In another embodiment, a targeting vector is provided for generating a humanized endogenous MYOC locus (i.e., the humanized endogenous MYOC locus described above) in which a region of the endogenous MYOC locus is deleted and replaced with a corresponding human MYOC sequence, the targeting vector comprising an insertion nucleic acid containing the corresponding human MYOC sequence adjacent to a 5' homology arm targeting a 5' target sequence in the endogenous MYOC locus and a 3' homology arm targeting a 3' target sequence in the endogenous MYOC locus. Optionally, the region of the endogenous MYOC locus comprises both MYOC exon sequences and MYOC intron sequences, and the corresponding human MYOC sequence comprises both MYOC exon sequences and MYOC intron sequences.

[0022] In another embodiment, methods are provided for evaluating the activity of human MYOC targeting reagents or candidate glaucoma treatments in non-human animals and non-human animal cells as described above. Some such methods include (a) administering the human MYOC targeting reagent or candidate glaucoma treatment to any of the above non-human animals, and (b) evaluating the activity of the human MYOC targeting reagent or candidate glaucoma treatment in the non-human animals. Some such methods include (a) administering the human MYOC targeting reagent or candidate glaucoma treatment to any of the above non-human animal cells, and (b) evaluating the activity of the human MYOC targeting reagent or candidate glaucoma treatment in the non-human animal cells.

[0023] In some such methods, a non-human animal further includes in its genome an expression cassette integrated into the genome, the expression cassette integrated into the genome comprising: (a) a nucleic acid encoding a chimeric clustered and regularly scattered short palindromic repeat (CRISPR)-related (Cas) protein containing a nuclease-inactive Cas protein fused to one or more transcription-activating domains; and (b) a nucleic acid encoding a chimeric adapter protein containing an adapter protein fused to one or more transcription-activating domains. Some such methods further comprise administering one or more guide RNAs, or one or more expression cassettes encoding one or more guide RNAs, prior to step (a), each guide RNA comprising one or more adapter-binding elements to which a chimeric adapter protein can specifically bind, each of the one or more guide RNAs being capable of forming a complex with a Cas protein and guiding it to a target sequence in a target gene, and at least one of the one or more guide RNAs targeting a humanized endogenous MYOC locus.

[0024] In some such methods, the non-human animal is a mouse, and one or more guide RNAs target one or more guide RNA target sequences selected from SEQ ID NOs. 90-95. In some such methods, the non-human animal is a mouse, and one or more guide RNAs target one or more guide RNA target sequences selected from SEQ ID NOs. 93-94. In some such methods, the non-human animal is a mouse, and one or more guide RNAs target the guide RNA target sequence described in SEQ ID NO. 93.

[0025] In some such methods, the guide RNA, or one or more expression cassettes encoding one or more guide RNAs, is administered via adenovirus-mediated delivery, lentivirus-mediated delivery, or adeno-associated virus (AAV)-mediated delivery. In some such methods, the guide RNA, or one or more expression cassettes encoding one or more guide RNAs, is administered via recombinant AAV2.Y3F-mediated delivery. In some such methods, the guide RNA, or one or more expression cassettes encoding one or more guide RNAs, is administered via lentivirus-mediated delivery.

[0026] In some such methods, the guide RNA, or one or more expression cassettes encoding one or more guide RNAs, is administered to a non-human animal via intravitreous injection or anterior chamber injection. In some such methods, the guide RNA, or one or more expression cassettes encoding one or more guide RNAs, is administered to a non-human animal at least one week before step (a) or about one to about ten weeks before step (a).

[0027] In some such methods, step (b) includes evaluating the activity of a human MYOC targeting reagent or a candidate glaucoma treatment in the eye of a non-human animal. In some such methods, step (a) includes administering a human MYOC targeting reagent, and step (b) includes measuring the expression of MYOC messenger RNA encoded by a humanized endogenous MYOC locus. In some such methods, step (a) includes administering a human MYOC targeting reagent, and step (b) includes measuring the expression of myosillin protein encoded by a humanized endogenous MYOC locus. In some such methods, step (a) includes administering a human MYOC targeting reagent, the human MYOC targeting reagent is a genome editing agent, and step (b) includes evaluating modifications to the humanized endogenous MYOC locus. In some such methods, step (b) includes measuring the frequency of insertions or deletions within the humanized endogenous MYOC locus.

[0028] In some such methods, the human MYOC targeting reagent comprises a nuclease designed to target a region of the human MYOC gene. In some such methods, the nuclease comprises a Cas protein and a guide RNA designed to target a guide RNA target sequence of the human MYOC gene. Optionally, the Cas protein is the Cas9 protein.

[0029] In some such methods, step (a) comprises administering a human MYOC targeting reagent, the human MYOC targeting reagent comprising an exogenous donor nucleic acid, the exogenous donor nucleic acid being designed to target the human MYOC gene, and optionally, the exogenous donor nucleic acid being delivered via AAV.

[0030] In some such methods, the human MYOC targeting reagent is an RNAi agent or an antisense oligonucleotide. In some such methods, the human MYOC targeting reagent is an antigen-binding protein. In some such methods, the human MYOC targeting reagent is a small molecule.

[0031] In some such methods, evaluating the activity of human MYOC-targeting reagents or candidate glaucoma treatments in non-human animals involves assessing intraocular pressure.

[0032] In some such methods, the evaluation is carried out by comparing the animal with an untreated control non-human animal.

[0033] In some such methods, step (a) includes administering a candidate glaucoma treatment agent, and step (b) includes assessing intraocular pressure. In some such methods, the candidate glaucoma treatment agent is an inhibitor of aqueous humor formation. In some such methods, the candidate glaucoma treatment agent increases aqueous humor outflow.

[0034] In some such cases, the candidate glaucoma treatment is an ANGPTL7 targeting agent. In some such cases, the ANGPTL7 targeting agent is an RNAi agent or an antisense oligonucleotide.

[0035] In another embodiment, a method for optimizing the activity of a human MYOC-targeting reagent or candidate glaucoma treatment is provided. Some such methods include (II) performing one of the above methods for evaluating the activity of a human MYOC-targeting reagent or candidate glaucoma treatment in a non-human animal and non-human animal cells for the first time in a first non-human animal or first non-human animal cell; (II) modifying a variable element and performing the method of step (I) a second time in a second non-human animal or second non-human animal cell using the modified variable element; and (III) comparing the activity of the human MYOC-targeting reagent or candidate glaucoma treatment in step (I) with the activity of the human MYOC-targeting reagent or candidate glaucoma treatment in step (II) and selecting a method that yields higher activity.

[0036] In another embodiment, a method is provided for producing any of the above-mentioned non-human animals containing a humanized endogenous MYOC gene locus.

[0037] Some such methods include (a) introducing genetically modified non-human animal embryonic stem (ES) cells containing a humanized endogenous MYOC locus in their genome, wherein the region of the endogenous MYOC locus is deleted and replaced with a corresponding human MYOC sequence, and (b) impregnating a non-human animal host embryo in a surrogate mother, wherein the surrogate mother produces genetically modified non-human animals of F0 offspring containing the humanized endogenous MYOC locus. Some such methods further include modifying the genome of the non-human animal ES cells to contain the humanized endogenous MYOC locus prior to step (a).

[0038] Some such methods include (a) modifying the genome of a one-cell stage embryo of a non-human animal to include a humanized endogenous MYOC locus in its genome, wherein a region of the endogenous MYOC locus is deleted and replaced with a corresponding human MYOC sequence, thereby producing a genetically modified embryo of the non-human animal; and (b) impregnating a surrogate mother with the genetically modified embryo of the non-human animal, wherein the surrogate mother produces genetically modified non-human animals of F0 offspring containing the humanized endogenous MYOC locus.

[0039] Some such methods involve crossing a genetically modified non-human animal of F0 offspring containing the humanized endogenous MYOC locus with a non-human animal containing an integrated expression cassette in the genome that comprises nucleic acids encoding short palindromic repeat (CRISPR)-related (Cas) proteins clustered and regularly scattered in a chimeric array, each containing a nuclease-inactive Cas protein fused to one or more transcriptional activation domains, and further comprising nucleic acids encoding a chimeric adapter protein containing an adapter protein fused to one or more transcriptional activation domains.

[0040] In another embodiment, a method is provided for producing any of the above-mentioned non-human animals comprising a humanized endogenous MYOC locus and an expression cassette incorporated into the genome, comprising: (a) a nucleic acid encoding a chimeric cluster of regularly scattered short palindromic repeat (CRISPR)-related (Cas) proteins comprising a nuclease-inactive Cas protein fused to one or more transcription-activating domains; and (b) a nucleic acid encoding a chimeric adapter protein comprising an adapter protein fused to one or more transcription-activating domains.

[0041] Some such methods include (a) introducing genetically modified non-human animal embryonic stem (ES) cells into a non-human animal host embryo, the ES cells having a humanized endogenous MYOC locus in which the region of the endogenous MYOC locus is deleted and replaced with a corresponding human MYOC sequence, and an integrated expression cassette in the genome having a nucleic acid encoding a Cas protein comprising a nuclease-inactive Cas protein fused to one or more transcription-activating domains, and a nucleic acid encoding a chimeric adapter protein comprising an adapter protein fused to one or more transcription-activating domains, and (b) impregnating the non-human animal host embryo in a surrogate mother, the surrogate mother producing genetically modified non-human animals of F0 offspring having the humanized endogenous MYOC locus and the integrated expression cassette in the genome. Some such methods further include modifying the genome of the non-human animal ES cells to have the humanized endogenous MYOC locus and the integrated expression cassette in the genome prior to step (a).

[0042] Some such methods include (a) providing a non-human animal one-cell stage embryo comprising, in its genome, (i) a humanized endogenous MYOC locus in which a region of the endogenous MYOC locus has been deleted and replaced with a corresponding human MYOC sequence, and (ii) an expression cassette integrated into the genome comprising a nucleic acid encoding a Cas protein containing a nuclease-inactive Cas protein fused to one or more transcription-activating domains, and a nucleic acid encoding a chimeric adapter protein containing an adapter protein fused to one or more transcription-activating domains, and (b) impregnating a non-human animal one-cell stage embryo in a surrogate mother, wherein the surrogate mother produces genetically modified non-human animals of F0 offspring containing the humanized endogenous MYOC locus and the expression cassette integrated into the genome.

[0043] In some such cases, the non-human animal is a mouse or a rat.

[0044] In another embodiment, methods for increasing MYOC expression in non-human animals are provided. Some such methods involve administering one or more guide RNAs, or one or more expression cassettes encoding one or more guide RNAs, to any of the above non-human animals (including a humanized endogenous MYOC locus and an expression cassette integrated into the genome, comprising: (a) a nucleic acid encoding a chimeric clustered and regularly scattered short palindromic repeat (CRISPR)-related (Cas) protein comprising a nuclease-inactive Cas protein fused to one or more transcription-activating domains; and (b) a nucleic acid encoding a chimeric adapter protein comprising an adapter protein fused to one or more transcription-activating domains), each guide RNA comprising one or more adapter-binding elements to which a chimeric adapter protein can specifically bind, each of the one or more guide RNAs is capable of forming a complex with a Cas protein and guiding it to a target sequence in a target gene, and at least one of the one or more RNAs targets the humanized endogenous MYOC locus.

[0045] In some such methods, the non-human animal is a mouse, and one or more guide RNAs target one or more guide RNA target sequences selected from SEQ ID NOs. 90-95. In some such methods, the non-human animal is a mouse, and one or more guide RNAs target one or more guide RNA target sequences selected from SEQ ID NOs. 93-94. In some such methods, the non-human animal is a mouse, and one or more guide RNAs target the guide RNA target sequence described in SEQ ID NO. 93.

[0046] In some such methods, the guide RNA, or one or more expression cassettes encoding one or more guide RNAs, is administered via adenovirus-mediated delivery, lentivirus-mediated delivery, or adeno-associated virus (AAV)-mediated delivery. In some such methods, the guide RNA, or one or more expression cassettes encoding one or more guide RNAs, is administered via recombinant AAV2.Y3F-mediated delivery. In some such methods, the guide RNA, or one or more expression cassettes encoding one or more guide RNAs, is administered via lentivirus-mediated delivery. In some such methods, the guide RNA, or one or more expression cassettes encoding one or more guide RNAs, is administered to non-human animals via intravitreous injection or anterior chamber injection.

[0047] In some such methods, the method results in an increase in MYOC mRNA or protein expression in the eye, limbal ring, retina, ciliary body, or trabecular network. Optionally, the method results in at least a 2-fold, at least 3-fold, at least 4-fold, at least 5-fold, at least 6-fold, at least 7-fold, at least 8-fold, at least 9-fold, at least 10-fold, at least 11-fold, at least 12-fold, at least 13-fold, at least 14-fold, or at least 15-fold increase in MYOC mRNA or protein expression in the eye, limbal ring, retina, ciliary body, or trabecular network compared to a control non-human animal that does not contain one or more guide RNAs or one or more expression cassettes encoding one or more guide RNAs. In some such methods, the method results in an increase in MYOC mRNA or protein expression in the limbal ring or trabecular network. Selectively, the method results in at least a 2-fold, at least 3-fold, at least 4-fold, at least 5-fold, at least 6-fold, at least 7-fold, at least 8-fold, at least 9-fold, at least 10-fold, at least 11-fold, at least 12-fold, at least 13-fold, at least 14-fold, or at least 15-fold increase in MYOC mRNA or protein expression in the limbal ring or trabecular network compared to a control non-human animal that does not contain one or more guide RNAs or one or more expression cassettes encoding one or more guide RNAs. In some such methods, the method results in an increase in MYOC mRNA or protein expression in the limbal ring. Selectively, the method results in at least a 2-fold, at least 3-fold, at least 4-fold, at least 5-fold, at least 6-fold, at least 7-fold, at least 8-fold, at least 9-fold, at least 10-fold, at least 11-fold, at least 12-fold, at least 13-fold, at least 14-fold, or at least 15-fold increase in MYOC mRNA or protein expression in the limbal ring compared to a control non-human animal that does not contain one or more guide RNAs or one or more expression cassettes encoding one or more guide RNAs. In some such methods, this method results in increased MYOC mRNA or protein expression in the trabecular network.Selectively, this method results in at least a 2-fold, at least 3-fold, at least 4-fold, at least 5-fold, at least 6-fold, at least 7-fold, at least 8-fold, at least 9-fold, at least 10-fold, at least 11-fold, at least 12-fold, at least 13-fold, at least 14-fold, or at least 15-fold increase in MYOC mRNA or protein expression in the trabecular network compared to control non-human animals that do not contain one or more guide RNAs or one or more expression cassettes encoding one or more guide RNAs.

[0048] In some such methods, the method results in elevated intraocular pressure. Optionally, the method results in non-human animals having an intraocular pressure at least 1, at least 2, at least 3, at least 4, at least 5, or at least 6 mmHg higher than that of control non-human animals. Optionally, the method results in an intraocular pressure of at least 15, at least 16, at least 17, at least 18, at least 19, at least 20, at least 21, or at least 22 mmHg.

[0049] In another embodiment, methods for increasing intraocular pressure in non-human animals are provided. Some such methods involve administering one or more guide RNAs, or one or more expression cassettes encoding one or more guide RNAs, to any of the above non-human animals (including a humanized endogenous MYOC locus and an expression cassette integrated into the genome, comprising: (a) a nucleic acid encoding a chimeric clustered and regularly scattered short palindromic repeat (CRISPR)-related (Cas) protein comprising a nuclease-inactive Cas protein fused to one or more transcription-activating domains; and (b) a nucleic acid encoding a chimeric adapter protein comprising an adapter protein fused to one or more transcription-activating domains), each guide RNA comprising one or more adapter-binding elements to which the chimeric adapter protein can specifically bind, each of the one or more guide RNAs is capable of forming a complex with a Cas protein and guiding it to a target sequence in a target gene, and at least one of the one or more RNAs targets the humanized endogenous MYOC locus.

[0050] In some such methods, the non-human animal is a mouse, and one or more guide RNAs target one or more guide RNA target sequences selected from SEQ ID NOs. 90-95. In some such methods, the non-human animal is a mouse, and one or more guide RNAs target one or more guide RNA target sequences selected from SEQ ID NOs. 93-94. In some such methods, the non-human animal is a mouse, and one or more guide RNAs target the guide RNA target sequence described in SEQ ID NO. 93.

[0051] In some such methods, the guide RNA, or one or more expression cassettes encoding one or more guide RNAs, is administered via adenovirus-mediated delivery, lentivirus-mediated delivery, or adeno-associated virus (AAV)-mediated delivery. In some such methods, the guide RNA, or one or more expression cassettes encoding one or more guide RNAs, is administered via recombinant AAV2.Y3F-mediated delivery. In some such methods, the guide RNA, or one or more expression cassettes encoding one or more guide RNAs, is administered via lentivirus-mediated delivery. In some such methods, the guide RNA, or one or more expression cassettes encoding one or more guide RNAs, is administered to non-human animals via intravitreous injection or anterior chamber injection.

[0052] In some such methods, the method results in an increase in MYOC mRNA or protein expression in the eye, limbal ring, retina, ciliary body, or trabecular network. Optionally, the method results in at least a 2-fold, at least 3-fold, at least 4-fold, at least 5-fold, at least 6-fold, at least 7-fold, at least 8-fold, at least 9-fold, at least 10-fold, at least 11-fold, at least 12-fold, at least 13-fold, at least 14-fold, or at least 15-fold increase in MYOC mRNA or protein expression in the eye, limbal ring, retina, ciliary body, or trabecular network compared to a control non-human animal that does not contain one or more guide RNAs or one or more expression cassettes encoding one or more guide RNAs. In some such methods, the method results in an increase in MYOC mRNA or protein expression in the limbal ring or trabecular network. Selectively, the method results in at least a 2-fold, at least 3-fold, at least 4-fold, at least 5-fold, at least 6-fold, at least 7-fold, at least 8-fold, at least 9-fold, at least 10-fold, at least 11-fold, at least 12-fold, at least 13-fold, at least 14-fold, or at least 15-fold increase in MYOC mRNA or protein expression in the limbal ring or trabecular network compared to a control non-human animal that does not contain one or more guide RNAs or one or more expression cassettes encoding one or more guide RNAs. In some such methods, the method results in an increase in MYOC mRNA or protein expression in the limbal ring. Selectively, the method results in at least a 2-fold, at least 3-fold, at least 4-fold, at least 5-fold, at least 6-fold, at least 7-fold, at least 8-fold, at least 9-fold, at least 10-fold, at least 11-fold, at least 12-fold, at least 13-fold, at least 14-fold, or at least 15-fold increase in MYOC mRNA or protein expression in the limbal ring compared to a control non-human animal that does not contain one or more guide RNAs or one or more expression cassettes encoding one or more guide RNAs. In some such methods, this method results in increased MYOC mRNA or protein expression in the trabecular network.Selectively, this method results in at least a 2-fold, at least 3-fold, at least 4-fold, at least 5-fold, at least 6-fold, at least 7-fold, at least 8-fold, at least 9-fold, at least 10-fold, at least 11-fold, at least 12-fold, at least 13-fold, at least 14-fold, or at least 15-fold increase in MYOC mRNA or protein expression in the trabecular network compared to control non-human animals that do not contain one or more guide RNAs or one or more expression cassettes encoding one or more guide RNAs.

[0053] In some such methods, the method results in elevated intraocular pressure. Optionally, the method results in non-human animals having an intraocular pressure at least 1, at least 2, at least 3, at least 4, at least 5, or at least 6 mmHg higher than that of control non-human animals. Optionally, the method results in an intraocular pressure of at least 15, at least 16, at least 17, at least 18, at least 19, at least 20, at least 21, or at least 22 mmHg.

[0054] In another embodiment, methods for increasing model glaucoma in non-human animals are provided. Some such methods involve administering one or more guide RNAs, or one or more expression cassettes encoding one or more guide RNAs, to any of the above non-human animals (including a humanized endogenous MYOC locus and an expression cassette integrated into the genome, comprising: (a) a nucleic acid encoding a chimeric clustered and regularly scattered short palindromic repeat (CRISPR)-related (Cas) protein comprising a nuclease-inactive Cas protein fused to one or more transcription-activating domains; and (b) a nucleic acid encoding a chimeric adapter protein comprising an adapter protein fused to one or more transcription-activating domains), each guide RNA comprising one or more adapter-binding elements to which the chimeric adapter protein can specifically bind, each of the one or more guide RNAs is capable of forming a complex with a Cas protein and guiding it to a target sequence in a target gene, and at least one of the one or more RNAs targets the humanized endogenous MYOC locus.

[0055] In some such methods, the non-human animal is a mouse, and one or more guide RNAs target one or more guide RNA target sequences selected from SEQ ID NOs. 90-95. In some such methods, the non-human animal is a mouse, and one or more guide RNAs target one or more guide RNA target sequences selected from SEQ ID NOs. 93-94. In some such methods, the non-human animal is a mouse, and one or more guide RNAs target the guide RNA target sequence described in SEQ ID NO. 93.

[0056] In some such methods, the guide RNA, or one or more expression cassettes encoding one or more guide RNAs, is administered via adenovirus-mediated delivery, lentivirus-mediated delivery, or adeno-associated virus (AAV)-mediated delivery. In some such methods, the guide RNA, or one or more expression cassettes encoding one or more guide RNAs, is administered via recombinant AAV2.Y3F-mediated delivery. In some such methods, the guide RNA, or one or more expression cassettes encoding one or more guide RNAs, is administered via lentivirus-mediated delivery. In some such methods, the guide RNA, or one or more expression cassettes encoding one or more guide RNAs, is administered to non-human animals via intravitreous injection or anterior chamber injection.

[0057] In some such methods, the method results in an increase in MYOC mRNA or protein expression in the eye, limbal ring, retina, ciliary body, or trabecular network. Optionally, the method results in at least a 2-fold, at least 3-fold, at least 4-fold, at least 5-fold, at least 6-fold, at least 7-fold, at least 8-fold, at least 9-fold, at least 10-fold, at least 11-fold, at least 12-fold, at least 13-fold, at least 14-fold, or at least 15-fold increase in MYOC mRNA or protein expression in the eye, limbal ring, retina, ciliary body, or trabecular network compared to a control non-human animal that does not contain one or more guide RNAs or one or more expression cassettes encoding one or more guide RNAs. In some such methods, the method results in an increase in MYOC mRNA or protein expression in the limbal ring or trabecular network. Selectively, the method results in at least a 2-fold, at least 3-fold, at least 4-fold, at least 5-fold, at least 6-fold, at least 7-fold, at least 8-fold, at least 9-fold, at least 10-fold, at least 11-fold, at least 12-fold, at least 13-fold, at least 14-fold, or at least 15-fold increase in MYOC mRNA or protein expression in the limbal ring or trabecular network compared to a control non-human animal that does not contain one or more guide RNAs or one or more expression cassettes encoding one or more guide RNAs. In some such methods, the method results in an increase in MYOC mRNA or protein expression in the limbal ring. Selectively, the method results in at least a 2-fold, at least 3-fold, at least 4-fold, at least 5-fold, at least 6-fold, at least 7-fold, at least 8-fold, at least 9-fold, at least 10-fold, at least 11-fold, at least 12-fold, at least 13-fold, at least 14-fold, or at least 15-fold increase in MYOC mRNA or protein expression in the limbal ring compared to a control non-human animal that does not contain one or more guide RNAs or one or more expression cassettes encoding one or more guide RNAs. In some such methods, this method results in increased MYOC mRNA or protein expression in the trabecular network.Selectively, this method results in at least a 2-fold, at least 3-fold, at least 4-fold, at least 5-fold, at least 6-fold, at least 7-fold, at least 8-fold, at least 9-fold, at least 10-fold, at least 11-fold, at least 12-fold, at least 13-fold, at least 14-fold, or at least 15-fold increase in MYOC mRNA or protein expression in the trabecular network compared to control non-human animals that do not contain one or more guide RNAs or one or more expression cassettes encoding one or more guide RNAs.

[0058] In some such methods, the method results in elevated intraocular pressure. Optionally, the method results in non-human animals having an intraocular pressure at least 1, at least 2, at least 3, at least 4, at least 5, or at least 6 mmHg higher than that of control non-human animals. Optionally, the method results in an intraocular pressure of at least 15, at least 16, at least 17, at least 18, at least 19, at least 20, at least 21, or at least 22 mmHg. [Brief explanation of the drawing]

[0059] [Figure 1] A schematic diagram (not to exact scale) is shown of the target alleles of the humanized mouse MYOC locus, wild-type mouse MYOC locus, and the humanized mouse MYOC locus containing the Y437H mutation and having a self-deleting neomycin (SDC-Neo) selector cassette (MAID 8533), as well as the target alleles of the humanized mouse MYOC locus containing the Y437H mutation and having loxP scarring from the removal of the SDC-Neo selector cassette (MAID 8534). [Figure 2] A schematic diagram (not to exact scale) of a strategy for screening humanized mouse MYOC loci is shown, including allele loss assays (7578mTU and 7578mTD), allele acquisition assays (7578hTU, 7578hTD, Cre903, and Hyg), and allele discrimination assays (8533hAS2.WT and 8533hAS2.Mut). [Figure 3]Alignment of human and mouse myosirin proteins is shown. [Figure 4A] The lox-stop-lox (LSL) dCas9 synergistic activation mediator (SAM) allele (LSL-SAM allele) is shown from 5' to 3', containing the 3' splicing sequence, the first loxP site, the neomycin resistance gene, the polyadenylation signal, the second loxP site, the dCas9-NLS-VP64 coding sequence (NLS-dCas9-NLS-VP64), the T2A peptide coding sequence, the MCP-NLS-p65-HSF1 coding sequence, and the Woodchuck hepatitis virus posttranscriptional regulatory element (WPRE) (not to exact scale). [Figure 4B] The neomycin resistance gene incorporating loxP and the polyadenylation signal have been removed, and the alleles from Figure 4A (SAM alleles) are shown (not to exact scale). [Figure 5] A general schematic diagram (not to exact scale) shows how to target the LSL-SAM allele from Figure 4A into the first intron of the Rosa26 (R26) locus. [Figure 6] A schematic diagram of a typical single guide RNA (SEQ ID NO: 68) is shown, in which the tetraloop and stemloop 2 are replaced with MS2-binding aptamers to facilitate the recruitment of a chimeric MS2 coat protein (MCP) fused to the transcriptional activation domain. [Figure 7] Ad5, AAV2.Y3F, and lentivirus expression are shown in the trabecular network (TM), ciliary body (CB), and corneal endothelium (CE). Rectangular boxes indicate TM. [Figure 8] This shows the location of the mouse Myoc SAM guide RNA target sequence. [Figure 9] A schematic diagram of the eye's anatomy is shown. [Figure 10A] This shows mouse Myoc expression relative to Gapdh expression measured by qPCR in the limbal ring (trabecular network (TM), iris, and ciliary body (CB), Figure 10A), retina (Figure 10B), and cornea (Figure 10C) after administration of mouse Myoc SAM guide RNA, compared to control guide RNA. [Figure 10B] This shows mouse Myoc expression relative to Gapdh expression measured by qPCR in the limbal ring (trabecular network (TM), iris, and ciliary body (CB), Figure 10A), retina (Figure 10B), and cornea (Figure 10C) after administration of mouse Myoc SAM guide RNA, compared to control guide RNA. [Figure 10C] This shows mouse Myoc expression relative to Gapdh expression measured by qPCR in the limbal ring (trabecular network (TM), iris, and ciliary body (CB), Figure 10A), retina (Figure 10B), and cornea (Figure 10C) after administration of mouse Myoc SAM guide RNA, compared to control guide RNA. [Figure 11] This shows intraocular pressure (IOP) in SAM mice (SAM-MYOC mice, homozygous for each allele) containing the humanized MYOC locus with the Y437H mutation, treated with AAV2.Y3F-SAM-g4, AAV2.Y3F-SAM-g5, or AAV2.Y3F-SAM-LacZ controls, or untreated (naive). [Figure 12] Intraocular pressure (IOP) is shown in SAM mice (SAM-MYOC mice, homozygous for each allele) containing the humanized MYOC locus with the Y437H mutation, either treated with AAV2.Y3F-SAM-g4 or untreated (naive). AAV2.Y3F-SAM-g4 treated mice were subsequently treated with either timolol or RHOPRESSA®. [Figure 13]This shows human MYOC expression as measured by qPCR in SAM mice containing a humanized MYOC locus containing a Y437H mutation that is either heterozygous or homozygous for the humanized MYOC locus, in relation to Gapdh expression measured after administration of mouse Myoc SAM guide RNAs g4 and g5 or SAM LacZ control gRNA. [Figure 14] This study shows human MYOC expression, as measured by RNASCOPE®, in SAM mice containing a humanized MYOC locus containing a Y437H mutation that is either heterozygous or homozygous for the humanized MYOC locus, after administration of mouse Myoc SAM guide RNA g4 and g5 or SAM LacZ control gRNA. [Figure 15A] This describes the experimental setup for testing the effect of human MYOC siRNA #1 on intraocular pressure (IOP) in SAM mice (SAM-MYOC mice, homozygous for each allele) containing a humanized MYOC locus with a Y437H mutation, treated with a mixture of AAV2.Y3F-SAM-g4 and LV-SAM-g4. [Figure 15B] Intraocular pressure (IOP) is shown in SAM mice (SAM-MYOC mice, homozygous for each allele) containing the humanized MYOC locus with the Y437H mutation, either treated with AAV2.Y3F-SAM-g4 or untreated (naive). AAV2.Y3F-SAM-g4 treated mice were subsequently treated with either human MYOC siRNA or control luciferase siRNA. [Figure 16A] This document describes the experimental setup for testing the effects of human MYOC siRNA #1, #2, #3, #4, and #5 on intraocular pressure (IOP) in SAM mice (SAM-MYOC mice, homozygous for each allele) containing the humanized MYOC locus with the Y437H mutation, treated with LV-SAM-g4. [Figure 16B]Intraocular pressure (IOP) is shown in SAM mice (SAM-MYOC mice, homozygous for each allele) containing the humanized MYOC locus with the Y437H mutation, treated with LV-SAM-g4 or PBS. LV-SAM-g4 treated mice were subsequently treated with human MYOC siRNA #1, #2, #3, #4, or #5. [Figure 16C] The results of qPCR are shown, indicating the percentage of human MYOC mRNA expression relative to Gapdh in SAM mice (SAM-MYOC mice, homozygous for each allele) containing the humanized MYOC locus with the Y437H mutation, treated with LV-SAM-g4 or PBS. LV-SAM-g4 treated mice were subsequently treated with human MYOC siRNA #1, #2, #3, #4, or #5. [Figure 16D-1] This shows RNASCOPE® analysis of human MYOC mRNA expression in the eyes of SAM mice (SAM-MYOC mice, homozygous for each allele) containing the humanized MYOC locus with the Y437H mutation, treated with LV-SAM-g4 or PBS. LV-SAM-g4 treated mice were subsequently treated with human MYOC siRNA #2 or #3. [Figure 16D-2] This shows RNASCOPE® analysis of human MYOC mRNA expression in the eyes of SAM mice (SAM-MYOC mice, homozygous for each allele) containing the humanized MYOC locus with the Y437H mutation, treated with LV-SAM-g4 or PBS. LV-SAM-g4 treated mice were subsequently treated with human MYOC siRNA #2 or #3. [Figure 17A] This document describes the experimental setup for testing the effects of human MYOC siRNA #2 and #3 on intraocular pressure (IOP) in SAM mice (SAM-MYOC mice, homozygous for each allele) containing the humanized MYOC locus with the Y437H mutation, treated with LV-SAM-g4. [Figure 17B]The intraocular pressure (IOP) in SAM mice (SAM-MYOC mice, homozygous for each allele) containing the humanized MYOC locus with the Y437H mutation, treated with LV-SAM-g4 or PBS, was shown, and the LV-SAM-g4 treated mice were subsequently treated with human MYOC siRNA #2 or #3. [Figure 17C] The results of qPCR are shown, indicating the percentage of human MYOC mRNA expression relative to Gapdh in SAM mice (SAM-MYOC mice, homozygous for each allele) containing the humanized MYOC locus with the Y437H mutation, treated with LV-SAM-g4 or PBS. LV-SAM-g4 treated mice were subsequently treated with human MYOC siRNA #2 or #3. [Figure 18A] The qPCR results show the percentage of human MYOC mRNA expression relative to Gapdh in SAM mice (SAM-MYOC mice, homozygous for each allele) containing the humanized MYOC locus with the Y437H mutation, treated with LV-SAM-g4 or PBS. [Figure 18B] The qPCR results show the percentage of Angptl7 mRNA expression relative to Gapdh in SAM mice (SAM-MYOC mice, homozygous for each allele) containing the humanized MYOC locus with the Y437H mutation, treated with LV-SAM-g4 or PBS. [Figure 18C] The results of qPCR are shown, indicating the percentage of human MYOC mRNA expression relative to Gapdh in SAM mice (SAM-MYOC mice, homozygous for each allele) that were treated with AAV-SAM-g4, untreated (naive), or contained the humanized MYOC locus including the Y437H mutation. [Figure 18D] The results of qPCR are shown, indicating the percentage of human Angptl7 mRNA expression relative to Gapdh in SAM mice (SAM-MYOC mice, homozygous for each allele) that were treated with AAV-SAM-g4, untreated (naive), or contained the humanized MYOC locus including the Y437H mutation. [Figure 19A] Intraocular pressure (IOP) was shown in SAM mice (SAM-MYOC mice, homozygous for each allele) containing the humanized MYOC locus with the Y437H mutation, treated with LV-SAM-g4 or PBS. LV-SAM-g4 treated mice were subsequently treated with 15 μg of Angptl7 siRNA #1 or #2. [Figure 19B] The results of qPCR are shown, indicating the percentage of human MYOC mRNA expression relative to Gapdh in SAM mice (SAM-MYOC mice, homozygous for each allele) containing the humanized MYOC locus with the Y437H mutation, treated with LV-SAM-g4 or PBS. LV-SAM-g4 treated mice were subsequently treated with 15 μg of Angptl7 siRNA #1 or #2. [Figure 19C] The results of qPCR are shown, indicating the percentage of human Angptl7 mRNA expression relative to Gapdh in SAM mice (SAM-MYOC mice, homozygous for each allele) containing the humanized MYOC locus with the Y437H mutation, treated with LV-SAM-g4 or PBS. LV-SAM-g4 treated mice were subsequently treated with 15 μg of Angpt7 siRNA #1 or #2. [Figure 20A] This document describes the experimental setup for testing the effects of Angptl7siRNA #1 and #2 on intraocular pressure (IOP) in SAM mice (SAM-MYOC mice, homozygous for each allele) containing the humanized MYOC locus with the Y437H mutation, treated with LV-SAM-g4. [Figure 20B] The intraocular pressure (IOP) in SAM mice (SAM-MYOC mice, homozygous for each allele) containing the humanized MYOC locus with the Y437H mutation, treated with LV-SAM-g4, was subsequently treated with 7.5 μg of Angptl7 siRNA #1 or #2 or PBS. [Figure 20C]The results of qPCR are shown, indicating the percentage of Angptl7 and human MYOC mRNA expression relative to Gapdh in SAM mice (SAM-MYOC mice, homozygous for each allele) containing the humanized MYOC locus with the Y437H mutation, treated with LV-SAM-g4. The LV-SAM-g4 treated mice were subsequently treated with 7.5 μg of Angptl7 siRNA #1 or #2 or PBS. [Modes for carrying out the invention]

[0060] definition As used interchangeably herein, the terms “protein,” “polypeptide,” and “peptide” include polymeric forms of amino acids of any length, including coded and non-coding amino acids, as well as chemically or biochemically modified or derivatized amino acids. These terms also include modified polymers, such as polypeptides having a modified peptide backbone. The term “domain” refers to any portion of a protein or polypeptide having a particular function or structure.

[0061] Proteins are said to have an "N-terminus" (amino terminus) and a "C-terminus" (carboxyl terminus). The term "N-terminus" refers to the starting point of a protein or polypeptide having an amino acid with a free amine group (-NH2) at its end. The term "C-terminus" refers to the terminal end of an amino acid chain (protein or polypeptide) terminated by a free carboxyl group (-COOH).

[0062] As used interchangeably herein, the terms “nucleic acid” and “polynucleotide” include polymeric forms of nucleotides of any length, including ribonucleotides, deoxyribonucleotides, or analogues or modified versions thereof. These include single-stranded, double-stranded, and multi-stranded DNA or RNA, genomic DNA, cDNA, DNA-RNA hybrids, and polymers containing purine bases, pyrimidine bases, or other natural, chemically modified, biochemically modified, unnatural, or derivatized nucleotide bases.

[0063] Nucleic acids are said to have a "5' end" and a "3' end" because mononucleotides react in such a way that oligonucleotides are formed by the 5' phosphate of one mononucleotide pentose ring being unidirectionally bonded to the adjacent 3' oxygen via a phosphodiester bond. The end of an oligonucleotide is called the "5' end" if its 5' phosphate is not linked to the 3' oxygen of another mononucleotide pentose ring. The end of an oligonucleotide is called the "3' end" if its 3' oxygen is not linked to the 5' phosphate of another mononucleotide pentose ring. A nucleic acid sequence can also be said to have a 5' and a 3' end, even if it is located inside a larger oligonucleotide. In either a linear or circular DNA molecule, distinct elements are referred to as the "upstream" or "downstream" 5' or 3' element.

[0064] The term "integrated into the genome" refers to nucleic acids that have been introduced into a cell so that their nucleotide sequence is integrated into the cell's genome. Any protocol can be used for the stable integration of nucleic acids into the cell's genome.

[0065] The terms “expression vector,” “expression construct,” or “expression cassette” refer to recombinant nucleic acids containing a desired coding sequence operably ligated to appropriate nucleic acid sequences required for the expression of the operably ligated coding sequence in a particular host cell or organism. Nucleic acid sequences required for expression in prokaryotes typically include promoters, operators (of any choice), and ribosome binding sites, as well as other sequences. Eukaryotic cells are generally known to utilize promoters, enhancers, and termination and polyadenylation signals, and some elements may be removed and others added without sacrificing desired expression.

[0066] The term "targeted vector" refers to recombinant nucleic acids that can be introduced to a target site in the cellular genome by homologous recombination, non-homologous end-joining ligation, or any other recombination method.

[0067] The term "viral vector" refers to recombinant nucleic acids that contain at least one element of viral origin and that are sufficient or permissible for packaging into a viral vector particle. Vectors and / or particles can be used for the purpose of introducing DNA, RNA, or other nucleic acids into cells in vitro, ex vivo, or in vivo. Many forms of viral vectors are known.

[0068] With respect to cells, tissues (e.g., liver samples), lipid droplets, proteins, and nucleic acids, the term “isolated” includes cells, tissues (e.g., liver samples), lipid droplets, proteins, and nucleic acids that are relatively purified with respect to other bacteria, viruses, cells, or other components that may normally be present in situ, and includes substantially pure preparations of cells, tissues (e.g., liver samples), lipid droplets, proteins, and nucleic acids, and includes substantially pure preparations thereof. The term “isolated” also includes cells, tissues (e.g., liver samples), lipid droplets, proteins, and nucleic acids that are chemically synthesized, have no naturally occurring counterparts, and are therefore substantially free from other cells, tissues (e.g., liver samples), lipid droplets, proteins, and nucleic acids, or have been isolated or purified from most other components (e.g., cellular components or biological components) that naturally accompany them (e.g., other cellular proteins, nucleic acids, or cellular or extracellular components).

[0069] The term "wild-type" includes entities that possess the structure and / or activity found in a normal state or context (as opposed to mutants, diseased, modified, etc.). Wild-type genes and polypeptides often exist in multiple different forms (e.g., alleles).

[0070] The term "endogenous sequence" refers to nucleic acid sequences that are naturally present in rat cells or within rats. For example, the endogenous Myoc sequence in mice refers to the naturally occurring Myoc sequence at the Myoc locus in mice.

[0071] "Exogenous" molecules or sequences include molecules or sequences that are not normally present in cells in their form. Normal presence includes presence related to specific developmental stages and environmental conditions of cells. Exogenous molecules or sequences may include, for example, mutated versions of corresponding endogenous sequences within cells, such as humanized versions of endogenous sequences, or sequences that are intracellular but correspond to endogenous sequences in a different form (i.e., not within chromosomes). In contrast, endogenous molecules or sequences include molecules or sequences that are normally present in their form in specific cells, at specific developmental stages, and under specific environmental conditions.

[0072] When used in the context of nucleic acids or proteins, the term “heterogeneous” indicates that the nucleic acid or protein contains at least two segments that do not naturally exist together within the same molecule. For example, when the term “heterogeneous” is used in reference to a nucleic acid segment or a protein segment, it indicates that the nucleic acid or protein contains two or more subsequences that are not found in nature in the same relationship to each other (e.g., bound together). As an example, a “heterogeneous” region of a nucleic acid vector is a segment of nucleic acid that is not found in nature in association with another molecule, and is located within or bound to another nucleic acid molecule. For example, a heterogeneous region of a nucleic acid vector may contain a coding sequence adjacent to a sequence not found in association with a naturally occurring coding sequence. Similarly, a “heterogeneous” region of a protein is a segment of amino acids that is located within or bound to another peptide molecule that is not found in association with another peptide molecule in nature (e.g., a fusion protein or tagged protein). Similarly, nucleic acids or proteins may contain heterogeneous labeling, heterogeneous secretion, or heterogeneous localization sequences.

[0073] "Codon optimization" is a process that utilizes codon degeneracy, as indicated by the diversity of three-base-pair codon combinations that specify amino acids, to modify nucleic acid sequences for enhanced expression in specific host cells, generally by maintaining the natural amino acid sequence while replacing at least one codon in the natural sequence with a codon that is more or most frequently used in the host cell's gene. For example, a nucleic acid encoding a myosillin protein can be modified to have a more frequently used alternative codon in a given prokaryotic or eukaryotic cell, including bacterial cells, yeast cells, human cells, non-human cells, mammalian cells, rodent cells, mouse cells, rat cells, hamster cells, or any other host cell, compared to the naturally occurring nucleic acid sequence. Codon frequency tables are readily available, for example, in "codon frequency databases." These tables can be applied in various ways. See Nakamura et al. (2000) Nucleic Acids Res. 28(1):292, which is incorporated herein by reference in its entirety for all purposes. Computer algorithms for codon optimization of specific sequences for expression in a particular host are also available (see, for example, Gene Forge).

[0074] The term “locus” refers to a specific location of a gene (or important sequence), DNA sequence, polypeptide coding sequence, or position on a chromosome in the genome of an organism. For example, “MYOC locus” can refer to a specific location of a MYOC gene, a MYOC DNA sequence, a myosillin coding sequence, or a MYOC location on a chromosome in the genome of an organism that has been identified in relation to the location of such a sequence. A “MYOC locus” may include regulatory elements of a MYOC gene, such as enhancers, promoters, 5' and / or 3' untranslated regions (UTRs), or combinations thereof.

[0075] The term “gene” refers to a DNA sequence within a chromosome that, if naturally occurring, may contain at least one coding region and at least one non-coding region. A DNA sequence in a chromosome that codes for a product (e.g., non-limitingly, RNA products and / or polypeptide products) may include a coding region interrupted by non-coding introns and sequences located adjacent to both the 5' and 3' coding regions, so that the gene corresponds to a full-length mRNA (including the 5' and 3' untranslated sequences). Additionally, other non-coding sequences, including regulatory sequences (e.g., non-limitingly, promoters, enhancers, and transcription factor binding sites), polyadenylation signals, internal ribosome entry sites, silencers, insulating sequences, and matrix-binding regions may also be present in the gene. These sequences may be close to (e.g., non-limitingly, within 10 kb) or far from the gene’s coding region, and they may affect the level or rate of gene transcription and translation.

[0076] The term "allele" refers to a variant form of a gene. Some genes have various different forms located at the same position on a chromosome, or at a specific locus. Diploid organisms have two alleles at each locus. Each pair of alleles represents the genotype of a particular locus. A genotype is described as homozygous if there are two identical alleles at a particular locus, and heterozygous if the two alleles are different.

[0077] The "coding region" or "coding sequence" of a gene consists of the DNA or RNA portion of the gene composed of exons that code for proteins. This region begins with a start codon at the 5' end and ends with a stop codon at the 3' end.

[0078] A “promoter” is a regulatory region of DNA that typically contains a TATA box that can instruct RNA polymerase II to initiate RNA synthesis at an appropriate transcription start site for a particular polynucleotide sequence. In some cases, a promoter may additionally include other regions that affect the transcription initiation rate. The promoter sequences disclosed herein regulate the transcription of operably linked polynucleotides. Promoters may be active in one or more of the cell types disclosed herein (e.g., mouse cells, rat cells, pluripotent cells, one-cell stage embryos, differentiated cells, or a combination thereof). Promoters may be, for example, constitutively active promoters, conditional promoters, inductive promoters, temporally restricted promoters (e.g., developmentally regulated promoters), or spatially restricted promoters (e.g., cell-specific or tissue-specific promoters). Examples of promoters can be found, for example, in International Publication No. 2013 / 176772, which is incorporated herein by reference in its entirety for all purposes.

[0079] "Operatable linkage" or "to be operable-linked" includes the parallel arrangement of two or more components (e.g., a promoter and another sequence element) such that both components function normally and that at least one of the components may mediate a function affecting at least one of the other components. For example, if a promoter controls the level of transcription of a coding sequence depending on the presence or absence of one or more transcription regulators, that promoter may be operable-linked to a coding sequence. An operable linkage may include such sequences being in close proximity to each other or acting in trans (e.g., regulatory elements may act apart to control the transcription of a coding sequence).

[0080] The methods and compositions provided herein utilize a variety of different components. Some of the components described may have active variants and fragments. The term "functional" refers to the inherent ability of a protein or nucleic acid (or a fragment or variant thereof) to exhibit biological activity or function. The biological function of a functional fragment or variant may be the same as, or actually modified compared to, the original molecule (e.g., with respect to their specificity, selectivity, or efficacy), but retains the basic biological function of the molecule.

[0081] The term "variant" refers to a nucleotide sequence (e.g., one nucleotide) or a protein sequence (e.g., one amino acid) that differs from the most common sequence in a population.

[0082] The term "fragment," when referring to proteins, means a protein that is shorter than or has fewer amino acids than a full-length protein. The term "fragment," when referring to nucleic acids, means a nucleic acid that is shorter than or has fewer nucleotides than a full-length nucleic acid. Protein fragments may be, for example, N-terminal fragments (i.e., removal of part of the C-terminus of a protein), C-terminal fragments (i.e., removal of part of the N-terminus of a protein), or internal fragments (i.e., removal of parts of both the N-terminus and C-terminus of a protein). Nucleic acid fragments may be, for example, 5' fragments (i.e., removal of part of the 3' end of a nucleic acid), 3' fragments (i.e., removal of part of the 5' end of a nucleic acid), or internal fragments (i.e., removal of parts of both the 5' end and 3' end of a nucleic acid).

[0083] In the context of two polynucleotide or polypeptide sequences, "sequence identity" or "identity" refers to the residues in two sequences that are identical when aligned for maximum match within a particular comparison window. When using a percentage of sequence identity in relation to proteins, non-identical residue positions are often differed by conserved amino acid substitutions, in which an amino acid residue is replaced by another amino acid residue with similar chemical properties (e.g., charge or hydrophobicity) and therefore does not alter the functional properties of the molecule. If sequences differ by a conserved substitution, the sequence identity percentage can be adjusted upward to compensate for the conservative nature of the substitution. Sequences differing by such a conservative substitution are said to have "sequence similarity" or "similarity." Means for making this adjustment are well known. Typically, this involves scoring the conservative substitution as a partial mismatch rather than a complete mismatch, thereby increasing the sequence identity percentage. Thus, for example, if identical amino acids are given a score of 1 and non-conservative substitutions are given a score of 0, a conservative substitution might be given a score between 0 and 1. The scoring of conservative substitutions is calculated, for example, by the program PC / GENE (Intelligenetics, Mountain View, California).

[0084] The "percentage of sequence identity" includes a value determined by comparing two optimally aligned sequences (the maximum number of perfectly matching residues) across a comparison window, where the portion of the polynucleotide sequence in the comparison window may contain additions or deletions (i.e., gaps) when compared to a reference sequence (without additions or deletions) for the optimal alignment of the two sequences. The percentage is calculated by determining the number of positions where identical nucleic acid bases or amino acid residues occur in both sequences, obtaining the number of matching positions, dividing the number of matching positions by the total number of positions in the comparison window, and multiplying the result by 100 to obtain the percentage of sequence identity. Unless otherwise specified (e.g., if the shorter sequence contains a concatenated non-homologous sequence), the comparison window is the full length of the shorter of the two sequences being compared.

[0085] Unless otherwise specified, sequence identity / similarity values ​​include values ​​obtained using GAP version 10 with the following parameters: GAP weight 50 and length weight 3, and identity % and similarity % for nucleotide sequences using the nwsgapdna.cmp scoring matrix; GAP weight 8 and length weight 2, and identity % and similarity % for amino acid sequences using the BLOSUM62 scoring matrix; or any equivalent program. "Equivalent program" includes any sequence comparison program that, for any two sequences in question, produces alignments that, when compared to the corresponding alignments produced by GAP version 10, have identical nucleotide or amino acid residue matches and identical percentage sequence identity.

[0086] The term "conservative amino acid substitution" refers to the substitution of an amino acid that is normally present in a sequence with a different amino acid of similar size, charge, or polarity. Examples of conservative substitutions include the substitution of a nonpolar (hydrophobic) residue, such as isoleucine, valine, or leucine, with another nonpolar residue. Similarly, examples of conservative substitutions include the substitution of one polar (hydrophilic) residue with another, such as between arginine and lysine, glutamine and asparagine, or glycine and serine. Additional examples of conservative substitutions include substitutions from one basic residue to another, such as lysine, arginine, or histidine, or from one acidic residue to another, such as aspartic acid or glutamic acid. Examples of non-conservative substitutions include substitutions from a nonpolar (hydrophobic) amino acid residue, such as isoleucine, valine, leucine, alanine, or methionine, to a polar (hydrophilic) residue, such as cysteine, glutamine, glutamic acid, or lysine, and / or from a polar residue to a nonpolar residue. A typical classification of amino acids is summarized below.

[0087] [Table 1]

[0088] Examples of "homologous" sequences (e.g., nucleic acid sequences) include sequences that are identical or substantially identical to a known reference sequence, such as those that are at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identical to a known reference sequence. Homologous sequences may include, for example, orthologous and paralogous sequences. For example, homologous genes typically originate from a common ancestral DNA sequence through either a speciation event (orthologous gene) or a gene duplication event (paralogous gene). "Orthologous" genes include genes of various species that evolved from a common ancestral gene through speciation. Orthologs typically retain the same function during evolution. "Paralogous" genes include genes that are related by duplication within the genome. Paralogs can evolve to have new functions during evolution.

[0089] The term "in vitro" includes artificial environments and processes or reactions occurring within artificial environments (e.g., test tubes or isolated cells or cell lines). The term "in vivo" refers to natural environments (e.g., living organisms or bodies, or cells or tissues within living organisms or bodies) and processes or reactions occurring within natural environments. The term "ex vivo" includes cells removed from the body of an individual and processes or reactions occurring within such cells.

[0090] The term "reporter gene" refers to a nucleic acid having a sequence encoding a gene product (typically an enzyme) that can be readily and quantitatively assayed when a construct containing a reporter gene sequence operably linked to a heterologous promoter and / or enhancer element is introduced into cells containing (or capable of containing) factors necessary for the activation of the promoter and / or enhancer element. Examples of reporter genes, but not limited to, include the gene encoding beta-galactosidase (lacZ), the bacterial chloramphenicol acetyltransferase (cat) gene, the firefly luciferase gene, the gene encoding beta-glucuronidase (GUS), and the gene encoding a fluorescent protein. "Reporter protein" refers to the protein encoded by the reporter gene.

[0091] As used herein, the term “fluorescent reporter protein” means a reporter protein that is detectable based on fluorescence, where fluorescence may originate directly from the reporter protein, from the activity of the reporter protein toward a fluorescent substrate, or from a protein having affinity for binding to a fluorescently tagged compound. Examples of fluorescent proteins include green fluorescent proteins (e.g., GFP, GFP-2, tagGFP, turboGFP, eGFP, Emerald, Azami Green, monomeric Azami) Green, CopGFP, AceGFP, and ZsGreenl), yellow fluorescent proteins (e.g., YFP, eYFP, Citrine, Venus, YPet, PhiYFP, and ZsYellowl), blue fluorescent proteins (e.g., BFP, eBFP, eBFP2, Azurite, mKalamal, GFPuv, Sapphire, and T-sapphire), cyan fluorescent proteins (e.g., CFP, eCFP, Cerulean, CyPet, AmCyanl, and Midoriishi-Cyan), red fluorescent proteins (e.g., RFP, mKate, mKate2, Examples of suitable fluorescent proteins include mPlum, DsRed monomer, mCherry, mRFP1, DsRed-Express, DsRed2, DsRed monomer, HcRed-Tandem, HcRed1, AsRed2, eqFP611, mRaspberry, mStrawberry, and Jred), orange fluorescent proteins (e.g., mOrange, mKO, Kusabira-Orange, monomeric Kusabira-Orange, mTangerine, and tdTomato), and other fluorescent proteins whose presence can be detected intracellularly by flow cytometry.

[0092] Repair in response to double-strand breaks (DSBs) occurs primarily through two conserved DNA repair pathways: homologous recombination (HR) and non-homologous end joining (NHEJ). See Kasparek & Humphrey (2011) Semin. Cell Dev. Biol. 22(8):886-897, which is incorporated herein by reference in its entirety for all purposes. Similarly, repair of target nucleic acids mediated by exogenous donor nucleic acids may involve any process of genetic information exchange between two polynucleotides.

[0093] The term “recombination” encompasses any process of exchanging genetic information between two polynucleotides, and can occur by any mechanism. Recombination can occur via homology-directed repair (HDR) or homologous recombination (HR). HDR or HR are forms of nucleic acid repair that may require homology of nucleotide sequences, using a “donor” molecule as a template for repairing a “target” molecule (i.e., a molecule that has experienced a double-strand break), resulting in the transfer of genetic information from the donor to the target. While we do not wish to be bound by any particular theory, such transfer may involve the correction of a mismatch in heteroduplex DNA formed between the broken target and the donor, and / or synthesis-dependent strand annealing, and / or related processes, used to resynthesize the genetic information that the donor becomes part of the target. In some cases, a donor polynucleotide, a portion of a donor polynucleotide, a copy of a donor polynucleotide, or a portion of a copy of a donor polynucleotide is incorporated into the target DNA. See Wang et al. (2013) Cell 153:910-918, Mandalos et al. (2012) PLoS ONE 7:e45768:1-9, and Wang et al. (2013) Nat. Biotechnol. 31:530-532, respectively, which are incorporated herein by reference in their entirety for all purposes.

[0094] Non-homologous end joining (NHEJ) involves the repair of double-strand breaks in nucleic acids by direct ligation of cleaved ends to each other or to exogenous sequences, without requiring a homologous template. Ligation of non-adjacent sequences by NHEJ can often result in deletions, insertions, or translocations near the site of the double-strand break. For example, NHEJ can also result in targeted incorporation of exogenous donor nucleic acids (i.e., NHEJ-based capture) by directly ligating the cleaved ends to the ends of the exogenous donor nucleic acid. Such NHEJ-mediated targeted incorporation may be preferable for the insertion of exogenous donor nucleic acids when homology-directed repair (HDR) pathways are not readily available (e.g., non-dividing cells, primary cells, and cells that perform homology-based DNA repair poorly). In addition, in contrast to homology-directed repair, knowledge of large regions of sequence identity adjacent to the cleavage site is not required, which can be advantageous when attempting targeted insertions in organisms with genomes for which knowledge of the genome sequence is limited. Integration may proceed via blunt-end ligation between the exogenous donor nucleic acid and the cleaved genomic sequence, or via sticky-end ligation (i.e., having a 5' or 3' overhang) using the exogenous donor nucleic acid adjacent to an overhang that matches what is produced by the nuclease agent in the cleaved genomic sequence. See, for example, U.S. Patent Application Publication 2011 / 020722, International Publication 2014 / 033644, International Publication 2014 / 089290, and Maresca et al. (2013) Genome Res. 23(3):539-546, each incorporated herein by whole for any purpose. When blunt ends are ligated, excision of the target and / or donor may be necessary for the microhomology generation region required for fragment joining, which may result in undesirable modifications to the target sequence.

[0095] A composition or method that “comprising” or “including” one or more of the enumerated elements may include other elements not specifically enumerated. For example, a composition that “comprises” or “includes” protein may include protein alone or in combination with other components. The transitional phrase “essentially consisting of” means that the scope of the claim is to be interpreted as encompassing the specific elements described in the claim and elements that do not substantially affect the basic and novel characteristics of the claimed invention. Therefore, the term “essentially consisting of” as used in the claims of the present invention is not intended to be interpreted as equivalent to “including.”

[0096] "Optional" or "optionally" means that the event or situation described thereafter may or may not occur, and that the description includes examples of cases in which the event or situation occurs and examples in which it does not occur.

[0097] Specifying a range of values ​​includes all integers within or defining that range, and all subranges defined by integers within that range.

[0098] Unless otherwise specified in the context, the term "approximately" encompasses values ​​that are ±5% of the stated value.

[0099] The terms "and / or" encompass all possible combinations of one or more of the related enumerated items, as well as the absence of any combination when interpreted as an alternative ("or").

[0100] The term "or" refers to any one of the members of a particular list.

[0101] The singular articles "a," "an," and "the" include plural references unless the context explicitly specifies otherwise. For example, the terms "protein" or "at least one protein" can include multiple proteins, including mixtures thereof.

[0102] Statistically significant means p ≤ 0.05. (Modes for carrying out the invention)

[0103] I. Overview This specification discloses non-human animal genomes, non-human animal cells, and non-human animals containing humanized MYOC loci, as well as methods for constructing and using such non-human animal cells and non-human animals. In some such non-human animal genomes, non-human animal cells, and non-human animals, the humanized MYOC locus contains glaucoma-related mutations, such as the Y437H mutation. Some such non-human animal genomes, non-human animal cells, and non-human animals further contain CRISPR / Cas synergistic activation mediator system components. For example, this specification discloses non-human animal genomes, non-human animal cells, and non-human animals containing clustered and regularly scattered short palindromic repeats (CRISPR) / CRISPR-associated (Cas) synergistic activation mediator (SAM) expression cassettes and humanized MYOC loci (e.g., containing the Y437H mutation), as well as methods for using such non-human animal cells and non-human animals. A MYOC locus containing the Y437H mutation refers to a MYOC locus encoding a myosilin protein that contains the Y437H mutation or a mutation that corresponds to the Y437H mutation in human myosilin when the encoded myosilin protein is optimally aligned with human myosilin (maximum number of perfectly matched residues). The amino acid position nomenclature of the Y437H mutation refers to the position of the mutation in the complete myosilin protein, including the signal peptide. This nomenclature is consistent with the nomenclature used in publications describing this mutation.

[0104] Furthermore, this specification also discloses humanized non-human animal MYOC genes, nucleic acids containing humanized non-human animal MYOC genes, and targeted vectors for use in humanizing non-human animal MYOC genes.

[0105] Non-human animal cells or non-human animals containing the humanized MYOC gene locus express human myosilicin protein or chimeric myosilicin protein containing one or more fragments of human myosilicin protein. Such non-human animal cells and non-human animals can be used to evaluate the delivery or efficacy of human myosilicin targeting agents (e.g., CRISPR / Cas9 genome editing agents, RNAi agents, or ASO agents) in vitro, ex vivo, or in vivo, and can be used in a manner that optimizes the delivery and efficacy of such agents in vitro, ex vivo, or in vivo. Where SAM expression cassettes are present, they can be used, for example, to upregulate the transcription of target genes, such as the humanized MYOC gene disclosed herein, in vitro, ex vivo, or in vivo, to achieve higher myosilicin expression levels. Such models can be used, for example, to evaluate the delivery or efficacy of candidate therapeutic agents for glaucoma, or to evaluate the delivery or efficacy of candidate glaucoma treatments or reagents for reducing intraocular pressure (IOP).

[0106] In some of the non-human animal cells and non-human animals disclosed herein, some, most, or all of the human MYOC genomic DNA is inserted into the corresponding non-human animal human MYOC locus. In some of the non-human animal cells and non-human animals disclosed herein, some, most, or all of the non-human animal genomic DNA is replaced one-to-one with the corresponding human genomic DNA. Since conserved regulatory elements are likely to remain intact and splicing transcripts undergoing RNA processing are more stable than cDNA, expression levels should be higher compared to non-human animals with cDNA insertions, provided that the intron-exon structure and splicing mechanism are maintained. Replacing non-human animal genomic sequences with corresponding human genomic sequences is likely to result in faithful expression of the transgene from the endogenous MYOC locus. Similarly, transgenic non-human animals with transgenic insertions of human MYOC coding sequences into random genomic loci rather than the endogenous non-human animal MYOC locus do not reflect the endogenous regulation of MYOC expression as accurately. Humanized MYOC alleles, resulting from replacing a large portion or all of non-human animal genomic DNA with one-to-one corresponding human genomic DNA, or from inserting human MYOC genomic sequences into corresponding non-human MYOC loci, provide true human targets or approximations of true human targets for human MYOC targeting reagents (e.g., CRISPR / Cas9 reagents, RNAi reagents, or ASO reagents designed to target human MYOC), thereby enabling testing of the efficacy and mechanism of action of such agents in living animals, as well as pharmacokinetic and pharmacodynamic studies in settings where humanized proteins and humanized genes exist as the only versions of MYOC.

[0107] The methods and compositions disclosed herein may optionally employ a non-human animal genome, non-human animal cells, and non-human animals, comprising a chimeric Cas protein expression cassette, a chimeric adapter protein expression cassette, or a synergistic activation mediator (SAM) expression cassette (e.g., a chimeric Cas protein coding sequence and a chimeric adapter protein sequence), such that the components may be constitutively available or, for example, available in a tissue-specific or time-specific manner. The cassette can be incorporated into the genome. Such genomes, non-human animal cells, and non-human animals may also comprise a guide RNA expression cassette (e.g., a MYOC guide RNA expression cassette or a MYOC guide RNA array expression cassette) and / or a recombinase expression cassette, as disclosed elsewhere herein. Alternatively, one or more components (e.g., guide RNA and / or recombinase) may be introduced into cells and non-human animals by other means to induce transcriptional activation of a target gene (e.g., a humanized MYOC gene).

[0108] Since only guide RNA needs to be introduced into a non-human animal to activate the transcription of the target gene, a non-human animal containing a SAM expression cassette simplifies the process for upregulating the expression of a target gene (e.g., a humanized MYOC gene) in vivo. If the non-human animal also contains a guide RNA expression cassette, the effects of target gene activation or upregulation can be studied without introducing any further components. In addition, the SAM expression cassette or guide RNA expression cassette may be a conditional expression cassette that can be selectively expressed in a particular tissue or developmental stage, which can, for example, reduce the risk of Cas-mediated toxicity in vivo. Alternatively, such an expression cassette may be constitutively expressed to allow testing of its activity in all kinds of cells, tissues, and organs.

[0109] Non-human animal genomes, non-human animal cells, and non-human animals are provided, each comprising a humanized MYOC locus (e.g., including the Y437H mutation described elsewhere herein), and one or more nucleic acids (any combination of such SAM-type nucleic acids) encoding a chimeric Cas protein, a chimeric adapter protein, a guide RNA, a recombinase, or any combination thereof. The non-human animal genome, non-human animal cells, or non-human animal may be male or female.

[0110] Non-human animal genomes, non-human animal cells, or non-human animals may be heterozygous or homozygous for humanized MYOC loci (e.g., including the Y437H mutation). Diploid organisms have two alleles at each locus. Each pair of alleles represents the genotype of a particular locus. A genotype is described as homozygous if there are two identical alleles at a particular locus, and heterozygous if the two alleles are different. Non-human animals containing humanized MYOC loci (e.g., including the Y437H mutation described herein) may have humanized MYOC loci in their germline.

[0111] SAM nucleic acids or expression cassettes can be stably integrated into non-human animal cells or the genome of non-human animals (i.e., into chromosomes), or located outside of chromosomes (e.g., replicating DNA outside of chromosomes). SAM nucleic acids or expression cassettes can be randomly integrated into the genome of non-human animals (i.e., transgenic) or integrated into a predetermined region of the genome of a non-human animal (e.g., a safe harbor locus) (i.e., knock-in). The target genomic locus into which the SAM nucleic acid or expression cassette is stably integrated may be heterozygous or homozygous for the nucleic acid or expression cassette. A non-human animal containing a stably integrated SAM nucleic acid or expression cassette as described herein may have the nucleic acid or expression cassette in its germline.

[0112] For example, a non-human animal genome, a non-human animal cell, or a non-human animal may include a chimeric Cas protein expression cassette, a chimeric adapter protein expression cassette, or a synergistic activation mediator (SAM) expression cassette (containing both a chimeric Cas protein coding sequence and a chimeric adapter protein sequence) as disclosed herein. In one example, a non-human animal genome, a non-human animal cell, or a non-human animal includes a SAM expression cassette containing both a chimeric Cas protein coding sequence and a chimeric adapter protein coding sequence. In one example, the SAM expression cassette (or chimeric Cas protein expression cassette or chimeric adapter protein expression cassette) is stably integrated into the genome. The stably integrated SAM expression cassette (or chimeric Cas protein expression cassette or chimeric adapter protein expression cassette) can be randomly integrated into the genome of a non-human animal (i.e., transgenic) or integrated into a predetermined region of the genome of a non-human animal (i.e., knock-in). In one example, the SAM expression cassette (or chimeric Cas protein expression cassette or chimeric adapter protein expression cassette) is stably integrated into a predetermined region of the genome, such as a safe harbor locus (e.g., Rosa26). The target genomic locus into which the SAM expression cassette (or chimeric Cas protein expression cassette or chimeric adapter protein expression cassette) is stably integrated may be heterozygous or homozygous for the SAM expression cassette (or chimeric Cas protein expression cassette or chimeric adapter protein expression cassette).

[0113] Optionally, the non-human animal genome, non-human animal cell, or non-human animal described above may further include one or more guide RNA expression cassettes or guide RNA expression cassettes (e.g., guide RNA array expression cassettes). The guide RNA expression cassettes can be stably integrated into the genome of a non-human animal cell or non-human animal (i.e., into a chromosome) or located outside the chromosome (e.g., by replicating DNA outside the chromosome or by being introduced into a non-human animal cell or non-human animal via AAV, LNP, or any other means disclosed herein). The guide RNA expression cassettes can be randomly integrated into the genome of a non-human animal (i.e., transgenic) or integrated into a predetermined region of the genome of a non-human animal (e.g., a safe harbor locus) (i.e., knock-in). The target genomic locus into which the guide RNA expression cassette is stably integrated may be heterozygous or homozygous with respect to the guide RNA expression cassette. In one example, a genome, cell, or non-human animal contains both a SAM expression cassette (or a chimeric Cas protein expression cassette or a chimeric adapter protein expression cassette) and a guide RNA expression cassette. In one example, both cassettes are integrated into the genome. The guide RNA expression cassette can be integrated into a different target genomic locus than the SAM expression cassette (or the chimeric Cas protein expression cassette or chimeric adapter protein expression cassette), or into the same target locus (e.g., the Rosa26 locus, such as being integrated into the first intron of the Rosa26 locus). For example, a non-human animal genome, a non-human animal cell, or a non-human animal may be heterozygous for each of the SAM expression cassette (or chimeric Cas protein expression cassette or chimeric adapter protein expression cassette) and the guide RNA expression cassette, having one allele of a target genomic locus (e.g., Rosa26) containing the SAM expression cassette (or chimeric Cas protein expression cassette or chimeric adapter protein expression cassette), and a second allele of the target genomic locus (e.g., Rosa26) containing the guide RNA expression cassette expression cassette.

[0114] Optionally, any of the non-human animal genomes, non-human animal cells, or non-human animals described above may further comprise a recombinase expression cassette. The recombinase expression cassette can be stably integrated into the genome (i.e., into the chromosome) of a non-human animal cell or non-human animal, or located outside the chromosome (e.g., by replicating DNA outside the chromosome, or by being introduced into a non-human animal cell or non-human animal via AAV, LNP, HDD, or any other means disclosed herein). The recombinase expression cassette can be randomly integrated into the genome of a non-human animal (i.e., transgenic), or integrated into a predetermined region of the genome of a non-human animal (e.g., a safe harbor locus) (i.e., knock-in). The target genomic locus into which the recombinase expression cassette is stably integrated may be heterozygous or homozygous for the recombinase expression cassette. The recombinase expression cassette can be incorporated into a different target genomic locus than any of the other expression cassettes disclosed herein, or it can be incorporated into the genome into the same target locus (e.g., the Rosa26 locus, such as being incorporated into the first intron of the Rosa26 locus).

[0115] Some non-human animals or non-human animal cells described herein (e.g., non-human animals or non-human animal cells comprising a humanized MYOC locus, a SAM expression cassette, and one or more SAM guide RNAs (or one or more SAM guide RNA expression cassettes) targeting the humanized MYOC locus) have increased human MYOC mRNA or protein expression compared to control non-human animals or non-human animal cells (e.g., non-human animals or non-human animal cells having the humanized MYOC locus described herein but not administered with one or more SAM guide RNAs). In some non-human animals or non-human animal cells, the increase in expression is at least about 2x, at least about 3x, at least about 4x, at least about 5x, at least about 6x, at least about 7x, at least about 8x, at least about 9x, at least about 10x, at least about 11x, at least about 12x, at least about 13x, at least about 14x, or at least about 15x (for example, at least 2x, at least 3x, at least 4x, at least 5x, at least 6x, at least 7x, at least 8x, at least 9x, at least 10x, at least 11x, at least 12x, at least 13x, at least 14x, or at least 15x). In some non-human animals or non-human animal cells, increased expression is at least approximately 2 to at least 25 times, at least approximately 3 to at least 25 times, at least approximately 4 to at least 25 times, at least approximately 5 to at least 25 times, at least approximately 6 to at least 25 times, at least approximately 7 to at least 25 times, at least approximately 8 to at least 25 times, at least approximately 9 to at least 25 times, at least approximately 10 to at least 25 times, at least approximately 2 to at least 20 times, at least approximately 2 to at least 15 times, or at least approximately 10 to at least 15 times. Increased human MYOC mRNA or protein expression compared to control non-human animals may be found in the eye, limbal ring, retina, ciliary body, trabecular network, or cornea. In specific examples, increased expression is found in the limbal ring (trabecular network (TM), iris, and ciliary body (CB)).

[0116] Some non-human animals or non-human animal cells described herein (e.g., non-human animals or non-human animal cells containing a humanized MYOC locus, a SAM expression cassette, and one or more SAM guide RNAs (or one or more SAM guide RNA expression cassettes) targeting the humanized MYOC locus) have one or more signs or symptoms of glaucoma. Glaucoma is a chronic optic neuropathy characterized by progressive loss of retinal ganglion cell (RGC) axons, resulting in irreversible vision loss. The primary risk factor for glaucoma is elevated intraocular pressure (IOP). Elevated IOP is caused by increased resistance to aqueous humor outflow through the structure of the trabecular network (TM). Aqueous humor is produced by the ciliary body, circulates in the anterior chamber, and outflows through the TM network. In most cases of glaucoma, there is increased resistance to aqueous humor in the TM. Pathogenic MYOC mutant proteins aggregate within cells, leading to trabecular network (TM) stress, elevated IOP, and glaucoma. Elevated levels of ER stress-induced proteins have been detected in human glaucoma TM cells.

[0117] In some non-human animals, the IOP is at least about 15, at least about 16, at least about 17, at least about 18, at least about 19, or at least about 20, at least about 21, or at least about 22 mmHg (e.g., at least 15, at least 16, at least 17, at least 18, at least 19, at least 20, at least 21, or at least 22 mmHg). In one example, the IOP is about 15–22, about 16–22, about 17–22, about 18–22, about 19–22, about 15–21, about 15–20, or about 16–21 mmHg (e.g., 15–22, 16–22, 17–22, 18–22, 19–22, 15–21, 15–20, or 16–21 mmHg).

[0118] In some non-human animals, IOP is increased compared to control non-human animals (e.g., non-human animals having the humanized MYOC locus described herein but not administered with one or more SAM guide RNAs, or wild-type non-human animals). In one example, IOP is at least about 1, at least about 2, at least about 3, at least about 4, at least about 5, or at least about 6 mmHg (e.g., at least 1, at least 2, at least 3, at least 4, at least 5, or at least 6 mmHg) higher than the control baseline in the control non-human animals. In another example, IOP is about 1–7, about 2–7, about 3–7, about 4–7, about 5–7, about 1–6, about 2–6, about 3–6, about 4–6, or about 5–6 mmHg (e.g., 1–6, 2–6, 3–6, 4–6, or 5–6 mmHg) higher than the control baseline in the control non-human animals. The control baseline may be, for example, an IOP in a non-human animal having the humanized MYOC locus described herein but not administered with one or more SAM guide RNAs, or an IOP in a wild-type non-human animal, or an IOP in a non-human animal having the humanized MYOC locus described herein but before administration of one or more SAM guide RNAs targeting the humanized MYOC locus.

[0119] In some non-human animals, endoplasmic reticulum (ER) stress is increased compared to control non-human animals (e.g., non-human animals having the humanized MYOC locus described herein but not administered with one or more SAM guide RNAs, or wild-type non-human animals). Some non-human animals exhibit signs and symptoms of glaucoma as measured by pattern electroretinogram (PERG). Some non-human animals show retinal ganglion cell (RGC) loss compared to control non-human animals (e.g., non-human animals having the humanized MYOC locus described herein but not administered with one or more SAM guide RNAs, or wild-type non-human animals). Some non-human animals show increased aqueous humor outflow resistance compared to control non-human animals (e.g., non-human animals having the humanized MYOC locus described herein but not administered with one or more SAM guide RNAs, or wild-type non-human animals).

[0120] A MYOC glaucoma model with transgenic overexpression of human myocillin Y437H shows only a slight increase in IOP of approximately 2–3 mmHg, and MYOC is expressed everywhere. In addition, the IOP phenotype is lost over reproduction. The model described herein has the advantage that expression is almost confined to the trabecular network, the target tissue of the disease pathology. Furthermore, a further increase in IOP of approximately 5–6 mmHg is observed.

[0121] II. Non-human animals containing the humanized MYOC gene locus The nucleic acids, non-human animal genomes, non-human animal cells, and non-human animals disclosed herein include a humanized MYOC locus (e.g., including the Y437H mutation described herein). Cells or non-human animals containing a humanized MYOC locus express human myosilirin protein or a partially humanized chimeric myosilirin protein, in which one or more fragments of the native myosilirin protein are replaced with corresponding fragments from human myosilirin.

[0122] A. Myocilin (MYOC) The non-human animal genomes, non-human animal cells, and non-human animals disclosed herein contain the humanized MYOC locus. Myocilin (also known as trabecular meshwork-inducible glucocorticoid-responsive protein or MYOC) is encoded by the MYOC gene (also known as GLC1A or TIGR). Myocilin is a secreted glycoprotein (55-57 kDa). MYOC was first reported as a glucocorticoid-inducible gene and protein in cultured human trabecular meshwork (TM) cells. Myocilin mRNA and / or protein are expressed in ocular structures, including the retina, ciliary body, and trabecular meshwork. It may perform structural functions in the cytoplasm or possibly associate with other intracellular molecules as a molecular chaperone. Extracellularly, it may be involved in generating resistance to aqueous humor outflow by binding to other extracellular molecules or the cell membrane of TM cells.

[0123] Many MYOC mutations cause myocilin to accumulate in TM cells, inducing glaucoma. Mutations in myocilin are the most common genetic cause of primary open-angle glaucoma, the most common form of glaucoma. However, the underlying mechanisms of MYOC-associated glaucoma are not fully understood. The main risk factor for glaucoma is elevated intraocular pressure (IOP), which can be caused by increased resistance to aqueous humor outflow through the structure of the TM. Myocilin is expressed in many ocular tissues capable of secreting myocilin into the aqueous humor, including TM cells and the ciliary body. Many different myocilin mutations have been identified. The Y437H mutation is one mutation associated with elevated IOP and the development of glaucoma.

[0124] Human MYOC maps to 1q24.3 on chromosome 1 (NCBI RefSeq Gene ID 4653, Assembly GRCh38.p13 (GCF_000001405.39), location NC_000001.11 (171635417..171652688, complement)). This gene has been reported to have three exons. The wild-type human myocillin protein is assigned UniProt accession number Q99972. The sequence of the canonical wild-type isoform myocillin (NCBI accession number NP_000252.1) is described in SEQ ID NO: 1. The mRNA (cDNA) encoding the canonical isoform is assigned NCBI accession number NM_000261.2 and is described in SEQ ID NO: 2. An exemplary coding sequence (CDS) is described in SEQ ID NO: 3 (CCDS ID CCDS1297.1). The full-length human myocillin protein described in Sequence ID No. 1 has 504 amino acids, including a signal peptide (amino acids 1-32) and mature myocillin (amino acids 33-504), which, upon cleavage by CAPN2 following amino acid 226 in the endoplasmic reticulum, produce an N-terminal fragment (33-226) and a C-terminal fragment (227-504). The depiction between these domains is as specified in UniProt. References to human myocillin include the canonical (wild-type) form as well as all allele forms and isoforms. Any other form of human myocillin has amino acids numbered for maximum alignment with the wild-type form, and the aligned amino acids are designated by the same number. An exemplary human myocillin protein containing the Y437H mutation is described in Sequence ID No. 4. An exemplary coding sequence of the human myocillin protein containing the Y437H mutation is described in Sequence ID No. 5.

[0125] The mouse Myoc gene maps to 1 H2.1;1 70.29 cM on chromosome 1 (NCBI RefSeq Gene ID 17926, Assembly GRCm39 (GCF_000001635.27), position NC_000067.7 (162466719..162477263)). This gene has been reported to have three exons. The wild-type mouse myosillin protein is assigned UniProt accession number O70624. The sequence of the canonical isoform, NCBI accession number NP_034995.3, is described in 6. The exemplary mRNA (cDNA) isoform encoding the canonical isoform is assigned NCBI accession number NM_010865.3 and is described in Sequence ID No. 7. The exemplary coding sequence (CDS) (CCDS ID CCDS15422.1) is described in Sequence ID No. 8. The full-length mouse myocillin protein described in Sequence ID No. 6 has 490 amino acids, including a signal peptide (amino acids 1-18) and mature myocillin (amino acids 19-490), which cleaves in the endoplasmic reticulum via CAPN2 following amino acid 212, producing an N-terminal fragment (19-212) and a C-terminal fragment (213-490). The depiction between these domains is as specified in UniProt. References to mouse myocillin include the canonical (wild-type) form as well as allele forms and isoforms. Any other form of mouse myocillin has numbered amino acids relative to its maximum alignment with the wild-type form, and the aligned amino acids are indicated by the same number.

[0126] The rat myoc gene maps to 13q22 on chromosome 13 (NCBI RefSeq Gene ID 81523, Assembly mRatBN7.2(GCF_015227675.2), location NC_051348.1(74976730..74987128)). This gene has been reported to have three exons. The wild-type rat myosirin protein is assigned UniProt accession number Q9R1J4. The sequence of the canonical isoform, NCBI accession number NP_110492.1, is described in SEQ ID NO: 9. The mRNA (cDNA) encoding the canonical isoform is assigned NCBI accession number NM_030865.1 and is described in SEQ ID NO: 10. An exemplary coding sequence (CDS) encoding the canonical isoform is described in SEQ ID NO: 11. The sequence of another isoform, NCBI accession number NP_110492.2, is described in SEQ ID NO: 12. The mRNA (cDNA) encoding this isoform is assigned NCBI accession number NM_030865.2 and is described in SEQ ID NO: 13. An exemplary coding sequence (CDS) encoding this isoform is described in SEQ ID NO: 14. The canonical full-length rat myosillin protein described in SEQ ID NO: 9 has 502 amino acids, including a signal peptide (amino acids 1-31) and mature myosillin (amino acids 32-502), which produces an N-terminal fragment (32-225) and a C-terminal fragment (226-502) after cleavage by CAPN2 following amino acid 225 in the endoplasmic reticulum. The depiction between these domains is as specified in UniProt. References to rat myosillin include the canonical (wild-type) form as well as all allele forms and isoforms. Any other form of rat myosilin has amino acids numbered for maximum alignment with the wild-type form, and the aligned amino acids are designated by the same number.

[0127] B. Humanized MYOC gene locus This specification discloses a humanized endogenous MYOC locus in which a segment of the endogenous MYOC locus is deleted and replaced with a corresponding human MYOC sequence (e.g., a corresponding human MYOC genome sequence), and a humanized myosilin protein is expressed from the humanized endogenous MYOC locus. The corresponding human MYOC sequence may include mutations such as those associated with glaucoma (e.g., mutations that cause glaucoma). An example of such a mutation is the Y437H mutation. Nearly 100 different pathogenic MYOC mutations have been identified, the majority of which are clustered in exon 3, which encodes the olfactomedin domain. Examples of such mutations include Y437H, G364V, and Q368X. For example, see Lynch et al. (2018) J. Biol. Chem. 293(52): 20137-20156 and Jain et al. (2017) Proc. Natl. Acad. Sci. USA 114(42): 11199-11204, respectively, which are incorporated herein by reference in their entirety for all purposes.

[0128] A humanized MYOC locus may be a MYOC locus in which the entire MYOC gene is replaced by a corresponding human MYOC sequence (e.g., a corresponding orthologous human MYOC sequence) or a codon-optimized version of the corresponding human MYOC sequence; or it may be a MYOC locus in which only a portion of the MYOC gene is replaced by a corresponding human MYOC sequence (i.e., humanized) or a codon-optimized version of the corresponding human MYOC sequence; or it may be a MYOC locus in which a portion of the corresponding human MYOC locus or a codon-optimized version of a portion of the corresponding human MYOC locus is inserted; or it may be a MYOC locus in which a portion of the MYOC gene is deleted and a portion of the corresponding human MYOC locus or a codon-optimized version of a portion of the corresponding human MYOC locus is inserted. The portion of the corresponding human MYOC locus that is inserted may contain more human MYOC loci than, for example, the portion deleted from the endogenous MYOC locus. A human MYOC sequence corresponding to a specific segment of the endogenous MYOC sequence refers to a region of human MYOC that aligns with that specific segment of the endogenous MYOC sequence when human MYOC and endogenous MYOC are optimally aligned (maximum number of perfectly matching residues). The corresponding human sequence can include, for example, complementary DNA (cDNA) or genomic DNA. Optionally, a codon-optimized version of the corresponding human MYOC sequence can be used, modified to be codon-optimized based on codon usage in non-human animals. The substituted or inserted (i.e., humanized) region can include coding regions such as exons, non-coding regions such as introns, untranslated regions, or regulatory regions (e.g., promoters, enhancers, or transcriptional repressor binding elements), or any combination thereof. As an example, exons corresponding to all one, two, or three exons of a human MYOC gene can be humanized. For example, exons corresponding to exons 1-3 of a human MYOC gene can be humanized.Alternatively, regions of MYOC that encode epitopes recognized by anti-human myosilirin antigen-binding proteins, or regions targeted by human myosilirin targeting reagents (e.g., small molecules), can be humanized. Similarly, introns corresponding to one or all of two introns of the human MYOC gene can be humanized or remain endogenous. For example, introns corresponding to the intron between exons 1 and 3 of the human MYOC gene (i.e., introns 1-2) can be humanized.

[0129] Adjacent untranslated regions containing regulatory sequences may also be humanized or remain endogenous. For example, the 5' untranslated region (UTR), 3'UTR, or both the 5'UTR and 3'UTR may be humanized, or the 5'UTR, 3'UTR, or both the 5'UTR and 3'UTR may remain endogenous. One or both of the human 5' and 3'UTRs may be inserted, and / or one or both of the endogenous 5' and 3'UTRs may be deleted. In a specific example, the human 3'UTR is inserted, and the 5'UTR remains endogenous. Depending on the extent of substitution by the corresponding human sequence, regulatory sequences such as promoters may be endogenous or supplied by the corresponding human sequence that substitutes them. For example, a humanized MYOC locus may contain an endogenous non-human animal MYOC promoter (i.e., a human MYOC sequence with a humanized MYOC coding sequence inserted can be operably ligated to an endogenous non-human animal MYOC promoter).

[0130] One or more or all of the signal peptide, mature myocillin, N-terminal fragment, or C-terminal fragment can be humanized, or one or more of such regions may remain endogenous. Exemplary coding sequences for the mouse myocillin signal peptide, mature myocillin, N-terminal fragment, and C-terminal fragment are described in SEQ ID NOs: 31-34, respectively. Exemplary coding sequences for the rat myocillin signal peptide, mature myocillin, N-terminal fragment, and C-terminal fragment are described in SEQ ID NOs: 39-42, respectively. Exemplary coding sequences for the human myocillin signal peptide, mature myocillin, N-terminal fragment, and C-terminal fragment are described in SEQ ID NOs: 21-24, respectively. An exemplary coding sequence for human mature myocillin containing the Y437H mutation is described in SEQ ID NO: 25. An exemplary coding sequence for the human myocillin C-terminal fragment containing the Y437H mutation is described in SEQ ID NO: 26.

[0131] For example, all or part of the region of the MYOC locus encoding the signal peptide can be humanized, and / or all or part of the region of the MYOC locus encoding the C-terminal fragment can be humanized, and / or all or part of the region of the MYOC locus encoding the N-terminal fragment domain can be humanized, and / or all or part of the region of the MYOC locus encoding the mature myosilin protein can be humanized. In one example, the entire region of the MYOC locus encoding the myosilin protein (including the signal peptide and mature myosilin) ​​is humanized. Selectively, the CDS of human myocillin protein contains a sequence that is at least approximately 85%, at least approximately 90%, at least approximately 95%, at least approximately 96%, at least approximately 97%, at least approximately 98%, at least approximately 99%, or approximately 100% identical to sequence number 5 (or its degenerate form) (for example, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identical to sequence number 5 (or its degenerate form)). Selectively, the CDS of human myocillin protein is essentially derived from a sequence that is at least approximately 85%, at least approximately 90%, at least approximately 95%, at least approximately 96%, at least approximately 97%, at least approximately 98%, at least approximately 99%, or approximately 100% identical to sequence number 5 (or its degenerate form) (e.g., at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identical to sequence number 5 (or its degenerate form)). Selectively, the CDS of human myosilirin protein consists of a sequence that is at least approximately 85%, at least approximately 90%, at least approximately 95%, at least approximately 96%, at least approximately 97%, at least approximately 98%, at least approximately 99%, or approximately 100% identical to SEQ ID NO: 5 (or its degenerate form) (for example, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identical to SEQ ID NO: 5 (or its degenerate form)). The myosilirin protein can be expressed and the activity of natural myosilirin and / or human myosilirin can be maintained.

[0132] One or more of the regions encoding the signal peptide, mature myosirin protein, N-terminal fragment, or C-terminal fragment may remain endogenous. For example, the region encoding the signal peptide may remain endogenous.

[0133] Myocillin proteins encoded by the humanized MYOC locus may contain one or more domains derived from human myocillin protein and / or one or more domains derived from endogenous (i.e., native) myocillin protein. Exemplary amino acid sequences of mouse myocillin signal peptide, mature myocillin protein, N-terminal fragment, and C-terminal fragment are described in SEQ ID NOs: 27-30, respectively. Exemplary coding sequences of rat myocillin signal peptide, mature myocillin protein, N-terminal fragment, and C-terminal fragment are described in SEQ ID NOs: 35-38, respectively. Exemplary amino acid sequences of human myocillin signal peptide, mature myocillin protein, N-terminal fragment, and C-terminal fragment are described in SEQ ID NOs: 15-18, respectively. Exemplary coding sequence of human mature myocillin protein containing the Y437H mutation is described in SEQ ID NO: 19. Exemplary amino acid sequence of human myocillin C-terminal fragment containing the Y437H mutation is described in SEQ ID NO: 20.

[0134] Myocillin proteins can include one or more or all of the following: human myocillin signal peptide, human myocillin C-terminal fragment, human myocillin N-terminal fragment, and human mature myocillin protein. For example, the myocillin protein encoded by the humanized MYOC locus may be a complete human myocillin protein (i.e., the signal peptide and mature myocillin protein).

[0135] Myosirin proteins encoded by the humanized MYOC locus may also contain one or more domains derived from endogenous (i.e., native) non-human animal myosirin proteins.

[0136] Domains in chimeric myocillin proteins or complete human myocillin proteins derived from human myocillin proteins can be encoded by a fully humanized sequence (i.e., the entire sequence encoding the domain is replaced by the corresponding human MYOC sequence) or by a partially humanized sequence (i.e., a portion of the sequence encoding the domain is replaced by the corresponding human MYOC sequence, and the remaining endogenous (i.e., native) sequence encoding the domain encodes the same amino acids as the corresponding human MYOC sequence, so that the encoded domain is identical to that domain in human myocillin proteins). Similarly, domains in chimeric proteins derived from endogenous myocillin proteins can be encoded by a fully endogenous sequence (i.e., the entire sequence encoding the domain is the endogenous MYOC sequence) or by a partially humanized sequence (i.e., a portion of the sequence encoding the domain is replaced by the corresponding human MYOC sequence, but the corresponding human MYOC sequence encodes the same amino acids as the replaced endogenous MYOC sequence, so that the encoded domain is identical to that domain in endogenous albumin proteins).

[0137] For example, the myosilin protein encoded by the humanized MYOC locus may contain the complete human myosilin protein (i.e., the signal peptide and mature myosilin protein). The myosilin protein is expressed and retains the activity of native myosilin and / or human myosilin.

[0138] For example, the myosirin protein encoded by the humanized MYOC locus may contain sequences that are at least approximately 85%, at least approximately 90%, at least approximately 95%, at least approximately 96%, at least approximately 97%, at least approximately 98%, at least approximately 99%, or approximately 100% identical to Sequence ID No. 4 (e.g., at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identical to Sequence ID No. 4). As another example, the myosirin protein encoded by the humanized MYOC locus can essentially be derived from a sequence that is at least approximately 85%, at least approximately 90%, at least approximately 95%, at least approximately 96%, at least approximately 97%, at least approximately 98%, at least approximately 99%, or approximately 100% identical to sequence number 4 (e.g., at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identical to sequence number 4). As another example, the myosirin protein encoded by the humanized MYOC locus may consist of a sequence that is at least about 85%, at least about 90%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%, or about 100% identical to Sequence ID No. 4 (for example, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identical to Sequence ID No. 4). Selectively, the MYOC coding sequence (CDS) of a humanized MYOC locus may contain a sequence that is at least approximately 85%, at least approximately 90%, at least approximately 95%, at least approximately 96%, at least approximately 97%, at least approximately 98%, at least approximately 99%, or approximately 100% identical to SEQ ID NO: 5 (or its degenerate form) (e.g., at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identical to SEQ ID NO: 5 (or its degenerate form)).Selectively, the MYOC coding sequence (CDS) of a humanized MYOC locus can essentially be a sequence that is at least about 85%, at least about 90%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%, or about 100% identical to SEQ ID NO: 5 (or its degenerate form) (e.g., at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100%) identical to SEQ ID NO: 5 (or its degenerate form). Selectively, the MYOC coding sequence (CDS) of the humanized MYOC locus may consist of a sequence that is at least approximately 85%, at least approximately 90%, at least approximately 95%, at least approximately 96%, at least approximately 97%, at least approximately 98%, at least approximately 99%, or approximately 100% identical to SEQ ID NO: 5 (or its degenerate form) (e.g., at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identical to SEQ ID NO: 5 (or its degenerate form)). In any case, myosillin protein can be expressed and the activity of natural myosillin and / or human myosillin can be retained.

[0139] Optionally, the humanized MYOC locus may contain other elements. Examples of such elements include a selection cassette, a reporter gene, a recombinase recognition site, or other elements. Alternatively, the humanized MYOC locus may lack other elements (e.g., a selection marker or selection cassette). Examples of suitable reporter genes and reporter proteins are disclosed elsewhere in this specification. An example of a suitable selection marker is neomycin phosphotransferase (neomycin phosphotransferase, neo r ), hygromycin B phosphotransferase (hyg r ), puromycin-N-acetyltransferase (puromycin-N-acetyltransferase, puro r), blasticidin S deaminase (bsr r Examples of recombinases include xanthine / guanine phosphoribosyl transferase (GPT), or herpes simplex virus thymidine kinase (HSV-K). Examples of recombinases include Cre, FLP, and Dre recombinases. An example of a Cre recombinase gene is Crei, in which the two exons encoding Cre recombinase are separated by an intron, preventing expression in prokaryotic cells. Such recombinases may further contain nuclear localization signals to promote localization to the nucleus (e.g., NLS-Crei). Recombinase recognition sites contain nucleotide sequences that can be recognized by site-specific recombinases and function as substrates for recombination events. Examples of recombinase recognition sites include FRT, FRT11, FRT71, attp, att, rox, and lox sites such as loxP, lox511, lox2272, lox66, lox71, loxM2, and lox5171.

[0140] Other elements, such as a reporter gene or selection cassette, may be self-deletion cassettes adjacent to the recombinase recognition site. See, for example, U.S. Patent No. 8,697,851 and U.S. Patent Application Publication No. 2013 / 0312129, respectively, which are incorporated herein by reference in their entirety for any purpose. As an example, self-deletion cassettes may include a Crei gene (containing two exons encoding Cre recombinase separated by introns) operably ligated to the mouse Prm1 promoter and a neomycin resistance gene operably ligated to the human ubiquitin promoter. By employing the Prm1 promoter, the self-deletion cassette can be specifically deleted in the germ cells of male F0 animals. Polynucleotides encoding selection markers may be operably ligated to an active promoter in the target cell. Examples of promoters are described elsewhere herein. As another specific example, a self-deletion selection cassette may contain a hygromycin resistance gene coding sequence operably linked to one or more promoters (e.g., both the human ubiquitin and EM7 promoters), followed by a polyadenylation signal, followed by a Crei coding sequence operably linked to one or more promoters (e.g., the mPrm1 promoter), followed by another polyadenylation signal, with the entire cassette adjacent to a loxP site.

[0141] The humanized MYOC locus may also be a conditional allele. For example, a conditional allele may be a multifunctional allele, as described in U.S. Patent Application Publication 2011 / 0104799, which is incorporated herein by reference in its entirety for all purposes. For example, a conditional allele may include (a) a sense-directed working sequence with respect to the transcription of a target gene, (b) a sense or antisense-directed drug selection cassette (DSC), (c) an antisense-directed nucleotide sequence of interest (NSI), and (d) a conditional by inversion module (COIN), utilizing modules such as exon-splitting introns and invertable gene traps. See, for example, U.S. Patent Application Publication 2011 / 0104799. The conditional allele may further include a recombinable unit that recombines upon exposure to a first recombinase to form a conditional allele that (i) lacks an activator sequence and a DSC, and (ii) contains a sense-directed NSI and an antisense-directed COIN. See, for example, U.S. Patent Application Publication 2011 / 0104799.

[0142] One exemplary humanized MYOC locus (e.g., the humanized mouse MYOC locus) contains the human MYOC 3'UTR, with the region from the start and stop codons replaced by the corresponding human sequence from start to stop codon. See Figure 1 and Sequence IDs 88 and 89. Exemplary sequences of the humanized MYOC locus are described in Sequence IDs 88 and 89.

[0143] In one specific example, a human MYOC sequence at the humanized endogenous MYOC locus may include a sequence that is at least approximately 85%, at least approximately 90%, at least approximately 95%, at least approximately 96%, at least approximately 97%, at least approximately 98%, at least approximately 99%, or approximately 100% identical to the sequence described in Sequence ID No. 87 (for example, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identical to Sequence ID No. 87). In another specific example, a humanized MYOC locus may encode a protein containing a sequence that is at least approximately 85%, at least approximately 90%, at least approximately 95%, at least approximately 96%, at least approximately 97%, at least approximately 98%, at least approximately 99%, or approximately 100% identical to the sequence described in SEQ ID NO: 4 (e.g., at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identical to SEQ ID NO: 4). In another specific example, a humanized MYOC locus may contain a coding sequence containing a sequence that is at least approximately 85%, at least approximately 90%, at least approximately 95%, at least approximately 96%, at least approximately 97%, at least approximately 98%, at least approximately 99%, or approximately 100% identical to the sequence described in SEQ ID NO: 5 (e.g., at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identical to SEQ ID NO: 5). In another specific example, a humanized MYOC locus may contain a sequence that is at least about 85%, at least about 90%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%, or about 100% identical to the sequence described in SEQ ID NO: 88 or 89 (for example, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identical to SEQ ID NO: 88 or 89).

[0144] In one specific example, the human MYOC sequence at the humanized endogenous MYOC locus can essentially consist of a sequence that is at least approximately 85%, at least approximately 90%, at least approximately 95%, at least approximately 96%, at least approximately 97%, at least approximately 98%, at least approximately 99%, or approximately 100% identical to the sequence described in Sequence ID No. 87 (for example, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identical to Sequence ID No. 87). In another specific example, a humanized MYOC locus could encode a protein that is essentially identical to the sequence described in Sequence ID No. 4 by at least about 85%, at least about 90%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%, or about 100% (for example, identical to Sequence ID No. 4 by at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100%). In another specific example, a humanized MYOC locus could include a coding sequence that is essentially identical to the sequence described in Sequence ID No. 5 by at least approximately 85%, at least approximately 90%, at least approximately 95%, at least approximately 96%, at least approximately 97%, at least approximately 98%, at least approximately 99%, or approximately 100% (for example, identical to Sequence ID No. 5 by at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100%). In another specific example, a humanized MYOC locus can essentially consist of a sequence that is at least about 85%, at least about 90%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%, or about 100% identical to the sequence described in SEQ ID NO: 88 or 89 (for example, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identical to SEQ ID NO: 88 or 89).

[0145] In one specific example, the human MYOC sequence at the humanized endogenous MYOC locus may consist of a sequence that is at least approximately 85%, at least approximately 90%, at least approximately 95%, at least approximately 96%, at least approximately 97%, at least approximately 98%, at least approximately 99%, or approximately 100% identical to the sequence described in Sequence ID No. 87 (for example, a sequence that is at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identical to Sequence ID No. 87). In another specific example, a humanized MYOC locus may encode a protein consisting of a sequence that is at least approximately 85%, at least approximately 90%, at least approximately 95%, at least approximately 96%, at least approximately 97%, at least approximately 98%, at least approximately 99%, or approximately 100% identical to the sequence described in SEQ ID NO: 4 (e.g., at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identical to SEQ ID NO: 4). In another specific example, a humanized MYOC locus may include a coding sequence consisting of a sequence that is at least approximately 85%, at least approximately 90%, at least approximately 95%, at least approximately 96%, at least approximately 97%, at least approximately 98%, at least approximately 99%, or approximately 100% identical to the sequence described in SEQ ID NO: 5 (e.g., at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identical to SEQ ID NO: 5). In another specific example, a humanized MYOC locus may consist of a sequence that is at least about 85%, at least about 90%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%, or about 100% identical to the sequence described in SEQ ID NO: 88 or 89 (for example, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identical to SEQ ID NO: 88 or 89).

[0146] C. Non-human animal genomes, non-human animal cells, and non-human animals containing the humanized MYOC locus Non-human animal genomes, non-human animal cells, and non-human animals are provided, including the humanized MYOC locus as described elsewhere in this specification. The genome, cell, or non-human animal may express the humanized myosirin protein encoded by the humanized MYOC locus. The genome, cell, or non-human animal may be male or female. The genome, cell, or non-human animal may be heterozygous or homozygous with respect to the humanized MYOC locus. Diploid organisms have two alleles at each locus. Each pair of alleles represents the genotype of a particular locus. A genotype is described as homozygous if there are two identical alleles at a particular locus, and heterozygous if the two alleles are different. A non-human animal containing the humanized MYOC locus may have the humanized MYOC locus in its germline.

[0147] The non-human animal genomes or cells provided herein may be any non-human animal genome or cell containing, for example, a MYOC locus or a genomic locus homologous or orthologous to the human MYOC locus. Genomics may originate from or be eukaryotic cells, including, for example, animal cells, mammalian cells, non-human mammalian cells, and human cells. The term “animal” includes any member of the animal kingdom, including, for example, mammals, fish, reptiles, amphibians, birds, and insects. Mammalian cells may include, for example, non-human mammalian cells, rodent cells, rat cells, or mouse cells. Other non-human mammals include, for example, non-human primates. The term “non-human” excludes humans.

[0148] Cells can also be in any kind of undifferentiated or differentiated state. For example, cells can be totipotent, pluripotent (e.g., human pluripotent cells, or non-human pluripotent cells such as mouse embryonic stem (ES) cells or rat ES cells), or non-pluripotent (e.g., non-ES cells). Totipotent cells include undifferentiated cells that can give rise to any cell type, and pluripotent cells include undifferentiated cells that have the ability to develop into two or more differentiated cell types. Such pluripotent and / or totipotent cells can be ES cells or ES-like cells, such as induced pluripotent stem (iPS) cells. ES cells include embryonic totipotent or pluripotent cells that can contribute to any tissue of the developing embryo upon introduction into the embryo. ES cells can originate from the inner cell mass of a blastocyst and can differentiate into cells of any of the three vertebrate germ layers (endoderm, ectoderm, and mesoderm).

[0149] The cells provided herein may also be germ cells (e.g., sperm or oocytes). Cells may be mitogenically capable or mitogenically inactive, meiogenically capable or meiogenically inactive. Similarly, cells may also be primary somatic cells or non-primary somatic cells. Somatic cells include any cells that are not gametes, germ cells, gamete cells, or undifferentiated stem cells. For example, cells may be ophthalmic cells such as trabecular reticular cells.

[0150] The preferred cells provided herein also include primary cells. Primary cells include cells or cell cultures isolated directly from an organism, organ, or tissue. Primary cells include cells that have not been transformed or immortalized. They include any cells obtained from an organism, organ, or tissue that have not been previously passaged in tissue culture, or that have been previously passaged in tissue culture but cannot be passaged indefinitely in tissue culture. For example, primary cells may be ophthalmic cells such as trabecular reticular cells.

[0151] Other suitable cells provided herein include immortalized cells. Immortalized cells also include cells from multicellular organisms that do not normally proliferate indefinitely but, due to mutations or modifications, can evade normal cellular senescence and instead continue to divide. Such mutations or modifications may occur spontaneously or may be intentionally induced. Many types of immortalized cells are known. Immortalized cells or primary cells typically include cells used for culture or recombinant gene or protein expression.

[0152] The cells provided herein also include one-cell stage embryos (i.e., fertilized oocytes or zygotes). Such one-cell stage embryos may originate from any genetic background (e.g., BALB / c, C57BL / 6, 129, or a combination thereof in the case of mice), may be fresh or frozen, and may be derived from natural reproduction or in vitro fertilization.

[0153] The cells provided herein may be normal and healthy cells, or they may be diseased cells or cells carrying mutants.

[0154] Non-human animals containing the humanized MYOC locus described herein can be prepared by methods described elsewhere herein. The term “animal” includes any member of the animal kingdom, including, for example, mammals, fish, reptiles, amphibians, birds, and insects. In specific examples, a non-human animal is a non-human mammal. Non-human mammals include, for example, non-human primates and rodents (e.g., mice and rats). The term “non-human animal” excludes humans. Preferred non-human animals include, for example, rodents such as mice and rats.

[0155] Non-human animals may be from any genetic background. For example, preferred mice may be from the 129 strain, the C57BL / 6 strain, hybrids of 129 and C57BL / 6, the BALB / c strain, or the Swiss Webster strain. Examples of the 129 strain include 129P1, 129P2, 129P3, 129X1, 129S1 (e.g., 129S1 / SV, 129S1 / Svlm), 129S2, 129S4, 129S5, 129S9 / SvEvH, 129S6 (129 / SvEvTac), 129S7, 129S8, 129T1, and 129T2. See, for example, Festing et al. (1999) Mamm. Genome 10(8):836, which is incorporated herein by reference in its entirety for all purposes. Examples of C57BL strains include C57BL / A, C57BL / An, C57BL / GrFa, C57BL / Kal_wN, C57BL / 6, C57BL / 6J, C57BL / 6ByJ, C57BL / 6NJ, C57BL / 10, C57BL / 10ScSn, C57BL / 10Cr, and C57BL / Ola. Preferred mice may also come from hybrids of the aforementioned 129 strain and the aforementioned C57BL / 6 strain (e.g., 50% 129 and 50% C57BL / 6). Similarly, preferred mice may come from hybrids of the aforementioned 129 strain or the aforementioned BL / 6 strain (e.g., 129S6 (129 / SvEvTac) strain).

[0156] Similarly, rats may be from, for example, the ACI rat strain, Dark Agouti (DA) rat strain, Wistar rat strain, LEA rat strain, Sprague Dawley (SD) rat strain, or Fisher rat strains such as Fisher F344 or Fisher F6. Rats can also be obtained from strains derived from hybrids of two or more of the aforementioned strains. For example, a suitable rat may be from the DA strain or the ACI strain. The ACI rat strain has a white belly and feet, and RT1 av1It is characterized by having a black agouti haplotype. Such strains are available from various sources, including Harlan Laboratories. Dark Agouti (DA) rat strains have agouti coat and RT1 av1 These rats are characterized by having a haplotype. Such rats are available from various sources, including Charles River and Harlan Laboratories. Some suitable rats may be derived from inbred rat strains. See, for example, U.S. Patent Application Publication 2014 / 0235933, which is incorporated herein by reference in its entirety for all purposes.

[0157] Some non-human animals or non-human animal cells described herein have one or more signs or symptoms of glaucoma. Glaucoma is a chronic optic neuropathy characterized by progressive loss of retinal ganglion cell (RGC) axons, resulting in irreversible vision loss. The primary risk factor for glaucoma is elevated intraocular pressure (IOP). Elevated IOP is caused by increased resistance to aqueous humor outflow through the structure of the trabecular network (TM). Aqueous humor is produced by the ciliary body, circulates in the anterior chamber, and outflows through the TM network. In most cases of glaucoma, there is increased resistance to aqueous humor in the TM. Pathogenic MYOC mutant proteins aggregate within cells, leading to trabecular network (TM) stress, elevated IOP, and glaucoma. Elevated levels of ER stress-inducible proteins have been detected in human glaucoma TM cells.

[0158] In some non-human animals, the IOP is at least about 15, at least about 16, at least about 17, at least about 18, at least about 19, at least about 20, at least about 21, or at least about 22 mmHg (e.g., at least 15, at least 16, at least 17, at least 18, at least 19, at least 20, at least 21, or at least 22 mmHg). In one example, the IOP is about 15–22, about 16–22, about 17–22, about 18–22, about 19–22, about 15–21, about 15–20, or about 16–21 mmHg (e.g., 15–22, 16–22, 17–22, 18–22, 19–22, 15–21, 15–20, or 16–21 mmHg).

[0159] In some non-human animals, IOP is increased compared to control non-human animals (e.g., non-human animals having the humanized MYOC locus described herein but not administered with one or more SAM guide RNAs targeting the humanized MYOC locus, or wild-type non-human animals). In one example, IOP is at least about 1, at least about 2, at least about 3, at least about 4, at least about 5, or at least about 6 mmHg (e.g., at least 1, at least 2, at least 3, at least 4, at least 5, or at least 6 mmHg) higher than the control baseline in control non-human animals. In another example, IOP is about 1–7, about 2–7, about 3–7, about 4–7, about 5–7, about 1–6, about 2–6, about 3–6, about 4–6, or about 5–6 mmHg (e.g., 1–6, 2–6, 3–6, 4–6, or 5–6 mmHg) higher than the control baseline in control non-human animals. The control baseline may be, for example, an IOP in a non-human animal having the humanized MYOC locus described herein but not administered with one or more SAM guide RNAs, or an IOP in a wild-type non-human animal, or an IOP in a non-human animal having the humanized MYOC locus described herein before administration of one or more SAM guide RNAs.

[0160] In some non-human animals, endoplasmic reticulum (ER) stress is increased compared to control non-human animals (e.g., non-human animals having the humanized MYOC locus described herein but not administered with one or more SAM guide RNAs, or wild-type non-human animals). Some non-human animals exhibit signs and symptoms of glaucoma as measured by pattern electroretinography (PERG). Some non-human animals show retinal ganglion cell (RGC) loss compared to control non-human animals (e.g., non-human animals having the humanized MYOC locus described herein but not administered with one or more SAM guide RNAs, or wild-type non-human animals). Some non-human animals show increased aqueous humor outflow resistance compared to control non-human animals (e.g., non-human animals having the humanized MYOC locus described herein but not administered with one or more SAM guide RNAs, or wild-type non-human animals).

[0161] III. Non-human animals containing synergistic activation mediator (SAM) expression cassettes The non-human animal genomes, non-human animal cells, and non-human animals disclosed herein also include synergistic activation mediator (SAM) expression cassettes based on clustered and regularly scattered short palindromic repeats (CRISPR) / CRISPR-associated (Cas) for use in methods of activating the transcription of target genes, such as the humanized MYOC gene disclosed herein, in vitro, ex vivo, or in vitro. The SAM system described herein comprises a chimeric Cas protein and a chimeric adapter protein and can be used together with guide RNA described elsewhere herein to activate the transcription of target genes, such as the humanized MYOC gene disclosed herein. The guide RNA can be encoded by an expression cassette integrated into the genome or can be provided by AAV or any other suitable means. Chimeric Cas proteins (e.g., chimeric Streptococcus pyogenes Cas9 protein, chimeric Campylobacter jejuni Cas9 protein, or chimeric Cas9 protein derived from Staphylococcus aureus Cas9 protein) and chimeric adapter proteins (e.g., adapter proteins that specifically bind to adapter-binding elements in guide RNA and include one or more heterologous transcriptional activation domains) are described in more detail elsewhere herein.

[0162] CRISPR / Cas systems include transcripts and other elements involved in the expression of the Cas gene or the instruction of its activity. CRISPR / Cas systems may be, for example, type I, type II, type III, or type V systems (e.g., subtype VA or subtype VB). CRISPR / Cas systems used in the compositions and methods disclosed herein may be non-natural. A “non-natural” system includes any system that shows human involvement, such as one or more components of the system being modified or mutated from their natural state, or they not containing at least substantially one other component that is naturally associated, or they being associated with at least one other component that is not naturally associated. For example, some CRISPR / Cas systems employ a non-natural CRISPR complex that includes a non-natural gRNA and Cas protein together, or employ a non-natural Cas protein, or employ a non-natural gRNA.

[0163] The methods and compositions disclosed herein employ the CRISPR / Cas system by using or testing the ability of a CRISPR complex (containing a chimeric Cas protein and a guide RNA (gRNA) that forms a complex with a chimeric adapter protein) to induce transcriptional activation of a target genomic locus in vivo.

[0164] The non-human animal genomes, non-human animal cells, and non-human animals disclosed herein include chimeric Cas protein expression cassettes and / or chimeric adapter protein expression cassettes. For example, the non-human animal genomes, non-human animal cells, and non-human animals disclosed herein may include synergistic activation mediator (SAM) expression cassettes containing chimeric Cas protein coding sequences and chimeric adapter protein coding sequences.

[0165] Such non-human animal genomes, non-human animal cells, or non-human animals containing SAM expression cassettes have the advantage of requiring only the delivery of guide RNA to induce transcriptional activation of target genomic loci. Some such non-human animal genomes, non-human animal cells, or non-human animals also include one or more guide RNA expression cassettes or guide RNA expression cassettes, such that all the components required for transcriptional activation of the target gene are already present. SAM systems can be used in such cells to provide increased expression of target genes in any desired manner. For example, the expression of one or more target genes may be increased in a constitutive manner or in a regulated manner (e.g., inducible, tissue-specific, transiently regulated, etc.).

[0166] A. Chimeric Cas protein This specification provides chimeric Cas proteins that can bind to a guide RNA, as disclosed elsewhere herein, to activate the transcription of a target gene. Such a chimeric Cas protein may include (a) a DNA-binding domain which is a clustered and regularly spaced short palindromic repeat (CRISPR)-related (Cas) protein or a functional fragment or variant thereof, capable of forming a complex with the guide RNA and binding to a target sequence, and (b) one or more transcriptional activation domains or functional fragments or variants thereof. For example, such a fusion protein may include one, two, three, four, five, or more transcriptional activation domains (e.g., two or more heterologous transcriptional activation domains or three or more heterologous transcriptional activation domains). In one example, the chimeric Cas protein may include a catalytically inactive Cas protein (e.g., dCas9) and a VP64 transcriptional activation domain or a functional fragment or variant thereof. For example, such a chimeric Cas protein contains, essentially consists of, or can consist of, an amino acid sequence that is at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identical to the dCas9-VP64 chimeric Cas protein sequence described in SEQ ID NO: 43. However, chimeric Cas proteins are also provided in which the transcriptional activation domain contains another transcriptional activation domain or a functional fragment or variant thereof, and / or the Cas protein contains another Cas protein (e.g., a catalytically inactive Cas protein). Examples of other suitable transcriptional activation domains are provided elsewhere herein.

[0167] The transcriptional activation domain can be located at the N-terminus, C-terminus, or any other location within the Cas protein. For example, when optimally aligned with the S.pyogenes Cas9 protein, the transcriptional activation domain can bind to the Rec1 domain, Rec2 domain, HNH domain, or PI domain of the Streptococcus pyogenes Cas9 protein, or to any corresponding region of the orthologous Cas9 protein or homologous or orthologous Cas protein. For instance, the transcriptional activation domain can bind to the Rec1 domain at position 553, the Rec1 domain at position 575, or the Rec2 domain at any position between positions 175 and 306 of the S.pyogenes Cas9 protein, or replace part or all of the region between positions 175 and 306, the HNH domain at any position between positions 715 and 901, or part or all of the region between positions 715 and 901, or the PI domain at position 1153. For example, see International Publication No. 2016 / 049258, which is incorporated herein by reference in its entirety for all purposes. The transcriptional activation domain may be adjacent to one or more linkers on one or both sides, as described elsewhere herein.

[0168] Chimeric Cas proteins can also be operably linked or fused to additional heterologous polypeptides. The fused or linked heterologous polypeptides may be located at the N-terminus, C-terminus, or any other location within the chimeric Cas protein. For example, the chimeric Cas protein may further contain nuclear localization signals. Examples of suitable nuclear localization signals and other modifications for Cas proteins are described in further detail elsewhere in this specification.

[0169] (1) Cas protein Cas proteins generally contain at least one RNA recognition or binding domain capable of interacting with guide RNA. Functional fragments or variants of Cas proteins are those that form a complex with guide RNA and retain the ability to bind to a target sequence of a target gene (e.g., to activate the transcription of the target gene).

[0170] In addition to the transcriptional activation domains described elsewhere in this specification, Cas proteins may also contain nuclease domains (e.g., DNase or RNase domains), DNA-binding domains, helicase domains, protein-protein interaction domains, dimerization domains, and other domains. Some such domains (e.g., DNase domains) may be derived from native Cas proteins. Modified Cas proteins can be constructed by adding other such domains. Nuclease domains have catalytic activity for nucleic acid cleavage, including the cleavage of covalent bonds in nucleic acid molecules. Cleavage can produce blunt or adherent ends and may be single-stranded or double-stranded. For example, wild-type Cas9 protein typically produces blunt-end cleavage products. Alternatively, wild-type Cpf1 protein (e.g., FnCpf1) may yield a cleavage product with a 5' overhang of 5 nucleotides, where the cleavage occurs after the 18th base pair from the PAM sequence of the non-target strand and after the 23rd base of the target strand. The Cas protein may be a nickase that possesses complete cleavage activity and can generate double-strand breaks (e.g., double-strand breaks with blunt ends) at the target genomic locus, or it may be a nickase that generates single-strand breaks at the target genomic locus. In one example, the Cas protein portion of the chimeric Cas protein disclosed herein is modified to have reduced nuclease activity (for example, nuclease activity is reduced by at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%, or 100% compared to the wild-type Cas protein), or to be substantially devoid of all nuclease activity (i.e., nuclease activity is reduced by at least 90%, at least 95%, at least 97%, at least 98%, at least 99%, or 100% compared to the wild-type Cas protein, or to have about 0% or less, about 1% or less, about 2% or less, about 3% or less, about 5% or less, or about 10% or less of the nuclease activity of the wild-type Cas protein).Nuclease-inactive Cas proteins are those that have mutations in their catalytic (i.e., nuclease) domain that are known to be inactivating mutations (e.g., mutations in the RuvC-like endonuclease domain of the Cpf1 protein, or mutations in both the HNH endonuclease domain and the RuvC-like endonuclease domain in Cas9), or Cas proteins that have nuclease activity reduced by at least about 97%, at least about 98%, at least about 99%, or 100% compared to wild-type Cas proteins. Examples of different Cas protein mutations for reducing or substantially eliminating nuclease activity are disclosed below.

[0171] Examples of Cas proteins include Cas1, Cas1B, Cas2, Cas3, Cas4, Cas5, Cas5e (CasD), Cas6, Cas6e, Cas6f, Cas7, Cas8a1, Cas8a2, Cas8b, Cas8c, Cas9 (Csn1 or Csx12), Cas10, Cas10d, CasF, CasG, CasH, Csy1, Csy2, Csy3, Cse1 (CasA), Cse2 (CasB), Cse3 (CasE This includes Cse4 (CasC), Csc1, Csc2, Csa5, Csn2, Csm2, Csm3, Csm4, Csm5, Csm6, Cmr1, Cmr3, Cmr4, Cmr5, Cmr6, Csb1, Csb2, Csb3, Csx17, Csx14, Csx10, Csx16, CsaX, Csx3, Csx1, Csx15, Csf1, Csf2, Csf3, Csf4, and Cu1966, as well as their homologs or modified versions.

[0172] An exemplary Cas protein is the Cas9 protein or a protein derived from the Cas9 protein. The Cas9 protein is derived from the type II CRISPR / Cas system and typically shares four key motifs with a conserved architecture. Motifs 1, 2, and 4 are RuvC-like motifs, and motif 3 is an HNH motif. Exemplary Cas9 proteins are Streptococcus pyogenes, Streptococcus thermophilus, Streptococcus sp., Staphylococcus aureus, Nocardiopsis dassonvillei, Streptomyces pristinaespiralis, Streptomyces viridochromogenes, Streptomyces viridochromogenes, Streptosporangium roseum, Streptosporangium roseum, Alicyclobacillus acidocaldarius, Bacillus pseudomycoides, Bacillus selenitireducens, Exiguobacterium sibiricum, Lactobacillus delbrueckii, Lactobacillus salivarius, Microscilla marina, Burkholderiales bacterium, Polaromonas naphthalenivorans, Polaromonas sp., Crocosphaera watsonii, Cyanothece sp., Microcystis aeruginosa, Synechococcus sp., Acetohalobium arabaticum, Ammonifex degensii, Caldicelulosiruptor becscii, Candidatus Desulforudis, Clostridium botulinum, Clostridium difficile, Finegoldia magna, Natranaerobius thermophilus, Pelotomaculum thermopropionicum, Acidithiobacillus caldus, Acidithiobacillus ferrooxidans, Allochromatium vinosum, Marinobacter sp., Nitrosococcus halophilus, Nitrosococcus watsoni, Pseudoalteromonas haloplanktis, Ktedonobacter racemifer, Methanohalobium evestigatum, Anabaena variabilis, Nodularia spumigena, Nostoc sp., Arthrospira maxima, Arthrospira platensis, Arthrospira Derived from sp., Lyngbya sp., Microcoleus chthonoplastes, Oscillatoria sp., Petrotoga mobilis, Thermosipho africanus, Acaryochloris marina, Neisseria meningitidis, or Campylobacter jejuni. Additional examples of Cas9 family members are described in International Publication 2014 / 131833, which is incorporated herein by reference in its entirety for all purposes. Cas9 derived from S. pyogenes (SpCas9) (assigned SwissProt accession number Q99ZW2) is an exemplary Cas9 protein.Cas9 from *Aureus* (SaCas9) (assigned UniProt accession number J7RUA5) is another exemplary Cas9 protein. Cas9 from *Campylobacter jejuni* (CjCas9) (assigned UniProt accession number Q0P897) is another exemplary Cas9 protein. See, for example, Kim et al. (2017) Nat.Comm.8:14500, whose entirety is incorporated herein by reference for all purposes. SaCas9 is smaller than SpCas9, and CjCas9 is smaller than both SaCas9 and SpCas9. Cas9 from *Neisseria meningitidis* (Nme2Cas9) is another exemplary Cas9 protein. See, for example, Edraki et al. (2019) Mol.Cell 73(4):714-726, whose entirety is incorporated herein by reference for all purposes. Cas9 proteins derived from Streptococcus thermophilus (e.g., Streptococcus thermophilus LMD-9 Cas9 encoded by the CRISPR1 locus (St1Cas9) or Streptococcus thermophilus Cas9 derived from the CRISPR3 locus (St3Cas9)) are other exemplary Cas9 proteins. Cas9 derived from Francisella novicida (FnCas9) or the RHA Francisella novicida Cas9 variant (E1369R / E1449H / R1556A substitution) that recognizes alternative PAM are other exemplary Cas9 proteins. These and other exemplary Cas9 proteins are outlined, for example, in Cebrian-Serrano and Davies (2017) Mamm. Genome 28(7):247-261, which is incorporated herein by reference in its entirety for all purposes.

[0173] Another example of a Cas protein is the Cpf1 (CRISPR derived from Prevotella and Francisella 1) protein. Cpf1 is a large protein (approximately 1300 amino acids) containing a RuvC-like nuclease domain homologous to the corresponding domain of Cas9, along with a corresponding arginine-rich cluster characteristic of Cas9. However, Cpf1 lacks the HNH nuclease domain present in the Cas9 protein, and in contrast to Cas9, which contains a long insertion including the HNH domain, the RuvC-like domain is contiguous in the Cpf1 sequence. See, for example, Zetsche et al. (2015) Cell 163(3):759-771, whose entire work is incorporated herein by reference for all purposes. Exemplary Cpf1 proteins are Francisella tularensis 1, Francisella tularensis subsp.novicida, Prevotella albensis, Lachnospiraceae bacterium MC2017 1, Butyrivibrio proteoclasticus, Peregrinibacteria bacterium GW2011_GWA2_33_10, Parcubacteria bacterium GW2011_GWC2_44_17, Smithella sp.SCADC, Acidaminococcus sp.BV3L6, Lachnospiraceae bacterium MA2020, Candidatus Methanoplasma termitum, Eubacterium eligens, Moraxella bovoculi 237, Leptospira inadai, Lachnospiraceae bacterium ND2006, Porphyromonas crevioricanis 3. Derived from Prevotella disiens and Porphyromonas macacae. Cpf1 (FnCpf1, assigned UniProt accession number A0Q7Q2) from Francisella novicida U112 is an exemplary Cpf1 protein.

[0174] The Cas protein may be a wild-type protein (i.e., the one that occurs intrinsically), a modified Cas protein (i.e., a Cas protein variant), or a fragment of a wild-type or modified Cas protein. The Cas protein may also be an active variant or fragment with respect to the catalytic activity of a wild-type or modified Cas protein. An active variant or fragment with respect to catalytic activity may contain at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% or more of sequence identity with a wild-type or modified Cas protein or a portion thereof, and the active variant retains the ability to cleave at a desired cleavage site and therefore retains nick-inducing or double-strand break-inducing activity. Assays for nick-inducing or double-strand break-inducing activity are known and generally measure the overall activity and specificity of the Cas protein on a DNA substrate containing a cleavage site.

[0175] An example of a modified Cas protein is the modified SpCas9-HF1 protein, which is a high-fidelity variant of Streptococcus pyogenes Cas9 with modifications (N497A / R661A / Q695A / Q926A) designed to reduce nonspecific DNA contact. See, for example, Kleinstiver et al. (2016) Nature 529(7587):490-495, the entire text of which is incorporated herein by reference for all purposes. Another example of a modified Cas protein is the modified eSpCas9 variant (K848A / K1003A / R1060A), designed to reduce off-target effects. See, for example, Slaymaker et al. (2016) Science 351(6268):84-88, the entire text of which is incorporated herein by reference for all purposes. Other SpCas9 variants include K855A and K810A / K1003A / R1060A. These and other modified Cas proteins are outlined, for example, in Cebrian-Serrano and Davies (2017) Mamm. Genome 28(7):247-261, the whole of which is incorporated herein by reference for all purposes. Another example of a modified Cas9 protein is xCas9, which is a SpCas9 variant capable of recognizing an expanded range of PAM sequences. See, for example, Hu et al. (2018) Nature 556:57-63, the whole of which is incorporated herein by reference for all purposes.

[0176] Cas proteins can be modified to increase or decrease one or more of the following: nucleic acid binding affinity, nucleic acid binding specificity, and enzymatic activity. Cas proteins can also be modified to alter other protein activities or properties, such as stability. For example, one or more nuclease domains of a Cas protein can be modified, deleted, or inactivated, or a Cas protein can be shortened to remove domains that are not essential for the protein's function, or the activity or properties of a Cas protein can be optimized (e.g., enhanced or reduced).

[0177] Cas proteins may contain at least one nuclease domain, such as a DNase domain. For example, the wild-type Cpf1 protein generally contains a RuvC-like domain that cleaves both strands of target DNA, possibly in a dimeric conformation. Cas proteins may also contain at least two nuclease domains, such as a DNase domain. For example, the wild-type Cas9 protein generally contains a RuvC-like nuclease domain and an HNH-like nuclease domain. The RuvC domain and the HNH domain can each cleave different strands of double-stranded DNA, thereby causing double-strand breaks in DNA. See, for example, Jinek et al. (2012) Science 337(6096):816-821, which is incorporated herein by reference in its entirety for all purposes.

[0178] Because one or more or all of the nuclease domains can be deleted or mutated, they may no longer function or have reduced nuclease activity. For example, if one of the nuclease domains in the Cas9 protein is deleted or mutated, the resulting Cas9 protein may be called a nickase and can produce single-strand breaks in double-stranded target DNA but cannot produce double-strand breaks (i.e., it can cleave complementary or non-complementary strands but not both). If both nuclease domains are deleted or mutated, the resulting Cas protein (e.g., Cas9) will have reduced ability to cleave both strands of double-stranded DNA (e.g., nuclease null or nuclease-inactive Cas protein, or Cas protein without catalytic activity (dCas)). An example of a mutation that converts Cas9 to nickase is the D10A mutation (from alanine to aspartic acid at position 10 of Cas9) in the RuvC domain of Cas9 from S. pyogenes. Similarly, H939A (from histidine to alanine at amino acid position 839), H840A (from histidine to alanine at amino acid position 840), or N863A (from asparagine to alanine at amino acid position N863) in the HNH domain of Cas9 from S. pyogenes can convert Cas9 to nickase. Other examples of mutations that convert Cas9 to nickase include the corresponding mutations for Cas9 from S. thermophilus. See, for example, Sapranauskas et al. (2011) Nucleic Acids Res. 39(21):9275-9282 and International Publication No. 2013 / 141680, respectively, which are incorporated herein by reference in their entirety for all purposes. Such mutations can be generated using methods such as site-directed mutagenesis, PCR-mediated mutagenesis, or whole-gene synthesis. Examples of other mutations that generate nickase can be found, for example, in International Publication No. 2013 / 176772 and International Publication No. 2013 / 142578, each of which is incorporated herein by reference in whole for all purposes.If all nuclease domains are deleted or mutated in a Cas protein (for example, both nuclease domains are deleted or mutated in a Cas9 protein), the resulting Cas protein (e.g., Cas9) has reduced ability to cleave both strands of double-stranded DNA (e.g., nuclease null or nuclease-inactive Cas protein). One specific example is the D10A / H840AS.pyogenes Cas9 double mutant, or the corresponding double mutant of Cas9 from another species when optimally aligned with S.pyogenes Cas9. Another specific example is the D10A / N863AS.pyogenes Cas9 double mutant, or the corresponding double mutant of Cas9 from another species when optimally aligned with S.pyogenes Cas9. An example of a catalytically inactive Cas9 protein (dCas9) contains, essentially consists of, or consists of an amino acid sequence that is at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identical to the dCas9 protein sequence described in SEQ ID NO: 44.

[0179] Examples of inactivating mutations in the catalytic domain of xCas9 are the same as those described above for SpCas9. Examples of inactivating mutations in the catalytic domain of Staphylococcus aureus Cas9 protein are also known. For example, the Cas9 enzyme of Staphylococcus aureus (SaCas9) can produce a nuclease-inactive Cas protein by including substitutions at position N580 (e.g., N580A substitution) and at position D10 (e.g., D10A substitution). See, for example, International Publication 2016 / 106236, which is incorporated herein by reference in its entirety for all purposes. Examples of inactivating mutations in the catalytic domain of Nme2Cas9 are also known (e.g., the combination of D16A and H588A). Examples of inactivating mutations in the catalytic domain of St1Cas9 are also known (e.g., the combination of D9A, D598A, H599A, and N622A). Examples of inactivating mutations in the catalytic domain of St3Cas9 are also known (e.g., the combination of D10A and N870A). Examples of inactivating mutations in the catalytic domain of CjCas9 are also known (e.g., the combination of D8A and H559A). Examples of inactivating mutations in the catalytic domain of FnCas9 and RHA are also known (e.g., N995A).

[0180] Examples of inactivating mutations in the catalytic domain of the Cpf1 protein are also known. For Cpf1 proteins derived from Francisella novicida U112 (FnCpf1), Acidaminococcus sp.BV3L6 (AsCpf1), Lachnospiraceae bacterium ND2006 (LbCpf1), and Moraxella bovoculi 237 (MbCpf1 Cpf1), such mutations may include mutations at positions 908, 993, or 1263 in AsCpf1 or the corresponding positions in the Cpf1 ortholog, or at positions 832, 925, 947, or 1180 in LbCpf1 or the corresponding positions in the Cpf1 ortholog. Such mutations may include, for example, the mutations D908A, E993A, and D1263A of AsCpf1 or the corresponding mutations in the Cpf1 ortholog, or one or more of the mutations D832A, E925A, D947A, and D1180A of LbCpf1 or the corresponding mutations in the Cpf1 ortholog. See, for example, U.S. Patent Application Publication 2016 / 0208243, which is incorporated herein by reference in its entirety for all purposes.

[0181] Cas proteins can also be operably ligated to heterologous polypeptides as fusion proteins. For example, in addition to transcriptional activation domains, Cas proteins can be fused to cleavage domains or epigenetic modification domains. See International Publication 2014 / 089290, which is incorporated herein by reference in its entirety for all purposes. Cas proteins can also be fused to heterologous polypeptides that provide increased or decreased stability. The fusion domain or heterologous polypeptide may be located at the N-terminus, C-terminus, or inside the Cas protein.

[0182] As an example, a Cas protein may be fused with one or more heterologous polypeptides that provide intracellular localization. Such heterologous polypeptides may include, for example, one or more nuclear localization signals (NLS), such as a single-part SV40 NLS and / or a bipart alpha-importin NLS for targeting the nucleus, a mitochondrial localization signal for targeting mitochondria, an ER retention signal, and so on. See, for example, Lange et al. (2007) J. Biol. Chem. 282(8):5101-5105, whose entirety is incorporated herein by reference for all purposes. Such intracellular localization signals may be located at the N-terminus, C-terminus, or anywhere within the Cas protein. The NLS may consist of a sequence of basic amino acids and may be a mono-segmental or bi-segmental sequence. Optionally, Cas proteins may contain two or more NLSs, including an NLS at the N-terminus (e.g., alpha-importin NLS or monosegmental NLS) and an NLS at the C-terminus (e.g., SV40 NLS or bisegmental NLS). Cas proteins may also contain two or more NLSs at the N-terminus and / or two or more NLSs at the C-terminus.

[0183] The Cas protein can also be operably ligated to a cell permeability domain or a protein transduction domain. For example, the cell permeability domain may be derived from the HIV-1 TAT protein, the TLM cell permeability motif from human hepatitis B virus, MPG, Pep-1, VP22, a cell permeability peptide from herpes simplex virus, or a polyarginine peptide sequence. See, for example, International Publication 2014 / 089290 and International Publication 2013 / 176772, respectively, which are incorporated herein by reference in their entirety for any purpose. The cell permeability domain can be located at the N-terminus, C-terminus, or anywhere within the Cas protein.

[0184] Cas proteins can also be operably ligated to heterologous polypeptides such as fluorescent proteins, purification tags, or epitope tags to facilitate tracking or purification. Examples of fluorescent proteins include green fluorescent proteins (e.g., GFP, GFP-2, tagGFP, turboGFP, eGFP, Emerald, Azami Green, Monomeric Azami). Green, CopGFP, AceGFP, ZsGreenl), yellow fluorescent proteins (e.g., YFP, eYFP, Citrine, Venus, YPet, PhiYFP, ZsYellowl), blue fluorescent proteins (e.g., eBFP, eBFP2, Azurite, mKalamal, GFPuv, Sapphire, T-sapphire), cyan fluorescent proteins (e.g., eCFP, Cerulean, CyPet, AmCyanl, Midoriishi-Cyan), red fluorescent proteins (e.g., mKate, mKate2, mPlum, DsRed monomer, mCherry, mRFP1, DsRed-Express, DsRed2, DsRed-Monomer, HcRed-Tandem, HcRedl, AsRed2, eqFP611, mRaspberry, mStrawberry, Jred), orange fluorescent proteins (e.g., mOrange, mKO, Kusabira-Orange, Monomeric) Examples include Kusabira-Orange, mTangerine, tdTomato, and any other suitable fluorescent proteins.Examples of tags include glutathione-S-transferase (GST), chitin-binding protein (CBP), maltose-binding protein, thioredoxin (TRX), poly (NANP), tandem affinity purification (TAP) tag, myc, AcV5, AU1, AU5, E, ECS, E2, FLAG, hemagglutinin (HA), nus, Softag 1, Softag 3, Strep, SBP, Glu-Glu, HSV, KT3, S, S1, T7, V5, VSV-G, histidine (His), biotin carboxyl carrier protein (BCCP), and calmodulin.

[0185] Cas proteins can be anchored to labeled nucleic acids. Such anchoring (i.e., physical linking) can be achieved through covalent or non-covalent interactions, and the anchoring can be direct (e.g., through direct fusion or chemical conjugation, which can be achieved by modification of cysteine ​​or lysine residues on the protein or intein modification) or through one or more intervening linker or adapter molecules such as streptavidin or aptamers. For example, see Pierce et al. (2005) Mini Rev. Med. Chem. 5(1):41-55, Duckworth et al. (2007) Angew. Chem. Int. Ed. Engl. 46(46):8819-8822, Schaeffer and Dixon (2009) Australian J. Chem. 62(10):1328-1332, Goodman et al. (2009) Chembiochem. 10(9):1551-1557, and Khatwani et al. (2012) Bioorg. Med. Chem. 20(14):4532-4539, each of which is incorporated herein by reference in its entirety for all purposes. Non-coordinate strategies for synthesizing protein-nucleic acid conjugates include the biotin-streptavidin and nickel-histidine methods. Covalent protein-nucleic acid conjugates can be synthesized by linking appropriately functionalized nucleic acids and proteins using various chemicals. Some of these chemicals involve the direct binding of oligonucleotides to amino acid residues on the protein surface (e.g., lysineamine or cysteine ​​thiol), while other, more complex schemes require post-translational modification of the protein or the involvement of catalytic or reactive protein domains. Methods for covalently binding proteins to nucleic acids may include, for example, chemical crosslinking of oligonucleotides to protein lysine or cysteine ​​residues, expressed protein ligation, chemoenzymatic methods, and the use of photoaptamers. Labeled nucleic acids can be anchored to the C-terminus, N-terminus, or internal region of a Cas protein.In one example, the labeled nucleic acid is anchored to the C-terminus or N-terminus of the Cas protein. Similarly, the Cas protein can be anchored to the 5' end, 3' end, or internal region of the labeled nucleic acid. That is, the labeled nucleic acid can be anchored in any direction and polarity. For example, the Cas protein can be anchored to the 5' end or 3' end of the labeled nucleic acid.

[0186] (2) Transcriptional activation domain The chimeric Cas proteins disclosed herein may contain one or more transcriptional activation domains. The transcriptional activation domains include regions of naturally occurring transcription factors that, together with a DNA-binding domain (e.g., a catalytically inactive Cas protein complexed with a guide RNA), can activate transcription from a promoter by direct contact with the transcription mechanism or via other proteins such as coactivators. Transcriptional activation domains also include functional fragments or variants of such regions of transcription factors, as well as engineered transcriptional activation domains derived from naturally occurring transcriptional activation domains or artificially generated or synthesized to activate the transcription of target genes. A functional fragment is a fragment that can activate the transcription of a target gene when operably ligated with a suitable DNA-binding domain. A functional variant is a variant that can activate the transcription of a target gene when operably ligated with a suitable DNA-binding domain.

[0187] Specific transcriptional activation domains for use in the chimeric Cas proteins disclosed herein include the VP64 transcriptional activation domain or a functional fragment or variant thereof. VP64 is a tetrameric repeat of the minimal activation domain from the herpes simplex VP16 activation domain. For example, the transcriptional activation domain includes, essentially consists of, or can consist of, an amino acid sequence that is at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identical to the VP64 transcriptional activation domain protein sequence described in SEQ ID NO: 45.

[0188] Other examples of transcriptional activation domains include the herpes simplex virus VP16 transactivation domain, VP64 (a quadruple tandem repeat of herpes simplex virus VP16), NF-κB p65 (NF-κB transactivation subunit p65) activation domain, MyoD1 transactivation domain, HSF1 transactivation domain (a transactivation domain from human heat shock factor 1), RTA (Epstein-Barr virus R transactivation domain), SET7 / 9 transactivation domain, p53 activation domain 1, p53 activation domain 2, CREB (cAMP response element binding protein) activation domain, E2A activation domain, NFAT (nuclear factor of activated T cells) activation domain, and their functional fragments and variants. See, for example, U.S. Patent Application Publication 2016 / 0298125, U.S. Patent Application Publication 2016 / 0281072, and International Publication 2016 / 049258, respectively, each incorporated herein by reference in whole for any purpose. Other examples of transcriptional activation domains include Gcn4, MLL, Rtg3, Gln3, Oaf1, Pip2, Pdr1, Pdr3, Pho4, Leu3, and their functional fragments and variants. See, for example, U.S. Patent Application Publication 2016 / 0298125, which is incorporated herein by reference in its entirety for all purposes. Still other examples of transcriptional activation domains include Spl, Vax, GATA4, and their functional fragments and variants. See, for example, International Publication 2016 / 149484, which is incorporated herein by reference in its entirety for all purposes. Other examples include the activated domains from Oct1, Oct-2A, AP-2, CTF1, P300, CBP, PCAF, SRC1, PvALF, ERF-2, OsGAI, HALF-1, C1, AP1, ARF-5, ARF-6, ARF-7, ARF-8, CPRF1, CPRF4, MYC-RP / GP, and TRAB1PC4, as well as their functional fragments and variants. See, for example, U.S. Patent Application Publication 2016 / 0237456, European Patent No. 3045537, and International Publication 2011 / 146121, respectively, each incorporated herein by reference in whole for all purposes.Other suitable transcriptional activation domains are readily known. See, for example, International Publication No. 2011 / 146121, which is incorporated herein by reference in its entirety for all purposes.

[0189] B. Chimeric Adapter Protein Chimeric adapter proteins that can bind to guide RNAs disclosed elsewhere in this specification are also provided. The chimeric adapter proteins disclosed herein are useful in dCas-synergistic activation mediator (SAM)-like systems to increase the number and diversity of transcriptional activation domains directed to target sequences within target genes in order to activate the transcription of target genes. Nucleic acids encoding chimeric adapter proteins can be incorporated into the genome of cells or non-human animals (e.g., cells or non-human animals containing a chimeric Cas protein expression cassette incorporated into the genome) as disclosed elsewhere in this specification, or the chimeric adapter proteins or nucleic acids can be introduced into such cells and non-human animals using methods disclosed elsewhere in this specification (e.g., LNP-mediated delivery or AAV-mediated delivery).

[0190] Such a chimeric adapter protein comprises (a) an adapter (i.e., an adapter domain or adapter protein) that specifically binds to an adapter binding element in the guide RNA, and (b) one or more heterologous transcription activation domains. For example, such a fusion protein may contain one, two, three, four, five, or more transcription activation domains (e.g., two or more heterologous transcription activation domains or three or more heterologous transcription activation domains). In one example, such a chimeric adapter protein may contain (a) an adapter (i.e., an adapter domain or adapter protein) that specifically binds to an adapter binding element in the guide RNA, and (b) two or more heterologous transcription activation domains. For example, the chimeric adapter protein may contain (a) an MS2 coat protein adapter that specifically binds to one or more MS2 aptamers in the guide RNA (e.g., two MS2 aptamers at separate positions in the guide RNA), and (b) one or more (e.g., two or more transcription activation domains). For example, the two transcription activation domains may be p65 and HSF1 transcription activation domains or functional fragments or variants thereof. However, chimeric adapter proteins are also provided in which the transcriptional activation domain includes other transcriptional activation domains or functional fragments or variants thereof.

[0191] One or more transcriptional activation domains may be directly fused to the adapter. Alternatively, one or more transcriptional activation domains may be linked to the adapter via a linker or a combination of linkers, or via one or more additional domains. Similarly, if two or more transcriptional activation domains are present, they may be directly fused to each other, or linked to each other via a linker or a combination of linkers, or via one or more additional domains. Linkers that may be used in these fusion proteins may include any sequence that does not interfere with the function of the fusion protein. Exemplary linkers are short (e.g., 2-20 amino acids) and typically flexible (e.g., including highly flexible amino acids such as glycine, alanine, and serine). Some specific examples of linkers include one or more units consisting of GGGS (SEQ ID NO: 46) or GGGGS (SEQ ID NO: 47), for example, any combination containing two, three, four, or more repeats of GGGS (SEQ ID NO: 46) or GGGGS (SEQ ID NO: 47). Other linker sequences may also be used.

[0192] One or more transcriptional activation domains and adapters can be in any order within the chimeric adapter protein. One option is that one or more transcriptional activation domains may be at the C-terminus of the adapter, and the adapter may be at the N-terminus of one or more transcriptional activation domains. For example, one or more transcriptional activation domains may be at the C-terminus of the chimeric adapter protein, and the adapter may be at the N-terminus of the chimeric adapter protein. However, one or more transcriptional activation domains may be at the C-terminus of the adapter even if they are not at the C-terminus of the chimeric adapter protein (e.g., if the nuclear localization signal is at the C-terminus of the chimeric adapter protein). Similarly, the adapter may be at the N-terminus of one or more transcriptional activation domains even if they are not at the N-terminus of the chimeric adapter protein (e.g., if the nuclear localization signal is at the N-terminus of the chimeric adapter protein). Another option is that one or more transcriptional activation domains may be at the N-terminus of the adapter, and the adapter may be at the C-terminus of one or more transcriptional activation domains. For example, one or more transcriptional activation domains may be at the N-terminus of the chimeric adapter protein, and the adapter may be at the C-terminus of the chimeric adapter protein. Another option is that if the chimeric adapter protein contains two or more transcriptional activation domains, these two or more domains may be adjacent to the adapter.

[0193] Chimeric adapter proteins can also be operably linked or fused to additional heterologous polypeptides. The fused or linked heterologous polypeptides can be located at the N-terminus, C-terminus, or any other location within the chimeric adapter protein. For example, the chimeric adapter protein may further contain a nuclear localization signal. A specific example of such a protein is an MS2 coat protein (adapter) linked (either directly or via NLS) to the C-terminus of the p65 transcriptional activation domain of the MS2 coat protein (MCP) and to the C-terminus of the HSF1 transcriptional activation domain of the p65 transcriptional activation domain. Such a protein may contain, from N-terminus to C-terminus, the MCP, the nuclear localization signal, the p65 transcriptional activation domain, and the HSF1 transcriptional activation domain. For example, the chimeric adapter protein contains, essentially consists of, or can consist of, an amino acid sequence that is at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identical to the MCP-p65-HSF1 chimeric adapter protein sequence described in Sequence ID No. 48.

[0194] Chimeric adapter proteins can also be fused to or ligated to one or more heterologous polypeptides that provide intracellular localization. Such heterologous polypeptides may include, for example, one or more nuclear localization signals (NLS), such as SV40 NLS and / or alpha-importin NLS for targeting the nucleus, a mitochondrial localization signal for targeting mitochondria, an ER retention signal, and the like. See, for example, Lange et al. (2007) J. Biol. Chem. 282:5101-5105, whose entirety is incorporated herein by reference for all purposes. An NLS may, for example, consist of a sequence of basic amino acids and may be a mono-segmental or bi-segmental sequence. Optionally, a chimeric adapter protein may contain two or more NLSs, including an NLS (e.g., alpha-importin NLS) at the N-terminus and / or an NLS (e.g., SV40 NLS) at the C-terminus.

[0195] Chimeric adapter proteins can also be operably linked to a cell permeability domain or a protein transduction domain. For example, the cell permeability domain may be derived from the HIV-1 TAT protein, the TLM cell permeability motif from human hepatitis B virus, MPG, Pep-1, VP22, a cell permeability peptide from herpes simplex virus, or a polyarginine peptide sequence. See, for example, International Publication 2014 / 089290 and International Publication 2013 / 176772, respectively, which are incorporated herein by reference in their entirety for any purpose. As another example, chimeric adapter proteins can be fused or linked to heterologous polypeptides to provide increased or decreased stability.

[0196] Chimeric adapter proteins can also be operably linked to heterologous polypeptides such as fluorescent proteins, purification tags, or epitope tags to facilitate tracking or purification. Examples of fluorescent proteins include green fluorescent proteins (e.g., GFP, GFP-2, tagGFP, turboGFP, eGFP, Emerald, Azami Green, Monomeric Azami). Green, CopGFP, AceGFP, ZsGreenl), yellow fluorescent proteins (e.g., YFP, eYFP, Citrine, Venus, YPet, PhiYFP, ZsYellowl), blue fluorescent proteins (e.g., eBFP, eBFP2, Azurite, mKalamal, GFPuv, Sapphire, T-sapphire), cyan fluorescent proteins (e.g., eCFP, Cerulean, CyPet, AmCyanl, Midoriishi-Cyan), red fluorescent proteins (e.g., mKate, mKate2, mPlum, DsRed monomer, mCherry, mRFP1, DsRed-Express, DsRed2, DsRed-Monomer, HcRed-Tandem, HcRedl, AsRed2, eqFP611, mRaspberry, mStrawberry, Jred), orange fluorescent proteins (e.g., mOrange, mKO, Kusabira-Orange, Monomeric) Examples of tags include Kusabira-Orange, mTangerine, tdTomato, and any other suitable fluorescent proteins. Examples of tags include glutathione-S-transferase (GST), chitin-binding protein (CBP), maltose-binding protein, thioredoxin (TRX), poly(NANP), tandem affinity purification (TAP) tag, myc, AcV5, AU1, AU5, E, ECS, E2, FLAG, hemagglutinin (HA), nus, Softag 1, Softag 3, Strep, SBP, Glu-Glu, HSV, KT3, S, S1, T7, V5, VSV-G, histidine (His), biotin carboxyl carrier protein (BCCP), and calmodulin.

[0197] Chimeric adapter proteins can be anchored to labeled nucleic acids. Such anchoring (i.e., physical linking) can be achieved via covalent or non-covalent interactions, and the anchoring can be achieved directly (e.g., via direct fusion or chemical bonding, which can be achieved by modification of cysteine ​​or lysine residues on the protein or intein modification) or via one or more intervening linker or adapter molecules such as streptavidin or aptamers. For example, see Pierce et al. (2005) Mini Rev. Med. Chem. 5(1):41-55, Duckworth et al. (2007) Angew. Chem. Int. Ed. Engl. 46(46):8819-8822, Schaeffer and Dixon (2009) Australian J. Chem. 62(10):1328-1332, Goodman et al. (2009) Chembiochem. 10(9):1551-1557, and Khatwani et al. (2012) Bioorg. Med. Chem. 20(14):4532-4539, each of which is incorporated herein by reference in its entirety for all purposes. Non-coordinate strategies for synthesizing protein-nucleic acid conjugates include the biotin-streptavidin and nickel-histidine methods. Covalent protein-nucleic acid conjugates can be synthesized by linking appropriately functionalized nucleic acids and proteins using a variety of chemicals. Some of these chemicals involve the direct binding of oligonucleotides to amino acid residues on the protein surface (e.g., lysineamine or cysteinethiol), while other, more complex schemes require post-translational modification of the protein or the involvement of catalytic or reactive protein domains. Methods for covalently binding proteins to nucleic acids may include, for example, chemical crosslinking of oligonucleotides to protein lysine or cysteine ​​residues, expressed protein ligation, chemoenzymatic methods, and the use of photoaptamers. Labeled nucleic acids can be anchored to the C-terminus, N-terminus, or internal region of a chimeric adapter protein. Similarly, chimeric adapter proteins can be anchored to the 5' end, 3' end, or internal region of a labeled nucleic acid.In other words, labeled nucleic acids can be anchored in any direction and polarity.

[0198] (1) Adapter protein or adapter domain An adapter (i.e., an adapter domain or adapter protein) is a nucleic acid-binding domain (e.g., a DNA-binding domain and / or RNA-binding domain) that specifically recognizes and binds to a distinct sequence (e.g., a distinct DNA and / or RNA sequence, such as an aptamer, in a sequence-specific manner). Aptamers contain nucleic acids that can bind to target molecules with high affinity and specificity through their ability to adopt specific three-dimensional conformations. Such adapters can bind to specific RNA sequences and secondary structures, for example. These sequences (i.e., adapter-binding elements) can be engineered into guide RNA. For example, the MS2 aptamer can be engineered into guide RNA to specifically bind to the MS2 coat protein (MCP). For example, an adapter may contain, essentially consist of, or be composed of, an amino acid sequence that is at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identical to the MCP sequence described in Sequence ID No. 49.

[0199] Some specific examples of adapters and targets include RNA-binding protein / aptamer combinations present in the diversity of bacteriophage coat proteins. For example, the following adapter proteins or their functional fragments or variants may be used: MS2 coat protein (MCP), PP7, Qβ, F2, GA, fr, JP501, M12, R17, BZ13, JP34, JP500, KU1, M1l, MX1, TW18, VK, SP, FI, ID2, NL95, TW19, AP205, φCb5, Φ Cb8r, Φ Cb12r, ΦCb23r, 7s, and PRR1. See, for example, International Publication No. 2016 / 049258, the whole of which is incorporated herein by reference for all purposes. Functional fragments or variants of adapter proteins retain the ability to bind to specific adapter-binding elements (e.g., the ability to bind to specific adapter-binding sequences in a sequence-specific manner). For example, a PP7 Pseudomonas bacteriophage coat protein variant can be used, in which amino acids 68-69 are mutated to SG and amino acids 70-75 are deleted from the wild-type protein. See, for example, Wu et al. (2012) Biophys. J. 102(12):2936-2944 and Chao et al. (2007) Nat. Struct. Mol. Biol. 15(1):103-105, respectively, which are incorporated herein by reference in their entirety for all purposes. Similarly, MCP variants such as the N55K mutation can also be used. See, for example, Spingola and Peabody (1994) J. Biol. Chem. 269(12):9006-9010, which are incorporated herein by reference in their entirety for all purposes.

[0200] Other examples of adapter proteins that can be used include the endribonuclease Csy4 or lambda N protein, in whole or in part (e.g., in DNA-bound form). See, for example, U.S. Patent Application Publication 2016 / 0312198, the whole of which is incorporated herein by reference for any purpose.

[0201] (2) Transcriptional activation domain The chimeric adapter proteins disclosed herein comprise one or more transcriptional activation domains. Such transcriptional activation domains may be naturally occurring transcriptional activation domains, functional fragments or functional variants of naturally occurring transcriptional activation domains, or engineered or synthetic transcriptional activation domains. The transcriptional activation domains that can be used include those described elsewhere herein for use in chimeric Cas proteins.

[0202] Specific transcriptional activation domains for use in the chimeric adapter proteins disclosed herein include the p65 and / or HSF1 transcriptional activation domain or functional fragments or variants thereof. The HSF1 transcriptional activation domain may be the transcriptional activation domain of human heat shock factor 1 (HSF1). The p65 transcriptional activation domain may be the transcriptional activation domain of transcription factor p65, also known as the nuclear factor NF-kappa-B p65 subunit encoded by the RELA gene. As an example, the transcriptional activation domain may contain, essentially consist of, or be composed of an amino acid sequence that is at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identical to the p65 transcriptional activation domain protein sequence described in Sequence ID No. 50. As another example, the transcriptional activation domain contains, essentially consists of, or may consist of, an amino acid sequence that is at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identical to the HSF1 transcriptional activation domain protein sequence described in Sequence ID No. 51.

[0203] C.SAM guide RNA and guide RNA array Also provided are guide RNAs and guide RNA arrays that can bind to chimeric Cas proteins and chimeric adapter proteins disclosed elsewhere herein to activate the transcription of target genes. The nucleic acids encoding the guide RNAs can be incorporated into the genome of non-human animal cells or non-human animals (e.g., SAM-compatible cells or non-human animals) as disclosed elsewhere herein, or the guide RNAs or nucleic acids can be introduced into such non-human animal cells and non-human animals using methods disclosed elsewhere herein (e.g., LNP-mediated delivery or AAV-mediated delivery). The delivery method may be selected to provide tissue-specific delivery of the recombinase disclosed elsewhere herein.

[0204] A nucleic acid encoding a guide RNA or guide RNA array may encode one or more guide RNAs (or, if the guide RNA is introduced into non-human animal cells or non-human animals, one or more guide RNAs may be introduced). For example, two or more, three or more, four or more, or five or more guide RNAs may be encoded or introduced. Each guide RNA coding sequence may be operably ligated to the same promoter (e.g., the U6 promoter) or a different promoter (e.g., each guide RNA coding sequence may be operably ligated to its own U6 promoter). Two or more of the guide RNAs may target different target sequences in a single target gene. For example, two or more, three or more, four or more, or five or more guide RNAs may each target different target sequences in a single target gene. Similarly, a guide RNA may target multiple target genes (e.g., two or more, three or more, four or more, or five or more target genes). Examples of guide RNA target sequences are disclosed elsewhere in this specification.

[0205] (1) Guide RNA A "guide RNA" or "gRNA" is an RNA molecule that binds to a Cas protein (e.g., the Cas9 protein) and targets the Cas protein to a specific location within target DNA. A guide RNA may contain two segments: a "DNA targeting segment" and a "protein-binding segment." A "segment" is a section or region of a molecule, such as a continuous sequence of nucleotides in the RNA. Some gRNAs, such as that of Cas9, may contain two distinct RNA molecules: an "activator RNA" (e.g., tracrRNA) and a "targeter RNA" (e.g., CRISPR RNA or crRNA). Other gRNAs are single RNA molecules (single RNA polynucleotides) that may also be called "single-molecule gRNAs," "single-guide RNAs," or "sgRNAs." See, for example, International Publications 2013 / 176772, 2014 / 065596, 2014 / 089290, 2014 / 093622, 2014 / 099750, 2013 / 142578, and 2014 / 131833, each of which are incorporated herein by reference in their entirety for all purposes. Guide RNA refers to either CRISPR RNA (crRNA) or a combination of crRNA and transactivated CRISPR RNA (tracrRNA). crRNA and tracrRNA can be associated as a single RNA molecule (single guide RNA or sgRNA) or as two separate RNA molecules (dual guide RNA or dgRNA). For example, in the case of Cas9, single guide RNA may include crRNA fused to tracrRNA (e.g., via a linker). For example, in the case of Cpf1, only crRNA is required to achieve binding to the target sequence. The terms “guide RNA” and “gRNA” include both bimolecule (i.e., modular) gRNA and monomolecule gRNA. In some of the methods and compositions disclosed herein, the gRNA is S. pyogenes Cas9 gRNA or its equivalent.

[0206] Exemplary two-molecule gRNAs include a crRNA-like molecule ("CRISPR RNA," "Targeter RNA," "crRNA," or "crRNA Repeat") and a corresponding tracrRNA-like molecule ("Trans-Activated CRISPR RNA," "Activator RNA," or "tracrRNA"). The crRNA includes both the DNA-targeting segment (single-stranded) of the gRNA and a sequence of nucleotides forming half of the dsRNA double helix of the protein-binding segment of the gRNA. An example of a crRNA tail located downstream (3') of the DNA-targeting segment includes, essentially consists of, or comprises GUUUUAGAGCUAUGCU (SEQ ID NO: 73). Any of the DNA-targeting segments disclosed herein can be ligated to the 5' end of SEQ ID NO: 73 to form a crRNA.

[0207] The corresponding tracrRNA (activator-RNA) contains a sequence of nucleotides that form the remaining half of the dsRNA double helix of the protein-binding segment of the gRNA. The sequence of nucleotides in the crRNA is complementary to the sequence of nucleotides in the tracrRNA and hybridizes with it to form the dsRNA double helix of the protein-binding domain of the gRNA. Therefore, it can be said that each crRNA has a corresponding tracrRNA. Examples of tracrRNA sequences include, essentially consist of, or consist of, AGCAUAGCAAGUUAAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGUCGGUGCUUU (Sequence ID 74), AAACAGCAUAGCAAGUUAAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGUCGGUGCUUUU (Sequence ID 75), or GUUGGAACCAUUCAAAACAGCAUAGCAAGUUAAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGUCGGUGC (Sequence ID 76).

[0208] In systems requiring both crRNA and tracrRNA, the crRNA and its corresponding tracrRNA hybridize to form gRNA. In systems requiring only crRNA, the crRNA may be gRNA. The crRNA additionally provides a single-stranded DNA targeting segment that hybridizes to the complementary strand of the target DNA. When used for intracellular modification, the precise sequence of a given crRNA or tracrRNA molecule can be designed to be specific to the species in which the RNA molecule is used. For example, see Mali et al. (2013) Science 339(6121):823-826, Jinek et al. (2012) Science 337(6096):816-821, Hwang et al. (2013) Nat. Biotechnol. 31(3):227-229, Jiang et al. (2013) Nat. Biotechnol. 31(3):233-239, and Cong et al. (2013) Science 339(6121):819-823, each incorporated herein by reference in its entirety for all purposes.

[0209] The DNA targeting segment (crRNA) of a given gRNA contains a nucleotide sequence complementary to a sequence on the complementary strand of the target DNA, as described in more detail below. The DNA targeting segment of the gRNA interacts with the target DNA in a sequence-specific manner via hybridization (i.e., base pairing). Therefore, the nucleotide sequence of the DNA targeting segment may be altered to determine the position within the target DNA where the gRNA and target DNA interact. The DNA targeting segment of a target gRNA can be modified to hybridize to any desired sequence within the target DNA. Naturally occurring crRNAs vary depending on the CRISPR / Cas system and the organism, but often contain a targeting segment of 21–72 nucleotides, flanked by two direct repeats (DRs) of 21–46 nucleotides (see, for example, International Publication 2014 / 131833, the whole of which is incorporated herein by reference for all purposes). In the case of S. pyogenes, the DRs are 36 nucleotides long, and the targeting segment is 30 nucleotides long. The DR located at 3' is complementary to the corresponding tracrRNA and hybridizes with it, and consequently, the tracrRNA binds to the Cas protein.

[0210] DNA targeting segments may have lengths of, for example, at least about 12, at least about 15, at least about 17, at least about 18, at least about 19, at least about 20, at least about 25, at least about 30, at least about 35, or at least about 40 nucleotides. Such DNA targeting segments may have lengths of, for example, about 12 to about 100, about 12 to about 80, about 12 to about 50, about 12 to about 40, about 12 to about 30, about 12 to about 25, or about 12 to about 20 nucleotides. For example, a DNA targeting segment may be about 15 to about 25 nucleotides (e.g., about 17 to about 20 nucleotides, or about 17, about 18, about 19, or about 20 nucleotides). See, for example, U.S. Patent Application Publication No. 2016 / 0024523, which is incorporated herein by reference in its entirety for all purposes. In the case of Cas9 from S. pyogenes, the typical DNA targeting segment is 16–20 nucleotides long, or 17–20 nucleotides long. In the case of Cas9 from S. aureus, the typical DNA targeting segment is 21–23 nucleotides long. In the case of Cpf1, the typical DNA targeting segment is at least 16 nucleotides long, or at least 18 nucleotides long.

[0211] For example, a DNA targeting segment may be approximately 20 nucleotides long. However, shorter and longer sequences can also be used as targeting segments (e.g., 15–25 nucleotides long, such as 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, or 25 nucleotides). The degree of identity between the DNA targeting segment and the corresponding guide RNA target sequence (or the degree of complementarity between the DNA targeting segment and the other strand of the guide RNA target sequence) may be, for example, approximately 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99%, or 100%. The DNA targeting segment and the corresponding guide RNA target sequence may contain one or more mismatches. For example, the DNA targeting segment of the guide RNA and the corresponding guide RNA target sequence may contain 1-4, 1-3, 1-2, 1, 2, 3, or 4 mismatches (for example, the full length of the guide RNA target sequence is at least 17, at least 18, at least 19, or at least 20 nucleotides). For example, the DNA targeting segment of the guide RNA and the corresponding guide RNA target sequence may contain 1-4, 1-3, 1-2, 1, 2, 3, or 4 mismatches, with the full length of the guide RNA target sequence being 20 nucleotides.

[0212] TracrRNAs can be in any form (e.g., full-length tracrRNA or active-part tracrRNA) and can be of various lengths. They may include primary transcripts or processed forms. For example, a tracrRNA (as part of a single guide RNA or as a separate molecule as part of two gRNA molecules) may contain, essentially consist of, or be composed of all or part of a wild-type tracrRNA sequence (e.g., about 20 or more, about 26 or more, about 32 or more, about 45 or more, about 48 or more, about 54 or more, about 63 or more, about 67 or more, about 85 or more, or more nucleotides of a wild-type tracrRNA sequence). Examples of wild-type tracrRNA sequences from S. pyogenes include 171-nucleotide, 89-nucleotide, 75-nucleotide, and 65-nucleotide versions. For example, see Deltcheva et al. (2011) Nature 471(7340):602-607, International Publication No. 2014 / 093661, each incorporated herein by reference in its entirety for all purposes. Examples of tracrRNA within a single guide RNA (sgRNA) include the tracrRNA segments found within the +48, ​​+54, +67, and +85 versions of the sgRNA, where "+n" indicates that the sgRNA contains the maximum +n nucleotides of wild-type tracrRNA. See U.S. Patent No. 8,697,359, each incorporated herein by reference in its entirety for all purposes.

[0213] The complementarity percentage between the guide RNA's DNA targeting segment and the complementary strand of the target DNA can be at least 60% (e.g., at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 97%, at least 98%, at least 99%, or 100%). The complementarity percentage between the DNA targeting segment and the complementary strand of the target DNA can be at least 60% over approximately 20 consecutive nucleotides. As an example, the complementarity percentage between the DNA targeting segment and the complementary strand of the target DNA can be 100% over 14 consecutive nucleotides at the 5' end of the complementary strand of the target DNA and as low as 0% over the rest. In such a case, the DNA targeting segment can be considered to be 14 nucleotides long. As another example, the complementarity percentage between a DNA targeting segment and the complementary strand of target DNA can be 100% across seven consecutive nucleotides at the 5' end of the complementary strand of target DNA and as low as 0% over the rest. In such a case, the DNA targeting segment may be considered to be 7 nucleotides long. For some guide RNAs, at least 17 nucleotides within the DNA targeting segment are complementary to the complementary strand of target DNA. For example, the DNA targeting segment may be 20 nucleotides long and may contain one, two, or three mismatches with the complementary strand of target DNA. For example, the mismatch is not adjacent to a region of the complementary strand corresponding to a protospacer adjacent motif (PAM) sequence (i.e., the reverse complement of the PAM sequence) (for example, the mismatch is at the 5' end of the DNA targeting segment of the guide RNA, or the mismatch is at least 2, at least 3, at least 4, at least 5, at least 6, at least 7, at least 8, at least 9, at least 10, at least 11, at least 12, at least 13, at least 14, at least 15, at least 16, at least 17, at least 18, or at least 19 base pairs away from the region of the complementary strand corresponding to the PAM sequence).

[0214] The protein-binding segment of a gRNA can contain two complementary sequences of nucleotides. These complementary nucleotides hybridize to form a double-stranded RNA duplex (dsRNA). The protein-binding segment of a target gRNA interacts with a Cas protein, and the gRNA guides the bound Cas protein to a specific nucleotide sequence in the targeted DNA via its DNA-targeting segment.

[0215] A single guide RNA may include a DNA targeting segment and a scaffold sequence (i.e., a protein-binding or Cas-binding sequence of the guide RNA). For example, such a guide RNA may have a 5' DNA targeting segment bound to a 3' scaffold sequence. An example of a scaffolding arrangement is GUUUUAGAGCUAGAAAUAGCAAGUUAAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGUCGGUGCU (version 1, sequence number 77), GUUGGAACCAUUCAAAACAGCAUAGCAAGUUAAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGUCGGUGC (version 2, sequence number 78), GUUUUAGAGCUAGAAAUAGCAAGUUAAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGUCGGUGC (version 3, sequence number 79), GUUUAAGAGCUAUGCUGGAAACAGCAUAGCAAGUUUAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAG UGGCACCGAGUCGGUGC (version 4, sequence number 80), GUUUUAGAGCUAGAAAUAGCAAGUUAAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGUCGGUGCUUUUUUU (version 5, sequence number 81), GUUUUAGAGCUAGAAAUAGCAAGUUAAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGUCGGUGCUUUU (version 6, sequence number 82), or GUUUAAGAGCUAUGCUGGAAACAGCAUAGCAAGUUUAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGUCGGUGCUUUUUU (version 7, sequence number 83), including, essentially being, or consisting of.A guide RNA targeting any of the guide RNA target sequences disclosed herein (e.g., any of SEQ ID NOs. 90-95) may include, for example, a DNA targeting segment on the 5' end of the guide RNA (e.g., any of SEQ ID NOs. 96-101) fused to one of the exemplary guide RNA scaffold sequences on the 3' end of the guide RNA. That is, any of the DNA targeting segments disclosed herein can bind to the 5' end of any one of the above scaffold sequences to form a single guide RNA (chimeric guide RNA). In a specific example, a guide RNA targeting SEQ ID NOs. 93 or 94 may include, for example, a DNA targeting segment of SEQ ID NOs. 99 or 100 on the 5' end of the guide RNA fused to one of the exemplary guide RNA scaffold sequences on the 3' end of the guide RNA. In another specific example, a guide RNA targeting SEQ ID NOs. 93 may include, for example, a DNA targeting segment of SEQ ID NOs. 99 on the 5' end of the guide RNA fused to one of the exemplary guide RNA scaffold sequences on the 3' end of the guide RNA.

[0216] Guide RNA may include modifications or sequences that provide additional desired features (e.g., modified or regulated stability, intracellular targeting, fluorescent tracking, protein or protein complex binding sites). Guide RNA may include one or more modified nucleosides or nucleotides, or one or more non-naturally occurring and / or naturally occurring components or compositions used in place of or in addition to canonical A, G, C, and U residues. Examples of such modifications include, for example, 5' caps (e.g., 7-methylguanylate caps (m7G)), 3' polyadenylated tails (i.e., 3' poly(A) tails), riboswitch sequences (e.g., enabling control of stability and / or accessibility by proteins and / or protein complexes), stability control sequences, sequences that form dsRNA double helixes (i.e., hairpins), modifications or sequences that target RNA to intracellular locations (e.g., nucleus, mitochondria, chloroplasts, etc.), modifications or sequences that provide tracking (e.g., direct conjugation to fluorescent molecules, conjugation to regions that facilitate fluorescence detection, sequences that enable fluorescence detection, etc.), modifications or sequences that provide binding sites to proteins (e.g., DNA-acting proteins such as transcription activators), and combinations thereof. Other examples of modifications include engineered stem-loop double structures, engineered bulge regions, engineered hairpin 3' of stem-loop double structures, or any combination thereof. For example, see U.S. Patent Application Publication 2015 / 0376586, which is incorporated herein by reference in its entirety for all purposes. The bulge may be an unpaired region of nucleotides within a double helix, comprising a crRNA-like region and a minimal tracrRNA-like region. The bulge may include an unpaired 5'-XXXY-3' on one side of the double helix, where X is any purine and Y may be a nucleotide capable of fluctuating base pairing with a nucleotide on the opposite strand, and an unpaired nucleotide region on the opposite side of the double helix.

[0217] Unmodified nucleic acids may be more susceptible to degradation. Exogenous nucleic acids can also induce an innate immune response. Modifications can help introduce stability and reduce immunogenicity. Guide RNAs can include modified nucleosides and modified nucleotides, for example, one or more of the following: (1) modification or substitution of one or both unbound phosphate oxygens and / or one or more bound phosphate oxygens in a phosphodiester backbone bond; (2) modification or substitution of components of ribose sugars, such as modification or substitution of the 2' hydroxyl on the ribose sugar; (3) substitution of the phosphate moiety by a dephosphorylated linker; (4) modification or substitution of naturally occurring nucleic acid bases; (5) substitution or modification of the ribose-phosphate backbone; (6) modification of the 3' or 5' end of an oligonucleotide (e.g., removal, modification, or substitution of a terminal phosphate group, or partial conjugation); and (7) modification of sugars. Other possible guide RNA modifications include modification or substitution of uracil or polyuracil tracts. For example, see International Publication No. 2015 / 048577 and U.S. Patent Application Publication No. 2016 / 0237455, respectively, which are incorporated herein by reference in their entirety for all purposes. Similar modifications can be made to Cas-coding nucleic acids, such as Cas mRNA. For example, Cas mRNA can be modified by depleting uridine using synonymous codons.

[0218] By combining chemical modifications such as those described above, modified gRNA and / or mRNA can be provided that contain residues (nucleosides and nucleotides) having two, three, four, or more modifications. For example, the modified residues may have modified sugars and modified nucleic acid bases. In one example, all bases of the gRNA are modified (for example, all bases have modified phosphate groups such as phosphorothioate groups). For example, all or substantially all phosphate groups of the gRNA can be replaced with phosphorothioate groups. Alternatively or additionally, the modified gRNA may contain at least one modified residue at or near its 5' end. Alternatively or additionally, the modified gRNA may contain at least one modified residue at or near its 3' end.

[0219] Some gRNAs contain one, two, three or more modified residues. For example, at least 5%, 10%, 15%, 20%, 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, or 100% of the modified gRNA may be modified nucleosides or nucleotides.

[0220] Unmodified nucleic acids may be more susceptible to degradation. Exogenous nucleic acids can also induce an innate immune response. Modifications may help introduce stability and reduce immunogenicity. Some gRNAs described herein may contain one or more modified nucleosides or nucleotides to introduce stability against intracellular or serum-based nucleases. Some modified gRNAs described herein may exhibit a reduced innate immune response when introduced into a population of cells.

[0221] The gRNAs disclosed herein may include skeletal modifications in which the phosphate groups of modified residues are modified by substituting one or more oxygen atoms with different substituents. Modifications may include large-scale substitution of unmodified phosphate groups with modified phosphate groups, as described herein. Skeletal modifications of the phosphate backbone may also include changes resulting in either uncharged linkers or charged linkers with asymmetric charge distributions.

[0222] Examples of modified phosphate groups include phosphorothioates, phosphoroselenates, boranophosphates, boranophosphate esters, hydrogen phosphonates, phosphoramidates, alkyl or aryl phosphonates, and phosphotryesters. The phosphorus atom of an unmodified phosphate group is achiral. However, the phosphorus atom may become chiral if one of the non-bridged oxygen atoms is replaced with one of the above atoms or groups of atoms. The phosphorus atom at the stereocenter can have either an "R" configuration (Rp) or an "S" configuration (Sp). The skeleton can also be modified by replacing the bridged oxygen (i.e., the oxygen that binds phosphate to the nucleoside) with nitrogen (bridged phosphoramidate), sulfur (bridged phosphorothioate), and carbon (bridged methylene phosphonate). The substitution may occur with or with both bonded oxygen atoms.

[0223] Phosphate groups can be replaced with phosphorus-free connectors through specific skeletal modifications. In some embodiments, charged phosphate groups can be replaced by neutral moieties. Examples of moieties that can replace phosphate groups include, but are not limited to, methylphosphonates, hydroxylaminos, siloxanes, carbonates, carboxymethyls, carbamates, amides, thioethers, ethylene oxide linkers, sulfonates, sulfonamides, thioformacetals, formacetals, oximes, methyleneiminos, methylenemethyliminos, methylenehydrazos, methylenedimethylhydrazos, and methyleneoxymethyliminos.

[0224] Scaffolds capable of mimicking nucleic acids can also be constructed such that phosphate linkers and ribose sugars are replaced by nuclease-resistant nucleosides or nucleotide substitutes. Such modifications may include skeletal and sugar modifications. In some embodiments, nucleic acid bases may be linked by surrogate skeletons. Examples include, but are not limited to, morpholino, cyclobutyl, pyrrolidine, and peptide nucleic acid (PNA) nucleoside substitutes.

[0225] Modified nucleosides and modified nucleotides can include one or more modifications to the sugar moiety (sugar modifications). For example, the 2'-hydroxyl group (OH) can be modified (e.g., substituted with any of several different oxy or deoxy substituents). Modification of the 2'-hydroxyl group can enhance the stability of nucleic acids because the hydroxyl cannot be deprotonated to form a 2'-alkoxide ion.

[0226] Examples of 2'-hydroxyl group modifications include alkoxy or aryloxy (where "R" can be, for example, alkyl, cycloalkyl, aryl, aralkyl, heteroaryl, or sugar), polyethylene glycol (PEG), O(CH2CH2O) n CH2CH2OR, where R can be, for example, H or optionally substituted alkyl, and n can be an integer from 0 to 20 (e.g., 0 to 4, 0 to 8, 0 to 10, 0 to 16, 1 to 4, 1 to 8, 1 to 10, 1 to 16, 1 to 20, 2 to 4, 2 to 8, 2 to 10, 2 to 16, 2 to 20, 4 to 8, 4 to 10, 4 to 16, and 4 to 20). The 2'-hydroxyl group modification can be 2'-O-Me. Similarly, the 2'-hydroxyl group modification can be a 2'-fluoro modification, which replaces the 2'-hydroxyl group with fluoride. The 2'-hydroxyl group modification can include a locked nucleic acid (LNA) where the 2'-hydroxyl is connected to the 4'-carbon of the same ribose sugar by a C 1-6 alkylene or C 1-6 heteroalkylene bridge, exemplary bridges being methylene, propylene, ether or amino bridges; O-amino (amino can be, for example, NH2; alkylamino, dialkylamino, heterocyclyl, arylamino, diarylamino, heteroarylamino, or diheteroarylamino, ethylenediamine or polyamino), and aminoalkoxy, O(CH2) n-aminos (wherein aminos may be, for example, NH2; alkylaminos, dialkylaminos, heterocyclyls, arylaminos, diarylaminos, heteroarylaminos or diheteroarylaminos, ethylenediamines, or polyaminos). 2'-hydroxyl group modifications may include unlocked nucleic acids (UNA) in which the ribose ring lacks a C2'-C3' bond. 2'-hydroxyl group modifications may include methoxyethyl groups (MOE) (OCH2CH2OCH3, e.g., PEG derivatives).

[0227] Deoxy 2' modifications include hydrogen (i.e., deoxyribose sugar partially located in the overhang portion of dsRNA), halo (e.g., bromo, chloro, fluoro, or iodine), amino (amino can be, for example, NH2, alkylamino, dialkylamino, heterocyclyl, arylamino, diarylamino, heteroarylamino, diheteroarylamino, or amino acid), and NH(CH2CH2NH) n CH2CH2-amino (where amino is, for example, as described herein), -NHC(O)R (where R may be, for example, alkyl, cycloalkyl, aryl, aralkyl, heteroaryl, or sugar), cyanomercapto, alkyl-thio-alkyl, thioalkoxy, and alkyl, cycloalkyl, aryl, alkenyl and alkynyl compounds, which may be optionally substituted with, for example, amino compounds as described herein.

[0228] Sugar modifications may include sugar groups that contain one or more carbon atoms having the opposite stereochemical configuration to the corresponding carbon in ribose. Therefore, modified nucleic acids may contain, for example, nucleotides containing arabinose as sugars. Modified nucleic acids may also contain debasic sugars. These debasic sugars can also be further modified with one or more of their constituent sugar atoms. Modified nucleic acids may also contain one or more L-type sugars (e.g., L-nucleosides).

[0229] The modified nucleosides and modified nucleotides described herein, which can be incorporated into modified nucleic acids, may include modified bases, also called nucleic acid bases. Examples of nucleic acid bases include, but are not limited to, adenine (A), guanine (G), cytosine (C), and uracil (U). These nucleic acid bases may be modified or completely substituted to provide a modified residue that can be incorporated into the modified nucleic acid. The nucleic acid bases of a nucleotide can be independently selected from purines, pyrimidines, purine analogs, or pyrimidine analogs. In some embodiments, the nucleic acid bases may include, for example, naturally occurring synthetic derivatives of the bases.

[0230] In dual guide RNAs, modifications can be included in both the crRNA and tracrRNA. Such modifications may be at one or both ends of the crRNA and / or tracrRNA. In sgRNAs, one or more residues at one or both ends of the sgRNA may be chemically modified, and / or the internal nucleoside may be modified, and / or the entire sgRNA may be chemically modified. Some gRNAs include 5' end modifications. Some gRNAs include 3' end modifications.

[0231] The guide RNAs disclosed herein may include one of the modification patterns disclosed in International Publication 2018 / 107028(Al), which is incorporated herein by reference in its entirety for any purpose. The guide RNAs disclosed herein may also include one of the structure / modification patterns disclosed in U.S. Patent Application Publication 2017 / 0114334, which is incorporated herein by reference in its entirety for any purpose. The guide RNAs disclosed herein may also include one of the structure / modification patterns disclosed in International Publication 2017 / 136794, International Publication 2017 / 004279, U.S. Patent Application Publication 2018 / 0187186, or U.S. Patent Application Publication 2019 / 0048338, which are incorporated herein by reference in their entirety for any purpose.

[0232] As an example, the 5' or 3' terminal nucleotides of guide RNA may contain phosphorothioate bonds (for example, the base may have a modified phosphate group which is a phosphorothioate group). For example, guide RNA may contain phosphorothioate bonds between the 2, 3, or 4 terminal nucleotides at the 5' or 3' end of guide RNA. Another example is that the 5' and / or 3' terminal nucleotides of guide RNA may have 2'-O-methyl modifications. For example, guide RNA may have 2'-O-methyl modifications on the 2, 3, or 4 terminal nucleotides at the 5' and / or 3' end (e.g., the 5' end). See, for example, International Publication No. 2017 / 173054(A1) and Finn et al. (2018) Cell Rep. 22(9):2227-2235, respectively, which are incorporated herein by reference in their entirety for all purposes. Other possible modifications are described in more detail elsewhere herein. In specific examples, guide RNA may contain 2'-O-methyl analogs and 3' phosphorothioate internucleotide bonds at the first three 5' and 3' terminal RNA residues. Such chemical modifications provide, for example, better stability and protection of guide RNA from exonucleases, allowing them to persist in cells longer than unmodified guide RNA. Such chemical modifications can also protect the RNA from innate intracellular immune responses that could aggressively degrade the RNA or trigger immune cascades leading to cell death.

[0233] As an example, any of the guide RNAs described herein may include at least one modification. For example, the at least one modification may include a 2'-O-methyl (2'-O-methyl, 2'-O-Me) modified nucleotide, a phosphorothioate (PS) bond between nucleotides, a 2'-fluoro (2'-Fluor, 2'-F) modified nucleotide, or a combination thereof. For example, the at least one modification may include a 2'-O-methyl (2'-O-Me) modified nucleotide. Alternatively or additionally, the at least one modification may include a phosphorothioate (PS) bond between nucleotides. Alternatively or additionally, the at least one modification may include a 2'-fluoro (2'-F) modified nucleotide. For example, the guide RNA described herein may include one or more 2'-O-methyl (2'-O-Me) modified nucleotides and one or more phosphorothioate (PS) bonds between nucleotides.

[0234] Modifications can occur anywhere on the guide RNA. For example, the guide RNA may include modifications on one or more of the first five nucleotides at the 5' end of the guide RNA, or modifications on one or more of the last five nucleotides at the 3' end of the guide RNA, or a combination thereof. For instance, the guide RNA may include phosphorothioate bonds between the first four nucleotides of the guide RNA, phosphorothioate bonds between the last four nucleotides of the guide RNA, or a combination thereof. Alternatively or additionally, the guide RNA may include 2'-O-Me modified nucleotides at the first three nucleotides at the 5' end of the guide RNA, or 2'-O-Me modified nucleotides at the last three nucleotides at the 3' end of the guide RNA, or a combination thereof.

[0235] Another chemical modification that has been shown to affect nucleotide sugar rings is halogen substitution. For example, 2'-fluoro(2'-F) substitution of a nucleotide sugar ring can increase oligonucleotide binding affinity and nuclease stability. A debasalized nucleotide is a nucleotide that lacks a nitrogenous base. An inverted base is a nucleotide that has an inverted bond from the usual 5'-to-3' bond (i.e., a 5'-to-5' bond or a 3'-to-3' bond).

[0236] Debasic nucleotides can be bound via reverse bonds. For example, a debasic nucleotide can be bound to the terminal 5' nucleotide via a 5'-to-5' bond, or a debasic nucleotide can be bound to the terminal 3' nucleotide via a 3'-to-3' bond. A reverse debasic nucleotide located at either the terminal 5' or 3' nucleotide may also be called a reverse debasic end cap.

[0237] In one example, one or more of the first three, four, or five nucleotides at the 5' end and one or more of the last three, four, or five nucleotides at the 3' end are modified. The modifications may be, for example, 2'-O-Me, 2'-F, reverse basic nucleotides, phosphorothioate bonds, or other nucleotide modifications known to enhance stability and / or performance.

[0238] In another example, the first four nucleotides at the 5' end and the last four nucleotides at the 3' end can be linked by a phosphorothioate bond.

[0239] In another example, the first three nucleotides at the 5' end and the last three nucleotides at the 3' end may include 2'-O-methyl (2'-O-Me) modified nucleotides. In yet another example, the first three nucleotides at the 5' end and the last three nucleotides at the 3' end may include 2'-fluoro (2'-F) modified nucleotides. In yet another example, the first three nucleotides at the 5' end and the last three nucleotides at the 3' end may include reverse basic nucleotides.

[0240] In some guide RNAs (e.g., single guide RNAs), at least one loop (e.g., two loops) of the guide RNA is modified by the insertion of a distinctly different RNA sequence that binds to one or more adapters (i.e., adapter proteins or domains). Such adapter proteins may be used to further recruit one or more heterologous functional domains, such as transcriptional activation domains. Examples of fusion proteins containing such adapter proteins (i.e., chimeric adapter proteins) are disclosed elsewhere herein. For example, the MS2-binding loop ggccAACAUGAGGAUCACCCAUGUCUGCAGggcc (SEQ ID NO: 52) can replace nucleotides +13 to +16 and nucleotides +53 to +56 of the sgRNA scaffold (backbone) described in SEQ ID NOs. 77, 79, 81, or 82, or of the sgRNA backbone of the S. pyogenes CRISPR / Cas9 system, which are described in their entirety by reference in International Publication No. 2016 / 049258 and Konermann et al. (2015) Nature 517(7536):583-588, respectively, which are incorporated herein by reference in their entirety for all purposes. See, for example, Figure 6. The numbering of guide RNAs used herein refers to the numbering of nucleotides in the guide RNA scaffold sequence (i.e., the sequence downstream of the DNA targeting segment of the guide RNA). For example, the first nucleotide of the guide RNA scaffold is +1, the second nucleotide of the scaffold is +2, and so on. The residues corresponding to nucleotides +13 to +16 in SEQ ID NOs. 77, 79, 81, or 82 are loop sequences in the region extending from nucleotides +9 to +21 in SEQ ID NOs. 77, 79, 81, or 82, which are referred to herein as tetraloops. The residues corresponding to nucleotides +53 to +56 in SEQ ID NOs. 77, 79, 81, or 82 are loop sequences in the region extending from nucleotides +48 to +61 in SEQ ID NOs. 77, 79, 81, or 82 are other stem-loop sequences, including stem-loop 1 (nucleotides +33 to +41) and stem-loop 3 (nucleotides +63 to +75).The resulting structure is an sgRNA scaffold in which each of the tetraloop and stemloop 2 sequences is replaced by an MS2-binding loop. The tetraloop and stemloop 2 protrude from the Cas9 protein so that the addition of the MS2-binding loop does not interfere with any Cas9 residues. Additionally, the proximity of the tetraloop and stemloop 2 sites to DNA suggests that localization to these sites can lead to a high degree of interaction between DNA and any recruited protein, such as a transcription activator. Therefore, in some sgRNAs, the nucleotides corresponding to +13 to +16 and / or +53 to +56, or the corresponding residues, of the guide RNA scaffold described in SEQ ID NOs. 77, 79, 81, or 82 are replaced by distinctly different RNA sequences capable of binding to one or more adapter proteins or domains when optimally aligned with any of these scaffolds / backbone. Alternatively or additionally, the adapter-binding sequence may be appended to the 5' or 3' end of the guide RNA. Exemplary guide RNA scaffolds containing MS2-binding loops in the tetraloop and stemloop 2 regions contain, are essentially derived from, or can be derived from the sequences described in SEQ ID NO: 66 or 71. Exemplary general single guide RNAs containing MS2-binding loops in the tetraloop and stemloop 2 regions contain, are essentially derived from, or can be derived from the sequences described in SEQ ID NO: 68 or 72.

[0241] Guide RNA can be provided in any form. For example, gRNA can be provided in RNA form as either two molecules (separate crRNA and tracrRNA) or one molecule (sgRNA), and optionally in the form of a complex with the Cas protein. gRNA can also be provided in the form of DNA encoding gRNA. DNA encoding gRNA can encode a single RNA molecule (sgRNA) or separate RNA molecules (e.g., separate crRNA and tracrRNA). In the latter case, the DNA encoding gRNA can be provided as a single DNA molecule or as separate DNA molecules encoding crRNA and tracrRNA, respectively.

[0242] If gRNA is provided in the form of DNA, it can be expressed transiently, conditionally, or constitutively within a cell. The DNA encoding the gRNA can be stably integrated into the cell's genome and operably ligated to an active promoter in the cell. Alternatively, the DNA encoding the gRNA can be operably ligated to a promoter in an expression construct. For example, the DNA encoding the gRNA may be in a vector containing heterologous nucleic acids. Promoters that can be used in such an expression construct include, for example, promoters active in one or more of the following: eukaryotic cells, human cells, non-human cells, mammalian cells, non-human mammalian cells, rodent cells, mouse cells, rat cells, pluripotent cells, embryonic stem (ES) cells, adult stem cells, developmentally restricted progenitor cells, induced pluripotent stem (iPS) cells, or one-cell stage embryos. Such promoters may be, for example, conditional promoters, inducible promoters, constitutive promoters, or tissue-specific promoters. Such promoters may also be, for example, bidirectional promoters. Specific examples of suitable promoters include RNA polymerase III promoters such as the human U6 promoter, the rat U6 polymerase III promoter, or the mouse U6 polymerase III promoter.

[0243] Alternatively, gRNA can be prepared by a variety of other methods. For example, gRNA can be prepared by in vitro transcription using T7 RNA polymerase (see, for example, International Publications 2014 / 089290 and 2014 / 065596, respectively, which are incorporated herein by reference in their entirety for all purposes). Guide RNA can also be a synthetically produced molecule prepared by chemical synthesis. For example, guide RNA can be chemically synthesized to include 2'-O-methyl analogs and 3' phosphorothioate internucleotide bonds in the first three 5' and 3' terminal RNA residues.

[0244] A guide RNA (or nucleic acid encoding a guide RNA) may be a composition comprising one or more guide RNAs (e.g., 1, 2, 3, 4, or more guide RNAs) and a carrier that increases the stability of the guide RNA (e.g., extending the period during which degradation products remain below a threshold, e.g., less than 0.5% by weight of the starting nucleic acid or protein, under specified storage conditions (e.g., -20°C, 4°C, or ambient temperature), or increasing in vivo stability). Non-limiting examples of such carriers include poly(lactic acid) (PLA) microspheres, poly(D,L-lactic-coglycolic-acid) (PLGA) microspheres, liposomes, micelles, reverse micelles, lipid cochleates, and lipid microtubules. Such a composition may further comprise a Cas protein, such as the Cas9 protein, or a nucleic acid encoding a Cas protein.

[0245] (2) Guide RNA target sequence The target DNA of the guide RNA includes nucleic acid sequences present on the DNA to which the DNA-targeting segment of the gRNA binds, provided that sufficient binding conditions are present. Suitable DNA / RNA binding conditions include physiological conditions normally present in cells. Other suitable DNA / RNA binding conditions (e.g., conditions in cell-free systems) are known in the art (see, for example, Molecular Cloning: A Laboratory Manual, 3rd Ed. (Sambrook et al., Harbor Laboratory Press 2001), which is incorporated herein by reference in its entirety for all purposes). A strand of target DNA that is complementary to and hybridizes with the gRNA can be called the “complementary strand,” and a strand of target DNA that is complementary to the “complementary strand” (and therefore not complementary to the Cas protein or gRNA) can be called the “non-complementary strand” or “template strand.”

[0246] The target DNA includes both the sequence on the complementary strand into which the guide RNA hybridizes and the corresponding sequence on the non-complementary strand (e.g., adjacent to the protospacer flanking motif (PAM)). As used herein, the term “guide RNA target sequence” specifically refers to the sequence on the non-complementary strand that corresponds to (i.e., its reverse complement) the sequence into which the guide RNA hybridizes on the complementary strand. That is, the guide RNA target sequence refers to the sequence on the non-complementary strand adjacent to the PAM (e.g., upstream or 5' of the PAM in the case of Cas9). The guide RNA target sequence is equivalent to the DNA targeting segment of the guide RNA, but contains thymine instead of uracil. For example, the guide RNA target sequence for the SpCas9 enzyme may refer to the sequence upstream of the 5'-NGG-3'PAM on the non-complementary strand. Guide RNA is designed to be complementary to the complementary strand of target DNA, and hybridization between the DNA-targeting segment of the guide RNA and the complementary strand of target DNA promotes the formation of the CRISPR complex. Perfect complementarity is not necessarily required, as long as there is sufficient complementarity to induce hybridization and promote the formation of the CRISPR complex. When a guide RNA is referred to herein as targeting a guide RNA target sequence, it means that the guide RNA hybridizes to the complementary strand sequence of target DNA, which is the reverse complement of the guide RNA target sequence on the non-complementary strand.

[0247] The target DNA or guide RNA target sequence may contain any polynucleotide and may be located, for example, in the nucleus or cytoplasm of a cell, or in a cellular organelle such as a mitochondria or chloroplast. The target DNA or guide RNA target sequence may be any nucleic acid sequence that is endogenous or exogenous to the cell. The guide RNA target sequence may be a sequence that codes for a gene product (e.g., a protein) or a non-coding sequence (e.g., a regulatory sequence), or may include both.

[0248] It is sometimes preferable that the target sequence is adjacent to the transcription start site of the gene. For example, the target sequence may be 1000, 900, 800, 700, 600, 500, 400, 300, 200, 190, 180, 170, 160, 150, 140, 130, 120, 110, 100, 90, 80, 70, 60, 50, 40, 30, 20, 10, 5 or within one base pair of the transcription start site, or 1000, 900, 800, 700, 600, 500, 400, 300, 200, 190, 180, 170, 160, 15 of the transcription start site. The target sequence may be 0, 140, 130, 120, 110, 100, 90, 80, 70, 60, 50, 40, 30, 20, 10, 5, or within one base pair upstream of the transcription start site, or 1000, 900, 800, 700, 600, 500, 400, 300, 200, 190, 180, 170, 160, 150, 140, 130, 120, 110, 100, 90, 80, 70, 60, 50, 40, 30, 20, 10, 5, or within one base pair downstream of the transcription start site. Optionally, the target sequence is located within the region 200 base pairs upstream and 1 base pair downstream of the transcription start site (-200 to +1).

[0249] The target sequence may be within any gene that is desired to be targeted for transcriptional activation. In some cases, the target gene may be a non-expressed or weakly expressed gene (e.g., minimally expressed beyond the background, such as 1.1x, 1.2x, 1.3x, 1.4x, 1.5x, 1.6x, 1.7x, 1.8x, 1.9x, or 2x). The target gene may also be expressed at a low level compared to the control gene. The target gene may also be epigenetically silenced. The term "epigenetically silenced" refers to a gene that is not transcribed, or is transcribed at a reduced level compared to the transcription level of the gene in a control sample (e.g., the corresponding control cell, such as a normal cell), due to mechanisms other than genetic changes such as mutation. Epigenetic mechanisms of gene silencing are well known and include, for example, hypermethylation of CpG dinucleotides in CpG islands of the gene's 5' regulatory region, and chromatin structural changes due to histone acetylation, such as reducing or inhibiting gene transcription.

[0250] Target genes may include genes expressed in specific organs or tissues. Target genes may also include disease-related genes. A disease-related gene is any gene that produces a transcription or translation product at an abnormal level or in an abnormal form in cells derived from disease-affected tissue compared to non-disease control tissues or cells. It may be a gene that becomes expressed at an abnormally high level, where the altered expression correlates with the onset and / or progression of the disease. A disease-related gene also refers to a gene that possesses a mutation or genetic variation that is involved in the pathogenesis of the disease. The transcription or translation product may be known or unknown, and may be at normal or abnormal levels.

[0251] One specific example of such a target gene is the Myoc gene (e.g., the humanized MYOC locus described elsewhere in this specification). Examples of guide RNA target sequences (excluding PAM) in the mouse Myoc gene are described in SEQ ID NOs. 90-95. The guide RNA DNA targeting segments corresponding to the guide RNA target sequences described in SEQ ID NOs. 90-95 are described in SEQ ID NOs. 96-101, respectively. A specific example of a guide RNA target sequence is SEQ ID NOs. 93 or 94 (with the corresponding DNA targeting segment described in SEQ ID NOs. 99 or 100, respectively). Another specific example of a guide RNA target sequence is SEQ ID NOs. 93 (with the corresponding DNA targeting segment described in SEQ ID NOs. 99).

[0252] Site-specific binding and cleavage of target DNA by a Cas protein can occur at positions determined by both (i) base pair complementarity between the guide RNA and the complementary strand of the target DNA, and (ii) a short motif called a protospacer adjacent motif (PAM) that is on the non-complementary strand of the target DNA. The PAM can be adjacent to the guide RNA target sequence. Optionally, the guide RNA target sequence can have the PAM adjacent at the 3’ end (e.g., in the case of Cas9). Alternatively, the guide RNA target sequence can have the PAM adjacent at the 5’ end (e.g., in the case of Cpf1). For example, the cleavage site of a Cas protein can be about 1 to about 10 or about 2 to about 5 base pairs (e.g., 3 base pairs) upstream or downstream of the PAM sequence (e.g., within the guide RNA target sequence). In the case of SpCas9, the PAM sequence (i.e., on the non-complementary strand) can be 5’-N1GG-3’, where N1 is any DNA nucleotide, and the PAM is immediately adjacent to the 3’ of the guide RNA target sequence on the non-complementary strand of the target DNA. Thus, the sequence corresponding to the PAM on the complementary strand (i.e., the reverse complement) is 5’-CCN2-3’, where N2 is any DNA nucleotide, and is immediately adjacent to the 5’ of the sequence where the DNA targeting segment of the guide RNA hybridizes to the complementary strand of the target DNA. In some such cases, N1 and N2 can be complementary, and the N1-N2 base pair can be any base pair (e.g., N1 = C and N2 = G, N1 = G and N2 = C, N1 = A and N2 = T, or N1 = T and N2 = A). In the case of Cas9 from Staphylococcus aureus, the PAM can be NNGRRT or NNGRR, where N can be A, G, C, or T, and R can be G or A. In the case of Cas9 from Campylobacter jejuni, the PAM can be, for example, NNNNACAC or NNNNRYAC, where N can be A, G, C, or T, and R can be G or A. In some cases (e.g., in the case of FnCpf1), the PAM sequence can be upstream of the 5’ end and can have the sequence 5’-TTN-3’.

[0253] An example of a guide RNA target sequence is the 20-nucleotide DNA sequence immediately preceding the NGG motif recognized by the SpCas9 protein. For example, two examples of guide RNA target sequences + PAM are GN 19 NGG (Sequence ID 84) or N 20 This is NGG (SEQ ID NO: 85). See, for example, International Publication No. 2014 / 165825, which is incorporated herein by reference in its entirety for all purposes. The guanine at the 5' end can facilitate transcription by intracellular RNA polymerase. Another example of a guide RNA target sequence + PAM is one with two guanine nucleotides at the 5' end (e.g., GGN). 20 NGG (SEQ ID NO: 86) is included and facilitates efficient transcription by T7 polymerase in vitro. See, for example, International Publication No. 2014 / 065596, which is incorporated herein by reference in its entirety for all purposes. Other guide RNA target sequences and PAMs may have SEQ ID NOs. 84-86 (including 5'G or GG and 3'GG or NGG) of 4-22 nucleotide lengths. Further other guide RNA target sequences + PAMs may have SEQ ID NOs. 84-86 of 14-20 nucleotide lengths.

[0254] The formation of a CRISPR complex hybridized to a target DNA can result in cleavage of one or both strands of the target DNA within or near the region corresponding to the guide RNA target sequence (i.e., the guide RNA target sequence on the non-complementary strand of the target DNA and the reverse complement on the complementary strand to which the guide RNA hybridizes). For example, the cleavage site can be within the guide RNA target sequence (e.g., at a position defined relative to the PAM sequence). A "cleavage site" includes the position on the target DNA where the Cas protein generates a single-strand or double-strand break. The cleavage site can be on only one strand (e.g., when a nickase is used) or on both strands of double-stranded DNA. The cleavage site can be at the same position on both strands (generating blunt ends, e.g., Cas9), or at different sites on each strand (generating sticky ends (i.e., overhangs), e.g., Cpf1). Sticky ends can be generated, for example, by using two Cas proteins that each generate a single-strand break at a different cleavage site on a different strand, thereby generating a double-strand break. For example, a first nickase can generate a single-strand break on the first strand of double-stranded DNA (dsDNA), and a second nickase can generate a single-strand break on the second strand of dsDNA such that an overhang sequence is generated. In some cases, the guide RNA target sequence or cleavage site of the nickase on the first strand is separated from the guide RNA target sequence or cleavage site of the nickase on the second strand by at least 2, at least 3, at least 4, at least 5, at least 6, at least 7, at least 8, at least 9, at least  10, at least 15, at least 20, at least 25, at least 30, at least 40, at least 50, at least 75, at least 100, at least 250, at least 500, or at least 1,000 base pairs.

[0255] D. Recombinases and Recombinase-Deleted Non-Human Animals A cell or non-human animal comprising a chimeric Cas protein expression cassette, a chimeric adapter protein expression cassette, a SAM expression cassette, a guide RNA expression cassette (e.g., one or more guide RNA expression cassettes), or a recombinase expression cassette located downstream of a polyadenylation signal or transcriptional terminator adjacent to a recombinase recognition site recognized by a site-specific recombinase disclosed herein may further comprise a recombinase expression cassette that drives the expression of a site-specific recombinase. A nucleic acid encoding the recombinase can be incorporated into the genome, or the recombinase or nucleic acid can be introduced into such cells and non-human animals using methods disclosed elsewhere herein (e.g., LNP-mediated delivery or AAV-mediated delivery). The delivery method may be selected to provide tissue-specific delivery of the recombinase disclosed elsewhere herein.

[0256] Site-specific recombinases include enzymes that can promote recombination between recombinase recognition sites where two recombination sites are physically separated within a single nucleic acid or on separate nucleic acids. Examples of recombinases include Cre, Flop, and Dre recombinases. An example of a Cre recombinase gene is Crei, where the two exons encoding Cre recombinase are separated by an intron, preventing expression in prokaryotic cells. Such recombinases may further include nuclear localization signals to promote nuclear localization (e.g., NLS-Crei). Recombinase recognition sites include nucleotide sequences that are recognized by site-specific recombinases and can function as substrates for recombination events. Examples of recombinase recognition sites include FRT, FRT11, FRT71, attp, att, rox, and lox sites such as loxP, lox511, lox2272, lox66, lox71, loxM2, and lox5171.

[0257] A recombinase expression cassette can be incorporated into a different target genomic locus than other expression cassettes disclosed herein, or into the genome at the same target locus (e.g., Rosa26 locus, such as being incorporated into the first intron of the Rosa26 locus). For example, a cell or non-human animal may be heterozygous for each of the SAM expression cassette (or chimeric Cas protein expression cassette or chimeric adapter protein expression cassette) and the recombinase expression cassette, having one allele of the target genomic gene containing the SAM expression cassette and a second allele of the target genomic gene containing the recombinase target cassette. Similarly, a cell or non-human animal may be heterozygous for each of the guide RNA expression cassette (e.g., guide RNA array expression cassette) and the recombinase expression cassette, having one allele of the target genomic gene containing the guide RNA expression cassette and a second allele of the target genomic gene containing the recombinase target cassette.

[0258] The recombinase gene in the recombinase expression cassette can be operably linked to any suitable promoter. Examples of promoters are described elsewhere herein. For example, the promoter may be a tissue-specific promoter or a developmental stage-specific promoter. Such promoters are advantageous because they can selectively activate the transcription of a target gene only in a desired tissue or at a desired developmental stage. For example, in the case of the Cas protein, this can reduce the possibility of Cas-mediated toxicity in vivo. Exemplary promoters for mouse recombinases are known and are described, for example, in U.S. Patent Application Publication 2019 / 0284572 and International Publication 2019 / 183123, respectively, which are incorporated herein by reference in their entirety for all purposes.

[0259] E. Nucleic acids encoding chimeric Cas proteins, chimeric adapter proteins, guide RNAs, synergistic activation mediators, or recombinases. Also provided are nucleic acids encoding chimeric Cas proteins, chimeric adapter proteins, guide RNAs, recombinases, or any combination thereof. Chimeric Cas proteins, chimeric adapter proteins, and guide RNAs are described in more detail elsewhere in this specification. For example, nucleic acids may be chimeric Cas protein expression cassettes, chimeric adapter protein expression cassettes, synergistic activation mediator (SAM) expression cassettes containing nucleic acids encoding both chimeric Cas proteins and chimeric adapter proteins, guide RNA or guide RNA array expression cassettes, recombinase expression cassettes, or any combination thereof. Such nucleic acids may be RNA (e.g., messenger RNA (mRNA)) or DNA, and may be single-stranded or double-stranded, and may be linear or circular. DNA may be part of a vector, such as an expression vector or a targeting vector. Vectors may also be viral vectors, such as adenoviruses, adeno-associated viruses, lentiviruses, and retroviral vectors. When any of the nucleic acids disclosed herein is introduced into a cell, the encoded chimeric Cas protein, chimeric adapter protein, or guide RNA may be transiently, conditionally, or constitutively expressed within the cell.

[0260] Nucleic acids can be codon-optimized selectively for efficient translation into proteins in specific cells or organisms. For example, nucleic acids can be modified to alternative codons that have a higher frequency of use in bacterial cells, yeast cells, human cells, non-human cells, mammalian cells, rodent cells, mouse cells, rat cells, or any other host cell of interest, compared to naturally occurring polynucleotide sequences.

[0261] Nucleic acids or expression cassettes can be stably integrated into the genome of a cell or non-human animal (i.e., into a chromosome), or located outside the chromosome (e.g., by replicating DNA outside the chromosome). Stably integrated expression cassettes or nucleic acids can be randomly integrated into the genome of a non-human animal (i.e., transgenic), or integrated into a predetermined region of the genome of a non-human animal (i.e., knock-in). In one example, the nucleic acid or expression cassette is stably integrated into a safe harbor locus, as described elsewhere in this specification. The target genomic locus into which the nucleic acid or expression cassette is stably integrated may be heterozygous or homozygous for the nucleic acid or expression cassette. For example, the target genomic locus or non-human animal cell or non-human animal may be heterozygous for the SAM expression cassette and heterozygous for the guide RNA expression cassette, and each may be optionally located on different alleles at the same target genomic locus.

[0262] The nucleic acids or expression cassettes described herein may be operably ligated to any suitable promoter for in vivo expression in non-human animals, or in vitro or ex vivo expression within cells. Non-human animals may be any suitable animals described elsewhere herein. For example, a nucleic acid or expression cassette (e.g., a chimeric Cas protein expression cassette, a chimeric adapter protein expression cassette, or a SAM cassette containing nucleic acids encoding both a chimeric Cas protein and a chimeric adapter protein) may be operably ligated to an endogenous promoter at a target genomic locus, such as the Rosa26 promoter. Alternatively, a cassette nucleic acid or expression cassette may be operably ligated to an exogenous promoter, such as a constitutively active promoter (e.g., a CAG promoter or U6 promoter), a conditional promoter, an inducible promoter, a temporally restricted promoter (e.g., a developmentally regulated promoter), or a spatially restricted promoter (e.g., a cell-specific or tissue-specific promoter). Such promoters are well known and discussed elsewhere herein. Promoters that can be used in expression constructs include, for example, promoters that are active in one or more of the following: eukaryotic cells, human cells, non-human cells, mammalian cells, non-human mammalian cells, rodent cells, mouse cells, rat cells, hamster cells, rabbit cells, pluripotent cells, embryonic stem (ES) cells, or zygotes. Such promoters may be, for example, conditional promoters, inducible promoters, constitutive promoters, or tissue-specific promoters.

[0263] For example, a nucleic acid encoding guide RNA can be operably ligated to a U6 promoter, such as a human U6 promoter or a mouse U6 promoter. Specific examples of suitable promoters (e.g., for expressing guide RNA) include RNA polymerase III promoters such as the human U6 promoter, the rat U6 polymerase III promoter, or the mouse U6 polymerase III promoter.

[0264] Optionally, the promoter may be a bidirectional promoter that drives the expression of a first gene (e.g., a gene encoding a chimeric Cas protein) and a second gene (e.g., a gene encoding a guide RNA or a chimeric adapter protein) in the other direction. Such a bidirectional promoter may consist of (1) a complete conventional unidirectional Pol III promoter comprising three external regulatory elements: a distal sequence element (DSE), a proximal sequence element (PSE), and a TATA box, and (2) a second basic Pol III promoter comprising a PSE and a TATA box fused in the reverse direction to the 5' end of the DSE. For example, in the H1 promoter, the DSE is adjacent to the PSE and TATA box, and the promoter can be made bidirectional by creating a hybrid promoter in which reverse transcription is controlled by adding a PSE and TATA box derived from the U6 promoter. See, for example, U.S. Patent Application Publication 2016 / 0074535, which is incorporated herein by reference in its entirety for all purposes. By using a bidirectional promoter to simultaneously express two genes, a compact expression cassette can be created to facilitate delivery.

[0265] One or more nucleic acids may be the same in a multicistronic expression construct. For example, the nucleic acid encoding a chimeric Cas protein and the nucleic acid encoding a chimeric adapter protein may be the same in a bicistronic expression construct. See, for example, Figures 4A and 4B. A multicistronic expression vector simultaneously expresses two or more distinct proteins from the same mRNA (i.e., transcripts produced from the same promoter). Preferred strategies for multicistronic expression of proteins include, for example, the use of the 2A peptide and the use of an internal ribosome entry site (IRES). For example, such a construct may include (1) nucleic acids encoding one or more chimeric Cas proteins and one or more chimeric adapter proteins, (2) nucleic acids encoding two or more chimeric adapter proteins, (3) nucleic acids encoding two or more chimeric Cas proteins, (4) nucleic acids encoding two or more guide RNAs or two or more guide RNA arrays, (5) nucleic acids encoding one or more chimeric Cas proteins and one or more guide RNAs or guide RNA arrays, (6) nucleic acids encoding one or more chimeric adapter proteins and one or more guide RNAs or guide RNA arrays, or (7) nucleic acids encoding one or more chimeric Cas proteins, one or more chimeric adapter proteins, and one or more guide RNAs or guide RNA arrays. As an example, such a multicistronic vector may use one or more internal ribosome entry sites (IRESs) to enable translation initiation from the internal region of mRNA. As another example, such a multicistronic vector may use one or more 2A peptides. These peptides are generally small "self-cleaved" peptides with a length of 18-22 amino acids that produce multiple genes at equimolar levels from the same mRNA. Ribosomes skip the synthesis of the glycyl-prolyl peptide bond at the C-terminus of peptide 2A, causing a "cleavage" between peptide 2A and the peptide immediately downstream.For example, see Kim et al. (2011) PLoS One 6(4):e18556, which is incorporated herein by reference in its entirety for all purposes. The "cleavage" occurs between the glycine and proline residues at the C-terminus, with the upstream cistron adding several residues to its end, and the downstream cistron beginning with proline. As a result, the "cleaved" downstream peptide has proline at its N-terminus. 2A-mediated cleavage is a universal phenomenon in all eukaryotic cells. 2A peptides have been identified from picornaviruses, insect viruses, and type C rotavirus. For example, see Szymczak et al. (2005) Expert Opin. Biol. Ther. 5(5):627-638, which is incorporated herein by reference in its entirety for all purposes. Examples of usable 2A peptides include Thoseasigna virus 2A (T2A), porcine teschovirus-1 2A (P2A), equine rhinitis A virus (ERAV) 2A (ERAV 2A, E2A), and FMDV 2A (FMDV 2A, F2A). Exemplary T2A, P2A, E2A, and F2A sequences include: T2A (EGRGSLLTCGDVEENPGP, SEQ ID NO: 53), P2A (ATNFSLLKQAGDVEENPGP, SEQ ID NO: 54), E2A (QCTNYALLKLAGDVESNPGP, SEQ ID NO: 55), and F2A (VKQTLNFDLLKLAGDVESNPGP, SEQ ID NO: 56). GSG residues can be added to the 5' end of any of these peptides to improve cleavage efficiency.

[0266] Either a nucleic acid or an expression cassette may contain a polyadenylation signal or transcriptional terminator upstream of the coding sequence. The term polyadenylation signal sequence refers to any sequence that directs the termination of transcription and the addition of a poly(A) tail to the mRNA transcript. In eukaryotes, transcriptional terminators are recognized by protein factors, and termination is followed by polyadenylation, the process of adding a poly(A) tail to the mRNA transcript in the presence of poly(A) polymerase. Mammalian poly(A) signals typically consist of a core sequence about 45 nucleotides long, flanked by a variety of auxiliary sequences that help enhance the efficiency of cleavage and polyadenylation. The core sequence, referred to as the poly-A recognition motif or poly-A recognition sequence, consists of a highly conserved upstream element (AATAAA or AAUAAA) within mRNA that is recognized by a cleavage and polyadenylation-specificity factor (CPSF), and a downstream region (U or G and U-rich) that is poorly defined and bound by a cleavage stimulation factor (CstF). Examples of transcriptional terminators that may be used include, for example, the human growth hormone (HGH) polyadenylation signal, the Simian virus 40 (SV40) late polyadenylation signal, the rabbit betaglobin polyadenylation signal, the bovine growth hormone (BGH) polyadenylation signal, the phosphoglycerate kinase (PGK) polyadenylation signal, the AOX1 transcription termination sequence, the CYC1 transcription termination sequence, or any transcription termination sequence known to be suitable for regulating gene expression in eukaryotic cells. For example, chimeric Cas protein expression cassettes, chimeric adapter protein expression cassettes, SAM expression cassettes, guide RNA expression cassettes, or recombinase expression cassettes may include a polyadenylation signal or transcriptional terminator upstream of the coding sequence within the expression cassette.A polyadenylation signal or transcriptional terminator may be adjacent to a recombinase recognition site recognized by a site-specific recombinase. Optionally, the recombinase recognition site may also be adjacent to a selection cassette containing, for example, the coding sequence of a drug resistance protein. Optionally, the recombinase recognition site may not be adjacent to a selection cassette. The polyadenylation signal or transcriptional terminator interferes with the transcription and expression of the protein or RNA encoded by the coding sequence (e.g., a chimeric Cas protein, a chimeric adapter protein, guide RNA, or recombinase). However, if the polyadenylation signal or transcriptional terminator is excised upon exposure to a site-specific recombinase, the protein or RNA can be expressed.

[0267] Such a configuration of an expression cassette (e.g., a chimeric Cas protein expression cassette or a SAM expression cassette) can enable tissue-specific or developmental-stage-specific expression in a non-human animal containing the expression cassette, if the polyadenylation signal or transcriptional terminator is excised in a tissue-specific or developmental-stage-specific manner. For example, in the case of a chimeric Cas protein, this can reduce the toxicity resulting from long-term expression of the chimeric Cas protein in cells or non-human animals, or from the expression of the chimeric Cas protein at an undesirable developmental stage or in an undesirable cell or tissue type within a non-human animal. See, for example, Parikh et al. (2015) PLoS One 10(1):e0116484, which is incorporated herein by reference in its entirety for all purposes. Excision of a polyadenylation signal or transcriptional terminator in a tissue-specific or developmental-stage-specific manner can be achieved if the non-human animal containing the expression cassette further includes a coding sequence for a site-specific recombinase operably linked to a tissue-specific or developmental-stage-specific promoter. Subsequently, polyadenylation signals or transcriptional terminators are excised only in those tissues or developmental stages, enabling tissue-specific or developmental stage-specific expression. For example, a chimeric Cas protein, a chimeric adapter protein, a chimeric Cas protein and a chimeric adapter protein, or a guide RNA may be expressed in a liver-specific manner. Examples of such promoters used to develop such “recombinase deletion” strains in non-human animals are disclosed elsewhere herein.

[0268] Any transcriptional terminator or polyadenylation signal may be used. As used herein, “transcriptional terminator” refers to a DNA sequence that causes the termination of transcription. In eukaryotes, transcriptional terminators are recognized by protein factors, and termination is followed by polyadenylation, a process that adds a poly(A) tail to the mRNA transcript in the presence of poly(A) polymerase. Mammalian poly(A) signals typically consist of a core sequence about 45 nucleotides long, flanked by a variety of auxiliary sequences that help enhance the efficiency of cleavage and polyadenylation. The core sequence, referred to as the polyA recognition motif or polyA recognition sequence, consists of a highly conserved upstream element (AATAAA or AAUAAA) in mRNA, recognized by cleavage and polyadenylation specificity factors (CPSF), and a less defined downstream region (U or G and U-rich) that is bound by cleavage stimulants (CstF). Examples of transcriptional terminators that may be used include, for example, the human growth hormone (HGH) polyadenylation signal, the Simian virus 40 (SV40) late polyadenylation signal, the rabbit betaglobin polyadenylation signal, the bovine growth hormone (BGH) polyadenylation signal, the phosphoglycerate kinase (PGK) polyadenylation signal, the AOX1 transcription termination sequence, the CYC1 transcription termination sequence, or any transcription termination sequence known to be suitable for regulating gene expression in eukaryotic cells.

[0269] Site-specific recombinases include enzymes that can promote recombination between recombinase recognition sites where two recombination sites are physically separated within a single nucleic acid or on separate nucleic acids. Examples of recombinases include Cre, Flop, and Dre recombinases. An example of a Cre recombinase gene is Crei, where the two exons encoding Cre recombinase are separated by an intron, preventing expression in prokaryotic cells. Such recombinases may further include nuclear localization signals to promote nuclear localization (e.g., NLS-Crei). Recombinase recognition sites include nucleotide sequences that are recognized by site-specific recombinases and can function as substrates for recombination events. Examples of recombinase recognition sites include FRT, FRT11, FRT71, attp, att, rox, and lox sites such as loxP, lox511, lox2272, lox66, lox71, loxM2, and lox5171.

[0270] Expression cassettes disclosed herein may also include other components. Such expression cassettes (e.g., chimeric Cas protein expression cassettes, chimeric adapter protein expression cassettes, SAM expression cassettes, guide RNA expression cassettes, or recombinase expression cassettes) may further include a second polyadenylation signal following a 3' splicing sequence and / or coding sequence (e.g., encoding a chimeric Cas protein, chimeric adapter protein, guide RNA, or recombinase) at the 5' end of the expression cassette. The term 3' splicing sequence refers to a nucleic acid sequence at the 3' intron / exon boundary that can be recognized and bound by a splicing mechanism. Expression cassettes may further include, for example, selection cassettes containing coding sequences for drug resistance proteins. Examples of preferred selection markers include neomycin phosphotransferase (neo r ), hygromycin B phosphotransferase (hyg r ), puromycin-N-acetyltransferase (puro r ), blastocydin S deaminase (bsr rExamples include xanthine / guanine phosphoribosyltransferase (GPT), or herpes simplex virus thymidine kinase (HSV-K). Optionally, the selection cassette may be adjacent to a recombinase recognition site for a site-specific recombinase. If the expression cassette also contains a recombinase recognition site adjacent to a polyadenylation signal upstream of the coding sequence as described above, the selection cassette may be adjacent to the same recombinase recognition site or to a different set of recombinase recognition sites recognized by different recombinases.

[0271] The expression cassette may also contain nucleic acids encoding one or more reporter proteins, such as fluorescent proteins (e.g., green fluorescent protein). Any suitable reporter protein may be used. For example, fluorescent reporter proteins as defined elsewhere in this specification may be used, or non-fluorescent reporter proteins may be used. Examples of fluorescent reporter proteins are provided elsewhere in this specification. Non-fluorescent reporter proteins include, for example, reporter proteins that can be used in histochemical or bioluminescent assays, such as beta-galactosidase, luciferase (e.g., sea urchin luciferase, firefly luciferase, and NanoLuc luciferase), and beta-glucuronidase. The expression cassette may contain reporter proteins detectable by flow cytometry assays (e.g., fluorescent reporter proteins such as green fluorescent protein) and / or reporter proteins detectable by histochemical assays (e.g., beta-galactosidase proteins). An example of such a histochemical assay is the visualization of histochemical in situ beta-galactosidase expression via hydrolysis of X-Gal (5-bromo-4-chloro-3-indoyl-bD-galactopyranoside), which produces a blue precipitate, or by using a fluorescence-generating substrate such as beta-methyl umbelliferyl galactoside (MUG) and fluorescein digalactoside (FDG).

[0272] The expression cassettes described herein may take any form. For example, an expression cassette may be contained within a vector such as a viral vector or a plasmid. An expression cassette may be operably linked to a promoter in an expression construct that can direct the expression of a protein or RNA (e.g., upon removal of an upstream polyadenylation signal). Alternatively, an expression cassette may be contained within a targeted vector. For example, a targeted vector may include a homology arm adjacent to the expression cassette, which is preferably used to direct recombination with a desired target genomic locus to facilitate genomic integration and / or substitution of an endogenous sequence.

[0273] The expression cassettes described herein may be in vitro, ex vivo in cells (e.g., embryonic stem cells) (e.g., integrated into the genome or extrachromosomal), or in vivo in organisms (e.g., non-human animals) (e.g., integrated into the genome or extrachromosomal). If ex vivo, the expression cassette may be in any type of cell of any organism, such as totipotent cells, including embryonic stem cells (e.g., mouse or rat embryonic stem cells) or induced pluripotent stem cells (e.g., human induced pluripotent stem cells). If in vivo, the expression cassette may be in any type of organism (e.g., non-human animals as further described elsewhere herein).

[0274] Specific examples of nucleic acids encoding catalytically inactive Cas proteins include, essentially consist of, or can consist of nucleic acids encoding an amino acid sequence that is at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identical to the dCas9 protein sequence described in Sequence ID No. 44. Optionally, the nucleic acid comprises, essentially comprises, or can comprise a nucleic acid encoding an amino acid sequence that is at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identical to the sequence described in Sequence ID No. 57 (optionally, the sequence encodes a protein that is at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identical to the dCas9 protein sequence described in Sequence ID No. 44).

[0275] Specific examples of nucleic acids encoding chimeric Cas proteins can include, consist essentially of, or consist of nucleic acids encoding an amino acid sequence that is at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identical to the Cas9 protein sequence set forth in SEQ ID NO: 43. Optionally, the nucleic acid can include, consist essentially of, or consist of a nucleic acid encoding an amino acid sequence that is at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identical to the sequence set forth in SEQ ID NO: 58 (optionally, the sequence encodes a protein that is at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identical to the dCas9 protein sequence set forth in SEQ ID NO: 43).

[0276] Specific examples of nucleic acids encoding adapters include, essentially consist of, or may consist of nucleic acids encoding amino acid sequences that are at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identical to the MCP sequence described in Sequence ID No. 49. Optionally, the nucleic acid comprises, essentially comprises, or can comprise a nucleic acid encoding an amino acid sequence that is at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identical to the sequence described in Sequence ID No. 59 (optionally, the sequence encodes a protein that is at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identical to the MCP sequence described in Sequence ID No. 44).

[0277] Specific examples of nucleic acids encoding chimeric adapter proteins include, essentially consist of, or may consist of nucleic acids encoding an amino acid sequence that is at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identical to the chimeric adapter protein sequence described in Sequence ID No. 48. Optionally, the nucleic acid comprises, essentially comprises, or can comprise a nucleic acid encoding an amino acid sequence that is at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identical to the sequence described in Sequence ID No. 60 (optionally, the sequence encodes a protein that is at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identical to the chimeric adapter protein sequence described in Sequence ID No. 48).

[0278] Specific examples of nucleic acids encoding transcriptional activation domains include, essentially, or may include, nucleic acids encoding amino acid sequences that are at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identical to the VP64, p65, or HSF1 sequences described in SEQ ID NOs. 45, 50, or 51, respectively. Optionally, the nucleic acids comprise, essentially consist of, or can comprise, a nucleic acid encoding an amino acid sequence that is at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identical to the sequences described in SEQ ID NOs. 61, 62, or 63 (optionally, the sequences encode proteins that are at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identical to the VP64, p65, or HSF1 sequences described in SEQ ID NOs. 45, 55, or 51).

[0279] One exemplary synergistic activation mediator (SAM) expression cassette is 5' to 3', (a) 3' splicing sequence, (b) first recombinase recognition site (e.g., loxP site), (c) coding sequence of drug resistance gene (e.g., neomycin phosphotransferase (neo r(a) a 3' splicing sequence, (b) a polyadenylation signal, (c) a second recombinase recognition site (e.g., loxP site), (f) a chimeric Cas protein coding sequence (e.g., dCas9-NLS-VP64 fusion protein or NLS-dCas9-NLS-VP64 fusion protein), (g) a 2A protein coding sequence (e.g., P2A or T2A coding sequence), and (e) a chimeric adapter protein coding sequence (e.g., MCP-NLS-p65-HSF1). Another exemplary synergistic activation mediator (SAM) expression cassette includes, from 5' to 3', (a) a 3' splicing sequence, (b) a first recombinase recognition site (e.g., loxP site), (c) a drug resistance gene coding sequence (e.g., neomycin phosphotransferase (neo r (g) a coding sequence, (d) a polyadenylation signal (e.g., a PGK polyadenylation signal and / or an SV40 polyadenylation signal such as a combination of a PGK polyadenylation signal and three SV40 polyadenylation signals), (e) a second recombinase recognition site (e.g., a loxP site), (f) a chimeric Cas protein coding sequence (e.g., a dCas9-NLS-VP64 fusion protein or an NLS-dCas9-NLS-VP64 fusion protein), (g) a 2A protein coding sequence (e.g., a P2A or T2A coding sequence), (e) a chimeric adapter protein coding sequence (e.g., MCP-NLS-p65-HSF1), (f) a woodchuck hepatitis virus post-transcriptional regulatory element (WPRE), and (g) another polyadenylation signal (e.g., a BGH polyadenylation signal). For example, see Figure 4A and Sequence ID No. 64 (the coding sequence described in Sequence ID No. 69 and the coding protein described in Sequence ID No. 67).

[0280] One exemplary general guide RNA array expression cassette includes, from 5' to 3', (a) the 3' splicing sequence, (b) the first recombinase recognition site (e.g., the rox site), and (c) the coding sequence of a drug resistance gene (e.g., puromycin-N-acetyltransferase (puro r(d) a coding sequence), (e) a polyadenylation signal (e.g., a PGK polyadenylation signal and / or an SV40 polyadenylation signal such as a combination of a PGK polyadenylation signal and three SV40 polyadenylation signals), (f) a second recombinase recognition site (e.g., a rox site), and (g) a guide RNA array comprising one or more guide RNA genes (e.g., a first U6 promoter followed by a first guide RNA coding sequence and a first terminator sequence, a second U6 promoter followed by a second guide RNA coding sequence and a second terminator sequence, and a third U6 promoter followed by a third guide RNA coding sequence and a third terminator sequence). See, for example, SEQ ID NO: 65. The region of SEQ ID NO: 65, including the promoter and guide RNA coding sequence, is described in SEQ ID NO: 70. The recombinase recognition site in the guide RNA array expression cassette may be the same as or different from the recombinase recognition site in the SAM expression cassette (for example, it may be recognized by the same or different recombinases).

[0281] Another exemplary general guide RNA array expression cassette includes one or more guide RNA genes (e.g., a first guide RNA coding sequence following a first U6 promoter, a second guide RNA coding sequence following a second U6 promoter, and a third guide RNA coding sequence following a third U6 promoter). Such a general guide RNA array expression cassette is described in Sequence ID No. 70.

[0282] F. Genome loci for integration The nucleic acids and expression cassettes described herein can be incorporated into the genome of non-human animal cells or target genomic loci in non-human animals. Any target genomic locus capable of expressing a gene can be used.

[0283] Examples of target genomic loci that can stably incorporate the nucleic acids or cassettes described herein are safe harbor loci in non-human animal cells or the genomes of non-human animals. Interactions between the incorporated exogenous DNA and the host genome can limit the reliability and safety of the integration and may lead to apparent phenotypic effects that are not due to target gene modification, but rather to unintended effects of the integration on surrounding endogenous genes. For example, randomly inserted transgenes are susceptible to positional effects and silencing, which can make their expression unreliable and unpredictable. Similarly, the integration of exogenous DNA into a chromosomal locus can affect surrounding endogenous genes and chromatin, thereby altering the behavior and phenotype of the cell. Safe harbor loci include chromosomal loci in which a transgene or other exogenous nucleic acid insertion can be stably and reliably expressed in all target tissues without apparent alteration of the behavior or phenotype of the cell (i.e., without any adverse effects on the host cell). For example, see Sadelain et al. (2012) Nat. Rev. Cancer 12:51–58, which is incorporated herein by reference in its entirety for all purposes. For example, a safe harbor locus may be a locus in which the expression of an inserted gene sequence is not disrupted by read-through expression from an adjacent gene. For example, a safe harbor locus may include a chromosomal locus in which exogenous DNA can be incorporated and function in a predictable manner without adversely affecting the structure or expression of an endogenous gene. A safe harbor locus may include extragenetic or intragenetic regions such as loci within a gene that are not essential, unnecessary, or can be disrupted without obvious phenotypic consequences.

[0284] For example, the Rosa26 locus and its equivalents in humans provide open chromatin structure in all tissues and are ubiquitously expressed during embryonic development and in adulthood. See, for example, Zambrowicz et al. (1997) Proc. Natl. Acad. Sci. USA 94:3789-3794, the entire work of which is incorporated herein by reference for all purposes. In addition, the Rosa26 locus can be targeted with high efficiency, and disruption of the Rosa26 gene does not produce an obvious phenotype. Other examples of safe harbor loci include CCR5, HPRT, AAVS1, and albumin. For example, U.S. Patents 7,888,121, 7,972,854, 7,914,796, 7,951,925, 8,110,379, 8,409,861, 8,586,526, each incorporated herein in whole for any purpose, and U.S. Patent Publications 2003 / 0232410, 2005 / 0208489, and 2005 See issues / 0026157, 2006 / 0063231, 2008 / 0159996, 2010 / 00218264, 2012 / 0017290, 2011 / 0265198, 2013 / 0137104, 2013 / 0122591, 2013 / 0177983, 2013 / 0177960, and 2013 / 0122591. Since there are no negative consequences for bi-allele targeting of safe harbor loci such as the Rosa26 locus, different genes or reporters can be targeted to two Rosa26 alleles. In one example, the expression cassette is incorporated into an intron of the Rosa26 locus, such as the first intron of the Rosa26 locus. See Figure 5, for example.

[0285] An expression cassette integrated into a target genomic locus can be operably ligated to an endogenous promoter at the target locus, or to an exogenous promoter that is heterologous to the target locus. For example, a chimeric Cas protein expression cassette, a chimeric adapter protein expression cassette, or a synergistic activation mediator (SAM) expression cassette is integrated into a target genomic locus (e.g., the Rosa26 locus) and operably ligated to an endogenous promoter at the target locus (e.g., the Rosa26 promoter). In another example, a guide RNA expression cassette is integrated into a target genomic locus (e.g., the Rosa26 locus) and operably ligated to one or more heterologous promoters (e.g., U6 promoters such as different U6 promoters upstream of each guide RNA coding sequence).

[0286] IV. Methods using non-human animals containing the humanized MYOC gene locus Various methods are provided for using non-human animals and non-human animal cells containing humanized MYOC loci (e.g., including the Y437H mutation) as described elsewhere in this specification. Such methods may, for example, increase MYOC expression, raise intraocular pressure, model glaucoma, or evaluate or optimize the delivery or efficacy of human MYOC-targeting reagents (e.g., therapeutic molecules or complexes) or candidate glaucoma treatments in vivo, ex vivo, or in vitro. Because non-human animals and non-human animal cells contain humanized MYOC loci, they more accurately reflect the efficacy of human MYOC-targeting reagents. The non-human animals disclosed herein contain humanized endogenous MYOC loci, rather than transgenic insertions of human MYOC sequences at random genomic loci, and the humanized endogenous MYOC loci contain corresponding human genomic MYOC sequences from both coding and non-coding regions (e.g., from both exon and intron regions), rather than artificial cDNA sequences. Therefore, such non-human animals and non-human animal cells are particularly useful for testing genome editing reagents designed to target human MYOC genes. Such non-human animals are also particularly useful for testing candidate glaucoma treatments, as the non-human animals disclosed herein exhibit a phenotype of elevated intraocular pressure that reflects the phenotype observed in glaucoma patients.

[0287] A. Methods to increase MYOC expression, raise intraocular pressure, or model glaucoma. Various methods are provided for increasing MYOC (e.g., human MYOC) expression, increasing intraocular pressure, or modeling glaucoma in vivo using non-human animals, including a humanized MYOC locus (e.g., including the Y437H mutation) and a SAM expression cassette as described elsewhere herein. Such methods for increasing MYOC expression, increasing intraocular pressure, or modeling glaucoma may include administering one or more DNAs encoding one or more guide RNAs or one or more SAM guide RNAs as described elsewhere herein to a non-human animal, thereby increasing MYOC expression. This method can increase MYOC mRNA expression and / or protein expression. Such methods for increasing MYOC expression may also be for increasing MYOC expression in non-human animal cells by administering one or more DNAs encoding one or more guide RNAs or one or more SAM guide RNAs to non-human animal cells, as described elsewhere herein.

[0288] Each of the one or more guide RNAs may contain one or more adapter-binding elements to which a chimeric adapter protein can specifically bind, and each of the one or more guide RNAs may form a complex with a chimeric Cas protein and a chimeric adapter protein, guiding them to a target sequence within the humanized MYOC locus, thereby increasing the expression of the humanized MYOC locus.

[0289] Such a method may further include measuring the expression of MYOC messenger RNA encoded by a humanized MYOC locus, or measuring the expression of myosirin protein encoded by a humanized MYOC locus, after administering one or more guide RNAs, or one or more DNAs encoding one or more guide RNAs.

[0290] In some methods, human MYOC mRNA or protein expression increases by at least approximately 2-fold, at least approximately 3-fold, at least approximately 4-fold, at least approximately 5-fold, at least approximately 6-fold, at least approximately 7-fold, at least approximately 8-fold, at least approximately 9-fold, at least approximately 10-fold, at least approximately 11-fold, at least approximately 12-fold, at least approximately 13-fold, at least approximately 14-fold, or at least approximately 15-fold (for example, at least 2-fold, at least 3-fold, at least 4-fold, at least 5-fold, at least 6-fold, at least 7-fold, at least 8-fold, at least 9-fold, at least 10-fold, at least 11-fold, at least 12-fold, at least 13-fold, at least 14-fold, or at least 15-fold). In some ways, human MYOC mRNA or protein expression increases by at least approximately 2 to at least approximately 25 times, at least approximately 3 to at least approximately 25 times, at least approximately 4 to at least approximately 25 times, at least approximately 5 to at least approximately 25 times, at least approximately 6 to at least approximately 25 times, at least approximately 7 to at least approximately 25 times, at least approximately 8 to at least approximately 25 times, at least approximately 9 to at least approximately 25 times, at least approximately 10 to at least approximately 25 times, at least approximately 2 to at least approximately 20 times, at least approximately 2 to at least approximately 15 times, or at least approximately 10 to at least approximately 15 times. The increase in human MYOC mRNA or protein expression can be located in the eye, limbal ring, retina, ciliary body, trabecular network, or cornea. In a specific example, the increase in expression is located in the limbal ring.

[0291] In some cases, non-human animals develop one or more signs or symptoms of glaucoma after administration of one or more guide RNAs or one or more DNAs encoding one or more guide RNAs. Glaucoma is a chronic optic neuropathy characterized by progressive loss of retinal ganglion cell (RGC) axons, resulting in irreversible vision loss. The primary risk factor for glaucoma is elevated intraocular pressure (IOP). Elevated IOP is caused by increased resistance to aqueous humor outflow through the structure of the trabecular network (TM). Aqueous humor is produced by the ciliary body, circulates in the anterior chamber, and outflows through the TM network. In most cases of glaucoma, there is increased resistance to aqueous humor in the TM. Pathogenic MYOC mutant proteins aggregate intracellularly, leading to trabecular network (TM) stress, elevated IOP, and glaucoma.

[0292] Such a method may further include measuring intraocular pressure (IOP) after administering one or more guide RNAs or one or more DNAs encoding one or more guide RNAs. In one example, the method yields an IOP of at least about 15, at least about 16, at least about 17, at least about 18, at least about 19, at least about 20, at least about 21, or at least about 22 mmHg (e.g., at least 15, at least 16, at least 17, at least 18, at least 19, at least 20, at least 21, or at least 22 mmHg). For example, this method yields an IOP of approximately 15-22, 16-22, 17-22, 18-22, 19-22, 15-21, 15-20, or 16-21 mmHg (for example, 15-22, 16-22, 17-22, 18-22, 19-22, 15-21, 15-20, or 16-21 mmHg).

[0293] In some methods, IOP increases by a certain amount compared to before administration of one or more guide RNAs or one or more DNAs encoding one or more guide RNAs, or compared to a control non-human animal (e.g., a non-human animal having the humanized MYOC locus described herein but not administered with one or more guide RNAs, or a wild-type non-human animal). In one example, IOP increases by at least about 1, at least about 2, at least about 3, at least about 4, at least about 5, or at least about 6 mmHg (e.g., at least 1, at least 2, at least 3, at least 4, at least 5, or at least 6 mmHg). In another example, IOP increases by about 1 to about 7, about 2 to about 7, about 3 to about 7, about 4 to about 7, about 5 to about 7, about 1 to about 6, about 2 to about 6, about 3 to about 6, about 4 to about 6, or about 5 to about 6 mmHg (e.g., 1 to 6, 2 to 6, 3 to 6, 4 to 6, or 5 to 6 mmHg).

[0294] Guide RNA or DNA encoding guide RNA can be administered in any form, by any delivery vehicle, and by any route of administration (either by introducing it into cells or by introducing it into animals so that the guide RNA or DNA gains access to the inside of cells in non-human animals). For example, the administration of one or more guide RNAs, or one or more DNAs encoding one or more guide RNAs, can be administered in several ways, including by adenovirus-mediated delivery (e.g., recombinant adenovirus type 5 (Ad5)), lentivirus-mediated delivery, adeno-associated virus (AAV)-mediated delivery, or lipid nanoparticle (LNP)-mediated delivery. In one example, guide RNA or DNA encoding guide RNA is administered via LNP-mediated delivery (e.g., in doses of about 0.1 mg / kg to about 2 mg / kg). In another example, guide RNA or DNA encoding guide RNA is administered via AAV-mediated delivery (e.g., using AAVs with serotypes for delivery to the eye, such as recombinant AAV2.Y3F). Guide RNA can be administered as RNA or as DNA. When administered as DNA, each guide RNA coding sequence can be operably ligated to, for example, a different U6 promoter. The guide RNA or the DNA encoding the guide RNA can be administered by any preferred route of administration. For example, the guide RNA or the DNA encoding the guide RNA can be administered via intravitreous injection or intrachorium injection.

[0295] In some cases, the target sequence of the guide RNA may include a regulatory sequence within the humanized MYOC locus. For example, the regulatory sequence may include a promoter or an enhancer. In some cases, the target sequence of the guide RNA may be within 200 base pairs of the transcription start site of the genetically modified endogenous MYOC locus, or within a region 200 base pairs upstream and 1 base pair downstream of the transcription start site.

[0296] In some cases, each guide RNA contains two adapter-binding elements to which a chimeric adapter protein can specifically bind. For example, the first adapter-binding element may be in the first loop of each of one or more guide RNAs, and the second adapter-binding element may be in the second loop of each of one or more guide RNAs. In a specific example, each guide RNA may be a single guide RNA containing a CRISPR RNA (crRNA) moiety fused to a transactivated CRISPR RNA (tracrRNA) moiety, the first loop being a tetraloop corresponding to residues 13-16 of SEQ ID NOs. 77, 79, 81, or 82, and the second loop being a stemloop 2 corresponding to residues 53-56 of SEQ ID NOs. 77, 79, 81, or 82. In another specific example, the adapter-binding element contains the sequence described in SEQ ID NOs. 52. In yet another specific example, each of one or more guide RNAs contains the sequence described in SEQ ID NOs. 66, 68, 71, or 72.

[0297] For example, the guide RNA can target a sequence containing the sequence described in any one of sequence numbers 90-95. Similarly, the guide RNA can contain the sequence described in any one of sequence numbers 96-101. In another example, the guide RNA can target a sequence containing the sequence described in any one of sequence numbers 93-94. Similarly, the guide RNA can contain the sequence described in any one of sequence numbers 99-100. In yet another example, the guide RNA can target a sequence containing the sequence described in sequence number 93. Similarly, the guide RNA can contain the sequence described in sequence number 99.

[0298] In some methods, one or more guide RNAs include multiple guide RNAs that target the humanized MYOC locus (e.g., at least two or at least three guide RNAs that target the humanized MYOC locus).

[0299] B. Method for testing the efficacy of human MYOC-targeting reagents or candidate glaucoma treatments. Various methods are provided for evaluating the delivery or efficacy of human MYOC-targeting reagents or candidate glaucoma treatments in vivo, ex vivo, or in vitro using non-human animals or non-human animal cells containing humanized MYOC loci (e.g., including the Y437H mutation) as described elsewhere in this specification. Such methods may include (a) introducing the human MYOC-targeting reagent or candidate glaucoma treatment into a non-human animal, and (b) evaluating the activity of the human MYOC-targeting reagent or candidate treatment. Similarly, such methods may include (a) introducing the human MYOC-targeting reagent or candidate glaucoma treatment into non-human animal cells, and (b) evaluating the activity of the human MYOC-targeting reagent or candidate treatment. The evaluation may be compared, for example, with control non-human animals or non-human animal cells containing humanized MYOC loci that have not been administered the human MYOC-targeting reagent or candidate glaucoma treatment, or with non-human animals or non-human animal cells before administration of the human MYOC-targeting reagent or candidate glaucoma treatment.

[0300] In a method in which a non-human animal or non-human animal cell also comprises a CRISPR / Cas synergistic activation mediator system component, such a method may further include administering one or more SAM guide RNAs or one or more DNAs encoding one or more SAM guide RNAs as described elsewhere herein to a non-human animal or non-human animal cell prior to step (a), each of the one or more guide RNAs comprising one or more adapter binding elements to which a chimeric adapter protein can specifically bind, and each of the one or more guide RNAs forming a complex with a chimeric Cas protein and a chimeric adapter protein, guiding them to a target sequence within the humanized MYOC locus, thereby increasing the expression of the humanized MYOC locus. If the step to be evaluated is performed in comparison to control non-human animal or non-human animal cells that have not been administered a human MYOC targeting reagent or candidate glaucoma treatment, the method may further include administering one or more DNAs encoding one or more guide RNAs or one or more SAM guide RNAs as described elsewhere herein.

[0301] A method further comprising administering one or more SAM guide RNAs or one or more DNAs encoding one or more SAM guide RNAs, as described elsewhere herein, to a non-human animal or non-human animal cell prior to step (a), wherein any preferred amount of time may occur between the step of administering one or more SAM guide RNAs or one or more DNAs encoding one or more SAM guide RNAs and the step of administering a human MYOC targeting reagent or a candidate glaucoma treatment. For example, the human MYOC targeting reagent or the candidate glaucoma treatment may be administered at least about 1 day, at least about 2 days, at least about 3 days, at least about 4 days, at least about 5 days, at least about 6 days, at least about 7 days, at least about 8 days, at least about 9 days, at least about 10 days, at least about 15 days, at least about 20 days, at least about 25 days, or at least about 30 days after the administration of one or more guide RNAs or one or more DNAs encoding one or more guide RNAs. In another example, the human MYOC-targeted reagent or candidate glaucoma treatment is administered approximately 1 to 2 days, 1 to 3 days, 1 to 4 days, 1 to 5 days, 1 to 6 days, 1 to 7 days, 1 to 8 days, 1 to 9 days, 1 to 10 days, 1 to 15 days, 1 to 20 days, 1 to 25 days, or 1 to 30 days after the administration of one or more guide RNAs or one or more DNAs encoding one or more guide RNAs. In another example, the human MYOC-targeted reagent or candidate glaucoma treatment is administered approximately 1 to 30 days, 2 to 30 days, 3 to 30 days, 4 to 30 days, 5 to 30 days, 6 to 30 days, 7 to 30 days, 8 to 30 days, 9 to 30 days, 10 to 30 days, 15 to 30 days, 20 to 30 days, or 25 to 30 days after the administration of one or more guide RNAs or one or more DNAs encoding one or more guide RNAs.In another example, a human MYOC-targeting reagent or candidate glaucoma treatment may be administered at least about...

Claims

1. A non-human animal having a humanized endogenous MYOC locus in its genome, wherein the region of the endogenous MYOC locus from the MYOC start codon to the MYOC stop codon has been deleted and replaced with the corresponding human MYOC genome sequence from the MYOC start codon to the MYOC stop codon. The non-human animal further contains an expression cassette integrated into its genome, and the expression cassette integrated into the genome is (a) A nucleic acid encoding a chimeric cluster of regularly scattered short palindromic repeat (CRISPR)-related (Cas) proteins, comprising a nuclease-inactive Cas protein fused to one or more transcription-activating domains, (b) A nucleic acid encoding a chimeric adapter protein, which includes an adapter protein fused to one or more transcriptional activation domains, The non-human animal further comprises one or more administered guide RNAs, or one or more expression cassettes encoding the one or more guide RNAs, each guide RNA comprising one or more adapter-binding elements to which the chimeric adapter protein can specifically bind, Each of the one or more guide RNAs can form a complex with the Cas protein and guide it to a target sequence within the target gene. At least one of the one or more guide RNAs targets a target sequence in the region 200 base pairs upstream and 1 base pair downstream of the transcription start site of the humanized endogenous MYOC gene locus. Selectively, one or more guide RNAs, or one or more expression cassettes encoding one or more guide RNAs, are located within the trabecular network. Non-human animals.

2. The non-human animal according to claim 1, wherein the humanized endogenous MYOC gene locus contains a mutation associated with glaucoma, and optionally the human MYOC sequence contains the mutation.

3. The aforementioned humanized endogenous MYOC gene locus contains the Y437H mutation, The non-human animal according to claim 1 or 2, wherein the human MYOC sequence includes the Y437H mutation.

4. The humanized endogenous MYOC locus includes an endogenous MYOC promoter, the human MYOC sequence is operably linked to the endogenous MYOC promoter, and / or The aforementioned humanized endogenous MYOC locus includes the human MYOC 3' untranslated region, and / or The endogenous MYOC 5' untranslated region is not deleted and is not replaced by the corresponding human MYOC genome sequence. A non-human animal according to claim 1 or 2.

5. The aforementioned humanized endogenous MYOC gene locus encodes human myosilicin protein, The non-human animal according to claim 1 or 2, wherein the sequence of the human myocillin protein optionally includes the sequence described in SEQ ID NO: 4, and optionally the sequence of the human myocillin protein is encoded by a coding sequence that includes the sequence described in SEQ ID NO:

5.

6. The region of the endogenous MYOC gene locus from the MYOC start codon to the MYOC stop codon is deleted and replaced with the corresponding human MYOC genome sequence and a human MYOC sequence including the human MYOC 3' untranslated region. The aforementioned human MYOC sequence contains the Y437H mutation, The endogenous MYOC 5' untranslated region is not deleted and is not replaced by the human MYOC sequence. The non-human animal according to claim 1 or 2, wherein the humanized endogenous MYOC locus includes an endogenous MYOC promoter, and the human MYOC sequence is operably linked to the endogenous MYOC promoter.

7. (i) The human MYOC sequence at the humanized endogenous MYOC locus contains a sequence that is at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identical to the sequence described in Sequence ID No. 87, and / or (ii) The humanized endogenous MYOC locus encodes a myosirin protein that contains a sequence that is at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identical to the sequence described in Sequence ID No. 4, and / or (iii) The humanized endogenous MYOC locus contains a myocilin coding sequence that is at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identical to the sequence described in Sequence ID No. 5, and / or (iv) The humanized endogenous MYOC locus contains a sequence that is at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identical to the sequence described in Sequence ID No. 88 or 89. A non-human animal according to claim 1 or 2.

8. The non-human animal according to claim 1 or 2, wherein the humanized endogenous MYOC gene locus does not contain a selection cassette or reporter gene.

9. The non-human animal according to claim 1 or 2, wherein the non-human animal is homozygous for the humanized endogenous MYOC gene locus.

10. The non-human animal according to claim 1 or 2, wherein the non-human animal includes the humanized endogenous MYOC gene locus in its germline.

11. The non-human animal according to claim 1 or 2, wherein the non-human animal is a mammal.

12. The non-human animal according to claim 11, wherein the non-human animal is a rat or a mouse.

13. The non-human animal according to claim 12, wherein the non-human animal is the mouse.

14. The expression cassette, further integrated into a second genome, comprises one or more adapter-binding elements, each of which is capable of specific binding to the chimeric adapter protein, and each cassette encodes one or more guide RNAs. Each of the one or more guide RNAs can form a complex with the Cas protein and guide it to a target sequence within the target gene. At least one of the one or more guide RNAs targets the humanized endogenous MYOC gene locus. A non-human animal according to claim 1 or 2.

15. The expression cassette incorporated into the genome is incorporated into the Rosa26 gene locus. The Cas protein is a Cas9 protein that contains mutations corresponding to D10A and N863A when optimally aligned with the Streptococcus pyogenes Cas9 protein. The one or more transcriptional activation domains in the chimeric Cas protein include VP64, The adapter protein comprises the MS2 coat protein or a functional fragment or variant thereof. The one or more transcriptional activation domains in the chimeric adapter protein include p65 and HSF1, The non-human animal further comprises one or more administered guide RNAs, or one or more expression cassettes encoding the one or more guide RNAs, The one or more guide RNAs, or the one or more expression cassettes encoding the one or more guide RNAs, are located within the trabecular network. Each of the one or more guide RNAs comprises two adapter-binding elements to which the chimeric adapter protein can specifically bind, The non-human animal according to claim 1 or 2, wherein the two adapter binding elements include a first adapter binding element in the first loop of each of the one or more guide RNAs, and a second adapter binding element in the second loop of each of the one or more guide RNAs.

16. The non-human animal is a mouse, and the one or more guide RNAs target one or more guide RNA target sequences selected from SEQ ID NOs. 90 to 95, and optionally, the non-human animal is a mouse, and the one or more guide RNAs target one or more guide RNA target sequences selected from SEQ ID NOs. 93 to 94. A non-human animal according to claim 1 or 2.

17. The non-human animal according to claim 1 or 2, wherein the non-human animal is a mouse, and the one or more guide RNAs target the guide RNA target sequence described in Sequence ID No.

93.

18. The non-human animal according to claim 1 or 2, wherein the non-human animal has MYOC mRNA or protein expression increased by at least 2, at least 3, at least 4, at least 5, at least 6, at least 7, at least 8, at least 9, at least 10, at least 11, at least 12, at least 13, at least 14, or at least 15 times in the eye, limbal ring, retina, ciliary body, or trabecular network compared to a control non-human animal that does not contain the one or more guide RNAs or the one or more expression cassettes encoding the one or more guide RNAs.

19. The aforementioned non-human animal has elevated intraocular pressure compared to a wild-type non-human animal or a control non-human animal. Optionally, the non-human animal has an intraocular pressure of at least 15, at least 16, at least 17, at least 18, at least 19, at least 20, at least 21, or at least 22 mmHg, and / or Selectively, the non-human animal has an intraocular pressure that is at least 1, at least 2, at least 3, at least 4, at least 5, or at least 6 mmHg higher than the intraocular pressure of the control non-human animal. The non-human animal according to claim 3.

20. A non-human animal cell containing a humanized endogenous MYOC locus in its genome, wherein the region of the endogenous MYOC locus from the MYOC start codon to the MYOC stop codon has been deleted and replaced with the corresponding human MYOC genome sequence from the MYOC start codon to the MYOC stop codon. The non-human animal cell further contains an expression cassette integrated into its genome, and the expression cassette integrated into the genome is (a) A nucleic acid encoding a chimeric cluster of regularly scattered short palindromic repeat (CRISPR)-related (Cas) proteins, comprising a nuclease-inactive Cas protein fused to one or more transcription-activating domains, (b) A nucleic acid encoding a chimeric adapter protein, which includes an adapter protein fused to one or more transcriptional activation domains, The non-human animal cells further comprise one or more administered guide RNAs, or one or more expression cassettes encoding the one or more guide RNAs, each guide RNA comprising one or more adapter-binding elements to which the chimeric adapter protein can specifically bind. Each of the one or more guide RNAs can form a complex with the Cas protein and guide it to a target sequence within the target gene. Non-human animal cells in which at least one of the one or more guide RNAs targets a target sequence in the region 200 base pairs upstream and 1 base pair downstream of the transcription start site of the humanized endogenous MYOC gene locus.

21. The aforementioned humanized endogenous MYOC gene locus contains a mutation associated with glaucoma, The non-human animal cell according to claim 20, wherein the human MYOC sequence optionally contains the mutation.

22. The aforementioned humanized endogenous MYOC gene locus contains the Y437H mutation, The aforementioned human MYOC sequence contains the Y437H mutation. Non-human animal cells according to claim 20 or 21.

23. The humanized endogenous MYOC locus includes an endogenous MYOC promoter, the human MYOC sequence is operably linked to the endogenous MYOC promoter, and / or The aforementioned humanized endogenous MYOC locus includes the human MYOC 3' untranslated region, and / or The endogenous MYOC 5' untranslated region is not deleted and is not replaced by the corresponding human MYOC genome sequence. Non-human animal cells according to claim 20 or 21.

24. The aforementioned humanized endogenous MYOC gene locus encodes human myosilicin protein, The non-human animal cell according to claim 20 or 21, wherein the sequence of the human myocillin protein optionally includes the sequence described in Sequence ID No. 4, and optionally the sequence of the human myocillin protein is encoded by a coding sequence including the sequence described in Sequence ID No.

5.

25. The region of the endogenous MYOC gene locus from the MYOC start codon to the MYOC stop codon is deleted and replaced with the corresponding human MYOC genome sequence and a human MYOC sequence including the human MYOC 3' untranslated region. The aforementioned human MYOC sequence contains the Y437H mutation, The endogenous MYOC 5' untranslated region is not deleted and is not replaced by the human MYOC sequence. The humanized endogenous MYOC locus includes an endogenous MYOC promoter, and the human MYOC sequence is operably linked to the endogenous MYOC promoter. Non-human animal cells according to claim 20 or 21.

26. (i) The human MYOC sequence at the humanized endogenous MYOC locus contains a sequence that is at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identical to the sequence described in Sequence ID No. 87, and / or (ii) The humanized endogenous MYOC locus encodes a myosirin protein that contains a sequence that is at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identical to the sequence described in Sequence ID No. 4, and / or (iii) The humanized endogenous MYOC locus contains a myocilin coding sequence that is at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identical to the sequence described in Sequence ID No. 5, and / or (iv) The humanized endogenous MYOC locus contains a sequence that is at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identical to the sequence described in Sequence ID No. 88 or 89. Non-human animal cells according to claim 20 or 21.

27. The non-human animal cell according to claim 20 or 21, wherein the humanized endogenous MYOC gene locus does not contain a selection cassette or reporter gene.

28. The non-human animal cell according to claim 20 or 21, wherein the non-human animal cell is homozygous for the humanized endogenous MYOC gene locus.

29. The non-human animal cell according to claim 20 or 21, wherein the non-human animal cell is a mammalian cell.

30. The non-human animal cell according to claim 29, wherein the non-human animal cell is a rat cell or a mouse cell.

31. The non-human animal cell according to claim 30, wherein the non-human animal cell is the mouse cell.

32. The expression cassette, further integrated into a second genome, comprises one or more adapter-binding elements, each of which is capable of specific binding to the chimeric adapter protein, and each cassette encodes one or more guide RNAs. Each of the one or more guide RNAs can form a complex with the Cas protein and guide it to a target sequence within the target gene. At least one of the one or more guide RNAs targets the humanized endogenous MYOC gene locus. Non-human animal cells according to claim 20 or 21.

33. The expression cassette incorporated into the genome is incorporated into the Rosa26 gene locus. The Cas protein is a Cas9 protein that contains mutations corresponding to D10A and N863A when optimally aligned with the Streptococcus pyogenes Cas9 protein. The one or more transcriptional activation domains in the chimeric Cas protein include VP64, The adapter protein comprises the MS2 coat protein or a functional fragment or variant thereof. The one or more transcriptional activation domains in the chimeric adapter protein include p65 and HSF1, The non-human animal cells further comprise one or more administered guide RNAs, or one or more expression cassettes encoding the one or more guide RNAs. The one or more guide RNAs, or the one or more expression cassettes encoding the one or more guide RNAs, are located within the trabecular network. Each of the one or more guide RNAs comprises two adapter-binding elements to which the chimeric adapter protein can specifically bind, The two adapter binding elements include a first adapter binding element in the first loop of each of the one or more guide RNAs, and a second adapter binding element in the second loop of each of the one or more guide RNAs. Non-human animal cells according to claim 20 or 21.

34. The non-human animal cells are mouse cells, and the one or more guide RNAs target one or more guide RNA target sequences selected from SEQ ID NOs. 90 to 95. The non-human animal cell according to claim 20 or 21, wherein the non-human animal cell is optionally a mouse cell, and the one or more guide RNAs target one or more guide RNA target sequences selected from SEQ ID NOs. 93 to 94.

35. The non-human animal cell according to claim 20 or 21, wherein the non-human animal cell is a mouse cell, and the one or more guide RNAs target the guide RNA target sequence described in Sequence ID No.

93.

36. A method for evaluating the activity of a human MYOC targeting reagent, (a) Administering the human MYOC targeting reagent to the non-human animal described in claim 1, (b) A method comprising evaluating the activity of the human MYOC targeting reagent in the non-human animal.

37. The method further comprises administering one or more guide RNAs, or one or more expression cassettes encoding the one or more guide RNAs, before step (a), wherein each guide RNA comprises one or more adapter-binding elements to which the chimeric adapter protein can specifically bind. Each of the one or more guide RNAs can form a complex with the Cas protein and guide it to a target sequence within the target gene. The method according to claim 36, wherein at least one of the one or more guide RNAs targets the humanized endogenous MYOC gene locus.

38. The method according to claim 37, wherein the one or more guide RNAs, or the one or more expression cassettes encoding the one or more guide RNAs, are administered to the non-human animal at least one week before step (a) or about one week to about ten weeks before step (a).

39. (I) Step (b) includes evaluating the activity of the human MYOC targeting reagent in the eye of the non-human animal, or (II) Step (a) comprises administering the human MYOC targeting reagent, and step (b) comprises measuring the expression of the MYOC messenger RNA encoded by the humanized endogenous MYOC locus, or (III) Step (a) comprises administering the human MYOC targeting reagent, and step (b) comprises measuring the expression of the myosirin protein encoded by the humanized endogenous MYOC locus, or (IV) Step (a) comprises administering the human MYOC targeting reagent, wherein the human MYOC targeting reagent is a genome editing agent, and Step (b) comprises evaluating the modification of the humanized endogenous MYOC locus, optionally comprising measuring the frequency of insertions or deletions within the humanized endogenous MYOC locus, or (V) The evaluation is performed by comparison with an untreated control non-human animal. The method according to any one of claims 36 to 38.

40. (I) The human MYOC targeting reagent comprises a nuclease designed to target a region of the human MYOC gene, optionally comprising a Cas protein and a guide RNA designed to target a guide RNA target sequence in the human MYOC gene, optionally being a Cas9 protein, or (II) Step (a) comprises administering the human MYOC targeting reagent, wherein the human MYOC targeting reagent comprises an exogenous donor nucleic acid, the exogenous donor nucleic acid is designed to target the human MYOC gene, and optionally the exogenous donor nucleic acid is delivered via AAV, or (III) The human MYOC targeting reagent is an RNAi agent or an antisense oligonucleotide, (IV) The human MYOC targeting reagent is an antigen-binding protein, or (V) The human MYOC targeting reagent is a small molecule, The method according to any one of claims 36 to 38.

41. A method for evaluating the activity of a candidate glaucoma treatment agent, (a) Administering the candidate glaucoma treatment agent to a non-human animal according to claim 1, wherein the humanized endogenous MYOC gene locus contains the Y437H mutation in the non-human animal, (b) A method comprising evaluating the activity of the candidate glaucoma treatment agent in the non-human animal.

42. The method further comprises administering one or more guide RNAs or one or more expression cassettes encoding the one or more guide RNAs, each guide RNA comprising one or more adapter-binding elements to which the chimeric adapter protein can specifically bind, Each of the one or more guide RNAs can form a complex with the Cas protein and guide it to a target sequence within the target gene. The method according to claim 41, wherein at least one of the one or more guide RNAs targets the humanized endogenous MYOC gene locus.

43. The method according to claim 42, wherein the one or more guide RNAs, or the one or more expression cassettes encoding the one or more guide RNAs, are administered to the non-human animal at least one week before step (a) or about one week to about ten weeks before step (a).

44. (i) step (b) comprises evaluating the activity of the candidate glaucoma treatment agent in the eye of the non-human animal, or (II) Step (b) comprises measuring the expression of MYOC messenger RNA encoded by the humanized endogenous MYOC locus, or (III) Step (b) comprises measuring the expression of the myosirin protein encoded by the humanized endogenous MYOC locus, or (IV) Step (b) comprises evaluating modifications of the humanized endogenous MYOC locus, optionally comprising measuring the frequency of insertions or deletions within the humanized endogenous MYOC locus, or (V) Evaluating the activity of the candidate glaucoma treatment agent in the non-human animal includes evaluating intraocular pressure, or (VI) The evaluation is performed in comparison with an untreated control non-human animal, or (VII) Step (a) comprises administering the candidate glaucoma treatment agent, and step (b) comprises evaluating intraocular pressure, optionally, (A) The candidate glaucoma treatment agent is an inhibitor of aqueous humor formation, or (B) The candidate glaucoma treatment agent increases aqueous humor outflow. The method according to any one of claims 41 to 43.

45. The method according to any one of claims 41 to 43, wherein the candidate glaucoma treatment agent is an ANGPTL7 targeting reagent, and optionally the ANGPTL7 targeting reagent is an RNAi agent or an antisense oligonucleotide.

46. A method for increasing MYOC expression in non-human animals, The aforementioned non-human animal contains a humanized endogenous MYOC locus in its genome, and the region of the endogenous MYOC locus from the MYOC start codon to the MYOC stop codon has been deleted and replaced with the corresponding human MYOC genome sequence from the MYOC start codon to the MYOC stop codon. The non-human animal further contains an expression cassette integrated into its genome, and the expression cassette integrated into the genome is (a) A nucleic acid encoding a chimeric cluster of regularly scattered short palindromic repeat (CRISPR)-related (Cas) proteins, comprising a nuclease-inactive Cas protein fused to one or more transcription-activating domains, (b) A nucleic acid encoding a chimeric adapter protein, which includes an adapter protein fused to one or more transcriptional activation domains, The method comprises administering to the non-human animal one or more guide RNAs, or one or more expression cassettes encoding the one or more guide RNAs, wherein each guide RNA comprises one or more adapter-binding elements to which the chimeric adapter protein can specifically bind. Each of the one or more guide RNAs can form a complex with the Cas protein and guide it to a target sequence within the target gene. A method wherein at least one of the one or more guide RNAs targets a target sequence in the region 200 base pairs upstream and 1 base pair downstream of the transcription start site of the humanized endogenous MYOC gene locus.

47. A method for increasing intraocular pressure in non-human animals, The aforementioned non-human animal contains a humanized endogenous MYOC locus in its genome, and the region of the endogenous MYOC locus from the MYOC start codon to the MYOC stop codon has been deleted and replaced with a corresponding human MYOC genome sequence containing the Y437H mutation from the MYOC start codon to the MYOC stop codon. The non-human animal further contains an expression cassette integrated into its genome, and the expression cassette integrated into the genome is (a) A nucleic acid encoding a chimeric cluster of regularly scattered short palindromic repeat (CRISPR)-related (Cas) proteins, comprising a nuclease-inactive Cas protein fused to one or more transcription-activating domains, (b) A nucleic acid encoding a chimeric adapter protein, which includes an adapter protein fused to one or more transcriptional activation domains, The method comprises administering to the non-human animal one or more guide RNAs, or one or more expression cassettes encoding the one or more guide RNAs, wherein each guide RNA comprises one or more adapter-binding elements to which the chimeric adapter protein can specifically bind. Each of the one or more guide RNAs can form a complex with the Cas protein and guide it to a target sequence within the target gene. A method wherein at least one of the one or more guide RNAs targets a target sequence in the region 200 base pairs upstream and 1 base pair downstream of the transcription start site of the humanized endogenous MYOC gene locus.

48. A method for modeling glaucoma in non-human animals, The aforementioned non-human animal contains a humanized endogenous MYOC locus in its genome, and the region of the endogenous MYOC locus from the MYOC start codon to the MYOC stop codon has been deleted and replaced with a corresponding human MYOC genome sequence containing the Y437H mutation from the MYOC start codon to the MYOC stop codon. The non-human animal further contains an expression cassette integrated into its genome, and the expression cassette integrated into the genome is (a) A nucleic acid encoding a chimeric cluster of regularly scattered short palindromic repeat (CRISPR)-related (Cas) proteins, comprising a nuclease-inactive Cas protein fused to one or more transcription-activating domains, (b) A nucleic acid encoding a chimeric adapter protein, which includes an adapter protein fused to one or more transcriptional activation domains, The method comprises administering to the non-human animal one or more guide RNAs, or one or more expression cassettes encoding the one or more guide RNAs, wherein each guide RNA comprises one or more adapter-binding elements to which the chimeric adapter protein can specifically bind. Each of the one or more guide RNAs can form a complex with the Cas protein and guide it to a target sequence within the target gene. A method wherein at least one of the one or more guide RNAs targets a target sequence in the region 200 base pairs upstream and 1 base pair downstream of the transcription start site of the humanized endogenous MYOC gene locus.

49. (I) The method results in increased MYOC mRNA or protein expression in the eye, limbal ring, retina, ciliary body, or trabecular network, and optionally, the method results in at least 2-fold, at least 3-fold, at least 4-fold, at least 5-fold, at least 6-fold, at least 7-fold, at least 8-fold, at least 9-fold, at least 10-fold, at least 11-fold, at least 12-fold, at least 13-fold, at least 14-fold, or at least 15-fold increased MYOC mRNA or protein expression in the eye, limbal ring, retina, ciliary body, or trabecular network compared to a control non-human animal that does not contain the one or more guide RNAs or one or more expression cassettes encoding the one or more guide RNAs, or (II) The method results in an elevated intraocular pressure, and optionally, the method results in a non-human animal having an intraocular pressure that is at least 1, at least 2, at least 3, at least 4, at least 5, or at least 6 mmHg higher than the intraocular pressure of the control non-human animal, (III) The method according to any one of claims 46 to 48, wherein the method results in an intraocular pressure of at least 15, at least 16, at least 17, at least 18, at least 19, at least 20, at least 21, or at least 22 mmHg.

50. The non-human animal is a mouse, and the one or more guide RNAs target one or more guide RNA target sequences selected from SEQ ID NOs. 90 to 95, and are arbitrarily selected. (I) The non-human animal is a mouse, and the one or more guide RNAs target one or more guide RNA target sequences selected from SEQ ID NOs: 93-94, or (II) The non-human animal is a mouse, and the one or more guide RNAs target the guide RNA target sequence described in Sequence ID No.

93. The method according to any one of claims 36 to 38, 41 to 43, and 46 to 48.

51. The one or more guide RNAs, or the one or more expression cassettes encoding the one or more guide RNAs, are administered via adenovirus-mediated delivery, lentivirus-mediated delivery, or adeno-associated virus (AAV)-mediated delivery, and optionally, (i) The one or more guide RNAs, or the one or more expression cassettes encoding the one or more guide RNAs, are administered via recombinant AAV2.Y3F-mediated delivery, or (ii) The one or more guide RNAs, or the one or more expression cassettes encoding the one or more guide RNAs, are administered via lentiviral-mediated delivery. The method according to any one of claims 36 to 38, 41 to 43, and 46 to 48.

52. The method according to any one of claims 36-38, 41-43, and 46-48, wherein the one or more guide RNAs, or the one or more expression cassettes encoding the one or more guide RNAs, are administered to the non-human animal by intravitreous injection or anterior chamber injection.