CAS nucleases and polynucleotides encoding the same

Novel Cas nucleases with altered PAM specificities enhance genome editing efficiency by allowing flexible target site selection, addressing the limitations of traditional CRISPR systems.

WO2026132030A1PCT designated stage Publication Date: 2026-06-25NOVOZYMES AS

Patent Information

Authority / Receiving Office
WO · WO
Patent Type
Applications
Current Assignee / Owner
NOVOZYMES AS
Filing Date
2025-12-17
Publication Date
2026-06-25

AI Technical Summary

Technical Problem

Existing CRISPR-associated (Cas) nucleases face limitations such as requiring complex engineering, high costs, and restricted target site availability due to canonical PAM sequences, which hampers efficient genome editing.

Method used

Development of novel Cas nucleases with altered PAM specificities, such as 'nRARK', allowing flexible target site selection and enhanced genome editing capabilities, including fusion polypeptides and nucleic acid constructs for precise genome modification.

Benefits of technology

The novel Cas nucleases provide increased target site flexibility and efficiency in genome editing, overcoming limitations of traditional CRISPR systems by enabling more versatile and cost-effective genome modification.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure EP2025087700_25062026_PF_FP_ABST
    Figure EP2025087700_25062026_PF_FP_ABST
Patent Text Reader

Abstract

The present invention relates to novel Cas nucleases and polynucleotides encoding the same. The invention also relates to nucleic acid constructs, vectors, and host cells comprising the polynucleotides and Cas nuclease, as well as fusion polypeptides, gene editing methods, methods of producing the Cas nuclease, formulations comprising the Cas nuclease, and use of the Cas nuclease. In particular, it relates to the Cas nuclease from Lachnospiraceae bacterium A10 (SEQ ID NO: 1) and its coding nucleic acid sequence (SEQ ID NO: 2).
Need to check novelty before this filing date? Find Prior Art

Description

[0001] NOVEL CAS NUCLEASES AND POLYNUCLEOTIDES ENCODING THE SAME

[0002] Reference to a Sequence Listing

[0003] This application contains a Sequence Listing in computer readable form, which is incorporated herein by reference.

[0004] Background of the Invention

[0005] Field of the Invention

[0006] The present invention relates to novel CRISPR-associated (Cas) nucleases, variants thereof, and polynucleotides encoding the same. The invention also relates to nucleic acid constructs, vectors, and host cells comprising the polynucleotides and Cas nuclease, as well as fusion polypeptides, gene editing methods, methods of producing the Cas nuclease, formulations comprising the Cas nuclease, and use of the Cas nuclease.

[0007] Description of the Related Art

[0008] In recent years, genome editing has emerged as a pivotal tool for research and applications. Early methods required complex engineering of nucleases, such as meganucleases, zinc finger fusion proteins, or TALENs, tailored for each target sequence. This process was timeconsuming and costly, posing scalability and efficiency challenges.

[0009] A transformative breakthrough occurred with RNA-guided nucleases, particularly CRISPR-associated (Cas) proteins. These RNA-guided nucleases allow specific targeting of genetic sequences using guide RNA, streamlining genome editing by eliminating the need for custom-engineered nucleases. RNA-guided nucleases, including CRISPR RNA (crRNA) and trans-activating crRNA (tracrRNA), offer versatile genome editing options, from introducing mutations via non-homologous end-joining (NHEJ), to precise base editing when fused with deaminases.

[0010] Programmable nucleases, core components of RNA-guided nucleases, bind and cleave nucleic acids with sequence-specificity. They exhibit activities like cis cleavage or nickase activity, guided by specialized RNA molecules. These nucleases can be engineered to reduce catalytic activity while maintaining sequence specificity, expanding their utility.

[0011] CRISPR systems in bacterial and archaeal adaptive immunity display diverse characteristics. Differences in size, PAM site, on-target activity, and cleavage pattern offer unique advantages for various applications, but can also represent limitations (e.g., low frequency of PAM sites in the target cell genome, or low expressability). Novel Cas nucleases are essential to address evolving genome engineering demands. Summary of the Invention

[0012] In a 1staspect the present invention relates to Cas nucleases selected from the group consisting of:

[0013] (a) a polypeptide having at least 80% sequence identity to the amino acid sequence of SEQ ID NO: 1 ;

[0014] (b) a polypeptide encoded by a polynucleotide having at least 80% sequence identity to the polypeptide coding sequence of SEQ ID NO: 2;

[0015] (c) a polypeptide derived from SEQ ID NO: 1 , by having 1-30 alterations (e.g., substitutions, deletions and / or insertions at one or more positions, e.g., 1 or 2 or 3 or 4 or 5 or 6 or 7 or 8 or 9 or 10 or 11 or 12 or 13 or 14 or 15 or 16 or 17 or 18 or 19 or 20 or 21 or 22 or 23 or 24 or 25 or 26 or 27 or 28 or 29 or 30 alterations), in particular substitutions, such as conservative amino acid substitutions;

[0016] (d) a polypeptide having a TM-score of at least 0.80 compared to the three- dimensional structure of the polypeptide of SEQ ID NO: 1 , wherein the three-dimensional structure is calculated using Alphafold;

[0017] (e) a polypeptide derived from the polypeptide of (a), (b), (c), or (d), wherein the N- and / or C-terminal end has been extended by addition of one or more amino acids; and

[0018] (f) a fragment of the polypeptide of (a), (b), (c), (d), or (e).

[0019] Some aspects of the disclosure provide Cas nucleases that have different PAM specificities. Typically, Cas nucleases, such as Cas9 from S. pyogenes (spCas9), require a canonical “nGG” PAM sequence to bind a particular nucleic acid region. This may limit the ability to target desired bases within a genome. In some embodiments, the Cas nucleases provided herein may need to be placed at a precise location, for example where a target base is placed within a 4-base region (e.g., a “editing window”), which is approximately 15 bases upstream of the PAM. See Komor, A.C., et al., “Programmable editing of a target base in genomic DNA without double-stranded DNA cleavage” Nature 533, 420-424 (2016). Accordingly, in some embodiments, any of the Cas nucleases provided herein may contain a CRISPR nuclease that is capable of binding a nucleotide sequence that does not contain a canonical (e.g., nGG) PAM sequence. In one embodiment, the Cas nuclease of the invention utilizes a “nRARK” PAM sequence which allows a more flexible selection of target sites compared to the canonical “nGG” PAM sequence of spCas9.

[0020] In a 2ndaspect, the invention relates to a fusion polypeptide comprising the Cas nuclease of the 1staspect, and one or more second polypeptide. In a 3rdaspect the present invention relates to a non-naturally occurring composition comprising (i) the Cas nuclease of the 1staspect and / or the fusion polypeptide of the 2ndaspect, or (ii) a nucleic acid molecule comprising a sequence encoding the Cas nuclease of the 1staspect and / or the fusion polypeptide of the 2ndaspect.

[0021] In a 4thaspect the present invention relates to a method of modifying a nucleotide sequence at a DNA target site in the genome of a cell, comprising introducing into the cell the Cas nuclease of the 1staspect, the fusion polypeptide according to the 2ndaspect, the composition of the 3rdaspect, the polynucleotide of the 5thaspect, and / or the nucleic acid construct or expression vector of the 6thaspect.

[0022] In a 5thaspect, the present invention relates to a polynucleotide encoding the Cas nuclease of the 1staspect, and / or the fusion polypeptide of the 2ndaspect.

[0023] In a 6thaspect the present invention relates to a nucleic acid construct or expression vector comprising the polynucleotide of the 5thaspect, operably linked to one or more control sequences that direct the production of the polypeptide in a cell.

[0024] In a 7thaspect, the present invention relates to a cell comprising the Cas nuclease of the 1staspect, the fusion polypeptide of the 2ndaspect, the composition of the 3rdaspect, the polynucleotide of the 5thaspect, or the nucleic acid construct or expression vector of the 6thaspect.

[0025] In an 8thaspect, the present invention relates to a cell comprising a genome modified by the Cas nuclease of the 1staspect, the fusion polypeptide of the 2ndaspect, the composition of the 3rdaspect, the method of the 4thaspect, the polynucleotide of the 5thaspect, and / or the nucleic acid construct or expression vector of the 6thaspect.

[0026] In a 9thaspect, the present invention relates to a method of producing a Cas nuclease of the 1staspect, or a fusion polypeptide of the 2ndaspect, comprising cultivating the host cell of the 7thaspect under conditions conducive for production of the Cas nuclease or fusion polypeptide.

[0027] In a 10thaspect, the present invention relates to the use of the Cas nuclease of the 1staspect, the fusion polypeptide of the 2ndaspect, the composition of the 3rdaspect, the method of the 4thaspect, the polynucleotide of the 5thaspect, or the nucleic acid construct or expression vector of the 6thaspect for modifying a target sequence in a cell, e.g., a target gene.

[0028] In an 11thaspect, the present invention relates to the use of the Cas nuclease of the 1staspect, the fusion polypeptide of the 2ndaspect, the composition of the 3rdaspect, the method of the 4thaspect, the polynucleotide of the 5thaspect, the nucleic acid construct or expression vector of the 6thaspect, the cell of the 7thaspect, or the cell of the 8thaspect for the manufacture of a medicament for modifying a target sequence in a cell, e.g., a target gene.

[0029] In a 12thaspect, the present invention relates to a formulation comprising (i) the Cas nuclease according to the 1staspect, the fusion polypeptide according to the 2ndaspect, a composition according to the 3rdaspect, the polynucleotide according to the 5thaspect, the nucleic acid construct or expression vector according to the 6thaspect, the cell according to the 7thaspect, or the cell according to the 8thaspect, and optionally, (ii) one or more of a lipid, a liposome, a hydrogel, a microparticle, a nanoparticle, or a block copolymer micelle.

[0030] The inventors of instant invention have identified novel Cas nucleases which, surprisingly possess several advantages over the nucleases known from the prior art, e.g., the herein identified novel Cas nucleases are highly distinct to known Cas nucleases (which is particularly advantageous in an IP-heavy research field) and the herein disclosed nucleases are compatible with more flexible PAM sites, resulting in a higher number of possible target sites per genome.

[0031] Brief Description of the Drawings

[0032] Figure 1 shows the bioinformatic pipeline developed to identify novel CRISPR-Cas systems in genome sequences. Genome sequences from various sources are processed with three state-of-the-art sequence mining tools to identify Cas nuclease genes, CRISPR arrays, and tracrRNA sequences. The pool of potential Cas proteins is further enriched by scanning the genomes with custom HMMs and filtered for the presence of domains and residues required for endonuclease activity. The three functional elements Cas, CRISPR array, and tracrRNA are mapped to each other via there genomic locus as well as the complementarity of the CRISPR repeats and the tracrRNAs.

[0033] Figure 2 shows the total number of colonies and ratio of green phenotypes obtained after NZ-0452 editing in Example 4. Editing efficiency (%) is indicated at the top of shaded bar graph.

[0034] Figure 3 shows one of guide plasmids for nuclease 0452 (pKJITOOl).

[0035] Figure 4 shows the nuclease 0452 expression plasmid (pKJIT016).

[0036] Figure 5 shows the ratio of white / black phenotypes obtained after nuclease 0452 transformations at 34°C in Aspergillus niger 'm Example 6.

[0037] Figure 6 shows the colony number and ratio of white / black phenotypes obtained after nuclease 0452 transformations in A. niger 'm Example 6.

[0038] Figure 7 shows the sequenced protospacer regions of white spore colonies after transformation by plasmids targeting 0452-PS1 aligned to the wild type fwnA sequence in Example 6, the sequences shown are provided in SEQ ID NOs: 114-118.

[0039] Figure 8 shows the sequenced protospacer regions of white spore colonies after transformation by plasmids targeting 0452-PS4 aligned to the wild type fwnA sequence in Example 6, the sequences shown are provided in SEQ ID NOs: 119-123.

[0040] Definitions

[0041] In accordance with this detailed description, the following definitions apply. Note that the singular forms "a," "an," and "the" include plural references unless the context clearly dictates otherwise. Unless defined otherwise or clearly indicated by context, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs.

[0042] Base-editing polypeptide: The term “base-editing polypeptide” means a polypeptide comprising a base editor domain capable of chemically altering the target DNA sequence without introducing a double-strand break. Non-limiting examples for a base-editing polypeptide include a deaminase, e.g., a cytidine deaminase, and an adenosine deaminase. The base-editing polypeptide may be fused to an inactivated Cas nuclease of the invention to enable singlenucleotide changes to a specific DNA target sequence. Using a base-editing polypeptide, single nucleotide polymorphisms (SNPs) can be introduced at the DNA target site without generating a double-strand break.

[0043] Catalytic domain: The term “catalytic domain” means the region of an enzyme containing the catalytic machinery of the enzyme. The catalytic domain of a Cas nuclease comprises a HNH domain, and one or more RuvC domain(s).

[0044] Catalytically dead nuclease: The term “catalytically dead nuclease”, “catalytically inactive nuclease” or “dCas” means a mutated Cas nuclease which has reduced, no, or substantially no cleavage activity. The catalytically dead nuclease has reduced, no, or substantially no cleavage activity for single- and double-strand DNA. In other words, the dCas may not cleave either strand of a target DNA.

[0045] For example, the catalytically dead nuclease may comprise an inactivated RuvC domain and an inactivated HNH domain. In some embodiments, the Cas nuclease is a catalytically dead nuclease, e.g., having substantially no nuclease activity, e.g., no more than 5 percent nuclease activity as compared with a wild-type Cas nuclease not having an inactivated RuvC domain and not having an inactivated HNH domain.

[0046] As a non-limiting example, in some cases, the dCas harbors both the D10A and the H840A mutations in S. pyogenes Cas9, or corresponding mutations in any Cas nuclease of the invention. Additional suitable nuclease-inactive dCas can be apparent to those of skill in the art based on this disclosure and knowledge in the field, and are within the scope of this disclosure. Such additional exemplary suitable dCas include, but are not limited to, D10A / H840A, D10A / D839A / H840A, and D10A / D839A / H840A / N863A mutant domains (See, e.g., Prashant et al., Nature Biotechnology. 2013; 31(9): 833-838, the entire contents of which are incorporated herein by reference). cDNA: The term "cDNA" means a DNA molecule that can be prepared by reverse transcription from a mature, spliced, mRNA molecule obtained from a eukaryotic or prokaryotic cell. cDNA lacks intron sequences that may be present in the corresponding genomic DNA. The initial, primary RNA transcript is a precursor to mRNA that is processed through a series of steps, including splicing, before appearing as mature, spliced mRNA. Coding sequence: The term “coding sequence” means a polynucleotide, which directly specifies the amino acid sequence of a polypeptide. The boundaries of the coding sequence are generally determined by an open reading frame, which begins with a start codon, such as ATG, GTG, or TTG, and ends with a stop codon, such as TAA, TAG, or TGA. The coding sequence may be a genomic DNA, cDNA, synthetic DNA, or a combination thereof.

[0047] Codon-optimized gene: The term "codon-optimized gene" means a gene having its frequency of codon usage optimized to the frequency of preferred codon usage of a host cell. The nucleic acid changes made to codon-optimize a gene do not change the amino acid sequence of the encoded polypeptide of the parent gene.

[0048] Control sequences: The term “control sequences” means nucleic acid sequences involved in regulation of expression of a polynucleotide in a specific organism or in vitro. Each control sequence may be native ( / .e., from the same gene) or heterologous ( / .e., from a different gene) to the polynucleotide encoding the polypeptide, and native or heterologous to each other. Such control sequences include, but are not limited to leader peptide, polyadenylation signal, prepeptide, propeptide, signal peptide, promoter, terminator, enhancer, and transcription or translation initiator and terminator sequences. At a minimum, the control sequences include a promoter, and transcriptional and translational stop signals. The control sequences may be provided with linkers for the purpose of introducing specific restriction sites facilitating ligation of the control sequences with the coding region of the polynucleotide encoding a polypeptide.

[0049] Cas nuclease: The term “Gas nuclease” means an RNA-guided DNA endonuclease associated with CRISPR which is capable of cleaving a target DNA sequence when coupled with a guide RNA. The Cas nuclease is guided by guide RNA(s) to recognize and cleave a specific target site in double-stranded DNA in the genome of a cell. CRISPR-Cas systems are currently classified into 2 classes, 6 types, and 33 subtypes (Makarova et al., 2020, Nat Rev Microbiol 18: 67-83).

[0050] In one embodiment, the Cas nuclease is a Class 2 Type I l_A CRISPR-Cas system employing the Cas nuclease of SEQ ID NO: 1 or variant thereof (including, for example, a CRISPR nickase). Typically, Cas nucleases of type II comprise two nuclease domains, an HNH nuclease domain that cleaves the complementary DNA strand, and a RuvC-like nuclease domain that cleaves the non-complementary DNA strand.

[0051] Target recognition and cleavage by the Cas nuclease requires a chimeric RNA consisting of a fusion of crRNA (comprising a guide sequence and a partial direct repeat) and tracrRNA (trans - activating crRNA) and a short, conserved sequence motif downstream of the crRNA binding region, called a protospacer adjacent motif (PAM). In one embodiment, target recognition and cleavage by the Cas nuclease takes place with separate crRNA molecule and separate tracrRNA molecule, i.e., where the crRNA is not fused to the tracrRNA. In one embodiment, the Cas nuclease (e.g., SEQ ID NO: 1) derived from the bacterium Lachnospiraceae bacterium A 10 targets the target DNA immediately adjacent to a 5’-nRARK PAM sequence.

[0052] In one embodiment, the Cas nuclease (e.g., SEQ ID NO: 1) targets the target DNA immediately adjacent to a 5’-nRAGK PAM sequence.

[0053] In one embodiment, the Cas nuclease (e.g., SEQ ID NO: 1) targets the target DNA immediately adjacent to a 5’-nVDRK PAM sequence.

[0054] The RNA-guided Cas nuclease activity creates site-specific double strand breaks, which are then repaired by either non-homologous end joining (NHEJ) or homology-directed repair (HDR). It is understood that the term "Cas nuclease" encompasses variants thereof. crRNA: CRISPR RNA (crRNA) serves as the molecular guide for Cas nucleases, providing sequence specificity for the Cas nuclease to target and / or edit and / or regulate specific DNA and / or RNA sequences. crRNA sequence comprises spacers, which recognize a distinct DNA sequence (protospacer). The crRNA associates with the Cas nuclease and creates a ribonucleoprotein complex called the CRISPR-Cas effector complex. Due to the match of the spacer to the complimentary target DNA sequence, the Cas nuclease introduces a DNA-break at the target site. The crRNA can be reprogrammed allowing precise and customizable genome editing.

[0055] DNA target site: The term “target sequence”, “target site”, or “DNA target site” means one or more DNA (e.g., genomic DNA) or RNA target sequence of interest that may be subject to a single-strand cut, or double-strand cut by the Cas nuclease, and / or induced or repressed by the Cas nuclease. Typically, the target site (protospacer sequence) is at least 15-20 nucleotides in length in order to allow its hybridization to the corresponding spacer sequence of the guide RNA. The target site can be located anywhere in the genome but will often be within a coding sequence or open reading frame. Non-limiting examples for target sites include genes, promoters, and other regulatory sequences such as enhancers, silencers, insulators, splicing sites, and untranslated regions (UTRs), including UTRs of genes and 5’-UTRs.

[0056] Preferably, the target site of interest is flanked by a functional PAM sequence for the selected Cas nuclease.

[0057] Expression: The term “expression” means any step involved in the production of a polypeptide including, but not limited to, transcription, post-transcriptional modification, translation, folding of the translated polypeptide into a functional structure, post-translational modification, and secretion.

[0058] Expression vector: An "expression vector" refers to a linear or circular DNA construct comprising a DNA sequence encoding a polypeptide, which coding sequence is operably linked to a suitable control sequence capable of effecting expression of the DNA in a suitable host. Such control sequences may include a promoter to effect transcription, an optional operator sequence to control transcription, a sequence encoding suitable ribosome binding sites on the mRNA, enhancers and sequences which control termination of transcription and translation.

[0059] Extension: The term “extension” means an addition of one or more amino acids to the amino and / or carboxyl terminus of a polypeptide, wherein the “extended” polypeptide has nuclease activity and / or DNA-binding activity.

[0060] Fragment: The term “fragment” means a polypeptide, a catalytic domain, or a DNA- binding module having one or more amino acids absent from the amino and / or carboxyl terminus of the mature polypeptide, catalytic domain, or binding module, wherein the fragment has nuclease or DNA-binding activity.

[0061] Fusion polypeptide: The term “fusion polypeptide” is a polypeptide in which one or more polypeptide is fused at the N-terminus and / or the C-terminus of a Cas nuclease of the present invention. A fusion polypeptide is produced by fusing a polynucleotide encoding another polypeptide to a polynucleotide of the present invention, or by fusing two or more polynucleotides of the present invention together. Techniques for producing fusion polypeptides are known in the art, and include ligating the coding sequences encoding the polypeptides so that they are in frame and that expression of the fusion polypeptide is under control of the same promoter(s) and terminator. Fusion polypeptides may also be constructed using intein technology in which fusion polypeptides are created post-translationally (Cooper et al., 1993, EMBO J. 12: 2575-2583; Dawson et al., 1994, Science 266: 776-779).

[0062] Genomic DNA: As used herein, “genomic DNA” refers to linear and / or chromosomal DNA and / or to plasmid or other extrachromosomal DNA sequences present in the cell or cells of interest. In some embodiments, the cell of interest is a eukaryotic cell. In some embodiments, the cell of interest is a prokaryotic cell. In some embodiments, the methods produce double-stranded breaks (DSBs) at pre-determined target sites in a genomic DNA sequence, resulting in mutation, insertion, and / or deletion of DNA sequences at the target site(s) in a genome.

[0063] Guide sequence portion: The “guide sequence portion” or “spacer” of an RNA molecule refers to a nucleotide sequence that is capable of hybridizing to a specific target DNA sequence (protospacer), e.g., the guide sequence portion has a nucleotide sequence which is partially or fully complementary to the DNA sequence being targeted along the length of the guide sequence portion. In some embodiments, the guide sequence portion is 17, 18, 19, 20, 21 , 22, 23, 24, 25, 26, 27, 28, 29, 30, 31 , 32, 33, 34, 35, 36, 37, 38, 39, 40, 41 , 42, 43, 44, 45, 46, 47, 48, 49 or 50 nucleotides in length, preferably at least 18 nucleotides in length, such as at least 23 nucleotides in length, or approximately 17-50, 17-49, 17-48, 17-47, 17-46, 17-45, 17-44, 17-43, 17-42, 17-41 , 17-40, 17-39, 17-38, 17-37, 17-36, 17-35, 17-34, 17-33, 17-31 , 17-30, 17-29, 17-28, 17-27, 17- 26, 17-25, 17-24, 17-22, 17-21 , 18-25, 18-24, 18-23, 18-22, 18-21 , 19-25, 19-24, 19-23, 19-22, 19-21 , 19-20, 20-22, 18-20, 20-21 , 21-22, or 17-20 nucleotides in length. The entire length of the guide sequence portion is fully complementary to the DNA sequence being targeted along the length of the guide sequence portion. The guide sequence portion may be part of an RNA molecule that can form a complex with a Cas nuclease with the guide sequence portion serving as the DNA targeting portion of the CRISPR complex. When the DNA molecule having the guide sequence portion is present contemporaneously with the CRISPR molecule the RNA molecule is capable of targeting the Cas nuclease to the specific target DNA or RNA sequence. Each possibility represents a separate embodiment. An RNA molecule can be custom designed to target any desired sequence. Accordingly, a molecule comprising a “guide sequence portion” is a type of targeting molecule. Throughout this application, the terms “guide molecule,” “RNA guide molecule,” “guide RNA molecule,” and “gRNA molecule" are synonymous with a molecule comprising a guide sequence portion, and the term “spacer” is synonymous with a “guide sequence portion.”

[0064] In embodiments of the present invention, the Cas nuclease has its greatest cleavage activity when used with an RNA molecule comprising a guide sequence portion having 17, 18, 19 20, 21 , 22, 23, 24, 25, 26, 27, 28, 29, or 30 nucleotides.

[0065] A single-guide RNA (sgRNA) molecule may be used to direct a Cas nuclease to a desired target site. The single-guide RNA comprises a guide sequence portion as well as a scaffold portion. The scaffold portion interacts with a Cas nuclease and, together with a guide sequence portion, activates and targets the Cas nuclease to a desired target site. A scaffold portion may be further engineered, for example, to have a reduced size.

[0066] The gRNA in CRISPR-Cas genome editing constitutes the re-programmable part that makes the system so versatile. In most natural Cas systems, the gRNA is actually a complex of two RNA polynucleotides, a first crRNA containing about 15-30 nucleotides that determine the specificity of the Cas nuclease and the tracr RNA which hybridizes to the crRNA to form an RNA complex that interacts with Cas nuclease (see Jinek et al., 2012, A programmable dual- RNA- guided DNA endonuclease in adaptive bacterial immunity, Science 337: 816-821).

[0067] Since the discovery of the CRISPR-Cas system single polynucleotide gRNAs have been developed and successfully applied just as effectively as the natural two-part gRNA complex.

[0068] The spacer may be part of a targeting guide RNA molecule that can form a complex with a Cas nuclease with the spacer sequence serving as the targeting portion of the CRISPR complex. When the molecule having the spacer sequence is present contemporaneously with the CRISPR molecule, the RNA molecule is capable of targeting the Cas nuclease to the specific target sequence. Each possibility represents a separate embodiment. A targeting RNA molecule can be custom designed to target any desired sequence.

[0069] The term “targets” as used herein, refers to preferential hybridization of a spacer sequence to a nucleic acid having a targeted nucleotide sequence (protospacer). It is understood that the term “targets” encompasses variable hybridization efficiencies, such that there is preferential targeting of the nucleic acid having the targeted nucleotide sequence, but unintentional off-target hybridization in addition to on-target hybridization might also occur. It is understood that where an RNA molecule targets a sequence, a complex of the RNA molecule and a Cas nuclease molecule targets the sequence for nuclease activity.

[0070] In the context of targeting a DNA sequence that is present in a plurality of cells, it is understood that the targeting encompasses hybridization of the guide sequence portion of the RNA molecule with the sequence in one or more of the cells, and also encompasses hybridization of the RNA molecule with the target sequence in fewer than all of the cells in the plurality of cells. Accordingly, it is understood that where an RNA molecule targets a sequence in a plurality of cells, a complex of the RNA molecule and a Cas nuclease is understood to hybridize with the target sequence in one or more of the cells, and also may hybridize with the target sequence in fewer than all of the cells. Accordingly, it is understood that the complex of the RNA molecule and the Cas nuclease introduces a double strand break in relation to hybridization with the target sequence in one or more cells and may also introduce a double strand break in relation to hybridization with the target sequence in fewer than all of the cells. As used herein, the term “modified cells” or “cell comprising a genome modified by the Cas nuclease” refers to cells in which a double strand break is affected by a complex of an RNA molecule and the Cas nuclease as a result of hybridization with the target sequence, i.e. , on- target hybridization.

[0071] Heterologous: The term "heterologous" means, with respect to a host cell, that a polypeptide or nucleic acid does not naturally occur in the host cell. The term "heterologous" means, with respect to a polypeptide or nucleic acid, that a control sequence, e.g., promoter, of a polypeptide or nucleic acid is not naturally associated with the polypeptide or nucleic acid, i.e., the control sequence is from a gene other than the gene encoding the mature polypeptide.

[0072] HNH sequence: The HNH sequence comprises the HNH domain of a Cas nuclease. The HNH domain in Cas nucleases stands for "Histidine-Asparagine-Histidine." These conserved amino acid residues play a crucial role in the nuclease activity of this domain. The HNH domain is one of the two main types of nuclease domains found in CRISPR-associated (Cas) proteins. In the context of CRISPR systems, the HNH domain is responsible for cleaving the DNA strand that is complementary to the RNA guide strand, thereby creating a cut in the target DNA. This break is a key step in the CRISPR-Cas gene editing process, allowing for precise DNA modification.

[0073] Together with the RuvC domain, the HNH domain creates a double-strand break in the target DNA, allowing for gene editing or modification.

[0074] Host Strain or Host Cell: A "host strain" or "host cell" is an organism into which an expression vector, phage, virus, or other DNA construct, including a polynucleotide encoding a polypeptide of the present invention has been introduced. Exemplary host strains are microorganism cells (e.g., bacteria, filamentous fungi, and yeast) capable of expressing the Cas nuclease. The term "host cell" includes protoplasts created from cells. Introduced: The term introduced in the context of inserting a nucleic acid sequence into a cell, means "transfection", "transformation" or "transduction," as known in the art.

[0075] Isolated: The term “isolated” means a polypeptide, nucleic acid, cell, or other specified material or component that has been separated from at least one other material or component, including but not limited to, other proteins, nucleic acids, cells, etc. An isolated polypeptide, nucleic acid, cell or other material is thus in a form that does not occur in nature. An isolated polypeptide includes, but is not limited to, a culture broth containing the secreted polypeptide expressed in a host cell.

[0076] Mature polypeptide: The term “mature polypeptide” means a polypeptide in its mature form following N-terminal and / or C-terminal processing (e.g., removal of signal peptide).

[0077] Mature polypeptide coding sequence: The term “mature polypeptide coding sequence” means a polynucleotide that encodes a mature Cas nuclease having nuclease activity and / or DNA binding activity.

[0078] Native: The term "native" means a nucleic acid or polypeptide naturally occurring in a host cell.

[0079] Nickase: The term “Nickase”, “CRISPR nickase”, or “nCas” means a nuclease having an inactivated RuvC domain, or an inactivated HNH domain. It is understood that a Cas nuclease, rather than losing nuclease activity to cleave all DNA, may lose the ability to cleave only the target strand or only the non-target strand of a double-stranded DNA, thereby being functional as a nickase (see, Gao et al. (2016) CELL RES., 26: 901). Accordingly, in certain embodiments, a Cas nuclease is a nCas. In certain embodiments, a nCas has the activity to cleave the non- complementary strand but lacks substantially the activity to cleave the complementary strand, e.g., by a mutation in the HNH domain. For example, an nCas can have a mutation that reduces the function of the HNH domain, such as an H840A mutation in S. pyogenes Cas9 or a corresponding mutation in any Cas nuclease of the invention. In certain embodiments, a nCas has the cleavage activity to cleave the complementary strand but lacks substantially the activity to cleave the non-complementary strand, e.g., by mutation in the RuvC domain. For example, the nCas can have a mutation that reduces the function of the RuvC domain, such as a D10A mutation in S. pyogenes Cas9 or a corresponding mutation in any Cas nuclease of the invention.

[0080] Nuclear Localization Sequence: The terms "nuclear localization sequence" and "NLS" are used interchangeably to indicate an amino acid sequence / peptide that directs the transport of a protein with which it is associated from the cytoplasm of a cell across the nuclear envelope barrier. The term "NLS" is intended to encompass not only the nuclear localization sequence of a particular peptide, but also derivatives thereof that are capable of directing translocation of a cytoplasmic polypeptide across the nuclear envelope barrier. NLSs are capable of directing nuclear translocation of a Cas nuclease when attached to the N-terminus, the C-terminus, or both the N- and C-termini of the Cas nuclease. In addition, a polypeptide having an NLS coupled by its N- or C-terminus to amino acid side chains located randomly along the amino acid sequence of the polypeptide will be translocated. Typically, an NLS consists of one or more short sequences of positively charged lysines or arginines exposed on the protein surface, but other types of NLS are known. Non-limiting examples of NLSs include an NLS sequence derived from: the SV40 virus large T-antigen, nucleoplasmin, c-myc, the hRNPAI M9 NLS, the IBB domain from importinalpha, myoma T protein, human p53, mouse c- abl IV, influenza vims NS1 , Hepatitis virus delta antigen, mouse Mxl protein, human poly(ADP- ribose) polymerase, and the steroid hormone receptors (human) glucocorticoid.

[0081] Nucleic acid: The term "nucleic acid" encompasses DNA, RNA, heteroduplexes, and synthetic molecules capable of encoding a polypeptide. Nucleic acids may be single-stranded or double-stranded, and may have chemical modifications. The terms "nucleic acid" and "polynucleotide" are used interchangeably. Because the genetic code is degenerate, more than one codon may be used to encode a particular amino acid, and the present compositions and methods encompass nucleotide sequences that encode a particular amino acid sequence. Unless otherwise indicated, nucleic acid sequences are presented in 5'-to-3' orientation.

[0082] Nucleic acid construct: The term "nucleic acid construct" means a nucleic acid molecule, either single- or double-stranded, which is isolated from a naturally occurring gene or is modified to contain segments of nucleic acids in a manner that would not otherwise exist in nature or which is synthetic, and which comprises one or more control sequences operably linked to the nucleic acid sequence.

[0083] Operably linked: The term "operably linked" means that specified components are in a relationship (including but not limited to juxtaposition) permitting them to function in an intended manner. For example, a regulatory sequence is operably linked to a coding sequence such that expression of the coding sequence is under control of the regulatory sequence.

[0084] PAM: The term “PAM” or “protospacer adjacent motif” as used herein refers to a nucleotide sequence of a target DNA located in proximity to the targeted DNA sequence (protospacer) and recognized by the Cas nuclease, i.e., by the guide RNA forming a complex with the Cas nuclease and the target DNA. The PAM sequence may differ depending on the Cas nuclease identity. In some instances, a PAM is required for a complex of a Cas nuclease and a guide RNA to hybridize to and edit the target sequence. In some instances, the complex does not require a PAM to edit the target sequence.

[0085] Commonly accepted abbreviations that are used in the art as well as herein to represent ambiguity in nucleotide bases of the PAM include the following: R=G or A; Y=C or T ; M=A or C; K=G or T ; S=G or C; W=A or T ; H=A or C or T; B=G or T or C; V=G or C or A; D=G or A or T; N=A or C or G or T. Non-limiting examples of suitable PAM sequences for the Cas nucleases of the present invention are shown in Table 1. Purified: The term purified means a nucleic acid, polypeptide or cell that is substantially free from other components as determined by analytical techniques well known in the art (e.g., a purified polypeptide or nucleic acid may form a discrete band in an electrophoretic gel, chromatographic eluate, and / or a media subjected to density gradient centrifugation). A purified nucleic acid or polypeptide is at least about 50% pure, usually at least about 60%, about 65%, about 70%, about 75%, about 80%, about 85%, about 90%, about 91 %, about 92%, about 93%, about 94%, about 95%, about 96%, about 97%, about 98%, about 99%, about 99.5%, about 99.6%, about 99.7%, about 99.8% or more pure (e.g., percent by weight or on a molar basis). In a related sense, a composition is enriched for a molecule when there is a substantial increase in the concentration of the molecule after application of a purification or enrichment technique. The term "enriched" refers to a compound, polypeptide, cell, nucleic acid, amino acid, or other specified material or component that is present in a composition at a relative or absolute concentration that is higher than a starting composition.

[0086] In one aspect, the term "purified" as used herein refers to the polypeptide or cell being essentially free from components (especially insoluble components) from the production organism. In other aspects, the term "purified" refers to the Cas nuclease being essentially free of insoluble components from the native organism from which it is obtained. In one aspect, the Cas nuclease is separated from some of the soluble components of the organism and culture medium from which it is recovered. The polypeptide may be purified ( / .e., separated) by one or more of the unit operations filtration, precipitation, or chromatography.

[0087] Accordingly, the Cas nuclease may be purified such that only minor amounts of other proteins, in particular, other polypeptides, are present. The term "purified" as used herein may refer to removal of other components, particularly other proteins and most particularly other enzymes present in the cell of origin of the polypeptide. The Cas nuclease may be "substantially pure", i.e., free from other components from the organism in which it is produced, e.g., a host organism for recombinantly produced polypeptide. In one aspect, the polypeptide is at least 40% pure by weight of the total polypeptide material present in the preparation. In one aspect, the polypeptide is at least 50%, 60%, 70%, 80% or 90% pure by weight of the total polypeptide material present in the preparation. As used herein a "substantially pure polypeptide" may denote a Cas nuclease preparation that contains at most 10%, preferably at most 8%, more preferably at most 6%, more preferably at most 5%, more preferably at most 4%, more preferably at most 3%, even more preferably at most 2%, most preferably at most 1 %, and even most preferably at most 0.5% by weight of other polypeptide material with which the Cas nuclease is natively or recombinantly associated.

[0088] It is, therefore, preferred that the substantially pure Cas nuclease or fusion polypeptide is at least 92% pure, preferably at least 94% pure, more preferably at least 95% pure, more preferably at least 96% pure, more preferably at least 97% pure, more preferably at least 98% pure, even more preferably at least 99% pure, most preferably at least 99.5% pure by weight of the total polypeptide material present in the preparation. The polypeptide of the present invention is preferably in a substantially pure form ( / .e., the preparation is essentially free of other polypeptide material with which it is natively or recombinantly associated). This can be accomplished, for example by preparing the polypeptide by well-known recombinant methods or by classical purification methods.

[0089] Recombinant: The term "recombinant" is used in its conventional meaning to refer to the manipulation, e.g., cutting and rejoining, of nucleic acid sequences to form constellations different from those found in nature. The term recombinant refers to a cell, nucleic acid, polypeptide or vector that has been modified from its native state. Thus, for example, recombinant cells express genes that are not found within the native (non-recombinant) form of the cell, or express native genes at different levels or under different conditions than found in nature. The term “recombinant” is synonymous with “genetically modified” and “transgenic”.

[0090] Recover: The terms "recover" or “recovery” means the removal of a polypeptide from at least one fermentation broth component selected from the list of a cell, a nucleic acid, or other specified material, e.g., recovery of the polypeptide from the whole fermentation broth, or from the cell-free fermentation broth, by polypeptide crystal harvest, by filtration, e.g., depth filtration (by use of filter aids or packed filter medias, cloth filtration in chamber filters, rotary-drum filtration, drum filtration, rotary vacuum-drum filters, candle filters, horizontal leaf filters or similar, using sheet or pad filtration in framed or modular setups) or membrane filtration (using sheet filtration, module filtration, candle filtration, microfiltration, ultrafiltration in either cross flow, dynamic cross flow or dead end operation), or by centrifugation (using decanter centrifuges, disc stack centrifuges, hydro cyclones or similar), or by precipitating the polypeptide and using relevant solidliquid separation methods to harvest the polypeptide from the broth media by use of classification separation by particle sizes. Recovery encompasses isolation and / or purification of the polypeptide.

[0091] Reverse transcriptase: The term "reverse transcriptase" or “RT” describes a class of polymerases characterized as RNA-dependent DNA polymerases. All known reverse transcriptases require a primer to synthesize a DNA transcript from an RNA template. Historically, reverse transcriptase has been used primarily to transcribe mRNA into cDNA which can then be cloned into a vector for further manipulation. Avian myoblastosis virus (AMV) reverse transcriptase was the first widely used RNA-dependent DNA polymerase (Verma, Biochim. Biophys. Acta 473:1 (1977)). The enzyme has 5'-3' RNA-directed DNA polymerase activity, 5'-3' DNA-directed DNA polymerase activity, and RNase H activity. RNase H is a processive 5' and 3' ribonuclease specific for the RNA strand for RNA-DNA hybrids (Perbal, A Practical Guide to Molecular Cloning, New York: Wiley & Sons (1984)). Errors in transcription cannot be corrected by reverse transcriptase because known viral reverse transcriptases lack the 3'-5' exonuclease activity necessary for proofreading (Saunders and Saunders, Microbial Genetics Applied to Biotechnology, London: Croom Helm (1987)). A detailed study of the activity of AMV reverse transcriptase and its associated RNase H activity has been presented by Berger et al., Biochemistry 22:2365-2372 (1983). Another reverse transcriptase which is used extensively in molecular biology is reverse transcriptase originating from Moloney murine leukemia virus (M- MLV). See, e.g., Gerard, G. R., DNA 5:271-279 (1986) and Kotewicz, M. L., et al., Gene 35:249- 258 (1985). M-MLV reverse transcriptase substantially lacking in RNase H activity has also been described. See, e.g., U.S. Pat. No. 5,244,797.

[0092] Reverse transcriptase, when fused with a Cas nuclease, offers a versatile and precise approach to gene editing. It allows for the conversion of RNA targets into DNA, enhancing the specificity and accuracy of Cas-mediated gene editing while enabling simultaneous manipulation of both DNA and RNA molecules for a wide range of applications in genetics and molecular biology.

[0093] The disclosure contemplates any wild-type RT obtained from any naturally occurring organism or virus, or obtained from a commercial or non-commercial source. In addition, the reverse transcriptases usable in the fusion polypeptides of the disclosure can include any naturally occurring mutant RT, engineered mutant RT, or other variant RT, including truncated variants that retain function. The RTs may also be engineered to contain specific amino acid substitutions, such as those specifically disclosed herein.

[0094] RTs are multi-functional enzymes typically with three enzymatic activities including RNA- and DNA-dependent DNA polymerization activity, and an RNaseH activity that catalyzes the cleavage of RNA in RNA-DNA hybrids. Some mutants of RTs have disabled the RNaseH moiety to prevent unintended damage to the mRNA. These enzymes that synthesize complementary DNA (cDNA) using mRNA as a template were first identified in RNA viruses.

[0095] Exemplary enzymes for use with the herein disclosed fusion polypeptide can include, but are not limited to, M-MLV reverse transcriptase and RSV reverse transcriptase. Enzymes having RT activity are commercially available. Some exemplary reverse transcriptases that can be fused to CRISPR nucleases or provided as individual proteins according to various embodiments of this disclosure are provided below.

[0096] A person of ordinary skill in the art will recognize that wild-type RTs, including but not limited to, Moloney Murine Leukemia Virus (M-MLV); Human Immunodeficiency Virus (HIV) reverse transcriptase and avian Sarcoma-Leukosis Virus (ASLV) reverse transcriptase, which includes but is not limited to Rous Sarcoma Virus (RSV) reverse transcriptase, Avian Myeloblastosis Virus (AMV) reverse transcriptase, Avian Erythroblastosis Virus (AEV) Helper Virus MCAV reverse transcriptase, Avian Myelocytomatosis Virus MC29 Helper Virus MCAV reverse transcriptase, Avian Reticuloendotheliosis Virus (REV-T) Helper Virus REV-A reverse transcriptase, Avian Sarcoma Virus UR2 Helper Virus LIR2AV reverse transcriptase, Avian Sarcoma Virus Y73 Helper Virus YAV reverse transcriptase, Rous Associated Virus (RAV) reverse transcriptase, and Myeloblastosis Associated Virus (MAV) reverse transcriptase may be suitably used in the subject methods and composition described herein. In some embodiments, the RT may be any RT described in WO 2020 / 191248, the contents of which is herein incorporated by reference.

[0097] In some embodiments, a suitable reverse transcriptase may be any reverse transcriptase described in WO2020191233, WO2020191233, WO2020191243, WO2020191246, WO2020191245, WO2020191234, WO2020191233, W02020191241 , US20200085066, US20200109398, US20200109398, WO2020191239, WO2020191245, and WO 2020191248, the contents of each of which are incorporated herein by reference in their entirety.

[0098] RuvC sequence: In the context of CRISPR-Cas systems, the RuvC sequence comprises or consists of a "RuvC" or "RuvC-like" domain. RuvC stands for "Resolved Holliday Junction nuclease" and is responsible for cleaving the DNA strand opposite to the DNA strand which is complementary to the RNA guide strand. Together with the HNH domain, the RuvC domain(s) creates a double-strand break in the target DNA, allowing for gene editing or modification. In some embodiments, the Cas nuclease comprises 3 RuvC domains, e.g., a RuvC I, a RuvC II and a RuvC III domain.

[0099] Sequence identity: The relatedness between two amino acid sequences or between two nucleotide sequences is described by the parameter “sequence identity”.

[0100] For purposes of the present invention, the sequence identity between two amino acid sequences is determined as the output of “longest identity” using the Needleman-Wunsch algorithm (Needleman and Wunsch, 1970, J. Mol. Biol. 48: 443-453) as implemented in the Needle program of the EMBOSS package (EMBOSS: The European Molecular Biology Open Software Suite, Rice et al., 2000, Trends Genet. 16: 276-277), preferably version 6.6.0 or later. The parameters used are a gap open penalty of 10, a gap extension penalty of 0.5, and the EBLOSUM62 (EMBOSS version of BLOSUM62) substitution matrix. In order for the Needle program to report the longest identity, the nobrief option must be specified in the command line. The output of Needle labelled “longest identity” is calculated as follows:

[0101] (Identical Residues x 100) / (Length of Alignment - Total Number of Gaps in Alignment)

[0102] For purposes of the present invention, the sequence identity between two polynucleotide sequences is determined as the output of “longest identity” using the Needleman-Wunsch algorithm (Needleman and Wunsch, 1970, supra) as implemented in the Needle program of the EMBOSS package (EMBOSS: The European Molecular Biology Open Software Suite, Rice et al., 2000, supra), preferably version 6.6.0 or later. The parameters used are a gap open penalty of 10, a gap extension penalty of 0.5, and the EDNAFULL (EMBOSS version of NCBI NLIC4.4) substitution matrix. In order for the Needle program to report the longest identity, the nobrief option must be specified in the command line. The output of Needle labelled longest identity is calculated as follows:

[0103] (Identical Deoxyribonucleotides x 100) / (Length of Alignment- Total Number of Gaps in Alignment)

[0104] Signal Peptide: A "signal peptide" is a sequence of amino acids attached to the N- terminal portion of a protein, which facilitates the secretion of the protein outside the cell. The mature form of an extracellular protein lacks the signal peptide, which is cleaved off during the secretion process.

[0105] Subsequence: The term “subsequence” means a polynucleotide having one or more nucleotides absent from the 5' and / or 3' end of a mature polypeptide-coding sequence, wherein the subsequence encodes a fragment having nuclease activity. tracrRNA: trans-activating CRISPR RNA (tracrRNA), is a class of RNA molecules forming an integral component of the CRISPR-Cas system. The tracrRNA serves as a scaffold or scaffoldlike molecule that facilitates the binding of Cas nucleases to the CRISPR RNA (crRNA) molecule. In this complex, tracrRNA interacts with the Cas nuclease to form a ribonucleoprotein complex that recognizes and binds to the target DNA or RNA sequence, guiding the Cas nuclease to the precise site for cleavage or editing. Non-limiting examples of tracrRNA coding sequences are listed in column 3 of Table 4.

[0106] Variant: The term “variant” means a Cas nuclease having nuclease and / or DNA-binding activity comprising a man-made mutation, i.e., a substitution, insertion (including extension), and / or deletion (e.g., truncation), at one or more positions. A substitution means replacement of the amino acid occupying a position with a different amino acid; a deletion means removal of the amino acid occupying a position; and an insertion means adding amino acids (e.g., 1-5 amino acids, 1-3 amino acids, or, in particular, 1 amino acid) adjacent to and immediately following the amino acid occupying a position.

[0107] Wild-type: The term "wild-type" in reference to an amino acid sequence or nucleic acid sequence means that the amino acid sequence or nucleic acid sequence is a native or naturally- occurring sequence. As used herein, the term "naturally-occurring" refers to anything (e.g., proteins, amino acids, or nucleic acid sequences) that is found in nature. Conversely, the term "non-naturally occurring" refers to anything that is not found in nature (e.g., compositions produced in the laboratory or during manufacturing, and / or recombinant nucleic acids and protein sequences produced in the laboratory or modification of the wild-type sequence). In embodiments of the present invention, an engineered Cas nuclease is a variant Cas nuclease comprising at least one amino acid modification (e.g., substitution, deletion, and / or insertion) compared to the Cas nuclease of any of the Cas nucleases indicated in column 1 of Table 4. Detailed Description of the Invention

[0108] Cas nucleases

[0109] In a 1staspect, the invention relates to Cas nucleases selected from the group consisting of:

[0110] (a) a polypeptide having at least 80% sequence identity to the amino acid sequence of SEQ ID NO: 1 ;

[0111] (b) a polypeptide encoded by a polynucleotide having at least 80% sequence identity to the polypeptide coding sequence of SEQ ID NO: 2;

[0112] (c) a polypeptide derived from SEQ ID NO: 1 , by having 1-30 alterations (e.g., substitutions, deletions and / or insertions at one or more positions, e.g., 1 or 2 or 3 or 4 or 5 or 6 or 7 or 8 or 9 or 10 or 11 or 12 or 13 or 14 or 15 or 16 or 17 or 18 or 19 or 20 or 21 or 22 or 23 or 24 or 25 or 26 or 27 or 28 or 29 or 30 alterations), in particular substitutions, such as conservative amino acid substitutions;

[0113] (d) a polypeptide having a TM-score of at least 0.80 compared to the three- dimensional structure of the polypeptide of SEQ ID NO: 1 , wherein the three-dimensional structure is calculated using Alphafold;

[0114] (e) a polypeptide derived from the polypeptide of (a), (b), (c), or (d), wherein the N- and / or C-terminal end has been extended by addition of one or more amino acids; and

[0115] (f) a fragment of the polypeptide of (a), (b), (c), (d), or (e).

[0116] In one embodiment the nuclease comprises one or more domain selected from the group consisting of:

[0117] (a) a RuvC domain having at least 80% sequence identity to the amino acid sequence of SEQ ID NO: 3, 4, or 5;

[0118] (b) a HNH domain having at least 80% sequence identity to the amino acid sequence of SEQ ID NO: 6;

[0119] (c) a RuvC domain derived from SEQ ID NO: 3, 4, or 5 by substitution, deletion or addition of one or several amino acids of SEQ ID NO: 3, 4, or 5;

[0120] (d) a HNH domain derived from SEQ ID NO: 6 by substitution, deletion or addition of one or several amino acids of SEQ ID NO: 6; and

[0121] (e) a fragment of the catalytic domain of (a), (b), (c), or (d).

[0122] In one embodiment, the nuclease is having nuclease activity.

[0123] In one embodiment, the nuclease is having DNA-binding activity.

[0124] In one embodiment, the nuclease is having nuclease activity and DNA-binding activity.

[0125] In one embodiment, the nuclease comprises or consists of an amino acid sequence having at least 60%, e.g., at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91 %, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% or 100% sequence identity to SEQ ID NO: 1.

[0126] In one embodiment, the nuclease is comprising, consisting essentially of, or consisting of SEQ ID NO: 1.

[0127] In one embodiment, the nuclease is a fragment of SEQ ID NO: 1 , wherein the fragment preferably contains at least 1000 amino acid residues (e.g., amino acids 1 to 1061 of SEQ ID NO: 1).

[0128] In one embodiment, the nuclease is comprising, consisting essentially of, or consisting of SEQ ID NO: 1.

[0129] In one embodiment, the nuclease is encoded by a polynucleotide having at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91 %, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% or 100% sequence identity to the mature polypeptide coding sequence of SEQ ID NO: 2.

[0130] In one embodiment, the nuclease is comprising an N-terminal extension and / or C-terminal extension of 1-10 amino acids, e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, or 10 amino acids, preferably and extension of 1-10 amino acid residues in the N- terminus and / or 1-10 amino acids in the C- terminus, such as 1-5 amino acids.

[0131] In one embodiment, the nuclease is having at most 10%, at most 9%, at most 8%, at most 7%, at most 6%, at most 5%, at most 4%, at most 3%, at most 2% or at most 1 % sequence differences to the polypeptide of SEQ ID NO: 1.

[0132] In one embodiment, the nuclease differs from the polypeptide of SEQ ID NO: 1 by at most 15 amino acids, such as at most 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14 or 15 amino acids.

[0133] In one embodiment, the nuclease is obtained from or obtainable from a Lachnospiraceae cell.

[0134] In one embodiment, the nuclease is obtained from or obtainable from a Lachnospiraceae bacterium cell, e.g. a Lachnospiraceae bacterium A 10 cell.

[0135] In one embodiment, the nuclease is comprising one or more functional RuvC domain.

[0136] In one embodiment, the nuclease is comprising two or more functional RuvC domains.

[0137] In one embodiment, the nuclease is comprising three or more functional RuvC domains.

[0138] In one embodiment, the nuclease is comprising one or more functional HNH domain.

[0139] In one embodiment, the nuclease is comprising one or more domain selected from the group group consisting of:

[0140] (a) a RuvC domain having at least 60%, e.g., at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91 %, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% or 100% sequence identity to SEQ ID NO: 3, 4, or 5; (b) a HNH domain having at least 60%, e.g., at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91 %, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% or 100% sequence identity to SEQ ID NO:6;

[0141] (c) a RuvC domain derived from SEQ ID NO: 3, 4, or 5, by substitution, deletion or addition of one or several amino acids of SEQ ID NO: 3, 4, or 5;

[0142] (d) a HNH domain derived from SEQ ID NO: 6, by substitution, deletion or addition of one or several amino acids of SEQ ID NO: 6; and

[0143] (e) a fragment of the catalytic domain of (a), (b), (c), or (d); preferably wherein the nuclease has nuclease activity, or wherein the nuclease has nickase activity.

[0144] In one embodiment, the HNH domain has at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91 %, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% or 100% sequence identity to SEQ ID NO:6.

[0145] In one embodiment, the HNH domain comprises or consists of an amino acid sequence having at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91 %, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% or 100% sequence identity to SEQ ID NO:6.

[0146] In one embodiment, the HNH domain is a variant of SEQ ID NO:6 comprising a substitution, such as a conservative amino acid substitution, a deletion, and / or an insertion at one or more positions.

[0147] In one embodiment, the HNH domain differs from any SEQ ID NO:6 by at most 15 amino acids, such as at most 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14 or 15 amino acids.

[0148] In one embodiment, the HNH domain is a fragment of SEQ ID NO:6, wherein the fragment preferably contains at least 20 amino acid residues.

[0149] In one embodiment, the HNH domain comprises, consists essentially of, or consists of any one of SEQ ID NO:6.

[0150] In one embodiment, the RuvC domain has at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91 %, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% or 100% sequence identity to SEQ ID NO: 3, 4, or 5.

[0151] In one embodiment, the RuvC domain comprises or consists of an amino acid sequence having at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91 %, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% or 100% sequence identity to the amino acid sequence of SEQ ID NO: 3, 4, or 5. In one embodiment, the RuvC domain is a variant of SEQ ID NO:3, 4, or 5 comprising a substitution, such as a conservative amino acid substitution, a deletion, and / or an insertion at one or more positions.

[0152] In one embodiment, the RuvC domain differs from a SEQ ID NO:3, 4, or 5 by at most 15 amino acids, such as at most 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14 or 15 amino acids.

[0153] In one embodiment, the RuvC domain is a fragment of any one of SEQ ID NO:3, 4, or 5, wherein the fragment preferably contains at least 10 amino acid residues (e.g., amino acids 5 to 30 of SEQ ID NO: 1).

[0154] In one embodiment, the RuvC domain comprises, consists essentially of, or consists of SEQ ID NO:3, 4, or 5.

[0155] In one embodiment, the nuclease has double-strand break activity towards a DNA target site.

[0156] In one embodiment, sequence identity is determined by the method described in the definition section under “Sequence Identity”.

[0157] In one embodiment, the polynucleotide encoding the nuclease is codon-optimized for expression in a eukaryotic cell.

[0158] In one embodiment, the polynucleotide encoding the nuclease is codon-optimized for expression in a mammalian cell, e.g., a non-human mammalian cell.

[0159] In one embodiment, the polynucleotide encoding the nuclease is codon-optimized for expression in a E. coli cell.

[0160] In one embodiment, the polynucleotide encoding the nuclease is codon-optimized for expression in a Bacillus cell.

[0161] In one embodiment, the polynucleotide encoding the nuclease is codon-optimized for expression in a Bacillus subtilis cell.

[0162] In one embodiment, the polynucleotide encoding the nuclease is codon-optimized for expression in a Bacillus licheniformis cell.

[0163] In one embodiment, the polynucleotide encoding the nuclease is codon-optimized for expression in a filamentous fungal cell.

[0164] In one embodiment, the polynucleotide encoding the nuclease is codon-optimized for expression in an Aspergillus niger cell.

[0165] In one embodiment, the polynucleotide encoding the nuclease is codon-optimized for expression in an Aspergillus oryzae cell.

[0166] In one embodiment, the polynucleotide encoding the nuclease is codon-optimized for expression in a Trichoderma reesei cell.

[0167] In one embodiment, the polynucleotide encoding the nuclease is codon-optimized for expression in a Lactobacillus cell. In one embodiment, the polynucleotide encoding the nuclease is codon-optimized for expression in a probtiotic cell.

[0168] In one embodiment, the polynucleotide encoding the nuclease is codon-optimized expression in a S. cerevisiae cell.

[0169] In one embodiment, the polynucleotide encoding the nuclease is codon-optimized for expression in P. pastoris.

[0170] In one embodiment, the polynucleotide encoding the nuclease is codon-optimized for expression in Lb. paracasei (Lacticaseibacillus paracasei or Lactobacillus paracasei).

[0171] In one embodiment, the polynucleotide encoding the nuclease is codon-optimized for expression in S. thermophilus.

[0172] In one embodiment, the nuclease is a Class 2 Cas nuclease.

[0173] In one embodiment, the nuclease is a Class 2 Type II Cas nuclease.

[0174] In one embodiment, the nuclease is a Class 2 Type-ll-A Cas nuclease.

[0175] In one embodiment, the nuclease utilizes a protospacer adjacent motif (PAM) sequence provided for the nuclease in Table 1.

[0176] In one embodiment, the nuclease utilizes a protospacer adjacent motif (PAM) sequence with the sequence “nnRARK”.

[0177] In one embodiment, the nuclease utilizes a protospacer adjacent motif (PAM) sequence with the sequence “nRAGK”.

[0178] In one embodiment, the nuclease utilizes a protospacer adjacent motif (PAM) sequence with the sequence “nVDRK”.

[0179] In one embodiment, the nuclease is non-naturally occurring, e.g., wherein the nuclease is engineered and comprises unnatural or synthetic amino acids.

[0180] In one embodiment, the nuclease is naturally occuring.

[0181] In an aspect, the Cas nuclease is isolated.

[0182] In another aspect, the Cas nuclease is purified.

[0183] Protospacer Adjacent Motif (PAM) Sequences

[0184] Cas nucleases of the present disclosure may cleave, nick, or bind to a target nucleic acid within or near a protospacer adjacent motif (PAM) sequence of the target nucleic acid. In some embodiments, the target nucleic acid is a double-stranded nucleic acid comprising a target strand and a non-target strand. In some embodiments, cleavage occurs within 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, 30, 35, 40, 45, or 50 nucleotides of a 5’ or 3’ terminus of a PAM sequence. In some embodiments, effector Cas nucleases described herein recognize a PAM sequence. In some embodiments, recognizing a PAM sequence comprises interacting with a sequence adjacent to the PAM. In some embodiments, a target nucleic acid comprises a target sequence that is adjacent to a PAM sequence. In some embodiments, the Cas nuclease does not require a PAM to bind and / or cleave a target nucleic acid. Examples of identified PAM sequences are shown in Table 1.

[0185] In some embodiments, a target nucleic acid is a single-stranded target nucleic acid comprising a target sequence. Accordingly, in some embodiments, the single-stranded target nucleic acid comprises a PAM sequence described herein that is adjacent (e.g., within 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, 30, 35, 40, 45, or 50 nucleotides) or directly adjacent to the target sequence. In some embodiments, a complex comprising the Cas nuclease and a guide RNA cleaves the single-stranded target nucleic acid.

[0186] In some embodiments, a target nucleic acid is a double-stranded nucleic acid comprising a target strand and a non-target strand, wherein the target strand comprises a target sequence. In some embodiments, the PAM sequence is located on the target strand. In some embodiments, the PAM sequence is located on the non-target strand. In some embodiments, the PAM sequence described herein is adjacent (e.g., within 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, 30, 35, 40, 45, or 50 nucleotides) to the target sequence on the target strand or the non-target strand. In some embodiments, such a PAM described herein is directly adjacent to the target sequence on the target strand or the non-target strand. In some embodiments, a complex comprising the Cas nuclease and a guide RNA cleaves the target strand or the non-target strand. In some embodiments, the complex cleaves both, the target strand and the non-target strand. In some embodiments, the complex recognizes the PAM sequence, and hybridizes to a target sequence of the target nucleic acid. In some embodiments, the complex cleaves the target nucleic acid, wherein the complex has recognized the PAM sequence and is hybridized to the target sequence.

[0187] In some embodiments, a Cas nuclease described herein, or a multimeric complex thereof, recognizes a PAM on a target nucleic acid. In some embodiments, multiple Cas nucleases of the multimeric complex recognize a PAM on a target nucleic acid. In some embodiments, at least two of the multiple Cas nucleases recognize the same PAM sequence. In some embodiments, at least two of the multiple Cas nucleases recognize different PAM sequences. In some embodiments, only one Cas nuclease of the multimeric complex recognizes a PAM on a target nucleic acid.

[0188] A Cas nuclease of the present disclosure, or a multimeric complex thereof, may cleave or nick a target nucleic acid within or near a protospacer adjacent motif (PAM) sequence of the target nucleic acid. In some embodiments, cleavage occurs within 1 , 2, 3, 4, 5, 6, 7, 8, 9 or 10 nucleotides of a 5’ or 3’ terminus of a PAM sequence.

[0189] In some embodiments, compositions and methods described herein do not comprise a PAM sequence. In some embodiments, Cas nucleases do not recognize a PAM sequence. In some embodiments, compositions and methods described herein comprise a protospacerflanking site (PFS) sequence. A PFS sequence may be useful for the detection and / or modification of RNA. Table 1. PAM sequences of novel Cas nucleases.

[0190] Removal or Reduction of Cas nuclease Activity

[0191] The present invention also relates to methods of producing a variant of a Cas nuclease, which comprises mutating, e.g., deletion, insertion, or substitution of a polynucleotide, the polynucleotide encoding a Cas nuclease of the present invention, which results in a variant having reduced nuclease activity (e.g. only nickase activity), or no nuclease activity (e.g. a catalytically inactive Cas nuclease).

[0192] The variant may be constructed by mutation of the polynucleotide using methods well known in the art, for example, one or more nucleotide insertions, one or more nucleotide replacements, or one or more nucleotide deletions.

[0193] In one embodiment, the nuclease comprises an amino acid substitution, insertion, or deletion in the one or more RuvC domain.

[0194] In one embodiment, the nuclease comprises an amino acid substitution, insertion, or deletion in the one or more HNH domain.

[0195] In one embodiment, the nuclease has a single-stranded break activity towards a DNA target site.

[0196] In one embodiment, the nuclease is a catalytically dead nuclease, e.g., due to inactivation / mutation of at least one RuvC domain and at least one HNH domain.

[0197] In one embodiment, the catalytically dead nuclease comprises one or more inactivated RuvC domain and one or more inactivated HNH domain.

[0198] In one embodiment, the catallytically dead nuclease comprising one or more inactivated RuvC domain and one or more inactivated HNH domain is created by one or more amino acid substitution, deletion or insertion at the positions provided for the nuclease in column 3 of Table 2 or column 3 of Table 3, respectively.

[0199] In one embodiment, the RuvC domain of SEQ ID NO:1 is inactivated by mutation, e.g. deletion, insertion or substitution, of amino acid D10 corresponding to position 10 of SEQ ID NO: 1.

[0200] In one embodiment, the HNH domain of SEQ ID NO:1 is inactivated by mutation, e.g. deletion, insertion or substitution, of amino acid H856 corresponding to position 856 of SEQ ID NO: 1. In one embodiment, the HNH domain of SEQ ID NO:1 is inactivated by mutation, e.g. deletion, insertion or substitution, of amino acid H855 corresponding to position 855 of SEQ ID NO: 1.

[0201] In one embodiment, the HNH domain of SEQ ID NO:1 is inactivated by mutation, e.g. deletion, insertion or substitution, of amino acid N871 corresponding to position 871 of SEQ ID NO: 1.

[0202] In one embodiment, the HNH domain of SEQ ID NO:1 is inactivated by mutation, e.g. deletion, insertion or substitution, of amino acid N880 corresponding to position 880 of SEQ ID NO: 1.

[0203] In one aspect, the polypeptide is derived from SEQ ID NO: 1 by substitution, deletion or addition of one or several amino acids. In some embodiments, the polypeptide is a variant of SEQ ID NO: 1 comprising a substitution, deletion, and / or insertion at one or more positions. In one aspect, the number of amino acid substitutions, deletions and / or insertions introduced into the polypeptide of SEQ ID NO: 1 is up to 15, e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, or 15. The amino acid changes may be of a minor nature, that is conservative amino acid substitutions or insertions that do not significantly affect the folding of the Cas nuclease; small deletions, typically of 1-30 amino acids; small amino or carboxyl-terminal extensions, such as an aminoterminal methionine residue; a small linker peptide of up to 20-25 residues; or a small extension that facilitates purification by changing net charge or another function, such as a poly-histidine tract, an antigenic epitope or a binding module.

[0204] In one embodiment, the nuclease is a nickase having one or more inactivated RuvC domain created by an amino acid substitution, insertion, or deletion at a position provided for the nuclease in column 3 of Table 2.

[0205] In some embodiments, the RuvC domain is derived from an amino acid sequence provided for in column 2 of Table 2, and / or at one or more positions provided for in column 3 of Table 2, by substitution, deletion or addition of one or several amino acids. In one aspect, the number of amino acid substitutions, deletions and / or insertions introduced into the RuvC domain is up to 15, e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, or 15. The amino acid changes may be of a minor nature, that is conservative amino acid substitutions or insertions that do not significantly affect the folding of the protein; small deletions, typically of 1-30 amino acids; small amino- or carboxyl-terminal extensions, such as an amino-terminal methionine residue; a small linker peptide of up to 20-25 residues; or a small extension that facilitates purification by changing net charge or another function, such as a poly-histidine tag, an antigenic epitope or a binding module.

[0206] Table 2. RuvC domains of the novel Cas nucleases.

[0207] In one embodiment, the nuclease is a nickase having one or more inactivated HNH domain created by an amino acid substitution, insertion or deletion at a position provided for the nuclease in column 3 of Table 3.

[0208] Table 3. HNH domains of the novel Cas nucleases

[0209] In some embodiments, the HNH domain is derived from an amino acid sequence provided in column 2 of Table 3, and / or at one or more positions provided in column 3 of Table 3, by substitution, deletion or addition of one or several amino acids. In one aspect, the number of amino acid substitutions, deletions and / or insertions introduced into the HNH domain is up to 15, e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, or 15. The amino acid changes may be of a minor nature, that is conservative amino acid substitutions or insertions that do not significantly affect the folding of the protein; small deletions, typically of 1-30 amino acids; small amino or carboxyl- terminal extensions, such as an amino-terminal methionine residue; a small linker peptide of up to 20-25 residues; or a small extension that facilitates purification by changing net charge or another function, such as a poly-histidine tract, an antigenic epitope or a binding module. Essential ammo acids in a polypeptide can be identified according to procedures known in the art, such as site-directed mutagenesis or alanine-scanning mutagenesis (Cunningham and Wells, 1989, Science 244: 1081-1085). In the latter technique, single alanine mutations are introduced at each residue in the molecule, and the resultant molecules are tested for nuclease activity to identify amino acid residues that are critical to the activity of the molecule. See also, Hilton et al., 1996, J. Biol. Chem. 271 : 4699-4708. The active site of the enzyme or other biological interaction can also be determined by physical analysis of structure, as determined by such techniques as nuclear magnetic resonance, crystallography, electron diffraction, or photoaffinity labeling, in conjunction with mutation of putative contact site amino acids. See, for example, de Vos et al., 1992, Science 255: 306-312; Smith et al., 1992, J. Mol. Biol. 224: 899-904; Wlodaver et al., 1992, FEBS Lett. 309: 59-64. The identity of essential amino acids can also be inferred from an alignment with a related polypeptide, and / or be inferred from sequence homology and conserved catalytic machinery with a related polypeptide or within a polypeptide or protein family with polypeptides / proteins descending from a common ancestor, typically having similar three- dimensional structures, functions, and significant sequence similarity. Additionally, or alternatively, protein structure prediction tools can be used for protein structure modelling to identify essential amino acids and / or active sites of polypeptides. See, for example, Jumper et al., 2021 , “Highly accurate protein structure prediction with AlphaFold”, Nature 596: 583-589.

[0210] Single or multiple amino acid substitutions, deletions, and / or insertions can be made and tested using known methods of mutagenesis, recombination, and / or shuffling, followed by a relevant screening procedure, such as those disclosed by Reidhaar-Olson and Sauer, 1988, Science 241 : 53-57; Bowie and Sauer, 1989, Proc. Natl. Acad. Sci. USA 86: 2152-2156; WO 95 / 17413; or WO 95 / 22625. Other methods that can be used include error-prone PCR, phage display (e.g., Lowman etal., 1991 , Biochemistry 30: 10832-10837; US 5,223,409; WO 92 / 06204), and region-directed mutagenesis (Derbyshire et al., 1986, Gene 46: 145; Ner et a / ., 1988, DNA 7: 127).

[0211] In a 2ndaspect, the invention relates to a fusion polypeptide, comprising the Cas nuclease of any one of the 1staspect, and one or more second polypeptide.

[0212] In one embodiment, the one or more second polypeptide comprises a polypeptide that localizes to one or more subcellular organelles.

[0213] In one embodiment, the one or more second polypeptide is a nuclear localization sequence (NLS), a cell penetrating peptide, and / or an affinity tag.

[0214] In one embodiment, the fusion polypeptide comprises 1-10 or more NLS at or near the amino-terminus, 1-10 or more NLS at or near the carboxy-terminus, or a combination of 1-10 or more NLS at or near the amino-terminus and 1-10 or more NLS at or near the carboxy-terminus.

[0215] In one embodiment, the fusion polypeptide comprises 1-4 NLS. In one embodiment, the fusion polypeptide comprises one NLS.

[0216] In one embodiment, the one or more NLS is located within the open-reading frame (ORF) of the nuclease.

[0217] In one embodiment, the one or more NLS are in tandem repeats.

[0218] In one embodiment, the fusion polypeptide comprises a first NLS and a second NLS.

[0219] In one embodiment, the fusion polypeptide comprises a linker sequence between the first NLS and the second NLS.

[0220] In one embodiment, the linker between the first NLS and the second NLS comprises at least 1 , at least 2, at least 3, at least 4, at least 5, at least 6, at least 7, at least 8, at least 9, or at least 10 amino acids.

[0221] In one embodiment, the one or more second polypeptide comprises a base-editing polypeptide.

[0222] In one embodiment, the base-editing polypeptide comprises a base editor domain.

[0223] In one embodiment, the fusion polypeptide comprises a linker between the Cas nuclease and the base-editing polypeptide.

[0224] In one embodiment, the base-editing polypeptide comprises a deaminase, e.g., a cytidine deaminase, such as a APOBEC3A deaminase, or an adenosine deaminase.

[0225] In one embodiment, the one or more second polypeptide comprises a reverse transcriptase, the reverse transcriptase preferably comprising a reverse transcriptase domain.

[0226] In one embodiment, the nuclease is fused to one or more NLS of sufficient strength to drive accumulation of a CRISPR complex comprising the Cas nuclease in a detectable amount in the nucleus of a eukaryotic cell.

[0227] In one embodiment, sequence identity is determined by the method described in the definition section under “Sequence Identity”.

[0228] In an aspect, the fusion polypeptide is isolated.

[0229] In another aspect, the fusion polypeptide is purified.

[0230] Sources of Cas nucleases

[0231] A Cas nuclease of the present invention may be obtained from microorganisms of any genus. For purposes of the present invention, the term “obtained from” as used herein in connection with a given source shall mean that the polypeptide encoded by a polynucleotide is produced by the source or by a strain in which the polynucleotide of the invention has been inserted. In one aspect, the polypeptide obtained from a given source is not secreted extracellularly.

[0232] In one aspect, the Cas nuclease is obtained or obtainable from bacterial cells. In one embodiment, the Cas nuclease is a polypeptide obtained or obtainable from a Lachnospiraceae cell.

[0233] In one embodiment, the Lachnospiraceae cell is a Lachnospiraceae bacterium cell.

[0234] In one embodiment, the Lachnospiraceae cell is a Lachnospiraceae bacterium MO cell.

[0235] It will be understood that for the aforementioned species, the invention encompasses both the perfect and imperfect states, and other taxonomic equivalents, e.g., anamorphs, regardless of the species name by which they are known. Those skilled in the art will readily recognize the identity of appropriate equivalents.

[0236] The Cas nuclease may be identified and obtained from other sources including microorganisms isolated from nature (e.g., soil, composts, water, etc.) or DNA samples obtained directly from natural materials (e.g., soil, composts, water, etc.) using the above-mentioned probes. Techniques for isolating microorganisms and DNA directly from natural habitats are well known in the art. A polynucleotide encoding the Cas nuclease may then be obtained by similarly screening a genomic DNA or cDNA library of another microorganism or mixed DNA sample. Once a polynucleotide encoding a Cas nuclease has been detected with the probe(s), the polynucleotide can be isolated or cloned by utilizing techniques that are known to those of ordinary skill in the art (see, e.g., Davis et al., 2012, Basic Methods in Molecular Biology, Elsevier).

[0237] AlphaFold structure prediction

[0238] AlphaFold is a computational method for predicting the three-dimensional structure of a polypeptide from its amino acid sequence (Jumper et al., Highly accurate protein structure prediction with AlphaFold. Nature, 2021). Predicted structures for millions of polypeptides deposited in the UniProt database have been deposited in the AlphaFold Protein Structure Database, using the AlphaFold Monomer v2.0 model (Varadi et al. AlphaFold Protein Structure Database: massively expanding the structural coverage of protein-sequence space with high- accuracy models. Nucleic Acids Research, 2021). In the AlphaFold Protein Structure Database, the three-dimensional structure of a polypeptide can be obtained by searching for the UniProt accession number of the polypeptide.

[0239] In addition to the many three-dimensional structures that are already publicly available, code is available for reproducing and predicting structures of new polypeptides at source code repositories such as Github.com under deepmind / alphafold / , using notebooks / AlphaFold.ipynb, which uses Alphafold v2.3.1 or newer. Additionally, it can be found in Github.com under sokrypton / ColabFold using v1.5.2 or newer, using AlphaFold2.ipynb. For technical details, please see Jumper et al. (vide supra).

[0240] AlphaFold produces a per-residue estimate of its confidence on a scale from 0 to 100. This confidence measure is called pLDDT and corresponds to the model’s predicted score on the IDDT-Ca metric. It is stored in the B-factor fields of the mmCIF and PDB files available for download (although unlike a B-factor, higher pLDDT is better). Regions with pLDDT score of more than 90 are expected to be modelled to high accuracy. These should be suitable for any application that benefits from high accuracy (e.g., characterization of binding sites). Regions with a pLDDT score between 70 and 90 are expected to be modelled well, corresponding to a generally good backbone prediction.

[0241] Structural Similarity

[0242] The relatedness between two amino acid sequences has conventionally been described by the parameter “sequence identity”. However, since the biological function of a polypeptide is defined by it’s three-dimensional structure rather than its amino acid sequence, a better way of assessing a functional relationship between polypeptides is by comparing their three-dimensional structures. Thus, for the purposes of the present invention, the relatedness between the three- dimensional structure of two polypeptides is described by the parameter “structural similarity”.

[0243] A three-dimensional structure of any polypeptide may be obtained experimentally via, e.g., X-ray crystallography or using in silico methods such as AlphaFold (vide supra). The structural similarity between three-dimensional structures may then be determined by the TM-score, which is calculated using the following general formula (Zhang & Skolnick, Proteins 57:702-710, 2004):

[0244] TM — score where LN is the length of the native structure, LT is the length of the aligned residues to the template structure, d, is the distance between the 7th pair of aligned residues and do is a scale to normalize the match difference. ‘Max’ denotes the maximum value after optimal spatial superposition.

[0245] For the purposes of the present invention, LN is always the length of the reference protein, indicating the use of a fixed reference length L to prevent artificially large TM-scores from alignment of substructures:

[0246] TM — score

[0247] A structural alignment of the three-dimensional structure of two polypeptides is necessary before the TM-score can be calculated. This is achieved via algorithms that optimize the structural overlap, and several methods are available, such as CEalign (Shindyalov and Bourne, Protein Eng., 11 , 739-747, 1998), DALI (Holm and Sander, Trends Biochem. Sci., 20, 478-480, 1995), or TM-align (Nucleic Acids Res. 33:2302-2309, 2005).

[0248] For the purposes of the present invention, TM-align is applied. For convenience, TM-score is integrated in the TM-align software, which is available from the author’s website. The version of TM-align is preferably updated 2019-08-22 or later, and the TM-score between a reference and query protein is determined by running this command:

[0249] TMalign <query.pdb> reference. pdb> -L <length of reference>

[0250] Where <query.pdb> is the name of the PDB file containing coordinates of the query polypeptide, reference. pdb> is the name of the PDB file containing coordinates of the reference polypeptide. The TM-score is calculated and reported in the output, along with several other parameters from the alignment.

[0251] Compositions

[0252] In a 3rdaspect, the present invention also relates to a non-naturally occuring composition comprising (i) the Cas nuclease of the 1staspect, or the fusion polypeptide of the 2ndaspect, and / or (ii) a nucleic acid molecule comprising a sequence encoding the Cas nuclease of the 1staspect or the fusion polypeptide the 2ndaspect.

[0253] In one embodiment, the nucleic acid molecule is a chemically modified nucleic acid molecule.

[0254] In one embodiment, the nucleic acid molecule is DNA.

[0255] In one embodiment, the nucleic acid molecule is RNA.

[0256] In one embodiment, the RNA is an mRNA comprising one or more of a 5’ untranslated regions (UTR), an open reading frame (ORF) encoding the Cas nuclease or fusion polypeptide, a 3’IITR, and a poly-adenylyl (polyA) tail.

[0257] In one embodiment, the ORF consists of nucleosides selected from adenosine, a modified adenosine, uridine, a modified uridine, guanosine, a modified guanosine, cytidine, and a modified cytidine.

[0258] In one embodiment, the ORF consists of nucleosides selected from adenosine, uridine, a modified uridine, guanosine, and cytidine.

[0259] In one embodiment, the nucleic acid molecule is linear.

[0260] In one embodiment, the nucleic acid molecule is circular.

[0261] In one embodiment, the composition is further comprising one or more RNA molecules, or a DNA polynucleotide encoding one or more of the one or more RNA molecules, wherein the one or more RNA molecules and the Cas nuclease or fusion polypeptide do not naturally occur together, and the one or more RNA molecules are configured to form a complex with the Cas nuclease or fusion polypeptide and / or target the complex to a target site. In one embodiment, the one or more RNA molecule comprises a guide RNA (gRNA), which gRNA is comprising a CRISPR RNA (crRNA) and a trans activating RNA (tracrRNA).

[0262] In one embodiment, the one or more RNA molecule is a single-molecule RNA (sgRNA), e.g., wherein the crRNA and the tracrRNA are part of the same RNA molecule.

[0263] In another embodiment, the one or more RNA molecule is a dual-molecule RNA, e.g., wherein the crRNA and the tracrRNA are separate RNA molecules.

[0264] In one embodiment, the composition is further comprising a donor template for homology directed repair (HDR).

[0265] In one embodiment, the sequence encoding the Cas nuclease or fusion polypeptide comprises a sequence having at least 60%, e.g., at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91 %, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% or 100% identity to the polynucleotide sequence of SEQ ID NO:2.

[0266] In one embodiment, the one or more RNA molecule comprises a trans activating RNA (tracrRNA) encoded by a polynucleotide having at least 60%, e.g., at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91 %, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% or 100% identity to the polynucleotide sequence of SEQ ID NO: 7.

[0267] In one embodiment, at least one of the one or more RNA molecule comprises a CRISPR RNA (crRNA) molecule comprising a guide sequence portion and a sequence encoded by a polynucleotide having at least 60%, e.g., at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91 %, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% or 100% identity to the polynucleotide sequence of SEQ ID NO:8.

[0268] In one embodiment, at least one of the one or more RNA molecule comprises or consists of a RNA molecule comprising a guide sequence portion and a sequence encoded by a polynucleotide having at least 60%, e.g., at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91 %, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% or 100% identity to the polynucleotide sequence of SEQ ID NO:9.

[0269] In one embodiment, the Cas nuclease or fusion polypeptide comprises a sequence having at least 60%, e.g., at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91 %, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% or 100% identity to any amino acid sequence of column 1 in Table 4, and the at least one RNA molecule is a RNA molecule comprising a guide sequence portion and a sequence encoded by a polynucleotide having at least 60%, e.g., at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91 %, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% or 100% identity to any polynucleotide sequence of column 4 in Table 4.

[0270] In one embodiment, the Cas nuclease or fusion polypeptide comprises a sequence having at least 60%, e.g., at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91 %, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% or 100% identity to the amino acid sequence of SEQ ID NO: 1 , and the at least one RNA molecule comprises a crRNA molecule comprising a guide sequence portion and a sequence encoded by a polynucleotide having at least 60%, e.g., at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91 %, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% or 100% identity to the polynucleotide sequence of SEQ ID NO: 8.

[0271] In one embodiment, the Cas nuclease or fusion polypeptide comprises a sequence having at least 60%, e.g., at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91 %, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% or 100% identity to any amino acid sequence of column 1 in Table 4, and the at least one RNA molecule comprises a crRNA molecule comprising a guide sequence portion and a sequence encoded by a polynucleotide having at least 60%, e.g., at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% or 100% identity to any of the polynucleotide sequences of column 2 in Table 4.

[0272] In one embodiment, the Cas nuclease or fusion polypeptide comprises a sequence having at least 60%, e.g., at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91 %, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% or 100% identity to the amino acid sequence of SEQ ID NO: 1 , and the at least one RNA molecule comprises a tracrRNA molecule comprising a sequence encoded by a polynucleotide having at least 60%, e.g., at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91 %, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% or 100% identity to the polynucleotide sequence of SEQ ID NO: 7.

[0273] In one embodiment, the Cas nuclease or fusion polypeptide comprises a sequence having at least 60%, e.g., at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91 %, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% or 100% identity to any amino acid sequence of column 1 in Table 4, and the at least one RNA molecule comprises a tracrRNA molecule comprising a sequence encoded by a polynucleotide having at least 60%, e.g., at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91 %, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% or 100% identity to any of the polynucleotide sequences of column 3 in Table 4.

[0274] In one embodiment, the composition further comprises a base editor enzyme.

[0275] In one embodiment, the base editor enzyme is an adenosine deaminase or a cytidine deaminase.

[0276] In one embodiment, the composition further comprises a reverse transcriptase enzyme.

[0277] Table 4 discloses which crRNA and tracrRNA coding sequences are associated with each of novel Cas nuclease. For example, the Cas nuclease 0452 with SEQ I D NO: 1 utilizes the crRNA sequence encoded by SEQ ID NO: 8, and the tracrRNA sequence encoded by SEQ ID NO: 7. Additionally, and / or alternatively, the Cas nuclease with SEQ ID NO: 1 utilizes a gRNA or sgRNA sequence encoded by SEQ ID NO: 9, comprising both the crRNA sequence encoded by SEQ ID NO: 8 and the tracrRNA sequence encoded by SEQ ID NO: 7.

[0278] Table 4. crRNA, tracrRNA, and guide RNA coding sequences of the novel Cas nucleases

[0279] Methods for modifying a DNA target site

[0280] In a 4thaspect, the present invention also relates to a method of modifying a nucleotide sequence at a DNA target site in the genome of a cell, comprising introducing into the cell the Cas nuclease of the 1staspect or the fusion polypeptide of the 2ndaspect, a polynucleotide encoding the Cas nuclease of the 1staspect or the fusion polypeptide of the 2ndaspect, and / or the composition of the 3rdaspect.

[0281] In one embodiment, the method comprises introducing a DNA-break at the DNA target site.

[0282] In one embodiment, the DNA-break is a single-strand break.

[0283] In one embodiment, the DNA-break is a double-strand break.

[0284] In one embodiment, the method is carried out under conditions that are permissive for non-homologous end joining (NHEJ), and homology-directed repair (HDR).

[0285] In one embodiment, the method is carried out under conditions that are permissive for non-homologous end joining (NHEJ). In one embodiment, the method is carried out under conditions that are permissive for homology-directed repair (HDR).

[0286] In one embodiment, the Cas nuclease or fusion polypeptide effects a DNA-break in a DNA strand adjacent to a PAM sequence, e.g., adjacent to the PAM sequence “nRARK” or “nVDRV”, or adjacent to any one of the PAM sequences mentioned in Table 1.

[0287] In one embodiment, the Cas nuclease or fusion polypeptide effects a DNA-break in a DNA strand adjacent to the PAM sequence “nRAGK”.

[0288] In one embodiment, the Cas nuclease or fusion polypeptide effects a DNA-break in a DNA strand adjacent to the PAM sequence “nVDRK”.

[0289] In one embodiment, the Cas nuclease or fusion polypeptide effects a DNA-break in a DNA strand adjacent to a sequence that is complementary to the PAM sequence.

[0290] In one embodiment, the target site is within a coding region of a protein.

[0291] In one embodiment, the target site is within a non-coding region of a protein.

[0292] In one embodiment, the target site is within a regulatory region of a protein, e.g., a promoter.

[0293] In one embodiment, the cell is a eukaryotic cell.

[0294] In one embodiment, the cell is a prokaryotic cell.

[0295] In one embodiment, the cell is a eukaryotic cell, such as a mammalian cell, a human cell, or a non-human mammalian cell, e.g., a BHK cell, a CHO cell, a mouse cell, a hamster cell, or a rat cell.

[0296] In a preferred embodiment, the cell is a fungal cell, such as a filmentous fungal cell, or a yeast cell.

[0297] In one embodiment, the cell is a yeast cell, e.g., a Candida, Hansenula, Kluyveromyces, Pichia, Saccharomyces, Schizosaccharomyces, or Yarrowia cell, such as a Kluyveromyces lactis, Saccharomyces carlsbergensis, Saccharomyces cerevisiae, Saccharomyces diastaticus, Saccharomyces douglasii, Saccharomyces kluyveri, Saccharomyces norbensis, Saccharomyces oviformis, or Yarrowia lipolytica cell.

[0298] In one embodiment, the cell is a Pichia cell, e.g., a Pichia pastoris cell.

[0299] In one embodiment, the cell is a filamentous fungal cell e.g., an Acremonium, Aspergillus, Aureobasidium, Bjerkandera, Ceriporiopsis, Chrysosporium, Coprinus, Coriolus, Cryptococcus, Filibasidium, Fusarium, Humicola, Magnaporthe, Mucor, Myceliophthora, Neocallimastix, Neurospora, Paecilomyces, Penicillium, Phanerochaete, Phlebia, Piromyces, Pleurotus, Schizophyllum, Talaromyces, Thermoascus, Thielavia, Tolypocladium, Trametes, or Trichoderma cell, in particular, an Aspergillus awamori, Aspergillus foetidus, Aspergillus fumigatus, Aspergillus japonicus, Aspergillus nidulans, Aspergillus niger, Aspergillus oryzae, Bjerkandera adusta, Ceriporiopsis aneirina, Ceriporiopsis caregiea, Ceriporiopsis gilvescens, Ceriporiopsis pannocinta, Ceriporiopsis rivulosa, Ceriporiopsis subrufa, Ceriporiopsis subvermispora, Chrysosporium inops, Chrysosporium keratinophilum, Chrysosporium lucknowense, Chrysosporium merdarium, Chrysosporium pannicola, Chrysosporium queenslandicum, Chrysosporium tropicum, Chrysosporium zonatum, Coprinus cinereus, Coriolus hirsutus, Fusarium bactridioides, Fusarium cerealis, Fusarium crookwellense, Fusarium culmorum, Fusarium graminearum, Fusarium graminum, Fusarium heterosporum, Fusarium negundi, Fusarium oxysporum, Fusarium reticulatum, Fusarium roseum, Fusarium sambucinum, Fusarium sarcochroum, Fusarium sporotrichioides, Fusarium sulphureum, Fusarium torulosum, Fusarium trichothecioides, Fusarium venenatum, Humicola insolens, Humicola lanuginosa, Mucor miehei, Myceliophthora thermophila, Neurospora crassa, Penicillium purpurogenum, Phanerochaete chrysosporium, Phlebia radiata, Pleurotus eryngii, Talaromyces emersonii, Thielavia terrestris, Trametes villosa, Trametes versicolor, Trichoderma harzianum, Trichoderma koningii, Trichoderma longibrachiatum, Trichoderma reesei, or Trichoderma viride cell.

[0300] In one embodiment, the cell is a Trichoderma cell.

[0301] In one embodiment, the cell is a Trichoderma reesei cell.

[0302] In one embodiment, the cell is an Aspergillus cell.

[0303] In one embodiment, the cell is an Aspergillus niger cell.

[0304] In one embodiment, the cell is an Aspergillus oryzae cell.

[0305] In one embodiment, the cell is a plant cell.

[0306] In one embodiment, the plant cell is one or more of a maize, rice, sorghum, rye, barley, wheat, millet, oats, sugarcane, turfgrass, switchgrass, soybean, canola, alfalfa, sunflower, cotton, tobacco, peanut, potato, tobacco, Arabidopsis, vegetable, or safflower cell.

[0307] In a preferred embodiment, the cell is a prokaryotic cell, e.g., a Gram-positive cell selected from the group consisting of Bacillus, Clostridium, Corynebacterium, Enterococcus, Geobacillus, Lactobacillus, Lacticaseibacillus, Lactiplantibacillus, Levilactobacillus, Ligilactobacillus, Umosilactobacillus, Lactococcus, Oceanobacillus, Staphylococcus, Streptococcus, or Streptomyces cells, or a Gram-negative bacteria selected from the group consisting of Campylobacter, E. coli, Flavobacterium, Fusobacterium, Helicobacter, llyobacter, Neisseria, Pseudomonas, Salmonella, and Ureaplasma cells, such as Lacticaseibacillus casei, Lacticaseibacillus paracasei, Lacticaseibacillus rhamnosus, Lactiplantibacillus plantarum, Levilactobacillus brevis, Ligilactobacillus salivarius, Limosilactobacillus fermentum, Umosilactobacillus reuteri, Lactobacillus acidophilus, Lactobacillus bulgaricus, Lactobacillus crispatus, Lactobacillus gasseri, Lactobacillus johnsonii, Lactobacillus helveticus, Corynebacterium glutamicum, Bacillus alkalophilus, Bacillus amyloliquefaciens, Bacillus brevis, Bacillus circulans, Bacillus clausii, Bacillus coagulans, Bacillus firmus, Bacillus lautus, Bacillus lentus, Bacillus licheniformis, Bacillus megaterium, Bacillus pumilus, Bacillus stearothermophilus, Bacillus subtilis, Bacillus thuringiensis, Streptococcus equisimilis, Streptococcus pyogenes, Streptococcus uberis, and Streptococcus equi subsp. Zooepidemicus, Streptomyces achromogenes, Streptomyces avermitilis, Streptomyces coelicolor, Streptomyces griseus, and Streptomyces lividans cells.

[0308] In one embodiment, the cell is a Bacillus cell.

[0309] In one embodiment, the cell is a Bacillus subtilis cell.

[0310] In one embodiment, the cell is a Bacillus licheniformis cell.

[0311] In one embodiment, the cell is a Lacticaseibacillus paracesei cell.

[0312] In one embodiment, the cell is a Streptococcus thermophilus cell.

[0313] In one embodiment, the cell is a E. coli cell.

[0314] DNA Repair by non-homologous end joining

[0315] Upon target recognition, Cas nucleases induce double-strand breaks in the target seguence, which when repaired by non-homologous end joining (NHEJ) can result in frameshift mutations and gene knockdown. The frameshift mutation caused by error-prone NHEJ may include nucleotide insertions or deletions (indels). Alternatively, homology-directed repair (HDR) at the double-strand break site can allow insertion of the desired seguence.

[0316] DNA Repair by Homologous Recombination

[0317] The term "homology-directed repair" or "HDR" refers to a mechanism for repairing DNA damage in cells, for example, during repair of double-stranded and single- stranded breaks in DNA. HDR reguires nucleotide seguence homology and uses a "nucleic acid template" (nucleic acid template or donor template used interchangeably herein) to repair the seguence where the doublestranded or single break occurred (e.g., DNA target seguence). This results in the transfer of genetic information from, for example, the nucleic acid template to the DNA target seguence. HDR may result in alteration of the DNA target seguence (e.g., insertion, deletion, mutation) if the nucleic acid template seguence differs from the DNA target seguence and part or all of the nucleic acid template polynucleotide or oligonucleotide is incorporated into the DNA target seguence. In some embodiments, an entire nucleic acid template polynucleotide, a portion of the nucleic acid template polynucleotide, or a copy of the nucleic acid template is integrated at the site of the DNA target seguence.

[0318] The terms "nucleic acid template" and “donor”, refer to a nucleotide seguence that is inserted or copied into a genome. The nucleic acid template comprises a nucleotide seguence, e.g., of one or more nucleotides, that will be added to or will template a change in the target nucleic acid or may be used to modify the target seguence. A nucleic acid template seguence may be of any length, for example between 2 and 10,000 nucleotides in length (or any integer value there between or there above), preferably between about 100 and 1 ,000 nucleotides in length (or any integer there between), more preferably between about 200 and 500 nucleotides in length. A nucleic acid template may be a single-stranded nucleic acid, a double-stranded nucleic acid. In some embodiment, the nucleic acid template comprises a nucleotide sequence, e.g., of one or more nucleotides, that corresponds to wild type sequence of the target nucleic acid, e.g., of the target position. In some embodiment, the nucleic acid template comprises a ribonucleotide sequence, e.g., of one or more ribonucleotides, that corresponds to wild type sequence of the target nucleic acid, e.g., of the target position. In some embodiment, the nucleic acid template comprises modified ribonucleotides.

[0319] Insertion of an exogenous sequence (also called a "donor sequence," donor template” or "donor"), for example, for correction of a mutant gene or for increased expression of a wild-type gene can also be carried out. It will be readily apparent that the donor sequence is typically not identical to the genomic sequence where it is placed. A donor sequence can contain a non- homologous sequence flanked by two regions of homology to allow for efficient HDR at the location of interest. Additionally, donor sequences can comprise a vector molecule containing sequences that are not homologous to the region of interest in cellular chromatin. A donor molecule can contain several, discontinuous regions of homology to cellular chromatin. For example, for targeted insertion of sequences not normally present in a region of interest, said sequences can be present in a donor nucleic acid molecule and flanked by regions of homology to sequence in the region of interest.

[0320] The donor polynucleotide can be DNA or RNA, single-stranded and / or doublestranded and can be introduced into a cell in linear or circular form. See, e.g., U.S. Patent Publication Nos. 2010 / 0047805; 2011 / 0281361 ; 2011 / 0207221 ; and 2019 / 0330620. If introduced in linear form, the ends of the donor sequence can be protected (e.g., from exonucleolytic degradation) by methods known to those of skill in the art. For example, one or more dideoxynucleotide residues are added to the 3' terminus of a linear molecule and / or self- complementary oligonucleotides are ligated to one or both ends. See, for example, Chang and Wilson, Proc. Natl. Acad. Sci. USA (1987); Nehls et al., Science (1996). Additional methods for protecting exogenous polynucleotides from degradation include, but are not limited to, addition of terminal amino group(s) and the use of modified internucleotide linkages such as, for example, phosphorothioates, phosphoramidates, and O-methyl ribose or deoxyribose residues.

[0321] Accordingly, embodiments of the present invention using a donor template for repair may use a DNA or RNA, single-stranded and / or double-stranded donor template that can be introduced into a cell in linear or circular form.

[0322] In embodiments of the present invention a gene-editing composition comprises: (1) an RNA molecule comprising a guide sequence to affect a double strand break in a gene prior to repair and (2) a donor RNA template for repair, the RNA molecule comprising the guide sequence is a first RNA molecule and the donor RNA template is a second RNA molecule. In some embodiments, the guide RNA molecule and template RNA molecule are connected as part of a single molecule. A donor sequence may also be an oligonucleotide and be used for gene correction or targeted alteration of an endogenous sequence. The oligonucleotide may be introduced to the cell on a vector, may be electroporated into the cell, or may be introduced via other methods known in the art. The oligonucleotide can be used to correct a mutated sequence in an endogenous gene (e.g., the sickle mutation in beta globin), or may be used to insert sequences with a desired purpose into an endogenous locus.

[0323] A polynucleotide can be introduced into a cell as part of a vector molecule having additional sequences such as, for example, replication origins, promoters and genes encoding antibiotic resistance. Moreover, donor polynucleotides can be introduced as naked nucleic acid, as nucleic acid complexed with an agent such as a liposome or poloxamer, or can be delivered by recombinant viruses (e.g., adenovirus, AAV, herpesvirus, retrovirus, lentivirus and integrase defective lentivirus (IDLV)).

[0324] The donor is generally inserted so that its expression is driven by the endogenous promoter at the integration site, namely the promoter that drives expression of the endogenous gene into which the donor is inserted. However, it will be apparent that the donor may comprise a promoter and / or enhancer, for example a constitutive promoter or an inducible or tissue specific promoter.

[0325] The donor molecule may be inserted into an endogenous gene such that all, some or none of the endogenous gene is expressed. For example, a transgene as described herein may be inserted into an endogenous locus such that some (N-terminal and / or C-terminal to the transgene) or none of the endogenous sequences are expressed, for example as a fusion with the transgene. In other embodiments, the transgene (e.g., with or without additional coding sequences such as forthe endogenous gene) is integrated into any endogenous locus, for example a safe-harbor locus, for example a CCR5 gene, a CXCR4 gene, a PPPIR12c (also known as AAVS1) gene, an albumin gene or a Rosa gene. See, e.g., U.S. Patent Nos. 7,951 ,925 and 8,110,379; U.S. Publication Nos. 2008 / 0159996; 20100 / 0218264; 2010 / 0291048; 2012 / 0017290; 2011 / 0265198; 2013 / 0137104; 2013 / 0122591 ; 2013 / 0177983 and 2013 / 0177960 and U.S. Provisional Application No. 61 / 823,689).

[0326] When endogenous sequences (endogenous or part of the transgene) are expressed with the transgene, the endogenous sequences may be full-length sequences (wild-type or mutant) or partial sequences. Preferably the endogenous sequences are functional. Non-limiting examples of the function of these full length or partial sequences include increasing the serum half-life of the polypeptide expressed by the transgene (e.g., therapeutic gene) and / or acting as a carrier.

[0327] Furthermore, although not required for expression, exogenous sequences may also include transcriptional or translational regulatory sequences, for example, promoters, enhancers, insulators, internal ribosome entry sites, sequences encoding 2A peptides and / or polyadenylation signals. In certain embodiments, the donor molecule comprises a sequence selected from the group consisting of a gene encoding a protein (e.g., a coding sequence encoding a protein that is lacking in the cell or in the individual or an alternate version of a gene encoding a protein), a regulatory sequence and / or a sequence that encodes a structural nucleic acid such as a microRNA or siRNA.

[0328] Polynucleotides encoding Cas nucleases

[0329] In a 5thaspect, the present invention also relates to polynucleotides encoding the Cas nuclease of the 1staspect, and / or the fusion polypeptide of the 2ndaspect.

[0330] In one embodiment, the polynucleotide comprises or consists of a polynucleotide having at least 60%, e.g., at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91 %, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% or 100% sequence identity to the polypeptide coding sequence of SEQ ID NO: 2.

[0331] In one embodiment, the polynucleotide is a chemically modified nucleic acid molecule.

[0332] In one embodiment, the polynucleotide is DNA.

[0333] In one embodiment, the polynucleotide is RNA.

[0334] In one embodiment, the RNA is an mRNA comprising one or more of a 5’ untranslated regions (UTR), an open reading frame (ORF) encoding the Cas nuclease or fusion polypeptide, a 3’IITR, and a poly-adenylyl (polyA) tail.

[0335] In one embodiment, the ORF consists of nucleosides selected from adenosine, a modified adenosine, uridine, a modified uridine, guanosine, a modified guanosine, cytidine, and a modified cytidine.

[0336] In one embodiment, the ORF consists of nucleosides selected from adenosine, uridine, a modified uridine, guanosine, and cytidine.

[0337] In one embodiment, the polynucleotide is linear.

[0338] In one embodiment, the polynucleotide is circular.

[0339] In one embodiment, the poly-A sequence comprises non-adenine nucleotides.

[0340] In one embodiment, the poly-A sequence comprises 100-400 nucleotides.

[0341] In one embodiment, the polynucleotide is operably linked to one or more heterologous control sequence.

[0342] In one embodiment, the heterologous control sequence is a heterologous promoter.

[0343] In one embodiment, the polynucleotide is isolated.

[0344] In one embodiment, the polynucleotide is purified.

[0345] The polynucleotide may be a genomic DNA, a cDNA, a synthetic DNA, a synthetic RNA, a mRNA, or a combination thereof. The polynucleotide may be cloned from a strain of Lachnospiraceae, or a related organism and thus, for example, may be a polynucleotide sequence encoding a variant of the Cas nuclease of the invention.

[0346] In one embodiment, the polynucleotide is obtained from a Lachnospiraceae bacterium cell. In one embodiment, the Lachnospiraceae bacterium cell is a Lachnospiraceae bacterium A 10 cell.

[0347] In an embodiment, the polynucleotide is a subsequence encoding a fragment having Cas nuclease activity and / or DNA binding activity of the present invention.

[0348] The polynucleotide may also be mutated by introduction of nucleotide substitutions that do not result in a change in the amino acid sequence of the polypeptide, but which correspond to the codon usage of the host organism intended for production of the enzyme, or by introduction of nucleotide substitutions that may give rise to a different amino acid sequence. For a general description of nucleotide substitution, see, e.g., Ford et al., 1991 , Protein Expression and Purification 2: 95-107.

[0349] Nucleic Acid Constructs

[0350] In a 6thaspect, the present invention relates to a nucleic acid construct or expression vector comprising the polynucleotide according to the 5thaspect of the invention, operably linked to one or more control sequences that direct the production of the nuclease or fusion polypeptide in a cell.

[0351] The present invention also relates to nucleic acid constructs or expression vectors comprising a polynucleotide of the present invention, wherein the polynucleotide is operably linked to one or more control sequences that direct the expression of the coding sequence in a suitable host cell under conditions compatible with the control sequences.

[0352] The polynucleotide may be manipulated in a variety of ways to provide for expression of the polypeptide. Manipulation of the polynucleotide prior to its insertion into a vector may be desirable or necessary depending on the expression vector. Techniques for modifying polynucleotides utilizing recombinant DNA methods are well known in the art.

[0353] Promoters

[0354] The control sequence may be a promoter, a polynucleotide that is recognized by a host cell for expression of a polynucleotide encoding the Cas nuclease. The promoter contains transcriptional control sequences that mediate the expression of the Cas nuclease. The promoter may be any polynucleotide that shows transcriptional activity in the host cell including mutant, truncated, and hybrid promoters, and may be obtained from genes encoding extracellular or intracellular polypeptides either homologous or heterologous to the host cell.

[0355] Examples of suitable promoters for directing transcription of the polynucleotide of the present invention in a bacterial host cell are described in Sambrook et al. , 1989, Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Lab., NY, Davis et al., 2012, supra, and Song et al., 2016, PLOS One 11(7): e0158447.

[0356] Examples of suitable promoters for directing transcription of the polynucleotide of the present invention in a filamentous fungal host cell are promoters obtained from Aspergillus, Fusarium, Rhizomucor and Trichoderma cells, such as the promoters described in Mukherjee et al., 2013, “Trichoderma Biology and Applications”, and by Schmoll and Dattenbbck, 2016, “Gene Expression Systems in Fungi: Advancements and Applications”, Fungal Biology.

[0357] For expression in a yeast host, examples of useful promoters are described by Smolke et al., 2018, “Synthetic Biology: Parts, Devices and Applications” (Chapter 6: Constitutive and Regulated Promoters in Yeast: How to Design and Make Use of Promoters in S. cerevisiae), and by Schmoll and Dattenbbck, 2016, “Gene Expression Systems in Fungi: Advancements and Applications”, Fungal Biology.

[0358] Terminators

[0359] The control sequence may also be a transcription terminator, which is recognized by a host cell to terminate transcription. The terminator is operably linked to the 3’terminus of the polynucleotide encoding the Cas nuclease. Any terminator that is functional in the host cell may be used in the present invention.

[0360] Preferred terminators for bacterial host cells may be obtained from the genes for Bacillus clausii alkaline protease (aprH), Bacillus licheniformis alpha-amylase (amyL), and Escherichia coli (E. coli) ribosomal RNA (rrnB).

[0361] Preferred terminators for filamentous fungal host cells may be obtained from Aspergillus or Trichoderma species, such as obtained from the genes for Aspergillus niger glucoamylase, Trichoderma reesei beta-glucosidase, Trichoderma reesei cellobiohydrolase I, and Trichoderma reesei endoglucanase I, such as the terminators described in Mukherjee et al., 2013, “Trichoderma: Biology and Applications”, and by Schmoll and Dattenbbck, 2016, “Gene Expression Systems in Fungi: Advancements and Applications”, Fungal Biology.

[0362] Preferred terminators for yeast host cells may be obtained from the genes for Saccharomyces cerevisiae enolase, Saccharomyces cerevisiae cytochrome C (CYC1), and Saccharomyces cerevisiae glyceraldehyde-3-phosphate dehydrogenase. Other useful terminators for yeast host cells are described by Romanos et al., 1992, Yeast 8: 423-488. mRNA Stabilizers

[0363] The control sequence may also be an mRNA stabilizer region downstream of a promoter and upstream of the coding sequence of a gene which increases expression of the gene encoding the Cas nuclease. Examples of suitable mRNA stabilizer regions are obtained from a Bacillus thunngiensis crylllA gene (WO 94 / 25612) and a Bacillus subtilis SP82 gene (Hue etal., 1995, J. Bacterid. 177: 3465-3471).

[0364] Examples of mRNA stabilizer regions for fungal cells are described in Geisberg et al., 2014, Cell 156(4): 812-824, and in Morozov et al., 2006, Eukaryotic Ce / / 5(11): 1838-1846.

[0365] Leader Sequences

[0366] The control sequence may also be a leader, a non-translated region of an mRNA that is important for translation by the host cell. The leader is operably linked to the 5’terminus of the polynucleotide encoding the Cas nuclease. Any leader that is functional in the host cell may be used.

[0367] Suitable leaders for bacterial host cells are described by Hambraeus et al., 2000, Microbiology 146(12): 3051-3059, and by Kaberdin and Blasi, 2006, FEMS Microbiol. Rev. 30(6): 967-979.

[0368] Preferred leaders for filamentous fungal host cells may be obtained from the genes for Aspergillus oryzae TAKA amylase and Aspergillus nidulans triose phosphate isomerase.

[0369] Suitable leaders for yeast host cells may be obtained from the genes for Saccharomyces cerevisiae enolase (ENO-1), Saccharomyces cerevisiae 3-phosphoglycerate kinase, Saccharomyces cerevisiae alpha-factor, and Saccharomyces cerevisiae alcohol dehydrogenase / glyceraldehyde-3-phosphate dehydrogenase (ADH2 / GAP).

[0370] Polyadenylation Sequences

[0371] The control sequence may also be a polyadenylation sequence, a sequence operably linked to the 3’terminus of the polynucleotide encoding the Cas nuclease which, when transcribed, is recognized by the host cell as a signal to add polyadenosine residues to transcribed mRNA. Any polyadenylation sequence that is functional in the host cell may be used.

[0372] Preferred polyadenylation sequences for filamentous fungal host cells are obtained from the genes for Aspergillus nidulans anthranilate synthase, Aspergillus niger glucoamylase, Aspergillus niger alpha-glucosidase, Aspergillus oryzae TAKA amylase, and Fusarium oxysporum trypsin-like protease.

[0373] Useful polyadenylation sequences for yeast host cells are described by Guo and Sherman, 1995, Mol. Cellular Biol. 15: 5983-5990.

[0374] Signal Peptides

[0375] The control sequence may also be a signal peptide coding region that encodes a signal peptide linked to the Nterminus of the Cas nuclease and directs the Cas nuclease into the cell’s secretory pathway. The 5’end of the coding sequence of the polynucleotide may inherently contain a signal peptide coding sequence naturally linked in translation reading frame with the segment of the coding sequence that encodes the polypeptide. Alternatively, the 5’end of the coding sequence may contain a signal peptide coding sequence that is heterologous to the coding sequence. A heterologous signal peptide coding sequence may be required where the coding sequence does not naturally contain a signal peptide coding sequence. Alternatively, a heterologous signal peptide coding sequence may simply replace the natural signal peptide coding sequence to enhance secretion of the polypeptide. Any signal peptide coding sequence that directs the expressed polypeptide into the secretory pathway of a host cell may be used.

[0376] Effective signal peptide coding sequences for bacterial host cells are the signal peptide coding sequences obtained from the genes for Bacillus NCIB 11837 maltogenic amylase, Bacillus licheniformis subtilisin, Bacillus licheniformis beta-lactamase, Bacillus stearothermophilus alphaamylase, Bacillus stearothermophilus neutral proteases (nprT, nprS, nprM), and Bacillus subtilis prsA. Further signal peptides are described by Freudl, 2018, Microbial Cell Factories 17: 52.

[0377] Effective signal peptide coding sequences for filamentous fungal host cells are the signal peptide coding sequences obtained from the genes for Aspergillus niger neutral amylase, Aspergillus niger glucoamylase, Aspergillus oryzae TAKA amylase, Humicola insolens cellulase, Humicola insolens endoglucanase V, Humicola lanuginosa lipase, and Rhizomucor miehei aspartic proteinase, such as the signal peptide described by Xu etal., 2018, Biotechnology Letters 40: 949-955

[0378] Useful signal peptides for yeast host cells are obtained from the genes for Saccharomyces cerevisiae alpha-factor and Saccharomyces cerevisiae invertase. Other useful signal peptide coding sequences are described by Romanos et al., 1992, supra.

[0379] Propeptides

[0380] The control sequence may also be a propeptide coding sequence that encodes a propeptide positioned at the N-terminus of the Cas nuclease. The resultant polypeptide is known as a proenzyme or propolypeptide (or a zymogen in some cases). A propolypeptide is generally inactive and can be converted to an active polypeptide by catalytic or autocatalytic cleavage of the propeptide from the propolypeptide. The propeptide coding sequence may be obtained from the genes for Bacillus subtilis alkaline protease (aprE), Bacillus subtilis neutral protease (nprT), Myceliophthora thermophila laccase (WO 95 / 33836), Rhizomucor miehei aspartic proteinase, and Saccharomyces cerevisiae alpha-factor.

[0381] Where both signal peptide and propeptide sequences are present, the propeptide sequence is positioned next to the N-terminus of a polypeptide and the signal peptide sequence is positioned next to the N-terminus of the propeptide sequence. Additionally or alternatively, when both signal peptide and propeptide sequences are present, the polypeptide may comprise only a part of the signal peptide sequence and / or only a part of the propeptide sequence. Alternatively, the final or isolated polypeptide may comprise a mixture of mature polypeptides and polypeptides which comprise, either partly or in full length, a propeptide sequence and / or a signal peptide sequence.

[0382] Regulatory Sequences

[0383] It may also be desirable to add regulatory sequences that regulate expression of the Cas nuclease relative to the growth of the host cell. Examples of regulatory sequences are those that cause expression of the gene to be turned on or off in response to a chemical or physical stimulus, including the presence of a regulatory compound. Regulatory sequences in prokaryotic systems include the lac, tac, and trp operator systems. In yeast, the ADH2 system or GAL1 system may be used. In filamentous fungi, the Aspergillus niger glucoamylase promoter, Aspergillus oryzae TAKA alpha-amylase promoter, and Aspergillus oryzae glucoamylase promoter, Trichoderma reesei cellobiohydrolase I promoter, and Trichoderma reesei cellobiohydrolase II promoter may be used. Other examples of regulatory sequences are those that allow for gene amplification. In fungal systems, these regulatory sequences include the dihydrofolate reductase gene that is amplified in the presence of methotrexate, and the metallothionein genes that are amplified with heavy metals.

[0384] Expression Vectors

[0385] The present invention also relates to recombinant expression vectors comprising a polynucleotide of the present invention, a promoter, and transcriptional and translational stop signals. The various nucleotide and control sequences may be joined together to produce a recombinant expression vector that may include one or more convenient restriction sites to allow for insertion or substitution of the polynucleotide encoding the Cas nuclease at such sites. Alternatively, the polynucleotide may be expressed by inserting the polynucleotide or a nucleic acid construct comprising the polynucleotide into an appropriate vector for expression. In creating the expression vector, the coding sequence is located in the vector so that the coding sequence is operably linked with the appropriate control sequences for expression.

[0386] The recombinant expression vector may be any vector (e.g., a plasmid or virus) that can be conveniently subjected to recombinant DNA procedures and can bring about expression of the polynucleotide. The choice of the vector will typically depend on the compatibility of the vector with the host cell into which the vector is to be introduced. The vector may be a linear or closed circular plasmid.

[0387] The vector may be an autonomously replicating vector, i.e., a vector that exists as an extrachromosomal entity, the replication of which is independent of chromosomal replication, e.g., a plasmid, an extrachromosomal element, a minichromosome, or an artificial chromosome. The vector may contain any means for assuring self-replication. Alternatively, the vector may be one that, when introduced into the host cell, is integrated into the genome and replicated together with the chromosome(s) into which it has been integrated. Furthermore, a single vector or plasmid or two or more vectors or plasmids that together contain the total DNA to be introduced into the genome of the host cell, or a transposon, may be used.

[0388] The vector preferably contains one or more selectable markers that permit easy selection of transformed, transfected, transduced, or the like cells. A selectable marker is a gene the product of which provides for biocide or viral resistance, resistance to heavy metals, prototrophy to auxotrophs, and the like.

[0389] The vector preferably contains at least one element that permits integration of the vector into the host cell's genome or autonomous replication of the vector in the cell independent of the genome.

[0390] For integration into the host cell genome, the vector may rely on the polynucleotide’s sequence encoding the polypeptide or any other element of the vector for integration into the genome by homologous recombination, such as homology-directed repair (HDR), or non- homologous recombination, such as non-homologous end-joining (NHEJ).

[0391] For autonomous replication, the vector may further comprise an origin of replication enabling the vector to replicate autonomously in the host cell in question. The origin of replication may be any plasmid replicator mediating autonomous replication that functions in a cell. The term “origin of replication” or “plasmid replicator” means a polynucleotide that enables a plasmid or vector to replicate in vivo.

[0392] More than one copy of a polynucleotide of the present invention may be inserted into a host cell to increase production of a Cas nuclease. For example, 2 or 3 or 4 or 5 or more copies are inserted into a host cell. An increase in the copy number of the polynucleotide can be obtained by integrating at least one additional copy of the sequence into the host cell genome or by including an amplifiable selectable marker gene with the polynucleotide where cells containing amplified copies of the selectable marker gene, and thereby additional copies of the polynucleotide, can be selected for by cultivating the cells in the presence of the appropriate selectable agent.

[0393] Host Cells

[0394] In a 7thaspect the invention relates to cells comprising the Cas nuclease of the 1staspect, the fusion polypeptide of the 2ndaspect, the composition of the 3rdaspect, the polynucleotide of the 5thaspect, and / or the nucleic acid construct or expression vector of the 6thaspect.

[0395] In one embodiment, the cell is a recombinant cell.

[0396] In a preferred embodiment, the Cas nuclease is heterologous to the cell.

[0397] In one embodiment, the cell comprises at least two copies, e.g., three, four, or five or more copies of the polynucleotide of the 5thaspect or the vector or construct of the 6thaspect. In one embodiment, the genome of the cell comprises a polynucleotide encoding the Cas nuclease of the 1staspect or fusion polypeptide of the 2ndaspect, a polynucleotide of the 5thaspect, or a nucleic acid construct or expression vector of the 6thaspect.

[0398] In one embodiment, the genome of the recombinant cell comprises at least two copies, e.g., three, four, or five, or more copies of a polynucleotide encoding the Cas nuclease of the 1staspect or fusion polypeptide of the 2ndaspect, of a polynucleotide of the 5thaspect, or of a nucleic acid construct or expression vector of the 6thaspect.

[0399] In an 8thaspect the invention relates to cells comprising a genome which was modified by the Cas nuclease of the 1staspect, the fusion polypeptide of the 2ndaspect, the composition of the 3rdaspect, the method of the 4thaspect, the polynucleotide of the 5thaspect, and / or the nucleic acid construct or expression vector of the 6thaspect.

[0400] In one embodiment, the cell is a recombinant cell.

[0401] In one embodiment, the cell is selected from the group consisting of: an archaeal cell, a bacterial cell, a eukaryotic cell, a eukaryotic single-cell organism, a somatic cell, a germ cell, a stem cell, a plant cell, an algal cell, an animal cell, in invertebrate cell, a vertebrate cell, a fish cell, a frog cell, a bird cell, a mammalian cell, a non-human mammalian cell, a pig cell, a cow cell, a goat cell, a sheep cell, a rodent cell, a rat cell, a mouse cell, a non-human primate cell, and a human cell.

[0402] In one embodiment, the cell is a eukaryotic cell.

[0403] In one embodiment, the cell is a prokaryotic cell.

[0404] In one embodiment, the cell is a eukaryotic cell, such as a mammalian cell, a human cell, or a non-human mammalian cell, e.g., a BHK cell, a CHO cell, a mouse cell, a hamster cell, or a rat cell.

[0405] In one embodiment, the cell is a fungal cell, such as a filmentous fungal cell, or a yeast cell.

[0406] In one embodiment, the cell is a yeast cell, e.g., a Candida, Hansenula, Kluyveromyces, Pichia, Saccharomyces, Schizosaccharomyces, or Yarrowia cell, such as a Kluyveromyces lactis, Saccharomyces carlsbergensis, Saccharomyces cerevisiae, Saccharomyces diastaticus, Saccharomyces douglasii, Saccharomyces kluyveri, Saccharomyces norbensis, Saccharomyces oviformis, or Yarrowia lipolytica cell.

[0407] In one embodiment the cell is a Pichia pastoris cell.

[0408] In one embodiment, the cell is a filamentous fungal cell e.g., an Acremonium, Aspergillus, Aureobasidium, Bjerkandera, Ceriporiopsis, Chrysosporium, Coprinus, Coriolus, Cryptococcus, Filibasidium, Fusarium, Humicola, Magnaporthe, Mucor, Myceliophthora, Neocallimastix, Neurospora, Paecilomyces, Penicillium, Phanerochaete, Phlebia, Piromyces, Pleurotus, Schizophyllum, Talaromyces, Thermoascus, Thielavia, Tolypocladium, Trametes, or Tnchoderma cell, in particular, an Aspergillus awamon, Aspergillus foetidus, Aspergillus fumigatus, Aspergillus japonicus, Aspergillus nidulans, Aspergillus niger, Aspergillus oryzae, Bjerkandera adusta, Ceriporiopsis aneirina, Ceriporiopsis caregiea, Ceriporiopsis gilvescens, Ceriporiopsis pannocinta, Ceriporiopsis rivulosa, Ceriporiopsis subrufa, Ceriporiopsis subvermispora, Chrysosporium inops, Chrysosporium keratinophilum, Chrysosporium lucknowense, Chrysosporium merdarium, Chrysosporium pannicola, Chrysosporium queenslandicum, Chrysosporium tropicum, Chrysosporium zonatum, Coprinus cinereus, Coriolus hirsutus, Fusarium bactridioides, Fusarium cerealis, Fusarium crookwellense, Fusarium culmorum, Fusarium graminearum, Fusarium graminum, Fusarium heterosporum, Fusarium negundi, Fusarium oxysporum, Fusarium reticulatum, Fusarium roseum, Fusarium sambucinum, Fusarium sarcochroum, Fusarium sporotrichioides, Fusarium sulphureum, Fusarium torulosum, Fusarium trichothecioides, Fusarium venenatum, Humicola insolens, Humicola lanuginosa, Mucor miehei, Myceliophthora thermophila, Neurospora crassa, Penicillium purpurogenum, Phanerochaete chrysosporium, Phlebia radiata, Pleurotus eryngii, Talaromyces emersonii, Thielavia terrestris, Trametes villosa, Trametes versicolor, Trichoderma harzianum, Trichoderma koningii, Trichoderma longibrachiatum, Trichoderma reesei, or Trichoderma viride cell.

[0409] In one embodiment, the cell is a Trichoderma cell.

[0410] In one embodiment, the cell is a Trichoderma reesei cell.

[0411] In one embodiment, the cell is an Aspergillus cell.

[0412] In one embodiment, the cell is an Aspergillus niger cell.

[0413] In one embodiment, the cell is an Aspergillus oryzae cell.

[0414] In one embodiment, the cell is a plant cell.

[0415] In one embodiment, the cell is one or more of a maize, rice, sorghum, rye, barley, wheat, millet, oats, sugarcane, turfgrass, switchgrass, soybean, canola, alfalfa, sunflower, cotton, tobacco, peanut, potato, tobacco, Arabidopsis, vegetable, or safflower cell.

[0416] In one embodiment, the cell is a prokaryotic cell, e.g., a Gram-positive cell selected from the group consisting of Bacillus, Clostridium, Corynebacterium, Enterococcus, Geobacillus, Lactobacillus, Lacticaseibacillus, Lactiplantibacillus, Levilactobacillus, Ligilactobacillus,

[0417] Umosilactobacillus, Lactococcus, Oceanobacillus, Staphylococcus, Streptococcus, or

[0418] Streptomyces cells, or a Gram-negative bacteria selected from the group consisting of Campylobacter, E. coli, Flavobacterium, Fusobacterium, Helicobacter, llyobacter, Neisseria, Pseudomonas, Salmonella, and Ureaplasma cells, such as Lacticaseibacillus casei, Lacticaseibacillus paracasei, Lacticaseibacillus rhamnosus, Lactiplantibacillus plantarum, Levilactobacillus brevis, Ligilactobacillus salivarius, Limosilactobacillus fermentum, Umosilactobacillus reuteri, Lactobacillus acidophilus, Lactobacillus bulgaricus, Lactobacillus crispatus, Lactobacillus gasseri, Lactobacillus johnsonii, Lactobacillus helveticus, Corynebacterium glutamicum, Bacillus alkalophilus, Bacillus amyloliquefaciens, Bacillus brevis, Bacillus circulans, Bacillus clausii, Bacillus coagulans, Bacillus firmus, Bacillus lautus, Bacillus lentus, Bacillus licheniformis, Bacillus megaterium, Bacillus pumilus, Bacillus stearothermophilus, Bacillus subtilis, Bacillus thuringiensis, Streptococcus equisimilis, Streptococcus pyogenes, Streptococcus uberis, and Streptococcus equi subsp. Zooepidemicus, Streptomyces achromogenes, Streptomyces avermitilis, Streptomyces coelicolor, Streptomyces griseus, and Streptomyces lividans cells.

[0419] In a preferred embodiment, the cell is a Bacillus cell.

[0420] In one embodiment, the cell is a Bacillus subtilis cell.

[0421] In one embodiment, the cell is a Bacillus licheniformis cell.

[0422] In one embodiment, the cell is a Lacticaseibacillus paracesei cell.

[0423] In one embodiment, the cell is a Streptococcus thermophilus cell.

[0424] In one embodiment, the cell is a E. coli cell.

[0425] In one embodiment, the cell is isolated.

[0426] In one embodiment, the cell is purified.

[0427] The present invention also relates to recombinant host cells, comprising a polynucleotide of the present invention operably linked to one or more control sequences that direct the production of the Cas nuclease.

[0428] A construct or vector comprising a polynucleotide is introduced into a host cell so that the construct or vector is maintained as a chromosomal integrant or as a self-replicating extra- chromosomal vector as described earlier. The choice of a host cell will to a large extent depend upon the gene encoding the polypeptide and its source. The Cas nuclease can be native or heterologous to the recombinant host cell. Also, at least one of the one or more control sequences can be heterologous to the polynucleotide encoding the Cas nuclease. The recombinant host cell may comprise a single copy, or at least two copies, e.g., three, four, five, or more copies of the polynucleotide of the present invention.

[0429] For purposes of this invention, Bacillus classes / genera / species shall be defined as described in Patel and Gupta, 2020, Int. J. Syst. Evol. Microbiol. 70: 406-438.

[0430] The bacterial host cell may also be any Streptococcus cell including, but not limited to, Streptococcus equisimilis, Streptococcus pyogenes, Streptococcus uberis, and Streptococcus equi subsp. Zooepidemicus cells.

[0431] The bacterial host cell may also be any Streptomyces cell including, but not limited to, Streptomyces achromogenes, Streptomyces avermitilis, Streptomyces coelicolor, Streptomyces griseus, and Streptomyces lividans cells.

[0432] Methods for introducing DNA into prokaryotic host cells are well-known in the art, and any suitable method can be used including but not limited to protoplast transformation, competent cell transformation, electroporation, conjugation, transduction, with DNA introduced as linearized or as circular polynucleotide. Persons skilled in the art will be readily able to identify a suitable method for introducing DNA into a given prokaryotic cell depending, e.g., on the genus. Methods for introducing DNA into prokaryotic host cells are for example described in Heinze et al., 2018, BMC Microbiology 18:56, Burke et al., 2001 , Proc. Natl. Acad. Sci. USA 98: 6289-6294, Choi et al., 2006, J. Microbiol. Methods 64: 391-397, and Donald et al., 2013, J. Bacteriol. 195(11): 2612- 2620.

[0433] The host cell may be a fungal cell. “Fungi” as used herein includes the phyla Ascomycota, Basidiomycota, Chytridiomycota, and Zygomycota as well as the Oomycota and all mitosporic fungi (as defined by Hawksworth et al., In, Ainsworth and Bisby’s Dictionary of The Fungi, 8th edition, 1995, CAB International, University Press, Cambridge, UK).

[0434] Fungal cells may be transformed by a process involving protoplast-mediated transformation, Agrobacterium-mediated transformation, electroporation, biolistic method and shock-wave-mediated transformation as reviewed by Li et al., 2017, Microbial Cell Factories 16: 168 and procedures described in EP 238023, Yelton et al., 1984, Proc. Natl. Acad. Sci. USA 81 : 1470-1474, Christensen et al., 1988, Bio / TechnologyQ: 1419-1422, and Lubertozzi and Keasling, 2009, Biotechn. Advances 27: 53-75. However, any method known in the art for introducing DNA into a fungal host cell can be used, and the DNA can be introduced as linearized or as circular polynucleotide.

[0435] The fungal host cell may be a yeast cell. “Yeast” as used herein includes ascosporogenous yeast (Endomycetales), basidiosporogenous yeast, and yeast belonging to the Fungi Imperfecti (Blastomycetes). For purposes of this invention, yeast shall be defined as described in Biology and Activities of Yeast (Skinner, Passmore, and Davenport, editors, Soc. App. Bacteriol. Symposium Series No. 9, 1980).

[0436] In a preferred embodiment, the yeast host cell is a Pichia or Komagataella cell, e.g., a Pichia pastoris cell (Komagataella phaffii).

[0437] The fungal host cell may be a filamentous fungal cell. “Filamentous fungi” include all filamentous forms of the subdivision Eumycota and Oomycota (as defined by Hawksworth et al., 1995, supra). The filamentous fungi are generally characterized by a mycelial wall composed of chitin, cellulose, glucan, chitosan, mannan, and other complex polysaccharides. Vegetative growth is by hyphal elongation and carbon catabolism is obligately aerobic. In contrast, vegetative growth by yeasts such as Saccharomyces cerevisiae is by budding of a unicellular thallus and carbon catabolism may be fermentative.

[0438] The filamentous fungal host cell may be an Acremonium, Aspergillus, Aureobasidium, Bjerkandera, Ceriporiopsis, Chrysosporium, Coprinus, Coriolus, Cryptococcus, Fili basidium, Fusarium, Humicola, Magnaporthe, Mucor, Myceliophthora, Neocallimastix, Neurospora, Paecilomyces, Penicillium, Phanerochaete, Phlebia, Piromyces, Pleurotus, Schizophyllum, Talaromyces, Thermoascus, Thielavia, Tolypocladium, Trametes, or Trichoderma cell. In a preferred embodiment, the filamentous fungal host cell is an Aspergillus, Tnchoderma or Fusarium cell.

[0439] In a further preferred embodiment, the filamentous fungal host cell is an Aspergillus niger, Aspergillus oryzae, Trichoderma reesei, or Fusarium venenatum cell.

[0440] In an 8thaspect the invention relates to plant cells comprising the Cas nuclease of the 1staspect, the fusion polypeptide of the 2ndaspect, the composition of the 3rdaspect, the polynucleotide of the 5thaspect, and / or the nucleic acid construct or expression vector of the 6thaspect.

[0441] In one embodiment, the plant cell is one or more of a maize, rice, sorghum, rye, barley, wheat, millet, oats, sugarcane, turfgrass, switchgrass, soybean, canola, alfalfa, sunflower, cotton, tobacco, peanut, potato, tobacco, Arabidopsis, vegetable, or safflower cell.

[0442] Methods of Production

[0443] In a 9thaspect, the present invention also relates to methods of producing a Cas nuclease of the the 1staspect, or a fusion polypeptide of the 2ndaspect, comprising cultivating the host cell of the 7thaspect under conditions conducive for production of the Cas nuclease or fusion polypeptide; and optionally, (b) recovering the Cas nuclease and / or the fusion polypeptide.

[0444] In one aspect, the cell is a Bacillus cell. In another aspect, the cell is a Bacillus subtilis cell. In another aspect, the cell is a Bacillus licheniformis cell.

[0445] In one embodiment, the cell is a Lacticaseibacillus paracesei cell.

[0446] In one embodiment, the cell is a Streptococcus thermophilus cell.

[0447] In one embodiment, the cell is a E. coli cell.

[0448] In one aspect, the cell is an Aspergillus cell. In another aspect, the cell is an Aspergillus niger cell. In another aspect, the cell is an Aspergillus oryzae cell.

[0449] In one aspect, the cell is a Trichoderma cell. In another aspect, the cell is a Trichoderma reesei cell.

[0450] In one embodiment the cell is a Pichia pastoris cell.

[0451] The host cell is cultivated in a nutrient medium suitable for production of the polypeptide using methods known in the art. For example, the cell may be cultivated by shake flask cultivation, or small-scale or large-scale fermentation (including continuous, batch, fed-batch, or solid-state, and / or microcarrier-based fermentations) in laboratory or industrial fermentors in a suitable medium and under conditions allowing the polypeptide to be expressed and / or isolated. Suitable media are available from commercial suppliers or may be prepared according to published compositions (e.g., in catalogues of the American Type Culture Collection). If the polypeptide is secreted into the nutrient medium, the polypeptide can be recovered directly from the medium. If the polypeptide is not secreted, it can be recovered from cell lysates. The polypeptide may be detected using methods known in the art that are specific for the polypeptide, including, but not limited to, the use of specific antibodies, formation of an enzyme product, disappearance of an enzyme substrate, or an assay determining the relative or specific activity of the polypeptide.

[0452] The polypeptide may be recovered from the medium using methods known in the art, including, but not limited to, collection, centrifugation, filtration, extraction, spray-drying, evaporation, or precipitation. In one aspect, a whole fermentation broth comprising the polypeptide is recovered. In another aspect, a cell-free fermentation broth comprising the polypeptide is recovered.

[0453] The polypeptide may be purified by a variety of procedures known in the art to obtain substantially pure polypeptides and / or polypeptide fragments (see, e.g., Wingfield, 2015, Current Protocols in Protein Science’, 80(1): 6.1.1-6.1.35; Labrou, 2014, Protein Downstream Processing, 1129: 3-10).

[0454] In an alternative aspect, the polypeptide is not recovered.

[0455] Use of the Cas nucleases

[0456] In a 10thaspect, the present invention relates to the use of the Cas nuclease of the 1staspect, the fusion polypeptide of the 2ndaspect, the composition of the 3rdaspect, the method of the 4thaspect, the polynucleotide of the 5thaspect, or the nucleic acid construct or expression vector of the 6thaspect for modifying a target sequence in a cell, e.g., a target gene.

[0457] In an 11thaspect, the present invention relates to the use of the Cas nuclease of the 1staspect, the fusion polypeptide of the 2ndaspect, the composition of the 3rdaspect, the method of the 4thaspect, the polynucleotide of the 5thaspect, the nucleic acid construct or expression vector of the 6thaspect, the cell of the 7thaspect, or the cell of the 8thaspect for the manufacture of a medicament for modifying a target sequence in a cell, e.g., a target gene.

[0458] In one embodiment, the targeted cell is selected from the group consisting of: an archaeal cell, a bacterial cell, a eukaryotic cell, a eukaryotic single-cell organism, a somatic cell, a germ cell, a stem cell, a plant cell, an algal cell, an animal cell, a non-human animal cell, an invertebrate cell, a vertebrate cell, a fish cell, a frog cell, a bird cell, a mammalian cell, a non-human mammalian cell, a pig cell, a cow cell, a goat cell, a sheep cell, a rodent cell, a rat cell, a mouse cell, a non-human primate cell, and a human cell

[0459] Formulations

[0460] In a 12thaspect, the present invention also relates to a formulation comprising (i) the Cas nuclease according to the 1staspect, the fusion polypeptide according to the 2ndaspect, a composition according to the 3rdaspect, the polynucleotide according to the 5thaspect, the nucleic acid construct or expression vector according to the 6thaspect, the cell according to the 7thaspect, or the cell according to the 8thaspect, and optionally, (ii) one or more of a lipid, a liposome, a hydrogel, a microparticle, a nanoparticle, or a block copolymer micelle.

[0461] In one embodiment, the lipid is a lipid nanoparticle.

[0462] In one embodiment, the Cas nuclease or fusion polypeptide is in a lyophilized formulation.

[0463] In one embodiment, the Cas nuclease or fusion polypeptide is in a liquid formulation.

[0464] In one embodiment, the Cas nuclease or fusion polypeptide is in a substantially endotoxin- free formulation.

[0465] Delivery

[0466] The Cas nuclease or CRISPR compositions described herein may be delivered as a protein, DNA molecules, RNA molecules, Ribonucleoproteins (RNP), nucleic acid vectors, or any combination thereof. In some embodiments, the RNA molecule comprises a chemical modification. Non-limiting examples of suitable chemical modifications include 2'-0-methyl (M), 2'- 0-methyl, 3'phosphorothioate (MS) or 2'-0-m ethyl, 3 'thioPACE (MSP), pseudouridine, and 1- methyl pseudo-uridine. Each possibility represents a separate embodiment of the present invention.

[0467] The Cas nucleases and / or polynucleotides encoding same described herein, and optionally additional proteins (e.g., ZFPs, TALENs, transcription factors, restriction enzymes) and / or nucleotide molecules such as guide RNA may be delivered to a target cell by any suitable means. The target cell may be any type of cell e.g., eukaryotic or prokaryotic, in any environment e.g., isolated or not, maintained in culture, in vitro, ex vivo, in vivo or in planta.

[0468] In some embodiments, the composition to be delivered includes mRNA of the nuclease and RNA of the guide. In some embodiments, the composition to be delivered includes mRNA of the nuclease, RNA of the guide and a donor template. In some embodiments, the composition to be delivered includes the Cas nuclease and guide RNA. In some embodiments, the composition to be delivered includes the Cas nuclease, guide RNA and a donor template for gene editing via, for example, homology directed repair (HDR). In some embodiments, the composition to be delivered includes mRNA of the nuclease, DNA-targeting RNA and the tracrRNA. In some embodiments, the composition to be delivered includes mRNA of the nuclease, DNA-targeting RNA and the tracrRNA and a donor template. In some embodiments, the composition to be delivered includes the Cas nuclease DNA-targeting RNA and the tracrRNA. In some embodiments, the composition to be delivered includes the Cas nuclease, DNA-targeting RNA and the tracrRNA and a donor template for gene editing via, for example, homology directed repair.

[0469] For the foregoing embodiments, each embodiment disclosed herein is contemplated as being applicable to each of the other disclosed embodiment. For example, it is understood that any of the RNA molecules or compositions of the present invention may be utilized in any of the methods of the present invention.

[0470] As used herein, all headings are simply for organization and are not intended to limit the disclosure in any manner. The content of any individual section may be equally applicable to all sections.

[0471] The present invention is further described by the following examples that should not be construed as limiting the scope of the invention.

[0472] EXAMPLES

[0473] Example 1 : Identification of novel Cas nucleases

[0474] The Cas nuclease with SEQ ID NO: 1 has been identified by mining bacterial genomes using a bioinformatic pipeline developed by the inventors of the instant invention (Fig. 1). The pipeline includes several modules that are customized for specific tasks, including the identification of Cas nuclease genes, CRISPR arrays, and tracrRNA, as well as matching and ranking of these features. To identify Cas nuclease genes, the pipeline employs state-of-the-art tools with a large suite of Hidden Markov Models (HMMs) and a scoring scheme to predict the Cas nuclease subtype. Additionally, new HMMs have been built on proprietary data. Identified Cas enzymes were filtered for the presence of conserved domains as well as essential catalytic residues. To identify CRISPR arrays, the pipeline uses a combination of searching for repetitive sequences and aligning them to known repeats. A filtering process was applied to exclude repeats that do not meet CRISPR-specific criteria. A kmer-based machine learning approach (extreme gradient boosting trees) was applied to predict the subtype of the CRISPR arrays based on the consensus repeat. To identify tracrRNAs, the pipeline used a combination of scanning for antirepeat sequences using the CRISPR repeat consensus sequences of identified CRISPR arrays as queries and sequence-structure covariance models derived from aligning sequences to experimentally validated tracrRNA tail structures. The complementarity of the CRISPR direct repeats and the tracrRNA was evaluated via alignment using a custom scoring system.

[0475] Table 5 provides an overview of the sequences (SEQ ID NOs.) of the identified novel Cas nuclease, nuclease domains, tracrRNA coding sequences, and crRNA coding sequences.

[0476] Table 5. Overview of sequences 1-9.

[0477] Example 2: Cas nuclease 0452 with SEQ ID NO: 1 showing editing activity in Bacillus licheniformis

[0478] An array of sgRNAs is designed, where each sgRNA coding seguence comprises a tracrRNA coding seguence (SEQ ID NO: 7), a crRNA coding seguence (SEQ ID NO: 8) and a spacer seguence directed to a RFP-gene (red fluorescent protein) which is encoded in the genome of the B. licheniformis cells.

[0479] The B. licheniformis cells are double fluorescent and comprise in their genomes both (i) a gene encoding RFP, and (ii) a gene encoding GFP (green fluorescent protein). The design of the gRNA is carried out to target several target seguences in the RFP-gene, each target seguence being in the proximity of a nnRH PAM seguence within the RFP-gene (i.e., each sgRNA targets a different target seguence within the RFP-gene). First, to transfer the plasmid DNAs encoding sgRNAs and the 0452 Cas nuclease, a B. subtilis donor strain is conjugated with B. licheniformis recipient strains. Conjugants are spread onto TY agar-plates and incubated at 34°C for 1-2 days. After incubation, using UV light the colonies of transformants appear in different colors: (i) dark / orange color colonies indicating that both the RFP, and the GFP are expressed, and (ii) bright green color colonies indicating that GFP is expressed and that RFP is no longer expressed. The bright green colonies show that the RFP coding gene is successfully disrupted by the 0452 Cas nuclease, and that the 0452 Cas nuclease shows gene editing activity in Bacillus licheniformis cells.

[0480] Example 3: Cas nuclease 0452 with SEQ ID NO: 1 showing editing activity in Aspergillus niger

[0481] An array of sgRNAs is designed, where each sgRNA coding seguence comprises a tracrRNA coding seguence (SEQ ID NO: 7), a crRNA coding seguence (SEQ ID NO: 8) and a spacer seguence directed to a polyketide synthase (wA) gene which is encoded in the genome of the Aspergillus niger cells. The wA gene is required for synthesis of a green pigment present in the walls of mature asexual spores. The knockout of the wA gene results in a black to white color change (Handbook of Industrial Mycology, 2004; Editor: Zhiqiang An; and Joergensen TR et al., Fungal Genetics and Biology, 48(5), 2011).

[0482] The design of the gRNA is carried out to target several target sequences in the wA gene, each target sequence being in the proximity of a nnRH PAM sequence within the wA gene (i.e. , each sgRNA targets a different target sequence within the wA gene). A. niger cells are transformed with plasmids encoding the sgRNA and the 0452 Cas nuclease. The polynucleotide encoding the Cas nuclease is codon-optimized for expression in Aspergillus niger.

[0483] After cultivation, the transformants appear in different colors: (i) dark / black spores indicating that the wA gene is still expressed in the majority of spores, and (ii) white spores indicating that these spores do not generate the green pigment anymore and that the wA gene is disrupted in the majority of these spores. The white spores show that the wA gene is successfully disrupted by the 0452 Cas nuclease, and that the 0452 Cas nuclease shows gene editing activity in Aspergillus niger cells.

[0484] The invention described and claimed herein is not to be limited in scope by the specific aspects herein disclosed, since these aspects are intended as illustrations of several aspects of the invention. Any equivalent aspects are intended to be within the scope of this invention. Indeed, various modifications of the invention in addition to those shown and described herein will become apparent to those skilled in the art from the foregoing description. Such modifications are also intended to fall within the scope of the appended claims. In the case of conflict, the present disclosure including definitions will control.

[0485] Example 4: Testing Cas nuclease 0452 (SEQ ID NO: 1) activity in Bacillus licheniformis

[0486] To evaluate the editing activity of the Cas nuclease 0452 in B. licheniformis a knocking out experiment was designed as follows:

[0487] Materials and methods

[0488] Chemicals used as buffers and substrates were commercial products of at least reagent grade. The following media were used for bacterial growth:

[0489] LB: See EP 0 506 780.

[0490] TY: See WO 94 / 14968, p. 16.

[0491] To select for erythromycin resistance, agar and liquid media were supplemented with 5 micro-gram / ml erythromycin. To select for tetracycline resistance, agar and liquid media were supplemented with 15 micro-gram / ml tetracycline Strains

[0492] • PP3724: This strain is a B. subtilis derivative, containing pLS20, wherein the methylase gene M.bli 190411 (LIS20130177942) is expressed from a triple promoter at the amyE locus, the pBC16-derived orf beta and the B. subtilis comS gene (and a kanamycin resistance gene) are expressed from a triple promoter at the air locus (making the strain D-alanine requiring), and the B. subtilis comS gene (and a cat gene) are expressed from a triple promoter at the pel locus.

[0493] • BT12257: This strain is a PP3724 derivative.

[0494] • SJ1904: This strain is a B. licheniformis derivative, described in WO 2008 / 066931 .

[0495] • MDT545: This strain is a SJ1904 derivative, described in WO 2021 / 183622.

[0496] The B. licheniformis strain MDT545 (WO 2021 / 183622) was used as the host because it has a DsRED expression cassette (SEQ ID NO: 10) integrated at the amyl locus of its chromosome and a GFP expression cassette integrated at the xylA locus. DsRED knock-outs should result in diminishing red fluorescence and arising only green fluorescence from the resultant cells, thus edited cells could be screened by visible phenotypic changes. Deletion cassette of DsRED gene (SEQ ID NO: 11) was designed to delete 67-bp DNAs in the middle of DsRED coding sequences. Protospacers were designed within the region to be deleted after disruption.

[0497] DNA construction and transformation

[0498] All of the constructions described in the examples were assembled from synthetic DNA fragments ordered from TWIST Bioscience, USA. The fragments were assembled by sequence overlap extension (SOE) as described in the examples. DNA manipulations (plasmid and genomic DNA preparation, restriction digestion, purification, ligation, DNA sequencing) was performed using standard textbook procedures with commercially available kits and reagents. Oligonucleotide primers were obtained from Macrogen, Korea. PCR amplifications were performed using standard textbook procedures, employing a commercial thermocycler and either PrimeStar GXL polymerase (TaKaRa, Japan) or KOD One polymerase (TOYOBO, Japan).

[0499] DNA was introduced into B. subtilis rendered naturally competent, either using a two-step procedure (Yasbin et al. , 1975, J. Bacteriol. 121 : 296-304.), or a one-step procedure, in which cell material from an agar plate was resuspended in Spizisen 1 medium (WO 2014 / 052630), 12 ml shaken at 200 rpm for appr. 4 hours at 37 °C, DNA added to 400 microliter aliquots, and these further shaken 150 rpm for 1 hour at the desired temperature before plating on selective agar plates. DNA was introduced into B. licheniformis by conjugation from B. subtilis, essentially as previously described (EP2029732 B1), using a modified B. subtilis donor strain PP3724 or its derivative BT12257, containing pLS20.

[0500] Guide RNA expression plasmid construction

[0501] We designed eight protospacers with different PAM sequences (SEQ ID NO:20-27). First, synthetic DNA fragments were prepared to have an optimized single guide RNA backbone and 21-base protospacer sequence (SEQ ID NO: 12-19), which targets DsRED gene. These DNA fragments were assembled with the linearized backbone plasmid pMDT411 by SOE PCR, which resulted in pKJIT001-008 (Figure 3). Primers used for plasmid construction and sequence check are described in the list (SEQ ID NO: 30-454).

[0502] Nuclease 0452 expression plasmid

[0503] A maltose-inducible nuclease 0452 expression plasmid was constructed. To make a constitutive expression plasmid, the nuclease coding sequences were amplified by colony PCR with the B. subtilis clone having the above plasmids. The nuclease 0452 gene fragment was then assembled with the linearized backbone plasmid pMDT417 by SOE PCR, which resulted in pKJIT016 (Figure 4). Primers used for plasmid construction and sequence check are described in the list (SEQ ID NO:30-45).

[0504] Nuclease 0452 Activity testing

[0505] For the knockout experiments, 21 -bp protospacer sequences targeting DsRED gene were designed (SEQ ID NOs: 12-19) based on the hypothetical PAMs of nuclease 0452 (table 7). Guide plasmids were designed to contain deletion cassette of DsRED gene and an optimized sgRNA scaffold (SEQ ID NO: 28). The coding sequences of the nuclease 0452 was codon optimized for B. subtilis (SEQ ID NO: 29).

[0506] The editing efficiency of nuclease 0452 was calculated as follows:

[0507] Editing Efficiency (%) = G / N, wherein G indicates the number of colonies only showing green fluorescence on tetracycline and erythromycin resistance resistant medium (agar plate), and N indicates the total number of colonies grown on the said medium. Successful editing was first verified by visual phenotypic screening and genotyping.

[0508] As shown in Figure 2 and Table 6 below, the editing efficiency of nuclease 0452 ranged from 14 to 100%. The guide plasmids pKJIT007 and 008 did not give any green colonies, suggesting less efficient PAMs or protospacers. Table 6. Summary of editing efficiency in B. licheniformis MDT545.

[0509] To verify that the green colony phenotypes resulted from targeted gene editing of the DsRED gene, colony PCR and sequencing were performed on green colonies obtained from the conjugants of nuclease 0452. The sequencing results confirmed the presence of the DsRED gene deletion cassette in all analyzed samples. This indicates that the nuclease 0452 nuclease induced DNA strand breaks at the target sites, triggering homology-directed repair and successful knockout of the DsRED gene. Based on this, we concluded that nuclease 0452 is an effective tool for genome engineering in B. licheniformis hosts with a very useful efficiency.

[0510] Table 7. Overview of sequences 10-45.

[0511] Example 5: Testing Cas nuclease 0452 (SEQ ID NO: 1) activity in Bacillus subtilis

[0512] To evaluate the editing activity of the Cas nuclease 0452 in B. subtilis a repair experiment was designed as follows:

[0513] Materials

[0514] • LB-agar plates were made using 20 mL per plate of commercially available LB-agar. Plates were made with 10 pg / mL Tetracyklin alt. 15 pg / mL Kannamycin (or both).

[0515] • A B. subtilis codon optimized nuclease 0452 encoding polynucleotide was ordered at Twist Bioscience, USA. Full sequence provided in SEQ ID NO: 46.

[0516] • gBIocks (short double stranded synthetic DNA) and primers ordered at IDT, Integrated DNA Technologies. Sequences are provided in SEQ ID Nos: 54 to 65 (both incl).

[0517] • Phusion Plus polymerase from Thermo Scientific used for all PCR.

[0518] Strains and plasmids

[0519] • pBC16 plasmid: In-house construct expressing MAD7 under a maltose induceable promotor and carrying a Tet-resistance gene, the full sequence is provided in SEQ ID NO: 49.

[0520] • pJOE8999: The pJOE8999 plasmid, as described by Sachla et al. (2021) in "A simplified method for CRISPR-Cas9 engineering of Bacillus subtilis" (Microbiol Spectr 9:e00754-21), was obtained from the Bacillus Genetic Stock Center (BGSC #ECE358). This plasmid expresses the Cas9 nuclease under a mannose-inducible promoter and includes the Pvan promoter for constitutive expression of the guide sgRNA when cloned into the vector; it carries a kanamycin (KANA) resistance gene for selection purposes. The full sequence is provided in SEQ ID NQ:50. • AN 1152: This strain is a B. subtilis derivative, where the comS gene and dsRED are expressed from a triple promoter in an expression cassette inserted at the genomic pel locus, making colonies distinctly red and fluorescent.

[0521] • AN1152_dsREDtrunc: A derivative of AN 1152 where the last 131 bp of the dsRED-gene is removed and 131 bp of an inhouse Lipase-gene has been inserted, the sequence is provided in SEQ ID NO: 52.

[0522] • E coli TOPlOF’

[0523] The nuclease 0452 specific PAMs and corresponding protospacers are identified within the 131 inserted bp in AN1152_dsREDtrunc. See table 8 below.

[0524] AN1152_dsREDtrunc carrying a plasmid expressing nuclease 0452 was transformed with plasmids containing repair DNA and sgRNA.

[0525] A successful double-stranded cut followed by repair of the complete dsRED results in RED and fluorescent colonies, the sequence of the dsRED repair DNA is provided in SEQ ID NO: 53.

[0526] DNA construction and transformation

[0527] All of the constructions described in the examples were assembled from synthetic DNA fragments ordered from TWIST Bioscience, USA alt. IDT, Belgium and PCR-fragments on existing plasmids.

[0528] The fragments were assembled by prolonged overlap extension (POE) creating multimeric plasmids suitable for direct transformation in B. subtilis.

[0529] DNA manipulations (plasmid and genomic DNA preparation, restriction digestion, purification, ligation, DNA sequencing) were performed using standard textbook procedures with commercially available kits and reagents. Oligonucleotide primers were obtained from IDT, Belgium. PCR amplifications were performed using standard textbook procedures, employing a commercial thermocycler and Phusion Plus polymerase from Thermo Scientific.

[0530] DNA was introduced into B. subtilis rendered naturally competent, using a one-step procedure, in which cell material from an agar plate was resuspended in Spizisen 1 medium (WO 2014 / 052630), 10 ml shaken at 220 rpm for appr. 5 hours at 37 °C, 3 pl POE-PCR added to 200 microliter aliquots, and these further shaken 220 rpm for 1 hour at the desired temperature before plating the full volume on selective agar plates. The plates were then incubated 42 hours at 34 °C.

[0531] A maltose-inducible nuclease 0452 expression plasmid was constructed by exchanging the MAD7 gene in pBC16 with a B. subtilis codon optimized nuclease 0452-encoding gene (SEQ ID NO: 46). The plasmid part PCR fragment: pBC16, (SEQ ID NO: 49) with primers SEQ ID NO: 67 and SEQ ID NO: 68.

[0532] NZ0452 PCR fragment: Synthetic gene from TWIST, SEQ ID NO: 47 with primers SEQ ID NO: 69 and SEQ ID NO: 70.

[0533] Assembled with POE-PCR and used to transform AN1152_dsREDtrunc. Plated on LB+tet. Correct strain with plasmid pBC16_NZ0452 (SEQ ID NO: 48) was identified with Nanopore sequencing.

[0534] Repair DN A plasmid empty (SEQ ID NO: 51):

[0535] Plasmid part: Template: pJOE8999 (SEQ ID NO: 50). Primers SEQ ID NO: 71 and SEQ ID NO: 72.

[0536] Repair-DNA: Template: AN1152 with functional dsRED-gene. Primers SEQ ID NO: 73 and SEQ ID NO 74.

[0537] Intermidiate part: gBLOCK from IDT. SEQ ID NO: 66

[0538] Assembled with POE-PCR and used for transformation of E. coli TOP10F. The transformations were plated on LB+kana and grown at 37°C for 24 hours. The sequence was verified with Nanopore-sequencing and a plasmid prep prepared.

[0539] Plasmids with repair DN A and sqRNA:

[0540] Plasmid part: Repair DNA plasmid (SEQ ID NO: 51) with primers SEQ ID NO: 75 and SEQ ID NO 76. sgRNAs: gBLOCKs SEQ ID NOs: 54-65 (both incl).

[0541] Assembled with POE-PCR.

[0542] Nuclease 0452 Activity testing

[0543] Six protospacers are tested with two different versions of tracrRNA, see table 9. POE fragments as described above were used to transform AN_0452, then plated on LB+TET+KANA with 1 % maltose added.

[0544] 2 pg of plasmid Repair DNA (SEQ ID NO: 51) was also used to transform AN_452.

[0545] The editing efficiency of nuclease NZ0452 was calculated as follows:

[0546] Editing Efficiency (%) = R / N, wherein R indicates the number of red colonies and N indicates the total number of colonies.

[0547] Successful editing was only verified by visual phenotypic screening.

[0548] See all results in table 8. Conclusions

[0549] No spontaneous repair was seen from plasmid with only repair-DNA.

[0550] Editing-efficiency ranging from 67-100% for all 6 protospacers were found with both versions of tracrRNA. Based on this we concluded that nuclease 0452 is an effective tool for genome engineering in B subtilis hosts with a very high. efficiency.

[0551] Table 8. Colony counting results of B. subtilis on LB+KANA+TET+maltose plates

[0552] Table 9. Spacers and sgRNAs for nuclease 0452 with corresponding PAM motif “NRAGK” (SEQ ID NO: 77), PAM sequence, spacer and sgRNA sequences

[0553]

[0554] Table 10. Overview of sequences 46-112 in this example 111 sgRNA NZ0452_opt_04

[0555] Example 6: Testing nuclease 0452 in Aspergillus niger

[0556] To confirm that the CRISPR nuclease 0452 can be used to specifically induce indels at a target region for targeted gene editing, the Aspergillus niger fwnA (wA) knockout / spore colour assay was used. Using this assay, plasmids with the nuclease 0452 and different spacer sequences targeting fwnA (wA) were transformed into Aspergillus niger Mbin118 and screened for their effectiveness in inducing white phenotype colonies, which suggests the presence of insertions / deletions in fwnA (wA).

[0557] As a result, one or more white phenotype transformants were found for the nuclease 0452 showing that the nuclease 0452 is capable of gene editing in this fungal host. Sequencing of the respective targeted protospacer regions showed evidence of indels, suggesting nuclease-induced fwnA (wA) knockout. These results also suggest that like Cas9, the PAMs for the nuclease 0452 are situated to the 3’ region of the protospacer. This study confirmed that 0452 and its predicted gRNA scaffolds can be used for targeted gene editing in A. niger.

[0558] White spore assay

[0559] In A. niger, the knockout of Polyketide synthase fwnA (wA) leads to white / fawn spore colour phenotype, thought to be due to inactivation of PpfA-dependent lysine biosynthesis / siderophore biosynthesis (Jorgensen et al., 2015, The molecular and genetic basis of conidial pigmentation in Aspergillus niger. Fungal Genetics and Biology, 48(5), pp.544-553).

[0560] We hypothesized that this spore colour assay could be used as an indicator of CRISPR nuclease activity. In the absence of repair DNA, DNA strand breaks in Eukaryotes generally cause indels due to error-prone NHEJ DNA repair. Consequently, CRISPR nuclease-induced DNA breaks of fwnA (wA) are expected to cause indels, gene knockout, and subsequently, white spore colour. Therefore, the appearance of white colonies following transformations with putative CRISPR systems targeting fwnA shows targeted nuclease activity.

[0561] Control plasmid and screening plasmid design

[0562] To ascertain that the system functions correctly, two controls were used for each nuclease screening: As a positive control (+ctrl), the S. pyogenes CRISPR / Cas9 system was used to demonstrate that fwnA (wA) knockout by indel induction leads to detectable colour change under the tested experimental conditions. As a negative control (-Ctrl), a plasmid containing the nuclease and spacerless-sgRNA was used to confirm that the system itself did not cause untargeted indels in fwnA (wA).

[0563] For the screening, 21 bp spacer sequences with expected activity targeting A. niger fwnA were designed based on the hypothetical PAMs of the nuclease. Screening plasmids were designed to contain two expression cassettes to express the sgRNA (each targeting a different region of fwnA) and nuclease. sgRNAs were derived from concatenation of the spacer, direct repeat and tracrRNA, while the nuclease gene sequences were codon optimized for A. niger. Constitutive expression was used for the expression cassette.

[0564] Transformation of controls and screening plasmids

[0565] The control plasmids (+ctrl and -Ctrl) and screening plasmids were transformed into A. niger Mbinl 18 and a target of max. 12 random colonies were isolated for each transformation according to the methods documented in “Protoplast generation and transformation”. The ratios of white / black colonies for each transformation were then observed.

[0566] For the screening for nuclease 0452 we conducted transformation and cultivation 34°C. White colonies were seen for 11 spacers tested with both wt-sgRNA and optimized-sgRNA (Figure 5).

[0567] The ratio of white colonies ranged from 16-33% with wt-sgRNA and 8-75% with optimized- sgRNA. As white spore ratios were found to be higher with optimized-sgRNA except for spacer 0452-PS4.

[0568] Table 11. CRISPR plasmids

[0569] Results

[0570] The colony number and ratio of white / black phenotypes obtained after nuclease 0452 transformations ranges from about 25 to 100% as shown in Figure 6, so we concluded that nuclease 0452 was very effective in Aspergillus niger. Spore PCR / Sanger sequencing of selected transformants

[0571] To further confirm that the white spore phenotype was a result of targeted indel formation in fwnA, spore PCR / sequencing of 2, 3, 4, 2, 2, 1 , and 2 white colonies from the transformations of plharl 174-4, plharl 174-10, plharl 175-1 , plharl 175-4, plharl 175-10, plharl 175-11 , and plharl 175-12, respectively, were done according to the methods documented in “Spore PCR, DNA purification and Sanger sequencing”.

[0572] The sequencing results showed that indels and deletions of various length were present in all sequenced samples, only the plasmids targeting 0452-PS1 and plasmids targeting 0452- PS4 are shown (Figures 7 and 8), and that these indels were all situated in close proximity of the targeted protospacer regions. This shows that DNA strand breaks were induced in the target regions by the expressed the nuclease, leading to insertions / deletions and subsequent fwnA (wA) knockout. As targeted indel activity was seen for the nuclease 0452, this data also suggests that their PAMs are situated 3’ relative to their protospacers, and that the putative PAM sequences disclosed herein can be used for targeted gene editing in A. niger.

[0573] Conclusions

[0574] Using the A. niger fwnA (wA) knockout / spore colour assay, the nuclease 0452 was able to induce the white spore phenotype. Subsequent spore PCR and sequencing confirmed that the white spore colonies were a result of fwnA knockout caused by nuclease-induced indels at the target regions. Altogether, these results confirm that nuclease 0452 and its corresponding direct repeats and gRNA scaffolds can be used for targeted gene editing in Aspergillus.

[0575] Expression plasmid construction

[0576] Before constructing the expression cassette, the gene sequence encoding nuclease 0452 was codon optimized for A. niger and the nucleoplasmin and SV40 nuclear localization signals were added on to the 5’ and 3’ ends respectively. The optimized sequence is shown in SEQ ID NO: 124. The wt sgRNA is shown in SEQ ID NO:125 and the optimized sgRNA in SEQ ID NO: 126 without spacers.

[0577] Screening plasmid construction

[0578] Prior to screening plasmid construction, various 21 bp gRNA spacer oligo sequences targeting different regions of fwnA were used (Table 12 below). Flanking 20bp regions homologous to the plasmid backbone insert sites were added and the oligos were synthesized using commercial oligonucleotide synthesis services. Subsequently, the spacer sequences were then joined to the four Asc cut intermediate plasmids using Hi Fi DNA Assembly (New England Biolabs, USA) to form the final screening plasmids. Table 12. Target sequences for the various enzymes

[0579] Culture media

[0580] • COVE-N-glyX: 218 g / L Xylitol, 10 g / L glycerol, 2.02 g / L KNO3, 50ml / L COVE salt solution, 25 g / L agar BA10, pH5.3

[0581] • YPG: 4 g / L yeast extract, 1 g / L KH2PO4, 0.5 g / L MgSO4.7aq, 15 g / L glucose, pH 6.0

[0582] • COVE salt solution: 26 g KCI, 26 g MgSO4.7aq, 76 g KH2PO4, 50ml Cove trace metals / L

[0583] • COVE trace metals: 0.04 g NaB4O?.10aq, 0.4 g CuSO4.5aq, 1.2 g FeSO4.7aq, 0.7 g MnSO4.aq, 0.8 g Na2MoO2.2aq, 10 g ZnSO4.7aq / L

[0584] • COVE-N top agar solution: 342.3 g / L Sucrose, 20ml / L COVE salt solution, 3 g / L NaNO2, 10 g / L Nippon gene agarose L Low melt agarose, 6 drops / L 5N NaOH

[0585] • STC: 0.8 M sorbitol, 50 mM Tris pH 8, 50 mM CaCh

[0586] • STPC: 40 % PEG4000 in STC buffer.

[0587] • Tween water: 1 g / L Polyoxyethylen (20) Sorbitan Monolaurate (Tween 20)

[0588] • LB: 10 g / L Bacto tryptone, 10 g / L NaCI, 5 g / L Bacto Yeast extract, pH 7.0

[0589] Protoplast formation

[0590] An agar slant (COVE-N-glyX) was inoculated with spores of MBinl 18, and the strain was grown at 30°C until completely sporulated. 9 ml of 0.1% tween20 water was added to the slant, and the spores were suspended manually. The spore suspension was transferred to shake flasks (500 ml) with baffles containing 100 ml YPG medium.

[0591] The flask was incubated at 30 or 32°C for 15-20 hrs (60-80 rpm). Mycelia was collected by filtering through Mira-cloth. Mycelia was washed 2-3 times by 0.6 M KCI or 0.7M KCI+10mM CaCh. Mycelia was resuspended in 20-30 ml 0.6M KCI or 0.7M KCI+10mM CaCh with 20-48 mg / ml Glucanex and 1.2mg / ml BSA in 50 ml Centrifuge tube.

[0592] The sample was incubated for 1-1.5 hrs at 30 or 32 °C, 80 rpm, and the protoplasting was monitored frequently by microscopy. After protoplasting was observed, the solution was filtered through Mira-cloth to 25 ml Universal container (Nunc 364211).

[0593] The solution was then centrifuged at 2000 rpm for 10 minutes with slow acceleration. The supernatant was discarded, and the pellet was washed with 5-15 ml STC buffer, then centrifuged at 2000 rpm for 10 minutes with slow acceleration.

[0594] The protoplasts were resuspended in protoplast solution (STC / STPC / DMSO=8:2:0.1) to a concentration of approx. 2 x 107protoplasts / ml. Mixing and pellet resuspension were done gently using pipetting.

[0595] Transformation

[0596] For each transformation, the transforming DNA was added to 100 pl of protoplasts in 14 ml Falcon tube, mixed gently and incubated on ice for more than 30 minutes. 1 ml SPTC buffer was added and the solution was mixed gently, then incubated at 37 °C in water bath for 20 minutes. 10-15 ml COVE-N top agar solution containing 50 pg / ml of Nourseothricin was added to the solution, mixed and poured onto transformation plates.

[0597] After the agar solidified, the plates were incubated at 34 °C until colonies were clearly visible. Transforming DNA volume is less than 10 ul (1 - 10 ug).’

[0598] Strain isolation

[0599] Colonies were picked for each transformation and isolated to COVE-N-glyX agar. The colonies were allowed to sporulate by incubation at 34 °C for 1 week.

[0600] Example 7: Testing nuclease 0452 in Lacticaseibacillus paracasei

[0601] This experiment aimed at evaluating the activity and efficiency of the 0452 nuclease in Lacticaseibacillus paracasei. A killing assay was conducted to assess the nuclease activity. The CRISPR-Cas system facilitates precise targeting and cleavage of DNA. Bacteria typically lack the non-homologous end joining (NHEJ) system and depend on homologous recombination (HR) to repair DNA damage, specifically double strand breaks (DSBs).

[0602] Without a homologous template for repair, they cannot survive CRISPR-induced DSBs. Thus, when a homologous template for repair is not provided, Lb. paracasei cells will be unable to survive CRISPR-induced DSBs, which leads to cell death. This CRISPR-Cas mediated killing effect was used to evaluate the activity and efficiency of the 0452 nuclease.

[0603] Killing assay on upp and ccpA genes

[0604] Targeting plasmids containing both 0452 nuclease codon optimized (SEQ ID NO: 149) and corresponding gRNA cassette, either native (SEQ ID NO: 150) or optimized for B. subtilis (SEQ ID NO: 151), with a targeting or non-targeting spacer were used to transform the wild-type Lb. paracasei strain and its upp deletion mutant.

[0605] The upp gene encodes a phosphoribosyltransferase and is a non-essential gene in Lb. paracasei. The transformation plates with the non-targeting control plasmid were full of colonies for both strains, confirming the competence of the cells. The transformation plates with the upp- targeting plasmids were only full for the Aupp mutant strain. There were no colonies observed on the transformation plates with the upp- targeting plasmids for the wild-type (WT) strain, indicating efficient targeted CRISPR-induced DSBs at the upp site. For all tested upp-targeting spacers, a killing effect with 100% efficiency at upp locus was observed (plate photos not included).

[0606] This killing of colonies following transformations with upp-targeting plasmids demonstrated that the 0452 nuclease efficiently induces DSBs at the upp target region.

[0607] To further confirm that the CRISPR nuclease 0452 can effectively target and cleave the DNA of Lb. paracasei, we designed plasmids targeting a different locus within the Lb. paracasei genome - ccpA gene which encodes a catabolite control protein A. As before, both the ccpA- targeting plasmids and the non-targeting plasmid control were introduced into WT Lb. paracasei strain. The results showed full plates when the strain was transformed with non-targeting plasmid and no colonies when the strain was transformed with the ccpA-targeting plasmids (plate photos not included). This demonstrates that 0452 nuclease is active and 100% efficient in targeting also at the ccpA locus.

[0608] Table 13. Sequences of spacers.

[0609] Example 8: Testing nuclease 0452 in Streptococcus thermophilus

[0610] To confirm the activity of the CRISPR nuclease 0452 a plasmid carrying the nuclease coding sequence and two different spacer sequences targeting malP were transformed into Streptococcus thermophilus (ST) and screened for their effectiveness in inducing knock-out of the gene malP. As a result, transformants were found for both spacers showing that the nuclease is capable of gene editing in ST. Sequencing of the respective targeted protospacer regions showed the expected deletion, suggesting nuclease-induced malP knockout. This confirms that nuclease 0452 and its predicted gRNA scaffolds can be used for targeted gene editing in S. thermophilus.

[0611] Experimental setup

[0612] The activity of the nuclease was tested in a 2-plasmid setting. One plasmid carried the nuclease-encoding gene and another plasmid ‘pSpacer’ carried the spacer-sgRNA and the recombination arms (HR) template for malP deletion. In a first step, the pSpacer was transformed into the ST host and plated on selective media. Successful transformants that have integrated the plasmid are selected and transformed in a second step with the nuclease plasmid and plated on selective media. The colonies containing both pSpacer and pAMB167 were screened for the malP deletion using PCR. Positive PCR clones were confirmed by sequencing.

[0613] Spacers selection

[0614] Spacers were designed within the genetic target malP in ST using the PAM sequence nRARK (SEQ ID NO: 165). Two spacers were predicted in silico and shown in Table 14 below:

[0615] Table 14. Spacers Transformation and screening of pSpacer plasmids

[0616] An ST strain was used for transformation of pSpacer plasmids. The strain was cultivated in M17 supplemented with 2% lactose (LM17). Selective media was LM17 supplemented with antibiotic Tetracycline at 2 ug / ml.

[0617] Competent cells were prepared by overnight cultivation anaerobically at 40°C onto LM17 agar plate from glycerol stock. A single colony was then selected and regrown overnight at 40°C in 10ml liquid LM17. 1 ml of the regrown culture was then used to seed 100 ml of LM17 liquid media and incubated at 40°C for 2h until the culture reached OD 0.3 - 0.5. The cells were then washed and immediately transformed by electroporation with 1 pg of plasmids pSpacer or the positive control. Settings for the electroporation were capacitance 25 pFD, resistance 200 ohms, voltage 2,5 volts. The time pulse constant recorded during the electroporation was between 4.0 - 4.8 ms.

[0618] Transformants were obtained with pSpacers as well as with the positive control. pSpacers were confirmed to contain plasmids by PCR using primers Spacer F / R and TetM F / R (Table 15). PCR products were confirmed by sequencing. Transformants of ST containing plasmid pSpacers were obtained.

[0619] Table 15. Primers for pSpacer transformants screening

[0620] Transformation and screening of plasmid pAMB167

[0621] Transformants were prepared electrocompetent using the same procedure as described above. Deletion plasmid and positive control containing the nuclease 0452 was transformed into the target strains using the same protocol as above.

[0622] Transformants were obtained and screened by PCR for the presence of the expected deletion of malP using primers malP-del F / R (Table 16). All clones showed to contain the malP deletion. The PCR products were confirmed by sequencing proving that nuclease 0452 is effective for genetic engineering in Streptococcus. Table 16. Primers for malP deletion screening

[0623] The invention is further defined by the following numbered paragraphs:

[0624] 1 . A Cas nuclease selected from the group consisting of:

[0625] (a) a polypeptide having at least 80%, e.g., at least 85%, at least 90%, at least 91 %, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% or 100% sequence identity to the amino acid sequence of SEQ ID NO: 1 ;

[0626] (b) a polypeptide encoded by a polynucleotide having at least 80%, e.g., at least 85%, at least 90%, at least 91 %, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% or 100% sequence identity to the polypeptide coding sequence of SEQ ID NO: 2;

[0627] (c) a polypeptide derived from SEQ ID NO: 1 by having 1-30 alterations (e.g., substitutions, deletions and / or insertions at one or more positions, e.g., 1 or 2 or 3 or 4 or 5 or 6 or 7 or 8 or 9 or 10 or 11 or 12 or 13 or 14 or 15 or 16 or 17 or 18 or 19 or 20 or 21 or 22 or 23 or 24 or 25 or 26 or 27 or 28 or 29 or 30 alterations, in particular substitutions, such as conservative amino acid substitutions;

[0628] (d) a polypeptide having a TM-score of at least 0.60, e.g., at least 0.65, at least 0.70, at least 0.75, at least 0.80, at least 0.85, at least 0.90, at least 0.91 , at least 0.92, at least 0.93, at least 0.94, at least 0.95, at least 0.96, at least 0.97, at least 0.98, at least 0.99, or even 1.0, compared to the three-dimensional structure of the polypeptide of SEQ ID NO:1 , wherein the three-dimensional structure is calculated using Alphafold;

[0629] (e) a polypeptide derived from the polypeptide of (a), (b), (c), or (d), wherein the N- and / or C-terminal end has been extended by addition of one or more amino acids; and

[0630] (f) a fragment of the polypeptide of (a), (b), (c), (d) or (e).

[0631] 2. The nuclease of paragraph 1 having nuclease activity, and / or DNA-binding activity.

[0632] 3. The nuclease according to any one of paragraphs 1-2, wherein the nuclease comprises or consists of an amino acid sequence having at least 80%, e.g., at least 85%, at least 90%, at least 91 %, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% or 100% sequence identity to SEQ ID NO: 1. 4. The nuclease of any one of paragraphs 1-3, comprising, consisting essentially of, or consisting of SEQ ID NO: 1.

[0633] 5. The nuclease according to any one of paragraphs 1-4, wherein the nuclease is a fragment of SEQ ID NO: 1 , wherein the fragment preferably contains at least 1000 amino acid residues (e.g., amino acids 1 to 1061 of SEQ ID NO: 1).

[0634] 6. The nuclease of any one of paragraphs 1-5, comprising, consisting essentially of, or consisting of SEQ ID NO: 1.

[0635] 7. The nuclease of any one of paragraphs 1-6, which is encoded by a polynucleotide having at least 80%, at least 85%, at least 90%, at least 91 %, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% or 100% sequence identity to the mature polypeptide coding sequence of SEQ ID NO: 2.

[0636] 8. The nuclease of any one of paragraphs 1-7, comprising an N-terminal extension and / or C-terminal extension of 1-10 amino acids, e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, or 10 amino acids, preferably and extension of 1-10 amino acid residues in the N- terminus and / or 1-10 amino acids in the C- terminus, such as 1-5 amino acids.

[0637] 9. The nuclease of any one of paragraphs 1 -8, having at most 10%, at most 9%, at most 8%, at most 7%, at most 6%, at most 5%, at most 4%, at most 3%, at most 2% or at most 1 % sequence differences to the polypeptide of SEQ ID NO: 1 .

[0638] 10. The nuclease of any one of paragraphs 1-9, which differs from the polypeptide of SEQ ID NO: 1 , by at most 15 amino acids, such as at most 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14 or 15 amino acids.

[0639] 11. The nuclease of any one of paragraphs 1-10, which is obtained from or obtainable from a Lachnospiraceae cell.

[0640] 12. The nuclease of any one of paragraphs 1-10, which is obtained from or obtainable from a Lachnospiraceae bacterium cell, e.g. a Lachnospiraceae bacterium A 10 cell.

[0641] 13. The nuclease of any one of paragraphs 1-12, comprising one or more functional RuvC domain.

[0642] 14. The nuclease of any one of paragraphs 1-13, comprising one or more functional HNH domain.

[0643] 15. The nuclease of any one of paragraphs 1-14, comprising one or more domain selected from the group consisting of:

[0644] (a) a RuvC domain having at least 60%, e.g., at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91 %, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% or 100% sequence identity to SEQ ID NO:3, 4, or 5;

[0645] (b) a HNH domain having at least 60%, e.g., at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91 %, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% or 100% sequence identity to SEQ ID NO:6;

[0646] (c) a RuvC domain derived from SEQ ID NO: 3, 4, or 5, by substitution, deletion or addition of one or several amino acids of SEQ ID NO: 3, 4, or 5;

[0647] (d) a HNH domain derived from SEQ ID NO: 6, by substitution, deletion or addition of one or several amino acids of SEQ ID NO: 6; and

[0648] (e) a fragment of the catalytic domain of (a), (b), (c), or (d); preferably wherein the nuclease has nuclease activity, or wherein the nuclease has nickase activity.

[0649] 16. The nuclease of any one of paragraphs 1-15, wherein the HNH domain has at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% or 100% sequence identity to SEQ ID NO:6.

[0650] 17. The nuclease of any one of paragraphs 1-16, wherein the HNH domain comprises or consists of an amino acid sequence having at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91 %, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% or 100% sequence identity to SEQ ID NO:6.

[0651] 18. The nuclease of any one of paragraphs 1-17, wherein the HNH domain is a variant of SEQ ID NO:6, comprising a substitution, such as a conservative amino acid substitution, a deletion, and / or an insertion at one or more positions.

[0652] 19. The nuclease of any one of paragraphs 1-18, wherein the HNH domain differs from SEQ ID NO:6, by at most 15 amino acids, such as at most 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14 or 15 amino acids.

[0653] 20. The nuclease of any one of paragraphs 1-19, wherein the HNH domain is a fragment of SEQ ID NO:6, wherein the fragment preferably contains at least 20 amino acid residues (e.g., amino acids 20 to 40 of SEQ ID NO: 6), or at least 27 amino acid residues (e.g., amino acids 20 to 47 of SEQ ID NO: 6).

[0654] 21. The nuclease of any one of paragraphs 1-20, wherein the HNH domain comprises, consists essentially of, or consists of SEQ ID NO:6.

[0655] 22. The nuclease of any one of paragraphs 1-21 , wherein the RuvC domain has at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% or 100% sequence identity to SEQ ID NO: 3, 4, or 5.

[0656] 23. The nuclease of any one of paragraphs 1-22, wherein the RuvC domain comprises or consists of an amino acid sequence having at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91 %, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% or 100% sequence identity to SEQ ID NO: 3, 4, or 5.

[0657] 24. The nuclease of any one of paragraphs 1-23, wherein the RuvC domain is a variant of SEQ ID NO: 3, 4, or 5, comprising a substitution, such as a conservative amino acid substitution, a deletion, and / or an insertion at one or more positions.

[0658] 25. The nuclease of any one of paragraphs 1-24, wherein the RuvC domain differs from SEQ ID NO: 3, 4, or 5, by at most 15 amino acids, such as at most 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14 or 15 amino acids.

[0659] 26. The nuclease of any one of paragraphs 1-25, wherein the RuvC domain is a fragment of SEQ ID NO: 3, 4, or 5, wherein the fragment preferably contains at least 40 amino acid residues (e.g., amino acids 1 to 40 of SEQ ID NO: 1).

[0660] 27. The nuclease of any one of paragraphs 1-26, wherein the RuvC domain comprises, consists essentially of, or consists of SEQ ID NO: 3, 4, or 5.

[0661] 28. The nuclease of any one of paragraphs 1-27, wherein the nuclease has double-strand break activity towards a DNA target site.

[0662] 29. The nuclease of any one of paragraphs 1-28, wherein the nuclease comprises an amino acid substitution, insertion, or deletion in the one or more RuvC domain.

[0663] 30. The nuclease of any one of paragraphs 1-29, wherein the nuclease comprises an amino acid substitution, insertion, or deletion in the one or more HNH domain.

[0664] 31. The nuclease of any one of paragraphs 1-30, wherein the nuclease is a nickase having one or more inactivated RuvC domain created by an amino acid substitution, insertion, or deletion at a position provided for the nuclease in column 3 of Table 2.

[0665] 32. The nuclease of any one of paragraphs 1-31 , wherein the nuclease is a nickase having one or more inactivated HNH domain created by an amino acid substitution, insertion or deletion at a position provided for the nuclease in column 3 of Table 3.

[0666] 33. The nuclease of any one of paragraphs 1-32, wherein the nuclease has a single-stranded break activity towards a DNA target site.

[0667] 34. The nuclease of any one of paragraphs 1-32, wherein the nuclease is a catalytically dead nuclease.

[0668] 35. The nuclease of paragraph 34, wherein the catalytically dead nuclease comprises one or more inactivated RuvC domain and one or more inactivated HNH domain.

[0669] 36. The nuclease of any one of paragraphs 34-35, wherein the catallytically dead nuclease comprising one or more inactivated RuvC domain and one or more inactivated HNH domain is created by one or more amino acid substitution, deletion or insertion at the positions provided for the nuclease in column 3 of Table 2 or column 3 of Table 3.

[0670] 37. The nuclease of any one of paragraphs 1-36, wherein sequence identity is determined by the method described in the definition section under “Sequence Identity”. 38. The nuclease of any one of paragraphs 1-37, wherein the polynucleotide encoding the nuclease is codon-optimized for expression in a eukaryotic cell.

[0671] 39. The nuclease of any one of paragraphs 1-38, wherein the polynucleotide encoding the nuclease is codon-optimized for expression in a mammalian cell, e.g., a non-human mammalian cell.

[0672] 40. The nuclease of any one of paragraphs 1-37, wherein the polynucleotide encoding the nuclease is codon-optimized for expression in a E. coli cell.

[0673] 41. The nuclease of any one of paragraphs 1-37, wherein the polynucleotide encoding the nuclease is codon-optimized for expression in a Bacillus cell.

[0674] 42. The nuclease of any one of paragraphs 1-37, wherein the polynucleotide encoding the nuclease is codon-optimized for expression in a Bacillus subtilis cell.

[0675] 43. The nuclease of any one of paragraphs 1-37, wherein the polynucleotide encoding the nuclease is codon-optimized for expression in a Bacillus licheniformis cell.

[0676] 44. The nuclease of any one of paragraphs 1-38, wherein the polynucleotide encoding the nuclease is codon-optimized for expression in a filamentous fungal cell.

[0677] 45. The nuclease of any one of paragraphs 1-38, wherein the polynucleotide encoding the nuclease is codon-optimized for expression in an Aspergillus niger cell.

[0678] 46. The nuclease of any one of paragraphs 1-38, wherein the polynucleotide encoding the nuclease is codon-optimized for expression in an Aspergillus oryzae cell.

[0679] 47. The nuclease of any one of paragraphs 1-38, wherein the polynucleotide encoding the nuclease is codon-optimized for expression in a Trichoderma reesei cell.

[0680] 48. The nuclease of any one of paragraphs 1-37, wherein the polynucleotide encoding the nuclease is codon-optimized for expression in a Lactobacillus cell.

[0681] 49. The nuclease of any one of paragraphs 1-37, wherein the polynucleotide encoding the nuclease is codon-optimized for expression in a probtiotic cell.

[0682] The nuclease of any one of paragraphs 1-38, wherein the polynucleotide encoding the nuclease is codon-optimized for expression in a S. cerevisiae cell.

[0683] 50. The nuclease of any one of the preceding paragraphs, wherein the polynucleotide encoding the nuclease is codon-optimized for expression in P. pastoris.

[0684] 51. The nuclease of any one of the preceding paragraphs, wherein the polynucleotide encoding the nuclease is codon-optimized for expression in Lb. paracasei (Lacticaseibacillus paracasei or Lactobacillus paracasei).

[0685] 52. The nuclease of any one of the preceding paragraphs, wherein the polynucleotide encoding the nuclease is codon-optimized for expression in S. thermophilus.

[0686] 53. The nuclease of any one of paragraphs 1-52, wherein the nuclease is a Class 2 Cas nuclease. 54. The nuclease of any one of paragraphs 1-53, wherein the nuclease is a Class 2 Type II Cas nuclease.

[0687] 55. The nuclease of any one of paragraphs 1-54, wherein the nuclease is a Class 2 Type-ll- A Cas nuclease.

[0688] 56. The nuclease of any one of paragraphs 1-55, wherein the nuclease utilizes a protospacer adjacent motif (PAM) sequence provided for the nuclease in Table 1 .

[0689] 57. The nuclease of any one of paragraphs 1-56, wherein the nuclease is non-naturally occurring, e.g., wherein the nuclease is engineered and comprises unnatural or synthetic amino acids.

[0690] 58. The nuclease of any one of paragraphs 1-57, wherein the nuclease is naturally occuring.

[0691] 59. A fusion polypeptide, comprising the Cas nuclease of any one of paragraphs 1-58, and one or more second polypeptide.

[0692] 60. The fusion polypeptide of paragraph 59, wherein the one or more second polypeptide comprises a polypeptide that localizes to one or more subcellular organelles.

[0693] 61. The fusion polypeptide according to any one of paragraphs 59-60, wherein one or more second polypeptide is a nuclear localization sequence (NLS), a cell penetrating peptide, and / or an affinity tag.

[0694] 62. The fusion polypeptide according to any one of paragraphs 59-61 , wherein the fusion polypeptide comprises 1-10 or more NLS at or near the amino-terminus, 1-10 or more NLS at or near the carboxy-terminus, or a combination of 1-10 or more NLS at or near the amino-terminus and 1-10 or more NLS at or near the carboxy-terminus.

[0695] 63. The fusion polypeptide according to any one of paragraphs 59-62, wherein the fusion polypeptide comprises 1-4 NLS.

[0696] 64. The fusion polypeptide according to any one of paragraphs 59-63, wherein the fusion polypeptide comprises one NLS.

[0697] 65. The fusion polypeptide according to any one of paragraphs 59-64, wherein the one or more NLS is located within the open-reading frame (ORF) of the nuclease.

[0698] 66. The fusion polypeptide according to any one of paragraphs 59-65, wherein the one or more NLS are in tandem repeats.

[0699] 67. The fusion polypeptide according to any one of paragraphs 59-66, wherein the fusion polypeptide comprises a first NLS and a second NLS.

[0700] 68. The fusion polypeptide according to paragraph 67, wherein the fusion polypeptide comprises a linker sequence between the first NLS and the second NLS.

[0701] 69. The fusion polypeptide according to paragraph 68, wherein the linker between the first NLS and the second NLS comprises at least 1 , at least 2, at least 3, at least 4, at least 5, at least 6, at least 7, at least 8, at least 9, or at least 10 amino acids. 70. The fusion polypeptide according to any one of paragraphs 59-69, wherein the one or more second polypeptide comprises a base-editing polypeptide.

[0702] 71. The fusion polypeptide according to any one of paragraphs 59-70, wherein the baseediting polypeptide comprises a base editor domain.

[0703] 72. The fusion polypeptide according to any one of paragraphs 59-71 , wherein the fusion polypeptide comprises a linker between the Cas nuclease and the base-editing polypeptide.

[0704] 73. The fusion polypeptide according to any one of paragraphs 59-72, wherein the baseediting polypeptide comprises a deaminase, e.g., a cytidine deaminase, such as a APOBEC3A deaminase, or an adenosine deaminase.

[0705] 74. The fusion polypeptide according to any one of paragraphs 59-73, wherein the one or more second polypeptide comprises a reverse transcriptase, the reverse transcriptase preferably comprising a reverse transcriptase domain.

[0706] 75. The fusion polypeptide according to any one of paragraphs 59-74, wherein the nuclease is fused to one or more NLS of sufficient strength to drive accumulation of a CRISPR complex comprising the Cas nuclease in a detectable amount in the nucleus of a eukaryotic cell.

[0707] 76. The nuclease or fusion polypeptide according to any one of paragraphs 1-75, which is isolated.

[0708] 77. The nuclease or fusion polypeptide according to any one of paragraphs 1-76, which is purified.

[0709] 78. The nuclease or fusion polypeptide according to any one of paragraphs 1-77, wherein sequence identity is determined by the method described in the definition section under “Sequence Identity”.

[0710] 79. A non-naturally occuring composition comprising (i) the Cas nuclease or fusion polypeptide of any one of paragraphs 1-78, and / or (ii) a nucleic acid molecule comprising a sequence encoding the Cas nuclease or fusion polypeptide of any one of paragraphs 1-78.

[0711] 80. The composition according to paragraph 79, wherein the nucleic acid molecule is a chemically modified nucleic acid molecule.

[0712] 81. The composition according to any one of paragraphs 79-80, wherein the nucleic acid molecule is DNA.

[0713] 82. The composition according to any one of paragraphs 79-81 , wherein the nucleic acid molecule is RNA.

[0714] 83. The composition according to any one of paragraphs 79-82, wherein the RNA is an mRNA comprising one or more of a 5’ untranslated regions (UTR), an open reading frame (ORF) encoding the Cas nuclease or fusion polypeptide, a 3’IITR, and a poly-adenylyl (polyA) tail.

[0715] 84. The composition according to any one of paragraphs 79-83, wherein the ORF consists of nucleosides selected from adenosine, a modified adenosine, uridine, a modified uridine, guanosine, a modified guanosine, cytidine, and a modified cytidine. 85. The composition according to any one of paragraphs 79-84, wherein the ORF consists of nucleosides selected from adenosine, uridine, a modified uridine, guanosine, and cytidine.

[0716] 86. The composition according to any one of paragraphs 79-86, wherein the nucleic acid molecule is linear.

[0717] 87. The composition according to any one of paragraphs 79-85, wherein the nucleic acid molecule is circular.

[0718] 88. The composition according to any one of paragraphs 79-87, further comprising one or more RNA molecules, or a DNA polynucleotide encoding one or more of the one or more RNA molecules, wherein the one or more RNA molecules and the Cas nuclease or fusion polypeptide do not naturally occur together, and the one or more RNA molecules are configured to form a complex with the Cas nuclease or fusion polypeptide and / or target the complex to a target site.

[0719] 89. The composition according to any one of paragraphs 79-88, wherein the one or more RNA molecule comprises a guide RNA (gRNA), which gRNA is comprising a CRISPR RNA (crRNA) and a trans activating RNA (tracrRNA).

[0720] 90. The composition according to any one of paragraphs 79-89, wherein the one or more RNA molecule is a single-molecule RNA (sgRNA), e.g., wherein the crRNA and the tracrRNA are part of the same RNA molecule.

[0721] 91 . The composition according to any one of paragraphs 79-89, wherein the one or more RNA molecule is a dual-molecule RNA, e.g., wherein the crRNA and the tracrRNA are separate RNA molecules.

[0722] 92. The composition according to any one of paragraphs 79-91 , further comprising a donor template for homology directed repair (HDR).

[0723] 93. The composition according to any one of paragraphs 79-92, wherein the sequence encoding the Cas nuclease or fusion polypeptide comprises a sequence having at least 60%, e.g., at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% or 100% identity to the polynucleotide sequence of SEQ ID NO: 1.

[0724] 94. The composition according to any one of paragraphs 79-93, wherein the one or more RNA molecule comprises a trans activating RNA (tracrRNA) sequence encoded by a polynucleotide having at least 60%, e.g., at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91 %, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% or 100% identity to the polynucleotide sequence of SEQ ID NO: 7.

[0725] 95. The composition according to any one of paragraphs 79-93, wherein at least one of the one or more RNA molecule comprises a CRISPR RNA (crRNA) molecule comprising a guide sequence portion and a sequence encoded by a polynucleotide having at least 60%, e.g., at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% or 100% identity to the polynucleotide sequence of SEQ ID NO: 8.

[0726] 96. The composition according to any one of paragraphs 79-95, wherein at least one of the one or more RNA molecule comprises or consists of a RNA molecule comprising a guide sequence portion and a sequence encoded by a polynucleotide having at least 60%, e.g., at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% or 100% identity to the polynucleotide sequence of SEQ ID NO: 9.

[0727] 97. The composition according to any one of paragraphs 79-96, wherein the Cas nuclease or fusion polypeptide comprises a sequence having at least 60%, e.g., at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91 %, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% or 100% identity to any amino acid sequence of column 1 in Table 4, and the at least one RNA molecule is a RNA molecule comprising a guide sequence portion and a sequence encoded by a polynucleotide having at least 60%, e.g., at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91 %, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% or 100% identity to any polynucleotide sequence of column 4 in Table 4.

[0728] 98. The composition according to any one of paragraphs 79-97, wherein the Cas nuclease or fusion polypeptide comprises a sequence having at least 60%, e.g., at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91 %, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% or 100% identity to the amino acid sequence of SEQ ID NO: 1 , and the at least one RNA molecule comprises a crRNA molecule comprising a guide sequence portion and a sequence encoded by a polynucleotide having at least 60%, e.g., at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91 %, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% or 100% identity to the polynucleotide sequence of SEQ ID NO: 8.

[0729] 99. The composition according to any one of paragraphs 79-98, wherein the Cas nuclease or fusion polypeptide comprises a sequence having at least 60%, e.g., at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91 %, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% or 100% identity to any amino acid sequence of column 1 in Table 4, and the at least one RNA molecule comprises a crRNA molecule comprising a guide sequence portion and a sequence encoded by a polynucleotide having at least 60%, e.g., at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91 %, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% or 100% identity to any of the polynucleotide sequences of column 2 in Table 4.

[0730] 100. The composition according to any one of paragraphs 79-99, wherein the Cas nuclease or fusion polypeptide comprises a sequence having at least 60%, e.g., at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91 %, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% or 100% identity to the amino acid sequence of SEQ ID NO: 1 , and the at least one RNA molecule comprises a tracrRNA molecule comprising a sequence encoded by a polynucleotide having at least 60%, e.g., at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% or 100% identity to the polynucleotide sequence of SEQ ID NO: 7.

[0731] 101. The composition according to any one of paragraphs 79-100, wherein the Cas nuclease or fusion polypeptide comprises a sequence having at least 60%, e.g., at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91 %, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% or 100% identity to any amino acid sequence of column 1 in Table 4, and the at least one RNA molecule comprises a tracrRNA molecule comprising a sequence encoded by a polynucleotide having at least 60%, e.g., at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% or 100% identity to any of the polynucleotide sequences of column 3 in Table 4.

[0732] 102. The composition according to any one of paragraphs 79-101 , wherein the composition further comprises a base editor enzyme.

[0733] 103. The composition according to any one of paragraphs 79-102, wherein the base editor enzyme is an adenosine deaminase or a cytidine deaminase.

[0734] 104. The composition according to any one of paragraphs 79-103, wherein the composition further comprises a reverse transcriptase enzyme.

[0735] 105. A method of modifying a nucleotide sequence at a DNA target site in the genome of a cell, comprising introducing into the cell the Cas nuclease or fusion polypeptide according to any one of paragraphs 1-78, a polynucleotide encoding the Cas nuclease or fusion polypeptide of any one of paragraphs 1-78, and / or the composition of any one of paragraphs 79-104.

[0736] 106. The method according to paragraph 105, wherein the method comprises introducing a DNA-break at the DNA target site.

[0737] 107. The method according to any one of paragraphs 105-106, wherein the DNA-break is a single-strand break.

[0738] 108. The method according to any one of paragraphs 105-106, wherein the DNA-break is a double-strand break. 109. The method according to any one of paragraphs 105-108, wherein the method is carried out under conditions that are permissive for non-homologous end joining (NHEJ), and / or homology-directed repair (HDR).

[0739] 110. The method according to any one of paragraphs 105-109, wherein the Cas nuclease or fusion polypeptide effects a DNA-break in a DNA strand adjacent to a PAM sequence, e.g., adjacent to the PAM sequence “nRARK”, “nVDRV”, “nRAGK”, or “nVDRK”, or adjacent to any one of the PAM sequences mentioned in Table 1.

[0740] 111. The method according to any one of paragraphs 105-110, wherein the Cas nuclease or fusion polypeptide effects a DNA-break in a DNA strand adjacent to a sequence that is complementary to the PAM sequence.

[0741] 112. The method according to any one of paragraphs 105-111, wherein the target site is within a coding region of a protein.

[0742] 113. The method according to any one of paragraphs 105-111, wherein the target site is within a non-coding region of a protein.

[0743] 114. The method according to any one of paragraphs 105-111, wherein the target site is within a regulatory region of a protein, e.g., a promoter.

[0744] 115. The method according to any one of paragraphs 105-114, wherein the cell is a eukaryotic cell.

[0745] 116. The method according to any one of paragraphs 105-114, wherein the cell is a prokaryotic cell.

[0746] 117. The method according to any one of paragraphs 105-114, wherein the cell is a eukaryotic cell, such as a mammalian cell, a human cell, or a non-human mammalian cell, e.g., a BHK cell, a CHO cell, a mouse cell, a hamster cell, or a rat cell.

[0747] 118. The method according to any one of paragraphs 105-114, wherein the cell is a fungal cell, such as a filmentous fungal cell, or a yeast cell.

[0748] 118a. The method according to paragraph 118, wherein the fungal cell is a Pichia cell, e.g., a Pichia pastoris cell.

[0749] 119. The method according to any one of paragraphs 105-114, wherein the cell is a yeast cell, e.g., a Candida, Hansenula, Kluyveromyces, Pichia, Saccharomyces, Schizosaccharomyces, or Yarrowia cell, such as a Kluyveromyces lactis, Saccharomyces carlsbergensis, Saccharomyces cerevisiae, Saccharomyces diastaticus, Saccharomyces douglasii, Saccharomyces kluyveri, Saccharomyces norbensis, Saccharomyces oviformis, or Yarrowia lipolytica cell.

[0750] 120. The method according to any one of paragraphs 105-114, wherein the cell is a filamentous fungal cell e.g., an Acremonium, Aspergillus, Aureobasidium, Bjerkandera, Ceriporiopsis, Chrysosporium, Coprinus, Coriolus, Cryptococcus, Fili basidium, Fusarium, Humicola, Magnaporthe, Mucor, Myceliophthora, Neocallimastix, Neurospora, Paecilomyces, Penicillium, Phanerochaete, Phlebia, Piromyces, Pleurotus, Schizophyllum, Talaromyces, Thermoascus, Thielavia, Tolypocladium, Trametes, or Tnchoderma cell, in particular, an Aspergillus awamon, Aspergillus foetidus, Aspergillus fumigatus, Aspergillus japonicus, Aspergillus nidulans, Aspergillus niger, Aspergillus oryzae, Bjerkandera adusta, Ceriporiopsis aneirina, Ceriporiopsis caregiea, Ceriporiopsis gilvescens, Ceriporiopsis pan noci nta, Ceriporiopsis rivulosa, Ceriporiopsis subrufa, Ceriporiopsis subvermispora, Chrysosporium inops, Chrysosporium keratinophilum, Chrysosporium lucknowense, Chrysosporium merdarium, Chrysosporium pannicola, Chrysosporium queenslandicum, Chrysosporium tropicum, Chrysosporium zonatum, Coprinus cinereus, Coriolus hirsutus, Fusarium bactridioides, Fusarium cerealis, Fusarium crookwellense, Fusarium culmorum, Fusarium graminearum, Fusarium graminum, Fusarium heterosporum, Fusarium negundi, Fusarium oxysporum, Fusarium reticulatum, Fusarium roseum, Fusarium sambucinum, Fusarium sarcochroum, Fusarium sporotrichioides, Fusarium sulphureum, Fusarium torulosum, Fusarium trichothecioides, Fusarium venenatum, Humicola insolens, Humicola lanuginosa, Mucor miehei, Myceliophthora thermophila, Neurospora crassa, Penicillium purpurogenum, Phanerochaete chrysosporium, Phlebia radiata, Pleurotus eryngii, Talaromyces emersonii, Thielavia terrestris, Trametes villosa, Trametes versicolor, Trichoderma harzianum, Trichoderma koningii, Trichoderma longibrachiatum, Trichoderma reesei, or Trichoderma viride cell.

[0751] 121. The method according to paragraph 120, wherein the cell is a Trichoderma cell.

[0752] 122. The method according to paragraph 121 , wherein the cell is a Trichoderma reesei cell.

[0753] 123. The method according to paragraph 120, wherein the cell is an Aspergillus cell.

[0754] 124. The method according to paragraph 123, wherein the cell is an Aspergillus niger cell.

[0755] 125. The method according to paragraph 123, wherein the cell is an Aspergillus oryzae cell.

[0756] 126. The method according to any one of paragraphs 105-115, wherein the cell is a plant cell.

[0757] 127. The method according to paragraph 126, wherein the plant cell is one or more of a maize, rice, sorghum, rye, barley, wheat, millet, oats, sugarcane, turfgrass, switchgrass, soybean, canola, alfalfa, sunflower, cotton, tobacco, peanut, potato, tobacco, Arabidopsis, vegetable, or safflower cell.

[0758] 128. The method according to paragraph 116, wherein the cell is a prokaryotic cell, e.g., a Gram-positive cell selected from the group consisting of Bacillus, Clostridium, Corynebacterium, Enterococcus, Geobacillus, Lactobacillus, Lacticaseibacillus, Lactiplantibacillus, Levilactobacillus, Ugilactobacillus, Umosilactobacillus, Lactococcus, Oceanobacillus, Staphylococcus, Streptococcus, or Streptomyces cells, or a Gram-negative bacteria selected from the group consisting of Campylobacter, E. coli, Flavobacterium, Fusobacterium, Helicobacter, llyobacter, Neisseria, Pseudomonas, Salmonella, and Ureaplasma cells, such as Lacticaseibacillus casei, Lacticaseibacillus paracasei, Lacticaseibacillus rhamnosus, Lactiplantibacillus plantarum, Levilactobacillus brevis, Ugilactobacillus salivarius, Umosilactobacillus fermentum, Umosilactobacillus reuteri, Lactobacillus acidophilus, Lactobacillus bulgancus, Lactobacillus cnspatus, Lactobacillus gassen, Lactobacillus johnsomi, Lactobacillus helveticus, Corynebacterium glutamicum, Bacillus alkalophilus, Bacillus amyloliquefaciens, Bacillus brevis, Bacillus circulans, Bacillus clausii, Bacillus coagulans, Bacillus firmus, Bacillus lautus, Bacillus lentus, Bacillus licheniformis, Bacillus megaterium, Bacillus pumilus, Bacillus stearothermophilus, Bacillus subtilis, Bacillus thuringiensis, Streptococcus equisimilis, Streptococcus pyogenes, Streptococcus uberis, and Streptococcus equi subsp. Zooepidemicus, Streptomyces achromogenes, Streptomyces avermitilis, Streptomyces coelicolor, Streptomyces griseus, and Streptomyces lividans cells.

[0759] 129. The method according to paragraph 128, wherein the cell is a Bacillus cell.

[0760] 130. The method according to paragraph 128, wherein the cell is a Bacillus subtilis cell.

[0761] 131. The method according to paragraph 128, wherein the cell is a Bacillus licheniformis cell.

[0762] 131a. The method according to paragraph 128, wherein the cell is a Lacticaseibacillus paracesei cell.

[0763] 131b. The method according to paragraph 128, wherein the cell is a Streptococcus thermophilus cell.

[0764] 131c. The method according to paragraph 128, wherein the cell is a E. coli cell.

[0765] 132. A polynucleotide encoding the Cas nuclease or fusion polypeptide according to any one of paragraphs 1-78.

[0766] 133. The polynucleotide of paragraph 132, which comprises or consists of a polynucleotide having at least 60%, e.g., at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91 %, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% or 100% sequence identity to the polypeptide coding sequence of SEQ ID NO: 2.

[0767] 134. The polynucleotide according to any one of paragraphs 132-133, wherein the polynucleotide is a chemically modified nucleic acid molecule.

[0768] 135. The polynucleotide according to any one of paragraphs 132-134, wherein the polynucleotide is DNA.

[0769] 136. The polynucleotide according to any one of paragraphs 132-134, wherein polynucleotide is RNA.

[0770] 137. The polynucleotide according to paragraph 136, wherein the RNA is an mRNA comprising one or more of a 5’ untranslated regions (UTR), an open reading frame (ORF) encoding the Cas nuclease or fusion polypeptide, a 3’IITR, and a poly-adenylyl (polyA) tail.

[0771] 138. The polynucleotide according to paragraph 137, wherein the ORF consists of nucleosides selected from adenosine, a modified adenosine, uridine, a modified uridine, guanosine, a modified guanosine, cytidine, and a modified cytidine. 139. The polynucleotide according to any one of paragraphs 137-138, wherein the ORF consists of nucleosides selected from adenosine, uridine, a modified uridine, guanosine, and cytidine.

[0772] 140. The polynucleotide according to any one of paragraphs 132-139, wherein the polynucleotide is linear.

[0773] 141. The polynucleotide according to any one of paragraphs 132-139, wherein the polynucleotide is circular.

[0774] 142. The polynucleotide according to any one of paragraphs 132-141 , wherein the poly-A sequence comprises non-adenine nucleotides.

[0775] 143. The polynucleotide according to any one of paragraphs 132-142, wherein the poly-A sequence comprises 100-400 nucleotides.

[0776] 144. The polynucleotide according to any one of paragraphs 132-143, wherein the polynucleotide is operably linked to one or more heterologous control sequence.

[0777] 145. The polynucleotide according to any one of paragraphs 132-144, wherein the heterologous control sequence is a heterologous promoter.

[0778] 146. The polynucleotide according to any one of paragraphs 132-145, which is isolated.

[0779] 147. The polynucleotide according to any one of paragraphs 132-146, which is purified.

[0780] 148. A nucleic acid construct or expression vector comprising the polynucleotide according to any one of paragraphs 132-147, operably linked to one or more control sequences that direct the production of the nuclease or fusion polypeptide in a cell.

[0781] 149. A cell comprising the Cas nuclease or fusion polypeptide of any one of paragraphs 1-78, a polynucleotide of any one of paragraphs 132-147, the nucleic acid construct or expression vector of paragraph 148, or the composition of any one of paragraphs 79-104.

[0782] 150. The cell according to paragraph 149, wherein the cell is a recombinant cell.

[0783] 151. The cell according to any one of paragraphs 149-150, wherein the Cas nuclease is heterologous to the cell.

[0784] 152. The cell according to any one of paragraphs 149-151 , wherein the cell comprises at least two copies, e.g., three, four, or five or more copies of the polynucleotide or vector or construct of any one of paragraphs 132-148.

[0785] 153. The cell according to any one of paragraphs 149-152, wherein the genome of the cell comprises a polynucleotide encoding the Cas nuclease or fusion polypeptide of any one of paragraphs 1-78, a polynucleotide of any one of paragraphs 132-147, or a nucleic acid construct or expression vector of paragraph 148.

[0786] 154. The cell according to any one of paragraphs 149-153, wherein the genome of the recombinant cell comprises at least two copies, e.g., three, four, or five, or more copies of a polynucleotide encoding the Cas nuclease or fusion polypeptide of any one of paragraphs 1-78, of a polynucleotide of any one of paragraphs 132-147, or of a nucleic acid construct or expression vector of any one of paragraph 148.

[0787] 155. A cell comprising a genome modified by the Cas nuclease or fusion polypeptide of any one of paragraphs 1-78, by a polynucleotide encoding the Cas nuclease or fusion polypeptide of any one of paragraphs 1-78, by the composition of any one of paragraphs 79-104, by the polynucleotide of any one of paragaphs 132-147, by the nucleic acid construct or expression vector of paragraph 148, and / or by the method of any one of paragraphs 105-131.

[0788] 156. The cell according to any one of paragraphs 149-155, wherein the cell is a recombinant cell.

[0789] 157. The cell of any one of paragraphs 149-156, wherein the cell is selected from the group consisting of: an archaeal cell, a bacterial cell, a eukaryotic cell, a eukaryotic single-cell organism, a somatic cell, a germ cell, a stem cell, a plant cell, an algal cell, an animal cell, in invertebrate cell, a vertebrate cell, a fish cell, a frog cell, a bird cell, a mammalian cell, a non-human mammalian cell, a pig cell, a cow cell, a goat cell, a sheep cell, a rodent cell, a rat cell, a mouse cell, a non-human primate cell, and a human cell.

[0790] 158. The cell of any one of paragraphs 149-157, wherein the cell is a eukaryotic cell.

[0791] 159. The cell of any one of paragraphs 149-157, wherein the cell is a prokaryotic cell.

[0792] 160. The cell of paragraph 158, wherein the cell is a eukaryotic cell, such as a mammalian cell, a human cell, or a non-human mammalian cell, e.g., a BHK cell, a CHO cell, a mouse cell, a hamster cell, or a rat cell.

[0793] 161. The cell of any one of paragraphs 149-160, wherein the cell is a fungal cell, such as a filmentous fungal cell, or a yeast cell.

[0794] 161a. The cell of paragraph 161 , wherein the fungal cell is a Pichia pastoris cell.

[0795] 162. The cell of paragraph 161 , wherein the cell is a yeast cell, e.g., a Candida, Hansenula, Kluyveromyces, Pichia, Saccharomyces, Schizosaccharomyces, or Yarrowia cell, such as a Kluyveromyces lactis, Saccharomyces carlsbergensis, Saccharomyces cerevisiae, Saccharomyces diastaticus, Saccharomyces douglasii, Saccharomyces kluyveri, Saccharomyces norbensis, Saccharomyces oviformis, or Yarrowia lipolytica cell.

[0796] 163. The cell of paragraph 161 , wherein the cell is a filamentous fungal cell e.g., an Acremonium, Aspergillus, Aureobasidium, Bjerkandera, Ceriporiopsis, Chrysosporium, Coprinus, Coriolus, Cryptococcus, Fili basidium, Fusarium, Humicola, Magnaporthe, Mucor, Myceliophthora, Neocallimastix, Neurospora, Paecilomyces, Penicillium, Phanerochaete, Phlebia, Piromyces, Pleurotus, Schizophyllum, Talaromyces, Thermoascus, Thielavia, Tolypocladium, Trametes, or Trichoderma cell, in particular, an Aspergillus awamori, Aspergillus foetidus, Aspergillus fumigatus, Aspergillus japonicus, Aspergillus nidulans, Aspergillus niger, Aspergillus oryzae, Bjerkandera adusta, Ceriporiopsis aneirina, Ceriporiopsis caregiea, Ceriporiopsis gilvescens, Ceriporiopsis pannocinta, Ceriporiopsis rivulosa, Ceriporiopsis subrufa, Ceriporiopsis subvermispora, Chrysosporium inops, Chrysosporium keratinophilum, Chrysosporium lucknowense, Chrysosporium merdarium, Chrysosporium pannicola, Chrysosporium queenslandicum, Chrysosporium tropicum, Chrysosporium zonatum, Coprinus cinereus, Coriolus hirsutus, Fusarium bactridioides, Fusarium cerealis, Fusarium crookwellense, Fusarium culmorum, Fusarium graminearum, Fusarium graminum, Fusarium heterosporum, Fusarium negundi, Fusarium oxysporum, Fusarium reticulatum, Fusarium roseum, Fusarium sambucinum, Fusarium sarcochroum, Fusarium sporotrichioides, Fusarium sulphureum, Fusarium torulosum, Fusarium trichothecioides, Fusarium venenatum, Humicola insolens, Humicola lanuginosa, Mucor miehei, Myceliophthora thermophila, Neurospora crassa, Penicillium purpurogenum, Phanerochaete chrysosporium, Phlebia radiata, Pleurotus eryngii, Talaromyces emersonii, Thielavia terrestris, Trametes villosa, Trametes versicolor, Trichoderma harzianum, Trichoderma koningii, Trichoderma longibrachiatum, Trichoderma reesei, or Trichoderma viride cell.

[0797] 164. The cell of paragraph 163, wherein the cell is a Trichoderma cell.

[0798] 165. The cell of paragraph 163, wherein the cell is a Trichoderma reesei cell.

[0799] 166. The cell of paragraph 163, wherein the cell is an Aspergillus cell.

[0800] 167. The cell of paragraph 163, wherein the cell is an Aspergillus niger cell.

[0801] 168. The cell of paragraph 163, wherein the cell is an Aspergillus oryzae cell.

[0802] 169. The cell of paragraph 158, wherein the cell is a plant cell.

[0803] 170. The cell of paragraph 169, wherein the cell is one or more of a maize, rice, sorghum, rye, barley, wheat, millet, oats, sugarcane, turfgrass, switchgrass, soybean, canola, alfalfa, sunflower, cotton, tobacco, peanut, potato, tobacco, Arabidopsis, vegetable, or safflower cell.

[0804] 171. The cell according to paragraph 159, wherein the cell is a prokaryotic cell, e.g., a Grampositive cell selected from the group consisting of Bacillus, Clostridium, Corynebacterium, Enterococcus, Geobacillus, Lactobacillus, Lacticaseibacillus, Lactiplantibacillus, Levilactobacillus, Ugilactobacillus, Umosilactobacillus, Lactococcus, Oceanobacillus, Staphylococcus, Streptococcus, or Streptomyces cells, or a Gram-negative bacteria selected from the group consisting of Campylobacter, E. coli, Flavobacterium, Fusobacterium, Helicobacter, llyobacter, Neisseria, Pseudomonas, Salmonella, and Ureaplasma cells, such as Lacticaseibacillus casei, Lacticaseibacillus paracasei, Lacticaseibacillus rhamnosus, Lactiplantibacillus plantarum, Levilactobacillus brevis, Ugilactobacillus salivarius, Umosilactobacillus fermentum, Umosilactobacillus reuteri, Lactobacillus acidophilus, Lactobacillus bulgaricus, Lactobacillus crispatus, Lactobacillus gasseri, Lactobacillus johnsonii, Lactobacillus helveticus, Corynebacterium glutamicum, Bacillus alkalophilus, Bacillus amyloliquefaciens, Bacillus brevis, Bacillus circulans, Bacillus clausii, Bacillus coagulans, Bacillus firmus, Bacillus lautus, Bacillus lentus, Bacillus licheniformis, Bacillus megaterium, Bacillus pumilus, Bacillus stearothermophilus, Bacillus subtilis, Bacillus thuringiensis, Streptococcus equisimilis, Streptococcus pyogenes, Streptococcus uberis, and Streptococcus equi subsp. Zooepidemicus, Streptomyces achromogenes, Streptomyces avermitilis, Streptomyces coelicolor, Streptomyces griseus, and Streptomyces lividans cells.

[0805] 171a. The cell according to paragraph 171 , wherein the cell is a Lacticaseibacillus paracesei cell. 171 b. The cell according to paragraph 171 , wherein the cell is a Streptococcus thermophilus cell. 171c. The cell according to paragraph 171 , wherein the cell is a E. coli cell.

[0806] 172. The cell of paragraph 171 , wherein the cell is a Bacillus cell.

[0807] 173. The cell of paragraph 171 , wherein the cell is a Bacillus subtilis cell.

[0808] 174. The cell of paragraph 171 , wherein the cell is a Bacillus licheniformis cell.

[0809] 175. The cell of any one of paragraphs 149-174, which is isolated.

[0810] 176. The cell of any one of paragraphs 149-175, which is purified.

[0811] 177. A method of producing a Cas nuclease or fusion polypeptide comprising cultivating the recombinant host cell of any one of paragraphs 149-176 under conditions conducive for production of the Cas nuclease or fusion polypeptide.

[0812] 178. The method of paragraph 177, further comprising recovering the Cas nuclease or the fusion polypeptide.

[0813] 179. Use of the Cas nuclease according to any one of paragraphs 1-58, the fusion polypeptide of any one of paragraphs 59-78, the composition according to any one of paragraphs 79-104, the polynucleotide according to any one of paragraphs 132-147, or the nucleic acid construct or expression vector according to paragraph 148, for modifying a target sequence in a targeted cell.

[0814] 180. Use of the Cas nuclease according to any one of paragraphs 1-58, the fusion polypeptide of any one of paragraphs 59-78, the composition according to any one of paragraphs 79-104, the polynucleotide according to any one of paragraphs 132-147, the nucleic acid construct or expression vector according to paragraph 148, or the cell according to any one of paragraphs 149-176, for the manufacture of a medicament for modifying a target sequence in a targeted cell.

[0815] 181. Use according to any one of paragraphs 179-180, wherein the targeted cell is selected from the group consisting of: an archaeal cell, a bacterial cell, a eukaryotic cell, a eukaryotic single-cell organism, a somatic cell, a germ cell, a stem cell, a plant cell, an algal cell, an animal cell, a non-human animal cell, an invertebrate cell, a vertebrate cell, a fish cell, a frog cell, a bird cell, a mammalian cell, a non-human mammalian cell, a pig cell, a cow cell, a goat cell, a sheep cell, a rodent cell, a rat cell, a mouse cell, a non-human primate cell, and a human cell.

[0816] 182. A formulation comprising (i) the Cas nuclease according to any one of paragraphs 1-58, the fusion polypeptide according to any one of paragraphs 59-78, a composition according to any one of paragraphs 79-104, the polynucleotide according to any one of paragraphs 132-147, the nucleic acid construct or expression vector according to paragraph 148, or the cell according to any one of paragraphs 149-176, and optionally, (ii) one or more of a lipid, a liposome, a hydrogel, a microparticle, a nanoparticle, or a block copolymer micelle.

[0817] 183. The formulation of paragraph 182, wherein the lipid is a lipid nanoparticle. 184. The formulation according to any one of paragraphs 182-183, wherein the Cas nuclease or fusion polypeptide is in a lyophilized formulation.

[0818] 185. The formulation according to any one of paragraphs 182-184, wherein the Cas nuclease or fusion polypeptide is in a liquid formulation. 186. The formulation according to any one of paragraphs 182-185, wherein the Cas nuclease or fusion polypeptide is in a substantially endotoxin-free formulation.

Claims

CLAIMS1 . A Cas nuclease selected from the group consisting of:(a) a polypeptide having at least 80% sequence identity to the amino acid sequence of SEQ ID NO: 1 ;(b) a polypeptide encoded by a polynucleotide having at least 80% sequence identity to the polypeptide coding sequence of SEQ ID NO: 2;(c) a polypeptide derived from SEQ ID NO: 1 , by having 1-30 alterations (e.g., substitutions, deletions and / or insertions at one or more positions, e.g., 1 or 2 or 3 or 4 or 5 or 6 or 7 or 8 or 9 or 10 or 11 or 12 or 13 or 14 or 15 or 16 or 17 or 18 or 19 or 20 or 21 or 22 or 23 or 24 or 25 or 26 or 27 or 28 or 29 or 30 alterations), in particular substitutions, such as conservative amino acid substitutions;(d) a polypeptide having a TM-score of at least 0.80 compared to the three- dimensional structure of the polypeptide of SEQ ID NO: 1 , wherein the three-dimensional structure is calculated using Alphafold;(e) a polypeptide derived from the polypeptide of (a), (b), (c), or (d), wherein the N- and / or C-terminal end has been extended by addition of one or more amino acids; and(f) a fragment of the polypeptide of (a), (b), (c), (d), or (e).

2. The nuclease according to claim 1 , wherein the nuclease comprises one or more domain selected from the group consisting of:(a) a RuvC domain having at least 80% sequence identity to the amino acid sequence of SEQ ID NO: 3, 4, or 5;(b) a HNH domain having at least 80% sequence identity to the amino acid sequence of SEQ ID NO: 6;(c) a RuvC domain derived from SEQ ID NO: 3, 4, or 5 by substitution, deletion, or addition of one or several amino acids of SEQ ID NO: 3, 4, or 5;(d) a HNH domain derived from SEQ ID NO: 6 by substitution, deletion, or addition of one or several amino acids of SEQ ID NO: 6; and(e) a fragment of the catalytic domain of (a), (b), (c), or (d).

3. The nuclease of any one of claims 1-2, wherein the nuclease is a nickase having one or more inactivated RuvC domain created by an amino acid substitution, insertion, or deletion at a position provided for the nuclease in column 3 of Table 2.

4. The nuclease of any one of claims 1-3, wherein the nuclease is a nickase having one or more inactivated HNH domain created by an amino acid substitution, insertion or deletion at a95position provided for the nuclease in column 3 of Table 3.

5. The nuclease of any one of claims 1 -4, wherein the nuclease is a nickase and has a singlestranded break activity towards a DNA target site.

6. The nuclease of any one of claims 1-2, wherein the nuclease is a catalytically dead nuclease.

7. The nuclease of any one of claims 1-6, wherein the nuclease is a Class 2 Type II Cas nuclease.

8. The nuclease of any one of claims 1-7, wherein the nuclease utilizes a protospacer adjacent motif (PAM) sequence provided for the nuclease in Table 1 .

9. A non-naturally occurring composition comprising (i) the Cas nuclease of any one of claims 1-8, or (ii) a nucleic acid molecule comprising a sequence encoding the Cas nuclease of any one of claims 1-8.

10. The composition according to claim 9, further comprising one or more RNA molecules, or a DNA polynucleotide encoding one or more of the one or more RNA molecules, wherein the one or more RNA molecules and the Cas nuclease do not naturally occur together, and the one or more RNA molecules are configured to form a complex with the Cas nuclease and / or target the complex to a target site.

11. The composition according to any one of claims 9-10, wherein the one or more RNA molecule comprises a guide RNA (gRNA), which gRNA is comprising a CRISPR RNA (crRNA) and a trans-activating RNA (tracrRNA).

12. A method of modifying a nucleotide sequence at a DNA target site in the genome of a cell, comprising introducing into the cell the Cas nuclease according to any one of claims 1-8, a polynucleotide encoding the Cas nuclease of any one of claims 1-8, and / or the composition of any one of claims 9-11 .

13. A polynucleotide encoding the Cas nuclease according to any one of claims 1-8.9614. A nucleic acid construct or expression vector comprising the polynucleotide according to claim 13, operably linked to one or more control sequences that direct the production of the Cas nuclease in a cell.

15. A cell comprising the Cas nuclease of any one of claims 1 -8, the polynucleotide according to claim 13, the nucleic acid construct or expression vector according to claim 14, or the composition of any one of claims 9-11.