Systems and methods for rearranging cargo nucleotide sequences

By employing recombinase or transposase complexes with Cas effector systems, the method enhances the efficiency and specificity of nucleotide sequence integration and editing, addressing limitations in existing CRISPR/Cas technologies.

JP7873002B2Active Publication Date: 2026-06-11METAGENOMI THERAPEUTICS INC

Patent Information

Authority / Receiving Office
JP · JP
Patent Type
Patents
Current Assignee / Owner
METAGENOMI THERAPEUTICS INC
Filing Date
2021-08-23
Publication Date
2026-06-11

Smart Images

  • Figure 0007873002000003
    Figure 0007873002000003
  • Figure 0007873002000004
    Figure 0007873002000004
  • Figure 0007873002000005
    Figure 0007873002000005
Patent Text Reader

Abstract

The present disclosure provides systems and methods for transposing a cargo nucleotide sequence to a target nucleic acid site. These systems and methods may include a first double-stranded nucleic acid comprising the cargo nucleotide sequence, where the cargo nucleotide sequence is configured to interact with a recombinase or transposase complex, a Cas effector complex comprising a Cas effector and at least one engineered guide polynucleotide configured to hybridize to the target nucleic acid site, and a recombinase or transposase complex configured to recruit the cargo nucleotide to the target nucleic acid site.
Need to check novelty before this filing date? Find Prior Art

Description

【Technical Field】 【0001】 Related Applications This application claims the benefit of U.S. Provisional Patent Application No. 63 / 069,703, filed August 24, 2020, entitled "SYSTEMS AND METHODS FOR TRANSPOSING CARGO NUCLEOTIDE SEQUENCES"; U.S. Provisional Patent Application No. 63 / 186,698, filed May 10, 2021, entitled "SYSTEMS AND METHODS FOR TRANSPOSING CARGO NUCLEOTIDE SEQUENCES"; and U.S. Provisional Patent Application No. 63 / 232,593, filed August 12, 2021, entitled "SYSTEMS AND METHODS FOR TRANSPOSING CARGO NUCLEOTIDE SEQUENCES", each of which is incorporated herein by reference. 【Background Art】 【0002】 Cas enzymes, together with their associated Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) guide ribonucleic acid (RNA), are thought to be a widespread (about 45% of bacteria, about 84% of archaea) component of the prokaryotic immune system and play a role in protecting the microorganism against non-self nucleic acids such as infecting viruses and plasmids by CRISPR-RNA-guided nucleic acid cleavage. Deoxyribonucleic acid (DNA) elements encoding CRISPR RNA elements can be relatively conserved in structure and length, while their CRISPR-associated (Cas) proteins are highly diverse and contain various nucleic acid interaction domains. Although CRISPR DNA elements were discovered in 1987, the programmable endonuclease cleavage ability of the CRISPR / Cas complex has only recently been recognized, and recombinant CRISPR / Cas systems have been used in various DNA manipulation and gene editing applications. 【0003】 Sequence List This application includes a sequence listing, which was filed electronically in ASCII format and is incorporated herein by reference in its entirety. The above ASCII copy, created on August 20, 2021, is named 55921-714_601_SL.txt and is 488,452 bytes in size. [Overview of the project] 【0004】 In some embodiments, the Disclosure provides a system for transposing a cargo nucleotide sequence to a target nucleic acid site, the system comprising: a first double-stranded nucleic acid comprising the cargo nucleotide sequence, wherein the cargo nucleotide sequence is configured to interact with a recombinase or transposase complex; a Cas effector complex comprising a class II, type II Cas effector and at least one manipulated guide polynucleotide configured to hybridize to the target nucleic acid site; and the recombinase or transposase complex configured to replenish the cargo nucleotide sequence to the target nucleic acid site. In some embodiments, the recombinase or transposase complex is non-covalently bonded to the Cas effector complex. In some embodiments, the recombinase or transposase complex is covalently bonded to the Cas effector complex. In some embodiments, the recombinase or transposase complex is fused to the Cas effector complex in a single polypeptide. In some embodiments, the cargo nucleotide sequence is adjacent to a left transposase recognition sequence and a right transposase recognition sequence. In some embodiments, the system further comprises a second double-stranded nucleic acid containing the target nucleic acid site. In some embodiments, the system further comprises a PAM sequence that fits the Cas effector complex adjacent to the target nucleic acid site. In some embodiments, the PAM sequence is located at 3' of the target nucleic acid site. In some embodiments, the recombinase or transposase complex is a Tn7 type transposase complex. In some embodiments, the manipulated guide polynucleotide is configured to bind to the class II, type II Cas effector. In some embodiments, the class II, type II Cas effector comprises a polypeptide containing a sequence having at least 80% identity to SEQ ID NO: 1 or its variants.In some embodiments, the recombinase or transposase complex comprises at least one, at least two, at least three, or four polypeptides, each containing a sequence having at least 80% identity to any one of SEQ ID NOs. 2-5 or a variant thereof. In some embodiments, the manipulated guide polynucleotide comprises a sequence containing at least 60-80 consecutive nucleotides, each having at least 80% identity to SEQ ID NOs. 12 or a variant thereof. In some embodiments, the manipulated guide polynucleotide comprises a sequence having at least 80% identity to SEQ ID NOs. 11 or a variant thereof. In some embodiments, the left-hand recombinase sequence comprises a sequence having at least 80% identity to any one of SEQ ID NOs. 17-18 or a variant thereof. In some embodiments, the right-hand recombinase sequence comprises a sequence having at least 80% identity to SEQ ID NOs. 19 or a variant thereof. In some embodiments, the class II, type II Cas effector and the recombinase or transposase complex are encoded by a polynucleotide sequence containing less than approximately 10 kilobases. 【0005】 In some embodiments, the Disclosure provides a method for transposing a cargo nucleotide sequence to a target nucleic acid site containing a target nucleotide sequence, the method comprising the steps of expressing any of the embodiments or systems described herein in a cell, or introducing any of the embodiments or systems described herein into a cell. 【0006】 In some embodiments, the Disclosure provides a system for transposing a cargo nucleotide sequence to a target nucleic acid site, the system comprising: a first double-stranded nucleic acid comprising a cargo nucleotide sequence configured to interact with a Tn7-type transposase complex; a Cas effector complex comprising a class II, type V Cas effector and an engineered guide polynucleotide configured to hybridize to the target nucleotide sequence; and a Tn7-type transposase complex configured to bind to the Cas effector complex, the Tn7-type transposase complex comprising a TnsA subunit. In some embodiments, the transposase complex is non-covalently bound to the Cas effector complex. In some embodiments, the transposase complex is covalently bound to the Cas effector complex. In some embodiments, the transposase complex is fused to the Cas effector complex in a single polypeptide. In some embodiments, the class II, type V Cas effector is not a Cas12k effector. In some embodiments, the cargo nucleotide sequence is adjacent to a left transposase recognition sequence and a right transposase recognition sequence. In some embodiments, the system further comprises a second double-stranded nucleic acid containing the target nucleic acid site. In some embodiments, the system further comprises a PAM sequence that fits the Cas effector complex adjacent to the target nucleic acid site. In some embodiments, the PAM sequence is located 5' of the target nucleic acid site. In some embodiments, the manipulated guide polynucleotide is configured to bind to the class II, type V Cas effector. In some embodiments, the TnsA subunit comprises a polypeptide having a sequence having at least 80% identity to SEQ ID NO: 7 or its variants. In some embodiments, the Tn7 type transposase complex comprises at least one, at least two, or three polypeptides having a sequence having at least 80% identity to any one of SEQ ID NOs: 8-10 or its variants.In some embodiments, the manipulated guide polynucleotide comprises a sequence containing at least 46 to 80 consecutive nucleotides having at least 80% identity to any one of SEQ ID NOs. 13-16 or its variants. In some embodiments, the left-hand recombinase sequence comprises a sequence having at least 80% identity to SEQ ID NOs. 20 or its variants. In some embodiments, the right-hand recombinase sequence comprises a sequence having at least 80% identity to SEQ ID NOs. 21 or its variants. In some embodiments, the class II, type V Cas effector is not a Cas12k effector. In some embodiments, the class II, type V Cas effector and the Tn7 type transposase complex are encoded by a polynucleotide sequence containing less than about 10 kilobases. 【0007】 In some embodiments, the Disclosure provides a method for transposing a cargo nucleotide sequence to a target nucleic acid site comprising a target nucleotide sequence, the method comprising expressing any one of the embodiments or systems described herein in a cell, or introducing any one of the embodiments or systems described herein into a cell. 【0008】 In some embodiments, the Disclosure provides a method for transposing a cargo nucleotide sequence to a target nucleic acid site, the method comprising contacting a first double-stranded nucleic acid containing a cargo nucleotide sequence with a Cas effector complex comprising a class II, type II Cas effector and at least one manipulated guide polynucleotide configured to hybridize to the target nucleic acid site; a recombinase or transposase complex configured to replenish the target nucleic acid site with the cargo nucleotide; and a second double-stranded nucleic acid containing the target nucleic acid site. In some embodiments, the recombinase or transposase complex is non-covalently bonded to the Cas effector complex. In some embodiments, the recombinase or transposase complex is covalently bonded to the Cas effector complex. In some embodiments, the recombinase or transposase complex is fused to the Cas effector complex in a single polypeptide. In some embodiments, the cargo nucleotide sequence is adjacent to a left transposase recognition sequence and a right transposase recognition sequence. In some embodiments, the target nucleic acid further comprises a PAM sequence that fits the Cas effector complex adjacent to the target nucleic acid site. In some embodiments, the PAM sequence is located at 3' of the target nucleic acid site. In some embodiments, the recombinase or transposase complex is a Tn7 type transposase complex. In some embodiments, the manipulated guide polynucleotide is configured to bind to the class II, type II Cas effector. In some embodiments, the class II, type II Cas effector comprises a polypeptide containing a sequence having at least 80% identity to SEQ ID NO: 1 or a variant thereof. In some embodiments, the recombinase or transposase complex comprises at least one, at least two, at least three, or four polypeptides containing a sequence having at least 80% identity to any one of SEQ ID NOs: 2-5 or a variant thereof.In some embodiments, the manipulated guide polynucleotide comprises a sequence containing at least 60 to 80 consecutive nucleotides having at least 80% identity to SEQ ID NO: 12 or its variants. In some embodiments, the manipulated guide polynucleotide comprises a sequence having at least 80% identity to SEQ ID NO: 11 or its variants. In some embodiments, the left-hand recombinase sequence comprises a sequence having at least 80% identity to any one of SEQ ID NOs: 17-18 or its variants. In some embodiments, the right-hand recombinase sequence comprises a sequence having at least 80% identity to SEQ ID NO: 19 or its variants. In some embodiments, the class II, type II Cas effector and the Tn7 type transposase complex are encoded by a polynucleotide sequence containing less than about 10 kilobases. 【0009】 In some embodiments, the Disclosure provides a method for transposing a cargo nucleotide sequence to a target nucleic acid site, the method comprising contacting a first double-stranded nucleic acid comprising the cargo nucleotide sequence with a Cas effector complex comprising a class II, type V Cas effector and at least one manipulated guide polynucleotide configured to hybridize to the target nucleotide sequence; a Tn7 type transposase complex configured to bind to the Cas effector complex, comprising a TnsA subunit; and a second double-stranded nucleic acid comprising the target nucleic acid site. In some embodiments, the transposase complex is non-covalently bound to the Cas effector complex. In some embodiments, the transposase complex is covalently bound to the Cas effector complex. In some embodiments, the transposase complex is fused to the Cas effector complex in a single polypeptide. In some embodiments, the cargo nucleotide sequence is adjacent to a left transposase recognition sequence and a right transposase recognition sequence. In some embodiments, the target nucleic acid site further comprises a PAM sequence that fits the Cas effector complex adjacent to the target nucleic acid site. In some embodiments, the PAM sequence is located 3' of the target nucleic acid site. In some embodiments, the manipulated guide polynucleotide is configured to bind to the class II, type V Cas effector. In some embodiments, the TnsA subunit comprises a polypeptide having a sequence having at least 80% identity to SEQ ID NO: 7 or its variants. In some embodiments, the Tn7 type transposase complex comprises at least one, at least two, or three polypeptides having a sequence having at least 80% identity to any one of SEQ ID NOs: 8-10 or its variants. In some embodiments, the manipulated guide polynucleotide comprises a sequence having at least about 46-80 consecutive nucleotides having at least 80% identity to any one of SEQ ID NOs: 13-16 or its variants.In some embodiments, the left-hand recombinase sequence includes a sequence having at least 80% identity to SEQ ID NO: 20 or its variants. In some embodiments, the right-hand recombinase sequence includes a sequence having at least 80% identity to SEQ ID NO: 21 or its variants. In some embodiments, the class II, type V Cas effector is not a Cas12k effector. In some embodiments, the class II, type V Cas effector and the Tn7 type transposase complex are encoded by a polynucleotide sequence containing less than about 10 kilobases. 【0010】 In some embodiments, the Disclosure provides a system for transposing a cargo nucleotide sequence to a target nucleic acid site, the system comprising: a first double-stranded nucleic acid comprising a cargo nucleotide sequence configured to interact with a Tn7-type transposase complex; a Cas effector complex comprising a class I, type IF Cas effector and an engineered guide polynucleotide configured to hybridize to the target nucleotide sequence; and a Tn7-type transposase complex configured to bind to the Cas effector complex, the Tn7-type transposase complex comprising a TnsA subunit. In some embodiments, the transposase complex is non-covalently bound to the Cas effector complex. In some embodiments, the transposase complex is covalently bound to the Cas effector complex. In some embodiments, the transposase complex is fused to the Cas effector complex in a single polypeptide. In some embodiments, the cargo nucleotide sequence is adjacent to a left-side transposase recognition sequence and a right-side transposase recognition sequence. In some embodiments, the system further comprises a second double-stranded nucleic acid containing the target nucleic acid site. In some embodiments, the system further comprises a PAM sequence that fits the Cas effector complex adjacent to the target nucleic acid site. In some embodiments, the PAM sequence is located at 3' of the target nucleic acid site. In some embodiments, the PAM sequence is located at 5' of the target nucleic acid site. In some embodiments, the manipulated guide polynucleotide is configured to bind to the class I, type IF Cas effector. In some embodiments, the class I, type IF Cas effector comprises a polypeptide containing a sequence having at least 80% identity to one of sequence numbers 41-43 or 48-50 or a variant thereof. In some embodiments, the Tn7 type transposase complex comprises at least one, at least two, or three polypeptides containing a sequence having at least 80% identity to one of sequence numbers 44-46 or 51-53 or a variant thereof. 【0011】 In some embodiments, the Disclosure provides a method for transposing a cargo nucleotide sequence to a target nucleic acid site comprising a target nucleotide sequence, the method comprising expressing any one of the embodiments or systems described herein in a cell, or introducing any one of the embodiments or systems described herein into a cell. 【0012】 In some embodiments, the Disclosure provides a system for transposing a cargo nucleotide sequence to a target nucleic acid site, the system comprising: a first double-stranded nucleic acid comprising a cargo nucleotide sequence configured to interact with a Tn7 type transposase complex; a Cas effector complex comprising a class II, type V Cas effector and an engineered guide polynucleotide configured to hybridize to the target nucleotide sequence; and a Tn7 type transposase complex configured to bind to the Cas effector complex, wherein the Tn7 type transposase complex comprises TnsB, Tn The transposase complex comprises sC and TniQ components, wherein (a) the class II, type V Cas effector comprises a polypeptide having a sequence having at least 80% sequence identity to any one of SEQ ID NOs. 22, 26, 30, 34, 55-89, 104, or 147 or a variant thereof, or (b) the Tn7 type transposase complex comprises a TnsB, TnsC, or TniQ component having a sequence having at least 80% sequence identity to any one of SEQ ID NOs. 23-25, 27-29, 31-33, 35-37, 101-103, 105-107, or 148-150 or a variant thereof. In some embodiments, the transposase complex is non-covalently bound to the Cas effector complex. In some embodiments, the transposase complex is covalently bound to the Cas effector complex. In some embodiments, the transposase complex is fused to the Cas effector complex in a single polypeptide. In some embodiments, the class II, type V Cas effector comprises a polypeptide containing a sequence having at least 80% sequence identity to one of sequence numbers 22, 26, 30, 34, 55-89, 104, or 147 or a variant thereof.In some embodiments, the Tn7 type transposase complex includes a TnsB, TnsC, or TniQ component containing a sequence having at least 80% sequence identity to one of sequence numbers 23-25, 27-29, 31-33, 35-37, 101-103, 105-107, or 148-150 or a variant thereof. In some embodiments, the class II, type V Cas effector is a Cas12k effector. In some embodiments, the cargo nucleotide sequence is adjacent to the left transposase recognition sequence and the right transposase recognition sequence. In some embodiments, the system further includes a second double-stranded nucleic acid containing the target nucleic acid site. In some embodiments, the system further includes a PAM sequence that fits the Cas effector complex adjacent to the target nucleic acid site. In some embodiments, the PAM sequence is located at 5' of the target nucleic acid site. In some embodiments, the PAM sequence comprises 5'-nGTn-3' or 5'-nGTt-3'. In some embodiments, the manipulated guide polynucleotide is configured to bind to the class II, type V Cas effector. In some embodiments, the TnsB, TnsC, and TniQ components each comprise a polypeptide having a sequence that is at least 80% identical to one of sequence numbers 23-25, 27-29, 31-33, 35-37, 101-103, 105-107, or 148-150. In some embodiments, the manipulated guide polynucleotide comprises a sequence containing at least about 46-80 consecutive nucleotides that is at least 80% identical to one of sequence numbers 90, 91, 92, 93, 117, 151, 156-181, or 209-234. In some embodiments, the manipulated guide polynucleotide comprises a sequence having at least 80% sequence identity to one of sequence numbers 111-114 or 201-206, 255, 262, 256, 209, 257, 263, 258, 210 or a variant thereof.In some embodiments, the left-hand recombinase sequence includes a sequence having at least 80% identity to one of sequence numbers 125, 127, 123, 129, 131, 133, 153, or 134 or its variants. In some embodiments, the right-hand recombinase sequence includes a sequence having at least 80% identity to one of sequence numbers 126, 155, 128, 124, 130, 132, or 154 or its variants. In some embodiments, the class II, type V Cas effector and the Tn7 type transposase complex are encoded by a polynucleotide sequence containing less than approximately 10 kilobases. In some embodiments, (a) the class II, type V Cas effector comprises a sequence having at least 80% sequence identity to sequence number 22 or a variant thereof; (b) the left recombinase sequence comprises a sequence having at least 80% sequence identity to sequence number 125 or a variant thereof; (c) the right recombinase sequence comprises a sequence having at least 80% sequence identity to sequence number 126 or 155 or a variant thereof; (d) the manipulated guide polynucleotide comprises (i) a sequence having at least 80% sequence identity to at least about 46 to 60 nucleotides of sequence number 90; or (ii) a sequence having at least 80% sequence identity to any one non-degenerate nucleotide of sequence number 94, 112, or 202; or (e) the TnsB, TnsC, and TniQ components comprises sequences having at least 80% sequence identity to sequence numbers 23 to 25 or a variant thereof.In some embodiments, (a) the class II, type V Cas effector comprises a sequence having at least 80% sequence identity to sequence number 26 or its variants; (b) the left recombinase sequence comprises a sequence having at least 80% sequence identity to sequence number 127 or its variants; (c) the right recombinase sequence comprises a sequence having at least 880% sequence identity to sequence number 128 or its variants; (d) the manipulated guide polynucleotide comprises (i) a sequence having at least 80% sequence identity to at least about 46-60 nucleotides of any one of sequence numbers 91, 156, or 209; or (ii) a sequence having at least 80% sequence identity to any one of sequence numbers 95, 113, or 203; or (e) the TnsB, TnsC, and TniQ components comprises sequences having at least 80% sequence identity to sequence numbers 27-29 or their variants. In some embodiments, (a) the class II, type V Cas effector comprises a sequence having at least 80% sequence identity to sequence number 60 or its variants; (b) the left recombinase sequence comprises a sequence having at least 80% sequence identity to sequence number 131 or its variants; (c) the right recombinase sequence comprises a sequence having at least 80% sequence identity to sequence number 132 or its variants; (d) the manipulated guide polynucleotide comprises (i) a sequence having at least 80% sequence identity to at least about 46-60 nucleotides of any one of sequence numbers 117, 161, or 214; or (ii) a sequence having at least 80% sequence identity to the non-degenerate nucleotide of sequence number 119; or (e) the TnsB, TnsC, and TniQ components comprising sequences having at least 80% sequence identity to sequence numbers 101-103 or their variants.In some embodiments, (a) the class II, type V Cas effector comprises a sequence having at least 80% sequence identity to sequence number 147 or a variant thereof; (b) the left recombinase sequence comprises a sequence having at least 80% sequence identity to sequence number 153 or a variant thereof; (c) the right recombinase sequence comprises a sequence having at least 880% sequence identity to sequence number 154 or a variant thereof; (d) the manipulated guide polynucleotide comprises (i) a sequence having at least 80% sequence identity to at least about 46-60 nucleotides of any one of sequence numbers 151, 181, or 234; or (ii) a sequence having at least 80% sequence identity to the non-degenerate nucleotide of sequence number 152 or 254; or (e) the TnsB, TnsC, and TniQ components comprises sequences having at least 80% sequence identity to sequence numbers 148-150 or a variant thereof. In some embodiments, (a) the class II, type V Cas effector comprises a sequence having at least 80% sequence identity to sequence number 34 or a variant thereof; (b) the left recombinase sequence comprises a sequence having at least 80% sequence identity to sequence number 129 or a variant thereof; (c) the right recombinase sequence comprises a sequence having at least 880% sequence identity to sequence number 130 or a variant thereof; (d) the manipulated guide polynucleotide comprises (i) a sequence having at least 80% sequence identity to at least about 46-60 nucleotides of any one of sequence numbers 93, 157, or 210; or (ii) a sequence having at least 80% sequence identity to any one of sequence numbers 97, 114, or 204; or (e) the TnsB, TnsC, and TniQ components comprising sequences having at least 80% sequence identity to sequence numbers 148-150 or a variant thereof.In some embodiments, (a) the class II, type V Cas effector comprises a sequence having at least 80% sequence identity with respect to SEQ ID NO: 30 or a variant thereof; (b) the left recombinase sequence comprises a sequence having at least 80% sequence identity with respect to SEQ ID NO: 123 or a variant thereof; (c) the right recombinase sequence comprises a sequence having at least 80% sequence identity with respect to SEQ ID NO: 124 or a variant thereof; and (d) the manipulated guide polynucleotide comprises (i) SEQ ID NO: (ii) comprising a sequence having at least 80% sequence identity to at least about 46 to 80 nucleotides of 92, or (ii) comprising a sequence having at least 80% identity to the non-degenerate nucleotide of SEQ ID NO: 111 or 201, (e) comprising a polypeptide in which the TnsB, TnsC, and TniQ components have sequences having at least 80% identity to SEQ ID NOs: 31, 32, and 33 or their variants, or (f) comprising a PAM sequence having 5'-nGTn-3' or 5'-nGTt-3'. 【0013】 In some embodiments, the Disclosure provides a system for transposing a cargo nucleotide sequence to a target nucleic acid site, the system comprising: a first double-stranded nucleic acid comprising a cargo nucleotide sequence configured to interact with a Tn7-type transposase complex; a Cas effector complex comprising a class II, type V Cas effector and an engineered guide polynucleotide configured to hybridize to the target nucleotide sequence; and a Tn7-type transposase complex configured to bind to the Cas effector complex, the Tn7-type transposase complex comprising TnsB and TnsC components but not TnsA and / or TniQ components. In some embodiments, the transposase complex is non-covalently bonded to the Cas effector complex. In some embodiments, the transposase complex is covalently bonded to the Cas effector complex. In some embodiments, the transposase complex is fused to the Cas effector complex in a single polypeptide. In some embodiments, the Tn7 type transposase complex comprises a polypeptide containing a sequence having at least 80% sequence identity to one of SEQ ID NOs: 39-40 or 109-110. In some embodiments, the TnsB component comprises a polypeptide containing a sequence having at least 80% sequence identity to SEQ ID NOs: 40 or 109. In some embodiments, the TnsC component comprises a polypeptide containing a sequence having at least 80% sequence identity to SEQ ID NOs: 39 or 110. In some embodiments, the class II, type V Cas effector is a Cas12k effector. In some embodiments, the class II, type V Cas effector contains a sequence having at least 80% sequence identity to SEQ ID NOs: 38 or 108. In some embodiments, the cargo nucleotide sequence is adjacent to the left transposase recognition sequence and the right transposase recognition sequence. In some embodiments, the system further comprises a second double-stranded nucleic acid containing the target nucleic acid site.In some embodiments, the double-stranded nucleic acid contains the target nucleic acid site, or the system is located inside a cell. In some embodiments, the system further includes a PAM sequence that fits the Cas effector complex adjacent to the target nucleic acid site. In some embodiments, the PAM sequence is located 5' of the target nucleic acid site. In some embodiments, the manipulated guide polynucleotide is configured to bind to the class II, type V Cas effector. In some embodiments, the TnsB and TnsC components include polypeptides having sequences that are at least 80% identical to SEQ ID NOs. 40 and 39, or 109 and 110, respectively. In some embodiments, the manipulated guide polynucleotide includes a sequence containing at least about 46 to 80 consecutive nucleotides that are at least 80% identical to one of SEQ ID NOs. 118, 182, 183, 235, or 236 or a variant thereof. In some embodiments, the manipulated guide polynucleotide comprises a sequence having at least 80% identity to one of the non-degenerate nucleotides of SEQ ID NOs: 115, 116, 205, 206, 261, 235, 260, or 236 or a variant thereof. In some embodiments, the left-hand recombinase sequence comprises a sequence having at least 80% identity to SEQ ID NO: 134. In some embodiments, the right-hand recombinase sequence comprises a sequence having at least 80% identity to SEQ ID NO: 135 or a variant thereof. In some embodiments, the class II, type V Cas effector and the Tn7 type transposase complex are encoded by a polynucleotide sequence containing less than approximately 10 kilobases.In some embodiments, (a) the class II, type V Cas effector comprises a sequence having at least 80% sequence identity to SEQ ID NO: 38 or its variants; (b) the left recombinase sequence comprises a sequence having at least 80% sequence identity to SEQ ID NO: 134 or its variants; (c) the right recombinase sequence comprises a sequence having at least 80% identity to SEQ ID NO: 135 or its variants; (d) the manipulated guide polynucleotide comprises (i) a sequence having at least 80% sequence identity to at least about 46 to 80 nucleotides of SEQ ID NO: 182 or 235; or (ii) a sequence having at least 80% identity to the non-degenerate nucleotides of SEQ ID NO: 98, 115, 116, 205, or 206; or (e) the TnsB and TnsC components comprise polypeptides having sequences having at least 80% identity to SEQ ID NO: 40 and 39 or their variants. 【0014】 In some embodiments, the Disclosure provides an engineered nuclease system comprising an endonuclease comprising a RuvC domain and an HNH domain, wherein the endonuclease is derived from an uncultured microorganism and is a class II, type II endonuclease comprising a sequence having at least 80% identity to SEQ ID NO: 1 or a variant thereof; and an engineered guide polynucleotide comprising a spacer sequence configured to form a complex with the endonuclease and to hybridize the engineered guide polynucleotide to a target nucleic acid sequence. In some embodiments, the engineered guide polynucleotide comprises at least 60 to 80 consecutive nucleotides having at least 80% identity to SEQ ID NO: 12 or a variant thereof. In some embodiments, the engineered guide polynucleotide comprises a sequence having at least 80% identity to SEQ ID NO: 11 or a variant thereof. 【0015】 In some embodiments, the Disclosure provides an engineered nuclease system comprising an endonuclease comprising a RuvC domain, wherein the endonuclease is derived from an uncultured microorganism and is a class II, type V endonuclease having at least 80% identity to SEQ ID NO: 5, and an engineered guide polynucleotide comprising a spacer sequence configured to form a complex with the endonuclease, wherein the engineered guide RNA is configured to hybridize to a target nucleic acid sequence. In some embodiments, the engineered guide polynucleotide comprises a sequence comprising at least about 46 to 80 consecutive nucleotides having at least 80% identity to SEQ ID NOs: 13 to 16 or their variants. 【0016】 In some embodiments, the Disclosure provides an engineered nuclease system comprising: an endonuclease comprising a RuvC domain, wherein the endonuclease is derived from an uncultured microorganism and is a class II, type VK endonuclease having at least 80% identity to any one of SEQ ID NOs: 22, 26, 30, 34, 55-89, 104, or 147 or its variants; and an engineered guide polynucleotide comprising an engineered guide polynucleotide configured to form a complex with the endonuclease, wherein the engineered guide RNA comprises a spacer sequence configured to hybridize to a target nucleic acid sequence. In some embodiments, the manipulated guide polynucleotide comprises a sequence containing at least about 46 to 80 consecutive nucleotides having at least 80% sequence identity to one of sequence numbers 90, 91, 92, 93, 117, 151, 156-181, or 209-234 or its variants. In some embodiments, the manipulated guide polynucleotide comprises a sequence having at least 80% sequence identity to a non-degenerate nucleotide of one of sequence numbers 111-114 or 201-206, 255, 262, 256, 209, 257, 263, 258, 210 or its variants. 【0017】 In some embodiments, the Disclosure provides an engineered nuclease system comprising: an endonuclease comprising a RuvC domain, wherein the endonuclease is derived from an uncultured microorganism, and the endonuclease is a class II, type VK endonuclease having at least 80% identity to either SEQ ID NO: 38 or SEQ ID NO: 108 or a variant thereof; and an engineered guide polynucleotide comprising an engineered guide polynucleotide configured to complex with the endonuclease, wherein the engineered guide RNA comprises a spacer sequence configured to hybridize to a target nucleic acid sequence. In some embodiments, the engineered guide polynucleotide comprises a sequence comprising at least about 46 to 80 consecutive nucleotides having at least 80% identity to either SEQ ID NO: 118, 182, 183, 235, or 236 or a variant thereof. In some embodiments, the manipulated guide polynucleotides include sequences having at least 80% identity to non-degenerate nucleotides of sequence numbers 111-114 or 201-206, 255, 262, 256, 209, 257, 263, 258, 210, 115, 116, 205, 206, 261, 235, 260, or 236, or their variants. 【0018】 In some embodiments, the Disclosure provides an engineered nuclease system comprising a class I, type IF Cas endonuclease comprising at least one Cas6, Cas7, or Cas8 polypeptide having at least 80% identity to one of SEQ ID NOs. 41-43 or 48-50 or a variant thereof, and an engineered guide, wherein the engineered guide RNA comprises a spacer sequence configured to form a complex with the endonuclease and to hybridize the engineered guide RNA to a target nucleic acid sequence. In some embodiments, the engineered guide polynucleotide comprises a sequence having at least 80% identity to the non-degenerate nucleotide of SEQ ID NOs. 121, 122, 207, or 208. 【0019】 Further aspects and advantages of this disclosure will be readily apparent to those skilled in the art from the detailed description below, and only exemplary embodiments of this disclosure are shown and described here. As will be understood, this disclosure may also be possible in other embodiments and different embodiments, and various details thereof may be modified in various obvious ways without all departing from this disclosure. Accordingly, the drawings and description are intended to be illustrative and not limiting. 【0020】 Embedding by citation All publications, patents, and patent applications referenced herein are incorporated herein by reference to the same extent as if each individual publication, patent, or patent application were specifically and individually indicated to be incorporated by reference. [Brief explanation of the drawing] 【0021】 Novel features of the present invention are described in particular in the appended claims. The features and advantages of the present invention will be better understood by referring to the following detailed description illustrating exemplary embodiments in which the principles of the present invention are used, and to the following appended drawings (also referred to herein as “Figure” and “FIG.”). [Figure 1] This diagram illustrates the typical organization of various classes and types of CRISPR / Cas loci. [Figure 2] The diagram illustrates the structure of a natural class II, type II crRNA / tracrRNA pair and compares it to a hybrid sgRNA formed by the ligation of crRNA and tracrRNA. [Figure 3] This shows two paths observed in Tn7 and Tn7-like elements. [Figure 4] This depicts the genomic context of type II Tn7-depleted CASTs of the MG36 family. Figure 4A shows that the MG36-5 CAST system consists of a CRISPR array (CRISPR repeats), a type II nuclease with RuvC and HNH endonuclease domains, and four predicted transposase protein open reading frames. The catalytic transposase TnsB is encoded as two subunits. Figure 4B shows that two transposon ends are predicted for the MG36-1 CAST system (TIR-1 and TIR-2). Figure 4C shows the alignment of the predicted left-end (LE) and right-end (RE) sequences of the type II Tn7-depleted CAST transposons, with annotated repeats indicated by arrows. The left and right ends are labeled according to their orientation. [Figure 5]It depicts the genomic context of the type V Tn7 CAST of the family MG39. A in Figure 5 shows that the MG39-1 CAST system consists of a type V nuclease, four predicted transposon proteins (TnsABC and TniQ), and a CRISPR array. The transposon ends were predicted for the MG39-1 CAST system (TIR-1). B in Figure 5 shows the alignment of the predicted left end (LE) and right end (RE) sequences of the type V Tn7 CAST transposon, with the annotated inverted repeats represented by arrows. [Figure 6] It depicts the predicted structure of the corresponding sgRNA of the CAST system described herein (e.g., predicted in Example 3). [Figure 7] It depicts the predicted structure of the corresponding sgRNA of the CAST system described herein (e.g., predicted in Example 3). [Figure 8] It depicts the genomic context of MG108-1, a system described herein. This candidate is a Cas12K CAST that is naturally lacking in TniQ. Genes in the genomic fragment are represented by arrows. [Figure 9] It depicts a phylogenetic gene tree of the Cas12k effector sequence. This tree was inferred from the multiple sequence alignment of 64 Cas12k sequences (orange and black branches) newly recovered this time and 229 reference Cas12k sequences (gray branches) from public databases. The orange branches indicate Cas12k effectors whose association with the CAST transposon components was confirmed. [Figure 10]It depicts the MG110 cascade CAST. A) Genomic context of the MG110-1 cascade CAST. The complete Tn7 combination (TnsA, TnsB, TnsC / TniB, TniQ) and the defective cascade combination (Cas6, Cas7, fused Cas5-Cas8) are represented by orange arrows. The TIR adjacent to the CAST transposon is represented by connected arrows. B) The repetitive secondary structure shows the stem-loop structure of the crRNA. C) The sequence alignment of CRISPR repeats from A. wodanis, Vibrio cholerae, and the MG110 family CAST shows conserved motifs indicative of the crRNA stem-loop secondary structure. [Figure 11A] It depicts the MG64-3 CRISPR locus. The tracrRNA is encoded upstream of the CRISPR array, and the transposon ends are encoded downstream (inner black frame). Sequences corresponding to partial 3’ CRISPR repeats and partial spacers are encoded within the transposon (outer frame). The self-matching spacer is encoded outside the transposon ends. [Figure 11B] It depicts the tracrRNA sequence alignment for various CASTs provided herein. The alignment of the tracrRNA sequences shows regions of conservation. In particular, the sequence "TGCTTTC" (upper frame) at sequence positions 92 - 98 is suggested to be important for the tertiary structure of the sgRNA and for the discontinuous repeat-anti-repeat pairing with the crRNA. Furthermore, the hairpin "CYCC(n6)GGRG" (lower frame) at positions 265 - 278 is suggested to be functionally important and capable of positioning downstream sequences for crRNA pairing. [Figure 11C] For example, it shows the presence of other important repeat-anti-repeat (RAR) motifs in the families of MG64-2, MG64-4, MG64-5, MG64-6, MG64-7, and MG108-1. [Figure 12A] It depicts the predicted structure of the MG64-2 sgRNA. [Figure 12B] This shows the predicted structure of MG64-4 sgRNA. [Figure 12C] This shows the predicted structure of MG64-6 sgRNA. [Figure 12D] This shows the predicted structure of MG64-7 sgRNA. [Figure 12E] This shows the predicted structure of MG108-1 sgRNA. [Figure 13-1] Figure 13 depicts PCR, PAM, and Sanger sequencing data demonstrating the in vitro activity of MG64-6. Effector proteins and their TnsB, TnsC, and TniQ proteins were expressed in an in vitro transcription / translation system using the protocol described for in vitro targeted integrase activity. Post-translation, target DNA, cargo DNA, and sgRNA were added to the reaction buffer. Integration was assayed by PCR across the target / donor junction. Figure 13A depicts gel images of PCRs showing transpositions with apo (without sgRNA) and 64-6 with sgRNA 64-6. PCR3 detects the RE junction distal to PAM. PCR4 detects the LE junction distal to PAM. PCR5 detects the RE junction proximal to PAM. PCR6 detects the LE junction proximal to PAM. PCRs are paired with different possible orientations (PCR3 and 6 vs. PCR4 and 5). An orientation with LE-PAM proximal and RE-PAM distal is preferred. Figure 13B depicts PAMs from an in vitro transposition assay for sequencing PCR5 and 6. [Figure 13-2]Figure 13 shows PCR, PAM, and Sanger sequencing data demonstrating the in vitro activity of MG64-6. The effector protein and its TnsB, TnsC, and TniQ proteins were expressed in an in vitro transcription / translation system using the protocol described for in vitro targeted integrase activity. Post-translation, target DNA, cargo DNA, and sgRNA were added to the reaction buffer. Integration was assayed by PCR across the target / donor junction. Figure 13C shows Sanger data indicating the junction of the transposition where excision occurs in the donor DNA. The first panel shows PCR 3 and 5 (RE). The second panel shows PCR 4 and 6 (LE). Since the Sanger sequencing reaction is for the donor-target product, the point at which the sequencing no longer matches the donor DNA is when the junction occurs (dark bar below the sequencing peak). [Figure 14] This image depicts next-generation sequencing (NGS) results of in vitro transposition products that reveal insertion site preferences. NGS reads were processed in CRISPResso2 compared to a reference sequence with a transposition at position 60. The resulting indels correspond to transpositions earlier or later than this arbitrary reference sequence. [Figure 15] This image shows the results of electrophoretic transfer assay (EMSA) of 64-2TnsB and its RE DNA sequence. The EMSA results confirm binding and TnsB recognition. TnsB protein was expressed in an in vitro transcription / translation system, incubated with FAM-labeled DNA containing the RE sequence, and separated on a natural 5% TBE gel. Binding is observed as an upward shift of the labeled band. Due to the presence of multiple TnsB binding sites, there are multiple shifts in the EMSA. Lane 1: FAM-labeled DNA only. Lane 2: FAM DNA and in vitro transcription / translation system (no TnsB protein). Lane 3: FAM DNA and TnsB. The upshift of the labeled band in Lane 3 indicates binding of the RE sequence by TnsB, suggesting that it contains an active RE transposition sequence. 【0022】 A brief description of the sequence listing The sequence listings submitted with this specification provide exemplary polynucleotide and polypeptide sequences for use in the methods, compositions, and systems relating to this disclosure. The following is an exemplary description of some of these sequences. 【0023】 MG36 【0024】 Sequence ID 1 shows the full-length peptide sequence of the MG36 Cas effector. 【0025】 Sequence IDs 2-5 show peptide sequences of MG36 transposition proteins that may contain a recombinase or transposase complex associated with the MG36 Cas effector. The addition of -B1, -B2, -T1, and -C to the label's terminus indicates similarity to the Tn7-like TnsB1, TnsB2, TnsT1, and TniC proteins, respectively. 【0026】 Sequence ID 11 shows the nucleotide sequence of an sgRNA engineered to function with the MG36 Cas effector. 【0027】 Sequence ID 12 shows the nucleotide sequence of MG36 tracrRNA, which originates from the same gene locus as the MG36 Cas effector. 【0028】 Sequence IDs 17-18 show the nucleotide sequences of the left-hand transposase recognition sequences associated with the MG36 system. 【0029】 Sequence ID 19 shows the nucleotide sequence of the right-hand transposase recognition sequence associated with the MG36 system. 【0030】 MG39 【0031】 Sequence ID 6 shows the full-length peptide sequence of the MG39-1 Cas effector. 【0032】 Sequence IDs 7-10 show peptide sequences of MG39-1 transposition proteins that may contain a recombinase or transposase complex associated with the MG39-1 Cas effector. 【0033】 Sequence IDs 13-16 show the nucleotide sequences of MG39 tracrRNA, which originate from the same gene locus as the MG39 Cas effector. 【0034】 Sequence ID 20 shows the nucleotide sequence of the left-hand transposase recognition sequence associated with the MG39 system. 【0035】 Sequence ID 21 shows the nucleotide sequence of the right-hand transposase recognition sequence associated with the MG39 system. 【0036】 MG64 【0037】 Sequence IDs 22, 26, 30, 34, 55-89, 104, and 147 show the full-length peptide sequences of the MG64 Cas effector. 【0038】 Sequence IDs 23-25, 27-29, 31-33, 35-37, 101-103, 105-107, and 148-150 show peptide sequences of MG64 transposition proteins that may contain MG64 Cas effector-related recombinase or transposase complexes. The addition of -A, -B, -C, and -Q to the label's ends indicates similarity to Tn7-like TnsA, TnsB, TnsC, and TniQ proteins, respectively. 【0039】 Sequence IDs 90-93, 117, 151, 156-181, and 209-234 show the nucleotide sequences of MG64 tracrRNA derived from the same locus as the MG64 effector. 【0040】 Sequence IDs 94-97, 119, 152, and 184-200 show the nucleotide sequences of the MG64-targeted CRISPR repeat. 【0041】 Sequence IDs 237-259 show the nucleotide sequences of MG64 crRNAs. 【0042】 Sequence IDs 111-114 and 201-204 show the nucleotide sequences of single guide RNAs engineered to function with the MG64 Cas effector. 【0043】 Sequence IDs 123, 125, 127, 129, 131, 133, and 153 show the nucleotide sequences of the left-hand transposase recognition sequences associated with the MG64 system. 【0044】 Sequence IDs 124, 126, 128, 130, 132, 154, and 155 show the nucleotide sequences of the right-hand transposase recognition sequences associated with the MG64 system. 【0045】 MG108 【0046】 Sequence IDs 38 and 108 show the full-length peptide sequences of the MG108 Cas effector. 【0047】 Sequence IDs 39-40 and 109-110 show peptide sequences of MG108 transposition proteins that may contain a recombinase or transposase complex associated with the MG108 Cas effector. The addition of -A, -B, -C, and -Q to the label's terminus indicates similarity to the Tn7-like TnsA, TnsB, TnsC, and TniQ proteins, respectively. 【0048】 Sequence IDs 98 and 120 show the nucleotide sequences of the MG108-targeted CRISPR repeat. 【0049】 Sequence IDs 260-261 show the nucleotide sequences of MG108 crRNAs. 【0050】 Sequence IDs 115-116 and 205-206 show the nucleotide sequences of single guide RNAs engineered to function with the MG108 Cas effector. 【0051】 Sequence IDs 118, 182-183, and 235-236 show the nucleotide sequences of the MG108 tracrRNA, which originates from the same locus as the MG108 effector. 【0052】 Sequence ID 134 shows the nucleotide sequence of the left-hand transposase recognition sequence associated with the MG108 system. 【0053】 Sequence ID 135 shows the nucleotide sequence of the right-hand transposase recognition sequence associated with the MG108 system. 【0054】 MG110 【0055】 Sequence IDs 41-43 and 48-50 show the full-length peptide sequences of the MG110 Cas effector. The addition of -6, -7, and -8 to the label ends indicates similarity to the class I, type IF system cas6, cas7, and cas8 proteins, respectively. 【0056】 Sequence IDs 44-47 and 51-54 show peptide sequences of MG110 transposition proteins that may contain a recombinase or transposase complex associated with the MG110 Cas effector. The addition of -A, -B, -C, and -Q to the label's terminus indicates similarity to the Tn7-like TnsA, TnsB, TnsC, and TniQ proteins, respectively. 【0057】 Sequence IDs 99-100 show the nucleotide sequences of the MG110-targeted CRISPR repeat. 【0058】 Sequence IDs 121-122 and 207-208 show the nucleotide sequences of MG110 crRNAs. 【0059】 Sequence IDs 136 and 138 show the nucleotide sequences of the left-hand transposase recognition sequences associated with the MG110 system. 【0060】 Sequence IDs 137 and 139 show the nucleotide sequences of the right-hand transposase recognition sequences associated with the MG110 system. 【0061】 Other arrays Sequence IDs 140-141 show the peptide sequences of the nuclear localization signal. 【0062】 Sequence IDs 142-143 show the linker peptide sequences. 【0063】 Sequence IDs 144-146 show the peptide sequences of the epitope tag. [Modes for carrying out the invention] 【0064】 While embodiments of the present invention are shown and described herein, it will be apparent to those skilled in the art that such embodiments are provided only as examples. It will be understood that numerous modifications, variations, and substitutions can be made without departing from the invention. It should be understood that various alternatives to the embodiments of the invention described herein may be utilized. 【0065】 The implementation of some of the methods disclosed herein utilizes techniques from immunology, biochemistry, chemistry, molecular biology, microbiology, cell biology, genomics, and recombinant DNA, unless otherwise specified. See, for example, Sambrook and Green, Molecular Cloning: A Laboratory Manual, 4th Edition (2012); the series Current Protocols in Molecular Biology (FM Ausubel, et al. eds.); the series Methods In Enzymology (Academic Press, Inc.), PCR 2: A Practical Approach (MJ MacPherson, BD Hames and GR Taylor eds. (1995)), Harlow and Lane, eds. (1988) Antibodies, A Laboratory Manual, and Culture of Animal Cells: A Manual of Basic Technique and Specialized Applications, 6th Edition (RI Freshney, ed. (2010)) (which are fully incorporated herein by reference). 【0066】 As used herein, the singular forms “a,” “an,” and “the” are intended to include the plural forms similarly unless the context otherwise expressly indicates. Furthermore, to the extent that the terms “including,” “includes,” “having,” “has,” “with,” or their variations thereof are used in any of the detailed descriptions and / or claims, such terms are intended to be inclusive in a manner similar to that of the term “comprising.” 【0067】 The terms “about” or “approximately” mean that a particular value is within an acceptable margin of error as determined by those skilled in the art, and this depends in part on how that value is measured or determined, i.e., on the limitations of the measuring system. For example, “about” could mean a standard deviation of 1 or more for the practice in the art in question. Alternatively, “about” could mean a range of up to 20%, up to 15%, up to 10%, up to 5%, or up to 1% of any given value. 【0068】 As used herein, “cell” usually refers to a living cell. A cell can be the basic structural, functional, and / or biological unit of a living organism. A cell may originate from any organism that has one or more cells. Some non-limiting examples include cells from prokaryotic cells, eukaryotic cells, bacterial cells, archaeal cells, unicellular eukaryotic cells, protozoan cells, cells from plants (e.g., plant crops, fruits, vegetables, grains, soybeans, corn, maize, wheat, seeds, tomatoes, rice, cassava, sugarcane, pumpkins, hay, potatoes, cotton, cannabis, tobacco, flowering plants, conifers, gymnosperms, ferns, crab mosses, hornworts, liver plants, mosses), algal cells (e.g., Botryococcus braunii, Chlamydomonas reinhardtii, Nannochloropsis gaditana, Chlorella pyrenoidosa, Sargassum patens C.) Examples include cells from Agardh, seaweed (e.g., kelp), fungal cells (e.g., yeast cells, mushroom cells), animal cells, cells from invertebrates (e.g., fruit flies, cnidarians, echinoderms, nematodes, etc.), cells from vertebrates (e.g., fish, amphibians, reptiles, birds, mammals, etc.), and cells from mammals (pigs, cows, goats, sheep, rodents, rats, mice, non-human primates, humans, etc.). Cells are those that do not originate from natural organisms (e.g., synthetically produced cells, often called artificial cells). 【0069】The term "nucleotide," as used herein, generally refers to a base-sugar-phosphate combination. Nucleotides may include synthetic nucleotides. Nucleotides may also include synthetic nucleotide analogs. Nucleotides can be monomeric units of nucleic acid sequences (e.g., deoxyribonucleic acid (DNA) and ribonucleic acid (RNA)). The term nucleotide may include ribonucleoside triphosphates such as adenosine triphosphate (ATP), uridine triphosphate (UTP), cytosine triphosphate (CTP), guanosine triphosphate (GTP), and deoxyribonucleoside triphosphates, e.g., dATP, dCTP, dITP, dUTP, dGTP, dTTP, or derivatives thereof. Such derivatives may include, for example, [αS]dATP, 7-deaza-dGTP, and 7-deaza-dATP, as well as nucleotide derivatives that confer nuclease resistance to nucleic acid molecules containing them. The term nucleotide, as used herein, may also refer to dideoxyribonucleoside triphosphates (ddNTPs) and their derivatives. Exemplary examples of dideoxyribonucleoside triphosphates include, but are not limited to, ddATP, ddCTP, ddGTP, ddITP, and ddTTP. Nucleotides may be unlabeled or labeled in a detectable manner, for example, by using a moiety containing an optically detectable moiety (e.g., a fluorophore). Labeling may also be performed using quantum dots. Examples of detectable labels include radioisotopes, fluorescent labels, chemiluminescent labels, bioluminescent labels, and enzymatic labels. Fluorescent labeling of nucleotides may include, but is not limited to, fluorescein, 5-carboxyfluorescein (FAM), 2'7'-dimethoxy-4'5-dichloro-6-carboxyfluorescein (JOE), rhodamine, 6-carboxylrhodamine (R6G), N,N,N',N'-tetramethyl-6-carboxylrhodamine (TAMRA), 6-carboxy-X-rhodamine (ROX), 4-(4'dimethylaminophenylazo)benzoic acid (DABCYL), Cascade Blue, Oregon Green, Texas Red, cyanine, and 5-(2'-aminoethyl)aminonaphthalene-1-sulfonic acid (EDANS).Specific examples of fluorescently labeled nucleotides available from Perkin Elmer (Foster City, Calif) include [R6G]dUTP, [TAMRA]dUTP, [R110]dCTP, [R6G]dCTP, [TAMRA]dCTP, [JOE]ddATP, [R6G]ddATP, [FAM]ddCTP, [R110]ddCTP, [TAMRA]ddGTP, [ROX]ddTTP, [dR6G]ddATP, [dR110]ddCTP, [dTAMRA]ddGTP, and [dROX]ddTTP; FluoroLink DeoxyNucleotides, FluoroLink Cy3-dCTP, FluoroLink Cy5-dCTP, FluoroLink Fluor X-dCTP, FluoroLink Cy3-dUTP, and FluoroLink Cy5-dUTP; Boehringer (Mannheim, Indianapolis, Fluorescein-15-dATP, fluorescein-12-dUTP, tetramethyl-rhodamine-6-dUTP, IR770-9-dATP, fluorescein-12-ddUTP, fluorescein-12-UTP, and fluorescein-15-2'-dATP available from (Ind.); as well as chromosome-labeled nucleotides available from Molecular Probes (Eugene, Oreg.): BODIPY-FL-14-UTP, BODIPY-FL-4-UTP, BODIPY-TMR-14-UTP, BODIPY-TMR-14-dUTP, BODIPY-TR-14-UTP, BODIPY-TR-14-dUTP, Cascade Blue-7-UTP, Cascade Blue-7-dUTP, fluorescein-12-UTP, fluorescein-12-dUTP, Oregon Green This may include 488-5-dUTP, rhodamine green-5-UTP, rhodamine green-5-dUTP, tetramethylrhodamine-6-UTP, tetramethylrhodamine-6-dUTP, Texas Red-5-UTP, Texas Red-5-dUTP, and Texas Red-12-dUTP. Nucleotides may be further labeled or marked by chemical modifications.A chemically modified single nucleotide can be a biotin-dNTP. Some non-limiting examples of biotinylated dNTPs include biotin-dATP (e.g., bio-N6-ddATP, biotin-14-dATP), biotin-dCTP (e.g., biotin-11-dCTP, biotin-14-dCTP), and biotin-dUTP (e.g., biotin-11-dUTP, biotin-16-dUTP, biotin-20-dUTP). 【0070】 The terms “polynucleotide,” “oligonucleotide,” and “nucleic acid” are generally used interchangeably to mean polymeric forms of nucleotides of any length, whether single-stranded, double-stranded, or multi-stranded, deoxyribonucleotides or ribonucleotides, or their analogues. Polynucleotides may be exogenous or endogenous to cells. Polynucleotides may exist in non-cellular environments. Polynucleotides may be genes or fragments thereof. Polynucleotides may be DNA. Polynucleotides may be RNA. Polynucleotides may have any three-dimensional structure and may perform any function. Polynucleotides may contain one or more analogues (e.g., altered backbone, sugar, or nucleic acid base). Modifications to the nucleotide structure, if present, may be given before or after polymer assembly. Some non-exclusive examples of analogs include 5-bromouracil, peptide nucleic acids, xeno nucleic acids, morpholino, locked nucleic acids, glycol nucleic acids, threose nucleic acids, dideoxynucleotides, cordycepin, 7-deaza-GTP, fluorophores (e.g., sugar-bound rhodamine or fluorescein), thiol-containing nucleotides, biotin-bound nucleotides, fluorescent base analogs, CpG islands, methyl-7-guanosine, methylated nucleotides, inosine, thiouridine, pseudouridine, dihydrouridine, keosin, and waiosin. Non-limiting examples of polynucleotides include coding or non-coding regions of genes or gene fragments, loci (multiple loci) defined by linkage analysis, exons, introns, messenger RNA (mRNA), transfer RNA (tRNA), ribosomal RNA (rRNA), small interfering RNA (siRNA), small hairpin RNA (shRNA), microRNA (miRNA), ribozymes, cDNA, recombinant polynucleotides, branched polynucleotides, plasmids, vectors, isolated DNA of any sequence, isolated RNA of any sequence, cell-free polynucleotides including cell-free DNA (cfDNA) and cell-free RNA (cfRNA), nucleic acid probes, and primers.The sequence of nucleotides may be interrupted by non-nucleotide components. 【0071】 The terms “transfection” or “transfected” generally refer to the introduction of nucleic acids into cells by non-viral or virus-based methods. Nucleic acid molecules may be gene sequences encoding complete proteins or their functional portions. See, for example, Sambrook et al., 1989, Molecular Cloning: A Laboratory Manual, 18.1–18.88. 【0072】 The terms “peptide,” “polypeptide,” and “protein” are used interchangeably herein to generally refer to polymers of at least two amino acid residues linked by peptide bonds. This term does not imply a specific length of polymer, nor is it intended to suggest or distinguish whether a peptide is produced using recombinant techniques, chemical or enzymatic synthesis, or naturally occurring. This term applies to naturally occurring amino acid polymers as well as amino acid polymers containing at least one modified amino acid. In some cases, polymers may be interrupted by non-amino acids. This term includes amino acid chains of any length, including full-length proteins, as well as proteins with or without secondary and / or tertiary structures (e.g., domains). This term further encompasses amino acid polymers modified by any other operations, such as disulfide bond formation, glycosylation, lipidation, acetylation, phosphorylation, oxidation, and conjugation with labeling components. The terms “amino acid” and “amino acid” generally refer to natural and non-natural amino acids, including, but not limited to, modified amino acids and amino acid analogs, as used herein. Modified amino acids can include both natural and unnatural amino acids that have been chemically modified to include a group or chemical moiety that does not naturally exist on the amino acid surface. Amino acid analogs may also refer to amino acid derivatives. The term "amino acid" includes both D-amino acids and L-amino acids. 【0073】 As used herein, “unnatural” can generally refer to a sequence of nucleic acid or polypeptide not found in natural nucleic acids or proteins. “Unnatural” may refer to an affinity tag. “Unnatural” may refer to a fusion. “Unnatural” may refer to a sequence of naturally occurring nucleic acid or polypeptide including mutations, insertions, and / or deletions. A nonnatural sequence may exhibit and / or encode activity (e.g., enzymatic activity, methyltransferase activity, acetyltransferase activity, kinase activity, ubiquitination activity, etc.) that may also be exhibited by the nucleic acid and / or polypeptide sequence to which the nonnatural sequence is fused. A nonnatural nucleic acid or polypeptide sequence may be ligated to a naturally occurring nucleic acid or polypeptide sequence (or a variant thereof) by genetic engineering to produce a chimeric nucleic acid and / or polypeptide sequence encoding a chimeric nucleic acid and / or polypeptide. 【0074】 The term “promoter” generally refers to a regulatory DNA region that controls the transcription or expression of a gene and may be located adjacent to or overlapping with a nucleotide or region of nucleotides from which RNA transcription is initiated, as used herein. Promoters may often contain specific DNA sequences that bind to protein factors called transcription factors, facilitating the binding of RNA polymerase to the DNA that leads to gene transcription. A “basic promoter,” also called a “core promoter,” may generally refer to a promoter that contains all the fundamental elements necessary to promote the transcription and expression of a functionally linked polynucleotide. Eukaryotic basic promoters typically, though not always, contain a TATA-box and / or CAAT-box. 【0075】 The term “expression” generally refers, as used herein, to the process by which a nucleic acid sequence or polynucleotide is transcribed from a DNA template (into mRNA or other RNA transcripts, etc.), and / or the process by which the transcribed mRNA is subsequently translated into peptides, polypeptides, or proteins. The transcript and the encoded polypeptide are sometimes collectively referred to as “gene products.” When the polynucleotide originates from genomic DNA, expression may involve the splicing of mRNA in eukaryotic cells. 【0076】 As used herein, “operably linked,” “operably linked,” “operably linked,” or their grammatical equivalents generally refer to the juxtaposition of genetic elements, such as promoters, enhancers, and polyadenylation sequences, which are related in a way that enables them to function in a desired manner. For example, a regulatory element, which may include a promoter and / or enhancer sequence, is operably linked to a coding region if the regulatory element helps initiate transcription of the coding sequence. Intervening residues may exist between the regulatory element and the coding region, as long as this functional relationship is maintained. 【0077】 "Vector" generally refers to a macromolecule or association of macromolecules that contains or associates with polynucleotides, as used herein, and may be used to mediate the delivery of polynucleotides to cells. Examples of vectors include plasmids, viral vectors, liposomes, and other gene delivery vehicles. Vectors generally include genetic elements, such as regulatory elements, that are operably linked to a gene to promote gene expression at a target. 【0078】 As used herein, “expression cassette” and “nucleic acid cassette” are generally used interchangeably to refer to a combination of nucleic acid sequences or elements that are expressed together or operably linked for expression. In some cases, an expression cassette refers to regulatory elements and a combination of genes or genes to which they are operably linked for expression. 【0079】 A “functional fragment” of a DNA or protein sequence generally refers to a fragment that possesses biological activity (either functional or structural) substantially similar to the biological activity of the full-length DNA or protein sequence. The biological activity of a DNA sequence may be its ability to influence expression in ways known to be attributable to the full-length sequence. 【0080】 As used herein, the term "engineered" generally indicates that the object has been modified by human intervention. In a non-limiting example, nucleic acids may be modified by altering their sequence to one that does not exist in nature. Nucleic acids may also be modified by ligating them with nucleic acids that do not associate in nature, so that the ligated product has a function not present in the original nucleic acid. Engineered nucleic acids may be synthesized in vitro with sequences that do not exist in nature. Proteins may be modified by altering their amino acid sequence to one that does not exist in nature. Engineered proteins may acquire new functions or properties. An "engineered" system contains at least one engineered component. 【0081】 As used herein, “synthetic” and “artificial” are used interchangeably to refer to proteins or domains that have low sequence identity (e.g., less than 50% sequence identity, less than 25% sequence identity, less than 10% sequence identity, less than 5% sequence identity, less than 1% sequence identity) relative to naturally occurring human proteins. For example, the VPR and VP64 domains are synthetic transactivation domains. 【0082】 The terms “tracrRNA” or “tracr sequence” can generally refer to nucleic acids having at least about 5%, 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 95%, or 100% sequence identity and / or similarity to a wild-type exemplary tracrRNA sequence (e.g., S. pyogenes S. aureus, or tracrRNA from sequence number *_*) as used herein. A tracrRNA can refer to a nucleic acid having up to about 5%, 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, or 100% sequence identity and / or similarity to a wild-type exemplary tracrRNA sequence (e.g., tracrRNA derived from S. pyogenes S. aureus, etc.). A tracrRNA may also refer to modified forms of tracrRNA that may include nucleotide changes such as deletions, insertions, or substitutions, mutations, mutations, or chimeras. A tracrRNA may refer to a nucleic acid that is at least approximately 60% identical to a wild-type exemplary tracrRNA sequence (e.g., tracrRNA derived from S. pyogenes, S. aureus, etc.) over a stretch of at least six consecutive nucleotides. For example, a tracrRNA sequence may be at least approximately 60%, at least approximately 65%, at least approximately 70%, at least approximately 75%, at least approximately 80%, at least approximately 85%, at least approximately 90%, at least approximately 95%, at least approximately 98%, at least approximately 99%, or 100% identical to a wild-type exemplary tracrRNA sequence (e.g., tracrRNA derived from S. pyogenes, S. aureus, etc.) over a stretch of at least six consecutive nucleotides. Type II tracrRNA sequences can be predicted on a genomic sequence by identifying regions that are complementary to parts of the repeat sequences of an adjacent CRISPR array. 【0083】 As used herein, “guide nucleic acid” can generally refer to a nucleic acid that can hybridize to another nucleic acid. The guide nucleic acid may be RNA. The guide nucleic acid may be DNA. The guide nucleic acid may be programmed to bind site-specifically to a nucleic acid sequence. The target nucleic acid, i.e., the target nucleic acid, may contain nucleotides. The guide nucleic acid may contain nucleotides. Part of the target nucleic acid may be complementary to part of the guide nucleic acid. A double-stranded target polynucleotide chain that is complementary to the guide nucleic acid and hybridizes with the guide nucleic acid may be called a complementary chain. A double-stranded target polynucleotide chain that is complementary to the complementary chain and therefore may not be complementary to the guide nucleic acid may be called a non-complementary chain. The guide nucleic acid may contain a polynucleotide chain and may be called a “single guide nucleic acid”. The guide nucleic acid may contain two polynucleotide chains and may be called a “double guide nucleic acid”. Unless otherwise specified, the term “guide nucleic acid” may be comprehensive and refer to both single guide nucleic acids and double guide nucleic acids. The guide nucleic acid may contain a segment that can be called a “nucleic acid targeting segment” or “nucleic acid targeting sequence”. The nucleic acid targeting segment may include subsegments that are sometimes called "protein-binding segments," "protein-binding sequences," or "Cas protein-binding segments." 【0084】 In the context of two or more nucleic acid or polypeptide sequences, the terms “sequence identity” or “percent identity” generally refer to two (e.g., in pairwise alignment) or more (e.g., in multiple sequence alignment) sequences that are identical or have identical amino acid residues or nucleic acids in a specific proportion when compared and aligned for the greatest correspondence across a local or global comparison window, as measured using a sequence comparison algorithm. Suitable sequence comparison algorithms for polypeptide sequences include, for example, BLASTP using the parameters of the BLOSUM62 scoring matrix with a word length (W) of 3, an expectation (E) of 10, and a gap cost of 1 for existence 11 and extension 1, and conditional composition score matrix adjustment for polypeptide sequences longer than 30 residues; BLASTP using the PAM30 scoring matrix with a word length (W) of 2, an expectation (E) of 1,000,000, and for sequences shorter than 30 residues, a gap cost of 9 for opening a gap and a gap cost of 1 for extending a gap (these are the default parameters for BLASTP in the BLAST suite available at https: / / blast.ncbi.nlm.nih.gov); CLUSTALW using the parameters of the Smith-Waterman homology search algorithm with the parameters match 2, mismatch -1, and gap -1; MUSCLE using the default parameters; MAFFT using the parameters retree:2, maxiterations:1000; Novafold using the default parameters; and HMMER hmmalign using the default parameters. 【0085】 This disclosure includes variants of any of the enzymes described herein that have one or more conserved amino acid substitutions. Such conserved substitutions can be made in the amino acid sequence of a polypeptide without disrupting the three-dimensional structure or function of the polypeptide. Conservative substitutions can be achieved by substituting amino acids that have similar hydrophobicity, polarity, and R-chain length. Furthermore or alternatively, conserved substitutions can be identified by comparing aligned sequences of homologous proteins from different species, thereby pinpointing amino acid residues that have mutated between species (e.g., non-conserved residues) without altering the basic function of the encoded protein. Such conservatively substituted mutants may include mutants having at least about 20%, at least about 25%, at least about 30%, at least about 35%, at least about 40%, at least about 45%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, or at least about 99% identity with any one of the systems described herein (e.g., the MG36 or MG39 system described herein). In some embodiments, such conservatively substituted mutants are functional mutants. Such functional mutants may include sequences with substitutions such that the activity of key active site residues of the endonuclease is not disrupted. In some embodiments, any functional variant of any of the systems described herein lacks at least one substitution of the conserved or functional residues referred to in Figures 4 and 5. In some embodiments, any functional variant of any of the systems described herein lacks all substitutions of the conserved or functional residues referred to in Figures 4 and 5. 【0086】 Tables of conserved substitutions that provide functionally similar amino acids are available from various sources (see, for example, Creighton, Proteins: Structures and Molecular Properties (WH Freeman & Co.; 2nd Edition (December 1993))). The following eight groups each contain amino acids that are conserved substitutions with each other. 1) Alanine (A), Glycine (G), 2) Aspartic acid (D), glutamic acid (E), 3) Asparagine (N), glutamine (Q), 4) Arginine (R), Lysine (K), 5) Isoleucine (I), leucine (L), methionine (M), valine (V), 6) Phenylalanine (F), tyrosine (Y), tryptophan (W), 7) Serine (S), threonine (T), and 8) Cysteine ​​(C), Methionine (M). 【0087】 As used herein, the term “RuvC_III domain” generally refers to the third discontinuous segment of the RuvC endonuclease domain (the RuvC nuclease domain consists of three discontinuous segments: RuvC_I, RuvC_II, and RuvC_III). The RuvC domain or its segments can generally be identified by alignment to a known domain sequence, structural alignment to a protein with an annotated domain, or comparison with a hidden Markov model (HMM) constructed based on a known domain sequence (e.g., Pfam HMM PF18541 for RuvC_III). 【0088】 As used herein, the term “HNH domain” generally refers to an endonuclease domain having characteristic histidine and asparagine residues. HNH domains can generally be identified by alignment to a known domain sequence, structural alignment to a protein with an annotated domain, or comparison to a hidden Markov model (HMM) constructed based on a known domain sequence (e.g., Pfam HMM PF01844 for the HNH domain). 【0089】 As used herein, the term “recombinase” generally refers to a site-specific enzyme that mediates DNA recombination between recombinase-recognition sequences, resulting in the excision, integration, inversion, or exchange (e.g., translocation) of DNA fragments between recombinase-recognition sequences. 【0090】 As used herein, the terms “recombinate” or “recombinate” in the context of nucleic acid modification (e.g., genome modification) generally refer to the process by which two or more nucleic acid molecules, or two or more regions of a single nucleic acid molecule, are modified by the action of a recombinase protein. Recombination can, in particular, result in, for example, the insertion, inversion, excision, or transposition of nucleic acid sequences within or between one or more nucleic acid molecules. 【0091】 As used herein, the term “transposon” generally refers to a mobile element that moves in and out of the genome accompanied by “cargo DNA.” In some cases, these transposons may differ in the type of nucleic acid they transpose, the type of repeats at the end of the transposon, the type of cargo being carried, or the mode of transposition (i.e., self-repair or host repair). As used herein, “transposase” generally refers to an enzyme that binds to the end of a transposon and catalyzes its movement to another part of the genome. In some cases, this movement may be by a cut-and-paste mechanism or by replication transposition. 【0092】 As used herein, “Tn7” or “Tn7-like transposase” generally refers to a family of transposases comprising three main components: heteromeric transposases (TnsA and / or TnsB) and regulatory proteins (TnsC). In addition to the TnsABC transposition proteins, the Tn7 element can encode dedicated target site-selective proteins, TnsD and TnsE. In addition to TnsABC, TnsD, a sequence-specific DNA-binding protein, directs transposition to a conserved site called the “Tn7 attachment site” (attTn7). TnsD is a member of a larger protein family that also includes TniQ. TniQ has been shown to target transposition to plasmid degrading sites. 【0093】 In some cases, the CAST system described herein may include one or more Tn7 or Tn7-like transposases. In certain exemplary embodiments, the Tn7 or Tn7-like transposase includes a multimeric protein complex. In certain exemplary embodiments, the multimeric protein complex includes TnsA, TnsB, TnsC, or TniQ. In these combinations, the transposases (TnsA, TnsB, TnsC, TniQ) may form complexes or fusion proteins with each other. 【0094】 As used herein, the term "Cas12k" (or alternatively "Class II, Type VK") generally refers to a subtype of the Type V CRISPR system that has been found to be defective in nuclease activity (for example, they may contain at least one defective RuvC domain lacking at least one catalytic residue crucial for DNA cleavage). Effectors of such subtypes have generally been associated with the CAST system. 【0095】 As used herein, the term “Type IF” (or alternatively, “Class I, Type IF CRISPR”) generally refers to a subtype of Class I, Type I CRISPR systems. Such systems generally contain multicomponent CRISPR effectors including Cas8, Cas7, and Cas6 proteins. In some cases, such systems are found to be associated with CAST systems. In some cases, Type IF CRISPR systems contain crRNAs including an 8-nt 5' handle for Cas8 and / or Cas5 binding, a 32-nt spacer bound by six copies of Cas7 for target recognition, or a 20-nt 3' hairpin for Cas6 binding and pre-crRNA processing. In some cases, Type F systems utilize a 5'-CC PAM on the non-target strand for target binding. 【0096】 overview 【0097】 The discovery of novel Cas enzymes with unique functionalities and structures could further disrupt deoxyribonucleic acid (DNA) editing technologies, offering the potential to improve speed, specificity, functionality, and ease of use. While the proliferation of CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) systems in microorganisms and indeed a wide variety of microbial species is anticipated, the literature contains relatively few functionally characterized CRISPR / Cas enzymes. This is partly due to the vast number of microbial species that are not easily cultured in the laboratory. Metagenomic sequencing from natural environmental niches representing many microbial species could dramatically increase the number of known new CRISPR / Cas systems and accelerate the discovery of new oligonucleotide editing functions. A recent example demonstrating the usefulness of such an approach is the discovery of the CasX / CasY CRISPR system in 2016 from metagenomic analysis of natural microbial communities. 【0098】 The CRISPR / Cas system is an RNA-directed nuclease complex described as functioning as an adaptive immune system in microorganisms. In its natural context, the CRISPR / Cas system occurs at the CRISPR (clustered regularly interspaced short palindromic repeats) operon or locus, which generally consists of two parts: (i) an array of short repetitive sequences (30-40 bp) also delimited by short spacer sequences, encoding an RNA-based targeting element, and (ii) an ORF encoding Cas, which encodes a nuclease polypeptide directed by the RNA-based targeting element together with an accessory protein / enzyme. Efficient nuclease targeting of a specific target nucleic acid sequence generally requires both (i) complementary hybridization between the first 6-8 nucleic acids of the target (target seed) and the crRNA guide, and (ii) the presence of a protospacer-adjacent motif (PAM) sequence in a defined vicinity of the target seed (PAMs are typically sequences not commonly represented in the host genome). Depending on the precise function and organization of the system, CRISPR-Cas systems are generally classified into two classes, five types, and sixteen subtypes based on shared functional characteristics and evolutionary similarities (see Figure 1). 【0099】 Class I CRISPR-Cas systems have large-scale multi-subunit effector complexes and include I, III, and I-type V. 【0100】 Type I CRISPR-Cas systems are considered to have moderate complexity in terms of components. In type I CRISPR-Cas systems, the array of RNA targeting elements is transcribed as a long precursor crRNA (precrRNA), which, upon processing with repeat elements, releases a short, mature crRNA. The crRNA then directs the nuclease complex to the nucleic acid target when followed by a suitable short consensus sequence called a protospacer-adjacent motif (PAM). This processing occurs via the endoribonuclease subunit (Cas6) of a large endonuclease complex called a cascade, which also contains the nuclease (Cas3) protein component of the crRNA-directing nuclease complex. Cas I nucleases primarily function as DNA nucleases. 【0101】 Type I CRISPR systems are sometimes characterized by the presence of a central nuclease known as Cas10, along with a repeat-associated mysterious protein (RAMP) containing a Csm or Cmr protein subunit. Similar to type I systems, mature crRNA is processed from precrRNA using a Cas6-like enzyme. Unlike type I and type II systems, type I type II systems appear to target and cleave DNA-RNA double helixes (such as the DNA strand used as a template for RNA polymerase). 【0102】 The Type I V CRISPR-Cas system has an effector complex consisting of a highly reduced large subunit nuclease (csf1), two genes for the RAMP protein in the Cas5 (csf3) and Cas7 (csf2) groups, and, in some cases, a gene for a smaller subunit. Such systems are generally found on endogenous plasmids. 【0103】 Type II CRISPR-Cas systems generally have a single polypeptide multi-domain nuclease effector and include types II, V, and VI. 【0104】 Type II CRISPR-Cas systems are considered the simplest in terms of components. In type II CRISPR-Cas systems, processing of a CRISPR array with mature crRNA does not require the presence of a special endonuclease subunit, but rather a small transcoded crRNA (tracrRNA) with a region complementary to the array repeat sequence. The tracrRNA interacts with both the corresponding effector nuclease (such as Cas9) and the repeat sequence to form a precursor dsRNA structure, which is then cleaved by endogenous RNAse III to produce a mature effector enzyme loaded with both tracrRNA and crRNA. Cas II nucleases are known as DNA nucleases. Type II effectors generally exhibit a structure consisting of a RuvC-like endonuclease domain employing an RNase H fold and an inserted, unrelated HNH nuclease domain inserted within the fold of the RuvC-like nuclease domain. The RuvC-like domain is responsible for cleaving the target DNA strand (e.g., the complementary DNA strand of crRNA), while the HNH domain is responsible for cleaving the displaced DNA strand. 【0105】 Type V CRISPR-Cas systems feature nuclease effector structures (e.g., Cas12) similar to those of type II effectors, including a RuvC-like domain. Like type II systems, most (but not all) type V CRISPR systems use tracrRNA to process precrRNA into mature crRNA. However, unlike type II systems which require RNAse III to cleave precrRNA into multiple crRNAs, type V systems can use the effector nuclease itself to cleave precrRNA. Like type II CRISPR-Cas systems, type V CRISPR-Cas systems are again known as DNA nucleases. Unlike type II CRISPR-Cas systems, some type V enzymes (e.g., Cas12a) appear to possess robust single-strand nonspecific deoxyribonuclease activity activated by initial crRNA-directed cleavage of a double-stranded target sequence. 【0106】 Type VI CRIPSR-Cas systems possess RNA-guided RNA endonucleases. Instead of a RuvC-like domain, single polypeptide effectors of type VI systems (e.g., Cas13) contain two HEPN ribonuclease domains. Unlike type II and type V systems, type VI systems also appear not to require tracrRNA to process precrRNA to crRNA. However, similar to type V systems, some type VI systems (e.g., C2C2) appear to possess robust single-strand nonspecific nuclease (ribonuclease) activity activated by initial crRNA-directed cleavage of target RNA. 【0107】 Due to their simpler architecture, Class II CRISPR-Cas are the most widely adopted for operation and development as designer nucleases / genome editing applications. 【0108】 One of the earliest adaptations of such a system for in vitro use was Jinek et al. (Science. 2012 Aug 17;337(6096):816-21; fully incorporated herein by reference). Jinek's study initially described a system comprising (i) recombinantly expressed, purified full-length Cas9 (e.g., class II, type II Cas enzyme) isolated from S. pyogenes SF370, (ii) purified mature crRNA of approximately 42 nt having an approximately 20 nt 5' sequence complementary to the target DNA sequence to be cleaved, followed by a 3' tracr binding sequence (the entire crRNA is transcribed in vitro from a synthetic DNA template carrying a T7 promoter sequence), (iii) purified tracrRNA transcribed in vitro from a synthetic DNA template carrying a T7 promoter sequence, and (iv) Mg2+. Jinek later described an improved, engineered system in which the crRNA of (ii) is bound to the 5' end of (iii) by a linker (e.g., GAAA) to form a single fusion synthetic guide RNA (sgRNA) that can itself direct Cas9 to the target (compare the upper and lower panels in Figure 2). 【0109】 Mali et al. (Science. 2013 Feb 15; 339(6121):823-826.), fully incorporated herein by reference, subsequently adapted this system for use in mammalian cells by providing: (i) an ORF encoding codon-optimized Cas9 (e.g., class II, type II Cas enzyme) under a suitable mammalian promoter having a C-terminal nuclear localization sequence (e.g., SV40 NLS) and a suitable polyadenylation signal (e.g., TK pA signal); and (ii) a DNA vector encoding an ORF encoding sgRNA (having a 5' sequence beginning with G, followed by a 3' tracr binding sequence, a linker, and a 20nt complementary target nucleic acid sequence linked to the tracrRNA sequence) under a suitable polymerase III promoter (e.g., U6 promoter). 【0110】 Transposons are mobile elements that can move within the genome. Such transposons have evolved to limit their adverse effects on the host. Various regulatory mechanisms maintain transposons at low frequencies and are sometimes used to coordinate them with various cellular processes. Some prokaryotic transposons can mobilize functions that benefit the host or otherwise help maintain the element. Certain transposons may have further evolved mechanisms that tightly control the selection of target sites, the most notable example of which is the Tn7 family. 【0111】 Transposons Tn7 and similar elements not only encode other adaptive functions in the natural environment, but may also be reservoirs of antibiotic resistance and pathogenic function in the clinical environment. For example, while the Tn7 system has evolved mechanisms to almost completely avoid integration into important host genes, it is also possible to maximize the dispersion of elements by recognizing mobile plasmids and bacteriophages that can transfer Tn7 between host bacteria. 【0112】 Tn7 and Tn7-like elements can control the location and timing of their insertion, possessing one pathway that induces insertion into a single conserved site within the bacterial genome, and a second pathway that appears to be better suited to maximizing targeting to mobile plasmids capable of transporting the element between bacteria (see Figure 3). The association between Tn7-like transposons and the CRISPR-Cas system suggests that the transposons may hijack CRISPR effectors to generate R-loops at target sites, thereby facilitating transposon dispersal via plasmids and phages. 【0113】 MG36 series 【0114】 In one embodiment, the disclosure provides a system for transposing a cargo nucleotide sequence to a target nucleic acid site. The system may comprise a first double-stranded nucleic acid, which may comprise a cargo nucleotide sequence configured to interact with a recombinase complex. The system may comprise a Cas effector complex, which may comprise a class II, type II Cas effector and at least one manipulated guide polynucleotide configured to hybridize to a target nucleic acid site. The class II, type II Cas effector may comprise a RuvC domain and an HNH domain. The system may comprise a recombinase or transposase complex, which may comprise a cargo nucleotide sequence to replenish the target nucleic acid site. 【0115】 In some cases, the cargo nucleotide sequence is adjacent to the left transposase recognition sequence. In some cases, the cargo nucleotide sequence is adjacent to the right transposase recognition sequence. In some cases, the cargo nucleotide sequence is adjacent to both the left and right transposase recognition sequences. In some cases, the system further includes a second double-stranded nucleic acid containing the target nucleic acid site. In some cases, the system further includes a PAM sequence that fits into a Cas effector complex adjacent to the target nucleic acid site. In some cases, the PAM sequence is located at 3' of the target nucleic acid site. In some cases, the recombinase or transposase complex is a Tn7 type transposase complex. In some cases, the manipulated guide polynucleotide is configured to bind to a class II, type II Cas effector. In some cases, a Class II, Type II Cas effector contains a polypeptide that is at least about 20%, at least about 25%, at least about 30%, at least about 35%, at least about 40%, at least about 45%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, or at least about 99% identical to SEQ ID NO: 1. In some cases, a Class II, Type II Cas effector contains a polypeptide that is substantially identical to SEQ ID NO: 1. 【0116】 In some cases, the recombinase or transposase complex contains at least one polypeptide containing a sequence that is substantially identical to any one of SEQ ID NOs. 2-5 or its variants. In some cases, the recombinase or transposase complex contains at least two polypeptides containing sequences that are substantially identical to any one of SEQ ID NOs: 2-5 or its variants.In some cases, the recombinase or transposase complex contains at least three polypeptides containing sequences that are substantially identical to any one of SEQ ID NOs: 2-5 or its variants. In some cases, the recombinase or transposase complex contains at least four polypeptides containing sequences that are substantially identical to any one of SEQ ID NOs: 2-5 or its variants.In some cases, the recombinase or transposase complex contains a TnsB1 polypeptide having at least about 20%, at least about 25%, at least about 30%, at least about 35%, at least about 40%, at least about 45%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, or at least about 99% identity with any one of SEQ ID NOs. 2 or its variants. In some cases, the recombinase or transposase complex contains a TnsB1 polypeptide having a sequence substantially identical to SEQ ID NOs. 2 or its variants. In some cases, the recombinase or transposase complex contains a TnsB2 polypeptide having at least about 20%, at least about 25%, at least about 30%, at least about 35%, at least about 40%, at least about 45%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, or at least about 99% identity with any one of SEQ ID NO: 3 or its variants. In some cases, the recombinase or transposase complex contains a TnsB2 polypeptide having a sequence substantially identical to SEQ ID NO: 3 or its variants.In some cases, the recombinase or transposase complex contains a TnsT1 polypeptide having at least about 20%, at least about 25%, at least about 30%, at least about 35%, at least about 40%, at least about 45%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, or at least about 99% identity with any one of SEQ ID NOs. 4 or its variants. In some cases, the recombinase or transposase complex contains a TnsC polypeptide having at least about 20%, at least about 25%, at least about 30%, at least about 35%, at least about 40%, at least about 45%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, or at least about 99% identity with any one of SEQ ID NO: 5 or its variants. In some cases, the recombinase or transposase complex contains a TnsC polypeptide having a TnsC polypeptide having a TNS 【0117】 In some cases, the manipulated guide polynucleotide contains a sequence of at least 60 to 80 consecutive nucleotides having at least about 20%, at least about 25%, at least about 30%, at least about 35%, at least about 40%, at least about 45%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, or at least about 99% identity with respect to SEQ ID NO: 11 or its variants. 【0118】 In some cases, the left-hand recombinase sequence contains a sequence that has at least approximately 20%, at least approximately 25%, at least approximately 30%, at least approximately 35%, at least approximately 40%, at least approximately 45%, at least approximately 50%, at least approximately 55%, at least approximately 60%, at least approximately 65%, at least approximately 70%, at least approximately 75%, at least approximately 80%, at least approximately 85%, at least approximately 90%, at least approximately 91%, at least approximately 92%, at least approximately 93%, at least approximately 94%, at least approximately 95%, at least approximately 96%, at least approximately 97%, at least approximately 98%, or at least approximately 99% identity with any one of sequence numbers 17-18 or its variants. In some cases, the recombinase sequence on the right contains a sequence that has at least about 20%, at least about 25%, at least about 30%, at least about 35%, at least about 40%, at least about 45%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, or at least about 99% identity with respect to Sequence Number 19 or its variants. 【0119】 In some cases, class II, type II Cas effectors and recombinase or transposase complexes are encoded by polynucleotide sequences containing less than approximately 20 kilobases, less than approximately 15 kilobases, less than approximately 10 kilobases, or less than approximately 5 kilobases. 【0120】 In one embodiment, the Disclosure provides a method for transposing a cargo nucleotide sequence to a target nucleic acid site containing a target nucleotide sequence, the method comprising the steps of expressing the system described herein in a cell or introducing the system described herein into a cell. 【0121】 MG39 series In one embodiment, the disclosure provides a system for transposing a cargo nucleotide sequence to a target nucleic acid site. The system may include a first double-stranded nucleic acid comprising a cargo nucleotide sequence. This cargo nucleotide sequence may be configured to interact with a Tn7 type transposase complex. The system may also include a Cas effector complex. The Cas effector complex may include a class II, type V Cas effector and an engineered guide polynucleotide configured to hybridize to the target nucleotide sequence. The class II, type V Cas effector may include a RuvC domain. The system may also include a Tn7 type transposase complex configured to bind to the Cas effector complex, the Tn7 type transposase complex comprising a TnsA subunit. 【0122】 In some cases, the cargo nucleotide sequence is adjacent to the left transposase recognition sequence. In some cases, the cargo nucleotide sequence is adjacent to the right transposase recognition sequence. In some cases, the cargo nucleotide sequence is adjacent to both the left and right transposase recognition sequences. In some cases, the system further includes a second double-stranded nucleic acid containing the target nucleic acid site. In some cases, the system further includes a PAM sequence that fits the Cas effector complex adjacent to the target nucleic acid site. In some cases, the PAM sequence is located at 3' of the target nucleic acid site. 【0123】 In some cases, the manipulated guide polynucleotide is configured to bind to a class II, type V Cas effector. In some cases, the class II, type V Cas effector contains a polypeptide containing a sequence that has at least about 20%, at least about 25%, at least about 30%, at least about 35%, at least about 40%, at least about 45%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, or at least about 99% identity with respect to SEQ ID NO: 5 or its variants. In some cases, the class II, type V Cas effector contains a polypeptide containing a sequence that is substantially identical to SEQ ID NO: 5 or its variants. In some cases, the TnsA subunit contains a polypeptide having a sequence that is at least about 20%, at least about 25%, at least about 30%, at least about 35%, at least about 40%, at least about 45%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, and at least about 99% identical to sequence number 7 or its variants. In some cases, the TnsA subunit contains a polypeptide having a sequence that is substantially identical to sequence number 7 or its variants. 【0124】 In some cases, the Tn7 type transposase complex contains at least one polypeptide containing a sequence that is substantially identical to any one of SEQ ID NOs. 8-10 or its variants. In some cases, the Tn7 type transposase complex contains at least two polypeptides containing sequences that are substantially identical to any one of SEQ ID NOs. 8-10 or its variants.In some cases, the Tn7 type transposase complex contains at least three polypeptides containing sequences that are substantially identical to any one of SEQ ID NOs. 8-10 or its variants. 【0125】 In some cases, the Tn7 type transposase complex contains a TnsA polypeptide containing a sequence that is at least approximately 20%, at least approximately 25%, at least approximately 30%, at least approximately 35%, at least approximately 40%, at least approximately 45%, at least approximately 50%, at least approximately 55%, at least approximately 60%, at least approximately 65%, at least approximately 70%, at least approximately 75%, at least approximately 80%, at least approximately 85%, at least approximately 90%, at least approximately 91%, at least approximately 92%, at least approximately 93%, at least approximately 94%, at least approximately 95%, at least approximately 96%, at least approximately 97%, at least approximately 98%, or at least approximately 99% identical to any one of the sequences in SEQ ID NO: 7 or its variants. In some cases, the Tn7 type transposase complex contains a TnsB polypeptide with a sequence having at least approximately 20%, at least approximately 25%, at least approximately 30%, at least approximately 35%, at least approximately 40%, at least approximately 45%, at least approximately 50%, at least approximately 55%, at least approximately 60%, at least approximately 65%, at least approximately 70%, at least approximately 75%, at least approximately 80%, at least approximately 85%, at least approximately 90%, at least approximately 91%, at least approximately 92%, at least approximately 93%, at least approximately 94%, at least approximately 95%, at least approximately 96%, at least approximately 97%, at least approximately 98%, or at least approximately 99% identity with any one of SEQ ID NO: 8 or its variants.In some cases, the Tn7 type transposase complex contains a TnsC polypeptide with a sequence having at least approximately 20%, at least approximately 25%, at least approximately 30%, at least approximately 35%, at least approximately 40%, at least approximately 45%, at least approximately 50%, at least approximately 55%, at least approximately 60%, at least approximately 65%, at least approximately 70%, at least approximately 75%, at least approximately 80%, at least approximately 85%, at least approximately 90%, at least approximately 91%, at least approximately 92%, at least approximately 93%, at least approximately 94%, at least approximately 95%, at least approximately 96%, at least approximately 97%, at least approximately 98%, or at least approximately 99% identity with any one of SEQ ID NO: 9 or its variants. In some cases, the Tn7 type transposase complex contains a TniQ polypeptide containing a sequence that is at least approximately 20%, at least approximately 25%, at least approximately 30%, at least approximately 35%, at least approximately 40%, at least approximately 45%, at least approximately 50%, at least approximately 55%, at least approximately 60%, at least approximately 65%, at least approximately 70%, at least approximately 75%, at least approximately 80%, at least approximately 85%, at least approximately 90%, at least approximately 91%, at least approximately 92%, at least approximately 93%, at least approximately 94%, at least approximately 95%, at least approximately 96%, at least approximately 97%, at least approximately 98%, or at least approximately 99% identical to any one of the SEQ ID NO: 10 sequences or their variants. 【0126】 In some cases, the manipulated guide polynucleotide contains a sequence of at least 46 to 80 consecutive nucleotides that have at least about 20%, at least about 25%, at least about 30%, at least about 35%, at least about 40%, at least about 45%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, or at least about 99% identity with respect to any one of SEQ ID NOs. 13 to 16 or its variants. 【0127】 In some cases, the left-hand recombinase sequence contains a sequence that is at least approximately 20%, at least approximately 25%, at least approximately 30%, at least approximately 35%, at least approximately 40%, at least approximately 45%, at least approximately 50%, at least approximately 55%, at least approximately 60%, at least approximately 65%, at least approximately 70%, at least approximately 75%, at least approximately 80%, at least approximately 85%, at least approximately 90%, at least approximately 91%, at least approximately 92%, at least approximately 93%, at least approximately 94%, at least approximately 95%, at least approximately 96%, at least approximately 97%, at least approximately 98%, or at least approximately 99% identical to sequence number 20 or its variants. 【0128】 In some cases, the recombinase sequence on the right contains a sequence that is at least approximately 20%, at least approximately 25%, at least approximately 30%, at least approximately 35%, at least approximately 40%, at least approximately 45%, at least approximately 50%, at least approximately 55%, at least approximately 60%, at least approximately 65%, at least approximately 70%, at least approximately 75%, at least approximately 80%, at least approximately 85%, at least approximately 90%, at least approximately 91%, at least approximately 92%, at least approximately 93%, at least approximately 94%, at least approximately 95%, at least approximately 96%, at least approximately 97%, at least approximately 98%, or at least approximately 99% identical to sequence number 21 or its variants. 【0129】 In some cases, class II, type V Cas effectors and Tn7 type transposase complexes are encoded by polynucleotide sequences containing less than approximately 20 kilobases, less than approximately 15 kilobases, less than approximately 10 kilobases, or less than approximately 5 kilobases. 【0130】 In one embodiment, the Disclosure provides a method for transposing a cargo nucleotide sequence to a target nucleic acid site containing a target nucleotide sequence, the method comprising the steps of expressing the system described herein in a cell or introducing the system described herein into a cell. 【0131】 In one embodiment, the Disclosure discloses a method for transposing a cargo nucleotide sequence to a target nucleic acid site, the method comprising contacting a first double-stranded nucleic acid containing a cargo nucleotide sequence with a Cas effector complex. The Cas effector complex may comprise a class II, type II Cas effector and at least one manipulated guide polynucleotide configured to hybridize to a target nucleic acid site. The method may comprise contacting a first double-stranded nucleic acid containing a cargo nucleotide sequence with a recombinase or transposase complex configured to replenish the cargo nucleotide to the target nucleic acid site. The method may comprise contacting a first double-stranded nucleic acid containing a cargo nucleotide sequence with a second double-stranded nucleic acid containing a target nucleic acid site. 【0132】 In some cases, the cargo nucleotide sequence is adjacent to the left transposase recognition sequence. In some cases, the cargo nucleotide sequence is adjacent to the right transposase recognition sequence. In some cases, the cargo nucleotide sequence is adjacent to both the left and right transposase recognition sequences. In some cases, the Cas effector complex further includes a PAM sequence that fits the Cas effector complex adjacent to the target nucleic acid site. In some cases, the PAM sequence is located at 3' of the target nucleic acid site. In some cases, the PAM sequence is located at 5' of the target nucleic acid site. 【0133】 MG64 series 【0134】 In one embodiment, the present disclosure provides a system for transposing a cargo nucleotide sequence to a target nucleic acid site. The system may include a first double-stranded nucleic acid comprising a cargo nucleotide sequence. This cargo nucleotide sequence may be configured to interact with a Tn7 type transposase complex. The system may also include a Cas effector complex. The Cas effector complex may include a class II, type V Cas effector and an engineered guide polynucleotide configured to hybridize to the target nucleotide sequence. The system may also include a Tn7 type transposase complex configured to bind to the Cas effector complex. The class II, type V Cas effector may include a RuvC domain. 【0135】 In some cases, the cargo nucleotide sequence is adjacent to the left transposase recognition sequence. In some cases, the cargo nucleotide sequence is adjacent to the right transposase recognition sequence. In some cases, the cargo nucleotide sequence is adjacent to both the left and right transposase recognition sequences. In some cases, the system further includes a second double-stranded nucleic acid containing the target nucleic acid site. In some cases, the system further includes a PAM sequence that fits the Cas effector complex adjacent to the target nucleic acid site. In some cases, the PAM sequence is located at 3' of the target nucleic acid site. In some cases, the PAM sequence is located at 5' of the target nucleic acid site. In some cases, the PAM sequence includes 5'-nGTn-3' or 5'-nGTt-3'. 【0136】 In some cases, the manipulated guide polynucleotide is configured to bind to a class II, type V Cas effector. In some cases, the class II, type V Cas effector contains a polypeptide containing a sequence having at least about 20%, at least about 25%, at least about 30%, at least about 35%, at least about 40%, at least about 45%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, or at least about 99% identity to any one of sequence numbers 22, 26, 30, 34, 55-89, 104, or 147 or its variants. In some cases, class II, type V Cas effectors contain polypeptides that have substantially the same sequence as one of sequence numbers 22, 26, 30, 34, 55-89, 104, or 147, or their variants. 【0137】 In some cases, the Tn7 type transposase complex contains at least one polypeptide containing a sequence having at least approximately 20%, at least approximately 25%, at least approximately 30%, at least approximately 35%, at least approximately 40%, at least approximately 45%, at least approximately 50%, at least approximately 55%, at least approximately 60%, at least approximately 65%, at least approximately 70%, at least approximately 75%, at least approximately 80%, at least approximately 85%, at least approximately 90%, at least approximately 91%, at least approximately 92%, at least approximately 93%, at least approximately 94%, at least approximately 95%, at least approximately 96%, at least approximately 97%, at least approximately 98%, or at least approximately 99% identity with any one of SEQ ID NOs. 23-25, 27-29, 31-33, 35-37, 101-103, 105-107, or 148-150 or its variants. In some cases, the recombinase or transposase complex contains at least one polypeptide having a substantially identical sequence to one of the sequence numbers 23-25, 27-29, 31-33, 35-37, 101-103, 105-107, or 148-150, or a variant thereof. In some cases, the Tn7 type transposase complex contains at least two polypeptides containing sequences that have at least approximately 20%, at least approximately 25%, at least approximately 30%, at least approximately 35%, at least approximately 40%, at least approximately 45%, at least approximately 50%, at least approximately 55%, at least approximately 60%, at least approximately 65%, at least approximately 70%, at least approximately 75%, at least approximately 80%, at least approximately 85%, at least approximately 90%, at least approximately 91%, at least approximately 92%, at least approximately 93%, at least approximately 94%, at least approximately 95%, at least approximately 96%, at least approximately 97%, at least approximately 98%, or at least approximately 99% identity with any one of SEQ ID NOs. 23-25, 27-29, 31-33, 35-37, 101-103, 105-107, or 148-150 or their variants.In some cases, the Tn7 type transposase complex contains at least two polypeptides that have substantially identical sequences to one of the following sequences, or variants thereof: SEQ ID NOs. 23-25, 27-29, 31-33, 35-37, 101-103, 105-107, or 148-150. In some cases, the Tn7 type transposase complex contains at least three polypeptides containing sequences that have at least approximately 20%, at least approximately 25%, at least approximately 30%, at least approximately 35%, at least approximately 40%, at least approximately 45%, at least approximately 50%, at least approximately 55%, at least approximately 60%, at least approximately 65%, at least approximately 70%, at least approximately 75%, at least approximately 80%, at least approximately 85%, at least approximately 90%, at least approximately 91%, at least approximately 92%, at least approximately 93%, at least approximately 94%, at least approximately 95%, at least approximately 96%, at least approximately 97%, at least approximately 98%, or at least approximately 99% identity with any one of sequence numbers 23-25, 27-29, 31-33, 35-37, 101-103, 105-107, or 148-150 or its variants. In some cases, the Tn7 type transposase complex contains at least three polypeptides that have substantially identical sequences to one of the following sequences, or variants thereof: SEQ ID NOs. 23-25, 27-29, 31-33, 35-37, 101-103, 105-107, or 148-150. 【0138】 In some cases, the Tn7 type transposase complex contains TnsB, TnsC, and TniQ polypeptides containing sequences that have at least approximately 20%, at least approximately 25%, at least approximately 30%, at least approximately 35%, at least approximately 40%, at least approximately 45%, at least approximately 50%, at least approximately 55%, at least approximately 60%, at least approximately 65%, at least approximately 70%, at least approximately 75%, at least approximately 80%, at least approximately 85%, at least approximately 90%, at least approximately 91%, at least approximately 92%, at least approximately 93%, at least approximately 94%, at least approximately 95%, at least approximately 96%, at least approximately 97%, at least approximately 98%, or at least approximately 99% identity with any one of SEQ ID NOs. 23-25, 27-29, 31-33, 35-37, 101-103, 105-107, or 148-150 or their variants. In some cases, Tn7 type transposase complexes contain TnsB polypeptides that have substantially identical sequences to any one of the sequence numbers 8 or its variants. In some cases, the Tn7 type transposase complex contains TnsB, TnsC, and TniQ polypeptides containing sequences that have at least approximately 20%, at least approximately 25%, at least approximately 30%, at least approximately 35%, at least approximately 40%, at least approximately 45%, at least approximately 50%, at least approximately 55%, at least approximately 60%, at least approximately 65%, at least approximately 70%, at least approximately 75%, at least approximately 80%, at least approximately 85%, at least approximately 90%, at least approximately 91%, at least approximately 92%, at least approximately 93%, at least approximately 94%, at least approximately 95%, at least approximately 96%, at least approximately 97%, at least approximately 98%, or at least approximately 99% identity with any one of SEQ ID NOs. 23-25, 27-29, 31-33, 35-37, 101-103, 105-107, or 148-150 or their variants. 【0139】 In some cases, the manipulated guide polynucleotide contains a sequence of at least 46 to 80 consecutive nucleotides having at least about 20%, at least about 25%, at least about 30%, at least about 35%, at least about 40%, at least about 45%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, or at least about 99% identity with any one of SEQ ID NOs. In some cases, the manipulated guide polynucleotide contains a sequence of at least approximately 46–80 consecutive nucleotides that are substantially identical to one of sequence numbers 90, 91, 92, 93, 117, 151, 156–181, or 209–234 or its variants. 【0140】 In some cases, the manipulated guide polynucleotide contains a sequence having at least about 20%, at least about 25%, at least about 30%, at least about 35%, at least about 40%, at least about 45%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, or at least about 99% identity with respect to one of the non-degenerate nucleotides of any 【0141】 In some cases, the left-hand recombinase sequence contains a sequence that has at least approximately 20%, at least approximately 25%, at least approximately 30%, at least approximately 35%, at least approximately 40%, at least approximately 45%, at least approximately 50%, at least approximately 55%, at least approximately 60%, at least approximately 65%, at least approximately 70%, at least approximately 75%, at least approximately 80%, at least approximately 85%, at least approximately 90%, at least approximately 91%, at least approximately 92%, at least approximately 93%, at least approximately 94%, at least approximately 95%, at least approximately 96%, at least approximately 97%, at least approximately 98%, or at least approximately 99% identity with any one of sequence numbers 125, 127, 123, 129, 131, 133, 153, or 134 or its variants. In some cases, the left-hand recombinase sequence contains substantially the same sequence as any one of sequence numbers 125, 127, 123, 129, 131, 133, 153, or 134, or a variant thereof. 【0142】 In some cases, the recombinase sequence on the right contains a sequence that has at least approximately 20%, at least approximately 25%, at least approximately 30%, at least approximately 35%, at least approximately 40%, at least approximately 45%, at least approximately 50%, at least approximately 55%, at least approximately 60%, at least approximately 65%, at least approximately 70%, at least approximately 75%, at least approximately 80%, at least approximately 85%, at least approximately 90%, at least approximately 91%, at least approximately 92%, at least approximately 93%, at least approximately 94%, at least approximately 95%, at least approximately 96%, at least approximately 97%, at least approximately 98%, or at least approximately 99% identity with any one of sequence numbers 126, 155, 128, 124, 130, 132, or 154 or its variants. In some cases, the recombinase sequence on the right contains substantially the same sequence as one of sequence numbers 126, 155, 128, 124, 130, 132, or 154, or its variants. 【0143】 In some cases, class II, type V Cas effectors and Tn7 type transposase complexes are encoded by polynucleotide sequences containing less than approximately 20 kilobases, less than approximately 15 kilobases, less than approximately 10 kilobases, or less than approximately 5 kilobases. 【0144】 MG108 series 【0145】 In one embodiment, the disclosure provides a system for transposing a cargo nucleotide sequence to a target nucleic acid site. The system may include a first double-stranded nucleic acid comprising a cargo nucleotide sequence. This cargo nucleotide sequence may be configured to interact with a Tn7-type transposase complex. The system may also include a Cas effector complex. The Cas effector complex may include a class II, type V Cas effector and an engineered guide polynucleotide configured to hybridize to the target nucleotide sequence. The class II, type V Cas effector may include a RuvC domain. The system may also include a Tn7-type transposase complex configured to bind to the Cas effector complex. In some cases, the Tn7-type transposase complex may include TnsB and TnsC components but not TnsA and / or TniQ components. 【0146】 In some cases, the cargo nucleotide sequence is adjacent to the left transposase recognition sequence. In some cases, the cargo nucleotide sequence is adjacent to the right transposase recognition sequence. In some cases, the cargo nucleotide sequence is adjacent to both the left and right transposase recognition sequences. In some cases, the system further includes a second double-stranded nucleic acid containing the target nucleic acid site. In some cases, the system further includes a PAM sequence that fits the Cas effector complex adjacent to the target nucleic acid site. In some cases, the PAM sequence is located at 3' of the target nucleic acid site. In some cases, the PAM sequence is located at 5' of the target nucleic acid site. 【0147】 In some cases, the manipulated guide polynucleotide is configured to bind to a class II, type V Cas effector. In some cases, the class II, type V Cas effector contains a polypeptide containing a sequence that has at least about 20%, at least about 25%, at least about 30%, at least about 35%, at least about 40%, at least about 45%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, or at least about 99% identity with respect to SEQ ID NO: 38 or SEQ ID NO: 108 or its variants. In some cases, the class II, type V Cas effector contains a polypeptide containing a sequence that is substantially identical to SEQ ID NO: 38 or SEQ ID NO: 108 or its variants. 【0148】 In some cases, a Tn7 type transposase complex contains at least one polypeptide containing a sequence that has at least approximately 20%, at least approximately 25%, at least approximately 30%, at least approximately 35%, at least approximately 40%, at least approximately 45%, at least approximately 50%, at least approximately 55%, at least approximately 60%, at least approximately 65%, at least approximately 70%, at least approximately 75%, at least approximately 80%, at least approximately 85%, at least approximately 90%, at least approximately 91%, at least approximately 92%, at least approximately 93%, at least approximately 94%, at least approximately 95%, at least approximately 96%, at least approximately 97%, at least approximately 98%, or at least approximately 99% identity with respect to one of sequence numbers 39-40 or 109-110 or its variants. In some cases, a recombinase or transposase complex contains at least one polypeptide containing a sequence that is substantially identical to one of sequence numbers 39-40 or 109-110 or its variants. In some cases, the Tn7 type transposase complex contains at least two polypeptides containing sequences that are substantially identical to one of the sequence numbers 39-40 or 109-110 or its variants.In some cases, the Tn7 type transposase complex contains at least three polypeptides containing sequences that are substantially identical to one of SEQ ID NOs. 39-40 or 109-110 or its variants. 【0149】 In some cases, the Tn7 type transposase complex contains a TnsB component with a sequence having at least approximately 20%, at least approximately 25%, at least approximately 30%, at least approximately 35%, at least approximately 40%, at least approximately 45%, at least approximately 50%, at least approximately 55%, at least approximately 60%, at least approximately 65%, at least approximately 70%, at least approximately 75%, at least approximately 80%, at least approximately 85%, at least approximately 90%, at least approximately 91%, at least approximately 92%, at least approximately 93%, at least approximately 94%, at least approximately 95%, at least approximately 96%, at least approximately 97%, at least approximately 98%, or at least approximately 99% identity with either SEQ ID NO: 40 or 109 or its variants. In some cases, the recombinase or transposase complex contains a TnsB component with a sequence that is substantially identical to either SEQ ID NO: 40 or 109 or its variants. 【0150】 In some cases, the Tn7 type transposase complex contains a TnsC component with a sequence having at least approximately 20%, at least approximately 25%, at least approximately 30%, at least approximately 35%, at least approximately 40%, at least approximately 45%, at least approximately 50%, at least approximately 55%, at least approximately 60%, at least approximately 65%, at least approximately 70%, at least approximately 75%, at least approximately 80%, at least approximately 85%, at least approximately 90%, at least approximately 91%, at least approximately 92%, at least approximately 93%, at least approximately 94%, at least approximately 95%, at least approximately 96%, at least approximately 97%, at least approximately 98%, or at least approximately 99% identity with respect to either SEQ ID NO: 39 or 110 or its variants. In some cases, the recombinase or transposase complex contains a TnsC component with a sequence having substantially identical identity with respect to either SEQ ID NO: 39 or 110 or its variants. 【0151】 In some cases, the Tn7 type transposase complex contains TnsB and TnsC components with sequences that have at least approximately 20%, at least approximately 25%, at least approximately 30%, at least approximately 35%, at least approximately 40%, at least approximately 45%, at least approximately 50%, at least approximately 55%, at least approximately 60%, at least approximately 65%, at least approximately 70%, at least approximately 75%, at least approximately 80%, at least approximately 85%, at least approximately 90%, at least approximately 91%, at least approximately 92%, at least approximately 93%, at least approximately 94%, at least approximately 95%, at least approximately 96%, at least approximately 97%, at least approximately 98%, or at least approximately 99% identity with respect to any one of the sequence numbers 40 and 39 or 109 and 110 or their variants. 【0152】 In some cases, the manipulated guide polynucleotide contains a sequence of at least 46 to 80 consecutive nucleotides having at least about 20%, at least about 25%, at least about 30%, at least about 35%, at least about 40%, at least about 45%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, or at least about 99% identity with any one of SEQ ID NOs. 118, 182, 183, 235, or 236 or its variants. In some cases, the manipulated guide polynucleotide contains a sequence of at least approximately 46–80 consecutive nucleotides that are substantially identical to one of sequence numbers 118, 182, 183, 235, or 236 or its variants. 【0153】 In some cases, the manipulated guide polynucleotide contains a sequence having at least about 20%, at least about 25%, at least about 30%, at least about 35%, at least about 40%, at least about 45%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, or at least about 99% identity with respect to any one of the non-degenerate nucleotides of SEQ ID NOs. 115, 116, 205, or 206 or its variants. In some cases, the manipulated guide polynucleotide contains a sequence of at least approximately 46–80 consecutive nucleotides substantially identical to the non-degenerate nucleotide of any one of sequence numbers 115, 116, 205, or 206 or its variants. 【0154】 In some cases, the left-hand recombinase sequence contains a sequence that is at least approximately 20%, at least approximately 25%, at least approximately 30%, at least approximately 35%, at least approximately 40%, at least approximately 45%, at least approximately 50%, at least approximately 55%, at least approximately 60%, at least approximately 65%, at least approximately 70%, at least approximately 75%, at least approximately 80%, at least approximately 85%, at least approximately 90%, at least approximately 91%, at least approximately 92%, at least approximately 93%, at least approximately 94%, at least approximately 95%, at least approximately 96%, at least approximately 97%, at least approximately 98%, or at least approximately 99% identical to sequence number 134 or its variants. 【0155】 In some cases, the recombinase sequence on the right contains a sequence that is at least approximately 20%, at least approximately 25%, at least approximately 30%, at least approximately 35%, at least approximately 40%, at least approximately 45%, at least approximately 50%, at least approximately 55%, at least approximately 60%, at least approximately 65%, at least approximately 70%, at least approximately 75%, at least approximately 80%, at least approximately 85%, at least approximately 90%, at least approximately 91%, at least approximately 92%, at least approximately 93%, at least approximately 94%, at least approximately 95%, at least approximately 96%, at least approximately 97%, at least approximately 98%, or at least approximately 99% identical to sequence number 135 or its variants. 【0156】 In some cases, class II, type V Cas effectors and Tn7 type transposase complexes are encoded by polynucleotide sequences containing less than approximately 20 kilobases, less than approximately 15 kilobases, less than approximately 10 kilobases, or less than approximately 5 kilobases. 【0157】 MG110 series 【0158】 In one embodiment, the present disclosure provides a system for transposing a cargo nucleotide sequence to a target nucleic acid site. The system may include a first double-stranded nucleic acid comprising a cargo nucleotide sequence. This cargo nucleotide sequence may be configured to interact with a Tn7 type transposase complex. The system may also include a Cas effector complex. The Cas effector complex may include a class I, type I Cas effector and an engineered guide polynucleotide configured to hybridize to the target nucleotide sequence. The system may also include a Tn7 type transposase complex configured to bind to the Cas effector complex. 【0159】 In some cases, the cargo nucleotide sequence is adjacent to the left transposase recognition sequence. In some cases, the cargo nucleotide sequence is adjacent to the right transposase recognition sequence. In some cases, the cargo nucleotide sequence is adjacent to both the left and right transposase recognition sequences. In some cases, the system further includes a second double-stranded nucleic acid containing the target nucleic acid site. In some cases, the system further includes a PAM sequence that fits the Cas effector complex adjacent to the target nucleic acid site. In some cases, the PAM sequence is located at 3' of the target nucleic acid site. 【0160】 In some cases, the manipulated guide polynucleotide is configured to bind to a class I, type I Cas effector. In some cases, the class I, type I Cas effector contains a polypeptide containing a sequence having at least about 20%, at least about 25%, at least about 30%, at least about 35%, at least about 40%, at least about 45%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, or at least about 99% identity to any one of sequence numbers 41-43 or 48-50 or its variants. In some cases, class I, type I Cas effectors contain polypeptides that have substantially the same sequence as one of sequence numbers 41-43 or 48-50 or their variants. 【0161】 In some cases, the manipulated guide polynucleotide is configured to bind to a class I, type I Cas effector. In some cases, the class I, type I Cas effectors include Cas6, Cas7, and Cas8 effectors containing sequences having at least about 20%, at least about 25%, at least about 30%, at least about 35%, at least about 40%, at least about 45%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, or at least about 99% identity with any one of sequence numbers 41-43 or 48-50 or its variants. In some cases, Class I, Type I Cas effectors include Cas6, Cas7, and Cas8 effectors that contain substantially identical sequences to any one of sequence numbers 41-43 or 48-50 or their variants. 【0162】 In some cases, a Tn7 type transposase complex contains at least one polypeptide containing a sequence that is substantially identical to one of SEQ ID NOs. 44-47 or 51-54 or its variants. In some cases, the Tn7 type transposase complex contains at least two polypeptides containing sequences that are substantially identical to one of sequence numbers 44-47 or 51-54 or its variants.In some cases, the Tn7 type transposase complex contains at least three polypeptides containing sequences that are substantially identical to one of sequence numbers 44-47 or 51-54 or its variants. In some cases, the Tn7 type transposase complex contains four polypeptides with sequences that are substantially identical to any one of SEQ ID NOs. 44-47 or 51-54 or its variants. 【0163】 In some cases, the Tn7 type transposase complex contains TnsA, TnsB, TnsC, and TniQ components that have sequences with at least approximately 20%, at least approximately 25%, at least approximately 30%, at least approximately 35%, at least approximately 40%, at least approximately 45%, at least approximately 50%, at least approximately 55%, at least approximately 60%, at least approximately 65%, at least approximately 70%, at least approximately 75%, at least approximately 80%, at least approximately 85%, at least approximately 90%, at least approximately 91%, at least approximately 92%, at least approximately 93%, at least approximately 94%, at least approximately 95%, at least approximately 96%, at least approximately 97%, at least approximately 98%, or at least approximately 99% identity with any one of SEQ ID NOs. 44-47 or 51-54 or their variants. 【0164】 In some cases, the manipulated guide polynucleotide contains a sequence having at least about 20%, at least about 25%, at least about 30%, at least about 35%, at least about 40%, at least about 45%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, or at least about 99% identity with respect to any one of the non-degenerate nucleotides of SEQ ID NOs. 121, 122, 207, or 208 or its variants. In some cases, the manipulated guide polynucleotide contains a sequence of at least approximately 46–80 consecutive nucleotides substantially identical to the non-degenerate nucleotide of any one of sequence numbers 121, 122, 207, or 208 or its variants. 【0165】 In some cases, the left-hand recombinase sequence contains a sequence that is at least approximately 20%, at least approximately 25%, at least approximately 30%, at least approximately 35%, at least approximately 40%, at least approximately 45%, at least approximately 50%, at least approximately 55%, at least approximately 60%, at least approximately 65%, at least approximately 70%, at least approximately 75%, at least approximately 80%, at least approximately 85%, at least approximately 90%, at least approximately 91%, at least approximately 92%, at least approximately 93%, at least approximately 94%, at least approximately 95%, at least approximately 96%, at least approximately 97%, at least approximately 98%, or at least approximately 99% identical to sequence number 136 or 138 or its variants. 【0166】 In some cases, the recombinase sequence on the right contains a sequence that is at least approximately 20%, at least approximately 25%, at least approximately 30%, at least approximately 35%, at least approximately 40%, at least approximately 45%, at least approximately 50%, at least approximately 55%, at least approximately 60%, at least approximately 65%, at least approximately 70%, at least approximately 75%, at least approximately 80%, at least approximately 85%, at least approximately 90%, at least approximately 91%, at least approximately 92%, at least approximately 93%, at least approximately 94%, at least approximately 95%, at least approximately 96%, at least approximately 97%, at least approximately 98%, or at least approximately 99% identical to sequence number 137 or 139 or its variants. 【0167】 In some cases, class I, type I Cas effector, and Tn7 type transposase complexes are encoded by polynucleotide sequences containing less than approximately 20 kilobases, less than approximately 15 kilobases, less than approximately 10 kilobases, or less than approximately 5 kilobases. 【0168】 In one embodiment, the Disclosure provides a method for transposing a cargo nucleotide sequence to a target nucleic acid site containing a target nucleotide sequence, the method comprising the steps of expressing the system described herein in a cell or introducing the system described herein into a cell. 【0169】 In accordance with the IUPAC Agreement, the following abbreviations will be used throughout the examples. A = Adenine C = Cytosine G = Guanine T = Chimin R = adenine or guanine Y = Cytosine or Thymine S = Guanine or Cytosine W = Adenine or Thymine K = guanine or thymine M = adenine or cytosine B = C, G, or T D = A, G, or T H = A, C, or T V = A, C, or G [Examples] 【0170】 Example 1 - (General Protocol) Identification / Confirmation of PAM Sequences for the Systems Described herein The putative endonuclease was expressed in an E. coli lysate-based expression system (myTXTL, Arbor Biosciences). The PAM sequence was determined by sequencing a plasmid containing a randomly generated potential PAM sequence that could be cleaved by the putative nuclease. In this system, the E. coli codon-optimized nucleotide sequence encoding the putative nuclease was transcribed and translated in vitro from a PCR fragment under the control of the T7 promoter. A second PCR fragment with the T7 promoter and a minimal CRISPR array consisting of a subsequent repeat-spacer-repeat sequence was also transcribed in the same reaction. Successful expression of the endonuclease and repeat-spacer-repeat sequence in the TXTL system, followed by CRISPR array processing, yielded an active in vitro CRISPR nuclease complex. 【0171】 A library of target plasmids containing spacer sequences matching a minimal array spacer sequence preceded by an 8N mixed base (potential PAM sequence) was incubated at the yield of a TXTL reaction. After 1–3 hours, the reaction was stopped and the DNA was recovered using a DNA cleanup kit, e.g., Zymo DCC, AMPure XP beads, QiaQuick. The adapter sequence was blunt-end ligated to DNA containing the active PAM sequence cleaved by an endonuclease, while uncleaved DNA was inaccessible for ligation. Subsequently, DNA segments containing the active PAM sequence were amplified by PCR using primers specific to the library and the adapter sequence. The PCR amplification products were degraded on a gel to identify the amplicons corresponding to the cleavage events. The amplified segments from the cleavage reactions were also used as templates for NGS library preparation or as substrates for Sanger sequencing. Sequence analysis of this resulting library, a subset of the starting 8N library, revealed sequences with PAM activity compatible with the CRISPR complex. For PAM testing using processed RNA constructs, the same procedure was repeated, except that in vitro transcribed RNA was added along with the plasmid library, and a minimal CRISPR array template was omitted. 【0172】 For endonucleases that are competent to bind but lack the corresponding nuclease, the PAM was determined by modifying the procedure described above. After expression in TXTL, the sgRNA or crRNA and PAM library were added. When the effector bound to the spacer sequence in an sgRNA-dependent manner, the spacer sequence was confined within the effector protein. Appropriate restriction enzymes targeting the spacer sequence were added, and all unprotected plasmids in the library were cleaved. Uncleaved (endonuclease-bound) members of the library containing the PAM were identified by PCR and subsequent NGS library preparation of the bands. 【0173】 Example 2 - In vitro targeted integrase activity Integrase activity was preferentially assayed using previously identified PAMs, but could be performed less efficiently using PAM library substrates instead. One configuration of components for in vitro assays included three plasmids other than the one containing the donor sequence: (1) an expression plasmid with an effector (or multiple effectors) under the T7 promoter, (2) an expression plasmid with the integrase gene under the T7 promoter; sgRNA or crRNA and tracrRNA, (3) a target plasmid containing a spacer site and the appropriate PAM, and (4) a donor plasmid containing the left-end (LE) and right-end (RE) DNA sequences necessary for transposition around a cargo gene (e.g., a select marker such as the Tet resistance gene). Effector and integrase genes were expressed using an in vitro transcription / translation (TXTL) system (e.g., an E. coli lysate-based or reticulocyte extract-based system). After expression, RNA, target DNA, and donor DNA were added and incubated to induce transposition. Transposition was detected by PCR at the integrase site junction with one primer on the target DNA and one on the donor DNA. The resulting PCR products were sequenced by NGS to determine the precise insertion topology for the sgRNA / crRNA target site. The primers were positioned downstream to accommodate and detect various insertion sites. Since the orientation of the integration was initially unknown, the primers were designed so that integration could be detected on either the cargo orientation or the spacer side. 【0174】 Integration efficiency was measured by quantitative PCR (qPCR) of the experimental output of target DNA with integrated cargo, and normalized to the amount of unmodified target DNA, also measured by qPCR. 【0175】 This assay can be performed using purified protein components rather than lysate-based expression. In this case, the protein was expressed in E. coli protease-deficient strain B under a T7-inducible promoter. Cells were lysed using sonication, and the His-tagged protein of interest was purified using HisTrap FF (GE Lifescience) Ni-NTA affinity chromatography on AKTA Avant FPLC (GE Lifescience). Purity was determined by SDS-PAGE and densitometry of protein bands separated on InstantBlue Ultrafast (Sigma-Aldrich) Coomassie-stained acrylamide gel (Bio-Rad) using ImageLab software (Bio-Rad). The protein was desalted in a storage buffer consisting of 50 mM Tris-HCl, 300 mM NaCl, 1 mM TCEP, 5% glycerol, pH 7.5 (or other buffer determined to achieve maximum stability) and stored at -80°C. After purification, the effector and integrase were added to the sgRNA, target DNA, and donor DNA as described above in a reaction buffer consisting of 26 mM HEPES pH 7.5, 4.2 mM TRIS pH 8, 50 μg / mL BSA, 2 mM ATP, 2.1 mM DTT, 0.05 mM EDTA, 0.2 mM MgCl2, 28 mM NaCl, 21 mM KCl, and 1.35% glycerol (final pH 7.5) supplemented with 15 mM Mg(OAc)2. 【0176】 Example 3 - Predicted RNA folding The predicted RNA folding of the active single RNA sequence was calculated at 37° using the Andronescu 2007 method. All hairpin loop secondary structures were individually deleted from the structure and repeatedly compiled into smaller single guides. In a second approach, the tracrRNA of MG64-1 was aligned to the tracrRNA of a known type Vk, and the unique insertion region was mutated from the single guide, minimizing it by 57 bases. Figure 12A shows the predicted structure of MG64-2 sgRNA (sequence number 202). Figure 12B shows the predicted structure of MG64-4 sgRNA (sequence number 203). Figure 12C shows the predicted structure of MG64-6 sgRNA (sequence number 201). Figure 12D shows the predicted structure of MG64-7 sgRNA (sequence number 204). Figure 12E shows the predicted structure of MG108-1 sgRNA (SEQ ID NO: 206). The intensity of the bases corresponds to the probability of base pairing for that base. 【0177】 Example 4 - Verification of transposon ends by gel shift Transposon ends were tested for TnsB binding via electrophoretic mobility shift assay (EMSA). In this case, potential LE or RE were synthesized as DNA fragments (100–500 bp) and end-labeled with FAM via PCR using FAM-labeled primers. TnsB proteins were synthesized in an in vitro transcription / translation system (e.g., PURExpress). After synthesis, 1 μL of TnsB protein was added to 50 nM labeled RE or LE in a 10 μL reaction in binding buffer (20 mM HEPES pH 7.5, 2.5 mM Tris pH 7.5, 10 mM NaCl, 0.0625 mM EDTA, 5 mM TCEP, 0.005% BSA, 1 ug / mL poly(dI-dC), and 5% glycerol). After incubation at 30°C for 40 minutes, 2 μL of 6X loading buffer (60 mM KCl, 10 mM Tris, pH 7.6, 50% glycerol) was added. The binding reaction was separated and visualized on a 5% TBE gel. A shift in LE or RE in the presence of TnsB was attributed to successful binding and indicated transposase activity. 【0178】 Figure 15 shows an example of this experiment, in which the RE DNA sequence of MG64-2 (e.g., SEQ ID NO: 155) was terminally labeled with FAM using the procedure described above and incubated with the corresponding MG64-2 TnsB-like component (e.g., SEQ ID NO: 23). The upshift of the labeled band in lane 3 indicates binding of the RE sequence by TnsB, suggesting that it contains an active RE transposition sequence. 【0179】 Example 5 - Integrase activity (predictive) in Escherichia coli Because E. coli lacks the ability to efficiently repair double-strand DNA breaks in its genome, transformation of E. coli with drugs that can induce double-strand breaks in the E. coli genome leads to cell death. This phenomenon is utilized to test endonuclease or effector-assisted integrase activity in E. coli by recombinantly expressing either the endonuclease or effector-assisted integrase along with guide RNA (determined, e.g., as in Example 3) in target strains having spacer / target and PAM sequences integrated into their genomic DNA. 【0180】 Subsequently, the manipulated strain is transformed and incorporated with plasmids containing a nuclease or effector with a single guide RNA, plasmids expressing integrase and accessory genes, and plasmids containing a temperature-sensitive origin of replication with selectable markers adjacent to the left-end (LE) and right-end (RE) transposon motifs. Transformants induced for the expression of these genes are screened for marker introduction to genomic targets by selection at the limiting temperature for plasmid replication, and marker integration into the genome is confirmed by PCR. 【0181】 Off-target integration is screened using an unbiased approach. In short, purified gDNA is fragmented by Tn5 integrase or shearing, and then the DNA of interest is PCR-amplified using ligated adapters and primers specific to selectable markers. The amplicons are then prepared for NGS sequencing. Analysis of the resulting sequences involves trimming transposon sequences, mapping flanking sequences to the genome to determine insertion sites, and determining the off-target insertion rate. 【0182】 Example 6 - Colony PCR screening of transposase activity (predictive) To test nuclease or effector-assisted integrase activity in bacterial cells, strain MGB0032 is constructed from BL21(DE3) E. coli cells engineered to contain MG64_1-specific targets and corresponding PAM sequences. Subsequently, MGB0032 E. coli cells are transfected with pJL56 (a plasmid expressing a combination of MG64_1 effector and helper, with ampicillin resistance) and pTCM 64_1 sg, a chloramphenicol-resistant plasmid expressing a single guide RNA sequence of an engineered target of interest driven by the T7 promoter. 【0183】 Next, the MGB0032 culture containing both plasmids is grown to saturation, diluted at least 1:10 in a growth culture containing the appropriate antibiotic, and incubated at 37°C until the OD is approximately 1. Cells from this growth stage are electrocompetent and transfected with the streamlined 64_1 pDonor plasmid, which has a tetracycline resistance marker flanked by the left-end (LE) and right-end (RE) transposon motifs. The electroporated cells are harvested in LB medium for 2 hours in or without a final concentration of 100 μM IPTG, then plated in LB-agar-ampicillin-chloramphenicol-tetracycline and incubated at 37°C for 4 days. Each resulting CFU is sampled using a sterile toothpick and mixed with water. To this solution, add the Q5 high-fidelity PCR master mix (New England Biolabs) and primers LA155 (5'-GCTCTTCCGATCTNNNNNGATGAGCGCATTGTTAGATTTCAT-3') and oJL50 (5'-AAACCGACATCGCAGGCTTC-3'). These primers are adjacent to the predicted insertion junction. The predicted product size is 609 bp. The DNA-amplified PCR product is visualized on a 2% agarose gel. Sanger sequencing of the PCR product confirms the transposition event. 【0184】 Example 7 - Intracellular Expression / In Vitro Assay (Predictive) To test the functionality of NLS constructs in a physiologically relevant environment, constructs cloned with active NLS-tagged CAST components are incorporated into K562 cells using lentiviral transduction. In short, constructs cloned into lentiviral transduction plasmids are transfected into 293T cells containing envelope plasmids and packaging plasmids. After 72 hours of incubation, the virus-containing supernatant is collected from the culture medium. The virus-containing medium is then incubated with the K562 cell line with 8 μg / mL polyblen for 72 hours, and the transfected cells are selected for large-scale integration using 1 μg / mL puromycin for 4 days. The selected cell lines are collected at the end of the 4 days and lysed separately for the nuclear and cytoplasmic fractions. The subsequent fractions are tested for translocation ability with complementary sets of components expressed in vitro. 【0185】 10 million cells are centrifuged and washed once with 1x PBS pH 7.4. The supernatant wash is completely aspirated into the cell pellet and flash-frozen at -80°C for 16 hours. After thawing on ice, the cell pellet size is measured by mass, and proteins from the cell fraction are spontaneously extracted using appropriate extraction volumes of cell fraction and nuclear extraction reagent (NE-PER). Briefly, the cytoplasmic extraction reagent is used in a 1:10 ratio of cell mass to extraction reagent. The cell suspension is mixed by vortexing and dissolved with a nonionic surfactant. Then, the cells are centrifuged at 16,000xg at 4°C for 5 minutes. The cytoplasmic extraction supernatant is decanted and stored for in vitro testing. Next, the nuclear extraction reagent is added in a 1:2 ratio of the original cell mass to nuclear extraction reagent, and incubated on ice for 1 hour with intermittent vortexing. Subsequently, the nuclear suspension is centrifuged at 16,000 × g for 10 minutes at 4°C, and the supernatant nuclear extract is decanted and tested for in vitro translocation activity. Using 4 μL of each cell and nuclear extract from each condition, the in vitro translocation reaction is performed with a complementary set of in vitro expressed protein, donor DNA, pTarget, and buffer. Evidence of translocation activity is assayed by PCR amplification of the donor-target junction. 【0186】 Example 8 - Activity in mammalian cells (predictive) To demonstrate targeting and cleavage activity in mammalian cells, nuclear localization sequences are fused to the C-terminus of nuclease or effector proteins and integrase proteins, respectively, and the fusion proteins are purified. A single guide RNA targeting the genomic locus of interest is synthesized and incubated with the nuclease / effector protein to form a ribonucleoprotein complex. Cells are transfected with plasmids containing a selectable neomycin resistance marker (NeoR) or fluorescent markers adjacent to the left-end (LE) and right-end (RE) motifs, harvested for 4–6 hours, and then electroporated with nuclease RNP and integrase proteins. Plasmid integration into the genome is quantified by counting G418-resistant colonies or by cell counting of fluorescence-activated cells. Genomic DNA is extracted 72 hours after electroporation and used for NGS library preparation. Off-target frequencies are assayed by fragmenting the genome and preparing amplicons of transposon markers and adjacent DNA for NGS library preparation. To test the activity of each targeting system, at least 40 different target sites are selected. 【0187】 Example 9 - Activity of the targeted nuclease In-situ expression and protein sequence analysis suggest that several RNA guide effectors are active nucleases. They contain predicted endonuclease-associated domains (corresponding to RuvC and HNH_endonuclease domains) and predicted HNH and RuvC catalytic residues (see, for example, Figure 4A showing predicted catalytic residues for the MG36-5 effector). 【0188】 Candidate activity is tested with engineered single guide RNA sequences using the myTXTL system and in vitro transcription RNA. Active proteins are identified as those that successfully cleave the library to obtain a band of approximately 170 bp on agarose gel electrophoresis. 【0189】 Example 10 - Identification of transposons A transposon is predicted to be active when it contains one or more protein sequences with integrase and / or integrase function between its left and right ends. A typical Tn7 transposon generally contains the catalytic integrase TnsB, but may also contain TnsA, TnsC, TnsD, TnsE, TniQ, and / or other integrases. The transposon terminus contains the predicted integrase binding site, which includes 15 bp–150 bp long direct and / or reverse repeats adjacent to the integrase protein and other “cargo” genes. Protein sequence analysis shows that the integrase contains an integrase domain, an integrase domain, and / or integrase catalytic residues, suggesting that they are active (e.g., Figure 4A showing a diagram of the locus of an exemplary MG36-5 effector-based CAST system containing the TnsB element, and Figure 5A showing a diagram of the locus of an exemplary MG39-1 effector-based CAST system containing the TnsA, TnsB, TnsC, and TniQ elements). 【0190】 Example 11 - Identification of CRISPR-related transposons Predicted CRISPR-associated transposons (CASTs) contain a CRISPR effector that targets DNA and / or RNA, and a protein with predicted integrase function, located near a CRISPR sequence. In some systems, the effector is predicted to have nuclease activity based on the presence of an endonuclease-associated catalytic domain and / or catalytic residues (e.g., Figure 4A shows predicted catalytic residues of the MG36-5 effector in the context of a CAST system locus containing the TnsB element). The integrase was predicted to be associated with an active nuclease when the CRISPR locus (CRISPR nuclease and array) and the predicted integrase protein are located between the left and right ends of the transposon (e.g., Figures 4B and 4C). In this case, the effector was predicted to direct DNA integration to a specific genomic location based on guide RNA. 【0191】 In some systems, the effector was predicted to be homologous to known CRISPR effector proteins but inactive based on the absence of the endonuclease domain and / or catalytic residues (Figure 5A). Integrase is predicted to be associated with the effector when the CRISPR locus (inactive CRISPR nuclease and array) and the integrase protein are located within the left and right ends of the predicted transposon (Figures 5A and 5B). 【0192】 Example 12 - Discovery of CAST CRISPR-related transposons (CASTs) are a system that includes transposons that have evolved to interact with the CRISPR system to facilitate targeted integration of DNA cargo. 【0193】 CAST is a genomic sequence that encodes one or more protein sequences involved in DNA transposition within the left and right ends of a transposon signature. Typical Tn7 transposons generally contain the catalytic transposase TnsB, but may also contain the catalytic transposase TnsA, the loader protein TnsC or TniB, and the target recognition proteins TnsD, TnsE, TniQ, and / or other transposon-related components. The transposon ends contain the predicted transposase binding sites, which include direct and / or reverse repeats of 15 bp to 150 bp in length, adjacent to the transposon mechanism and other "cargo" genes. 【0194】 In addition, CAST further encodes CRISPR nucleases or effectors that target DNA and / or RNA in the vicinity of the CRISPR array. In some systems, the effectors are predicted to be active nucleases based on the presence of an endonuclease-associated catalytic domain and / or catalytic residues. In some systems, the effectors were predicted to be inactive based on the absence of an endonuclease domain and / or catalytic residues, although they exhibit sequence similarity to known CRISPR effector proteins. Transposons are predicted to be associated with effectors if the CRISPR locus and transposon-associated proteins are located within the left and right ends of the predicted transposon. In this case, the effectors are predicted to direct DNA integration to a specific genomic location based on guide RNA. 【0195】 Example 13a-Cas12kCAST The Cas12k CAST system encodes nuclease-deficient CRISPR Cas12k effectors, CRISPR arrays, tracrRNA, and Tn7-like transposition proteins (see, for example, Figure 8, which shows a locus diagram of the MG108-1 CAST system, including Cas12k). Cas12k effectors are phylogenetically diverse, and several features confirming their association with CAST have been identified (see, for example, Figure 9, which shows how the MG64-1, MG64-2, MG64-3, MG64-5, MG64-6, MG64-7, MG64-13, MG64-54, MG64-56, MG108-1, and MG108-2 effectors are part of this group). One such distinctive feature was the transposon terminus, identified in the context of the MG64-3 CRISPR locus. The left end of the transposon was identified downstream of the MG64-3 CRISPR locus, as indicated by the terminal reverse repeat and self-congruent spacer sequence (Figure 11A). Another such feature identified was the inclusion of a Cas12k CAST CRISPR repeat (crRNA) containing the conserved motif 5'-GNNGGNNTGAAAG-3' (see, e.g., MG64-2, MG64-4, MG64-5, MG64-6, MG64-7, and MG108-1, as well as Figure 11B). Short repeat-anti-repeats (RARs) within the crRNA motif aligned with different regions of the tracrRNA, and the RAR motif appeared to define the start and end of the tracrRNA. Figure 13C shows the presence of three RAR motifs in the families, e.g., MG64-2, MG64-4, MG64-5, MG64-6, MG64-7, and MG108-1. 【0196】 Example 13b - CAST of Class I, Type IF Several CASTs encode nuclease-deficient CRISPR type IF cascade effector proteins, CRISPR arrays, and Tn7-like transposition proteins (see, e.g., Figure 10A, showing an organizational diagram of the loci of the MG110-1 effector-based type IF CAST system). The type IF cascade CASTs are predicted to function with a single guide RNA encoded by a crRNA, which contains the conserved motif 5'-CTGCCGNNTAGGNAGC-3' thought to be involved in the formation of a stem-loop structure (see, e.g., Figures 10B-C, showing an alignment of this feature in the family crRNAs SEQ ID NOs. 207 and 208 of MG110-1 and MG110-2). Partly based on having these same features, the MG110-2 effector-containing family was also identified as a type IF CAST system. 【0197】 Example 14 - Transposon End Prediction The transposon termini were inferred from intergeneric regions adjacent to the effector and transposon mechanism. For example, for Cas12k CAST, the intergeneric region directly upstream of TnsB and directly downstream of the CRISPR locus was predicted to contain the left and right ends (LE and RE) of the Tn7 transposon (see, for example, Figure 11A, which shows the analysis of LE and RE in the context of the MG64-3 family CAST locus diagram). 【0198】 Direct and reverse repeats (DR / IR) of approximately 12 bp, with up to two mismatches, were predicted on the contig. In addition, short (approximately 10-20 bp) DR / IRs adjacent to the CAST transposon were discovered using the Dotplot algorithm. Matching DR / IRs located in intergeneric regions adjacent to the CAST effector and the transposon gene were predicted to encode the transposon binding site. The transposon terminal boundary was defined by aligning the LE and RE extracted from the intergeneric region encoding the putative transposon binding site. The LE and RE terminals of the putative transposon were identified as a) regions located within 400 bp upstream and downstream of the first and last predicted transposon-coding genes, b) regions sharing multiple short reverse repeats, and c) regions sharing more than 65% of nucleotide IDs. By repeating this process, we identified the estimated LE / RE sequences for MG36-5 (sequences 17-18), MG39-1 (sequences 20-21), MG64-2 (sequences 125-126), MG64-4 (sequences 127-128), MG64-6 (sequences 123-124), MG64-7 (sequences 129-130), MG64-13 (sequences 131-132), MG64-54 (sequences 133), MG108-1 (sequences 134-135), MG110-1 (sequences 136-137), and MG110-2 (sequences 138-139). 【0199】 Example 15 - Single guide design for Class II, Type V CAST systems Analysis of the intergenetic regions surrounding the Cas effector and CRISPR array of the MG64 subfamily identified potential antirepeat sequences and a conserved "CYCC(N6)GGRG" stem-loop structure adjacent to the antirepeat corresponding to the tracrRNA sequence (Figure 11B). TracrRNA and crRNA repeats were folded and trimmed, and a GAAA tetraloop sequence was added to maintain the stem-loop region of the crRNA-tracrRNA complementary sequence to generate sgRNA. A summary of these sequences is shown in Table 1 below. 【0200】 [Table 1-1] 【0201】 [Table 1-2] 【0202】 Example 16 - In vitro integration activity using the targeted nuclease In-situ expression and protein sequence analysis suggested that several RNA guide effectors were active nucleases. They contained predicted endonuclease-associated domains (matching RuvC and HNH_endonuclease domains) and / or predicted HNH and RuvC catalytic residues. Candidate activity was tested with engineered single guide RNA sequences using the myTXTL system and in vitro transcription RNA. Active proteins were identified as those that successfully cleaved the library to obtain a band of approximately 170 bp on agarose gel electrophoresis. 【0203】 Example 17 - Programmable DNA Integration CAST activity was tested by combining five components in a single reaction: (1) Cas effector protein expressed by myTXTL or PURExpress, (2) target DNA fragment or plasmid containing the target sequence and PAM corresponding to the Cas enzyme, (3) donor DNA fragment containing markers or DNA fragments adjacent to the predicted LE and RE of the transposase system in the DNA fragment or plasmid, (4) any combination of additional transposase proteins predicted to be part of the array expressed using myTXTL or PURExpress, and (5) an engineered in vitro transcribed single guide RNA sequence. The active system with successfully transposed donor fragments was assayed by PCR amplification of the donor-target junction. 【0204】 Figure 13 shows example data demonstrating the activity of the MG64-6 system, which includes the MG64-6 effector, TnsB, TnsC, and TniQ proteins (SEQ ID NOs. 30-33) using predicted LE / RE donor sequences (SEQ ID NOs. 123-124) and in silico-designed sgRNA (SEQ ID NOs. 201). After the rearrangement reaction by combining all MG64-6 components, PCR amplification of the junction resulted in appropriate donor-target formation, demonstrating that the rearrangement reaction is sg-dependent (Figure 13A). The presence of amplification bands in PCR reactions #3 and #4 (spanning the LE / RE junction when LE / RE is inserted distal to PAM, respectively) indicates that both donor orientations toward the target occur: orientation when LE is close to PAM and orientation when RE is close to PAM. While both dislocation orientations occurred, donor integration at the target was preferred when the LE was closer to the PAM, as indicated by strong bands present in reactions #4 and #5 (spanning the LE junction when inserted distally to the PAM and the RE junction when inserted proximal to the PAM, respectively). 【0205】 Sanger sequencing was performed on the preferred orientation product. Among the integrations occurring when the LE was close to the PAM, there was a clear degradation of the sequencing chromatogram signal from either the forward or reverse direction across the target / donor junction (Figure 13C). This indicated that among the products oriented when the LE was close to the PAM, the integration occurred over a nucleotide range, and the major product of the LE close to the PAM product was a 61 bp integration from the PAM (Figure 14). Sequencing derived from the donor across the donor-target junction defined the construction of the essential outer boundary of the LE and RE sequences. Further investigation of the LE and RE domains will determine the internal limits of the LE and RE sequences, and therefore the minimum LE / RE required for transposition. Sequencing of the RE in the product where the LE was close to the PAM revealed a 3 bp duplication downstream of the donor RE. This is partly due to a Tn7 transposase integration event that cleaves and ligates the donor fragment at a staggered cut site. The 3bp overlap is smaller than the 5bp overlap expected from other Tn7 transposases. 【0206】 Sanger sequencing of PCR amplification products in an 8N library of the target plasmid revealed the PAM priority of the MG64-6 effector as nGTn / nGTt at the 5' end of the spacer. NGS analysis of the PAM library target confirmed the priority of the nGTn motif at the 5' end (Figure 13B). 【0207】 Example 18 - Determination of the Integration Window The PCR junctions of the PAM amplified in Example 17 were indexed against an NGS library and sequenced using MiSeq with a V2 300 read kit. The reads were mapped and quantified using CRISPResso, which uses the amplicon sequence of the estimated transposition sequence with an integration distance of 60 bp from the PAM (guideseq=LE or RE, 20 bp 3' end, window center=0, window size=20). The indel histogram was normalized for all detected indel reads, and the frequency was plotted in comparison to the 60 bp reference sequence (Figure 14). 【0208】 Both PCR reaction 5 (LE proximal to PAM, Figure 13a) and PCR 4 (RE distal to PAM, Figure 13b) were plotted for MG64-6 on the sequence and distance from PAM (Figure 14). Analysis of the integration window showed that 95% of the integrations that occurred at the spacer PAM site were within a 10 bp window, 58–68 nucleotides away from PAM. The difference in integration distance between distal and proximal frequencies reflected duplication of integration sites—a 3–5 base pair duplication resulting from a shift in transposase nuclease activity during integration. 【0209】 Example 19 - Verification of transposon ends by gel shift To validate the activity of TnsB against predicted transposon terminal sequences, the RE of MG64-6 was amplified using a FAM-labeled oligonucleotide. The MG64-6 TnsB protein was expressed using a cell-free transcription / translation system and incubated with the RE FAM-labeled product. After incubation for 30 minutes, binding was observed on a native 5% TBE gel (Figure 15). Multiple bands in the fluorescent product within the co-incubated lane (Figure 15, lane 3) indicated at least three TnsB binding sites. 【0210】 Example 20 - Colony PCR screening of transposase activity (predictive) Transcriptional activity is assayed via colony PCR screening. After transformation with the pDonor plasmid, E. coli are plated onto LB-agar containing ampicillin, chloramphenicol, and tetracycline. Selected CFUs are added to a solution containing primers adjacent to the insertion junction and PCR reagents. 【0211】 Example 22 - LE-RE minimization (predictive) Sequencing of the target-transition junction helps identify terminal reverse repeats by identifying the outermost sequence from the donor plasmid that is incorporated into the target reaction. Performing 14 bp repeat analysis with a variance of 10% identifies short repeats contained within the terminals, and removes extraneous sequences by identifying the minimum sequence contained in these cleavages that preserve these repeats. Prediction and cloning are repeated many times, and each interaction is tested with in vitro transposition. The transposition is predicted to be active up to the 68 bp LE region combined with the 96 bp RE region. 【0212】 Example 23 - Effect of dislocation overhang (predictive) To test whether extra sequences outside the TnsB binding motif are necessary for transposition, oligos designed for both LE and RE TGTACA or TGTCGA motifs are designed and synthesized with 0, 1, 2, 3, 5, and 10 bp of extra base pairs. These synthesized oligos are used to generate donor PCR fragments with overhangs and are tested for their ability to transpose to the target site. 【0213】 Example 24 - CAST NLS Design (Predictive) Genome editing in eukaryotes for therapeutic purposes relies on the translocation of editing enzymes into the nucleus. A small polypeptide stretch of a large protein signals cellular components to translocate the protein across the nuclear membrane. The placement of NLS tags needs to be optimized because they must provide translocation functionality while maintaining the function of the protein to which they are fused. To test the functional orientation of NLS for each component of the CAST complex, constructs are synthesized in which nucleoplasmin NLS is fused to the N-terminus and SV40 NLS to the C-terminus of each component of MG CAST. Proteins from these constructs are expressed in cell-free in vitro transcription / translation reactions and tested for in vitro translocation activity with a complementary set of untagged components. NLS-tagged constructs are evaluated for maintenance of activity by donor-target junction PCR using PCR 4 (to assess RE distal translocation) and congeneral translocation events, and PCR 5 (to assess LE proximal translocation). 【0214】 Example 25 - Design and testing (predictive) of a Cas12k and TniQ protein fusion construct. To simplify / minimize the expression of protein components and facilitate their delivery to cells, fusion constructs between TniQ proteins and Cas12k effectors with various linkers, linker lengths, and domain boundaries are designed, synthesized, and tested. Both orientations of TniQ fused to Cas12k are the designed and synthesized C-terminal fusion, Cas-TniQ, and the N-terminal fusion, TniQ-Cas. 【0215】 Two other linkers are also employed to fuse the effector gene and the TniQ gene. The self-stopping translation sequence P2A is active in the Cas-NLS-P2A-NLS-TniQ construct, and the MCV Internal Ribosome Entry Sequence (IRES) mRNA-based linker enables independent translation of the two components in the cell. 【0216】 Example 26 - Intracellular expression linked to in vitro translocation study (predictive) To test the functionality of NLS constructs in a physiologically relevant environment, constructs cloned with active NLS-tagged CAST components are incorporated into K562 cells using lentiviral transduction. In short, constructs cloned into lentiviral transduction plasmids are transfected into 293T cells containing envelope plasmids and packaging plasmids. After 72 hours of incubation, the virus-containing supernatant is collected from the culture medium. The virus-containing medium is then incubated with the K562 cell line with 8 μg / mL polyblen for 72 hours, and the transfected cells are selected for large-scale integration using 1 μg / mL puromycin for 4 days. The selected cell lines are collected at the end of the 4 days and lysed separately for the nuclear and cytoplasmic fractions. The subsequent fractions are tested for translocation ability with complementary sets of components expressed in vitro. 【0217】 Both NLS-TnsB and TnsB-NLS were tested by cell fractionation and in vitro transposition, with transpositions detected across both cytoplasmic and nuclear fractions. 【0218】 Intracellular Cas12k fusions are similarly fractionated and tested for transposition. Cas-NLS-P2A-NLS-TniQ are introduced into cells, fractionated, and tested in vitro for intracellular activity. Cas-NLS-P2A-NLS-TniQ can transpose in the cytoplasm by adding a single guide to the reaction. Cas-NLS-P2A-NLS-TniQ constructs in the nuclear fraction can be complemented by supplementing with holoCas protein (+sgRNA) or additional TniQ with sgRNA. 【0219】 The systems of this disclosure can be used for a variety of applications, such as nucleic acid editing (e.g., gene editing) or binding to nucleic acid molecules (e.g., sequence-specific binding). Such systems can be used, for example, to improve (e.g., remove or replace) genetically inherited mutations that may cause disease in a subject; to inactivate genes to confirm their function within cells; as a diagnostic tool to detect disease-causing genetic elements (e.g., via cleavage of reverse-transcribed viral RNA or amplified DNA sequences encoding disease-causing mutations); as an inactivating enzyme combined with a probe to target and detect specific nucleotide sequences (e.g., sequences encoding antibiotic resistance in bacteria); to inactivate viruses or prevent them from infecting host cells by targeting viral genomes; to improve organisms to produce valuable small molecules, macromolecules, or secondary metabolites; to add genes or alter metabolic pathways; to establish gene-driven elements for evolutionary selection; and / or as a biosensor to detect cytotoxicity caused by foreign small molecules and nucleotides. 【0220】 While preferred embodiments of the present invention have been shown and described herein, it will be apparent to those skilled in the art that such embodiments are provided only as examples. The present invention is not intended to be limited by any particular example provided herein. Although the present invention is described in relation to the foregoing specification, the descriptions and examples of embodiments herein are not intended to be constrained. Those skilled in the art will be able to conceive of many modifications, variations, and substitutions without departing from the present invention. Furthermore, it will be understood that all aspects of the present invention are not limited to any particular description, configuration, or relative proportion described herein, depending on various conditions and variables. It should be understood that various alternatives to the embodiments of the present invention described herein may be used in carrying out the present invention. Therefore, it should be considered that the present invention also encompasses such alternatives, modifications, variations, or equivalents. The following claims define the scope of the present invention, and it is intended that methods and structures within the scope of these claims and their equivalents are encompassed thereby.

Claims

[Claim 1] A manipulated nuclease system, An endonuclease comprising a RuvC domain, wherein the endonuclease is a class II, type V endonuclease comprising a sequence having at least 90% sequence identity with SEQ ID NO: 30, A modified guide ribonucleic acid (RNA) wherein the modified guide RNA is configured to form a complex with the endonuclease, and the modified guide RNA includes a spacer sequence configured to hybridize to a target nucleic acid sequence, An engineered nuclease system comprising, wherein the engineered guide RNA contains a sequence having at least 90% sequence identity with SEQ ID NO:

257. [Claim 2] The manipulated nuclease system according to claim 1, wherein the endonuclease includes SEQ ID NO:

30. [Claim 3] The manipulated nuclease system according to any one of claims 1 to 2, wherein the manipulated guide RNA comprises a sequence having at least 46 to 80 consecutive nucleotides having at least 90% sequence identity with respect to SEQ ID NO:

92. [Claim 4] The manipulated nuclease system according to claim 1, wherein the manipulated guide RNA comprises a sequence having at least 90% sequence identity with respect to sequence number 263. [Claim 5] The manipulated nuclease system according to any one of claims 1 to 2, wherein the manipulated guide RNA includes sequence number 257. [Claim 6] The manipulated nuclease system according to any one of claims 1 to 5, wherein the manipulated guide RNA comprises a sequence having at least 90% sequence identity with respect to the non-degenerate nucleotide of sequence number 111. [Claim 7] The endonuclease is configured to bind to a protospacer adjacent motif (PAM) sequence, wherein the PAM sequence comprises 5'-nGTn-3' or 5'-nGTt-3', according to any one of claims 1 to 6. [Claim 8] A system for rearranging a cargo nucleotide sequence to a target nucleic acid site, wherein the system is A double-stranded nucleic acid comprising the cargo nucleotide sequence, wherein the cargo nucleotide sequence is configured to interact with a Tn7 type transposase complex, A Cas effector complex comprising a class II, type V Cas effector and an engineered guide polynucleotide configured to hybridize to the target nucleic acid site, The Tn7 type transposase complex is configured to bind to the Cas effector complex and comprises a TnSB component, a TnSC component, and a TniQ component, Includes, (a) The Class II, Type V Cas effector comprises a polypeptide having at least 90% sequence identity with respect to Sequence ID No. 30, (b) The TnsB component, the TnsC component, or the TniQ component includes a sequence having at least 90% sequence identity with any one of sequence numbers 31 to 33, and (c) The manipulated guide RNA contains a sequence that has at least 90% sequence identity with SEQ ID NO:

257. system. [Claim 9] The system according to claim 8, wherein the Class II, Type V Cas effector comprises a polypeptide including Sequence ID No.

30. [Claim 10] The system according to claim 8, wherein the TnsB component includes a sequence having at least 90% sequence identity with respect to sequence number 31. [Claim 11] The system according to claim 8, wherein the TnsC component includes a sequence having at least 90% sequence identity with respect to sequence number 32. [Claim 12] The system according to claim 8, wherein the TniQ component includes a sequence having at least 90% sequence identity with respect to sequence number 33. [Claim 13] The system according to claim 8, wherein the TnsB component includes SEQ ID NO: 31, the TnsC component includes SEQ ID NO: 32, and the TniQ component includes SEQ ID NO:

33. [Claim 14] The system according to any one of claims 8 to 13, wherein the manipulated guide polynucleotide comprises a sequence containing at least 46 to 80 consecutive nucleotides having at least 90% sequence identity with respect to SEQ ID NO:

92. [Claim 15] The system according to any one of claims 8 to 13, wherein the manipulated guide polynucleotide comprises a sequence having at least 90% sequence identity with respect to SEQ ID NO:

263. [Claim 16] The system according to any one of claims 8 to 13, wherein the manipulated guide polynucleotide comprises a sequence having at least 90% sequence identity with respect to SEQ ID NO:

257. [Claim 17] The system according to any one of claims 8 to 13, wherein the manipulated guide polynucleotide includes SEQ ID NO:

257. [Claim 18] The system according to any one of claims 8 to 17, wherein the manipulated guide polynucleotide comprises a sequence having at least 90% sequence identity with respect to the non-degenerate nucleotide of sequence number 111. [Claim 19] The system according to any one of claims 8 to 18, wherein the Cas effector complex is configured to bind to a protospacer adjacent motif (PAM) sequence, and the PAM sequence comprises 5'-nGTn-3' or 5'-nGTt-3'. [Claim 20] The system according to any one of claims 8 to 19, wherein the Tn7 type transposase complex is non-covalently bonded to the Cas effector complex. [Claim 21] The system according to any one of claims 8 to 19, wherein the Tn7 type transposase complex is covalently bonded to the Cas effector complex. [Claim 22] The system according to any one of claims 8 to 21, wherein the cargo nucleotide sequence is adjacent to the left transposase recognition sequence and the right transposase recognition sequence. [Claim 23] The system according to claim 22, wherein the transposase recognition sequence on the left side includes a sequence having at least 90% sequence identity with respect to sequence number 123. [Claim 24] The system according to claim 22 or 23, wherein the transposase recognition sequence on the right side includes a sequence having at least 90% sequence identity with respect to sequence number 124. [Claim 25] A method for modifying a target nucleotide sequence, comprising the step of contacting the target nucleotide sequence in vitro with a system according to any one of claims 8 to 24. [Claim 26] The method according to claim 25, wherein the target nucleotide sequence is located within a cell. [Claim 27] One or more nucleic acids encoding the system described in any one of claims 8 to 24.