Method for constructing target DNA sequences and cloning vectors

A method using protelomerase and IIS-type restriction endonuclease/meganuclease enzymatic reactions in host cells addresses purity and scalability issues in target DNA sequence production, ensuring high-fidelity and GMP-compliant large-scale DNA preparation for gene therapy.

JP7880341B2Active Publication Date: 2026-06-25NANJING GENSCRIPT BIOTECH CO LTD

Patent Information

Authority / Receiving Office
JP · JP
Patent Type
Patents
Current Assignee / Owner
NANJING GENSCRIPT BIOTECH CO LTD
Filing Date
2021-08-06
Publication Date
2026-06-25

Smart Images

  • Figure 0007880341000007
    Figure 0007880341000007
  • Figure 0007880341000008
    Figure 0007880341000008
  • Figure 0007880341000009
    Figure 0007880341000009
Patent Text Reader

Abstract

The present application relates to a method for producing a target DNA sequence and a cloning vector, comprising the steps of amplifying and extracting a DNA construct in a host cell and a protelomerase-type IIS restriction endonuclease and / or meganuclease-DNA exonuclease catalyzed three-step isothermal enzymatic reaction step, where the construct is autonomously replicating and contains (a) one or more type IIS restriction endonuclease and / or meganuclease recognition sequences; (b) a target DNA sequence; and (c) protelomerase recognition sequences at the two terminal flanks of the target DNA sequence.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This application relates to the field of biotechnology, particularly to methods for creating target DNA sequences and cloning vectors.

Background Art

[0002] Gene therapy and cell therapy based on gene editing have been well developed after more than 30 years of development. Particularly, due to accurate gene editing based on nucleases (zinc finger nucleases, TAL nucleases and CRISPR nucleases) and a virus-independent delivery mode, virus-independent, safer and more efficient gene therapy has become possible. In the initial stage, plasmids are used for the delivery and expression of target protein genes in treatments such as compensating for gene-deficient genetic diseases. Also, when large fragments of genes are knocked in based on homologous recombination, plasmids or linearized plasmid fragments are used as editing templates. In clinical practice, the application of plasmids in gene editing has been found to have many drawbacks and potential risks: 1) increased cytotoxicity based on double-stranded DNA during in vivo or immune cell delivery due to containing redundant vector backbone sequences of non-target fragments; 2) the redundant backbone sequences on plasmids are mainly the replication origin site and antibiotic resistance genes, and these genes can not only be contaminated by the normal human microbiota when used in gene therapy, but the sequences with bacterial origin generally contain CpG islands and tend to be methylated in the plasmid replication process, resulting in strong immunogenicity during application; and 3) during homologous recombination repair, blunt-ended double-stranded DNA prepared based on the PCR amplification method is adopted as a repair template that may lead to non-homologous recombination repair end joining, resulting in an edited sequence containing repetitive regions of homologous arms and impairing the original design.

[0003] To address the above problems, research is being conducted to attempt to isolate target fragments from plasmids using a dual enzyme digestion mode. However, this method is difficult to implement for large-scale fragment isolation and purification, and only small-scale purification is possible through agarose gel electrophoresis. Furthermore, this method exhibits coloration depending on the DNA dye used for the DNA fragments, and it is difficult to verify whether or not the embedded DNA dye can be completely removed. In 2016, Slavcev et al. reported that in vivo recombination based on bacteria could be achieved using a fermentation technique for double-stranded DNA of target fragments. However, the results showed that due to incomplete recombination efficiency, the purified product after thallus lysis was contaminated with DNA impurities such as the bacterial genome and the original plasmid, and the high-purity final product was separated by agarose gel electrophoresis. In other words, this method still suffers from the above-mentioned problems of difficulty in scaling up and the toxic DNA dyes used in industrial production.

[0004] Another attempt to directly obtain target fragments is to prepare large quantities of double-stranded DNA through PCR amplification. However, the DNA polymerase used in PCR is less sequence-fidelity than that used in bacterial DNA replication systems. During large-scale preparation, the cost of purchasing large quantities of high-fidelity DNA polymerase is very high. When large fragments are amplified, PCR yields are somewhat reduced, and contamination with non-full-length fragments can occur. The critical problem is that when milligram-level pure double-stranded DNA products are required, large-scale PCR is necessary, and hundreds or even thousands of PCR reaction products must be collected together, making it difficult to comply with GMP standards. To solve this problem, Touchlight Genetics in the UK developed an in-vitro thermostatic index amplification method for double-stranded DNA based on rolling circle amplification (RCA). However, when the target product is used as a template for gene knock-in, in-vitro replication methods carry the risk of increasing DNA sequence mutations due to a lack of error correction mechanisms in bacterial DNA replication. In particular, when the target sequence contains a very high proportion of GC sequences, hairpin structures, and repeat sequences, it is generally difficult to guarantee that deletions will not occur within high-difficulty sequence regions by in vitro replication based on DNA polymerase. Furthermore, the overall in vitro replication process is index replication and random, and when the target sequence monomer is cleaved and released from the compound after the reaction is complete, the product will be mixed with incomplete fragment products and will contain impurities that are difficult to remove. [Overview of the Initiative] [Problems that the invention aims to solve]

[0005] A simple and efficient method for preparing target DNA sequences is still needed. [Means for solving the problem]

[0006] This application provides a method for preparing target DNA sequences and cloning vectors. The method and cloning vectors have no special limitations on the target DNA sequence to be prepared; therefore, general methods and cloning vectors suitable for various DNA sequences exist. Furthermore, the method and cloning vectors can be configured to efficiently prepare high-purity target DNA sequences on a large scale. The method uses a DNA construct containing a protelomerase recognition sequence and an IIS-type restriction endonuclease and / or meganuclease recognition sequence. The DNA construct is prepared on a large scale through an intracellular vector amplification process (e.g., through plasmid transformation and extraction methods). High-purity target DNA fragments can then be obtained through three steps of a protelomerase-IIS-type restriction endonuclease and / or meganuclease-DNA exonuclease isothermal enzymatic reaction. The final product can be subjected to alcohol (e.g., ethanol) precipitation concentration and is easily prepared on a large scale. The overall preparation and purification process does not involve DNA dyes. The final product is derived from in vivo replication by host cells, and the sequence is highly accurate. Sequence errors, deletions, or mutations are avoided due to the inclusion of high-difficulty sequences within the target sequence.

[0007] Cloning vector This application provides a general cloning vector configured to produce a target DNA sequence or construct the following DNA constructs. The general cloning vector is an autonomous replication vector comprising (a) one or more IIS-type restriction endonuclease and / or meganuclease recognition sequences, and (b) a plurality of cloning sites.

[0008] In some implementations, the cloning vector is selected from plasmids, cosmids, phages, or viruses (e.g., retroviruses, adeno-associated viruses, lentiviruses, rhabdoviruses, and adenoviruses). The cloning vector may include an origin of replication. The cloning vector may include one or more restriction endonuclease recognition sites or exogenous DNA sequences (e.g., target DNA sequences having a lateral wing linked to a protelomerase recognition sequence) and / or multiple cloning sites configured to insert selection marker genes (e.g., antibiotic resistance genes and ccdB genes) configured to recognize and select cells transformed by the cloning vector.

[0009] In some implementations, the cloning vector includes two or more IIS-type restriction endonuclease and / or meganuclease recognition sequences, for example, 3, 4, 5, 6, 7, 8, 9, 10 or more. In some embodiments, the cloning vector includes three or more IIS-type restriction endonuclease and / or two or more meganuclease recognition sequences. In one specific embodiment, the cloning vector includes five IIS-type restriction endonuclease and / or two meganuclease recognition sequences.

[0010] The IIS-type restriction endonucleases used herein are not limited to, but include AlwI, BccI, BsmAl, EarI, MlyI, PleI, BmrI, BsaI, BsmBl, FauI, HpyAV, MnlI, SapI, BbsI, BciVI, HphI, MboII, BfuaI, BspMI, SfaNI, HgaI, BbvI, EciI, FokI, BceAI, BsmFI, BtgZI, BpmI, BpuEI, BsgI, ACLWI, and Alw26I. , including Bst6I, BsrDI, BstMAI, Eaml1041, Ksp632I, PpsI, SchI, BfiI, Bso31I, BspTNI, BspQI, Eco31I, Esp3I, FauI, SmuI, BfuI, BpiI, BpuAI, BstV2I, AsuHPI, Acc36I, LweI, AarI, BseMII, TspDTI, TspGWI, BseXI, BstVlI, Eco57I, Eco57MI, GsuI, PsrI, or MmeI. In some implementations, the IIS-type restricted endonuclease is selected from one or a combination of the following: BbsI, BsaI, BsmBI, BspQI, BsrDI, EarI, HgaI, and SfaNI. Methods for preparing and using IIS-type restriction endonucleases are conventional, and numerous IIS-type restriction endonucleases are commercially available. The "IIS-type restriction endonuclease recognition sequence" is a sequence that can be recognized and cleaved by the corresponding IIS-type restriction endonuclease, determined according to the specific IIS-type restriction endonuclease used, and is known in the art. In one embodiment, the IIS-type restriction endonuclease comprises BspQI having the recognition sequence GCTCTTC.

[0011] As used herein, the term “meganucleases” refers to endonuclease subtypes that have rare nicks greater than 12 bp of the double-stranded DNA target sequence. Meganucleases are generally dimeric enzymes, also known as homing endonucleases (HEs), and can be divided into five families: LAGLIDADG, GIY-YIG, HNH, His-Cys box, and TO-(D / E)XK, depending on their sequence and structural motifs. Structural data are available for at least one member of each family. In some implementations, meganucleases are selected from one or a combination of the following: I-SceI, I-CreI, I-DmoI, I-OnuI, I-LtrI, I-PanMI, I-GzeMII, I-HjeI, I-LtrWI, and I-SmaMI. Methods for preparing and using meganucleases are conventional, and numerous meganucleases are commercially available. The "meganuclease recognition sequence" is a sequence that can be recognized and cleaved by the corresponding meganuclease, is determined according to the specific meganuclease used, and is known in the art. In one embodiment, the meganuclease comprises I-SceI having the recognition sequence TAGGGATAACAGGGTAAT.

[0012] In some implementations, the IIS-type restriction endonuclease recognition sequence in the cloning vector is selected from one or a combination of the following recognition sequences: BbsI, BsaI, BsmBI, BspQI, BsrDI, EarI, HgaI, and SfaNI.

[0013] In some implementations, the meganuclease recognition sequence in the cloning vector is selected from one or a combination of the following recognition sequences: I-SceI, I-CreI, I-DmoI, I-OnuI, I-LtrI, I-PanMI, I-GzeMII, I-HjeI, I-LtrWI, and I-SmaMI.

[0014] In some embodiments, the cloning vector includes a replication origin and a selection marker gene. The selection marker gene may be selected from antibiotic resistance genes such as kanamycin resistance genes, chloramphenicol resistance genes, and neomycin resistance genes, or from ccdB genes. In some implementations, the cloning vector includes a lactose operon sequence, a gene encoding β-galactosidase containing multiple cloning sites, and three or more BspQI recognition sequences and / or two or more I-sceI recognition sequences. In some embodiments, the lactose operon sequence includes a lac promoter and a lac activator gene.

[0015] The cloning vector may be a medium / high copy cloning vector. In some implementations, the cloning vector configured to construct the DNA construct of the present invention is derived from a pBR322 vector, a pUC vector, or a pET vector. In some preferred embodiments, the cloning vector configured to construct the DNA construct of the present invention is derived from a pUC vector such as a pUC57 vector. In some implementations, the pUC vector contains or is a sequence such as that shown in SEQ ID NO: 12.

[0016] As used herein, the term “derived from” means “reconstructed from,” i.e., the cloning vector is obtained by reconstructing the original vector from which the cloning vector originates (such as a pBR322 vector, pUC vector, or pET vector). Reconstruction may include (i) the insertion of one or more IIS-type restriction endonuclease and / or meganuclease recognition sequences onto an original vector such as a pBR322 vector, pUC vector, or pET vector; (ii) the creation of one or more IIS-type restriction endonuclease and / or meganuclease recognition sequences by performing mutations on an original vector such as a pBR322 vector, pUC vector, or pET vector, or a combination of (i) and (ii). In some implementations, additional IIS-type restriction endonuclease recognition sequences may be added to the origin of replication site and antibiotic resistance gene sequences via synonymous codon mutations to increase the number of IIS-type restriction endonuclease recognition sites on the DNA construct or cloning vector and to reduce the size of the skeletal fragments subject to enzymatic digestion.

[0017] In some implementations, the cloning vector is constructed by performing a reconstruction on the pUC57 vector, in which (i) the BspQI recognition sequence is added after positions 1554 and 2539 of the pUC57 vector, and the I-sceI recognition sequence is added after positions 1501 and 2479; and (ii) the G at position 1397 of the pUC57 vector is mutated to C, and the AT at positions 2136 and 2137 are mutated to GC. In some implementations, the pUC vector contains or is a sequence as shown in sequence number 12.

[0018] In some implementations, the nucleotide sequence of the cloning vector contains, or is, a sequence like the one shown in SEQ ID NO: 1.

[0019] DNA constructs The present invention further provides a DNA construct used in a method for constructing the following target DNA sequences. The DNA construct autonomously replicates and comprises (a) one or more IIS-type restriction endonuclease and / or meganuclease recognition sequences; (b) a target DNA sequence; and (c) protelomerase recognition sequences at the two terminal side wings of the target DNA sequence. The DNA construct may be constructed by a method comprising (i) providing a cloning vector comprising one or more IIS-type restriction endonuclease and / or meganuclease recognition sequences; and (ii) inserting a target DNA sequence having two terminal side wings linked to a protelomerase recognition sequence into the cloning vector. In some implementations, the DNA construct is prepared by inserting a target DNA sequence having two terminal side wings linked to a protelomerase recognition sequence into multiple cloning sites of a cloning vector, as described above.

[0020] As used herein, “DNA construct” refers to a manually constructed product of a DNA fragment to be introduced into a host cell or biosome. The DNA constructs described herein are autonomously replicated; that is, they may include sequences that support the autonomous replication of the DNA construct within a prokaryotic or eukaryotic host cell, such as origins of replication (ori).

[0021] In some implementations, the DNA construct includes two or more endonuclease recognition sequences, for example, 3, 4, 5, 6, 7, 8, 9, 10 or more.

[0022] In some implementations, the DNA construct further includes the origin of replication site and a selection marker gene.

[0023] In some implementations, the target DNA sequence is directly adjacent to the protelomerase recognition sequence at both ends; that is, there are no other sequences between the target DNA sequence and the protelomerase recognition sequence at those two ends.

[0024] In some implementations, additional sequences, such as endonuclease recognition sites and nicking enzyme recognition sites, are present between the target DNA sequence and the protelomerase recognition sequence at the two ends.

[0025] The two protelomerase recognition sequences at the two ends of the target DNA sequence may undergo direct duplication or inverse duplication.

[0026] As used herein, the term "protease" refers to an enzyme that can recognize and cleave a protease recognition sequence and religate it to DNA containing the protease recognition sequence to produce a closed double-stranded DNA. Proteases are generally found in phages, for example, but not limited to, Escherichia coli (E.Coli) phage N15 (i.e., protease TelN), Klebsiella phage phi K02, Yersinia phage Py54, Halomonas phage phi HAP, Vibrio phage VP882, and proteases derived from Borrelia burgdorferi plasmid lpB31.16. In some embodiments, the protease is selected from proteases derived from Escherichia coli (E.Coli) phage N15, Klebsiella phage phi K02, Yersinia phage Py54, Halomonas phage phi HAP, Vibrio phage VP882, and Borrelia burgdorferi plasmid lpB31.16, or their homologs or variants. In some embodiments, the protease is a protease (TelN) derived from Escherichia coli (E.Coli) phage N15, or its homolog or variant. A homolog is generally a functional homolog of the protease, and its amino acid sequence may have at least 40%, 50%, 60%, 70%, 80%, 90%, 95% or 98% identity with the native amino acid sequence of the protease. A variant may include cleavages, insertions, substitutions and / or deletions with respect to the native amino acid sequence of the protease, for example, cleavages, insertions, substitutions and / or deletions of one or more amino acids. Methods for preparing and using proteases are conventional methods, and many proteases are commercially available.

[0027] As used herein, the term "telomerase recognition sequence" is a DNA sequence that can be recognized by telomerase, which is determined according to the specific telomerase used and is known in the art. In some embodiments, the telomerase recognition sequence is selected from telomerase recognition sequences derived from Escherichia coli (E.Coli) phage N15, Klebsiella phage phi K02, Yersinia phage Py54, Halomonas phage phi HAP, Vibrio phage VP882, and Borrelia burgdorferi lpB31.16. In some embodiments, the telomerase recognition sequence is derived from Escherichia coli (E.Coli) phage N15. In some embodiments, the telomerase recognition sequence comprises or consists of SEQ ID NO:2.

[0028] Method for generating a target DNA sequence The present application provides a method for generating a target DNA sequence. The term "generate" can be used interchangeably with terms such as amplification, cloning, and replication. The method includes the steps of amplifying and extracting the DNA construct of the present application in a host cell, a three-step isothermal enzymatic reaction step catalyzed by telomerase-IIS type restriction endonuclease and / or meganuclease-DNA exonuclease (i.e., the first cleavage reaction - the second cleavage reaction - the digestion reaction), and an optional recovery step.

[0029] [[ID=⑨]] In some embodiments, the method for generating a target DNA sequence is culturing a host cell together with an introduced DNA construct that autonomously replicates and contains (a) one or more IIS type restriction endonuclease and / or meganuclease recognition sequences; (b) a target DNA sequence; and (c) telomerase recognition sequences flanking both ends of the target DNA sequence to amplify the DNA construct and extracting the amplified DNA construct from the host cell; This method allows protelomerase to be brought into contact with the amplified and extracted DNA construct, where the protelomerase recognizes and cleaves the protelomerase recognition sequence on the DNA construct, thereby obtaining a first cleavage reaction; The first cleavage reaction is made possible to contact one or more IIS-type restriction endonucleases and / or meganucleases, where the IIS-type restriction endonucleases and / or meganucleases recognize and cleave IIS-type restriction endonuclease and / or meganuclease recognition sequences on the construct, thereby obtaining a second cleavage reaction; By allowing the second cleavage reaction to come into contact with one or more exonucleases, it is possible to digest sequences other than the target DNA sequence. Includes.

[0030] 1. Steps to amplify and extract the DNA construct. A method for producing a target DNA sequence according to the present invention comprises the steps of amplifying and extracting the DNA construct of the present invention, wherein the amplification of the DNA construct is carried out in a host cell.

[0031] In some implementations, the step of amplifying and extracting the DNA construct includes amplifying the DNA construct by culturing host cells together with the introduced DNA construct and extracting the amplified DNA construct from the host cells. In some implementations, the step of amplifying and extracting the DNA construct is performed by utilizing host cell plasmids known in the art.

[0032] As used herein, the term “host cell” encompasses any cell that is converted to be introduced into a vector or construct and has the capacity to support the replication of the vector or construct. The host cell may be a prokaryotic cell (e.g., a bacterial cell such as an Escherichia coli (E. coli) cell) or a eukaryotic cell (e.g., a yeast cell, insect cell, amphibian cell, or mammalian cell).

[0033] DNA constructs can be introduced into host cells by any method known in the art. Such methods include, but are not limited to, enabling the DNA construct to enter suitable competent cells, such as Escherichia coli (E. coli) competent cells, including, but not limited to, Top10 chemically competent cells (Invitrogen®, product catalog number: C404010), DH5α chemically competent cells (Invitrogen®, product catalog number: 18265017), and DH10B chemically competent cells (Invitrogen®, product catalog number: 12331013), through chemical transformation or electroporation transformation.

[0034] The amplification of DNA constructs within host cells is carried out by culturing the host cells under conditions suitable for the amplification of the DNA constructs. In some embodiments, the DNA constructs are cultured in a suitable liquid medium (e.g., LB medium containing the resistance gene) after the cloning vector has been transfected into host cells such as E. coli (E. coli) cells. Alternatively, the DNA vectors are amplified within the fermented and accumulated host cells.

[0035] Extraction of the amplified DNA constructs can be carried out by extraction methods known in the art, including, but not limited to, the use of alkaline lysis or commercially available plasmid extraction kits (e.g., QIAprep Spin Miniprep kit, Qiagen).

[0036] 2. First cleavage reaction A method for producing a target DNA sequence according to the present invention further comprises a first cleavage reaction step following the above steps of amplifying and extracting a DNA construct. The first cleavage reaction step may include enabling contact of a protelomerase with the amplified and extracted DNA construct, where the protelomerase recognizes and cleaves a protelomerase recognition sequence on the DNA construct to obtain a first cleavage reaction mixture.

[0037] The first cleavage reaction may be an isothermal reaction at a temperature suitable for protelomerase activity. Suitable temperatures are known in the art and may be, for example, 20 to 40°C, for example, 25 to 35°C, and for example, 30°C. The time for the first cleavage reaction may be 10 minutes to 24 hours, for example, 30 minutes to 12 hours, for example, 40 minutes, 50 minutes, 1 hour, 2 hours, or 4 hours.

[0038] In some implementations, a method for producing a target DNA sequence includes a step of inactivating protelomerase after a first cleavage reaction. The inactivation step may include heating the protelomerase to a temperature higher than its inactivation temperature (e.g., higher than 60°C, higher than 70°C, and e.g., 75°C) and maintaining that temperature for a suitable time (e.g., 2 minutes to 1 hour, e.g., 5 minutes, 10 minutes, or 20 minutes).

[0039] 3. Second cleavage reaction A method for producing a target DNA sequence according to the present invention further comprises a step of a second cleavage reaction after the above step of a first cleavage reaction. The second cleavage reaction may include enabling contact of the first cleavage reaction mixture with one or more IIS-type restriction endonucleases and / or meganucleases, where the IIS-type restriction endonucleases and / or meganucleases recognize and cleave IIS-type restriction endonucleases and / or meganuclease recognition sequences on the construct to obtain the second cleavage reaction mixture.

[0040] The second cleavage reaction may be an isothermal reaction at a temperature suitable for the IIS-type restriction endonuclease and / or meganuclease used. Suitable temperatures are known in the art and may be, for example, 20 to 55°C, for example, 30 to 50°C, and for example, 37°C. The time for the second cleavage reaction may be 10 minutes to 24 hours, for example, 30 minutes to 12 hours, for example, 40 minutes, 50 minutes, 1 hour, 2 hours, or 4 hours.

[0041] In some implementations, a method for constructing a target DNA sequence includes a step of inactivating the IIS-type restriction endonuclease and / or meganuclease after a second cleavage reaction. The inactivation step may include heating the IIS-type restriction endonuclease and / or meganuclease to a temperature higher than its inactivation temperature (e.g., higher than 60°C, higher than 65°C, and e.g., 65°C or 70°C) and maintaining that temperature for a specific time (e.g., 2 minutes to 1 hour, e.g., 10 minutes, 20 minutes, or 25 minutes).

[0042] 4. Digestive response A method for producing a target DNA sequence according to the present invention further comprises a digestion step following the above step of the second cleavage reaction. The digestion step includes digesting sequences other than the target DNA sequence, and preferably digesting all other nucleotide sequences other than the target DNA sequence, by enabling contact of the second cleavage reaction mixture with one or more exonucleases.

[0043] The exonuclease may be any exonuclease known in the art, but is not limited to, phage T5 exonuclease (phage T5 gene D15 product), phage λ exonuclease, RecE from Rac prophage, exonuclease VIII derived from Escherichia coli (E. coli), and phage T7 exonuclease (phage T7 gene 6 product). In some implementations of this disclosure, the exonuclease is either T5 exonuclease or λ exonuclease. Methods for preparing and using exonucleases are conventional, and numerous exonucleases are commercially available.

[0044] The digestion reaction may be an isothermal reaction at a temperature suitable for the exonuclease used. Suitable temperatures are known in the art and may be, for example, 20 to 55°C, for example, 30 to 50°C, and for example, 37°C. The time for the digestion reaction may be 10 minutes to 24 hours, for example, 30 minutes to 12 hours, for example, 1 hour, 2 hours, 3 hours, 4 hours, 5 hours, or 6 hours.

[0045] In some implementations, a method for producing a target DNA sequence includes a step of inactivating the exonuclease after the digestion reaction. The inactivation step may include heating the exonuclease to a temperature higher than its inactivation temperature (e.g., higher than 60°C, higher than 65°C, and e.g., 70°C, 75°C, or 80°C) and maintaining that temperature for a specific time (e.g., 2 minutes to 1 hour, e.g., 5 minutes, 10 minutes, or 20 minutes).

[0046] Further purification steps are not included after the completion of the first and second cleavage reactions. It will be understood by those skilled in the art that all enzymatic digestion reactions according to this application include a first cleavage reaction utilizing protelomerase and a second cleavage reaction utilizing IIS-type restriction endonuclease and / or meganuclease, under conditions that do not involve cleaving the target DNA sequence.

[0047] In some implementations, the method for producing the target DNA sequence may further include an optional target DNA sequence recovery step after the digestion reaction step described above. The recovery step may be carried out by one or a combination of the following: recovery of the product from the digestion step by phenol-chloroform extraction and DNA adsorption centrifugal column, isopropanol / ethanol precipitation, or removal of the protease and salt in the reaction system by molecular sieve chromatography supported by high-performance liquid chromatography. When the obtained target DNA sequence is used in mammalian cell or animal experiments, endotoxin removal and / or sterile filtration may be further performed.

[0048] The product of the digestion reaction is closed linear double-stranded DNA. In some implementations, the target DNA sequence produced by the method for producing the target DNA sequence according to this application is closed linear double-stranded DNA.

[0049] A method for producing a target DNA sequence according to this application can achieve high purity of the target DNA sequence (e.g., product purity of 100%) without any purification steps. In some implementations, the method for producing a target DNA sequence does not include any steps for purifying the target DNA sequence, for example, not including a step of purifying the target DNA sequence after a digestion reaction, not including a step of purifying the product DNA sequence after a second cleavage reaction, and / or not including a step of purifying the product DNA sequence after a first cleavage reaction. In some implementations, the purity of the target DNA sequence produced by the method for producing a target DNA sequence according to this application is greater than 95%, greater than 98%, greater than 99%, or 100%.

[0050] The method for producing target DNA sequences according to this invention allows for the large-scale production of target DNA sequences. In some implementations, the fermentation culture system of host cells containing the introduced DNA construct may be up to 1 L or more, for example, 5 L or 10 L or more. In some implementations, for the extraction of the amplified DNA construct, the amplified DNA construct may be extracted from a fermentation culture system of 1 L or more, 5 L or more, or 10 L or more. In some implementations, the produced target DNA sequence may be up to 1 mg or more, 5 mg or more, or 10 mg or more.

[0051] 5. Several other implementation forms In some implementations, the DNA construct of the present invention further includes a further restriction endonuclease recognition sequence between the target DNA sequence and the protelomerase recognition sequence, and a method for constructing a target DNA sequence according to the present invention further includes, after a digestion reaction step (i.e., a step of "digesting sequences other than the target DNA sequence by enabling the second cleavage reaction mixture to come into contact with one or more exonucleases"), a step of enabling the restriction endonuclease to come into contact with the digestion reaction product (i.e., closed linear double-stranded DNA) to recognize and cleave the further restriction endonuclease recognition sequence, thereby removing two ends of the closed linear double-stranded DNA to prepare a double-stranded target DNA fragment having two unclosed ends (having blunt or adherent ends).

[0052] In some implementations, the DNA construct of the present invention further includes a further Nicking enzyme recognition sequence between the target DNA sequence and the protelomerase recognition sequence, and a method for constructing a target DNA sequence according to the present invention further includes a step of recognizing and cleaving a further Nicking enzyme recognition sequence by enabling contact of the digestion reaction product (i.e., closed linear double-stranded DNA) after a digestion reaction step (i.e., a step of "digesting sequences other than the target DNA sequence by enabling contact of the second cleavage reaction reaction with one or more exonucleases"), thereby removing one strand in the positive sense strand and antisense strand of the double-stranded DNA, to form a DNA sequence in which the two ends are a covalently closed structure, the locality is double-stranded DNA, and the intermediate target fragment region is single-stranded DNA.

[0053] As used herein, the term “nicking enzyme” includes, but is not limited to, Nb.BbvCI, Nb.BsmI, Nb.BsrDI, Nb.BssSI, Nb.BtsI, Nt.AlwI, Nt.BbvCI, Nt.BsmAI, Nt.BspQI, Nt.BstNBI, and Nt.CviPI. Methods for preparing and using nicking enzymes are conventional, and numerous nicking enzymes are commercially available (e.g., New England BioLabs). A “nicking enzyme recognition sequence” is a sequence that can be recognized and cleaved by the corresponding nicking enzyme, determined according to the specific nicking enzyme used, and is known in the art.

[0054] The target DNA sequence produced by the method for producing a target DNA sequence according to this application can be transferred into a cell or animal body via chemical or physical delivery to perform transient expression of an exogenous gene or to incorporate the exogenous DNA sequence into the genome.

[0055] In some implementations, the target DNA sequence produced by a method for producing a target DNA sequence according to this application may include a protein-coding sequence for use in protein expression. For example, the target DNA sequence may include a promoter, a target gene, and a poly(A) tail.

[0056] Therefore, the present invention further provides a method for expressing a target protein. The method is A method for constructing a target DNA sequence according to the present invention, wherein the target DNA sequence includes a DNA sequence encoding a target protein; The obtained target DNA sequence is introduced into prokaryotic or eukaryotic cells; Incubating prokaryotic or eukaryotic cells under conditions suitable for protein expression, Includes.

[0057] In some implementations, target DNA sequences produced by methods for producing target DNA sequences according to the present invention can be used for gene reconstruction of target genomes, such as CRISPR-based gene reconstruction.

[0058] Therefore, the present invention further provides a method for incorporating a target DNA sequence into a target integration site of a target genome. The method is as follows: A method for constructing a target DNA sequence according to the present invention, wherein the target DNA sequence comprises homology arm sequences at two ends of a target integration site and an intermediate target knock-in fragment; The obtained target DNA sequence, Cas9 protein, and sgRNA designed based on the target DNA sequence are introduced together into prokaryotic or eukaryotic cells; This involves incorporating a target DNA sequence into a target genome by incubating prokaryotic or eukaryotic cells, Includes.

[0059] The optimal sgRNA sequence is designed based on the target DNA sequence using an existing sgRNA design website, for example: https: / / www.genscript.com / gRNA-design-tool.html.

[0060] The homology arm sequences at the two ends of the target integration site may be sequences of approximately 300 bp, 310 bp, 320 bp, 330 bp, and 350 bp at the two ends of the target integration gene site. For example, the target integration gene may be the TRAC gene or the RAB11a gene, and the homology arm sequences at the two ends of the target integration site may be sequences of approximately 300 bp at the left and right ends of the TRAC gene or the RAB11a gene.

[0061] A method for incorporating a target DNA sequence into a target integration site of a target genome according to this invention makes it possible to achieve stable continuous expression of the incorporated target DNA sequence within the target genome, for example, for 7 days or more.

[0062] kit The present application provides a kit for constructing a target DNA sequence, further comprising a kit configured to carry out a method for constructing a target DNA sequence according to the present application. The kit comprises a cloning vector, a protelomerase, one or more IIS-type restriction endonucleases and / or meganucleases, and one or more exonucleases. The kit may further comprise instructions for use describing a method according to the present application.

[0063] The method for producing target DNA sequences provided herein is suitable for the large-scale industrial production of target DNA sequences. In some implementations, the method for producing target DNA sequences is carried out in a fermentation tank of 1 L or more, 5 L or more, 10 L or more, or 20 L or more.

[0064] The method following this application has the following beneficial effects: I. The method according to this application is simple and general for any sequence, does not require a specific design for each sequence, and does not require different preparation techniques for any of the sequences. II. The definitive raw material for the method according to this invention is a large-scale extracted plasmid, and since the enzymatic digestion reaction at each step is carried out thermostatically, it is easy to scale up. III. The enzymes selected for the method according to this invention have very high compatibility with the reaction conditions, and the enzyme digestion reaction system in the next step can be established in the first step of enzyme digestion, while the enzyme digestion reaction system in the next step can be prepared without purifying the enzyme digestion products after each step of the enzyme digestion reaction. With respect to each step of the subsequent enzyme digestion reaction, only the enzyme that reacted in the previous step needs to be inactivated, and then a new enzyme is added to the reaction system for the reaction. IV. The method according to this application makes it possible to prepare large quantities of high-purity target DNA sequences without performing any stripe-like DNA sorting in accordance with DNA fragment separation based on electrophoresis or high-performance liquid chromatography, and is also very suitable for economically and efficiently preparing gene editing templates and performing accurate editing on a large scale in compliance mode. V. The method according to this application is particularly suitable for GMP production because it facilitates quality control when the target DNA sequence is configured for large-scale industrial production. To achieve this.

[0065] The present invention will be described in more detail with reference to the following drawings. [Brief explanation of the drawing]

[0066] [Figure 1] This shows the layout of the functional elements of a general construct and the enzymatic digestion sites of the restriction endonuclease in one embodiment of the present application. [Figure 2] This shows the flow of preparing a target DNA sequence for one implementation form of the present invention. [Figure 3]The results of detecting intermediate and final products of double-stranded DNA by agarose gel electrophoresis in one embodiment of the present invention are shown. Lane M is a 3000 bp double-stranded DNA marker, lane 1 is the product of the purified prepared vector digested with TelN enzyme, lane 2 is the product of lane 1 digested with I-sceI enzyme, and lane 3 is the product of lane 2 digested with λ exonuclease. [Figure 4] This invention demonstrates the purity verification of target sequence 1 of the final product via Agilent Bioanalyzer 2100 in one embodiment of the present invention. [Figure 5] The results of detecting intermediate and final products of double-stranded DNA by agarose gel electrophoresis in one embodiment of the present invention are shown. Lane M is a 3000 bp double-stranded DNA marker, lane 1 is the product of the purified prepared vector digested with TelN enzyme, lane 2 is the product of lane 1 digested with BspQI enzyme, and lane 3 is the product of lane 2 digested with λ exonuclease. [Figure 6] This invention demonstrates the purity verification of target sequence 2 of the final product via Agilent Bioanalyzer 2100 in one embodiment of the present invention. [Figure 7] This invention demonstrates the detection of cell viability after electroporation of target sequence 2, prepared according to the method of the present invention, in one embodiment of the present invention. [Figure 8] This describes the detection of green fluorescent protein expression in cells 48 hours after electroporation of target sequence 2 prepared according to the method of the present invention, in one embodiment of the present invention. [Figure 9] This invention demonstrates the detection of cell viability after electroporation of target sequence 2, prepared according to the method of the present invention, in one embodiment of the present invention. [Figure 10] This describes the detection of green fluorescent protein expression in cells 7 days after electroporation of target sequence 2 prepared according to the method of the present invention, in one embodiment of the present invention. [Modes for carrying out the invention]

[0067] Unless otherwise specified, the scientific and technical terms used in this invention have the meanings generally understood by those skilled in the art to which this invention belongs.

[0068] The technical solutions in this disclosure are described in further detail with reference to the accompanying drawings and embodiments. Unless otherwise specified, the methods and materials in the following embodiments are commercially available conventional products. Those skilled in the art will understand that the methods and materials described below are illustrative and should not be considered to limit the scope of this disclosure. [Examples]

[0069] Example 1: Design and construction of a general structure The pUC57 kanamycin-resistant vector is selected as the reconstituted vector. As shown in Figure 1, based on the fact that the BspQI enzyme digestion recognition site is initially only present at position 714, two BspQI enzyme digestion recognition sites (sequence: GCTCTTC) are added after positions 1554 and 2539, respectively, between the functional elements. Two meganuclease I-sceI enzyme digestion recognition sequences (sequence: TAGGGATAACAGGGTAAT), which are not initially present, are knocked in after positions 1501 and 2479. Furthermore, a synonymous codon mutation is achieved by changing G to C through a point mutation at position 1397, and in this way, one BspQI enzyme digestion recognition site is added to its coding gene under conditions that do not cause any change in the function of ori. Similarly, synonymous mutations (AT to GC) mutate the base AT to GC at positions 2136 and 2137, adding a single BspQI enzyme digestion recognition site to the kanamycin resistance gene encoding gene. Both the above sequence insertions and point mutations can be performed according to site-directed mutagenesis services based on the pUC57-KanR plasmid vector (Nanjing GenScript Biotech Corp., sequence information: https: / / www.addgene.org / vector-database / 6258 / , see Sequence ID No. 12) via Nanjing GenScript Biotech Corp., yielding typical prepared constructs as shown in Figure 1. The functional elements of the general construct, from left to right, include the lac promoter, the lac activator gene, and the lacZα gene encoding β-galactosidase with multiple cloning sites (MSCs); the plasmid replication origin sequence ori; the kanamycin resistance gene KanR; and an existing lactose operon sequence for blue-white spot screening of the pUC57-KanR vector, including the appended BspQI recognition sequence GCTCTTC and the I-sceI enzyme digestion recognition sequence TAGGGATAACAGGGTAAT. The designed sequence of the general construct is shown in SEQ ID NO: 1.

[0070] Example 2: Target sequence 1 configured to knock in the RAB11a gene with the GFP gene. 2.1 Design of Target Sequence 1 The knock-in sequences at the two ends of the GFP sequence site are selected from the 300 bp sequences (as RAB11a homology arm sequences) shown in SEQ ID NOs. 4 and 5, respectively, on the left and right sides of the human genome RAB11a (Ras-related protein Rab-11A) gene knock-in site. The original sequence of the designed target sequence 1 (a stable 1356 bp double-stranded DNA) is shown in SEQ ID NO. 3, and the protelomerase TelN recognition sequence TATCAGCACACAATTGCCCATTATACGCGCGTATAATGGACTATTGTGTGCTGATA (SEQ ID NO. 2), derived from E. coli phage N15, is added to both ends of target DNA sequence 1. The sequence of target sequence 1 after the addition of the protelomerase TelN enzyme recognition sequences to both ends is shown in SEQ ID NO. 6. The designed terminal sequences are synthesized into complete genes by Nanjing GenScript Biotech Corp. (https: / / www.genscript.com.cn / gene_synthesis.html).

[0071] 2.2 Preparation of DNA constructs and preparation of large-scale plasmids Target sequence 1 (SEQ ID NO: 6), synthesized in Step 2.1 with TelN enzyme recognition sequences added to both ends, is flatly bound to the enzymatic digestion site of a common vector pUC57-Kan-V6 (the common vector prepared in Embodiment 1) that has undergone linearized single enzymatic digestion via restriction endonuclease EcoRV (New England BioLabs, catalog number R3195L) using T4 ligase (Thermo Scientific®, product catalog number: EL0011). The conjugation product is converted to competent Escherichia coli (E. coli) cells by electroporation, and the transformed E. coli (E. coli) are coated onto LB plate medium containing kanamycin and cultured overnight at 37°C. The following day, 10 single colonies are collected and cultured in liquid on 3 mL of LB liquid medium containing 50 μg / ml kanamycin to extract plasmids. Each plasmid is identified by Sanger sequencing to determine the plasmid with the overall accurate sequence. Plasmid extraction and sequencing are completed by Nanjing GenScript Biotech Corp. (https: / / www.genscript.com.cn / custom-plasmid-preparation.html). E. coli (E. coli) strains containing plasmids with accurate sequences are streaked and stored. A seed solution is prepared by inoculating the stored E. coli (E. coli) strains (OD value approximately 0.8), and then a 10L large-scale plasmid extraction is performed by inoculating 1% of the seed solution into 1L of E. coli (E. coli) culture.

[0072] 2.3 Preparation of stable double-stranded DNA containing target sequence 1 A prepared plasmid vector, after large-scale extraction, is subjected to a three-step isothermal enzymatic digestion reaction to obtain linear closed double-stranded DNA containing target sequence 1. In Step I, 0.8 mg of the prepared circular vector is cleaved with TelN enzyme (New England BioLabs, catalog number M0651S, 5 U / μL enzyme, 1 μL added to each of the 300 fmol TelN recognition sites) into two linear double-stranded DNA fragments, each with two closed ends containing target sequence 1 and the vector backbone. The fragments are incubated at 30°C for 1 hour, and then heated at 75°C for 10 minutes to inactivate the TelN enzyme. After the reaction is complete, agarose electrophoresis is used to detect whether the reaction has been fully carried out, i.e., whether the prepared vector has been completely converted from a superhelical circular plasmid to two linear double-stranded DNA fragments. The stripe positions should correspond to the size of the target fragment and the prepared vector backbone (approximately 2.6 kb), respectively. Step II: Target sequence 1 is prepared using an I-sceI enzyme that contains the BspQI recognition sequence GCTCTTC but does not contain the I-sceI enzyme recognition sequence. Target sequence 1 is prepared in an unchanged state, while the vector's backbone sequence is cleaved into multiple broken DNA fragments, including double-stranded ones, by the I-sceI enzyme (New England BioLabs, catalog number R0694L). The reaction conditions are that the reaction is carried out at 37°C for 1 hour, followed by inactivation. After the reaction is complete, it is detected by agarose electrophoresis whether or not the reaction has been carried out completely. Step III: The fragmented vector backbone DNA fragments are completely digested with DNA exonuclease. After the reaction in Step II, an exonuclease, specifically λ exonuclease (New England BioLabs, catalog number M0262L) or T5 exonuclease (New England BioLabs, catalog number M0663L), is added to the reaction system and reacted at 37°C for 2 hours, followed by thermal inactivation. For specific reaction conditions, please refer to Tables 1 to 3 below. After the reaction, the purity of the target product is detected by agarose gel electrophoresis, and the product is a single product containing only the striped target (see Figure 3).The final product, i.e., the stable double-stranded target sequence 1, is subjected to purity verification using the Agilent Bioanalyzer 2100. A DNA pure product with 100% purity and containing only the target fragment (see Figure 4) can be obtained under conditions where no further selection or purification steps of DNA molecular fragments are performed.

[0073] [Table 1]

[0074] [Table 2]

[0075] [Table 3]

[0076] As shown in Figure 3, lane M is a 3000 bp double-stranded DNA marker. Lane 1 is the product of the purified prepared vector after digestion with TelN enzyme, where the product contains a 1 kb striped target double-stranded DNA and a 2 kb vector backbone double-stranded DNA. Lane 2 is the product after digestion of the lane 1 product with I-sceI enzyme, where the product contains a 1 kb striped target double-stranded DNA and a vector backbone double-stranded DNA cut into two fragments. Lane 3 is the product after digestion of the lane 2 product with lambda Exo exonuclease, where the final product contains only a 1 kb striped target double-stranded DNA.

[0077] As shown in Figure 4, the final product undergoes purity detection via Agilent Bioanalyzer capillary electrophoresis and DNA12000 detection chip. The 50 bp and 17,000 bp peaks are internal reference peaks for chip detection, and the material peak of the target sequence is present only within the entire detection range, indicating 100% purity.

[0078] 2.4 Final purification of stable double-stranded DNA containing target sequence 1 The above reaction products may be recovered by phenol-chloroform extraction (Invitrogen®, product catalog number: 15593031) and by DNA adsorption centrifugation column made of special material (QIAGEN-tip 100, Qiagen, product catalog number: 10043). Then, DNA molecules containing only the target sequence are recovered by isopropanol (Sinopharm Chemical Reagent Co., Ltd., serial number 80109218) / ethanol (Sinopharm Chemical Reagent Corporation, serial number 10009257) precipitation. The isopropanol / ethanol precipitation method employed in the embodiment includes the following specific steps: 1) Add 0.7 times the volume of isopropanol at normal temperature (for example, add 0.7 ml of isopropanol to 1 ml of stable double-stranded DNA intended for concentration), mix uniformly, then centrifuge at 12,000-14,000 rpm at 4°C for 10 minutes, carefully absorb the supernatant to avoid contact with the precipitate; 2) Add 1 ml of 70% ethanol solution at normal temperature, gently suspend the plasmid precipitate, wash thoroughly, and centrifuge. 1) Centrifuge at 12,000–14,000 rpm at 4°C for 5–10 minutes, carefully absorbing the supernatant to avoid contact with the precipitate; 2) Centrifuge at 5,000–10,000 rpm at 4°C for 5–10 seconds, carefully and completely absorbing the residual liquid with a 20 μL or 200 μL pipette to avoid contact with the precipitate; and 3) After a clear liquid (drying can generally be completed within 1 minute after the liquid has been completely absorbed) is not visible to the naked eye, add a suitable volume of solution (e.g., solution V, 10 mM Tris-Cl pH 8.5, or pure water at the milli-Q level) to dissolve the DNA.

[0079] In this embodiment, a 0.8 mg purified prepared vector containing the target sequence is used, where the length of the target sequence accounts for 35% of the total length of the prepared vector; therefore, the theoretically stable double-stranded DNA product should be 0.28 mg. After isopropanol precipitation and purification, the stable double-stranded DNA containing only the target sequence in the obtained pure product is 0.18 mg according to OD260 UV absorption measurement (Nanodrop One, ThermoFisher), and the productivity is 64.73%. By amplifying the reaction system, a 4.5 mg purified prepared vector containing the target sequence is simultaneously obtained, ensuring that a 1 mg final product is obtained simultaneously. Proteases and salts in the reaction system are removed by purifying 1 mg or more of plasmid through molecular sieve chromatography (chromatography column packing: Sepharose 6 Fast Flow, Cytiva) supported by high-performance liquid chromatography (AKTA Explorer 100, Cytiva).

[0080] Example 3: Target sequence 2 configured to knock in a GFP gene editing template having a CMV promoter into the TRAC gene. 3.1 Design of Target Sequence 2 The knock-in sequences at the two ends of the GFP sequence site containing the CMV promoter are selected from the 300 bp sequences (as left and right homology arm sequences of TRAC) shown in SEQ ID NOs. 7 and 8, respectively, on the left and right sides of the human genome TRAC (T cell receptor α chain coding gene) gene knock-in site. The original sequence of target sequence 2 (1885 bp stable double-stranded DNA) is as shown in SEQ ID NO. 9, and the protelomerase TelN recognition sequence TATCAGCACACAATTGCCCATTATACGCGCGTATAATGGACTATTGTGTGCTGATA (SEQ ID NO. 2) derived from Escherichia coli (E. coli) phage N15 is added to both ends of target DNA sequence 2. A protelomerase TelN enzyme recognition sequence, as shown in SEQ ID NO. 10, is added to the sequence of target sequence 2 after the two ends. The designed terminal sequences undergo complete gene synthesis by Nanjing GenScript Biotech Corp. (https: / / www.genscript.com.cn / gene_synthesis.html).

[0081] 3.2 Preparation of DNA constructs and preparation of large-scale plasmids Similar to Embodiment 2, target sequence 2 synthesized in 3.1, having two terminals with added TelN enzyme recognition sequences, is flatly bound to pUC57-Kan-V6 (a general vector prepared in Embodiment 1), which has undergone single enzymatic digestion and linearization via restriction endonuclease EcoRV (New England BioLabs, catalog number R3195L) using T4 ligase (Thermo Scientific®, product catalog number: EL0011). The conjugation product is converted into competent Escherichia coli (E. coli) cells, and the transformed E. coli (E. coli) are coated with kanamycin on LB plate medium and cultured overnight at 37°C. The following day, 10 single colonies are collected and cultured in liquid on 3 mL of LB liquid medium containing 50 μg / ml kanamycin to perform plasmid extraction. Plasmid extraction is completed by Nanjing GenScript Biotech Corp. (https: / / www.genscript.com.cn / custom-plasmid-preparation.html). Each plasmid is identified through sequencing to determine the plasmid with the overall accurate sequence. Escherichia coli (E. coli) strains containing plasmids with the accurate sequence are streaked and stored. A seed solution is prepared, and a 10L large-scale plasmid extraction is performed by inoculating the seed solution into a 1L Escherichia coli (E. coli) culture system.

[0082] 3.3 Preparation of stable double-stranded DNA containing target sequence 2 Similar to the operating steps in Embodiment 2.3, the prepared plasmid vector, which has undergone large-scale extraction, is subjected to a three-step inothermal enzymatic digestion reaction to obtain closed double-stranded DNA containing the target sequence. Step I: The circular prepared vector obtained in Embodiment 3.2 is cleaved into two linear double-stranded DNA fragments, each with two closed ends containing target sequence 2 and the vector backbone, using TelN enzyme (New England BioLabs, catalog number M0651S, 1 μL of 5 U / μL enzyme added to 300 fmol each of the TelN recognition sites). See Table 4 for specific reaction conditions. After the reaction is complete, it is detected by agarose electrophoresis whether or not the reaction has been carried out completely. Step II: Target sequence 2 may be subjected to enzymatic digestion with I-sceI as used in Embodiment I. However, the target sequence does not contain the BspQI recognition sequence, and for enzymatic digestion, a BspQI with more enzymatic digestion sites on a general vector may be used. The target fragment is prepared in an unchanged state, while the vector backbone sequence is cleaved into multiple broken DNA fragments, including double-stranded ones. For specific reaction conditions, please refer to Table 5. After the reaction is complete, agarose electrophoresis is used to detect whether the reaction has been carried out completely or not. In Step III, the fragmented vector backbone DNA fragments are completely digested with DNA exonuclease. After the reaction in Step II, λ exonuclease (New England BioLabs, catalog number M0262L) or T5 exonuclease (New England BioLabs, catalog number M0663L) is added to the reaction system and reacted at 37°C for 2 hours, followed by thermal inactivation. For specific reaction conditions, please refer to Table 6. The purity of the target product is detected by agarose gel electrophoresis, and the product is a single product containing only the striped target (see Figure 5). The final product, i.e., the stable double-stranded target sequence 2, is then validated for purity using the Agilent Bioanalyzer 2100. As shown in Figure 6, the 50 bp and 17,000 bp peaks are internal reference peaks detected by the chip, and the material peak of the target sequence is present only within the entire detection range, indicating a purity of 100%.A pure DNA product with 100% purity and containing only the target fragment can be obtained under conditions that do not involve any further sorting or purification steps of DNA molecular fragments.

[0083] [Table 4]

[0084] [Table 5]

[0085] [Table 6]

[0086] As shown in Figure 5, lane M is a 3000 bp double-stranded DNA marker. Lane 1 is the product of the purified prepared vector after TelN enzyme digestion, where the product contains 1.8 kb of striped target double-stranded DNA and 2 kb of vector backbone double-stranded DNA. Lane 2 is the product after lane 1 product is digested with BspQI enzyme, where the product contains 1.8 kb of striped target double-stranded DNA and vector backbone double-stranded DNA cut into five fragments. Lane 3 is the product after lane 2 product is digested with λ exonuclease, and the final product contains only 1.8 kb of striped target double-stranded DNA.

[0087] As shown in Figure 6, the final product undergoes purity detection via Agilent Bioanalyzer capillary electrophoresis and DNA12000 detection chip. The 50 bp and 17,000 bp peaks are internal reference peaks for chip detection, and the material peak of the target sequence is present only within the entire detection range, indicating 100% purity.

[0088] 3.4 Final purification of stable double-stranded DNA containing target sequence 2 The above reaction product may be recovered by phenol-chloroform extraction (Invitrogen®, product catalog number: 15593031) and DNA adsorption centrifugation column made of special material (QIAGEN-tip 100, Qiagen, product catalog number: 10043), followed by isopropanol (Sinopharm Chemical Reagent Co., Ltd., serial number 80109218) / ethanol (Sinopharm Chemical Reagent Corporation, serial number 10009257) precipitation. The isopropanol / ethanol precipitation method employed in the embodiment includes the following specific steps: 1) Add 0.7 times the volume of isopropanol at normal temperature (for example, add 0.7 ml of isopropanol to 1 ml of stable double-stranded DNA intended for concentration), mix uniformly, then centrifuge at 12,000-14,000 rpm at 4°C for 10 minutes, carefully absorb the supernatant to avoid contact with the precipitate; 2) Add 1 ml of 70% ethanol solution at normal temperature, gently suspend the plasmid precipitate, wash thoroughly, and centrifuge. 1) Centrifuge at 12,000–14,000 rpm at 4°C for 5–10 minutes, carefully absorbing the supernatant to avoid contact with the precipitate; 2) Centrifuge at 5,000–10,000 rpm at 4°C for 5–10 seconds, carefully and completely absorbing the residual liquid with a 20 μL or 200 μL pipette to avoid contact with the precipitate; and 3) After a clear liquid (drying can generally be completed within 1 minute after the liquid has been completely absorbed) is not visible to the naked eye, add a suitable volume of solution (e.g., solution V, 10 mM Tris-Cl pH 8.5, or pure water at the milli-Q level) to dissolve the DNA.

[0089] In this embodiment, a 1.2 mg purified prepared vector containing the target sequence is used, where the length of the target sequence accounts for 42.5% of the total length of the prepared vector; therefore, the theoretically stable double-stranded DNA product should be 0.51 mg. After isopropanol preparation and purification, the stable double-stranded DNA containing only the target sequence in the obtained pure product is 0.26 mg according to OD260 UV absorption measurement (Nanodrop One, ThermoFisher), and the productivity is 50.98%. By amplifying the reaction system, 4.62 mg of purified prepared vector containing the target sequence is obtained simultaneously, ensuring that 1 mg of the final product is obtained simultaneously. Proteases and salts in the reaction system are removed by purifying 1 mg or more of the plasmid through molecular sieve chromatography (chromatography column packing: Sepharose 6 Fast Flow, Cytiva) supported by high-performance liquid chromatography (AKTA Explorer 100, Cytiva).

[0090] Example 4: Electroporation of target sequence 2 prepared in Example 3 into HEK293T cell line for green fluorescent protein expression. 4.1 Culture and preparation of HEK293T mammalian cell line: Remove the culture medium (DMEM low glucose, Gibco®, catalog number: 11885084) from the refrigerator at 4°C and place it in an ultraclean bench at room temperature. Thaw the HEK293T cell (ATCC® CRL-3216®) frozen tube by hand and shaking at 37°C. Add 1 mL of cell freezing medium to 5 mL of medium, remove DMSO, centrifuge, and then discard the supernatant. Add 5 mL of fresh medium and culture in a 6 cm plate, then add 1 × 10^6 cells / bottle to a 10 cm culture dish and culture for 48 hours.

[0091] 4.2 Electroporation of DNA into HEK293T cell line 1. The cell count is 1.96 × 10^6 / mL, and the total volume is 5 mL. 2.10 μL of electroporation liquid + 1 μg / μL of target sequence encoding the green fluorescent protein gene or 2 μL of plasmid sample extracted according to Embodiment 3.2 3. Take 1 mL of cells obtained in Step 1, centrifuge at low speed, discard the supernatant, then add 10 μL of electroporation solution and resuspend. 4. Incubate the target sequence 2 or plasmid sample from step 2 and the cells from step 3 at room temperature for 10 minutes. 5. Mix the incubated sample and cells from step 4 uniformly, then incubate at room temperature for 10 minutes. 6. The Celetrix CTX-1500 LE electroporator was used for electroporation at a potential of 420V, and each of the three sample groups was tested. 7. Culture the cells in a culture dish. 8. After culturing the cells for 48 hours, use a flow cytometer (CytoFLEX, BECKMAN COULTER) to observe the percentage of viable cells and the percentage of cell populations expressing green fluorescent protein.

[0092] 4.3 Test Results As shown in Figure 7, cell viability was determined by counting and selecting a population of cells with normal morphology via a flow cytometer. The viability of cell samples obtained by electroporation of 2 μg of stable double-stranded DNA target sequence 2 was similar to that obtained by electroporation of 2 μg of plasmid and in a blank control group without any DNA electroporation.

[0093] As shown in Figure 8, in detecting the expression level of green fluorescent protein, cells that positively expressed green fluorescent protein within a population of living cells identified via flow cytometry were selected and their percentages were counted. The mean GFP expression rate in cell samples obtained by electroporation of 2 μg of stable double-stranded DNA was 18.21%, and the mean GFP expression rate in electroporation of 2 μg of plasmid was 35.70%, while the mean GFP expression rate in the blank control group without any DNA electroporation was 0.36%. This indicates that the introduced stable double-stranded DNA can be expressed within the cell. The GFP expression rate in HEK293 cells obtained by electroporation of 2 μg of plasmid was twice that of the same cells obtained by electroporation of 2 μg of stable double-stranded DNA, which is possible because the plasmid is more easily able to enter the cytoplasm and undergo transient transfection expression due to its superhelical structure.

[0094] Example 5: Electroporation of both target sequence 2 and CRISPR-Cas9 RNP compound prepared in Example 3 into HEK293T cell line for genomic fixation point knock-in of GFP expression gene 5.1 Culture and preparation of the HEK293T mammalian cell line: HEK293T cells are regenerated by the same method as in Embodiment 4.1.

[0095] 5.2 Electroporation of DNA into HEK293T cell line 1. The cell count is 1.96 × 10^6 / mL, and the total volume is 5 mL. 2.10 μL electroporation liquid + 1 μg / μL stable double-stranded DNA target sequence encoding the green fluorescent protein gene, 2 μL plasmid sample extracted in Embodiment 2 or 3.2; and 0.5 μL 50 picomoles of Cas9 protein (GenCrispr Cas9-N-NLS nuclease, GenScript, catalog number Z03388) + 1 μL 200 picomoles of sgRNA (sequence number 11 of sequence designed on the target DNA sequence design website https: / / www.genscript.com / gRNA-design-tool.html: AGAGUCUCUCAGCUGGUACAguuuuagagcuaGAAAuagcaaguuaaaauaaggcuaguccguuaucaacuugaaaaaguggcaccgagucggugcuuuu, Nanjing The gene knock-in group (synthesized by Genscript) is added and incubated at room temperature for 10 minutes, but the blank control group is used when Cas9 protein and sgRNA are not added. 3. Take 1 mL of cells obtained in Step 1, centrifuge slowly, discard the supernatant, then add 10 μL of electroporation solution to resuspend the cells, and then incubate at room temperature for 10 minutes. Mix 4.2 and 3 uniformly, then incubate at room temperature for 10 minutes, and then transfer the mixture to an electroporation cup. 5. The Celetrix CTX-1500 LE electroporator was used for electroporation at a potential of 420V, and each of the three sample groups was tested. 6. Culture the cells in a culture dish. 7. After culturing the cells for 7 days, use a flow cytometer to observe the percentage of viable cells and the percentage of cell populations expressing green fluorescent protein.

[0096] 5.3 Test Results As shown in Figure 9, cell viability was determined by counting and selecting a population of cells with normal morphology via a flow cytometer. Regardless of whether Cas9 protein and sgRNA compounds (RNP, ribonucleoprotein) were added or not, the viability in cell samples obtained by electroporation of 2 μg of stable double-stranded DNA was similar to that obtained by electroporation of 2 μg of plasmid and in the blank control group without any DNA electroporation.

[0097] For detecting the expression level of green fluorescent protein, cells expressing green fluorescent protein positively were selected from a population of living cells identified via flow cytometry, and their percentages were counted. As shown in Figure 10, the mean GFP expression rate in the blank control group without any DNA electroporation was nearly 0, regardless of whether Cas9 protein and sgRNA compounds (RNPs) were added or not. The mean GFP expression rates after electroporation of 2 μg of plasmid were very close, at 5.51% and 5.91%, respectively, indicating that the experimental groups with Cas9 and sgRNA RNP compounds added did not undergo large-scale knock-in of the GFP gene fragment on the HEK293T cell genome and instead experienced background GFP expression due to plasmid template residue. The mean GFP expression rates in cell samples after electroporation of 2 μg of stable double-stranded DNA were 13.35% and 4.92%, respectively. Stable double-stranded DNA GFP expression was also 13.35% 7 days after electroporation of HEK293T cell lines compared to the control group without RNP. This indicates that gene editing-based knock-in of the GFP encoding gene occurs, and that stable double-stranded DNA is more favorable for CRISPR-mediated fixed-point knock-in, exhibiting lower template retention than plasmids. (While the green fluorescent protein expression background is slightly lower than with plasmids, CRISPR-mediated fixed-point knock-in can be significantly improved.)

[0098] The implementation of the present invention is not limited to the embodiments described above. Those skilled in the art can make various modifications and improvements to the invention in form and detail, without departing from the spirit and scope of the invention, and these will be considered to fall within the scope of the invention.

Claims

1. A method for creating a target DNA sequence, A step of amplifying and extracting a DNA construct, comprising: amplifying the introduced DNA construct by culturing host cells with the DNA construct, which autonomously replicates and comprises (a) a cloning vector containing one or more IIS-type restriction endonuclease and meganuclease recognition sequences; (b) the target DNA sequence; and (c) protelomerase recognition sequences located at two ends of the target DNA sequence, and extracting the amplified DNA construct from the host cells, wherein the IIS-type restriction endonuclease is Bsp QI, the meganuclease is I-sceI, and the cloning vector is constructed by performing the following reconstructions on the pUC57 vector: (i) the BspQI recognition sequence is added after the 1554th and 2539th bases of the pUC57 vector, and the I-sceI recognition sequence is added after the 1501st and 2479th bases; and (ii) the G at the 1397th base of the pUC57 vector is mutated to C, and the ATs at the 2136th and 2137th bases are mutated to GC; A first cleavage reaction step comprising: enabling contact of protelomerase with the amplified and extracted DNA construct, wherein the protelomerase recognizes and cleaves the protelomerase recognition sequence on the DNA construct, thereby obtaining a first cleavage reaction; A second cleavage reaction step, comprising the step of enabling contact of the first cleavage reaction with one or more IIS-type restriction endonucleases and / or meganucleases, wherein the IIS-type restriction endonucleases and / or meganucleases recognize and cleave the IIS-type restriction endonucleases and / or meganuclease recognition sequences on the construct, thereby obtaining the second cleavage reaction; A digestion reaction step comprising allowing the second cleavage reaction to be contacted with one or more exonucleases, which are T5 exonuclease or λ exonuclease, thereby digesting sequences other than the target DNA sequence, Methods that include...

2. The method according to claim 1, wherein the protelomerase is selected from protelomerases derived from Escherichia coli (E. coli) phage N15, Klebsiella phage phi K02, Yersinia phage Py54, Halomonas phage phi HAP, Vibrio phage VP882, and Borrelia burgdorferi plasmid lpB31.

16.

3. The method according to claim 1 or 2, wherein the DNA construct comprises two or more IIS-type restriction endonuclease and meganuclease recognition sequences.

4. The method according to claim 3, wherein the DNA construct comprises 3, 4, 5, 6, 7, 8, 9, 10 or more IIS-type restriction endonuclease and meganuclease recognition sequences.

5. The method according to any one of claims 1 to 4, wherein the DNA construct includes a replication origin and a selection marker gene.

6. The method according to any one of claims 1 to 5, wherein after the first cleavage reaction and the second cleavage reaction are completed, no further purification step is included.

7. The method according to any one of claims 1 to 6, wherein the first cleavage reaction step is carried out thermostatically at a temperature suitable for protelomerase activity, the second cleavage reaction step is thermostat controlled at a temperature suitable for the IIS-type restriction endonuclease and meganuclease, and / or the digestion reaction step is carried out thermostatically at a temperature suitable for the exonuclease.

8. The method according to any one of claims 1 to 7, further comprising the steps of inactivating the protelomerase after the first cleavage reaction, inactivating the IIS-type restriction endonuclease and meganuclease after the second cleavage reaction, and / or inactivating the exonuclease after the digestion reaction.

9. The method according to any one of claims 1 to 8, wherein the DNA construct is constructed by a method comprising: (i) providing a cloning vector comprising one or more IIS-type restriction endonuclease and meganuclease recognition sequences; and (ii) inserting the target DNA sequence having two ends linked to the protelomerase recognition sequence into the cloning vector.

10. The method according to any one of claims 1 to 9, further comprising the step of recognizing and cleaving the further restriction endonuclease recognition sequence by allowing the restriction endonuclease to come into contact with the digestion reaction product, thereby preparing a double-stranded target DNA fragment having blunt or adherent ends after the digestion reaction step.

11. The method according to any one of claims 1 to 9, further comprising the step of recognizing and cleaving the further nicking enzyme recognition sequence by allowing the nicking enzyme to come into contact with the digestion reaction product, thereby preparing a DNA sequence in which, after the digestion reaction step, two ends are closed-ended double-stranded DNA and the intermediate is single-stranded DNA.

12. A method for expressing a target protein, Obtaining the target DNA sequence by performing a method for producing a target DNA sequence including a DNA sequence encoding the target protein, according to any one of claims 1 to 11; The obtained target DNA sequence is introduced into prokaryotic or eukaryotic cells; Incubating the prokaryotic cells or eukaryotic cells under conditions suitable for protein expression, Methods that include...

13. A method for integrating a target DNA sequence into a target integration site of a target genome, A method for producing a target DNA sequence comprising homology arm sequences at two ends of the target integration site and an intermediate target knock-in fragment, according to any one of claims 1 to 11, thereby obtaining the target DNA sequence; The obtained target DNA sequence, Cas9 protein, and sgRNA designed based on the target DNA sequence are introduced together into prokaryotic or eukaryotic cells; The prokaryotic cells or eukaryotic cells are incubated to incorporate the target DNA sequence into the target genome, Methods that include...

14. An autonomous replication vector, and a cloning vector comprising (a) one or more IIS-type restriction endonuclease and meganuclease recognition sequences, and (b) a plurality of cloning sites, The aforementioned IIS-type restriction endonuclease recognition sequence is a BspQI recognition sequence. The meganuclease recognition sequence is an I-sceI recognition sequence. The following reconstruction is performed on the pUC57 vector: (i) The BspQI recognition sequence is appended after the 1554th and 2539th bases of the pUC57 vector, and the I-sceI recognition sequence is appended after the 1501st base and 2479th base; and (ii) A cloning vector in which the G at position 1397 of the pUC57 vector is mutated to C, and the ATs at positions 2136 and 2137 are mutated to GC.

15. The cloning vector according to claim 14, comprising two or more IIS-type restriction endonuclease and meganuclease recognition sequences.

16. The cloning vector according to claim 15, comprising 3, 4, 5, 6, 7, 8, 9, 10 or more IIS-type restriction endonuclease and meganuclease recognition sequences.

17. A cloning vector according to any one of claims 14 to 16, comprising a replication origin and a selection marker gene.

18. A cloning vector according to any one of claims 14 to 17, comprising a lactose operon sequence, a gene sequence encoding a β-galactosidase including the plurality of cloning sites, and three or more BspQI recognition sequences and / or two or more I-sceI recognition sequences.

19. A cloning vector according to any one of claims 14 to 18, wherein its own nucleotide sequence includes a sequence such as that shown in Sequence ID No.

1.

20. A kit for preparing a target DNA sequence, comprising a cloning vector according to any one of claims 14 to 19, a protelomerase, one or more IIS-type restriction endonucleases and meganucleases, and one or more exonucleases, wherein the IIS-type restriction endonuclease is BspQI, the meganuclease is I-sceI, and the exonuclease is T5 exonuclease or λ exonuclease.