DNA ligase and use thereof

WO2026129256A1PCT designated stage Publication Date: 2026-06-25BGI RESEARCH HANGZHOU +2

View PDF 0 Cites 0 Cited by

Patent Information

Authority / Receiving Office: WO · WO
Patent Type: Applications
Current Assignee / Owner: BGI RESEARCH HANGZHOU
Filing Date: 2024-12-19
Publication Date: 2026-06-25

Application Information

Patent Timeline

19 Dec 2024

Application

25 Jun 2026

Publication

WO2026129256A1

IPC: C12N9/00; C12N15/10; C12N15/33; C12N15/52; C12N15/63

AI Tagging

Application Domain

Enzymes Fermentation

Technology Topics

LigationDNA

Explore More Agents

Novelty Search
Search existing technologies and assess novelty
↗
FTO
Analyze whether a product may infringe others' patents
↗
Design FTO
Check prior-design risk for exterior design
↗
Drafting
Draft patent application text based on a technical solution
↗
Find Solutions with TRIZ
Generate feasible solution to solve your technical challenge
↗

Similar Technology Patents

Highly active T4 DNA ligase mutants, their preparation methods and applications
CN122303162ASingle mutationGenetics
T7 DNA ligase variants with increased ligation activity
US12644112B2Microbiological testing/measurement Ligases Wild typeEnzyme variant
A novel high-throughput single-cell multi-omics analysis method
CN122326728AEnzyme digestionLysis
Modified red blood cells
CN122396764AAntiendomysial antibodies Red blood cell
Recombinant serum amyloid a of crassostrea gigas, preparation method and application thereof
CN122127431AAntibacterial agents Peptide/protein ingredientsSAA proteinEscherichia coli

Get free access to AI patent search and analysis

Check patentability, review prior art and ask IP Agent with full patent context.

Smart Images

Figure CN2024140690_25062026_PF_FP_ABST

Patent Text Reader

Abstract

The present invention provides a novel DNA ligase, comprising: an amino acid sequence as shown in SEQ ID NO: 2 or SEQ ID NO: 4; or, compared with the amino acid sequence as shown in SEQ ID NO: 2 or SEQ ID NO: 4, an amino acid sequence obtained after substitution and / or deletion and / or addition of one or more amino acid residues and retaining DNA ligation activity; or an amino acid sequence having at least 70% identity with the amino acid sequence as shown in SEQ ID NO: 2 or SEQ ID NO: 4 and retaining DNA ligation activity. The novel DNA ligase of the present invention has good ligation activity, and can be applied to a variety of scenarios.

Need to check novelty before this filing date? Find Prior Art

Description

DNA ligase and its applications Technical Field

[0001] This invention belongs to the field of biotechnology, and more specifically, this invention relates to DNA ligases and their applications. Background Technology

[0002] DNA ligases primarily function to join broken ends in DNA molecules together to form a complete DNA strand, playing a crucial role in DNA replication, repair, and recombination. DNA ligases recognize the terminal sequences of DNA strands and utilize adenosine triphosphate (ATP) or nicotinamide adenine dinucleotide (NAD)... + As a cofactor, it binds the 5'-P end of one sequence to the 3'-OH end of another sequence with a phosphodiester bond, thereby forming a continuous DNA chain.

[0003] Common DNA ligases include T4 DNA ligase, T7 DNA ligase, and DNA ligase I. DNA ligases have a wide range of applications in molecular biology, such as ligating DNA fragments into cloning vectors and repairing damaged DNA.

[0004] Currently, the most widely used DNA ligase is the T4 DNA ligase. Since its discovery in 1962, the T4 DNA ligase has a 60-year history of development. With continuous advancements in scientific research, higher demands have been placed on the ligation specificity, ligation efficiency, and substrate specificity of ligases. For example, some novel sequencing technologies, such as third-generation sequencing and single-molecule sequencing, have even higher requirements for the ligation reaction, necessitating more efficient and precise DNA ligases. However, relying solely on traditional directed evolution methods to modify existing T4 DNA ligases has limited potential and cannot meet the ever-increasing demands. Summary of the Invention

[0005] In view of the problems existing in the prior art, the present invention aims to provide a novel DNA ligase that should exhibit better activity and application potential, and has greater potential for commercialization and modification.

[0006] Therefore, in a first aspect, the present invention provides a DNA ligase comprising: a) an amino acid sequence as shown in SEQ ID NO: 2 or SEQ ID NO: 4; b) an amino acid sequence having one or more amino acid residues substituted, deleted, and / or added, and retaining DNA ligation activity, compared to the amino acid sequence shown in SEQ ID NO: 2 or SEQ ID NO: 4; or c) an amino acid sequence having at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, at least 99.1%, at least 99.2%, at least 99.3%, at least 99.4%, at least 99.5%, at least 99.6%, at least 99.7%, at least 99.8%, or at least 99.9% sequence identity, and retaining DNA ligation activity, compared to the amino acid sequence shown in SEQ ID NO: 2 or SEQ ID NO: 4.

[0007] In a second aspect, the present invention provides an isolated nucleic acid comprising a nucleotide sequence encoding a DNA ligase of the first aspect.

[0008] In a third aspect, the present invention provides an expression vector comprising the nucleic acid of the second aspect.

[0009] In a fourth aspect, the present invention provides a recombinant cell comprising the nucleic acid of the second aspect or the expression vector of the third aspect.

[0010] In a fifth aspect, the present invention provides a method for ligating DNA, the method comprising the steps of ligating a first DNA fragment to a second DNA fragment or ligating the 5' end and 3' end of a third DNA fragment together using a DNA ligase of the first aspect.

[0011] In a sixth aspect, the present invention provides a method for constructing a sequencing library, the method comprising the steps of using the DNA ligase of the first aspect to ligate a first DNA fragment to a second DNA fragment or to ligate the 5' end and 3' end of a third DNA fragment together.

[0012] In a seventh aspect, the present invention provides a kit comprising the DNA ligase of the first aspect, a reaction buffer, and instructions for use.

[0013] The beneficial effects of the present invention are at least in one or more of the following aspects:

[0014] The novel DNA ligase provided by this invention possesses both TA ligation activity and sticky end ligation activity, which can improve the circularization efficiency in nucleic acid single-strand circularization. This novel DNA ligase can be directly applied to a variety of different scenarios, and after sequence optimization, it has significant potential for improvement and even greater application possibilities. Attached Figure Description

[0015] Figure 1 shows the results of structural alignment between the predicted structural models of L27 DNA ligase and L32 DNA ligase and T4 DNA ligase; in Figure 1A, L27 DNA ligase (blue) and T4 DNA ligase (gray) are shown, and in Figure 1B, L32 DNA ligase (green) and T4 DNA ligase (gray) are shown.

[0016] Figure 2 shows the purification results of L27 DNA ligase and L32 DNA ligase. The three bands from left to right are whole bacterial culture, supernatant, and crude purified protein after Ni column elution.

[0017] Figure 3 is a schematic diagram of the DNA ligase single-strand circularization efficiency determination. Detailed Implementation

[0018] The present invention will be described in detail below. It should be understood that the following description is merely illustrative and is not intended to limit the scope of the invention; the scope of protection of the invention is defined by the appended claims. Furthermore, those skilled in the art will understand that modifications can be made to the technical solutions of the present invention without departing from its spirit and intent. Unless otherwise specified, the technical means used in the embodiments are conventional means well known to those skilled in the art.

[0019] Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which the subject matter pertains. Before a detailed description of the invention, the following definitions are provided to better understand it.

[0020] In the context of this invention, many embodiments use the expressions "comprising," "including," or "basically / mainly composed of...". The expressions "comprising," "including," or "basically / mainly composed of..." are generally understood as open-ended expressions, indicating that they include not only the elements, components, parts, or method steps specifically listed after the expression, but also other elements, components, parts, or method steps. Additionally, in this document, the expressions "comprising," "including," or "basically / mainly composed of..." can also be understood as closed-ended expressions in certain circumstances, indicating that they include only the elements, components, parts, or method steps specifically listed after the expression, and exclude any other elements, components, parts, or method steps. Furthermore, in the context of this invention, many embodiments use the expression "composed of...", which should be understood as a closed-ended expression, indicating that it includes only the elements, components, parts, or method steps specifically listed after the expression, and excludes any other elements, components, parts, or method steps.

[0021] As used herein and in the appended claims, unless the context clearly specifies otherwise, the indefinite articles (“a”, “an”) and definite articles (“the”) in the singular form include the referent in the plural form. Similarly, the terms indefinite article (“a”, “an”), “one or more”, and “at least one” are used interchangeably herein.

[0022] In cases where a numerical range is provided, such as a concentration range, a percentage range, or a ratio range, it should be understood that, unless the context explicitly states otherwise, all intermediate values separated by one-tenth of the lower limit unit between the upper and lower limits of the range, as well as any other values or intermediate values within the range, are included in the subject matter.

[0023] In order to better understand the invention and without limiting its scope, unless otherwise indicated, all figures and other numerical values used in the specification and claims to express quantities, percentages or proportions should in all cases be understood to be modified by the term "about".

[0024] In this document, references to nucleic acid sequences include not only the nucleic acid sequence itself, but also its inverse complementary sequence and the complementary double-stranded sequence formed by them. Those skilled in the art will understand how to derive the inverse complementary sequence from a nucleic acid sequence. The function of a sequence as referred to herein includes both the sequence itself possessing this function and its inverse complementary sequence possessing this function.

[0025] In this article, unless it contradicts common sense in the field, mentioning nucleic acid sequences is equivalent to mentioning any one or more of the corresponding DNA, RNA, DNA double strand, RNA double strand, and DNA-RNA double strand.

[0026] As used herein, unless otherwise specified, “G”, “C”, “A”, “T” and “U” in a nucleotide sequence typically represent nucleotides containing guanine, cytosine, adenine, thymine and uracil as bases, respectively.

[0027] As used herein, unless otherwise specified, amino acids are generally represented by single-letter or three-letter abbreviations known in the art. For example, alanine can be represented by Ala or A, glycine by Gly or G, valine by Val or V, leucine by Leu or L, isoleucine by Ile or I, proline by Pro or P, phenylalanine by Phe or F, tyrosine by Tyr or Y, tryptophan by Trp or W, serine by Ser or S, threonine by Thr or T, cysteine by Cys or C, methionine by Met or M, asparagine by Asn or N, glutamine by Gln or Q, aspartic acid by Asp or D, glutamic acid by Glu or E, lysine by Lys or K, arginine by Arg or R, and histidine by His or H.

[0028] In this paper, the term "identity" is used to refer to the sequence matching between two polypeptides or two nucleic acids. Two compared sequences are considered identical at that position when a position is occupied by the same base or amino acid monomeric subunit (e.g., a position in each of two DNA molecules is occupied by adenine, or a position in each of two polypeptides is occupied by lysine). The "percentage identity" between two sequences is a function of the number of matching positions shared by the two sequences divided by the number of positions compared × 100. For example, if six out of ten positions in two sequences match, then the two sequences have 60% identity. For example, the DNA sequences CTGACT and CAGGTT have 50% identity (three out of six positions match). Typically, two sequences are compared to produce the maximum identity. Such comparisons can be performed using, for example, a computer program such as the Align program (DNAstar, Inc.) Needleman et al. (1970) J. Mol. Biol. 48: 443-453. The algorithm of E. Meyers and W. Miller (Comput. Appl Biosci., 4:11-17 (1988)) which has been integrated into the ALIGN program (version 2.0) can also be used to determine the percentage identity between two amino acid sequences using the PAM120 weight residue table, a 12-bit nick length penalty, and a 4-bit nick penalty.

[0029] In this document, the term "DNA ligation activity" refers to the activity of being able to ligate two strands of DNA with sticky ends together, the activity of being able to ligate a DNA fragment with an adenine (A) deoxynucleotide at its 3' end to a DNA fragment with a thymine (T) deoxynucleotide at its 3' end, and / or the activity of being able to ligate the 5' end and 3' end of a single strand of DNA together.

[0030] In this document, the terms “first,” “second,” “third,” or “fourth,” etc., are used only to distinguish the various elements, components, parts, and method steps involved in this invention, and do not imply the order or priority of these elements, components, parts, and method steps, unless the context clearly describes or can be determined from the context.

[0031] As used herein, the term “degenerate sequence” refers to the phenomenon that the same amino acid is encoded by two or more codons, which includes all possible sequences of different base sequences that encode a single amino acid.

[0032] As described in the background section of this application, there is an urgent need in the art for a more efficient and precise DNA ligase. However, the space for modifying existing T4 DNA ligases by simply relying on traditional directed evolution methods is limited and cannot meet the growing demand.

[0033] Therefore, addressing the problems existing in the prior art, the inventors sampled, extracted nucleic acids, performed metagenomic sequencing, sequence alignment, and structural model alignment on deep-sea sediments from the western Black Sea. The results revealed two proteins, L27 (SEQ ID NO:2) and L32 (SEQ ID NO:4), verified later, which were confirmed to be novel DNA ligases. These ligases possess TA ligation activity and sticky end ligation activity, improving cyclization efficiency in nucleic acid single-strand cyclization. They can be directly applied to various scenarios and have significant potential for improvement and further application after sequence optimization. Thus, this invention was completed.

[0034] Therefore, in a first aspect, the present invention provides a DNA ligase comprising: a) an amino acid sequence as shown in SEQ ID NO: 2 or SEQ ID NO: 4; b) an amino acid sequence having substitutions, deletions, and / or additions of one or more amino acid residues compared to the amino acid sequence shown in SEQ ID NO: 2 or SEQ ID NO: 4, and retaining DNA ligation activity; or c) an amino acid sequence having at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, at least 99.1%, at least 99.2%, at least 99.3%, at least 99.4%, at least 99.5%, at least 99.6%, at least 99.7%, at least 99.8%, or at least 99.9% sequence identity compared to the amino acid sequence shown in SEQ ID NO: 2 or SEQ ID NO: 4, and retaining DNA ligation activity.

[0035] Sequence alignment revealed that the DNA ligases with the amino acid sequences shown in SEQ ID NO: 2 or SEQ ID NO: 4 share less than 35% sequence identity with the T4 DNA ligase. However, structural model comparison showed that these two DNA ligases have highly homologous three-dimensional structures with the T4 DNA ligase. Experiments showed that these two DNA ligases have similar functions to the T4 DNA ligase, capable of joining two DNA fragments together and joining single-stranded DNA ends together to form a circular DNA chain.

[0036] In some embodiments, the DNA ligase further comprises a purification tag, an epitope tag, and / or a solubilization tag.

[0037] In some embodiments, the purification tag, epitope tag, and / or solubilization tag are attached to the N-terminus or C-terminus of the DNA ligase.

[0038] In some embodiments, the purification tag is selected from at least one of Poly-Arg, Poly-His, Strep-TagⅡ, S-tag, FLAG, and GFP.

[0039] In some embodiments, the epitope tag is selected from at least one of C-myc, HA, V5, and VSV-G.

[0040] In some embodiments, the solubilizing tag is selected from at least one of Trx, SUMO, GST, MBP, and NusA.

[0041] Those skilled in the art will understand that adding additional tag sequences to the N-terminus or C-terminus of the amino acid sequence of DNA ligase is intended to improve the expression level, solubility, stability, or facilitate the purification process after protein expression of the exogenous protein. These additional tag sequences do not affect the functional activity of the DNA ligase itself.

[0042] In a second aspect, the present invention provides an isolated nucleic acid comprising a nucleotide sequence encoding a DNA ligase of the first aspect.

[0043] In some embodiments, the nucleic acid comprises a nucleotide sequence or a degenerate sequence thereof as shown in SEQ ID NO: 1 or SEQ ID NO: 3.

[0044] In a third aspect, the present invention provides an expression vector comprising the nucleic acid of the second aspect.

[0045] In some embodiments, the expression vector is a plasmid, such as pET28A, pET30A, pGEX-4T-1, or pET SUMO.

[0046] In a fourth aspect, the present invention provides a recombinant cell comprising the nucleic acid of the second aspect or the expression vector of the third aspect.

[0047] In some implementations, the recombinant cells are eukaryotic cells such as plant cells or animal cells, or prokaryotic cells such as Escherichia coli.

[0048] In some preferred embodiments, the recombinant cells are Escherichia coli BL21(DE3) strain.

[0049] Through genetic engineering, nucleic acids containing DNA ligases L27 or L32, or expression vectors containing nucleic acids encoding DNA ligases L27 or L32, can be transferred into suitable cells. These cells can express the DNA ligases in large quantities under suitable conditions and can be used as bioreactors.

[0050] In a fifth aspect, the present invention provides a method for ligating DNA, the method comprising the steps of ligating a first DNA fragment to a second DNA fragment or ligating the 5' end and 3' end of a third DNA fragment together using a DNA ligase of the first aspect.

[0051] In a sixth aspect, the present invention provides a method for constructing a sequencing library, the method comprising the steps of using the DNA ligase of the first aspect to ligate a first DNA fragment to a second DNA fragment or to ligate the 5' end and 3' end of a third DNA fragment together.

[0052] The following description applies to the fifth and sixth aspects of the present invention.

[0053] In some embodiments, the first DNA fragment comprises a complementary first strand and a second strand, wherein the first strand protrudes 1-6 (e.g., 1, 2, 3, 4, 5, or 6) deoxynucleotides at its 3' end relative to the second strand, and the second DNA fragment comprises a complementary third strand and a fourth strand, wherein the third strand protrudes 1-6 (e.g., 1, 2, 3, 4, 5, or 6) deoxynucleotides at its 3' end relative to the fourth strand, and the 1-6 deoxynucleotides protruding at the 3' end of the first strand are complementary to the 1-6 deoxynucleotides protruding at the 3' end of the third strand, such that, under the action of the DNA ligase, the first strand is ligated to the fourth strand, and the second strand is ligated to the third strand.

[0054] Those skilled in the art will understand that in order to join two DNA fragments together, the ends used for joining need to have free groups for carrying out the joining. For example, the 5' end of the DNA strand used for joining needs to have a free phosphate group, and the 3' end needs to have a free hydroxyl group.

[0055] Taking the above-described implementation as an example, the last deoxyribonucleotide at the 3' end of the first strand and the last deoxyribonucleotide at the 3' end of the third strand should have a free hydroxyl group, and the first deoxyribonucleotide at the 5' end of the second strand and the first deoxyribonucleotide at the 5' end of the fourth strand should have a free phosphate group. Thus, DNA ligase can catalyze the formation of a phosphodiester bond between the phosphate group and the hydroxyl group, thereby linking the first strand and the fourth strand together, and linking the second strand and the third strand together.

[0056] In one implementation, the first DNA fragment is a target nucleic acid fragment, and the second DNA fragment is a sequencing adapter. A sequencing library can be constructed by ligating the two DNA fragments together using the DNA ligase, followed by subsequent sequencing on a suitable sequencing platform.

[0057] The method of the present invention can be a method for preparing TA clones. Therefore, in some preferred embodiments, the first strand protrudes an adenine (A) deoxynucleotide at its 3' end relative to the second strand, the second strand protrudes an adenine (A) deoxynucleotide at its 3' end relative to the first strand, the third strand protrudes a thymine (T) deoxynucleotide at its 3' end relative to the fourth strand, and the fourth strand protrudes a thymine (T) deoxynucleotide at its 3' end relative to the third strand.

[0058] In one specific implementation, the first DNA fragment may be a target nucleic acid fragment, and the second DNA fragment may be a TA cloning vector.

[0059] Similarly, in order to join these two DNA fragments together, the ends of the two DNA fragments need to have groups for carrying out the ligation. Specifically, the first and second strands of the first DNA fragment have free phosphate groups at their 5' ends and free hydroxyl groups at their 3' ends, and the third and fourth strands of the second DNA fragment have free phosphate groups at their 5' ends and free hydroxyl groups at their 3' ends. Thus, DNA ligase can catalyze the formation of phosphodiester bonds between the phosphate and hydroxyl groups, thereby joining the first and fourth strands end-to-end, and joining the second and third strands end-to-end, to form a double-stranded circular DNA, or TA clone.

[0060] In some embodiments, the third DNA fragment is a single-stranded DNA fragment with a first adapter and a second adapter attached to its 5' end and 3' end, respectively, such that the DNA ligase ligates the 5' end and 3' end of the third DNA fragment together in the presence of splint oligonucleotides, wherein the splint oligonucleotides are complementary to the first adapter and the second adapter.

[0061] Similarly, the third DNA fragment needs to have a group at its end for ligation. Specifically, the first adapter has a free phosphate group at its 5' end, and the second adapter has a free hydroxyl group at its 3' end. Thus, under the action of DNA ligase, the phosphate group at the 5' end of the first adapter and the hydroxyl group at the 3' end of the second adapter will form a phosphodiester bond, or the phosphate group at the 5' end of the first adapter and the hydroxyl group at the 3' end of the second adapter can be linked by adding additional dNTPs or oligonucleotide fragments, thereby ligating the third DNA fragment end-to-end.

[0062] Furthermore, to bring the 5' and 3' ends of the third DNA fragment close together for ligation, a splint oligonucleotide can be introduced into the reaction system. This splint oligo comprises at least two parts: a first part complementary to the first adapter and a second part complementary to the second adapter. Thus, under the action of the splint oligonucleotide, the first and second adapters on the third DNA fragment can spatially approach each other and ligate together under the action of the DNA ligase, ultimately forming a circular third DNA fragment.

[0063] In one embodiment, the ligation product can be amplified after the first DNA fragment and the second DNA fragment are ligated together, or after the 5' end and 3' end of the third DNA fragment are ligated together. For amplification of the ligation product, conventional methods for amplifying DNA in the art, such as PCR amplification, can be used to obtain a sufficient quantity of ligation product for subsequent operations such as library construction and sequencing.

[0064] In some embodiments, the concentration of the DNA ligase used is 0.1 mg / mL to 1 mg / mL, for example 0.1 mg / mL, 0.2 mg / mL, 0.3 mg / mL, 0.4 mg / mL, 0.5 mg / mL, 0.6 mg / mL, 0.7 mg / mL, 0.8 mg / mL, 0.9 mg / mL or 1 mg / mL.

[0065] In a preferred embodiment, the concentration of the DNA ligase used is 0.3 mg / mL or 0.4 mg / mL.

[0066] In some embodiments, the first DNA fragment and the second DNA fragment, or the 5' end and the 3' end of the third DNA fragment, are ligated at a reaction temperature of 4°C to 42°C (e.g., 10°C, 15°C, 20°C, 25°C, 30°C, 35°C, 36°C, 37°C, 38°C, 39°C, 40°C, 41°C, or 42°C).

[0067] In a preferred embodiment, the reaction temperature is 25°C or 37°C.

[0068] In some implementations, the connection reaction takes place for 30 minutes to 2 hours, for example, 30 minutes, 40 minutes, 1 hour, 1.5 hours or 2 hours.

[0069] In a seventh aspect, the present invention provides a kit comprising the DNA ligase of the first aspect, a reaction buffer, and instructions for use.

[0070] In some embodiments, the reaction buffer contains Tris-HCl and Mg 2+ ATP, dithiothreitol.

[0071] In some embodiments, the reaction buffer may be a reaction buffer suitable for other DNA ligases, such as T4 DNA ligase.

[0072] Example

[0073] The embodiments of the present invention will be described in detail below with reference to examples. Those skilled in the art will understand that the following examples are merely illustrative and should not be considered as limiting the scope of the present invention. Where specific techniques or conditions are not specified in the examples, they are performed according to the techniques or conditions described in the literature in the art or according to the product instructions. Reagents or instruments whose manufacturers are not specified are all conventional products that can be obtained commercially.

[0074] Example 1 - Sequence Alignment

[0075] Using the Clustal Omega online sequence alignment website (http: / / www.clustal.org), proteins L27 and L32 were aligned with T4 DNA ligase, respectively. The alignment results are as follows:

[0076] The above comparison results show that the sequence similarity between proteins L27 and L32 and T4 DNA ligase is 29.91% and 33.33%, respectively, and the sequence similarity between proteins L27 and L32 is 35.14%. It is evident that the sequence similarity between proteins L27 and L32, as well as between these two proteins and T4 DNA ligase, is very low.

[0077] Example 2 - Structural Model Comparison

[0078] The inventors further predicted the protein structure of proteins L27 and L32 using an AlphaFold3-based protein structure model, and then compared the predicted structure models with the T4 DNA ligase structure. The results are shown in Figure 1. The figure shows that the TM-scores of proteins L27 and L32 with the T4 DNA ligase are 0.87 and 0.68, respectively. Combined with the sequence alignment results from Example 1, it can be seen that although the sequence similarity of proteins L27 and L32 with the T4 DNA ligase is low, their three-dimensional structures are highly homologous. Therefore, it is speculated that they likely have the same or similar functions as the T4 DNA ligase.

[0079] Example 3 - Construction, expression, and purification of recombinant plasmids

[0080] We commissioned Changzhou Xinyi Biotechnology Co., Ltd. to synthesize the gene sequences of proteins L27 and L32, and cloned their genes into the pET28A expression vector at the Nde I and Xho I cloning sites.

[0081] The specific protein expression and purification steps include:

[0082] 1) The above recombinant plasmid was transformed into Escherichia coli BL21(DE3) competent cells and incubated overnight at 37°C on a plate.

[0083] 2) The next day, pick a single colony and inoculate it into 12 mL of LB medium, and incubate at 37°C until OD. 600 = 0.8 to 1, then add isopropyl-β-D-thiogalactoside (IPTG) to a final concentration of 0.5 mM and induce overnight at 16 °C.

[0084] 3) The next day, the cells were centrifuged at 4000×g for 10 min to collect the bacteria. The cells were then resuspended in 10 mL of lysis buffer (50 mM Tris, 500 mM NaCl, 10 mM imidazole, 5% glycerol, pH 7.8), and then sonicated for 30 min using an eight-probe ultrasonic disruptor to obtain a whole bacterial culture. The whole bacterial culture was further centrifuged to obtain the supernatant.

[0085] 4) Add the supernatant to 0.5 mL of Ni packing material, incubate at 4 °C for 30 min, then centrifuge at 500 rpm for 3 min. Take 1 mL of the solution containing Ni packing material from the bottom and place it in a 96-well filter plate. Wash 3 times with 1 mL Ni-A buffer and 3 times with 1 mL Ni-C buffer. Then elute with 0.5 mL Ni-B buffer. Perform ultrafiltration four times with dialysis buffer. Concentrate the crude purified protein in a 50 kDa concentrator tube, add an equal volume of glycerol, and store at -20 °C for functional activity verification.

[0086] The specific formulation of the buffer solution used in the above purification steps is as follows:

[0087] Ni-A buffer: 50mM Tris, 500mM NaCl, 10mM imidazole, 5% glycerol, pH 7.8;

[0088] Ni-B buffer: 50mM Tris, 500mM NaCl, 500mM imidazole, 5% glycerol, pH 7.8;

[0089] Ni-C buffer: 50mM Tris, 500mM NaCl, 50mM imidazole, 5% glycerol, pH 7.8;

[0090] Dialysis buffer: 40 mM Tris, 200 mM KCl, 2 mM dithiothreitol (DTT), 0.2 mM disodium ethylenediaminetetraacetate (EDTA·2Na), 5% glycerol, pH 8.0.

[0091] Polyacrylamide gel electrophoresis was used to detect the ultrasonically disrupted whole bacterial culture, the supernatant after centrifugation of the whole bacterial culture, and the crude purified protein after Ni column purification. The results are shown in Figure 2. As can be seen from the figure, most of the protein is expressed in a soluble form, and the molecular weights of proteins L27 and L32 are approximately 53 kDa and 54 kDa, respectively.

[0092] Example 4 - TA Linkage Activity Assay

[0093] Proteins L27 and L32 were added to the TA system for ligation, as shown in Table 1 below. The reaction was performed in a PCR instrument at 25℃ for 2 hours with a heat-sealed 35℃ environment. A negative control (NC) was set up during the reaction, using water instead of the test proteins, while maintaining all other conditions the same. After the reaction was completed, the reaction solution was transferred into DH5α competent cells, and the presence of ligase activity of the test proteins was determined by the number of colonies.

[0094] Table 1: TA Connection System

[0095] The results are shown in Table 2 below. It can be seen that proteins L27 and L32 both possess TA ligation activity. Therefore, it can be preliminarily concluded that proteins L27 and L32 can function as DNA ligases.

[0096] Table 2: Number of TA-ligated transformed colonies

[0097] Example 5 - Single-chain cyclization activity determination

[0098] DNA single-strand circularization involves denaturing double-stranded DNA (dsDNA) with adapter sequences at high temperature to form single-stranded DNA (ssDNA). Then, under the catalysis of a ligase, a splint oligo primer is used to complementary pair with both ends of the ssDNA. The two ends of the ssDNA are ligated to form a single-stranded circular DNA molecule, which can be used to construct a single-stranded circular DNA library specifically for the MGI high-throughput sequencer. The sample processing and single-strand circularization standard procedure for preparing circularized single strands were followed according to the "MGIEasy Circulation Kit Instruction Manual 7.0" (https: / / www.mgitech.cn / Uploads / Temp / file / 20240724 / 66a0786c23a4a.pdf). The entire preparation process includes substrate preparation, single-strand circularization reaction, single-strand circularization product labeling, and single-strand circularization efficiency calculation. A schematic diagram of single-strand circularization using DNA ligase is shown in Figure 3.

[0099] The specific process is as follows:

[0100] 5.1 Preparation of Cyclic Substrates

[0101] Prepare the PCR amplification system as shown in Table 3, and the primer sequences are shown in Table 4. Use the amplification program shown in Table 5 to perform amplification in the PCR instrument. The temperature of the PCR instrument's hot cap is 105℃.

[0102] Table 3: Amplification systems used for preparing cyclized substrates

[0103] Table 4: Primer Sequences

[0104] Table 5: Amplification procedures for preparing cyclized substrates

[0105] PCR amplification products were extracted using a gel extraction kit (OMEGA, Code No. D2500-02), and the extracted products were dissolved in TE buffer. The concentration was then quantified using the Qubit dsDNA HS Assay Kit (Invitrogen, Code No. Q32854). This is the substrate used for subsequent single-strand circularization.

[0106] Take 150 ng of cyclized substrate into a 0.2 mL PCR tube, dissolve and dilute it to 23.1 μL with T4 DNA ligase buffer; preheat the PCR instrument to 95 °C and heat the lid to 105 °C, then place the sample in the tube and react at 95 °C for 3 min, then immediately place it on ice.

[0107] 5.2 Single-chain cyclopentatization

[0108] Prepare the single-chain cyclized mixture in advance according to Table 6.

[0109] Table 6: Monocyclic Compounds

[0110] A negative control (NC) group was set up during the reaction, in which water was used instead of ligase, while all other conditions were the same. 6.8 μL of single-stranded cyclization mixture was added to the single-stranded isolated product, mixed well, and then the ligation reaction was performed in a PCR instrument at 37℃ for 30 min with a heated lid at 50℃.

[0111] 5.3 Enzymatic digestion

[0112] Prepare the enzyme digestion mixture in advance according to Table 7.

[0113] Table 7: Enzyme digestion mixture

[0114] Add 1.825 μL of the above enzyme digestion mixture to the reaction product, mix well, and then perform the enzyme digestion reaction in a PCR instrument. The reaction conditions are 37℃ for 30 min and 50℃ with a hot cap.

[0115] Reaction termination and quantification: After the reaction was complete, 3.3 μL of 0.5 M EDTA was added directly to the reaction tube, vortexed 6 times for 3 seconds each time, and then briefly centrifuged to collect the reaction solution to the bottom of the tube. The concentration of single-stranded products in each reaction system was then determined using the Qubit ssDNA Assay Kit (Invitrogen, Code No. Q10212).

[0116] 5.4 Calculation of Single-Chain Cyclic Efficiency

[0117] The Qubit ssDNA Assay Kit (Invitrogen, Code No. Q10212) was used to quantify the purified product after enzyme digestion and purification according to the instructions. The theoretical ssDNA concentration (ng / μL) was calculated as follows: Circulation substrate amount / 2 / Total reaction volume = 150 / 2 / 35.025 = 2.14 ng / μL (in this example, each system contained 150 ng of circulation substrate, theoretically only one single strand participated in circulation, and the total reaction volume was 35.025 μL). Single-strand circulation efficiency was calculated as ΔssDNA concentration / theoretical ssDNA concentration. The results are shown in Table 8 below. It can be seen that compared to T4 DNA ligase, L27 and L32 ligases have higher single-strand circulation efficiency.

[0118] Table 8: ΔssDNA concentration and single-strand circularization efficiency

[0119] Sequence List:

Claims

1. A DNA ligase comprising: a) An amino acid sequence as shown in SEQ ID NO: 2 or SEQ ID NO: 4, b) An amino acid sequence having one or more amino acid residues substituted, deleted, and / or added, and retaining DNA linking activity, compared with the amino acid sequence shown in SEQ ID NO: 2 or SEQ ID NO:

4. c) An amino acid sequence having at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, at least 99.1%, at least 99.2%, at least 99.3%, at least 99.4%, at least 99.5%, at least 99.6%, at least 99.7%, at least 99.8%, or at least 99.9% sequence identity with respect to DNA ligation activity.

2. The DNA ligase according to claim 1, further comprising a purification tag, an epitope tag, and / or a solubilization tag; Preferably, the purification tag, epitope tag, and / or solubilization tag are linked to the N-terminus or C-terminus of the DNA ligase; Preferably, the purification tag is selected from at least one of Poly-Arg, Poly-His, Strep-Tag II, S-tag, FLAG, and GFP; Preferably, the epitope tag is selected from at least one of C-myc, HA, V5, and VSV-G; Preferably, the solubilizing tag is selected from at least one of Trx, SUMO, GST, MBP and NusA.

3. An isolated nucleic acid comprising a nucleotide sequence encoding the DNA ligase of claim 1 or 2.

4. The nucleic acid according to claim 3, wherein the nucleic acid comprises a nucleotide sequence or a degenerate sequence thereof as shown in SEQ ID NO: 1 or SEQ ID NO:

3.

5. An expression vector comprising the nucleic acid of claim 3 or 4.

6. The expression vector according to claim 5, wherein the expression vector is a plasmid; Preferably, the plasmid is selected from pET28A, pET30A, pGEX-4T-1 or pET SUMO.

7. A recombinant cell comprising the nucleic acid of claim 3 or 4 or the expression vector of claim 5 or 6.

8. The recombinant cell according to claim 7, wherein the recombinant cell is a eukaryotic cell such as a plant cell or an animal cell, or a prokaryotic cell such as Escherichia coli; Preferably, the recombinant cells are Escherichia coli BL21(DE3) strain.

9. A method for ligating DNA, the method comprising the steps of ligating a first DNA fragment to a second DNA fragment or ligating the 5' end and 3' end of a third DNA fragment together using the DNA ligase of claim 1 or 2.

10. A method for constructing a sequencing library, the method comprising the steps of ligating a first DNA fragment to a second DNA fragment or ligating the 5' end and 3' end of a third DNA fragment together using the DNA ligase of claim 1 or 2.

11. The method according to claim 9 or 10, wherein the first DNA fragment comprises a complementary first strand and a second strand, and the first strand protrudes 1-6 deoxynucleotides at its 3' end relative to the second strand, the second DNA fragment comprises a complementary third strand and a fourth strand, and the third strand protrudes 1-6 deoxynucleotides at its 3' end relative to the fourth strand, and the 1-6 deoxynucleotides protruding at the 3' end of the first strand are complementary to the 1-6 deoxynucleotides protruding at the 3' end of the third strand, such that, under the action of the DNA ligase, the first strand is ligated to the fourth strand, and the second strand is ligated to the third strand.

12. The method according to any one of claims 9-11, wherein the first DNA fragment is a target nucleic acid fragment and the second DNA fragment is a sequencing adapter.

13. The method of claim 11, wherein the first chain protrudes an adenine (A) deoxynucleotide at its 3' end relative to the second chain, the second chain protrudes an adenine (A) deoxynucleotide at its 3' end relative to the first chain, the third chain protrudes a thymine (T) deoxynucleotide at its 3' end relative to the fourth chain, and the fourth chain protrudes a thymine (T) deoxynucleotide at its 3' end relative to the third chain.

14. The method according to claim 13, wherein the first DNA fragment is a target nucleic acid fragment and the second DNA fragment is a TA cloning vector.

15. The method according to claim 9 or 10, wherein the third DNA fragment is a single-stranded DNA fragment having a first adapter and a second adapter attached to its 5' end and 3' end, respectively, such that, in the presence of a splice oligonucleotide, the DNA ligase ligates the 5' end and 3' end of the third DNA fragment together, wherein the splice oligonucleotide is complementary to the first adapter and the second adapter.

16. The method according to any one of claims 9-15, wherein, After the first DNA fragment and the second DNA fragment are ligated together, or after the 5' end and the 3' end of the third DNA fragment are ligated together, the resulting ligation product is amplified.

17. The method according to any one of claims 9-16, wherein the concentration of the DNA ligase used is 0.1 mg / mL to 1 mg / mL; Preferably, the concentration of the DNA ligase used is 0.3-0.4 mg / mL.

18. The method according to any one of claims 9-17, wherein, The first DNA fragment and the second DNA fragment, or the 5' end and the 3' end of the third DNA fragment, are joined at a reaction temperature of 4℃-42℃. Preferably, the reaction temperature is 25℃-37℃.

19. A kit comprising the DNA ligase of claim 1 or 2, a reaction buffer, and instructions for use.

20. The kit of claim 19, wherein the reaction buffer comprises Tris-HCl, Mg 2+ , ATP, dithiothreitol.