A Sanger sequencing kit and method for high GC-content DNA fragments
By adding DNA helicase, betaine, and nucleic acid denaturant to the Sanger sequencing kit, combined with a gradient cooling cyclic sequencing program, the problem of sequencing signal interruption in high GC-content DNA fragments was solved, achieving high accuracy and long sequencing length.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- AOKE (WUHAN) BIOTECHNOLOGY CO LTD
- Filing Date
- 2024-12-31
- Publication Date
- 2026-06-30
Smart Images

Figure CN122303401A_ABST
Abstract
Description
Technical Field
[0001] This invention belongs to the field of molecular biology technology, specifically relating to a Sanger sequencing kit and method for high GC content DNA fragments. Background Technology
[0002] DNA fragments with high GC content play a crucial role in molecular biology research, but their sequencing presents significant challenges. GC content refers to the ratio of guanine (G) to cytosine (C) in a DNA sequence; high GC content means a greater number of G and C base pairs. Since G and C can form three hydrogen bonds, compared to the two hydrogen bonds between A and T, the DNA double helix in high-GC regions is more stable. This makes these regions difficult to separate and replicate effectively during PCR amplification and sequencing. Furthermore, high-GC-content DNA fragments tend to form complex secondary structures, further increasing the difficulty of sequencing. Therefore, sequencing technology for high-GC-content DNA fragments has always been a key focus in the field of genomics research.
[0003] Sanger sequencing is a classic DNA sequencing technology with wide applications in gene sequencing. However, traditional Sanger sequencing faces numerous challenges with high-GC-content DNA fragments. High-GC regions are prone to forming complex secondary structures, such as hairpin structures and G-tetramers. These structures can hinder DNA polymerase elongation, leading to sequencing signal interruptions and peak pattern irregularities, severely affecting the accuracy and integrity of sequencing, thus limiting the application of Sanger sequencing in the analysis of high-GC-content DNA. Existing high-GC-content Sanger sequencing kits typically only contain DNA polymerase, dNTPs, and Mg... 2+ It does not contain DNA helicase, formamide, or betaine, and its sequencing results are unstable, often exhibiting overlapping peaks, duplicate peaks, and signal interruptions. Summary of the Invention
[0004] The purpose of this invention is to solve the problem that high GC content DNA fragments are difficult to sequence accurately using conventional Sanger sequencing technology in the prior art. It provides a Sanger sequencing kit and method for high GC content DNA fragments, which can improve the accuracy and success rate of sequencing. This is achieved through the following technical solutions:
[0005] In a first aspect, the present invention provides a Sanger sequencing kit for high GC-content DNA fragments, comprising reagent A and reagent B. Reagent A includes DNA polymerase, ddNTPs, and dNTPs, and further includes DNA helicase and betaine.
[0006] The final concentration of the DNA helicase in reagent A is 0.1-0.3 U / μL;
[0007] The final concentration of the dNTPs in reagent A is 1.5~2.0 mM;
[0008] The final concentration of betaine in reagent A is 1~2 M;
[0009] Reagent B includes a nucleic acid denaturant, which is formamide, and its final concentration in the mixture of reagent A and reagent B is 0.4-5 mmol / μL.
[0010] The above kit is an improvement on the conventional Sanger sequencing reaction system, which increases the concentration of dNTPs. Combined with DNA helicase, betaine, and nucleic acid denaturing agents, it can further promote the dissociation and polymerization reactions in high GC regions, effectively improving sequencing accuracy.
[0011] In conventional sequencing, the concentration of dNTPs is 40-60 μM, and the improved final concentration is 1.5-2.0 mM. If the concentration of dNTPs is lower than 1.5 mM, the reaction rate will decrease, resulting in a reduction in the yield of PCR products. This is because there is insufficient raw material for DNA synthesis, which cannot meet the needs of DNA polymerase, thus affecting the amplification efficiency.
[0012] DNA helicase at a final concentration of 0.1-0.3 U / μL can further accelerate the unwinding process in high-GC templates, breaking the double helix of DNA into two single strands, thus increasing amplification efficiency and facilitating sequencing of complex templates.
[0013] Betaine at a final concentration of 1–2 M enables DNA to unwind more effectively at lower temperatures, thereby improving amplification efficiency. This is especially true for template DNA rich in GC base pairs, where the three hydrogen bonds between GC base pairs make them more difficult to unwind than AT base pairs. Betaine's effect is more significant in overcoming amplification difficulties caused by high GC content.
[0014] Nucleic acid denaturants with a final concentration of 0.4-5 mmol / μL can interact with DNA molecules, disrupting base stacking forces and hydrogen bonds, and causing the DNA double helix to unwind into single strands without inhibiting key components of the PCR reaction such as DNA polymerase, thus not affecting the amplification reaction.
[0015] Furthermore, the final concentration of the DNA helicase in reagent A is 0.2 U / μL.
[0016] Furthermore, the final concentration of the dNTPs in reagent A is 1.8 mM.
[0017] Furthermore, the final concentration of betaine in reagent A is 1.5 M.
[0018] Furthermore, the final concentration of the nucleic acid denaturing agent in the mixture of reagent A and reagent B is 2 mmol / μL.
[0019] When using the above kit, it can be diluted with sequencing reaction buffer and ddH2O to ensure that the content of each component is within the final concentration range. Sequencing reaction buffer (e.g., PCR buffer) provides a suitable pH and ionic environment to ensure the activity and stability of DNA polymerase.
[0020] The high GC content DNA fragment described in this invention has a GC content of 65% or higher.
[0021] Furthermore, the DNA helicase is a RecQ helicase.
[0022] In a second aspect, the present invention provides a sample pretreatment method based on the above-described Sanger sequencing kit, wherein the target DNA fragment and primers are added to a mixture of DNA polymerase, ddNTPs, dNTPs, DNA helicase, and betaine for pre-denaturation, the nucleic acid denaturing agent is added, incubated, and cooled to obtain a pretreated product.
[0023] The pre-denaturation temperature is 95~98℃, and the time is 5~10 min;
[0024] The incubation temperature is 95~98℃, and the time is 3~10 min.
[0025] Furthermore, the cooling temperature is 4°C.
[0026] The above sample pretreatment involves using high-temperature denaturation combined with nucleic acid denaturing agents to perform special denaturation treatment on samples containing DNA fragments with high GC content, in order to reduce the possibility of secondary structures forming in high GC regions.
[0027] This invention adjusts the dosages of DNA helicase, betaine, nucleic acid denaturant, and dNTPs, and adds sample pretreatment. When used synergistically, these components can untangle hairpin structures formed by high GC, significantly improving sequencing efficiency and achieving sequencing lengths of 800-1200 bp. Adjusting only the dosages of DNA helicase, betaine, nucleic acid denaturant, and dNTPs, or sample pretreatment, does not achieve the same effect as this invention; the sequencing length of complex samples simply does not meet the basic requirements of first-generation sequencing (i.e., sequencing length greater than 800 bp).
[0028] A third aspect of the present invention provides a Sanger sequencing method for high GC content DNA fragments, comprising the above-described sample pretreatment method, and further comprising:
[0029] The pretreated product was subjected to a PCR reaction using a gradient cooling cyclic sequencing program to obtain the reaction product;
[0030] The reaction product was purified to obtain a purified product;
[0031] The purified product was sequenced.
[0032] Furthermore, the gradient cooling cyclic sequencing program is as follows: pre-denaturation at 96℃ for 30 s; followed by 30-35 cycles, each cycle including denaturation at 96℃ for 10 s, annealing at 65-55℃ for 15 s, a decrease of 0.2℃ per cycle, extension at 60℃ for 120 s, and a final extension at 60℃ for 5 min.
[0033] To sequence high-GC-content DNA fragments more accurately, a gradient-cooling cyclic sequencing program was used.
[0034] Compared with the prior art, the beneficial effects of the present invention are as follows: the sequencing peaks obtained by the sequencing method of the present invention are clear and continuous, without obvious signal interruption and abnormal peaks, and the sequencing accuracy is high, while the sequencing peaks of traditional methods are disordered, with multiple signal loss points and low accuracy. Attached Figure Description
[0035] Figure 1 The sequencing results are from Example 1;
[0036] Figure 2 The sequencing results are from Example 2;
[0037] Figure 3 The sequencing results are for Comparative Example 1;
[0038] Figure 4 The sequencing results are for Comparative Example 2;
[0039] Figure 5 The sequencing results are for Comparative Example 3;
[0040] Figure 6 The sequencing results are for Comparative Example 4. Detailed Implementation
[0041] The technical solution of the present invention will be clearly and completely described below. Obviously, the described embodiments are only some embodiments of the present invention, and not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of the present invention.
[0042] The high GC content DNA fragment described in this invention has a GC content of 65% or higher.
[0043] In this invention, the concentrations of the target DNA fragment, primers, DNA polymerase, and ddNTPs are the same as those used in conventional Sanger sequencing methods.
[0044] In an embodiment of the present invention, the final concentration of the target DNA fragment in reagent A is 10~30 ng / μL;
[0045] The final concentration of the primer in reagent A is 1~3 μM;
[0046] The final concentration of the DNA polymerase in reagent A is 0.1~0.2 U / μL;
[0047] The final concentration of the ddNTPs in reagent A is 0.1 mM;
[0048] The final concentration of the helicase in reagent A is 0.1-0.3 U / μL;
[0049] The final concentration of the dNTPs in reagent A is 1.5~2.0 mM;
[0050] The final concentration of betaine in reagent A is 1~2 M;
[0051] The nucleic acid denaturing agent is formamide, and its final concentration in the mixture of reagent A and reagent B is 0.4-5 mmol / μL.
[0052] In the embodiments of the present invention, the final concentrations of betaine, DNA helicase, dNTPs, and formamide are shown in Table 1 below. In each embodiment, the final concentrations of betaine, DNA helicase, and dNTPs in reagent A, and the final concentration of formamide in the mixture of reagent A and reagent B, are shown in Table 1 below.
[0053] Table 1 Final concentrations in each example
[0054]
[0055] The reagents of this invention can be diluted with sequencing buffer during use to ensure that the content of each component is within the final concentration range.
[0056] When sequencing a target DNA fragment, specific primers are designed based on the nucleotide sequence of the target DNA fragment.
[0057] First-generation sequencing typically employs the Sanger sequencing principle, which includes the "dideoxy-terminated linker" method. In this process, ddNTPs (dideoxynucleotide triphosphates) are added to the synthesizing DNA strand. Because the 3'-OH group is oxidized, the 5'-phosphate group of the next dNTP cannot form a phosphodiester bond with it, thus terminating the elongation. By adding ddNTPs at different positions, a series of new DNA fragments of varying lengths are generated. After these fragments are separated by electrophoresis, the signals are read using a laser scanner to generate a sequencing peak map. The sequence analysis program calculates the QV value based on the peak shape.
[0058] In first-generation sequencing, the QV value refers to the quality value of a base, used to quantitatively assess the reliability of bases in the sequencing results. The QV value is calculated by the sequence analysis program based on the peak shape of the bases; a higher QV value indicates a lower error rate for that base. It is defined as Q = –10 log [QV value is missing here, likely indicating a typo]. 10 P In this context, Q represents the quality score, and P represents the probability error. For example, Phred has a quality score of 10, an error of 10%, and an accuracy of 90%. Sanger sequencing has an overall accuracy of approximately 99.999%, or a total error of 0.001%. QV20 indicates a 99% confidence level that each base in the sequencing result was correctly identified. This is a metric for sequencing accuracy; a higher QV value indicates higher accuracy. With read lengths typically around 1000 bp, these two aspects of Sanger sequencing (accuracy and read length) are considered the gold standard for sequencing.
[0059] In the embodiments and comparative examples of this invention, DNA polymerase and ddNTPs were purchased from Thermo Fisher Scientific, and the BigDyeTerminator v3.1 cyclic sequencing kit was used. RecQ helicase was purchased from KEMOBio, catalog number KMA0223352R.
[0060] Unless otherwise specified, all reagents, kits and techniques used in the following examples were performed using commercially available reagents and kits, as well as conventional molecular biology techniques in the industry.
[0061] Example 1
[0062] The method for Sanger sequencing of high GC content (70%) DNA fragments in this embodiment is as follows:
[0063] 1. Sample preprocessing
[0064] Prepare 5 μL of reagents as shown in Table 2 below:
[0065] Table 2. Reagents used in Example 1
[0066]
[0067] Add the above reagent to the bottom of the tube, pre-denature at 98°C for 5 min, then quickly add 1 μL of formamide at a concentration of 12 mmol / μL, incubate at 98°C for 5 min, and quickly place in a pre-conditioned ice-water bath and cool at 4°C for 1 min to obtain the pretreated product.
[0068] The nucleotide sequence of the primer is shown in SEQ NO.1, specifically: 5'-gcatatacgatacaaggctg-3'.
[0069] The nucleotide sequence of the target DNA fragment is shown in SEQ NO.2, specifically as follows:
[0070]
[0071] 2. Perform PCR reaction on the pretreated product.
[0072] The PCR reaction program was as follows: pre-denaturation at 96℃ for 30 s; one cycle consisted of denaturation at 96℃ for 10 s and annealing at 65~55℃ for 15 s, with the annealing temperature decreasing by 0.2℃ for each cycle, for a total of 30 cycles; extension at 60℃ for 2 min; final extension at 60℃ for 5 min; and storage at 4℃.
[0073] 3. Purify the PCR reaction products using magnetic beads.
[0074] (1) Add 5 μL of carboxyl magnetic beads to the bottom of each tube and 30 μL of 80% ethanol to the bottom of each tube, and incubate on a plate mixer for 3 min.
[0075] (2) Centrifuge at 1000 rpm for 30 s, place the plate on a magnetic rack, immediately invert the 96-well plate, and centrifuge at 500 rpm for 1 min;
[0076] (3) Add 100 μL of 80% alcohol to each tube, shake, centrifuge at 1000 rpm for 30 s, place the 96-well plate on a magnetic rack, immediately invert the 96-well plate, and centrifuge at 500 rpm for 1 min.
[0077] (4) Evaporate the pure alcohol at room temperature for 5 minutes.
[0078] 4. Sequencing
[0079] Add 20 μL ddH2O to each tube of purified product, shake thoroughly, centrifuge briefly, collect the product at the bottom of the tube, and perform capillary electrophoresis on the obtained product (ABI3730XL sequencer).
[0080] Sequencing results as follows Figure 1 As shown.
[0081] Example 2
[0082]
[0083] Sequencing results as follows Figure 2 As shown.
[0084] Example 3
[0085] The reagents and methods used in this embodiment for Sanger sequencing of DNA fragments with high GC content (70%) are basically the same as those in Example 1, except that:
[0086] The concentrations of each reagent in this embodiment are shown in Table 3. The final concentration of betaine is 2 M, the final concentration of dNTPs is 1.5 mM, and the final concentration of RecQ helicase is 0.1 U / μL.
[0087] The sample pretreatment method is as follows: add the reagent to the bottom of the tube, pre-denature at 95℃ for 10 min, then quickly add 1 μL of formamide with a concentration of 2.4 mmol / μL, incubate at 95℃ for 10 min, and quickly place it in a pre-conditioned ice-water bath and cool at 4℃ for 1 min to obtain the pretreated product.
[0088] Table 3. Reagents in Example 3
[0089]
[0090] Example 4
[0091] The method used in this embodiment for Sanger sequencing of high GC content (70%) DNA fragments is basically the same as that in Example 1, except that:
[0092] The reagents in this embodiment are shown in Table 4 below. The final concentration of betaine is 1 M, the final concentration of dNTPs is 2 mM, and the final concentration of RecQ helicase is 0.3 U / μL.
[0093] The sample pretreatment method is as follows: add the reagent to the bottom of the tube, pre-denature at 96℃ for 7 min, then quickly add 1 μL of formamide with a concentration of 30 mmol / μL, incubate at 96℃ for 3 min, and quickly place it in a pre-conditioned ice-water bath and cool at 4℃ for 1 min to obtain the pretreated product.
[0094] Table 4. Reagents in Example 4
[0095]
[0096] Comparative Example 1
[0097] The high GC content (70%) DNA fragment in this comparative example 1 was sequenced using the conventional Sanger sequencing method, as shown below:
[0098] 1. Prepare the reagents as shown in Table 5 below:
[0099] Table 5 Common Reagents
[0100]
[0101] The above reagents were subjected to PCR reactions in steps 2-4 of Example 1, the PCR products were purified by magnetic bead method, and then sequenced.
[0102] Sequencing results as follows Figure 3 As shown.
[0103] Comparative Example 2
[0104] The Sanger sequencing method used for the high GC content (75%) DNA fragment in Comparative Example 2 was the conventional Sanger sequencing method, as shown below:
[0105] 1. Prepare the reagents as shown in Table 5:
[0106] The above reagents were subjected to PCR reactions in steps 2-4 of Example 2, the PCR products were purified by magnetic bead method, and then sequenced.
[0107] Sequencing results as follows Figure 4 As shown.
[0108] Comparative Example 3
[0109] This comparative example performed Sanger sequencing on the high GC content (70%) DNA fragment from Example 1, as shown below:
[0110] 1. Prepare the reagents as shown in Table 2:
[0111] The above reagents were subjected to PCR reactions in steps 2-4 of Example 1, the PCR products were purified by magnetic bead method, and then sequenced.
[0112] Sequencing results as follows Figure 5 As shown.
[0113] Comparative Example 4
[0114] This comparative example performed Sanger sequencing on the high GC content (70%) DNA fragment from Example 1, as shown below:
[0115] 1. Prepare the reagents as shown in Table 5:
[0116] The above reagents were subjected to sample pretreatment, PCR reaction, magnetic bead purification of PCR products, and sequencing in step 1-4 of Example 1.
[0117] Sequencing results as follows Figure 6 As shown.
[0118] Test case
[0119] Comparing the sequencing results of Example 1 with Comparative Examples 1, 3, and 4, and Example 2 with Comparative Examples 2, 3, and 4, it can be seen that the sequencing length of Example 1 is 1100 bp, with 1005 bases above QV20; the sequencing length of Comparative Example 1 is 180 bp, with 173 bases above QV20; the sequencing length of Comparative Example 3 is 180 bp, with 173 bases above QV20; the sequencing length of Comparative Example 4 is 180 bp, with 171 bases above QV20; the sequencing length of Example 2 is 800 bp, with 790 bases above QV20; and the sequencing length of Comparative Example 2 is 290 bp. The sequencing length of Example 3 was 791 bp, with 780 bases above QV20. The sequencing length of Example 4 was 782 bp, with 777 bases above QV20. It was found that the sequencing peaks obtained by the sequencing method of the present invention were clear and continuous, with no obvious signal interruption or abnormal peaks, and the sequencing accuracy was high. In contrast, the sequencing peaks of traditional methods were disordered, with multiple signal loss points and low accuracy.
[0120] Although embodiments of the invention have been shown and described, it will be understood by those skilled in the art that various changes, modifications, substitutions and alterations can be made to these embodiments without departing from the principles and spirit of the invention, the scope of which is defined by the appended claims and their equivalents.
Claims
1. A Sanger sequencing kit for high GC content DNA fragments, consisting of reagent A and reagent B, reagent A comprising a DNA polymerase, ddNTPs and dNTPs, characterized in that, Reagent A also includes DNA helicase and betaine. The final concentration of the DNA helicase in reagent A is 0.1-0.3 U / μL; The final concentration of the dNTPs in reagent A is 1.5~2.0 mM; The final concentration of betaine in reagent A is 1~2 M; Reagent B includes a nucleic acid denaturant, which is formamide, and its final concentration in the mixture of reagent A and reagent B is 0.4-5 mmol / μL.
2. The Sanger sequencing kit for high GC content DNA fragments according to claim 1, characterized in that, The final concentration of the DNA helicase in reagent A is 0.2 U / μL.
3. The Sanger sequencing kit for high GC content DNA fragments according to claim 1, characterized in that, The final concentration of the dNTPs in reagent A is 1.8 mM.
4. The Sanger sequencing kit for high GC-content DNA fragments according to claim 1, characterized in that, The final concentration of betaine in reagent A is 1.5 M.
5. The Sanger sequencing kit for high GC-content DNA fragments according to claim 1, characterized in that, The final concentration of the nucleic acid denaturant in the mixture of reagent A and reagent B is 2 mmol / μL.
6. The Sanger sequencing kit for high GC-content DNA fragments according to claim 1, characterized in that, The DNA helicase is RecQ helicase.
7. A sample pretreatment method based on the Sanger sequencing kit of claim 1, characterized in that, The target DNA fragment and primers are added to a mixture of DNA polymerase, ddNTPs, dNTPs, DNA helicase, and betaine for pre-denaturation. The nucleic acid denaturing agent is then added, and the mixture is incubated and cooled to obtain the pretreated product. The pre-denaturation temperature is 95~98℃, and the time is 5~10 min; The incubation temperature is 95~98℃, and the time is 3~10 min.
8. The sample preprocessing method according to claim 7, characterized in that, The cooling temperature is 4°C.
9. A Sanger sequencing method for high GC content DNA fragments, characterized in that, The sample preprocessing method according to claim 7 further includes: The pretreated product was subjected to a PCR reaction using a gradient cooling cyclic sequencing program to obtain the reaction product; The reaction product was purified to obtain a purified product; The purified product was sequenced.
10. The Sanger sequencing method for high GC content DNA fragments according to claim 9, characterized in that, The gradient cooling cyclic sequencing program is as follows: pre-denaturation at 96℃ for 30 s; followed by 30-35 cycles, each cycle including denaturation at 96℃ for 10 s, annealing at 65-55℃ for 15 s, a decrease of 0.2℃ per cycle, extension at 60℃ for 120 s, and a final extension at 60℃ for 5 min.