Method for constructing nucleic acid library and nanopore sequencing method
By processing the ends of double-stranded target nucleic acids and ligating adapters, asymmetric nucleic acid libraries are constructed, solving the problem of low efficiency in existing nanopore sequencing and achieving efficient and accurate double-stranded sequencing.
Patent Information
- Authority / Receiving Office
- WO · WO
- Patent Type
- Applications
- Current Assignee / Owner
- BGI HANGZHOU CYCLONESEQ TECHNOLOGY CO LTD
- Filing Date
- 2024-12-23
- Publication Date
- 2026-07-02
AI Technical Summary
Existing nanopore sequencing technologies have low effective proportions and efficiency of double-stranded sequencing libraries, and the sequencing process is highly complex, making it difficult to achieve accurate double-stranded sequencing.
By processing the ends of double-stranded target nucleic acids to introduce asymmetric end dangling sequences and connecting them with different types of adapters, nucleic acid libraries are constructed. The asymmetric structure is used to improve the library ratio and efficiency of double-stranded sequencing. Combined with specific amplification and helicase treatment, the accuracy of the sequencing process is ensured.
It improved the efficiency and proportion of double-stranded sequencing library construction, enhanced the stability and accuracy of sequencing results, reduced the complexity of the sequencing process, and improved the characteristic consistency of sequencing signals.
Smart Images

Figure CN2024141535_02072026_PF_FP_ABST
Abstract
Description
Methods for constructing nucleic acid libraries and nanopore sequencing methods Technical Field
[0001] This invention relates to the field of nucleic acid library construction, and more specifically, to a method for constructing a nucleic acid library and a nanopore sequencing method. Background Technology
[0002] Single-molecule sequencing technology enables direct sequencing of single nucleic acid molecules, providing more comprehensive and accurate sequence information. Nanopore sequencing is a typical single-molecule level detection technology, offering advantages such as high sequencing speed, long read lengths, direct sequencing, high throughput, low cost, small size, and portability. In nanopore sequencing, a single nanopore is embedded in an insulating, impermeable membrane, forming a stable ion current channel. Under voltage, single-stranded nucleic acid molecules pass through the nanopore, reducing the ion current. Due to the different molecular structures and sizes of different bases on the single-stranded nucleic acid molecule, the current passing through the nanopore exhibits differences corresponding to the base sequence. By analyzing the current change signal using algorithms, the sequence of the perforated single-stranded nucleic acid can be read in real time. However, existing nanopore sequencing technologies typically only sequence a single strand of the target nucleic acid, resulting in limited sequencing accuracy. This severely restricts the application scope of nanopore sequencing technology. Double-stranded sequencing of the target nucleic acid significantly improves sequencing accuracy, making it an important tool in genomics and biological research, and providing more possibilities for scientific research and medical applications.
[0003] Oxford Nanopore Technologies (ONT) has developed a double-stranded sequencing method that includes: (1) connecting the first and second strands (usually hairpin adapters) at or near one end of the target nucleic acid via a bridging portion, and connecting the other end to a sequencing adapter complex (usually a Y-type sequencing adapter complex) to form a sequencing library; (2) performing single-molecule nanopore sequencing on the sequencing library. Since the double strands of the target nucleic acid are connected at one end by a bridging portion, the first and second strands of the target nucleic acid pass through the nanopore sequentially, achieving double-stranded sequencing of the target nucleic acid, as shown in patent CN103827320B. Based on this, by using different anchors (usually restraint sequences modified with cholesterol or fatty acyl chains, which can couple the library to the membrane) to bind the hairpin adapters or Y-type sequencing adapters of the sequencing library respectively, the capture probability of the library capable of double-stranded sequencing is increased because the anchors bound to the hairpin adapters are coupled to the membrane more strongly than the anchors bound to the Y-type sequencing adapters.
[0004] However, the problems with the aforementioned patents CN103827320B and CN106460061B include: (1) In the process of connecting the first and second strands at or near one end of the target nucleic acid through a bridging portion and connecting the other end to the sequencing adapter complex to form a sequencing library, byproducts that cannot be double-stranded are also generated, namely, products whose ends of the target nucleic acid are both connected by the bridging portion and products whose ends of the target nucleic acid are both connected by the sequencing adapter complex, resulting in a low proportion and low efficiency of obtaining the target library. (2) Two different anchors (usually restraining sequences modified with cholesterol or fatty acyl chains that can couple the library to the membrane) are required during the sequencing process, increasing the complexity of the sequencing system by increasing the sequencing components, and the capture probability of libraries that can be double-stranded is only slightly improved (e.g., in Example 6 of CN106460061B, the proportion of double-stranded sequencing is only 46%).
[0005] Furthermore, ONT developed another method for double-stranded sequencing, including: (1) treating the target nucleic acid with a cross-linking agent to form interstrand cross-links between the first and second strands of the target nucleic acid, forming at least one cross-linked double-stranded nucleic acid construct; (2) sequencing the cross-linked construct using single-molecule nanopore sequencing technology to achieve double-stranded sequencing, as shown in patent application CN114945679A. However, patent application CN114945679A still has certain problems, such as: the cross-linking sites that occur when the cross-linking agent treats the target nucleic acid are random and uncontrolled, and some cross-linking agents may also damage the double strands of nucleic acid and affect sequencing; after the cross-linking agent treats and forms the double-stranded nucleic acid construct, it is also necessary to break the double-stranded nucleic acid at or near the interstrand cross-linking site, which is also a random and uncontrolled process. These uncontrolled random processes lead to complex operation, low proportion of effective double-stranded sequencing libraries obtained, and low efficiency. Summary of the Invention
[0006] The main objective of this invention is to provide a method for constructing nucleic acid libraries and a nanopore sequencing method to solve the problems of low effective double-stranded sequencing sequence ratio and low efficiency in the double-stranded sequencing libraries constructed in the prior art.
[0007] To achieve the above objective, according to a first aspect of the present invention, a method for constructing a nucleic acid library is provided, the method comprising: processing the ends of a double-stranded target nucleic acid to obtain a terminal processing product, the terminal processing product containing a first end and a second end, the first end and the second end respectively containing a terminal dangling sequence; and orienting a first adapter and a second adapter to the first end and the second end of the terminal processing product respectively to obtain a nucleic acid library; wherein the first adapter is different from the second adapter.
[0008] Further, the terminal dangling sequence includes an oligonucleotide sequence; preferably, the length of the oligonucleotide sequence is 2-20 nt; preferably, the oligonucleotide sequence is a homopolyoligonucleotide sequence; more preferably, it is poly A, poly G, poly C or poly T.
[0009] Further, the processing of the ends of the double-stranded target nucleic acid includes: i) suspending the ends of the double-stranded target nucleic acid to obtain intermediate product sequences containing 3' end suspension sequences at both ends; ii) performing a first amplification on the intermediate product sequence to obtain end-processed products, which contain a first end and a second end, the sequence of the first end being different from the sequence of the second end; iii) orienting the first adapter and the second adapter to the first end and the second end of the end-processed product, respectively, to obtain a nucleic acid library.
[0010] Further, the method includes: a) using a terminal transferase to suspend the ends of a double-stranded target nucleic acid to obtain an intermediate product sequence containing a first end and a second end, each containing a 3' end suspension sequence; b) using a first primer to perform a first amplification of the intermediate product sequence to obtain a first amplification product; the first amplification is linear amplification, and the 3' end of the first primer is at least partially complementary to the 3' end suspension sequence of the intermediate product sequence; the first amplification product comprises a template strand and an amplification strand; the 5' end of the template strand and the 3' end of the amplification strand constitute the first end of the first amplification product. The 3' end of the amplification strand protrudes one or more nucleotides compared to the 5' end of the template strand; the 3' end of the template strand and the 5' end of the amplification strand form the second end of the first amplification product; the 3' end of the template strand contains a 3' end dangling sequence, the 5' end of the amplification strand contains the sequence of the first primer, and the 5' end of the amplification strand protrudes beyond the 3' end of the template strand; the sequence of the first end is different from the sequence of the second end; c) the first adapter is ligated to the 3' end protruding nucleotide of the amplification strand at the first end of the first amplification product, and the second adapter is ligated to the 5' end protruding structure of the amplification strand at the second end of the first amplification product to obtain a nucleic acid library.
[0011] Furthermore, the method further includes: performing a second amplification on the nucleic acid library to obtain an amplified nucleic acid library.
[0012] Further, c) includes: ligating a first adapter to a 3' protruding nucleotide of the amplified strand at the first end of the first amplification product to obtain a first adapter ligation product; performing a third amplification on the first adapter ligation product to obtain an amplified first adapter ligation product, wherein the end of the first adapter ligation product contains a protruding structure; ligating a second adapter to the protruding structure at the end of the first adapter ligation product to obtain a nucleic acid library; preferably, the first adapter is a hairpin adapter and the second adapter is a Y-type adapter.
[0013] Further, b) includes: the 3' end of the first primer binds to the 3' end dangling sequence of the double-stranded target nucleic acid; under the action of polymerase, the DNA single strand containing the 3' end dangling sequence is used as a template and the first primer is used as a primer for the first amplification to obtain the first amplification product; preferably, the first primer contains a first blocking nucleotide, the 5' end sequence of the first blocking nucleotide can bind to the second adapter, and the 3' end sequence of the first blocking nucleotide can bind to the 3' end dangling sequence; preferably, the first blocking nucleotide includes a baseless nucleotide, a spacer, PNA, RNA, morpholino base, iso-dC, or iso-dG.
[0014] Furthermore, the ligation product of the second adapter is amplified a second or third time using a second primer set, the second primer set comprising: a second primer set sequence 1 capable of specifically binding to the first adapter, and a second primer set sequence 2 capable of being at least partially complementary to the 3' end dangling sequence.
[0015] Furthermore, the Y-type adapter contains a first strand and a second strand. The 3' end of the first strand and the middle segment of the second strand are complementary. The 5' end of the second strand also contains a 5' dangling segment, which can bind complementary to the sticky end of the 5' end of the first primer. The 5' end of the first strand contains a helicase binding site. The 5' end of the first strand and the 3' end of the second strand are not complementary.
[0016] Furthermore, the helicase binding site is bound to a helicase; preferably, the helicase is selected from any one or more of the following: Dda helicase, Pif 1 helicase, XPD helicase, T7 Gp41 helicase or DnaB helicase.
[0017] Further, the suspension treatment includes: using a first enzyme to add a tail to the 3' end of the double-stranded target nucleic acid, forming an intermediate product sequence with 3' suspension sequences at both ends; or using a second enzyme to catalyze the ligation of a structure containing a 3' suspension to the end of the double-stranded target nucleic acid, forming an intermediate product sequence with 3' suspension sequences at both ends; or using primers containing modified nucleotides and a DNA polymerase that can tolerate the modified nucleotides to amplify the double-stranded target nucleic acid, and then using the corresponding nuclease to cleave the modified nucleotides to form a nick, the cleaved short fragment dissociates to form an intermediate product sequence with 3' suspension sequences at both ends; or using a 5'-3' exonuclease to digest the double-stranded target nucleic acid, forming an intermediate product sequence with 3' suspension sequences at both ends; preferably, the first enzyme includes any one or more of terminal transferase, polymerase, or reverse transcriptase; more preferably, the terminal transferase includes a terminal transferase capable of adding a certain length of polyA, polyT, polyC, or polyG to the 3' end of DNA; more preferably, the polymerase includes E. coli Poly(A) polymerase and / or E. coli Poly(U) polymerase; more preferably, the reverse transcriptase includes Template Switching reverse transcriptase and / or M-MuLV reverse transcriptase; preferably, the second enzyme includes a ligase or transposase; more preferably, the ligase includes one or more of T4 DNA ligase, T3 DNA ligase, SplintR ligase, T4 RNA ligase 1, T4 RNA ligase 2, RtcB ligase, or Thermostable 5′App DNA / RNA ligase; more preferably, the transposase includes one or more of MuA transposase, Tn5 transposase, or Tn7 transposase; preferably, the modified nucleotide includes deoxyuridine dU and / or deoxyhypoxanthine dI, and the DNA polymerase capable of tolerating the modified nucleotide includes any one or more of the following: Q5 ultra-fidelity DNA polymerase, Taq DNA polymerase, Bst DNA polymerase, phi29 DNA polymerase, Bsu Large Fragment DNA polymerase, or DNA polymerase I; the nuclease includes any one or more of the following: USER enzyme, USER II enzyme, USER... Exonuclease III or endonuclease V; preferably, the exonuclease includes any one or more of the following: T7 Exonuclease, Exonuclease VIII (truncated), Lambda Exonuclease or T5 Exonuclease.
[0018] Further, the polymerase includes DNA polymerase and / or RNA polymerase; preferably, the polymerase includes any one or more of the following: Q5 superfidelity DNA polymerase, Taq DNA polymerase, Bst DNA polymerase, SD DNA polymerase, phi29 DNA polymerase, Bsu large fragment DNA polymerase, Klenow fragment DNA polymerase, T4 DNA polymerase, T7 DNA polymerase, DNA Polymerase I, T3 RNA polymerase, T7 RNA polymerase, SP6 RNA polymerase, or E. coli RNA polymerase.
[0019] To achieve the above objectives, according to a second aspect of the present invention, a nanopore sequencing method is provided, the nanopore sequencing method comprising: obtaining a nucleic acid library using the above-described method for constructing a nucleic acid library, and passing the nucleic acid library through a nanopore under the action of an electric field to recognize double-stranded target nucleic acids.
[0020] Furthermore, the nanopores are located within the membrane; preferably, the membrane is bound to a restraint sequence, and the restraint sequence is bound to the membrane via a cholesterol modification at its end; the restraint sequence is at least partially complementary to the 3' end sequence of the second chain described above.
[0021] Further, the nanopores are transmembrane protein pores or solid pores; preferably, the transmembrane protein in the transmembrane protein pores is selected from any one or more of the following: hemolysin, MspA, MspB, MspC, MspD, FraC, ClyA, PA63, CsgG, CsgD, XcpQ, SP1, phi29 connector protein, InvG or GspD.
[0022] Furthermore, the membrane includes an amphiphilic membrane; preferably, the membrane includes a phospholipid bilayer, a diblock copolymer, or a triblock copolymer.
[0023] By applying the technical solution of this invention, in the above-described method for constructing nucleic acid libraries, the target nucleic acid is suspended to obtain intermediate product sequences containing 3' end suspension sequences at both ends. These intermediate product sequences are then amplified to form asymmetric double-stranded target nucleic acid products. Utilizing the asymmetric structure, different adapters are introduced into the double-stranded target nucleic acid products to obtain the nucleic acid library. This method not only achieves high library construction efficiency but also increases the proportion of double-stranded sequencing libraries during nanopore sequencing, resulting in more effective sequencing data. Attached Figure Description
[0024] The accompanying drawings, which form part of this application, are used to provide a further understanding of the invention. The illustrative embodiments of the invention and their descriptions are used to explain the invention and do not constitute an undue limitation of the invention. In the drawings:
[0025] Figure 1 illustrates a method for constructing a nucleic acid library and a schematic diagram of nanopore sequencing according to an embodiment of the present invention.
[0026] Figure 2 illustrates a method for constructing a nucleic acid library and a schematic diagram of nanopore sequencing according to an embodiment of the present invention.
[0027] Figure 3 illustrates a method for constructing a nucleic acid library and a schematic diagram of nanopore sequencing according to an embodiment of the present invention.
[0028] Figure 4 illustrates a method for constructing a nucleic acid library and a schematic diagram of nanopore sequencing according to Embodiment 1 of the present invention.
[0029] Figure 5 shows a schematic diagram of the construction of a Y-type sequencing adapter complex according to Embodiment 1 of the present invention.
[0030] Figure 6 shows the electrophoresis results of the Y-type sequencing adapter complex according to Embodiment 1 of the present invention.
[0031] Figure 7 shows the result of the sequencing current signal according to Embodiment 1 of the present invention.
[0032] Figure 8 illustrates a method for constructing a nucleic acid library and a schematic diagram of nanopore sequencing according to Embodiment 2 of the present invention.
[0033] Figure 9 shows the results of the sequencing current signal according to Embodiment 2 of the present invention.
[0034] Figure 10 illustrates a method for constructing a nucleic acid library and a schematic diagram of nanopore sequencing according to Embodiment 3 of the present invention.
[0035] Figure 11 shows the result of the sequencing current signal according to Embodiment 3 of the present invention.
[0036] Figure 12 illustrates a method for constructing a nucleic acid library and a schematic diagram of nanopore sequencing according to Embodiment 4 of the present invention.
[0037] Figure 13 shows a schematic diagram of the construction of a Y-type sequencing adapter complex according to Embodiment 4 of the present invention.
[0038] Figure 14 shows the electrophoresis results of the Y-type sequencing adapter complex according to Example 4 of the present invention.
[0039] Figure 15 shows the result of the sequencing current signal according to Embodiment 4 of the present invention. Detailed Implementation
[0040] It should be noted that, unless otherwise specified, the embodiments and features described in this application can be combined with each other. The present invention will now be described in detail with reference to the embodiments.
[0041] As mentioned in the background section, in existing technologies, sequencing libraries typically only have one strand that can be sequenced in nanopore sequencing, affecting the accuracy of sequencing results. Furthermore, existing technologies CN103827320B, CN106460061B, and patent application CN114945679A share common problems, including: (1) When using the above methods for double-strand sequencing, the motor protein controls the sequencing of the first strand by controlling the double strand, while the second strand is sequenced by controlling the single strand. The kinetics of the same motor protein controlling the sequencing of the double strand and the single strand differ, leading to differences in the sequencing electrical signals of the first and second strands. This increases the difficulty of base identification during sequencing result analysis. (2) When using the above methods for double-strand sequencing, the motor protein controls the sequencing of the first strand by controlling the double strand, while the second strand is sequenced by controlling the single strand. After the first strand is sequenced, the second strand exists in single-strand form, easily forming secondary structures and affecting sequencing. Both of the above problems affect the accuracy of sequencing. Therefore, in this application, the inventors attempted to develop a new method for constructing nucleic acid libraries, and further developed a nanopore sequencing method based on this method, proposing a series of protection schemes in this application.
[0042] In a first typical embodiment of this application, a method for constructing a nucleic acid library is provided. The method includes: processing the ends of a double-stranded target nucleic acid to obtain a terminal processing product, the terminal processing product containing a first end and a second end, the first end and the second end respectively containing a terminal dangling sequence; and orienting a first adapter and a second adapter to the first end and the second end of the terminal processing product respectively to obtain a nucleic acid library; wherein the first adapter is different from the second adapter.
[0043] In the above method, by using different first and second ends (i.e., the first and second ends have different dangling sequences), they are specifically linked to the first and second adapters, respectively, to obtain double-stranded fragments with different adapters at both ends. Based on this, a nucleic acid library capable of double-stranded sequencing can be obtained. Utilizing this asymmetric end can improve the specificity of different types of adapter ligation, thereby increasing the proportion of libraries capable of double-stranded sequencing and improving construction efficiency.
[0044] Preferably, in the above method, a nucleic acid library containing the genetic information of the target nucleic acid is obtained by using the above-described method for constructing a nucleic acid library, starting with a double-stranded target nucleic acid. In the double-stranded nucleic acid library, the genetic information located on both the sense and antisense strands is contained on a single strand of the nucleic acid library. In the subsequent sequencing process, sequencing this single strand can obtain the genetic information in the double-stranded target nucleic acid, thus completing double-stranded sequencing.
[0045] In a preferred embodiment, the terminal dangling sequence comprises an oligonucleotide sequence; preferably, the length of the oligonucleotide sequence is 2-20 nt; preferably, the oligonucleotide sequence is a homopolyoligonucleotide sequence; more preferably, it is poly A, poly G, poly C or poly T.
[0046] In a preferred embodiment, processing the ends of the double-stranded target nucleic acid includes: i) suspending the ends of the double-stranded target nucleic acid to obtain an intermediate product sequence containing 3' end suspension sequences at both ends; ii) performing a first amplification on the intermediate product sequence to obtain an end-processed product containing a first end and a second end, wherein the sequence of the first end is different from the sequence of the second end; iii) orienting a first adapter and a second adapter to the first end and the second end of the end-processed product, respectively, to obtain a nucleic acid library.
[0047] The above method can be used to obtain end-treated products with a first end and a second end at each end. The first and second ends differ when ligated to a first adapter and / or a second adapter. In this application, the first and second ends refer to the upstream and downstream ends of the double-stranded target nucleic acid, not the 5' or 3' end of a single strand in the double-stranded structure. In the method of this application, the first and second ends may be the same at certain times, but after further processing and before ligation to the adapter, the first and second ends differ, thereby achieving specific ligation between the adapter and the ends.
[0048] In a preferred embodiment, the method includes: a) suspending the ends of a double-stranded target nucleic acid using a terminal transferase to obtain an intermediate product sequence, the intermediate product sequence containing a first end and a second end, the first end and the second end each containing a 3' end suspension sequence; b) performing a first amplification on the intermediate product sequence using a first primer to obtain a first amplification product; the first amplification is linear amplification, the 3' end of the first primer is at least partially complementary to the 3' end suspension sequence of the intermediate product sequence; the first amplification product comprises a template strand and an amplification strand; the 5' end of the template strand and the 3' end of the amplification strand constitute the first amplification product. The first end has one or more nucleotides protruding from the 3' end of the amplification strand compared to the 5' end of the template strand; the 3' end of the template strand and the 5' end of the amplification strand form the second end of the first amplification product; the 3' end of the template strand contains a 3' end dangling sequence, the 5' end of the amplification strand contains a first primer sequence, and the 5' end of the amplification strand protrudes from the 3' end of the template strand; the sequence of the first end is different from the sequence of the second end; c) the first adapter is ligated to the 3' end protruding nucleotide of the amplification strand at the first end of the first amplification product, and the second adapter is ligated to the 5' end protruding structure of the amplification strand at the second end of the first amplification product to obtain a nucleic acid library.
[0049] In the above method, firstly in step a), the ends of the double-stranded target nucleic acid are dangling to obtain an intermediate product containing a first end and a second end, each containing a 3' dangling sequence. At this time, the dangling sequences on the first end and the second end may be the same or different.
[0050] Further, in step b), the intermediate product sequence is linearly amplified using the first primer (bistrand synthesis, rather than multiple cycles of PCR amplification) to obtain the first amplification primer. In the first amplification primer, the first end differs from the second end due to the introduction of the first primer. The first amplification product obtained in this step has a double-stranded structure, including a template strand and an amplification strand. The template strand is the single-stranded DNA from the intermediate product obtained in step a), and the amplification strand is the single-stranded DNA obtained after amplification using the template strand as a template and the first primer as a primer. The template strand and the amplification strand together form the first amplification primer with a double-stranded structure. Further, by performing operations such as 3' end extension at the 3' end of the amplification strand, the first end of the first amplification product can be obtained.
[0051] In the above method, due to the introduction of the first primer and the linear amplification, the first and second ends of the first amplification product are different from the first and second ends of the intermediate product sequence. The first and second ends are the sticky end structures contained at both ends of a specific double-stranded DNA fragment.
[0052] It should be noted that the terms "first," "second," and "third" in this application are used only to distinguish different contents and do not involve any limitation on order, size, priority, etc.
[0053] In a preferred embodiment, the method further includes: performing a second amplification on the nucleic acid library to obtain an amplified nucleic acid library. This second amplification ensures that all sequencing strands are sequenced in double-stranded form through the well, thereby improving the stability and accuracy of the sequencing results.
[0054] In a preferred embodiment, c) includes: ligating a first adapter to a 3' protruding nucleotide of the amplified strand at the first end of a first amplification product to obtain a first adapter ligation product; performing a third amplification on the first adapter ligation product to obtain an amplified first adapter ligation product, wherein the end of the first adapter ligation product contains a protruding structure; ligating a second adapter to the protruding structure at the end of the first adapter ligation product to obtain a nucleic acid library; preferably, the first adapter is a hairpin adapter and the second adapter is a Y-type adapter.
[0055] Preferably, by performing a second amplification on this nucleic acid library containing hairpin adapters, it is ensured that the sequencing strands are all sequenced in double-stranded form through the well, which can improve the stability and accuracy of the sequencing results in subsequent sequencing.
[0056] In this second amplification, the complementary double-stranded structures (template strand - amplification strand) in the hairpin sequencing structure unwind, and the unwound single-stranded DNA fragments all obtain corresponding complementary fragments through the second amplification.
[0057] Preferably, the hairpin adapter is TA-complementarily connected to the end of the first amplification product that does not contain the first primer. Preferably, the hairpin adapter contains a characteristic sequencing sequence, which allows for differentiation of sequencing direction, sequencing sequence, etc., during subsequent sequencing, facilitating the analysis and correction of sequencing data. The aforementioned characteristic sequencing sequence includes, but is not limited to, non-natural nucleotides, UMI sequences, etc., as long as they can achieve the differentiation function.
[0058] In a preferred embodiment, b) includes: the 3' end of the first primer binds to the 3' end dangling sequence of the double-stranded target nucleic acid; under the action of polymerase, the DNA single strand containing the 3' end dangling sequence is used as a template and the first primer is used as a primer for a first amplification to obtain a first amplification product; preferably, the first primer contains a first blocking nucleotide, the 5' end sequence of the first blocking nucleotide can bind to the second adapter, and the 3' end sequence of the first blocking nucleotide can bind to the 3' end dangling sequence; preferably, the first blocking nucleotide includes a baseless nucleotide, a spacer, PNA, RNA, morpholino base, iso-dC, or iso-dG.
[0059] Preferably, the first primer contains a first blocking nucleotide, the 5' sequence of which is specifically able to bind to the top chain of the Y-linker, and the 3' sequence of which is at least partially complementary to the 3' dangling nucleotide; preferably, the first blocking nucleotide includes, but is not limited to, a baseless nucleotide (purine-free / pyrimidine-free site, AP site), or a spacer (including but not limited to iSp18, iSp9, iSpC3, iSpC6 or iSpC12), or PNA (peptide nucleic acid), or RNA, or morpholino base, or iso-dC (isocytosine), or iso-dG (isoguanine).
[0060] The 3' end of the first primer is at least partially complementary to the 3' dangling end, and the 5' end of the first primer is in a free state, forming a sticky end, providing a convenient connection method and location for subsequent ligation of other linkers. Preferably, the first primer contains a first blocking nucleotide. This non-natural nucleotide or ribonucleotide can block the movement of some polymerase along the complementary strand of the first primer from the 5' end to the 3' end during amplification. When the polymerase reaches the position of the first blocking nucleotide, amplification stops, thus preventing amplification of the 5' end of the first primer and preserving this sticky end.
[0061] The Y-type adapter mentioned above is a sequencing adapter, which can play a role in assisting sequencing in nucleic acid libraries, including but not limited to serving as a binding site for helicases and / or a binding sequence for restraint sequences in nanopore sequencing, helping nucleic acid libraries pass through nanopores to complete nanopore sequencing.
[0062] In a preferred embodiment, the ligation product of the second adapter is amplified a second or third time using a second primer set, the second primer set comprising: a second primer set sequence 1 capable of specifically binding to the first adapter, and a second primer set sequence 2 capable of being at least partially complementary to the 3' end dangling sequence.
[0063] In a preferred embodiment, the Y-connector comprises a first strand and a second strand, the 3' end of the first strand and the middle segment of the second strand being complementary, the 5' end of the second strand also comprising a 5' overhang, the 5' overhang being complementary to the sticky end of the 5' end of the first primer; the 5' end of the first strand contains a helicase binding site; the 5' end of the first strand and the 3' end of the second strand are not complementary.
[0064] The aforementioned Y-type connector comprises a first chain and a second chain. The 5' end of the second chain is in a free state, forming a 5' overhang; the 3' end of the second chain is also in a free state and does not pair complementaryly with the first chain; the middle segment of the second chain pairs complementaryly with the 3' end of the first chain. Similarly, the 5' end of the first chain is in a free state, forming a "Y-shaped" structure together with the 3' end of the second chain. In this application, the descriptions of "first chain" and "second chain" are relative positions, used only to distinguish the names of different chains in a double-chain structure, and are not affected by the specific upper, lower, left, or right positions shown in the accompanying drawings or other illustrations.
[0065] The aforementioned second primer set includes two primers (second primer set sequence 1 and second primer set sequence 2) for amplification using hairpin sequencing structures as templates. The second primers include, but are not limited to: second primer set sequence 1, which can specifically bind to the hairpin adapter, wherein such primers can complementarily pair with the hairpin adapter, thereby achieving the construction of a nucleic acid library through two amplification reactions; and second primer set sequence 2, which is at least partially complementary to the 3' suspension. First, amplification is performed using second primer set sequence 1. When the first primer becomes the template strand, the 3' suspension is re-exposed and forms a free state. Then, second primer set sequence 2 is used to specifically bind to and amplify the 3' suspension.
[0066] In a preferred embodiment, the helicase binding site is bound to a helicase; preferably, the helicase is selected from any one or more of the following: Dda helicase, Pif 1 helicase, XPD helicase, T7 Gp41 helicase, or DnaB helicase.
[0067] Helicases with unwinding directions of 5'-3' or 3'-5' are suitable for the above method. In actual use, the direction of the Y-connector can be flexibly set.
[0068] The Y-connector contains a helicase binding site. The helicase can bind to this binding site. By flexibly selecting a specific unwinding direction based on the location of the binding site, the helicase can unwind the double-stranded nucleic acid library. This allows the nucleic acids entering the nanopore for sequencing to form single strands under the action of the helicase and enter the nanopore in single-stranded form during subsequent nanopore sequencing. Meanwhile, the nucleic acid strands to be sequenced that have not yet entered the nanopore remain in double-stranded form, preventing them from folding into complex secondary structures that would affect the subsequent sequencing process and the accuracy of the sequencing.
[0069] In a preferred embodiment, the suspension treatment includes: adding a tail to the 3' end of a double-stranded target nucleic acid using a first enzyme to form an intermediate product sequence with 3' suspension sequences at both ends; or using a second enzyme to catalyze the attachment of a structure containing a 3' suspension to the end of the double-stranded target nucleic acid to form an intermediate product sequence with 3' suspension sequences at both ends; or using primers containing modified nucleotides and a DNA polymerase capable of tolerating the modified nucleotides to amplify the double-stranded target nucleic acid, and then using the corresponding nuclease to cleave the modified nucleotides to form a nick, the cleaved short fragment dissociates to form an intermediate product sequence with 3' suspension sequences at both ends; or using a 5'-3' exonuclease to digest the double-stranded target nucleic acid to form an intermediate product sequence with 3' suspension sequences at both ends; preferably, the first enzyme includes any one or more of terminal transferase, polymerase, or reverse transcriptase; more preferably, the terminal transferase includes a terminal transferase capable of adding a certain length of polyA, polyT, polyC, or polyG to the 3' end of DNA; more preferably, the polymerase includes E. coli Poly(A) polymerase and / or E. coli Poly(U) polymerase; more preferably, the reverse transcriptase includes Template Switching reverse transcriptase and / or M-MuLV reverse transcriptase; preferably, the second enzyme includes a ligase or transposase; more preferably, the ligase includes one or more of T4 DNA ligase, T3 DNA ligase, SplintR ligase, T4 RNA ligase 1, T4 RNA ligase 2, RtcB ligase, or Thermostable 5′App DNA / RNA ligase; more preferably, the transposase includes one or more of MuA transposase, Tn5 transposase, or Tn7 transposase; preferably, the modified nucleotide includes deoxyuridine dU and / or deoxyhypoxanthine dI, and the DNA polymerase capable of tolerating the modified nucleotide includes any one or more of the following: Q5 ultra-fidelity DNA polymerase, Taq DNA polymerase, Bst DNA polymerase, phi29 DNA polymerase, Bsu Large Fragment DNA polymerase, or DNA polymerase I; the nuclease includes any one or more of the following: USER enzyme, USER II enzyme, USER... Exonuclease III or endonuclease V; preferably, the exonuclease includes any one or more of the following: T7 Exonuclease, Exonuclease VIII (truncated), Lambda Exonuclease or T5 Exonuclease.
[0070] In the methods described above, various techniques can be used to form 3' suspensions at both ends of the double-stranded target nucleic acid. These include using specially modified primers (which function as primers and can take various forms) and polymerases to amplify the target nucleic acid containing the 3' suspension, forming asymmetric ends. The two asymmetric ends of the double-stranded target nucleic acid are connected to different adapters, one of which is a hairpin adapter, achieving double-strand ligation. New double-stranded nucleic acid libraries are obtained through primer and polymerase amplification, and these new double-stranded nucleic acids can be further constructed into new double-stranded nucleic acid libraries. Nucleic acid libraries obtained using the above methods can be used for single-molecule sequencing, such as nanopore sequencing.
[0071] The above method further includes exonuclease digestion after the target nucleic acid is ligated to the hairpin adapter to remove the target nucleic acid that was not successfully ligated to the hairpin adapter, so as to further increase the proportion of libraries that can be used for double-stranded sequencing and increase the proportion of double-stranded sequencing.
[0072] The nucleic acid libraries obtained by the above methods can also achieve double-stranded sequencing, including: one strand being the original target nucleic acid strand (modified) and the other strand being the polymerase amplification strand (unmodified). By comparing the differences in the sequencing signals of the two strands, the modification information can be further verified; or one strand being the original target nucleic acid strand and the other strand being the polymerase amplification strand. Amplification using modified dNTPs can further amplify or optimize the sequencing signal, improve sequencing accuracy, and obtain more target nucleic acid feature information.
[0073] The advantages of the above method are: it enables two sequencing runs of the target nucleotides, improving sequencing accuracy; it increases the proportion and construction efficiency of libraries suitable for double-stranded sequencing, thus increasing the proportion of double-stranded sequencing; and both double-stranded sequencing processes are motor protein-controlled, resulting in consistent signal characteristics. Furthermore, in the construction of nucleic acid libraries, since the template used for amplification is either the sense or antisense strand, the first amplification product contains two amplification products. Theoretically, these two amplification products contain the same genetic information, differing only in the adapter sequences at the 5' or 3' ends. Therefore, after sequencing, the two nucleic acid libraries can be cross-checked, improving sequencing accuracy.
[0074] In a preferred embodiment, the polymerase includes DNA polymerase and / or RNA polymerase; preferably, the polymerase includes any one or more of the following: Q5 ultrafidelity DNA polymerase, Taq DNA polymerase, Bst DNA polymerase, SD DNA polymerase, phi29 DNA polymerase, Bsu large fragment DNA polymerase, Klenow fragment DNA polymerase, T4 DNA polymerase, T7 DNA polymerase, DNA Polymerase I, T3 RNA polymerase, T7 RNA polymerase, SP6 RNA polymerase, or E. coli RNA polymerase.
[0075] In a preferred embodiment, the polymerase used for amplification in the method includes DNA polymerase and / or RNA polymerase; preferably, the polymerase includes any one or more of the following: Q5 ultra-fidelity DNA polymerase, Taq DNA polymerase, Bst DNA polymerase, SD DNA polymerase, phi29 DNA polymerase, Bsu large fragment DNA polymerase, Klenow fragment DNA polymerase, T4 DNA polymerase, T7 DNA polymerase, DNA Polymerase I, T3 RNA polymerase, T7 RNA polymerase, SP6 RNA polymerase, or E. coli RNA polymerase; preferably, the amplification time in the method is 1-120 minutes, including but not limited to 1, 2, 5, 10, 20, 30, 60, 80, 100, or 120 minutes. Those skilled in the art can flexibly adjust the parameters used for amplification according to factors such as the length of amplification and the type of polymerase.
[0076] In a preferred embodiment, the amplification in the above method is performed in an amplification buffer containing a pH buffer system; preferably, the pH buffer system includes any one or more of the following: dihydrogen phosphate-hydrogen phosphate buffer system, carbonate-sodium bicarbonate buffer system, Tris-HCl buffer system, HEPES buffer system, MOPS buffer system; preferably, the amplification buffer contains one or more of NTP, dNTP, or ddNTP; preferably, the amplification buffer contains K + and / or Na + Preferably, the amplification buffer also contains any one or more of the following metal ions: Mg 2+ Mo 2+ Cu 2+ Fe 2+ Zn 2+ Ca 2+ Pb 2+ , and Cd 2+Preferably, the amplification buffer further contains additives that can enhance the first amplification and / or the second amplification; more preferably, the additives include any one or more of the following: dimethyl sulfoxide, glycerol, formamide, bovine serum albumin, ammonium sulfate, polyethylene glycol, gelatin, nonionic detergent, N,N,N-trimethylglycine, single-stranded nucleic acid binding protein, dithiothreitol or ethylenediaminetetraacetic acid.
[0077] The methods described above include, but are not limited to, those shown in Figures 1, 2, 3, and 4. By flexibly adjusting the timing of the connection between the first and second connectors, the construction of nucleic acid libraries can be achieved.
[0078] In a second typical embodiment of this application, a nanopore sequencing method is provided, which includes: obtaining a nucleic acid library using the above-described method for constructing a nucleic acid library, and passing the nucleic acid library through a nanopore under the action of an electric field to recognize double-stranded target nucleic acids.
[0079] The Y-linker used in this application is a double-stranded structure formed by complementary pairing of the top and bottom strands, and the helicase binds to the free fragment within the Y-linker. During nanopore sequencing, the helicase continuously unwinds the double-stranded nucleic acid library, and the resulting single-stranded DNA enters the nanopore for sequencing. Sequencing of these nucleic acid libraries is a motor protein-controlled double-stranded sequencing process, resulting in relatively consistent signal characteristics. Nucleic acid libraries located away from the nanopore sequencing area and not yet sequenced by the nanopore retain their double-stranded structure to prevent the formation of secondary structures that could affect sequencing.
[0080] In a preferred embodiment, the nanopores are located in the membrane; preferably, the membrane is bound with a restraint sequence, and the restraint sequence is bound to the membrane by a cholesterol modification at its end; the restraint sequence is at least partially complementary to the 3' end sequence of the second chain described above.
[0081] Preferably, the second chain in the Y-type connector can complementarily pair with the restraint sequence, which binds to the membrane material through cholesterol modification at its end, thereby binding the nucleic acid library around the nanopore and increasing the permeability of the nucleic acid library.
[0082] Taking Figure 1 as an example, the restraint sequence (5' end cholesterol modification, represented by black circles) binds to the second strand of the Y-type adapter, binding the nucleic acid library to the membrane material near the nanopore (gray squares represent the membrane, white channels represent nanopores). After applying a sequencing voltage, under the action of an electric field (indicated by black arrows), the guide sequence of the sequencing adapter is captured by the nanopore, achieving double-stranded sequencing of the double-stranded target nucleic acid under the control of motor proteins.
[0083] In a preferred embodiment, the nanopore is a transmembrane protein pore or a solid pore; preferably, the transmembrane protein in the transmembrane protein pore is selected from any one or more of the following: hemolysin, MspA, MspB, MspC, MspD, FraC, ClyA, PA63, CsgG, CsgD, XcpQ, SP1, phi29 connector protein, InvG or GspD.
[0084] In a preferred embodiment, the membrane comprises an amphiphilic membrane; preferably, the membrane comprises a phospholipid bilayer, a diblock copolymer, or a triblock copolymer.
[0085] In a preferred embodiment, nanopore sequencing is performed in a sequencing buffer; preferably, the sequencing buffer contains a pH buffer system; preferably, the pH buffer system includes any one or more of the following: dihydrogen phosphate-hydrogen phosphate buffer system, carbonate-sodium bicarbonate buffer system, Tris-HCl buffer system, HEPES buffer system, MOPS buffer system; preferably, the sequencing buffer contains one or more of NTPs, dNTPs, or ddNTPs; preferably, the sequencing buffer contains K + and / or Na + Preferably, the sequencing buffer also contains any one or more of the following metal ions: Mg 2+ Mo 2+ Cu 2+ Fe 2+ Zn 2+ Ca 2+ Pb 2+ , and Cd 2+ .
[0086] The beneficial effects of this application will be explained in more detail below with reference to specific embodiments.
[0087] Example 1
[0088] Figure 4 shows a schematic diagram of the target nucleic acid double-stranded sequencing method of Example 1.
[0089] 1. Target nucleic acid treatment to form a 3' suspension can be performed using various known methods. In this example, a terminal transferase is used to add polyA to the 3' end of the target nucleotide double strand. Following the manufacturer's instructions, a polyA was added to the 3' end of the target nucleotide (SEQ ID NO: 1) with the characteristic sequence using terminal transferase (NEB, M0315) and dATP (NEB, N0440S). The reaction mixture consisted of: 300 ng target nucleic acid, 1 μL terminal transferase (20,000 U / mL), 10 μL terminal transferase 10× Buffer, 10 μL CoCl2 (2.5 mM), 10 μL dATP (1 mM), and water to a total volume of 100 μL. The reaction conditions were: incubation at 37°C for 30 minutes, followed by incubation at 70°C for 10 minutes.
[0090] The underlined part indicates the feature sequence.
[0091] 2. Purify the polyA-added product using AMPure XP beads (Beckman Coulter, A63882) according to the manufacturer's instructions.
[0092] 3. Dissolve primer 1 in TE buffer (pH=8) according to the manufacturer's instructions. Mix primer 1, polyA product, phi29 DNA polymerase, and phi29 DNA polymerase reaction buffer (NEB, M0269S), and dNTPs thoroughly for amplification. The reaction system is as follows: 200 ng of polyA product, 2 μL of phi29 polymerase (10,000 U / mL), 10 μL of phi29 polymerase 10× Buffer, 10 μL of dNTPs (10 mM), 10 μL of primer 1 (10 μM), and add water to a total volume of 100 μL. Amplification conditions: incubate at 30°C for 30 minutes. After amplification, inactivate phi29 DNA polymerase by treating at 65°C for 10 minutes.
[0093] The sequence of primer 1 is 5'Phosphorylation-TGCT-iSpC3-SEQ ID NO: 2 3'.
[0094] SEQ ID NO: 2: TGGTGCTGATATTGCTTTTTTTTTTTTTTTT;
[0095] Phosphorylation is a process of phosphorylation.
[0096] 4. Add 2 μL of Taq DNA polymerase (Beyotime, D7205-1) directly to the product from step 3 to add DNA to the 3' end. The addition condition is incubation at 72°C for 10 minutes.
[0097] 5. Purify the amplification product using AMPure XP beads (Beckman Coulter, A63882) according to the manufacturer's instructions to obtain target nucleic acid with asymmetric ends at both ends, one end being a 3'A end and the other end being a sticky end.
[0098] 6. Construct a Y-type sequencing adapter complex with sticky ends. The construction process is shown in Figure 5. This refers to helicase. Sequence A and Sequence B (SEQ ID NO: 5) were dissolved separately in TE buffer (pH=8) according to the manufacturer's instructions. Sequence A and Sequence B were annealed at a 1:1 ratio to form sequencing adapters. The annealing process involved incubation at 95°C for 5 minutes, followed by cooling to 25°C at a rate of 0.1°C / s, and then incubation for another 30 minutes. Prokaryotic expression of the helicase He(T4Dda-(ΔM1)G1 / E94C / C109A / C136A / K194L / A360C, SEQ ID NO: 6) was performed in *E. coli*, and the target protein was obtained after multi-step purification. The helicase and sequencing adapters were mixed at a molecular ratio of 9:1, with a final reaction buffer concentration of 25 mM HEPES, 50 mM KCl, 0.5 mM EDTA, 2.5 mM MgCl2, pH=8.0, and incubated at room temperature for 30 minutes. The incubation product was incubated with 0.25 volumes of 5 mM ATP at room temperature for 30 minutes. The sequencing adapter complex was purified using AMPure XP beads (Beckman Coulter, A63882) according to the manufacturer's instructions. The purified sequencing adapter complex was then analyzed by electrophoresis, and the results are shown in Figure 6, indicating a large quantity of 1:1 sequencing adapter complex.
[0099] The sequence of sequence A is: 5'(iSpC3) 30 -SEQ ID NO:3-(iSp18)4-SEQ ID NO:4 3'.
[0100] SEQ ID NO: 3: TTTTTTTTTT.
[0101] SEQ ID NO: 4: GGTTGTTTCTGTTGGTGCTGATATTGCT.
[0102] The sequence of sequence B is: SEQ ID NO: 5:
[0103] 5'AGCAAGCAATATCAGCACCAACAGAAACAACCTTTGAGGCGAGCGGTCAA 3'.
[0104] 7. Construct hairpin connectors with 5'T ends. Dissolve the single-stranded hairpin sequence A in TE buffer (pH=8) according to the manufacturer's instructions and anneal to form hairpin connector A. The annealing process is as follows: incubate at 95°C for 5 minutes, cool to 25°C at a rate of 0.1°C / s, and continue incubating for 30 minutes.
[0105] The sequence of the card issuance sequence A is 5'Phosphorylation-SEQ ID NO:7-iSp18-SEQ ID NO:8 3'.
[0106] SEQ ID NO: 7: TCTCTCTCTCTCTCTTTTTTCCTCCTCCTTTTTT.
[0107] SEQ ID NO: 8:TTTTTGAGAGAGAGAGAT.
[0108] 8. Following the manufacturer's instructions, perform the ligation reaction using the NEBNext Quick Ligation Module (NEB, E6056) to ligate the target nucleic acid with asymmetric ends, the Y-type sequencing adapter complex, and hairpin adapter A. The reaction mixture consisted of: 200 ng of target nucleic acid with asymmetric ends, 2.5 μL of the Y-type sequencing adapter complex, 2.5 μL of hairpin adapter A, 20 μL of 5× ligation buffer, 10 μL of T4 DNA ligase, and water to a total volume of 100 μL. The reaction was incubated at 25°C for 60 minutes.
[0109] 9. Purify the ligation product using AMPure XP beads (Beckman Coulter, A63882) according to the manufacturer's instructions to obtain a double-stranded nucleic acid library.
[0110] 10. Perform nanopore sequencing. A single-channel nanopore detection system was built using patch-clamp and signal amplifier to complete single-pore protein embedding. Double-stranded nucleic acid libraries and confinement sequences were mixed and added to the single-channel system. Changes in the current signal were observed and obtained at 180 mV. The sequencing buffer consisted of 470 mM KCl, 25 mM HEPES, 10 mM MgCl2, and 30 mM ATP, pH 8.10; the sequencing temperature was 30 °C.
[0111] The restraint sequence is 5'Cholesterol-(iSp18)4-SEQ ID NO: 93'.
[0112] SEQ ID NO: 9: TTGACCGCTCGC. Wherein, 5'Cholesterol is a 5' cholesterol modification.
[0113] 11. Sequencing yielded current signals for double-stranded nanopore sequencing of the target nucleotides. A representative current signal is shown in Figure 7. The special sequence on the hairpin connector forms a clear characteristic signal, denoted as 1D and 2D, respectively, representing the current signals for sequencing both strands. The proportion of all sequencing signals capable of double-stranded sequencing was calculated to be 41 / 62 = 66%.
[0114] Example 2
[0115] Figure 8 shows a schematic diagram of the target nucleic acid double-stranded sequencing method in Example 2.
[0116] 1. The nucleic acid library from Example 1 was amplified to obtain a new double-stranded nucleic acid library. Primers 2 (SEQ ID NO: 10) and 3 (SEQ ID NO: 11) were dissolved in TE buffer (pH=8) according to the manufacturer's instructions. Primers 2, 3, the nucleic acid library from Example 1, phi29 DNA polymerase and phi29 DNA polymerase reaction buffer (NEB, MO269S), and dNTPs were mixed and amplified. The reaction system consisted of: 200 ng nucleic acid library, 2 μL phi29 polymerase (10,000 U / mL), 10 μL phi29 polymerase 10× Buffer, 10 μL dNTPs (10 mM), 10 μL primer 2 (10 μM), 10 μL primer 3 (10 μM), and water added to a total volume of 100 μL. The amplification conditions were incubation at 30°C for 30 minutes.
[0117] SEQ ID NO: 10:AAAAAAGGAGGAGGA.
[0118] SEQ ID NO: 11: TGGTGCTGATATTGC.
[0119] 2. Purify the amplification products using AMPure XP beads (Beckman Coulter, A63882) according to the manufacturer's instructions to obtain a new double-stranded nucleic acid library.
[0120] 3. Perform nanopore sequencing as described in Example 1. The sequencing yielded the current signal of the double-stranded target nucleotide nanopore sequencing. A representative current signal is shown in Figure 9. The special sequence on the hairpin connector can form a clear characteristic signal. The current signals of the two strands are represented by 1D and 2D, respectively.
[0121] Example 3
[0122] Figure 10 shows a schematic diagram of the target nucleic acid double-stranded sequencing method in Example 3.
[0123] 1. Construct hairpin connectors with 5'T ends. Dissolve hairpin sequence B in TE buffer (pH=8) according to the manufacturer's instructions and anneal to form hairpin connector B. The annealing process is as follows: incubate at 95°C for 5 minutes, cool down to 25°C at a rate of 0.1°C / s, and continue incubating for 30 minutes.
[0124] The sequence of the card issuance sequence B is 5'Phosphorylation-SEQ ID NO:12-iSp18-SEQ ID NO:13 3'.
[0125] SEQ ID NO: 12: TCTCTCTCAAAAAAATTTTCCTCCTCCTTTTTT.
[0126] SEQ ID NO: 13: TTTTTTTTTTTTTGAGAGAGAT.
[0127] 2. Following the manufacturer's instructions, the target nucleic acid with asymmetric ends (Example 1) and hairpin adapter B were ligated using the NEBNext Quick Ligation Module (NEB, E6056). The reaction mixture consisted of 200 ng of target nucleic acid with asymmetric ends, 2.5 μL of hairpin adapter B, 20 μL of 5× ligation buffer, 10 μL of T4 DNA ligase, and water to a total volume of 100 μL. The reaction was incubated at 25°C for 60 minutes.
[0128] 3. Purify the ligation product using AMPure XP beads (Beckman Coulter, A63882) according to the manufacturer's instructions to obtain the target nucleic acid with one end connected to a hairpin connector.
[0129] 4. Amplification using polymerase to obtain new double-stranded nucleic acids. Primer 2 (SEQ ID NO: 10) and primer 3 (SEQ ID NO: 11) were dissolved in TE buffer (pH=8) according to the manufacturer's instructions. Primer 2, primer 3, the target nucleic acid with a hairpin adapter at one end, phi29 DNA polymerase, phi29 DNA polymerase reaction buffer (NEB, M0269S), and dNTPs were mixed and amplified. The reaction system consisted of: 100 ng of target nucleic acid with a hairpin adapter at one end, 1 μL of phi29 polymerase (10,000 U / mL), 5 μL of phi29 polymerase 10× Buffer, 5 μL of dNTPs (10 mM), 5 μL of primer 2 (10 μM), 5 μL of primer 3 (10 μM), and water added to a total volume of 50 μL. Amplification was performed at 30°C for 30 minutes.
[0130] 5. Purify the amplification product using AMPure XP beads (Beckman Coulter, A63882) according to the manufacturer's instructions to obtain new double-stranded nucleic acids.
[0131] 6. Following the manufacturer's instructions, the novel double-stranded nucleic acid and the Y-type sequencing adapter complex (Example 1) were ligated using the NEBNext Quick Ligation Module (NEB, E6056). The reaction mixture consisted of 100 ng of the novel double-stranded nucleic acid, 1.25 μL of the Y-type sequencing adapter complex, 10 μL of 5× ligation buffer, 5 μL of T4 DNA ligase, and water to a total volume of 50 μL. The reaction was incubated at 25°C for 60 minutes.
[0132] 7. Purify the ligation product using AMPure XP beads (Beckman Coulter, A63882) according to the manufacturer's instructions to obtain a double-stranded nucleic acid library.
[0133] 8. Perform nanopore sequencing as described in Example 1. The sequencing yielded the current signal of the double-stranded nanopore sequencing of the target nucleotide. A representative current signal is shown in Figure 11. The special sequence on the hairpin connector can form a clear characteristic signal. The current signals of the two strands are represented by 1D and 2D, respectively.
[0134] Example 4
[0135] Figure 12 shows a schematic diagram of the target nucleic acid double-stranded sequencing method in Example 4.
[0136] 1. A specially designed sequencing adapter complex was constructed, as shown in Figure 13. Sequences C and D were dissolved in TE buffer (pH=8) according to the manufacturer's instructions. Sequences C and D were annealed at a 1:1 ratio to form the sequencing adapter. The annealing process involved incubation at 95°C for 5 minutes, followed by cooling to 25°C at a rate of 0.1°C / s, and then incubation for another 30 minutes. Prokaryotic expression of the helicase He(T4Dda-(ΔM1)G1 / E94C / C109A / C136A / K194L / A360C, SEQ ID NO: 6) was performed in *E. coli*, and the target protein was obtained after multi-step purification. The helicase and sequencing adapter were mixed at a molecular ratio of 9:1, and the final concentration of the reaction buffer was 25 mM HEPES, 50 mM KCl, 0.5 mM EDTA, 2.5 mM MgCl2, pH=8.0. The mixture was incubated at room temperature for 30 minutes. The incubation product was incubated with 0.25 volumes of 5 mM ATP at room temperature for 30 minutes. The sequencing adapter complex was purified using AMPure XP beads (Beckman Coulter, A63882) according to the manufacturer's instructions. The purified sequencing adapter complex was then analyzed by electrophoresis, and the results are shown in Figure 14, revealing a large quantity of 1:1 sequencing adapter complex.
[0137] The sequence of sequence C is: 5'(iSpC3) 30 -SEQ ID NO:14-(iSp18)4-SEQ ID NO:15 3'.
[0138] SEQ ID NO: 14:TTTTTTTTTT.
[0139] SEQ ID NO: 15: GGTTGTTTCTGTTGGTGCTGATATTGCTTTTTTTTTTTTTTTT.
[0140] The sequence of sequence D is SEQ ID NO: 16: 5'(LNA_A)(LNA_A)(LNA_T)(LNA_A)TCAGCACCAACAGAAACAACCTTTGAGGCGAGCGGTCAA 3'. The LNA mentioned above is a locked nucleic acid.
[0141] 2. The specially designed sequencing adapter complex, polyA product (Example 1), phi29 DNA polymerase, and phi29 DNA polymerase reaction buffer (NEB, M0269S) were mixed and amplified. The reaction system consisted of: 300 ng of polyA product, 2 μL of phi29 polymerase (10,000 U / mL), 10 μL of phi29 polymerase 10× Buffer, 10 μL of dNTPs (10 mM), and 1 μL of sequencing adapter complex, with water added to a total volume of 100 μL. Amplification was performed at 30°C for 30 minutes.
[0142] 3. Purify the amplification product using AMPure XP beads (Beckman Coulter, A63882) according to the manufacturer's instructions to obtain the amplification product with blunt ends.
[0143] 4. Following the manufacturer's instructions, add an atom to the ends of the amplification product using Klenow Fragment (3'→5'exo-) (NEB, M0212). The reaction mixture consisted of: 300 ng of amplification product, 10×NEBuffer 2 5 μL, dATP (10 mM) 0.5 μL, Klenow Fragment (3'→5'exo-) 3 μL, and water to a total volume of 50 μL. The atomization was performed at 37°C for 10 minutes.
[0144] 5. Purify the A-added product using AMPure XP beads (Beckman Coulter, A63882) according to the manufacturer's instructions to obtain an amplified product with 3'A ends.
[0145] 6. Perform the ligation reaction using the amplification product from the NEBNext Quick Ligation Module (NEB, E6056) and the hairpin adapter (Example 1) according to the manufacturer's instructions. The reaction mixture consisted of 200 ng of amplification product, 2.5 μL of hairpin adapter, 20 μL of 5× ligation buffer, 10 μL of T4 DNA ligase, and water to a total volume of 100 μL. The reaction was incubated at 25°C for 60 minutes.
[0146] 7. Purify the ligation product using AMPure XP beads (Beckman Coulter, A63882) according to the manufacturer's instructions to obtain a double-stranded nucleic acid library.
[0147] 8. Amplify the double-stranded nucleic acid library to obtain a new double-stranded nucleic acid library. Dissolve primer 2 (SEQ ID NO: 10) and primer 4 (SEQ ID NO: 17) in TE buffer (pH=8) according to the manufacturer's instructions. Mix primer 2, primer 4, double-stranded nucleic acid library, phi29 DNA polymerase and phi29 DNA polymerase reaction buffer (NEB, M0269S), and dNTPs thoroughly for amplification. The reaction system is as follows: 100 ng double-stranded nucleic acid library, 1 μL phi29 polymerase (10,000 U / mL), 5 μL phi29 polymerase 10× Buffer, 5 μL dNTPs (10 mM), 5 μL primer 2 (10 μM), 5 μL primer 4 (10 μM), and add water to a total volume of 50 μL. Amplification conditions: incubation at 30°C for 30 minutes.
[0148] SEQ ID NO: 17: TTTTTTTTTTTTTTT.
[0149] 9. Purify the amplification products using AMPure XP beads (Beckman Coulter, A63882) according to the manufacturer's instructions to obtain a new double-stranded nucleic acid library.
[0150] 10. Perform nanopore sequencing as described in Example 1. The sequencing yielded the current signal of the double-stranded target nucleotide nanopore sequencing. A representative current signal is shown in Figure 15. The special sequence on the hairpin connector can form a clear characteristic signal. The current signals of the two strands are represented by 1D and 2D, respectively.
[0151] The above description is merely a preferred embodiment of the present invention and is not intended to limit the invention. Various modifications and variations can be made to the present invention by those skilled in the art. Any modifications, equivalent substitutions, improvements, etc., made within the spirit and principles of the present invention should be included within the scope of protection of the present invention.
Claims
1. A method of constructing a library of nucleic acids, characterized by, The method includes: The ends of a double-stranded target nucleic acid are processed to obtain an end-processed product, wherein the end-processed product contains a first end and a second end, and the first end and the second end each contain a terminal dangling sequence; The first adapter and the second adapter are respectively oriented to the first end and the second end of the terminal processing product to obtain the nucleic acid library; the first adapter is different from the second adapter.
2. The method of claim 1, wherein, The terminal dangling sequence includes an oligonucleotide sequence; Preferably, the length of the oligonucleotide sequence is 2-20 nt; Preferably, the oligonucleotide sequence is a homopolyoligonucleotide sequence; more preferably, it is poly A, poly G, poly C, or poly T.
3. The method according to claim 1 or 2, characterized in that, The processing of the ends of the double-stranded target nucleic acid includes: i) The ends of the double-stranded target nucleic acid are dangling to obtain intermediate product sequences containing 3' dangling sequences at both ends; ii) Perform a first amplification on the intermediate product sequence to obtain the terminal-processed product, the terminal-processed product containing the first end and the second end, wherein the sequence of the first end is different from the sequence of the second end; iii) The first adapter and the second adapter are respectively oriented to the first end and the second end of the end-processed product to obtain the nucleic acid library.
4. The method of claim 3, wherein, The method includes: a) The ends of the double-stranded target nucleic acid are dangling using terminal transferase to obtain an intermediate product sequence, wherein the intermediate product sequence contains a first end and a second end, and the first end and the second end each contain a 3' dangling sequence; b) The intermediate product sequence is amplified using the first primer to obtain a first amplification product; the first amplification is a linear amplification, and the 3' end of the first primer is at least partially complementary to the 3' end dangling sequence of the intermediate product sequence. The first amplification product comprises a template strand and an amplification strand; The 5' end of the template strand and the 3' end of the amplification strand form the first end of the first amplification product; the 3' end of the amplification strand protrudes one or more nucleotides compared to the 5' end of the template strand. The 3' end of the template strand and the 5' end of the amplification strand form the second end of the first amplification product; the 3' end of the template strand contains the 3' end dangling sequence, the 5' end of the amplification strand contains the sequence of the first primer, and the 5' end of the amplification strand protrudes from the 3' end of the template strand. The sequence at the first end is different from the sequence at the second end; c) Connect the first adapter to the 3' protruding nucleotide of the amplification strand at the first end of the first amplification product, and connect the second adapter to the 5' protruding structure of the amplification strand at the second end of the first amplification product to obtain the nucleic acid library.
5. The method according to claim 1 or 4, characterized in that, The method further includes: The nucleic acid library is then subjected to a second amplification to obtain the amplified nucleic acid library.
6. The method of claim 4, wherein, c) includes: The first adapter is connected to the 3' protruding nucleotide of the amplification strand at the first end of the first amplification product to obtain the first adapter ligation product. The first connector ligation product is amplified a third time to obtain an amplified first connector ligation product, the end of which has a protruding structure. The second connector is connected to the protruding structure at the end of the product of the first connector to obtain the nucleic acid library; Preferably, the first connector is a hairpin connector and the second connector is a Y-type connector.
7. The method of claim 4, wherein, b) includes: The 3' end of the first primer binds to the 3' end dangling sequence of the double-stranded target nucleic acid. Under the action of polymerase, the first amplification is performed using the single strand of DNA containing the 3' end dangling sequence as a template and the first primer as a primer to obtain the first amplification product. Preferably, the first primer contains a first blocking nucleotide, the 5' end sequence of which can bind to the second adapter, and the 3' end sequence of which can bind to the 3' end dangling sequence. Preferably, the first blocking nucleotide includes a baseless nucleotide, a spacer, PNA, RNA, morpholino base, iso-dC, or iso-dG.
8. The method according to claim 5 or 6, characterized in that, The ligation product of the second adapter is amplified using a second primer set, wherein the second primer set comprises: The second primer sequence 1 is capable of specifically binding to the first connector, and the second primer sequence 2 is at least partially complementary to the 3' end dangling sequence.
9. The method of claim 6, wherein, The Y-type connector includes a first chain and a second chain. The 3' end of the first primer and the middle segment of the second primer are complementary pairings. The 5' end of the second primer also contains a 5' overhang, which can complementarily bind to the sticky end of the 5' end of the first primer. The 5' end of the first chain contains a helicase binding site; The 5' end of the first chain and the 3' end of the second chain are not complementary.
10. The method of claim 9, wherein, The helicase is bound to the binding site of the helicase; Preferably, the helicase is selected from any one or more of the following: Dda helicase, Pif 1 helicase, XPD helicase, T7 Gp41 helicase, or DnaB helicase.
11. The method of claim 3, wherein, The suspension treatment includes: The first enzyme is used to add a tail to the 3' end of the double-stranded target nucleic acid, forming an intermediate product sequence with the 3' end dangling sequence at each end; or Using a second enzyme catalysis, a structure containing a 3' suspension is attached to the end of the double-stranded target nucleic acid, forming an intermediate product sequence with the 3' end suspension sequence at each end; or The double-stranded target nucleic acid is amplified using primers containing modified nucleotides and a DNA polymerase capable of tolerating the modified nucleotides. A corresponding nuclease then cleaves the modified nucleotides to create a nick. The resulting short fragment dissociates to form an intermediate product sequence containing the 3' end dangling sequences at both ends; or The double-stranded target nucleic acid is digested using a 5'-3' exonuclease to form an intermediate product sequence containing the 3' end dangling sequence at each end; Preferably, the first enzyme includes any one or more of terminal transferase, polymerase, or reverse transcriptase; More preferably, the terminal transferase includes a terminal transferase capable of adding a certain length of polyA, polyT, polyC, or polyG to the 3' end of DNA; More preferably, the polymerase includes E. coli Poly(A) polymerase and / or E. coli Poly(U) polymerase; More preferably, the reverse transcriptase includes Template Switching reverse transcriptase and / or M-MuLV reverse transcriptase; preferably, the second enzyme includes a ligase or a transposase; More preferably, the ligase includes one or more of T4 DNA ligase, T3 DNA ligase, SplintR ligase, T4 RNA ligase 1, T4 RNA ligase 2, RtcB ligase or Thermostable 5′App DNA / RNA ligase; More preferably, the transposase includes one or more of MuA transposase, Tn5 transposase, or Tn7 transposase; Preferably, the modified nucleotides include deoxyuridine dU and / or deoxyhypoxanthine dI. The DNA polymerase capable of tolerating modified nucleotides includes any one or more of the following: Q5 superfidel DNA polymerase, Taq DNA polymerase, Bst DNA polymerase, phi29 DNA polymerase, Bsu LargeFragment DNA polymerase, or DNA polymerase I. The nuclease includes any one or more of the following: USER enzyme, USER II enzyme, USER III enzyme, or endonuclease V. Preferably, the exonuclease includes any one or more of the following: T7 Exonuclease, Exonuclease VIII (truncated), Lambda Exonuclease, or T5 Exonuclease.
12. The method of claim 7, wherein, The polymerase includes DNA polymerase and / or RNA polymerase; Preferably, the polymerase comprises any one or more of the following: Q5 super-fidelity DNA polymerase, Taq DNA polymerase, Bst DNA polymerase, SD DNA polymerase, phi29 DNA polymerase, Bsu large fragment DNA polymerase, Klenow fragment DNA polymerase, T4 DNA polymerase, T7 DNA polymerase, DNA Polymerase I, T3 RNA polymerase, T7 RNA polymerase, SP6 RNA polymerase, or E. coli RNA polymerase.
13. A method of nanopore sequencing, characterized in that, The nanopore sequencing method includes: obtaining a nucleic acid library using the method for constructing a nucleic acid library according to any one of claims 1-12, and passing the nucleic acid library through a nanopore under the action of an electric field to recognize the double-stranded target nucleic acid.
14. The nanopore sequencing method of claim 13, wherein, The nanopores are located within the membrane; Preferably, the membrane is bound to a restraint sequence, and the restraint sequence is bound to the membrane via a terminal cholesterol modification; The restraint sequence is at least partially complementary to the 3' end sequence of the second strand in the method for constructing a nucleic acid library according to claim 9.
15. The nanopore sequencing method of claim 13, wherein, The nanopores are transmembrane protein pores or solid pores; Preferably, the transmembrane protein in the transmembrane protein pore is selected from any one or more of the following: hemolysin, MspA, MspB, MspC, MspD, FraC, ClyA, PA63, CsgG, CsgD, XcpQ, SP1, phi29 connector protein, InvG, or GspD.
16. The nanopore sequencing method of claim 14, wherein, The membrane includes an amphiphilic membrane; Preferably, the membrane comprises a phospholipid bilayer, a diblock copolymer, or a triblock copolymer.