Cyclic and amplification on a support for generating immobilized nucleic acid concatemer molecules
By using clamp-capture primers and pinning primers for circularization and rolling circle amplification, nucleic acid tandem template molecules are generated and immobilized on a support, solving the problem of insufficient nucleic acid library preparation in existing technologies and improving sequencing speed and data output.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- ELEMENT BIOSCIENCES INC
- Filing Date
- 2024-07-23
- Publication Date
- 2026-06-19
AI Technical Summary
Existing technologies are insufficient for effectively preparing nucleic acid libraries suitable for next-generation sequencing, resulting in insufficient sequencing speed and data output.
Linear library molecules are circularized and immobilized on a support by using clamp-capture primers and pinning primers. Rolling circle amplification is then performed to generate nucleic acid tandem template molecules, which are then amplified by combining strand displacement polymerase and a mixture of nucleotides to form DNA nanospheres.
It improves the efficiency of nucleic acid library preparation and sequencing speed, is suitable for batch sequencing and repeated sequencing, and enhances the output of sequencing data.
Smart Images

Figure CN122249550A_ABST
Abstract
Description
[0001] Cross-references to related applications
[0002] This application claims priority and benefit to U.S. Provisional Application No. 63 / 515,328, filed July 24, 2023, the contents of which are incorporated herein by reference in their entirety.
[0003] By referencing and incorporating into the sequence list
[0004] The contents of the electronic sequence list (ELEM-022_001WO_SeqListing_ST26.xml; size 158,025 bytes; and creation date: July 22, 2024) are incorporated herein by reference in their entirety. Technical Field
[0005] This disclosure relates to compositions and methods for preparing nucleic acid libraries for next-generation sequencing, and methods for sequencing libraries prepared using the techniques disclosed herein. The nucleic acid libraries contain nucleic acid tandem template molecules, which can be generated by hybridizing linear library molecules with a plurality of immobilized clip-on capture primers and performing rolling circle amplification. The resulting libraries can be used in downstream sequencing workflows, including batch sequencing and repeated sequencing. Background Technology
[0006] Improvements in next-generation sequencing (NGS) technology have significantly increased sequencing speed and data output, resulting in high sample throughput on current sequencing platforms. Efficient preparation of sequencing libraries suitable for NGS applications is crucial for downstream amplification and sequencing workflows. Therefore, alternative methods for generating and sequencing nucleic acid libraries are needed. This document provides compositions, methods, and kits to meet this need. The compositions and methods disclosed herein can be used to generate multiple nucleic acid tandem template molecules immobilized to a support, compatible with various downstream sequencing methods. Multiple nucleic acid tandem template molecules can be generated using multiple linear library molecules and multiple immobilized clip-on capture primers. This disclosure also provides methods for seeding and reseeding the support. The immobilized nucleic acid tandem template molecules can also be used in downstream sequencing workflows, including batch sequencing and repeated sequencing workflows. Summary of the Invention
[0007] This disclosure provides a method for generating a plurality of nucleic acid tandem template molecules immobilized to a support, the method comprising: (a) providing a support having a plurality of clip-capture primers (200) and a plurality of pinning primers (500) immobilized thereon, wherein each of the plurality of clip-capture primers (200) comprises a first portion (210) binding to a first universal binding site in a linear library molecule (100) and a second portion (220) binding to a second universal binding site in the same linear library molecule (100), wherein the density of the clip-capture primers (200) on the support is 10 5 / mm 2 Up to 10 15 / mm 2(a) providing a plurality of linear library molecules (100), wherein each of the plurality of pinning primers (500) binds at least a portion of each tandem template molecule, and wherein each pinning primer (500) includes a terminal 3' non-extending end; (b) providing a plurality of linear library molecules (100), wherein each of the plurality of linear library molecules includes a target sequence in any order and any adaptor sequence or a combination of two or more adaptor sequences, wherein the adaptor sequence includes: (i) a first universal binding site (120) for a first portion of a splint-capture primer or a complementary sequence thereof; (ii) a universal binding site (123) for a first non-splint-capture primer or a complementary sequence thereof; (iii) at least one sample index sequence, It comprises a left sample index sequence (160) and / or a right sample index sequence (170), wherein the left sample index sequence (160) and / or the right sample index sequence (170) distinguish the target sequence obtained from different sample sources in multiplex assays; (iv) at least one universal binding site (140) for a forward sequencing primer or its complementary sequence; (v) at least one universal binding site (150) for a reverse sequencing primer or its complementary sequence; (vi) at least one universal binding site for a compacted oligonucleotide or its complementary sequence; and (vii) at least one unique molecular index sequence (UMI) comprising a left unique molecular index sequence (UMI) that can be used to uniquely identify linear library molecules (100) with the unique molecular index sequence appended. (viii) a molecular index sequence (180) and / or a right-unique molecular index sequence (190); (ix) at least one universal binding site for a pinned primer or its complementary sequence; (x) at least one batch-specific barcode sequence; (x) a universal binding site (133) for a second non-spindle capture primer or its complementary sequence; (xi) at least one short random sequence of about 3 to 20 nucleotides in length and providing nucleotide sequence diversity; and / or (xii) a second universal binding site (130) for a second portion of a pinned pinned capture primer or its complementary sequence; (c) contacting a plurality of pinned capture primers (200) with a plurality of linear library molecules (100) under certain conditions. The conditions are suitable for hybridizing each linear library molecule with each clip-capture primer to form each open-loop library molecule (300), which has a gap or slit between the 5' and 3' ends of the open-loop library molecule (300), wherein each linear library molecule contains a first universal binding site (120) for the first portion (210) of the clip-capture primer that hybridizes with it, and wherein the same linear library molecule (100) contains a second universal binding site (130) for the second portion (220) of the same clip-capture primer that hybridizes with it, thereby generating a plurality of open-loop library molecules (300) containing slits or gaps;(d) Enzymatically close the nicks or gaps in multiple open-loop library molecules to generate multiple covalently closed circular library molecules (400), wherein each covalently closed circular library molecule hybridizes with a clip-capture primer (200); (e) Contact the multiple covalently closed circular library molecules (400) with the rolling circle amplification reaction mixture and perform a rolling circle amplification reaction to generate multiple tandem template molecules, wherein the multiple tandem template molecules are immobilized on a support and wherein the density of the tandem template molecules on the support is 10; 5 / mm 2 Up to 10 15 / mm 2 The rolling circle amplification reaction mixture contains a strand displacement polymerase and a nucleotide mixture containing dATP, dGTP, dCTP, dTTP and dUTP, wherein the rolling circle amplification reaction mixture contains a plurality of single-stranded nucleic acid compacted oligonucleotides, wherein the 5' and 3' regions of each single-stranded nucleic acid compacted oligonucleotide hybridize with universal binding sites on each tandem template molecule, thereby pulling the distal portions of each tandem template molecule together and causing compaction of the tandem template molecule to form DNA nanospheres, wherein the terminal 3' end of each single-stranded nucleic acid compacted oligonucleotide is non-extendable, and wherein at least one portion of each tandem template molecule hybridizes with a pinned primer fixed on a support; and (f) performing at least one sequencing reaction to determine the sequence of at least a portion of the plurality of tandem template molecules.
[0008] This disclosure provides a method for generating a plurality of nucleic acid tandem template molecules immobilized to a support, the method comprising: (a) providing a support containing a plurality of clip-capture primers (200) and a plurality of pinning primers (500) immobilized thereon, wherein each of the plurality of clip-capture primers (200) comprises a first portion (210) binding to a first universal binding site in a linear library molecule (100) and a second portion (220) binding to a second universal binding site in the same linear library molecule (100), wherein the density of the clip-capture primers (200) on the support is 10 5 / mm 2 Up to 10 15 / mm 2(a) providing a plurality of linear library molecules (100), wherein each of the plurality of pinning primers (500) binds at least a portion of each tandem template molecule, and wherein each pinning primer (500) includes a terminal 3' non-extending end; (b) providing a plurality of linear library molecules (100), wherein each of the plurality of linear library molecules includes a target sequence in any order and any adaptor sequence or any combination of two or more adaptor sequences, wherein the adaptor sequence includes: (i) a first universal binding site (120) or its complementary sequence for a first portion of a splint-capture primer; (ii) a universal binding site (123) or its complementary sequence for a first non-splint-capture primer; (iii) at least one sample index sequence including a left sample index sequence (160) and / or a right sample index sequence (170) that distinguishes the target sequence obtained from different sample sources in multiplex assays; and (iv) at least one universal binding site (140) or its complementary sequence for a forward sequencing primer. (v) at least one universal binding site (150) for a reverse sequencing primer or its complementary sequence; (vi) at least one universal binding site for a compacted oligonucleotide or its complementary sequence; (vii) at least one unique molecular index sequence (UMI) comprising a left unique molecular index sequence (180) and / or a right unique molecular index sequence (190) that can be used to uniquely identify a linear library molecule (100) with the unique molecular index sequence attached; (viii) at least one universal binding site for a pinned primer or its complementary sequence; (ix) at least one batch-specific barcode sequence; (x) a universal binding site (133) for a second non-sticker capture primer or its complementary sequence; (xi) at least one short random sequence (132) of about 3 to 20 nucleotides in length and providing nucleotide sequence diversity; and / or (xii) a second universal binding site (130) (or its complementary sequence) for a second portion of a pinned trap.(c)(i) contacting a plurality of clip-capture primers (200) with a plurality of linear library molecules (100) under conditions suitable for hybridizing each linear library molecule (100) with each clip-capture primer (200) to form each open-loop library molecule (300) containing a 5' protruding valve structure, and (ii) contacting the 5' protruding valve structure with a valve-cutting reagent under conditions suitable for cleaving the 5' protruding valve structure to generate a plurality of open-loop library molecules, each open-loop library molecule having a newly cleaved 5' end and an uncleaved 3' end and containing a cut between the 5' end and the 3' end of the open-loop library molecule, wherein each linear library molecule contains a clip-capture primer that hybridizes with the first portion (210) of the clip-capture primer. (d) Enzymatically close nicks in a plurality of open-loop library molecules (300) to generate a plurality of covalently closed circular library molecules (400), wherein each covalently closed circular library molecule hybridizes with a clip-capture primer (200); (e) Contact the plurality of covalently closed circular library molecules (400) with a rolling circle amplification reaction mixture and perform a rolling circle amplification reaction to generate a plurality of tandem template molecules immobilized on a support, wherein the density of the tandem template molecules on the support is 10; 5 / mm 2 Up to 10 15 / mm 2 The rolling circle amplification reaction mixture contains a strand displacement polymerase and a nucleotide mixture containing dATP, dGTP, dCTP, dTTP and dUTP, wherein the rolling circle amplification reaction mixture contains a plurality of single-stranded nucleic acid compacted oligonucleotides, wherein the 5' and 3' regions of each single-stranded nucleic acid compacted oligonucleotide hybridize with a universal binding site on a nucleic acid tandem template molecule to pull the distal portions of the tandem template molecule together, thereby causing compaction of the tandem template molecule to form DNA nanospheres, wherein the terminal 3' end of the single-stranded nucleic acid compacted oligonucleotide is non-extendable, and wherein at least one portion of each tandem template molecule hybridizes with a pinning primer (500); and (f) performing at least one sequencing reaction to determine the sequence of at least a portion of the plurality of tandem template molecules.
[0009] In some embodiments, the plurality of clip-on trapping primers (200) in step (a) are located at random and non-predetermined positions on the support. In some embodiments, the plurality of clip-on trapping primers (200) in step (a) comprises a plurality of nearest-neighbor clip-on trapping primers that are in contact with and / or overlap each other when the support is viewed from any angle, including above, below, or from the side. In some embodiments, the plurality of clip-on trapping primers in step (a) comprises at least a first subgroup of clip-on trapping primers having a first sequence and a second subgroup of clip-on trapping primers having a second sequence different from the first sequence.
[0010] In some embodiments, the plurality of linear library molecules (100) in step (b) comprises at least a first subgroup and a second subgroup of linear library molecules. In some embodiments, the linear library molecules (100) in the first subgroup comprise a mixture of target sequences, and the linear library molecules (100) in the second subgroup comprise a mixture of target sequences. In some embodiments, the first subgroup of linear library molecules comprises a universal binding site (140-1) or its complementary sequence for a first-batch specific forward sequencing primer, a universal binding site (150-1) or its complementary sequence for a first-batch specific reverse sequencing primer, and a first-batch specific barcode sequence (142); and the second subgroup of linear library molecules comprises a universal binding site (140-2) or its complementary sequence for a second-batch specific forward sequencing primer, a universal binding site (150-2) or its complementary sequence for a second-batch specific reverse sequencing primer, and a second-batch specific barcode sequence (152).
[0011] In some embodiments, the plurality of tandem template molecules in step (e) comprises at least a first subgroup and a second subgroup of tandem template molecules. In some embodiments, the first and second subgroups of tandem template molecules are located at random and non-predetermined positions on the support, and each tandem template molecule in the first and second subgroups comprises the nearest neighbor nucleic acid tandem template molecule, which contacts or overlaps each other when the support is viewed from any angle, including above, below, or from the side.
[0012] In some embodiments, the sequencing in step (f) includes performing first-batch repeated sequencing. In some embodiments, the first-batch repeated sequencing includes: (a) hybridizing a first subgroup of the tandem template molecule with a plurality of first-batch specific forward sequencing primers and performing a plurality of sequencing reactions, thereby generating a plurality of first-batch sequencing reads, wherein the length of the first-batch sequencing reads does not exceed 50 bases; (b) stopping or blocking the first-batch repeated sequencing in step (a) to inhibit further sequencing reactions; (c) removing the plurality of first-batch sequencing reads from the first subgroup of the tandem template molecule and retaining the first subgroup of the tandem template molecule; and (d) repeatedly sequencing the first subgroup of the tandem template molecule by repeating steps (a)-(c) at least once.
[0013] In some embodiments, the sequencing in step (f) further includes performing a second batch of repeated sequencing. In some embodiments, the second batch of repeated sequencing includes: (a) hybridizing a second subgroup of the tandem template molecule with a plurality of second batch-specific forward sequencing primers and performing a plurality of sequencing reactions, thereby generating a plurality of second batch sequencing reads, wherein the length of the second batch sequencing reads does not exceed 50 bases; (b) stopping or blocking the second batch of repeated sequencing in step (a) to inhibit further sequencing reactions; (c) removing a plurality of second batch sequencing reads from the second subgroup of the tandem template molecule and retaining the second subgroup of the tandem template molecule; and (d) repeatedly sequencing the second subgroup of the tandem template molecule by repeating steps (a)-(c) at least once.
[0014] In some embodiments, (i) step (c) includes distributing a first subgroup of linear library molecules onto a support under conditions suitable for hybridizing each linear library molecule from the first subgroup with a respective clip-capture primer (200) to generate a first subgroup of open-ring library molecules, each having a notch or gap, wherein the support contains an excess of clip-capture primers immobilized thereon compared to the first subgroup of linear library molecules; (ii) step (d) includes enzymatically closing the notch or gap to generate a first subgroup of covalently closed circular library molecules, wherein each covalently closed circular library molecule hybridizes with a clip-capture primer; (iii) step (e) includes performing a rolling circle amplification reaction to generate a first subgroup of tandem template molecules; (iv) step (f) includes purifying the first subgroup of tandem template molecules... (v) The method further includes stopping the sequencing of the first subgroup of the tandem template molecules; (vi) distributing the second subgroup of the linear library molecules onto the same support under conditions suitable for hybridizing each linear library molecule from the second subgroup with each clip-capture primer (200) to generate a second subgroup of open circular library molecules with nicks or gaps, and enzymatically closing the nicks or gaps to generate a second subgroup of covalently closed circular library molecules, wherein each covalently closed circular library molecule hybridizes with the clip-capture primer and undergoes rolling circle amplification to generate the second subgroup of the tandem template molecules; and (vii) continuing to sequence at least a portion of the first subgroup of the tandem template molecules or at least a portion of the second subgroup of the tandem template molecules.
[0015] In some embodiments, the sequencing in step (f) includes pairwise sequencing. In some embodiments, pairwise sequencing includes: (a) generating multiple extended forward sequencing primer chains by contacting multiple tandem template molecules with multiple forward sequencing primers under conditions suitable for hybridizing at least one forward sequencing primer with at least one of the universal binding sites (140) of the tandem template molecules for the forward sequencing primers, and performing a forward sequencing reaction using the first hybridized forward sequencing primer, multiple sequencing polymerases, and multiple nucleotide reagents; (b) retaining the multiple tandem template molecules immobilized on a support and replacing the multiple extended forward sequencing primer chains with multiple forward extension chains, which are obtained by using tandem template molecules... (c) Hybridize with the tandem template molecule by generating a debasement site in the tandem template molecule at the uridine nucleotide in the tandem template molecule and creating a gap at the debasement site to generate multiple tandem template molecules containing gaps, while retaining multiple forward extension strands and multiple fixed clamping and pinning primers; and (d) Sequencing multiple forward extension strands by contacting multiple forward extension strands with multiple soluble reverse sequencing primers, multiple sequencing polymerases and multiple nucleotide reagents, and performing a reverse sequencing reaction, thereby generating multiple extended reverse sequencing primer strands.
[0016] In some embodiments, the sequencing in step (f) includes chain termination sequencing. In some embodiments, chain termination sequencing includes: (a) contacting a plurality of tandem template molecules with a plurality of sequencing polymerases and a plurality of nucleic acid sequencing primers, wherein the contact is performed under conditions suitable for forming a plurality of sequencing polymerase complexes comprising a sequencing polymerase that binds to a nucleic acid duplex, wherein the nucleic acid duplex comprises a portion of the tandem template molecule that hybridizes to the nucleic acid sequencing primer; (b) contacting the plurality of sequencing polymerase complexes with a plurality of nucleotides comprising a detectable label and a blocking portion at a 2' sugar position or a 3' sugar position, wherein the contact is performed under conditions suitable for binding at least one nucleotide to one of the sequencing polymerase complexes and conditions suitable for promoting polymerase-catalyzed nucleotide incorporation; (c) incorporating the nucleotide into the 3' end of a nucleic acid sequencing primer of at least one sequencing polymerase complex, thereby generating a sequencing polymerase complex comprising the incorporated nucleotide; (d) detecting the incorporated nucleotide; (e) removing the blocking portion from the incorporated nucleotide; and (f) repeating steps (b)-(e) at least once.
[0017] In some embodiments, the sequencing in step (f) includes: (a) contacting a plurality of tandem template molecules with a plurality of sequencing polymerases and a plurality of nucleic acid sequencing primers, wherein the contact is performed under conditions suitable for forming a plurality of sequencing polymerase complexes comprising a sequencing polymerase that binds to a nucleic acid duplex, wherein the nucleic acid duplex comprises a portion of the tandem template molecule that hybridizes to the nucleic acid sequencing primer; (b) contacting the plurality of sequencing polymerase complexes with a plurality of nucleotides comprising a detectable label of a phosphate ester portion attached to a phosphate ester chain, wherein the contact is performed under conditions suitable for binding at least one nucleotide to one of the sequencing polymerase complexes and conditions suitable for promoting polymerase-catalyzed nucleotide incorporation; (c) incorporating the nucleotide into the 3' end of a sequencing primer of at least one sequencing polymerase complex, thereby generating a sequencing polymerase complex comprising the incorporated nucleotide; (d) detecting the incorporated nucleotide; and (e) repeating steps (b)-(d) at least once.
[0018] In some embodiments, the sequencing in step (f) includes: (a) contacting a plurality of tandem template molecules with a plurality of first sequencing polymerases and a plurality of nucleic acid sequencing primers, wherein the contact is performed under conditions suitable for binding the plurality of first polymerases with the plurality of tandem template molecules and the plurality of nucleic acid primers, thereby forming a plurality of first polymerase complexes comprising a first sequencing polymerase bound to a nucleic acid duplex, wherein the nucleic acid duplex comprises a tandem template molecule hybridized to the nucleic acid sequencing primers; (b) contacting the plurality of first polymerase complexes with a plurality of multivalent molecules, wherein the multivalent molecules are detectably labeled, and wherein the plurality of multivalent molecules contain Each of the multivalent molecules contains a nucleus attached to a plurality of nucleotide arms, and each nucleotide arm is attached to a nucleotide moiety, wherein the contact is performed under conditions suitable for binding the complementary nucleotide moiety of the multivalent molecule to at least two of a plurality of first polymerase complexes, thereby forming a plurality of multivalent binding polymerase complexes, and the conditions are suitable for inhibiting the incorporation of the complementary nucleotide moiety into the nucleic acid sequencing primers of the plurality of multivalent binding polymerase complexes; (c) detecting the plurality of multivalent binding polymerase complexes; and (d) recognizing the bases of the nucleotide moiety in the plurality of multivalent binding polymerase complexes, thereby determining the sequence of the tandem template molecule. In some embodiments, the method further includes (e) dissociating a plurality of multivalent-binding polymerase complexes by removing a plurality of first nucleic acid sequencing polymerases and bound multivalent molecules, retaining the nucleic acid duplexes, thereby generating a plurality of retained nucleic acid duplexes; (f) contacting the plurality of retained nucleic acid duplexes from step (e) with a plurality of second sequencing polymerases under conditions suitable for binding the plurality of second polymerases with the plurality of retained nucleic acid duplexes, thereby forming a plurality of second polymerase complexes comprising second sequencing polymerases bound to the nucleic acid duplexes; and (g) contacting the plurality of second polymerase complexes with a plurality of nucleotides, wherein the contact is performed under conditions suitable for binding complementary nucleotides from the plurality of nucleotides with at least two of the second polymerase complexes, thereby forming a plurality of nucleotide-polymerase complexes, and the conditions being suitable for promoting the incorporation of bound complementary nucleotides into primers of nucleotide-binding complexes. In some embodiments, the nucleotides in the plurality of nucleotides contain detectable tags, and the method includes: (h) detecting complementary nucleotides incorporated into nucleic acid sequencing primers of nucleotide-polymerase complexes. In some embodiments, the method further includes (h) detecting complementary nucleotides incorporated into nucleic acid sequencing primers of the nucleotide-polymerase complex; and (i) identifying bases of complementary nucleotides incorporated into nucleic acid sequencing primers of the nucleotide-polymerase complex. In some embodiments, the plurality of nucleotides in step (g) comprises a plurality of unlabeled nucleotides and wherein the detection of nucleotide incorporation is omitted.In some embodiments, step (b) of contacting a plurality of first polymerase complexes with a plurality of multivalent molecules is carried out in the presence of a non-catalytic divalent cation that inhibits polymerase-catalyzed nucleotide incorporation, wherein the non-catalytic divalent cation comprises strontium, barium, or calcium. In some embodiments, step (g) of contacting a plurality of second polymerase complexes with a plurality of nucleotides is carried out in the presence of a catalytic divalent cation that promotes polymerase-catalyzed nucleotide incorporation, wherein the catalytic divalent cation comprises magnesium or manganese.
[0019] In some embodiments, each of the plurality of multivalent molecules comprises: (a) a nucleus; and (b) a plurality of nucleotide arms comprising: (i) a nuclear attachment portion, (ii) a spacer, (iii) a linker, and (iv) a nucleotide portion, wherein the nucleus is attached to the plurality of nucleotide arms via the nuclear attachment portion of the plurality of nucleotide arms, wherein the spacer is attached to the linker, and wherein the linker is attached to the nucleotide portion. In some embodiments, the linker comprises an aliphatic chain having 2 to 6 subunits or an oligoethylene glycol chain having 2 to 6 subunits. In some embodiments, the plurality of nucleotide arms attached to a given nucleus have the same type of nucleotide portion, and wherein these types of nucleotide portions include dATP, dGTP, dCTP, dTTP, or dUTP. In some embodiments, the plurality of multivalent molecules comprises one type of multivalent molecule, wherein each of the plurality of multivalent molecules has the same type of nucleotide portion selected from the group consisting of dATP, dGTP, dCTP, dTTP, and dUTP. In some embodiments, the plurality of multivalent molecules comprise a mixture of any combination of two or more types of multivalent molecules, each type having a nucleotide motif selected from the group consisting of dATP, dGTP, dCTP, dTTP, and / or dUTP.
[0020] In some embodiments, the individual nucleotides in the plurality of nucleotides in step (g) comprise an aromatic base, a pentose sugar, and 1-10 phosphate ester groups. In some embodiments, the plurality of nucleotides in step (g) comprise one type of nucleotide selected from the group consisting of dATP, dGTP, dCTP, dTTP, and dUTP, or a mixture comprising any combination of two or more types of nucleotides selected from the group consisting of dATP, dGTP, dCTP, dTTP, and / or dUTP. In some embodiments, at least one nucleotide in the plurality of nucleotides in step (g) is labeled with a fluorophore. In some embodiments, the plurality of nucleotides in step (g) lack fluorophore labeling. In some embodiments, at least one of the nucleotides in step (g) comprises a removable chain-terminating portion attached to the 3' carbon position of a sugar group, optionally wherein the removable chain-terminating portion comprises an alkyl group, an alkenyl group, an alkynyl group, an allyl group, an aryl group, a benzyl group, an azide group, an azido group, an O-azidomethyl group, an amine group, an amide group, a ketone group, an isocyanate group, a phosphate group, a thiol group, a disulfide group, a carbonate group, a urea group, or a silyl group, and optionally wherein the removable chain-terminating portion can be cleaved with a chemical compound to generate an extendable 3'OH moiety on the sugar group.
[0021] In some embodiments, the method includes forming a plurality of binding complexes, comprising the steps of: (a) binding a first nucleic acid sequencing primer, a first sequencing polymerase, and a first multivalent molecule to a first portion of a tandem template molecule, thereby forming a first binding complex, wherein a first nucleotide portion of the first multivalent molecule is bound to the first polymerase; and (b) binding a second nucleic acid sequencing primer, a second sequencing polymerase, and the first multivalent molecule to a second portion of the same nucleic acid tandem template molecule, thereby forming a second binding complex, wherein a second nucleotide portion of the first multivalent molecule is bound to the second polymerase, and wherein the first and second binding complexes comprise the same multivalent molecule, thereby forming an affinity complex.
[0022] In some embodiments, the method includes (a) contacting a plurality of first sequencing polymerases and a plurality of nucleic acid sequencing primers with different portions of respective nucleic acid tandem template molecules to form at least a first polymerase complex and a second polymerase complex on the same nucleic acid tandem template molecule; and (b) contacting a plurality of multivalent molecules containing a detectable label with at least the first polymerase complex and the second polymerase complex under conditions suitable for binding a single multivalent molecule from the plurality of multivalent molecules to the first polymerase complex and the second polymerase complex, wherein at least a first nucleotide portion of the single multivalent molecule binds to the first polymerase complex, the first polymerase complex including a first nucleic acid sequencing primer that hybridizes to a first portion of the tandem template molecule, thereby forming a first binding complex, and wherein at least a second nucleotide portion of the single multivalent molecule binds to a first polymerase complex comprising a first nucleic acid sequencing primer that hybridizes to a first portion of the tandem template molecule, thereby forming a first binding complex, and wherein at least a second nucleotide portion of the single multivalent molecule binds to a first polymerase complex that hybridizes to a first portion of the tandem template molecule. A second polymerase complex is bound, the second polymerase complex comprising a second nucleic acid sequencing primer that hybridizes with a second portion of a tandem template molecule, thereby forming a second binding complex, wherein the contact is performed under conditions suitable for inhibiting the polymerase-catalyzed incorporation of the bound first and second nucleotide portions into the first and second binding complexes, and wherein the first and second binding complexes bound to the same multivalent molecule form an affinity complex; (c) detecting the first and second binding complexes on the same tandem molecule; and (d) identifying the first nucleotide portion in the first binding complex, thereby determining the sequence of the first portion of the tandem template molecule, and identifying the second nucleotide portion in the second binding complex, thereby determining the sequence of the second portion of the tandem template molecule.
[0023] In some embodiments, the plurality of sequencing polymerases in steps (a) and (d) comprise a plurality of engineered polymerases having at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or greater than 99% sequence identity with any of SEQ ID NO: 128-146. In some embodiments, the plurality of sequencing polymerases in step (a) comprise a plurality of engineered polymerases having at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or greater than 99% sequence identity with any of SEQ ID NO: 128-146. In some embodiments, the plurality of sequencing polymerases comprises a plurality of engineered polymerases having at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or greater than 99% sequence identity with any of SEQ ID NO: 128-146. In some embodiments, the plurality of first sequencing polymerases comprises a plurality of engineered polymerases having at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or greater than 99% sequence identity with any of SEQ ID NO: 128-146. In some embodiments, the plurality of second sequencing polymerases include a plurality of engineered polymerases having at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or greater than 99% sequence identity with any of SEQ ID NO: 128-146.
[0024] In some embodiments, the valve-cutting reagent includes at least one 5' valve endonuclease derived from eukaryotic or archaea organisms. In some embodiments, the valve-cutting agent comprises at least one archaeological 5' valve endonuclease selected from the group consisting of: *Archaeoglobus fulgidus* (Afu FEN1), *Methanobacterium thermoautotrophicum* (Mth FEN1), *Pyrococcus furiosus* (Pfu FEN1), *Methanococcus jannaschii* (Mja FEN1), *Pyrococcus woesei* (Pwo FEN1), *Pyrococcus horikoshii* (Pho FEN1), *Archaeoglobus veneficus* (Ave FEN1), *Thermococcus kodakarensis* (Tko FEN1), and *Desulfurococcus amylolyticus*. (Dam FEN1), Aeropyrum pernix (Ape FEN1), and Sulfolobus solfataricus (Sso FEN1). In some embodiments, the valve cleavage reagent comprises a 5' valve endonuclease from the genus *Thermococcus* 9°N FEN1. In some embodiments, the valve cleavage reagent comprises a 5' valve endonuclease from mouse, yeast, or human.
[0025] In some embodiments, the enzymatic closure nick includes contacting a plurality of open-loop library molecules with a DNA ligase, including a T3 ligase, a T4 ligase, a T7 ligase, a Tfu ligase, or a ligase derived from Thermococcus nautili.
[0026] In some embodiments, the flap cutting reagent comprises a DNA ligase. In some embodiments, the DNA ligase comprises a T3 ligase, a T4 ligase, a T7 ligase, a Tfu ligase, or a ligase derived from Thermococcus nautilus. Attached Figure Description
[0027] The features of the invention are specifically set forth in the appended claims. The features and advantages of this disclosure will be better understood by referring to the following detailed description of exemplary embodiments that utilize the principles of this disclosure, and in the accompanying drawings:
[0028] Figure 1 These are schematic diagrams of various exemplary configurations of multivalent molecules. Left (Class I): Schematic diagram of multivalent molecules with a "starburst" or "helix-skelter" configuration. Middle (Class II): Schematic diagram of multivalent molecules with a dendritic macromolecular configuration. Right (Class III): Schematic diagram of multiple multivalent molecules formed by the reaction of streptavidin with a 4-arm or 8-arm PEG-NHS containing biotin and dNTPs. The nucleotide moiety is designated as 'N', biotin is designated as 'B', and streptavidin is designated as 'SA'.
[0029] Figure 2 This is a schematic diagram of an exemplary multivalent molecule comprising a universal nucleus attached to multiple nucleotide arms.
[0030] Figure 3 This is a schematic diagram of an exemplary multivalent molecule comprising a dendritic nucleus attached to multiple nucleotide arms.
[0031] Figure 4 This is a schematic diagram of an exemplary multivalent molecule containing a nucleus attached to multiple nucleotide arms, wherein the nucleotide arms include biotin, spacers, linkers, and nucleotide moieties.
[0032] Figure 5 This is a schematic diagram of an exemplary nucleotide arm that includes a nuclear attachment portion, a spacer, a linker, and a nucleotide portion.
[0033] Figure 6 The chemical structures of exemplary spacers (top) and various exemplary connectors, including 11-atom connectors, 16-atom connectors, 23-atom connectors and N3 connectors (bottom), are shown.
[0034] Figure 7 The chemical structures of various exemplary connectors (including connectors 1 to 9) are shown.
[0035] Figure 8 The chemical structures of various exemplary connectors to which nucleotide moieties are attached are shown.
[0036] Figure 9 The chemical structures of various exemplary connectors that bind / attach to nucleotide moieties are shown.
[0037] Figure 10 The chemical structures of various exemplary connectors that bind / attach to nucleotide moieties are shown.
[0038] Figure 11The chemical structure of an exemplary biotinylated nucleotide arm is shown. In this example, the nucleotide portion is attached to the linker via a propargylamine attachment at the 5-position of the pyrimidine base or the 7-position of the purine base.
[0039] Figure 12 This is a schematic diagram of guanine tetrads (e.g., G-tetrads).
[0040] Figure 13 This is a schematic diagram of an exemplary intramolecular G-quadruplex structure.
[0041] Figure 14 This is a schematic diagram of an exemplary low-binding support, comprising a glass substrate and alternating hydrophilic coatings covalently or non-covalently adhered to the glass, and further comprising chemically reactive functional groups that act as attachment sites for oligonucleotide primers (e.g., splint-capture primers). Alternatively, the support may be made of any material, such as glass, plastic, or polymeric materials.
[0042] Figure 15 A is a schematic diagram of an exemplary support having multiple clip-on capture primers (200) arranged in a non-predetermined and random manner on a support. Circular dots represent clip-on capture primers attached to the support. Multiple clip-on capture primers may have the same sequence. Clip-on capture primers may be attached to the support such that, when viewed from any angle of the support (including above, below, and / or from the side of the support), some of the nearest clip-on capture primers are in contact with and / or overlap each other, as shown by the dashed lines surrounding the four clip-on capture primers representing the nearest clip-on capture primers in contact with each other.
[0043] Figure 15 B is Figure 15 A schematic diagram of the same support shown in Figure A, wherein each clip-capture primer (200) is attached to one of four different batch sequences (e.g., batch-specific sequencing primer binding sites and / or batch-specific barcode sequences, which are common to a specific batch or subgroup of nucleic acid tandem template molecules in a plurality of nucleic acid tandem template molecules). The different batch sequences of the tandem template molecules are represented by horizontal stripes, vertical dashed lines, brick shapes, or solid black lines. The tandem template molecules may be attached to the support (e.g., via attachment to clip-capture primers) such that some of the nearest nucleic acid tandem molecules are in contact with and / or overlap each other when viewed from any angle of the support (including above, below, or to the side of the support). The dashed lines surrounding the four tandem template molecules represent the nearest tandem template molecules that are in contact with each other.
[0044] Figure 16A is a schematic diagram of an exemplary support having multiple tandem template molecules immobilized to a support (e.g., via attachment to a clip-on capture primer (200)), wherein the tandem template molecules are arranged on the support in a predetermined manner. Circular dots represent tandem template molecules immobilized to the support. Each tandem template molecule contains one of four different batch sequences (e.g., batch-specific sequencing primer binding sites and / or batch-specific barcode sequences). Different batch sequences of the nucleic acid tandem template molecules are represented by horizontal stripes, vertical dashed lines, brick shapes, or solid black dots. For example, the nucleic acid tandem template molecules may be immobilized to the support to form dots and arranged in rows and columns.
[0045] Figure 16 B is a schematic diagram of an exemplary support having multiple nucleic acid tandem template molecules fixed to a support (e.g., via attachment to a clip-on capture primer (200)), wherein the tandem template molecules are arranged on the support in a predetermined manner. The tandem template molecules may comprise one of four different batch sequences (e.g., batch-specific sequencing primer binding sites and / or batch-specific barcode sequences). The different batch sequences of the tandem template molecules are represented by horizontal stripes, vertical dashed lines, brick shapes, or solid black lines. For example, multiple tandem template molecules may be fixed to the support and arranged to form stripes.
[0046] Figure 17 A is a schematic diagram illustrating a support having an exemplary clip-on capture primer (200) fixed thereon, wherein the clip-on capture primer can be used to perform a ligation reaction on the support. The clip-on capture primer comprises a first portion (210) and a second portion (220). The schematic diagram also illustrates an exemplary open-loop library molecule formed from a linear library molecule, the linear library molecule comprising a first universal binding site (120) for binding the first portion of the clip-on capture primer and a second universal binding site (130) for binding the second portion of the same clip-on capture primer. In some embodiments, the linear library molecule further includes a target sequence and at least one adaptor sequence. The linear library molecule hybridizes with the clip-on capture primer to form an open-loop library molecule (300) having a nick or gap, wherein the nick or gap is asymmetrically positioned on the clip-on capture primer.
[0047] Figure 17 B illustrates covalent closure. Figure 17 A schematic diagram of an exemplary covalently closed cyclic library molecule (400) generated by gaps or nicks in the open-ring library molecule of A. Figure 17 The covalently closed circular library molecule of B can be used for a workflow that includes rolling circle amplification and sequencing, as illustrated in the following workflow. Figures 29 to 36 middle.
[0048] Figure 18A is a schematic diagram illustrating a support having an exemplary clip-on trapping primer (200) fixed thereon, wherein the clip-on trapping primer can be used for a ligation reaction on the support. The clip-on trapping primer comprises a first portion (210) and a second portion (220). The schematic diagram also illustrates an exemplary open-loop library molecule (300) formed from a linear library molecule, the linear library molecule comprising a first universal binding site (120) for binding the first portion of the clip-on trapping primer and a second universal binding site (130) for binding the second portion of the same clip-on trapping primer. The linear library molecule may also include a target sequence and at least one adaptor sequence. The linear library molecule hybridizes with the clip-on trapping primer to form an open-loop library molecule (300) with a nick or gap, wherein the nick or gap is asymmetrically positioned on the clip-on trapping primer.
[0049] Figure 18 B illustrates covalent closure. Figure 18 A schematic diagram of an exemplary covalently closed cyclic library molecule (400) generated from the gaps or nicks in the open-ring library molecule (300) of A. Figure 18 The covalently closed circular library molecule of B can be used for a workflow that includes rolling circle amplification and sequencing, as illustrated in the following workflow. Figures 29 to 36 middle.
[0050] Figure 19 A is a schematic diagram illustrating a support having an exemplary clip-on capture primer (200) fixed thereon, wherein the clip-on capture primer can be used for a ligation reaction on the support. The clip-on capture primer comprises a first portion (210) and a second portion (220). The schematic diagram also illustrates an exemplary open-loop library molecule (300) generated from a linear library molecule, the linear library molecule comprising a first universal binding site (120) for binding the first portion of the clip-on capture primer, and the linear library comprising a second universal binding site (130) for binding the second portion of the same clip-on capture primer. The linear library molecule may also include a target sequence and at least one adaptor sequence. The linear library molecule hybridizes with the clip-on capture primer to form an open-loop library molecule (300) having a nick or gap, wherein the nick or gap is symmetrically positioned on the clip-on capture primer.
[0051] Figure 19 B illustrates covalent closure. Figure 19 A schematic diagram of an exemplary covalently closed cyclic library molecule (400) generated from the gaps or nicks in the open-ring library molecule (300) of A. Figure 19 The covalently closed circular library molecule of B can be used for a workflow that includes rolling circle amplification and sequencing, as illustrated in the following workflow. Figures 29 to 36 middle.
[0052] Figure 20This is a schematic diagram illustrating various embodiments (e.g., (A)-(D)) of a linear library molecule (100), comprising (i) a target sequence, also referred to herein as an "insertion," and any adaptor sequence or any combination of two or more adaptor sequences, which may include (ii) a first universal binding site (120) (or its complementary sequence) against a first portion of a clip-on capture primer, and (iii) at least one sample index sequence (e.g., a left sample index sequence (160) and / or a right sample index sequence (170)) that can be used to distinguish the target sequence obtained from different sample sources in multiplex assays. The sequence (110), (iv) a universal binding site (140) for forward sequencing primers or its complementary sequence, (v) a universal binding site (150) for reverse sequencing primers or its complementary sequence, (vi) at least one unique molecular index sequence (UMI) (e.g., a left unique molecular index sequence (180) and / or a right unique molecular index sequence (190)) that can be used to uniquely identify nucleic acid molecules (e.g., having a target sequence) with the unique molecular index sequence attached, and (vii) a second universal binding site (130) for a second portion of a primer captured by a fixed clip or its complementary sequence. In some embodiments, the universal binding site (140) for forward sequencing primers includes a batch-specific forward sequencing primer binding site that can be used for batch sequencing. In some embodiments, the universal binding site (150) for reverse sequencing primers includes a batch-specific reverse sequencing primer binding site that can be used for batch sequencing. In some embodiments, at least one sample index sequence (e.g., (160) and / or (170)) comprises a sample index sequence conjugated with an optional short random sequence (e.g., NNN), wherein the short random sequence provides nucleotide sequence diversity and is about 3 to 20 nucleotides in length. In some embodiments, either the target sequence (110) or the adaptor sequence may be arranged in any order.
[0053] Figure 21This is a schematic diagram illustrating various embodiments (e.g., (A)-(F)) of a linear library molecule (100) comprising (i) a target sequence (110), also referred to herein as an "insert," and any adaptor sequence or any combination of two or more adaptor sequences, wherein the adaptor sequence comprises: (ii) a first universal binding site (120) (or its complementary sequence) against a first portion of a clip-on capture primer, (iii) at least one sample index sequence (e.g., (160) and / or (170)) which can be used to distinguish the target sequence obtained from different sample sources in multiplex assays, (iv) a universal binding site (140) (or its complementary sequence) against a forward sequencing primer, (v) a universal binding site (150) (or its complementary sequence) against a reverse sequencing primer, and (vi) a second universal binding site (130) (or its complementary sequence) against a second portion of a clip-on capture. In some embodiments, the universal binding site (140) against the forward sequencing primer comprises a batch-specific forward sequencing primer binding site that can be used for batch sequencing. In some embodiments, the universal binding site (150) for reverse sequencing primers includes a batch-specific reverse sequencing primer binding site that can be used for batch sequencing. In some embodiments, at least one sample index sequence (e.g., (160) and / or (170)) includes a sample index sequence that conjugates to a short random sequence (e.g., NNN), wherein the short random sequence provides nucleotide sequence diversity and is about 3 to 20 nucleotides in length. In some embodiments, either the target sequence (110) or the adaptor sequence can be arranged in any order.
[0054] Figure 22This is a schematic diagram illustrating various embodiments of a linear library molecule (100) (e.g., (A)-(C)), comprising (i) a target sequence, also referred to as an insert (110), and any adaptor sequence or any combination of two or more adaptor sequences, wherein the adaptor sequence comprises: (ii) a first universal binding site (120) (or its complementary sequence) against a first portion of a clip-on capture primer, (iii) a universal binding site (123) (or its complementary sequence) against a first non-clip-on capture primer, (iv) at least one sample index sequence (e.g., (160) and / or (170)) which can be used to distinguish the target sequence obtained from different sample sources in multiplexing, and (v) at least one universal binding site (140) against a forward sequencing primer. (vi) at least one universal binding site (150) (or its complementary sequence) for a reverse sequencing primer, (vii) at least one unique molecular index sequence (UMI) (e.g., (180) and / or (190)) which can be used to uniquely identify nucleic acid molecules with the attached unique molecular index sequence (e.g., having a target sequence), (viii) a universal binding site (133) (or its complementary sequence) for a second non-sandwich capture primer, (ix) at least one optional short random sequence (e.g., NNNN) (132) which provides nucleotide sequence diversity and is about 3 to 20 nucleotides in length, and (x) a second universal binding site (130) (or its complementary sequence) for a second portion of a fixed sandwich capture primer. The universal binding site (140) for a forward sequencing primer may contain a batch-specific forward sequencing primer binding site that can be used for batch sequencing. The universal binding site (150) for a reverse sequencing primer may contain a batch-specific reverse sequencing primer binding site that can be used for batch sequencing. In some embodiments, at least one sample index sequence (e.g., (160) and / or (170)) comprises a sample index sequence conjugated with an optional short random sequence (e.g., NNN) (not shown), wherein the short random sequence provides nucleotide sequence diversity and is about 3 to 20 nucleotides in length. In some embodiments, either the target sequence (110) or the adaptor sequence may be arranged in any order.
[0055] Figure 23This is a schematic diagram illustrating various embodiments (e.g., (A)-(F)) of a linear library molecule (100), comprising (i) a target sequence and any adapter sequence or any combination of two or more adapter sequences, wherein the adapter sequence comprises: (ii) a first universal binding site (120) (or its complementary sequence) against a first portion of a clip-capture primer, (iii) a universal binding site (123) (or its complementary sequence) against a first non-clip-capture primer, (iv) at least one sample index sequence (e.g., (160) and / or (170)) which can be used to distinguish the target sequence obtained from different sample sources in multiplexing, (v) at least one universal binding site (140) (or its complementary sequence) against a forward sequencing primer, (vi) at least one universal binding site (150) (or its complementary sequence) against a reverse sequencing primer, (vii) a universal binding site (133) (or its complementary sequence) against a second non-clip-capture primer, and (viii) a second universal binding site (130) (or its complementary sequence) against a second portion of a fixed clip-capture primer. The universal binding site (140) for forward sequencing primers may include a batch-specific forward sequencing primer binding site that can be used for batch sequencing. The universal binding site (150) for reverse sequencing primers may include a batch-specific reverse sequencing primer binding site that can be used for batch sequencing. At least one sample index sequence (e.g., (160) and / or (170)) may include a sample index sequence that conjugates to an optional short random sequence (e.g., NNN) (not shown), wherein the short random sequence provides nucleotide sequence diversity and is about 3 to 20 nucleotides in length. In some embodiments, either the target sequence (110) or the adaptor sequence may be arranged in any order.
[0056] Figure 24 This is a schematic diagram illustrating various embodiments (e.g., (A)-(B)) of a linear library molecule (100) containing at least one conjugating linker sequence located between any of the general linker sequences described herein.
[0057] Figure 25 A is a schematic diagram illustrating an exemplary clip-on capture primer (200) fixed to a support, wherein the clip-on capture primer hybridizes with an exemplary open-loop library molecule (300) having slits or gaps generated from a linear library molecule.
[0058] Figure 25 B illustrates covalent closure. Figure 25 A schematic diagram of an exemplary covalently closed cyclic library molecule (400) generated from the gaps or nicks in the open-ring library molecule (300) of A. Figure 25The covalently closed circular library molecule of B can be used for a workflow that includes rolling circle amplification and sequencing, as illustrated in the following workflow. Figures 29 to 36 middle.
[0059] Figure 26 A is a schematic diagram illustrating an exemplary clip-on capture primer (200) fixed to a support, wherein the clip-on capture primer hybridizes with an open-loop library (300) with slits or gaps generated from linear library molecules.
[0060] Figure 26 B illustrates covalent closure. Figure 26 A schematic diagram of an exemplary covalently closed cyclic library molecule (400) generated from the gaps or nicks in the open-ring library molecule (300) of A. Figure 26 The covalently closed circular library molecule of B can be used for a workflow that includes rolling circle amplification and sequencing, as illustrated in the following workflow. Figures 29 to 36 middle.
[0061] Figure 27A This is a schematic diagram illustrating a mixture of exemplary clip-on capture primers (200-A) and (200-B) containing different sequences immobilized to the same support, wherein each clip-on capture primer can hybridize with its homologous linear library molecule to form respective open-ring library molecules (300-A) and (300-B) with nicks or gaps. In this example, the first clip-on capture primer (200-A) can bind to the first linear library molecule, while the second clip-on capture primer (200-B) can bind to the second linear library molecule.
[0062] Figure 27B This demonstrates the covalent closure. Figure 27A Schematic diagrams of exemplary covalently closed cyclic library molecules (400-A) and (400-B) generated by gaps or cuts in open-ring library molecules. Figure 27B Covalently closed circular library molecules can be used to perform workflows including rolling circle amplification and sequencing, as illustrated in the diagram. Figures 29 to 36 middle.
[0063] Figure 28A This is a schematic diagram illustrating various embodiments of an open-loop library molecule hybridized with a clip-capture primer (200), wherein the 5' end of the open-loop library molecule forms a 5' lobe structure. Figure 28A (i) is a schematic diagram showing an open-loop library molecule (300) comprising a 5' overhang valve structure of 2 to 10 nucleotides in length and a 3' overhang valve structure of 1 nucleotide in length, wherein the 5' overhang valve structure can be cleaved by a nuclease (e.g., 5' valve nuclease 1 or FEN1). Figure 28A(ii) is a schematic diagram showing an open-loop library molecule (300) containing a 5' overhang valve structure of 2 to 10 nucleotides in length and lacking a 3' overhang valve structure, wherein the 5' overhang valve structure can be cleaved by a nuclease (e.g., 5' valve nuclease 1 or FEN1). Figure 28A (iii) is a schematic diagram showing an open-loop library molecule (300) comprising a 5' overhang valve structure of 2 to 10 nucleotides in length and a 3' overhang valve structure of 2 to 10 nucleotides in length, wherein the 5' overhang valve structure cannot be cleaved by a nuclease (e.g., 5' valve nuclease 1 or FEN1).
[0064] Figure 28B This is a schematic diagram illustrating an exemplary open-ring library molecule hybridized with a clip-on trap primer (200-A or 200-B), wherein the 5' end of the open-ring library molecule forms a 5' lobe structure. The left figure shows... Figure 28A The left image shows an open-loop library molecule where the 5' end of the open-loop library molecule (300-A) forms a 5' overhang valve structure of 2 to 10 nucleotides in length, and the 3' end of the open-loop library molecule (300-A) forms a 3' overhang valve structure of 1 nucleotide in length. In this example, the 5' overhang valve structure can be cleaved by a nuclease (e.g., 5' valve endonuclease 1 or FEN1). The right image shows... Figure 27A The open-loop library molecule shown in the right has a 5' overhang valve structure of 2 to 10 nucleotides in length at its 5' end and a 3' overhang valve structure of 1 nucleotide in length at its 3' end, and the 5' overhang valve structure can be cleaved by a nuclease (e.g., 5' valve nuclease 1 or FEN1).
[0065] Figure 29 This is a schematic diagram illustrating an exemplary rolling circle amplification reaction on a support using a covalently closed circular library molecule (400) and a mixture of nucleotides including a cleavable portion of a nucleotide that can be cleaved to generate an ablation site. In some embodiments, the 3' end of a primer captured by a fixed clamp can be used to initiate the rolling circle amplification reaction. The rolling circle amplification reaction generates a fixed single-stranded tandem molecule having at least one nucleotide with a cleavable portion that can be cleaved to generate an ablation site in the fixed tandem molecule. Figures 20 to 24 Any of the linear library molecules shown can be used in particular to generate covalently closed cyclic library molecules, such as... Figure 29 As shown, the covalently closed circular library molecule hybridizes with a fixed clamp-capture primer to initiate rolling circle amplification on the support.
[0066] Figure 30This is a schematic diagram illustrating a fixed single-stranded tandem molecule having at least one nucleotide with a cleavable portion that can be cleaved to generate a debasement site in the fixed tandem template molecule.
[0067] Figure 31 It is shown in Figure 30 The diagram illustrates an exemplary forward sequencing reaction performed on a fixed tandem template molecule. The forward sequencing reaction can be performed using multiple soluble forward sequencing primers and generate multiple extended forward sequencing primer chains. The fixed tandem template molecule may have two or more extended forward sequencing primer chains hybridized thereon.
[0068] Figure 32 This is a schematic diagram illustrating an exemplary method for replacing extended forward sequencing primer chains by performing a primer extension reaction with a chain-displacement polymerase in the absence of additional soluble primers.
[0069] Figure 33 This is a schematic diagram illustrating an exemplary method for replacing extended forward sequencing primer strands by generating forward extension strands through a primer extension reaction using soluble forward sequencing primers.
[0070] Figure 34 This is a schematic diagram illustrating an exemplary method for generating debase sites at nucleotides with easily cleavable portions in a fixed single-stranded tandem template molecule, and creating gaps at the debase sites to generate multiple tandem template molecules containing gaps, while retaining the multiple forward extension strands and the multiple fixed splint-capture primers. The forward extension strands can be... Figure 32 or Figure 33 The method described in the document is used to generate it.
[0071] Figure 35 It is shown that in removing such Figure 34 A schematic diagram of an exemplary retained forward extension chain following a tandem template molecule containing gaps.
[0072] Figure 36 It is shown in Figure 35 This diagram illustrates an exemplary reverse sequencing reaction performed on the retained forward extension strand. The reverse sequencing reaction can be performed using multiple soluble reverse sequencing primers. The retained forward extension strand may have two or more extended reverse sequencing primer strands hybridized thereon. The extended reverse sequencing primer strands do not hybridize with or covalently bind to the clip-on capture primers. Therefore, the extended reverse sequencing primer strands are not fixed to the support.
[0073] Figure 37This is a schematic diagram illustrating an exemplary support having a clamp-on trapping primer (200) and a pinning primer (500) fixed thereon. The clamp-on trapping primer binds to the tandem template molecule. For example, the fixed tandem can be... Figures 29 to 36 The workflow shown is generated. A fixed tandem template molecule captures two or more copies of a universal binding sequence against a fixed pinned primer. A portion of the fixed tandem template molecule containing the universal binding sequence against the pinned primer can hybridize with the pinned primer.
[0074] Figure 38AThis is a schematic diagram illustrating an exemplary batch sequencing workflow. A first covalently closed circular molecule (left) can be generated by hybridizing individual linear library molecules (not shown) from a first subgroup with a clip-capture primer (200) immobilized to a support. The hybridized first linear library molecules form a first open circular library molecule (not shown) with a notch or gap, which can be enzymatically closed to form a first covalently closed circular library molecule that hybridizes with the first clip-capture primer. The first covalently closed library molecule contains a first insert sequence (110-1), a first-batch barcode sequence (142; BC-1), and a universal binding site (140-1) for the first-batch forward sequencing primer, which selectively hybridizes with the first-batch forward sequencing primer. The universal binding site (140-1) for the first-batch forward sequencing primer corresponds to the first insert sequence (110-1). A second covalently closed circular molecule (right) can be generated by hybridizing individual linear library molecules (not shown) from a second subgroup with a clip-capture primer immobilized to the same support. The hybridized second linear library molecule forms a second open-circular library molecule (not shown) with a nick or gap, which can be enzymatically closed to form a second covalently closed circular library molecule that hybridizes with the second clip-capture primer. The second covalently closed library molecule contains a second insert sequence (110-2) different from the first insert sequence (110-1), a second batch barcode sequence (143; BC-2) different from the first batch barcode sequence (143; BC-1), and a universal binding site (140-2) for the second batch-specific forward sequencing primer, which selectively hybridizes with the second batch forward sequencing primer. The universal binding site (140-2) for the second batch-specific forward sequencing primer corresponds to the second insert sequence (110-2). Rolling circle amplification (RCA) is performed on the first and second covalently closed circular library molecules to generate first and second batch tandem template molecules immobilized to the same support. A first-batch sequencing workflow (e.g., first-batch repeated sequencing) is performed on the first and second tandem template molecules at least once. A second-batch sequencing workflow (e.g., second-batch repeated sequencing) is performed on the first and second tandem template molecules at least once.
[0075] Figure 38BThis is a schematic diagram illustrating an exemplary first-batch repeated sequencing workflow and a second-batch repeated sequencing workflow. The first-batch sequencing workflow uses first-batch specific sequencing primers (solid arrows), sequencing polymerase, and multiple nucleotide reagents to perform first-batch sequencing on first and second tandem template molecules to generate multiple first-sequencing reads (dashed arrows), where each first-sequencing read includes a first-batch barcode sequence (142; BC-1) and a portion of the first insert sequence (110-1). In this example, the first tandem template molecule undergoes first-batch repeated sequencing comprising no more than 200 sequencing cycles, but the second tandem template molecule does not undergo first-batch sequencing because the first-batch specific sequencing primers do not hybridize to the universal binding site (140-2) for the second-batch specific forward sequencing primers.
[0076] Figure 38C It is shown Figure 38B The diagram illustrates a continuation of the exemplary first-batch repeated sequencing workflow and the second-batch repeated sequencing workflow described herein. A second-batch sequencing workflow is performed on the first and second tandem template molecules using second-batch-specific sequencing primers (solid arrows), sequencing polymerase, and multiple nucleotide reagents to generate multiple second sequencing reads (dashed arrows), wherein the second sequencing reads include a second-batch barcode sequence (143; BC-2) and a portion of a second insert sequence (110-2). In some embodiments, the second tandem template molecule undergoes second-batch repeated sequencing comprising no more than 200 sequencing cycles, but the first tandem does not undergo second-batch sequencing because the second-batch-specific sequencing primers do not hybridize to the universal binding site (140-1) for the first-batch-specific forward sequencing primers.
[0077] Figure 39AThis is a schematic diagram illustrating an exemplary batch sequencing workflow. A first covalently closed circular molecule (left) can be generated by hybridizing individual linear library molecules (not shown) from a first subgroup with a clip-capture primer (200) fixed to a support. The hybridized first linear library molecules form a first open circular library molecule (not shown) with a notch or gap, which can be enzymatically closed to form a first covalently closed circular library molecule that hybridizes with the first clip-capture primer. The first covalently closed library molecule (left) contains a first insert sequence (110-1), a first-batch barcode sequence (142; BC-1), and a universal binding site (140-1) for first-batch specific forward sequencing primers, which selectively hybridizes with the first-batch forward sequencing primers. The universal binding site (140-1) for first-batch specific forward sequencing primers corresponds to the first insert sequence (110-1). The second covalently closed circular molecule (right) can be generated by hybridizing individual linear library molecules (not shown) from the second subgroup with a clip-capture primer (200) fixed to the same support. The hybridized second linear library molecules form a second open circular library molecule (not shown) with a nick or gap, which can be enzymatically closed to form a second covalently closed circular library molecule that hybridizes with the second clip-capture primer. The second covalently closed library molecule (right) contains a second insert sequence (110-2) different from the first insert sequence (110-1), a second batch barcode sequence (143; BC-2) different from the first batch barcode sequence (143; BC-1), and a universal binding site (140-1) for the first batch-specific forward sequencing primer, which selectively hybridizes with the first batch forward sequencing primer. The universal binding site (140-1) for the first batch-specific forward sequencing primer corresponds to the second insert sequence (110-2). Rolling circle amplification is performed on the first covalently closed circular library molecule and the second covalently closed circular library molecule to generate a first batch tandem template molecule and a second batch tandem template molecule immobilized to the same support. The first batch template molecule and the second batch template molecule are subjected to a batch sequencing workflow repeated at least once (e.g., batch repeated sequencing).
[0078] Figure 39BThis is a schematic diagram illustrating an exemplary batch sequencing workflow in which one type of sequencing primer is used to sequence two different tandem template molecules, each carrying the same universal binding site (140-1) for a first-batch specific forward sequencing primer. A batch sequencing workflow can be performed on the first and second tandem template molecules using first-batch specific sequencing primers (solid arrows), sequencing polymerase, and multiple nucleotide reagents to generate multiple first sequencing reads (dashed arrows), where each first sequencing read includes a first-batch barcode sequence (142; BC-1) and a portion of a first insert sequence (110-1). In this example, the first tandem template molecule undergoes repeated sequencing involving no more than 200 sequencing cycles. The batch sequencing workflow also generates multiple second sequencing reads (dashed arrows), where each second sequencing read includes a second-batch barcode sequence (143; BC-2) and a portion of a second insert sequence (110-2). In some embodiments, the second tandem undergoes repeated sequencing involving no more than 200 sequencing cycles.
[0079] Figure 40A This is a graph showing the nucleotide base diversity of the right sample index sequence (170) including the trimeric random sequence (NNN). The graph shows that the nucleotide diversity of the trimeric random sequence (NNN) recognized by A and T bases is approximately 30%, and the nucleotide diversity of the nucleotides recognized by C and G bases is approximately 20%.
[0080] Figure 40B This is a graph showing the nucleotide base diversity of the left sample index sequence (160) lacking the trimeric random sequence (NNN). The graph shows that the nucleotide diversity for A and T base recognition is approximately 40%, for C base recognition is approximately 15%, and for G base recognition is approximately 5%.
[0081] Figure 41 It is the amino acid sequence of a wild-type DNA polymerase having the main strand sequence from RLI 89578.1 (SEQ ID NO: 128).
[0082] Figure 42 It is the amino acid sequence of a wild-type DNA polymerase having the main strand sequence from KUO 42443.1 (SEQ ID NO:129).
[0083] Figure 43 It is the amino acid sequence of a wild-type DNA polymerase having the main strand sequence from MBC 7218772.1 (SEQ ID NO:130).
[0084] Figure 44It is the amino acid sequence of the wild-type DNA polymerase having the main strand sequence from NOZ 58130.1 (SEQ ID NO:131).
[0085] Figure 45 It is the amino acid sequence of a wild-type DNA polymerase having the main strand sequence from RMF 90817.1 (SEQ ID NO:132).
[0086] Figure 46 It is the amino acid sequence of a wild-type DNA polymerase having the main strand sequence from NOZ 77387.1 (SEQ ID NO: 133).
[0087] Figure 47 It is the amino acid sequence of a wild-type DNA polymerase having the main strand sequence from RLF 89458.1 (SEQ ID NO: 134).
[0088] Figure 48 It is the amino acid sequence of a wild-type DNA polymerase having the main strand sequence from RLF 78286.1 (SEQ ID NO: 135).
[0089] Figure 49 It is the amino acid sequence of a wild-type DNA polymerase having the main strand sequence from WP 175059460.1 (SEQ ID NO:136).
[0090] Figure 50 It is the amino acid sequence of Phi29 polymerase (SEQ ID NO:137).
[0091] Figure 51 It is the amino acid sequence of a wild-type DNA polymerase (Bst polymerase) with a main strand sequence derived from Geobacillus stearothermophilus (SEQ ID NO:138).
[0092] Figure 52 It is the amino acid sequence of 9°N polymerase (SEQ ID NO: 139).
[0093] Figure 53 It is the amino acid sequence of 9°N polymerase UniProt Q56366 (SEQ ID NO: 140).
[0094] Figure 54 It is the amino acid sequence of VENT® polymerase UniProt P30317 (SEQ ID NO:141).
[0095] Figure 55It is the amino acid sequence of DEEP VENT® polymerase UniProt Q51334 (SEQ ID NO:142).
[0096] Figure 56 It is THERMINATOR TM The amino acid sequence of the polymerase (SEQ ID NO:143).
[0097] Figure 57 It is the amino acid sequence of Pfu polymerase UniProt P61875 (SEQ ID NO: 144).
[0098] Figure 58 It is the amino acid sequence of the polymerase UniProt P0CL77 (SEQ ID NO:145) of Pyrococcus abyssi.
[0099] Figure 59 It is the amino acid sequence of RB69 polymerase (SEQ ID NO: 146).
[0100] Figure 60 It is the amino acid sequence of phage T3 DNA ligase (SEQ ID NO:147).
[0101] Figure 61 It is the amino acid sequence of phage T4 DNA ligase (SEQ ID NO:148).
[0102] Figure 62 It is the amino acid sequence of phage T7 DNA ligase (SEQ ID NO:149).
[0103] Figure 63 It is the amino acid sequence of phage Tfu DNA (SEQ ID NO:150).
[0104] Figure 64 It is the amino acid sequence of a thermostable DNA ligase from Thermococcus nautilus (SEQ ID NO:151). Detailed Implementation
[0105] introduction
[0106] This disclosure provides compositions and methods for generating multiple nucleic acid tandem template molecules immobilized to a support. The nucleic acid tandem template molecules can be generated by a support-on-support ligation and circularization reaction using multiple linear library molecules and multiple immobilized splint-capture primers. This disclosure provides methods for seeding and optionally reseeding the support to increase the density of immobilized nucleic acid tandems that can be sequenced. In some embodiments, the immobilized tandems can be used for downstream sequencing workflows including batch sequencing and iterative sequencing workflows. This disclosure also provides methods for interrupting an ongoing sequencing run to reseed the support to generate additional tandem template molecules that can be sequenced.
[0107] The headings provided herein are not intended to limit the various aspects of this disclosure, which can be understood by referring to the specification as a whole.
[0108] Unless otherwise defined, the technical and scientific terms used herein have the meanings commonly understood by one of ordinary skill in the art. Generally, terms related to the techniques of molecular biology, nucleic acid chemistry, protein chemistry, genetics, microbiology, transgenic cell production, and hybridization described herein are those well-known and commonly used in the art. The techniques and procedures described herein are generally performed according to conventional methods well-known in the art and as described in the various general and more specific references cited and discussed throughout this specification. For example, see Sambrook et al., Molecular Cloning: A Laboratory Manual (3rd edition, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY2000). See also Ausubel et al., Current Protocols in Molecular Biology, Greene Publishing Associates (1992). The nomenclature used in conjunction with the laboratory procedures and techniques described herein is that which is well-known and commonly used in the art.
[0109] definition
[0110] Unless the context otherwise requires, singular terms shall include plural forms, and plural terms shall include singular forms. Unless explicitly and definitively limited to one referent, the singular forms “a / an” and “the”, as well as the singular use of any word, contain multiple referents.
[0111] It should be understood that the use of alternative terms (e.g., "or") is considered to refer to one or both of the alternatives or any combination thereof.
[0112] As used herein, the term “and / or” should be considered to refer to a specific disclosure of each of the specified features or components, whether or not they are together with another. For example, the term “and / or” as used in phrases such as “A and / or B” herein is intended to include: “A and B”; “A or B”; “A” (A alone); and “B” (B alone). Similarly, the term “and / or” as used herein, such as “A, B and / or C”, is intended to cover each of the following aspects: “A, B and C”; “A, B or C”; “A or C”; “A or B”; “B or C”; “A and B”; “B and C”; “A and C”; “A” (A alone); “B” (B alone); and “C” (C alone).
[0113] As used herein and in the appended claims, the terms “comprising,” “including,” “having,” and “containing,” and their grammatical variations, are intended to be non-limiting, such that one or more items in the list do not exclude other items that may be substituted for or added to the listed items. It should be understood that whenever an aspect is described herein with the language “comprising,” other similar aspects described as “consisting of” and / or “substantially consisting of” are also provided.
[0114] As used herein, the terms “about” and “approximately” mean a value or composition within an acceptable error range for a particular value or composition, as determined by one of ordinary skill in the art, which will depend in part on how the value or composition is measured or determined, i.e., the limitations of the measurement system. For example, according to practice in the art, “about” or “approximately” may mean within one or more standard deviations. Alternatively, “about” or “approximately” may mean a range of up to 10% (i.e., ±10%) or greater, depending on the limitations of the measurement system. For example, about 5 mg may include any number between 4.5 mg and 5.5 mg. Furthermore, specifically with respect to biological systems or processes, the term may mean up to an order of magnitude or up to 5 times the value. When a particular value or composition is provided in this disclosure, unless otherwise stated, the meaning of “about” or “approximately” should be assumed to be within an acceptable error range for that particular value or composition. Furthermore, in the case of providing ranges and / or subranges of values, the range and / or subrange may include the endpoints of the range and / or subrange.
[0115] The term "biological sample" refers to a section of a single cell, multiple cells, tissue, organ, organism, or any of these biological samples. Biological samples can be extracted from an organism (e.g., for biopsy) or obtained from cell cultures grown in liquid or in petri dishes. Biological samples include fresh, frozen, fresh-frozen, or archived samples (e.g., formalin-fixed paraffin-embedded; FFPE). Biological samples can be embedded in wax, resin, epoxy resin, or agar. For example, biological samples can be fixed in any one or any combination of two or more of the following: acetone, ethanol, methanol, formaldehyde, paraformaldehyde-triton, or glutaraldehyde. Biological samples can be sectioned or unsectioned. Biological samples can be stained, destained, or unstained.
[0116] Desired nucleic acids can be extracted from biological samples using any of a variety of techniques known to those skilled in the art. For example, a typical DNA extraction procedure includes: (i) collecting a cell or tissue sample from which DNA is to be extracted; (ii) disrupting the cell membrane (i.e., cell lysis) to release DNA and other cytoplasmic components; (iii) treating the dissolved sample with a concentrated salt solution to precipitate proteins, lipids, and RNA, followed by centrifugation to separate the precipitated proteins, lipids, and RNA; and (iv) purifying the DNA from the supernatant to remove detergents, proteins, salts, or other reagents used during cell membrane lysis. A variety of suitable commercial nucleic acid extraction and purification kits are consistent with the disclosure herein. Examples include, but are not limited to: the QIAamp kit (for isolating genomic DNA from human samples) and DNAeasy kit (for isolating genomic DNA from animal or plant samples) from Qiagen (Germantown, MD), or the Maxwell® and ReliaPrep™ series kits from Promega (Madison, WI).
[0117] The terms “nucleic acid,” “polynucleotide,” and “oligonucleotide,” as used herein, and other related terms, are used interchangeably and refer to polymers of nucleotides and are not limited to any particular length. Nucleic acids include recombinant and chemically synthesized forms. Nucleic acids can be isolated. Nucleic acids include DNA molecules (e.g., cDNA or genomic DNA), RNA molecules (e.g., mRNA), DNA or RNA analogs generated using nucleotide analogs (e.g., peptide nucleic acids (PNAs) and non-naturally occurring nucleotide analogs), and chimeric forms containing DNA and RNA. Nucleic acids can be single-stranded or double-stranded. Nucleic acids comprise polymers of nucleotides, wherein the nucleotides comprise natural or non-natural bases and / or sugars. Nucleic acids contain naturally occurring nucleoside bonds, such as phosphodiester bonds. Nucleic acids may lack phosphate ester groups. Nucleic acids contain non-natural nucleoside bonds, including thiophosphate, sulfur-containing phosphate, or peptide nucleic acid (PNA) bonds. Nucleic acids comprise combinations of natural and non-natural nucleoside bonds. In some embodiments, nucleic acids comprise one type of polynucleotide or a mixture of two or more different types of polynucleotides.
[0118] The terms "universal sequence," "universal adaptor sequence," and related terms refer to a sequence common to two or more polynucleotide molecules within a nucleic acid molecule. For example, adaptors with the same universal sequence can bind to multiple polynucleotides, resulting in a population of co-conjugated molecules carrying the same universal adaptor sequence. Examples of universal adaptor sequences include amplification primer sequences, sequencing primer sequences, clip-capture primer sequences, pinned primer sequences, or non-clip primer sequences, or sequences complementary to them.
[0119] As used herein, the terms “operably linked” and “operably conjugated,” or related terms, refer to the juxtaposition of components such that the activity of one component affects that of the other. The juxtaposed components can be covalently linked together. For example, two nucleic acid components can be enzymatically linked together, wherein the bond conjugating the two components includes a phosphodiester bond. A first nucleic acid component and a second nucleic acid component can be linked together, wherein the first nucleic acid component can conjugate function to the second nucleic acid component. For example, a bond between a primer-binding sequence and a target sequence forms a nucleic acid library molecule having a portion that can bind to a primer. In another instance, a transgene (e.g., a nucleic acid encoding a polypeptide or a target nucleic acid sequence) can be linked to a vector, wherein the bond allows the transgene sequence contained in the vector to be expressed or function. In some embodiments, the transgene is operably linked to a host cell regulatory sequence (e.g., a promoter sequence) that affects transgene expression. In some embodiments, the vector includes at least one host cell regulatory sequence, which includes a promoter sequence, an enhancer, a transcription and / or translation initiation sequence, a transcription and / or translation termination sequence, a polypeptide secretion signaling sequence, and so on. In some embodiments, host cell regulatory sequences control the level, timing, and / or location of transgene expression. Those skilled in the art will understand that components can be operatively linked without direct or indirect physical linkage.
[0120] The terms “linked,” “connected,” “attached,” “additional,” and variations thereof encompass any type of fusion, binding, attachment, or association between any combination of compounds or molecules that possesses sufficient stability to withstand use in a particular procedure. This procedure may include, but is not limited to: nucleotide binding; nucleotide incorporation; deblocking (e.g., removal of chain terminations); washing; removal; flow; detection; imaging and / or identification. Such bonds can include, for example, covalent bonds, ionic bonds, hydrogen bonds, dipole-dipole bonds, hydrophilic bonds, hydrophobic bonds, or affinity bonds, bonds or associations involving van der Waals forces, mechanical bonds, and so on. In some embodiments, such bonds occur intramolecularly, for example, linking the ends of single-stranded or double-stranded linear nucleic acid molecules together to form a cyclic molecule. In some embodiments, such bonds can occur between combinations of different molecules or between molecules and non-molecules, including, but not limited to: bonds between nucleic acid molecules and solid surfaces; bonds between proteins and detectable reporter gene portions; bonds between nucleotides and detectable reporter gene portions; and so on. Some examples of the key can be found in Hermanson, G., “Bioconjugate Techniques”, 2nd edition (2008); Aslam, M., Dent, A., “Bioconjugation: Protein Coupling Techniques for the Biomedical Sciences”, London: Macmillan (1998); Aslam, M., Dent, A., “Bioconjugation: Protein Coupling Techniques for the Biomedical Sciences”, London: Macmillan (1998).
[0121] The term "adaptor" and related terms refer to an oligonucleotide that can be operatively linked (added to) a target polynucleotide, wherein the adaptor confers function on the co-linked adaptor-target molecule. Adaptors comprise DNA, RNA, chimeric DNA / RNA, or the like. Adaptors may include at least one ribonucleoside residue. Adaptors may be single-stranded, double-stranded, or have single-stranded and / or double-stranded portions. Adaptors may be configured in linear, stem-loop, hairpin, or Y-shaped forms. Adaptors can be of any length, including 4 to 100 nucleotides or longer. Adaptors may have blunt ends, overhanging ends, or a combination of both. Overhanging ends include 5' overhangs and 3' overhangs. The 5' end of a single-stranded adaptor or one strand of a double-stranded adaptor may have a 5' phosphate group or lack a 5' phosphate group. Adaptors may include a 5' tail that has not hybridized to the target polynucleotide (e.g., a tailed adaptor), or the adaptor may be tailless. The adaptor may include a sequence complementary to at least a portion of a primer (such as an amplification primer, sequencing primer, splint-capture primer, pinned primer, or non-splint primer). The adaptor may include a random or degenerate sequence. The adaptor may include at least one inosine residue. The adaptor may include at least one thiophosphate, thiophosphate, and / or phosphoramide bond. The adaptor may contain a barcode sequence that can be used to distinguish polynucleotides (e.g., insert sequences) from different sample sources in multiplex assays. The adaptor may contain a unique identification sequence (e.g., a unique molecular index, UMI; or a unique molecular tag) that can be used to uniquely identify the nucleic acid molecule to which the adaptor is appended. In some embodiments, the unique identification sequence can be used to increase error correction and accuracy, reduce false-positive variant recognition rates, and / or increase the sensitivity of variant detection. The adaptor may include at least one restriction enzyme recognition sequence, including any combination of any one or more of the following groups: type I, type II, type III, type IV, Hs, or type IIB.
[0122] The terms “nucleic acid template,” “template polynucleotide,” “nucleic acid target,” “target polynucleotide,” “template strand,” and other variations refer to the nucleic acid strand of the basic nucleic acid molecule used in any of the analytical methods described herein (e.g., primer extension, amplification, and / or sequencing). The template nucleic acid can be single-stranded or double-stranded, or may have single-stranded or double-stranded portions. The template nucleic acid can be obtained from naturally occurring sources, recombinant forms, or chemically synthesized to include any type of nucleic acid analogue. The template nucleic acid can be linear, circular, or other forms. The template nucleic acid may include an insert region having an insert sequence also referred to as the target sequence. The template nucleic acid may also include at least one adaptor sequence. The template nucleic acid can be a tandem copy of two or more copies of the target sequence and at least one adaptor sequence. The insert region can be isolated in any form, including from fresh-frozen paraffin-embedded tissue, puncture biopsy, circulating tumor cells, cell-free circulating DNA, or any type of nucleic acid library. This includes recombinant molecules of chromosomes, genomes, organelles (e.g., mitochondria, chloroplasts, or ribosomes), cloned or amplified cDNA, RNA (such as precursor mRNA or mRNA), oligonucleotides, and whole-genome DNA. The insert region can be isolated from any source, including from organisms such as prokaryotes, eukaryotes (e.g., humans, plants, and animals), fungi, viruses, cells, tissues, normal or diseased cells or tissues, body fluids including blood, urine, serum, lymph, tumors, saliva, anal and vaginal secretions, amnion samples, sweat, semen, environmental samples, culture samples, or synthetic nucleic acid molecules prepared using recombinant molecular biology or chemical synthesis methods. The insertion region can be isolated from any organ, including the head, neck, brain, breast, ovary, cervix, colon, rectum, endometrium, gallbladder, intestine, bladder, prostate, testis, liver, lung, kidney, esophagus, pancreas, thyroid gland, pituitary gland, thymus, skin, heart, larynx, or other organs. The template nucleic acid can undergo nucleic acid analysis (including sequencing and compositional analysis).
[0123] As used herein, the term "polymerase" and its variants include enzymes comprising a domain that binds a nucleotide (or nucleoside), wherein the polymerase can form a complex having a template nucleic acid and a complementary nucleotide. A polymerase may have one or more activities, including but not limited to: base analog detection activity, DNA polymerization activity, reverse transcriptase activity, DNA binding, strand substitution activity, and nucleotide binding and recognition. A polymerase can be any enzyme capable of catalyzing the polymerization of nucleotides (including their analogs) into a nucleic acid chain. Typically, but not necessarily, such nucleotide polymerization can occur in a template-dependent manner. Typically, a polymerase includes one or more active sites at which nucleotide binding and / or nucleotide polymerization catalysis can occur. In some embodiments, the polymerase includes other enzymatic activities, such as 3' to 5' exonuclease activity or 5' to 3' exonuclease activity. In some embodiments, the polymerase has strand substitution activity. Polymerases may include, but are not limited to: naturally occurring polymerases and any subunits and truncated forms thereof, mutant polymerases, variant polymerases, recombinant, fusion, or otherwise engineered polymerases, chemically modified polymerases, synthetic molecules or assemblies, and any analogs, derivatives, or fragments thereof (e.g., catalytically active fragments) that retain the ability to catalyze nucleotide polymerization. Polymerases include catalytically inactive polymerases, catalytically active polymerases, reverse transcriptases, and other enzymes containing nucleotide-binding domains. In some embodiments, polymerases may be isolated from cells or produced using recombinant DNA technology or chemical synthesis methods. In some embodiments, polymerases may be expressed in prokaryotes, eukaryotes, viruses, or bacteriophages. In some embodiments, polymerases may be post-translational modified proteins or fragments thereof. Polymerases may be derived from prokaryotes, eukaryotes, viruses, or bacteriophages. Polymerases include DNA-guided DNA polymerases and RNA-guided DNA polymerases. Suitable polymerases are known in the art, and the sequences of exemplary suitable polymerases are described in […]. Figures 41 to 59 Provided by China.
[0124] The term "strand displacement" refers to the ability of a polymerase to locally separate double-stranded nucleic acid strands and synthesize a new strand in a template-based manner. Strand displacement polymerases replace the complementary strand from the template strand and catalyze the synthesis of the new strand. Strand displacement polymerases include mesophilic and thermophilic polymerases. Strand displacement polymerases include wild-type enzymes and variants (including exonuclease depletion mutants, mutant versions, chimeric enzymes, and truncated enzymes). Examples of strand displacement polymerases include phi29 DNA polymerase, large fragment Bst DNA polymerase, large fragment Bsu DNA polymerase (exo-), Bca DNA polymerase (exo-), the Klenow fragment of E. coli DNA polymerase, T5 polymerase, M-MuLV reverse transcriptase, HIV reverse transcriptase, Deep Vent DNA polymerase, and KOD DNA polymerase. phi29 DNA polymerase can be wild-type phi29 DNA polymerase (e.g., MagniPhi from Expedeon). TM ), or variant EquiPhi29 TM DNA polymerase (e.g., from Thermo Fisher Scientific, catalog number A39390), or chimeric QualiPhi TM DNA polymerase (e.g., from 4basebio, catalog number 510025).
[0125] As used herein, the term "fidelity" refers to the accuracy of DNA polymerization performed by a template-dependent DNA polymerase. DNA polymerase fidelity is typically measured by the error rate (the frequency of incorporation of inaccurate nucleotides, i.e., nucleotides that are not complementary to the template nucleotide). The accuracy or fidelity of DNA polymerization is maintained by the polymerase activity and 3'-5' exonuclease activity of the DNA polymerase.
[0126] As used herein, the term "binding complex" refers to a complex formed by binding together a nucleic acid duplex, a polymerase, and a free nucleotide or nucleotide moiety of a multivalent molecule, wherein the nucleic acid duplex includes a nucleic acid template molecule that hybridizes with a nucleic acid primer. In a binding complex, the free nucleotide or nucleotide moiety may or may not bind to the 3' end of the nucleic acid primer at a position opposite to a complementary nucleotide in the nucleic acid template molecule. A "ternary complex" is an example of a binding complex formed by binding together a nucleic acid duplex, a polymerase, and a free nucleotide or nucleotide moiety of a multivalent molecule, wherein the free nucleotide or nucleotide moiety binds to the 3' end of the nucleic acid primer (as part of the nucleic acid duplex) at a position opposite to a complementary nucleotide in the nucleic acid template molecule. An "affinity complex" refers to a complex in which an arm of a single multivalent molecule with multiple nucleotide moiety portions participates in different ternary complexes.
[0127] The term "retention time" and related terms refer to the length of time a binding complex remains stable without any dissociation of its components, wherein the components of the binding complex include a nucleic acid template and primer, a polymerase, a nucleotide moiety of a multivalent molecule, or a free (e.g., unconjugated) nucleotide. The nucleotide moiety or free nucleotide may be complementary or non-complementary to nucleotide residues in the template molecule. The nucleotide moiety or free nucleotide may bind to the 3' end of the nucleic acid primer at a position opposite to the complementary nucleotide residue in the nucleic acid template molecule. Retention time indicates the stability of the binding complex and the strength of the binding interaction. Retention time can be measured by observing the onset and / or duration of the binding complex (e.g., by observing a signal from a labeled component of the binding complex). For example, a labeled nucleotide or a labeled reagent containing one or more nucleotides may be present in the binding complex, thus allowing a signal from the label to be detected during the retention time of the binding complex. An exemplary label is a fluorescent label. The binding complex (e.g., a ternary complex) remains stable before undergoing conditions that cause dissociation between the polymerase, template molecule, primer, and / or nucleotide moiety or any of the nucleotides. For example, dissociation conditions include contacting the bound complex with any or any combination of detergent, EDTA, and / or water.
[0128] As used herein, the term "primer" and related terms refer to oligonucleotides capable of hybridizing with DNA and / or RNA polynucleotide templates to form double-stranded molecules. Primers include natural nucleotides and / or nucleotide analogs. Primers can be recombinant nucleic acid molecules. Primers can have any length, but typically range from 4 to 50 nucleotides. Typical primers include a 5' end and a 3' end. The 3' end of a primer may contain a 3'OH portion that acts as the initiation site for nucleotide polymerization in a polymerase-catalyzed primer extension reaction. Alternatively, the 3' end of a primer may lack a 3'OH portion or may contain a terminal 3' blocking group that inhibits nucleotide polymerization in a polymerase-catalyzed reaction. Any nucleotide or more along the length of the primer may be labeled with a detectable reporter gene portion. Primers may be in solution (e.g., soluble primers) or may be immobilized to a support (e.g., splint-capture primers or pinned primers).
[0129] When used to refer to nucleic acid molecules, the terms "hybridize," "hybridizing," "hybridization," or other related terms refer to the hydrogen bonding between two different nucleic acids to form a double-stranded nucleic acid. Hybridization also includes the hydrogen bonding between two different regions of a single nucleic acid molecule to form a self-hybridized molecule with double-stranded regions. Hybridization can include Watson-Crick or Hoogstein binding to form double-stranded nucleic acids or double-stranded regions within a nucleic acid molecule. The two different regions of a double-stranded nucleic acid or a single nucleic acid can be completely complementary or partially complementary. Complementary nucleic acid strands do not need to hybridize to each other across their entire length. Complementary base pairing can be standard AT or CG base pairing or can be other forms of base pairing interactions. Double-stranded nucleic acids can contain mismatched base-paired nucleotides.
[0130] When used to refer to nucleic acids, the terms "extend," "extending," "extension," and other variations refer to the incorporation of one or more nucleotides into a nucleic acid molecule. Nucleotide incorporation involves the polymerization of one or more nucleotides into the terminal 3' OH end of a nucleic acid chain (e.g., a nucleic acid primer), resulting in the extension of the nucleic acid chain (e.g., an extended primer). Nucleotide incorporation can be performed using native nucleotides and / or nucleotide analogs. Typically, but not always, nucleotide incorporation occurs in a template-dependent manner. Any suitable method for extending nucleic acid molecules can be used, which involves primer extension catalyzed by DNA polymerase or RNA polymerase.
[0131] The term "nucleotide" and related terms refer to a molecule comprising an aromatic base, a pentose sugar (e.g., ribose or deoxyribose), and at least one phosphate ester group. Canonical or non-canonical nucleotides are used in accordance with this term. In some embodiments, phosphate esters include monophosphate, diphosphate, or triphosphate esters, or corresponding phosphate ester analogs. The term "nucleoside" refers to a molecule comprising an aromatic base and a sugar. Nucleotides and nucleosides may be unlabeled or labeled with a detectable reporter gene motif.
[0132] Nucleotides (and nucleosides) typically contain heterocyclic bases, including substituted or unsubstituted nitrogen-containing parent heteroaromatic rings, commonly found in nucleic acids, including naturally occurring, substituted, modified, or engineered variants or analogues thereof. The bases of nucleotides (or nucleosides) are capable of forming Watson-Crick and / or Husstein hydrogen bonds with suitable complementary bases. Exemplary bases include, but are not limited to, purines and pyrimidines, such as: 2-aminopurine, 2,6-diaminopurine, adenine (A), ethylene adenine, N... 6 -Δ 2-Isopentenyladenine (6iA), N 6 -Δ 2 -Isopentenyl-2-methylthioadenine (2ms6iA), N 6 -Methyladenine, guanine (G), isoguanine, N 2 -Dimethylguanine (dmG), 7-methylguanine (7mG), 2-thiopyrimidine, 6-thioguanine (6sG), hypoxanthine and O 6 -Methylguanine; 7-deazo-purine, such as 7-deazoadenine (7-deazoaden-A) and 7-deazoguanine (7-deazo-G); pyrimidine, such as cytosine (C), 5-propynylcytosine, isocytosine, thymine (T), 4-thiothymine (4sT), 5,6-dihydrothymine, O 4 -Methylthymine, uracil (U), 4-thiouracil (4sU), and 5,6-dihydrouracil (dihydrouracil; D); indole, such as nitroindole and 4-methylindole; pyrrole, such as nitropyrrole; muscarin; inosine; hydroxymethylcytosine; 5-methylcytosine; base (Y); and methylated, glycosylated, and acylated base moieties; etc. Additional exemplary bases can be found in Fasman, 1989, in “Practical Handbook of Biochemistry and Molecular Biology”, pp. 385–394, CRC Press, Boca Raton, Fla.
[0133] Nucleotides (and nucleosides) typically contain a sugar moiety, such as a carbocyclic moiety (Ferraro and Gotor 2000 Chem. Rev. 100: 4319-48), an acyclic moiety (Martinez et al., 1999 Nucleic Acids Research 27: 1271-1274; Martinez et al., 1997 Bioorganic & Medicinal Chemistry Letters Vol. 7: 3013-3016), and other sugar moieties (Joeng et al., 1993 J. Med. Chem. 36: 2627-2638; Kim et al., 1993 J. Med. Chem. 36: 30-7; Eschenmosser 1999 Science 284: 2118-2124; and US Patent No. 5,558,991). The sugar moiety may include ribosyl; 2'-deoxyribosyl; 3'-deoxyribosyl; 2',3'-dideoxyribosyl; 2',3'-didehydrodideoxyribosyl; 2'-alkoxyribosyl; 2'-azidoribosyl; 2'-aminoribosyl; 2'-fluororibosyl; 2'-mercaptoribosyl; 2'-alkylthioribosyl; 3'-alkoxyribosyl; 3'-azidoribosyl; 3'-aminoribosyl; 3'-fluororibosyl; 3'-mercaptoribosyl; 3'-alkylthioribosyl carbocyclic; acyclic or other modified sugars.
[0134] In some embodiments, the nucleotide comprises a chain of one, two, or three phosphorus atoms, wherein the chain is typically attached to the 5' carbon of the sugar moiety via an ester or phosphoramide bond. In some embodiments, the nucleotide is an analogue having a phosphorus chain, wherein the phosphorus atoms are linked together by an intermediate O, S, NH, methylene, or ethylene group. In some embodiments, the phosphorus atom in the chain comprises a substituted side group (including O, S, or BH3). In some embodiments, the chain comprises a phosphate ester group substituted with an analogue, including phosphoramide, thiophosphate, dithiophosphate, and O-methylphosphoramide groups.
[0135] The terms "reporter moiety," "reporter moieties," or related terms refer to compounds that produce or cause the production of a detectable signal. A reporter moiety is sometimes referred to as a "tag." Any suitable reporter moiety can be used, including luminescence, photoluminescence, electroluminescence, bioluminescence, chemiluminescence, fluorescence, phosphorescence, chromophores, radioisotopes, electrochemical, mass spectrometry, Raman spectroscopy, haptens, affinity tags, atoms, or enzymes. A reporter moiety produces a detectable signal caused by a chemical or physical change, such as heat, light, electricity, pH, salt concentration, enzymatic activity, or proximity events. Proximity events include two reporter moieties coming close to each other, associating with each other, or binding to each other. Those skilled in the art are well aware of selecting reporter moieties such that each reporter moiety absorbs excitation radiation and / or emits fluorescence at a wavelength distinguishable from other reporter moieties, allowing for the monitoring of the presence of different reporter moieties in the same or different reactions. Two or more different reporter moieties with spectrally distinct emission profiles or with minimal overlap spectral emission profiles can be selected. The reporter gene portion may be linked to nucleotides, nucleosides, nucleic acids, enzymes (e.g., polymerases or reverse transcriptases) or supports (e.g., surfaces) in an operative manner.
[0136] The report section (or labeling) may include fluorescent labels or fluorophores. Exemplary fluorescent portions that can be used as fluorescent markers or fluorophores include, but are not limited to, fluoresceins and fluorescein derivatives such as carboxyfluorescein, tetrachlorofluorescein, hexachlorofluorescein, carboxynaphthol fluorescein, isothiocyanate fluorescein, NHS-fluorescein, iodoacetamide-fluorescein, maleimide fluorescein, SAMSA-fluorescein, aminothiourea fluorescein, hydrazine methylthioacetamide fluorescein; rhodamine and rhodamine derivatives such as TRITC, TMR, lissamine rhodamine, Texas red, rhodamine B, rhodamine 6G, rhodamine 10, NHS-rhodamine, TMR-iodoacetamide, lissamine rhodamine B sulfonyl chloride, lissamine rhodamine B sulfonyl hydrazine, Texas red sulfonyl chloride, Texas red hydrazine; coumarins and coumarin derivatives such as AMCA, AMCA-NHS, AMCA-sulfo-NHS, AMCA-HPDP, DCIA, AMCE-hydrazine; BODIPY® and derivatives such as BODIPY. FL C3-SE, BODIPY 530 / 550 C3, BODIPY530 / 550 C3-SE, BODIPY 530 / 550 C3 hydrazide, BODIPY 493 / 503 C3 hydrazide, BODIPY FL C3 hydrazide, BODIPY FL IA, BODIPY 530 / 551 IA, Br-BODIPY 493 / 503, Cascade® Blue and its derivatives, such as Cascade Blue acetyltriazine, Cascade Blue cadaverine, Cascade Blue ethylenediamine, Cascade Blue hydrazide, Lucifer Yellow and its derivatives, such as Lucifer Yellow iodoacetamide, Lucifer Yellow CH, cyanines and their derivatives, such as indolonyl cyanine dyes, benzoindolonyl cyanine dyes, pyridinium cyanine dyes, thiazolonyl cyanine dyes, quinolinelonyl cyanine dyes, imidazolonyl cyanine dyes, Cy3, Cy5, lanthanide chelates and their derivatives, such as BCPDA, TBP, TMT, BHHCT, BCOT, europium chelates, terbium chelates, Alexa Fluor® dyes, DyLight dyes, Atto dyes, LightCycler® Red dyes, CAL Flour dyes, JOE and its derivatives, Oregon Green dyes, WellRED dyes, IRD dyes, phycoerythrin and phycobilin dyes, malachite green, diphenylethylene, DEG dyes, NR dyes, near-infrared dyes, and other dyes known in the art, such as Haugland, Molecular Probes Handbook, (Eugene, Oreg.) 6th edition; Lakowicz,Those described in Principles of Fluorescence Spectroscopy, 2nd Edition, Plenum Press New York (1999), or Hermanson, Bioconjugate Techniques, 2nd Edition, or their derivatives, or any combination thereof. Indigo dyes can exist in sulfonated or non-sulfonated forms and consist of two pseudoindole, benzoindoline, pyridinium, thiazoline, and / or quinolinium groups separated by a polymethystylene bridge between two nitrogen atoms. Commercially available anthocyanin fluorophores include, for example, Cy3 (which may contain 1-[6-(2,5-dioxopyrrolidone-1-yloxy)-6-oxohexyl]-2-(3-{1-[6-(2,5-dioxopyrrolidone-1-yloxy)-6-oxohexyl]-3,3-dimethyl-1,3-dihydro-2H-indol-2-yl]prop-1-en-1-yl)-3,3-dimethyl-3H-indolium or 1-[6-(2,5-dioxopyrrolidone-1-yloxy)-6-oxohexyl]-2 -(3-{1-[6-(2,5-dioxopyrrolidone-1-yloxy)-6-oxohexyl]-3,3-dimethyl-5-sulfon-1,3-dihydro-2H-indol-2-ylidene}prop-1-en-1-yl)-3,3-dimethyl-3H-indolon-5-sulfonate), Cy5 (which may contain 1-(6-((2,5-dioxopyrrolidone-1-yl)oxy)-6-oxohexyl)-2-((1E,3E)-5-((E)- ...-6-oxohexyl)-2-((1E,3E)-5-((E)-1-(6-((2,5-dioxopyrrolidone-1-yl)-6- -(2,5-dioxopyrrolidone-1-yl)-6-oxohexyl)-3,3-dimethyl-5-indole-2-yl)pent-1,3-dien-1-yl)-3,3-dimethyl-3H-indole-1-onyl or 1-(6-((2,5-dioxopyrrolidone-1-yl)oxy)-6-oxohexyl)-2-((1E,3E)-5-((E)-1-(6-((2,5-dioxopyrrolidone-1-yl)oxy)-6-oxohexyl)-3,3-dimethyl-5-sulfonylindole-2-yl)pent-1,3-dien- 1-(5-carboxypentyl)-2-[(1E,3E,5E,7Z)-7-(1-ethyl-1,3-dihydro-2H-indole-2-ylidene)hepta-1,3,5-trien-1-yl]-3H-indoleonium or 1-(5-carboxypentyl)-2-[(1E,3E,5E,7Z)-7-(1-ethyl-5-sulfonyl-1,3-dihydro-2H-indole-2-ylidene)hepta-1,3,[5-trien-1-yl]-3H-indolium-5-sulfonate), where "Cy" stands for 'cyanine', and the first number identifies the number of carbon atoms between the two pseudoindole groups. Cy2 is an oxazole derivative rather than a pseudoindole, and the benzo-derived Cy3.5, Cy5.5, and Cy7.5 are exceptions to this rule.
[0137] In some embodiments, the reporting portion may be a fluorescence resonance energy transfer (FRET) pair, allowing for multiple classifications within a single excitation and imaging step. As used herein, FRET may include excitation exchange (Forster) transfer or electron exchange (Dexter) transfer.
[0138] This disclosure provides support for the sequencing methods described herein.
[0139] As used herein, the term "support" refers to a substrate designed for the deposition of biological molecules or biological samples for determination and / or analysis. Examples of biological molecules to be deposited onto a support include nucleic acids (e.g., DNA, RNA), peptides, carbohydrates, lipids, single cells, or multiple cells. Examples of biological samples include, but are not limited to, saliva, sputum, mucus, blood, plasma, serum, urine, feces, sweat, tears, and fluids from tissues or organs.
[0140] In some embodiments, the support is solid, semi-solid, or a combination of both. In some embodiments, the support is porous, semi-porous, non-porous, or any combination of porous. In some embodiments, the support can be substantially planar, concave, convex, or any combination thereof. In some embodiments, the support can be cylindrical, for example, including a capillary or the inner surface of a capillary.
[0141] In some embodiments, the surface of the support may be substantially smooth. In some embodiments, the support may have a regular or irregular texture, including bumps, etched surfaces, holes, three-dimensional supports, or any combination thereof.
[0142] In some embodiments, the support comprises beads of any shape, including spherical, hemispherical, cylindrical, barrel-shaped, ring-shaped, disc-shaped, rod-shaped, conical, triangular, cubic, polygonal, tubular, or linear.
[0143] The support can be made of any material, including but not limited to: glass, fused silica, silicon, polymers (e.g., polystyrene (PS), macroporous polystyrene (MPPS), polymethyl methacrylate (PMMA), polycarbonate (PC), polypropylene (PP), polyethylene (PE), high-density polyethylene (HDPE), cyclic olefin polymers (COP), cyclic olefin copolymers (COC), polyethylene terephthalate (PET)) or any combination thereof. Various combinations of both glass and plastic substrates are contemplated.
[0144] The support may have multiple (e.g., two or more) nucleic acid tandem template molecules immobilized thereon. These multiple immobilized nucleic acid tandem template molecules may have the same sequence or different sequences. In some embodiments, individual nucleic acid tandem template molecules from the multiple nucleic acid templates are immobilized to different sites on the support. In some embodiments, two or more individual nucleic acid tandem template molecules from the multiple nucleic acid templates are immobilized to sites on the support.
[0145] The term "array" refers to a support that includes an array of sites located at predetermined positions on the support to form an array of sites (e.g., Figure 16 The sites can be discrete and separated by gaps. In some embodiments, the predetermined sites on the support can be arranged in rows or columns in one dimension, or in rows and columns in two dimensions. In some embodiments, a plurality of predetermined sites are arranged in an organized manner on the support. In some embodiments, the plurality of predetermined sites are arranged in any organized pattern, including straight lines, hexagonal patterns, grid patterns, patterns with reflective symmetry, or patterns with rotational symmetry, etc. The spacing between different pairs of sites can be the same or can be different. In some embodiments, the support includes at least 10 2 At least 10 sites 3 At least 10 sites 4 At least 10 sites 5 At least 10 sites 6 At least 10 sites 7 At least 10 sites 8 At least 10 sites 9 At least 10 sites 10 At least 10 sites 11 At least 10 sites 12 At least 10 sites 13 At least 10 sites 14 At least 10 sites 15 One or more sites, wherein the site is located at a predetermined position on the support. In some embodiments, a plurality of predetermined sites on the support (e.g., 10 sites) 2 Up to 1015 Nucleic acid templates are immobilized at multiple sites (one or more sites) to form a nucleic acid template array. In some embodiments, the nucleic acid templates are immobilized at multiple predetermined sites by hybridization with immobilized capture primers, or the nucleic acid templates are covalently attached to the capture primers. In some embodiments, the nucleic acid templates are immobilized at multiple predetermined sites, for example, at 10 sites. 2 Up to 10 15 At one or more predetermined sites. In some embodiments, a fixed nucleic acid template is amplified clonally to generate fixed nucleic acid polony at multiple predetermined sites. In some embodiments, each fixed nucleic acid polony comprises a single-stranded or double-stranded tandem template molecule.
[0146] In some embodiments, a support comprising multiple sites located at random positions on the support is referred herein to as a support having sites randomly positioned thereon. Figure 15 The locations of the randomly positioned sites on the support are not predetermined. These multiple randomly positioned sites are arranged on the support in a disordered and / or unpredictable manner. In some embodiments, the support comprises at least 10 2 At least 10 sites 3 At least 10 sites 4 At least 10 sites 5 At least 10 sites 6 At least 10 sites 7 At least 10 sites 8 At least 10 sites 9 At least 10 sites 10 At least 10 sites 11 At least 10 sites 12 At least 10 sites 13 At least 10 sites 14 At least 10 sites 15 Multiple or more sites, wherein the sites are randomly located on the support. In some embodiments, multiple randomly located sites on the support (e.g., 10 sites) 2 Up to 10 15 Nucleic acid templates are used to immobilize (one or more sites) to form a nucleic acid template-immobilized support. In some embodiments, the nucleic acid template is immobilized at multiple randomly located sites by hybridization with immobilized capture primers, or the nucleic acid template is covalently attached to the capture primers. In some embodiments, the nucleic acid template is immobilized at multiple randomly located sites, for example, at 10 sites. 2 Up to 10 15At one or more sites. In some embodiments, a fixed nucleic acid template is amplified clonally to generate fixed nucleic acid colonies at the plurality of randomly located sites. In some embodiments, each fixed nucleic acid colony comprises a single-stranded or double-stranded tandem template molecule.
[0147] When used to refer to low-binding surface coatings, one or more layers of a multilayer surface coating may contain branched polymers or may be linear. Examples of suitable branched polymers include, but are not limited to: branched PEG, branched poly(vinyl alcohol) (branched PVA), branched poly(vinylpyridine), branched poly(vinylpyrrolidone) (branched PVP), branched poly(acrylic acid) (branched PAA), branched polyacrylamide, branched poly(N-isopropylacrylamide) (branched PNIPAM), branched poly(methyl methacrylate) (branched PMA), branched poly(2-hydroxyethyl methacrylate) (branched PHEMA), branched poly(oligomeric (ethylene glycol) methyl ether methacrylate) (branched POEGMA), branched polyglutamic acid (branched PGA), branched polylysine, branched polyglucosides, and dextran.
[0148] In some embodiments, the branched polymer used to produce one or more layers of any of the multilayer surfaces disclosed herein may comprise: at least 4 branches, at least 5 branches, at least 6 branches, at least 7 branches, at least 8 branches, at least 9 branches, at least 10 branches, at least 12 branches, at least 14 branches, at least 16 branches, at least 18 branches, at least 20 branches, at least 22 branches, at least 24 branches, at least 26 branches, at least 28 branches, at least 30 branches, at least 32 branches, at least 34 branches, at least 36 branches, at least 38 branches, or at least 40 branches.
[0149] The linear, branched, or multibranched polymers used to form one or more layers of any of the multilayer surfaces disclosed herein may have a molecular weight of at least 500, at least 1,000, at least 2,000, at least 3,000, at least 4,000, at least 5,000, at least 10,000, at least 15,000, at least 20,000, at least 25,000, at least 30,000, at least 35,000, at least 40,000, at least 45,000, or at least 50,000 Daltons.
[0150] In some embodiments, for example, where at least one layer of the multilayer surface comprises a branched polymer, the number of covalent bonds between the branched polymer molecules of the layer being deposited and the molecules of the previous layer can range from about one covalent bond per molecule to about 32 covalent bonds per molecule. In some embodiments, the number of covalent bonds between the branched polymer molecules of the new layer and the molecules of the previous layer can be at least 1, at least 2, at least 3, at least 4, at least 5, at least 6, at least 7, at least 8, at least 9, at least 10, at least 12, at least 14, at least 16, at least 18, at least 20, at least 22, at least 24, at least 26, at least 28, at least 30, or at least 32 covalent bonds per molecule.
[0151] Any reactive functional groups remaining after the material layer is coupled to the surface can optionally be blocked by coupling small inert molecules using high-yield coupling chemistry. For example, in the case of using amine coupling chemistry to connect a new material layer to a previous material layer, any residual amine groups can subsequently be acetylated or deactivated by coupling with small amino acids such as glycine.
[0152] The number of layers of low nonspecific binding material (e.g., hydrophilic polymeric material) deposited on the surface can range from 1 to about 10. In some embodiments, the number of layers is at least 1, at least 2, at least 3, at least 4, at least 5, at least 6, at least 7, at least 8, at least 9, or at least 10. In some embodiments, the number of layers can be up to 10, up to 9, up to 8, up to 7, up to 6, up to 5, up to 4, up to 3, up to 2, or up to 1. Any of the lower and upper limits described in this paragraph can be combined to form the ranges included in this disclosure; for example, in some embodiments, the number of layers can range from about 2 to about 4. In some embodiments, all layers may contain the same material. In some embodiments, each layer may contain a different material. In some embodiments, multiple layers may contain multiple materials. In some embodiments, at least one layer may contain a branched polymer. In some embodiments, all layers may contain a branched polymer.
[0153] In some cases, polar protic solvents, polar or polar aprotic solvents, nonpolar solvents, or any combination thereof can be used to deposit one or more layers of low nonspecific binding material onto and / or conjugate to the substrate surface. In some embodiments, the solvent used for layer deposition and / or conjugation may comprise alcohols (e.g., methanol, ethanol, propanol, etc.), another organic solvent (e.g., acetonitrile, dimethyl sulfoxide (DMSO), dimethylformamide (DMF), etc.), water, aqueous buffer solutions (e.g., phosphate buffer, phosphate-buffered saline, 3-(N-morpholino)propanesulfonic acid (MOPS), etc.), or any combination thereof. In some embodiments, the organic component of the solvent mixture used may constitute at least 1%, 5%, 10%, 15%, 20%, 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 70%, 75%, 80%, 85%, 90%, 95%, 98%, or 99% of the total, with the balance consisting of water or an aqueous buffer solution. In some embodiments, the aqueous component of the solvent mixture used may constitute at least 1%, 5%, 10%, 15%, 20%, 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 70%, 75%, 80%, 85%, 90%, 95%, 98%, or 99% of the total, with the balance consisting of an organic solvent. The pH of the solvent mixture used may be less than 6, about 6, 6.5, 7, 7.5, 8, 8.5, 9, or greater than pH 9.
[0154] The term "branched polymer" and related terms refer to polymers having multiple functional groups that facilitate the conjugation of bioactive molecules, such as nucleotides, and these functional groups may be attached to the side chains of the polymer or directly to the central core or main chain of the polymer. Branched polymers may have a linear main chain in which one or more functional groups detach from the main chain for conjugation. Branched polymers may also be polymers with one or more side chains having sites suitable for conjugation. Examples of functional groups include, but are not limited to: hydroxyl, ester, amine, carbonate, acetal, aldehyde, aldehyde hydrate, alkenyl, acrylate, methacrylate, acrylamide, active sulfone, hydrazine, thiol, alkanonic acid, acyl halide, isocyanate, isothiocyanate, maleimide, vinyl sulfone, dithiopyridine, vinylpyridine, iodoacetamide, epoxide, glyoxal, diketone, methanesulfonate, toluenesulfonate, and tresylate.
[0155] When used to refer to immobilized nucleic acids, the term "immobilized" and related terms refer to nucleic acid molecules attached to a support, or to a coating on a support, or embedded in a matrix formed by a coating on a support via covalent or non-covalent interactions, wherein the nucleic acid molecules include splint-capture primers (200), pinning primers (500), nucleic acid tandem template molecules, and extension products of the capture primers. Extension products of the capture primers include nucleic acid tandem template molecules capable of forming nucleic acid colonies.
[0156] In some embodiments, one or more nucleic acid tandem template molecules are immobilized on a support, for example, at sites on the support. In some embodiments, the one or more nucleic acid templates are amplified clonally. In some embodiments, one or more nucleic acid tandem template molecules are cloned and amplified from a support (e.g., in solution), then deposited onto the support and immobilized. In some embodiments, the clonal amplification reaction of one or more nucleic acid tandem template molecules is performed on the support, resulting in immobilization. In some embodiments, the clonal amplification of one or more nucleic acid tandem template molecules using nucleic acid amplification reactions (e.g., in solution or on a support) includes any one or any combination of the following: polymerase chain reaction (PCR), multiple displacement amplification (MDA), transcription-mediated amplification (TMA), sequence-based amplification (NASBA), strand displacement amplification (SDA), real-time SDA, bridging amplification, isothermal bridging amplification, rolling circle amplification (RCA), loop-to-loop amplification, helicase-dependent amplification, recombinase-dependent amplification, and / or single-strand-binding (SSB) protein-dependent amplification.
[0157] The terms “capture primer,” “sandwich capture primer,” “immobilized sandwich capture primer,” and related terms refer to single-stranded oligonucleotides immobilized to a support and containing a sequence capable of hybridizing with at least a portion of a nucleic acid library molecule. Spandwich capture primers can be used to immobilize linear library molecules to a support via hybridization. Spandwich capture primers can be immobilized to the support in a manner resistant to primer removal during flow, washing, aspiration, and changes in temperature, pH, salt, chemical, and / or enzymatic conditions. Typically, but not necessarily, the 5' end of the capture primer can be immobilized to the support. Alternatively, the interior or 3' end of the capture primer can be immobilized to the support.
[0158] The sequence of the splint-capture primer may be fully or partially complementary to at least a portion of the linear library molecule along its length. The support may include multiple immobilized splint-capture primers having the same sequence or having two or more different sequences. The splint-capture primer may be of any length, such as 4 to 50 nucleotides, or 50 to 100 nucleotides, or 100 to 150 nucleotides, or longer. The splint-capture primer may include a terminal 3' nucleotide having a sugar 3'OH moiety that can extend for nucleotide polymerization (e.g., polymerase-catalyzed polymerization). The splint-capture primer may include a terminal 3' nucleotide having a portion that blocks polymerase-catalyzed extension. The splint-capture primer may include a terminal 3' nucleotide having a 3' sugar position linked to a chain-terminating portion that inhibits nucleotide polymerization. The 3' chain-terminating portion may be removed (e.g., deblocked) using a deblocking agent to convert the 3' end into an extendable 3' OH end.
[0159] The terms "pinned primer," "pinned primer via solid," and related terms refer to single-stranded oligonucleotides immobilized to a support and containing a sequence capable of hybridizing with at least a portion of a nucleic acid tandem template molecule. Pinned primers can be used to immobilize tandem template molecules to a support via hybridization. Pinned primers can be immobilized to the support in a manner resistant to primer removal during flow, washing, aspiration, and changes in temperature, pH, salt, chemical, and / or enzymatic conditions. Typically, but not necessarily, the 5' end of the pinned primer can be immobilized to the support. Alternatively, the internal portion or 3' end of the pinned primer can be immobilized to the support.
[0160] The sequence of the pinning primer (500) may be fully or partially complementary to at least a portion of the tandem template molecule along its length. The support may include multiple pinned primers having the same sequence or having two or more different sequences. The pinning primer may be of any length, such as 4 to 50 nucleotides, or 50 to 100 nucleotides, or 100 to 150 nucleotides or longer. The pinning primer may include a terminal 3' nucleotide having a sugar 3' OH portion that can extend for nucleotide polymerization (e.g., polymerase-catalyzed polymerization). The pinning primer may include a terminal 3' phosphate portion that blocks polymerase-catalyzed extension. The pinning primer may include a terminal 3' nucleotide having a portion that blocks polymerase-catalyzed extension. The pinning primer may include a terminal 3' nucleotide having a 3' sugar position linked to a chain termination portion that inhibits nucleotide polymerization. The 3' chain termination portion may be removed (e.g., deblocked) using a deblocking agent to convert the 3' end to an extendable 3' OH end.
[0161] The clip-capture primer (200) and pinning primer (500) may contain DNA, RNA, or analogues thereof. The capture primer and pinning primer may include a combination of DNA and RNA.
[0162] The 3' end of the clip-capture primer (200) or pinning primer (500) may include a chain-terminating portion. Examples of chain-terminating portions include alkyl groups, alkenyl groups, alkynyl groups, allyl groups, aryl groups, benzyl groups, azido groups, amine groups, amide groups, ketol groups, isocyanate groups, phosphate groups, thio groups, disulfide groups, carbonate groups, urea groups, acetal groups, or silyl groups. Azide-type chain-terminating portions include azido, azido, and azidomethyl groups. Examples of deblocking agents include phosphine compounds, such as tris(2-carboxyethyl)phosphine (TCEP) and bissulfotriphenylphosphine (BS-TPP), for azido, azido, and azidomethyl chain-terminating groups. Examples of deblocking agents include tetra(triphenylphosphine)palladium(O) (Pd(PPh3)4) with piperidine, or with 2,3-dichloro-5,6-dicyano-1,4-benzoquinone (DDQ), for alkyl, alkenyl, alkynyl, and allyl chain-terminating groups. Examples of deblocking agents include Pd / C, for aryl and benzyl chain-terminating groups. Examples of deblocking agents include phosphine, β-mercaptoethanol, or dithiothreitol (DTT), for amine, amide, ketone, isocyanate, phosphate, thio, and disulfide chain-terminating groups. Examples of deblocking agents include potassium carbonate (K2CO3) in MeOH, triethylamine in pyridine, and Zn in acetic acid (AcOH), for carbonate chain-terminating groups. Examples of deblocking agents include tetrabutylammonium fluoride, pyridine-HF, and ammonium fluoride and triethylamine trihydrofluoride, for urea and silyl chain-terminating groups.
[0163] In some embodiments, a plurality of immobilized capture primers (200) and pinning primers (500) on a support are fluidly connected to each other to allow solutions of reagents (e.g., linear library molecules, or covalently closed circular library molecules, soluble primers, enzymes, nucleotides, divalent cations, buffers, reagents, etc.) to flow onto the support, such that the plurality of immobilized capture primers and pinning primers on the support can react with the reagents substantially simultaneously in a large-scale parallel manner. In some embodiments, the fluid connectivity of the plurality of immobilized capture primers and pinning primers can be used to perform nucleic acid amplification reactions (e.g., RCA, MDA, PCR, and bridge amplification) substantially simultaneously on the plurality of immobilized capture primers and pinning primers.
[0164] In some embodiments, multiple tandem template molecules of nucleic acids immobilized on a support are fluidly connected to each other to allow solutions of reagents (e.g., soluble primers, enzymes, nucleotides, divalent cations, buffers, reagents, etc.) to flow onto the support, such that the multiple tandem template molecules on the support can react with the reagents substantially simultaneously in a large-scale parallel manner. In some embodiments, the fluid connectivity of the multiple tandem template molecules can be used to perform nucleotide binding assays and / or nucleotide polymerization reactions (e.g., primer extension or sequencing) substantially simultaneously on the multiple tandem template molecules, and optionally to perform detection and imaging of large-scale parallel sequencing.
[0165] When used to refer to nucleic acids, the terms "amplify," "amplifying," "amplification," and other related terms include generating multiple copies of the original polynucleotide template molecule, wherein the copies contain sequences complementary to the template sequence or sequences identical to the template sequence. In some embodiments, the copies contain sequences substantially identical to or substantially identical to sequences complementary to the template sequence.
[0166] This disclosure provides various pH buffers. The full names of the pH buffers are listed herein. The term "Tris" refers to the pH buffer tris(hydroxymethyl)-aminomethane. The term "Tris-HCl" refers to the pH buffer tris(hydroxymethyl)-aminomethane hydrochloride. The term "Tricine" refers to the pH buffer N-[tris(hydroxymethyl)methyl]glycine. The term "Bicine" refers to the pH buffer N,N-bis(2-hydroxyethyl)glycine. The term "Bis-Tris propane" refers to the pH buffer 1,3-bis[tris(hydroxymethyl)methylamino]propane. The term "HEPES" refers to the pH buffer 4-(2-hydroxyethyl)-1-piperazine ethanesulfonic acid. The term "MES" refers to the pH buffer 2-(N-morpholino)ethanesulfonic acid. The term "MOPS" refers to the pH buffer 3-(N-morpholino)propanesulfonic acid. The term "MOPSO" refers to the pH buffer 3-(N-morpholino)-2-hydroxypropanesulfonic acid. The term "BES" refers to the pH buffer N,N-bis(2-hydroxyethyl)-2-aminoethanesulfonic acid. The term "TES" refers to the pH buffer 2-[(2-hydroxy-1,1-bis(hydroxymethyl)ethyl)amino]ethanesulfonic acid. The term "CAPS" refers to the pH buffer 3-(cyclohexylamino)-1-propanesulfonic acid. The term "TAPS" refers to the pH buffer N-[tris(hydroxymethyl)methyl]-3-aminopropanesulfonic acid. The term "TAPSO" refers to the pH buffer N-[tris(hydroxymethyl)methyl]-3-amino-2-hydroxypropanesulfonic acid. The term "ACES" refers to the pH buffer N-(2-acetamido)-2-aminoethanesulfonic acid. The term "PIPES" refers to the pH buffer piperazine-1,4-bis(2-ethanesulfonic acid).
[0167] Throughout this application, various publications, patents, and / or patent applications have been cited. The disclosures of these publications, patents, and / or patent applications are hereby incorporated, in their entirety, by reference in order to provide a more comprehensive description of the current state of the art to which this disclosure pertains.
[0168] Supports ring-on-chain and RCA
[0169] This disclosure provides a method for generating a plurality of nucleic acid tandem template molecules immobilized to a support, the method comprising the step (a): providing a support having a plurality of clip-capture primers (200) immobilized thereon. In some embodiments, the support further includes a plurality of pinning primers (500) immobilized thereon.
[0170] In some embodiments, in step (a), a plurality of immobilized clamp-on capture primers (200) partially hybridize with a linear library molecule (100) and act as initiation sites for rolling circle amplification to generate a plurality of immobilized nucleic acid tandem template molecules (e.g., see...). Figure 17A to 17B, 18A to 18B, 19A to 19B, 25A to 25B, 26A to 26B, and 27A to 27B). In some embodiments, a plurality of pinning primers (500) partially hybridize with the tandem template molecule and pin the tandem template molecule to a support (e.g., see...). Figure 37 The clamp-and-capture primers and the pinning primers together generate a fixed tandem template molecule with a compacted shape and size. In some embodiments, the 3' end of the pinning primer (500) is non-extendable. In some embodiments, the pinning primer (500) includes a 3' blocking group that makes it non-extendable. In some embodiments, the plurality of clamp-and-capture primers (200) and pinning primers (500) can be used for batch-specific sequencing (described below) or for non-batch-specific sequencing.
[0171] In some embodiments, in step (a), the plurality of clip-on capture primers (200) contain the same sequence. In some embodiments, in step (a), the plurality of clip-on capture primers (200) contain different sequences. In some embodiments, each clip-on capture primer contains a sequence along its length that is completely or partially complementary to at least a portion of a nucleic acid library molecule (e.g., a linear or circular library molecule). In some embodiments, in step (a), each clip-on capture primer contains a sequence complementary to at least a portion of a universal adaptor sequence in the nucleic acid library molecule.
[0172] In some embodiments, in step (a), each of the plurality of clip-on capture primers (200) includes a first portion (210) that binds to a first universal binding site in a linear library molecule, and each clip-on capture primer (200) includes a second portion (220) that binds to a second universal binding site in the same linear library molecule (e.g., Figure 17 (A to 17B, 18A to 18B, 19A to 19B, 25A to 25B, 26A to 26B, and 27A to 27B). In some embodiments, the first and second portions of the clip-capture primer (e.g., (210) and (220)) have the same or different lengths. The first portion (210) of the clip-capture primer may be about 4 to 50 nucleotides, or 50 to 100 nucleotides, or 100 to 150 nucleotides, or longer. The second portion (220) of the clip-capture primer may be about 4 to 50 nucleotides, or 50 to 100 nucleotides, or 100 to 150 nucleotides, or longer. In some embodiments, the first and second portions of the clip-capture primer (e.g., (210) and (220)) have the same sequence. In some embodiments, the first and second portions of the immobilized clip-capture primer (e.g., (210) and (220)) have different sequences.
[0173] In some embodiments, in step (a), the plurality of clip-on capture primers (200) fixed to the support comprise clip-on capture primers of one type having the same sequence. For example, each fixed clip-on capture primer (200) comprises a first portion (210) and a second portion (220), wherein the first portion (210) of the clip-on capture primer binds to a first universal binding site (120) in the linear library molecule and the second portion (220) of the clip-on capture primer binds to a second universal binding site (130) in the same linear library molecule (e.g., Figure 25 (A to 25B, 26A to 26B).
[0174] In some embodiments, in step (a), the plurality of clip-on trapping primers (200) fixed to the support comprise a mixture of different types of clip-on trapping primers, comprising at least a first subgroup and a second subgroup of clip-on trapping primers with different sequences, wherein the different types of clip-on trapping primers bind to different types of linear library molecules (e.g., Figures 27A to 27B In some embodiments, each of the fixed clip-on capture primers in the first subgroup (200-A) comprises a first portion (210-A) and a second portion (220-A), wherein the first portion (210-A) of the clip-on capture primer binds to a first universal binding site (120-A) in the first linear library molecule and the second portion (220-A) of the clip-on capture primer binds to a second universal binding site (130-A) in the same linear library molecule (e.g., Figure 27A (Left). In some embodiments, each of the fixed clip-on capture primers in the second subgroup (200-B) comprises a first portion (210-B) and a second portion (220-B), wherein the first portion (210-B) of the clip-on capture primer binds to a first universal binding site (120-B) in the second linear library molecule, and the second portion (220-B) of the clip-on capture primer binds to a second universal binding site (130-B) in the same linear library molecule (e.g., Figure 27A ,right).
[0175] In some embodiments of step (a), the immobilized clip-on capture primer comprises a single-stranded oligonucleotide, which includes DNA, RNA, or a combination of DNA and RNA. The immobilized clip-on capture primer can be of any length, such as 4 to 50 nucleotides, or 50 to 100 nucleotides, or 100 to 150 nucleotides or longer.
[0176] In some embodiments of step (a), each clip-capture primer includes a terminal 3' extendable end. In some embodiments, each clip-capture primer includes a terminal 3' nucleotide having a sugar 3' OH portion that is extendable for nucleotide polymerization (e.g., polymerase-catalyzed nucleotide polymerization). In some embodiments, each clip-capture primer includes a 3' non-extendable end having a blocking portion that can be removed to generate a 3' OH portion.
[0177] In some embodiments of step (a), the individual clip-capture primers lack nucleotides having a cleavable portion. In some embodiments, the individual clip-capture primers lack nucleotides having a cleavable portion that can be cleaved to generate a debasement site in the clip-capture primer. For example, the clip-capture primers lack uridine, 8-oxo-7,8-dihydroguanine (e.g., 8oxoG), and deoxyinosine.
[0178] In some embodiments of step (a), each clip-capture primer includes at least one nucleotide having a cleavable portion that can be cleaved to generate a debasement site in the clip-capture primer. In some embodiments, at least one nucleotide having a cleavable portion comprises uridine, 8-oxo-7,8-dihydroguanine (e.g., 8oxoG), or deoxyinosine. In some embodiments, at least one nucleotide having a cleavable portion comprises a uracil base.
[0179] In some embodiments of step (a), the individual clip-capture primers lack inosine. In some embodiments of step (a), the individual clip-capture primers include at least one inosine at any position. In some embodiments, the inosine bases in the clip-capture primers may hybridize with adenine, cytosine, or uracil in the linear library molecule.
[0180] In some embodiments of step (a), each clip-capture primer includes at least one nucleotide having a cleavable portion containing a restriction enzyme recognition sequence that can be cleaved by a restriction enzyme, including, for example, type I, type II, type IIs, type IIB, type III and / or type IV restriction enzymes.
[0181] In some embodiments, the easily ruptured portion may be located in the first portion (210) of the clamp-capture primer. In some embodiments, the easily ruptured portion may be located in the second portion (220) of the clamp-capture primer.
[0182] In any of the embodiments of step (a), a plurality of clip-on trapping primers may be secured to a support or to a coating on the support. The secured clip-on trapping primers may be embedded in and attached (coupled) to a coating on the support. In some embodiments, the 5' end of the clip-on trapping primer is secured to a support or to a coating on the support. Alternatively, the inner portion or 3' end of the clip-on trapping primer may be secured to a support or to a coating on the support.
[0183] In any of the embodiments of step (a), each clip-capture primer contains at least one phosphate-thioester diester bond at its 5' end, which makes the clip-capture primer resistant to exonuclease degradation. In some embodiments, each clip-capture primer contains 2 to 5 or more consecutive phosphate-thioester diester bonds at its 5' end. In some embodiments, each clip-capture primer contains at least one ribonucleotide and / or at least one 2'-O-methyl or 2'-O-methoxyethyl (MOE) nucleotide, which makes the clip-capture primer resistant to exonuclease degradation.
[0184] In any of the embodiments of step (a), each clip-capture primer contains at least one locked nucleic acid (LNA) comprising a methylene bridge between the 2' oxygen and 4' carbon of the pentose ring. In some embodiments, up to five nucleotides at or near the terminal 5' end contain the locked nucleic acid (LNA). Immobilized clip-capture primers containing at least one LNA are resistant to nuclease digestion and can exhibit increased melting temperatures when hybridized with the forward extension strand.
[0185] In some embodiments, in step (a), the plurality of pinning primers (500) fixed to the support comprise a type of pinning primer having the same sequence. For example, each fixed pinning primer (500) binds to at least a portion of a nucleic acid tandem (e.g., Figure 37 )
[0186] In some embodiments, in step (a), the plurality of pinning primers (500) pinned to the support comprises a mixture of different types of pinning primers, including at least a first subgroup and a second subgroup of pinning primers with different sequences, wherein the different types of pinning primers bind to different types of nucleic acid tandem template molecules generated from different types of linear library molecules.
[0187] In some embodiments, in step (a), each of the fixed pinning primers (500) in the first subgroup (500-A) binds to at least a portion of the universal adaptor sequence in the first nucleic acid tandem template molecule.
[0188] In some embodiments, in step (a), each of the fixed pinning primers (500) in the second subgroup (500-B) binds to at least a portion of the universal adaptor sequence in the second nucleic acid tandem template molecule.
[0189] In some embodiments of step (a), the plurality of immobilized pinning primers (500) comprise single-stranded oligonucleotides, which include DNA, RNA, or a combination of DNA and RNA. The immobilized pinning primers can be of any length, such as 4 to 50 nucleotides, or 50 to 100 nucleotides, or 100 to 150 nucleotides or longer.
[0190] In any of the embodiments of step (a), the plurality of fixed pinning primers (500) comprise sequences that are fully or partially complementary to at least a portion of the nucleic acid tandem along their length (e.g., Figure 37 In some embodiments, the pinning primer comprises a sequence complementary to at least a portion of a universal adaptor sequence in the nucleic acid tandem template molecule. In some embodiments, the sequence of the pinning primer (500) differs from the sequence of the clip-capture primer (200).
[0191] In some embodiments of step (a), each pinning primer (500) includes a non-extendable 3' end. In some embodiments, each pinning primer includes a 3' blocking portion that inhibits polymerase-catalyzed nucleotide polymerization. In some embodiments, the 3' blocking portion includes a phosphate group, a dideoxycytidine group, a reverse dT group, or an amino group. In some embodiments, the pinning primer is non-extendable in the primer extension reaction. In some embodiments, the 3' end of the pinning primer includes an extendable OH portion.
[0192] In some embodiments of step (a), each pinning primer (500) lacks a nucleotide having a cleavable portion that can be cleaved to generate an ablation site in the pinning primer. In some embodiments, each pinning primer lacks a nucleotide having a cleavable portion. For example, the pinning primer lacks uridine, 8-oxo-7,8-dihydroguanine (e.g., 8oxoG), and deoxyinosine. In some embodiments, each pinning primer includes a nucleotide having a cleavable portion that can be cleaved to generate an ablation site in the surface primer. In some embodiments, each pinning primer includes a restriction enzyme recognition sequence that can be cleaved by a restriction enzyme, including, for example, type I, type II, type IIs, type IIB, type III, and / or type IV restriction enzymes.
[0193] In some embodiments of step (a), a plurality of pinning primers (500) may be pinned to a support or to a coating on the support. The pinned primers may be embedded in and attached (coupled) to a coating on the support. In some embodiments, the 5' end of the pinning primer is pinned to a support or to a coating on the support. Alternatively, the inner portion or 3' end of the pinning primer may be pinned to a support or to a coating on the support.
[0194] In some embodiments of step (a), each pinning primer (500) contains at least one phosphate-thioester diester bond at its 5' end, which makes the pinning primer resistant to exonuclease degradation. In some embodiments, each pinning primer contains 2 to 5 or more consecutive phosphate-thioester diester bonds at its 5' end. In some embodiments, each pinning primer contains at least one ribonucleotide and / or at least one 2'-O-methyl or 2'-O-methoxyethyl (MOE) nucleotide, which makes the pinning primer resistant to exonuclease degradation.
[0195] In some embodiments of step (a), each pinning primer (500) comprises at least one locked nucleic acid (LNA) containing a methylene bridge between the 2' oxygen and 4' carbon of the pentose ring. Pinning primers comprising at least one LNA are resistant to nuclease digestion and can exhibit increased melting temperatures when hybridized with tandem.
[0196] In some embodiments of step (a), the support comprises per mm 2 Approximately 10 2 Up to 10 15 A fixed clamp captures primers (200). In some embodiments, the support comprises per mm 2 Approximately 10 2 Up to 10 15 A fixed pinning primer (500). In some embodiments, the support comprises each mm 2 Approximately 10 2 Up to 10 15 One is a fixed splint-capture primer and the other is a fixed pinning primer.
[0197] In some embodiments of step (a), the fixed clip-capture primers (200) and pinning primers (500) are fluidly connected to each other to allow various solutions of linear or circular nucleic acid template molecules, soluble primers, enzymes, nucleotides, divalent cations, buffers, reagents, etc., to flow onto the support, such that multiple fixed clip-capture primers and pinning primers (and any primer extension products generated from the fixed clip-capture primers) react with the solution in a large-scale parallel manner.
[0198] In some embodiments, the method for generating a plurality of nucleic acid tandems immobilized to a support further includes step (b): providing a plurality of linear nucleic acid libraries, comprising at least a first subgroup and a second subgroup of linear libraries, wherein each linear library in the plurality of linear libraries contains a 5' end and a 3' end, and wherein each linear library in the plurality of linear libraries contains a target sequence in any order and any adaptor sequence or any combination of two or more adaptor sequences, wherein the adaptor sequence comprises: (i) a first universal binding site (120) (or its complementary sequence) against a first portion of a clip-on capture primer; (ii) a universal binding site (123) (or its complementary sequence) against a first non-clip-on capture primer; (iii) at least one sample index sequence (e.g., (160) and / or (170)) which can be used to distinguish the target sequence obtained from different sample sources in multiplex assays; (iv) at least one universal binding site (140) (or its complementary sequence) against a forward sequencing primer; (v) (vi) At least one universal binding site (150) for a reverse sequencing primer (or its complementary sequence); (vii) at least one universal binding site (or its complementary sequence) for a compacted oligonucleotide; (vii) at least one unique molecular index sequence (UMI) (e.g., a left unique molecular index sequence (180) and / or a right unique molecular index sequence (190)) which can be used to uniquely identify nucleic acid molecules with the unique molecular index sequence attached (e.g., having a target sequence); (viii) at least one universal binding site (or its complementary sequence) for a pinned primer; (ix) at least one batch-specific barcode sequence; (x) a universal binding site (133) (or its complementary sequence) for a second non-snap-capture primer; (xi) at least one short random sequence (e.g., NNNN) (132) which provides nucleotide sequence diversity and is about 3 to 20 nucleotides in length; and / or (xii) a second universal binding site (130) (or its complementary sequence) for a second portion of the fixed snap-captured primer (e.g., see Figure 17 (A to 17B, 18A to 18B, 19A to 19B, 25A to 25B, 26A to 26B, and 27A to 27B). In some embodiments, the plurality of linear library molecules comprises single-stranded linear library molecules. In some embodiments, the plurality of linear library molecules comprises double-stranded linear library molecules. In some embodiments, the plurality of linear library molecules comprises a mixture of single-stranded and double-stranded linear library molecules.
[0199] In some embodiments, in step (b), each linear library molecule in the first subgroup contains a target sequence. In some embodiments, the target sequences in the linear library molecules of the first subgroup have the same sequence. In some embodiments, the target sequences in the linear library molecules of the first subgroup have different sequences.
[0200] In some embodiments, in step (b), each linear library molecule in the second subgroup contains a target sequence. In some embodiments, the target sequences in the linear library molecules of the second subgroup have the same sequence. In some embodiments, the target sequences in the linear library molecules of the second subgroup have different sequences. In some embodiments, the linear library molecules of the first subgroup and the second subgroup have the same sequence. In some embodiments, the linear library molecules of the first subgroup and the second subgroup have different target sequences.
[0201] In some embodiments, in step (b), each linear library molecule in the first and second subgroups contains a universal binding site (140) for forward sequencing primers, which includes a batch-specific forward sequencing primer binding site that can be used for forward batch sequencing.
[0202] In some embodiments, in step (b), each linear library molecule in the first and second subgroups contains a universal binding site (150) for reverse sequencing primers, which includes batch-specific reverse sequencing primer binding sites that can be used for reverse batch sequencing.
[0203] In some embodiments, in step (b), each linear library molecule in the first and second subgroups contains at least one sample index sequence (e.g., a left sample index sequence (160) and / or a right sample index sequence (170)). In some embodiments, at least one sample index sequence is conjugated with an optional short random sequence (e.g., NNN), wherein the short random sequence provides nucleotide sequence diversity. In some embodiments, the short random sequence is about 3 to 20 nucleotides in length.
[0204] In some embodiments, the linear library molecules provided in step (b) contain a non-phosphorylated or phosphorylated 5' end. In some embodiments, the non-phosphorylated 5' end can be treated with a polynucleotide kinase (e.g., T4 PNK) to generate a linkable 5' phosphorylated end.
[0205] In some embodiments, the plurality of nucleic acid linear library molecules provided in step (b) contain Figures 20 to 24 Any of the linear library molecules shown. Those skilled in the art will recognize that linear library molecules with connective sequences constructed in other arrangements are possible.
[0206] In some embodiments, in step (b), each linear library molecule in the first and second subgroups contains a universal binding site (or its complementary sequence) of the same type for binding a first portion of the clip-capture primer, and a universal binding site (or its complementary sequence) of the same type for binding a second portion of the clip-capture primer. In some embodiments, each linear library molecule in the first and second subgroups contains a first universal binding site (120) (or its complementary sequence) for binding a first portion (210) of the clip-capture primer (210) and a second universal binding site (130) (or its complementary sequence) for binding a second portion (220) of the immobilized clip-capture primer. In some embodiments, the linear library molecules in the first and second subgroups contain the same first universal binding site (120) and the same second universal binding site (130) (e.g., Figures 25-26 ).
[0207] In some embodiments, in step (b), each linear library molecule in the first and second subgroups includes different types of universal binding sites (or their complementary sequences) for binding a first portion of the clip-capture primer, and different types of universal binding sites (or their complementary sequences) for binding a second portion of the clip-capture primer. In some embodiments, each linear library molecule in the first subgroup of the linear library includes a universal binding site (120-A) (or its complementary sequence) for binding the first portion (210-A) of the clip-capture primer and a universal binding site (130-A) (or its complementary sequence) for binding the second portion (220-A) of the immobilized clip-capture primer. In some embodiments, each linear library molecule in the second subgroup of the linear library includes a universal binding site (120-B) (or its complementary sequence) for binding the first portion (210-B) of the clip-capture primer and a universal binding site (130-B) (or its complementary sequence) for binding the second portion (220-B) of the immobilized clip-capture primer. In some embodiments, the universal binding sites (120-A) and (120-B) have different sequences. In some embodiments, the universal binding sites (130-A) and (130-B) have different sequences (e.g., Figure 27A (Left and right). In some embodiments, the clip-capture primers (220-A and 220-B) that hybridize the first and second subgroups of the linear library molecules contain different sequences.
[0208] In some embodiments, the method for generating a plurality of nucleic acid tandem molecules immobilized to a support further includes step (c): contacting a plurality of clip-capture primers (200) immobilized on the support with a plurality of linear library molecules (100), wherein the contact is performed under conditions suitable for hybridizing each linear library molecule with each immobilized clip-capture primer (200) to form each open-loop library molecule (300), each open-loop library molecule having at least a portion of a first terminal region of the respective linear library molecule that hybridizes with a first portion (210) of the clip-capture primer and having at least a portion of a second terminal region of the same linear library molecule that hybridizes with a second portion (220) of the same clip-capture primer, wherein each open-loop library molecule has a gap or cut between the 5' end and the 3' end of the open-loop library molecule (e.g., Figure 17 (A, 18A, 19A, 25A, 26A, 27A left and 27A right). In some embodiments, a first universal binding site (120) of a first portion of a clip-capture primer for a given linear library molecule hybridizes with a first portion (210) of a clip-capture primer, and a second universal binding site (130) of a second portion of a fixed clip-capture primer for the same given library molecule hybridizes with a second portion (220) of the same clip-capture primer. In some embodiments, the contact of step (c) includes dispensing a plurality of single-stranded nucleic acid linear library molecules onto a support having a plurality of fixed clip-capture primers (200) and pinning primers (500). In some embodiments, the contact of step (c) includes dispensing one type of single-stranded nucleic acid linear library molecule onto a support having a plurality of fixed clip-capture primers (200) and pinning primers (500). In some embodiments, the contact in step (c) includes dispensing a mixture of at least two different types of single-stranded nucleic acid linear library molecules onto a support having a plurality of immobilized splint-capture primers and pinning primers, wherein the at least two types comprise at least a first subgroup and a second subgroup of linear library molecules, and wherein the support comprises the first and second subgroups of immobilized splint-capture primers. In some embodiments, a first universal binding site (120) in the linear library molecules targeting a first portion of the immobilized splint-capture primers may hybridize with a first portion (210) of the immobilized splint-capture primers. In some embodiments, a second universal binding site (130) in the linear library molecules targeting a second portion of the immobilized splint-capture primers may hybridize with a second portion (220) of the immobilized splint-capture primers. In some embodiments, the immobilized clip-on capture primers comprise a first portion (210) and a second portion (220) that hybridize to adaptor sequences (e.g., (120) and (130)) in a linear library molecule (100), and the clip-on capture primers act as nucleic acid clip molecules (e.g., for circularizing the linear library molecule) Figure 17(A, 18A, 19A, 25A, 26A, 27A left and 27A right).
[0209] In some embodiments, step (c) includes: contacting a first subgroup (200-A) of fixed clip-capture primers with a first subgroup (100-A) of linear library molecules, wherein the contact is performed under conditions suitable for hybridizing each linear library molecule in the first subgroup with each fixed clip-capture primer in the first subgroup to form each open-loop library molecule, each open-loop library molecule having a first terminal region of a given linear library molecule that hybridizes with a first portion (210-A) of the clip-capture primer and a second terminal region of the same linear library molecule that hybridizes with a second portion (220-A) of the same clip-capture primer, wherein each open-loop library molecule in the first subgroup has a gap or cut between the 5' end and the 3' end of the open-loop library molecule (e.g., Figure 27A (left). In some embodiments, the contact of step (c) includes dispensing a first subgroup of single-stranded nucleic acid linear library molecules onto a support having a mixture of a first subgroup and a second subgroup having immobilized clip-capture primers. In some embodiments, the contact of step (c) includes dispensing a first subgroup of single-stranded nucleic acid linear library molecules onto a support having a plurality of pinning primers (500). In some embodiments, the immobilized clip-capture primer (200-A) comprises a first portion (210-A) and a second portion (220-A) that hybridize with the adaptor sequences (120-A) and (130-A) in the linear library molecules of the first subgroup, and the clip-capture primer (200-A) acts as a nucleic acid clip molecule for circularizing the linear library molecules of the first subgroup (e.g., Figure 27A (Left). In some embodiments, each linear library molecule of the first subgroup includes a universal binding sequence (120-A) capable of hybridizing with a first portion (210-A) of each of the fixed clip-on capture primers in the first subgroup. In some embodiments, each linear library molecule of the first subgroup includes a universal binding sequence (130-A) capable of hybridizing with a second portion (220-A) of each of the fixed clip-on capture primers in the first subgroup.
[0210] In some embodiments, step (c) includes: contacting a second subgroup (200-B) of the fixed clip-capture primers with a second subgroup (100-B) of the linear library molecules, wherein the contact is performed under conditions suitable for hybridizing each linear library molecule in the second subgroup with each fixed clip-capture primer in the second subgroup to form each open-loop library molecule, each open-loop library molecule having a first terminal region of a given linear library molecule that hybridizes with a first portion (210-B) of the clip-capture primer and a second terminal region of the same linear library molecule that hybridizes with a second portion (220-B) of the same clip-capture primer, wherein each open-loop library molecule in the second subgroup has a gap or cut between the 5' end and the 3' end of the open-loop library molecule (e.g., Figure 27A (right). In some embodiments, the contact of step (c) includes dispensing a second subgroup of single-stranded nucleic acid linear library molecules onto a support having a mixture of a first subgroup and a second subgroup having immobilized clip-capture primers. In some embodiments, the contact of step (c) includes dispensing a second subgroup of single-stranded nucleic acid linear library molecules onto a support having a plurality of pinning primers (500). In some embodiments, the immobilized clip-capture primer (200-B) comprises a first portion (210-B) and a second portion (220-B) that hybridize to the adaptor sequences (120-B) and (130-B) in the linear library molecules of the second subgroup, and the clip-capture primer (200-B) acts as a nucleic acid clip molecule for circularizing the linear library molecules (e.g., Figure 27A (Right). In some embodiments, each linear library molecule of the second subgroup includes a universal binding sequence (120-B) capable of hybridizing with the first portion (210-B) of each fixed clip-on capture primer in the second subgroup. In some embodiments, each linear library molecule of the second subgroup includes a universal binding sequence (130-B) capable of hybridizing with the second portion (220-B) of each fixed clip-on capture primer in the second subgroup.
[0211] In some embodiments of step (c), the location of the gaps or nicks in the open-loop library molecule can be asymmetric or symmetric relative to the double strands formed by hybridizing the 5' and 3' ends of the linear library molecule with the fixed clamp-on capture primers (200). For example, Figure 17 A shows an asymmetrically positioned cut or gap. Figure 18 A shows an asymmetrically positioned cut or gap. Figure 19A illustrates a symmetrically positioned notch or gap. Asymmetric or symmetrically positioned gaps / notches can be generated by adjusting the lengths of the first portion (210) and the second portion (220) of the clamped trap primer. In some embodiments, the length of the first portion (210) can be increased and the length of the second portion (220) can be decreased to improve / increase the percentage of linear library molecules hybridizing with the clamp trap primer (200). In some embodiments, the length of the first portion (210) can be decreased and the length of the second portion (220) can be increased to improve / increase the percentage of linear library molecules hybridizing with the clamp trap primer (200).
[0212] In some embodiments of step (c), when the plurality of linear library molecules comprises double-stranded linear library molecules, the hybridization conditions of step (c) are suitable for denaturing the double-stranded linear library molecules into single-stranded linear library molecules that can hybridize with the clip-on trap primer (200). For example, hybridization can be performed at temperatures of about 35°C to 40°C, or about 40°C to 45°C, or about 45°C to 50°C, or about 50°C to 55°C. In some embodiments, the hybridization in step (c) can be performed using a hybridization reagent comprising 3X SSC, formamide, and / or a dissociating agent. In some embodiments, the dissociating agent can disrupt non-covalent bonds, such as hydrogen bonds or van der Waals forces. In some embodiments, the dissociating agent comprises SDS (sodium dodecyl sulfate), urea, thiourea, guanidine cation chloride, guanidine hydrochloride, guanidine thiocyanate, guanidine isosulfate, potassium thiocyanate, lithium chloride, sodium iodide, or sodium perchlorate.
[0213] In some embodiments of step (c), the amount of multiple linear library molecules (100) in contact with multiple fixed clamp-capture primers (200) can be adjusted to achieve approximately 10 2 / mm 2 Up to 10 15 / mm 2 The density of the fixed tandem template molecules, wherein the fixed tandem template molecules can be generated in step (e) (described below) by performing a rolling circle amplification reaction. In some embodiments, the amount of the plurality of linear library molecules (100) in contact with the plurality of fixed clip-on capture primers (200) may be about 0.1 to 1 pM, or about 1 to 5 pM, or about 5 to 10 pM, or about 10 to 20 pM, or about 20 to 30 pM, or about 30 to 40 pM, or about 40 to 50 pM.
[0214] In some embodiments, the method for generating a plurality of nucleic acid tandem molecules immobilized to a support further includes step (d): enzymatically closing nicks or gaps in a plurality of open-loop library molecules, thereby generating a plurality of covalently closed circular library molecules (400), wherein each single-stranded covalently closed circular library molecule hybridizes with immobilized splint-capture primers (e.g., Figure 17 B, 18B, 19B, 25B, 26B, 27B (left) and 27B (right).
[0215] In some embodiments, step (d) includes: enzymatically closing nicks or gaps in a plurality of open-ring library molecules of the first subgroup, thereby generating a first subgroup of covalently closed circular library molecules, wherein each single-stranded covalently closed circular library molecule in the first subgroup hybridizes with a fixed splint-capture primer of the first subgroup (e.g., Figure 27B ,Left).
[0216] In some embodiments, step (d) includes: enzymatically closing nicks or gaps in a plurality of open-ring library molecules of the second subgroup, thereby generating a second subgroup of covalently closed circular library molecules, wherein each single-stranded covalently closed circular library molecule in the second subgroup hybridizes with a fixed splint-capture primer of the second subgroup (e.g., Figure 27B ,right).
[0217] In any of the embodiments of step (d), the nicks in the individual open-ring library molecules can be closed by performing a ligase-catalyzed ligation reaction to form single-stranded covalently closed circular molecules, wherein the covalently closed circular molecules hybridize with each immobilized splint-capture primer. In some embodiments, the ligation reaction can be performed using a ligase. In some embodiments, the ligase includes phage DNA ligases, including T3 DNA ligase (e.g., Figure 60 ), T4 DNA ligase (e.g., Figure 61 ) or T7 DNA ligase (e.g., Figure 62 In some embodiments, the ligase comprises a thermostable DNA ligase, including Taq DNA ligase, Tfu DNA ligase (e.g., Figure 63 ) or DNA ligases from Thermococcus nautilus (e.g., Figure 64 In some embodiments, the ligase comprises a recombinant thermostable T4 DNA ligase (e.g., Hi-T4 DNA ligase from NewEngland Biolabs, catalog number M2622S).
[0218] In some embodiments of step (d), the gaps in the individual open-ring library molecules can be closed by performing a polymerase-catalyzed gap-filling reaction using the 3' extendable end of the library molecule as the initiation site for the polymerase-catalyzed filling reaction and using immobilized splint-capture primers as template molecules, thereby forming covalently closed circular molecules with nicks. The nicks can be closed by performing an enzymatic ligation reaction to form single-stranded covalently closed circular library molecules, wherein each covalently closed circular library molecule hybridizes with each immobilized splint-capture primer. In some embodiments, the gap-filling reaction can be performed using a plurality of nucleotides and a polymerase with 5' to 3' strand replacement activity. The polymerase includes: E. coli DNA polymerase I, a Klenow fragment of E. coli DNA polymerase I, T7 DNA polymerase, or T4 DNA polymerase. In some embodiments, after the gap-filling reaction, the nicks can be closed by performing a ligation reaction using a ligase. In some embodiments, the ligase comprises a bacteriophage DNA ligase, including T3, T4, or T7 DNA ligases. In some embodiments, the ligase comprises a thermostable DNA ligase, including Taq DNA ligase, Tfu DNA ligase, or a DNA ligase derived from Thermococcus nautilus. In some embodiments, the ligase comprises a recombinant thermostable T4 DNA ligase (e.g., Hi-T4 DNA ligase from New England Biolabs, catalog number M2622S).
[0219] In some embodiments of step (d), the ligation reaction can be carried out by contacting a plurality of open-ring library molecules (e.g., with nicks) with a ligation reaction mixture comprising one or more DNA ligases, a pH buffer, and ATP at least once. In some embodiments, the ligation reaction mixture comprises one or more DNA ligases, a pH buffer, ATP, multiple nucleotides, and the ligation reaction mixture lacks or includes a strand displacement polymerase. In any of the embodiments of step (d), the ligation reaction can be carried out using a ligation reaction mixture comprising at least one DNA ligase and a strand displacement polymerase. In some embodiments, the ligation reaction mixture further comprises any combination of magnesium ions, a reducing agent, a detergent, a congestant, an amino acid, a phosphine compound, ammonium ions, a salt, a viscosity agent, multiple nucleotides, and / or a strand displacement polymerase. In some embodiments, a plurality of open-ring library molecules (e.g., with nicks) can be contacted with the ligation reaction mixture at least two, at least three, at least four, or up to ten times.
[0220] In some embodiments of step (d), the ligation reaction can be carried out at temperatures where the ligase exhibits activity (e.g., at about 15°C to 20°C, or about 20°C to 30°C, or about 30°C to 40°C, or about 40°C to 50°C).
[0221] In some embodiments of step (d), the ligase in the ligation reaction mixture comprises a bacteriophage DNA ligase, such as a T3, T4, or T7 DNA ligase. In some embodiments, the ligase in the ligation reaction mixture comprises a thermostable DNA ligase, including Taq DNA ligase, Tfu DNA ligase, or a DNA ligase derived from Thermococcus nautilus. In some embodiments, the ligase in the ligation reaction mixture comprises a recombinant thermostable T4 DNA ligase (e.g., Hi-T4 DNA ligase from New England Biolabs, catalog number M2622S).
[0222] In some embodiments of step (d), the ligation reaction may be performed using a ligase reaction mixture comprising T3 phage DNA ligase (e.g., NCBI No. 523305.1), T4 phage DNA ligase (e.g., NCBI No. 049813.1), T7 phage DNA ligase (e.g., NCBI No. 041963.1), thermostable Taq DNA ligase (e.g., from New England Biolabs, catalog No. M0208S), thermostable Tfu DNA ligase from *Thermococcus fumicoolans* (e.g., UniProtKB / Swiss No. Q9HH07.1), thermostable DNA ligase from *Thermococcus nautilus* (e.g., NCBI No. WP_042693257.1), and / or engineered thermostable T4 DNA ligase Hi-T4 ligase (from New England Biolabs, catalog No. M2622S).
[0223] In some embodiments of step (d), the pH buffer in the reaction mixture includes Tris (e.g., tris(hydroxymethyl)aminomethane), Tris-HCl (e.g., tris(hydroxymethyl)aminomethane hydrochloride), HEPES (e.g., 4-(2-hydroxyethyl)-1-piperazine ethane sulfonic acid), or MOPS (e.g., 3-(N-morpholino)propane sulfonic acid). In some embodiments, the pH buffer in the reaction mixture is within the pH range in which the chain displacement polymerase is inactivated. For example, the pH buffer in the reaction mixture may be in the pH range of about 4 to 9, or it may be in the pH range of about 5 to 8.5, about 5.5 to 8, about 6 to 7.9, about 6.5 to 7.8, about 7 to 7.9, or about 7 to 7.5.
[0224] In some embodiments of step (d), the magnesium ions connecting the reaction mixture include MgCl2 or MgSO4.
[0225] In some embodiments of step (d), the reducing agent connecting the reaction mixture includes DTT (e.g., dithiothreitol), DTE (dithioerythritol), betaine, and / or glucuronic acid.
[0226] In some embodiments of step (d), the detergents used in the reaction mixture include Tween-20, Tween-80, Triton X-100, Nonidet P-40, CHAPS (e.g., 3-[(3-cholamidopropyl)dimethylammonium]-1-propanesulfonate) or DetX (e.g., N-dodecyl-N,N-dimethyl-3-ammonium-1-propane sulfate).
[0227] In some embodiments of step (d), the congesting agent connecting the reaction mixture includes PEG (e.g., polyethylene glycol, for example, molecular weight 1-50K), dextran, dextran sulfate, hydroxypropyl methylcellulose (HPMC), hydroxyethyl methylcellulose (HEMC), hydroxybutyl methylcellulose, hydroxypropyl cellulose, methylcellulose, or hydroxymethylcellulose.
[0228] In some embodiments of step (d), the amino acid connecting the reaction mixture includes β-alanine or β-valine.
[0229] In some embodiments of step (d), the phosphine compound connecting the reaction mixture comprises a phosphine having a derived trialkylphosphine moiety or a derived triarylphosphine moiety. In some embodiments, the phosphine compound comprises TCEP (e.g., tris(2-carboxyethyl)phosphine), BS-TPP (e.g., bissulfotriphenylphosphine), THPP (e.g., tris(hydroxypropyl)phosphine), or THMP (e.g., tris(hydroxymethyl)phosphine).
[0230] In some embodiments of step (d), the ammonium ions in the reaction mixture include ammonium sulfate (e.g., NH4)2SO4) or ammonium acetate.
[0231] In some embodiments, the salt in the connecting reaction mixture includes NaCl, KCl, or potassium glutamate.
[0232] In some embodiments of step (d), the viscosity agent in the connecting reaction mixture includes trehalose, sucrose, cellulose, xylitol, mannitol, sorbitol, D-maltose, or inositol. In some embodiments, the viscosity agent includes glycerol or a glycol compound such as ethylene glycol or propylene glycol (e.g., propanediol). In some embodiments, the viscosity agent in the connecting reaction mixture comprises 50% Brix sucrose, which contains 50 grams of sucrose per 100 grams of total solution.
[0233] In some embodiments of step (d), the multiple nucleotides linked in the reaction mixture include any combination of dATP, dGTP, dCTP, dTTP, and / or dUTP.
[0234] In some embodiments of step (d), the chain substitution polymerase in the reaction mixture comprises a polymerase capable of locally separating one strand of a double-stranded nucleic acid and synthesizing a new strand in a template-based manner. The chain substitution polymerase substitutes the complementary strand from the template strand and catalyzes the synthesis of the new strand. In some embodiments, the chain substitution polymerase comprises a thermophilic or mesophilic polymerase. In some embodiments, the chain substitution polymerase comprises a wild-type enzyme or a variant enzyme (including exonuclease subtraction mutants, mutant versions, chimeric enzymes, and truncated enzymes). In some embodiments, the chain substitution polymerase comprises phi29 DNA polymerase, a large fragment of Bst DNA polymerase, a large fragment of Bsu DNA polymerase (exo-), Bca DNA polymerase (exo-), a Klenow fragment of E. coli DNA polymerase, T5 polymerase, M-MuLV reverse transcriptase, HIV reverse transcriptase, Deep Vent DNA polymerase, or KOD DNA polymerase. In some embodiments, the phi29 DNA polymerase may be a wild-type phi29 DNA polymerase (e.g., MagniPhi from Expedeon), a variant EquiPhi29 DNA polymerase (e.g., from ThermoFisher Scientific, catalog number A39390), or a chimeric QualiPhi DNA polymerase (e.g., from 4basebio, catalog number 510025).
[0235] In some embodiments of step (d), the ligation reaction mixture may include at least one DNA ligase, a strand displacement polymerase, and a pH buffer (at a pH at which the strand displacement polymerase is inactivated). For example, the pH range of the pH buffer in the ligation reaction mixture may be about 6.5 to 7.8. Therefore, the strand displacement polymerase in the ligation reaction mixture does not catalyze rolling circle amplification. In some embodiments, the DNA ligase in the ligation reaction mixture may close nicks in open-loop library molecules to generate a plurality of covalently closed circular library molecules, each covalently closed circular library molecule hybridizing with immobilized clip-capture primers (200), thereby forming a nucleic acid duplex with clip-capture primers having a 3' end. In some embodiments, the strand displacement polymerase in the ligation reaction mixture may bind to the 3' end of each clip-capture primer that has formed a nucleic acid duplex (e.g., a pre-loaded strand displacement polymerase), but the strand displacement polymerase does not initiate primer extension reactions (e.g., rolling circle amplification) because the strand displacement polymerase is inactive at the pH of the ligation reaction mixture. In some embodiments, the chain displacement polymerase can be preloaded onto the 3' end of the clip-capture primer during the ligation reaction in step (d), and then in step (e), rolling circle amplification reactions can be initiated substantially simultaneously on multiple immobilized covalently closed circular library molecules by changing the pH to a range that allows the preloaded chain displacement polymerase activity.
[0236] In some embodiments, the method for generating a plurality of nucleic acid tandem molecules immobilized to a support further includes step (e): contacting a rolling circle amplification reaction mixture with a plurality of covalently closed circular library molecules immobilized to a support and performing a plurality of rolling circle amplification reactions, thereby generating a plurality of immobilized single-stranded nucleic acid tandem template molecules (e.g., Figures 29 to 36 In some embodiments, a single covalently closed circular library molecule can generate a single tandem template molecule. In some embodiments, the single tandem template molecule serves as a template molecule for downstream sequencing workflows.
[0237] In some embodiments, step (e) includes contacting a rolling circle amplification reaction mixture with a first and a second subgroup of covalently closed circular library molecules immobilized to a support and performing a rolling circle amplification reaction, thereby generating a first and a second subgroup of immobilized single-stranded nucleic acid tandem template molecules. In some embodiments, a single covalently closed circular library molecule can generate a single tandem template molecule. In some embodiments, the single tandem template molecule serves as a template molecule for downstream sequencing workflows.
[0238] In some embodiments of step (e), the rolling circle amplification reaction is carried out in the presence of a plurality of compacted oligonucleotides. In some embodiments, the plurality of compacted oligonucleotides comprises a plurality of compacted oligonucleotides having the same sequence. In some embodiments, the plurality of compacted oligonucleotides comprises a mixture of compacted oligonucleotides having two or more different sequences. In some embodiments, the rolling circle amplification reaction can be initiated substantially simultaneously on a plurality of covalently closed cyclic library molecules immobilized to a support. In some embodiments, the plurality of covalently closed cyclic library molecules can be contacted with the rolling circle amplification reaction mixture at least once, at least twice, at least three times, at least four times, or at most ten times.
[0239] In some embodiments of step (e), the rolling circle amplification reaction can be performed on a support that generates multiple immobilized tandem template molecules, wherein each tandem template molecule is covalently bound to an immobilized clamp-on trap primer (e.g., Figures 29 to 36 In some embodiments, each tandem template molecule comprises two or more tandem repeat units, wherein the units comprise complementary sequences to the given covalently closed circular library molecule generated in step (d) above. In some embodiments of step (e), at least a portion of each tandem template molecule may hybridize with fixed pinned primers (e.g., Figure 37 In some embodiments of step (e), the pinned primers include a 3' end with a non-extending portion. Therefore, multiple pinned primers include a 3' end that does not initiate the rolling circle amplification reaction in step (e).
[0240] In some embodiments of step (e), the rolling circle amplification reaction mixture comprises magnesium ions, a reducing agent, a detergent, a crowding agent, an amino acid, a phosphine compound, ammonium ions, a salt, a viscosity agent, multiple nucleotides, and / or multiple compacted oligonucleotides. In some embodiments of step (e), the rolling circle amplification reaction mixture lacks a chain displacement polymerase. In some embodiments of step (e), the rolling circle amplification reaction mixture includes a chain displacement polymerase. In some embodiments, when the rolling circle amplification reaction mixture lacks a chain displacement polymerase, the rolling circle amplification reaction is catalyzed by a chain displacement polymerase present in the linking reaction mixture of step (d).
[0241] In some embodiments of step (e), the rolling circle amplification reaction mixture comprises a pH buffer, magnesium ions, a reducing agent, a detergent, a crowding agent, an amino acid, a phosphine compound, ammonium ions, a salt, a viscosity agent, multiple nucleotides, or combinations thereof. In some embodiments of step (e), the rolling circle amplification reaction mixture comprises a chain displacement polymerase.
[0242] In some embodiments of step (e), the pH buffer in the rolling circle amplification reaction mixture includes Tris (e.g., tris(hydroxymethyl)-aminomethane), Tris-HCl (e.g., tris(hydroxymethyl)-aminomethane hydrochloride), HEPES (e.g., 4-(2-hydroxyethyl)-1-piperazine ethane sulfonic acid), or MOPS (e.g., 3-(N-morpholino)propane sulfonic acid). In some embodiments of step (e), the pH buffer in the rolling circle amplification reaction mixture is within the pH range in which the chain displacement polymerase is active. For example, the pH buffer in the rolling circle amplification reaction mixture may be in the pH range of about 7 to 9, about 7.5 to 9, about 8 to 9, about 8.1 to 8.9, about 8.2 to 8.8, about 8.3 to 8.7, or about 8.4 to 8.6.
[0243] In some embodiments of step (e), the magnesium ions in the rolling ring amplification reaction mixture include MgCl2 or MgSO4.
[0244] In some embodiments of step (e), the reducing agent in the rolling ring amplification reaction mixture includes DTT (e.g., dithiothreitol) and / or betaine.
[0245] In some embodiments of step (e), the detergent in the rolling ring amplification reaction mixture includes Tween-20, Tween-80, Triton X-100, Nonidet P-40, CHAPS (e.g., 3-[(3-cholamidopropyl)dimethylammonium]-1-propanesulfonate) or DetX (e.g., N-dodecyl-N,N-dimethyl-3-ammonium-1-propane sulfate).
[0246] In some embodiments of step (e), the crowding agent in the rolling ring amplification reaction mixture includes PEG (e.g., polyethylene glycol, for example, 1-50K molecular weight), dextran, dextran sulfate, hydroxypropyl methylcellulose (HPMC), hydroxyethyl methylcellulose (HEMC), hydroxybutyl methylcellulose, hydroxypropyl cellulose, methylcellulose, or hydroxymethylcellulose.
[0247] In some embodiments of step (e), the amino acids in the rolling circle amplification reaction mixture include β-alanine or β-valine.
[0248] In some embodiments of step (e), the phosphine compound in the rolling ring amplification reaction mixture comprises a phosphine having a derived trialkylphosphine moiety or a derived triarylphosphine moiety. In any of the embodiments of step (e), the phosphine compound comprises TCEP (e.g., tris(2-carboxyethyl)phosphine), BS-TPP (e.g., bissulfotriphenylphosphine), THPP (e.g., tris(hydroxypropyl)phosphine), or THMP (e.g., tris(hydroxymethyl)phosphine).
[0249] In some embodiments of step (e), the ammonium ions in the rolling ring amplification reaction mixture include ammonium sulfate (e.g., NH4)2SO4) or ammonium acetate.
[0250] In some embodiments of step (e), the salt in the rolling ring amplification reaction mixture includes NaCl, KCl, or potassium glutamate.
[0251] In some embodiments of step (e), the viscosity agent in the rolling ring amplification reaction mixture includes trehalose, sucrose, cellulose, xylitol, mannitol, sorbitol, D-maltose, or inositol. In some embodiments, the viscosity agent includes glycerol or a glycol compound such as ethylene glycol or propylene glycol (e.g., propanediol).
[0252] In some embodiments of step (e), the multiple nucleotides in the rolling circle amplification reaction mixture include any combination of dATP, dGTP, dCTP, dTTP, and / or dUTP.
[0253] In some embodiments of step (e), the strand substitution polymerase, if present in the rolling circle amplification reaction mixture, comprises a polymerase capable of locally separating one strand of a double-stranded nucleic acid and synthesizing a new strand in a template-based manner. The strand substitution polymerase substitutes the complementary strand from the template strand and catalyzes the synthesis of the new strand. In some embodiments, the strand substitution polymerase comprises a thermophilic or mesophilic polymerase. In some embodiments, the strand substitution polymerase comprises a wild-type enzyme or a variant enzyme (including exonuclease subtraction mutants, mutant versions, chimeric enzymes, and truncated enzymes). In some embodiments, the strand substitution polymerase comprises phi29 DNA polymerase, a large fragment of Bst DNA polymerase, a large fragment of Bsu DNA polymerase (exo-), Bca DNA polymerase (exo-), a Klenow fragment of E. coli DNA polymerase, T5 polymerase, M-MuLV reverse transcriptase, HIV reverse transcriptase, Deep Vent DNA polymerase, or KOD DNA polymerase. In any of the embodiments of step (e), the phi29 DNA polymerase may be a wild-type phi29 DNA polymerase (e.g., MagniPhi from Expedeon), or a variant EquiPhi29 DNA polymerase (e.g., from Thermo Fisher Scientific, catalog number A39390), or a chimeric QualiPhi DNA polymerase (e.g., from 4basebio, catalog number 510025) or any suitable polymer described herein.
[0254] In some embodiments of step (e), the rolling circle amplification reaction mixture contains a pH buffer and lacks chain displacement polymerase. In some embodiments, the pH buffer in the rolling circle amplification reaction mixture is within the pH range where the chain displacement polymerase is active. For example, the pH buffer in the rolling circle amplification reaction mixture may be about pH 8 to 9. Thus, contacting the covalently closed cyclic molecule with the rolling circle amplification reaction mixture (e.g., lacking chain displacement polymerase) initiates the rolling circle amplification reaction.
[0255] In some embodiments of step (e), the rolling circle amplification reaction can be carried out under isothermal amplification conditions at a constant temperature (such as, for example, about 20°C, about 25°C, about 30°C, about 35°C, about 37°C, about 40°C, about 42°C, about 50°C, about 60°C, about 65°C, about 70°C, about 75°C) or at a higher temperature or within a temperature range defined by any two of the foregoing temperatures.
[0256] In some embodiments of step (e), the rolling circle amplification reaction mixture comprises multiple nucleotides, including dATP, dCTP, dGTP, and dTTP. In some embodiments, the multiple nucleotides further comprise nucleotides having cleavable moieties, wherein the rolling circle amplification reaction generates multiple immobilized single-stranded nucleic acid tandem template molecules, each single-stranded nucleic acid tandem template molecule having at least one nucleotide with a cleavable moiety (e.g., Figure 29 In some embodiments, a tandem template molecule having at least one cleavable moiety of incorporated nucleotides can be cleaved at the cleavable moiety to generate a debasement site in the tandem template molecule. In some embodiments, among the plurality of nucleotides, the nucleotides having the cleavable moiety include deoxyuridine, 8-oxo-7,8-dihydroguanine (e.g., 8oxoG), deoxyinosine, thymidine diol, 3-methyladenine, 7-methylguanine, deoxyxanthoside, 5-hydroxyuridine, 5-hydroxymethyluridine, 5-formyluridine, cyclobutene pyrimidine dimer, 5-methylcytosine, 5-hydroxymethylcytosine, 5-formylcytosine, 5-carboxycytosine, N 6 5-methyladenine, 5-methylcytosine, 5-hydroxymethylcytosine, 5-formylcytosine, or 5-carboxycytosine. In some embodiments, multiple nucleotides in the rolling circle amplification reaction lack nucleotides with easily cleavable moieties.
[0257] In some embodiments of step (e), the multiple nucleotides in the rolling circle amplification mixture may contain a certain amount of dUTP such that a target percentage of thymidine in the resulting tandem template molecule is replaced by dUTP. For example, when 30% of the dTTP in the tandem template molecule is to be replaced by dUTP (e.g., 30% is the target percentage), the nucleotide mixture may contain 7.5% dUTP (e.g., 30 / 4 = 7.5%), 17.5% dTTP, and 25% each of dATP, dCTP, and dGTP. The target percentage of dTTP to be replaced by dUTP may be about 0.1% to 1%, or about 1% to 5%, or about 5% to 10%, or about 10% to 20%, or about 20% to 30%, or about 30% to 45%, or about 45% to 50%, or a higher percentage of dTTP in the immobilized tandem template molecule, replaced by nucleotides with cleavable moieties.
[0258] In some embodiments of step (e), the multiple nucleotides in the rolling circle amplification reaction mixture may include a certain amount of deoxyinosine such that a target percentage of guanosine in the resulting tandem template molecule is replaced by deoxyinosine. For example, when 30% of dGTP in the tandem template molecule is to be replaced by deoxyinosine (e.g., 30% is the target percentage), the nucleotide mixture may contain 7.5% deoxyinosine (e.g., 30 / 4 = 7.5%), 17.5% dGTP, and 25% each of dATP, dCTP, and dTTP. The target percentage of dGTP to be replaced by deoxyinosine may be about 0.1% to 1%, or about 1% to 5%, or about 5% to 10%, or about 10% to 20%, or about 20% to 30%, or about 30% to 45%, or about 45% to 50%, or a higher percentage of dGTP in the immobilized tandem template molecule, replaced by nucleotides with cleavable moieties.
[0259] In some embodiments of step (e), the multiple nucleotides in the rolling circle amplification mixture may include an amount of 8oxoG such that a target percentage of guanosine in the resulting tandem template molecule is replaced by 8oxoG. For example, when 30% of dGTP in the tandem template molecule is to be replaced by 8oxoG (e.g., 30% is the target percentage), the nucleotide mixture may contain 7.5% 8oxoG (e.g., 30 / 4 = 7.5%), 17.5% dGTP, and 25% each of dATP, dCTP, and dTTP. The target percentage of dGTP to be replaced by 8oxoG may be about 0.1% to 1%, or about 1% to 5%, or about 5% to 10%, or about 10% to 20%, or about 20% to 30%, or about 30% to 45%, or about 45% to 50%, or a higher percentage of dGTP in the immobilized tandem template molecule, replaced by nucleotides with cleavable moieties.
[0260] In some embodiments of step (e), the rolling circle amplification reaction generates a fixed tandem template molecule with incorporated nucleotides having cleavable portions, the incorporated nucleotides being distributed at random positions along the individual fixed tandem template molecules (e.g., Figures 29 to 30 In some embodiments, nucleotides having cleavable portions are allocated to different positions within different immobilized tandem template molecules.
[0261] In some embodiments of step (e), the individual immobilized tandem template molecules generated by rolling circle amplification contain two or more tandem repeat units, wherein each unit contains a complementary sequence of a covalently closed circular library molecule (e.g., Figure 29In some embodiments, the repeating units of each tandem template molecule comprise any combination of any one or more of the following arranged in any order: (i) the target sequence; (ii) a first universal binding site (120) (or its complementary sequence) against a first portion of a clip-capture primer; (iii) a universal binding site (123) (or its complementary sequence) against a first non-clip-capture primer; (iv) at least one sample index sequence (e.g., (160) and / or (170)) which can be used to distinguish the target sequence obtained from different sample sources in multiplex assays; (v) at least one universal binding site (140) (or its complementary sequence) against a forward sequencing primer; (vi) at least one universal binding site (150) (or its complementary sequence) against a reverse sequencing primer; (vii) a repeating unit against a compacted oligonucleotide. (viii) at least one universal binding site (or its complementary sequence); (viii) at least one unique molecular index sequence (UMI) (e.g., (180) and / or (190)) which can be used to uniquely identify nucleic acid molecules with the attached unique molecular index sequence (e.g., having a target sequence); (ix) at least one universal binding site (or its complementary sequence) for a pinned primer; (xi) at least one batch-specific barcode sequence; (xii) a universal binding site (133) (or its complementary sequence) for a second non-sticker capture primer; (xiii) at least one short random sequence (e.g., NNNN) (132) which provides nucleotide sequence diversity and is about 3 to 20 nucleotides in length; and / or (xiv) a second universal binding site (130) (or its complementary sequence) for a second portion of the pinned trap.
[0262] In some embodiments of step (e), the universal binding site (140) for forward sequencing primers includes a batch-specific forward sequencing primer binding site that can be used for forward batch sequencing.
[0263] In some embodiments of step (e), the universal binding site (150) for reverse sequencing primers includes a batch-specific reverse sequencing primer binding site that can be used for reverse batch sequencing.
[0264] In some embodiments of step (e), at least one sample index sequence (e.g., (160) and / or (170)) comprises a sample index sequence conjugated with an optional short random sequence (e.g., NNN), wherein the short random sequence provides nucleotide sequence diversity and is about 3 to 20 nucleotides in length.
[0265] In some embodiments of step (e), the rolling circle amplification reaction can be carried out with or without multiple compacted oligonucleotides.
[0266] Tandem template molecules can self-collapse to form DNA nanospheres, sometimes referred to as communities. The shape and size of DNA nanospheres can be further compacted by including a pair of inverted repeat sequences in a covalently closed circular library molecule, or by performing rolling circle amplification in the presence of one or more compacted oligonucleotides.
[0267] In some embodiments of step (e), rolling circle amplification (RCA) can be performed using compacted oligonucleotides to generate single-stranded tandem template molecules having multiple copies of repeating units arranged in tandem, wherein each repeating unit contains a target sequence and at least one binding site for the compacted oligonucleotide. Each immobilized tandem template molecule can hybridize with at least one compacted oligonucleotide, which can collapse the individual tandem template molecules into DNA nanospheres with compacted shape and size compared to tandem template molecules that have not hybridized with the compacted oligonucleotide.
[0268] In some embodiments of step (e), the compacted oligonucleotide comprises a single-stranded nucleic acid oligonucleotide comprising DNA, RNA, or a combination of DNA and RNA. The compacted oligonucleotide can be of any length, including 20-150 nucleotides, 30-100 nucleotides, or 40-80 nucleotides. The compacted oligonucleotide may include a 5' region, an optional internal region (intermediate region), and a 3' region. The 5' and 3' regions of the compacted oligonucleotide can hybridize with binding sites in a tandem strand to pull the distal portions of the tandem strand together, thereby compacting the tandem strand to form DNA nanospheres (e.g., communities). For example, the 5' region of the compacted oligonucleotide is designed to hybridize with a first portion of the tandem strand template molecule (e.g., a universal binding site for the compacted oligonucleotide), and the 3' region of the compacted oligonucleotide is designed to hybridize with a second portion of the same tandem strand template molecule (e.g., a universal binding site for the compacted oligonucleotide). The 5' and 3' regions of the compacted oligonucleotide can hybridize with regions of the tandem strand template molecule having universal sites for binding the compacted oligonucleotide. The 5' and 3' regions of the compacted oligonucleotide can hybridize with regions in the tandem template molecule that overlap with any of the following: a splint-capture primer binding site, a non-splint primer binding site, a pinned primer binding site, a forward sequencing primer binding site, and / or a reverse sequencing primer binding site. The intermediate region can be of any length, for example, about 2-20 nucleotides. The intermediate region can include a homopolymer region having consecutive identical bases (e.g., AAA, GGG, CCC, TTT, or UUU). In some embodiments, the intermediate region contains a heteropolymer sequence.
[0269] Including compacting oligonucleotides during RCA promotes the formation of DNA nanospheres with a more compact size and shape compared to tandem structures generated in the absence of compacting oligonucleotides. These DNA nanospheres are stable and maintain their compacted size and shape over multiple reagent flows, such as during multiple sequencing cycles. This stability improves sequencing accuracy by increasing signal intensity across multiple sequencing cycles.
[0270] DNA nanospheres can be imaged, and FWHM (full width at half maximum) measurements can be obtained to determine the shape / size of the nanospheres. A dot image of the DNA nanospheres can be represented as a Gaussian spot, and its size can be measured as the FWHM. Smaller spot sizes, as indicated by a smaller FWHM, are typically associated with a refined image of the spots. In some embodiments, the FWHM of the nanosphere spots can be about 10 μm or smaller. In some embodiments, the dot image of the DNA nanospheres remains discrete across multiple sequencing cycles.
[0271] In some of steps (e), after the rolling circle amplification reaction, covalently closed cyclic library molecules may optionally be removed from the tandem template molecules by at least one washing step, the washing step being performed under conditions suitable for retaining the tandem template molecules immobilized to the support, wherein each tandem template molecule is operatively bound to the immobilized clamp-on capture primer (200).
[0272] In some embodiments, the method for generating a plurality of nucleic acid tandem template molecules immobilized to a support further includes step (f): performing at least one sequencing reaction to determine the sequence of at least a portion of the tandem template molecule. In some embodiments, the tandem template molecule serves as the nucleic acid template molecule to be sequenced (e.g., a tandem template molecule). In some embodiments, any sequencing method can be used to sequence the tandem template molecule. For example, the sequencing method may employ a plurality of sequencing primers, a plurality of sequencing polymerases, and at least one nucleotide reagent (e.g., Figure 31 and 36 ).
[0273] In some embodiments, the plurality of sequencing polymerases in step (f) include engineered polymerases comprising those described in SEQ ID NO: 128-146 (e.g., respectively). Figures 41 to 59 Sequence identity of at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or greater than 99% of any one of the sequences.
[0274] In some embodiments, the nucleotide reagent comprises any or any combination of nucleotides and / or multivalent molecules. In some embodiments, the nucleotide reagent comprises a canonical nucleotide. In some embodiments, the nucleotide reagent comprises a nucleotide analog. In some embodiments, the nucleotide analog comprises a detectably labeled nucleotide. For example, the detectably labeled nucleotide may be labeled at the nucleobase and / or phosphate ester chain. In some embodiments, the nucleotide reagent comprises a nucleotide carrying a removable or non-removable chain termination portion. In some embodiments, the nucleotide reagent comprises multivalent molecules, each comprising a central core attached to a plurality of polymer arms, each polymer arm having a nucleotide portion at its end (e.g., Figures 1 to 4 ).
[0275] In some embodiments, the sequencing reaction employs binding of unlabeled nucleotides without incorporation. In some embodiments, the sequencing reaction employs incorporation of unlabeled nucleotide analogs. In some embodiments, the sequencing reaction employs incorporation of detectably labeled nucleotides having removable chain termination portions. In some embodiments, the sequencing reaction employs a two-stage sequencing reaction, including binding of detectably labeled multivalent molecules without incorporation, and incorporation of nucleotides or nucleotide analogs. In some embodiments, the sequencing reaction employs incorporation of nucleotide portions from the arms of multivalent molecules. Exemplary nucleotide arms are shown in... Figure 5 In, and exemplary multivalent molecules are shown in Figures 1 to 4 In some embodiments, any of the detectably labeled nucleotide reagents contains at least one fluorophore.
[0276] In some embodiments, the sequencing in step (f) includes generating multiple extended forward sequencing primer chains by contacting multiple immobilized tandem template molecules with multiple soluble forward sequencing primers under conditions suitable for hybridizing at least one forward sequencing primer with at least one of the universal forward sequencing primer binding sites of the immobilized tandem template molecule, and performing a forward sequencing reaction using a hybridized first forward sequencing primer, one or more types of sequencing polymerases, and nucleotide reagents (e.g., Figure 31In some embodiments, the soluble forward sequencing primer includes a 3' OH extendable end. In some embodiments, the soluble forward sequencing primer includes a 3' blocking portion that can be removed to produce a 3' OH extendable end. In some embodiments, the soluble forward sequencing primer lacks a nucleotide with a cleavable portion. The forward sequencing reaction can generate multiple extended forward sequencing primer chains. In some embodiments, each immobilized tandem template molecule has multiple copies of a universal forward sequencing primer binding site, wherein each forward sequencing primer binding site is capable of hybridizing with a first forward sequencing primer. Individual forward sequencing primer binding sites in a given immobilized tandem template molecule can hybridize with a forward sequencing primer and undergo a sequencing reaction. Each immobilized tandem template molecule can undergo two or more sequencing reactions, wherein each sequencing reaction is initiated by a first forward sequencing primer that hybridizes with a universal forward sequencing primer binding site (e.g., Figure 31 ).
[0277] In some embodiments, the sequencing method further includes step (g): retaining the plurality of immobilized tandem template molecules and replacing the plurality of extended forward sequencing primer strands with a plurality of forward extension strands that hybridize with the retained immobilized single-stranded nucleic acid tandem template molecules. By performing a primer extension reaction, the plurality of extended forward sequencing primer strands (e.g., ...) can be removed and replaced with a plurality of forward extension strands. Figure 32 and 33 In some embodiments, multiple forward extension chains can be generated through different workflows described in steps (g1), (g2), and (g3) below.
[0278] In some embodiments, step (g1) of the method for replacing multiple extended forward sequencing primer chains with multiple forward extension chains includes: contacting at least one extended forward sequencing primer chain with multiple strand replacement polymerases and multiple nucleotides under conditions suitable for initiating a strand replacement primer extension reaction using at least one extended forward sequencing primer chain and in the absence of additional soluble amplification primers, thereby generating a forward extension chain covalently bound to the extended forward sequencing primer chain, wherein the forward extension chain hybridizes with a fixed tandem template molecule ( Figure 32 For example, one of the extended forward sequencing primer chains can act as a primer for strand displacement polymerase. The strand displacement polymerase extends the extended forward sequencing primer chain and displaces the downstream extended forward sequencing primer chain, while simultaneously synthesizing an extended strand that replaces the downstream extended forward sequencing primer chain. The newly extended strand is covalently ligated to the extended forward sequencing primer chain. The immobilized tandem template molecule is retained. Figure 32The primer extension reaction in step (g1) may optionally include multiple compacted oligonucleotides and / or hexamines (e.g., hexaminecobaltIII) to generate a forward extension chain. The individual forward extension chain may collapse into nanospheres having a more compact size and / or shape compared to nanospheres generated from a primer extension reaction performed without compacted oligonucleotides and / or hexamines (e.g., hexaminecobaltIII). Including compacted oligonucleotides and / or hexamines (e.g., hexaminecobaltIII) in the primer extension reaction improves the FWHM (full width at half maximum) of the nanosphere's dot image. The dot image may be represented as Gaussian dots, and the size may be measured in FWHM. Smaller dot sizes, as indicated by a smaller FWHM, are generally associated with an improved dot image. In some embodiments, the FWHM of the nanosphere dots may be about 10 μm or smaller. In some embodiments, the multiple compacted oligonucleotides in steps (e) and (g1) may have the same or different sequences.
[0279] Examples of strand substitution polymerases include phi29 DNA polymerase, large fragment of Bst DNA polymerase, large fragment of Bsu DNA polymerase (exo-), Bca DNA polymerase (exo-), Klenow fragment of E. coli DNA polymerase, T5 polymerase, M-MuLV reverse transcriptase, HIV reverse transcriptase, Deep Vent DNA polymerase, and KOD DNA polymerase. The phi29 DNA polymerase can be wild-type phi29 DNA polymerase (e.g., MagniPhi from Expedeon), or the variant EquiPhi29 DNA polymerase (e.g., from Thermo Fisher Scientific, catalog number A39390), or the chimeric QualiPhi DNA polymerase (e.g., from 4basebio, catalog number 510025).
[0280] In some embodiments, step (g2) of the method for replacing multiple extended forward sequencing primer chains with multiple forward extension chains includes: (i) removing the multiple extended forward sequencing primer chains while retaining the immobilized tandem template molecule; and (ii) contacting the multiple retained immobilized tandem template molecules with multiple soluble forward sequencing primers (e.g., a second plurality of soluble forward sequencing primers), multiple nucleotides (e.g., a second plurality of nucleotides), and multiple primer extension polymerases under conditions suitable for hybridizing the multiple soluble forward sequencing primers with the multiple retained immobilized tandem template molecules and suitable for performing a polymerase-catalyzed primer extension reaction, thereby generating multiple forward extension chains, wherein the soluble sequencing primers hybridize with the forward sequencing primer binding sequences in the retained immobilized tandem template molecules (e.g., ...). Figure 33The primer extension reaction in step (g2) may optionally include multiple compacted oligonucleotides and / or hexamines (e.g., hexaminecobaltIII) to generate a forward extension chain. The individual forward extension chain may collapse into nanospheres having a more compact size and / or shape compared to nanospheres generated from a primer extension reaction performed without compacted oligonucleotides and / or hexamines (e.g., hexaminecobaltIII). Including compacted oligonucleotides and / or hexamines (e.g., hexaminecobaltIII) in the primer extension reaction improves the FWHM (full width at half maximum) of the nanosphere's dot pattern. The dot pattern may be represented as Gaussian dots, and the size may be measured in FWHM. Smaller dot sizes, as indicated by a smaller FWHM, are generally associated with an improved dot pattern. In some embodiments, the FWHM of the nanosphere dots may be about 10 μm or smaller. In some embodiments, the multiple compacted oligonucleotides in steps (e) and (g2) may have the same or different sequences.
[0281] In some embodiments, in step (g2), the conditions suitable for hybridizing a plurality of soluble forward sequencing primers with a plurality of retained immobilized single-stranded nucleic acid tandem template molecules include hybridizing the retained immobilized tandem template molecules with the soluble primers in the presence of primer extension polymerase, a plurality of nucleotides, and a high-efficiency hybridization buffer. In some embodiments, the high-efficiency hybridization buffer comprises: (i) a first polar aprotic solvent having a dielectric constant not greater than 40 and a polarity index of 4-9; (ii) a second polar aprotic solvent having a dielectric constant not greater than 115 and present in the hybridization buffer formulation in an amount capable of effectively denaturing double-stranded nucleic acids; (iii) a pH buffer system that maintains the pH of the hybridization buffer formulation in the range of about 4-8; and (iv) a crowding agent in an amount sufficient to enhance or promote molecular crowding. In some embodiments, the high-efficiency hybridization buffer comprises: (i) a first polar aprotic solvent comprising 25-50% by volume of acetonitrile; (ii) a second polar aprotic solvent comprising 5-10% by volume of formamide; (iii) a pH buffer system comprising 2-(N-morpholino)ethanesulfonic acid (MES) at a pH of 5-6.5; and (iv) a congestant comprising 5-35% by volume of polyethylene glycol (PEG) in the hybridization buffer. In some embodiments, the high-efficiency hybridization buffer further comprises betaine.
[0282] In some embodiments, step (g3) of the method for replacing multiple extended forward sequencing primer chains with multiple forward extension chains includes: (i) removing multiple extended forward sequencing primer chains while retaining the immobilized tandem template molecule; and (ii) contacting the multiple retained immobilized tandem template molecules with multiple soluble amplification primers, multiple nucleotides (e.g., second multiple nucleotides), and multiple primer extension polymerases under conditions suitable for hybridizing the multiple soluble amplification primers with the multiple retained immobilized tandem template molecules and suitable for polymerase-catalyzed primer extension reactions, thereby generating multiple forward extension chains, wherein the soluble amplification primers hybridize with soluble amplification primer binding sequences in the retained immobilized tandem template molecules. The primer extension reaction in step (g3) may optionally include multiple compacted oligonucleotides and / or hexamines (e.g., hexaminecobaltIII) to generate forward extension chains. Individual forward extension strands can collapse into nanospheres with a more compact size and / or shape compared to nanospheres generated from primer extension reactions performed without compacting oligonucleotides and / or hexamines (e.g., hexaminecobaltIII). Including compacting oligonucleotides and / or hexamines (e.g., hexaminecobaltIII) in the primer extension reaction improves the FWHM (full width at half maximum) of the nanosphere's dot pattern. The dot pattern can be represented as Gaussian dots, and the size can be measured in FWHM. Smaller dot sizes, as indicated by a smaller FWHM, are generally associated with an improved dot pattern. In some embodiments, the FWHM of the nanosphere dots can be about 10 μm or smaller. In some embodiments, the multiple compacting oligonucleotides in steps (e) and (g3) have the same or different sequences.
[0283] In some embodiments, in step (g3), the conditions suitable for hybridizing a plurality of soluble amplification primers with a plurality of retained immobilized single-stranded nucleic acid tandem template molecules include hybridizing the retained immobilized tandem template molecules with soluble primers in the presence of primer extension polymerase, a plurality of nucleotides, and a high-efficiency hybridization buffer. In some embodiments, the high-efficiency hybridization buffer comprises: (i) a first polar aprotic solvent having a dielectric constant not greater than 40 and a polarity index of 4-9; (ii) a second polar aprotic solvent having a dielectric constant not greater than 115 and present in the hybridization buffer formulation in an amount capable of effectively denaturing double-stranded nucleic acids; (iii) a pH buffer system that maintains the pH of the hybridization buffer formulation in the range of about 4-8; and (iv) a crowding agent in an amount sufficient to enhance or promote molecular crowding. In some embodiments, the high-efficiency hybridization buffer comprises: (i) a first polar aprotic solvent comprising 25-50% by volume of acetonitrile; (ii) a second polar aprotic solvent comprising 5-10% by volume of formamide; (iii) a pH buffer system comprising 2-(N-morpholino)ethanesulfonic acid (MES) at a pH of 5-6.5; and (iv) a congestant comprising 5-35% by volume of polyethylene glycol (PEG) in the hybridization buffer. In some embodiments, the high-efficiency hybridization buffer further comprises betaine.
[0284] In some embodiments, in steps (g2) and / or (g3), enzymes or chemical reagents may be used to remove the extended forward sequencing primer chains. For example, 5' to 3' double-stranded DNA exonucleases (including T7 exonucleases (e.g., from New England Biolabs, catalog number M0263S)) may be used to enzymatically degrade the extended forward sequencing primer chains. In some embodiments, temperatures favorable for nucleic acid denaturation may be used to remove the extended forward sequencing primer chains.
[0285] In some embodiments, in steps (g2) and / or (g3), a denaturing agent may be used to remove the plurality of extended forward sequencing primer chains, wherein the denaturing agent comprises any one or any combination of compounds such as formamide, acetonitrile, guanidine hydrochloride and / or buffers (e.g., Tris-HCl, MES, HEPES, etc.).
[0286] In some embodiments, in steps (g2) and / or (g3), elevated temperatures (e.g., heat) may be used to remove the plurality of extended forward sequencing primer strands, with or without nucleic acid denaturing reagents. The plurality of extended forward sequencing primer strands may be subjected to temperatures of approximately 45°C to 50°C, or approximately 50°C to 60°C, or approximately 60°C to 70°C, or approximately 70°C to 80°C, or approximately 80°C to 90°C, or approximately 90°C to 95°C or higher.
[0287] In some embodiments, in steps (g2) and / or (g3), 100% formamide may be used at a temperature of about 65°C for about 3 minutes, and the extended forward sequencing primer chains may be removed by washing with a reagent containing about 50 mM NaCl or equivalent ionic strength and having a pH of about 6.5 to 8.5.
[0288] In some embodiments, the primer extension polymerase in either step (g2) and / or (g3) comprises a high-fidelity polymerase. In some embodiments, the primer extension polymerase comprises a DNA polymerase (e.g., a uracil-resistant polymerase) capable of catalyzing a primer extension reaction using a template molecule containing uracil. Exemplary polymerases include, but are not limited to: Q5U Hot Start high-fidelity DNA polymerase (e.g., catalog number M0515S from New England Biolabs), Taq DNA polymerase, One Taq DNA polymerase (e.g., a mixture of Taq and Deep Vent DNA polymerases, catalog number M0480S from New England Biolabs), LongAmp Taq DNA polymerase (e.g., catalog number M0323S from New England Biolabs), Epimark Hot Start Taq DNA polymerase (e.g., catalog number M0490S from New England Biolabs), Bst DNA polymerase (e.g., large fragment, catalog number M0275S from New England Biolabs), Bsu DNA polymerase (e.g., large fragment, catalog number M0330S from New England Biolabs), Phi29 DNA polymerase (e.g., catalog number M0269S from New England Biolabs), and E. coli DNA polymerase (e.g., from New England Biolabs). Biolabs catalog number M0209S), Therminator DNA polymerase (e.g., from New England Biolabs catalog number M0261S), Vent DNA polymerase, and Deep Vent DNA polymerase.
[0289] The sequencing method described herein can provide increased accuracy in downstream sequencing reactions because steps (g1), (g2), and (g3) replace the extended forward sequencing primers generated in step (f) with a forward extension strand having reduced base errors. The extended forward sequencing primers are generated in step (e) and may or may not contain nucleotides incorrectly incorporated due to mismatched bases catalyzed by polymerase. When steps (g1), (g2), and (g3) are performed with a high-fidelity DNA polymerase, the resulting forward extension strand can have reduced base errors compared to the extended forward sequencing primers. The forward extension strand can be used as a nucleic acid template for downstream sequencing steps (e.g., see step (i) below). Therefore, steps (g1), (g2), and (g3) can increase the sequencing accuracy of downstream step (i) and thus increase the overall sequencing accuracy of the sequencing workflow.
[0290] In some embodiments, the sequencing method further includes step (h): removing the retained fixed tandem template molecule by generating debase sites at nucleotides with easily cleavable portions in the fixed single-stranded tandem template molecule, and generating gaps at the debase sites to generate a plurality of single-stranded nucleic acid tandem template molecules containing gaps, while retaining a plurality of forward-extending strands and a plurality of fixed splint-capture primers (200) and pinning primers (500), such as, for example Figure 34 As shown.
[0291] A base-degrading sites are generated on a retained tandem template molecule containing nucleotides with cleavable portions. In some embodiments, the cleavable portions in the retained tandem template molecule include uridine, 8-oxo-7,8-dihydroguanine (e.g., 8oxoG), or deoxyinosine. A base-degrading sites can be removed to generate multiple single-stranded nucleic acid template molecules with gaps, while retaining the multiple forward-extending strands. A base-degrading sites can be generated by contacting an immobilized tandem template molecule with an enzyme that removes nucleotides with cleavable portions. Uracil DNA glycosylase (UDG) can be used to convert uracil in the retained tandem template strand to a base-degrading site. FPG glycosylase can be used to convert 8oxoG in the retained tandem template strand to a base-degrading site. AlkA glycosylase can be used to convert deoxyinosine in the retained tandem template strand to a base-degrading site.
[0292] In some embodiments, in step (h), the gap can be generated by contacting the debasement site in the immobilized tandem template molecule with an enzyme or mixture of enzymes having lysin activity that breaks the phosphodiester backbone at the 5' and 3' sides of the debasement site, to release base-free deoxyribose and generate the gap. Figure 34The abase sites can be removed using AP lyase, Endo IV endonuclease, FPG glycosylation enzyme / AP lyase, or Endo VIII glycosylation enzyme / AP lyase. In some embodiments, a mixture of uracil DNA glycosylation enzyme and DNA glycosylation enzyme-lyase endonuclease VIII (e.g., USER (uracil-specific excision reagent enzyme from New England Biolabs, catalog number M5509) or thermostable USER (also from New England Biolabs, catalog number M5508)) can be used to generate and remove the abase sites to create gaps.
[0293] In some embodiments, in step (h), the tandem template molecule carrying at least one cleavable nucleotide can react with at least one enzyme to convert the cleavable nucleotide into an abase site. In some embodiments, uracil DNA glycosylase (UDG) can be used to convert deoxyuridine into an abase site, FPG glycosylase can be used to convert 8oxoG into an abase site, and AlkA glycosylase can be used to convert deoxyinosine into an abase site. Other exemplary enzymes that can convert easily cleavable nucleotides into a debasement site in a tandem include single-stranded selective monofunctional uracil DNA glycosylase 1 (SMUG1), methyl-binding domain glycosylase 4 (MBD4), thymine DNA glycosylase (TDG), mutY homologous DNA glycosylase (MYH), alkylpurine glycosylase C (AlkC), alkylpurine glycosylase D (AlkD), 8-oxo-guanine glycosylase 1 (OGG1) without a debasement site lyase activity, endonuclease III-like 1 (NTHL1) without a debasement site lyase activity, endonuclease VIII-like glycosylase 1 (NEIL1) without a debasement site lyase activity, endonuclease VIII-like glycosylase 2 (NEIL2) without a debasement site lyase activity, endonuclease VIII-like glycosylase 3 (NEIL3) without a debasement site lyase activity, and their enzymatically active fragments.
[0294] In some embodiments, in step (h), enzymes, chemical compounds, and / or heat can be used to remove multiple tandem template molecules containing gaps. Following the gap removal procedure, multiple retained forward extensions are hybridized with retained, immobilized clip-on capture primers, such as... Figure 35 As shown.
[0295] For example, 5' to 3' double-stranded DNA exonucleases (including T7 exonucleases (e.g., from New England Biolabs, catalog number M0263S)) can be used to enzymatically degrade multiple tandem template molecules containing gaps. When a 5' to 3' double-stranded DNA exonuclease is used to remove tandem template molecules containing gaps, the plurality of soluble amplification primers in step (g3) may contain at least one phosphate-thioester diester bond at their 5' ends, which makes the soluble amplification primers resistant to exonuclease degradation. In some embodiments, the plurality of soluble amplification primers in step (g3) contain 2 to 5 or more consecutive phosphate-thioester diester bonds at their 5' ends. In some embodiments, the plurality of soluble amplification primers in step (g3) contain at least one ribonucleotide and / or at least one 2'-O-methyl or 2'-O-methoxyethyl (MOE) nucleotide, which makes the forward sequencing primers resistant to exonuclease degradation.
[0296] In some embodiments, in step (h), a chemical reagent that facilitates nucleic acid denaturation may be used to remove the plurality of spaced tandem template molecules. The denaturing reagent may comprise any or any combination of compounds such as formamide, acetonitrile, guanidine hydrochloride, and / or buffers (e.g., Tris-HCl, MES, HEPES, etc.).
[0297] In some embodiments, in step (h), elevated temperatures (e.g., heat) may be used to remove the plurality of spaced tandem template molecules, with or without nucleic acid denaturing reagents. The spaced template molecules may be subjected to temperatures of about 45°C to 50°C, or about 50°C to 60°C, or about 60°C to 70°C, or about 70°C to 80°C, or about 80°C to 90°C, or about 90°C to 95°C or higher.
[0298] In some embodiments, in step (h), the plurality of interstitial tandem template molecules can be removed by using 100% formamide at a temperature of about 65°C for about 3 minutes and by washing with a reagent containing about 50 mM NaCl or equivalent ionic strength and having a pH of about 6.5 to 8.5.
[0299] In some embodiments, the sequencing method further includes step (i): sequencing the plurality of retained forward extension strands to generate a plurality of extended reverse sequencing primer strands. In some embodiments, the sequencing in step (i) includes: contacting the plurality of retained forward extension strands with a plurality of soluble reverse sequencing primers under conditions suitable for hybridizing the reverse sequencing primers with the reverse sequencing primer binding sites of the retained forward extension strands, and performing a sequencing reaction using the hybridized reverse sequencing primers, wherein the forward sequencing reaction generates a plurality of extended reverse sequencing primer strands (e.g., Figure 36 The extended reverse sequencing primer strand can hybridize with the retained forward extension strand. The retained forward extension strand can hybridize with the clip-on capture primer. Therefore, the retained forward extension strand can be immobilized to the support. The extended reverse sequencing primer strand does not hybridize with or covalently bind to the clip-on capture primer.
[0300] In some embodiments, in step (i), the fixed, retained forward extension strand serves as the nucleic acid template molecule to be sequenced (e.g., Figure 36 In some embodiments, the retained forward extension strand can be sequenced using any sequencing method. For example, the sequencing method may employ multiple sequencing primers, multiple sequencing polymerases, and at least one nucleotide reagent.
[0301] In some embodiments, the plurality of sequencing polymerases in step (i) include engineered polymerases comprising the same as those in SEQ ID NO:128-146 (e.g., respectively). Figures 41 to 59 At least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or greater than 99% sequence identity of any of the following, or any suitable polymerase described herein.
[0302] In some embodiments, in step (i), the nucleotide reagent comprises any or any combination of nucleotides and / or multivalent molecules. In some embodiments, the nucleotide reagent comprises a canonical nucleotide. In some embodiments, the nucleotide reagent comprises a nucleotide analog. In some embodiments, the nucleotide analog comprises a detectably labeled nucleotide. For example, the detectably labeled nucleotide may be labeled at the nucleobase and / or phosphate ester chain. In some embodiments, the nucleotide reagent comprises a nucleotide carrying a removable or non-removable chain termination portion. In some embodiments, the nucleotide reagent comprises multivalent molecules, each comprising a central core attached to a plurality of polymer arms, each polymer arm having a nucleotide portion at its end (e.g., Figures 1 to 4 ).
[0303] In some embodiments, in step (i), the sequencing reaction involves binding an unlabeled nucleotide without incorporation. In some embodiments, the sequencing reaction involves incorporating an unlabeled nucleotide analog. In some embodiments, the sequencing reaction involves incorporating a detectably labeled nucleotide having a removable chain termination portion. In some embodiments, the sequencing reaction involves a two-stage sequencing reaction, including binding a detectably labeled multivalent molecule without incorporation, and incorporating a nucleotide or nucleotide analog. In some embodiments, the sequencing reaction involves incorporating a nucleotide portion from an arm of a multivalent molecule. An exemplary nucleotide arm is shown in... Figure 5 In, and exemplary multivalent molecules are shown in Figures 1 to 4 In some embodiments, any of the detectably labeled nucleotide reagents contains at least one fluorophore.
[0304] In some embodiments, in step (i), the conditions suitable for hybridizing the reverse sequencing primers with the reverse sequencing primer-binding sequence of the retained forward extension strand include contacting the plurality of soluble reverse sequencing primers and the retained forward extension strand with a high-efficiency hybridization buffer. In some embodiments, the high-efficiency hybridization buffer comprises: (i) a first polar aprotic solvent having a dielectric constant not greater than 40 and a polarity index of 4-9; (ii) a second polar aprotic solvent having a dielectric constant not greater than 115 and present in the hybridization buffer formulation in an amount capable of effectively denaturing double-stranded nucleic acids; (iii) a pH buffer system that maintains the pH of the hybridization buffer formulation in the range of about 4-8; and (iv) a crowding agent in an amount sufficient to enhance or promote molecular crowding. In some embodiments, the high-efficiency hybridization buffer comprises: (i) a first polar aprotic solvent comprising 25-50% by volume of acetonitrile; (ii) a second polar aprotic solvent comprising 5-10% by volume of formamide; (iii) a pH buffer system comprising 2-(N-morpholino)ethanesulfonic acid (MES) at a pH of 5-6.5; and (iv) a congestant comprising 5-35% by volume of polyethylene glycol (PEG) in the hybridization buffer. In some embodiments, the high-efficiency hybridization buffer further comprises betaine.
[0305] In an alternative embodiment, the sequencing in step (i) includes: using a fixed clamp to capture primers (e.g., (200)) as sequencing primers and performing a sequencing reaction to generate multiple reverse sequencing strands.
[0306] In some embodiments, the reverse sequencing reaction in step (i) includes: contacting the plurality of soluble reverse sequencing primers with the reverse sequencing primer binding sequence of the retained forward extension strand, one or more types of sequencing polymerase, and a plurality of nucleotides and / or a plurality of multivalent molecules (e.g., Figure 36In some embodiments, the soluble reverse sequencing primer includes a 3' OH extendable end. In some embodiments, the soluble reverse sequencing primer includes a 3' blocking portion that can be removed to produce a 3' OH extendable end. In some embodiments, the soluble reverse sequencing primer lacks a nucleotide with a cleavable portion. Sequencing reactions employing nucleotides and / or multivalent molecules are described in more detail below. Reverse sequencing reactions can produce multiple extended reverse sequencing primer strands. In some embodiments, a single retained forward extension strand has multiple copies of the reverse sequencing primer binding sequence / site, wherein each reverse sequencing primer binding site is capable of hybridizing with a reverse sequencing primer. A single reverse sequencing primer binding site in a given retained forward extension strand can hybridize with a reverse sequencing primer and can undergo a sequencing reaction. Thus, a single retained forward extension strand can undergo two or more sequencing reactions, wherein each sequencing reaction is initiated from a reverse sequencing primer that hybridizes with a reverse sequencing primer binding site. In some embodiments, the sequencing reaction includes multiple nucleotides (or analogs thereof) labeled with detectable reporter gene portions. In some embodiments, the sequencing reaction includes a plurality of multivalent molecules having nucleotide moieties, wherein the multivalent molecules are labeled with a detectable reporter gene moieties. In some embodiments, the detectable reporter gene moieties include fluorophores.
[0307] In some embodiments, at least one washing step may be performed after any of the sequencing steps (a) to (i). The washing step may be performed using a washing buffer comprising a pH buffer, a metal chelating agent, a salt, and a detergent.
[0308] In some embodiments, the pH buffering compound in the wash buffer comprises any one or any combination of two or more of the following: Tris, Tris-HCl, Tricine, Bicine, Bis-Tris propane, HEPES, MES, MOPS, MOPSO, BES, TES, CAPS, TAPS, TAPSO, ACES, PIPES, ethanolamine (also known as 2-aminomethanol; MEA), citrate compounds, citrate mixtures, NaOH and / or KOH. In some embodiments, the pH buffer may be present in the wash buffer at a concentration of about 1 mM to 100 mM, or about 10 mM to 50 mM, or about 10 mM to 25 mM. In some embodiments, the pH of the pH buffer present in any of the reagents described herein may be adjusted to a pH of about 4 to 9, or about 5 to 9, or about 5 to 8.
[0309] In some embodiments, the metal chelating agent in the wash buffer includes: EDTA (ethylenediaminetetraacetic acid), EGTA (ethylene glycol tetraacetic acid), HEDTA (hydroxyethylethylenediaminetriacetic acid), DPTA (diethylenetriaminepentaacetic acid), NTA (N,N-bis(carboxymethyl)glycine), anhydrous citrate, sodium citrate, calcium citrate, ammonium citrate, diammonium citrate, citric acid, potassium citrate, or magnesium citrate. In some embodiments, the wash buffer contains a chelating agent at a concentration of about 0.01 mM to 50 mM, or about 0.1 mM to 20 mM, or about 0.2 mM to 10 mM.
[0310] In some embodiments, the salt in the wash buffer includes NaCl, KCl, NH₂SO₄, or potassium glutamate. In some embodiments, the detergent includes an ionic detergent, such as SDS (sodium dodecyl sulfate). The wash buffer may contain a monovalent salt at a concentration of about 25 mM to 500 mM, or about 50 mM to 250 mM, or about 100 mM to 200 mM.
[0311] In some embodiments, the detergent in the wash buffer includes a nonionic detergent, such as Triton X-100, Tween 20, Tween 80, or Nonidet P-40. In some embodiments, the detergent includes a zwitterionic detergent, such as CHAPS (3-[(3-cholamidopropyl)dimethylammonium]-1-propanesulfonate) or N-dodecyl-N,N-dimethyl-3-ammonium-1-propane sulfate (DetX). In some embodiments, the detergent includes: LDS (lithium dodecyl sulfate), sodium taurodeoxycholate, sodium taurocholate, sodium glycocholate, sodium deoxycholate, or sodium cholate. In some embodiments, the detergent is included in the wash buffer at a concentration of about 0.01% to 0.05%, or about 0.05% to 0.1%, or about 0.1% to 0.15%, or about 0.15% to 0.2%, or about 0.2% to 0.25%.
[0312] Formation of a fixed open-ring library molecule with 5' lobes
[0313] In some embodiments, in the method for generating multiple nucleic acid tandems immobilized to a support as described above, step (c) includes: (i) forming multiple open-loop libraries by contacting multiple immobilized clip-capture primers (200) with multiple linear library molecules (100), each open-loop library molecule having a 5' overhang, wherein the contact is performed under conditions suitable for hybridizing each linear library molecule with each immobilized clip-capture primer to form each open-loop library molecule, each open-loop library molecule having at least a portion of a first terminal region of a given linear library molecule that hybridizes with a first portion (210) of the clip-capture primer, and having at least a portion of a second terminal region of the same linear library molecule that hybridizes with a second portion (220) of the same clip-capture primer, wherein the 5' ends of each open-loop library molecule form a 5' overhang structure that can be cleaved by a structure-specific 5' overhang endonuclease (e.g., Figure 28A(i), 28A (ii), 28B left and 28B right), and step (c) includes (ii) cleaving the 5' protruding valve structure by contacting a plurality of open-loop library molecules with a valve-cleaving reagent under conditions suitable for cleaving the 5' protruding valve structure, thereby forming a plurality of cleavage products. In some embodiments, each cleavage product comprises an open-loop library molecule having a newly cleaved 5' end and an uncleaved 3' end, wherein the newly cleaved 5' end and the uncleaved 3' end of the same open-loop library molecule form a nick upon hybridization with a first portion (210) and a second portion (220) of the same clip-capture primer, wherein the nick is enzymatically ligated. In some embodiments, the cleavage reagent cleaves the 5' and 3' lobes, thereby generating a plurality of cleavage products, wherein each cleavage product comprises an open-ring library molecule having a newly cleaved 5' end and a newly cleaved 3' end, wherein the newly cleaved 5' and 3' ends of the same open-ring library molecule form a nick upon hybridization with a first portion (210) and a second portion (220) of the same clip-on capture primer, wherein the nick is enzymatically ligated. In some embodiments, the cleavage reagent cleaves the 5' and 3' lobes, thereby generating a plurality of cleavage products, wherein each cleavage product comprises an open-ring library molecule having a newly cleaved 5' end and a newly cleaved 3' end, wherein the newly cleaved 5' and 3' ends of the same open-ring library molecule form a gap upon hybridization with a first portion (210) and a second portion (220) of the same clip-on capture primer. In some embodiments, the gap may undergo a polymerase-catalyzed filling reaction to generate the nick, wherein the nick is enzymatically ligated. In some embodiments, the gaps in individual open-ring library molecules can be closed by performing a polymerase-catalyzed gap-filling reaction, using the newly cleaved 3' end of the library molecule as the initiation site for the polymerase-catalyzed filling reaction, and using immobilized clip-capture primers as template molecules, thereby forming open-ring library molecules with cleavage. The cleavage can be closed by performing an enzymatic ligation reaction to form single-stranded covalently closed cyclic library molecules, wherein each covalently closed cyclic library molecule hybridizes with immobilized clip-capture primers.
[0314] In some embodiments, the method for generating a plurality of nucleic acid tandem template molecules immobilized to a support further includes step (d): enzymatically closing nicks in a plurality of open-loop library molecules, thereby generating a plurality of covalently closed circular library molecules (400), wherein each covalently closed circular library molecule hybridizes with an immobilized splint-capture primer (200).
[0315] In some embodiments, the method for generating a plurality of nucleic acid tandem template molecules immobilized to a support further includes step (e): contacting a plurality of covalently closed circular library molecules with a rolling circle amplification reaction mixture and performing a plurality of rolling circle amplification reactions, thereby generating a plurality of immobilized nucleic acid tandem template molecules, wherein the density of the immobilized tandem template molecules is 10.5 / mm 2 Up to 10 15 / mm 2 .
[0316] In some embodiments, the method for generating a plurality of nucleic acid tandem template molecules immobilized to a support further includes step (f): performing at least one sequencing reaction to determine the sequence of at least a portion of the plurality of immobilized tandem template molecules.
[0317] In some embodiments, the contact in step (c) includes dispensing a plurality of single-stranded nucleic acid linear library molecules onto a support having a plurality of immobilized clip-capture primers (200) and pinning primers (500). In some embodiments, the contact in step (c) includes dispensing one type of single-stranded nucleic acid linear library molecules onto a support having a plurality of immobilized clip-capture primers and pinning primers. In some embodiments, the contact in step (c) includes dispensing a mixture of at least two different types of single-stranded nucleic acid linear library molecules onto a support having a plurality of immobilized clip-capture primers and pinning primers, wherein the at least two types comprise at least a first subgroup and a second subgroup of linear library molecules, and wherein the support comprises a first subgroup and a second subgroup of immobilized clip-capture primers (200), for example... Figure 28B Zuohe Figure 28B Right. In some embodiments, a universal binding sequence in the linear library molecule (100) for a first portion of a fixed clip-on capture primer may hybridize with the first portion (210) of the fixed clip-on capture primer. In some embodiments, a second universal binding site (130) in the linear library molecule (100) for a second portion of a fixed clip-on capture primer may hybridize with the second portion (220) of the fixed clip-on capture primer. In some embodiments, the fixed clip-on capture primer comprises a first portion (210) and a second portion (220) that hybridize with adaptor sequences (e.g., (120) and (130)) in the linear library molecule, and the clip-on capture primer acts as a nucleic acid clip molecule for circularizing the linear library molecule (e.g., Figure 28A (i), 28A(ii), 28A(iii), 28B left and 28B right).
[0318] In some embodiments, step (c) includes: (i) forming a first subgroup of open-loop libraries by contacting a first subgroup (200-A) of immobilized clip-capture primers with a first subgroup of linear nucleic acid libraries, wherein each open-loop library has a 5' overhanging lobe, wherein the contact is performed under conditions suitable for hybridizing each linear library of the first subgroup with each immobilized clip-capture primer to form each open-loop library (300-A), each open-loop library having at least a portion of a first terminal region of a given linear library that hybridizes with a first portion (210-A) of the clip-capture primer, and at least a portion of a second terminal region of the same linear library that hybridizes with a second portion (220-A) of the same clip-capture primer, wherein the 5' ends of each open-loop library form a 5' overhanging lobe structure that can be cleaved by a structure-specific 5' lobe endonuclease (e.g., Figure 28B (i) and (ii) cleaving the 5' protruding valve structure by contacting a first subgroup of open-loop library molecules with a valve-cleaving reagent under conditions suitable for cleaving the 5' protruding valve structure, thereby forming a first subgroup of cleavage products, wherein each cleavage product comprises an open-loop library molecule having a newly cleaved 5' end and an uncleaved 3' end, wherein the newly cleaved 5' end and the uncleaved 3' end of the same open-loop library molecule form a nick upon hybridization with a first portion (210-A) and a second portion (220-A) of the same clip-capture primer. In some embodiments, contacting in step (c) comprises dispensing the first subgroup of linear library molecules onto a support having a mixture of the first subgroup and the second subgroup of the clip-capture primer. In some embodiments, contacting in step (c) comprises dispensing the first subgroup of linear library molecules onto a support having a plurality of pinning primers (500). In some embodiments, the immobilized clip-on trapping primer (200-A) comprises a first portion (210-A) and a second portion (220-A) that hybridize to the adaptor sequences (120-A) and (130-A) in the linear library molecule of the first subgroup, and the clip-on trapping primer (200-A) acts as a nucleic acid clip molecule for circularizing the linear library molecule (e.g., Figure 28B (Left). In some embodiments, each linear library molecule of the first subgroup includes a universal binding sequence (120-A) capable of hybridizing with a first portion (210-A) of each of the fixed-splice capture primers in the first subgroup. In some embodiments, each linear library molecule of the first subgroup includes a universal binding sequence (130-A) capable of hybridizing with a second portion (220-A) of each of the fixed-splice capture primers in the first subgroup. In some embodiments, the linear library molecules in the first subgroup are single-stranded.
[0319] In some embodiments, step (c) includes: (i) forming a second subgroup of open-loop libraries by contacting a second subgroup (200-A) of immobilized clip-capture primers with a second subgroup of nucleic acid linear libraries, wherein each open-loop library has a 5' overhanging lobe, wherein the contact is performed under conditions suitable for hybridizing each linear library of the second subgroup with each immobilized clip-capture primer to form each open-loop library (300-B), each open-loop library having at least a portion of a first terminal region of a given linear library that hybridizes with a first portion (210-B) of the clip-capture primer, and at least a portion of a second terminal region of the same linear library that hybridizes with a second portion (220-B) of the same clip-capture primer, wherein the 5' ends of each open-loop library form a 5' overhanging lobe structure that can be cleaved by a structure-specific 5' lobe endonuclease (e.g., Figure 28B (i) and (ii) cleaving the 5' protruding valve structure by contacting the second subgroup of the open-loop library molecules with a valve-cleaving reagent under conditions suitable for cleaving the 5' protruding valve structure, thereby forming a second subgroup of cleavage products, wherein each cleavage product comprises an open-loop library molecule having a newly cleaved 5' end and an uncleaved 3' end, wherein the newly cleaved 5' end and the uncleaved 3' end of the same open-loop library molecule form a nick upon hybridization with the first portion (210-B) and the second portion (220-B) of the same clip-capture primer. In some embodiments, contacting in step (c) comprises dispensing the second subgroup of the linear library molecules onto a support having a mixture of the first subgroup and the second subgroup with the clip-capture primer. In some embodiments, contacting in step (c) comprises dispensing the second subgroup of the linear library molecules onto a support having a plurality of pinning primers (500). In some embodiments, the immobilized clip-on trapping primer (200-B) comprises a first portion (210-B) and a second portion (220-B) that hybridize to the adaptor sequences (120-B) and (130-B) in the linear library molecule of the second subgroup, and the clip-on trapping primer (200-B) acts as a nucleic acid clip molecule for circularizing the linear library molecule (e.g., Figure 28B (Right). In some embodiments, each linear library molecule of the second subgroup includes a universal binding sequence (120-B) capable of hybridizing with the first portion (210-B) of each fixed-splice capture primer in the second subgroup. In some embodiments, each linear library molecule of the second subgroup includes a universal binding sequence (130-B) capable of hybridizing with the second portion (220-B) of each fixed-splice capture primer in the second subgroup. In some embodiments, the linear library molecules in the second subgroup are single-stranded.
[0320] In some embodiments, contacting the first subgroup (200-A) of the immobilized clip-on capture primers with the first subgroup of the nucleic acid linear library molecules and contacting the second subgroup (200-A) of the immobilized clip-on capture primers with the second subgroup of the nucleic acid linear library molecules occur simultaneously. In some embodiments, contacting the first subgroup (200-A) of the immobilized clip-on capture primers with the first subgroup of the nucleic acid linear library molecules and contacting the second subgroup (200-A) of the immobilized clip-on capture primers with the second subgroup of the nucleic acid linear library molecules occur sequentially.
[0321] In some embodiments, step (c) includes contacting a support containing a plurality of first subgroups (200-A) of fixed clip-capture primers and a plurality of second subgroups (200-B) of fixed clip-capture primers with a first subgroup or a second subgroup of a nucleic acid linear library molecule.
[0322] In some embodiments, step (c) includes contacting a support containing a plurality of first subgroups (200-A) of fixed clip-capture primers and a plurality of second subgroups (200-B) of fixed clip-capture primers with the first subgroups and the second subgroups of the nucleic acid linear library molecules substantially simultaneously or in any order.
[0323] In some embodiments of step (c), the 5' valve endonuclease includes a structure-specific 5' valve endonuclease that can cleave the 5' valve structure of single-stranded DNA or RNA. The structure-specific 5' valve endonuclease does not cleave a specific sequence, but rather cleaves the 5' overhanging valve structure. The structure-specific 5' valve endonuclease catalyzes the hydrolytic cleavage of the phosphodiester bond at the junction of single-stranded and double-stranded DNA, thereby releasing the 5' overhanging valve.
[0324] In some embodiments of step (c), the 5' protruding lobe structure contains a nucleic acid sequence that is not complementary to the first part (210) of the clamp trap primer.
[0325] In some embodiments of step (c), the 5' protruding valve structure is at least 2 nucleotides in length. In any embodiment of step (c), the 5' protruding valve structure is 2 to 10 nucleotides in length.
[0326] In some embodiments of step (c), cleavage at any location on the 5' lobe structure generates a cleavage product. The length of the cleavage product can be 2 to 10 nucleotides.
[0327] In some embodiments of step (c), the structure-specific 5' valve endonuclease includes valve endonuclease 1 (FEN1), RAD2 endonuclease, or XPG endonuclease.
[0328] In some embodiments of step (c), the respective open-loop library molecules lack a 3' overhanging valve structure. In some embodiments, the respective open-loop library molecules further include a 3' overhanging valve structure. In some embodiments, the 3' overhanging valve structure includes a nucleic acid sequence complementary to the second portion (220) of the clip-on capture primer. In some embodiments, the 3' overhanging valve structure is 1 nucleotide in length. In some embodiments, the 3' overhanging valve structure is 2 to 10 nucleotides in length. In some embodiments, the 5' valve endonuclease does not cleave the 3' overhanging valve structure.
[0329] In some embodiments of step (c), each open-loop library molecule comprises a 5' lobule structure of 2 to 10 nucleotides in length and a 3' lobule structure of 1 nucleotide in length, wherein the 5' lobule structure can be cleaved by a 5' lobule endonuclease (e.g., Figure 28A (i)). In some embodiments, the 5' valve endonuclease can cleave both the 5' and 3' protruding valves. In some embodiments, the 5' valve endonuclease comprises FEN1.
[0330] In some embodiments of step (c), each open-loop library molecule contains a 5' overhang valve structure of 2 to 10 nucleotides in length and lacks a 3' overhang valve structure, wherein the 5' overhang valve structure can be cleaved by a 5' valve endonuclease (e.g., Figure 28A (ii)). In some embodiments, the 5' valve endonuclease comprises FEN1.
[0331] In some embodiments of step (c), each open-loop library molecule comprises a 5' lobule structure of 2 to 10 nucleotides in length and a 3' lobule structure of 2 to 10 nucleotides in length, wherein the 5' lobule structure cannot be cleaved by a 5' lobule endonuclease (e.g., Figure 28A (iii)).
[0332] In some embodiments of step (c), the valve-cutting reagent comprises at least one 5' valve endonuclease derived from thermophilic, eukaryotic, or archaea organisms. In some embodiments, the 5' valve endonuclease comprises a thermostable enzyme. In some embodiments, the 5' valve endonuclease comprises FEN-1.
[0333] In some embodiments of step (c), the valve-cutting reagent comprises at least one 5' valve endonuclease derived from an archaea species, including but not limited to *Afu FEN1* (Chapados et al., 2004 Cell 116:39–50; Hosfield et al., 1998 J. Biol. Chem. 273:27154–27161; Hosfield 1998 Cell 95;135–146; Allai 2003 J. Mol. Biol. 328:537–554), *Mth FEN1* (thermotrophic methanobacterium), *Pfu FEN1* (Kaiser et al., 1999 J. Biol. Chem. 274:21387-21394), and *Mja FEN1* (Hosfield et al., 1998 J. Biol. Chem.). 273:27154–27161; Hwang 1998 Nature Struct. Biol.5:707–713; Rao 1998 J. Bacteriol. 180:5406-5412; Bae 1999 Mol. Cells 9:45–48), *Pho* FEN1, *Pho* FEN1; Matsui et al., 1999 J. Biol. Chem. 274:18297-18309; Matsui 2002 J. Biol. Chem. 277:37840–37847; Matsui 2014 Extremophiles 18:415–427, *Ave* FEN1, *Tko* FEN1; Burkhart 2017 J. Bacteriol. 199:e00141-17; Muzzamal 2020 Folia Microbiol (Praha) 62:407–415), *Dam FEN1* (Mase 2011 Acta. Crystallogr. Sect. F. Struct. Biol.Cryst. Commun. 67:209–213), *Ape FEN1* (Collins 2004 Acta. Crystallogr. D. Biol.Crystallogr.60:1674–1678), and *Sulfolobus tokodaii* (Sto FEN1; Horie 2007 Biosci. Biotechnol.).Biochem. 71:855–865) or *Sulphurus sulfideus* (Sso FEN1; Beattie and Bell 2012 EMBO.J. 31:1556–1567). The contents of these references are hereby explicitly incorporated in their entirety by reference.
[0334] In some embodiments of step (c), the valve-cutting reagent includes a 5' valve endonuclease (9°N FEN-1) from the genus *Thermococcus* species 9°North (e.g., from New England Biolabs, catalog number M0645S).
[0335] In some embodiments of step (c), the valve-cutting reagent comprises at least one 5' valve endonuclease derived from eukaryotes, including but not limited to mouse FEN-1 (Harrington and Lieber 1994 EMBO J. 13:1235-1246), yeast FEN1 (Harrington and Lieber 1994 Genes Dev.8:1344-1355), and human FEN1 (Hiraoka et al., 1995 Genomics 25:220-225). The contents of these references are hereby explicitly incorporated herein by reference in their entirety.
[0336] In some embodiments of step (c), the valve-cutting reagent includes at least one family A DNA polymerase (DNA polymerase I), Taq DNA polymerase, and / or Bst DNA polymerase from Escherichia coli, all of which exhibit 5' valve endonuclease activity.
[0337] In some embodiments of step (c), the valve-cutting reagent comprises a type of 5' valve endonuclease, for example, selected from any of the above-described 5' valve endonucleases.
[0338] In some embodiments of step (c), the valve cleavage reagent comprises a mixture of two or more different types of 5' valve endonucleases, for example, selected from any of the above-described 5' valve endonucleases. In some embodiments of step (c), the valve cleavage reagent comprises a mixture of a 5' valve endonuclease, for example, selected from any of the above-described 5' valve endonucleases, and a DNA polymerase exhibiting 5' valve endonuclease activity.
[0339] In some embodiments of step (c), the valve-cutting reagent includes at least one fusion enzyme, which contains a portion of at least one 5' valve endonuclease selected from any of the above-described 5' valve endonucleases.
[0340] In some embodiments of step (c), the lobe-cutting reagent comprises a structure-specific 5' lobe endonuclease and a solvent. In some embodiments, the solvent comprises any combination of any one or more of the following: pH buffers, viscosity compounds, ammonium ions, salts, magnesium ions, detergents, reducing compounds, and / or nucleotides.
[0341] In some embodiments of step (c), the cutting reagent further comprises a ligase. In some embodiments, the ligase comprises a bacteriophage DNA ligase, including T3, T4, or T7 DNA ligases. In some embodiments, the ligase comprises a thermostable DNA ligase, including Taq DNA ligase, Tfu DNA ligase, or a DNA ligase from Thermococcus nautilus. In some embodiments, the ligase comprises a recombinant thermostable T4 DNA ligase (e.g., Hi-T4 DNA ligase from New England Biolabs, catalog number M2622S).
[0342] In some embodiments of step (c), the flap-cutting reaction can be carried out at a temperature of about 45°C to 50°C, or about 50°C to 55°C, or about 55°C to 60°C, or about 60°C to 65°C, or about 65°C to 70°C.
[0343] In some embodiments of step (c), the flap cleavage reaction may be carried out at a pH of about 6.5 to 7, or about 7 to 7.5, or about 7.5 to 8, or about 8 to 8.5, or about 8.5 to 9.
[0344] In some embodiments, the flap-cutting agent comprises a solvent containing water or an aqueous buffer solution.
[0345] In some embodiments, the flap-cutting agent includes at least one pH buffer, including Tris (e.g., tris(hydroxymethyl)-aminomethane), Tris-HCl (e.g., tris(hydroxymethyl)-aminomethane hydrochloride), HEPES (e.g., 4-(2-hydroxyethyl)-1-piperazine ethane sulfonic acid), or MOPS (e.g., 3-(N-morpholino)propane sulfonic acid).
[0346] In some embodiments, the flap-cutting agent comprises at least one viscosity compound, including trehalose, sucrose, cellulose, xylitol, mannitol, sorbitol, D-maltose, or inositol. In some embodiments, the viscosity agent comprises glycerol or a glycol compound such as ethylene glycol or propylene glycol (e.g., propanediol).
[0347] In some embodiments, the flap-cutting agent includes at least one ammonium ion source, including ammonium sulfate (e.g., NH4)2SO4) and / or ammonium acetate.
[0348] In some embodiments, the flap-cutting agent comprises at least one salt, including NaCl, KCl, or potassium glutamate.
[0349] In some embodiments, the flap-cutting agent includes at least one magnesium ion source, which includes MgCl2 and / or MgSO4.
[0350] In some embodiments, the flap cutting agent includes at least one detergent, including Tween-20, Tween-80, Triton X-100, Nonidet P-40, CHAPS (e.g., 3-[(3-cholamidopropyl)dimethylammonium]-1-propanesulfonate) and / or DetX (e.g., N-dodecyl-N,N-dimethyl-3-ammonium-1-propane sulfate).
[0351] In some embodiments, the flap-cutting agent comprises at least one reducing compound, including DTT (dithiothreitol), 2-β-mercaptoethanol, TCEP (tris(2-carboxyethyl)phosphine), formamide, DMSO (dimethyl sulfoxide), sodium dithionite (Na2S2O4), glutathione, methionine, betaine, tris(3-hydroxypropyl)phosphine (THPP), and / or N-acetylcysteine.
[0352] In some embodiments, the valve-cutting agent comprises at least one nucleotide. In some embodiments, the at least one nucleotide comprises ATP.
[0353] In some embodiments, the flap cutting reagent includes at least one ligase. In some embodiments, the ligase includes a phage DNA ligase, which includes T3 DNA ligase (e.g., SEQ ID NO: 147, Figure 60 ), T4 DNA ligase (e.g., SEQ ID NO:148, Figure 61 ) or T7 DNA ligase (e.g., SEQ ID NO:149, Figure 62 In some embodiments, the ligase comprises a thermostable DNA ligase, including Taq DNA ligase, Tfu DNA ligase (e.g., SEQ ID NO:150, Figure 63 ) or DNA ligases from Thermococcus nautilus (e.g., SEQ ID NO:151, Figure 64 In some embodiments, the ligase comprises recombinant thermostable T4 DNA ligase (e.g., Hi-T4 DNA ligase from New England Biolabs, catalog number M2622S).
[0354] Fresh linear library molecules were inoculated with immobilized clip-on capture primers.
[0355] In some embodiments, at step (c) of the method for generating a plurality of nucleic acid tandem molecules immobilized to a support as described above, the support may be seeded at least once. In some embodiments, the support may be seeded multiple times with a plurality of fresh linear library molecules to generate a support having a plurality of immobilized tandem template molecules. In some embodiments, multiple seeding of the support may generate a density of about 10 2 / mm 2 Up to 10 15 / mm 2 Multiple fixed open-ring library molecules.
[0356] In some embodiments, at step (c) of the method for generating a plurality of nucleic acid tandem molecules immobilized to a support as described above, the method includes contacting a plurality of immobilized clip-capture primers (200) with a first reagent stream containing a first plurality of linear library molecules, wherein the contact with the first reagent stream is performed under conditions suitable for hybridizing each linear library molecule in the first plurality of linear libraries with a respective immobilized clip-capture primer to form a first plurality of open-loop libraries, wherein each open-loop library molecule has a first terminal region of a linear library molecule that hybridizes with a first portion (210) of the clip-capture primer and a second terminal region of the same linear library molecule that hybridizes with a second portion (220) of the same clip-capture primer, and wherein each open-loop library molecule has a gap or slit between the 5' end and the 3' end of the open-loop library molecule. In some embodiments, after the first stream contact, some of the immobilized clip-capture primers are inoculated because they hybridize with the open-loop library molecules. In some embodiments, some of the immobilized clip-capture primers are uninoculated because they do not hybridize with the open-loop library molecules. In some embodiments, the linear library molecules are single-stranded. In some embodiments, it is desirable to increase the percentage of immobilized clip-on capture primers that are inoculated and hybridize with the open-loop library molecules by performing an additional stream.
[0357] In some embodiments, the method further includes step (c1): contacting a plurality of immobilized clip-on capture primers (200) with a second reagent stream containing a second plurality of linear library molecules, wherein the contact with the second stream is performed under conditions suitable for hybridizing each linear library molecule in the second plurality of linear library molecules with a respective free immobilized clip-on capture primer to form a second plurality of open-loop library molecules. In some embodiments, after the second reagent stream contacts the clip-on capture primers, some of the immobilized clip-on capture primers are seeded because they hybridize with the open-loop library molecules. In some embodiments, some of the immobilized clip-on capture primers are unseeded because they do not hybridize with the open-loop library molecules. In some embodiments, it is desirable to increase the percentage of immobilized clip-on capture primers seeded and hybridizing with the open-loop library molecules by performing another reagent stream. In some embodiments, at least a third reagent stream, at least a fourth reagent stream, or at least a fifth reagent stream is performed. In some embodiments, the reagent stream contains a plurality of linear library molecules. In some embodiments, up to ten reagent streams may be performed to increase the percentage of immobilized clip-on capture primers seeded and hybridizing with the open-loop library molecules. In some embodiments, the linear library molecules are single-stranded. In some embodiments, the linear library molecules are recycled, i.e., they have already been contacted with the capture primers in a previous reagent stream. In some embodiments, two or more seeding streams may be performed to generate a density of about 10. 2 / mm 2 Up to 10 15 / mm 2 Multiple open-loop library molecules (e.g., hybridized with fixed clip-capture primers).
[0358] In some embodiments, the method further includes step (d), which includes enzymatically closing the nicks and / or gaps formed by the immobilized open-ring library molecules using a linkage reaction mixture. Examples of step (d) are as described above.
[0359] In some embodiments, the method further includes step (e), which includes performing a rolling ring reaction using a rolling ring reaction mixture to generate a plurality of fixed tandem bodies. Examples of step (e) are as described above.
[0360] In some embodiments, the method further includes step (f), which includes sequencing the immobilized tandem template molecule. Examples of step (f) are described above.
[0361] In some embodiments, the method further includes step (g), which includes replacing the extended forward sequencing primer strand with a forward extension strand. Examples of step (g) are as described above.
[0362] In some embodiments, the method further includes step (h), which involves removing the retained fixed tandem template molecule by generating debase sites at nucleotides having easily cleavable portions in a fixed single-stranded tandem template molecule and creating gaps at the debase sites to generate a plurality of single-stranded nucleic acid tandem template molecules containing gaps, while retaining a plurality of fixed forward extension strands. An embodiment of step (h) is described above.
[0363] In some embodiments, the method further includes performing step (i), which includes sequencing the forward extension strand. Examples of step (i) are as described above.
[0364] Recycled linear library molecules were seeded with primers captured via a fixed clamp.
[0365] In some embodiments, at step (c) of the method for generating a plurality of nucleic acid tandem template molecules immobilized to a support as described above, the support may be seeded at least once with a plurality of fresh linear library molecules, and then seeded again with a plurality of recycled linear library molecules to generate a support having a plurality of immobilized tandem template molecules. In some embodiments, multiple seedings of the support may generate a density of about 10 2 / mm 2 Up to 10 15 / mm 2 Multiple fixed open-ring library molecules.
[0366] In some embodiments, at step (c1) of the method for generating a plurality of nucleic acid tandem molecules immobilized to a support, the method includes contacting a plurality of immobilized clip-capture primers (200) with a first reagent stream containing a first plurality of linear libraries of nucleic acids, wherein the first stream contact is performed under conditions suitable for hybridizing each linear library molecule in the first plurality of linear libraries with a respective immobilized clip-capture primer to form a first plurality of open-loop libraries, wherein each open-loop library molecule has a first terminal region of the respective linear library molecule that hybridizes with a first portion (210) of the clip-capture primer, and a second terminal region of the same respective linear library molecule that hybridizes with a second portion (220) of the same clip-capture primer, wherein each open-loop library molecule has a gap or slit between the 5' end and the 3' end of the open-loop library molecule. In some embodiments, the linear libraries are single-stranded.
[0367] In some embodiments, a first subgroup of the first plurality of linear library molecules hybridizes with immobilized clip-on capture primers, and a second subgroup of the first plurality of single-stranded nucleic acid linear library molecules does not hybridize with immobilized clip-on capture primers. In some embodiments, the second subgroup of the first plurality of single-stranded nucleic acid linear library molecules (e.g., unhybridized linear library molecules) may be collected and may be re-flowed onto immobilized clip-on capture primers (recycled linear library molecules). In some embodiments, after the first flow contact, some of the immobilized clip-on capture primers are inoculated because they hybridize with open-loop library molecules. In some embodiments, some of the immobilized clip-on capture primers are uninoculated because they do not hybridize with open-loop library molecules. In some embodiments, it is desirable to increase the percentage of immobilized clip-on capture primers that are inoculated and hybridize with open-loop library molecules by performing another flow. In some embodiments, the linear library molecules are single-stranded.
[0368] In some embodiments, the method further includes step (c2): performing a recycling stream by contacting a plurality of immobilized clip-on capture primers (200) with a second stream containing unhybridized linear library molecules (i.e., recycled linear library molecules) from a first plurality of linear library molecules, wherein the second stream contact is performed under conditions suitable for hybridizing each linear library molecule in the second stream with each free immobilized clip-on capture primer (i.e., clip-on capture primers that have not yet hybridized with the linear library molecules) to form a second plurality of open-loop libraries. In some embodiments, after the second stream contact, some of the immobilized clip-on capture primers are seeded because they hybridize with the open-loop libraries. In some embodiments, some of the immobilized clip-on capture primers are unseeded because they do not hybridize with the open-loop libraries. It may be desirable to increase the percentage of immobilized clip-on capture primers that are seeded and hybridize with the open-loop libraries by performing another reagent recycling stream. In some embodiments, at least a third recycling stream, at least a fourth recycling stream, or at least a fifth recycling stream is performed. In some embodiments, up to ten recirculation flows can be performed to increase the percentage of immobilized clip-capture primers that are inoculated and hybridize with open-loop library molecules.
[0369] In some embodiments, the percentage of immobilized clip-capture primers (200) (e.g., inoculated clip-capture primers) that hybridize with linear library molecules can be increased by performing any number of reagent streams with fresh linear library molecules and / or any number of reagent recirculation streams with recycled linear library molecules. In some embodiments, streams with fresh linear library molecules and / or recirculation streams with recycled linear library molecules can be performed in any order and in any combination.
[0370] In some embodiments, the method further includes step (d), which includes enzymatically closing the nicks and / or gaps formed by the immobilized open-ring library molecules using a linkage reaction mixture. Examples of step (d) are as described above.
[0371] In some embodiments, the method further includes step (e), which includes performing a rolling ring reaction using a rolling ring reaction mixture to generate a plurality of fixed tandem bodies. Examples of step (e) are as described above.
[0372] In some embodiments, the method further includes step (f), which includes sequencing the immobilized tandem. Examples of step (f) are as described above.
[0373] In some embodiments, the method further includes step (g), which includes replacing the extended forward sequencing primer strand with a forward extension strand. Examples of step (g) are as described above.
[0374] In some embodiments, the method further includes step (h), which involves removing the retained fixed tandem template molecule by generating debase sites at nucleotides having easily cleavable portions in a fixed single-stranded tandem template molecule and creating gaps at the debase sites to generate a plurality of single-stranded nucleic acid tandem template molecules containing gaps, while retaining a plurality of fixed forward extension strands. An embodiment of step (h) is described above.
[0375] In some embodiments, the method further includes performing step (i), which includes sequencing the forward extension strand. Examples of step (i) are as described above.
[0376] Batch sequencing and re-inoculation under sequencing interruption
[0377] For massively parallel nucleic acid sequencing, limitations in optical resolution hinder the ability to perform highly multiplexed sequencing. Batch-specific sequencing enables the sequencing of desired subsets (e.g., batches) of template molecules immobilized in the same flow cell using selected batch-specific sequencing primers, reducing overcrowded signals and images. The use of batch-specific sequencing primers can generate clear and resolvable optical images. The batch-specific sequencing method described in this paper has many applications. For example, the number of points that are imaged and associated with sequencing can be counted. The counted points can be used as a measure of the level of target nucleic acids in a sample.
[0378] The term "batch-specific sequencing primer binding site" and other related terms refer to a predetermined sequencing primer binding site linked to an insert region (e.g., target sequence) in a library molecule. Alternatively, the batch-specific sequencing primer binding site can be linked to a batch-specific barcode sequence adjacent to the insert region. The library molecule can undergo rolling circle amplification to generate a tandem template molecule carrying a complementary sequence of the library molecule. The tandem template molecule can serve as the nucleic acid template molecule to be sequenced. In a mixture of different tandem template molecules, the batch-specific sequencing primer binding site facilitates the sequencing of subgroups of the tandem template molecules. For example, a mixture of different subgroups of tandem template molecules immobilized to the same support can be sequenced separately at different times using different batch-specific sequencing primers that hybridize with their homologous batch-specific sequencing primer binding sites. In some embodiments, the mixture of tandem template molecules contains at least a first subgroup and a second subgroup of tandem template molecules.
[0379] The tandem template molecules of the first subgroup can share the same first-batch specific sequencing primer binding sequence. The first-batch specific sequencing primer binding site can selectively hybridize with its homologous first-batch sequencing primer to sequence the first subgroup of tandem template molecules carrying the first-batch specific sequencing primer binding site. In some embodiments, the first-batch sequencing primer can be used to sequence only the insert region. In some embodiments, the first-batch sequencing primer can be used to sequence only the batch-specific barcode sequence. In some embodiments, the first-batch sequencing primer can be used to sequence both the batch-specific barcode sequence and the insert region.
[0380] The tandem template molecules of the second subgroup can share the same second batch-specific sequencing primer binding sequence. The second batch-specific sequencing primer binding site can selectively hybridize with its homologous second batch sequencing primer to sequence the second subgroup of the tandem template molecules carrying the second batch-specific sequencing primer binding site. In some embodiments, the second batch sequencing primer can be used to sequence only the insert region. In some embodiments, the second batch sequencing primer can be used to sequence only the batch-specific barcode sequence. In some embodiments, the second batch sequencing primer can be used to sequence both the batch-specific barcode sequence and the insert region. This disclosure provides methods for generating multiple nucleic acid tandem molecules immobilized to a support and performing separate batch sequencing on the support. In some embodiments, any massively parallel sequencing technology can be used for separate sequencing batches. In some embodiments, multiple subgroups of the tandem template molecule are immobilized to a support, including at least a first subgroup and a second subgroup. In some embodiments, the first subgroup of the tandem template molecule undergoes a first sequencing reaction (e.g., a first batch sequencing) and the support is imaged to detect the first sequencing reaction, wherein the second subgroup of the template molecule does not undergo a sequencing reaction. In some embodiments, a second subgroup of the tandem template molecules undergoes a second sequencing reaction (e.g., a second batch sequencing) and the same region of the support is imaged to detect the second sequencing reaction, wherein the first subgroup of the template molecules does not undergo a sequencing reaction. Therefore, both the first and second subgroups of the tandem template molecules undergo batch sequencing. In some embodiments, the first and second subgroups of the tandem template molecules are assigned to the same region of the support, and this region is imaged in both the first and second batch sequencing.
[0381] In some embodiments, multiple subgroups of tandem template molecules are fixed to a support at high density, wherein at least some of the fixed tandem template molecules in the first and second subgroups include the nearest tandem template molecules that are in contact with and / or overlap with each other when viewed from any angle of the support (including above, below, or to the side of the support).
[0382] In some embodiments, the support comprises a plurality of tandem template molecules fixed at predetermined locations on the support (e.g., a patterned support). In some embodiments, the support comprises a plurality of tandem template molecules fixed at random and non-predetermined locations on the support. In some embodiments, the support comprises a mixture of at least two subgroups of tandem template molecules fixed at random and non-predetermined locations on the support. In some embodiments, the support lacks any contours (e.g., pores, protrusions, etc.) arranged in a predetermined pattern. In some embodiments, the support lacks contours that include features as sites for attaching nucleic acid tandem template molecules. In some embodiments, the support lacks gap regions arranged in a predetermined pattern, wherein the gap regions are sites designed not to have attached tandem template molecules. In some embodiments, the support lacks features that can be fabricated using photochemistry, photolithography, or micron- or nanon-scale printing.
[0383] In some embodiments, each tandem template molecule in a given subgroup of tandem template molecules comprises a target sequence and a batch-specific sequencing primer binding site sequence corresponding to the target sequence or corresponding to a tandem template molecule in the given subgroup. In some embodiments, each tandem template molecule in a given subgroup of tandem template molecules further comprises a batch barcode sequence corresponding to the target sequence or corresponding to a tandem template molecule in the given subgroup. In some embodiments, a predetermined batch sequencing primer binding site sequence may be linked to a given target sequence, thus the predetermined batch sequencing primer binding site sequence corresponds to a given target sequence. In some embodiments, a predetermined batch barcode sequence may be linked to a given target sequence, thus the predetermined batch barcode sequence corresponds to a given target sequence. In some embodiments, tandem template molecules within a given subgroup have the same target sequence. In some embodiments, tandem template molecules within a given subgroup have different target sequences. In some embodiments, tandem template molecules within a given subgroup have the same batch barcode sequence. In some embodiments, tandem template molecules within a given subgroup have the same sequencing primer binding site sequence. Therefore, different subgroups of tandem template molecules can be batch-sequencing using batch-specific sequencing primers.
[0384] In some embodiments, the target sequence is sequenced. In other embodiments, the target sequence does not need to be sequenced. Instead, the target barcode can be sequenced by performing a small number of sequencing cycles to reveal the target barcode corresponding to its target sequence.
[0385] In some embodiments, each tandem template molecule in a given subgroup of tandem template molecules further includes a sample index sequence, which can be used to distinguish the target sequence obtained from different sample sources in multiple determinations. In some embodiments, tandem template molecules within a given subgroup have the same or different sample index sequences.
[0386] In some embodiments, the target sequence does not need to be sequenced. Instead, the target barcode and / or sample index can be sequenced by performing a small number of sequencing cycles to reveal the target barcode corresponding to its target sequence and the sample index corresponding to the sample source of the target sequence. In some embodiments, the tandem template molecule lacks a sample index and the target barcode can serve as the sample index. In some embodiments, batch-specific sequencing includes performing no more than 200 sequencing cycles, no more than 150 sequencing cycles, no more than 100 sequencing cycles, no more than 50 sequencing cycles, no more than 25 sequencing cycles, or no more than 10 sequencing cycles.
[0387] In some embodiments, identical portions of the various tandem template molecules can be re-sequencing (e.g., repeated sequencing) from the same starting position to generate overlapping sequencing reads that can be aligned with a reference sequence. For example, identical portions of the various tandem template molecules can be sequenced at least two, three, four, five, or up to 50 times. The starting sequencing site can be any location within the tandem template molecule and is determined by sequencing primers designed to anneal to a selected location within the tandem template molecule. In some embodiments, target barcodes (or target barcodes and sample indexes) can be repeatedly sequenced by performing a short number of sequencing cycles on the target barcode region (or target barcode and sample index region) of a given tandem template molecule. Repeated sequencing reads increase the redundancy of sequencing information for individual bases in the tandem template molecule. Repeated sequencing of one strand of the tandem template molecule provides sufficient base coverage, thus eliminating the need to sequence the complementary strand.
[0388] In some embodiments, after sequencing the first and / or second subgroups of the tandem template molecules, the support can be re-inoculated at least once with an additional subgroup (e.g., a third subgroup) of linear library molecules. This can be used to generate a third batch of tandem template molecules that undergo additional batch sequencing. In some embodiments, an ongoing batch sequencing run can be stopped (e.g., interrupted) before completion to allow re-inoculation of the support with an additional subgroup (e.g., a third subgroup) of linear library molecules or tandem template molecules, and the interrupted batch sequencing can then be resumed. Therefore, the support can be re-inoculated at any time and / or before the completion of the previous sequencing batch.
[0389] In some embodiments, the support comprises a plurality of tandem template molecules initially fixed at a low density, wherein most of the nearest neighboring tandem template molecules do not contact each other and / or do not overlap each other. In some embodiments, the initial low-density support comprises a plurality of fixed tandem template molecules having gap spaces between the fixed template molecules.
[0390] In some embodiments, the same support may undergo a first re-inoculation with additional linear library molecules, wherein the re-inoculated linear library molecules undergo amplification to generate additional tandem template molecules, such that the first re-inoculation density has some nearest-neighbor tandem template molecules that are in contact with and / or overlap with each other (e.g., 10%-30% of the first immobilized re-inoculated template molecules). In some embodiments, the resulting first re-inoculated support comprises a plurality of immobilized tandem template molecules that have a reduced number of gap spaces (and / or a reduced gap space size) between the immobilized tandem template molecules compared to the initial low-density support.
[0391] In some embodiments, the same support may undergo a second re-inoculation with additional linear library molecules, which undergo amplification to generate more tandem template molecules, such that the second re-inoculation density has an increased number of nearest-neighbor tandem template molecules that are in contact with and / or overlap with each other (e.g., 25%-50% or more of the first immobilized re-inoculated template molecules). In some embodiments, the resulting second re-inoculated support comprises a plurality of immobilized tandem template molecules that have a further reduced number of gap spaces (and / or a further reduced gap space size) between the immobilized tandem template molecules compared to the first re-inoculated density support. In some embodiments, the support may undergo multiple re-inoculation workflows to increase the number of nearest-neighbor tandem template molecules that are in contact with and / or overlap with each other.
[0392] The method described in this paper employs batch sequencing of high-density immobilized tandem template molecules, which offers the advantage of maximizing space on the support (e.g., flow cell). Furthermore, the same inoculation support can be reused by re-inoculating the support to generate additional immobilized tandem template molecules and performing additional sequencing reactions on the re-inoculated tandem template molecules.
[0393] Tandem template molecules arranged in a predetermined manner on a support (e.g., a patterned support) can be used for batch sequencing. Alternatively, tandem template molecules arranged randomly on a support can be used for batch sequencing, which avoids the need to fabricate supports with organized and predetermined characteristics for attaching tandem template molecules (e.g., no need to fabricate via photolithography).
[0394] By using short sequencing reads of the target barcode region of the tandem template molecule, batch sequencing also significantly reduces sequencing run time, reagent usage, and reagent costs.
[0395] When short sequencing reads of the target barcode region are obtained in an iterative manner, there is no need to assemble the sequencing reads or obtain the full-length target sequence, which reduces the need for long assembly computations. Moreover, the redundant sequencing information obtained from the short sequencing reads avoids the need to sequence the complementary strand of the tandem template molecule, thus eliminating the need for paired sequencing.
[0396] Batch sequencing also offers the flexibility to re-inoculate support at any time between different sequencing batches, or to interrupt an ongoing sequencing batch to allow for re-inoculation, and then resume the ongoing batch sequencing. The ability to re-inoculate support at any time can increase throughput and efficiency.
[0397] Using immobilized tandem template molecules for batch sequencing offers advantages over single-copy template molecules (e.g., single-copy template molecules generated via bridge amplification). For example, tandem template molecules carry multiple sequencing primer binding sites along the same tandem template molecule. Multiple sequencing primer binding sites can be used to generate multiple sequencing reads to increase sequencing depth. In summary, repeatedly sequencing one strand of a tandem template increases sequencing base coverage and sequencing depth compared to sequencing single-copy template molecules.
[0398] Batch sequencing has many uses, including but not limited to detecting specific target nucleic acids, mutant nucleic acid sequences, splice variants and their abundance levels.
[0399] Batch sequencing
[0400] In some embodiments, in the method for generating a plurality of nucleic acid tandems immobilized to a support as described above, step (a) includes providing a support having a plurality of clip-capture primers (200) and a plurality of pinning primers (500) immobilized thereon. In some embodiments, the plurality of immobilized clip-capture primers and pinning primers can be used for batch sequencing as described below.
[0401] In some embodiments, in step (a), a plurality of clip-on capture primers (200) contain the same sequence. In some embodiments, a plurality of clip-on capture primers having the same sequence can hybridize / capture different linear library molecules carrying the same clip-on capture primer binding site sequence.
[0402] In some embodiments, in step (a), the plurality of clip-capture primers (200) comprise a plurality of clip-capture primer subgroups, which include at least a first subgroup and a second subgroup of clip-capture primers. In some embodiments, the clip-capture primers in at least the first subgroup and the second subgroup have different sequences. In some embodiments, the clip-capture primers in at least the first subgroup and the second subgroup can hybridize / capture different linear library molecules carrying different clip-capture primer binding site sequences.
[0403] In some embodiments, in step (a), a plurality of clip-capture primers (200) fixed to a support comprise different types of clip-capture primers, including at least a first subgroup and a second subgroup of clip-capture primers with different sequences, wherein the different types of clip-capture primers bind to different types of linear library molecules.
[0404] In some embodiments, each of the fixed clip-on capture primers in the first subgroup (200-A) comprises a first portion (210-A) and a second portion (220-A), wherein the first portion (210-A) of the clip-on capture primer binds to a first universal binding site (120-A) in the first linear library molecule and the second portion (220-A) of the clip-on capture primer binds to a second universal binding site (130-A) in the same linear library molecule.
[0405] In some embodiments, each of the fixed clip-on capture primers in the second subgroup (200-B) comprises a first portion (210-B) and a second portion (220-B), wherein the first portion (210-B) of the clip-on capture primer binds to a first universal binding site (120-B) in the second linear library molecule, and the second portion (220-B) of the clip-on capture primer binds to a second universal binding site (130-B) in the same second linear library molecule.
[0406] In some embodiments, in step (a), the support further includes a plurality of features on the support that are randomly and non-predeterminedly positioned (e.g., Figure 15 A), characterized by sites for attaching multiple splint-capture primers (200) and multiple pinning primers (500).
[0407] In some embodiments, in step (a), the support further includes features on the support located at predetermined positions on the support (e.g., Figure 16 (A and 16B), characterized by sites for attaching multiple splint-capture primers and multiple pinning primers.
[0408] In some embodiments, in step (a), the support is passivated with at least one polymer coating comprising a plurality of clamp-capture primers (200) and pinning primers (500) covalently tethered to at least one polymer layer.
[0409] In some embodiments, in step (a), a plurality of clip-capture primers (200) and a plurality of pinning primers (500) are randomly distributed throughout at least one polymer layer and embedded within the at least one polymer layer (e.g., Figure 14 and 15 A).
[0410] In some embodiments, in step (a), the support lacks any contours (e.g., holes, protrusions, etc.) arranged in a predetermined pattern, wherein the contours have features as sites for attaching a plurality of clamp-capture primers (200) and a plurality of pinning primers (500). In some embodiments, the support lacks gap regions arranged in a predetermined pattern, wherein the gap regions are sites designed not to have attached clamp-capture primers or pinning primers.
[0411] In some embodiments, in step (a), a plurality of clamp-capturing primers (200) and a plurality of pinning primers (500) are located at predetermined positions within at least one polymer layer (e.g., Figure 14 , 16 A and 16B).
[0412] In some embodiments, in step (a), the support further includes contours (e.g., holes, protrusions, etc.) arranged in a predetermined pattern, wherein the contours have features as sites for attaching a plurality of splint-capturing primers (200) and a plurality of pinning primers (500). In some embodiments, the support further includes...
Claims
1. A method for generating a plurality of nucleic acid tandem template molecules immobilized to a support, the method comprising: a) Provides a support having multiple clip-on capture primers (200) and multiple pinning primers (500) fixed thereon, Each of the plurality of clip-on capture primers (200) comprises a first portion (210) that binds to a first universal binding site in a linear library molecule (100) and a second portion (220) that binds to a second universal binding site in the same linear library molecule (100). The density of the clamp-on trapping primers (200) on the support is 10. 5 / mm 2 Up to 10 15 / mm 2 between, Each of the plurality of pinning primers (500) binds to at least a portion of the respective tandem template molecule, and each pinning primer (500) includes a terminal 3' non-extendable end; b) Provides a plurality of linear library molecules (100), wherein each of the plurality of linear library molecules contains a target sequence in any order and any adaptor sequence or a combination of two or more adaptor sequences, wherein the adaptor sequence comprises: (i) the first universal binding site (120) or its complementary sequence for the first part of the clamp capture primer; (ii) the universal binding site (123) or its complementary sequence for the first non-splice capture primer; (iii) At least one sample index sequence comprising a left sample index sequence (160) and / or a right sample index sequence (170), wherein the left sample index sequence (160) and / or the right sample index sequence (170) distinguish the target sequence obtained from different sample sources in multiple assays; (iv) At least one universal binding site (140) or its complementary sequence against the forward sequencing primer; (v) At least one universal binding site (150) or its complementary sequence against a reverse sequencing primer; (vi) At least one universal binding site or its complementary sequence for compacted oligonucleotides; (vii) At least one unique molecular index sequence (UMI) comprising a left unique molecular index sequence (180) and / or a right unique molecular index sequence (190) that can be used to uniquely identify linear library molecules (100) with the attached unique molecular index sequence. (viii) At least one universal binding site for the pinning primer or its complementary sequence; (ix) At least one batch-specific barcode sequence; (x) The universal binding site (133) or its complementary sequence for the second non-splice capture primer; (xi) at least one short random sequence of about 3 to 20 nucleotides in length and providing nucleotide sequence diversity; and / or (xii) a second universal binding site (130) or its complementary sequence for the second part of the primer captured by the fixed clamp; c) Contacting the plurality of clip-on capture primers (200) with the plurality of linear library molecules (100), wherein the contact is performed under conditions suitable for hybridization of each linear library molecule with each clip-on capture primer to form each open-ring library molecule (300), the open-ring library molecule having a gap or slit between the 5' and 3' ends of the open-ring library molecule (300). Each linear library molecule contains a first universal binding site (120) for the first portion of the clip-capture primer that hybridizes to the first portion (210) of the clip-capture primer, and The same linear library molecule (100) contains a second universal binding site (130) for the second portion of the immobilized clamp capture primer that hybridizes with the second portion (220) of the same clamp capture primer, thereby generating a plurality of open-loop library molecules (300) containing nicks or gaps. d) Enzymatically close the nicks or gaps in the plurality of open-loop library molecules, thereby generating a plurality of covalently closed circular library molecules (400), wherein each covalently closed circular library molecule hybridizes with the clamp trapping primer (200); e) contacting the plurality of covalently closed circular library molecules (400) with a rolling circle amplification reaction mixture and performing a rolling circle amplification reaction, thereby generating the plurality of concatemer template molecules, wherein the plurality of concatemer template molecules are immobilized on the support, and wherein the density of the concatemer template molecules on the support is between 10 5 / mm 2 and 10 15 / mm 2 , The rolling circle amplification reaction mixture contains a strand displacement polymerase and a mixture of nucleotides comprising dATP, dGTP, dCTP, dTTP, and dUTP. The rolling circle amplification reaction mixture contains multiple single-stranded nucleic acid compacted oligonucleotides. In this process, the 5' and 3' regions of each single-stranded nucleic acid compacted oligonucleotide hybridize with the universal binding sites on each tandem template molecule, thereby pulling the distal portions of the tandem template molecules together and causing the tandem template molecules to compact to form DNA nanospheres. The 3' ends of the compacted oligonucleotides of each single-stranded nucleic acid are non-extendable, and At least one portion of each tandem template molecule hybridizes with a pinned primer fixed to the support; and f) Perform at least one sequencing reaction to determine the sequence of at least a portion of the plurality of tandem template molecules.
2. A method for generating a plurality of nucleic acid tandem template molecules immobilized to a support, the method comprising: a) Provides a support containing multiple clip-on capture primers (200) and multiple pinning primers (500) fixed thereon. Each of the plurality of clip-on capture primers (200) comprises a first portion (210) that binds to a first universal binding site in a linear library molecule (100) and a second portion (220) that binds to a second universal binding site in the same linear library molecule (100). The density of the clamp-on trapping primers (200) on the support is 10. 5 / mm 2 Up to 10 15 / mm 2 between, Each of the plurality of pinning primers (500) binds to at least a portion of the respective tandem template molecule, and each pinning primer (500) includes a terminal 3' non-extendable end; b) Provides a plurality of linear library molecules (100), wherein each of the plurality of linear library molecules contains a target sequence in any order and any linker sequence or any combination of two or more linker sequences, wherein the linker sequence contains (i) the first universal binding site (120) or its complementary sequence for the first part of the clamp capture primer; (ii) the universal binding site (123) or its complementary sequence for the first non-splice capture primer; (iii) At least one sample index sequence comprising a left sample index sequence (160) and / or a right sample index sequence (170) that distinguishes the target sequence obtained from different sample sources in multiple assays; (iv) At least one universal binding site (140) or its complementary sequence against the forward sequencing primer; (v) At least one universal binding site (150) or its complementary sequence against a reverse sequencing primer; (vi) At least one universal binding site or its complementary sequence for compacted oligonucleotides; (vii) At least one unique molecular index sequence (UMI) comprising a left unique molecular index sequence (180) and / or a right unique molecular index sequence (190) that can be used to uniquely identify linear library molecules (100) with the attached unique molecular index sequence. (viii) At least one universal binding site for the pinning primer or its complementary sequence; (ix) At least one batch-specific barcode sequence; (x) The universal binding site (133) or its complementary sequence for the second non-splice capture primer; (xi) at least one short random sequence (132) of about 3 to 20 nucleotides in length and providing nucleotide sequence diversity; and / or (xii) a second universal binding site (130) (or its complementary sequence) for the second portion captured by the fixed clamp; c)(i) Contacting the plurality of clip-on trapping primers (200) with the plurality of linear library molecules (100), wherein the contact is performed under conditions suitable for hybridizing each linear library molecule (100) with each clip-on trapping primer (200) to form each open-ring library molecule (300) containing a 5' overhanging lobe structure, and (ii) The 5' protruding flap structure is brought into contact with a flap-cutting reagent under certain conditions suitable for cutting the 5' protruding flap structure to generate a plurality of open-ring library molecules, each open-ring library molecule having a newly cut 5' end and an uncut 3' end and containing a cut between the 5' end and the 3' end of the open-ring library molecule. Each linear library molecule contains at least a portion of the first universal binding site (120) of the first portion (210) of the clip-capture primer that hybridizes to the first portion of the clip-capture primer. The same linear library molecule contains at least a portion of the second universal binding site (130) of the second portion of the immobilized clip-capture primer that hybridizes to the second portion (220) of the same clip-capture primer; d) Enzymatically close the nicks in the plurality of open-ring library molecules (300) to generate a plurality of covalently closed circular library molecules (400), wherein each covalently closed circular library molecule hybridizes with a clip-on trap primer (200); e) The plurality of covalently closed cyclic library molecules (400) are contacted with the rolling circle amplification reaction mixture and a rolling circle amplification reaction is performed, thereby generating a plurality of tandem template molecules immobilized on the support, wherein the density of the tandem template molecules on the support is 10. 5 / mm 2 Up to 10 15 / mm 2 , The rolling circle amplification reaction mixture contains a strand displacement polymerase and a mixture of nucleotides comprising dATP, dGTP, dCTP, dTTP, and dUTP. The rolling circle amplification reaction mixture contains multiple single-stranded nucleic acid compacted oligonucleotides. In this process, the 5' and 3' regions of each single-stranded nucleic acid compacting oligonucleotide hybridize with universal binding sites on the nucleic acid tandem template molecule to pull the distal portions of the tandem template molecule together, thereby causing the tandem template molecule to compact and form DNA nanospheres. The 3' end of the single-stranded nucleic acid compacted oligonucleotide is non-extendable, and At least one portion of each tandem template molecule hybridizes with the pinning primer (500); and f) Perform at least one sequencing reaction to determine the sequence of at least a portion of the plurality of tandem template molecules.
3. The method according to claim 1 or 2, wherein the plurality of clamp-capture primers (200) in step (a) are located at random and non-predetermined positions on the support.
4. The method according to any one of claims 1 to 3, wherein the plurality of clip-capture primers (200) in step (a) comprises a plurality of nearest neighbor clip-capture primers that are in contact with and / or overlap each other when the support is viewed from any angle including above, below or from the side.
5. The method according to any one of claims 1 to 4, wherein the plurality of clip-on capture primers in step (a) comprises at least a first subgroup of clip-on capture primers having a first sequence and a second subgroup of clip-on capture primers having a second sequence different from the first sequence.
6. The method according to any one of claims 1 to 5, wherein the plurality of linear library molecules (100) in step (b) comprises at least a first subgroup of linear library molecules and a second subgroup of linear library molecules.
7. The method according to claim 6, wherein the linear library molecule (100) in the first subgroup comprises a mixture of target sequences, and the linear library molecule (100) in the second subgroup comprises a mixture of target sequences.
8. The method according to claim 7, wherein The first subgroup of linear library molecules contains For the first batch of specific forward sequencing primers, the universal binding site (140-1) or its complementary sequence, The universal binding site (150-1) or its complementary sequence for the first batch of specific reverse sequencing primers, and The first batch of specific barcode sequences (142); and The second subgroup of linear library molecules contains For the universal binding site (140-2) or its complementary sequence of the second batch of specific forward sequencing primers, The universal binding site (150-2) or its complementary sequence for the second batch of specific reverse sequencing primers, and The second batch of specific barcode sequences (152).
9. The method according to any one of claims 1 to 8, wherein the plurality of tandem template molecules in step (e) comprises at least a first subgroup of tandem template molecules and a second subgroup of tandem template molecules.
10. The method of claim 9, wherein the first and second subgroups of the tandem template molecules are located at random and non-predetermined positions on the support, and wherein each tandem template molecule in the first and second subgroups of the tandem template molecules includes the nearest neighbor nucleic acid tandem template molecule, which contacts or overlaps each other when the support is viewed from any angle including above, below, or from the side.
11. The method of claim 10, wherein the sequencing in step (f) comprises performing a first batch of repeated sequencing, the first batch of repeated sequencing comprising: a) Hybridize the first subgroup of the tandem template molecule with multiple first-batch specific forward sequencing primers and perform multiple sequencing reactions to generate multiple first-batch sequencing reads, wherein the length of the first-batch sequencing reads does not exceed 50 bases. b) Stop or block the first batch of repeated sequencing in step (a) to inhibit further sequencing reactions; c) Remove the plurality of first-batch sequencing reads from the first subgroup of the tandem template molecule and retain the first subgroup of the tandem template molecule; as well as d) Repeat steps (a)-(c) at least once to sequence the first subgroup of the tandem template molecule.
12. The method of claim 11, wherein the sequencing in step (f) further comprises performing a second batch of repeated sequencing, the second batch of repeated sequencing comprising: a) Hybridize the second subgroup of the tandem template molecule with multiple second-batch specific forward sequencing primers and perform multiple sequencing reactions to generate multiple second-batch sequencing reads, wherein the length of the second-batch sequencing reads does not exceed 50 bases. b) Stop or block the second batch of repeated sequencing in step (a) to inhibit further sequencing reactions; c) Remove the plurality of second batch sequencing reads from the second subgroup of the tandem template molecule and retain the second subgroup of the tandem template molecule; as well as d) Repeat steps (a)-(c) at least once to sequence the second subgroup of the tandem template molecule.
13. The method according to any one of claims 1 to 12, wherein (i) Step (c) includes distributing a first subgroup of linear library molecules onto the support under conditions suitable for hybridizing each linear library molecule from the first subgroup with a respective clip-capture primer (200) to generate a first subgroup of open-loop library molecules each having a notch or gap, wherein the support contains an excess of clip-capture primers fixed thereon compared to the first subgroup of linear library molecules; (ii) Step (d) includes enzymatically closing the notch or gap to generate a first subgroup of covalently closed circular library molecules, wherein each covalently closed circular library molecule hybridizes with a clip-on trap primer; (iii) Step (e) includes performing a rolling ring amplification reaction to generate a first subgroup of tandem template molecules; (iv) Step (f) includes sequencing at least a portion of the first subgroup of the tandem template molecule; (v) wherein the method further includes stopping the sequencing of the first subgroup of the tandem template molecule; (vi) The second subgroup of linear library molecules is assigned to the same support under certain conditions, said conditions being suitable for hybridizing each linear library molecule from the second subgroup with a respective clip-on capture primer (200) to generate a second subgroup of open-ring library molecules, each with a nick or gap, and enzymatically closing said nick or gap to generate a second subgroup of covalently closed-ring library molecules, wherein each covalently closed-ring library molecule hybridizes with a clip-on capture primer to perform rolling circle amplification to generate a second subgroup of tandem template molecules; and (vii) Continue sequencing at least a portion of the first subgroup of the tandem template molecule or at least a portion of the second subgroup of the tandem template molecule.
14. The method according to any one of claims 1 to 13, wherein the sequencing in step (f) comprises pairwise sequencing, the pairwise sequencing comprising: a) By contacting the plurality of tandem template molecules with a plurality of forward sequencing primers under certain conditions, the conditions being suitable for hybridizing at least one forward sequencing primer with at least one of the universal binding sites (140) of the tandem template molecules for the forward sequencing primers, and performing a forward sequencing reaction using the first hybridized forward sequencing primer, a plurality of sequencing polymerases and a plurality of nucleotide reagents; b) Retain the plurality of tandem template molecules fixed on the support and replace the plurality of extended forward sequencing primer chains with a plurality of forward extension chains, which hybridize with the tandem template molecules by using the tandem template molecules as template molecules to perform primer extension reactions. c) Remove the tandem molecule by generating a debasing site in the tandem template molecule at the uridine nucleotide in the tandem template molecule and generating a gap at the debasing site to generate a plurality of tandem template molecules containing gaps, while retaining the plurality of forward extension chains and the plurality of fixed splint-capture primers and pinning primers. as well as d) The multiple forward extension strands are sequenced by contacting them with multiple soluble reverse sequencing primers, multiple sequencing polymerases and multiple nucleotide reagents, and a reverse sequencing reaction is performed, thereby generating multiple extended reverse sequencing primer strands.
15. The method according to any one of claims 1 to 13, wherein the sequencing in step (f) comprises chain termination sequencing, the chain termination sequencing comprising: a) Contact the plurality of tandem template molecules with a plurality of sequencing polymerases and a plurality of nucleic acid sequencing primers, wherein the contact is performed under certain conditions suitable for forming a plurality of sequencing polymerase complexes comprising sequencing polymerases that bind to nucleic acid duplexes, wherein the nucleic acid duplexes comprise the portion of the tandem template molecule that hybridizes with the nucleic acid sequencing primers. b) Contact the plurality of sequencing polymerase complexes with a plurality of nucleotides comprising a detectable marker and a blocking portion at a 2' sugar position or a 3' sugar position, wherein the contact is performed under conditions suitable for binding at least one nucleotide to one of the sequencing polymerase complexes and conditions suitable for promoting polymerase-catalyzed nucleotide incorporation. c) Incorporating a nucleotide into the 3' end of at least one sequencing polymerase complex of the nucleic acid sequencing primer, thereby generating a sequencing polymerase complex containing the incorporated nucleotide. d) Detect the incorporated nucleotides; e) Remove the blocking portion from the incorporated nucleotide; and f) Repeat steps (b)-(e) at least once.
16. The method according to any one of claims 1 to 13, wherein the sequencing in step (f) comprises: a) Contact the plurality of tandem template molecules with a plurality of sequencing polymerases and a plurality of nucleic acid sequencing primers, wherein the contact is performed under certain conditions suitable for forming a plurality of sequencing polymerase complexes comprising sequencing polymerases that bind to nucleic acid duplexes, wherein the nucleic acid duplexes comprise the portion of the tandem template molecule that hybridizes with the nucleic acid sequencing primers. b) Contacting the plurality of sequencing polymerase complexes with a plurality of nucleotides, the plurality of nucleotides comprising a detectable label of a phosphate ester moiety attached to a phosphate ester chain, wherein the contact is performed under conditions suitable for binding at least one nucleotide to one of the sequencing polymerase complexes and conditions suitable for promoting polymerase-catalyzed nucleotide incorporation. c) Incorporating a nucleotide into the 3' end of a sequencing primer of at least one sequencing polymerase complex, thereby generating a sequencing polymerase complex containing the incorporated nucleotide. d) Detect the incorporated nucleotides; and e) Repeat steps (b)-(d) at least once.
17. The method according to any one of claims 1 to 13, wherein the sequencing in step (f) comprises: a) Contact the plurality of tandem template molecules with a plurality of first sequencing polymerases and a plurality of nucleic acid sequencing primers, wherein the contact is performed under conditions suitable for the plurality of first polymerases to bind to the plurality of tandem template molecules and the plurality of nucleic acid primers, thereby forming a plurality of first polymerase complexes comprising a first sequencing polymerase bound to a nucleic acid duplex, wherein the nucleic acid duplex comprises a tandem template molecule hybridized to the nucleic acid sequencing primers. b) Contacting the plurality of first polymerase complexes with a plurality of multivalent molecules, wherein the multivalent molecules are detectably labeled, and wherein each of the plurality of multivalent molecules comprises a nucleus attached to a plurality of nucleotide arms, and each nucleotide arm is attached to a nucleotide moiety, wherein the contact is performed under conditions suitable for binding complementary nucleotide moieties of the multivalent molecules to at least two of the plurality of first polymerase complexes, thereby forming a plurality of multivalent binding polymerase complexes, and wherein the conditions are suitable for inhibiting the incorporation of complementary nucleotide moieties into the nucleic acid sequencing primers of the plurality of multivalent binding polymerase complexes; c) Detect the plurality of multivalent binding polymerase complexes; as well as d) Identify the bases of the nucleotide moieties in the plurality of multivalent binding polymerase complexes, thereby determining the sequence of the tandem template molecule.
18. The method of claim 17, further comprising: e) Dissociate the plurality of multivalent binding polymerase complexes by removing the plurality of first nucleic acid sequencing polymerases and the bound multivalent molecules, while retaining the nucleic acid duplexes, thereby generating a plurality of retained nucleic acid duplexes; f) Contacting the plurality of preserved nucleic acid duplexes from step (e) with a plurality of second sequencing polymerases under conditions suitable for binding the plurality of second polymerases to the plurality of preserved nucleic acid duplexes, thereby forming a plurality of second polymerase complexes comprising the second sequencing polymerases bound to the nucleic acid duplexes; and g) Contacting the plurality of second polymerase complexes with a plurality of nucleotides, wherein the contact is performed under conditions suitable for binding complementary nucleotides from the plurality of nucleotides to at least two of the second polymerase complexes, thereby forming a plurality of nucleotide-polymerase complexes, and the conditions being suitable for promoting the incorporation of the bound complementary nucleotides into primers of the nucleotide-binding complexes.
19. The method of claim 18, wherein the nucleotides of the plurality of nucleotides comprise a detectable tag, and the method comprises: (h) Detect the complementary nucleotide incorporated into the nucleic acid sequencing primers of the nucleotide-polymerase complex.
20. The method of claim 18, further comprising: h) Detect the complementary nucleotide incorporated into the nucleic acid sequencing primers of the nucleotide-polymerase complex; as well as i) Identify the bases of the complementary nucleotide incorporated into the nucleic acid sequencing primers of the nucleotide-polymerase complex.
21. The method of claim 18, wherein the plurality of nucleotides in step (g) comprises a plurality of unlabeled nucleotides and wherein detection of nucleotide incorporation is omitted.
22. The method of claim 17, wherein step (b) of contacting the plurality of first polymerase complexes with the plurality of multivalent molecules is carried out in the presence of a non-catalytic divalent cation that inhibits polymerase-catalyzed nucleotide incorporation, and wherein the non-catalytic divalent cation comprises strontium, barium, or calcium.
23. The method of claim 18, wherein step (g) of contacting the plurality of second polymerase complexes with the plurality of nucleotides is carried out in the presence of a catalytic divalent cation that promotes polymerase-catalyzed nucleotide incorporation, and wherein the catalytic divalent cation comprises magnesium or manganese.
24. The method of claim 17, wherein each of the plurality of multivalent molecules comprises: (a) a nucleus; and (b) a plurality of nucleotide arms comprising: (i) a nuclear attachment portion, (ii) a spacer, (iii) a linker, and (iv) a nucleotide portion, wherein the nucleus is attached to the plurality of nucleotide arms via the nuclear attachment portion of the plurality of nucleotide arms, wherein the spacer is attached to the linker, and wherein the linker is attached to the nucleotide portion.
25. The method of claim 24, wherein the connector comprises an aliphatic chain having 2 to 6 subunits or an oligomeric glycol chain having 2 to 6 subunits.
26. The method of claim 24 or 25, wherein the plurality of nucleotide arms attached to a given nucleus have the same type of nucleotide moiety portion, and wherein the type of nucleotide moiety portion includes dATP, dGTP, dCTP, dTTP, or dUTP.
27. The method of claim 24 or 25, wherein the plurality of multivalent molecules comprises a type of multivalent molecule, wherein each of the plurality of multivalent molecules has the same type of nucleotide moiety selected from the group consisting of: dATP, dGTP, dCTP, dTTP, and dUTP.
28. The method of claim 24 or 25, wherein the plurality of multivalent molecules comprises a mixture of any combination of two or more types of multivalent molecules, each type having a nucleotide moiety selected from the group consisting of dATP, dGTP, dCTP, dTTP and / or dUTP.
29. The method of claim 18, wherein each nucleotide in the plurality of nucleotides in step (g) comprises an aromatic base, a pentose sugar, and 1 to 10 phosphate ester groups.
30. The method of claim 18, wherein the plurality of nucleotides in step (g) comprises one type of nucleotide selected from the group consisting of dATP, dGTP, dCTP, dTTP and dUTP, or a mixture comprising any combination of two or more types of nucleotides selected from the group consisting of dATP, dGTP, dCTP, dTTP and / or dUTP.
31. The method of claim 18, wherein at least one of the nucleotides in step (g) is labeled with a fluorescent group.
32. The method of claim 18, wherein the plurality of nucleotides in step (g) lack fluorescent labeling.
33. The method of claim 18, wherein at least one of the nucleotides in step (g) comprises a removable chain-terminating portion attached to the 3' carbon position of a sugar group, optionally wherein the removable chain-terminating portion comprises an alkyl group, an alkenyl group, an alkynyl group, an allyl group, an aryl group, a benzyl group, an azide group, an azido group, an O-azidomethyl group, an amine group, an amide group, a ketone group, an isocyanate group, a phosphate group, a thiol group, a disulfide group, a carbonate group, a urea group, or a silyl group, and optionally wherein the removable chain-terminating portion can be cleaved with a chemical compound to generate an extendable 3'OH moiety on the sugar group.
34. The method of claim 24, the method comprising forming a plurality of binding complexes, which includes the following steps: a) Bind a first nucleic acid sequencing primer, a first sequencing polymerase and a first multivalent molecule to a first portion of a tandem template molecule, thereby forming a first binding complex, wherein the first nucleotide portion of the first multivalent molecule binds to the first polymerase; as well as b) The second nucleic acid sequencing primer, the second sequencing polymerase, and the first multivalent molecule are bound to the second portion of the same nucleic acid tandem template molecule, thereby forming a second binding complex, wherein the second nucleotide portion of the first multivalent molecule is bound to the second polymerase, and The first binding complex and the second binding complex comprise the same multivalent molecule, thereby forming an affinity complex.
35. The method of claim 24, further comprising: a) Contact the plurality of first sequencing polymerases and the plurality of nucleic acid sequencing primers with different portions of each nucleic acid tandem template molecule to form at least a first polymerase complex and a second polymerase complex on the same nucleic acid tandem template molecule. b) Contacting a plurality of multivalent molecules containing a detectable label with at least the first polymerase complex and the second polymerase complex under conditions suitable for binding a single multivalent molecule from the plurality of multivalent molecules to the first polymerase complex and the second polymerase complex, wherein at least a first nucleotide portion of the single multivalent molecule binds to the first polymerase complex, the first polymerase complex comprising a first nucleic acid sequencing primer that hybridizes to a first portion of the tandem template molecule, thereby forming a first binding complex, and wherein at least a second nucleotide portion of the single multivalent molecule binds to the second polymerase complex, the second polymerase complex comprising a second nucleic acid sequencing primer that hybridizes to a second portion of the tandem template molecule, thereby forming a second binding complex, and The contact is carried out under conditions suitable for inhibiting the polymerase-catalyzed incorporation of the bound first and second nucleotide moieties into the first and second binding complexes, and The first binding complex and the second binding complex, which are bound to the same multivalent molecule, form an affinity complex; c) Detect the first binding complex and the second binding complex on the same tandem molecule; as well as d) Identify the first nucleotide portion in the first binding complex, thereby determining the sequence of the first portion of the tandem template molecule, and identify the second nucleotide portion in the second binding complex, thereby determining the sequence of the second portion of the tandem template molecule.
36. The method of claim 14, wherein the plurality of sequencing polymerases in steps (a) and (d) comprise a plurality of engineered polymerases having at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or greater than 99% sequence identity with any of SEQ ID NO:128-146.
37. The method of claim 15, wherein the plurality of sequencing polymerases in step (a) comprises a plurality of engineered polymerases having at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or greater than 99% sequence identity with any of SEQ ID NO: 128-146.
38. The method of claim 16, wherein the plurality of sequencing polymerases comprises a plurality of engineered polymerases having at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or greater than 99% sequence identity with any of SEQ ID NO: 128-146.
39. The method of claim 17, wherein the plurality of first sequencing polymerases comprises a plurality of engineered polymerases having at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or greater than 99% sequence identity with any of SEQ ID NO:128-146.
40. The method of claim 18, wherein the plurality of second sequencing polymerases comprises a plurality of engineered polymerases having at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or greater than 99% sequence identity with any of SEQ ID NO: 128-146.
41. The method according to any one of claims 2 to 40, wherein the valve-cutting reagent comprises at least one 5' valve endonuclease derived from eukaryotic or archaea organisms.
42. The method according to any one of claims 2 to 40, wherein the valve-cutting reagent comprises at least one archaea 5' valve endonuclease selected from the group consisting of: Archaeococcus scintillans (Afu FEN1) Thermotrophic methanobacterium (Mth FEN1) Pfu FEN1 Methanococcus jabini (Mja FEN1) Pwo FEN1, Pho FEN1 (Hormikoshi) Harmful archaea (Ave FEN1) Kodak thermococcus (Tko FEN1) Dam FEN1 (a type of amylolytic sulfur-containing bacteria) Agile aerothermal bacteria (Ape FEN1) and Sulphur-bearing leaf fungus (Sso FEN1).
43. The method of claim 41, wherein the valve-cutting reagent comprises a 5' valve endonuclease (9°N FEN1) from the genus *Thermococcus* species 9°N North.
44. The method of claim 41, wherein the valve-cutting reagent comprises a 5' valve endonuclease derived from mouse, yeast, or human.
45. The method according to any one of claims 1 to 44, wherein enzymatic closure of the nick comprises contacting the plurality of open-loop library molecules with a DNA ligase, the DNA ligase comprising T3 ligase, T4 ligase, T7 ligase, Tfu ligase or a ligase derived from Thermococcus nautilus.
46. The method of claim 41, wherein the flap cutting reagent comprises DNA ligase.
47. The method of claim 46, wherein the DNA ligase comprises T3 ligase, T4 ligase, T7 ligase, Tfu ligase or a ligase derived from Thermococcus nautilus.