Multi-dye attached polyphosphodiester scaffolds for use in sequencing
Multi-dye labeled polyphosphodiester scaffolds address the limitations of current sequencing technologies by enhancing fluorescence intensity and reducing dye-dye quenching, enabling cost-effective and scalable nucleic acid sequencing.
Patent Information
- Authority / Receiving Office
- WO · WO
- Patent Type
- Applications
- Current Assignee / Owner
- ILLUMINA INC
- Filing Date
- 2025-12-11
- Publication Date
- 2026-06-25
Smart Images

Figure IMGF000002_0001 
Figure IMGF000002_0002 
Figure IMGF000012_0001
Abstract
Description
IP-2725 -PCT PATENTMULTI-DYE ATTACHED POLYPHOSPHODIESTER SCAFFOLDS FOR USE IN SEQUENCINGField
[0001] The present disclosure generally relates to compositions, kits, methods and systems for nucleic acid sequencing applications.BACKGROUND
[0002] Nucleic acid sequencing methodology has evolved significantly from the chemical degradation methods used by Maxam and Gilbert and the strand elongation methods used by Sanger. Today several sequencing methodologies are in use which allow for the parallel processing of thousands of nucleic acids all in a single sequencing run. The instrumentation that performs such methods is typically large and expensive since the current methods typically rely on large amounts of expensive reagents and multiple sets of optic filters to record nucleic acid incorporation into sequencing reactions.
[0003] It has become clear that the need for high-throughput, smaller, less expensive DNA sequencing technologies will be beneficial for reaping the rewards of genome sequencing. Personalized healthcare is moving toward the forefront and will benefit from such technologies. The sequencing of an individual’s genome to identify potential mutations and abnormalities will be crucial in identifying if a person has a particular disease, followed by subsequent therapies tailored to that individual. To accommodate such endeavor, sequencing technologies should not only have high throughput capabilities, but also have scalability. As such, there exists a need for new sequencing methods that improve on speed, error read, and are also cost effective.
[0004] Scaffolding fluorophores using oligomeric or polymeric materials for enhanced brightness is an active area of study in fluorescence imaging. Various strategies have been explored to improve the brightness, photostability and other optical properties of fluorophores. Polymeric materials like linear polymers, nanoparticles and nanogels can be used to scaffold fluorophores. Different scaffolding strategies provide different drawbacks and benefits. The use of nanogels enables dyes to be scaffolded with fixed positions, whereas scaffolding on linear flexible polymers can allow' dye-dye contacts, likely increasing the occurrence of dye-dye quenching. Different materials offer varying levels of protection of fluorophores from solvent, a feature that is beneficial to reduce quenching events further, linear polymers do not providefluorophores any solvent protection due to their simple structure compared to other nanomaterials, such as nanogels that have ‘cores’.
[0005] Scaffolding fluorophores affords brightness multiplication for the labels concerned. However, dye-dye self-quenching is a phenomenon of interest when bringing dyes close to each other on a polymer. Effectively, a balance needs to be found between adding as many fluorophores as possible for maximal brightness enhancement, and ensuring there is significant space between neighboring dyes to limit the occurrence of dye self-quenching. As such, designing and synthesizing new scaffolding based labeling reagent with enhanced fluorescence intensity as nucleic acid labels for sequencing applications remains challenging.SUMMARY
[0006] One aspect of the present disclosure relates to a multi-dye labeled polyphosphodiester having the structure of Formula (I) or (II):FlI X Yq(II), or a salt thereof, wherein:Rlis —OH, — SH, — <)", — S“, — O~M+, — S" M+, — O(unsubstituted or substituted Ci-Ce alkyl), or — O(unsubstituted or substituted Cj-Ce alkenyl);R2is O or S;L1is an unsubstituted or substituted C2-C6 alkylene, an unsubstituted or substituted C2-C6 alkenylene, an unsubstituted or substitutedalkynylene, a 2 to 8 membered heteroalkylene, an unsubstituted or substituted C3-C10 membered carbocyclylene, an unsubstituted or substituted C6-C10 arylene, an unsubstituted or substituted 5 to 10 membered heteroarylene, or an unsubstituted or substituted 3 to 10 membered heterocyclylene, or the C2-C6 alkylene, C2-C6alkenylene, C2-C6 alkynylene, or 2 to 8 membered heteroalkylene each independently interrupted by an unsubstituted or substituted C3-C10 membered carbocyclylene, an unsubstituted or substituted C -C w arylene, an unsubstituted or substituted 5 to 10 membered heteroarylene, or an unsubstituted or substituted 3 to 10 membered heterocyclylene;L2is a bond, an unsubstituted or substituted Ci-Ce alkylene, an unsubstituted or substituted C2-C6 alkenylene, an unsubstituted or substituted C- alkynylene, a 2 to 8 membered heteroalkylene, an unsubstituted or substituted C3-C10 membered carbocyclylene, an unsubstituted or substituted C6-C10 arylene, an unsubstituted or substituted 5 to 10 membered heteroarylene, an unsubstituted or substituted 3 to 10 membered heterocyclylene;each of L3, L4and L5is independently a bond, an unsubstituted or substituted Ci-Gs alkylene or a 2 to 8 membered heteroalkylene;M+is a cation;Fl is a fluorescent dye moiety;each X is independently a functional group of the fluorescent dye moiety Fl;each Y is independently a functional group of the polyphosphodiester, wherein X—Y refers to a reaction product between X and Y forming covalent bonding;each RAand RBis independently H, -—Oil, — SH, — <)P(=Ra)(Rb)Rc, unsubstituted or substituted Ci-Ce alkyl, unsubstituted or substituted Ce-Cio aryl, unsubstituted or substituted 5 or 6 membered heteroaryl, or a functional group capable binding to a biological molecule of interest either via a bioorthogonal reaction selected from the group consisting of a [3+2] dipolar cycloaddition, a Diels- Alder cycloaddition, a [4+1] cycloaddition, a phosphine ligation, and condensation with 2-acylphenyl boronic acid, or via noncovalent interaction with the biological molecule of interest;Rais O or S;each of Rband Rcis independently --Oil, — -SH, — O~, — S“, — O" M+, — S~M+, — O(unsubstituted or substituted Ci-Ce alkyl), or — O(unsubstituted or substituted C2-C.6 alkenyl);each m is independently an integer of 5 or more;each ml and m2 is independently an integer of 3 or more;n is an integer of 1 or more;q is an integer of 2 or more;k is 0 or 1, provided that when k is 0, L3and L4are directly attached to L2; and p is 0 or 1,provided that when p is 0, RAis directly attached to L2, and n is at least 2. In some embodiments, each m is independently an integer of 7 or more.
[0007] One aspect of the present disclosure relates to a biological molecule attached to the multi-dye labeled polyphosphodiester as described herein via either noncovalent interaction or the bioorthogonal reaction with at least one of RAand RB, wherein the biological molecule is nucleotide, an oligonucleotide, or a polynucleotide.
[0008] One aspect of the present disclosure relates to a method of determining the sequences of a plurality of different target polynucleotides, comprising:(a) contacting a solid support with a solution comprising sequencing primers under hybridization conditions, wherein the solid support comprises a plurality of different target polynucleotides immobilized thereon; and the sequencing primers are complementary to at least a portion of the target polynucleotides;(b) contacting the solid support with an incorporation mixture comprising DNA polymerase and one more of four different types of nucleotides under conditions suitable for DNA polymerase-mediated primer extension, and incorporating one type of nucleotides into the sequencing primers to produce extended copy polynucleotides; whereineach of the four types of nucleotides comprises a 3' blocking group; and the incorporation mixture comprises a first type of unlabeled nucleotide having a first functional moiety covalently attached to the first type of unlabeled nucleotide; (c) contacting the extended copy polynucleotides with the multi-dye labeled polyphosphodiester as described herein, wherein one of RAand RBof the multi-dye labeled polyphosphodiester undergoes the bioorthogonal reaction with the first functional moiety of the first type of unlabeled to form covalent bonding, or at lea;(d) imaging the solid support and performing one or more fluorescent measurements of the extended copy polynucleotides; and(e) removing the 3' blocking group of the incorporated nucleotides.
[0009] A further aspect of the present disclosure relates to a method of determining the sequences of a plurality of different target polynucleotides, comprising:(a) contacting a solid support with a solution comprising sequencing primers under hybridization conditions, wherein the solid support comprises a plurality of different target polynucleotides immobilized thereon; and the sequencing primers are complementary to at least a portion of the target polynucleotides;(b) contacting the solid support with an incorporation mixture comprising DNA polymerase and one more of four different types of nucleotides under conditions suitable for DNA polymerase-mediated primer extension, and incorporating one type of nucleotides into the sequencing primers to produce extended copy polynucleotides; whereineach of the four types of nucleotides comprises a 3' blocking group; andthe incorporation mixture comprises a first type of unlabeled nucleotide having a first functional moiety covalently attached to the first type of unlabeled nucleotide;(c) contacting the extended copy polynucleotides with an antibody comprising the multi¬ dye labeled polyphosphodiester as described herein, wherein the antibody binds to the first type of unlabeled nucleotides via noncovalent interaction with the first functional moiety of the first type of unlabeled nucleotides;(d) imaging the solid support and performing one or more fluorescent measurements of the extended copy polynucleotides; and(e) removing the 3' blocking group of the incorporated nucleotides.
[0010] One aspect of the present disclosure relates to a kit for sequencing application, comprising:an incorporation mixture comprising one or more of four different types of nucleotides each comprising a 3' blocking group, wherein a first type of unlabeled nucleotide having a first functional moiety covalently attached to the first type of unlabeled nucleotide; andthe multi-dye labeled polyphosphodiester as described herein, wherein at least one of RAand RBof multi-dye labeled polyphosphodiester is capable of undergo the bioorthogonal reaction specifically with the first functional moiety of the first type of unlabeled nucleotides.
[0011] A further aspect of the present disclosure relates to a kit for sequencing application, comprising:an incorporation mixture comprising one or more of four different types of nucleotides each comprising a 3' blocking group, wherein a first type of unlabeled nucleotide having a first functional moiety covalently attached to the first type of unlabeled nucleotide; andan antibody comprising the multi-dye labeled polyphosphodiester as described herein, wherein the antibody is capable of binding to the first type of unlabeled nucleotides via noncovalent interaction with the first functional moiety of the first type of unlabeled nucleotides.
[0012] Another aspect of the present disclosure relates to a system for nucleic acid sequencing, comprising a plurality of chambers, wherein one or more of the chambers contains the kit in accordance with the present disclosure.BRIEF DESCRIPTION OF THE DRAWINGS
[0013] FIG. 1 illustrates labeled polyphosphodiester scaffolds brightness per scaffold in water and universal scan mixture (USM) normalized to the fluorescent intensity of the respective free dye-azides in the same solvent.
[0014] FIG. 2 is a MiSeq® scatterplots at cycle 8, where ffC is labeled with ATTO532
[0015] FIG. 3A is a MiSeq® scatterplots at cycle 11, using ffC functionalized with a TCO moiety and post labeling reagent including a polyphosphodiester scaffold labeled with three ATTO532 dyes, where the post labeling reagent was flush through for about 2.5 seconds.
[0016] FIG. 3B is MiSeq® scatterplots at cycle 13, using ffC functionalized with a TCO moiety and post labeling reagent including a polyphosphodiester scaffold labeled with three ATTO532 dyes, where the post labeling reagent was incubated for 30 seconds to facilitate binding.DETAILED DESCRIPTION
[0017] The present disclosure provides next-generation sequencing compositions, methods, kits, and systems. Certain disclosure relates to oligomers or polymers containing polyphosphodiester scaffolds functionalities for multi-dye attachments. The scaffolds can reduce or prevent undesired intramolecular dye-dye quenching, achieve better fluorescent intensity with multi-dye construction, and enhance overall scaffold brightness. The labeled polyphosphodiester scaffolds can be used in post-incorporation labeling (PIL) sequencing by synthesis methods as described herein. For example, the usability of the multi-dye attached polymer scaffold has been demonstrated in two-channel nucleic acid sequencing applications using blue and green light excitations (e.g., lasers at 450-460 nm and 520-540 nm). The repeating phosphodiester units offers several benefits as spacer structures: they are inherently water soluble, eliminating the need for additional solubilizing monomers or moieties; they are biocompatible, analogous to the phosphodiester backbone of DNA; and the degree of elongation can be tuned by changing the solution in which the scaffold is dissolved. By functionalizing the polyphosphodiester scaffolds with one functional group of an orthogonal or bioorthogonal coupling pair, the scaffolds may be easily labeled by dyes incorporating the partnering moiety to create a library of multi-dye scaffolds, rendering the multi-dye labeled polyphosphodiester scaffolds a desirable choice as labeling reagents.Definitions
[0018] Unless defined otherwise, all technical and scientific terms used herein have the same meaning as is commonly understood by one of ordinary skill in the ait. The use of the term “including” as well as other forms, such as “include”, “includes,” and “included,” is not limiting. The use of the term “having” as well as other forms, such as “have”, “has,” and “had,” is not limiting. As used in this specification, w'hether in a transitional phrase or in the body of the claim, the terms “comprise(s)” and “comprising” are to be interpreted as having an open-ended meaning. That is, the above terms are to be interpreted synonymously with the phrases “having at least” or “including at least.” For example, when used in the context of a process, the term“comprising” means that the process includes at least the recited steps but may include additional steps. When used in the context of a compound, composition, or device, the term “comprising” means that the compound, composition, or device includes at least the recited features or components, but may also include additional features or components.
[0019] As used herein, common organic abbreviations are defined as follows:AzAPA N-(5-azidoacetamidylpentyl)acrylamide°C Temperature in degrees CentigradeDMT 4,4’ -dimethoxytrityldATP Deoxyadenosine triphosphatedCTP Deoxycytidine triphosphatedGTP Deoxyguanosine triphosphatedTTP Deoxythymidine triphosphateddNTP Dideoxynucleotide triphosphateffA Fully functionalized A nucleotideffC Fully functionalized C nucleotideffG Fully functionalized G nucleotideffN Fully functionalized nucleotideffT Fully functionalized T nucleotideLED Light emitting diodeNSB Non-specific bindingPIL Post-incorporation labelingSBS Sequencing by synthesis
[0020] As used herein, the term “array” refers to a population of different probe molecules that are attached to one or more substrates such that the different probe molecules can be differentiated from each other according to relative location. An array can include different probe molecules that are each located at a different addressable location on a substrate. Alternatively, or additionally, an array can include separate substrates each bearing a different probe molecule, wherein the different probe molecules can be identified according to the locations of the substrates on a surface to which the substrates are attached or according to the locations of the substrates in a liquid. Exemplary arrays in which separate substrates are located on a surface include, without limitation, those including beads in wells as described, for example, in U. S. Patent No. 6,355,431 Bl, US 2002 / 0102578 and PCT Publication No. WO 00 / 63437. Exemplary formats that can be used in the invention to distinguish beads in a liquid array, for example, using a microfluidic device, such as a fluorescent activated cell sorter (FACS), are described, for example, in US Pat. No. 6,524,793. Further examples of arrays that can be used in the inventioninclude, without limitation, those described in U. S. Pat Nos. 5,429,807; 5,436,327; 5,561,071; 5,583,211; 5,658,734; 5,837,858; 5,874,219; 5,919,523; 6,136,269; 6,287,768; 6,287,776; 6,288,220; 6,297,006; 6,291,193; 6,346,413; 6,416,949; 6,482,591; 6,514,751 and 6,610,482; and WO 93 / 17126; WO 95 / 11995; WO 95 / 35505; EP 742 287; and EP 799 897.
[0021] As used herein, the term “covalently atached” or “covalently bonded” refers to the forming of a chemical bonding that is characterized by the sharing of pairs of electrons between atoms. For example, a covalently attached polymer coating refers to a polymer coating that forms chemical bonds with a functionalized surface of a substrate, as compared to attachment to the surface via other means, for example, adhesion or electrostatic interaction. It will be appreciated that polymers that are attached covalently to a surface can also be bonded via means in addition to covalent atachment.
[0022] As used herein, the term “non-co valent interactions” differs from a covalent bond in that it does not involve the sharing of electrons, but rather involves more dispersed variations of electromagnetic interactions between molecules or within a molecule. Non-covalent interactions can be generally classified into four categories, electrostatic, re-effects, van der Waals forces, and hydrophobic effects. Non-limiting examples of electrostatic interactions include ionic interactions, hydrogen bonding (a specific type of dipole-dipole interaction), halogen bonding, etc. Van der Walls forces are a subset of electrostatic interaction involving permanent or induced dipoles or multipoles, re-effects can be broken dowm into numerous categories, including (but not limited to) re-re interactions, cation-u & anion-re interactions, and polar- re interactions. In general, re-effects are associated with the interactions of molecules with the re-orbitals of a molecular system, such as benzene. The hydrophobic effect is the tendency of nonpolar substances to aggregate in aqueous solution and exclude water molecules. Non-covalent interactions can be both intermolecular and intramolecular. Non-covalent interactions can be both intermolecular and intramolecular.
[0023] It is to be understood that certain radical naming conventions can include either a mono-radical or a di-radical, depending on the context. For example, where a substituent requires two points of atachment to the rest of the molecule, it is understood that the substituent is a di-radical. For example, a substituent identified as alkyl that requires two points of attachment includes di-radicals such as --CH2 -, CH2CH2, -CH2CH(CH3)CH2-, and the like. Other radical naming conventions clearly indicate that the radical is a di-radical such as “alkylene” or “alkenylene.”
[0024] The term “halogen” or “halo,” as used herein, means any one of the radio-stable atoms of column 7 of the Periodic Table of the Elements, e.g., fluorine, chlorine, bromine, or iodine, with fluorine and chlorine being preferred.
[0025] As used herein, “Cato Cb,” “Ca-Cb,” or “Ca-b” in which “a” and “b” are integers refer to the number of carbon atoms in an alkyl, alkenyl or alkynyl group, or the number of ring atoms of a cycloalkyl or aryl group. That is, the alkyl, the alkenyl, the alkynyl, the ring of the cycloalkyl, and ring of the aryl can contain from “a” to “b,” inclusive, carbon atoms. For example, a “Ci to C i alkyl” group refers to all alkyl groups having from 1 to 4 carbons, that is, CII3-, CH3CH2-, CH3CH2CH2-, (CH3)2CH-, CH3CH2CH2CH2-, CH3CH2CH(CH3)- and (CH3)3C-; a C3to C4 cycloalkyl group refers to all cycloalkyl groups having from 3 to 4 carbon atoms, that is, cyclopropyl and cyclobutyl. Similarly, a “4 to 6 membered heterocyclyl” group refers to all heterocyclyl groups with 4 to 6 total ring atoms, for example, azetidine, oxetane, oxazoline, pyrrolidine, piperidine, piperazine, morpholine, and the like. If no “a” and “b” are designated with regard to an alkyl, alkenyl, alkynyl, cycloalkyl, or aryl group, the broadest range described in these definitions is to be assumed. As used herein, the term “Ci-Cg” includes Ci, C2, C3, C4, C5 and Ce, and a range defined by any of the two numbers. For example, Ci-Ce alkyl includes Ci, C2, C3, C4, C5 and Ce alkyl, C2-C6 alkyl, C1-C3 alkyl, etc. Similarly, C2-C6 alkenyl includes C2, C3, C4, C5 and Ce alkenyl, C2-C5 alkenyl, C3-C4 alkenyl, etc.; and C2-C6 alkynyl includes C2, C3, C4, C5 and Ce alkynyl, C2-C5 alkynyl, C3-C4 alkynyl, etc. C3-Cs cycloalkyl each includes hydrocarbon ring containing 3, 4, 5, 6, 7 and 8 carbon atoms, or a range defined by any of the two numbers, such as C3-C7 cycloalkyl or Cs-Ce cycloalkyl.
[0026] As used herein, “alkyl” refers to a straight or branched hydrocarbon chain that is fully saturated (i.e., contains no double or triple bonds). The alkyl group may have 1 to 20 carbon atoms (whenever it appeal’s herein, a numerical range such as “ 1 to 20” refers to each integer in the given range; e.g., “1 to 20 carbon atoms” means that the alkyl group may consist of 1 carbon atom, 2 carbon atoms, 3 carbon atoms, etc., up to and including 20 carbon atoms, although the present definition also covers the occurrence of the term “alkyl” where no numerical range is designated). The alkyl group may also be a medium size alkyl having 1 to 9 carbon atoms. The alkyl group could also be a lower alkyl having 1 to 6 carbon atoms. The alkyl group may be designated as “C1-C4 alkyl” or similar designations. By way of example only, “Ci-Ce alkyl” indicates that there are one to six carbon atoms in the alkyl chain, i.e., the alkyl chain is selected from the group consisting of methyl, ethyl, propyl, iso-propyl, n-butyl, iso-butyl, sec-butyl, and t-butyl. Typical alkyl groups include, but are in no way limited to, methyl, ethyl, propyl, isopropyl, butyl, isobutyl, tertiary butyl, pentyl, hexyl, and the like.
[0027] As used herein, “alkoxy” refers to the formula -OR wherein R is an alkyl as is defined above, such as “C1-C9 alkoxy,” including but not limited to methoxy, ethoxy, n-propoxy, 1 -methylethoxy (isopropoxy), n-butoxy, iso-butoxy, sec-butoxy, and tert-butoxy, and the like.
[0028] As used herein, “alkenyl” refers to a straight or branched hydrocarbon chain containing one or more double bonds. The alkenyl group may have 2 to 20 carbon atoms, although the present definition also covers the occurrence of the term “alkenyl” where no numerical range is designated. The alkenyl group may also be a medium size alkenyl having 2 to 9 carbon atoms. The alkenyl group could also be a lower alkenyl having 2 to 6 carbon atoms. The alkenyl group may be designated as “C2-C6 alkenyl” or similar designations. By way of example only, “C2-C6 alkenyl” indicates that there are two to six carbon atoms in the alkenyl chain, i.e., the alkenyl chain is selected from the group consisting of ethenyl, propen-l-yl, propen-2-yl, propen-3-yl, buten-1-yl, buten-2-yl, buten-3-yl, buten-4-yl, 1-methyl-propen-l-yl, 2-methyl-propen-l-yl, 1-ethyl-ethen-l-yl, 2-methyl-propen-3-yl, buta- 1,3-dienyl, buta- 1,2, -dienyl, and buta-l,2-dien-4-yl. Typical alkenyl groups include, but are in no way limited to, ethenyl, propenyl, butenyl, pentenyl, and hexenyl, and the like.
[0029] The term “aromatic” refers to a ring or ring system having a conjugated pi electron system and includes both carbocyclic aromatic (e.g., phenyl) and heterocyclic aromatic groups (e.g., pyridine). The term includes monocyclic or fused-ring polycyclic (i.e., rings which share adjacent pairs of atoms) groups provided that the entire ring system is aromatic.
[0030] As used herein, “aryl” refers to an aromatic ring or ring system (i.e., two or more fused rings that share two adjacent carbon atoms) containing only carbon in the ring backbone. When the aryl is a ring system, every ring in the system is aromatic. The aryl group may have 6 to 18 carbon atoms, although the present definition also covers the occurrence of the term “aryl” where no numerical range is designated. In some embodiments, the aryl group has 6 to 10 carbon atoms. The aryl group may be designated as “Ce-Cio aryl,” “Cg or C10 aryl,” or similar designations. Examples of aryl groups include, but are not limited to, phenyl, naphthyl, azulenyl, and anthracenyl.
[0031] An “aralkyl” or “arylalkyl” is an aryl group connected, as a substituent, via an alkylene group, such as “C7-14 aralkyl” and the like, including but not limited to benzyl, 2- phenylethyl, 3-phenylpropyl, and naphthylalkyl. In some cases, the alkylene group is a lower alkylene group (i.e., a Ci-Ce alkylene group).
[0032] As used herein, “aryloxy” refers to RO- in which R is an aryl, as defined above, such as but not limited to phenyl.
[0033] As used herein, “heteroaryl” refers to an aromatic ring or ring system (i.e., t wo or more fused rings that share two adjacent atoms) that contain(s) one or more heteroatoms, that is, an element other than carbon, including but not limited to, nitrogen, oxygen and sulfur, in the ring backbone. When the heteroaryl is a ring system, every ring in the system is aromatic. The heteroaryl group may have 5-18 ring members (i.e., the number of atoms making up the ringbackbone, including carbon atoms and heteroatoms), although the present definition also covers the occurrence of the term “heteroaryl” where no numerical range is designated. In some embodiments, the heteroaryl group has 5 to 10 ring members or 5 to 7 ring members. The heteroaryl group may be designated as “5-7 membered heteroaryl,” “5-10 membered heteroaryl,” or similar designations. Examples of heteroaryl rings include, but are not limited to, furyl, thienyl, phthalazinyl, pyrrolyl, oxazolyl, thiazolyl, imidazolyl, pyrazolyl, isoxazolyl, isothiazolyl, triazolyl, thiadiazolyl, pyridinyl, pyridazinyl, pyrimidinyl, pyrazinyl, triazinyl, quinolinyl, isoquinolinyl, benzoimidazolyl, benzoxazolyl, benzothiazolyl, indolyl, isoindolyl, and benzothienyl.
[0034] As used herein, “carbocyclyl” means a non-aromatic cyclic ring or ring system containing only carbon atoms in the ring system backbone. When the carbocyclyl is a ring system, two or more rings may be joined together in a fused, bridged or spiro-connected fashion. Carbocyclyls may have any degree of saturation provided that at least one ring in a ring system is not aromatic. Thus, carbocyclyls include cycloalkyls, cycloalkenyls, and cycloalkynyls. The carbocyclyl group may have 3 to 20 carbon atoms, although the present definition also covers the occurrence of the term “carbocyclyl” where no numerical range is designated. The carbocyclyl group may also be a medium size carbocyclyl having 3 to 10 carbon atoms. The carbocyclyl group could also be a carbocyclyl having 3 to 6 carbon atoms. The carbocyclyl group may be designated as “C3-C6 carbocyclyl” or similar designations. Examples of carbocyclyl rings include, but are not limited to, cyclopropyl, cyclobutyl, cyclopentyl, cyclohexyl, cyclohexenyl, 2,3-dihydro-indene, bicycle[2.2.2]octanyl, adamantyl, and spiro[4.4]nonanyl.
[0035] As used herein, “cycloalkyl” means a fully saturated carbocyclyl ring or ring system. Examples include cyclopropyl, cyclobutyl, cyclopentyl, and cyclohexyl.
[0036] As used herein, “heterocyclyl” means a non-aromatic cyclic ring or ring system containing at least one heteroatom in the ring backbone. Heterocyclyls may be joined together in a fused, bridged or spiro-connected fashion. Heterocyclyls may have any degree of saturation provided that at least one ring in the ring system is not aromatic. The heteroatom(s) may be present in either a non-aromatic or aromatic ring in the ring system. The heterocyclyl group may have 3 to 20 ring members (i.e., the number of atoms making up the ring backbone, including carbon atoms and heteroatoms), although the present definition also covers the occurrence of the term “heterocyclyl” where no numerical range is designated. The heterocyclyl group may also be a medium size heterocyclyl having 3 to 10 ring members. The heterocyclyl group could also be a heterocyclyl having 3 to 6 ring members. The heterocyclyl group may be designated as “3-6 membered heterocyclyl” or similar designations. In preferred six membered monocyclic heterocyclyls, the heteroatom(s) are selected from one up to three of O, N or S, and in preferredfive membered monocyclic heterocyclyls, the heteroatom(s) are selected from one or two heteroatoms selected from O, N, or S. Examples of heterocyclyl rings include, but are not limited to, azepinyl, acridinyl, carbazolyl, cinnolinyl, dioxolanyl, imidazolinyl, imidazolidinyl, morpholinyl, oxiranyl, oxepanyl, thiepanyl, piperidinyl, piperazinyl, dioxopiperazinyl, pyrrolidinyl, pyrrolidonyl, pyrrolidionyl, 4-piperidonyl, pyrazolinyl, pyrazolidinyl, 1,3 -dioxinyl, 1,3-dioxanyl, 1,4-dioxinyl, 1,4-dioxanyl, 1,3-oxathianyl, 1,4-oxathiinyl, 1,4-oxathianyl, 2H-1,2- oxazinyl, trioxanyl, hexahydro- 1, 3, 5-triazinyl, 1,3-dioxolyl, 1,3-dioxolanyl, 1,3-dithiolyl, 1,3- dithiolanyl, isoxazolinyl, isoxazolidinyl, oxazolinyl, oxazolidinyl, oxazoli dinonyl, thiazolinyl, thiazolidinyl, 1,3-oxathiolanyl, indolinyl, isoindolinyl, tetrahydrofuranyl, tetrahydropyranyl, tetrahydrothiophenyl, tetrahydrothiopyranyl, tetrahydro- 1,4-thiazinyl, thiamorpholinyl, dihydrobenzofuranyl, benzimidazolidinyl, and tetrahydroquinoline.
[0037] As used herein, “alkylene” refers to a branched, or straight chain fully saturated di-radical chemical group containing only carbon and hydrogen that is attached to the rest of the molecule via two points of attachment. By way of example only, “Ci-Cio alkylene” indicates that there are one to ten carbon atoms in the alkylene chain. Non-limiting examples include ethylene and propylene. When an alkylene is interrupted by a ring or ring system described herein, it means the ring or ring system is either inserted into a single covalent bond between two carbon atoms in the alkylene, or the ring or ring system is added to one terminal of the alkylene. For example, when a C2 alkylene is interrupted by a phenylene, it can encompass the following structures:- CH2-Ph-CH2-or-Ph-CIl2CIl2-.
[0038] As used herein, “alkenylene” refers to a straight or branched chain di-radical chemical group containing only carbon and hydrogen and containing at least one carbon-carbon double bond that is attached to the rest of the molecule via two points of attachment. The alkenylene group may be designated as “C2-C6 alkenylene” or similar designations. By way of example only,alkenylene” indicates that there are two, three, four, five or six carbon atoms in the alkenylene chain. When an alkenylene is interrupted by a ring or ring system described herein, it means the ring or ring system is either inserted into a single covalent bond (between two carbon atoms) in the alkenylene, or the ring or ring system is added to one terminal of the alkenylene.
[0039] As used herein, “alkynylene” refers to a straight or branched chain di-radical chemical group containing only carbon and hydrogen and containing at least one carbon-carbon triple bond that is attached to the rest of the molecule via two points of attachment. The alkynylene group may be designated as “C2-C6 alkynylene” or similar designations. By way of example only, “C2-C6 alkynylene” indicates that there are two, three, four, five, or six carbon atoms in the alkynylene chain. When an alkynylene is interrupted by a ring or ring system described herein, itmeans the ring or ring system is either inserted into a single covalent bond (between two carbon atoms) in the alkynylene, or the ring or ring system is added to one terminal of the alkynylene.
[0040] As used herein, “heteroalkylene” refers to an alkylene group, as defined herein, containing one or more heteroatoms in the carbon back bone (i.e., an alkylene group in which one or more carbon atoms is replaced with a heteroatom, for example, nitrogen atom (N), oxygen atom (O) or sulfur atom (S)). For example, a -CH 2- may be replaced with -O-, -S- or -NH-. Heteroalkylene groups include, but are not limited to ether, thioether, amino-alkylene, and alkylene-amino-alkylene moieties. In some embodiments, the heteroalkylene may include one, two, three, four, or five -CII2CII2O- unit(s). Alternatively and / or additionally, one or more carbon atoms can also be substituted with an oxo (=0) to become a carbonyl. For example, a CH - may be replaced with -C(=O)-. It is understood that when a carbon atom is replaced with a carbonyl group, it refers to the replacement of -CH - with -C(=O)-. When a carbon atom is replaced with a nitrogen atom, it refers to the replacement of -CH- with -N-. When a carbon atom is replaced with an oxygen or sulfur atom, it refers to the replacement of -CH2- with -O- or -S-. When a heteroalkylene is interrupted by a ring or ring system described herein, it means the ring or ring system is either inserted into a single covalent bond (e.g., between two carbon atoms, or between a carbon atom and a heteroatom) in the heteroalkylene, or the ring or ring system is added to one terminal of the heteroalkylene. For example, when a -CH2-O-CH2- is interrupted by a phenylene, it can encompass the following structures: -CH2-Ph-OCH2- or -Ph-CH2OCH2-.
[0041] As used herein, “(aryl)alkyl” refer to an aryl group, as defined above, connected, as a substituent, via an alkylene group, as described above. The alkylene and aryl group of an aralkyl may be substituted or unsubstituted. Examples include but are not limited to benzyl, 2 -phenylalkyl, 3 -phenylalkyl, and naphthylalkyl. In some embodiments, the alkylene is an unsubstituted straight chain containing 1, 2, 3, 4, 5, or 6 methylene unit(s).
[0042] As used herein, “(heteroaryl)alkyl” refer to a heteroaryl group, as defined above, connected, as a substituent, via an alkylene group, as defined above. The alkylene and heteroaryl group of heteroaralkyl may be substituted or unsubstituted. Examples include but are not limited to 2-thienylalkyl, 3-thienylalkyl, furylalkyl, thienylalkyl, pyrrolylalkyl, pyridylalkyl, isoxazolylalkyl, and imidazolylalkyl, and their benzo-fused analogs. In some embodiments, the alkylene is an unsubstituted straight chain containing 1, 2, 3, 4, 5, or 6 methylene unit(s).
[0043] As used herein, “(heterocyclyl)alkyl” refer to a heterocyclic or a heterocyclyl group, as defined above, connected, as a substituent, via an alkylene group, as defined above. The alkylene and heterocyclyl groups of a (heterocyclyl)alkyl may be substituted or unsubstituted. Examples include but are not limited to (tetrahydro-2H-pyran-4-yl)methyl, (piperidin-4-yl)ethyl, (piperidin-4-yl)propyl, (tetrahydro-2H-thiopyran-4-yl)methyl, and (l,3-thiazinan-4-yl)methyl. Insome embodiments, the alkylene is an unsubstituted straight chain containing 1, 2, 3, 4, 5, or 6 methylene unit(s).
[0044] As used herein, “(carbocyclyl)alkyl” refer to a carbocyclyl group (as defined herein) connected, as a substituent, via an alkylene group. Examples include but are not limited to cyclopropylmethyl, cyclobutylmethyl, cyclopentylethyl, and cyclohexylpropyl. In some embodiments, the alkylene is an unsubstituted straight chain containing 1, 2, 3, 4, 5, or 6 methylene unit(s).
[0045] As used herein, “alkoxyalkyl” or “(alkoxy)alkyl” refers to an alkoxy group connected via an alkylene group, such as C1-C8 alkoxyalkyl, or (C1-C8 alkoxy)C1-C6 alkyl, for example, --(CH2)i-3-OCH3.
[0046] As used herein, “-O-alkoxyalkyl” or “-O-(alkoxy)alkyl” refers to an alkoxy group connected via an -O-(alkylene) group, such as -O-(C1-C6 alkoxy)C1-C6 alkyl, for example, -O-(CH2)I.3-OCH3.
[0047] As used herein, “haloalkyl” refers to an alkyl group in which one or more of the hydrogen atoms are replaced by a halogen (e.g., mono-haloalkyl, di-haloalkyl, and tri-haloalkyl). Such groups include but are not limited to, chloromethyl, fluoromethyl, difluoromethyl, trifluoromethyl and l-chloro-2-fluoromethyl, 2-fluoroisobutyl. A haloalkyl may be substituted or unsubstituted.
[0048] As used herein, “haloalkoxy” refers to an alkoxy group in which one or more of the hydrogen atoms are replaced by a halogen (e.g., mono-haloalkoxy, di-haloalkoxy and trihaloalkoxy). Such groups include but are not limited to, chloromethoxy, fluoromethoxy, difluoromethoxy, trifluoromethoxy and l-chloro-2-fluoromethoxy, 2-fluoroisobutoxy. A haloalkoxy may be substituted or unsubstituted.
[0049] An “amino” group refers to a -NIT group. The term “mono-substituted amino group” as used herein refers to an amino (-NH2) group where one of the hydrogen atom is replaced by a substituent. The term “di-substituted amino group” as used herein refers to an amino (-NH2) group where each of the two hydrogen atoms is replaced by a substituent. The term “optionally substituted amino,” as used herein refer to a -NRARB group where RA and RB are independently hydrogen, alkyl, cycloalkyl, aryl, heteroaryl, heterocyclyl, aralkyl, or heterocyclyl(alkyl), as defined herein.
[0050] An “O-carboxy” group refers to a “-OC(=O)R” group in which R is selected from hydrogen, Ci-Ce alkyl, C2-C6 alkenyl, C2-C6 alkynyl, C3-C7 carbocyclyl, Cg-Cio aryl, 5-10 membered heteroaryl, and 3-10 membered heterocyclyl, as defined herein.
[0051] A “C-carboxy” group refers to a “-C(=O)OR” group in which R is selected from the group consisting of hydrogen, Ci-Ce alkyl, C2-C6 alkenyl,alkynyl, C3-C7carbocyclyl, Ce-Cio aryl, 5-10 membered heteroaryl, and 3-10 membered heterocyclyl, as defined herein. A non-limiting example includes carboxyl (i.e., -C(=O)OH).
[0052] A “sulfonyl” group refers to an “-SO2. R” group in which R is selected from hydrogen, Ci-Cc, alkyl, C2-C6 alkenyl, C2-C6 alkynyl, C3-C7 carbocyclyl, Ce-Cio aryl, 5-10 membered heteroaryl, and 3-10 membered heterocyclyl, as defined herein.
[0053] A “sulfino” group refers to a “~S(=O)OH” group.
[0054] A “sulfo” group refers to a “-S(=O)2. OH” or “-SO3H” group.
[0055] A “sulfonate” group refers to a “-SO3 ” group.
[0056] A “sulfate” group refers to “-SO4 ” group.
[0057] A “S-sulfonamido” group refers to a “-SO2NRARB” group in which RA and RB are each independently selected from hydrogen, C1-C6 alkyl, C2-C6 alkenyl, C2-C6 alkynyl, C3-C7 carbocyclyl, Ce-Cio aryl, 5-10 membered heteroaryl, and 3-10 membered heterocyclyl, as defined herein.
[0058] An “N-sulfonamido” group refers to a “-N(RA)SO2RB” group in which RA and Rb are each independently selected from hydrogen, Ci-Ce alkyl, C2-C6 alkenyl, C2-C6 alkynyl, C3-C7 carbocyclyl, Ce-Cio aryl, 5-10 membered heteroaryl, and 3-10 membered heterocyclyl, as defined herein.
[0059] A “C-amido” group refers to a “-C(=O)NRARB” group in which RA and RB are each independently selected from hydrogen, Ci-Cs alkyl, C2-C6 alkenyl, C2-C6 alkynyl, C3-C7 carbocyclyl, Ce-Cio aryl, 5-10 membered heteroaryl, and 3-10 membered heterocyclyl, as defined herein.
[0060] An “N-amido” group refers to a “-N(RA)C(=O)RB” group in which RA and RB are each independently selected from hydrogen, Ci-Ce alkyl, C2-C6 alkenyl, C2-C6 alkynyl, C3-C7 carbocyclyl, Ce-Cio aryl, 5-10 membered heteroaryl, and 3-10 membered heterocyclyl, as defined herein.
[0061] An “O-carbamyl” group refers to a “-OC(=O)N(RARB)” group in which RA and RB can be the same as defined with respect to S-sulfonamido. An O-carbamyl may be substituted or unsubstituted.
[0062] An “N-carbamyl” group refers to an “ROC(=O)N(RA) -“ group in which R and RA can be the same as defined with respect to N-sulfonamido. An N-carbamyl may be substituted or unsubstituted.
[0063] An “O-thiocarbamyl” group refers to a “-OC(=S)-N(RARB)” group in which RA and RB can be the same as defined with respect to S-sulfonamido. An O-thiocarbamyl may be substituted or unsubstituted.
[0064] An “N-thiocarbamyl” group refers to an “ROC(=S)N(RA)-“ group in which R and RA can be the same as defined with respect to N-sulfonamido. An N-thiocarbamyl may be substituted or unsubstituted.
[0065] The term “alkylamino” or “(alkyl)amino” refers to an amino group wherein one or both hydrogen is replaced by an alkyl group.
[0066] An “(alkoxy)alkyl” group refers to an alkoxy group connected via an alkylene group, such as a “(Ci-Ce alkoxy) Ci-Ce alkyl” and the like.
[0067] The term “hydroxy” as used herein refers to a -OH group.
[0068] The term “cyano” group as used herein refers to a “-CN” group.
[0069] The term “azido” as used herein refers to a -N3 group.
[0070] The term “isoniirile” as used herein refers to a “-N+=C~ ” group.
[0071] Wherever a group or substituent is depicted as a di-radical (i.e., has two points of attachment to the rest of the molecule), it is to be understood that the substituent can be attached in any directional configuration unless otherwise indicated. Thus, for example, a substituentA^C'depicted as -AE-- or 'bincludes the substituent being oriented such that the A is attached at the leftmost attachment point of the molecule as well as the case in which A is attached at the rightmost attachment point of the molecule. In addition, if a group or substituent is depicted asL' >, and when L is defined as a bond or absent; such group or substituent is equivalent VAx. AtoE
[0072] When a group is described as “optionally substituted” it may be either unsubstituted or substituted. Likewise, when a group is described as being “substituted,” the substituent may be selected from one or more of the indicated substituents. As used herein, a substituted group is derived from the unsubstituted parent group in which there has been an exchange of one or more hydrogen atoms for another atom or group. Unless otherwise indicated, when a group is deemed to be “substituted,” it is meant that the group is substituted with one or more substituents independently selected from Ci-Ce alkyl, Ci-Ce alkenyl, Ci-Ce alkynyl, Ci-Ce heteroalkyl, C3-C7 carbocyclyl (optionally substituted with halo, Ci-Ce alkyl, Ci-Ce alkoxy, Ci-Ce haloalkyl, and Ci-Ce haloalkoxy), C3-C7 carbocyclyl-C1-C6-alkyl (optionally substituted with halo, Ci-Ce alkyl, Ci-Ce alkoxy, Ci-Ce haloalkyl, and Ci-Ce haloalkoxy), 3- 10 membered heterocyclyl (optionally substituted with halo, Ci-Ce alkyl, Ci-Ce alkoxy, Ci-Ce haloalkyl, and Ci-Ce haloalkoxy), 3-10 membered heterocyclyl -Ci-Ce-alkyl (optionally substituted with halo, Ci-Ce alkyl, Ci-Ce alkoxy, Ci-Ce haloalkyl, and Ci-Ce haloalkoxy), aryl (optionally substituted with halo, Ci-Ce alkyl, Ci-Ce alkoxy, Ci-Cg haloalkyl, and Ci-Ce haloalkoxy), (aryl)Ci-Ce alkyl(optionally substituted with halo, Ci-Ce alkyl, Ci-Ce alkoxy, Ci-Ce haloalkyl, and Ci-Ce haloalkoxy), 5-10 membered heteroaryl (optionally substituted with halo, Ci-Ce alkyl, Ci-Ce alkoxy, Ci-Ce haloalkyl, and Ci-C'6 haloalkoxy), (5-10 membered heteroaryl)Ci-C'6 alkyl (optionally substituted with halo, Ci-Ce alkyl, Ci-Cc, alkoxy, Ci-Cc, haloalkyl, and Ci-Ce haloalkoxy), halo, -CN, hydroxy, Ci-Ce alkoxy, (Ci-Ce alkoxy)Ci-C6 alkyl, -O(Ci-C6 alkoxy)Ci-Ce alkyl; (Ci-Ce haloalkoxy)Ci-Ce alkyl; -O(Ci-Ce haloalkoxy)Ci-Ce alkyl; aryloxy, sulfhydryl (mercapto), halo(Ci-C6)alkyl (e.g., -CF3), halo(Ci-C-6)alkoxy (e.g., -OCF3), Ci-Cs alkylthio, arylthio, amino, amino(Ci-C6)alkyl, nitro, O-carbamyl, N-carbamyl, O-thiocarbamyl, N- thiocarbamyl, C -amido, N-amido, S-sulfonamido, N-sulfonamido, C -carboxy, O-carboxy, acyl, cyanato, isocyanato, thiocyanate, isothiocyanate, sulfinyl, sulfonyl, -SO3H, sulfonate, sulfate, sulfino, -OSO3 C1-4alkyl, monophosphate, diphosphate, triphosphate, oxo (=0), or thioxo (=S). Wherever a group is described as “optionally substituted” that group can be substituted with the above substituents.
[0073] In each instance where a single mesomeric form of a compound described herein is shown, the alternative mesomeric forms are equally contemplated.
[0074] As used herein, a “nucleotide” includes a nitrogen containing heterocyclic base, a sugar, and one or more phosphate groups. They are monomeric units of a nucleic acid sequence. In RNA, the sugar is a ribose, and in DNA a deoxyribose, i.e., a sugar lacking a hydroxyl group that is present in ribose. The nitrogen containing heterocyclic base can be purine, deazapurine, or pyrimidine base. Purine bases include adenine (A) and guanine (G), and modified derivatives or analogs thereof, such as 7-deaza adenine or 7-deaza guanine. Pyrimidine bases include cytosine (C), thymine (T), and uracil (U), and modified derivatives or analogs thereof. The C-l atom of deoxyribose is bonded to N-l of a pyrimidine or N-9 of a purine.
[0075] As used herein, a “nucleotide conjugate” generally refers to a nucleotide labeled with a fluorescent moiety, optionally through a cleavage linker as described herein. In some embodiment, when a nucleotide conjugate is described as an unlabeled nucleotide, such nucleotide does not include a fluorescent moiety. In some further embodiments, an unlabeled nucleotide conjugate also does not have a cleavable linker.
[0076] As used herein, a “nucleoside” is structurally similar to a nucleotide but is missing the phosphate moi eties. An example of a nucleoside analogue would be one in which the label is linked to the base and there is no phosphate group attached to the sugar molecule. The term “nucleoside” is used herein in its ordinary sense as understood by those skilled in the art. Examples include, but are not limited to, a ribonucleoside comprising a ribose moiety and a deoxyribonucleoside comprising a deoxyribose moiety. A modified pentose moiety is a pentose moiety in which an oxygen atom has been replaced with a carbon and / or a carbon has beenreplaced with a sulfur or an oxygen atom. A “nucleoside” is a monomer that can have a substituted base and / or sugar moiety. Additionally, a nucleoside can be incorporated into larger DNA and / or RNA polymers and oligomers.
[0077] The term “purine base” is used herein in its ordinary sense as understood by those skilled in the art and includes its tautomers. Similarly, the term “pyrimidine base” is used herein in its ordinary sense as understood by those skilled in the art and includes its tautomers. A non-limiting list of optionally substituted purine -bases includes purine, adenine, guanine, deazapurine, 7-deaza adenine, 7-deaza guanine, hypoxanthine, xanthine, alloxanthine, 7-alkylguanine (e.g., 7-methylguanine), theobromine, caffeine, uric acid and isoguanine. Examples of pyrimidine bases include, but are not limited to, cytosine, thymine, uracil, 5,6-dihydrouracil and 5 -alkylcytosine (e.g., 5-methylcytosine).
[0078] As used herein, when an oligonucleotide or polynucleotide is described as “comprising” a nucleoside or nucleotide described herein, it means that the nucleoside or nucleotide described herein forms a covalent bond with the oligonucleotide or polynucleotide. Similarly, when a nucleoside or nucleotide is described as part of an oligonucleotide or polynucleotide, such as “incorporated into” an oligonucleotide or polynucleotide, it means that the nucleoside or nucleotide described herein forms a covalent bond with the oligonucleotide or polynucleotide. In some such embodiments, the covalent bond is formed between a 3' hydroxy group of the oligonucleotide or polynucleotide with the 5' phosphate group of a nucleotide described herein as a phosphodiester bond between the 3' carbon atom of the oligonucleotide or polynucleotide and the 5' carbon atom of the nucleotide.
[0079] As used herein, the term “cleavable linker” is not meant to imply that the whole linker is required to be removed. The cleavage site can be located at a position on the linker that ensures that part of the linker remains attached to the detectable label and / or nucleoside or nucleotide moiety after cleavage.
[0080] As used herein, “derivative” or “analog” means a synthetic nucleotide or nucleoside derivative having modified base moieties and / or modified sugar moieties. Such derivatives and analogs are discussed in, e.g., Scheit, Nucleotide Analogs (John Wiley & Son, 1980) and Uhlman et al., Chemical Reviews 90:543-584, 1990. Nucleotide analogs can also comprise modified phosphodiester linkages, including phosphorothioate, phosphorodithioate, alkyl -phosphonate, and phosphoramidate linkages. “Derivative,” “analog” and “modified” as used herein, may be used interchangeably, and are encompassed by the terms “nucleotide” and “nucleoside” defined herein.
[0081] As used herein, a biotin moiety-containing molecule or an analog thereof O HN NHcomprises the biotin moiety of structureor. In some cases, biotin moiety is O HN NHattached to the remaining portion of the molecule via a linker, such as S '. The analog of the biotin moiety-containing molecule may include a substituted biotin moiety.
[0082] As used herein, an alkyl chloride-containing molecule or an analog thereofcomprises the structure " A. The analog of the alkyl chloride moiety-containing molecule may include a substituted alkyl chloride moiety.
[0083] As used herein, a dinitrophenyl (DNP) moiety-containing molecule or an O2NA.Nanalog thereof comprises the structure H. In some cases, DNP moiety is attached O2N._ NHto the remaining portion of the molecule via a linker, such asO. The analog of the DNP moiety- containing molecule may include a substituted DNP moiety.
[0084] As used herein, a digoxigenin (DIG) moiety-containing molecule or an analogOor diastereomers thereof such asO OHOHH In some cases, DIG moiety is attached to the remining portion ofthe molecule via a linker, such as H. The analog of the DIG moiety- containing molecule may include a substituted DIG moiety.
[0085] As used herein, a P-N-acetylglucosamine (O-GlcNAc) moiety-containingmolecule or an analog thereof comprises the structure\ or OHno-'-y -'-QNHAc The analog of the O-GlcNAc moiety-containing molecule may include a substituted O-GlcNAc moiety.
[0086] As used herein, an alkyl guanine moiety-containing molecule or an analogthereof comprises the structure. The analog of the alkyl guanine moiety- containing molecule may include a substituted alkyl guanine moiety.
[0087] As used herein, a 3-nitrotyrosine moiety-containing molecule or an analog OUNH2thereof comprises the structureHO. The analog of the 3-nitrotyrosine moiety containing molecule may include a substituted 3-nitrotyrosine moiety.
[0088] As used herein, anti-DNP antibody refers to an antibody capable of specific binding to a DNP moiety described herein.
[0089] As used herein, anti-DIG antibody refers to an antibody capable of specific binding to a DIG moiety described herein.
[0090] As used herein, wheat germ agglutinin (WGA) refers to a lectin capable of binding O-GlcNAc moiety described herein.
[0091] As used herein, SNAP-Tag® refers to a commercially available protein tag. SNAP-Tag® is capable of specific binding to an alkyl guanine moiety described herein.
[0092] As used herein, HaloTag® refers to a commercially available protein tag. HaloTag® is capable of specific binding to alkyl chloride moiety described herein.
[0093] As used herein, anti-nitrotyrosine antibody refers to an antibody capable of specific binding to a 3 -nitrotyrosine moiety described herein.
[0094] As used herein, the term “phosphate” is used in its ordinary sense as understood OHO=P — O —by those skilled in the art, and includes its protonated forms (for example, O' and OHO=P — o — |OH ). As used herein, the terms “monophosphate,” “diphosphate,” and “triphosphate” are used in their ordinary sense as understood by those skilled in the art and include protonated forms.
[0095] As understood by one of ordinary skill in the art, a compound such as a nucleotide conjugate described herein may exist in ionized form, e.g., containing a -CO2-, -SO3-or -O~. If a compound contains a positively or negatively charged substituent group, it may also contain a negatively or positively charged counterion such that the compound as a whole is neutral. In other aspects, the compound may exist in a salt form, where the counterion is provided by a conjugate acid or base.
[0096] As used herein, the term “orthogonal” or “bioorthogonal” in the context of chemical reaction, it refers to the situation when there are two pairs of substances and each substance can interact with their respective partner, but does not interact with either substance of the other pair. In the context of the functional groups such as the first or second functional group of the unlabeled nucleotides, it refers to that such functional groups will selectively react with the multi-dye labeled polyphosphodiester scaffold described herein, or an antibody comprising the multi-dye labeled polyphosphodiester scaffold described herein, while the second functional groups will have little or no reactivity towards the same chemical entities that are reactive to the first functional groups.
[0097] As used herein, the term “phasing” refers to a phenomenon in SBS that is caused by incomplete removal of the 3' terminators and fluorophores, and / or failure to complete the incorporation of a portion of DNA strands within clusters by polymerases at a given sequencing cycle. Prephasing is caused by the incorporation of nucleotides without effective 3' terminators, wherein the incorporation event goes 1 cycle ahead due to a termination failure. Phasing and prephasing cause the measured signal intensities for a specific cycle to consist of the signal from the current cycle as well as noise from the preceding and following cycles. As thenumber of cycles increases, the fraction of sequences per cluster affected by phasing and prephasing increases, hampering the identification of the correct base. Prephasing can be caused by the presence of a trace amount of unprotected or unblocked 3'-OH nucleotides during sequencing by synthesis (SBS). The unprotected 3'-OH nucleotides could be generated during the manufacturing processes or possibly during the storage and reagent handling processes.
[0098] 0095] As used herein, the term “sequence context effect” or “sequence specific effect” refers to the effect that the intensity of a base as shown in a cloud scatterplot may be impacted by the preceding sequence context during sequencing by synthesis (SBS), in particular the two-channel SBS. This intensity modulation adds noise to the system and can cause miscalls when the sequence-specific intensity modulation shift a given cluster’s intensity towards a decision boundary. It is also known as sequence specific errors (SSE). Without being bound by a particular theory, one reason for the sequence-specific intensity shifts is differential incorporation of fully functionalized nucleotides (ffNs) labeled with different dyes (e.g., green ffA and blue ffA). Another reason for causing fluorescent signal intensity shirt may be due to the preceding bases show different effect on the incorporated labeled nucleotides (e.g., quenching or enhancing dye signal intensity). As described in the present disclosure, using a single fully functionalized nucleotide (ffN) in connection with affinity reagent(s) that can produce color in two channels can reduce or eliminate sequence-specific intensity shifts and thereby improve base calling accuracy.Multi-Dye Labeled Polyphosphodiester Scaffolds
[0099] Certain embodiments of the present disclosure relate to a multi-dye labeled phosphodiester oligomer or polymer (polyphosphodiester) having the structure of Formula (I) or (II):FlIX“YFlIX~Y(II), or a salt thereof, wherein:R1is —OH, — SH, — O“, — S", — O~M+, — S“M+, — O(unsubstituted or substituted Ci- Ce alkyl), or —O(un substituted or substituted C2-C6 alkenyl);R2is O or S;L1is an unsubstituted or substituted C2-C6 alkylene, an unsubstituted or substituted C2-C6 alkenylene, an unsubstituted or substitutedC alkynylene, a 2 to 8 membered heteroalkylene, an unsubstituted or substituted C3-C10 membered carbocyclylene, an unsubstituted or substituted C6-C10 arylene, an unsubstituted or substituted 5 to 10 membered heteroarylene, or an unsubstituted or substituted 3 to 10 membered heterocyclylene, or the Calkylene, C2-C6 alkenylene, C2-C6 alkynylene, or 2 to 8 membered heteroalkylene each independently interrupted by an unsubstituted or substituted C3-C10 membered carbocyclylene, an unsubstituted or substituted C6-C10 arylene, an unsubstituted or substituted 5 to 10 membered heteroarylene, or an unsubstituted or substituted 3 to 10 membered heterocyclylene;1. ' is a bond, an unsubstituted or substituted Ct-Ce alkylene, an unsubstituted or substituted C2-C6 alkenylene, an unsubstituted or substituted C2-C0 alkynylene, a 2 to 8 membered heteroalkylene, an unsubstituted or substituted C3-C10 membered carbocyclylene, an unsubstituted or substituted C6-C10 arylene, an unsubstituted or substituted 5 to 10 membered heteroarylene, an unsubstituted or substituted 3 to 10 membered heterocyclylene;each of L3, L4and I is independently a bond, an unsubstituted or substituted Ci- C6 alkylene or a 2 to 8 membered heteroalkylene;M+is a cation;F1 is a fluorescent dye moiety;each X is independently a functional group of the fluorescent dye moiety Fl; each Y is independently a functional group of the polyphosphodiester, wherein X— Y refers to a reaction product between X and Y forming covalent bonding;each RAand RBis independently H, —OH, — SH, — OP(=Ra)(Rb)Rc, unsubstituted or substituted Ci-Ce alkyl, unsubstituted or substituted Ce-Cio aryl, unsubstituted or substituted 5 or 6 membered heteroaryl, or a functional group capable binding to a biological molecule of interest either via a bioorthogonal reaction selected from the group consisting of a [3+2] dipolar cycloaddition, a Diels- Alder cycloaddition, a [4+1] cycloaddition, a phosphine ligation, and condensation with 2-acylphenyl boronic acid, or via noncovalent interaction with the biological molecule of interest;Rais O or S;each of Rband Rcis independently —OH, — SH, — <)”, — S", — <)”M+, — S~M+, — O(unsubstituted or substituted Ci-Ce alkyl), or — O(unsubstituted or substituted C2-C6 alkenyl);m is an integer of 5 or more;each ml and m2 is independently an integer of 3 or more;n is an integer of 1 or more;q is an integer of 2 or more;k is 0 or 1, provided that when k is 0, LJand I are directly attached to L2; and p is 0 or 1, provided that when p is 0, RAis directly attached to L2, and n is at least 2. In some embodiments, m is an integer of 7 or more.
[0100] In some embodiments of the polyphosphodiester of Formula (I) or (II), R1is —OH. In some other embodiments, R1is — O~M+, and M+is Na+or K+. In some embodiments, R1is — 0(unsubstituted Ci-Ce alkyl), such as — OMe or — OEt. In some other embodiments, R1is — O(unsubstituted C2-C.6 alkenyl), such as — O-allyl. In some embodiments, R2is O. In other embodiments, R2is S. In some embodiments, L1is — CH2-CH2—, — CH2=CH2— or — CH2- CIl2=CIl2-CIl2- wherein -CH2-CH2-, —CH -CH — or -Clb-CIb^CIb-CIR- is optionally substituted. In one embodiment, L1is — CH2-CH2—. In some other embodiments, L1is cyclohexylene or C2-C6 alkylene interrupted by cyclohexylene. In some embodiments, L2is Ci- C3 alkylene (e.g., — CII2— ). In some such embodiments, k is 1 and L4is a bond or C1-C3 alkylene (e.g., — CH — ). In some other embodiments, L2is — CH -CH —, — CH2-CH2=CH2-CH2—, wherein — CII2-CII2—, — I 11 '-Cl I — or — CH2-CIl2=CH2-CH2— is optionally substituted. In some other embodiments, L2is cyclohexylene. In some such embodiments, k is 0, and L4is a bond or C1-C3alkylene (e.g., — CII2— ). In some embodiments, I? is C1-C3 alkylene (e.g., — CH2— ). In other embodiments, L3is a bond.
[0101] In some embodiments of the polyphosphodiester of Formula (I), ni is 7, 8, 9, or 10. In some such embodiments, n is 1, 2, 3, 4, or 5. In some such embodiments, p is 1. In other embodiments, p is 0. In some such embodiments, m is 7, n is 1 or 2, and p is 1. In some such embodiments, the oligomer of Formula (I) can include the following structure:other embodiments, the ethylene moiety (L1) of the phosphodiester repeating unit can be optionally substituted, or the ethylene unit can be replaced by — CH2=CHz—, — CH2-CH?=CH?- CH2— or cyclohexylene, each independently unsubstituted or substituted. In some embodiments, L2is C1-C3 alkylene (e.g., — CII2— ). In other embodiments, L2is a 2 to 8 membered heteroalkylene containing one or more heteroatoms selected from N, O and =0. In some embodiments, L3is Ci-C3 alkylene (e.g., — CH2— ). In other embodiments, L3is a 2 to 8 membered heteroalkylene containing one or more heteroatoms selected from N, O and =0. In some embodiments, RAis OH. In other embodiments, RAis an optionally substituted phenyl or 5-6 membered heteroaryl. In some embodiments, L4is a bond. In other embodiments, L4 is a 2 to 8 membered heteroalkylene containing one or more heteroatoms selected from N, O and =0. In one embodiment, I..4is — NHC(=O)(CH2)2-4C(=O) —. In some embodiments, Y is DBCO and X is azido. As such, X-Y forms a triazole.
[0102] In some embodiments of the polyphosphodiester of Formula (II), each m I and m2 is independently 3, 4, 5, 6, 7 or 8. In some such embodiments, q is 2, 3, 4, or 5.(II-c), or a salt thereof.
[0103] In some embodiments of the polyphosphodiester of Formula (I), (I-a), (I-b), (II), (Il-a), (Il-b) or (II-c), one of X and Y is amino, unsubstituted or substituted Q. rjdibenzocyclooctyne moiety (DBCO), or unsubstituted or substituted transcyclooctene (TCO) moiety, and the other one of X and Y is carboxyl, azido, or unsubstituted or substituted tetrazine moiety. For example, X is -NH2 and Y is -C(O)OH, and X— Y forms an amide bond; X is DBCO and Y is azido; or X is TCO and Y is an unsubstituted or substituted N-N z=\tN-N / =.Htetrazine moiety (for example, phenyl tetrazine (e.g.,N-N -s or N-N. wherein N— N N=\ (\ the phenyl ring may be optionally substituted); pyrimidyl tetrazine (e.g.N-N N””' • or N=NN-N N wherein the pyrimidyl ring is optionally substituted); methyl tetrazine (N-N N = N N:::"f-<\ / r- N~N ); pyridyl tetrazine (e.g. y-where the pyridyl ring is N-N / ’H / ’ \optionally substituted); t-butyl tetrazine (e.g., N-N • )).
[0104] In any embodiments of the polyphosphodiester of Formula (I), ( I-a), (I-b), (II), (Il-a), (Il-b) or (II-c), the biological molecule of interest is a nucleotide, an oligonucleotide, a polynucleotide or an antibody (e.g., avidin such as streptavidin or neutravidin, anti-DIG, anti-DNP, HaloTag®, SNAP-Tag®, WGA (lectin), anti-nitrotyrosine antibody, His-Tag, or oligo¬ aspartate protein).
[0105] In some embodiments, the biological molecule is covalently attached to the multi -dye labeled polyphosphodiester via the bioorthogonal reaction with one of RAand RB. In some embodiments, at least one of RAand RBcomprises or is a functional group selected from azido, amino, unsubstituted or substituted vinyl, unsubstituted or substituted tetrazine, unsubstituted or substituted triazine, unsubstituted or substituted cyclopenta- 1, 3-dienyl, - S(O)2CH=CH2, -O-(CH2)2-SCH=CH2, sydnone, imino sydnone, nitrone, cyclopropenone, cyclopropenium ion, l,3-dithiolium-4-olate (DTO), chloro-oxime, amino hydrazide, or. In some such embodiments, the biological molecule of interest comprises -O-NH2, - SH, -NH-NH2, unsubstituted or substituted alkynyl, unsubstituted or substituted C5-C16 cycloalkynyl (e.g., unsubstituted or substituted bicyclo [6.1.0] nonyne (BCN) moiety), unsubstituted or substituted 5 to 16 membered heterocycloalkynyl (e.g., unsubstituted or substituted DBCO), or, unsubstituted or substituted C5-C16 cycloalkenyl (e.g., unsubstituted or substituted norbornene or TCO), unsubstituted or substituted 5 to 16 membered heterocycloalkenyl, substituted vinyl, isonitrile, substituted boronic acid moiety, substituted 'T'CK.. NH1EtO-" °f\<SV [fV' IH F HO' F Yl'N 1- CNphosphines,S,5or HO. in other embodiments, the biological molecule of interest comprises a functional group selected from azido, amino, unsubstituted or substituted vinyl, unsubstituted or substituted tetrazine, unsubstituted or substituted triazine, unsubstituted or substituted cyclopenta- 1,3 -dienyl, -S(O)2CH=CH2, -O- (CH2)2-SCH=CH?., sydnone, imino sydnone, nitrone, cyclopropenone, cyclopropenium ion, 1,3-••OL Odithiolium-4-olate (DTO), chloro-oxime, amino hydrazide,or. In some such embodiments, at least one of RAand RBcomprises or is a functional group selected from -O-NH2, -SH, -NH-NH2, unsubstituted or substituted alkynyl, unsubstituted or substituted C5-C16 cycloalkynyl (e.g., unsubstituted or substituted bicyclo[6.1.0]nonyne (BCN) moiety), unsubstituted or substituted 5 to 16 membered heterocycloalkynyl (e.g., unsubstituted or substituted DBCO), or, unsubstituted or substituted C5-C16 cycloalkenyl (e.g., un substituted or substituted norbornene or TCO), unsubstituted or substituted 5 to 16 membered heterocycloalkenyl, substituted vinyl, isonitrile, substituted boronic acid moiety, substitutedO II HO'' N CNJ bphosphines,. In some further embodiments, RAis H, OH, -OP(=O)(OH)O, or -OP(=O)(OH)O M+, one of RBand the biological molecule comprises alkynyl, unsubstituted or substituted DBCO, substituted phosphines (such as phosphine PPh2PPh2, OMeH " OHmethyl ester (e.g. O ), phosphine alcohol (e.g.), phosphine amine (e.g.„PPh2PPh2NH2SH), or phosphine thiol (e.g. )), unsubstituted or substituted tetrazine (e.g., phenyl tetrazine, pyrimidyl tetrazine, methyl tetrazine, pyridyl tetrazine, t-butyl tetrazine), triazine, unsubstituted or substituted BCN moiety, unsubstituted or substituted norbornene moiety,HO.... OHvinyl boronic acid (HO-acylphenyl boronic acid (' ), and the other oneFIFof RBand the biological molecule comprises azido, phenyl azido (e.g., ), unsubstitutedor substituted TCO moiety, primary isonitrile (e.g.' NC ), tertiary isonitrile (e.g.' NC ), or Oamino hydrazide (e.g.). In some further embodiments, one of RBand the biologicalmolecule comprises alkynyl, norbornene, DBCO, BCN, and the other one of RBand the biological molecule comprises azido. In some further embodiments, one of RBand the biological molecule comprises optionally substituted triphenylphosphine moiety, and the other one of RBand the biological molecule comprises phenyl azido. In some further embodiments, one of RBand the biological molecule comprises TCO, and the other one of RBand the biological molecule comprises a substituted tetrazine. In still further embodiments, one of RBand the biological molecule comprises an amino hydrazide moiety, and the other one of RBand the biological molecule comprises a 2 -acylphenyl boronic acid moiety.
[0106] In some embodiments, the biological molecule is oligonucleotide or polynucleotide, wherein the oligonucleotide or polynucleotide is at least partially complementary and hybridized to a target polynucleotide immobilized on a surface of a solid support.
[0107] In other embodiments, the biological molecule is an antibody, and it binds to the multi-dye labeled polyphosphodiester as described herein via noncovalent interaction. Non¬ limiting examples of such binding partner pairing are summarized in the table below:One of RA / RBAntibody binding partnerBiotin A vidin (e.g., streptavidin or neutravidin) Alkyl chloride HaloTag®DNP Anti-DNP antibodyDIG Anti-DIG antibodyP-N-acetyl glucosamine (O- WGA (lectin)GlcNAc)Alkyl guanine SNAP-Tag®3-nitrotyrosine Anti-ni trotyrosine antibodyNickel or cobalt complex such as His-TagNi-nitrilotriacetic acid (NTA)Zinc complex Oligo-aspartate protein
[0108] In any embodiments of the multi-dye labeled polyphosphodiester scaffolds described herein, such scaffold may be water soluble. In some embodiments, the scaffold may reduce dye-dye quenching and / or boost brightness of the fluorophore by about or at least about 10%, 15%, 20%, 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 100%, 110%, 120%, 130%, 140%, 150%, 160%, 170%, 180%, 190%, 200%, 225%, 250%, 275%, or 300%, as measured by fluorescent intensity base on a single fluorophore.Synthesis of Polyphosphodiester Scaffolds
[0109] Polyphosphodiester scaffolds may be synthesized by two different procedures. Firstly, a DNA synthesizer may be used to build the scaffolds in a stepwise manner via phosphoramidite chemistry, as illustrated in Scheme 1 below. This method allows a high degree of control over the final structure of the polyphosphodiester as a different monomer may be used for each step of the process, and easily allows for block polymers as required for scaffolds withmore than two dyes. Two amidites are designed for the phosphodiester elongation. Amidite A contains a functional group X which allow to further modification to attach dyes or other desired functions. Amidite B is for standard elongation as a spacer between dyes.Scheme 1. Synthesis of polyphosphodiester scaffold using phosphoramidite chemistryDNA Synthesizer Fl — Xs, wherein L4, X, Y and m are described herein, * refers to the fluorescent dye moiety. In one embodiment, L4is — NHC(=O)((Tl2)2-4C(=O)—. X / Y pairing may include but not limited to - NH2 / -COOH; DBCO / -N3; TCO / Tz (e.g., methyl tetrazine). Furthermore, at least one terminal hydroxy group can further be modified to install a terminal alkyne moiety. In addition to the terminal alkyne moiety, additional options such as terminal tetrazine moiety, triphenylphosphine, or 2-acylphenylboronic acid moiety can also be used. In some embodiments, Y is DBCO and X is azido. As such, X-Y forms a triazole.
[0110] The second approach is via Ring-Opening Polymerization (ROP) of cyclic phosphate monomers (Scheme 2), reported by Muller, et al. RSC Advances 2015, 5 (53) 42881-42888. Controlled ROP allows polyphosphodiester chains to be synthesized with a specific molecular weight, determined only by the ratio of initiator to monomer in the reaction mixture. The initiator, monomer, base, solvent and temperature all play a role in ensuring the rate of reaction is slow' enough for polymerization to be controlled. ROP of phospholane ester and amidate monomers results in non-water soluble polyphosphotriesters and polyphosphoamidates respectively, followed by a final deprotection step to yield the desired scaffold.Scheme 2. General reaction scheme for the ROP of a phospholane monomer to form polyphosphodiester scaffoldBase, SolventZ"R!Dry conditionsRA^OH o-p<°■ o -78C to RTInitiator min to hours1eq PhospholaneMonomerm eqdeprotectioncapping, wherein Z is Nil or O; R1is an unsubstituted or substituted Ci-Ce alkyl or ('?-( <, alkenyl as defined.[Oil 1] Careful consideration of the initiator and terminator will dictate the end groups of the polymer and may comprise whole dyes or click pairs for further functionalization or attachment to an unlabeled nucleotide in the post-incorporation labeling method described herein. A specific example of functionalization of the phosphodiester scaffold with activated ester group is described in Scheme 3, reported by Nifant’ev, et al.. Polymers 2021; 13(6):868.Scheme 3. End capping polyphosphodiester to form an activated esterwherein R and R’ may be RAand RBas defined herein, respectively.
[0112] Phospholane amidates may be used as the monomer for preparing the polyphosphodiester scaffold described herein as the resulting chain can be easily deprotected in 0.1M HC1 (Scheme 4). Phospholane amidate monomers may be synthesized from 2-chloro-l,3,2- dioxaphospholane 2-oxide and the corresponding amine, or a phosphoramidate and a diol, the latter method also allowing side groups on the polyphosphodiester scaffold to be introduced (Schemes 5A-5C), all of which were reported by Zhang, S. et al., Macromolecules 2013, 46 (13), 5141-5149.Scheme 4.TBD, DCM 0GC, 10min1 eq NEt31.1eqr~ °,.o THF dry( pz +" O' " Cl 4A MS1eq00, NSc e e A l ni q2atmh m 5: O mcles 1.1 eNEt31.1 eqr'c,p x- / THF dryt p-^+H3N' O 4A MS"o' Ci1,1eq1eq OC, N2atmScheme 5B: l°mmolesNEt, 2eq \THF dry (18ml) / O - 4A MSH Q / OH OP\1eq OC, N2atm o'N- Scheme 5C:2mmoles1eq
[0113] Using the synthetic method described above, copolymers may be synthesized which allow regularly repeating functionalization along the polyphosphodiester chain, and therefore allow multiple dyes to be spaced apart on the scaffold. Scheme 6 illustrates a reaction scheme for the copolymerization of two phospholane monomers to form a polyphosphodiester scaffold with X groups spaced along the scaffold, assuming a similar reactivity of each monomer. X is a functional group which is capable of participating in an ortliogonal / hioorthogonal reaction (such as a click reaction) to allow dye labelling of the scaffold.Scheme 6.R1, z z 0 '^0 + catalyst, DCM " / '~ / 1 eq n eq m eqA Bwherein the squiggly line between X and the carbon to which it attaches means there may be additional linker present (such as L4described in Formula (I)). With n equivalents of monomerA and m equivalents of monomer B a random copolymer will form, containing monomer B spaced out, on average, by (n~m) units of monomer A..Methods of Sequencing with Post Incorporation Labeling
[0114] Post-incorporation labeling methods and kits have been described in U. S. Publication No. 2023 / 0383342 Al and U. S. Appl. No. 18 / 820008, each of which is incorporated by reference in its entirety.
[0115] Certain embodiment of the present disclosure relates to a method of determining the sequences of a plurality of different target polynucleotides (Method A), comprising:(a) contacting a solid support with a solution comprising sequencing primers under hybridization conditions, wherein the solid support comprises a plurality of different target polynucleotides immobilized thereon; and the sequencing primers are complementary to at least a portion of the target polynucleotides;(b) contacting the solid support with an incorporation mixture comprising DNA polymerase and one more of four different types of nucleotides under conditions suitable for DNA polymerase-mediated primer extension, and incorporating one type of nucleotides into the sequencing primers to produce extended copy polynucleotides; whereineach of the four types of nucleotides comprises a 3' blocking group; and the incorporation mixture comprises a first type of unlabeled nucleotide having a first functional moiety covalently attached to the first type of unlabeled nucleotide;(c) contacting the extended copy polynucleotides with the multi-dye labeled polyphosphodiester or polymer as described herein (i.e., first labeling reagent), wherein one of RAand RBof the multi-dye labeled polyphosphodiester undergoes the bioorthogonal reaction with the first functional moiety of the first type of unlabeled to form covalent bonding;(d) imaging the solid support and performing one or more fluorescent measurements of the extended copy polynucleotides; and(e) removing the 3' blocking group of the incorporated nucleotides.
[0116] Additional embodiment of the present disclosure relates to a method of determining the sequences of a plurality of different target polynucleotides (Method B), comprising:(a) contacting a solid support with a solution comprising sequencing primers under hybridization conditions, wherein the solid support comprises a plurality of different target polynucleotides immobilized thereon; and the sequencing primers are complementary to at least a portion of the target polynucleotides;(b) contacting the solid support with an incorporation mixture comprising DNA polymerase and one more of four different types of nucleotides under conditions suitable for DNA polymerase-mediated primer extension, and incorporating one type of nucleotides into the sequencing primers to produce extended copy polynucleotides; whereineach of the four types of nucleotides comprises a 3' blocking group; and the incorporation mixture comprises a first type of unlabeled nucleotide having a first functional moiety covalently attached to the first type of unlabeled nucleotide;(c) contacting the extended copy polynucleotides with an antibody comprising the multidye labeled polyphosphodiester as described herein (i.e., first labeling reagent), wherein the antibody binds to the first type of unlabeled nucleotides via noncovalent interaction with the first functional moiety of the first type of unlabeled nucleotides;(d) imaging the solid support and performing one or more fluorescent measurements of the extended copy polynucleotides; and(e) removing the 3' blocking group of the incorporated nucleotides.
[0117] In some embodiments of the methods of sequencing described herein, the first functional moiety is covalently attached to the nucleobase of the first type of unlabeled nucleotide via a cleavable linker. In other embodiments, the first functional moiety is covalently attached to the 3’ blocking group of the first type of unlabeled nucleotide via a cleavable linker.
[0118] In some embodiments of Method A, RAis H, OH, -OP(=O)(OH)O”, or -0P(=0)(0H)0~M+, one of RBand the biological molecule comprises alkynyl, unsubstituted or. OMesubstituted DBCO, substituted phosphines (such as phosphine methyl ester (e.g. O,,.,,.zOH,,,... NH2,,,..... phosphine alcohol (e.g.-> ), phosphine anime (e.g.”b> ), or phosphine thiol (e,-PPh?)), unsubstituted or substituted tetrazine (e.g., phenyl tetrazine, pyrimidyl tetrazine, methyl tetrazine, pyridyl tetrazine, t-butyl tetrazine), triazine, unsubstituted or substituted BCN HOmoiety, unsubstituted or substituted norbornene moiety, vinyl boronic acid(HO ), 2- HO.,, OHacylphenyl boronic acid (), and the other one of RBand the biological moleculeFcomprises azido, phenyl azido (e.g., ), unsubstituted or substituted TCO moiety,primary isonitrile (e.g. NC), tertiary isonitrile (e.g.•*> NC), or amino hydrazide (e.g.O H INH2). in some further embodiments, one of RBand the biological molecule comprises alkynyl, norbornene, DBCO, BCN, and the other one of RBand the biological molecule comprises azido. In some further embodiments, one of RBand the biological molecule comprises optionally substituted triphenylphosphine moiety, and the other one of RBand the biological molecule comprises phenyl azido. In some further embodiments, one of RBand the biological molecule comprises TCO, and the other one of RBand the biological molecule comprises a substituted tetrazine. In still further embodiments, one of RBand the biological molecule comprises an amino hydrazide moiety, and the other one of RBand the biological molecule comprises a 2-acylphenyl boronic acid moiety.
[0119] In some embodiments of Method B, RAis H, OH, -~OP(=O)(OH)O~, or -OP(=O)(OH)O”M+, RBcomprises or is a biotin moiety, and the multi-dye labeled antibody is an avidin (e.g., streptavidin or neutravidin). In another embodiment, RBis or comprises a DNP moiety, and the multi-dye labeled antibody is anti-DNP antibody. In another embodiment, RBis or comprises a DIG moiety, and the multi-dye labeled antibody is anti-DIG antibody.
[0120] In some embodiments of the methods of sequencing described herein, the incorporation mixture comprises a second type of labeled nucleotide, and a third type of labeled nucleotide.
[0121] In other embodiments of the methods of sequencing described herein, the incorporation mixture comprises a second type of unlabeled nucleotide having a second functional moiety covalently attached to the second type of unlabeled nucleotide, and a third type of labeled nucleotide. In still other embodiments, the incorporation mixture comprises a second type of unlabeled nucleotide having a second functional moiety covalently attached to the second type of unlabeled nucleotide, and a mixture of a third type of unlabeled nucleotide having a first functional moiety covalently atached to the third type of unlabeled nucleotide and a third type of unlabeled nucleotide having a second functional moiety covalently attached to the third type of unlabeled nucleotide. In some such embodiments, wherein step (c) further comprises contacting the extendedcopy polynucleotides with a second labeling reagent that binds (e.g., either reacts to form covalent bonding or binds through noncovalent interaction) specifically with the second functional moiety.
[0122] In still other embodiments of the methods of sequencing described herein, the incorporation mixture comprises a second type of unlabeled nucleotide having a second functional moiety covalently attached to the second type of unlabeled nucleotide, and a third type of unlabeled nucleotide having a third functional moiety covalently attached to the third type of unlabeled nucleotide. In some such embodiments, step (c) further comprises contacting the extended copy polynucleotides with a second labeling reagent that binds (e.g., either reacts to form covalent bonding or binds through noncovalent interaction) specifically with the second functional moiety of the second type of unlabeled nucleotides, and a third labeling reagent that binds (e.g., reacts to form covalent bonding or binds through noncovalent interaction) specifically with the third functional moiety of the third type of unlabeled nucleotides.
[0123] In any embodiments of the methods of sequencing described herein, the incorporation mixture comprises a fourth type of unlabeled nucleotide, wherein the fourth type of unlabeled nucleotide is not capable of binding (e.g., either reacts to form covalent bonding or binds through noncovalent interaction) with any of the labeling reagents. In some embodiments, step (e) also removes the detectable labels of the incorporated nucleotides. In some such embodiments, the detectable labels and the 3' blocking groups of the incorporated nucleotides are removed in a single chemical reaction.
[0124] In any embodiments of the methods of sequencing described herein, the method further comprises (f) washing the solid support with an aqueous wash solution. In some embodiments, steps (b) to (f) are repeated at least 50, 100, 150, 2.00, 250, 300, 350, 400, 450, or 500 cycles to determine the target polynucleotide sequences. In any embodiments, the four types of nucleotides comprise dATP, dCTP, dGTP and dTTP or dUTP, or non-natural nucleotide analogs thereof.
[0125] In some embodiments, one or more of the first labeling reagent, the second labeling reagent, and / or the third labeling reagent are in an aqueous post incorporation labeling mixture. The mixture may contain additional inorganic salt(s) and buffering agents, including but not limited to NaCl, KC1, and citrate. The pH of the post incorporation labeling mixture may have a pH from about 7.0 to about 8.5, or from about 7.2. to about 8.0, or about 7.5.
[0126] In some embodiments, the incorporation mixture composition further comprises a DNA polymerase, such as a mutant of 9°N polymerase disclosed in WO 2005 / 024010, U. S. Publication Nos. 2020 / 0131484 A i. 2020 / 0181587 Al, and 2024 / 0141427, each of which is incorporated by reference in its entirety.
[0127] In some embodiments of the methods of sequencing described herein, the method is performed on an automated sequencing instrument comprising two light sources operating at two different wavelengths. For example, one light source has a wavelength of about 450 nm to about 460 nm, and the other light source has a wavelength of about 520 nm to about 540 nm. In other embodiments, the method is performed on an automated sequencing instrument comprising a single light source. For example, the single light source has a wavelength of about 450 nm to about 460 nm, or about 520 nm to about 540 nm.Kits
[0128] Certain embodiments of the present disclosure relate to a kit for sequencing application, comprisingan incorporation mixture comprising one or more of four different types of nucleotides each comprising a 3' blocking group, wherein a first type of unlabeled nucleotide having a first functional moiety covalently attached to the first type of unlabeled nucleotide; andthe multi-dye labeled polyphosphodiester as described herein, wherein at least one of RAand RBof multi-dye labeled polyphosphodiester is capable of undergo the bioorthogonal reaction specifically with the first functional moiety of the first type of unlabeled nucleotides.
[0129] A further aspect of the present disclosure relates to a kit for sequencing application, comprising:an incorporation mixture comprising one or more of four different types of nucleotides each comprising a 3' blocking group, wherein a first type of unlabeled nucleotide having a first functional moiety covalently attached to the first type of unlabeled nucleotide; andan antibody comprising the multi-dye labeled polyphosphodiester as described herein, wherein the antibody is capable of binding to the first type of unlabeled nucleotides via noncovalent interaction w'ith the first functional moiety of the first type of unlabeled nucleotides.
[0130] In some embodiments, the kit comprises a second type of labeled nucleotide, and a third type of labeled nucleotide. In some embodiments, the kit comprises a second type of unlabeled nucleotide having a second functional moiety covalently attached to the second type of unlabeled nucleotide, and a third type of labeled nucleotide. In some embodiments, the kit comprises a second type of unlabeled nucleotide having a second functional moiety covalently attached to the second type of unlabeled nucleotide, and a mixture of a third type of unlabeled nucleotide having a first functional moiety covalently attached to the third type of unlabeled nucleotide and a third type of unlabeled nucleotide having a second functional moiety covalently attached to the third type of unlabeled nucleotide. In some further embodiments, the kit further comprises a second labeling reagent and the second labeling reagent is capable of binding (e.g.,either reacts to form covalent bonding or binds through noncovalent interaction) specifically with the second functional moiety to form covalent bonding.
[0131] In some embodiments, wherein the kit comprises a second type of unlabeled nucleotide having a second functional moiety covalently attached to the second type of unlabeled nucleotide, and a third type of unlabeled nucleotide having a third functional moiety covalently attached to the third type of unlabeled nucleotide. In further embodiments, the kit further comprises a second labeling reagent that is capable of binding (e.g., either reacts to form covalent bonding or binds through noncovalent interaction) specifically with the second functional moiety of the second type of unlabeled nucleotide to form covalent bonding, and a third labeling reagent that is capable of binding (e.g., either reacts to form covalent bonding or binds through noncovalent interaction) specifically with the third functional moiety of the third type of unlabeled nucleotide to form covalent bonding. In some embodiments, the kit comprises a fourth type of unlabeled nucleotide, wherein the fourth type of unlabeled nucleotide is not capable of binding with any labeling reagents. In some embodiments, the incorporation mixture composition further comprises a DNA polymerase. In some embodiments, the four different types of nucleotides are distinguishable using a single light source. In some embodiments, the four different types of nucleotides are distinguishable using two light sources operating at two different wavelengths. In further embodiments, one light source has a wavelength of about 450 nm to about 460 nm, and the other light source has a wavelength of about 520 nm to about 540 nm.
[0132] Additional embodiments of the present disclosure relate to a system for nucleic acid sequencing, comprising a plurality of chambers, wherein one or more of the chambers contains a kit in accordance with the present disclosure.Fluorescent Dyes
[0133] Various fluorescent dyes may be used in the present disclosure as fluorescent dyes for attaching to the phosphodiester oligomer or polymer scaffolds as described herein, or as labeling reagent(s) or detectable labels for the post incorporation sequencing methods as described herein, in particularly those dyes that may be excitation by a blue light or a green light. These dyes may also be referred to as “blue dyes” and “green dyes” respectively. Examples of various type of blue dyes, including but not limited to coumarin dyes, chromenoquinoline dyes, naphthalimide dyes, and bisboron containing heterocycles are disclosed in U. S. Publication Nos. 2018 / 0094140, 2018 / 0201981, 2020 / 0277529, 2020 / 0277670, 2021 / 0188832, 2022 / 0033900, 2022 / 0195517, 2022 / 0380389, 2023 / 0313292, and 2023 / 0416279, and 2024 / 0327910 Al, each of which is incorporated by reference in its entirety. Non-limiting examples of the blue dyes include:0"\ CO2HH, and salts, mesonieric forms, and optionally substituted analogs thereof. For example, analogs with -SO3II substitution on the alkyl group(s).
[0134] Examples of green dyes including cyanine or polymethine dyes disclosed in International Publication Nos. W02013 / 041117, WO2014 / 135221, WO2016 / 189287, W02017 / 051201 and W02018 / 060482A1, each of which is incorporated by reference in its entirety. Non-limiting examples of the green dyes include:, SO3HOH, and salts, mesomeric forms, and optionally substituted analogs thereof.
[0135] In some embodiments, the fluorescent dyes described herein may be further modified to introduce one or more substituents (such as -SO3H, -OH, -C(O)OH, -C(O)OR, where R is unsubstituted or substituted Ct-Ce alkyl) to improve the hydrophilicity of the dyes while maintaining the signal intensity of the dye. In some such embodiments, coumarin dye Amay be further modified to improve the hydrophilicity of thecompoundas (coumarin dye Al) or a salt thereof (where -SO3H is in ionized form -SO3 ).3 ’ Blocking Groups
[0136] I'he nucleotide described herein may also have a 3' blocking group covalently attached to the deoxyribose sugar of the nucleotide. Various 3' blocking group are disclosed in W02002 / 029003, W02004 / 018497 and WO2014 / 139596. For example, the blocking group may be azidomethyl (-CH2N3) or substituted azidomethyl (e.g., -CH(CHF2)N3 or CH(CH2F)Ns), or allyl, each connecting to the 3 '-oxygen atom of the deoxyribose moiety. In some embodiments, the 3' blocking group is azidomethyl, forming S'-OCIbN? with the 3' carbon of the ribose or deoxyribose.
[0137] Additional 3' blocking groups are disclosed in U. S. Publication No.2020 / 0216891 Al, which is incorporated by reference in its entirety. Non-limiting examples ofcovalently attached to the 3' carbon of the deoxyribose.Deprotection of the 3' Blocking Groups
[0138] In some embodiments, the 3' hydroxy protecting group such as azidomethyl may be removed or deprotected by using a water-soluble phosphine reagent to generate a free 3 '-OH. Non-limiting examples include tris(hydroxymethyl)phosphine (THMP), tris(hydroxyethyl)phosphine (THEP) or tris(hydroxypropyl)phosphine (THP or THPP). 3'-acetal blocking groups described herein may be removed or cleaved under various chemical conditions. For 3' blocking groups that contain an allyl moiety, non-limiting cleaving condition includes a Pd(II) complex, such as Pd(OAc)2 or allylPd(II) chloride dimer, in the presence of a phosphine ligand, for example tris(hydroxymethyl)phosphine (THMP), or tris(hydroxypropyl)phosphine (THP or THPP). For those blocking groups containing an alkynyl group (e.g., an ethynyl), theymay also be removed by a Pd(II) complex (e.g., Pd(OAc)2or allyl Pd(II) chloride dimer) in the presence of a phosphine ligand (e.g., THP or TUMP).Palladium Cleavage Reagents
[0139] In some other embodiments, the 3' blocking group such as allyl or AOM as described herein may be cleaved by a palladium catalyst. In some such embodiments, the Pd catalyst is water soluble. In some such embodiments, is a Pd(0) complex (e.g., Tris(3,3',3"- phosphinidynetris(benzenesulfonato)palladium(0) nonasodium salt nonahydrate). In some instances, the Pd(0) complex may be generated in situ from reduction of a Pd(II) complex by reagents such as alkenes, alcohols, amines, phosphines, or metal hydrides. Suitable palladium sources include Na2PdCl4, Li2PdCl4, Pd(CH3CN)2Cl2. (PdCl(C3H5))2, [Pd(C3H5)(THP)]Cl, [Pd(C3H5)(THP)2]Cl, Pd(OAc)2, Pd(PPh3)4, Pd(dba)2, Pd(Acac)2, PdCl2(COD), Pd(TFA)2, Na2PdBr4, K2PdBr4, PdCl2, PdBr2, and Pd(NO3)2. In one such embodiment, the Pd(0) complex is generated in situ from Na2PdCl4or K2PdCl4. In another embodiment, the palladium source is allyl palladium(II) chloride dimer [(PdCl(C3H5))2]. In some embodiments, the Pd(0) complex is generated in an aqueous solution by mixing a Pd(II) complex with a phosphine. Suitable phosphines include water soluble phosphines, such as tris(hydroxypropyl)phosphine (THP), tris(hydroxymethyl)phosphine (THMP), l,3,5-triaza-7-phosphaadamantane (PTA), bis(p- sulfonatophenyl)phenylphosphine dihydrate potassium salt, tris(carboxyethyl)phosphine (TCEP), and triphenylphosphine-3,3 ',3 ''-tri sulfonic acid trisodium salt.
[0140] In some embodiments, the palladium catalyst is prepared by mixing [ ( Allyl)Pd Cl]2with THP in situ. The molar ratio of [( Allyl)PdCT] 2 and the THP may be about 1:1, 1:1.5, 1:2, 1:2.5, 1:3, 1:3.5, 1:4, 1:4.5, 1:5, 1:5.5, 1:6, 1:6.5, 1:7, 1:7.5, 1:8, 1:8.5, 1:9, 1:9.5 or 1:10. In one embodiment, the molar ratio of [(Allyl)PdCl]2 to THP is 1:10. In some other embodiment, the palladium catalyst is prepared by mixing a water soluble Pd reagent such as Na2PdCl4or K2PdCl4with THP in situ. The molar ratio of Na2PdCl4or K2PdCl4and THP may be about 1:1, 1:1.5, 1:2, 1:2.5, 1:3, 1:3.5, 1:4, 1:4.5, 1:5, 1:5.5, 1:6, 1:6.5, 1:7, 1:7.5, 1:8, 1:8.5, 1:9, 1:9.5 or 1:10. In one embodiment, the molar ratio of Na2PdCl4or K2PdCl4to THP is about 1:3. In another embodiment, the molar ratio of Na2PdCl4or K2PdCl4to THP is about 1:3.5. In yet another embodiment, the molar ratio of Na2PdCl4or K2PdCl4to THP is about 1:2.5. In some further embodiments, one or more reducing agents may be added, such as ascorbic acid or a salt thereof (e.g., sodium ascorbate). In some embodiments, the cleavage mixture may contain additional buffer reagents, such as a primary amine, a secondary amine, a tertiary amine, a carbonate salt, a phosphate salt, or a borate salt, or combinations thereof. In some further embodiments, the buffer reagent comprises ethanolamine (EA), tris(hydroxymethyl)aminomethane (Tris), glycine, sodiumcarbonate, sodium phosphate, sodium borate, 2-dimethylethanolamine (DMEA), 2- diethylethanolamine (DEEA), N, N, N', N'-tetramethylethylenediamine (TEMED), or N, N, N', N'-tetraethylethylenediamine (TEEDA), or 2-piperidine ethanol (also known as (2-hydroxyethyl)piperidine, having the structureor combinations thereof. In one embodiment, the buffer reagent comprises or is DEEA. In another embodiment, the buffer reagent comprises or is (2-hydroxyethyl)piperidine. In another embodiment, the buffer reagent contains one or more inorganic salts such as a carbonate salt, a phosphate salt, or a borate salt, or combinations thereof. In one embodiment, the inorganic salt is a sodium salt.Palladium (Pd) Scavengers
[0141] Pd has the capacity to stick on DNA, mostly in its inactive Pd(II) form, which may interfere with the binding between DNA and polymerase, causing increased phasing. A post-cleavage wash composition that includes a Pd scavenger compound may be used following the deblocking step. For example, PCT Publication No. WO 2020 / 126593 discloses Pd scavengers such as 3,3 ’-dithiodipropionic acid (DDPA) and lipoic acid (LA) may be included in the scan composition and / or the post-cleavage wash composition. The use of these scavengers in the post-cleave washing solution has the purpose of scavenging Pd(0), converting Pd(0) to the inactive Pd(II) form, thereby improving the prephasing value and sequencing metrics, reducing signal degrade, and extend sequencing read length. Pd scavengers include both Pd(0) and Pd(II) scavengers, which are described in U. S. Publication No. 2022 / 0396832 Al, which is incorporated by reference in its entirety.Cleavable Linkers
[0142] In some embodiments, the first / second / third functional moiety of the nucleotide described herein is covalently attached to the nucleobase of the nucleotide via a cleavable linker. Use of the term “cleavable linker” is not meant to imply that the whole linker is required to be removed. The cleavage site can be located at a position on the linker that ensures that part of the linker remains attached to the dye and / or substrate moiety after cleavage. Cleavable linkers may be, by way of non-limiting example, electrophilically cleavable linkers, nucleophilically cleavable linkers, photocleavable linkers, cleavable under reductive conditions (for example disulfide or azide containing linkers), oxidative conditions, cleavable via use of safety-catch linkers and cleavable by elimination mechanisms. The use of a cleavable linker to attach the dye compound to a substrate moiety ensures that the label can, if required, be removed after detection, avoiding any interfering signal in downstream steps.
[0143] Useful linker groups may be found in PCT Publication No. W02004 / 018493 (herein incorporated by reference), examples of which include linkers that may be cleaved using water-soluble phosphines or water-soluble transition metal catalysts formed from a transition metal and at least partially water-soluble ligands. In aqueous solution the latter form at least partially water-soluble transition metal complexes. Such cleavable linkers can be used to connect bases of nucleotides to labels such as the dyes set forth herein.
[0144] Particular linkers include those disclosed in PCT Publication No. W02004 / 018493 (herein incorporated by reference) such as those that include moieties of the formulae:(wherein X is selected from the group comprising 0, S, Nil and NQ wherein Q is a CMO substituted or unsubstituted alkyl group, Y is selected from the group comprising O, S, NH and N(allyl), T is hydrogen or a Ci-Cio substituted or unsubstituted alkyl group and * indicates where the moiety is connected to the remainder of the nucleotide or nucleoside). In some aspect, the linkers connect the bases of nucleotides to labels such as, for example, the dye compounds described herein.
[0145] Additional examples of linkers include those disclosed in U. S. Publication No. 2016 / 0040225 (herein incorporated by reference), such as those include moieties of the formulae:(wherein * indicates where the moiety is connected to the remainder of the nucleotide or nucleoside). The linker moieties illustrated herein may comprise the whole or partial linker structure between the nucleotides / nucleosides and the labels.
[0146] Additional examples of linkers are disclosed in U. S. Publication No.2020 / 0216891 Al, which is incorporated by reference in its entirety:2, 3, 4, 5; k is 1; Z is -N3 (azido), -O-Ci-Ce alkyl, -O-C2-C6 alkenyl, or -O-C2-C6 alkynyl; and R comprises the first, the second, or the third functional moiety described herein, which may contain additional linker and / or spacer structure. One of ordinary skill in the art understands that the first, the second, or the third functional moiety described herein is covalently bound to the labeling reagent described herein by reacting with the one or more second functional groups of the watersoluble polymer scaffold. In one embodiment, the cleavable linker comprises0 0 (“AOL” linker moiety) where Z is -O-allyl. For the purpose of the present disclosure, the nucleotide may contain multiple cleavable linkers repeating units (e.g., k is 1, 2, 3, 45, 6, 7, 8, 9 or 10).
[0147] The first, second or third functional moiety may be attached to any position on the nucleotide base, for example, through a linker. In particular embodiments, Watson-Crick base pairing can still be carried out for the resulting analog. Particular nucleobase labeling sites includethe C5 position of a pyrimidine base or the C7 position of a 7-deaza purine base. As described above a linker group may be used to covalently attach a dye to the nucleotide.
[0148] In particular embodiments, the unlabeled nucleotide may be enzymatically incorporable and enzymatically extendable. Accordingly, a linker moiety may be of sufficient length to connect the nucleotide to the compound such that the compound does not significantly interfere with the overall binding and recognition of the nucleotide by a nucleic acid replication enzyme. Thus, the linker can also comprise a spacer unit, such as one or more PEG unit(s) (- OCH2CH2-)n, where n is an integer of 1-20, for example 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14 or 15. The spacer distances, for example, the nucleotide base from a cleavage site or label.
[0149] A unlabeled nucleotides described herein may have the formula:B-L-Ro-AI / -R"R'O^ / WOR"’
[0150] where R is the first, second or third functional moiety described herein; B is a nucleobase, such as, for example uracil, thymine, cytosine, adenine, 7-deaza adenine, guanine, 7- deaza guanine, and the like; L is a linker; -OR' is monophosphate, diphosphate, triphosphate, thiophosphate, a phosphate ester analog, -0- attached to a reactive phosphorous containing group, or -O- protected by a blocking group; R" is H or OH; and R'" is H, a 3' hydroxy blocking group described herein, or -OR'” forms a phosphoramidite. Where -OR'" is phosphoramidite, R' is an acid-cleavable hydroxy protecting group which allows subsequent monomer coupling under NH2NH2N-automated synthesis conditions. In some further embodiments, B comprisesderivatives and analogs thereof. In some further embodiments, the nucleobase comprises the
[0151] In yet another alternative embodiment, there is no blocking group on the 3' carbon of the pentose sugar-and the labeled avidin attached to the base via a linker, for example,can be of a size or structure sufficient to act as a block to the incorporation of a further nucleotide. Thus, the block can be due to steric hindrance or can be due to a combination of size, charge and structure, whether or not the dye is attached to the 3 ' position of the sugar.
[0152] The use of a blocking group allows polymerization to be controlled, such as by stopping extension when an unlabeled nucleotide is incorporated. If the blocking effect is reversible, for example, by way of non-limiting example by changing chemical conditions or by removal of a chemical block, extension can be stopped at certain points and then allowed to continue.
[0153] In a particular embodiment, the linker and blocking group are both present and are separate moieties. In particular embodiments, the linker and blocking group are both cleavable under the same or substantially similar conditions. Thus, deprotection and deblocking processes may be more efficient because only a single treatment will be required to remove both the dye compound and the blocking group. However, in some embodiments a linker and blocking group need not be cleavable under similar conditions, instead being individually cleavable under distinct conditions.
[0154] Non-limiting exemplary unlabeled nucleotides as described herein include:wherein L represents a linker, including a cleavable linker described herein; Rxrepresents a ribose or deoxyribose moiety as described above, or a ribose or deoxyribose moiety with the 5' position substituted with mono-, di- or tri- phosphates; R represents the first, second or third functional moiety described herein.
[0155] In some embodiments, non-limiting exemplary unlabeled nucleotide containing a functional moiety covalently attached via a cleavable linker are shown below:wherein PG stands for the 3' blocking groups described herein; p is an integer of 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10; and m is 0, 1, 2, 3, 4, or 5. In one embodiment, -O-PG is AOM. In another embodiment, -O-PG is -0-azidomethyl. In one embodiment, m is 5. In another embodiment, m is 0. In anotherembodiment, m is 2. In some further embodiments, p is 1, 2, 3, 4 or 5. (CH2)mR refers to the connection point of the first / second / third functional moiety with the cleavable linker as a result of a reaction between an amino group of the linker moiety and the carboxyl group of the first / second / third functional moiety. In further embodiments, the nucleotide may be attached to the functional moiety via more than one of the same cleavable linkers (such as LN3-LN3, sPA-sPA, AOL-AOL). In other embodiments, the functional moiety may be attached to the nucleotide via two or more different cleavable linkers (such sPA-LN3, sPA-sPA-LN3, SPA-LN3-LN3, etc.). In addition, the linker may further include additional PEG spacers as described herein, forexample, between R and -(CH2)m-. In any embodiments of the nucleotide described herein, the nucleotide is a nucleotide triphosphate. In further embodiments, the nucleotide has a 2' deoxyribose.General description of sequencing by synthesis
[0156] In a specific embodiment, a synthetic step is carried out and may optionally comprise incubating a template or target polynucleotide strand with a reaction mixture comprising fluorescently labeled nucleotides of the disclosure. A polymerase can also be provided under conditions which permit formation of a phosphodiester linkage between a free 3' hydroxy group on a polynucleotide strand annealed to the template or target polynucleotide strand and a 5’ phosphate group on the labeled nucleotide. Thus, a synthetic step can include formation of a polynucleotide strand as directed by complementary base pairing of nucleotides to a template / target strand.
[0157] In all embodiments of the methods, the detection step may be carried out while the polynucleotide strand into which the labeled nucleotides are incorporated is annealed to a template / target strand, or after a denaturation step in which the two strands are separated. Further steps, for example chemical or enzymatic reaction steps or purification steps, may be included between the synthetic step and the detection step. In particular, the polynucleotide strand incorporating the labeled nucleotide(s) may be isolated or purified and then processed further or used in a subsequent analysis. By way of example, polynucleotide strand incorporating the labeled nucleotide(s) as described herein in a synthetic step may be subsequently used as labeled probes or primers. In other embodiments, the product of the synthetic step set forth herein may be subject to further reaction steps and, if desired, the product of these subsequent steps purified or isolated.
[0158] Suitable conditions for the synthetic step will be well known to those familiar with standard molecular biology techniques. In one embodiment, a synthetic step may be analogous to a standard primer extension reaction using nucleotide precursors, including the labeled nucleotides as described herein, to form an extended polynucleotide strand (primer polynucleotide strand) complementary to the template / target strand in the presence of a suitable polymerase enzyme. In other embodiments, the synthetic step may itself form part of an amplification reaction producing a labeled double stranded amplification product comprised of annealed complementary strands derived from copying of the primer and template polynucleotide strands. Other exemplary synthetic steps include nick translation, strand displacement polymerization, random primed DNA labeling, etc. A particularly useful polymerase enzyme for a synthetic step is one that is capable of catalyzing the incorporation of the labeled nucleotides as set forth herein. A variety of naturally occurring or mutant / modified polymerases can be used.By way of example, a thermostable polymerase can be used for a synthetic reaction that is carried out using thermocycling conditions, whereas a thermostable polymerase may not be desired for isothermal primer extension reactions. Suitable thermostable polymerases which are capable of incorporating the labeled nucleotides according to the disclosure include those described in WO 2005 / 024010 or W006120433, each of which is incorporated herein by reference. In synthetic reactions which are carried out at lower temperatures such as 37 °C, polymerase enzymes need not necessarily be thermostable polymerases, therefore the choice of polymerase will depend on a number of factors such as reaction temperature, pH, strand-displacing activity and the like.
[0159] In specific non-limiting embodiments, the disclosure encompasses methods of nucleic acid sequencing, re-sequencing, whole genome sequencing, single nucleotide polymorphism scoring, any other application involving the detection of the modified nucleotide or nucleoside labeled with dyes set forth herein when incorporated into a polynucleotide.
[0160] SBS generally involves sequential addition of one or more nucleotides or oligonucleotides to a growing polynucleotide chain in the 5’ to 3' direction using a polymerase or ligase in order to form an extended polynucleotide chain complementary to the template / target nucleic acid to be sequenced. The identity of the base present in one or more of the added nucleotide(s ) can be determined in a detection or “imaging” step. The identity of the added base may be determined after each nucleotide incorporation step. The sequence of the template may then be inferred using conventional Watson-Crick base-pairing rules. The use of the nucleotides labeled with dyes set forth herein for determination of the identity of a single base may be useful, for example, in the scoring of single nucleotide polymorphisms, and such single base extension reactions are within the scope of this disclosure.
[0161] In an embodiment of the present disclosure, the sequence of a template / target polynucleotide is determined by detecting the incorporation of one or more nucleotides into a nascent strand complementary to the template polynucleotide to be sequenced through the detection of fluorescent label(s) attached to the incorporated nucleotide(s). Sequencing of the template polynucleotide can be primed with a suitable primer (or prepared as a hairpin construct which will contain the primer as part of the hairpin), and the nascent chain is extended in a stepwise manner by addition of nucleotides to the 3' end of the primer in a polymerase-catalyzed reaction.
[0162] In particular embodiments, each of the different nucleotide triphosphates (A, T, G and C) may be labeled with a unique fluorophore and also comprises a blocking group at the 3' position to prevent uncontrolled polymerization. Alternatively, one of the four nucleotides may be unlabeled (dark). The polymerase enzyme incorporates a nucleotide into the nascent chain complementary to the template / target polynucleotide, and the blocking group prevents furtherincorporation of nucleotides. Any unincorporated nucleotides can be washed away and the fluorescent signal from each incorporated nucleotide can be “read” optically by suitable means, such as a charge-coupled device using light source excitation and suitable emission filters. The 3'- blocking group and fluorescent dye compounds can then be removed (deprotected) (simultaneously or sequentially) to expose the nascent chain for further nucleotide incorporation.
[0163] The method, as exemplified above, utilizes the incorporation of fluorescently labeled, 3’ blocked nucleotides A, G, C, and T into a growing strand complementary to the immobilized polynucleotide, in the presence of DNA polymerase. The polymerase incorporates a base complementary to the target polynucleotide but is prevented from further addition by the 3' blocking group. The label of the incorporated nucleotide can then be determined, and the blocking group removed by chemical cleavage to allow further polymerization to occur. The nucleic acid template to be sequenced in a SBS reaction may be any polynucleotide that it is desired to sequence. The nucleic acid template for a sequencing reaction will typically comprise a double stranded region having a free 3' hydroxy group that serves as a primer or initiation point for the addition of further nucleotides in the sequencing reaction. The region of the template to be sequenced will overhang this free 3' hydroxy group on the complementary strand. The overhanging region of the template to be sequenced may be single stranded but can be double¬ stranded, provided that a “nick” is present on the strand complementary to the template strand to be sequenced to provide a free 3' OH group for initiation of the sequencing reaction. In such embodiments, sequencing may proceed by strand displacement. In certain embodiments, a primer bearing the free 3' hydroxy group may be added as a separate component (e.g., a short oligonucleotide) that hybridizes to a single-stranded region of the template to be sequenced. Alternatively, the primer and the template strand to be sequenced may each form part of a partially self-complementary nucleic acid strand capable of forming an intra-molecular duplex, such as for example a hairpin loop structure. Hairpin polynucleotides and methods by which they may be attached to solid supports are disclosed in PCT Publication Nos. WO0157248 and W02005 / 047301, each of which is incorporated herein by reference. Nucleotides can be added successively to a growing primer, resulting in synthesis of a polynucleotide chain in the 5' to 3' direction. The nature of the base which has been added may be determined, particularly but not necessarily after each nucleotide addition, thus providing sequence information for the nucleic acid template. Thus, a nucleotide is incorporated into a nucleic acid strand (or polynucleotide) by joining of the nucleotide to the free 3' hydroxy group of the nucleic acid strand via formation of a phosphodiester linkage with the 5' phosphate group of the nucleotide.
[0164] The nucleic acid template to be sequenced may be DNA or RNA, or even a hybrid molecule comprised of deoxynucleotides and ribonucleotides. The nucleic acid templatemay comprise naturally occurring and / or non-naturally occurring nucleotides and natural or non¬ natural backbone linkages, provided that these do not prevent copying of the template in the sequencing reaction.
[0165] In certain embodiments, the nucleic acid template to be sequenced may be attached to a solid support via any suitable linkage method known in the art, for example via covalent attachment. In certain embodiments template polynucleotides may be attached directly to a solid support (e.g., a silica-based support). However, in other embodiments of the disclosure the surface of the solid support may be modified in some way so as to allow either direct covalent attachment of template polynucleotides, or to immobilize the template polynucleotides through a hydrogel or polyelectrolyte multilayer, which may itself be non-covalently attached to the solid support.
[0166] Arrays in which polynucleotides have been directly attached to a support (for example, silica-based supports such as those disclosed in WO00 / 06770 (incorporated herein by reference), wherein polynucleotides are immobilized on a glass support by reaction between a pendant epoxide group on the glass with an internal amino group on the polynucleotide. In addition, polynucleotides can be attached to a solid support by reaction of a sulfur-based nucleophile with the solid support, for example, as described in W02005 / 047301 (incorporated herein by reference). A still further example of solid-supported template polynucleotides is where the template polynucleotides are attached to hydrogel supported upon silica-based or other solid supports, for example, as described in WOOO / 31148, W001 / 01143, WO02 / 12566, W003 / 014392, U. S. Pat. No. 6,465,178 and WO00 / 53812, each of which is incorporated herein by reference.
[0167] A particular surface to which template polynucleotides may be immobilized is a polyacrylamide hydrogel. Polyacrylamide hydrogels are described in the references cited above and in W02005 / 065814, which is incorporated herein by reference. Specific hydrogels that may be used include those described in W02005 / 065814 and U. S. Pub. No. 2014 / 0079923. In one embodiment, the hydrogel is PAZAM (poly(N-(5-azidoacetamidylpentyl) acrylamide-co- acrylamide)).
[0168] DNA template molecules can be attached to beads or microparticles, for example, as described in U. S. Pat. No. 6,172,218 (which is incorporated herein by reference). Attachment to beads or microparticles can be useful for sequencing applications. Bead libraries can be prepared where each bead contains different DNA sequences. Exemplary libraries and methods for their creation are described in Nature, 437, 376-380 (2005); Science, 309, 5741, 1728- 1732 (2005), each of which is incorporated herein by reference. Sequencing of arrays of such beads using nucleotides set forth herein is within the scope of the disclosure.
[0169] Template(s) that are to be sequenced may form part of an “array” on a solid support, in which case the array may take any convenient form. Thus, the method of the disclosure is applicable to all types of high-density arrays, including single-molecule arrays, clustered arrays, and bead arrays. Nucleotides labeled with dye compounds of the present disclosure may be used for sequencing templates on essentially any type of array, including but not limited to those formed by immobilization of nucleic acid molecules on a solid support.
[0170] However, nucleotides labeled with dye compounds of the disclosure are particularly advantageous in the context of sequencing of clustered arrays. In clustered arrays, distinct regions on the array (often referred to as sites, or features) comprise multiple polynucleotide template molecules. Generally, the multiple polynucleotide molecules are not individually resolvable by optical means and are instead detected as an ensemble. Depending on how the array is formed, each site on the array may comprise multiple copies of one individual polynucleotide molecule (e.g., the site is homogenous for a particular single- or double-stranded nucleic acid species) or even multiple copies of a small number of different polynucleotide molecules (e.g., multiple copies of two different nucleic acid species). Clustered arrays of nucleic acid molecules may be produced using techniques generally known in the art. By way of example, WO 98 / 44151 and WOOO / 18957, each of which is incorporated herein, describe methods of amplification of nucleic acids wherein both the template and amplification products remain immobilized on a solid support in order to form arrays comprised of clusters or “colonies” of immobilized nucleic acid molecules. The nucleic acid molecules present on the clustered arrays prepared according to these methods are suitable templates for sequencing using nucleotides labeled with dye compounds of the disclosure.
[0171] Nucleotides labeled with dye compounds of the present disclosure are also useful in sequencing of templates on single molecule arrays. The term “single molecule array” or “SMA” as used herein refers to a population of polynucleotide molecules, distributed (or arrayed) over a solid support, wherein the spacing of any individual polynucleotide from all others of the population is such that it is possible to individually resolve the individual polynucleotide molecules. The target nucleic acid molecules immobilized onto the surface of the solid support can thus be capable of being resolved by optical means in some embodiments. This means that one or more distinct signals, each representing one polynucleotide, will occur within the resolvable area of the particular imaging device used.
[0172] Single molecule detection may be achieved wherein the spacing between adjacent polynucleotide molecules on an array is at least 100 nm, more particularly at least 250 nm, still more particularly at least 300 nm, even more particularly at least 350 nm. Thus, eachmolecule is individually resolvable and detectable as a single molecule fluorescent point, and fluorescence from said single molecule fluorescent point also exhibits single step photobleaching.
[0173] The terms “individually resolved” and “individual resolution” are used herein to specify that, when visualized, it is possible to distinguish one molecule on the array from its neighboring molecules. Separation between individual molecules on the array will be determined, in part, by the particular technique used to resolve the individual molecules. The general features of single molecule arrays will be understood by reference to published applications WO00 / 06770 and WO 01 / 57248, each of which is incorporated herein by reference. Although one use of the labeled nucleotides of the disclosure is in SBS reactions, the utility of such nucleotides is not limited to such methods. In fact, the labeled nucleotides described herein may be used advantageously in any sequencing methodology which requires detection of fluorescent labels attached to nucleotides incorporated into a polynucleotide.
[0174] In particular, nucleotides labeled with dye compounds of the disclosure may be used in automated fluorescent sequencing protocols, particularly fluorescent dye-terminator cycle sequencing based on the chain termination sequencing method of Sanger and co-workers. Such methods generally use enzymes and cycle sequencing to incorporate fluorescently labeled dideoxynucleotides in a primer extension sequencing reaction. So-called Sanger sequencing methods, and related protocols (Sanger- type), utilize randomized chain termination with labeled dideoxynucleotides.EXAMPLES
[0175] Additional embodiments are disclosed in further detail in the following examples, which are not in any way intended to limit the scope of the claims.Example 1. Synthesis of Multi-Dye Labeled Polyphosphodiesters
[0176] In this example, several polyphosphodiester scaffolds were prepared using the phosphoramidite chemistry described in Scheme 1. These scaffolds comprised 1, 2. or 3 blocks of polyphosphodiester chains of 7 or 12 monomers in length to separate 2, 3 or 4 DBCO moieties respectively. These scaffolds were then labelled with four different azide-functionalized dyes (coumarin dye A, coumarin dye Al, NR550C4, ATTO 32) in a copper- free click manner and purified by semi-prep HPLC to create a library of multi-dye scaffolds.Y / OHHO. A. Lo. I.0. -4A6 ' 7m'0 O AH 0 Y OHL L4I ABmAYABmABmAl i 0ABmABniABmA, wherein Y is DBCO; L4is -NHC(=O)(CH2)4C(=O)-; m is 7 or 12.
[0177] The following five scaffolds were prepared: T-AB7A-T, T-AB7AB7A-T, T- AB7AB7AB7A-T, T-AB12A-T, T-AB12AB12AB12A-T, wherein T refers to T nucleoside that forms9 OHphosphodiester bondo " with the terminal -OH groups. Due to the nature of the phosphoramidite synthesis, the T at one end of the chain will be attached at the ' position on the ribose and the other will be attached at the 5' position.
[0178] The polyphosphodiester scaffold was further labeled with an azido functionalized dye using the following reaction conditions: T-AB7AB7A-T (146 nmol, 1.0 eq) and NR550C4-C3-N3 (1747 nmol, 12.0 eq) were diluted in water (1 ml) and stirred at RT for 3 hrs. The labelled scaffold was purified by semi-prep HPLC (20-45% acetonitrile in water), m / z 1525 (M4+), 1220 (M5+).
[0179] A dilute solution of dye / scaffold in universal scan mixture (USM) was prepared and the absorbance and fluorescent intensity of the sample measured by UV spectrophotometer and fluorimeter respectively (excitation wavelength 460nm for coumarin dyes A and Al, 530nm for ATTO-532, 550nm for NR550C4). The concentration of the sample was determined from its absorbance and extinction coefficient. (The extinction coefficient of each scaffold was determined as n multiplied by the extinction coefficient of a single dye, where n is the number of dyes on the scaffold). The brightness was calculated by dividing the fluorescent intensity either by the concentration of dye (giving brightness per dye, see Table 1) or by the concentration of the scaffold (giving total brightness, see FIG. 1).
[0180] Preliminary data in solution suggest that NR550C4 labeled polyphosphodiester scaffolds show dye enhancement, as the brightness of each dye on the scaffold is greater than that of a single free dye in solution. This leads to a much brighter scaffold overall. In contrast, coumarin dye Al labeled polyphosphodiester scaffolds show great dye quenching in solution leading to an overall dimmer scaffold than a single free dye. The data are summarized in Table 1. The results suggest that the brightness of the polyphosphodiester scaffolds in solution may be highly dependent on the structure of the dye and on the number of dyes in the scaffold, as both of these factors will affect the preferred 3D conformation of the polyphosphodiester chain.
[0181] 'Table 1 illustrates labeled polyphosphodiester scaffolds brightness per dye in water and universal scan mix (USM) normalized to the fluorescent intensity of the respective free dye-azides in the same solvent.Table 1.Oligo TABxAT T(ABx)sAT TABxAT TABxAT T(ABx)3AT Dye NR550C4 NR550C4 Coumarin ATTO532 ATTO532 Dye AlNo. of Dyes 2 4 2 2 4 X = 7 Dye FI intensity 1.95 1.50 0.17 0.82 0.74(normalized tofree dye in water)Dye FI intensity 1.27 0.48 0.17 0.80 0.46 (normalized tofree dye in USM)X =12 Dye FI intensity 2.83 3.85 0.24 0.85 0.81(normalized tofree dye in water)Dye FI intensity 1.27 1.03 0.16 0.81 (normalized to free dye in USM)
[0182] ATTO532 labeled scaffolds show slight dye quenching which increases as the number of dyes increases. As the quenching is low however, the scaffolds with 4 dyes are still brighter than those with 2 dyes, as well as the free single dye. The scaffolds are more quenched in universal scan mix (USM), and this was found to be caused by the high salt concentration shielding the phosphate anions, preventing repulsion along the chain and so allowing proximity-dependent dye-dye quenching to occur (FIG. 1).Example 2. Sequencing by synthesis using post-incorporation labeling method
[0183] In order to assess the brightness of labeled PPE scaffolds on a sequencing platform, T-AB12AB12AB12A-T was first reacted to one equivalent of a tetrazine-azide Tz-PEG3- N3, and the remaining three free DBCO moieties were labeled with ATTO532-C3-azide. In particular, Tz-PEG3-N3 was dissolved in 0.5% acetonitrile water mixture. T-AB12AB12AB12A-T in water was reacted with 1 eq of Tz-PEG3-N3 at room temperature for 30 min. The reaction was monitored using a UPLC. Then 3eq. of Atto532-C3-N3 was added to the above reaction mixture and reacted at room temperature 4 hours in water. The reaction was used in sequencing experiments without further purification. The tetrazine moiety may then be clicked to an incorporated TCO-ffN during SBS, leveraging the post-incorporation labeling (PIE) method described herein.
[0184] In this experiment, SBS runs were conducted on Illumina MiSeq® instrument with blue and green LED operating at 460 nm and 532 nm. Images were taken simultaneously through collection channels which are in blue (472-520nm) and green (540-640nm). The standard incorporation mixture include: (1) a set of nucleotides comprising dark ffG, ffC-DB-AOL- Atto532, ffA-DB-AOL-BL-NR550SO, ffA-DB-AOL-BL-coumarin dye A, ffT-DB-AOL-coumarin dye A, each comprising a 3’ AOM blocking group; (2) DNA polymerase disclosed as SEQ ID NO:5 in U. S. Publication No. 2024 / 0141427; and (3) a glycine buffer. The SBS scatterplot at cycle 8 is shown as FIG. 2. It was observed that the -1 cycle sequence context effect was more pronounced, meaning the interaction between ffC-AOL-Atto532 and previous base A and quenched the signal.O' 'Ocoumarin Dye A
[0185] In comparison, the incorporation mix used in the post incorporation labeling method included dark ff'G, ffC-DB-AOL-AOL-TCO, ffA-DB-AOL-BL-NR550SO, ffA-DB-AOL-BL-coumarin dye A, and ffT-DB-AOL-coumarin dye A, each comprising a 3’ AOM blocking group. Post incorporation labeling mixture contains PPE-TZ(Atto532)3 in 10 mM pH 8 Tris buffer. The post incorporation labeling (PIL) mixture was introduced to the flowcell surface containing the incorporated ffNs and allowed to bind at 60 °C by flushing through the flow cell without incubation (about 2.5s contact time) and also a longer 30s incubation contain time. The PIL mixture comprises 10pM ATTO532 labeled T-A**Bi2A*Bi2A*Bi2A*-T in 25mM Tris buffer pH 7 (* refers to the ATTO532 label and ** refers to one DBCO moiety that was further derivatized with the tetrazine moiety through reaction of the DBCO moiety with TZ-PEG3-N3). The SBS scatterplot at cycle 11 for the flush through treatment is shown in FIG. 3A and the scatterplot at cycle 13 for the incubation treatment is shown as FIG. 3B. No sequence context was present in the PIL method, because the dye incorporated by the PIL method was further away from DNA and the scatterplot cloud is much tighter when PPE present on ffC. The increase in cluster brightness achieved using a PPE strategy is of particular importance in the effort to improve SNR and sequencing quality, especially in the context of the high density flowcell roadmap whereby fewer DNA strands comprise each cluster, necessitating a greater signal to allow the signal to be consistently distinguished from noise.Tz-PEG3-N3
[0186] In summary, the scatterplots resulting from the PIL sequencing method described herein using a polyphosphodiester (PPE) scaffold labeled with a maximum of 3x ATTO532 dyes indicate that 2X brightness can be achieved compared to the standard ATTO532 ffC on the MiSeq. This is the same as the brightness of a similar' scaffold in solution, indicating a correlation between solution and sequencing data for ATTO532. In the standard SBS method utilizing ffC-AOL-ATTO532, the interaction between the ffC and previous base A showed quenching of the signal. In contrast, no sequence context effect was observed in the PIL, method.
Claims
1. WHAT IS CLAIMED IS:
1. A multi-dye labeled polyphosphodiester having the structure of Formula (I) or (II):3.Fl I4.X-Y5.L J np(1), Fl I6.X-Y8. 9.q(II), or a salt thereof, wherein:10.R1is —OH, — SH, — O“, — S", — O~M+, — S“M+, — O(unsubstituted or substituted Ci- Ce alkyl), or —O(un substituted or substituted C2-C6 alkenyl):11.R2is O or S;12.L1is an unsubstituted or substituted C2-C6 alkylene, an unsubstituted or substituted C2-C6 alkenylene, an unsubstituted or substituted C2-C6 alkynylene, a 2 to 8 membered heteroalkylene, an unsubstituted or substituted C3-C10 membered carbocyclylene, an unsubstituted or substituted C6-C10 arylene, an unsubstituted or substituted 5 to 10 membered heteroarylene, or an unsubstituted or substituted 3 to 10 membered heterocyclylene, or the C2-C6 alkylene, C2-C6 alkenylene, C2-C6 alkynylene, or 2 to 8 membered heteroalkylene each independently interrupted by an unsubstituted or substituted C3-C10 membered carbocyclylene, an unsubstituted or substituted C6-C10 arylene, an unsubstituted or substituted 5 to 10 membered heteroarylene, or an unsubstituted or substituted 3 to 10 membered heterocyclylene;13.L2is a bond, an unsubstituted or substituted Ci-Ce alkylene, an unsubstituted or substituted C2-C6 alkenylene, an unsubstituted or substituted C2-C6 alkynylene, a 2 to 8 membered heteroalkylene, an unsubstituted or substituted C3-C10 membered carbocyclylene, an unsubstituted or substituted C6-C10 arylene, an unsubstituted or substituted 5 to 10 membered heteroarylene, an unsubstituted or substituted 3 to 10 membered heterocyclylene; each of L3, L4and L5is independently a bond, an un substituted or substituted Ci- Ce alkylene, or a 2 to 8 membered heteroalkylene;14.M+is a cation;15.Fl is a fluorescent dye moiety;16.each X is independently a functional group of the fluorescent dye moiety Fl; each Y is independently a functional group of the polyphosphodiester, wherein X— Y refers to a reaction product between X and Y forming covalent bonding;17.each RAand RBis independently H, —OH, — SH, — OP(=Ra)(Rb)Rc, unsubstituted or substituted Ci-Ce alkyl, unsubstituted or substituted Ce-Cio aryl, unsubstituted or substituted 5 or 6 membered heteroaryl, or a functional group capable binding to a biological molecule of interest either via a bioorthogonal reaction selected from the group consisting of a [3+2] dipolar cycloaddition, a Diels- Alder cycloaddition, a [4+1] cycloaddition, a phosphine ligation, and condensation with 2-acylphenyl boronic acid, or via noncovalent interaction with the biological molecule of interest;18.Rais O or S;19.each of Rband Rcis independently —OH, — SH, — O", — S", — O" M+, — S" M+, — O(unsubstituted or substituted Ci-Ce alkyl), or — O(unsubstituted or substituted C2-Ce alkenyl);20.each m is independently an integer of 5 or more;21.each ml and m2 is independently an integer of 3 or more;22.n is an integer of 1 or more;23.q is an integer of 2 or more;24.k is 0 or 1, provided that when k is 0, L3and L4are directly attached to L2; and p is 0 or 1, provided that when p is 0, RAis directly attached to L2, and n is at least2. The multi-dye labeled polyphosphodiester of claim 1, wherein R3is —OH.
3. The multi-dye labeled polyphosphodiester of claim 1, wherein R1is — O⁻ M+, and M+is Na+or K+4. The multi-dye labeled polyphosphodiester of claim 1, wherein R2is O.
5. The multi-dye labeled polyphosphodiester of any one of claims 1 to 4, wherein L1is — CII2-CH2—, -CH2=CH2“, or -CH2-CH2=CH2-CH2-.
6. The multi-dye labeled polyphosphodiester of any one of claims 1 to 4, wherein L1is cyclohexylene or C2-Ce alkylene interrupted by cyclohexylene.
7. The multi-dye labeled polyphosphodiester of any one of claims 1 to 6, wherein L2is — CH2—.
8. The multi-dye labeled polyphosphodiester of claim 7, wherein k is 1 and L4is a bond or — CH2—.
9. The multi-dye labeled polyphosphodiester of any one of claims 1 to 6, wherein L2is -CH2=CH2-, -CH2-CH2=CH2-CH2- or cyclohexylene.
10. The multi-dye labeled polyphosphodiester of claim 9, wherein k is 0, and L4is a bond or — CH2—.
11. The multi-dye labeled polyphosphodiester of any one of claims 1 to 10, wherein L5is a bond or — CH2—.
12. The multi-dye labeled polyphosphodiester of any one of claims 1 to 11, wherein one of X and Y is amino, unsubstituted or substituted dibenzocyclooctyne moiety (DBCO)35.CO O36., or unsubstituted or substituted transcyclooctene (TCO) moiety, and the other one of X and Y is carboxyl, azido, or unsubstituted or substituted tetrazine moiety.
13. The multi-dye labeled polyphosphodiester of claim 12, wherein one of X and Y is amino and the other X and Y is carboxyl, and X— Y forms an amide bond.
14. The multi-dye labeled polyphosphodiester of any one of claims 1 to 13, wherein m is 7, 11 is 1 or 2, and p is 1.
15. The multi-dye labeled polyphosphodiester of any one of claims 1 to 14, wherein the biological molecule of interest is a nucleotide, an oligonucleotide, a polynucleotide or an antibody.
16. A biological molecule attached to the multi-dye labeled polyphosphodiester of any one of claims 1 to 15 via either the bioorthogonal reaction or noncovalent interaction with at least one of RAand RB, wherein the biological molecule is nucleotide, an oligonucleotide, a polynucleotide, or an antibody.
17. The biological molecule of claim 16, wherein the biological molecule is covalently attached to the multi-dye labeled polyphosphodiester via the bioorthogonal reaction with one of RAand RB.
18. The biological molecule of claim 16 or 17, wherein RAis H, OH, -OP(=O)(OH)O~, or -0P(=0)(0H)0~M+, one of RBand the biological molecule comprises43.cc X-No \ alkynyl, unsubstituted or substituted dibenzocyclooctyne moiety (DBCO)44.substituted phosphines, unsubstituted or substituted tetrazine (e.g., phenyl tetrazine, pyrimidyl tetrazine, methyl tetrazine, pyridyl tetrazine, t-butyl tetrazine), triazine, unsubstituted or substituted bicyclo[6.1.0]nonyne (BCN) moiety, unsubstituted or substituted norbornene moiety, vinyl boronic acid, 2-acylphenyl boronic acid, and the other one of RBand the biological molecule comprises azido, phenyl azido, unsubstituted or substituted transcyclooctene (TCO) moiety, primary isonitrile, tertiary isonitrile, or amino hydrazide.
19. The biological molecule of claim 18, wherein one of RBand the biological molecule comprises alkynyl, norbornene, DBCO, BCN, and the other one of RBand the biological molecule comprises azido.
20. The biological molecule of claim 18, wherein one of RBand the biological molecule comprises optionally substituted triphenylphosphine moiety, and the other one of RBand the biological molecule comprises phenyl azido.
21. The biological molecule of claim 18, wherein one of RBand the biological molecule comprises TCO, and the other one of RBand the biological molecule comprises a substituted tetrazine.
22. The biological molecule of claim 18, wherein one of RBand the biological molecule comprises an amino hydrazide moiety, and the other one of RBand the biological molecule comprises a 2-acylphenyl boronic acid moiety.
23. The biological molecule of any one of claims 16 to 22, wherein the biological molecule is oligonucleotide or polynucleotide, wherein the oligonucleotide or polynucleotide is at least partially complementary and hybridized to a target polynucleotide immobilized on a surface of a solid support.
24. A method of determining the sequences of a plurality of different target polynucleotides, comprising:51.(a) contacting a solid support with a solution comprising sequencing primers under hybridization conditions, wherein the solid support comprises a plurality of different target polynucleotides immobilized thereon; and the sequencing primers are complementary to at least a portion of the target polynucleotides;52.(b) contacting the solid support with an incorporation mixture comprising DNA polymerase and one more of four different types of nucleotides under conditions suitable for DNA polymerase-mediated primer extension, and incorporating one type of nucleotides into the sequencing primers to produce extended copy polynucleotides; w'herein53.each of the four types of nucleotides comprises a 3' blocking group; and the incorporation mixture comprises a first type of unlabeled nucleotide having a first functional moiety covalently attached to the first type of unlabeled nucleotide;54.(c) contacting the extended copy polynucleotides with the multi -dye labeled polyphosphodiester of any one of claims 1 to 14, wherein one of RAand RBof the multidye labeled polyphosphodiester undergoes the bioorthogonal reaction with the first functional moiety of the first type of unlabeled to form covalent bonding;55.(d) imaging the solid support and performing one or more fluorescent measurements of the extended copy polynucleotides; and56.(e) removing the 3' blocking group of the incorporated nucleotides.
25. The method of claim 24, wherein the first functional moiety is covalently attached to the nucleobase of the first type of unlabeled nucleotide via a cleavable linker.
26. The method of claim 24, wherein the first functional moiety is covalently attached to the 3’ blocking group of the first type of unlabeled nucleotide via a cleavable linker.
27. The method of any one of claims 24 to 26, wherein RAis H, OH, -OP(=O)(OH)O”, or -OP(=O)(OH)O”M+, one of RBand the first functional moiety comprises alkynyl, unsubstituted60.or substituted dibenzocyclooctyne moiety (61.
62. DBCO), substituted phosphines, unsubstituted or substituted tetrazine (e.g., phenyl tetrazine, pyrimidyl tetrazine, methyl tetrazine, pyridyl tetrazine, t-butyl tetrazine), triazine, unsubstituted or substituted bicyclo[6.1.0]nonyne (BCN) moiety, unsubstituted or substituted norbornene moiety, vinyl boronic acid, 2-acylphenyl boronic acid, and the other one of RBand the first functional moiety comprises azido, phenyl azido, unsubstituted or substituted transcyclooctene (TCO) moiety, primary isonitrile, tertiary isonitrile, or amino hydrazide.
28. The method of claim 27, wherein one of RBand the first functional moiety comprises alkynyl, norbornene, DBCO, BCN, and the other one of RBand the first functional moiety comprises azido.
29. The method of claim 27, wherein one of RBand the first functional moiety comprises optionally substituted triphenylphosphine moiety, and the other one of RBand the first functional moiety comprises phenyl azido.
30. The method of claim 27, wherein one of RBand the first functional moiety comprises TCO, and the other one of RBand the first functional moiety comprises a substituted tetrazine.
31. The method of claim 27, wherein one of RBand the first functional moiety comprises an amino hydrazide moiety, and the other one of RBand the first functional moiety comprises a 2-acylphenyl boronic acid moiety.
32. The method of any one of claims 24 to 31, wherein the incorporation mixture comprises a second type of labeled nucleotide, and a third type of labeled nucleotide.
33. The method of any one of claims 24 to 31, wherein the incorporation mixture comprises a second type of unlabeled nucleotide having a second functional moiety covalently attached to the second type of unlabeled nucleotide, and a third type of labeled nucleotide.
34. The method of any one of claims 24 to 31, wherein the incorporation mixture comprises a second type of unlabeled nucleotide having a second functional moiety covalently attached to the second type of unlabeled nucleotide, and a mixture of a third type of unlabeled nucleotide having a first functional moiety covalently attached to the third type of unlabeled nucleotide and a third type of unlabeled nucleotide having a second functional moiety covalently attached to the third type of unlabeled nucleotide.
35. The method of claim 33 or 34, wherein step (c) further comprises contacting the extended copy polynucleotides with a second labeling reagent that reacts specifically with the second functional moiety.
36. The method of any one of claims 24 to 31, wherein the incorporation mixture comprises a second type of unlabeled nucleotide having a second functional moiety covalently attached to the second type of unlabeled nucleotide, and a third type of unlabeled nucleotide having a third functional moiety covalently attached to the third type of unlabeled nucleotide.
37. The method of claim 36, wherein step (c) further comprises contacting the extended copy polynucleotides with a second labeling reagent that reacts specifically with the second functional moiety of the second type of unlabeled nucleotides, and a third labeling reagent that reacts specifically with the third functional moiety of the third type of unlabeled nucleotides.
38. The method of any one of claims 27 to 37, wherein the incorporation mixture comprises a fourth type of unlabeled nucleotide, wherein the fourth type of unlabeled nucleotide is not capable of reacting with any of the labeling reagents.
39. The method of any one of claims 24 to 38, wherein step (e) also removes the detectable labels of the incorporated nucleotides.
40. The method of claim 39, wherein the detectable labels and the 3' blocking groups of the incorporated nucleotides are removed in a single chemical reaction.
41. The method of any one of claims 24 to 40, further comprising (f) washing the solid support with an aqueous wash solution.
42. The method of claim 41, wherein steps (b) to (f) are repeated at least 50, 100, 150, 200, 250, 300, 350, 400, 450, or 500 cycles to determine the target polynucleotide sequences.
43. The method of any one of claims 24 to 42, wherein the four types of nucleotides comprise dATP, dCTP, dGTP and d'TTP or dUTP, or non-natural nucleotide analogs thereof.
44. The method of any one of claims 24 to 43, wherein the method is performed on an automated sequencing instrument comprising two light sources operating at two different wavelengths.
45. The method of claim 44, wherein one light source has a wavelength of about 450 nm to about 460 nm, and the other light source has a wavelength of about 520 nm to about 540 nm.
46. The method of any one of claims 24 to 43, wherein the method is performed on an automated sequencing instrument comprising a single light source.
47. The method of claim 46, wherein the single light source has a wavelength of about 450 nm to about 460 nm, or about 520 nm to about 540 nm.
48. A kit for sequencing application, comprising:82.an incorporation mixture comprising one or more of four different types of nucleotides each comprising a 3' blocking group, wherein a first type of unlabeled nucleotide having a first functional moiety covalently attached to the first type of unlabeled nucleotide; and83.the multi-dye labeled polyphosphodiester of any one of claims 1 to 15, wherein at least one of RAand RBof multi-dye labeled polyphosphodiester is capable of undergo the bioorthogonal reaction specifically with the first functional moiety of the first type of unlabeled nucleotides.
49. A kit for sequencing application, comprising:85.an incorporation mixture comprising one or more of four different types of nucleotides each comprising a 3' blocking group, wherein a first type of unlabeled nucleotide having a first functional moiety covalently attached to the first type of unlabeled nucleotide; and86.an antibody comprising the multi-dye labeled polyphosphodiester of claim 16, wherein the antibody is capable of binding to the first type of unlabeled nucleotides via noncovalent interaction with the first functional moiety of the first type of unlabeled nucleotides.
50. The kit of claim 48 or 49, wherein the kit comprises a fourth type of unlabeled nucleotide, wherein the fourth type of unlabeled nucleotide is not capable of reacting with any labeling reagent.
51. The kit of any one of claims 48 to 50, wherein the four different types of nucleotides are distinguishable using a single light source.
52. The kit of any one of claims 48 to 50, wherein the four different types of nucleotides are distinguishable using two light sources operating at two different wavelengths.
53. A system for nucleic acid sequencing, comprising a plurality of chambers, wherein one of the chambers contains the kit of any one of claims 48 to 52.