Methods of preparing RNA sequencing libraries
Patent Information
- Authority / Receiving Office
- WO · WO
- Patent Type
- Applications
- Current Assignee / Owner
- REALSEQ BIOSCIENCES INC
- Filing Date
- 2025-10-30
- Publication Date
- 2026-06-11
Smart Images

Figure US2025053370_11062026_PF_FP_ABST
Abstract
Description
Attorney Docket No. 57767-714601METHODS OF PREPARING RNA SEQUENCING LIBRARIESCROSS-REFERENCE
[0001] This application claims the benefit of U.S. Provisional Application No. 63 / 714,496, filed October 31, 2024, which application is incorporated herein by reference in its entirety.INCORPORATION BY REFERENCE OF SEQUENCE LISTING
[0002] The present application is being filed along with a Sequence Listing in electronic format. The Sequence Listing is provided as a file entitled 57767-714_601_SL.xml, created October 28, 2025, which is 28,641 kilobytes in size. The information in the electronic format of the Sequence Listing is incorporated by reference in its entirety.GOVERNMENT SUPPORT
[0003] This invention was made with government support under Small Business Innovation Research grant 1R43HG013284-01 awarded by the National Institute of Health. The government has certain rights in the invention.SUMMARY
[0004] In certain aspects, described herein is a method of preparing a sequencing library from a sample comprising a plurality of RNA molecules, wherein the plurality of RNA molecules comprise a combination of ligatable or unligatable 5’ and 3’ ends (end-types), and wherein the plurality of RNA molecules comprise at least one RNA molecule comprising 5’ and 3’ ligatable ends and at least one or more RNA molecules comprising one or two 5’ or 3’ unligatable ends, the method comprising: a) preparing a sequencing library for all pluralities of RNA molecules comprising ligatable end-types before and / or after a plurality of pretreatments, wherein preparing the sequencing library comprises ligation to sequencing adaptors; and b) applying a plurality of pretreatments to the plurality of RNA molecules of the sample, wherein the plurality of pretreatments comprises: (i) converting a first plurality of RNA molecules comprising unligatable end-types to ligatable end-types, and / or (ii) depleting a first plurality of RNA molecules comprising ligatable 5’ end or 3 ’end using the end-specific exonucleases, and / or (iii) circularizing a second plurality of RNA molecules comprising ligatable end-types to obtain a second plurality of circularized RNA molecules that have no ligatable ends; and / or (iv)Attorney Docket No. 57767-714601 converting a third plurality of RNA molecules comprising unligatable end-types into a third plurality of RNA molecules comprising ligatable end-types; and / or (v) circularizing a fourth plurality of RNA molecules comprising ligatable end-types to obtain a fourth plurality of circularized RNA molecules; and / or (vi) converting a fifth plurality of RNA molecules comprising unligatable end-types into a fifth plurality of RNA molecules comprising ligatable end-types molecules; and / or (vii) circularizing a sixth plurality of RNA molecules comprising ligatable end-types to obtain a sixth plurality of circularized RNA molecules; and / or (viii) converting a seventh plurality of RNA molecules comprising unligatable end-types into a seventh plurality of ligatable end types; and / or (ix) optionally, repeating (v) through (vii) for a fifth or more pluralities of RNA molecules comprising unligatable end-types; In some embodiments, the method comprises performing a sequencing only the RNA molecules incorporated in(to) sequencing library. In some embodiments, the RNA molecules comprise small RNAs (sRNA) or RNA fragments (RFs). In some embodiments, said sRNAs or RFs are 150 nucleotides or less in length. In some embodiments, said sRNAs or RFs are 50 nucleotides or less in length. In some embodiments, 5’ends comprise 5’-hydroxyl (5’-OH), 5’-Phosphate (5’-P), 5’- triphosphate (5’-ppp) or 5’-cap (e.g., 5’-methylGppp); and wherein the 3’ ends comprise 3’- Phosphate (3’-P), 2’-phosphate (2’-p), 2’,3’-cyclic phosphate (2’,3’>P ), 2’-O-methyl (2’-OMe). In some embodiments, the ligatable ends are selected from 5’-P, 3’-OH, 5’-OH, 3’-P or 2’,3’>P, or any combination thereof. In some embodiments, the RNA end-types are defined as RNA Types comprising the following ends: 5’-P and 3’-OH (Type 1); 5’-OH and 3’-OH (Type 2); 5’-OH and 3’-P or 2’,3’>P (Type 3); 5’-P and 3’-P or 2’,3’>P (Type 4). In some embodiments, the depleting RNA molecules comprising ligatable 5’-P end and / or 3 ’-OH end is performed with the endspecific exonucleases selected from: Terminator 5’p-dependent exonuclease, XRN-1 (5’-P end specific) or exonuclease and Exonuclease T (3 ’-OH end specific). In some embodiments, the circularizing is performed with at least one 3’-ligase (ligating 3’-OH with 5’-P ends) selected from: T4 RNA ligase, T4 RNA ligase 1 (Rnll), T4 RNA ligase 2 (Rnl2), Mth RNA Ligase, CircLigase™ ssDNA ligase, CircLigase ™ II ssDNA ligase, CircLigase™ RNA Ligase, Thermostable 5' AppDNA / RNA ligase, or a combination thereof. In some embodiments, the circularizing is performed with at least one 5’-ligase (ligating 5’-OH with 3’-P or 2’,3’>P ends) selected from: RNA-splicing ligase (RtcB), A. thaliana tRNA ligase (AtRNL), tRNA ligase enzyme (Tril), tRNA ligase (Rigl+), or a combination thereof. In some embodiments, the circularizing of RNA molecules prevents ligation of circularized RNA molecules with sequencing adapter(s) thereby preventing incorporation of the circularized RNA molecules into the sequencing library. In some embodiments, the sequencing library preparation includes only Type 1 or a combination of Type 1 and Type 2 RNA molecules.Attorney Docket No. 57767-714601
[0005] In certain aspects, described herein is a method of preparing a sequencing library from a sample comprising a plurality of RNA molecules; wherein the plurality of RNA molecules comprises a first end comprising a 5 ’-Phosphate (5’-P) end or a 5 ’-hydroxyl (5 ’-OH) end and a second end comprising a 3’-Phosphate (3’-P) end, a 2’,3’-cyclic (2’,3’>P ) end, or a 3’-hydroxyl (3-’OH) end; the method comprising: separating a composition comprising the plurality of RNA molecules into at least a first partition and a second partition; performing a first pretreatment on the first partition and a second pretreatment on the second partition, wherein the first pretreatment and the second pretreatment are not the same, wherein the first pretreatment and the second pretreatment are independently selected from: circularizing a plurality of RNA molecules comprising 5’-P ends and 3’-OH ends; converting a plurality of RNA molecules comprising 3’-P end or 2’,3’>P end, to 3 ’-OH ends; converting a plurality of RNA molecules comprising 3’-P or 2’,3’>P ends to 3’-OH ends, then circularizing a plurality of RNA molecules comprising 5’-P ends and 3’-OH ends; converting a plurality of RNA molecules comprising 5’-OH ends to 5’-P ends and converting a plurality of RNA molecules comprising 5’-OH and 3’P or 2’,3’>P ends to 5’-P and 3’-P or 2’,3’>P ends, then circularizing a plurality of RNA molecules comprising 5’-P ends and 3 ’-OH ends, and then converting 5’-P and 3’-P or 2’,3’>P ends to 5’-P ends and 3 ’-OH ends ; circularizing a plurality of RNA molecules comprising 5’-OH ends and 3’-P ends or 2’,3’>P end, then converting a plurality of RNA molecules comprising 5 ’-OH ends to 5’-P ends and circularizing a plurality of RNA molecules comprising 5’-P ends and 3 ’-OH ends; circularizing a plurality of RNA molecules comprising 5’-OH ends and 3’-P ends or 2’,3’>P ends; then converting a plurality of RNA molecules comprising 3’-P ends or 2’,3’>P ends to 3’-OH ends; degrading a plurality of RNA molecules comprising 5’-P ends, then converting a plurality of RNA molecules comprising 5’-OH ends to 5’-P ends and circularizing a plurality of RNA molecules comprising 5’-P ends and 3’-OH ends; then converting a plurality of RNA molecules comprising 3’-P ends or 2’,3’>P end, to 3’ OH ends; or no pretreatment is performed; and preparing a first sequencing library from the first partition and a second sequencing library from the second partition. In some embodiments, (b)(i) comprises contacting the plurality of RNA molecules with T4 RNA ligase 1 and T4 RNA ligase 2 in the presence of ATP. In some embodiments, (b)(ii) comprises contacting the plurality of RNA molecules with T4 polynucleotide kinase (PNK) in the absence of ATP. In some embodiments, (b)(iii) comprises contacting the plurality of RNA molecules with PNK in the absence of ATP then contacting the plurality of RNA molecules with T4 RNA ligase 1 or T4 RNA ligase 1 and T4 RNA ligase 2 in the presence of ATP. In some embodiments, (b)(iv) contacting the plurality of RNA molecules with PNK (3 ’ phosphatase minus) in the presence of ATP; then contacting the plurality of RNA molecules with T4 RNA ligase 1 and T4 RNA ligase 2 in the presence of ATP; then contacting the plurality of RNA molecules withAttorney Docket No. 57767-714601PNK in the absence of ATP. In some embodiments, (b)(v) comprises contacting the plurality of RNA molecules with RtcB ligase, then contacting the plurality of RNA molecules with PNK (3’ phosphatase minus) in the presence of ATP; then contacting the plurality of RNA molecules with T4 RNA ligase 1, and T4 RNA ligase 2 in the presence of ATP; then contacting the plurality of RNA molecules with PNK in the absence of ATP. In some embodiments, (b)(vi) comprises contacting the plurality of RNA molecules with RtcB ligase, then contacting the plurality of RNA molecules with PNK in the absence of ATP. In some embodiments, (b)(vii) comprises contacting the plurality of RNA molecules with Terminator 5 ’-Phosphate-Dependent exonuclease, then contacting the plurality of RNA molecules with PNK (3’ phosphatase minus), in the presence of ATP; then contacting the plurality of RNA molecules with T4 RNA ligase 1 and T4 RNA ligase 2 in the presence of ATP; then contacting the plurality of RNA molecules with T4 RNA ligase in the absence of ATP. In some embodiments, contacting the plurality of RNA molecules with the PNK is performed in buffer solutions comprising (2-(N-morpholino)ethanesulfonic acid) (MES) or Imidazole at pH 5.5-6.5, or tri s(hydroxymethyl)aminom ethane (TRIS) at pH 7.0-7.5. In some embodiments, the method comprises separating the composition comprising the plurality of RNA molecules into a third, fourth, fifth, sixth, and / or seventh partition, wherein a third, fourth, fifth, sixth, and / or seventh pretreatment is performed on the third, fourth, fifth, sixth, and / or seventh partition, wherein the third, fourth, fifth, sixth, and / or seventh pretreatments are different from each other and the first and second pretreatment. In some embodiments, preparing the sequencing library comprises ligating a single adaptor to 5’ or to 3’end of the plurality of RNA molecules. In some embodiments, preparing the sequencing library comprises ligating two adapters, wherein the first adaptor is ligated to a first end and the second adapter is ligated to a second end of the plurality of RNA molecules. In some embodiments, preparing a sequence library further comprises circularizing the plurality of ligation products comprising the single adaptor-RNA molecules. In some embodiments, preparing a sequence library further comprises reverse transcription of circularized products (RT) followed by PCR amplification of cDNA products of the RT. In some embodiments, preparing a sequence library further comprises direct RT-PCR amplification of the circularized products. In some embodiments, the method comprises sequencing the first sequencing library and the second sequencing library.
[0006] In certain aspects, described herein is a method of preparing a sequencing library from a sample comprising a plurality of RNA molecules, wherein the plurality of RNA molecules have ends of Type 1 comprising a combination of 5’-Phosphate (5’-P) and 3’-hydroxyl (3’-OH) ends), Type 2 comprising 5’-hydroxyl (5’-OH) and 3’-OH ends, Type 3 comprising 5’-OH and 3’- Phosphate (3’-P) or 2’, 3’ cyclic phosphate (2’,3’>P ends ), and Type 4 comprising 5’-P and 3’-P or 2’,3’>P ends; the method comprising: separating a composition comprising the plurality of saidAttorney Docket No. 57767-714601RNA molecules into at least a first partition and a second partition; performing a first pretreatment on the first partition and a second pretreatment on the second partition, wherein the first pretreatment and the second pretreatment are not the same, wherein the first pretreatment and the second pretreatment are independently selected from: circularizing Type 1 RNA molecules; converting Type 3 RNA molecules to Type 2 RNA molecules and converting Type 4 RNA molecules to Type 1 RNA molecules; converting Type 3 RNA molecules to Type 2 RNA molecules and converting Type 4 RNA molecules to Type 1 RNA molecules, then circularizing Type 1 RNA molecules; converting Type 2 RNA molecules to Type 1 RNA molecules and converting Type 3 RNA molecules to Type 4 RNA molecules, then circularizing Type 1 molecules, and then converting Type 4 RNA molecules to Type 1 RNA molecules; circularizing Type 3 molecules, then converting Type 2 RNA molecules to Type 1 RNA molecules and circularizing Type 1 RNA molecules; circularizing Type 3 molecules, then converting Type 4 molecules to type 1 molecules; degrading type 1 and type 4 molecules, then converting Type 2 molecules to type 1 molecules and circularizing Type 1 molecules; then converting Type 3 molecules to Type 2 molecules; or no pretreatment; ligating a plurality of adaptors to the plurality of RNA molecules to produce a plurality of adaptor-RNA molecules. In some embodiments, the method comprises separating the composition comprising the plurality of RNA molecules into a third, fourth, fifth, sixth, and / or seventh partition, wherein a third, fourth, fifth, sixth, and / or seventh pretreatment is performed on each corresponding separate partition, wherein each pretreatment is different. In some embodiments, (b)(i) comprises contacting the plurality of RNA molecules with T4 RNA ligase 1 and T4 RNA ligase 2 in the presence of ATP. In some embodiments, (b)(ii) comprises contacting the plurality of RNA molecules with PNK in the absence of ATP. In some embodiments, (b)(iii) comprises contacting the plurality of RNA molecules with PNK in the absence of ATP then contacting the plurality of RNA molecules with T4 RNA ligase 1 and T4 RNA ligase 2 in the presence of ATP. In some embodiments, (b)(iv) comprises contacting the plurality of RNA molecules with PNK (3’ phosphatase minus) in the presence of ATP; then contacting the plurality of RNA molecules with T4 RNA ligase 1 and T4 RNA ligase 2 in the presence of ATP; then contacting the plurality of RNA molecules with PNK in the absence of ATP. In some embodiments, (b)(v) comprises contacting the plurality of RNA molecules with RtcB ligase, then contacting the plurality of RNA molecules with PNK (3’ phosphatase minus) in the presence of ATP; then contacting the plurality of RNA molecules with T4 RNA ligase 1, and T4 RNA ligase 2 in the presence of ATP; then contacting the plurality of RNA molecules with PNK in the absence of ATP. In some embodiments, (b)(vi) comprises contacting the plurality of RNA molecules with RtcB ligase, then contacting the plurality of RNA molecules with PNK in the absence of ATP. In some embodiments, (b)(vii) comprises contactingAttorney Docket No. 57767-714601 the plurality of RNA molecules with Terminator 5 ’-Phosphate-Dependent exonuclease, then contacting the plurality of RNA molecules with PNK (3’ phosphatase minus), in the presence of ATP; then contacting the plurality of RNA molecules with T4 RNA ligase 1 and T4 RNA ligase 2 in the presence of ATP; then contacting the plurality of RNA molecules with T4 RNA ligase in the absence of ATP. In some embodiments, contacting the plurality of RNA molecules with the PNK is performed in a buffer solution comprising a MES or Imidazole at pH 5.5-6.5. In some embodiments, the method further comprises sequencing the first sequencing library and the second sequencing library to identify and quantities of at least one Type of RNA molecules or their combinations thereafter. In some embodiments, the method comprises comparing the relative quantities of the same Type or different Types of RNA molecules in the first sequencing library and the second sequencing library. In some embodiments, 5’ ends comprise 5 ’-OH, 5’-P, 5’- triphosphate (5’-ppp); or 5’-cap (e.g., 5’-mGppp). In some embodiments, 3’ ends comprise 3’-P, 2’-phosphate (2’-P) or 2’,3’>P. In some embodiments, 3’ ends comprise 3’-OH and 2’-hydroxyl (2’-OH) or 3’-OH and 2’-O-Methyl (2’-OMe).
[0007] In certain aspects, described herein is a method of preparing a sequencing library from a sample comprising a plurality of RNA molecules, the method comprising: a) separating a composition comprising the plurality of RNA molecules into at least a first partition and a second partition; b) performing a first pretreatment on the first partition and a second pretreatment on the second partition wherein the first pretreatment and the second pretreatment are independently selected from: contacting the plurality of RNA molecules with T4 RNA ligase 1 and T4 RNA ligase 2 in the presence of ATP; contacting the plurality of RNA molecules with PNK in the absence of ATP; contacting the plurality of RNA molecules with PNK in the absence of ATP then contacting the plurality of RNA molecules with T4 RNA ligase 1 (Rnll) and T4 RNA ligase 2 (Rnl2) in the presence of ATP; contacting the plurality of RNA molecules with PNK (3’ phosphatase minus) in the presence of ATP; then contacting the plurality of RNA molecules with T4 RNA ligase 1 and T4 RNA ligase 2 in the presence of ATP; then contacting the plurality of RNA molecules with PNK in the absence of ATP; contacting the plurality of RNA molecules with RtcB ligase, then contacting the plurality of RNA molecules with PNK (3’ phosphatase minus) in the presence of ATP; then contacting the plurality of RNA molecules with T4 RNA ligase 1, and T4 RNA ligase 2 in the presence of ATP; then contacting the plurality of RNA molecules with PNK in the absence of ATP; contacting the plurality of RNA molecules with RtcB ligase, then contacting the plurality of RNA molecules with PNK in the absence of ATP; contacting the plurality of RNA molecules with Terminator 5 ’-Phosphate-Dependent exonuclease, then contacting the plurality of RNA molecules with PNK (3’ phosphatase minus), in the presence of ATP; then contacting the plurality of RNA molecules with T4 RNA ligase 1 and T4 RNA ligaseAttorney Docket No. 57767-7146012 in the presence of ATP; then contacting the plurality of RNA molecules with T4 RNA ligase in the absence of ATP; or no pretreatment(s); c) ligating a plurality of adaptors to the plurality of RNA molecules in the first partition and to the plurality of RNA molecules in the second partition to produce a plurality of adaptor-RNA molecules. In some embodiments, contacting the plurality of RNA molecules with the PNK is performed in solutions comprising a salt of MES or Imidazole buffer at pH value between 5.5 and 6.5. In some embodiments, the method comprises separating the composition comprising the plurality of RNA molecules into a third fourth, fifth, sixth, and / or seventh partition, wherein a third fourth, fifth, sixth, and / or seventh pretreatment is performed on the corresponding separate partition, wherein each pretreatment is different. In certain aspects, described herein is a method of pretreating a plurality of RNA molecules comprising all possible combinations of a 5’-P, 5’-OH, 3’-P end, 3’-OH end, a 3’-P end and 2’,3’>P end, the method comprising: converting the 3’-P ends and the 2’,3’>P ends to 3’-OH ends by contacting the plurality of RNA molecules with a PNK in a buffer solution at pH between 5.5 and 6.5; converting the 5 ’-OH ends to a 5’-P ends by PNK in the presence of ATP and a buffer at pH value between 7.0 and 7.5. In some embodiments, the buffer solution comprises a MES buffer at pH 6.0. In some embodiments, the PNK is heat-inactivated at 65°C-85°C in the presence of citric acid at pH 6, wherein both chelating Mg2+cations by citrate anions and pH 6 prevents RNA from degradation at indicated temperatures. In some embodiments, a sequencing adaptor is ligated to each of 3’ ends of the plurality of RNA molecules after step (a) and before step (b). In some embodiments, a sequencing adaptor is ligated to each of 3’ ends of the plurality of RNA molecules after step (b). In some embodiments, the method allows identification of one or more RNA Types for any RNA class of interest. In some embodiments, the RNA class is selected from: microRNAs (miRNA), endogenous small interfering RNAs (esiRNA), Piwi interacting RNAs (piRNA), small nuclear RNA (snRNA), small nucleolar RNAs (snoRNAs), molecules derived from mRNA transcripts (smRNA, scRNA, sutRNA, sinRNA) and other small genome-encoded RNA (sgmRNA), long non-coding RNAs (IncRNA), transfer RNA (tRNA), ribosomal RNA (rRNA) and Y RNA, or combination thereof. In some embodiments, deep sequencing of the plurality of sequencing libraries comprising Type 1, Type 2, Type 3, or Type 4 RNA simultaneously allows to identify specific RNA classes as biomarker candidates. In some embodiments, the method determines if an RNA molecule is a Type 1, Type 2, Type 3, or Type 4 RNA molecule. In some embodiments, sequencing libraries prepared for different RNA Type allows to identify specific RNA Type(s) and RNA class(es) providing the most sensitive and specific detection of RNA biomarkers. In some embodiments, a length of an identified RNA sequences is within a range of 15 to 150 nucleotide sequencing reads.Attorney Docket No. 57767-714601
[0008] In certain aspects, described herein is a kit for preparing sequencing libraries comprising sequences of all RNA Types and / or specific RNA Types or different combinations of the specific RNA Types from a plurality of RNA molecules from a sample, the kit comprising: a universal or RNA Type-specific sequencing adapter or adapters; and sequencing library preparation kit. In some embodiments, the kit comprises a pool of control (spike-in) RNA molecules, the RNA molecules comprising: a plurality of 5’ and 3’ end combinations, the end combinations comprising Type 1 with 5’-P and 3 ’-OH ends; Type 2 with 5 ’-OH and 3 ’-OH ends; Type 3 with 5’-OH and 3’-P or 2’,3’>P ends; and Type 4 with 5’-P and 3’-P or 2’,3’>P ends; internal bar-code nucleotide sequences corresponding to and distinguishing between Type 1, Type 2, Type 3 and Type 4; a randomized nucleotide sequence at the first end; and a randomized nucleotide sequence at the second end. In some embodiments, the kit comprisesone or more of: a PNK (wild type and 3’ phosphatase minus mutant), a T4 RNA ligase 1, a T4 RNA ligase 2; aRtcB ligase, or a Terminator 5 ’-Phosphate-Dependent exonuclease. In some embodiments, the kit comprises stock solutions of one or more of: ATP; standard ligase and PNK buffers, MES, and citric acid.BRIEF DESCRIPTION OF THE DRAWINGS
[0009] The novel features of the invention are set forth with particularity in the appended claims. A better understanding of the features and advantages of the present invention will be obtained by reference to the following detailed description that sets forth illustrative embodiments, in which the principles of the invention are utilized, and the accompanying drawings of which:
[0010] Figure 1 depicts RNA Types having different combinations of free ends, including 5 ’-hydroxyl (5 ’-OH), 5 ’-phosphate (5’-P) at 5’ end; 2 ’-hydroxyl (2 ’-OH) / 3’ -hydroxyl (3 ’-OH), 2 ’-OH / 3 ’-Phosphate (3’-P) and 2’,3’-cyclic phosphate (2’,3’>P or cP) at 3’-end; and circularized RNA which has no free ends. These RNA ends are typically present in cell-free RNA molecules that comprise small RNAs and RNA fragments (also known as RFs) found in human plasma and other biofluids.
[0011] Figures 2A-2B depicts a schematic representation of compositions of the RNA Types-specific sequencing adapters and methods for using these adapters for preparation of small RNA sequencing libraries. FIG. 2A depicts a proposed workflow using a combo adaptor (CAD) of type 6. The modifications are related to differences between termini composition of CADs types 6, 7 and 8 that require different ligation, blocking and circularization steps (see Table 1). Step 1: Selected CAD variant having appropriate (RNA Type-specific) ends areAttorney Docket No. 57767-714601 ligated to either small RNA (sRNA) 3’ or 5’ ends to produce RNA-C D ligation products (RCAD). Step 2: Optional blocking CAD by ligation of its 5’-end with blocking oligonucleotide. Since CAD is used in excess over RNA input, some variants of unused (unligated in Step 1) someCAD will need to be blocked to avoid formation of circular CAD template for production of adapter dimers in downstream steps. However, some other CAD variants cannot circularize under proposed reaction conditions and, therefore, do not need the blocking step. Step 3: Optional conversion of blocked RCAD ends (if such ends are present) to their ligatable (5’-P and 3 ’-OH) forms by polynucleotide kinase (PNK) that has 5 ’-OH phosphorylase and 3’- phosphatase activities or by PNK, 3 ’-phosphatase free (PNK, 3’-minus) having only the 5’-phosphorylase / kinase activity. Step 4: Circularization of the RCAD by ligating their ends. Step 5: Reverse transcription (RT) of the circular RCAD to produce cDNAs containing antisense sequences for sRNA flanked by sequencing adapters. The presence of a non-template linker within the CAD (blue bar) stops RT after one round, preventing rollingcircle amplification (RCA). Step 6: PCR amplification of the RT product with the extended, bar- coded PCR sequencing primers, yielding RNA sequencing libraries. FIG. 2B depicts different versions of combo adapters (CADs) containing segments of standard 3’ - and 5 ’-adapters (shown in blue and green, respectively) compatible with next generation sequencing. The CAD segments comprise DNA and / or RNA nucleotides with shown end variations and are connected by a nontemplate linker. All ends are abbreviated as in Fig. 1, except App, which is 5' adenylation modification (5 ’,5 ’-adenyl pyrophosphoryl).
[0012] Figures 3A-3B depicts the workflow for preparation of sequencing libraries by a single protocol compatible with different RNA Type-specific pretreatments (from Table 2). FIG. 3A depicts the core workflow based on a protocol for preparation of libraries from biofluids, which was modified to efficiently capture RF of both Types 1 and 2, and to reduce adapter dimers. Step 1 Combo adapter (CAD), which encodes two standard sequencing adapters used for small RNA-seq library preparations, is ligated with RF’s 3’ end to produce RF-CAD ligation products (RCAD). Step 2: Steps 2a and 2b are run simultaneously. Step 2a Blocking non-ligated CAD (see Fig. IB) used in excess over RF input to reduce amounts of circular CAD (in Step 3) encoding adapter dimers. Step 2b Conversion of RCAD non-ligatable 3’-p end and 5 ’-OH end (only present in RF Type 2) to their ligatable (3 ’-OH and 5’-p) forms by T4 polynucleotide kinase (PNK) in the presence of ATP to allow RCAD circularization. Step 3: Circularization of the RCAD by ligating their ends by T4 RNA ligase 1 in the presence of ATP. Since the blocking of CAD in Step 2a is not 100% efficient, the unblocked CAD is also circularized (cCAD) (and multimerized) in Step 3. Step 4: Removal of cCAD after its hybridization with biotinylated cCAD-specific depleting oligo (DO) and capture the hybridsAttorney Docket No. 57767-714601 simultaneously with products of Step 2a using streptavidin-coated magnetic beads. The DO is complementary to the junction between the 5’ and 3’ adapter ends present only in circular and multimeric forms of CAD. Step 5: Reverse transcription (RT) of the circular RCAD to produce cDNAs containing RF antisense sequences flanked by the standard sequencing adapters. The abasic linker in CAD (small black bar) stops RT and prevents rolling-circle amplification (RCA). Step 6: PCR amplification of the RT product with the extended, UDI / barcoded PCR sequencing primers, yielding RF sequencing libraries. Depending on the pretreatment of RF samples before the library preparation (Table 2), this core protocol allows to specifically capture different RNA Types. C: Scheme of blocking non-ligated CAD by splint-dependent ligation of CAD with biotinylated BO using T4 DNA ligase in Step 2a. FIG. 3B depicts a scheme of blocking non-ligated CAD by splint-dependent ligation of CAD with biotinylated BO using T4 DNA ligase in Step 2a.
[0013] Figures 4A-4C depicts results of simultaneous detection of all RNA Types after PNK pretreatments. All results were obtained by sequencing samples of human brain RNA spiked in with control RNAs representing a pool of synthetic RNAs of variable length having internal Type-specific barcodes (stsRNA) (Table 4). FIG. 4A depicts a comparison “no pretreatment control” with the effect of the PNK pretreatments under different conditions (as indicated) on percentage of sequencing reads corresponding to different RNA classes. FIG. 4B depicts percentages of different stsRNA Types (see Fig. 1) detected before (left panel) and after the selected (middle panel) and the standard (right panel) PNK pretreatments. FIG. 4C depicts sequencing profiles after the selected PNK pretreatment (PP code 04 or Protocol C) for all indicated RNA classes (left panel) or only for RFs derived from mRNA transcripts (right panel).
[0014] Figures 5A-5B depicts results of yields of circularization of RFs Type 1 by T4 RNA ligases as a method of their exclusions from sequencing libraries. All results were obtained by sequencing samples of human brain RNA spiked in with an stsRNA pool of control RNAs (Table 4). FIG. 5A. The stsRNAs’ length profiles for Types 1 and 2 (left panel), and Types 3 and 4 (right panel) after sequencing samples pretreated by Rnll for 2 hours. The pretreatment results in depletion of Type 1 (compare with Fig. 4B, left panel) and appearance of Type 3 in sequencing reads shortened by 1 nt at 3’ ends. FIG. 5B. Left panel'. Sequencing profiles for all RNA classes before (Protocol A) or after the selected pretreatment (Protocol B) with an equimolar mix of Rnll and Rnl2. Protocol B was selected for providing efficient circularization of Type 1 while minimizing side, reverse (exonuclease) reaction. Right panel'. Detection-by- Exclusion. Subtracting sequencing profiles for Protocol B (Type 2) from Protocol A (Types 1+2) allows to determine the profile of miRNAs or other Type 1 RFs.Attorney Docket No. 57767-714601
[0015] Figure 6 depicts sequencing profiles for Protocols D through H (from Table 2) for all RNA classes discovered in total human brain RNA samples. Types of RNA ends (RNA Types) shown in Fig. 1 are indicated along with the specific pretreatment Protocols (Table 2).
[0016] Figure 7 depicts sequencing profiles for selected Types of RF ends (Fig. 1) derived from indicated RNA classes in total human brain RNA samples. Each profile shows the specific Protocols (Table 2) and the selected RNA classes: rRNA (top panel), tRNA (middle panel) and snoRNA (bottom panel). The RNA Types and RNA classes were selected based on the maximum differences in lengths of dominant (based on read percentages) peaks and fractions.
[0017] Figures 8A-8C depicts a comparison of RFs’ sequencing profiles in plasma samples by RealSeq-RF and benchmark Phospho-Seq methods. Plasma samples were from 3 healthy donors (H samples) and 3 patients diagnosed with breast cancer (D samples). FIG. 8A. Left panel: Percentages of sequencing reads aligned to indicated RNA classes using RealSeq-RF pretreatment protocols (Table 2) providing detection of RFs with specific end Types (Fig. 1) and NEBNext® kit with upfront standard pretreatments by PNK in the presence of ATP (PNK+ATP) or by PNK in the absence of ATP (PNK). Right panel'. Percentages of reads for snRNAs. The reads corresponding to stsRNAs and unannotated RNA classes are not shown. FIG. 8B: Overlapping profiles for lengths of RFs derived from snRNAs for all D and H samples, detected by sequencing libraries prepared using either NEBNext kit with “PNK+ATP” pretreatment providing RFs of all RNA Types 1+2+3+4 detection (left panels) or RealSeq-RF with protocol B providing RNA Type 2 RFs (right panels). FIG. 8C: Pileup of reads mapping to the U2 snRNA using either the NEBNext kit with “PNK+ATP” pretreatment (left panels) or RealSeq-RF protocol B (right panels).DETAILED DESCRIPTION
[0018] Cell-free nucleic acids (cfDNA and cfRNA) found in blood and other biofluids are promising biomarkers with diagnostic potential for cancer and other diverse pathologies. However, current methods exploiting cfDNA analytes are not sensitive enough to avoid false positive or negative results in diagnosing cancer at early stages (when it is more treatable) and monitoring minimal residual disease for recurrence when tumor-associated cfDNA are in several orders of magnitude lower abundance than background cfDNA that originated from non- cancerous cells (Pons-Belda et al. 2021. Diagnostics 11 : 2171; Song et al. 2023. Nat. Biomed. Eng. 6: 232-245). Meanwhile, cell-free RNA (cfRNA) is emerging as an important class of biomarkers for cancer that can provide higher sensitivity and specificity than cfDNA (Cabus et al. 2022. Biomarker Res. 10: 62). The fact that RNA is transcribed in multiple copies from theAttorney Docket No. 57767-714601 genomic and intergenic DNA templates (including the ones that are normally silent) contributes to higher tumor-associated RNA abundance than tumor-derived DNA both in cells and in circulation (Vibert et al. 2022. Mol. Cell. 82: 2458-2471). Furthermore, cfRNA species contain information relating to biological phenotypes that could provide information about tissues of cancer origin and cancer subtype specificity (Chen et al. 2022. Elife 11 : e75181). In addition, dramatic changes in the RNA expression profile in tumors, dysregulated RNA post- transcriptional events, including alternative splicing and formation of chimeric RNAs that are detectable only in the transcriptome (but not in the genome), contribute to the higher complexity of the cfRNA landscape (Ning et al. 2023. EBioMedicine 93: 104645). In addition to cancer, RNA biomarkers also are considered as potential biomarkers for other pathologies, including but not limited to microbial (viral, bacterial and fungi) infections and genetic disorders.
[0019] The majority (about 95%) of total cfRNA are small RNAs (sRNAs) and RNA fragments, which are shorter than 42 nucleotides (nt) in length (Akat et al. 2018. JCI Insight 5: el27317) and are referred to as RFs. RFs comprise products of cleavages of larger RNAs by intracellular and / or circulating ribonucleases yielding a variety of RNA Types that differ in specific combinations of RNA ends FIG. 1 shows different RNA end-types that can be generated by ribonucleases, which have diagnostic potential. Protective RNA secondary structures, RNA-protein complexes, or encapsulation into lipid EVs help the RFs released into circulation to survive further degradation (Shi et al. 2022. Nat. Cell Biol. 24: 415-423). The protected RFs can be detected and analyzed by next-generation sequencing (NGS). Besides well studied microRNAs (miRNAs), the entire RNA fragmentome also includes RFs cut from precursors and mature mRNAs, IncRNAs, tRNAs, rRNAs, snRNAs and other known RNA classes. The RFs derived from specific regions of these RNAs (rather than products of random RNA fragmentation) represent the highest potential as biomarkers. Until recently, the analysis of cfRNAs has been primarily focused on miRNA. However, there are a limited number of tissue- specifically expressed individual miRNA, which only represent a small variety of the transcriptome, whereas some other RNA classes (e.g., mRNA and IncRNA) have much greater diversity than miRNA. As a result, the potential to obtain biomarkers that reliably assess the state of a disease using RFs derived from these more diverse and abundant RNAs is much higher (Giraldez et al. 2019. EMBO J. 38: el01695).
[0020] Sequencing analysis of the entire RNA fragmentome (also known as cfRNA transcriptome) would improve understanding of the roles of diverse RFs in cancer development and treatment resistance, to discover novel RNA biomarkers and to offer higher sensitivity for cancer diagnostics. However, most of the standard commercially available methods of NGS library preparation for sRNAs (sRNA-Seq) only detect RNA Type 1 (e.g., miRNAs, piRNAsAttorney Docket No. 57767-714601 and a few RFs derived from other RNA classes), which account for about 10% of the entire RNA fragmentome, while the 90% of other RFs representing other RNA Types (FIG. 1) are hidden from detection by sRNA-Seq (Shigematsu and Kirino. 2022. Biomolecules 12: 611). Currently there are two main groups of methods for detecting the “hidden” RNA Types. The first group uses T4 polynucleotide kinase (PNK) in the presence of ATP to erase differences between the phosphorylation states of the both RNA ends by converting all of them to Type 1 RNAs followed by the standard methods of sRNA-Seq library preparation (Solaguren-Beascoa et al. 2023. Int. J. Mol. Sci. 24: 11653). Although this approach allows simultaneous analysis of sequences derived from a larger variety of RNA classes than without enzymatic treatment, the emerged rRNA fragments dominate the sequencing reads and significantly overshadow the analysis of other RNA species limiting the sensitivity of determining their abundances in a sample. The second group focuses exclusively on detection of RFs with specific 5’ or 3’ RNA ends but do not distinguish between phosphorylation status at the opposite 3’ or 5’ ends (Crocker et al. 2022. Curr. Protoc. 2: e495; Shi et al. 2022. Nat. Cell Biol. 24: 415-423). Moreover, all the second group methods require purification of intermediate reaction products, e.g. by gelelectrophoresis, RNA extraction and ethanol precipitation, that limits throughput, sensitivity and reproducibility of quantification of RFs by sequencing and, therefore, their applications to diagnostic assays. The most important problem is that none of these methods can both detect all RFs simultaneously and distinguish each individual RNA Type-specifically.
[0021] To address these shortcoming described herein is a preparation technology, which comprises methods, kits, and compositions that enable previously impossible, comprehensive analysis of the entire RNA fragmentome, including: (i) full spectrum of RFs, (ii) RFs derived from selected RNA classes, and (iii) specific RF Type(s) within these RNA classes (including the RF length profiles) that maximizes the sensitivity of detection for rare tumor-associated RFs by eliminating background noise from other, irrelevant RF sequences.METHODS
[0022] The methods described herein are useful for determining RNA Types present in any sample comprising a plurality of RNA Types (e.g., different 3’- and / or 5’-end types). The different RNA types detectable by the methods described herein are shown in FIG. 1 and 2B. These RNA Types are generally present in biological samples derived from biological fluids, tissues, or cells of an individual, primary cells cultured ex vivo or immortalized cell lines cultured in tissue culture. These RNA types may further be generated in vitro or synthesized. RNAs can be extracted from various biofluids or cells using standard methods or commercially available extraction kits.Attorney Docket No. 57767-714601
[0023] A first step of determining RNA Types from a sample is the application of one or more enzyme pre-treatment steps to convert unligatable ends to ends that can be ligatable to a combo adaptor (CAD) to obtain a pretreated sample as shown in Table 1. A sample can be divided into one, two, three, four, five, six, seven, eight or more partitions and the individual partitions can be subjected to pretreatments to determine different RNA Types. The sample may only need to be divided into as many partitions as necessary to determine the particular RNA Types desired. The pretreatments may be carried out in parallel or successively, and may comprise a non-pretreated sample or partition.
[0024] As described herein a RNA molecule analyzed or detected by the described methods can have any combination of RNA end-types. In certain embodiments, the 5’ and 3’ ends of an RNA molecule are ligatable. In certain embodiments, the 5’ and 3’ ends of an RNA molecule are unligatable. In certain embodiments, the 5’ end of an RNA molecule is ligatable, and 3’ end of an RNA molecule is unligatable. In certain embodiments, the 5’ end of an RNA molecule is unligatable, and 3’ end of an RNA molecule is ligatable. In some embodiments, the methods for preparation of the sequencing libraries comprises at least one of the methods listed in Table 1 or Table 2.
[0025] As shown in Table 1 and FIGS. 2A and 3A, after pretreatment different steps may be applied such as ligation to a CAD, or not, depending upon the RNA Type to be detected.Further, the CAD may differ as shown in FIG. 2B.
[0026] Optional steps in the process comprise removing or blocking unused CAD, which can be achieved, for example, by using a process as shown in FIG. 3B.RNA molecule sources
[0027] The methods described herein can be used to determine RNA molecule types present in any type of sample. An sample can be provided or obtained from a biological sample. The biological sample can be obtained from an individual, cultured cells, as the result of an in vitro reaction that generates specific RNA end-types either singly or in combination (e.g., which may be uses as a control), or that from a sample comprising chemically synthesized RNAs. In certain embodiments, the biological sample is selected from plasma or serum. In certain embodiments, the biological sample is plasma. In certain embodiments, the biological sample is a tissue biopsy. In certain embodiments, the sample comprises cfRNA. In certain embodiments, the sample comprises at least two of Type 1, Type 2, Type 3 and Type 4 RNA. In certain embodiments, the sample comprises at least three of Type 1, Type 2, Type 3 and Type 4 RNA. In certain embodiments, the sample comprises Type 1, Type 2, Type 3 and Type 4 RNA. In certain embodiments, the biological sample comprises cfRNA. In certain embodiments, the biological sample comprises at least two of Type 1, Type 2, Type 3 and Type 4 RNA. In certainAttorney Docket No. 57767-714601 embodiments, the biological sample comprises at least three of Type 1, Type 2, Type 3 and Type 4 RNA. In certain embodiments, the biological sample comprises Type 1, Type 2, Type 3 and Type 4 RNA.Types of RNA ends
[0028] As described herein a RNA molecule analyzed or detected by the described methods can have any combination of RNA end-types. In certain embodiments, the 5’ and 3’ ends of an RNA molecule are ligatable. In certain embodiments, the 5’ and 3’ ends of an RNA molecule are unligatable. In certain embodiments, the 5’ end of an RNA molecule is ligatable, and 3’ end of an RNA molecule is unligatable. In certain embodiments, the 5’ end of an RNA molecule is unligatable, and 3’ end of an RNA molecule is ligatable.
[0029] The methods described herein include methods of preparing a sequencing library or libraries from a sample comprising a plurality of RNA molecules comprising a plurality of RNA end types as shown in FIG. 1. Types of RNA molecules comprising selected combinations of RNA ends are referred to herein as RNA Types (or Types of RNA). As used herein, Type 1 RNA molecules comprise 5 ’-Phosphate (P) and 3’-hydroxyl(OH) ends. Type 2 RNA molecules comprise 5’-OH and 3’-OH ends. Type 3 RNA molecules comprise 5’-OH and 3’-P or 2’,3’>P ends. Type 4 RNA molecules comprise 5’-P and 3’-P or 2’,3’>P ends. Only Types of RNA molecules included in libraries are sequenced, and therefore detected by the sequencing. In some embodiments, the Types of RNA molecules included in the library depend on (or is determined by) the methods of sequencing library preparation. These methods use different compositions of sequencing adapters and / or different enzymatic treatment steps in sequencing library preparation. The examples for these embodiments are shown in Figs. 2A-2B, and Table 1. In some embodiments, the RNA Types are included in the libraries depending on the pretreatments of RNA molecules before sequencing library preparation. In some embodiments, the method may enable simultaneous detection of RNA molecules comprising all combinations of Type 1, Type 2, Type 3, and Type 4 ends. In some other embodiments, the method may enable Typespecific detection of selected single RNA Type or a combination of two or more RNA Types. In some embodiments, the method may enable simultaneous and / or the RNA Type-specific detection of RNA molecules using a single sequencing library preparation protocol along with one or more (sequential) pretreatment of RNA molecules. In some embodiments, the method of said library preparation protocol can specifically detect only Type 1 or a combination of Type 1 and Type 2 of RNA molecules. The examples for the latter embodiments are shown in Figs. 3A- 3B, and Tables 1-2.
[0030] Described herein in are methods of detecting a plurality of RNA molecules that comprise a plurality of RNA end-types. In certain embodiments, the methods detect at least twoAttorney Docket No. 57767-714601 of Type 1, Type 2, Type 3 and Type 4 of RNA molecules. In certain embodiments, the methods detect at least three of Type 1, Type 2, Type 3 and Type 4 of RNA molecules. In certain embodiments, the methods detect Type 1, Type 2, Type 3 and Type 4 of RNA molecules. In certain embodiments, the different end-types are processed simultaneously. In certain embodiments, the different end-types are detected simultaneously. In certain embodiments, the different end-types are processed and detected simultaneously.RNA pretreatments for conversion of RNA ends
[0031] Pretreatments that allow simultaneous (Type-independent) or Type-specific detection of RNA molecules can exploit the differences in phosphorylation states or forms of RNA termini. Each RNA molecule comprises a 5’ end and a 3’ present in two forms that can be ligated (ligatable ends) or cannot be ligated (unligatable ends) either to another end of the same RNA molecules intramolecularly or to and end of (sequencing) adapter intermolecularly, where the ligating with an adapter (or adapters) represents a first step of sequencing library preparation. In some embodiments, the detection of all (four) RNA Types simultaneously is provided by converting all unligatable end(s) into ligatable ends before the sequencing library preparation. In other embodiments, the detection of specific RNA Types or a combination of two (or three) specific RNA Types comprises a direct circularization (intramolecular ligation) of an RNA Type comprising a combination of ligatable ends. The circularized RNA molecules have no ends anymore and cannot be ligated to the adapters that exclude (or deplete) them from sequencing the libraries. The termini of RNA Types comprising one or two unligatable ends can be further either selectively converted into RNA Types that can be circularized (to deplete another, selected RNA Type) or ligated to the adapter (or adapters) and incorporated into the sequencing libraries.
[0032] In some embodiments, the combinations of the ligatable ends are selected from 5’- Phosphate (5’-P) end with 3 ’-hydroxyl (3 ’-OH) end; or a 5’ hydroxyl (3 ’-OH) end with 3’- Phosphate (3’-P) or 2’, 3’-cyclic phosphate (2’,3’>P). In some embodiments, the selected Types of RNA molecules (RNA Types) can be detected individually or as possible combinations of the individual Types of RNA molecules. In some embodiments, the circularizing prevents ligation of circularized RNA molecules with sequencing adapter(s) and prevents an incorporation of the circularized RNA molecules into sequencing library. In some embodiments, the core library preparation protocol specifically detects only Type 1 or a combination of Type 1 and Type 2 of RNA molecules.
[0033] In some embodiments, the methods comprise separating a sample or composition comprising the plurality of RNA molecules into at least a first partition and a second partition. The methods may further comprise performing a first pretreatment on the first partition and a second pretreatment on the second partition, wherein the first pretreatment and the secondAttorney Docket No. 57767-714601 pretreatment are different. The first and second pretreatment may be individually selected from circularizing a plurality of RNA molecules comprising 5’-P ends and 3 ’-OH ends; converting a plurality of RNA molecules comprising 3’-P end or 2’,3’>P end, to 3 ’-OH ends; then circularizing a plurality of RNA molecules comprising 5’0 ends and 3 ’-OH ends; converting a plurality of RNA molecules comprising 5 ’-OH ends to 5’-P ends and circularizing a plurality of RNA molecules comprising 5’-P ends and 3 ’-OH ends; then converting a plurality of RNA molecules comprising 3’-P end or a 2’,3’>P end to 3’-OH ends; circularizing a plurality of RNA molecules comprising 5’-OH ends and 3’-P ends or 2’,3’>P end, then converting a plurality of RNA molecules comprising 5 ’-OH ends to 5’-P ends and circularizing a plurality of RNA molecules comprising 5’-P ends and 3’-OH ends; circularizing a plurality of RNA molecules comprising 5’-OH ends and 3’-P ends or 2’,3’>P ends; then converting a plurality of RNA molecules comprising 3’-P ends or 2’,3’>P ends to 3’-OH ends; degrading a plurality of RNA molecules comprising 5’-P ends, then converting a plurality of RNA molecules comprising 5’- OH ends to 5’-P ends and circularizing a plurality of RNA molecules comprising 5’-P ends and 3’-OH ends; then converting a plurality of RNA molecules comprising 3’-P ends or 2’,3’>P end, to 3’ OH ends; or no pretreatment is performed. The first and second pretreatment may be independently selected from: (a) circularizing Type 1 RNA molecules; (b) converting Type 3 RNA molecules to Type 2 RNA molecules and converting Type 4 RNA molecules to Type 1 RNA molecules; (c) converting Type 3 RNA molecules to Type 2 RNA molecules and converting Type 4 RNA molecules to Type 1 RNA molecules, then circularizing Type 1 RNA molecules; (d) converting Type 2 RNA molecules to Type 1 RNA molecules and circularizing Type 1 RNA molecules, then converting Type 3 RNA molecules to Type 2 RNA molecules and converting Type 4 RNA molecules to Type 1 RNA molecules; (e) circularizing Type 3 molecules, then converting Type 2 RNA molecules to Type 1 RNA molecules and circularizing Type 1 RNA molecules; (f) circularizing Type 3 RNA molecules, then converting Type 4 RNA molecules to Type 1 RNA molecules; (g) degrading Type 1 and Type 4 RNA molecules, then converting Type 2 RNA molecules to Type 1 RNA molecules and circularizing Type 1 RNA molecules; (e) then converting Type 3 RNA molecules to Type 2 RNA molecules; (f) or no pretreatment. The first and second pretreatment may be individually selected from: (a) contacting the plurality of RNA molecules with T4 RNA ligase 1 and T4 RNA ligase 2 in the presence of ATP; (b) contacting the plurality of RNA molecules with PNK in the absence of ATP; (c) contacting the plurality of RNA molecules with PNK in the absence of ATP then contacting the plurality of RNA molecules with T4 RNA ligase 1 and T4 RNA ligase 2 in the presence of ATP; (d) contacting the plurality of RNA molecules with PNK (3’ phosphatase minus), T4 RNA ligase 1, and T4 RNA ligase 2 in the presence of ATP; (e) then contacting theAttorney Docket No. 57767-714601 plurality of RNA molecules with T4 RNA ligase in the absence of ATP; (f) contacting the plurality of RNA molecules with RtcB ligase, then contacting the plurality of RNA molecules with PNK (3’ phosphatase minus), T4 RNA ligase 1, and T4 RNA ligase 2 in the presence of ATP; (g) contacting the plurality of RNA molecules with RtcB ligase, then contacting the plurality of RNA molecules with PNK in the absence of ATP; (h) contacting the plurality of RNA molecules with Terminator 5 ’-Phosphate-Dependent Exonuclease, then contacting the plurality of RNA molecules with PNK(3’ phosphatase minus), T4 RNA ligase 1, and T4 RNA ligase 2 in the presence of ATP; (i) then contacting the plurality of RNA molecules with PNK in the absence of ATP; (j) or no pretreatment.
[0034] In some embodiments, the method comprises at least one of (a) circularizing a plurality of RNA molecules comprising 5’-P ends and 3’-OH ends; (b) converting a plurality of RNA molecules comprising 3’-P end or 2’,3’>P end, to 3’-OH ends; (c) converting a plurality of RNA molecules comprising 3’-P or 2’,3’>P ends to 3’-OH ends, then circularizing a plurality of RNA molecules comprising 5’-P ends and 3’-OH ends; (d) converting a plurality of RNA molecules comprising 5’-OH ends to 5’-P ends and converting a plurality of RNA molecules comprising 5’- OH and 3’P or 2’,3’>P ends to 5’-P and 3’-P or 2’,3’>P ends, then circularizing a plurality of RNA molecules comprising 5’-P ends and 3’-OH ends, and then converting 5’-P and 3’-P or 2’,3’>P ends to 5’-P ends and 3’-OH ends ; (e)circularizing a plurality of RNA molecules comprising 5’- OH ends and 3’-P ends or 2’,3’>P end, then converting a plurality of RNA molecules comprising 5 ’-OH ends to 5’-P ends and circularizing a plurality of RNA molecules comprising 5’-P ends and 3’-OH ends; (f) circularizing a plurality of RNA molecules comprising 5’-OH ends and 3’-P ends or 2’,3’>P ends; then converting a plurality of RNA molecules comprising 3’-P ends or 2’,3’>P ends to 3’-OH ends; (g) degrading a plurality of RNA molecules comprising 5’-P ends, then converting a plurality of RNA molecules comprising 5 ’-OH ends to 5’-P ends and circularizing a plurality of RNA molecules comprising 5’-P ends and 3’-OH ends; then converting a plurality of RNA molecules comprising 3’-P ends or 2’,3’>P end, to 3’ OH ends; or (d) no pretreatment is performed.
[0035] In some embodiments, the method comprises at least one of: (a) circularizing Type 1 RNA molecules ; (b) converting Type 3 RNA molecules to Type 2 RNA molecules and converting Type 4 RNA molecules to Type 1 RNA molecules; (c) converting Type 3 RNA molecules to Type 2 RNA molecules and converting Type 4 RNA molecules to Type 1 RNA molecules, then circularizing Type 1 RNA molecules; (d) converting Type 2 RNA molecules to Type 1 RNA molecules and converting Type 3 RNA molecules to Type 4 RNA molecules, then circularizing Type 1 molecules, and then converting Type 4 RNA molecules to Type 1 RNA molecules; (e)circularizing Type 3 molecules, then converting Type 2 RNA molecules to Type 1 RNAAttorney Docket No. 57767-714601 molecules and circularizing Type 1 RNA molecules; (f) circularizing Type 3 molecules, then converting Type 4 molecules to type 1 molecules; (g) degrading type 1 and type 4 molecules, then converting Type 2 molecules to type 1 molecules and circularizing Type 1 molecules; then converting Type 3 molecules to Type 2 molecules; or (h) no pretreatment.
[0036] In some embodiments, the method comprises at least one of (a) contacting the plurality of RNA molecules with T4 RNA ligase 1 and T4 RNA ligase 2 in the presence of ATP; (b) contacting the plurality of RNA molecules with PNK in the absence of ATP; (c) contacting the plurality of RNA molecules with PNK in the absence of ATP then contacting the plurality of RNA molecules with T4 RNA ligase 1 (Rnll) and T4 RNA ligase 2 (Rnl2) in the presence of ATP; (d) contacting the plurality of RNA molecules with PNK (3’ phosphatase minus) in the presence of ATP; then contacting the plurality of RNA molecules with T4 RNA ligase 1 and T4 RNA ligase 2 in the presence of ATP; then contacting the plurality of RNA molecules with PNK in the absence of ATP; (e) contacting the plurality of RNA molecules with RtcB ligase, then contacting the plurality of RNA molecules with PNK (3’ phosphatase minus) in the presence of ATP; then contacting the plurality of RNA molecules with T4 RNA ligase 1, and T4 RNA ligase 2 in the presence of ATP; then contacting the plurality of RNA molecules with PNK in the absence of ATP; (f) contacting the plurality of RNA molecules with RtcB ligase, then contacting the plurality of RNA molecules with PNK in the absence of ATP; (g) contacting the plurality of RNA molecules with Terminator 5 ’-Phosphate-Dependent Exonuclease, then contacting the plurality of RNA molecules with PNK (3’ phosphatase minus), in the presence of ATP; then contacting the plurality of RNA molecules with T4 RNA ligase 1 and T4 RNA ligase 2 in the presence of ATP; then contacting the plurality of RNA molecules with T4 RNA ligase in the absence of ATP; or (h) no pretreatment(s).
[0037] In some embodiments, a pretreatment step may comprise circularizing a plurality of RNA molecules comprising 5’-P ends and 3 ’-OH ends. In some embodiments, the pretreatment step comprises circularizing Type 1 RNA molecules. The pretreatment step may comprise contacting the plurality of RNA molecules with T4 RNA ligase 1 and T4 RNA ligase 2 in the presence of ATP.
[0038] In some embodiments, a pretreatment step may comprise converting a plurality of RNA molecules comprising 3’-P end or 2’,3’>P end, to 3’-OH ends by PNK in the absence of ATP. In some embodiments, the pretreatment step may comprise converting Type 3 RNA molecules to Type 2 RNA molecules and converting Type 4 RNA molecules to Type 1 RNA molecules. The pretreatment step may comprise contacting the plurality of RNA molecules with PNK in the absence of ATP.Attorney Docket No. 57767-714601
[0039] In some embodiments, a pretreatment step may comprise converting a plurality of RNA molecules comprising 3’-P or 2’,3’>P ends to 3’-OH ends, then circularizing a plurality of RNA molecules comprising 5 ’OH ends and 3 ’-OH ends. In some embodiments, the pretreatment step may comprise converting Type 3 RNA molecules to Type 2 RNA molecules and converting Type 4 RNA molecules to Type 1 RNA molecules, then circularizing Type 1 RNA molecules. The pretreatment step may comprise contacting the plurality of RNA molecules with PNK in the absence of ATP then contacting the plurality of RNA molecules with T4 RNA ligase 1 and T4 RNA ligase 2 in the presence of ATP.
[0040] In some embodiments, a pretreatment step may comprise converting a plurality of RNA molecules comprising 5 ’-OH ends to 5’-P ends and circularizing a plurality of RNA molecules comprising 5’-P ends and 3’-OH ends; then converting a plurality of RNA molecules comprising 3’-P end or a 2’,3’>P end to 3 ’-OH ends. In some embodiments, the pretreatment step may comprise converting Type 2 RNA molecules to Type 1 RNA molecules and circularizing Type 1 RNA molecules, then converting Type 3 RNA molecules to Type 2 RNA molecules and converting Type 4 RNA molecules to Type 1 RNA molecules. The pretreatment step may comprise contacting the plurality of RNA molecules with a polynucleotide kinase that catalyzes the removal of 3 '-phosphoryl groups from 3 '-phosphoryl polynucleotides, (e.g., a T4 PNK (3’ phosphatase minus) (PNK, 3’ minus)), T4 RNA ligase 1, and T4 RNA ligase 2 in the presence of ATP; then contacting the plurality of RNA molecules with PNK in the absence of ATP.
[0041] In some embodiments, a pretreatment step may comprise circularizing a plurality of RNA molecules comprising 5’-OH ends and 3’-P ends or 2’,3’>P end, then converting a plurality of RNA molecules comprising 5 ’-OH ends to 5’-P ends and circularizing a plurality of RNA molecules comprising 5’-P ends and 3 ’-OH ends. In some embodiments, the pretreatment step may comprise circularizing Type 3 molecules, then converting Type 2 RNA molecules to Type 1 RNA molecules and circularizing Type 1 RNA molecules. The pretreatment step may comprise contacting the plurality of RNA molecules with RtcB ligase, then contacting the plurality of RNA molecules with PNK, 3’ minus; T4 RNA ligase 1, and T4 RNA ligase 2 in the presence of ATP.
[0042] In some embodiments, a pretreatment step may comprise circularizing a plurality of RNA molecules comprising 5’-OH ends and 3’-P ends or 2’,3’>P ends; then converting a plurality of RNA molecules comprising 3’-P ends or 2’,3’>P ends to 3’-OH ends. In some embodiments, the pretreatment step may comprise circularizing Type 3 molecules, then converting Type 4 molecules to type 1 molecules. The pretreatment step may comprise contacting the plurality of RNA molecules with a ligase that joins single-stranded RNA with aAttorney Docket No. 57767-7146013 '-phosphate or 2 , 3 '-cyclic phosphate to another RNA with a 5 ' -hydroxyl (e.g., RtcB ligase), then contacting the plurality of RNA molecules with PNK in the absence of ATP.
[0043] In some embodiments, a pretreatment step may comprise degrading a plurality of RNA molecules comprising 5’-P ends, then converting a plurality of RNA molecules comprising 5’-OH ends to 5’-P ends and circularizing a plurality of RNA molecules comprising 5’-P ends and 3’-OH ends; then converting a plurality of RNA molecules comprising 3’-P ends or 2’,3’>P end, to 3’ OH ends. In some embodiments, the pretreatment step may comprise degrading Type 1 and Type 4 RNA molecules, then converting Type 2 RNA molecules to Type 1 RNA molecules and circularizing Type 1 RNA molecules: then converting Type 3 RNA molecules to Type 2 RNA molecules. The pretreatment step may comprise contacting the plurality of RNA molecules with Terminator 5 ’-Phosphate-Dependent Exonuclease, then contacting the plurality of RNA molecules with PNK, (3’ phosphatase minus), T4 RNA ligase 1, and T4 RNA ligase 2 in the presence of ATP; then contacting the plurality of RNA molecules with PNK in the absence of ATP.
[0044] In some embodiments, a pretreatment step may comprise converting a plurality of RNA molecules comprising 3’-P or 2’,3’>P ends to 3’-OH ends, then circularizing a plurality of RNA molecules comprising 5’-P ends and 3’-OH ends. In some embodiments, the pretreatment step comprises converting Type 3 RNA molecules to Type 2 RNA molecules and converting Type 4 RNA molecules to Type 1 RNA molecules, then circularizing Type 1 RNA molecules. In some embodiments, the pre-treatment step comprises contacting the plurality of RNA molecules with PNK in the absence of ATP then contacting the plurality of RNA molecules with T4 RNA ligase 1 (Rnll) and T4 RNA ligase 2 (Rnl2) in the presence of ATP.
[0045] In some embodiments, a pretreatment step may comprise converting a plurality of RNA molecules comprising 5’-OH ends to 5’-P ends and converting a plurality of RNA molecules comprising 5’-OH and 3’P or 2’,3’>P ends to 5’-P and 3’-P or 2’,3’>P ends, then circularizing a plurality of RNA molecules comprising 5’-P ends and 3 ’-OH ends, and then converting 5’-P and 3’-P or 2’,3’>P ends to 5’-P ends and 3’-OH ends. In some embodiments, the pre-treatment step comprises converting Type 2 RNA molecules to Type 1 RNA molecules and converting Type 3 RNA molecules to Type 4 RNA molecules, then circularizing Type 1 molecules, and then converting Type 4 RNA molecules to Type 1 RNA molecules. In some embodiments, the pretreatment step comprises contacting the plurality of RNA molecules with PNK (3’ phosphatase minus) in the presence of ATP; then contacting the plurality of RNA molecules with T4 RNA ligase 1 and T4 RNA ligase 2 in the presence of ATP; then contacting the plurality of RNA molecules with PNK in the absence of ATP.Attorney Docket No. 57767-714601
[0046] In some embodiments, a pretreatment step may comprise circularizing a plurality of RNA molecules comprising 5’-OH ends and 3’-P ends or 2’,3’>P end, then converting a plurality of RNA molecules comprising 5 ’-OH ends to 5’-P ends and circularizing a plurality of RNA molecules comprising 5’-P ends and 3’-OH ends. In some embodiments, the pretreatment step comprises circularizing Type 3 molecules, then converting Type 2 RNA molecules to Type 1 RNA molecules and circularizing Type 1 RNA molecules. In some embodiments, the pretreatment step comprises contacting the plurality of RNA molecules with RtcB ligase, then contacting the plurality of RNA molecules with PNK (3’ phosphatase minus) in the presence of ATP; then contacting the plurality of RNA molecules with T4 RNA ligase 1, and T4 RNA ligase 2 in the presence of ATP; then contacting the plurality of RNA molecules with PNK in the absence of ATP.
[0047] In some embodiments, a pretreatment step may comprise degrading a plurality of RNA molecules comprising 5’-P ends, then converting a plurality of RNA molecules comprising 5’-OH ends to 5’-P ends and circularizing a plurality of RNA molecules comprising 5’-P ends and 3’-OH ends; then converting a plurality of RNA molecules comprising 3’-P ends or 2’,3’>P end, to 3’ OH ends. In some embodiments, the pre-treatment step comprises degrading Type 1 and Type 4 RNA molecules, then converting Type 2 RNA molecules to Type 1 RNA molecules and circularizing Type 1 RNA molecules; then converting Type 3 RNA molecules to Type 2 RNA molecules. In some embodiments, the pretreatment step comprises contacting the plurality of RNA molecules with Terminator 5 ’-Phosphate-Dependent Exonuclease, then contacting the plurality of RNA molecules with PNK (3’ phosphatase minus), in the presence of ATP; then contacting the plurality of RNA molecules with T4 RNA ligase 1 and T4 RNA ligase 2 in the presence of ATP; then contacting the plurality of RNA molecules with T4 RNA ligase in the absence of ATP.
[0048] In some embodiments, the pretreatments of the RNA molecules with PNK in the absence of ATP. In some embodiments, the pretreatment with PNK is performed in buffer solutions comprising a buffer selected from ACES [(N-(2-acetamido)-2-aminoethanesulfonic acid)]; Acetate buffer; ADA [N-(2-acetamido)iminodiacetic acid]; BES [N,N-Bis(2- hydroxyethyl)-2-aminoethanesulfonic acid]; TRIS [tris(hydroxymethyl)aminomethane; BISTRIS [Bis-(2-hydroxy-ethyl)-amino-tris(hydroxymethyl)-methane]; BIS-TRIS-Propane [2,2'- (Propane- 1 ,3 -diyldiimino)bis[2-(hydroxymethyl)propane- 1 ,3 -diol] ; Cacodylic acid [Dimethylarsinic acid]; Carbonate buffer; Citric acid [2-hydroxypropane-l,2,3-tricarboxylic acid]; Imidazole buffer; MES [2-(N-morpholino)ethanesulfonic acid]; MOPS [3- morpholinopropane-1 -sulfonic acid]; MOPSO [2-Hydroxy-3-morpholinopropanesulfonic acid]; PIPES [piperazine-N,N'-bis(2-ethanesulfonic acid)], or combinations thereof. 1Attorney Docket No. 57767-714601
[0049] In certain aspects, described herein is a method of preparation of sequencing libraries comprising RNA molecules with all possible combinations of a 5’-P, 5’-OH, 3’-P end, 3’-OH end, 3’-P end or 2’,3’>P end, the method converting the 3’-P ends or the 2’,3’>P ends to 3’-OH ends and converting 5 ’-OH ends to the 5’-P ends to make them all RNA Type 1. The standard practice in the field is simultaneous conversion of all these ends using a single treatment of RNA molecules by PNK in Tris-HCl buffer (pH 7.6) in the presence of 1 mM ATP (Solaguren- Beascoa et al. 2023. Int. J. Mol. Sci. 24: 11653). PNK contains both kinase and phosphatase domains, where kinase activity provides 5’-end phosphorylation (conversion 5’-OH to 5’-P) and the phosphatase activity can dephosphorylate 3’-P or 2’,3’>P ends (convert them to 3’-OH). However, the optimal reaction conditions for the PNK phosphatase and kinase activities are different (Zhuang et al. J. Nucleic Acids. 2012: 360358).
[0050] The methods described herein accomplishes this task in multiple steps rather than simultaneously. In the first step, the 3’-P ends or the 2’,3’>P ends are converted to 3’-OH ends by contacting the plurality of RNA molecules with a PNK in the absence of ATP at pH 6+0.5. In the second step, the 5 ’-OH ends are converted to 5’-P ends by contacting the plurality of RNA molecules with a PNK in the presence of ATP at pH 8+0.5. In some embodiments, these two steps can be done as sequential pretreatments before the sequencing library preparation to improve the rate (and yield) of the conversion. In other embodiments, the first step is done as a pretreatment (or be one among other types of pretreatments) before the sequencing library preparation, and the second step is included in the library preparation protocol. In some embodiments, the 3 ’-end dephosphorylation is performed in MES buffer solution at pH 6.0. In some embodiments, the PNK is heat-inactivated at 65°C-85°C in the presence of citric acid at pH 6.0, wherein chelating Mg2+cations by citrate anions and pH 6 prevents RNA degradation at the high temperatures.
[0051] The methods may further comprise protocols for one or more sequential pretreatments of the plurality of RNA molecules before preparing a sequencing library. Depending on the number of the different pretreatment protocols, the method may further comprise separating the composition comprising the plurality of RNA molecules into a corresponding number of partitions, where different pretreatment protocols will be applied.
[0052] The methods may further comprise inactivation or removal of enzymes and solutes present in the pretreatment reaction solutions that are incompatible with the downstream pretreatment steps and / or library preparation protocols. In some embodiments, the enzymes and solutes are removed using spin-columns or beads-based protocols. In some embodiments, the enzymes and solutes are removed using RNA Clean and Concentrator kit columns (Zymo Research). In other embodiments, the method comprises heat inactivation of the enzymes. InAttorney Docket No. 57767-714601 some embodiments, the heat inactivation is performed in reaction buffers in the presence of a chelation agent which binds and inactivates divalent metal ions capable of degrading RNA molecules at elevated temperatures. In some embodiments, the chelating agent is selected from a salt of: EDTA, citric acid, and ADA [N-(2-acetamido)iminodiacetic acid], or a combination thereof.Combo adaptor ligation
[0053] For use with the methods described herein are combo adapters (CAD) comprising: a) nucleic acid residues, and, optionally, at least one modified residue; b) a 5 ’-proximal segment and a 3 ’-proximal segment, wherein each proximal segment comprises at least one sequencing adapter, primer binding site, sequencing bar-code, detection sequence, or a combination thereof; c) a 5’ end and a 3’ end that allow: i) interm olecular ligation of said combo adapter to a sample polynucleotide to produce an adapter-polynucleotide ligation product (also referred to as adapter-polynucleotide ligation product); and ii) circularization of the adapter-polynucleotide ligation product to produce a circularized adapter-polynucleotide ligation product; and d) a template-deficient segment for primer extension by a polymerase, wherein the template-deficient segment or at least one modified residue (or moiety) restricts rolling-circle amplification. The modified residue may be between the 5 ’-proximal segment and the 3 ’-proximal segment of the combo adapter. The template-deficient segment may be between the 5’ proximal segment and the 3’ proximal segment. The modified residue(s) or moi eties may be located in the template. The sequencing adapters may comprise a sequence required by sequencing methods selected from: standard Sanger sequencing; next-generation sequencing; and single-molecule sequencing. The combo adapter may comprise at least one sequence selected from: a sequencing adapter, a primer binding site, a detection sequence, a probe hybridization sequence, a capture oligonucleotide binding site, a polymerase binding site, an endonuclease restriction site, a sequencing bar-code, an indexing sequence, a Zip-code, one or more random nucleotides, a unique molecular identifier (UMI), sequencing flow-cell binding sites and combinations thereof. The 5’-proximal segment or the 3’-proximal segment of said combo adapter may comprise at least one sequencing adapter. The 5 ’-proximal segment and the 3 ’-proximal segment of said combo adapter may each comprise at least one sequencing adapter. The sequencing adapters may enable sequencing of the adapter-polynucleotide ligation product or complement thereof. The combo adapter may contain at least one ribonucleotide (RNA), deoxyribonucleotide (DNA), or modified nucleic acid residue. Non-limiting examples of modified residues include a deoxyuridine (dU), an inosine (I), a deoxyinosine (di), an Unlocked Nucleic Acid (UNA), a Locked Nucleic Acid (LNA) comprising a sugar modification, a Peptide Nucleic Acid (PNA), an abasic site, and a nucleic acid residue with a modification selected from: a 5-nitroindole baseAttorney Docket No. 57767-714601 modification, a 2’-phosphate (2’-p), a 2’-NH2, a 2’-NHR, a 2'-OMe, a 2’-O-alkyl, a 2'-F, a 2’- halo, a phosphorothioate (PS), and a disulfide (S-S) internucleotide bond modification.
[0054] In some embodiments, the 5’ - and / or 3 ’-end groups of the combo adapter may contain a reversible blocking group that requires chemical, photochemical or enzymatic conversion to unblock, repair or activate the end group converting said blocking group active groups prior to the circularization step. Non-limiting examples of reversible blocking groups are 3’-phosphate (3 ’-p), 2’-phosphate (2’-p), 2’,3’-cyclic phosphate (2’,3’>p), 3’-O-(3- methoxyethyl)ether, and 3’-O-isovaleryl ester, 5’-ppp, 5’-p and 5’-OH. Non-limiting examples of active groups are 2’-OH / 3’-OH. A chemical group may be an active group or a reversible blocking group depending on the ligase used. For example, 3 ’-OH may be an active group for 3 ’-OH ligase and a blocking group for 5 ’-OH ligase; 3’-p may be an active group for 5 ’-OH ligase and a blocking group for 3 ’-OH ligase; 5 ’-OH may be an active group for 5 ’-OH ligase and a blocking group for 3 ’-OH ligase; and 5’-p or 5’-App may be an active group for 3 ’-OH ligase and a blocking group for 5 ’-OH ligase.
[0055] The methods described herein comprise ligating an RNA to a combo adapter (CAD). Ligating the RNA to the CAD to produce the polynucleotide-CAD ligation product (which also referred as adapter-polynucleotide ligation product) may occur before circularizing.
[0056] Ligating the sample polynucleotide to the CAD may be performed with at least one ligase. The ligase may be selected from: T4 RNA ligase, T4 RNA ligase 1 (Rnll), T4 RNA ligase 2 (Rnl2), T4 RNA Ligase 2 truncated, T4 RNA Ligase 2 truncated K227Q, T4 RNA Ligase 2 truncated KQ, Thermostable 5' AppDNA / RNA Ligase, Mth RNA Ligase, CircLigase™ ssDNA ligase, CircLigase™ II ssDNA ligase, CircLigase™ RNA Ligase, Thermostable RNA ligase, ThermoPhage DNA ligase, T3 DNA ligase, T4 DNA ligase, and SplintR® Ligase.
[0057] In some embodiments, the CAD is ligated to the 5’ end of the RNA. In some embodiments, the CAD is ligated to the 3’ end of the RNA.
[0058] In some embodiments, the ligating comprises splint-independent ligation of the CAD to the sample polynucleotide. In some embodiments, the ligating comprises splint-assisted ligation of the CAD to the RNA.
[0059] In some embodiments, the CAD and / or the sample polynucleotide is contacted with an enzyme that modifies the CAD and / or the sample polynucleotide after circularizing. The enzyme, by way of non-limiting example, may be a nucleic acid cleaving enzyme.Preparing a sequencing library
[0060] Described herein are methods of preparing sequencing libraries from samples comprising a plurality of RNA molecules, the plurality of RNA molecules comprising a plurality of RNA Types. RNA sequencing library preparation protocols requiring a ligation of both endsAttorney Docket No. 57767-714601 of RNA molecules to adapter or adapters can be used with the upfront pretreatments described herein. In contrast, library preparation methods which are insensitive to the phosphorylation status of one or both ends of RNA cannot discriminate between all RNA Types. Preparing a sequencing library comprise ligating a plurality of adaptors as described herein to the plurality of RNA molecules to produce a plurality of adaptor-RNA molecules. In some embodiments, the RNA molecules are ligated to two adapters, where first adapter is ligated to first RNA end and the second adapter is ligated to the second RNA end to produce 5’-adapter-RNA-adapter-3’ ligation products. In other embodiments, the RNA molecules are ligated to a single adapter, where the adapter is ligated to either 5’ or 3’ RNA end to produce 5’-adapter-RNA-3’ or 3’- adapter-RNA-5’ ligation products. In preferred embodiments, the preparing a sequence library further comprises circularizing the ligation products of RNA with the single adapter by ligating their 5' ends to their 3' ends to produce a plurality of circularized adapter-RNA ligation products. In some embodiments, the ligating and / or circularizing comprise contacting the adapter and / or the plurality of adapter-RNA ligation products with a ligase.
[0061] The circularization of RNA molecules in the pretreatment steps as well as the ligation of adapters to RNA molecules and the circularization of adapter-RNA ligation products during the sequencing library preparation comprise the ligation. The ligation may be splint independent. The ligation may be splint dependent. In some embodiments, the ligation is performed with at least one 3’-ligase (ligating 5’-P and 3’-OH ends). In certain embodiments, the 3’ ligase is selected from: T4 RNA ligase, T4 RNA ligase 1 (Rnll), T4 RNA ligase 2 (Rnl2), Mth RNA Ligase, CircLigase™ ssDNA ligase, CircLigase ™ II ssDNA ligase, CircLigase™ RNA Ligase, Thermostable 5' AppDNA / RNA ligase, or a combination thereof. In some embodiments, the ligation is performed with at least one 5’-ligase (ligating 5’-OH and 3’-P or 2’,3’>P ends) selected from: RNA-splicing ligase (RtcB), A. thaliana tRNA ligase (AtRNL ), tRNA ligase enzyme (Tril), tRNA ligase (Rigl+), or a combination thereof.
[0062] The methods for preparing a sequencing library described herein may comprise converting the 5'-end and / or 3'-end groups of the adapter-RNA ligation product before the circularizing. In some embodiments, converting comprises an enzymatic conversion selected from: a) 5'-OH to 5’-P; b) 3’-P to 3'-OH; c) 2'-P to 2'-OH; and d) 2',3'>p to 2'-OH and 3'-OH, or combinations thereof. In some embodiments, the enzymatic conversion comprises a treatment by PNK or PNK, 3 minus in the presence of ATP. In some embodiments, the enzymatic conversion comprises a treatment by PNK in the absence of ATP. In some embodiments, the treatment by PNK in the absence of ATP is performed at pH 6+0.5.
[0063] In some embodiments, preparing a sequencing library further comprises rolling circle amplification (RCA) of the circularized products. In some other embodiments, preparing aAttorney Docket No. 57767-714601 sequencing library further comprises reverse transcription (RT) of circularized products followed by PCR amplification of cDNA products of the RT. In certain embodiments, preparing a sequencing library further comprises direct PCR amplification of the circularized products. In some embodiments, the amplicons produced by RCA, PCR or RT-PCR comprise the sequencing library. In some other embodiments, the RT step is performed using wild type or modified versions of reverse transcriptases. In certain embodiments, the reverse transcriptase comprises (RNA-dependent DNA polymerases) selected from: Superscript® III and IV, ThermoScript™, Maxima™ and RevertAid™ (Thermo Fisher); AMV, M-MuLV, ProtoScript® II and Induro RT (NEB); MarathonRT (Kerafast); StellarScript (Watchmaker); TGIX-RTase (SBS Genentech). In some other embodiments, the RT step is performed using DNA Polymerase also having reverse transcriptase activity (DNA-dependent and RNA-dependent DNA polymerases) selected from: DNA polymerase I Large (Klenow) Fragment, Taq, Bst 2.0 / II and 3.0 / 111 (NEB); and Tth DNA Polymerase (Roche).
[0064] The methods may further comprise sequencing the plurality of sequencing libraries using any of current next-generation sequencing (NGS) and single-molecule sequencing platforms. In some embodiments, the NGS method is selected from: Illumina, Singular Genomics, Element Biosciences, Ultima Genomics or Complete Genomics / BGI platforms. In some embodiments, the single-molecule sequencing platforms is selected from Oxford Nanopores or Pacific Biosciences platforms. In some embodiments, the sequencing comprises sequencing of the entire RNA pool or RNA fragmentome (total RNA or whole transcriptome sequencing) that allows detection of known and previously unknown RNA sequences. In some other embodiments, the sequencing comprises sequencing of selected fraction(s) of RNA pool and / or selected fraction(s) of sequencing libraries comprising RNA molecules or their sequences of selected sizes or lengths. In some other embodiments, the sequencing comprises sequencing of a pool of RNA molecules comprising known or selected sequences of interest (targeted sequencing). In some other embodiments, the targeted sequencing comprises enrichment or selection of the targeted sequencing by hybridization with target-specific oligonucleotides.
[0065] The methods may further comprise bioinformatic analysis of the sequencing data and profiles, including (but not limited to) statistical analysis of absolute and / or relative abundances of sequences selected from: all RNA Types, all RNA classes, selected RNA classes, selected RNA Types, selected lengths or ranges of the lengths, or combination thereof. In some embodiments, profiling of the plurality of sequencing libraries simultaneously by deep sequencing allows identification of sequences derived from Type 1, Type 2, Type 3, or Type 4 RNA classes as biomarker candidates. In other embodiments, the method specifically detects RNA molecules of individual Typse selected from Type 1, Type 2, Type 3, and Type 4 orAttorney Docket No. 57767-714601 combinations thereof. In some embodiments, finding the specific RNA Type(s) selected for specific RNA classes that eliminate background sequencing reads (which otherwise would be provided by irrelevant RNAs) can significantly increase the detection sensitivity for low abundant RNA biomarkers for multiple pathologies including cancer. In some embodiments, a fragmentation sequencing profile for each RNA Type of each RNA class is used to identify most sensitive and specific biomarkers.
[0066] In certain embodiments, sequencing profiles are compared between different RNA samples to determine differential expression (DE) between these samples. In some embodiments, a length of analyzed RNA sequencing reads is selected from within a range of: 15-200 nucleotides, 20-200 nucleotides; 30-200 nucleotides; 15-150 nucleotides, 20-150 nucleotides; 30-150 nucleotides; 15-100 nucleotides, 20-100 n nucleotides t; 30-100 nucleotides; 15-75 nucleotides, 20-75 nucleotides; 30-75 nucleotides; 15-50 nucleotides, 20-50 nucleotides and 30-50 nucleotides.
[0067] In certain embodiments, the DE analysis is performed to compare RNA molecule expression levels in healthy controls and samples taken from individuals with certain pathologies or diseases. In certain embodiments, the disease is cancer. In certain embodiments, the disease is an infectious disease.
[0068] In certain embodiments, the DE analysis is performed to compare RNA molecule expression levels in samples comprising a combination of host and pathogen RNAs, wherein RNA sequences from both species will be analyzed.
[0069] In certain embodiments, sequencing is used to compare different RNA Types present in a sample to determine differences in RNA Types within the sample. Ratios or percentages of different RNA Types may reveal an RNA Type signature of a sample. Such signatures could be useful for identifying samples or diagnosing a disease in an individual from which the sample is taken.
[0070] In certain embodiments, sequencing of RNA samples detect an absolute number of or sequencing counts associated with specific RNA Types in a sample. In certain embodiments, the methods detect at least two of Type 1, Type 2, Type 3 and Type 4 RNA molecules. In certain embodiments, the methods detect at least three of Type 1, Type 2, Type 3 and Type 4 RNA. In certain embodiments, the methods detect Type 1, Type 2, Type 3 and Type 4 RNA molecules.
[0071] Determining different RNA Types may be combined with determining the sequence of an RNA associated with a particular RNA Type. Such information can shed light into tissuebased expression of certain genes, or if certain genes are associated with a given pathology, or if an individual can be diagnosed with a certain pathology such as cancer, inflammatory or autoimmune conditions, metabolic diseases, or aging. RNAs sequenced may in addition to beingAttorney Docket No. 57767-714601 derived from the host be derived from a non-host source, such as without limitation, a bacteria, virus, fungus, or parasite. The non-host source may be a pathogenic organism or a beneficial organism. The methods disclosed herein may determine if the non-host source is resistant to antibiotics, antifungals, anti-parasitic, or antivirals. The methods disclosed herein may determine if the non-host source is a chronic infection or an acute infection.
[0072] Based on the results of the method described herein the results of the RNA sequencing may indicate a further course of treatment to be administered to an individual. Based on the results of the method described herein the results of the RNA sequencing may indicate a suitability for a course of treatment to be administered to an individual.Adapters
[0073] In some embodiments, the adapter or adapters ligated to the ends of RNA molecules during the sequencing library preparation is / are single-stranded nucleic acids comprising at least one RNA residue, one DNA residue, modified nucleic acid residue, non-nucleotide residue, or a combination thereof. In some embodiments, the adapter or adapters comprise 5 '-end and 3 '-end groups that allow ligation with the ends of RNA molecules either directly or after conversion of one or both adapter ends to ligatable 5 '-end and / or ligatable 3 '-end. In some embodiments, the adapter or adapters comprise a template-deficient segment (TDS) that restricts primer extension by a polymerase over the TDS. In some embodiments, the adapter comprises at least one universal priming sequence that allows library preparation, amplification, or sequencing. In certain embodiments, the universal priming sequence comprises a sequence selected from: one or two sequencing adapter(s) compatible with current NGS (e.g., Illumina, Singular Genomics, Element Biosciences or Complete Genomics / BGI) and single-molecule sequencing platforms (e.g., Oxford Nanopores or Pacific Biosciences); an RT and / or PCR primer binding site(s); the RNA Type-specific barcode; a sequencing bar-code or index; one or more random nucleotides at any position of adapters regarding its ends; a random or semirandom unique molecular identifier (UMI); a sequencing flow-cell binding site(s); a promoter and enhancer for a polymerase; or combinations thereof. In some embodiments, the adapter(s) 5 ’end or 3’ end comprises a blocking group preventing formation of adapter dimers and / or circularization or concatamerization of the adapters. In some embodiments, the length of the RNA Type-specific barcode is 5 tol2 nucleotides. In some embodiments, the length of the sequencing bar-code or index is 6 to 10 nucleotides. In some embodiments, the length of the UMI is 6 to 15 nucleotides. In some embodiments, a polymerase is a bacteriophage RNA polymerase selected from: T7, T3 or SP6 RNA polymerase.
[0074] In some embodiments, the 5'-end or 3'-end of a single adapter ligated to an RNA molecule comprises a reversible blocking group that can be activated by enzymatic or chemicalAttorney Docket No. 57767-714601(including photochemical) conversion to a ligatable end group prior to the circularization of the adapter-RNA ligation product. In some embodiments, the reversible blocking group is a 3'-end- blocking group selected from: 3’-P; 2',3'>P; 3'-O-(a-methoxyethyl)ether, and 3'-O-isovaleryl ester. In some embodiments, the adapter reversible blocking group is a 5'-end-blocking group selected from: 5’-OH and 5’-ppp. In some embodiments, the adapter is 3’-adapter to be ligated to 3’ end of the RNA molecules. In some embodiments, the 5’ end group of 3 ’-adapter is selected from: 5 ’-adenylated or 5’-App (5 ’,5 ’-adenyl pyrophosphoryl), 5’-P or 5 ’-OH. In some embodiments, the adapter is a 5 ’-adapter to be ligated to a 5’ end of the RNA molecules. In some embodiments, the 3’ end group of 5’-adapter is selected from: 3’-OH, 3’-P or 2’,3’>P.
[0075] In some embodiments, the ends of an adapter or adapters are RNA Type-specific or RNA ends-specific and can be ligated only to selected RNA ends without a need for their conversion by the pretreatments, where different Type-specific adapters are used to detect selected RNA Type(s). In some other embodiments, the single adapter specific to a certain RNA Type (e.g., Type 1) or RNA Types (e.g., Types 1 and 2) is used to also detect other RNA Types individually or in combination, where the other RNA Types are converted (as result of the pretreatment or pretreatments) to the RNA Type or RNA Types, which can be specifically ligated to this adapter. In some embodiments, the Type-specific adapters having the same ends and sequences except different Type-specific barcodes are ligated to corresponding Type- specifically pretreated RNA molecules in separate singleplex reactions, where the RNA-adapter ligation products are then pooled, and the next steps of the library preparations are run in multiplex.Control (spike-in) RNAs
[0076] In the methods described herein, a pool of control synthetic RNAs (also referred to interchangeably as spike-in RNAs) may be used when preparing a sequencing library. In some embodiments, the control RNAs are not homologous to human transcriptome or RNA fragmentome. In other embodiments, the control RNAs are modified versions of RNA molecule(s) of interest present in the human transcriptome or RNA fragmentome. In some embodiments, the methods comprise adding a pool of control RNAs, wherein the pool of control RNAs are combined (or mixed) with or spiked-in the sample.
[0077] The pool of control RNAs may perform one or more of the following functions. The pool of control RNA may be used to monitor efficiency of reactions resulting in conversion(s) between the RNA ends before or during said library preparation. The pool of control RNAs may be used to monitor efficiency of incorporation of RNAs of the different sizes of the plurality of RNA molecules and end-group Types (RNA Type 1, Type 2, Type 3, Type 4) from the samples into sequencing libraries by different methods of library preparations. The pool of control RNAsAttorney Docket No. 57767-714601 may be used to normalize sequencing reads to account for technical variations between different sequencing libraries, or a combination thereof. In some embodiments, the pool of control RNAs may be used for quantification of naturally occurring RNA molecules in a sample.
[0078] The pool of control RNAs comprise one or more RNA Type-specific groups of synthetic RNAs. Each of the groups may differ from each other in RNA length, while each group comprises RNA of the same length. In some embodiments, each individual RNA Typespecific group comprise a first end comprising either a 5’-P end or a 5 ’-OH end and a second end comprising either a 3’-P end, a 2’,3’>P end, or a 3’-OH end. Each individual Type-specific RNA may comprise an internal bar-code nucleotide sequence corresponding to and distinguishing between RNA Type 1 (comprising 5’-P and 3’-OH ends), RNA Type 2 (comprising 5’-OH and 3’-OH ends), RNA Type 3 (comprising 5’-OH and 3’-P or a 2’,3’>P ends) and RNA Type 4 (comprising 5’-P and 3’-P or a 2’,3’>P ends). Each RNA may comprise a randomized nucleotide sequence at the first end. Each RNA may comprise a randomized nucleotide sequence at the second end.
[0079] The internal RNA Type-specific bar-code nucleotide sequences in adapters and control RNAs may comprise unique nucleotide sequences. The internal bar-code nucleotide sequences may vary in length from 5 nucleotides to 12 nucleotides. The internal bar-code nucleotide sequences may vary in length from 5 nucleotides to 6 nucleotides, 5 nucleotides to 7 nucleotides, 5 nucleotides to 8 nucleotides, 5 nucleotides to 9 nucleotides, 5 nucleotides to 10 nucleotides, 5 nucleotides to 11 nucleotides, 5 nucleotides to 12 nucleotides, 6 nucleotides to 7 nucleotides, 6 nucleotides to 8 nucleotides, 6 nucleotides to 9 nucleotides, 6 nucleotides to 10 nucleotides, 6 nucleotides to 11 nucleotides, 6 nucleotides to 12 nucleotides, 7 nucleotides to 8 nucleotides, 7 nucleotides to 9 nucleotides, 7 nucleotides to 10 nucleotides, 7 nucleotides to 11 nucleotides, 7 nucleotides to 12 nucleotides, 8 nucleotides to 9 nucleotides, 8 nucleotides to 10 nucleotides, 8 nucleotides to 11 nucleotides, 8 nucleotides to 12 nucleotides, 9 nucleotides to 10 nucleotides, 9 nucleotides to 11 nucleotides, 9 nucleotides to 12 nucleotides, 10 nucleotides to 11 nucleotides, 10 nucleotides to 12 nucleotides, or 11 nucleotides to 12 nucleotides. The internal bar-code nucleotide sequences may vary in length from 5 nucleotides, 6 nucleotides, 7 nucleotides, 8 nucleotides, 9 nucleotides, 10 nucleotides, 11 nucleotides, or 12 nucleotides. The internal bar-code nucleotide sequences may vary in length from at least 5 nucleotides, 6 nucleotides, 7 nucleotides, 8 nucleotides, 9 nucleotides, 10 nucleotides, or 11 nucleotides. The internal bar-code nucleotide sequences may vary in length from at most 6 nucleotides, 7 nucleotides, 8 nucleotides, 9 nucleotides, 10 nucleotides, 11 nucleotides, or 12 nucleotides.
[0080] In some embodiments, the length of the randomized nucleotide sequence at the first end and the length of the randomized nucleotide sequence and the second end is in range from 1Attorney Docket No. 57767-714601 nucleotide to 6 nucleotides. In some embodiments, the length of the randomized nucleotide sequence at the first end and the length of the randomized nucleotide sequence and the second end is in range from 1 nucleotide to 2 nucleotides, 1 nucleotide to 3 nucleotides, 1 nucleotide to4 nucleotides, 1 nucleotide to 5 nucleotides, 1 nucleotide to 6 nucleotides, 2 nucleotides to 3 nucleotides, 2 nucleotides to 4 nucleotides, 2 nucleotides to 5 nucleotides, 2 nucleotides to 6 nucleotides, 3 nucleotides to 4 nucleotides, 3 nucleotides to 5 nucleotides, 3 nucleotides to 6 nucleotides, 4 nucleotides to 5 nucleotides, 4 nucleotides to 6 nucleotides, or 5 nucleotides to 6 nucleotides.
[0081] In some embodiments, the length of the randomized nucleotide sequence at the first end and the length of the randomized nucleotide sequence and the second end is in range from 1 nucleotide, 2 nucleotides, 3 nucleotides, 4 nucleotides, 5 nucleotides, or 6 nucleotides. In some embodiments, the length of the randomized nucleotide sequence at the first end and the length of the randomized nucleotide sequence and the second end is in range from at least 1 nucleotide, 2 nucleotides, 3 nucleotides, 4 nucleotides, or 5 nucleotides. In some embodiments, the length of the randomized nucleotide sequence at the first end and the length of the randomized nucleotide sequence and the second end is in range from at most 2 nucleotides, 3 nucleotides, 4 nucleotides,5 nucleotides, or 6 nucleotides.
[0082] In some embodiments, the pool of control RNAs comprise polynucleotides with a length of 15 nucleotides to 150 nucleotides. In some embodiments, the pool of control RNAs comprise RNAs with a length of 15 nucleotides to 30 nucleotides, 15 nucleotides to 45 nucleotides, 15 nucleotides to 60 nucleotides, 15 nucleotides to 75 nucleotides, 15 nucleotides to 90 nucleotides, 15 nucleotides to 100 nucleotides, 15 nucleotides to 110 nucleotides, 15 nucleotides to 120 nucleotides, 15 nucleotides to 130 nucleotides, 15 nucleotides to 140 nucleotides, 15 nucleotides to 150 nucleotides, 30 nucleotides to 45 nucleotides, 30 nucleotides to 60 nucleotides, 30 nucleotides to 75 nucleotides, 30 nucleotides to 90 nucleotides, 30 nucleotides to 100 nucleotides, 30 nucleotides to 110 nucleotides, 30 nucleotides to 120 nucleotides, 30 nucleotides to 130 nucleotides, 30 nucleotides to 140 nucleotides, 30 nucleotides to 150 nucleotides, 45 nucleotides to 60 nucleotides, 45 nucleotides to 75 nucleotides, 45 nucleotides to 90 nucleotides, 45 nucleotides to 100 nucleotides, 45 nucleotides to 110 nucleotides, 45 nucleotides to 120 nucleotides, 45 nucleotides to 130 nucleotides, 45 nucleotides to 140 nucleotides, 45 nucleotides to 150 nucleotides, 60 nucleotides to 75 nucleotides, 60 nucleotides to 90 nucleotides, 60 nucleotides to 100 nucleotides, 60 nucleotides to 110 nucleotides, 60 nucleotides to 120 nucleotides, 60 nucleotides to 130 nucleotides, 60 nucleotides to 140 nucleotides, 60 nucleotides to 150 nucleotides, 75 nucleotides to 90 nucleotides, 75 nucleotides to 100 nucleotides, 75 nucleotides to 110 nucleotides, 75 nucleotides to 120Attorney Docket No. 57767-714601 nucleotides, 75 nucleotides to 130 nucleotides, 75 nucleotides to 140 nucleotides, 75 nucleotides to 150 nucleotides, 90 nucleotides to 100 nucleotides, 90 nucleotides to 110 nucleotides, 90 nucleotides to 120 nucleotides, 90 nucleotides to 130 nucleotides, 90 nucleotides to 140 nucleotides, 90 nucleotides to 150 nucleotides, 100 nucleotides to 110 nucleotides, 100 nucleotides to 120 nucleotides, 100 nucleotides to 130 nucleotides, 100 nucleotides to 140 nucleotides, 100 nucleotides to 150 nucleotides, 110 nucleotides to 120 nucleotides, 110 nucleotides to 130 nucleotides, 110 nucleotides to 140 nucleotides, 110 nucleotides to 150 nucleotides, 120 nucleotides to 130 nucleotides, 120 nucleotides to 140 nucleotides, 120 nucleotides to 150 nucleotides, 130 nucleotides to 140 nucleotides, 130 nucleotides to 150 nucleotides, or 140 nucleotides to 150 nucleotides. In some embodiments, the pool of control RNAs comprise RNAs with a length of 15 nucleotides, 30 nucleotides, 45 nucleotides, 60 nucleotides, 75 nucleotides, 90 nucleotides, 100 nucleotides, 110 nucleotides, 120 nucleotides, 130 nucleotides, 140 nucleotides, or 150 nucleotides. In some embodiments, the pool of control RNAs comprise RNAs with a length of at least 15 nucleotides, 30 nucleotides, 45 nucleotides, 60 nucleotides, 75 nucleotides, 90 nucleotides, 100 nucleotides, 110 nucleotides, 120 nucleotides, 130 nucleotides, or 140 nucleotides. In some embodiments, the pool of control RNAs comprise RNAs with a length of at most 30 nucleotides, 45 nucleotides, 60 nucleotides, 75 nucleotides, 90 nucleotides, 100 nucleotides, 110 nucleotides, 120 nucleotides, 130 nucleotides, 140 nucleotides, or 150 nucleotides.RNA molecules
[0083] The methods described herein include methods of preparing a sequencing library or libraries from a sample comprising a plurality of RNA molecules comprising a plurality of RNA ends. In some embodiments, the plurality of RNA molecules comprises a pool of RNA molecules selected from: naturally occurring RNAs, shorter versions or fragments of naturally occurring RNAs, in vitro RNA transcripts, and synthetic RNAs, or combinations thereof.
[0084] In some embodiments, the method identifies two or more naturally occurring RNA classes selected from the list consisting of: a microRNA or portion thereof. In some instances, the nucleic acid sequence comprises an RNA molecule or a fragmented RNA molecule (RNA fragments) selected from: a microRNA (miRNA), endogenous small interfering RNAs (esiRNA), a piwi-interacting RNA (piRNA), a pre-miRNA, a pri-miRNA, a mRNA, a fragment derived from mRNA transcripts or pre-mRNA (scdRNA, sutRNA, sinRNA) circular RNA (circRNA), a ribosomal RNA (rRNA), a Y RNA, a transfer RNA (tRNA), a pre-tRNA, a long non-coding RNA (IncRNA), a small nuclear RNA (snRNA) and small nucleolar RNAs (snoRNAs). In some embodiments, the nucleic acid sequences is not a ribosomal RNA.Attorney Docket No. 57767-714601
[0085] In some embodiments, the method simultaneously identifies RNA molecules with all types of naturally occurring RNA 5’ and 3’ ends. In some ither embodiments, the method identifies RNA molecules with specific RNA ends and the specific combinations of RNA ends called here as RNA Types. In certain embodiments, the method identifies at least two of Type 1, Type 2, Type 3 and Type 4 RNA molecules. In certain embodiments, the method identifies at least three of Type 1, Type 2, Type 3 and Type 4 RNA molecules. In certain embodiments, the method identifies Type 1, Type 2, Type 3 and Type 4 RNA molecules.
[0086] In some embodiments, the identified RNA molecules comprise small RNAs or RNA fragments (also referred here as RFs) of 200 nt or less in length. In some other embodiments, the identified RNA molecules comprise RFs of 100 nt or less in length. In yet other embodiments, the identified RNA molecules comprise RFs of 50 nt or less in length. In some embodiments, the identified RNA molecules comprise RFs of 15 to 40 nt in length or less in length. In some embodiments, the identified RNA molecules comprise RFs of 8 to 30 nt in length.
[0087] In some embodiments, the RNA molecule is no more than about 200, 190, 180,170, 160, 150, 140, 130, 120 110, 100, 90, 80, 70, 60, 50, 40, 30, 20, or 10 nucleotides in size. In some embodiments, the nucleic acid is no more than about 100 nucleotides in size. In some embodiments, the nucleic acid is no more than about 90 nucleotides in size. In some embodiments, the nucleic acid is not more than about 90 nucleotides in size. In some embodiments, the nucleic acid is no more than about 80 nucleotides in size. In some embodiments, the nucleic acid is no more than about 70 nucleotides in size. In some embodiments, the nucleic acid is no more than about 60 nucleotides in size. In some embodiments, the nucleic acid is no more than about 50 nucleotides in size. In some embodiments, the nucleic acid is no more than about 40 nucleotides in size. In some embodiments, the nucleic acid is no more than about 30 nucleotides in size. In some embodiments, the nucleic acid is no more than about 20 nucleotides in size. In some embodiments, the nucleic acid is no more than about 10 nucleotides in size. In some embodiments, the nucleic acid is between 10 and 20 nucleotides, 10 and 30 nucleotides, 10 and 40 nucleotides, 10 and 50 nucleotides, 10 and 60 nucleotides, 10 and 70 nucleotides, 10 and 80 nucleotides, 10 and 90 nucleotides or 10 and 100 nucleotides.KITS AND COMPOSITIONS
[0088] Also described herein is a kit for preparing a sequencing library from a sample comprising a plurality of RNA molecules. The kit may comprise one or more of an adaptor or adapters as described herein, a plurality of control RNAs as described herein, and at least one enzyme useful for treating and converting RNA ends. In some embodiments, the pool of spike-inAttorney Docket No. 57767-714601RNAs comprises a first end comprising either a 5’-P end or a 5 ’-OH end and a second end comprising either a 3’-P end, a 2’,3’>P end, or a 3’-OH end; internal bar-code nucleotide sequences corresponding to and distinguishing between Type 1, Type 2, Type 3 and Type 4; a randomized nucleotide sequence at the first end; and a randomized nucleotide sequence at the second end. In some embodiments, the enzyme comprises a RtcB ligase, a T4 RNA ligase 1, a T4 RNA ligase 2; a Terminator 5 ’-Phosphate-Dependent Exonuclease, a PNK, (3’ phosphatase minus), or a combination thereof; stock solutions of ATP; stock solutions of buffers comprising standard ligase and PNK buffers, MES and (optionally) citric acid buffers.
[0089] Instructions for use may be included in the kit. Optionally, the kit also contains other useful components, such as, diluents, buffers, pharmaceutically acceptable carriers, syringes, catheters, applicators, pipetting or measuring tools, bandaging materials or other useful paraphernalia. The materials or components assembled in the kit can be provided to the practitioner stored in any convenient and suitable ways that preserve their operability and utility. For example, the components can be in dissolved, dehydrated, or lyophilized form; they can be provided at room, refrigerated or frozen temperatures. The components are typically contained in suitable packaging material(s). As employed herein, the phrase “packaging material” refers to one or more physical structures used to house the contents of the kit, such as compositions and the like. The packaging material is constructed by well-known methods, preferably to provide a sterile, contaminant-free environment. The packaging materials employed in the kit are those customarily utilized in gene expression assays and in the administration of pretreatments. As used herein, the term “package” refers to a suitable solid matrix or material such as glass, plastic, paper, foil, and the like, capable of holding the individual kit components. Thus, for example, a package can be a glass vial or prefilled syringes used to contain suitable quantities of the pharmaceutical composition. The packaging material has an external label which indicates the contents and / or purpose of the kit and its components.NUMBERED EMBODIMENTS
[0090] Also disclosed herein are the following embodiments:1. A method of preparing a sequencing library from a sample comprising a plurality of RNA molecules, wherein the plurality of RNA molecules comprise a combination of ligatable or unligatable 5’ and 3’ ends, and wherein the plurality of RNA molecules comprise at least one RNA molecule comprising both 5’ and 3’ ligatable ends and at least two or more RNA molecules comprising one or two 5’ or 3’ unligatable ends, the method comprising: a) applying a plurality of pretreatments to the plurality of RNA molecules of the sample, wherein the plurality of pretreatments comprises:Attorney Docket No. 57767-714601(i) converting a first plurality of RNA molecules comprising unligatable end-types to ligatable end-types, or(ii) circularizing a first plurality of RNA molecules comprising ligatable end-types to obtain a first plurality of circularized RNA molecules; and(iii) converting a second plurality of RNA molecules comprising unligatable end-types into a second plurality of RNA molecules comprising ligatable end-types; and / or(iv) circularizing a second plurality of RNA molecules comprising ligatable end-types to obtain a second plurality of circularized RNA molecules; and(v) converting a third plurality of RNA molecules comprising unligatable end-types into a third plurality of RNA molecules comprising ligatable end-types molecules; and / or(vi) circularizing a third plurality of RNA molecules comprising ligatable end-types to obtain a third plurality of circularized RNA molecules; and(vii) converting a fourth plurality of RNA molecules comprising unligatable end-types into a fourth plurality of ligatable end types; and / or(vii) optionally, repeating (v) through (vii) for a forth or more pluralities of RNA molecules comprising unligatable end-types; b) preparing a sequencing library for all pluralities of RNA molecules comprising ligatable endtypes, wherein preparing the sequencing library comprises ligation to sequencing adaptors.2. The method of embodiment 1, further comprising performing a sequencing reaction on the sequencing library.3. The method of embodiment 1 or 2, wherein the RNA molecules comprise small RNAs (sRNA) or RNA fragments (RFs).4. The method of embodiment 3, wherein said sRNAs or RFs are 150 nucleotides or less in length.5. The method of embodiment 3, wherein said sRNAs or RFs are 50 nucleotides or less in length.6. The method of any one of embodiments 1 to 5, wherein 5’ends comprise 5’-hydroxyl (5’- OH), 5 ’-Phosphate (5’-P), 5 ’-triphosphate (5 ’-ppp) or 5 ’-cap (e.g., 5’-methylGppp); and wherein the 3’ ends comprise 3’-Phosphate (3’-P), 2’-phosphate (2’-p), 2’,3’-cyclic phosphate (2’,3’>P ), 2’-O-methyl (2’-0Me).7. The method of any one of embodiments 1 to 5, wherein the ligatable ends are selected from 5’-P, 3’-OH, 5’-OH, 3’-P or 2’,3’>P, or any combination thereof.8. The method of any one of embodiments 1 to 7, wherein the circularizing is performed with at least one 3’-ligase (ligating 3’-OH with 5’-P ends) selected from: T4 RNA ligase,Attorney Docket No. 57767-714601T4 RNA ligase 1 (Rnll), T4 RNA ligase 2 (Rnl2), Mth RNA Ligase, CircLigase™ ssDNA ligase, CircLigase ™ II ssDNA ligase, CircLigase™ RNA Ligase, Thermostable 5' AppDNA / RNA ligase, or a combination thereof.9. The method of any one of embodiments 1 to 8, wherein the circularizing is performed with at least one 5’-ligase (ligating 5’-OH with 3’-P or 2’,3’>P ends) selected from: RNA-splicing ligase (RtcB), A. thaliana tRNA ligase (AtRNL ), tRNA ligase enzyme (Tril), tRNA ligase (Rigl+), or a combination thereof.10. The method of any one of embodiments 1 to 9, wherein the circularizing of RNA molecules prevents ligation of circularized RNA molecules with sequencing adapter(s) thereby preventing incorporation of the circularized RNA molecules into the sequencing library.11. The method of embodiment 5, wherein the sequencing detects Type 1 or a combination of Type 1 and Type 2 RNA molecules.12. A method of preparing a sequencing library from a sample comprising a plurality of RNA molecules; wherein the plurality of RNA molecules comprises a first end comprising a 5 ’-Phosphate (5’-P) end or a 5 ’-hydroxyl (5 ’-OH) end and a second end comprising a 3’-Phosphate (3’-P) end, a 2’,3’-cyclic Why not just phosphate (2’,3’>P) end, or a 3 ’-hydroxyl (3 -’OH) end; the method comprising: a) separating a composition comprising the plurality of RNA molecules into at least a first partition and a second partition; b) performing a first pretreatment on the first partition and a second pretreatment on the second partition, wherein the first pretreatment and the second pretreatment are not the same, wherein the first pretreatment and the second pretreatment are independently selected from:(i) circularizing a plurality of RNA molecules comprising 5’-P ends and 3’- OH ends;(ii) converting a plurality of RNA molecules comprising 3’-P end or 2’,3’>P end, to 3 ’-OH ends;(iii) converting a plurality of RNA molecules comprising 3’-P or 2’,3’>P ends to 3 ’-OH ends, then circularizing a plurality of RNA molecules comprising 5’-P ends and 3 ’-OH ends;(iv) converting a plurality of RNA molecules comprising 5 ’-OH ends to 5’-P ends and circularizing a plurality of RNA molecules comprising 5’-P ends and 3 ’-OH ends; then converting a plurality of RNA molecules comprising 3’-P end or a 2’,3’>P end to 3 ’-OH ends;Attorney Docket No. 57767-714601(v) circularizing a plurality of RNA molecules comprising 5 ’-OH ends and 3’-P ends or 2’,3’>P end, then converting a plurality of RNA molecules comprising 5 ’-OH ends to 5’-P ends and circularizing a plurality of RNA molecules comprising 5’-P ends and 3’-OH ends;(vi) circularizing a plurality of RNA molecules comprising 5 ’-OH ends and 3’-P ends or 2’,3’>P ends; then converting a plurality of RNA molecules comprising 3’-P ends or 2’,3’>P ends to 3’-OH ends;(vii) degrading a plurality of RNA molecules comprising 5’-P ends, then converting a plurality of RNA molecules comprising 5 ’-OH ends to 5’-P ends and circularizing a plurality of RNA molecules comprising 5’-P ends and 3 ’-OH ends; then converting a plurality of RNA molecules comprising 3’-P ends or 2’,3’>P end, to 3’ OH ends; or(viii) no pretreatment is performed; and c) preparing a first sequencing library from the first partition and a second sequencing library from the second partition.13. The method of embodiment 12, wherein (b)(i) comprises contacting the plurality of RNA molecules with T4 RNA ligase 1 and T4 RNA ligase 2 in the presence of ATP.14. The method of any one of embodiments 12 or 13, wherein (b)(ii) comprises contacting the plurality of RNA molecules with T4 polynucleotide kinase (PNK) in the absence of ATP.15. The method of any one of embodiments 12 to 14, wherein (b)(iii) comprises contacting the plurality of RNA molecules with PNK in the absence of ATP then contacting the plurality of RNA molecules with T4 RNA ligase 1 and T4 RNA ligase 2 in the presence of ATP.16. The method of any one of embodiments 12 to 15, wherein (b)(iv) comprises contacting the plurality of RNA molecules with PNK (3’ phosphatase minus), T4 RNA ligase 1 and T4 RNA ligase 2 in the presence of ATP.17. The method of any one of embodiments 12 to 16, wherein (b)(v) comprises contacting the plurality of RNA molecules with RtcB ligase, then contacting the plurality of RNA molecules with PNK (3’ phosphatase minus), T4 RNA ligase 1, and T4 RNA ligase 2 in the presence of ATP.18. The method of any one of embodiments 12 to 17, wherein (b)(vi) comprises contacting the plurality of RNA molecules with RtcB ligase, then contacting the plurality of RNA molecules with PNK in the absence of ATP.Attorney Docket No. 57767-71460119. The method of any one of embodiments 12 to 18, wherein (b)(vii) comprises contacting the plurality of RNA molecules with Terminator 5 ’-Phosphate-Dependent Exonuclease, then contacting the plurality of RNA molecules with PNK (3’ phosphatase minus), T4 RNA ligase 1, and T4 RNA ligase 2 in the presence of ATP.20. The method of any one of embodiments 12 to 19, wherein contacting the plurality of RNA molecules with the PNK is performed in solutions comprising a salt of MES, TRIS, or Imidazole at pH 5.5-6.5.21. The method of any one of embodiments 12 to 20, further comprising separating the composition comprising the plurality of RNA molecules into a third, fourth, fifth, sixth, and / or seventh partition, wherein a third, fourth, fifth, sixth, and / or seventh pretreatment is performed on the third, fourth, fifth, sixth, and / or seventh partition, wherein the third, fourth, fifth, sixth, and / or seventh pretreatments are different from each other and the first and second pretreatment.22. The method of any one of embodiments 12 to 21, wherein preparing the sequencing library comprises ligating a single adaptor to 5’ or to 3’end of the plurality of RNA molecules.23. The method of any one of embodiments 12 to 22, wherein preparing the sequencing library comprises ligating two adapters, wherein the first adaptor is ligated to a first end and the second adapter is ligated to a second end of the plurality of RNA molecules.24. The method of embodiment 23, wherein preparing a sequence library further comprises circularizing the plurality of ligation products comprising the single adaptor-RNA molecules.25. The method of embodiments 23, wherein preparing a sequence library further comprises reverse transcription of circularized products (RT) followed by PCR amplification of cDNA products of the RT.26. The method of embodiments 23, wherein preparing a sequence library further comprises direct PCR amplification of the circularized products.27. The method of embodiment of any one of embodiments 12-26, further comprising sequencing the first sequencing library and the second sequencing library.28. A method of preparing a sequencing library from a sample comprising a plurality of RNA molecules, wherein the plurality of RNA molecules have ends of Type 1 comprising a combination of 5 ’-Phosphate (5’-P) and 3 ’-hydroxyl (3 ’-OH) ends), Type 2 comprising 5 ’-hydroxyl (5 ’-OH) and 3 ’-OH ends, Type 3 comprising 5 ’-OH and 3’- Phosphate (3’-P) or 2’, 3’ cyclic phosphate (2’,3’>P ends ), and Type 4 comprising 5’-P and 3’-P or 2’,3’>P ends; the method comprising:Attorney Docket No. 57767-714601 a) separating a composition comprising the plurality of said RNA molecules into at least a first partition and a second partition; b) performing a first pretreatment on the first partition and a second pretreatment on the second partition, wherein the first pretreatment and the second pretreatment are not the same, wherein the first pretreatment and the second pretreatment are independently selected from:(i) circularizing Type 1 RNA molecules;(ii) converting Type 3 RNA molecules to Type 2 RNA molecules and converting Type 4 RNA molecules to Type 1 RNA molecules;(iii) converting Type 3 RNA molecules to Type 2 RNA molecules and converting Type 4 RNA molecules to Type 1 RNA molecules, then circularizing Type 1 RNA molecules;(iv) converting Type 2 RNA molecules to Type 1 RNA molecules and circularizing Type 1 RNA molecules, then converting Type 3 RNA molecules to Type 2 RNA molecules and converting Type 4 RNA molecules to Type 1 RNA molecules;(v) circularizing Type 3 molecules, then converting Type 2 RNA molecules to Type 1 RNA molecules and circularizing Type 1 RNA molecules;(vi) circularizing Type 3 molecules, then converting Type 4 molecules to Type 1 molecules;(vii) degrading Type 1 and Type 4 molecules, then converting Type 2 molecules to type 1 molecules and circularizing Type 1 molecules; then converting Type 3 molecules to Type 2 molecules; or(viii) no pretreatment ; c) ligating a plurality of adaptors to the plurality of RNA molecules to produce a plurality of adaptor-RNA molecules;29. The method of embodiment 28, further comprising separating the composition comprising the plurality of RNA molecules into a third, fourth, fifth, sixth, and / or seventh partition, wherein a third, fourth, fifth, sixth, and / or seventh pretreatment is performed on each corresponding separate partition, wherein each pretreatment is different.30. The method of any one of embodiments 28 and 29, wherein (b)(i) comprises contacting the plurality of RNA molecules with T4 RNA ligase 1 and T4 RNA ligase 2 in the presence of ATP.Attorney Docket No. 57767-71460131. The method of any one of embodiments 28 to 30, wherein (b)(ii) comprises contacting the plurality of RNA molecules with PNK in the absence of ATP.32. The method of any one of embodiments 28 to 31 wherein (b)(iii) comprises contacting the plurality of RNA molecules with PNK in the absence of ATP then contacting the plurality of RNA molecules with T4 RNA ligase 1 and T4 RNA ligase 2 in the presence of ATP.33. The method of any one of embodiments 28 to 32, wherein (b)(iv) comprises contacting the plurality of RNA molecules with PNK (3’ phosphatase minus), T4 RNA ligase 1, and T4 RNA ligase 2 in the presence of ATP; then contacting the plurality of RNA molecules with T4 RNA ligase in the absence of ATP.34. The method of any one of embodiments 28 to 33, wherein (b)(v) comprises contacting the plurality of RNA molecules with RtcB ligase, then contacting the plurality of RNA molecules with PNK (3’ phosphatase minus), T4 RNA ligase 1, and T4 RNA ligase 2 in the presence of ATP.35. The method of any one of embodiments 28 to 34, wherein (b)(vi) comprises contacting the plurality of RNA molecules with RtcB ligase, then contacting the plurality of RNA molecules with PNK in the absence of ATP.36. The method of any one of embodiments 28 to 35, wherein (b)(vii) comprises contacting the plurality of RNA molecules with Terminator 5 ’-Phosphate-Dependent Exonuclease, then contacting the plurality of RNA molecules with PNK (3’ phosphatase minus), T4 RNA ligase 1, and T4 RNA ligase 2 in the presence of ATP; then contacting the plurality of RNA molecules with T4 RNA ligase in the absence of ATP.37. The method of embodiment 36, wherein contacting the plurality of RNA molecules with the PNK is performed in a solution comprising a salt of MES, TRIS, or Imidazole at pH 5.5-6.5.38. The method of any one of embodiments 28 to 37 wherein the method further comprises sequencing the first sequencing library and the second sequencing library to identify and quantities of at least one Type of RNA molecules.39. The method of embodiment 38, further comprising comparing the relative quantities of the same Type or different Types of RNA molecules in the first sequencing library and the second sequencing library.40. The method of any one of embodiments 28 to 39, wherein 5’ ends comprise 5’-hydroxyl (5’-OH), 5’-Phosphate (5’-P), 5 ’-triphosphate (5’-ppp); or 5’-cap (e.g., 5’-mGppp).41. The method of any one of embodiments 28 to 40, wherein 3’ ends comprise 3 ’-Phosphate (3’-P), 2’-phosphate (2’-P) or 2’,3’-cyclic phosphate (2’,3’>P).Attorney Docket No. 57767-71460142. The method of any one of embodiments 28 to 41, wherein 5’ ends comprise 5 ’-hydroxyl (5 ’-OH).43. The method of any one of embodiments 28 to 42, wherein 3’ ends comprise 3 ’-hydroxyl (3’-OH) or 2’-O-Methyl (2’-0Me).44. A method of preparing a sequencing library from a sample comprising a plurality of RNA molecules, the method comprising: a) separating a composition comprising the plurality of RNA molecules into at least a first partition and a second partition; b) performing a first pretreatment on the first partition and a second pretreatment on the second partition wherein the first pretreatment and the second pretreatment are independently selected from:(i) contacting the plurality of RNA molecules with T4 RNA ligase 1 and T4 RNA ligase 2 in the presence of ATP;(ii) contacting the plurality of RNA molecules with PNK in the absence of ATP;(iii) contacting the plurality of RNA molecules with PNK in the absence of ATP then contacting the plurality of RNA molecules with T4 RNA ligase 1 and T4 RNA ligase 2 in the presence of ATP;(iv) contacting the plurality of RNA molecules with PNK (3’ phosphatase minus), T4 RNA ligase 1, and T4 RNA ligase 2 in the presence of ATP; then contacting the plurality of RNA molecules with T4 RNA ligase in the absence of ATP;(v) contacting the plurality of RNA molecules with RtcB ligase, then contacting the plurality of RNA molecules with PNK (3’ phosphatase minus), T4 RNA ligase 1, and T4 RNA ligase 2 in the presence of ATP;(vi) contacting the plurality of RNA molecules with RtcB ligase, then contacting the plurality of RNA molecules with PNK in the absence of ATP;(vii) contacting the plurality of RNA molecules with Terminator 5’- Phosphate-Dependent Exonuclease, then contacting the plurality of RNA molecules with PNK (3’ phosphatase minus), T4 RNA ligase 1, and T4 RNA ligase 2 in the presence of ATP; then contacting the plurality of RNA molecules with T4 RNA ligase in the absence of ATP; or(viii) no pretreatment;Attorney Docket No. 57767-714601 c) ligating a plurality of adaptors to the plurality of RNA molecules in the first partition and to the plurality of RNA molecules in the second partition to produce a plurality of adaptor-RNA molecules;45. The method of embodiment 44, wherein contacting the plurality of RNA molecules with the PNK is performed in solutions comprising a salt of MES, TRIS, or Imidazole at pH 5.5-6.5 at pH 5.5-6.5.46. The method of embodiment 44 or 45, further comprising separating the composition comprising the plurality of RNA molecules into a third fourth, fifth, sixth, and / or seventh partition, wherein a third fourth, fifth, sixth, and / or seventh pretreatment is performed on the corresponding separate partition, wherein each pretreatment is different.47. A method of pretreating a plurality of RNA molecules comprising all possible combinations of a 5’-P, 5’-OH, 3’-P end, 3’-OH end, a 3’-P end and 2’,3’>P end, the method comprising: a) converting the 3’-P ends and the 2’,3’>P ends to 3 ’-OH ends by contacting the plurality of RNA molecules with a PNK in a buffer solution at pH between 5.5 and 6.5; b) converting the 5 ’-OH ends to a 5’-P ends by PNK in the presence of ATP and a buffer at pH between 7.5 and 8.5.48. The method of embodiment 47, wherein the buffer solution comprises a buffer at pH 6.49. The method of embodiment 47, wherein the PNK is heat-inactivated at 65°C-85°C in the presence of citric acid at pH 6, wherein both chelating Mg2+cations by citrate anions and pH 6 prevents RNA degradation at the elevated temperatures.50. The method of embodiment 47, wherein a sequencing adaptor is ligated to each of 3’ ends of the plurality of RNA molecules after step (a) and before step (b).51. The method of embodiment 47, wherein a sequencing adaptor is ligated to each of 3’ ends of the plurality of RNA molecules after step (b).52. The method of any one of embodiments 1 to 51, wherein the method allows identification of one or more RNA Types for any RNA class of interest.53. The method of embodiment 52, wherein the RNA class is selected from: microRNAs (miRNA), endogenous small interfering RNAs (esiRNA), Piwi interacting RNAs (piRNA), small nuclear RNA (snRNA), small nucleolar RNAs (snoRNAs), molecules derived from mRNA transcripts (smRNA, scRNA, sutRNA, sinRNA) and other small genome-encoded RNA (sgmRNA), long non-coding rRNAs (IncRNA), transfer RNA (tRNA), ribosomal RNA (rRNA) and Y RNA, or combination thereof.Attorney Docket No. 57767-71460154. The method of any one of embodiments 1 to 53, wherein deep sequencing of the plurality of sequencing libraries comprising sequences of Type 1, Type 2, Type 3, or Type 4 RNA molecules simultaneously allows to identify specific RNA classes as biomarker candidates.55. The method of any one of embodiments 1 to 54, wherein the method determine if an RNA molecule is a Type 1, Type 2, Type 3, or Type 4.56. The method of any one of embodiments 1 to 55, wherein sequencing libraries prepared for different RNA Type allows to identify specific RNA Type(s) and RNA class(es) providing the most sensitive and specific detection of RNA biomarkers.57. The method of any one of embodiments 1 to 56, where a length of an identified RNA molecules is within a range of 15 to 150 nucleotide sequencing reads.58. A kit for all-Types or preparing a sequencing library from a sample comprising specific pretreatments a plurality of RNA molecules from a sample, the kit comprising: c) a library preparation kit comprising a universal (RNA Type-independent) or Type-specific sequencing adapter; d) a pool of control (spike-in) RNA molecules, the RNA molecules comprising:(ix) a plurality of 5’ and 3’ end combinations, the end combinations comprising Type 1 with 5’-P and 3 ’-OH ends; Type 2 with 5 ’-OH and 3’- OH ends; Type 3 with 5’-OH and 3’-P or 2’,3’>P ends; and Type 4 with 5’-P and 3’-P or 2’,3’>P ends;(x) internal bar-code nucleotide sequences corresponding to and distinguishing between Type 1, Type 2, Type 3 and Type 4;(xi) a randomized nucleotide sequence at the first end; and(xii) a randomized nucleotide sequence at the second end.59. The kit of embodiment 58, further comprising one or more enzymes the one or more enzymes comprising one or more of: a PNK (regular and 3 ’-end phosphatase minus mutant), a T4 RNA ligase 1, a T4 RNA ligase 2; a RtcB ligase, or a Terminator 5’- Phosphate-Dependent Exonuclease.DEFINITIONS
[0091] Unless defined otherwise, all terms of art, notations and other technical and scientific terms or terminology used herein are intended to have the same meaning as is commonly understood by one of ordinary skill in the art to which the claimed subject matter pertains. In some cases, terms with commonly understood meanings are defined herein for clarity and / or forAttorney Docket No. 57767-714601 ready reference, and the inclusion of such definitions herein should not necessarily be construed to represent a substantial difference over what is generally understood in the art.
[0092] Throughout this application, various embodiments may be presented in a range format. It should be understood that the description in range format is merely for convenience and brevity and should not be construed as an inflexible limitation on the scope of the disclosure. Accordingly, the description of a range should be considered to have specifically disclosed all the possible subranges as well as individual numerical values within that range. For example, description of a range such as from 1 to 6 should be considered to have specifically disclosed subranges such as from 1 to 3, from 1 to 4, from 1 to 5, from 2 to 4, from 2 to 6, from 3 to 6 etc., as well as individual numbers within that range, for example, 1, 2, 3, 4, 5, and 6. This applies regardless of the breadth of the range.
[0093] As used in the specification and claims, the singular forms “a”, “an” and “the” include plural references unless the context clearly dictates otherwise. For example, the term “a sample” includes a plurality of samples, including mixtures thereof.
[0094] The terms “determining,” “measuring,” “evaluating,” “assessing,” “assaying,” and “analyzing” are often used interchangeably herein to refer to forms of measurement. The terms include determining if an element is present or not (for example, detection). These terms can include quantitative, qualitative or quantitative and qualitative determinations. Assessing can be relative or absolute. “Detecting the presence of’ can include determining the amount of something present in addition to determining whether it is present or absent depending on the context.
[0095] The terms “subject,” “individual,” or “patient” are used interchangeably herein. A “subject” can be a biological entity containing expressed genetic materials. The biological entity can be a plant, animal, or microorganism, including, for example, bacteria, viruses, fungi, and protozoa. The subject can be tissues, cells and their progeny of a biological entity obtained in vivo or cultured in vitro. The subject can be a mammal. The mammal can be a human. The subject may be diagnosed or suspected of being at high risk for a disease. In some cases, the subject is not necessarily diagnosed or suspected of being at high risk for the disease.
[0096] Non-limiting examples of a “biological sample” include any animal cells, tissues, or fluids from which nucleic acids and / or proteins can be obtained. As non-limiting examples, this includes whole blood, peripheral blood, plasma, serum, saliva, mucus, urine, semen, lymph, fecal extract, cheek swab, cells or other bodily fluid or tissue, including but not limited to tissue obtained through surgical biopsy or surgical resection. Alternatively, a sample can be obtained through primary patient derived cell lines, or archived patient samples in the form of preserved samples, or fresh frozen samples.Attorney Docket No. 57767-714601
[0097] The term “zw vitro” is used to describe an event that takes places contained in a container for holding laboratory reagent such that it is separated from the biological source from which the material is obtained. In vitro assays can encompass cell-based assays in which living or dead cells are employed. In vitro assays can also encompass a cell-free assay in which no intact cells are employed.
[0098] As used herein, the term “about” a number refers to that number plus or minus 10% of that number. The term “about” a range refers to that range minus 10% of its lowest value and plus 10% of its greatest value.
[0099] The term “small RNA,” as used herein, may refer in general to an RNA molecule or RNA fragment with potential biological functions that is at least about 8 nucleotides to at least about 200 nucleotides. “Small RNAs” term is usually referred to mature RNAs comprising defined sequences derived from the larger precursors RNAs by a regulated intracellular RNA processing machinery. The “RNA fragments” term is usually referred to fragments of larger mature RNAs produced by their non-random or semirandom cleavage by cellular and (cell-free) circulating RNases rather than products of random degradation of RNA.
[0100] The section headings used herein are for organizational purposes only and are not to be construed as limiting the subject matter described.
[0101] The term “template-deficient segment” or “non-template linker” refers to a segment of the combo adapter comprising at least one nucleotide, modified nucleotide, non-nucleotide residue or combination thereof, wherein the template-deficient segment is capable of restricting primer extension by a polymerase using combo adapter sequence as a template. Placing the template-deficient segment between combo adapter 5 '-proximal and 3 '-proximal segments or within 5 '-proximal segment can stop the primer extension and, therefore, can prevent more than one round of primer extension (RCA) on the circular template comprising sequences of the adapter-polynucleotide ligation product or the combo adapter alone. The template-deficient segment may comprise at least one template-deficient nucleotide or template-deficient non- nucleotide residue. The template-deficient nucleotide or non- nucleotide residue may be at or near the 5' end of the template-deficient segment. The template- deficient nucleotide or non- nucleotide residue may be at or near the 3' end of the template-deficient segment. The templatedeficient nucleotide may be a modified nucleotide, a derivatized nucleotide, a nucleotide analog, a DNA residue or a RNA residue. Said non-nucleotide residue is not chemically classified as nucleic acid residue, but can be synthetically inserted (serve as a linker) between nucleic acid residues. The template-deficient segment cannot serve as a template for nucleic acid synthesis. The template-deficient segment may prevent the synthesis of a nucleic acid strand complementary to a nucleic acid strand containing at least one template-deficient nucleotide or aAttorney Docket No. 57767-714601 template-deficient non-nucleotide residue at or beyond the site of the tempi ate -deficient nucleotide or a template-deficient non-nucleotide residue. The template-deficient nucleotide or non- nucleotide residue, which cannot be copied by a polymerase, may comprise at least one feature selected from: a) absence of a nucleic acid base (e.g., an abasic site or a non-nucleotide linker); b) a modified nucleic acid base lacking complementarity to the nucleotides accepted by a polymerase; c) a nucleotide or modified nucleotide that cannot be recognized by a polymerase (e.g., a ribonucleotide is not recognized by a DNA-dependent DNA polymerase, or a deoxyribonucleotide by an RNA-dependent DNA polymerase); and d) a modified nucleotide residue inhibiting activity of a polymerase by forming a chemical bond with said polymerase.
[0102] The term non-nucleotide residue refers to a residue that is not chemically classified as nucleic acid residue. The non-nucleotide residue may be synthetically inserted (serve as a linker or a spacer) between nucleic acid residues or be attached to nucleic acid ends (terminal groups). Examples of non-nucleotide residues include (but are not limited to): disulfide (S-S), 3' Thiol Modifier C3 S-S, a propanediol (C3 Spacer), a hexanediol (six carbon glycol spacer), a triethylene glycol (Spacer 9) and hexa-ethyleneglycol (Spacer 18).EXAMPLES
[0103] The following examples are included for illustrative purposes only and are not intended to limit the scope of the invention.Example 1. Sequencing library protocols for detecting and identifying RNA Types using Type-specific adapters and protocols,
[0104] The general sequencing library preparation protocol is described in Fig. 2A. Compositions of the Type-specific adapters (CADs) is schematically shown in Fig. 2B. Experimental details including compositions of the enzymatic reactions are indicated (Table 1). Compositions of the CADs is the same as previously described CAD type 6 (Barb eran- Sol er et al. 2018. Genome Biol. 19: 105) except indicated (in Fig. 2B) end modifications in CAD types 7 and 8.Table 1: Workflows of protocols for preparation of sequencing libraries specific for different types of RNA endsAttorney Docket No. 57767-714601Supplementary Information (for Table 1): T4 Polynucleotide Kinase (PNK) is used to dephosphorylate 3’ends of RNA molecules; T4 Polynucleotide Kinase (3' phosphatase minus) (PNK, 3’-minus) is used to phosphorylate 5’ends of RNA molecules without de-phosphorylation of their 3’ ends. The T4 RNA ligase 1 (Rnll) is used for circularization of RNA molecules comprising 5’-P with 3’-OH ends. RtcB ligase is used for circularization of RNA molecules comprising 5’-0H with 3’-P or 2’,3’>P (2’,3 ’-cyclic phosphate) ends.
[0105] This approach allows detection either of all RNA Types (Fig. 1) simultaneously (protocol A2) or certain individual RNA Types specifically, including Type 1 (protocol Al) Type 2 (protocol A3 and A4) and Type 4 (protocols A5 and A7). The other protocols allow simultaneous detecting of combinations of the individual RNA Types, including Types 1+2 (protocol A) and Types 3+4 (protocol A6). The protocol A is described as in Barb eran- Sol er, S.Attorney Docket No. 57767-714601 et al. “Decreasing miRNA sequencing bias using a single adapter and circularization approach.” Genome Biol. 19, 105 (2018).
[0106] All other protocols shown in Table 1 are modified version of the protocol A. Protocols A2, A3, A4, A6 and A7 describe RNA pre-treatments before library preparations. In Protocols A3, A4 and A7, the indicated RNA Types are converted into circular forms (RNA Type 5) that block the RNA ends and prevents an incorporation of the circularized RNA molecules into sequencing libraries.Example 2. Simultaneous detection and discrimination between different RNA Types of RFs using the RNA Type-specific pretreatments before sequencing library preparation.
[0107] Core protocol for sequencing library preparation. In Example 1, we described several library preparation protocols featuring different types of CADs, which require different ligases to work with different Types of RNA molecules. In the Example 2, we described an alternative approach (named RealSeq-RF) using a single, core Protocol A, which in combination with different pretreatment protocols (Table 2) allows detection of different RNA Types either simultaneously or specifically. The workflow of protocol A are presented in Figs. 3A-B (and corresponding legend to this figure) with experimental details described herein (Barb eran- Sol er et al. 2018. Genome Biol. 19: 105). The CAD composition used in this version of Protocol A is: 5’-rApp(TGGAATTCTCGGGTGCCAAGG)-idSp / idSp- r(GUUCAGAGUUCUACAGUCCGACGAUC)p-3’ [SEQ ID NO. 1] , where idSp is abasic l’,2’-dideoxyribose, D-spacers.Table 2: Examples of pretreatments for preparation of sequencing libraries containing various Types of RNA ends (RNA Types).Attorney Docket No. 57767-714601Supplementary Information (for Table 2): T4 Polynucleotide Kinase (PNK) is used to dephosphorylate 3’ends of RNA molecules; T4 Polynucleotide Kinase (3' phosphatase minus) (PNK, 3’-minus) is used to phosphorylate 5’ends of RNA molecules without de-phosphorylation of their 3’ ends. The combination of T4 RNA ligase 1 (Rnll) and T4 RNA ligase 2 (Rnl2) is used for circularization of RNA molecules comprising 5’-P with 3 ’-OH ends. MES is a buffer (2-N-morpholinoethanesulfonic acid, sodium salt). RtcB ligase is used for circularization of RNA molecules comprising 5 ’-OH with 3’-P or 2’,3’>P (2’, 3 ’-cyclic phosphate) ends. Terminator 5 ’-Phosphate-Dependent Exonuclease specifically digests RNA molecules with 5'-p ends (but preserves RNA molecules with 5 ’-OH ends).
[0108] In contrast to the library preparation protocol A (Table 2), which is specifically capture RNA molecules of Types 1 and 2, most of the currently available small RNA-Seq library preparation kits capture only Type 1 RNA molecules. The RealSeq-RF approach can also be adopted to be used with the any RNA Type 1 -specific protocols for the library preparation designated below as protocol A8 (Table 3).Table 3: Examples of alternative pretreatments for preparation of sequencing libraries containing various RNA Types.Attorney Docket No. 57767-714601Example 3. Synthetic spike-in RNAs comprising RNAType-specific barcodes.Model synthetic small RNAs of 20, 30, 40, 50 and 60 nt (stsRNAs) comprising sequences, which are not homologous to the human transcriptome, with 6-nt RNA Type-specific barcodes (CGTGAT, ACATCG, GCCTAA, and TGGTCA corresponding to Types 1, 2, 3 and 4, respectively) and 4-nt random sequences at each end shown in Table 4.Attorney Docket No. 57767-714601Table 4: Examples of Type-specific (spike-in) control RNA
[0109] A pool of these stsRNAs (control RNAs) comprising equimolar amounts of each stsRNA was used along with RNA molecules (or RFs) isolated from samples before the start of the pretreatment protocols. It allows to measure efficiency of the enzymatic conversion between different RNA Types as well as to monitor efficiency of incorporation of RFs of the different sizes and RNA Types. The stsRNA having different RNA Types but the same size were distinguished by the RNA Type-specific barcode sequences.Example 4. Pretreatment(s) for detecting all RNA Types both simultaneously and Type- specifically.
[0110] To test and select the most effective pretreatments, total human brain RNA samples were used along with the stsRNA pool described above. A custom protocol was developed for the clean-up performed after each pretreatment to remove enzymes (without heat inactivation) and buffers with minimal loss of RNA (<10%). For this purpose, RNA Clean and ConcentratorAttorney Docket No. 57767-714601 kit columns (Zymo Research) were used with increased amounts (in comparison to Zymo protocol) of ethanol added to reaction solutions before loading to the columns. Following the pretreatments, Protocol A was used for preparation of triplicate sequencing libraries and then sequenced on Illumina NextSeq 550 or Singular Genomics G4.
[0111] Sequencing data were analyzed using a customized workflow for fragmented RNAs. Briefly, adapter sequences were trimmed using Cutadapt and reads were initially aligned to a custom reference of high confidence tRNAs using Bowtie2. Unaligned reads were then mapped to the human genome (hg38) sequentially using Bowtie2 followed by Hisat2 to more accurately map mRNA-derived sequences and other RNA classes. Multimapping reads were assigned the highest-scoring alignment; one was randomly assigned for reads with multiple high-scoring alignments. A custom Python script was then used to count mapped reads and perform downstream analyses.
[0112] Selection of the most efficient pretreatment for simultaneous detection of all RNA Types. The standard treatment of cfRNA by PNK, which has both 5 ’-kinase and 3’- phosphatase activities, in Tris-HCl buffer (pH 7.6) in the presence of 1 mM ATP is suboptimal for the 3 ’-end dephosphorylation. To find the reaction condition providing the most efficient conversion of 2’3’>P and 3’-P ends into 3’-OH ends, the effects of different PNK pretreatment conditions originally described were experimentally compared by sequencing samples of human brain RNA spiked in with the stsRNA pool. Among the 9 different PNK reaction conditions tested, the PNK pretreatment in MES (a.k.a. MES-NaOH) buffer at pH 6.0 in the absence of ATP was selected. The selected pretreatment produces the largest increase in the percentage of mRNA and tRNA associated sequencing reads while minimizing the amount of rRNA fragments (Fig. 4A). Analysis of sequencing data for the stsRNA pool confirmed the RNA Types in sequencing libraries, including only RNA Types 1+2 in no-pretreatment control (PP Code 01, which is Protocol A), all RNA Types 1+2+3+4 found after the selected pretreatment (PP code 04, which became Protocol C) and the standard pretreatment (PP code 10) as shown in Fig. 4B. The RF sequencing profiles (Percent of sequencing reads vs. Length of sequencing reads) for all indicated RNA classes with the PNK pretreatment (Protocol C) and without (Protocol A) are shown in Fig. 4C (left panel) and Fig. 5B (top left panel), respectively. The Protocol C sequencing profile of RFs related to the mRNA transcriptome (Fig. 4C, right panel), comprises RFs derived from mRNA transcript, including small RNA from protein coding region (scdRNA), small untranslated mRNA region (sutRNA), and small intronic RNA (sinRNA).
[0113] Exclusion of specific RNA Types by circularization. We employed circularization of RNA Type 1 RFs (and RNA Type 4 after their conversion to RNA Type 1) by T4 RNA ligase(s) to exclude these RFs from sequencing libraries. To this end, we tested T4 RNA ligase 1Attorney Docket No. 57767-714601(Rnll) and 2 (Rnl2) pretreatments of our stsRNA pool spiked into total brain RNA samples. Although efficiencies of the RNA Type 1 exclusion-by-circularization were similar (-85%) at all tested conditions (Fig. 5A, left panel), we found that an equimolar mixture of Rnll+Rnl2 provided the least products of a side “reverse ligation” reaction (Krug and Uhlenbeck. 1982. Biochemistry 21 : 1858-64), which is not commonly known is a result of exonuclease activity of Rnll at high concentration and / or long reaction time (> 1 h). Using stsRNAs, we confirmed by sequencing that this undesirable reaction removes pNp mononucleotide from 3’ ends of RNA Types 3 and 4, converting them to RNA Types 2 and 1, respectively, and allows their incorporation into libraries (Fig. 5A, compare left and right panels). Simultaneously, the converted Type 4-to-Type 1 stsRNAs were circularized by Rnll, reducing their inclusion into the library relative to the converted RNA Type 3-to-Type 2 stsRNAs (see Fig. 5A, right panel).
[0114] Subtraction (or exclusion) of sequencing profiles for Protocol B (RNA Type 2) (Fig. 5B, top left panel) from the profile for Protocol A (Types 1+2) (Fig. 5B (bottom left panel) allowed for the identification of RNA Type 1 RFs. For example, nearly all miRNAs were classified as RNA Type 1 (Fig. 5B, right panel) using this new approach, which we called “detection-by-exclusion”. This subtraction approach was made possible by using the core protocol for the sequencing library preparations.
[0115] RF profiles generated from total human brain RNA using the other five pretreatment Protocols (from Table 2) are shown in Fig. 6. The latter protocols include two (D, E, F and G) or three (H) pretreatments, one (D, E, G and H) or two (F) of which are excluding circularization steps. The ends of non-circularized RNA Types are converted into ligatable forms (RNA Types 1 or 2), and these RFs are then incorporated into sequencing libraries.
[0116] Analyzing individual RNA Types for selected RNA classes. Although some RF sequencing profiles (Fig. 5B, left panel and Fig. 6) could look similar at first sight, the striking differences between the profiles are revealed when analyzing individual RNA Types (or their combinations) along with specific RNA classes as shown in Fig. 7. In the latter figure, a few examples of such sequencing profiles for different end Types of rRNA, tRNA and snoRNA are depicted. The snoRNA profile (bottom left) revealed a significant abundance of (Type 1+2) RFs > 50 nt in length found in cells in contrast to Type 3+4 RFs (bottom right).
[0117] Analyzing mRNA-derived RFs The number of unique individual mRNA transcripts represented by RFs of different RNA Types was compared. RNA Type 3-specific (Protocol H) provided detection of -3850 transcripts whereas RNA Types 1+2+3+4 (Protocol C) and RNA Types 1+2 (Protocol A) can detect only -2300 (1.7 times less) and -550 (7 times less) mRNA transcripts, respectively, that implies an importance of the RNA Type-specific analysis to increase the sensitivity of detection of corresponding “parent” RNA transcript classes.Attorney Docket No. 57767-714601Example 5. Analyzing sequencing profiles of RFs isolated from plasma samples.
[0118] Plasma samples (from Innovative Research) were collected in EDTA-stabilized tubes from 3 healthy donors (H samples) and 3 female patients diagnosed with stage II breast invasive carcinoma before pretreatment (D samples). Total plasma RNA was isolated using Quick- cfRNA™ Serum & Plasma kit (Zymo Research). Using these plasma RNA samples, we prepared and sequenced RealSeq-RF libraries for all pretreatment protocols (from A to H) described in Table 2, providing specific detection of distinct RNA Types (Fig. 8A). As benchmark, we also sequenced “Phospho-Seq libraries” prepared by the NEBNext® Small RNA Library Prep kit in combination with the most (commonly) used pretreatments of plasma RNA samples with PNK in the presence of (“PNK+ATP”) (Giraldez et al. 2019. EMBO J 38 el01695) or by PNK in the absence of ATP (“PNK”) (Lecanda et al. 2016. Methods 107: 89-97), that allow analysis of all RNA Types 1+2+3+4 or only Types 1+4, respectively.
[0119] Comparison of these two sequencing data outputs are shown in Fig. 8A, left panel. Neither protocol showed significant differences between H and D samples while considering all RNA classes. However, the analysis of sequencing read profiles for each individual RNA class revealed snRNAs as one of the most differentially abundant RNA classes (largest percentage of reads difference) between the healthy (H) and disease (D) samples (Fig. 8A, right panel). Although both “PNK+ATP” with NEBNext and RealSeq-RF Protocol C showed similar (but not significant) differences between H and D samples while considering all RNA Types simultaneously, further analysis of RealSeq-RF read profiles for each individual RNA Type allowed the identification of Type 2 (Protocol B) of snRNA as providing the best discrimination between the D and H samples (Fig. 8A, right panel).
[0120] Comparison of the snRNA lengths’ profiles between D and H samples (Fig. 8B) further enhanced the discrimination between D and H samples. To assess patient-to-patient sample variability, snRNA RF profiles were overlayed. While little variation existed amongst the H group, differences were observed within the D group. The lack of differences between the D3 sample and the tested H samples could be related to a differential breast cancer subtype than for DI and D2 samples, which may point to the specificity of this approach; however, this would require further validation.
[0121] For Phospho-Seq libraries, most RFs were found to be very short (< 20 nt) with very few or no sequences of > 30 nt detected (Fig. 8B, left panels) leading to difficulties in specifically mapping these to different RNA classes. Also, no clear distinction was observed in the distribution of snRNA RF lengths between D and H samples. This makes unambiguous discrimination between D and H samples problematic using the standard Phospho-Seq / NEBNext protocol. In contrast, snRNA RF length profiles for RealSeq-RF Protocol B libraries providedAttorney Docket No. 57767-714601 clear discrimination between D and H samples with the specific presence of long RFs of > 30 nt only in DI and D2 samples (framed in Figs. 8B, right panels) but not in H samples. Although less pronounced in the D3 length profile (Fig. 8A and Fig. 8B, right panels) this could be related to differential breast cancer subtype in DI sample (in comparison to DI and D2). Furthermore, significant differences in the length profiles for 19-24 nt short snRNA fragments between all D samples (including D3) and H samples allowed their clear discrimination with RealSeq-RF Type 2 specific protocol (Fig. 6B, left panels) but not for the Phospho-Seq method combining all RNA Types (Fig. 6B, right panels). However, especially important was the potential finding of cancer-specific RFs of > 50 nt.Another indication of the cancer-specific origin of snRNA RFs is provided by their localization in the 3’ half of U2 snRNA sequence (Fig. 8C, right panels) mainly in DI and D2 samples exclusively detected by RealSeq-RF Protocol B. U2 snRNA was selected for this report due to it providing the largest discrimination between D and H samples among the snRNA class. Also, RFs derived from U2 small nuclear RNA were previously implicated as multi-cancer diagnostic and prognostic biomarkers (Kohler et al. J. Cancer Res. Clin. Oncol. 142: 795-805).
[0122] While preferred embodiments of the present invention have been shown and described herein, it will be obvious to those skilled in the art that such embodiments are provided by way of example only. Numerous variations, changes, and substitutions will now occur to those skilled in the art without departing from the invention. It should be understood that various alternatives to the embodiments of the invention described herein may be employed in practicing the invention. It is intended that the following claims define the scope of the invention and that methods and structures within the scope of these claims and their equivalents be covered thereby.
Claims
1. Attorney Docket No. 57767-714601CLAIMSWhat is claimed is:
1. A method of preparing a sequencing library from a sample comprising a plurality of RNA molecules, wherein the plurality of RNA molecules comprise a combination of ligatable or unligatable 5’ and 3’ ends (end-types), and wherein the plurality of RNA molecules comprise at least one RNA molecule comprising 5’ and 3’ ligatable ends and at least one or more RNA molecules comprising one or two 5’ or 3’ unligatable ends, the method comprising: a) preparing a sequencing library for all pluralities of RNA molecules comprising ligatable end-types before and / or after a plurality of pretreatments, wherein preparing the sequencing library comprises ligation to sequencing adaptors; and b) applying a plurality of pretreatments to the plurality of RNA molecules of the sample, wherein the plurality of pretreatments comprises:(i) converting a first plurality of RNA molecules comprising unligatable end-types to ligatable end-types, and / or(ii) depleting a first plurality of RNA molecules comprising ligatable 5’ end or 3’end using the end-specific exonucleases, and / or(iii) circularizing a second plurality of RNA molecules comprising ligatable end-types to obtain a second plurality of circularized RNA molecules that have no ligatable ends; and / or(iv) converting a third plurality of RNA molecules comprising unligatable end-types into a third plurality of RNA molecules comprising ligatable end-types; and / or(v) circularizing a fourth plurality of RNA molecules comprising ligatable end-types to obtain a fourth plurality of circularized RNA molecules; and / or(vi) converting a fifth plurality of RNA molecules comprising unligatable end-types into a fifth plurality of RNA molecules comprising ligatable end-types molecules; and / or(vii) circularizing a sixth plurality of RNA molecules comprising ligatable end-types to obtain a sixth plurality of circularized RNA molecules; and / or(viii) converting a seventh plurality of RNA molecules comprising unligatable end-types into a seventh plurality of ligatable end types; and / or(ix) optionally, repeating (v) through (vii) for a fifth or more pluralities of RNA molecules comprising unligatable end-types.
2. The method of claim 1, further comprising performing a sequencing only the RNA molecules incorporated in(to) sequencing library.Attorney Docket No. 57767-7146013. The method of claim 1 or 2, wherein the RNA molecules comprise small RNAs (sRNA) or RNA fragments (RFs).
4. The method of claim 3, wherein said sRNAs or RFs are 150 nucleotides or less in length.
5. The method of claim 3, wherein said sRNAs or RFs are 50 nucleotides or less in length.
6. The method of any one of claims 1 to 5, wherein 5’ends comprise 5’-hydroxyl (5’-OH), 5’- Phosphate (5’-P), 5 ’-triphosphate (5’-ppp) or 5’-cap (e.g., 5’-methylGppp); and wherein the 3’ ends comprise 3 ’-Phosphate (3’-P), 2’ -phosphate (2’-p), 2’, 3 ’-cyclic phosphate (2’,3’>P ), 2’-O-methyl (2’-0Me).
7. The method of any one of claims 1 to 5, wherein the ligatable ends are selected from 5’-P, 3’-OH, 5’-OH, 3’-P or 2’,3’>P, or any combination thereof.
8. The method of claim 7, wherein the RNA end-types are defined as RNA Types comprising the following ends: 5’-P and 3’-OH (Type 1); 5’-OH and 3’-OH (Type 2); 5’-OH and 3’-P or 2’,3’>P (Type 3); 5’-P and 3’-P or 2’,3’>P (Type 4).
9. The method of any one of claims 1 to 8, wherein the depleting RNA molecules comprising ligatable 5’-P end and / or 3’-OH end is performed with the end-specific exonucleases selected from: Terminator 5’p-dependent exonuclease, XRN-1 (5’-P end specific) or exonuclease and Exonuclease T (3 ’-OH end specific).
10. The method of any one of claims 1 to 9, wherein the circularizing is performed with at least one 3’-ligase (ligating 3’-OH with 5’-P ends) selected from: T4 RNA ligase, T4 RNA ligase 1 (Rnll), T4 RNA ligase 2 (Rnl2), Mth RNA Ligase, CircLigase™ ssDNA ligase, CircLigase ™ II ssDNA ligase, CircLigase™ RNA Ligase, Thermostable 5' AppDNA / RNA ligase, or a combination thereof.
11. The method of any one of claims 1 to 10, wherein the circularizing is performed with at least one 5 ’-ligase (ligating 5 ’-OH with 3’-P or 2’,3’>P ends) selected from: RNA-splicing ligase (RtcB), A. thaliana tRNA ligase (AtRNL), tRNA ligase enzyme (Tril), tRNA ligase (Rigl+), or a combination thereof.
12. The method of any one of claims 1 to 11, wherein the circularizing of RNA molecules prevents ligation of circularized RNA molecules with sequencing adapter(s) thereby preventing incorporation of the circularized RNA molecules into the sequencing library.
13. The method of any one of claims 1 to 12, wherein the sequencing library preparation includes only Type 1 or a combination of Type 1 and Type 2 RNA molecules.
14. A method of preparing a sequencing library from a sample comprising a plurality of RNA molecules; wherein the plurality of RNA molecules comprises a first end comprising a 5’- Phosphate (5’-P) end or a 5 ’-hydroxyl (5 ’-OH) end and a second end comprising a 3’-Attorney Docket No. 57767-714601Phosphate (3’-P) end, a 2’,3’-cyclic (2’,3’>P ) end, or a 3’-hydroxyl (3-’OH) end; the method comprising: a) separating a composition comprising the plurality of RNA molecules into at least a first partition and a second partition; b) performing a first pretreatment on the first partition and a second pretreatment on the second partition, wherein the first pretreatment and the second pretreatment are not the same, wherein the first pretreatment and the second pretreatment are independently selected from: i. circularizing a plurality of RNA molecules comprising 5’-P ends and 3’- OH ends; ii. converting a plurality of RNA molecules comprising 3’-P end or 2’,3’>P end, to 3 ’-OH ends; iii. converting a plurality of RNA molecules comprising 3’-P or 2’,3’>P ends to 3 ’-OH ends, then circularizing a plurality of RNA molecules comprising 5’-P ends and 3 ’-OH ends; iv. converting a plurality of RNA molecules comprising 5 ’-OH ends to 5’-P ends and converting a plurality of RNA molecules comprising 5 ’-OH and 3’P or 2’,3’>P ends to 5’-P and 3’-P or 2’,3’>P ends, then circularizing a plurality of RNA molecules comprising 5’-P ends and 3’-OH ends, and then converting 5’-P and 3’-P or 2’,3’>P ends to 5’-P ends and 3 ’-OH ends ; v. circularizing a plurality of RNA molecules comprising 5 ’-OH ends and 3’-P ends or 2’,3’>P end, then converting a plurality of RNA molecules comprising 5 ’-OH ends to 5’-P ends and circularizing a plurality of RNA molecules comprising 5’-P ends and 3’-OH ends; vi. circularizing a plurality of RNA molecules comprising 5’-OH ends and 3’-P ends or 2’,3’>P ends; then converting a plurality of RNA molecules comprising 3’-P ends or 2’,3’>P ends to 3’-OH ends; vii. degrading a plurality of RNA molecules comprising 5’-P ends, then converting a plurality of RNA molecules comprising 5 ’-OH ends to 5’-P ends and circularizing a plurality of RNA molecules comprising 5’-P ends and 3 ’-OH ends; then converting a plurality of RNA molecules comprising 3’-P ends or 2’,3’>P end, to 3’ OH ends; or viii. no pretreatment is performed; andAttorney Docket No. 57767-714601 c) preparing a first sequencing library from the first partition and a second sequencing library from the second partition.
15. The method of claim 14, wherein (b)(i) comprises contacting the plurality of RNA molecules with T4 RNA ligase 1 and T4 RNA ligase 2 in the presence of ATP.
16. The method of claims 14 or 15, wherein (b)(ii) comprises contacting the plurality of RNA molecules with T4 polynucleotide kinase (PNK) in the absence of ATP.
17. The method of any one of claims 14 to 16, wherein (b)(iii) comprises contacting the plurality of RNA molecules with PNK in the absence of ATP then contacting the plurality of RNA molecules with T4 RNA ligase 1 or T4 RNA ligase 1 and T4 RNA ligase 2 in the presence of ATP.
18. The method of any one of claims 14 to 17, wherein (b)(iv) contacting the plurality of RNA molecules with PNK (3’ phosphatase minus) in the presence of ATP; then contacting the plurality of RNA molecules with T4 RNA ligase 1 and T4 RNA ligase 2 in the presence of ATP; then contacting the plurality of RNA molecules with PNK in the absence of ATP.
19. The method of any one of claims 14 to 18, wherein (b)(v) comprises contacting the plurality of RNA molecules with RtcB ligase, then contacting the plurality of RNA molecules with PNK (3’ phosphatase minus) in the presence of ATP; then contacting the plurality of RNA molecules with T4 RNA ligase 1, and T4 RNA ligase 2 in the presence of ATP; then contacting the plurality of RNA molecules with PNK in the absence of ATP.
20. The method of any one of claims 14 to 19, wherein (b)(vi) comprises contacting the plurality of RNA molecules with RtcB ligase, then contacting the plurality of RNA molecules with PNK in the absence of ATP.
21. The method of any one of claims 14 to 20, wherein (b)(vii) comprises contacting the plurality of RNA molecules with Terminator 5 ’-Phosphate-Dependent exonuclease, then contacting the plurality of RNA molecules with PNK (3’ phosphatase minus), in the presence of ATP; then contacting the plurality of RNA molecules with T4 RNA ligase 1 and T4 RNA ligase 2 in the presence of ATP; then contacting the plurality of RNA molecules with T4 RNA ligase in the absence of ATP.
22. The method of any one of claims 14 to 21, wherein contacting the plurality of RNA molecules with the PNK is performed in buffer solutions comprising (2-(N- morpholino)ethanesulfonic acid) (MES) or Imidazole at pH 5.5-6.5, or tri s(hydroxymethyl)aminom ethane (TRIS) at pH 7.0-7.5.Attorney Docket No. 57767-71460123. The method of any one of claims 14 to 22, further comprising separating the composition comprising the plurality of RNA molecules into a third, fourth, fifth, sixth, and / or seventh partition, wherein a third, fourth, fifth, sixth, and / or seventh pretreatment is performed on the third, fourth, fifth, sixth, and / or seventh partition, wherein the third, fourth, fifth, sixth, and / or seventh pretreatments are different from each other and the first and second pretreatment.
24. The method of any one of claims 14 to 23, wherein preparing the sequencing library comprises ligating a single adaptor to 5’ or to 3 ’end of the plurality of RNA molecules.
25. The method of any one of claims 14 to 24, wherein preparing the sequencing library comprises ligating two adapters, wherein the first adaptor is ligated to a first end and the second adapter is ligated to a second end of the plurality of RNA molecules.
26. The method of claim 24 or 25, wherein preparing a sequence library further comprises circularizing the plurality of ligation products comprising the single adaptor-RNA molecules.
27. The method of any one of claims 24 to 26, wherein preparing a sequence library further comprises reverse transcription of circularized products (RT) followed by PCR amplification of cDNA products of the RT.
28. The method of any one of claims 24 to 27, wherein preparing a sequence library further comprises direct RT-PCR amplification of the circularized products.
29. The method of claim of any one of claims 14-28, further comprising sequencing the first sequencing library and the second sequencing library.
30. A method of preparing a sequencing library from a sample comprising a plurality of RNA molecules, wherein the plurality of RNA molecules have ends of Type 1 comprising a combination of 5 ’-Phosphate (5’-P) and 3 ’-hydroxyl (3 ’-OH) ends), Type 2 comprising 5 ’-hydroxyl (5 ’-OH) and 3 ’-OH ends, Type 3 comprising 5 ’-OH and 3’- Phosphate (3’-P) or 2’, 3’ cyclic phosphate (2’,3’>P ends ), and Type 4 comprising 5’-P and 3’-P or 2’,3’>P ends; the method comprising: a) separating a composition comprising the plurality of said RNA molecules into at least a first partition and a second partition; b) performing a first pretreatment on the first partition and a second pretreatment on the second partition, wherein the first pretreatment and the second pretreatment are not the same, wherein the first pretreatment and the second pretreatment are independently selected from: i. circularizing Type 1 RNA molecules;Attorney Docket No. 57767-714601 ii. converting Type 3 RNA molecules to Type 2 RNA molecules and converting Type 4 RNA molecules to Type 1 RNA molecules; iii. converting Type 3 RNA molecules to Type 2 RNA molecules and converting Type 4 RNA molecules to Type 1 RNA molecules, then circularizing Type 1 RNA molecules; iv. converting Type 2 RNA molecules to Type 1 RNA molecules and converting Type 3 RNA molecules to Type 4 RNA molecules, then circularizing Type 1 molecules, and then converting Type 4 RNA molecules to Type 1 RNA molecules; v. circularizing Type 3 molecules, then converting Type 2 RNA molecules to Type 1 RNA molecules and circularizing Type 1 RNA molecules; vi. circularizing Type 3 molecules, then converting Type 4 molecules to type 1 molecules; vii. degrading type 1 and type 4 molecules, then converting Type 2 molecules to type 1 molecules and circularizing Type 1 molecules; then converting Type 3 molecules to Type 2 molecules; or viii. no pretreatment; c) ligating a plurality of adaptors to the plurality of RNA molecules to produce a plurality of adaptor-RNA molecules.
31. The method of claim 30, further comprising separating the composition comprising the plurality of RNA molecules into a third, fourth, fifth, sixth, and / or seventh partition, wherein a third, fourth, fifth, sixth, and / or seventh pretreatment is performed on each corresponding separate partition, wherein each pretreatment is different.
32. The method of any one of claims 30 and 31, wherein (b)(i) comprises contacting the plurality of RNA molecules with T4 RNA ligase 1 and T4 RNA ligase 2 in the presence of ATP.
33. The method of any one of claims 30 to 32, wherein (b)(ii) comprises contacting the plurality of RNA molecules with PNK in the absence of ATP.
34. The method of any one of claims 30 to 33 wherein (b)(iii) comprises contacting the plurality of RNA molecules with PNK in the absence of ATP then contacting the plurality of RNA molecules with T4 RNA ligase 1 and T4 RNA ligase 2 in the presence of ATP.
35. The method of any one of claims 30 to 34, wherein (b)(iv) comprises contacting the plurality of RNA molecules with PNK (3’ phosphatase minus) in the presence of ATP; then contacting the plurality of RNA molecules with T4 RNA ligase 1 and T4 RNAAttorney Docket No. 57767-714601 ligase 2 in the presence of ATP; then contacting the plurality of RNA molecules with PNK in the absence of ATP.
36. The method of any one of claims 30 to 35, wherein (b)(v) comprises contacting the plurality of RNA molecules with RtcB ligase, then contacting the plurality of RNA molecules with PNK (3’ phosphatase minus) in the presence of ATP; then contacting the plurality of RNA molecules with T4 RNA ligase 1, and T4 RNA ligase 2 in the presence of ATP; then contacting the plurality of RNA molecules with PNK in the absence of ATP.
37. The method of any one of claims 30 to 36, wherein (b)(vi) comprises contacting the plurality of RNA molecules with RtcB ligase, then contacting the plurality of RNA molecules with PNK in the absence of ATP.
38. The method of any one of claims 30 to 37, wherein (b)(vii) comprises contacting the plurality of RNA molecules with Terminator 5 ’-Phosphate-Dependent exonuclease, then contacting the plurality of RNA molecules with PNK (3’ phosphatase minus), in the presence of ATP; then contacting the plurality of RNA molecules with T4 RNA ligase 1 and T4 RNA ligase 2 in the presence of ATP; then contacting the plurality of RNA molecules with T4 RNA ligase in the absence of ATP.
39. The method of any one of claims 30 to 38, wherein contacting the plurality of RNA molecules with the PNK is performed in a buffer solution comprising a MES or Imidazole at pH 5.5-6.5.
40. The method of any one of claims 30 to 39 wherein the method further comprises sequencing the first sequencing library and the second sequencing library to identify and quantities of at least one Type of RNA molecules or their combinations thereafter.
41. The method of claim 40, further comprising comparing the relative quantities of the same Type or different Types of RNA molecules in the first sequencing library and the second sequencing library.
42. The method of any one of claims 30 to 41, wherein 5’ ends comprise 5’-OH, 5’-P, 5’- triphosphate (5’-ppp); or 5’-cap (e.g., 5’-mGppp).
43. The method of any one of claims 30 to 42, wherein 3’ ends comprise 3’-P, 2’-phosphate (2’-P) or 2’,3’>P.
44. The method of any one of claims 30 to 43, wherein 3’ ends comprise 3’-OH and 2’- hydroxyl (2’-OH) or 3’-OH and 2’-O-Methyl (2’-0Me).
45. A method of preparing a sequencing library from a sample comprising a plurality of RNA molecules, the method comprising:Attorney Docket No. 57767-714601 a) separating a composition comprising the plurality of RNA molecules into at least a first partition and a second partition; b) performing a first pretreatment on the first partition and a second pretreatment on the second partition wherein the first pretreatment and the second pretreatment are independently selected from:(i) contacting the plurality of RNA molecules with T4 RNA ligase 1 and T4 RNA ligase 2 in the presence of ATP;(ii) contacting the plurality of RNA molecules with PNK in the absence of ATP;(iii) contacting the plurality of RNA molecules with PNK in the absence of ATP then contacting the plurality of RNA molecules with T4 RNA ligase 1 (Rnll) and T4 RNA ligase 2 (Rnl2) in the presence of ATP;(iv) contacting the plurality of RNA molecules with PNK (3’ phosphatase minus) in the presence of ATP; then contacting the plurality of RNA molecules with T4 RNA ligase 1 and T4 RNA ligase 2 in the presence of ATP; then contacting the plurality of RNA molecules with PNK in the absence of ATP;(v) contacting the plurality of RNA molecules with RtcB ligase, then contacting the plurality of RNA molecules with PNK (3’ phosphatase minus) in the presence of ATP; then contacting the plurality of RNA molecules with T4 RNA ligase 1, and T4 RNA ligase 2 in the presence of ATP; then contacting the plurality of RNA molecules with PNK in the absence of ATP;(vi) contacting the plurality of RNA molecules with RtcB ligase, then contacting the plurality of RNA molecules with PNK in the absence of ATP;(vii) contacting the plurality of RNA molecules with Terminator 5’- Phosphate-Dependent exonuclease, then contacting the plurality of RNA molecules with PNK (3’ phosphatase minus), in the presence of ATP; then contacting the plurality of RNA molecules with T4 RNA ligase 1 and T4 RNA ligase 2 in the presence of ATP; then contacting the plurality of RNA molecules with T4 RNA ligase in the absence of ATP; or(viii) no pretreatment(s);Attorney Docket No. 57767-714601 c) ligating a plurality of adaptors to the plurality of RNA molecules in the first partition and to the plurality of RNA molecules in the second partition to produce a plurality of adaptor-RNA molecules.
46. The method of claim 45, wherein contacting the plurality of RNA molecules with the PNK is performed in solutions comprising a salt of MES or Imidazole buffer at pH value between 5.5 and 6.5.
47. The method of claim 45 or 46, further comprising separating the composition comprising the plurality of RNA molecules into a third fourth, fifth, sixth, and / or seventh partition, wherein a third fourth, fifth, sixth, and / or seventh pretreatment is performed on the corresponding separate partition, wherein each pretreatment is different.
48. A method of pretreating a plurality of RNA molecules comprising all possible combinations of a 5’-P, 5’-OH, 3’-P end, 3’-OH end, a 3’-P end and 2’,3’>P end, the method comprising: a) converting the 3’-P ends and the 2’,3’>P ends to 3 ’-OH ends by contacting the plurality of RNA molecules with a PNK in a buffer solution at pH between 5.5 and 6.5; b) converting the 5 ’-OH ends to a 5’-P ends by PNK in the presence of ATP and a buffer at pH value between 7.0 and 7.5.
49. The method of claim 48, wherein the buffer solution comprises a MES buffer at pH 6.0.
50. The method of claim 48, wherein the PNK is heat-inactivated at 65°C-85°C in the presence of citric acid at pH 6, wherein both chelating Mg2+cations by citrate anions and pH 6 prevents RNA from degradation at indicated temperatures.
51. The method of claim 48, wherein a sequencing adaptor is ligated to each of 3’ ends of the plurality of RNA molecules after step (a) and before step (b).
52. The method of claim 48, wherein a sequencing adaptor is ligated to each of 3’ ends of the plurality of RNA molecules after step (b).
53. The method of any one of claims 1 to 52, wherein the method allows identification of one or more RNA Types for any RNA class of interest.
54. The method of claim 53, wherein the RNA class is selected from: microRNAs (miRNA), endogenous small interfering RNAs (esiRNA), Piwi interacting RNAs (piRNA), small nuclear RNA (snRNA), small nucleolar RNAs (snoRNAs), molecules derived from mRNA transcripts (smRNA, scRNA, sutRNA, sinRNA) and other small genome- encoded RNA (sgmRNA), long non-coding RNAs (IncRNA), transfer RNA (tRNA), ribosomal RNA (rRNA) and Y RNA, or combination thereof.Attorney Docket No. 57767-71460155. The method of any one of claims 1 to 54, wherein deep sequencing of the plurality of sequencing libraries comprising Type 1, Type 2, Type 3, or Type 4 RNA simultaneously allows to identify specific RNA classes as biomarker candidates.
56. The method of any one of claims 1 to 55, wherein the method determines if an RNA molecule is a Type 1, Type 2, Type 3, or Type 4 RNA molecule.
57. The method of any one of claims 1 to 56, wherein sequencing libraries prepared for different RNA Type allows to identify specific RNA Type(s) and RNA class(es) providing the most sensitive and specific detection of RNA biomarkers.
58. The method of any one of claims 1 to 57, where a length of an identified RNA sequences is within a range of 15 to 150 nucleotide sequencing reads.
59. A kit for preparing sequencing libraries comprising sequences of all RNA Types and / or specific RNA Types or different combinations of the specific RNA Types from a plurality of RNA molecules from a sample, the kit comprising: a) a universal or RNA Type-specific sequencing adapter or adapters; and b) sequencing library preparation kit.
60. The kit of claim 59, further comprising a pool of control (spike-in) RNA molecules, the RNA molecules comprising:(i) a plurality of 5’ and 3’ end combinations, the end combinations comprising Type 1 with 5’-P and 3 ’-OH ends; Type 2 with 5 ’-OH and 3 ’-OH ends; Type 3 with 5’- OH and 3’-P or 2’,3’>P ends; and Type 4 with 5’-P and 3’-P or 2’,3’>P ends;(ii) internal bar-code nucleotide sequences corresponding to and distinguishing between Type 1, Type 2, Type 3 and Type 4;(iii) a randomized nucleotide sequence at the first end; and(iv) a randomized nucleotide sequence at the second end.
61. The kit of claim 60, further comprising one or more of: a PNK (wild type and 3’ phosphatase minus mutant), a T4 RNA ligase 1, a T4 RNA ligase 2; a RtcB ligase, or a Terminator 5 ’-Phosphate-Dependent exonuclease.
62. The kit of any one of claims 59 to 61, further comprising stock solutions of one or more of: ATP; standard ligase and PNK buffers, MES, and citric acid.