Disrupted homopolymer capture probes and uses thereof

What is AI technical title?
AI technical title is built by PatSnap AI team. It summarizes the technical point description of the patent document.
Capture probes with non-sequential nucleotide sequences improve the efficiency of target nucleic acid capture and conversion in spatial transcriptomic workflows, facilitating high-resolution spatial transcriptomic analysis.

WO2026142970A1PCT designated stage Publication Date: 2026-07-02ILLUMINA INC

View PDF 0 Cites 0 Cited by

Patent Information

Authority / Receiving Office: WO · WO
Patent Type: Applications
Current Assignee / Owner: ILLUMINA INC
Filing Date: 2025-12-19
Publication Date: 2026-07-02

Application Information

Patent Timeline

19 Dec 2025

Application

02 Jul 2026

Publication

WO2026142970A1

IPC: C12Q1/68; C12Q1/6806; C12Q1/6869; C12Q1/6837; C12Q1/6858; C12N15/10

AI Tagging

Technology Topics

NucleotideTissue sample

Explore More Agents

Novelty Search
Search existing technologies and assess novelty
↗
FTO
Analyze whether a product may infringe others' patents
↗
Design FTO
Check prior-design risk for exterior design
↗
Drafting
Draft patent application text based on a technical solution
↗
Find Solutions with TRIZ
Generate feasible solution to solve your technical challenge
↗

Similar Technology Patents

Get free access to AI patent search and analysis

Check patentability, review prior art and ask IP Agent with full patent context.

Smart Images

Figure IMGF000093_0001_TABLE
Figure 00000109_0000
Figure 00000109_0001

Patent Text Reader

Abstract

The disclosure provides methods, compositions, and kits for capturing, amplifying, and sequencing target nucleic acids from tissue samples, e.g., frozen or FFPE tissue samples. Provided herein are capture probes comprising at least two non-sequential nucleotides or non-sequential nucleotide sequences, where each of the non-sequential nucleotides or non-sequential nucleotide sequences is complementary to a portion of a homopolymer sequence of a target nucleic acid.

Need to check novelty before this filing date? Find Prior Art

Description

DISRUPTED HOMOPOLYMER CAPTURE PROBES AND USES THEREOF CROSS-REFERENCE TO RELATED APPLICATIONS

[0001] This application claims priority benefit of U.S. Provisional Application No.63 / 738,990, filed December 26, 2024, which is hereby incorporated by reference in its entirety.REFERENCE TO SEQUENCE LISTING SUBMITTED ELECTRONICALLY

[0002] The content of the electronically submitted sequence listing (Name:4213_016PC01_SequenceListing_ST26.xml; Size: 15,256 bytes; and Date of Creation: December 16, 2025), filed with the application, is incorporated herein by reference in its entirety.BACKGROUND

[0003] Next generation sequencing technology is providing increasingly high speed of sequencing, allowing larger sequencing depth. Spatial transcriptomic enables highly multiplexed, spatially located gene expression analysis from fresh frozen and formalin- fixed paraffin-embedded (FFPE) tissue samples. In order to generate spatial sequencing libraries, an on-surface library preparation method must be used to spatially capture and barcode transcripts from a tissue sample. Sequencing libraries must also include other sequences, such as unique molecular identifies (UMIs), spatial barcode sequences, and / or sample indices, while maintaining an optimal length for sequencing. Current spatial workflows require fragmentation to generate libraries of optimal fragment size for sequencing and contain UMI and spatial information on a barcoded surface. Current on- market spatial workflows capture and convert <1% mRNA within a tissue section. Accordingly, there is a need for compositions and methods with improved capture and conversion efficiency of target nucleic acids for spatial transcriptomic workflows.SUMMARY

[0004] The present disclosure is directed to capture probes, solid supports, kits, and methods for capturing, amplifying, and sequencing target nucleic acids from tissue samples, e.g., frozen or FFPE tissue samples.

[0005] In one aspect, the present disclosure provides a capture probe comprising a first primer binding sequence and a capture region, wherein the capture region comprises at least two non-sequential nucleotides or non- sequent! al nucleotide sequences, where each of the non-sequential nucleotides or non-sequential nucleotide sequences is complementary to a portion of a homopolymer sequence of a target nucleic acid.

[0006] In another aspect, the present disclosure provides a capture probe comprising a first primer binding sequence, a spatial barcode, and a capture region, wherein the capture region comprises at least two non-sequential nucleotides or non-sequential nucleotide sequences, where each of the non-sequential nucleotides or non-sequential nucleotide sequences is complementary to a portion of a homopolymer sequence of a target nucleic acid.

[0007] In some aspects, the capture region comprises less than ten non-sequential nucleotide sequences.

[0008] In some aspects, the capture region comprises two to six non-sequential nucleotide sequences.

[0009] In some aspects, each of the non-sequential nucleotide sequences is separated by an intervening nucleotide or intervening nucleotide sequence. In some aspects, the intervening nucleotide and / or intervening nucleotide sequence comprises at least one nucleotide that is not complementary to the homopolymer sequence.

[0010] In some aspects, each of the at least two non-sequential nucleotide sequences is between 2 to 10 bases in length.

[0011] In some aspects, each of the at least two non-sequential nucleotide sequences is between 2 to 8 bases in length.

[0012] In some aspects, each of the at least two non-sequential nucleotide sequences is between 3 to 5 bases in length.

[0013] In some aspects, the 3’ end of the capture region is complementary to the homopolymer sequence of the target nucleic acid.

[0014] In some aspects, the 3’ end of the capture region comprises at least four nucleotides complementary to the homopolymer sequence of the target nucleic acid.

[0015] In some aspects, the capture region comprises locked nucleic acids (LNAs), Bislocked nucleic acids (bisLNAs), twisted intercalating nucleic acids (TINAs), bridged nucleic acids (BNAs), 2’-O-methyl RNA:DNA chimeric nucleic acids, minor groove binder (MGB) nucleic acids, morpholino nucleic acids, C5-modified pyrimidine nucleic acids, peptide nucleic acids (PNAs), phosphorothioate nucleic acids, or combinations thereof. In some aspects, the capture region comprises locked nucleic acids (LNAs) or 2’- O-methyl RNA:DNA chimeric nucleic acids.

[0016] In some aspects, the intervening nucleotide and / or intervening nucleotide sequence comprises a natural base. In some aspects, the natural base is a deoxythymidine, deoxyadenosine, deoxyguanosine, deoxycytidine, or deoxyuridine.

[0017] In some aspects, the intervening nucleotide and / or intervening nucleotide sequence comprises an unnatural base. In some aspects, the unnatural base is a 2’-deoxyinosine, isoguanine, 3 -nitropyrrole, 5-nitroindole, or isocytosine.

[0018] In some aspects, the capture probe further comprises an index sequence.

[0019] In some aspects, the first primer binding sequence is a first sequencing primer binding sequence or a first decoding primer binding sequence.

[0020] In some aspects, the target nucleic acid is a messenger RNA (mRNA).

[0021] In some aspects, the homopolymer sequence is a poly-A sequence. In some aspects, the poly-A sequence is incorporated into the target nucleic acid using poly(A) polymerase or terminal deoxynucleotidyl transferase (TdT).

[0022] In some aspects, the homopolymer sequence is a poly-I sequence. In some aspects, the poly-I sequence is incorporated into the target nucleic acid using a polymerase.

[0023] In some aspects, the homopolymer sequence is between 3 and 50 bases in length.

[0024] In some aspects, the homopolymer sequence is greater than 50 bases in length.

[0025] In some aspects, each of the non-sequential nucleotide sequences comprises a plurality of deoxythymidines.

[0026] In some aspects, each of the non-sequential nucleotide sequences comprises a plurality of deoxyadenosines, deoxycytidines, deoxyuridines, or a combination thereof.

[0027] In some aspects, the intervening nucleotide and / or intervening nucleotide sequence does not comprise a deoxythymidine.

[0028] In some aspects, the target nucleic acid is DNA. In some aspects, the homopolymer sequence is a poly-A, poly-T, poly-G, or Poly-C sequence. In some aspects, the homopolymer sequence is incorporated into the DNA using TdT.

[0029] In some aspects, the capture probe comprises, from 5’ to 3’, the first primer binding sequence and the capture region. In some aspects, the first primer binding sequence is a first sequencing primer binding sequence.

[0030] In some aspects, the capture probe comprises, from 5’ to 3’, the spatial barcode, the first primer binding sequence, and the capture region. In some aspects, the first primer binding sequence is a decoding primer binding sequence.

[0031] In some aspects, the capture probe further comprises a cleavable site. In some aspects, the cleavable site comprises a chemically cleavable moiety or an enzymatically cleavable moiety. In some aspects, the enzymatically cleavable moiety comprises a restriction endonuclease recognition site.

[0032] In some aspects, the capture probe further comprises a spatial barcode.

[0033] In some aspects, the capture probe further comprises a unique molecular identifier (UMI).

[0034] In another aspect, provided herein is a solid support comprising a plurality of immobilized capture probes, wherein each capture probe of the plurality comprises a capture probe described herein. In some aspects, each capture probe is immobilized to the solid support at a 5’ end.

[0035] In some aspects, the solid support further comprises a plurality of immobilized spatial probes. In some aspects, each spatial probe of the plurality of immobilized spatial probes comprises a second primer binding sequence, a spatial barcode, and a probe sequence. In some aspects, each spatial probe is immobilized to the solid support at a 5’ end. In some aspects, each spatial probe is immobilized to the solid support at a 3’ end.

[0036] In some aspects, each spatial probe further comprises an index sequence, a molecular identifier, or a combination thereof. In some aspects, the molecular identifier is a unique molecular identifier.

[0037] In some aspects, the probe sequence is identical to a portion of the capture region of each immobilized capture probe. In some aspects, the first primer binding sequence of the capture probe is a first sequencing primer binding sequence, and wherein the portion of the capture region that is identical to the probe sequence is adjacent to the first sequencingprimer binding sequence of the capture probe. In some aspects, the portion of the capture region that is identical to the probe sequence is hybridized to a blocking element.

[0038] In some aspects, the probe sequence is hybridized to a blocking element.

[0039] In some aspects, the probe sequence is complementary to the reverse complement of a template switch oligo sequence.

[0040] In some aspects, each spatial probe of the plurality of immobilized spatial probes comprises, from 5’ to 3’, the second primer binding sequence, the spatial barcode, and the probe sequence. In some aspects, the second primer binding sequence is a second decoding primer binding sequence. In some aspects, each spatial probe of the plurality of immobilized spatial probes comprises, from 5’ to 3’, the probe sequence, the spatial barcode, and the second primer binding sequence. In some aspects, the second primer binding sequence is a second decoding primer binding sequence.

[0041] In some aspects, the solid support is a bead array, a spotted array, a flow cell, clustered particles arranged on a surface of a chip, a film, or a plate.

[0042] In another aspect, provided herein is a solid support comprising a plurality of capture probes described herein and a plurality of immobilized spatial probes described herein, wherein each capture probe comprises, from 5’ to 3’, the first primer binding sequence and the capture region, wherein the 3’ end of the capture region is complementary to the homopolymer sequence of the target nucleic acid, wherein each immobilized spatial probe comprises, from 5’ to 3’, the second primer binding sequence, the spatial barcode, and the probe sequence, and wherein each capture probe is attached to the solid support at a 5’ end. In some aspects, the capture probes comprise a first primer binding sequence and a capture region, wherein the capture region comprises at least two non-sequential nucleotides or non-sequential nucleotide sequences, where each of the non-sequential nucleotides or non-sequential nucleotide sequences is complementary to a portion of a homopolymer sequence of a target nucleic acid. In some aspects, each spatial probe is immobilized to the solid support at a 5’ end.

[0043] In another aspect, provided herein is a solid support comprising a plurality of capture probes described herein, and a plurality of immobilized spatial probes described herein, wherein each capture probe comprises, from 5’ to 3’, the first primer binding sequence and the capture region, wherein the 3’ end of the capture region is complementary to the homopolymer sequence of the target nucleic acid, wherein each immobilized spatial probe comprises, from 5’ to 3’, the probe sequence, the spatial barcode, and the secondprimer binding sequence, and wherein each capture probe is attached to the solid support at a 5’ end. In some aspects, the capture probes comprise a first primer binding sequence and a capture region, wherein the capture region comprises at least two non- sequent! al nucleotides or non-sequential nucleotide sequences, where each of the non- sequent! al nucleotides or non-sequential nucleotide sequences is complementary to a portion of a homopolymer sequence of a target nucleic acid. In some aspects, each spatial probe is immobilized to the solid support at a 3’ end.

[0044] In another aspect, provided herein is a solid support comprising a plurality of capture probes described herein, wherein each capture probe comprises, from 5’ to 3’, the spatial barcode, the first primer binding sequence, and the capture region, wherein the 3’ end of the capture region is complementary to the homopolymer sequence of the target nucleic acid, and wherein each capture probe is attached to the solid support at a 5’ end. In some aspects, the capture probes comprise a first primer binding sequence, a spatial barcode, and a capture region, wherein the capture region comprises at least two nonsequential nucleotides or non-sequential nucleotide sequences, where each of the nonsequential nucleotides or non-sequential nucleotide sequences is complementary to a portion of a homopolymer sequence of a target nucleic acid.

[0045] In another aspect, provided herein is a kit comprising a solid support described herein. In some aspects, the kit further comprises a template switch oligo (TSO). In some aspects, the kit further comprises a splint oligo. In some aspects, the kit further comprises a blocking oligo.

[0046] In another aspect, provided herein is a method of generating an immobilized complement of a target nucleic acid in a biological sample, the method comprising: a. contacting a solid support described herein with the biological sample comprising a plurality of target nucleic acids; b. hybridizing the capture region of each capture probe to a homopolymeric sequence of a target nucleic acid from the plurality; and c. extending each capture region with a polymerase, thereby generating an immobilized complement of each target nucleic acid.

[0047] In another aspect, provided herein is a method of generating a plurality of second strand extension products of target nucleic acids of a biological sample, the method comprising: a. providing a solid support comprising a plurality of immobilized capture probes, wherein each capture probe of the immobilized plurality of capture probes comprises a first primer binding sequence, a spatial barcode, and a capture region, whereinthe capture region comprises at least two non-sequential nucleotides or non-sequential nucleotide sequences, and wherein each of the non-sequential nucleotides or non-sequential nucleotide sequences is complementary to a portion of a homopolymer sequence of a target nucleic acid; b. contacting the solid support with a biological sample comprising a plurality of target nucleic acids; c. hybridizing the capture region of each immobilized capture probe to a homopolymeric sequence of a target nucleic acid from the plurality, thereby forming a plurality of hybridized capture probes; d. extending the capture region of each hybridized capture probe with a polymerase, thereby generating a plurality of immobilized first strand extension products, wherein the extending comprises addition of a plurality of non- templated nucleotides to the end of the immobilized first strand extension products; e. removing the plurality of target nucleic acids from the solid support; f. hybridizing a template switch oligonucleotide (TSO) to each immobilized first strand extension product, wherein the TSO is complementary to a plurality of the non-templated nucleotides, and wherein the TSO comprises a second sequencing primer binding sequence, thereby forming a plurality of hybridized TSOs; g. generating a plurality second strand extension product using the TSO; and h. removing the plurality of second strand extension products. In some aspects, each capture probe of the immobilized plurality of capture probes comprises a second primer binding sequence. In some aspects, the first primer binding sequence is a first sequencing primer binding sequence or a first decoding primer binding sequence, and wherein the second primer binding sequence is a second sequencing primer binding sequence or a second decoding primer binding sequence.

[0048] In another aspect, provided herein is a method of generating a plurality of second strand extension products of a target nucleic acid in a biological sample, the method comprising: a. providing a solid support comprising a plurality of immobilized capture probes and a plurality of immobilized spatial probes, wherein each capture probe comprises a sequencing primer binding sequence and a capture region, wherein each spatial probe comprises a primer binding sequence, a spatial barcode, and a probe sequence, wherein the capture region comprises at least two non-sequential nucleotides or non-sequential nucleotide sequences, and wherein each of the non-sequential nucleotides or non-sequential nucleotide sequences is complementary to a portion of a homopolymer sequence of a target nucleic acid; b. contacting the solid support with a biological sample comprising a plurality of target nucleic acids; c. hybridizing the capture region of each immobilized capture probe to a homopolymeric sequence of a target nucleic acid from the plurality, thereby forming aplurality of hybridized capture probes; d. extending the capture region of each hybridized capture probe with a polymerase, thereby generating a plurality of immobilized first strand extension products, wherein the extending comprises addition of a plurality of non- templated nucleotides to the end of the immobilized first strand extension products; e. removing the plurality of target nucleic acids from the solid support; f. providing a plurality of splint oligonucleotides to the solid support and hybridizing the splint oligonucleotides to each immobilized first strand extension product and immobilized spatial probe to form a splinted complex, wherein the splint oligonucleotide comprises a first region complementary to the plurality of non-templated nucleotides of the first strand extension product and a second region complementary to the probe sequence of the spatial probe, thereby bringing the immobilized first strand extension product and the immobilized spatial probe of the splinted complex into ligatable proximity; g. ligating the immobilized first strand extension product and the immobilized spatial probe of each splinted complex by enzymatic or chemical ligation, thereby forming a plurality of ligated first strand extension products; h. hybridizing a primer to each ligated first strand extension product and extending the hybridized primers, thereby generating a plurality of second strand extension products; and i. removing the plurality of second strand extension products.

[0049] In another aspect, provided herein is a method of generating a plurality of second strand extension products of a target nucleic acid in a biological sample, the method comprising: a. providing a solid support comprising a plurality of immobilized capture probes and a plurality of immobilized spatial probes, wherein each capture probe comprises a sequencing primer binding sequence and a capture region, wherein each spatial probe comprises a primer binding sequence, a spatial barcode, and a probe sequence, wherein the capture region comprises at least two non-sequential nucleotides or non-sequential nucleotide sequences, and wherein each of the non-sequential nucleotides or non-sequential nucleotide sequences is complementary to a portion of a homopolymer sequence of a target nucleic acid; b. contacting the solid support with a biological sample comprising a plurality of target nucleic acids; c. hybridizing the capture region of each immobilized capture probe to a homopolymeric sequence of a target nucleic acid from the plurality, thereby forming a plurality of hybridized capture probes; d. extending the capture region of each hybridized capture probe with a polymerase, thereby generating a plurality of immobilized first strand extension products, wherein the extending comprises addition of a plurality of non- templated nucleotides to the end of the immobilized first strand extension products; e.removing the plurality of target nucleic acids from the solid support; f. hybridizing a template switch oligonucleotide (TSO) to each immobilized first strand extension product, wherein the TSO is complementary to a plurality of the non-templated nucleotides, and wherein the TSO comprises a bait sequence at a 3’ end, thereby forming a plurality of hybridized TSOs; g. incorporating the complement of the TSO into the 3’ end of the immobilized first strand extension product by template switching, thereby adding a bait sequence complement to the 3’ end of each immobilized first strand extension product; h. hybridizing the bait sequence complement of each immobilized first strand extension product to the probe sequence of the immobilized spatial probes, and extending the 3’ end of the hybridized first strand extension product, thereby incorporating a complement of the spatial barcode and a primer binding sequence complement into the 3’ end of each immobilized first strand extension product; i. denaturing the hybridized first strand extension products and spatial probes; j. hybridizing a primer to the primer binding sequence complement of each immobilized first strand extension product and extending the hybridized primers, thereby generating a plurality of second strand extension products; and k. removing the plurality of second strand extension products. In some aspecst, after step (g), the method further comprises hybridizing a blocking element to the capture region of the capture probe. In some aspects, the blocking element is complementary to a 5’ portion of the capture region of the capture probe.

[0050] In some aspects, the step of removing the plurality of second strand extension products comprises chemical or enzymatic removal of the second strand extension products. In some aspects, the chemical removal comprises contacting the plurality of second strand extension products with an alkaline solution. In some aspects, the enzymatic removal comprises enzymatic cleavage of a cleavage site, wherein the plurality of second strand extension products comprise the cleavage site at a 5’ end. In some aspects, the cleavage site comprises a restriction enzyme site, a uracil, an 8-oxoguanine, or a combination thereof.

[0051] In some aspects, the bait sequence is identical to a portion of the capture region of the capture probe. In some embodiments, the bait sequence comprises at least 6 nucleotides. In some embodiments, the bait sequence comprises a GC content of about 20% to about 80%.

[0052] A bait sequence can be, or be about, 1, 2, 3, 4, 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, or a number or a range between any two of these values, nucleotides in length. A baitsequence can be at least, or be at most, 1, 2, 3, 4, 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 100, 200, or 300 nucleotides in length. GC content of a bait sequence (e.g, a first bait sequence, a second bait sequence) can vary. For example, the GC content of the bait sequence can be, or be about, 0.0%, 1%, 2%, 3%, 4%, 5%, 6%, 7%, 8%, 9%, 10%, 11%, 12%, 13%, 14%, 15%, 16%, 17%, 18%, 19%, 20%, 21%, 22%, 23%, 24%, 25%, 26%, 27%, 28%, 29%, 30%, 31%, 32%, 33%, 34%, 35%, 36%, 37%, 38%, 39%, 40%, 41%, 42%, 43%, 44%, 45%, 46%, 47%, 48%, 49%, 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, 100%, or a number or a range between any two of these values.

[0053] In some aspects, each spatial probe further comprises an index sequence, a molecular identifier, or a combination thereof. In some aspects, each capture probe further comprises an index sequence, a molecular identifier, a spatial barcode, or a combination thereof. In some aspects, the molecular identifier is a unique molecular identifier. In some aspects, the spatial barcode sequence of the spatial probe and the spatial barcode sequence of the capture probe are different. In some aspects, the spatial barcode sequence of the spatial probe and the spatial barcode sequence of the capture probe are the same.

[0054] In some aspects, the method further comprises amplifying the plurality of second strand extension products, thereby generating a library. In some aspects, generating the library comprises tagmentation or ligation of adapters to the second strand extension products. In some aspects, the method further comprises sequencing the library. In some aspects, sequencing comprises sequencing-by-synthesis, sequencing-by-ligation, or sequencing-by-binding.

[0055] In some aspects, the biological sample comprises a tissue sample. In some aspects, the tissue sample comprises a fresh frozen tissue sample or a formalin-fixed paraffin embedded (FFPE) sample.

[0056] In some aspects, step b) further comprises contacting the sample with a lysis buffer, a permeabilization buffer and / or a reagent to deparaffinize a FFPE sample.

[0057] In some aspects, the polymerase is a reverse transcriptase. In some aspects, the reverse transcriptase is a highly processive reverse transcriptase.BRIEF DESCRIPTION OF THE DRAWINGS

[0058] Figures 1A-1B illustrates a standard capture probe (FIG. 1A) and an exemplary capture probe of the present disclosure (FIG. IB) (also referred to herein as a disrupted homopolymer capture probe). In FIG. 1A, the poly-T sequence of the capture probe is complementary to a poly-A homopolymer sequence in a target mRNA molecule of a biological sample. PBS, primer binding sequence; note the capture probes are not shown in their entirety, as denoted by the double curved lines. As shown in FIG. IB, the exemplary disrupted homopolymer (DHP) capture probe includes a DHP capture region comprising 4 intervening nucleotides or intervening nucleotide sequences, which are not complementary to the mRNA poly-A region and lead to the formation of mistmatches.

[0059] Figures 2A-2F show an exemplary workflow using the capture probes of the present disclosure. FIG. 2A shows a solid support comprising an immobilized capture probe (left) and an immobilized spatial probe (right). The immobilized capture probe comprises, from 5’ to 3’, a primer binding sequence (e.g., sbsl2) and a disrupted homopolymer capture sequence (DHP). The immobilized spatial probe comprises, from 5’ to 3’, a first primer binding sequence (e.g., P5), a spatial barcode, a second primer binding sequence (e.g., sbs3) and a probe sequence. The probe sequence of the immobilized spatial probe shares sequence identity with the DHP sequence of the immobilized capture probe. FIG. 2B shows the step of hybridizing a target nucleic acid (e.g., a mRNA molecule from a biological sample), wherein the poly-A tail of the mRNA molecule hybridizes to the DHP sequence of the capture probe. Mismatches in the poly-A tail and DHP sequence duplex lead to disruptions in base pairing, as indicated by the arrows. Polymerase extension (e.g., extension with a reverse transcriptase) of the capture probe is then performed to generate a first cDNA extension product of the captured mRNA. FIGs. 2C and 2D show the process of template switching with a template switch oligo (TSO) including a bait sequence, thereby generating an extended capture probe including a bait sequence complement. Following template switching, a blocking element is hybridized to the DHP sequence of the capture probe, as shown in FIG. 2E, to inhibit hybridization of the probe sequence of the immobilized spatial probe to the DHP sequence of the capture probe. The bait sequence complement then hybridizes to the probe sequence and is extended by a polymerase, as shown in FIG. 2F, to incorporate the spatial barcode and additional primer binding sequences (e.g., P5’ and sbs3’). The extended capture probe may then be removed from thesolid support (e.g., by chemical or enzymatic cleavage) and processed for downstream applications, such as sequencing.

[0060] Figure 3 is a diagram showing the generation of libraries with capture probes comprising 20T homopolymer, DHP0, or DHPnaive capture sequences.

[0061] Figure 4 is a graph showing the quality of the libraries created from capture probes comprising 20T homopolymer, DHP0, or DHPnaive capture sequences as assessed by TapeStation.

[0062] Figures 5A-5B are enhanced volcano graphs showing expression of genes from libraries that were generated with capture probes comprising 20T homopolymer, DHP0, or DHPnaive capture sequences. FIG. 5A shows the comparison of gene expression from libraries generated with capture probes comprising a 20T homopolymer capture sequence compared to capture probes comprising a DHP0 capture sequence. FIG. 5B shows the comparison of gene expression from libraries generated with capture probes comprising a 20T homopolymer capture sequence compared to capture probes comprising a DHPnaive capture sequence.DETAILED DESCRIPTION

[0063] Spatial analysis methodologies and compositions described herein can provide a vast amount of analyte and / or expression data for a variety of analytes within a biological sample at high spatial resolution, while retaining native spatial context. Spatial analysis methods and compositions can include, e.g., the use of a capture probe including a spatial barcode (e.g., a nucleic acid sequence that provides information as to the location or position of an analyte within a cell or a tissue sample (e.g., mammalian cell or a mammalian tissue sample) and a capture domain that is capable of binding to an analyte (e.g., a protein and / or a nucleic acid) produced by and / or present in a cell. Spatial analysis methods and compositions can also include the use of a capture probe having a capture domain that captures an intermediate agent for indirect detection of an analyte. For example, the intermediate agent can include a nucleic acid sequence (e.g., a barcode) associated with the intermediate agent. Detection of the intermediate agent is therefore indicative of the analyte in the cell or tissue sample.I. Definitions

[0064] All publications mentioned herein are incorporated herein by reference in full for the purpose of describing and disclosing the methodologies, which might be used in connection with the description herein. Moreover, with respect to any term that is presented in one or more publications that is similar to, or identical with, a term that has been expressly defined in this disclosure, the definition of the term as expressly provided in this disclosure will control in all respects.

[0065] The practice of the technology described herein will employ, unless indicated specifically to the contrary, conventional methods of chemistry, biochemistry, organic chemistry, molecular biology, bioinformatics, microbiology, recombinant DNA techniques, genetics, immunology, and cell biology that are within the skill of the art, many of which are described below for the purpose of illustration. Examples of such techniques are available in the literature. See, e.g., Singleton et al., DICTIONARY OF MICROBIOLOGY AND MOLECULAR BIOLOGY 2nd ed., J. Wiley & Sons (New York, N.Y. 1994); and Sambrook and Green, Molecular Cloning: A Laboratory Manual, 4th Edition (2012). Methods, devices and materials similar or equivalent to those described herein can be used in the practice of this invention.

[0066] Unless defined otherwise herein, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this disclosure belongs. Various scientific dictionaries that include the terms included herein are well known and available to those in the art. Although any methods and materials similar or equivalent to those described herein find use in the practice or testing of the disclosure, some preferred methods and materials are described. Accordingly, the terms defined immediately below are more fully described by reference to the specification as a whole. It is to be understood that this disclosure is not limited to the particular methodology, protocols, and reagents described, as these may vary, depending upon the context in which they are used by those of skill in the art. The following definitions are provided to facilitate understanding of certain terms used frequently herein and are not meant to limit the scope of the present disclosure.

[0067] As used herein, the singular forms “a,” “an” and “the” include plural referents unless the content clearly dictates otherwise. Thus, for example, reference to “a protein” includes a mixture of two or more proteins, and the like.

[0068] Throughout this specification, unless the context requires otherwise, the words “comprise”, “comprises” and “comprising” will be understood to imply the inclusion of a stated step or element or group of steps or elements but not the exclusion of any other step or element or group of steps or elements. By “consisting of is meant including, and limited to, whatever follows the phrase “consisting of.” Thus, the phrase “consisting of indicates that the listed elements are required or mandatory, and that no other elements may be present. By “consisting essentially of is meant including any elements listed after the phrase, and limited to other elements that do not interfere with or contribute to the activity or action specified in the disclosure for the listed elements. Thus, the phrase “consisting essentially of indicates that the listed elements are required or mandatory, but that other elements are optional and may or may not be present depending upon whether or not they affect the activity or action of the listed elements. As used herein, the terms “includes,” “including,” “includes,” “including,” “contains,” “containing,” “have,” “having,” and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, product-by-process, or composition of matter that includes, includes, or contains an element or list of elements does not include only those elements but can include other elements not expressly listed or inherent to such process, method, product-by-process, or composition of matter. Similarly, “comprise,” “comprises,” “comprising” “include,” “includes,” and “including” are interchangeable and not intended to be limiting.

[0069] Other than in the operating examples, or where otherwise indicated, all numbers expressing quantities of ingredients or reaction conditions used herein should be understood as modified in all instances by the term “about.” The term “about” when used to described aspects of the disclosure, in connection with percentages means ±1%, ±2%, ±3%, ±4%, ±5%. The term “about,” as used herein can mean within an acceptable error range for the particular value as determined by one of ordinary skill in the art, which can depend in part on how the value is measured or determined, e.g., the limitations of the measurement system. Alternatively, “about” can mean a range of plus or minus 20%, plus or minus 10%, plus or minus 5%, or plus or minus 1% of a given value. Alternatively, particularly with respect to biological systems or processes, the term can mean within an order of magnitude, within 5-fold, or within 2-fold, of a value. Where particular values are described in the application and claims, unless otherwise stated the term “about” meaning within an acceptable error range for the particular value can be assumed. Also, where ranges and / or subranges of values are provided, the ranges and / or subranges can includethe endpoints of the ranges and / or subranges. In some cases, variations can include an amount or concentration of 20%, 10%, 5%, 1 %, 0.5%, or even 0.1 % of the specified amount.

[0070] For the recitation of numeric ranges herein, each intervening number there between with the same degree of precision is explicitly contemplated. For example, for the range of 6-9, or 6 to 9, the numbers 7 and 8 are contemplated in addition to 6 and 9, and for the range 6.0-7.0, the number 6.0, 6.1, 6.2, 6.3, 6.4, 6.5, 6.6, 6.7, 6.8, 6.9, and 7.0 are explicitly contemplated.

[0071] As used herein, the term “complementary” when used in reference to a polynucleotide is intended to mean a polynucleotide that includes a nucleotide sequence capable of selectively annealing to an identifying region of a target polynucleotide under certain conditions, e.g., a first oligonucleotide sequence can form a double-stranded structure by matching base-pairs with a second oligonucleotide sequence or portion thereof. In various embodiments, “complementary” oligonucleotides are 100% complementary to each other, while in other embodiments, a first oligonucleotide sequence is at least (meaning greater than or equal to) about 95% complementary to a second oligonucleotide sequence over the length of the first oligonucleotide, at least about 90%, at least about 85%, at least about 80%, at least about 75%, at least about 70%, at least about 65%, at least about 60%, at least about 55%, or at least about 50% complementary to the second oligonucleotide over the length of the first oligonucleotide to the extent that the oligonucleotides are able to hybridize to each other under the conditions being utilized. The percent complementarity is determined over the length of the oligonucleotide. For example, given a first oligonucleotide in which 18 of 20 nucleotides of the first oligonucleotide are complementary to a 20- nucleotide region in a second oligonucleotide of 100 nucleotides total length, the oligonucleotides would be 90 percent complementary. In this example, the remaining noncomplementary nucleotides may be clustered or interspersed with complementary nucleobases and need not be contiguous to each other or to complementary nucleotides. As used herein, the term "substantially complementary" and grammatical equivalents is intended to mean a polynucleotide that includes a nucleotide sequence capable of specifically annealing to an identifying region of a target polynucleotide under certain conditions. Annealing refers to the nucleotide base-pairing interaction of one nucleic acid with another nucleic acid that results in the formation of a duplex, triplex, or other higher- ordered structure. The primary interaction is typically nucleotide basespecific, e.g., A:T,A:11, and G:C, by Watson-Crick and Hoogsteen-type hydrogen bonding. In certain embodiments, base-stacking and hydrophobic interactions can also contribute to duplex stability. Conditions under which a polynucleotide anneals to complementary or substantially complementary regions of target nucleic acids are well known in the art, e.g., as described in Nucleic Acid Hybridization, A Practical Approach, Hames and Higgins, eds., IRL Press, Washington, D.C. (1985) and Wetmur and Davidson, Mol. Biol. 31 :349 (1968). Annealing conditions will depend upon the particular application and can be routinely determined by persons skilled in the art, without undue experimentation.

[0072] As used herein, the term “dNTP” refers to deoxynucleoside triphosphates. NTP refers to ribonucleotide triphosphates. The purine bases (Pu) include adenine (A), guanine (G) and derivatives and analogs thereof. The pyrimidine bases (Py) include cytosine (C), thymine (T), uracil (U) and derivatives and analogs thereof. Examples of such derivatives or analogs, by way of illustration and not limitation, are those which are modified with a reporter group, biotinylated, amine modified, radiolabeled, alkylated, and the like and also include phosphorothioate, phosphite, ring atom modified derivatives, and the like. The reporter group can be a fluorescent group such as fluorescein, a chemiluminescent group such as luminol, a terbium chelator such as N-(hydroxy ethyl) ethylenediaminetriacetic acid that is capable of detection by delayed fluorescence, and the like.

[0073] “Hybridize” shall mean the annealing of a nucleic acid sequence to another nucleic acid sequence (e.g., one single-stranded nucleic acid (such as a primer) to another nucleic acid) based on the well-understood principle of sequence complementarity. In an aspect, the other nucleic acid is a single-stranded nucleic acid. In some aspects, one portion of a nucleic acid hybridizes to itself, such as in the formation of a hairpin structure. The propensity for hybridization between nucleic acids depends on the temperature and ionic strength of their milieu, the length of the nucleic acids and the degree of complementarity. The effect of these parameters on hybridization is described in, for example, Sambrook J., Fritsch E. F., Maniatis T., Molecular cloning: a laboratory manual, Cold Spring Harbor Laboratory Press, New York (1989). As used herein, hybridization of a primer, or of a DNA extension product, respectively, is extendable by creation of a phosphodiester bond with an available nucleotide or nucleotide analogue capable of forming a phosphodiester bond, therewith. For example, hybridization can be performed at a temperature ranging from 15° C. to 95° C. In some aspects, the hybridization is performed at a temperature of about 20° C., about 25° C., about 30° C., about 35° C., about 40° C., about 45° C., about 50° C., about55° C., about 60° C., about 65° C., about 70° C., about 75° C., about 80° C., about 85° C., about 90° C., or about 95° C. In other aspects, the stringency of the hybridization can be further altered by the addition or removal of components of the buffered solution.

[0074] As used herein, the term "each," when used in reference to a collection of items, is intended to identify an individual item in the collection but does not necessarily refer to every item in the collection unless the context clearly dictates otherwise.

[0075] The terms “upstream” and “5 '-of’ with reference to positions in a nucleic acid sequence are used interchangeably to refer to a relative position in the nucleic acid sequence that is further towards the 5' end of the sequence.

[0076] The terms “downstream” and “3 '-of’ with reference to positions in a nucleic acid sequence are used interchangeably to refer to a relative position in the nucleic acid sequence that is further towards the 3' end of the sequence.

[0077] As used herein, the terms “ligation,” “ligating,” and grammatical equivalents thereof are intended to mean to form a covalent bond or linkage between the termini of two or more nucleic acids, e.g., oligonucleotides and / or polynucleotides, typically in a template- driven reaction. The nature of the bond or linkage may vary widely and the ligation may be carried out enzymatically or chemically. As used herein, ligations are usually carried out enzymatically to form a phosphodiester linkage between a 5’ carbon terminal nucleotide of one oligonucleotide with a 3’ carbon of another nucleotide. Template driven ligation reactions are described in the following references: U.S. Patent Nos. 4,883,750; 5,476,930; 5,593,826; and 5,871,921, incorporated herein by reference in their entireties. The term “ligation” also encompasses non-enzymatic formation of phosphodiester bonds, as well as the formation of non-phosphodiester covalent bonds between the ends of oligonucleotides, such as phosphorothioate bonds, disulfide bonds, and the like.

[0078] As used herein, “specifically hybridizes” refers to preferential hybridization under hybridization conditions where two nucleic acids, or portions thereof, that are substantially complementary, hybridize to each other and not to other nucleic acids that are not substantially complementary to either of the two nucleic acid. For example, specific hybridization includes the hybridization of a primer or capture nucleic acid to a portion of a target nucleic acid (e.g., a template, or adapter portion of a template) that is substantially complementary to the primer or capture nucleic acid. In some aspects nucleic acids, or portions thereof, that are configured to specifically hybridize are often about 80% or more, 81% or more, 82% or more, 83% or more, 84% or more, 85% or more, 86% or more, 87%or more, 88% or more, 89% or more, 90% or more, 91% or more, 92% or more, 93% or more, 94% or more, 95% or more, 96% or more, 97% or more, 98% or more, 99% or more or 100% complementary to each other over a contiguous portion of nucleic acid sequence. A specific hybridization discriminates over non-specific hybridization interactions (e.g., two nucleic acids that a not configured to specifically hybridize, e.g., two nucleic acids that are 80% or less, 70% or less, 60% or less or 50% or less complementary) by about 2-fold or more, often about 10-fold or more, and sometimes about 100-fold or more, 1000-fold or more, 10,000-fold or more, 100,000-fold or more, or 1,000,000-fold or more. Two nucleic acid strands that are hybridized to each other can form a duplex which comprises a double stranded portion of nucleic acid.

[0079] As may be used herein, the terms “nucleic acid,” “nucleic acid molecule,” “nucleic acid sequence,” “strand,” “nucleic acid fragment” and “polynucleotide” are used interchangeably and are intended to include, but are not limited to, a polymeric form of nucleotides covalently linked together that may have various lengths, either deoxyribonucleotides or ribonucleotides, or analogs, derivatives or modifications thereof Different polynucleotides may have different three-dimensional structures, and may perform various functions, known or unknown. Non-limiting examples of polynucleotides include a gene, a gene fragment, an exon, an intron, intergenic DNA (including, without limitation, heterochromatic DNA), messenger RNA (mRNA), transfer RNA, ribosomal RNA, a ribozyme, cDNA, a recombinant polynucleotide, a branched polynucleotide, a plasmid, a vector, isolated DNA of a sequence, isolated RNA of a sequence, a nucleic acid probe, and a primer. Polynucleotides useful in the methods of the disclosure may comprise natural nucleic acid sequences and variants thereof, artificial nucleic acid sequences, or a combination of such sequences. As may be used herein, the terms “nucleic acid oligomer” and “oligonucleotide” are used interchangeably and are intended to include, but are not limited to, nucleic acids having a length of 200 nucleotides or less. In some aspects, an oligonucleotide is a nucleic acid having a length of 2 to 200 nucleotides, 2 to 150 nucleotides, 5 to 150 nucleotides or 5 to 100 nucleotides. The terms “polynucleotide,” “oligonucleotide,” “oligo” or the like refer, in the usual and customary sense, to a linear sequence of nucleotides. Oligonucleotides of the disclosure may be of any length and include, in various embodiments, DNA oligonucleotides, RNA oligonucleotides, analogs thereof, or a combination thereof. In any aspects or embodiments described herein, an oligonucleotide is single-stranded, double-stranded, or partially double-stranded.Oligonucleotides are typically from about 5, 6, 7, 8, 9, 10, 12, 15, 25, 30, 40, 50 or more nucleotides in length, up to about 100 nucleotides in length. In some aspects, an oligonucleotide is a primer configured for extension by a polymerase when the primer is annealed completely or partially to a complementary nucleic acid template. A primer is often a single stranded nucleic acid. In certain aspects, a primer, or portion thereof, is substantially complementary to a portion of an adapter. In some aspects, a primer has a length of 200 nucleotides or less. In certain aspects, a primer has a length of 10 to 150 nucleotides, 15 to 150 nucleotides, 5 to 100 nucleotides, 5 to 50 nucleotides or 10 to 50 nucleotides. In some aspects, an oligonucleotide may be immobilized to a solid support.

[0080] As used herein, the term “adapter” refers generally to any linear nucleic acid molecule that can be ligated to an oligonucleotide of the disclosure. In some embodiments, adapters include two reverse complementary oligonucleotides forming a double-stranded structure. In some embodiments, an adapter includes two oligonucleotides that are complementary at one portion and mismatched at another portion, forming a Y-shape or fork-shaped adapter that is double stranded at the complementary portion and has two floppy overhangs at the mismatched portion. In some embodiments, adapters are copied onto the library molecules using templated polymerase synthesis (e.g., second strand cDNA synthesis as described herein). In some embodiments, adapters are ligated to a first complementary strand of the disclosure. In some embodiments, an adapter comprises two oligonucleotides that are double-stranded at one portion and single-stranded at another portion, forming an adapter with an overhang. In some embodiments, an oligonucleotide primer comprises an adapter nucleotide sequence (e.g., a Bl 5 nucleotide sequence). In some embodiments, an adapter comprises a sequence that is complementary to a primer. In further embodiments, an adapter comprises a sequence that is complementary to a P5 primer or a P5’ primer. In some embodiments, an adapter comprises a sequence complementary to a P7 primer or a P7’ primer. In some embodiments, an adapter comprises a sequence complementary to a B 15 primer or a B 15 ’ primer. The terms “P5”, “P7”, “B 15”, “P5”’ (P5 prime), “P7”’ (P7 prime), “B15”’ (B15 prime), “P15”, “P17” and “A14” may be used when referring to examples of oligonucleotide sequences of primers, e.g., clustering primers, and / or oligonucleotide sequences that are complementary to primers. The terms "P5"' (P5 prime), "P7"' (P7 prime), “B15”’ (B15 prime) and “A14”’ (A14 prime) refer to the complement of P5, P7, B15 and A14, respectively. It will be understood that any suitable primer can be used in the methods presented herein, and that the use of P5, P5’,P7, P7’, P15, P17, B15, B15’, A14 and A14’ are exemplary embodiments only. Uses of primers such as P5, P5’, P7, P7’, P15, P17, B15, B15’, A14 and A14’ or their complements on flow cells are known in the art, as exemplified by the disclosures of WO 2019 / 222264, WO 2007 / 010251 , WO 2006 / 064199, WO 2005 / 065814, WO 2015 / 106941 , WO 1998 / 044151 , and WO 2000 / 018957, each of which is incorporated herein by reference in its entirety.

[0081] The terms “P5” and “P7” may be used when referring to examples of adapters.The terms “P51” (P5 prime) and “P71” (P7 prime) refer to the complement of P5 and P7, respectively. It will be understood that any suitable adapter can be used in the methods presented herein, and that the use of P5 and P7 are exemplary embodiments only. Uses of adapters such as P5 and P7 or their complements on flowcells are known in the art, as exemplified by the disclosures of WO 2007 / 010251, WO 2006 / 064199, WO 2005 / 065814, WO 2015 / 106941, WO 1998 / 044151, and WO 2000 / 018957, each of which is incorporated herein by reference in its entirety. For example, any suitable forward amplification primer, whether immobilized or in solution, can be useful in the methods presented herein for hybridization to a complementary sequence and amplification of a sequence. Similarly, any suitable reverse amplification primer, whether immobilized or in solution, can be useful in the methods presented herein for hybridization to a complementary sequence and amplification of a sequence. One of skill in the art will understand how to design and use primer sequences that are suitable for capture and / or amplification of nucleic acids as presented herein.

[0082] As used herein, the terms “polynucleotide primer” and “primer” refers to any polynucleotide molecule that may hybridize to a polynucleotide template, be bound by a polymerase, and be extended in a template-directed process for nucleic acid synthesis (e.g., amplification and / or sequencing). The primer may be a separate polynucleotide from the polynucleotide template, or both may be portions of the same polynucleotide (e.g., as in a hairpin structure having a 3' end that is extended along another portion of the polynucleotide to extend a double-stranded portion of the hairpin). Primers (e.g., forward or reverse primers) may be attached to a solid support. A primer can be of any length depending on the particular technique it will be used for. For example, PCR primers are generally between 10 and 40 nucleotides in length. The length and complexity of the nucleic acid fixed onto the nucleic acid template may vary. In some aspects, a primer has a length of 200 nucleotides or less. In certain aspects, a primer has a length of 10 to 150nucleotides, 15 to 150 nucleotides, 5 to 100 nucleotides, 5 to 50 nucleotides or 10 to 50 nucleotides. One of skill can adjust these factors to provide optimum hybridization and signal production for a given hybridization procedure. The primer permits the addition of a nucleotide residue thereto, or oligonucleotide or polynucleotide synthesis therefrom, under suitable conditions. In an aspect the primer is a DNA primer, i.e., a primer consisting of, or largely consisting of, deoxyribonucleotide residues. The primers are designed to have a sequence that is the complement of a region of template / target DNA to which the primer hybridizes. The addition of a nucleotide residue to the 3’ end of a primer by formation of a phosphodiester bond results in a DNA extension product. The addition of a nucleotide residue to the 3’ end of the DNA extension product by formation of a phosphodiester bond results in a further DNA extension product. In another aspect, the primer is an RNA primer. In aspects, a primer is hybridized to a target polynucleotide. A “primer” is complementary to a polynucleotide template, and complexes by hydrogen bonding or hybridization with the template to give a primer / template complex for initiation of synthesis by a polymerase, which is extended by the addition of covalently bonded bases linked at its 3' end complementary to the template in the process of DNA synthesis.

[0083] As used herein, the term “primer binding sequence” refers to a polynucleotide sequence that is complementary to at least a portion of a primer (e.g., a sequencing primer or an amplification primer). Primer binding sequences can be of any suitable length. In aspects, a primer binding sequence is about or at least about 10, 15, 20, 25, 30, or more nucleotides in length. In aspects, a primer binding sequence is 10-50, 15-30, or 20-25 nucleotides in length. The primer binding sequence may be selected such that the primer (e.g., sequencing primer) has the preferred characteristics to minimize secondary structure formation or minimize non-specific amplification, for example having a length of about 20- 30 nucleotides; approximately 50% GC content, and a Tm of about 55°C to about 65°C.

[0084] As used herein, a “sequencing primer binding sequence” refers to a sequence for facilitating sequencing of a ssDNA fragment to which the sequencing primer binding sequence is joined (e.g., to provide a priming site for sequencing by synthesis, or to provide annealing sites for sequencing by ligation, or to provide annealing sites for sequencing by hybridization). For example, in some aspects, the sequencing primer binding sequence provides a site for priming DNA synthesis of said ssDNA fragment or the complement of said ssDNA fragment. In some aspects, the sequencing primer binding sequence includes an SBS3, SBS8', SBS12', or SBS491' sequence.

[0085] As used herein, a “decoding primer binding sequence” refers to a sequence for facilitating the decoding / sequencing of a ssDNA fragment to which the decoding primer binding sequence is joined (e.g., to provide a priming site for sequencing by synthesis, or to provide annealing sites for sequencing by ligation, or to provide annealing sites for sequencing by hybridization), and to which a spatial barcode is joined, wherein the decoding / sequencing will identify the spatial barcode.

[0086] As used herein, an “amplification primer binding sequence” means a sequence for the purpose of facilitating amplification of a nucleic acid to which said sequence is appended. For example, in some implementations, the amplification primer binding sequence provides a priming site for a nucleic acid amplification reaction using a DNA polymerase (e.g., a PCR amplification reaction or a strand-displacement amplification reaction, or a rolling circle amplification reaction), or a ligation template for ligation of probes using a template-dependent ligase in a nucleic acid amplification reaction (e.g., a ligation chain reaction).

[0087] As used herein, a “DNA fragment” means a portion or piece or segment of a target DNA that is cleaved from or released or broken from a longer DNA molecule such that it is no longer attached to the parent molecule. A DNA fragment can be double-stranded (a “dsDNA fragment”) or single-stranded (a “ssDNA fragment”), and the process of generating DNA fragments from the target DNA is referred to as “fragmenting” the target DNA. In some aspects, the method is used to generate a “DNA fragment library” including a collection or population of tagged DNA fragments.

[0088] For example, any suitable forward amplification primer, whether immobilized or in solution, can be useful in the methods presented herein for hybridization to a complementary sequence and amplification of a sequence. Similarly, any suitable reverse amplification primer, whether immobilized or in solution, can be useful in the methods presented herein for hybridization to a complementary sequence and amplification of a sequence. One of skill in the art will understand how to design and use primer sequences that are suitable for capture and / or amplification of nucleic acids as presented herein. In some embodiments, a “first clustering primer” as described herein is a P5 primer. In some embodiments, a “first clustering primer” as described herein is a P7 primer. In some embodiments, a “first clustering primer” as described herein is a P5' primer. In some embodiments, a “first clustering primer” as described herein is a P7' primer. In some embodiments, a second clustering primer” as described herein is a P5 primer, In someembodiments, a second clustering primer” as described herein is a P7 primer, In some embodiments, a second clustering primer” as described herein is a P5' primer, In some embodiments, a “second clustering primer” as described herein is a P7' primer. In some embodiments, P5 comprises or consists of the polynucleotide sequence 5’ AAT GAT ACG GCG ACC ACC GA 3’ (SEQ ID NO: 1), or a variant thereof. In some embodiments, P5 comprises or consists of the polynucleotide sequence 5’ AAT GAT ACG GCG ACC ACC GAG ATC TAC AC 3’ (SEQ ID NO: 2), or a variant thereof. In some embodiments, P7 comprises or consists of the polynucleotide sequence 5’ CAA GCA GAA GAC GGC ATA CG 3 ’ (SEQ ID NO. 3), or a variant thereof. In some embodiments, P7 comprises or consists of the polynucleotide sequence 5’ CAA GCA GAA GAC GGC ATA CGA GAT 3’ (SEQ ID NO. 4), or a variant thereof. In some embodiments, P5' comprises or consists of the polynucleotide sequence 5’ TCG GTG GTC GCC GTA TCA TT 3’ (SEQ ID NO: 5), or a variant thereof. In some embodiments, P5' comprises or consists of the polynucleotide sequence 5’ GTG TAG ATC TCG GTG GTC GCC GTA TCA TT 3’ (SEQ ID NO: 6), or a variant thereof. In some embodiments, P7' comprises the polynucleotide sequence 5’ CGT ATG CCG TCT TCT GCT TG 3’ (SEQ ID NO. 7), or a variant thereof. In some embodiments, P7' comprises or consists of the polynucleotide sequence 5’ ATC TCG TAT GCC GTC TTC TGC TTG 3’ (SEQ ID NO. 8), or a variant thereof. In some embodiments, B15 comprises or consists of the polynucleotide sequence 5’ GTCTCGTGGGCTCGG 3’ (SEQ ID NO: 9), or a variant thereof. In some embodiments, Bl 5’ comprises or consists of the polynucleotide sequence 5’ CCGAGCCCACGAGAC 3’ (SEQ ID NO: 10), or a variant thereof. In some embodiments, Pl 5 comprises or consists of the polynucleotide sequence 5’ TTTTTTAATG ATACGGCGAC CACCGAGANC TAC AC 3’ (SEQ ID NO: 11), or a variant thereof. In some embodiments, P17 comprises or consists of the polynucleotide sequence 5’ TTTTTTNNNC AAGCAGAAGA CGGCATACGA GAT 3’ (SEQ ID NO: 12), or a variant thereof. The term “variant” as used herein with reference to any of the sequences recited herein refers to a variant nucleic acid that is substantially identical, i.e., has only some nucleotide sequence variations, for example to the non-variant sequence. In some embodiments, a variant has at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or at least 99% overall nucleotide sequence identity to the nonvariant nucleic acid sequence. It will be understood that reference to P5 and P7 herein could refer to different primer sequences. Any suitable primer sequence combinations are encompassed by the present disclosure.

[0089] As used herein a “splint oligonucleotide” refers to an oligonucleotide comprising a sequence complementary to a region on a surface probe and another sequence complementary to a capture oligonucleotide, e.g., attached to a substrate. Splint oligonucleotides are typically 10 nucleotides or more in length. Splint oligonucleotides may be 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19, 20, 21 , 22, 23, 24, 25, 30, 35, 40, 45, 50, 55, 60, 65, 60, 75, or 80 nucleotides.

[0090] As used herein an “anchor” refers to a moiety that attaches a nano-scaffold to a substrate. An anchor includes a chemical moiety, peptide, or oligonucleotide. A polynucleotide anchor may be between 4-20 nucleotides.

[0091] As used herein a “surface oligonucleotide” refers to an oligonucleotide comprising an anchor sequence for attaching the oligo to the surface of a substrate, a spatial barcode sequence and a sequence that hybridizes with a splint oligonucleotide. Surface oligonucleotides are typically 20 nucleotides or more in length. Surface oligonucleotides may be 20, 21 , 22, 23, 24, 25, 26, 27, 28, 29, 30, 35, 40, 45, 50, 55, 60, 65, 60, 75, or 80 nucleotides or more.

[0092] Nucleotides may include naturally occurring nucleotides and functional analogs thereof. Examples of functional analogs are those that are capable of hybridizing to a nucleic acid in a sequence specific fashion or capable of being used as a template for replication of a particular nucleotide sequence. Naturally occurring nucleotides generally have a backbone containing phosphodiester bonds. An analog structure can have an alternate backbone linkage including any of a variety known in the art. Naturally occurring nucleotides generally have a deoxyribose sugar {e.g., found in DNA) or a ribose sugar e.g., found in RNA). An analog structure can have an alternate sugar moiety including any of a variety known in the art. Nucleotides can include native or non-native bases. A native DNA can include one or more of adenine, thymine, cytosine and / or guanine, and a native RNA can include one or more of adenine, uracil, cytosine and / or guanine. Any non-native base may be used, such as a locked nucleic acid (LNA) and a bridged nucleic acid (BNA). Example modified nucleotides include inosine, xathanine, hypoxathanine, isocytosine, isoguanine, 2-aminopurine, 5 -methylcytosine, 5 -hydroxymethyl cytosine, 2-aminoadenine, 6- methyl adenine, 6-methyl guanine, 2-propyl guanine, 2-propyl adenine, 2-thiouracil, 2- thiothymine, 2-thiocytosine, 15-halouracil, 15-halocytosine, 5-propynyl uracil, 5-propynyl cytosine, 6-azo uracil, 6-azo cytosine, 6-azo thymine, 5 -uracil, 4-thiouracil, 8-halo adenine or guanine, 8-amino adenine or guanine, 8-thiol adenine or guanine, 8-thioalkyl adenine orguanine, 8-hydroxyl adenine or guanine, 5-halo substituted uracil or cytosine, 7- methylguanine, 7-methyladenine, 8-azaguanine, 8-azaadenine, 7-deazaguanine, 7- deazaadenine, 3 -deazaguanine, 3 -deazaadenine or the like. As is known in the art, certain nucleotide analogues cannot become incorporated into a polynucleotide, for example, nucleotide analogues such as adenosine 5'-phosphosulfate. Nucleotides may include any suitable number of phosphates, e.g., three, four, five, six, or more than six phosphates.

[0093] As used herein, the term “nucleotide analogs” refers to synthetic analogs having modified nucleotide base portions, modified pentose portions, and / or modified phosphate portions, and, in the case of polynucleotides, modified intemucleotide linkages, as generally described elsewhere (e.g., Scheit, Nucleotide Analogs, John Wiley, New York, 1980; Englisch, Angew. Chem. Int. Ed. Engl. 30:613-29, 1991; Agarwal, Protocols for Polynucleotides and Analogs, Humana Press, 1994; and S. Verma and F. Eckstein, Ann. Rev. Biochem. 67:99-134, 1998). Exemplary phosphate analogs include but are not limited to phosphorothioate, phosphorodithioate, phosphoroselenoate, phosphorodiselenoate, phosphoroanilothioate, phosphoranilidate, phosphoramidate, boronophosphates, including associated counterions, e.g., H+, NH4+, Na+, if such counterions are present. Exemplary modified nucleotide base portions include but are not limited to 5-methylcytosine (5mC); C-5-propynyl analogs, including but not limited to, C- 5 propynyl-C and C-5 propynyl-U; 2,6-diaminopurine, also known as 2-amino adenine or 2-amino-dA); hypoxanthine, pseudouridine, 2-thiopyrimidine, isocytosine (isoC), 5- methyl isoC, and isoguanine (isoG; see, e.g., U.S. Pat. No. 5,432,272). Exemplary modified pentose portions include, but are not limited to, locked nucleic acid (LNA) analogs including without limitation Bz-A-LNA, 5-Me-Bz-C-LNA, dmf-G-LNA, and T- LNA (see, e.g., The Glen Report, 16(2):5, 2003; Koshkin et al., Tetrahedron 54:3607-30, 1998), and 2’-or 3 ’-modifications where the 2’ -or 3 ’-position is hydrogen, hydroxy, alkoxy (e.g., methoxy, ethoxy, allyloxy, isopropoxy, butoxy, isobutoxy and phenoxy), azido, amino, alkylamino, fluoro, chloro, or bromo. Modified internucleotide linkages include phosphate analogs, analogs having achiral and uncharged intersubunit linkages (e.g., Sterchak, E. P. et al., Organic Chern., 52:4202, 1987), and uncharged morpholino-based polymers having achiral intersubunit linkages (see, e.g., U.S. Pat. No. 5,034,506). Some internucleotide linkage analogs include morpholidate, acetal, and polyamide-linked heterocycles.

[0094] In the context of “polynucleotides,” the terms “variant” and “derivative” as used herein refer to a polynucleotide that comprises a nucleotide sequence of a polynucleotide or a fragment of a polynucleotide, which has been altered by the introduction of nucleotide substitutions, deletions or additions. A variant or a derivative of a polynucleotide can be a fusion polynucleotide which contains part of the nucleotide sequence of a polynucleotide. The term “variant” or “derivative” as used herein also refers to a polynucleotide or a fragment thereof, which has been chemically modified, e.g., by the covalent attachment of any type of molecule to the polynucleotide. For example, but not by way of limitation, a polynucleotide or a fragment thereof can be chemically modified, e.g., by acetylation, phosphorylation, methylation, etc. The variants or derivatives are modified in a manner that is different from naturally occurring or starting nucleotide or polynucleotide, either in the type or location of the molecules attached. Variants or derivatives further include deletion of one or more chemical groups which are naturally present on the nucleotide or polynucleotide. A variant or a derivative of a polynucleotide or a fragment of a polynucleotide can be chemically modified by chemical modifications using techniques known to those of skill in the art, including, but not limited to specific chemical cleavage, acetylation, formulation, etc. Further, a variant or a derivative of a polynucleotide or a fragment of a polynucleotide can contain one or more dNTPs or nucleotide analogs. A polynucleotide variant or derivative may possess a similar or identical function as a polynucleotide or a fragment of a polynucleotide described herein. A polynucleotide variant or derivative may possess an additional or different function compared with a polynucleotide or a fragment of a polynucleotide described herein.

[0095] As used herein, the term “double-stranded,” when used in reference to a nucleic acid molecule, means that substantially all of the nucleotides in the nucleic acid molecule are hydrogen bonded to a complementary nucleotide. A partially double stranded nucleic acid can have at least 10%, 25%, 50%, 60%, 70%, 80%, 90% or 95% of its nucleotides hydrogen bonded to a complementary nucleotide.

[0096] As used herein, the term “single-stranded,” when used in reference to a nucleic acid molecule, means that essentially none of the nucleotides in the nucleic acid molecule is hydrogen bonded to a complementary nucleotide.

[0097] As used herein, the term “amplicon,” when used in reference to a nucleic acid, means the product of copying the nucleic acid, wherein the product has a nucleotide sequence that is the same as or complementary to at least a portion of the nucleotidesequence of the nucleic acid. An amplicon can be produced by any of a variety of amplification methods that use the nucleic acid, or an amplicon thereof, as a template including, for example, polymerase extension, polymerase chain reaction (PCR), rolling circle amplification (RCA), ligation extension, or ligation chain reaction. An amplicon can be a nucleic acid molecule having a single copy of a particular nucleotide sequence (e.g., a PCR product) or multiple copies of the nucleotide sequence (e.g., a concatemeric product of RCA). A first amplicon of a target nucleic acid can be a complementary copy. Subsequent amplicons are copies that are created, after generation of the first amplicon, from the target nucleic acid or from the first amplicon. A subsequent amplicon can have a sequence that is substantially complementary to the target nucleic acid or substantially identical to the target nucleic acid.

[0098] A “blocking element” refers to an agent (e.g., polynucleotide, protein, nucleotide) that reduces and / or inhibits nucleotide incorporation (i.e., extension of a primer) relative to the absence of the blocking element. In embodiments, the blocking element is a non- extendable oligomer (e.g., a 3 '-blocked oligo). A blocking element on a nucleotide can be reversible, whereby the blocking moiety can be removed or modified to allow the 3' hydroxyl to form a covalent bond with the 5' phosphate of another nucleotide. For example, a reversible terminator may refer to a blocking moiety located, for example, at the 3' position of the nucleotide and may be a chemically cleavable moiety such as an allyl group, an azidomethyl group or a methoxymethyl group. In aspects the blocking moiety is not reversible (e.g., the blocking element including a blocking moiety irreversibly prevents extension). In aspects, the blocking element includes an oligo having a 3' dideoxynucleotide or similar modification to prevent extension by a polymerase and is used in conjunction with a non-strand displacing polymerase. In another example implementation, the blocking element includes one or more modified nucleotides including a cleavable linker (e.g., linked to the 5', 3', or the nucleobase) containing PEG, thereby blocking the extension. In another example implementation, the blocking element includes one or more modified nucleotides linked to biotin, to which a protein (e.g., streptavidin) can be bound, thereby blocking polymerase extension. In another example implementation, the blocking element includes a modified nucleotide, such as iso dGTP or iso dCTP, which are complementary to each other.

[0099] A nucleic acid can be amplified by a thermocycling method or by an isothermal amplification method. In some aspects, a rolling circle amplification method is used. Insome aspects amplification takes place on a solid support (e.g., within a flow cell) where a nucleic acid, nucleic acid library or portion thereof is immobilized. In certain sequencing methods, a nucleic acid library is added to a flow cell and immobilized by hybridization to anchors under suitable conditions. This type of nucleic acid amplification is often referred to as solid phase amplification. In some aspects of solid phase amplification, all or a portion of the amplified products are synthesized by an extension initiating from an immobilized primer. Solid phase amplification reactions are analogous to standard solution phase amplifications except that at least one of the amplification oligonucleotides (e.g., primers) is immobilized on a solid support.

[0100] In some aspects, solid phase amplification comprises a nucleic acid amplification reaction comprising only one species of oligonucleotide primer immobilized to a surface or substrate. In certain aspects solid phase amplification comprises a plurality of different immobilized oligonucleotide primer species. In some aspects, solid phase amplification may comprise a nucleic acid amplification reaction comprising one species of oligonucleotide primer immobilized on a solid surface and a second different oligonucleotide primer species in solution. Multiple different species of immobilized or solution based primers can be used. Non-limiting examples of solid phase nucleic acid amplification reactions include interfacial amplification, bridge PCR amplification, emulsion PCR, WildFire amplification (e.g., US patent publication US20130012399), the like or combinations thereof.

[0101] The number of template copies or amplicons that can be produced can be modulated by appropriate modification of the amplification reaction including, for example, varying the number of amplification cycles run, using polymerases of varying processivity in the amplification reaction and / or varying the length of time that the amplification reaction is run, as well as modification of other conditions known in the art to influence amplification yield. The number of copies of a nucleic acid template can be at least 1, 10, 100, 200, 500, 1000, 2000, 3000, 4000, 5000, 6000, 7000, 8000, 9000 and 10,000 copies, or a range that includes or is between any two of the foregoing numbers, and can be varied depending on the particular application.

[0102] The terms "identical" or percent "identity," in the context of two or more nucleic acids or polypeptide sequences, refer to two or more sequences or subsequences that are the same or have a specified percentage of amino acid residues or nucleotides that are the same (i.e., about 60% identity, preferably 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%,93%, 94%, 95%, 96%, 97%, 98%, 99%, or higher identity over a specified region, when compared and aligned for maximum correspondence over a comparison window or designated region) as measured using a BLAST or BLAST 2.0 sequence comparison algorithms with default parameters described below, or by manual alignment and visual inspection (see, e.g. , NCBI web site www.ncbi.nlm.nih.gov / BLAST / or the like). Such sequences are then said to be "substantially identical." This definition also refers to, or may be applied to, the complement of a test sequence. The definition also includes sequences that have deletions and / or additions, as well as those that have substitutions. As described below, the preferred algorithms can account for gaps and the like. Preferably, identity exists over a region that is at least about 25 amino acids or nucleotides in length, or more preferably over a region that is 50-100 amino acids or nucleotides in length.

[0103] As used herein, the term “molecular identifier,” “single molecule identifier,” or “SMI” refers to sequences of nucleotides applied to or identified in nucleic acid molecules that may be used to distinguish individual or groups of nucleic acid molecules from one another. When incorporated into a nucleic acid, a SMI can be used to correct for subsequent amplification bias by directly counting single molecular identifiers (SMIs) that are sequenced after amplification. A SMI {e.g., a UMI) can be attached to similar nucleic acids, e.g., adapters, making each nucleic acid unique. SMIs {e.g., UMIs) may also be used to uniquely tag individual molecules e.g., individual mRNA molecules) in a sample {e.g., individual mRNA molecules in a tissue sample, cell sample, or sample library). In some embodiments, a UMI is a random nucleotide sequence {e.g., N9).

[0104] As used herein, the term “unique molecular identifier” or “UMI” refers to sequences of nucleotides applied to or identified in nucleic acid molecules that may be used to distinguish individual nucleic acid molecules from one another. UMIs may be sequenced along with the nucleic acid molecules with which they are associated to determine whether the read sequences are those of one source nucleic acid molecule or another. The term “UMI” may be used herein to refer to both the sequence information of a polynucleotide and the physical polynucleotide per se. A unique molecular index, unique molecular identifier or UMI, when used in reference to a capture probe or other nucleic acid is intended to refer to a portion of a probe useful as a molecular barcode to uniquely tag each molecule in a sample library. A UMI may be denoted as “NNNN...” in a string of nucleic acids to designate that portion of the oligonucleotide as the UMI. A UMI may be from 6 to 20 nucleotides or more in length. UMIs are similar to barcodes, which are commonly usedto distinguish reads of one sample from reads of other samples, but UMIs are instead used to distinguish nucleic acid template fragments from another when many fragments from an individual sample are sequenced together. UMIs may be defined in many ways, such as described in WO 2019 / 108972, WO 2018 / 136248, and WO 2016 / 176091, each of which are incorporated herein by reference. In some aspects, the UMI comprises a spatial barcode. In some aspects, the UMI comprises a virtual UMI.

[0105] UMIs may be applied to or identified in individual DNA molecules. In some aspects, the UMIs may be applied to the DNA molecules by methods that physically link or bond the UMIs to the DNA molecules, e.g., by ligation or transposition through polymerase, endonuclease, transposases, etc. These "applied" UMIs are therefore also referred to as physical UMIs. In some aspects, they may also be referred to as exogenous UMIs. The UMIs identified within source DNA molecules are referred to as virtual UMIs. In some aspects, virtual UMIs may also be referred to as endogenous UMI. Physical UMIs may be defined in many ways. For example, they may be random, pseudo-random or partially random, or non-random nucleotide sequences that are inserted in adapters or otherwise incorporated in source DNA molecules to be sequenced.

[0106] A "virtual unique molecular index" or "virtual UMI" is a unique subsequence in a source DNA molecule. In some aspects, virtual UMIs are located at or near the ends of the source DNA molecule. One or more such unique end positions may alone or in conjunction with other information uniquely identify a source DNA molecule. Depending on the number of distinct source DNA molecules and the number of nucleotides in the virtual UMI, one or more virtual UMIs can uniquely identify source DNA molecules in a sample. In some aspects, a combination of two virtual unique molecular identifiers is required to identify a source DNA molecule. Such combinations may be extremely rare, possibly found only once in a sample. In some aspects, one or more virtual UMIs in combination with one or more physical UMIs may together uniquely identify a source DNA molecule. In some aspects, the virtual UMI is derived from sequences at the 3’ end of a first-strand cDNA product.

[0107] As used herein, the terms "address," "tag," “barcode” or "index," when used in reference to a nucleotide sequence is intended to mean a unique nucleotide sequence that is distinguishable from other indices as well as from other nucleotide sequences within polynucleotides contained within a sample. A nucleotide "address," "tag," “barcode” or "index" can be a random or a specifically designed nucleotide sequence. An "address,""tag," “barcode” or "index" can be of any desired sequence length so long as it is of sufficient length to be unique nucleotide sequence within a plurality of indices in a population and / or within a plurality of polynucleotides that are being analyzed or interrogated. A nucleotide "address," "tag," “barcode” or "index" of the disclosure is useful, for example, to be attached to a target polynucleotide to tag or mark a particular species for identifying all members of the tagged species within a population. Accordingly, an index is useful as a barcode where different members of the same molecular species can contain the same index and where different species within a population of different polynucleotides can have different indices.

[0108] As used herein, the term “barcode” is also intended to mean a series of nucleotides in an oligonucleotide that can be used provide barcode information including one or more of identification of the oligonucleotide, a spatial address on a surface, a characteristic of the oligonucleotide, or a manipulation that has been carried out on the oligonucleotide. The barcode can be a naturally occurring nucleotide sequence or a nucleotide sequence that does not occur naturally in the organism from which the barcoded nucleic acid was obtained. For example, each nucleic acid capture probe in a population on a substrate for spatial capture of nucleic acids in a biological sample, e.g., a permeabilized tissue sample, a cell suspension, can include different barcode sequences from all other nucleic acid capture probes in the population. Alternatively, each nucleic acid probe in a population can include different barcode sequences from some or most other nucleic acid capture probes in a population. For example, each capture probe in a population can have a barcode that is present for several different capture probes in the population even though the capture probes with the common barcode differ from each other at other sequence regions along their length. In various embodiments, one or more barcode sequences that are used with a biological tissue are not present in the genome, transcriptome or other nucleic acids of the biological specimen. For example, barcode sequences can have less than 80%, 70%, 60%, 50% or 40% sequence identity to the nucleic acid sequences in a particular biological tissue.

[0109] A tag / index / barcode sequence can be unique to a single nucleic acid species in a population or can be shared by several different nucleic acid species in a population. For example, each nucleic acid probe in a population can include different tag / index / barcode sequences from all other nucleic acid probes in the population. Alternatively, each nucleic acid probe in a population can include different tag / index / barcode sequences from some or most other nucleic acid probes in a population. For example, each probe in a populationcan have a tag / index / barcode that is present for several different probes in the population even though the probes with the common tag / index / barcode differ from each other at other sequence regions along their length. In particular embodiments, one or more tag / index / barcode sequences that are used with a biological specimen are not present in the genome, transcriptome or other nucleic acids of the biological specimen. For example, tag / index / barcode sequences can have less than 80%, 70%, 60%, 50% or 40% sequence identity to the nucleic acid sequences in a particular biological specimen.

[0110] As used herein, a "spatial address," "spatial tag", “spatial barcode”, “spatial barcode sequence” or "spatial index," when used in reference to a nucleotide sequence, means an address, tag, barcode, or index encoding spatial information related to the region or location of origin of an addressed, tagged, barcoded, or indexed nucleic acid in a tissue sample. The sequence can be a naturally occurring sequence or a sequence that does not occur naturally in the organism from which the barcoded nucleic acid was obtained.[OHl] As used herein, a “template switch oligo” or “TSO” refers to an oligonucleotide useful in a method of DNA sequencing in which the oligonucleotide hybridizes to untemplated cytosine (C) nucleotides added to the end of a target RNA or DNA template by a reverse transcriptase during reverse transcription. For example, the TSO comprises a poly G sequence that binds the poly C sequence added to the target template. In some aspects, the TSO comprises 2-5 guanosines that hybridizes to the untemplated cytosine nucleotides. In some embodiments, the 2-5 guanosines are riboguanosines, or modified or locked nucleic acids. In some aspects, the TSO comprises rGrGrG.

[0112] As used herein, the term “universal sequence” refers to a series of nucleotides that is common to two or more nucleic acid molecules even if the molecules also have regions of sequence that differ from each other. A universal sequence that is present in different members of a collection of molecules can allow capture of multiple different nucleic acids using a population of universal capture nucleic acids that are complementary to the universal sequence. Similarly, a universal sequence present in different members of a collection of molecules can allow the replication or amplification of multiple different nucleic acids using a population of universal primers that are complementary to the universal sequence. Thus, a universal capture nucleic acid or a universal primer includes a sequence that can hybridize specifically to a universal sequence. Target nucleic acid molecules may be modified to attach universal adapters, for example, at one or both ends of the different target sequences. Universal capture oligonucleotides are applicable forinterrogating a plurality of different oligonucleotides without necessarily distinguishing the different species whereas targetspecific capture sequences are applicable for distinguishing the different species. A nonlimiting example of a universal sequence is a polyT nucleotide sequence.

[0113] As used herein, the term “DNA polymerase” and “nucleic acid polymerase” are used in accordance with their plain ordinary' meanings and refer to enzymes capable of synthesizing nucleic acid molecules from nucleotides (e.g., deoxyribonucleotides). Exemplary types of polymerases that may be used in the compositions and methods of the present disclosure include the nucleic acid polymerases such as DNA polymerase, DNA- or RNA-dependent RNA polymerase, and reverse transcriptase. In some cases, the DNA polymerase is 9°N polymerase or a variant thereof, E. Coli DNA polymerase I, Bacteriophage T4 DNA polymerase, Sequenase, Taq DNA polymerase, DNA polymerase from Bacillus stearothermophilus, Bst 2.0 DNA polymerase, 9°N polymerase (exo- ) A485L / Y409V, Phi29 DNA Polymerase (<p29 DNA Polymerase), T7 DNA polymerase, DNA polymerase II, DNA polymerase III holoenzyme, DNA polymerase IV, DNA polymerase V, VentR DNA polymerase, Therminator™ II DNA Polymerase, Therminator™ III DNA Polymerase, or Therminator™ IX DNA Polymerase. In aspects, the polymerase is a protein polymerase. Typically, a DNA polymerase adds nucleotides to the d'end of a DNA strand, one nucleotide at a time. In aspects, the DNA polymerase is a Pol I DNA polymerase, Pol II DNA polymerase, Pol III DNA polymerase, Pol IV DNA polymerase, Pol V DNA polymerase, Pol P DNA polymerase, Pol p DNA polymerase, Pol / . DNA polymerase, Pol o DNA polymerase, Pol a DNA polymerase, Pol 8 DNA polymerase, Pol 8 DNA polymerase, Pol r| DNA polymerase, Pol r DNA polymerase, Pol K DNA polymerase, Pol L, DNA polymerase, Pol y DNA polymerase, Pol 0 DNA polymerase, Pol n DNA polymerase, or a thermophilic nucleic acid polymerase (e.g. Therminator y, 9°N polymerase (exo-), Therminator II, Therminator III, or Therminator IX). In aspects, the DNA polymerase is a modified archaeal DNA polymerase. In aspects, the polymerase is a reverse transcriptase. For example, a polymerase catalyzes the addition of a next correct nucleotide to the 3 '-OH group of the primer via a phosphodiester bond, thereby chemically incorporating the nucleotide into the primer.

[0114] As used herein, the term “template polynucleotide” or “template nucleic acid” refers to any polynucleotide molecule that may be bound by a polymerase and utilized as a template for nucleic acid synthesis. A template polynucleotide may be a targetpolynucleotide. In general, the term “target polynucleotide” refers to a nucleic acid molecule or polynucleotide in a starting population of nucleic acid molecules having a target sequence whose presence, amount, and / or nucleotide sequence, or changes in one or more of these, are desired to be determined. In general, the term “target sequence” refers to a nucleic acid sequence on a single strand of nucleic acid. The target sequence may be a portion of a gene, a regulatory sequence, genomic DNA, cDNA, RNA including mRNA, miRNA, rRNA, or others. The target sequence may be a target sequence from a sample or a secondary target such as a product of an amplification reaction. A target polynucleotide is not necessarily any single molecule or sequence. For example, a target polynucleotide may be any one of a plurality of target polynucleotides in a reaction, or all polynucleotides in a given reaction, depending on the reaction conditions. For example, in a nucleic acid amplification reaction with random primers, all polynucleotides in a reaction may be amplified. As a further example, a collection of targets may be simultaneously assayed using polynucleotide primers directed to a plurality of targets in a single reaction. As yet another example, all or a subset of polynucleotides in a sample may be modified by the addition of a primer-binding sequence (such as by the ligation of adapters containing the primer binding sequence), rendering each modified polynucleotide a target polynucleotide in a reaction with the corresponding primer polynucleotide(s). In the context of selective sequencing, “target polynucleotide(s)” refers to the subset of polynucleotide(s) to be sequenced from within a starting population of polynucleotides.

[0115] As used herein, the term "extend," when used in reference to a nucleic acid, is intended to mean addition of at least one nucleotide or oligonucleotide to the nucleic acid. In particular embodiments one or more nucleotides can be added to the 3' end of a nucleic acid, for example, via polymerase catalysis (e.g., DNA polymerase, RNA polymerase or reverse transcriptase). Chemical or enzymatic methods can be used to add one or more nucleotide to the 3' or 5' end of a nucleic acid. One or more oligonucleotides can be added to the 3' or 5' end of a nucleic acid, for example, via chemical or enzymatic (e.g., ligase catalysis) methods. An extension reaction, in which nucleotides are added to the 3' end of an oligonucleotide {e.g., a primer) is performed in the presence of a polymerase, such as a DNA or RNA polymerase. In some embodiments, the polymerase is a non-thermostable isothermal strand displacement polymerase. Suitable non-thermostable strand displacement polymerases according to the present disclosure can be found, for example, through New England BioLabs, Inc. and include phi29, Bsu, Klenow, DNA Polymerase I (E. coli), andTherminator. In some embodiments, the extension reaction is carried out by recombinase polymerase amplification (RPA). RPA comprises three core enzymes - a recombinase, a single-stranded DNA binding protein (SSB) and a strand-displacing polymerase. As described in Daher et al. (Rana K Daher, Gale Stewart, Maurice Boissinot, Michel G Bergeron, Recombinase Polymerase Amplification for Diagnostic Applications, Clinical Chemistry, Volume 62, Issue 7, 1 July 2016). One or more oligonucleotides can be added to the 3' or 5' end of a nucleic acid, for example, via chemical or enzymatic e.g., ligase catalysis) methods. A nucleic acid can be extended in a template directed manner, whereby the product of extension is complementary to a template nucleic acid that is hybridized to the nucleic acid that is extended.

[0116] As used herein, the term “adjacent,” refers to two nucleotide sequences in a nucleic acid, can refer to nucleotide sequences separated by 0 to about 20 nucleotides, more specifically, in a range of about 1 to about 10 nucleotides, or to sequences that directly abut one another. As those of skill in the art appreciate, two nucleotide sequences that are to be ligated together will generally directly abut one another.

[0117] As used herein, the term "poly T,” “poly A," “poly C,” poly G,” or “poly I,” when used in reference to a nucleic acid sequence (e.g., a capture nucleotide sequence), is intended to mean a series of two or more thiamine (T), adenine (A), cytosine (C), guanosine (G), or inosine (I) bases, respectively. A poly T, poly A, poly C, poly G, or poly I can include at least about 2, 5, 8, 10, 12, 15, 18, 20, 22, 25, 28, 30, 32, 35, 38, 40, or more of the T, A, C, G, or I bases, respectively. Alternatively or additionally, a poly T, poly A, poly C, poly G, or poly I can include at most about 40, 38, 35, 32, 30, 28, 25, 22, 20, 18, 15, 12, 10, 8, 5, or 2 of the T, A, C, G, or I bases, respectively. In some embodiments, the disclosure contemplates use of a "polyTVN" sequence, wherein “T” is a capture nucleotide sequence, “V” is adenine (A), cytosine (C), or guanine (G), and “N” is adenine (A), cytosine (C), guanine (G), or thymine (T). The polyTVN sequence is used, in some embodiments, to bias reverse transcription to the base of the poly A tail on the mRNA molecule, e.g., in template switching.

[0118] As used herein, the term “tagmentation,” “tagment,” or “tagmenting” refers to transforming a nucleic acid, e.g., a DNA, into adaptor-modified templates in solution ready for cluster formation and sequencing by the use of transposase mediated fragmentation and tagging. This process often involves the modification of the nucleic acid by a transposome complex comprising transposase enzyme complexed with adaptors comprising transposonend sequence. Tagmentation results in the simultaneous fragmentation of the nucleic acid and ligation of the adaptors to the 5' ends of both strands of duplex fragments. Following a purification step to remove the transposase enzyme, additional sequences are added to the ends of the adapted fragments by PCR.

[0119] A “transposase” refers to an enzyme that is capable of forming a functional complex with a transposon end-containing composition (e.g., transposons, transposon ends, transposon end compositions) and catalyzing insertion or transposition of the transposon end-containing composition into the double-stranded target nucleic acid with which it is incubated, for example, in an in vitro transposition reaction. A transposase as presented herein can also include integrases from retrotransposons and retroviruses. Transposases, transposomes and transposome complexes are generally known to those of skill in the art, as exemplified by the disclosure of US Pat. Publ. No. 2010 / 0120098, the content of which is incorporated herein by reference in its entirety. Although many embodiments described herein refer to Tn5 transposase and / or hyperactive Tn5 transposase, it will be appreciated that any transposition system that is capable of inserting a transposon end with sufficient efficiency to 5'-tag and fragment a target nucleic acid for its intended purpose can be used in the present invention. In particular embodiments, a preferred transposition system is capable of inserting the transposon end in a random or in an almost random manner to 5'- tag and fragment the target nucleic acid.

[0120] As used herein, the term “transposition reaction” refers to a reaction wherein one or more transposons are inserted into target nucleic acids, e.g., at random sites or almost random sites. Essential components in a transposition reaction are a transposase and DNA oligonucleotides that exhibit the nucleotide sequences of a transposon, including the transferred transposon sequence and its complement (the non- transferred transposon end sequence) as well as other components needed to form a functional transposition or transposome complex. The DNA oligonucleotides can further comprise additional sequences (e.g., adaptor or primer sequences) as needed or desired. In some embodiments, the method provided herein is exemplified by employing a transposition complex formed by a hyperactive Tn5 transposase and a Tn5-type transposon end (Goryshin and Reznikoff, 1998, J. Biol. Chem., 273: 7367) or by a MuA transposase and a Mu transposon end comprising Rland R2 end sequences (Mizuuchi, 1983, Cell, 35: 785; Savilahti et al., 1995, EMBO J., 14:4893). However, any transposition system that is capable of inserting a transposon end in a random or in an almost random manner with sufficient efficiency to 5'-tag and fragment a target DNA for its intended purpose can be used in the present invention. Examples of transposition systems known in the art which can be used for the present methods include but are not limited to Staphylococcus aureus Tn552 (Colegio et al., 2001 , J Bacterid., 183: 2384-8; Kirby et al., 2002, Mol Microbiol, 43: 173-86), Tyl (Devine and Boeke, 1994, NucleicAcidsRes., 22: 3765-72 and International Patent Application No. WO 95 / 23875), TransposonTn7 (Craig, 1996, Science. 271 : 1512; Craig, 1996, Review in: Cun- Top Microbiollmmunol, 204: 27-48), TnlO and ISIO (Kleckner et al., 1996, Curr Top Microbiol Immunol, 204: 49-82), Mariner transposase (Lampe et al., 1996, EMBO J., 15: 5470-9), Tci (Plasterk,1996, Curr Top Microbiol Immunol, 204: 125-43), P Element (Gloor, 2004, Methods Mol Biol, 260: 97-114), TnJ (Ichikawa and Ohtsubo, 1990, J Biol Chem. 265: 18829-32), bacterial insertion sequences (Ohtsubo and Sekine, 1996, Curr. Top. Microbiol. Immunol. 204:1 -26), retroviruses (Brown et al., 1989, Proc Natl Acad Sci USA, 86: 2525-9), and retrotransposon of yeast (Boeke and Corces, 1989, Annu Rev Microbiol. 43: 403-34). The method for inserting a transposon end into a target sequence can be carried out in vitro using any suitable transposon system for which a suitable in vitro transposition system is available or that can be developed based on knowledge in the art. In general, a suitable in vitro transposition system for use in the methods provided herein requires, at a minimum, a transposase enzyme of sufficient purity, sufficient concentration, and sufficient in vitro transposition activity and a transposon end with which the transposase forms a functional complex with the respective transposase that is capable of catalyzing the transposition reaction. Suitable transposase transposon end sequences that can be used in the invention include but are not limited to wild-type, derivative or mutant transposon end sequences that form a complex with a transposase chosen from among a wild-type, derivative or mutant form of the transposase. As used herein, the term “transposome complex” refers to a transposase enzyme non-covalently bound to a double stranded nucleic acid. For example, the complex can be a transposase enzyme preincubated with double-stranded transposon DNA under conditions that support non- covalent complex formation. Double-stranded transposon DNA can include, without limitation, Tn5 DNA, a portion of Tn5 DNA, a transposon end composition, a mixture of transposon end compositions or other doublestranded DNAs capable of interacting with a transposase such as the hyperactive Tn5 transposase.

[0121] As used herein, the term "random" can be used to refer to the spatial arrangement or composition of locations on a surface. For example, there are at least two types of orderfor an array described herein, the first relating to the spacing and relative location of features (also called "sites") and the second relating to identity or predetermined knowledge of the particular species of molecule that is present at a particular feature. Accordingly, features of an array can be randomly spaced such that nearest neighbor features have variable spacing between each other. Alternatively, the spacing between features can be ordered, for example, forming a regular pattern such as a rectilinear grid or hexagonal grid. In another respect, features of an array can be random with respect to the identity or predetermined knowledge of the gene of interest (e.g., nucleic acid of a particular sequence) that occupies each feature independent of whether spacing produces a random pattern or ordered pattern. An array set forth herein can be ordered in one respect and random in another. For example, in some embodiments set forth herein a surface is contacted with a population of nucleic acids under conditions where the nucleic acids attach at sites that are ordered with respect to their relative locations but 'randomly located' with respect to knowledge of the sequence for the nucleic acid species present at any particular site. Reference to "randomly distributing" nucleic acids at locations on a surface is intended to refer to the absence of knowledge or absence of predetermination regarding which nucleic acid will be captured at which location (regardless of whether the locations are arranged in an ordered pattern or not).

[0122] As used herein, the terms “sequencing”, “sequence determination”, “determining a nucleotide sequence”, and the like include determination of a partial or complete sequence information (e.g. , a sequence) of a polynucleotide being sequenced, and particularly physical processes for generating such sequence information. That is, the term includes sequence comparisons, consensus sequence determination, contig assembly, fingerprinting, and like levels of information about a target polynucleotide, as well as the express identification and ordering of nucleotides in a target polynucleotide. The term also includes the determination of the identification, ordering, and locations of one, two, or three of the four types of nucleotides within a target polynucleotide. In some aspects, a sequencing process described herein comprises contacting a template and an annealed primer with a suitable polymerase under conditions suitable for polymerase extension and / or sequencing. In aspects, sequencing generates one or more sequencing reads. The sequencing methods are preferably carried out with the target polynucleotide arrayed on a solid substrate. Multiple target polynucleotides can be immobilized on the solid support through linker molecules, or can be attached to particles, e.g., microspheres, which can also be attachedto a solid substrate. In aspects, the solid substrate is in the form of a chip, a bead, a well, a capillary tube, a slide, a wafer, a filter, a fiber, a porous media, or a column. In aspects, the solid substrate is gold, quartz, silica, plastic, silica, diamond, silver, metal, or polypropylene. In aspects, the solid substrate is porous.

[0123] The term “Next Generation Sequencing (NGS)” herein refers to sequencing methods that allow for massively parallel sequencing of clonally amplified molecules and of single nucleic acid molecules. Non-limiting examples of NGS include sequencing-by- synthesis using reversible dye terminators, sequencing-by-ligation, and sequencing-by- binding.

[0124] As used herein, the term “sequencing read” is used in accordance with its plain and ordinary meaning and refers to an inferred sequence of nucleotide bases (or nucleotide base probabilities) corresponding to all or part of a single polynucleotide fragment. A sequencing read may include 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 150, 200, 250, or more nucleotide bases. In aspects, a sequencing read includes reading a barcode and a template nucleotide sequence. In aspects, a sequencing read includes reading a template nucleotide sequence. In aspects, a sequencing read includes reading a barcode and not a template nucleotide sequence. In aspects, a sequencing read includes a computationally derived string corresponding to the detected label. The sequence reads are optionally stored in an appropriate data structure for further evaluation. In aspects, a first sequencing reaction can generate a first sequencing read. The first sequencing read can provide the sequence of a first region of the polynucleotide fragment. In aspects, a second sequencing primer can initiate sequencing at a second location on the nucleic acid template. The second location can be distinct from the first location. In some cases, a 3’ terminal nucleotide of the second primer can hybridize to a location that is more than 5 nucleotides away from a binding site of a 3' terminal nucleotide of the first primer. The second sequencing reaction can generate a second sequencing read. The second sequencing read can provide the sequence of a second region of the nucleic acid template which is distinct from the first region of the nucleic acid template. In some aspects, the nucleic acid template is optionally subjected to one or more additional rounds of sequencing using additional sequencing primers, thereby generating additional sequencing reads.

[0125] The term “paired end reads” refers to reads obtained from paired end sequencing that obtains one read from each end of a nucleic fragment. Paired end sequencing involves fragmenting DNA into sequences called inserts. In some protocols such as some used byIllumina, the reads from shorter inserts (e.g., on the order of tens to hundreds of bp) are referred to as short-insert paired end reads or simply paired end reads. In contrast, the reads from longer inserts (e.g., on the order of several thousands of bp) are referred to as mate pair reads. In this disclosure, short-insert paired end reads and long-insert mate pair reads may both be used and are not differentiated with regard to the process for determining sequences of DNA fragments. In some aspects, paired end reads include reads of about 20 bp to 1000 bp. In some aspects, paired end reads include reads of about 50 bp to 500 bp, about 80 bp to 150 bp, or about 100 bp.

[0126] “Synthetic” agents refer to non-naturally occurring agents, such as enzymes or nucleotides derived or constructed using human-made techniques. For example, s synthetic DNA polymerases refer to non-naturally occurring DNA polymerases such as those constructed by synthetic methods, mutated parent DNA polymerases such as truncated DNA polymerases and fusion DNA polymerases. Synthetic oligonucleotides such as adapter sequences or primers, include a human-designed sequence, typically configured to maximize yield and minimize off-target products, without introducing any biases. Examples of synthetic oligonucleotide sequences include P5, P7, or complementary sequences thereof (i.e., P5' or P7'). The P5 and P7 primers are used on the surface of commercial flow cells for sequencing on various Illumina platforms, as described in U.S. Patent Publication No. 2011 / 0059865 Al.

[0127] The term “library” merely refers to a collection or plurality of template nucleic acid molecules which share common sequences at their 5' ends (e.g., the first end) and common sequences at their 3 ' ends (e.g., the second end). In aspects, a population of template nucleic acid molecules form a library.

[0128] The terms “solid surface,” “solid support” and other grammatical equivalents herein refer to any material that is appropriate for or can be modified to be appropriate for the attachment of the capture oligonucleotides. As will be appreciated by those in the art, the number of possible substrates is very large. Possible substrates include, but are not limited to, glass and modified or functionalized glass, plastics (including acrylics, polystyrene and copolymers of styrene and other materials, polypropylene, polyethylene, polybutylene, polyurethanes, Teflon™, etc.), polysaccharides, nylon or nitrocellulose, ceramics, resins, silica or silica-based materials including silicon and modified silicon, carbon, metals, inorganic glasses, plastics, optical fiber bundles, and a variety of other polymers. Particularly useful solid supports and solid surfaces for some embodiments are locatedwithin a flowcell apparatus. Additional non-limiting examples of solid supports and solid surfaces include a bead array, a spotted array, clustered particles arranged on a surface of a chip, and a multiwell plate.

[0129] As used herein, the term “substrate” is intended to mean a solid support. The term includes any material that can serve as a solid or semi-solid foundation for creation of features such as wells for the deposition of biopolymers, including nucleic acids, polypeptide and / or other polymers. A substrate as provided herein is modified, for example, or can be modified to accommodate attachment of biopolymers by a variety of methods well known to those skilled in the art. Exemplary types of substrate materials include glass, modified glass, functionalized glass, inorganic glasses, microspheres, including inert and / or magnetic particles, plastics, polysaccharides, nylon, nitrocellulose, ceramics, resins, silica, silica-based materials, carbon, metals, an optical fiber or optical fiber bundles, a variety of polymers other than those exemplified above (e.g., cyclic olefin copolymers, polyacrylamide, cyclic olefin polymers, etc.), and multiwell microtiter plates. Specific types of exemplary plastics include acrylics, polystyrene, copolymers of styrene and other materials, polypropylene, polyethylene, polybutylene, polyurethanes and Teflon™. Specific types of exemplary silica-based materials include silicon and various forms of modified silicon. In a particular aspect, a “substrate” as used herein includes, but is not limited to beads, a microarray, a plate, a multiwell plate, or a flowcell (e.g., a nonpattemed flowcell, or a pattered flowcell). The substrate can comprise a planar surface, or comprise a non-planar (e.g., convex or concave) surface. Those skilled in the art will know or understand that the composition and geometry of a substrate as provided herein can vary depending on the intended use and preferences of the user. In some aspects, the substrate may be patterned. For example, the substrate may be patterned with nanowells. Therefore, although planar substrates such as slides, chips or wafers are exemplified herein in reference to microarrays for illustration, given the teachings and guidance provided herein, those skilled in the art will understand that a wide variety of other substrates exemplified herein or well known in the art also can be used in the methods and / or compositions herein.

[0130] In a certain aspect, a substrate disclosed herein may further comprises islands or clusters of immobilized capture agents or capture oligos. The islands or clusters can be generated on the surface of a substrate (e.g., a flowcell) by using bridge amplification. In such a case, the substrate comprises a plurality immobilized capture oligos on the surface of the substrate, which bind with complementary adapter regions presents on nearbyprimers or oligos to form bridge-like structures; these bridge-like structures are then extended using a polymerase enzyme, generating a double stranded molecule, that is then denatured to leave a single-stranded capture oligo anchored to the substrate. After multiple iterations of the foregoing process, islands or clusters of immobilized capture oligos are created. An example of the foregoing process that can be used with the methods and compositions disclosed herein can be found in WO 2022 / 015913 Al, which is incorporated herein by reference in-full. In a particular aspect, the nearby primers or oligos are attached to the substrate (e.g., a flowcell) by a selectively cleavable linker. Each island or cluster may be roughly circular or oval in shape. Each island or cluster may have an average diameter of 200 nm, 250 nm, 300 nm, 350 nm, 400 nm, 450 nm, 500 nm, 550 nm, 600 nm, 650 nm, 700 nm, 750 nm, 800 nm, 850 nm, 900 nm, 950 nm, 1000 nm, 1050 nm, 1100 nm, 1200 nm, or a range that includes or is in between any two of the forgoing diameters. In a further aspect, the surface of the substrate (e.g., a flowcell) comprises per 1 mm2of surface area 0.3, 0.4, 0.5, 0.6. 0.7, 0.8, 0.9, 1.0, 1.1, 1.2, 1.3, 1.4, 1.5, 1.6. 1.7, 1.8, 1.9, 2.0, 2.1, 2.2, 2.3, 2.4, or 2.5 million clusters, or range including or between any two of the forgoing numbers. In a particular aspect, a “substrate” as disclosed herein comprises islands or clusters of immobilized capture oligos comprising adapter sequence(s), a spatial address sequence, an optional sequence primer site, and a capture moiety for a targeted analyte. In yet a further aspect, each cluster or island on the substrate (e.g., a flowcell) comprises capture oligos that have a unique spatial address sequence, so the x,y location of each cluster or island can be identified. In such a case, the x,y location of each cluster or island can be determined by decoding the spatial address sequence. Methods to decode the spatial address sequence include, but are not limited, the decoding-by-hybridization or the decoding-by-sequencing methods disclosed herein.

[0131] In some aspects, the substrate is an ordered substrate. An “ordered substrate” refers to an arrangement of different regions in or on an exposed layer of a substrate, where each region comprises features (e.g., nanowells) that have an assigned x,y spatial address, or an x,y spatial address that can be readily determined. An “ordered substrate” may have a specific pattern of features. In some aspects, the pattern can be a repeating arrangement of features and / or interstitial regions. In a certain aspect, the surface(s) of an “ordered substrate” can be patterned with spatial address sequences. Exemplary patterned substrate that can be used in the methods and compositions set forth herein are described in US Ser. No. 13 / 661,524 or US Pat. App. Publ. No. 2012 / 0316086 Al, each of which is incorporatedherein by reference. In a particular aspect, the features of an ordered substrate can comprise immobilized oligos, or islands or clusters of immobilized oligos. In such an aspect, the location of the islands or clusters of immobilized capture oligos can be readily be determined without having to decode the spatial address sequence of immobilized oligos. Accordingly, immobilized oligos having a unique spatial address sequence is optional for an “ordered substrate.” Examples of “ordered substrates” include, but are not limited to, patterned flowcells, beadchip arrays, and microarrays.

[0132] As used herein, the term “interstitial region” refers to an area in a substrate or on a surface that separates other areas of the substrate or surface. For example, an interstitial region can separate one feature of an array from another feature of the array. The two regions that are separated from each other can be discrete, lacking contact with each other. In another example, an interstitial region can separate a first portion of a feature from a second portion of a feature. The separation provided by an interstitial region can be partial or full separation. Interstitial regions will typically have a surface material that differs from the surface material of the features on the surface. For example, features of an array can have an amount or concentration of capture agents or capture oligos that exceeds the amount or concentration present at the interstitial regions. In some aspects, the capture agents or primers may not be present at the interstitial regions.

[0133] In some aspects, the substrate includes an array of wells or depressions in a surface.This may be fabricated as is generally known in the art using a variety of techniques, including, but not limited to, photolithography, stamping techniques, molding techniques and micro-etching techniques. As will be appreciated by those in the art, the technique used will depend on the composition and shape of the array substrate.

[0134] The features of a patterned substrate or an ordered substrate can be wells in an array of wells (e.g., microwells or nanowells) on glass, silicon, plastic or other suitable solid supports with patterned, covalently-linked gel such as poly(N-(5- azidoacetamidylpentyl)acrylamide-coacrylamide) (PAZAM, see, for example, U.S. Prov. Pat. App. Ser. No. 61 / 753,833, which is incorporated herein by reference). The process creates gel pads used for sequencing that can be stable over sequencing runs with a large number of cycles. The covalent linking of the polymer to the wells is helpful for maintaining the gel in the structured features throughout the lifetime of the structured substrate during a variety of uses. However, in many aspects, the gel need not be covalently linked to the wells. For example, in some conditions silane free acrylamide (SFA, see, forexample, U.S. Pat. App. Pub. No. 2011 / 0059865 Al, which is incorporated herein by reference) which is not covalently attached to any part of the structured substrate, can be used as the gel material.

[0135] In particular aspects, a patterned substrate or ordered substrate can be made by patterning a solid support material with wells (e.g., microwells or nanowells), coating the patterned support with a gel material (e.g., PAZAM, SFA or chemically modified variants thereof, such as the azidolyzed version of SFA (azido-SFA)) and polishing the gel coated support, for example via chemical or mechanical polishing, thereby retaining gel in the wells but removing or inactivating substantially all of the gel from the interstitial regions on the surface of the structured substrate between the wells. Primer nucleic acids can be attached to gel material. A solution of target nucleic acids (e.g., a fragmented human genome) can then be contacted with the polished substrate such that individual target nucleic acids will seed individual wells via interactions with primers attached to the gel material; however, the target nucleic acids will not occupy the interstitial regions due to absence or inactivity of the gel material. Amplification of the target nucleic acids will be confined to the wells since absence or inactivity of gel in the interstitial regions prevents outward migration of the growing nucleic acid colony. The process is conveniently manufacturable, being scalable and utilizing conventional micro- or nano-fabrication methods. A patterned substrate or ordered substrate can include, for example, wells etched into a slide or chip.

[0136] The pattern of the etchings and geometry of the wells can take on a variety of different shapes and sizes so long as such features are physically or functionally separable from each other. Particularly useful substrates having such structural features are patterned substrates that can select the size of solid support particles such as microspheres. An exemplary patterned substrate having these characteristics is the etched substrate used in connection with BeadArray technology (Illumina, Inc., San Diego, Calif). Further examples, are described in U.S. Pat. No. 6,770,441, which is incorporated herein by reference.

[0137] In some aspects, a substrate disclosed herein is a flowcell. The term “flowcell” as used herein refers to a chamber comprising a solid surface across which one or more fluid reagents can be flowed. Examples of flowcells and related fluidic systems and detection platforms that can be readily used in the methods of the present disclosure are described, for example, in Bentley et al., Nature 456:53-59 (2008), WO 04 / 018497; US 7,057,026;WO 91 / 06678; WO 07 / 123744; US 7,329,492; US 7,21 1,414; US 7,315,019; US 7,405,281, and US 2008 / 0108082, each of which is incorporated herein by reference. A flowcell can be “a nonpatterned flowcell”, where the surface(s) of the flowcell comprises randomly or semi-randomly arranged features (e.g., areas comprising clusters or islands of oligos). Alternatively, the flowcell can be a “patterned flowcell,” where the flowcell comprises features (e.g., nanowells) at fixed locations across the surface(s) of the flowcell. The features of a “patterned flowcell” can further comprise immobilized oligos, or clusters or islands of immobilized oligos A “patterned flowcell” can be an “ordered substrate” in that the features of the patterned flowcell have an assigned x,y spatial address, or an x,y spatial address that can be readily determined.

[0138] In the methods capture oligonucleotides are immobilized on a substrate via one or more polynucleotides, such as a polynucleotide. When referring to immobilization of molecules (e.g. nucleic acids) to a solid support, the terms “immobilized” and “attached” are used interchangeably herein and both terms are intended to encompass direct or indirect, covalent or non-covalent attachment, unless indicated otherwise, either explicitly or by context. In some embodiments, covalent attachment may be used, but generally all that is required is that the molecules (e.g. nucleic acids) remain immobilized or attached to the support under the conditions in which it is intended to use the support, for example in applications requiring nucleic acid amplification and / or sequencing. Oligonucleotides to be used as capture primers or amplification primers can be immobilized such that a 3'-end is available for enzymatic extension and at least a portion of the sequence is capable of hybridizing to a complementary sequence.

[0139] Immobilization can occur via hybridization to a surface attached oligonucleotide, in which case the immobilized oligonucleotide or polynucleotide can be in the 3' -5' orientation. Alternatively, immobilization can occur by means other than base-pairing hybridization, such as the covalent attachment set forth above.

[0140] As used herein, the term “immobilized” refers to the state of two things being joined, fastened, adhered, attached, connected, or bound to each other. For example, an analyte, such as a nucleic acid, can be immobilized on a material, such as a bead, gel, or surface, by a covalent or non-covalent bond. A covalent bond is characterized by the sharing of pairs of electrons between atoms. A non-covalent bond is a chemical bond that does not involve the sharing of pairs of electrons and can include, for example, hydrogen bonds, ionic bonds, van der Waals forces, hydrophilic interactions and hydrophobicinteractions. In various embodiments, covalent attachment can be used, but all that is required is that the oligonucleotides remain stationary or attached to a surface under conditions in which it is intended to use the surface, for example, in applications requiring nucleic acid capture, amplification, and / or sequencing.

[0141] Exemplary covalent linkages include, for example, those that result from the use of click chemistry techniques. Exemplary non-covalent linkages include, but are not limited to, non-specific interactions (e.g., hydrogen bonding, ionic bonding, van der Waals interactions etc.) or specific interactions (e.g., affinity interactions, receptor-ligand interactions, antibody- epitope interactions, avidin-biotin interactions, streptavidin-biotin interactions, lectincarbohydrate interactions, etc.). Exemplary linkages are set forth in U.S. Pat. Nos. 6,737,236; 7,259,258; 7,375,234 and 7,427,678; and US Pat. Pub. No.2011 / 0059865 Al, each of which is incorporated herein by reference.

[0142] Certain aspects may make use of an inert substrate or matrix (e.g., glass slides, polymer beads etc.) that has been functionalized, for example by application of a layer or coating of an intermediate material comprising reactive groups which permit covalent attachment to biomolecules, such as polynucleotides. Examples of such substrates include, but are not limited to, polyacrylamide hydrogels supported on an inert substrate such as glass, particularly polyacrylamide hydrogels as described in WO 2005 / 065814 and US 2008 / 0280773, the contents of which are incorporated herein in their entirety by reference. In such aspects, the biomolecules (e.g., polynucleotides) may be directly covalently attached to the intermediate material (e.g., the hydrogel) but the intermediate material may itself be non-covalently attached to the substrate or matrix (e.g., the glass substrate). The term “covalent attachment to a substrate” is to be interpreted accordingly as encompassing this type of arrangement.

[0143] As used herein, the term “array” refers to a population of sites that can be differentiated from each other according to relative location. Different molecules that are at different sites of an array can be differentiated from each other according to the locations of the sites in the array. An individual site of an array can include one or more molecules of a particular type. For example, a site can include a single target nucleic acid molecule having a particular sequence or a site can include several nucleic acid molecules having the same sequence (and / or complementary sequence, thereof). The sites of an array can be different features located on the same substrate. Exemplary features include without limitation, wells in a substrate, beads (or other particles) in or on a substrate, projectionsfrom a substrate, ridges on a substrate or channels in a substrate. The sites of an array can be separate substrates each bearing a different molecule. Different molecules attached to separate substrates can be identified according to the locations of the substrates on a surface to which the substrates are associated or according to the locations of the substrates in a liquid or gel. Exemplary arrays in which separate substrates are located on a surface include, without limitation, those having beads in wells.

[0144] As used herein, the term “plurality” is intended to mean a population of two or more different members. Pluralities can range in size from small, medium, large, to very large. The size of small plurality can range, for example, from a few members to tens of members. Medium sized pluralities can range, for example, from tens of members to about 100 members or hundreds of members. Large pluralities can range, for example, from about hundreds of members to about 1000 members, to thousands of members and up to tens of thousands of members. Very large pluralities can range, for example, from tens of thousands of members to about hundreds of thousands, a million, millions, tens of millions and up to or greater than hundreds of millions of members. Therefore, a plurality can range in size from two to well over one hundred million members as well as all sizes, as measured by the number of members, in between and greater than the above exemplary ranges. An exemplary number of features within a microarray includes a plurality of about 500,000 or more discrete features within 1.28 cm2. Exemplary nucleic acid pluralities include, for example, populations of about IxlO5, 5xl05and IxlO6or more different nucleic acid species. Accordingly, the definition of the term is intended to include all integer values greater than two. An upper limit of a plurality can be set, for example, by the theoretical diversity of nucleotide sequences in a nucleic acid sample.

[0145] As used herein the term “determine” can be used to refer to the act of ascertaining, establishing or estimating. A determination can be probabilistic. For example, a determination can have an apparent likelihood of at least 50%, 75%, 90%, 95%, 98%, 99%, 99.9% or higher. In some cases, a determination can have an apparent likelihood of 100%. An exemplary determination is a maximum likelihood analysis or report. As used herein, the term “identify,” when used in reference to a thing, can be used to refer to recognition of the thing, distinction of the thing from at least one other thing or categorization of the thing with at least one other thing. The recognition, distinction or categorization can be probabilistic. For example, a thing can be identified with an apparent likelihood of at least 50%, 75%, 90%, 95%, 98%, 99%, 99.9% or higher. A thing can be identified based on aresult of a maximum likelihood analysis. In some cases, a thing can be identified with an apparent likelihood of 100%.

[0146] As used herein, a "biological sample" may include one or more biological or chemical substances, such as nucleic acids, oligonucleotides, proteins, cells, tissues, organisms, and / or biologically active chemical compound(s), such as analogs or mimetics of the aforementioned species.

[0147] As used herein, the term “tissue” is intended to mean an aggregation of cells, and, optionally, intercellular matter. Typically the cells in a tissue are not free floating in solution and instead are attached to each other to form a multicellular structure. Exemplary tissue types include muscle, nerve, epidermal and connective tissues. In some instances, the biological sample may include whole blood, lymphatic fluid, serum, plasma, sweat, tear, saliva, sputum, cerebrospinal fluid, amniotic fluid, seminal fluid, vaginal excretion, serous fluid, synovial fluid, pericardial fluid, peritoneal fluid, pleural fluid, transudates, exudates, cystic fluid, bile, urine, gastric fluid, intestinal fluid, fecal samples, liquids containing single or multiple cells, liquids containing organelles, fluidized tissues, fluidized organisms, viruses including viral pathogens, liquids containing multi-celled organisms, biological swabs and biological washes. In further examples, the sample can be derived from an organ, including for example, an organ of the musculoskeletal system such as muscle, bone, tendon or ligament; an organ of the digestive system such as salivary gland, pharynx, esophagus, stomach, small intestine, large intestine, liver, gallbladder or pancreas; an organ of the respiratory system such as larynx, trachea, bronchi, lungs or diaphragm; an organ of the urinary system such as kidney, ureter, bladder or urethra; a reproductive organ such as ovary, fallopian tube, uterus, vagina, placenta, testicle, epididymis, vas deferens, seminal vesicle, prostate, penis or scrotum; an organ of the endocrine system such as pituitary gland, pineal gland, thyroid gland, parathyroid gland, or adrenal gland; an organ of the circulatory system such as heart, artery, vein or capillary; an organ of the lymphatic system such as lymphatic vessel, lymph node, bone marrow, thymus or spleen; an organ of the central nervous system such as brain, brainstem, cerebellum, spinal cord, cranial nerve, or spinal nerve; a sensory organ such as eye, ear, nose, or tongue; or an organ of the integument such as skin, subcutaneous tissue or mammary gland. In various embodiments, the tissue can be derived from a multicellular organism. In some embodiments, a tissue section can be contacted with a surface, for example, by laying the tissue on the surface. The tissue can be freshly excised from an organism, or it may have been previouslypreserved for example by freezing (e.g., fresh frozen tissue), embedding in a material such as paraffin (e.g., formalin fixed paraffin embedded (FFPE) samples), formalin fixation, infiltration, dehydration or the like. Optionally, a tissue section can be attached to a surface, for example, using techniques and compositions described in, for example, U.S. Patent No.11,390,912, incorporated by reference herein in its entirety. In some embodiments, a tissue can be permeabilized and the cells of the tissue lysed when the tissue is in contact with a surface. Any of a variety of treatments can be used such as those set forth above in regard to lysing cells. Target proteins and / or nucleic acids that are released from a tissue that is permeabilized can be captured by capture oligonucleotides on the surface. Thus, in various embodiments, the biological sample is a tissue sample. The thickness of a tissue sample or other biological sample that is contacted with a surface in a method set forth herein can be any suitable thickness desired. In representative embodiments, the thickness will be at least 0.1 pm, 0.25 pm, 0.5 pm, 0.75 pm, 1 pm, 5 pm, 10 pm, 50 pm, 100 pm or thicker. Alternatively or additionally, the thickness of a biological sample that is contacted with a surface will be no more than 100 pm, 50 pm, 10 pm, 5 pm, 1 pm, 0.5 pm, 0.25 pm, 0.1 pm or thinner.

[0148] As used herein, the term "tissue sample" refers to a piece of tissue that has been obtained from a subject, optionally fixed, sectioned, and mounted on a planar surface, e.g., a microscope slide. The tissue sample can be a formalin-fixed paraffin-embedded (FFPE) tissue sample or a fresh tissue sample or a frozen tissue sample, etc. The methods disclosed herein may be performed before or after staining the tissue sample. For example, following hematoxylin and eosin staining, a tissue sample may be spatially analyzed in accordance with the methods as provided herein. A method may include analyzing the histology of the sample (e.g., using hematoxylin and eosin staining) and then spatially analyzing the tissue. In various embodiments, the tissue is removed from the sample by enzymatic degradation. In various embodiments, the tissue removal is carried out before the RNA is removed from the tissue. In various embodiments, the tissue is removed via degradation with proteinase K, e.g., at 37°C for 40 minutes.

[0149] As used herein, the term "formalin-fixed paraffin embedded (FFPE) tissue section" refers to a piece of tissue, e.g., a biopsy that has been obtained from a subject, fixed in formaldehyde (e.g., 3%-5% formaldehyde in phosphate buffered saline) or Bouin solution, embedded in wax, cut into thin sections, and then mounted on a planar surface, e.g., a microscope slide.

[0150] As used herein, the term “subject” encompasses mammals and non-mammals. Examples of mammals include, but are not limited to, any member of the mammalian class: humans, non-human primates such as chimpanzees, and other apes and monkey species, cattle, horses, sheep, goats, swine, rabbits, dogs, cats, rodents, rats, mice, guinea pigs, and the like. Examples of non-mammals include, but are not limited to, birds, fish, and the like. The term does not denote a particular age or gender. A subject can be any living or nonliving organism, including but not limited to a human, non-human animal, plant, bacterium, fungus, virus or protist. A subject may be any age (e.g., an embryo, a fetus, infant, child, adult). A subject can be of any sex (e.g., male, female, or combination thereof). A subject may be pregnant. In some aspects, a subject is a mammal. In some aspects, a subject is a human subject. A subject can be a patient (e.g., a human patient). In some aspects, a subject is suspected of having a genetic variation or a disease or condition associated with a genetic variation.

[0151] Where a range of values is provided, it is understood that each intervening value, to the tenth of the unit of the lower limit unless the context clearly indicates otherwise, between the upper and lower limit of that range, and any other stated or unstated intervening value in, or smaller range of values within, that stated range is encompassed within the invention. The upper and lower limits of any such smaller range (within a more broadly recited range) may independently be included in the smaller ranges, or as particular values themselves, and are also encompassed within the invention, subject to any specifically excluded limit in the stated range. Where the stated range includes one or both of the limits, ranges excluding either or both of those included limits are also included in the invention.

[0152] The methods and kits of the present disclosure may be applied, mutatis mutandis, to the sequencing of RNA, or to determining the identity of a ribonucleotide.

[0153] As used herein, the term “kit” refers to any delivery system for delivering materials.In the context of reaction assays, such delivery systems include systems that allow for the storage, transport, or delivery of reaction reagents (e.g., oligonucleotides, enzymes, etc. in the appropriate containers) and / or supporting materials (e.g., packaging, buffers, written instructions for performing a method, etc.) from one location to another. For example, kits include one or more enclosures (e.g., boxes) containing the relevant reaction reagents and / or supporting materials. As used herein, the term “fragmented kit” refers to a delivery system comprising two or more separate containers that each contain a subportion of the total kit components. The containers may be delivered to the intended recipient together orseparately. For example, a first container may contain an enzyme for use in an assay, while a second container contains oligonucleotides. In contrast, a “combined kit” refers to a delivery system containing all of the components of a reaction assay in a single container (e.g., in a single box housing each of the desired components). The term “kit” includes both fragmented and combined kits.

[0154] The recitation of a listing of elements in any definition of a variable herein includes definitions of that variable as any single element or combination (or subcombination) of listed elements. The recitation of an aspect herein includes that aspect as any single aspect or in combination with any other aspects or portions thereof.

[0155] All patents and publications mentioned in this specification are herein incorporated by reference to the same extent as if each independent patent and publication was specifically and individually indicated to be incorporated by reference.II. Compositions and Kits

[0156] Provided herein are capture probes (also referred to herein as capture oligonucleotides or capture oligos) for capturing target nucleic acids from biological samples for spatial transcriptomic analysis.

[0157] In one aspect, the present disclosure provides a capture probe comprising a first primer binding sequence and a capture region, wherein the capture region comprises at least two non-sequential nucleotides or non- sequent! al nucleotide sequences, where each of the non-sequential nucleotides or non-sequential nucleotide sequences is complementary to a portion of a homopolymer sequence of a target nucleic acid. Capture probes comprising at least two non-sequential nucleotides or non-sequential nucleotide sequences are also referred to herein as “disrupted homopolymer capture probes” or “DHP capture probes”.

[0158] The capture probes of the present invention provide advantages over traditional homopolymer capture probes. Typically, as shown in FIG. 1A, capture probes used in spatial transcriptomic workflows for targeting mRNA bind to the poly-A tail of the mRNA molecule. The standard design for these capture probes comprises a poly-T homopolymer sequence. However, it is known in the art that homopolymer sequences of the length typically used to capture mRNA sequences (e.g., greater than about 10 to about 15 bases in length) have a negative effect on polymerase (e.g., reverse transcriptase) activity. As shown in FIG. IB, the capture probes of the present disclosure incorporate deliberate mismatches in order to disrupt the extended homopolymer stretches (e.g., poly-T homopolymerstretches) of the capture probe. By breaking apart the homopolymer stretches into shorter stretches (i.e., less than about 10 bases in length), the capture probes of the present disclosure are less inhibitiory to polymerase activity compared to traditional homopolymer capture probes. Accordingly, the capture probes described herein provide improved target nucleic acid capture efficiency and immobilization (e.g., improved capture of mRNA and subsequent production of first-strand cDNA on a solid support).

[0159] Furthermore, in the context of traditional capture probes (i.e. capture probes with homopolymer capture sequences greater than about 10 to about 15 bases in length), processing of the captured nucleic acids leads to the introduction of long stretches of homopolymer sequences into the final library, which may lead to detrimental downstream PCR and clustering performance due to enzyme inhibition. In contrast, the capture probes of the present disclosure minimize the length of the homopolymer stretches by breaking up the homopolymer regions with intervening nucleotides or intervening nucleotide sequences that do not efficiently base pair with the target nucleic acid. The present capture probe design therefore minimizes such downstream inhibitory effects typically seen with standard homopolymer capture probes.

[0160] In another aspect, the present disclosure provides a capture probe comprising a first primer binding sequence, a spatial barcode, and a capture region, wherein the capture region comprises at least two non-sequential nucleotides or non-sequential nucleotide sequences, where each of the non-sequential nucleotides or non-sequential nucleotide sequences is complementary to a portion of a homopolymer sequence of a target nucleic acid.

[0161] In some aspects, the capture region comprises less than ten non-sequential nucleotide sequences. In some aspects, the capture region comprises less than nine nonsequential nucleotide sequences. In some aspects, the capture region comprises less than eight non-sequential nucleotide sequences. In some aspects, the capture region comprises less than seven non-sequential nucleotide sequences. In some aspects, the capture region comprises less than six non-sequential nucleotide sequences. In some aspects, the capture region comprises less than five non-sequential nucleotide sequences. In some aspects, the capture region comprises less than four non-sequential nucleotide sequences. In some aspects, the capture region comprises less than three non-sequential nucleotide sequences.

[0162] In some aspects, the capture region comprises two to six non-sequential nucleotide sequences. In some aspects, the capture region comprises three to seven non-sequentialnucleotide sequences. In some aspects, the capture region comprises four to eight nonsequential nucleotide sequences. In some aspects, the capture region comprises two nonsequential nucleotide sequences. In some aspects, the capture region comprises three nonsequential nucleotide sequences. In some aspects, the capture region comprises four nonsequential nucleotide sequences. In some aspects, the capture region comprises five nonsequential nucleotide sequences. In some aspects, the capture region comprises six nonsequential nucleotide sequences. In some aspects, the capture region comprises seven nonsequential nucleotide sequences. In some aspects, the capture region comprises eight nonsequential nucleotide sequences.

[0163] In some aspects, each of the non-sequential nucleotide sequences is separated by an intervening nucleotide or intervening nucleotide sequence. In some aspects, the intervening nucleotide and / or intervening nucleotide sequence comprises at least one nucleotide that is not complementary to the homopolymer sequence. In some aspects, the intervening nucleotide and / or intervening nucleotide sequence comprises at least one nucleotide that is not complementary to the homopolymer sequence. In some aspects, the intervening nucleotide and / or intervening nucleotide sequence comprises at least two nucleotides that are not complementary to the homopolymer sequence. In some aspects, the intervening nucleotide and / or intervening nucleotide sequence comprises at least three nucleotides that are not complementary to the homopolymer sequence. In some aspects, the intervening nucleotide and / or intervening nucleotide sequence comprises at least four nucleotides that are not complementary to the homopolymer sequence. In some aspects, the intervening nucleotide and / or intervening nucleotide sequence comprises at least five nucleotides that are not complementary to the homopolymer sequence.

[0164] In some aspects, each of the at least two non-sequential nucleotide sequences is between 2 to 10 bases in length. In some aspects, each of the at least two non-sequential nucleotide sequences is between 2 to 8 bases in length. In some aspects, each of the at least two non-sequential nucleotide sequences is between 3 to 5 bases in length. In some aspects, each of the at least two non-sequential nucleotide sequences is 2 bases in length. In some aspects, each of the at least two non-sequential nucleotide sequences is 3 bases in length. In some aspects, each of the at least two non-sequential nucleotide sequences is 4 bases in length. In some aspects, each of the at least two non-sequential nucleotide sequences is 5 bases in length. In some aspects, each of the at least two non-sequential nucleotide sequences is 6 bases in length. In some aspects, each of the at least two non-sequentialnucleotide sequences is 7 bases in length. In some aspects, each of the at least two nonsequential nucleotide sequences is 8 bases in length. In some aspects, each of the at least two non-sequential nucleotide sequences is 9 bases in length. In some aspects, each of the at least two non-sequential nucleotide sequences is 10 bases in length.

[0165] In some aspects, the 3’ end of the capture region is complementary to the homopolymer sequence of the target nucleic acid.

[0166] In some aspects, the 3’ end of the capture region comprises at least two nucleotides complementary to the homopolymer sequence of the target nucleic acid. In some aspects, the 3’ end of the capture region comprises at least three nucleotides complementary to the homopolymer sequence of the target nucleic acid. In some aspects, the 3 ’ end of the capture region comprises at least four nucleotides complementary to the homopolymer sequence of the target nucleic acid. In some aspects, the 3’ end of the capture region comprises at least five nucleotides complementary to the homopolymer sequence of the target nucleic acid.

[0167] In some aspects, the capture region comprises locked nucleic acids (LNAs), Bislocked nucleic acids (bisLNAs), twisted intercalating nucleic acids (TINAs), bridged nucleic acids (BNAs), 2’-O-methyl RNA:DNA chimeric nucleic acids, minor groove binder (MGB) nucleic acids, morpholino nucleic acids, C5-modified pyrimidine nucleic acids, peptide nucleic acids (PNAs), phosphorothioate nucleic acids, or combinations thereof. In some aspects, the capture region comprises locked nucleic acids (LNAs) or 2’- O-methyl RNA:DNA chimeric nucleic acids. In some aspects, the capture region comprises locked nucleic acids (LNAs). In some aspects, the capture region comprises 2’-O-methyl RNA:DNA chimeric nucleic acids.

[0168] In some aspects, the intervening nucleotide and / or intervening nucleotide sequence comprises a natural base. In some aspects, the natural base is a deoxythymidine, deoxyadenosine, deoxyguanosine, deoxycytidine, or deoxyuridine. In some aspects, the natural base is a deoxythymidine. In some aspects, the natural base is a deoxyadenosine. In some aspects, the natural base is a deoxyguanosine. In some aspects, the natural base is a deoxycytidine. In some aspects, the natural base is a deoxyuridine.

[0169] In some aspects, the intervening nucleotide and / or intervening nucleotide sequence comprises an unnatural base. In some aspects, the unnatural base is a 2’-deoxyinosine, isoguanine, 3 -nitropyrrole, 5-nitroindole, or isocytosine. In some aspects, the unnatural base is a 2’-deoxyinosine. In some aspects, the unnatural base is a isoguanine. In someaspects, the unnatural base is a 3 -nitropyrrole. In some aspects, the unnatural base is a 5- nitroindole. In some aspects, the unnatural base is a isocytosine.

[0170] In some aspects, the capture probe further comprises an index sequence.

[0171] In some aspects, the first primer binding sequence is a first sequencing primer binding sequence or a first decoding primer binding sequence. In some aspects, the first primer binding sequence is a first sequencing primer binding sequence. In some aspects, the first primer binding sequence is a first decoding primer binding sequence.

[0172] In some aspects, the target nucleic acid is a messenger RNA (mRNA).

[0173] In some aspects, the homopolymer sequence is a poly-A sequence. In some aspects, the poly-A sequence is incorporated into the target nucleic acid using poly(A) polymerase or terminal deoxynucleotidyl transferase (TdT).

[0174] In some aspects, the homopolymer sequence is a poly-I sequence. In some aspects, the poly-I sequence is incorporated into the target nucleic acid using a polymerase. In some aspects, the polymerase is a DNA polymerase. In some aspects, the DNA polymerase is a terminal deoxynucleotidyl transferase (TdT), Bst DNA polymerase, a Deep Vent (Exo-) DNA polymerase, a Therminator I® DNA polymerase, or a Therminator IX® DNA polymerase.

[0175] In some aspects, the homopolymer sequence is between 3 and 50 bases in length.In some aspects, the homopolymer sequence is between 10 and 45 bases in length. In some aspects, the homopolymer sequence is between 20 and 40 bases in length. In some aspects, the homopolymer sequence is between 25 and 35 bases in length. In some aspects, the homopolymer sequence is about 3, 5, 10, 15, 20, 25, 30, 35, 40, 45, or 50 bases in length.

[0176] In some aspects, the homopolymer sequence is between 5 and 25 bases in length.In some aspects, the homopolymer sequence is between 25 and 50 bases in length. In some aspects, the homopolymer sequence is between 50 and 75 bases in length. In some aspects, the homopolymer sequence is between 75 and 100 bases in length. In some aspects, the homopolymer sequence is between 100 and 125 bases in length. In some aspects, the homopolymer sequence is between 125 and 150 bases in length. In some aspects, the homopolymer sequence is between 150 and 175 bases in length. In some aspects, the homopolymer sequence is between 175 and 200 bases in length.

[0177] In some aspects, the homopolymer sequence is greater than 50 bases in length. In some aspects, the homopolymer sequence is greater than 75 bases in length. In some aspects, the homopolymer sequence is greater than 100 bases in length. In some aspects,the homopolymer sequence is greater than 125 bases in length. In some aspects, the homopolymer sequence is greater than 150 bases in length. In some aspects, the homopolymer sequence is greater than 175 bases in length. In some aspects, the homopolymer sequence is greater than 200 bases in length.

[0178] In some aspects, each of the non-sequential nucleotide sequences comprises a plurality of deoxythymidines.

[0179] In some aspects, each of the non-sequential nucleotide sequences comprises a plurality of deoxyadenosines, deoxycytidines, deoxyuridines, or a combination thereof. In some aspects, each of the non-sequential nucleotide sequences comprises a plurality of deoxy adenosines. In some aspects, each of the non-sequential nucleotide sequences comprises a plurality of deoxycytidines. In some aspects, each of the non-sequential nucleotide sequences comprises a plurality of deoxyuri dines.

[0180] In some aspects, the intervening nucleotide and / or intervening nucleotide sequence does not comprise a deoxythymidine.

[0181] In some aspects, the capture region comprises the nucleotide sequence:GCTTATTTGTTTTCTT (SEQ ID NO: 13). In some aspects, the capture region comprises the nucleotide sequence: GCTTATTTGTTTTCTTTTTTTTGTTTTTTTTTTTT (SEQ ID NO: 14). In some aspects, the capture region comprises the nucleotide sequence: TTTTTTTTTTTTTTTTTTTT (SEQ ID NO: 15). In some aspects, the capture region comprises the nucleotide sequence: TTTTSTTTTSTTTTSTTTTSTTTTSTTTTT (SEQ ID NO: 16).

[0182] In some aspects, the target nucleic acid is DNA. In some aspects, the homopolymer sequence is a poly-A, poly-T, poly-G, or Poly-C sequence. In some aspects, the homopolymer sequence is a poly-A sequence. In some aspects, the homopolymer sequence is a poly-T sequence. In some aspects, the homopolymer sequence is a poly-G sequence. In some aspects, the homopolymer sequence is a Poly-C sequence. In some aspects, the homopolymer sequence is incorporated into the DNA using TdT.

[0183] In some aspects, the capture probe comprises, from 5’ to 3’, the first primer binding sequence and the capture region. In some aspects, the first primer binding sequence is a first sequencing primer binding sequence.

[0184] In some aspects, the capture probe comprises, from 5’ to 3’, the spatial barcode, the first primer binding sequence, and the capture region. In some aspects, the first primer binding sequence is a decoding primer binding sequence.

[0185] In some aspects, the capture probe further comprises a cleavable site. In some aspects, the cleavable site comprises a restriction endonuclease recognition site, a uracil, an 8-oxoguanine, or a combination thereof. In some aspects, the cleavable site comprises a chemically cleavable moiety or an enzymatically cleavable moiety. In some aspects, the enzymatically cleavable moiety comprises a restriction endonuclease recognition site.

[0186] In some aspects, an oligonucleotide of the disclosure, or a modified form thereof, is generally about 5 nucleotides to about 150 nucleotides in length. In further embodiments, an oligonucleotide of the disclosure is about 5 to about 125 nucleotides in length, about 5 to about 100 nucleotides in length, about 5 to about 90 nucleotides in length, about 5 to about 50 nucleotides in length, about 5 to about 45 nucleotides in length, about 5 to about 40 nucleotides in length, about 5 to about 35 nucleotides in length, about 5 to about 30 nucleotides in length, about 5 to about 25 nucleotides in length, about 5 to about 20 nucleotides in length, about 5 to about 15 nucleotides in length, about 5 to about 10 nucleotides in length, about 10 to about 150 nucleotides in length, about 10 to about 125 nucleotides in length, about 10 to about 100 nucleotides in length, about 10 to about 90 about 10 to about 50 nucleotides in length, about 10 to about 45 nucleotides in length, about 10 to about 40 nucleotides in length, about 10 to about 35 nucleotides in length, about 10 to about 30 nucleotides in length, about 10 to about 25 nucleotides in length, about 10 to about 20 nucleotides in length, about 10 to about 15 nucleotides in length, and all oligonucleotides intermediate in length of the sizes specifically disclosed to the extent that the oligonucleotide is able to achieve the desired result.

[0187] In some aspects, an oligonucleotide of the disclosure is or is at least 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19, 20, 21 , 22, 23, 24, 25, 26, 27, 28, 29, 30, 31 , 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51 , 52, 53, 54, 55, 56, 57, 58, 59, 60, 61 , 62, 63, 64, 65, 66, 67, 68, 69, 70, 71 , 72, 73, 74, 75, 76, 77, 78, 79, 80, 81 , 82, 83, 84, 85, 86, 87, 88, 89, 90, 91 , 92, 93, 94, 95, 96, 97, 98, 99, 100, 101 , 102, 103, 104, 105, 106, 107, 108, 109, 110, 111 , 112, 113, 114, 115, 116, 117, 118, 119, 120, 121 , 122, 123, 124, 125, 126, 127, 128, 129, 130, 131 , 132, 133, 134, 135, 136, 137, 138, 139, 140, 141 , 142, 143, 144, 145, 146, 147, 148, 149, 150 or more nucleotides in length. In some aspetcts, an oligonucleotide of the disclosure is less than 6, 7, 8, 9, 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19, 20, 21 , 22, 23, 24, 25, 26, 27, 28, 29, 30, 31 , 32, 33, 34, 35, 36, 37, 38, 39, 40, 41 , 42, 43, 44, 45, 46, 47, 48, 49, 50, 51 , 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67,68, 69, 70, 71 , 72, 73, 74, 75, 76, 77, 78, 79, 80, 81 , 82, 83, 84, 85, 86, 87,88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101 , 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121 , 122, 123, 124, 125, 126, 127, 128, 129, 130, 131 , 132, 133, 134, 135, 136, 137, 138, 139, 140, 141 , 142, 143, 144, 145, 146, 147, 148, 149, 150, or more nucleotides in length.

[0188] In some aspects, the length of an oligonucleotide (such as a primer) of the disclosure is between about 5 base pairs (bp) and 40 bp, or between about 5 bp and 35 bp, or between about 5 bp and 30 bp, or between about 10 bp and 35 bp, or between about 10 bp and 30 bp, or between about 20 bp and 40 bp, or between about 20 bp and 35 bp, or between about 20 bp and 30 bp, or between about 9 and 20 bp or between about 5 and 15 bp, or between about 9 and 15 bp in length. In some aspects, the length of an oligonucleotide (such as a primer) of the disclosure is about 10 bp, 13 bp, 15 bp, 20 bp, 25 bp, 30 bp, 35 bp, or 40 bp. As described herein, in various aspects the oligonucleotide may be a P5 primer, a P5’ primer, a P7 primer, or a P7’ primer.

[0189] In some aspects, the length of a capture probe of the disclosure is between about 35 bp and 100 bp, or between about 40 bp and 90 bp, or between about 40 bp and 80 bp, or between about 50 bp and 70 bp, or between about 50 bp and 60 bp in length. In some aspects, the length of a capture probe of the disclosure is between about 60 bp and 150 bp, or between about 60 bp and 140 bp, or between about 60 bp and 130 bp, or between about 60 bp and 120 bp, or between about 70 bp and 110 bp, or between 80 np and 100 bp in length.

[0190] In another aspect, provided herein is a solid support comprising a plurality of immobilized capture probes, wherein each capture probe of the plurality comprises a capture probe described herein. In some aspects, each capture probe is immobilized to the solid support at a 5’ end.

[0191] In some aspects, the solid support further comprises a plurality of immobilized spatial probes. In some aspects, each spatial probe of the plurality of immobilized spatial probes comprises a second primer binding sequence, a spatial barcode, and a probe sequence. In some aspects, each spatial probe is immobilized to the solid support at a 5’ end. In some aspects, each spatial probe is immobilized to the solid support at a 3’ end.

[0192] In some aspects, each spatial probe further comprises an index sequence, a molecular identifier, or a combination thereof. In some aspects, the molecular identifier is a unique molecular identifier (UMI). In some aspects, the UMI is a physical UMI.

[0193] In some aspects, the capture probe comprises a UMI. In some aspects, the capture probe comprises a spatial barcode. In some aspects, the capture probe comprises both a UMI and a spatial barcode. In some aspects, the UMI is a physical UMI.

[0194] In some aspects, the probe sequence is identical to a portion of the capture region of each immobilized capture probe. In some aspects, the probe sequence is identical to a portion of the capture region of each immobilized capture probe that is between 5 to 45 nucleotides in length. In some aspects, the probe sequence is identical to a portion of the capture region of each immobilized capture probe that is between 10 to 40 nucleotides in length. In some aspects, the probe sequence is identical to a portion of the capture region of each immobilized capture probe that is between 15 to 35 nucleotides in length.

[0195] In some aspects, the first primer binding sequence of the capture probe is a first sequencing primer binding sequence, and wherein the portion of the capture region that is identical to the probe sequence is adjacent to the first sequencing primer binding sequence of the capture probe. In some aspects, the portion of the capture region that is identical to the probe sequence is hybridized to a blocking element.

[0196] In some aspects, the probe sequence is hybridized to a blocking element. In some aspects, the blocking element hybridizes to a postion of the probe sequence. In some aspects, the length of the blocking element is 95%, 90%, 85%, 80%, 75%, 70%, 65%, 60%, 55%, 50%, or less than 50% the length of the probe sequence.

[0197] In some aspects, the blocking element is substantially complementary to the probe sequence. In some aspects, the blocking element comprises locked nucleic acids (LNAs), Bis-locked nucleic acids (bisLNAs), twisted intercalating nucleic acids (TINAs), bridged nucleic acids (BNAs), 2’-O-methyl RNA:DNA chimeric nucleic acids, minor groove binder (MGB) nucleic acids, morpholino nucleic acids, C5-modified pyrimidine nucleic acids, peptide nucleic acids (PNAs), phosphorothioate nucleic acids, or combinations thereof. In some aspects, the capture region comprises locked nucleic acids (LNAs) or 2’- O-methyl RNA:DNA chimeric nucleic acids.

[0198] In some aspects, the probe sequence is complementary to the reverse complement of a template switch oligo sequence.

[0199] In some aspects, each spatial probe of the plurality of immobilized spatial probes comprises, from 5’ to 3’, the second primer binding sequence, the spatial barcode, and the probe sequence. In some aspects, the second primer binding sequence is a second decoding primer binding sequence. In some aspects, each spatial probe of the plurality ofimmobilized spatial probes comprises, from 5’ to 3’, the probe sequence, the spatial barcode, and the second primer binding sequence. In some aspects, the second primer binding sequence is a second decoding primer binding sequence.

[0200] In some aspects, the solid support is a bead array, a spotted array, a flow cell, clustered particles arranged on a surface of a chip, a film, or a plate.

[0201] In some aspects, only one capture probe in a set of capture probes comprises a capture region. In some aspects, two or more capture probes in a set of capture probes comprise as capture region.

[0202] In some aspects, only one probe in a set of capture probes comprises a spatial address region, e.g., such as a complete spatial address region describing the position of a capture site on a capture array. In some aspects, two or more probes in a set of capture probes can comprise a spatial address region, e.g., two or more probes can each comprise a partial spatial address region (i.e., combinatorial address region), wherein each partial address region describes the position of a capture site on a capture array, e.g., along the x- axis or the y-axis.

[0203] In some aspects, a set of capture probes (e.g., a RNA and surface capture probe) can comprise at least one capture probe comprising a capture region and a spatial address region (e.g., a complete or a partial spatial address region). In some aspects, no capture probe in a set of capture probes comprises both a capture region and a spatial address region.

[0204] In some aspects, the capture site on the substrate is a plurality of capture sites. In some embodiments, the plurality of capture sites is 2 or more, 10 or more, 30 or more, 100 or more, 300 or more, 1,000 or more, 3,000 or more, 10,000 or more, 30,000 or more, 100,000 or more, 300,000 or more, 1,000,000 or more 3,000,000 or more, or 10,000,000 or 1,000,000,000 or more capture sites.

[0205] In various aspects, the capture array or substrate comprises a capture site density of 1 or more, 2 or more, 10 or more, 30 or more, 100 or more, 300 or more, 1,000 or more, 3,000 or more, 10,000 or more, 100,000 or more, 1 ,000,000 or more, capture sites per square centimeter (cm2).

[0206] In some embodiments, the plurality of capture probes is 2 or more, 10 or more, 30 or more, 100 or more, 300 or more, 1,000 or more, 3,000 or more, 10,000 or more, 30,000 or more, 100,000 or more, 300,000 or more, 1 ,000,000 or more 3,000,000 or more, or 10,000,000 or more, 100,000,000 or more, or 1,000,000,000 or more capture probes.

[0207] In some embodiments, each capture probe in the plurality of capture probes within the same capture site comprises the same spatial address sequence. In some embodiments, each capture probe in the plurality of capture probes in different capture sites comprises a different spatial address sequence.

[0208] In another aspect, provided herein is a solid support comprising a plurality of capture probes described herein and a plurality of immobilized spatial probes described herein, wherein each capture probe comprises, from 5’ to 3’, the first primer binding sequence and the capture region, wherein the 3’ end of the capture region is complementary to the homopolymer sequence of the target nucleic acid, wherein each immobilized spatial probe comprises, from 5’ to 3’, the second primer binding sequence, the spatial barcode, and the probe sequence, and wherein each capture probe is attached to the solid support at a 5’ end. In some aspects, the capture probes comprise a first primer binding sequence and a capture region, wherein the capture region comprises at least two non-sequential nucleotides or non-sequential nucleotide sequences, where each of the non-sequential nucleotides or non-sequential nucleotide sequences is complementary to a portion of a homopolymer sequence of a target nucleic acid. In some aspects, each spatial probe is immobilized to the solid support at a 5’ end.

[0209] In another aspect, provided herein is a solid support comprising a plurality of capture probes described herein, and a plurality of immobilized spatial probes described herein, wherein each capture probe comprises, from 5’ to 3’, the first primer binding sequence and the capture region, wherein the 3’ end of the capture region is complementary to the homopolymer sequence of the target nucleic acid, wherein each immobilized spatial probe comprises, from 5’ to 3’, the probe sequence, the spatial barcode, and the second primer binding sequence, and wherein each capture probe is attached to the solid support at a 5’ end. In some aspects, the capture probes comprise a first primer binding sequence and a capture region, wherein the capture region comprises at least two non-sequential nucleotides or non-sequential nucleotide sequences, where each of the non-sequential nucleotides or non-sequential nucleotide sequences is complementary to a portion of a homopolymer sequence of a target nucleic acid. In some aspects, each spatial probe is immobilized to the solid support at a 3’ end.

[0210] In another aspect, provided herein is a solid support comprising a plurality of capture probes described herein, wherein each capture probe comprises, from 5’ to 3’, the spatial barcode, the first primer binding sequence, and the capture region, wherein the 3’end of the capture region is complementary to the homopolymer sequence of the target nucleic acid, and wherein each capture probe is attached to the solid support at a 5’ end. In some aspects, the capture probes comprise a first primer binding sequence, a spatial barcode, and a capture region, wherein the capture region comprises at least two nonsequential nucleotides or non-sequential nucleotide sequences, where each of the nonsequential nucleotides or non-sequential nucleotide sequences is complementary to a portion of a homopolymer sequence of a target nucleic acid.

[0211] In some aspects, provided herein are adapters, wherein each adapter comprises a double-stranded region. In some aspects, adapter molecules can be “Y”-shaped, “U”- shaped, “hairpin” shaped, have a bubble (e.g., a portion of sequence that is non- complimentary), or other features. In other aspects, adapter molecules can comprise a “Y”- shape, a “U”-shaped, a “hairpin” shaped, or a bubble. Certain adapters may comprise modified or non-standard nucleotides, restriction sites, or other features for manipulation of structure or function in vitro. Adapter molecules may ligate to a variety of nucleic acid material having a terminal end. For example, adapter molecules can be suited to ligate to a T-overhang, an A-overhang, a CG-overhang, a multiple nucleotide overhang, a dehydroxylated base, a blunt end of a nucleic acid material and the end of a molecule were the 5' of the target is dephosphorylated or otherwise blocked from traditional ligation. In other aspects the adapter molecule can contain a dephosphorylated or otherwise ligationpreventing modification on the 5' strand at the ligation site. In the latter two aspects such strategies may be useful for preventing dimerization of library fragments or adapter molecules.

[0212] An adapter sequence can mean a single-strand sequence, a double-strand sequence, a complimentary sequence, a non-complimentary sequence, a partial complimentary sequence, an asymmetric sequence, a primer binding sequence, a flow-cell sequence, a ligation sequence or other sequence provided by an adapter molecule. In particular aspects, an adapter sequence can mean a sequence used for amplification by way of compliment to an oligonucleotide.

[0213] In some aspects, provided methods and compositions include at least one adapter sequence (e.g., two adapter sequences, one on each of the 5' and 3' ends of a nucleic acid material). In some aspects, provided methods and compositions may comprise 2 or more adapter sequences (e.g., 3, 4, 5, 6, 7, 8, 9, 10 or more). In some aspects, at least two of the adapter sequences differ from one another (e.g., by sequence). In some aspects, eachadapter sequence differs from each other adapter sequence (e.g., by sequence). In some aspects, at least one adapter sequence is at least partially non-complementary to at least a portion of at least one other adapter sequence (e.g., is non-complementary by at least one nucleotide).

[0214] In some aspects, an adapter sequence comprises at least one non-standard nucleotide. In some aspects, a non-standard nucleotide is selected from an abasic site, a uracil, tetrahydrofuran, 8-oxo-7,8-dihydro-2'deoxyadenosine (8-oxo-A), 8-oxo-7,8- dihydro-2'-deoxyguanosine (8-oxo-G), deoxyinosine, 5 'nitroindole, 5-Hydroxymethyl-2'- deoxy cytidine, iso-cytosine, 5'-methyl-isocytosine, or isoguanosine, a methylated nucleotide, an RNA nucleotide, a ribose nucleotide, an 8-oxo-guanine, a photocleavable linker, a biotinylated nucleotide, a desthiobiotin nucleotide, a thiol modified nucleotide, an acrydite modified nucleotide an iso-dC, an iso dG, a 2'-O-methyl nucleotide, an inosine nucleotide Locked Nucleic Acid, a peptide nucleic acid, a 5 methyl dC, a 5-bromo deoxyuridine, a 2,6-Diaminopurine, 2-Aminopurine nucleotide, an abasic nucleotide, a 5- Nitroindole nucleotide, an adenylated nucleotide, an azide nucleotide, a digoxigenin nucleotide, an I-linker, an 5' Hexynyl modified nucleotide, an 5-Octadiynyl dU, photocleavable spacer, a non-photocleavable spacer, a click chemistry compatible modified nucleotide, and any combination thereof.

[0215] In some aspects, an adapter sequence comprises a moiety having a magnetic property (i.e., a magnetic moiety). In some aspects this magnetic property is paramagnetic. In some aspects where an adapter sequence comprises a magnetic moiety (e.g., a nucleic acid material ligated to an adapter sequence comprising a magnetic moiety), when a magnetic field is applied, an adapter sequence comprising a magnetic moiety is substantially separated from adapter sequences that do not comprise a magnetic moiety (e.g., a nucleic acid material ligated to an adapter sequence that does not comprise a magnetic moiety).

[0216] In some aspects, the adapter is partially double-stranded and is formed by annealing two oligonucleotides corresponding to the two strands. The two strands have a number of complementary base pairs (e.g., 12-17 bp) that allow the two oligonucleotides to anneal at the end to be ligated with a dsDNA fragment. A dsDNA fragment to be ligated on both ends for pair-end reads is also referred to as an insert. Other base pairs are not complementary on the two strands, resulting in a fork-shaped adapter having two floppy overhangs.

[0217] In some aspects, each adapter further comprises a random sequence. In some aspects, the random sequence is between 2 to 8 nucleotides in length. In some aspects, the random sequence is 2, 3, 4, 5, 6, 7, or 8 nucleotides in length.

[0218] In some aspects, the adapter is between 25 and 80 nucleotides in length. In some aspects, the adapter is greater than 80 nucleotides in length. In some aspects, the adapter is about 25, about 40, about 50, about 60, about 70, or about 80 nucleotides in length.

[0219] In some asepcts, the adapter comprises a 3’ overhang. In some aspects, the adapter is blunt-ended.

[0220] In another aspect, provided herein is a kit comprising a solid support described herein. In some aspects, the kit further comprises a template switch oligo (TSO). In some aspects, the kit further comprises a splint oligo. In some aspects, the kit further comprises a blocking oligo.

[0221] Generally, the kit includes one or more containers providing a composition and one or more additional reagents (e.g., a buffer suitable for polynucleotide extension). The kit may also include a template nucleic acid (DNA and / or RNA), one or more primers, one or more adapters, nucleoside triphosphates (including, e.g., deoxyribonucleotides, ribonucleotides, labeled nucleotides, and / or modified nucleotides), buffers, salts, and / or labels (e.g., fluorophores).

[0222] In aspects, the kit includes a reverse transcriptase, a sequencing polymerase, and one or more amplification polymerases. In aspects, the sequencing polymerase is capable of incorporating modified nucleotides. In aspects, the polymerase is a DNA polymerase. In aspects, the DNA polymerase is a Pol I DNA polymerase, Pol II DNA polymerase, Pol III DNA polymerase, Pol IV DNA polymerase, Pol V DNA polymerase, Pol (3 DNA polymerase, Pol LI DNA polymerase, Pol X DNA polymerase, Pol o DNA polymerase, Pol a DNA polymerase, Pol 5 DNA polymerase, Pol e DNA polymerase, Pol q DNA polymerase, Pol r DNA polymerase, Pol K DNA polymerase, Pol £ DNA polymerase, Pol y DNA polymerase, Pol 9 DNA polymerase, Pol u DNA polymerase, or a thermophilic nucleic acid polymerase (e g., Therminator y, 9°N polymerase (exo-), Therminator II, Therminator III, or Therminator IX). In aspects, the DNA polymerase is a thermophilic nucleic acid polymerase. In aspects, the DNA polymerase is a modified archaeal DNA polymerase. In aspects, the polymerase is a reverse transcriptase. In aspects, the kit includes a strand-displacing polymerase. In aspects, the kit includes a strand-displacing polymerase,such as a phi29 polymerase, phi29 mutant polymerase or a thermostable phi29 mutant polymerase.

[0223] In aspects, the kit includes a buffered solution. Typically, the buffered solutions contemplated herein are made from a weak acid and its conjugate base or a weak base and its conjugate acid. For example, sodium acetate and acetic acid are buffer agents that can be used to form an acetate buffer. Other examples of buffer agents that can be used to make buffered solutions include, but are not limited to, Tris, bicine, tricine, HEPES, TES, MOPS, MOPSO and PIPES. Additionally, other buffer agents that can be used in enzyme reactions, hybridization reactions, and detection reactions are known in the art. In aspects, the buffered solution can include Tris. With respect to the aspects described herein, the pH of the buffered solution can be modulated to permit any of the described reactions. In some aspects, the buffered solution can have a pH greater than pH 7.0, greater than pH 7.5, greater than pH 8.0, greater than pH 8.5, greater than pH 9.0, greater than pH 9.5, greater than pH 10, greater than pH 10.5, greater than pH 11.0, or greater than pH 11.5. In other aspects, the buffered solution can have a pH ranging, for example, from about pH 6 to about pH 9, from about pH 8 to about pH 10, or from about pH 7 to about pH 9. In aspects, the buffered solution can include one or more divalent cations. Examples of divalent cations can include, but are not limited to, Mg2+, Mn2+, Zn2+, and Ca2+. In aspects, the buffered solution can contain one or more divalent cations at a concentration sufficient to permit hybridization of a nucleic acid. The kit may also include a flow cell. In aspects, kit includes the solid support and a flow cell carrier (e.g., a flow cell carrier as described in US 2021 / 0190668, which is incorporated herein by reference for all purposes).

[0224] In aspects, the kit includes, without limitation, nucleic acid primers, probes, adapters, enzymes, and the like, and are each packaged in a container, such as, without limitation, a vial, tube or bottle, in a package suitable for commercial distribution, such as, without limitation, a box, a sealed pouch, a blister pack and a carton. The package typically contains a label or packaging insert indicating the uses of the packaged materials. As used herein, “packaging materials” includes any article used in the packaging for distribution of reagents in a kit, including without limitation containers, vials, tubes, bottles, pouches, blister packaging, labels, tags, instruction sheets and package inserts.

[0225] Adapters and / or primers may be supplied in the kits ready for use, as concentrates- requiring dilution before use, or in a lyophilized or dried form requiring reconstitution prior to use. If required, the kits may further include a supply of a suitable diluent for dilution orreconstitution of the primers and / or adapters. Optionally, the kits may further include supplies of reagents, buffers, enzymes, and dNTPs for use in carrying out nucleic acid amplification and / or sequencing. Further components which may optionally be supplied in the kit include sequencing primers suitable for sequencing templates prepared using the methods described herein.

[0226] In addition to the above components, the subject kits may further include instructions for practicing the subject methods. These instructions may be present in the subject kits in a variety of forms, one or more of which may be present in the kit. One form in which these instructions may be present is as printed information on a suitable medium or substrate, e.g., a piece or pieces of paper on which the information is printed, in the packaging of the kit, in a package insert, etc. Yet another means would be a computer readable medium, e.g., diskette, CD, digital storage medium, etc., on which the information has been recorded. Yet another means that may be present is a website address which may be used via the Internet to access the information at a removed site. Any convenient means may be present in the kits.III. Methods

[0227] In another aspect, provided herein is a method of generating an immobilized complement of a target nucleic acid in a biological sample. In an aspect, the method comprises: a. contacting a solid support described herein with the biological sample comprising a plurality of target nucleic acids; b. hybridizing the capture region of each capture probe to a homopolymeric sequence of a target nucleic acid from the plurality; and c. extending each capture region with a polymerase, thereby generating an immobilized complement of each target nucleic acid.

[0228] In another aspect, provided herein is a method of generating a plurality of second strand extension products of target nucleic acids of a biological sample, the method comprising: a. providing a solid support comprising a plurality of immobilized capture probes, wherein each capture probe of the immobilized plurality of capture probes comprises a first primer binding sequence, a spatial barcode, and a capture region, wherein the capture region comprises at least two non-sequential nucleotides or non-sequential nucleotide sequences, and wherein each of the non-sequential nucleotides or non-sequential nucleotide sequences is complementary to a portion of a homopolymer sequence of a target nucleic acid; b. contacting the solid support with a biological sample comprising a pluralityof target nucleic acids; c. hybridizing the capture region of each immobilized capture probe to a homopolymeric sequence of a target nucleic acid from the plurality, thereby forming a plurality of hybridized capture probes; d. extending the capture region of each hybridized capture probe with a polymerase, thereby generating a plurality of immobilized first strand extension products, wherein the extending comprises addition of a plurality of non- templated nucleotides to the end of the immobilized first strand extension products; e. removing the plurality of target nucleic acids from the solid support; f. hybridizing a template switch oligonucleotide (TSO) to each immobilized first strand extension product, wherein the TSO is complementary to a plurality of the non-templated nucleotides, and wherein the TSO comprises a second sequencing primer binding sequence, thereby forming a plurality of hybridized TSOs; g. generating a plurality second strand extension product using the TSO; and h. removing the plurality of second strand extension products. In some aspects, each capture probe of the immobilized plurality of capture probes comprises a second primer binding sequence. In some aspects, the first primer binding sequence is a first sequencing primer binding sequence or a first decoding primer binding sequence, and wherein the second primer binding sequence is a second sequencing primer binding sequence or a second decoding primer binding sequence.

[0229] In another aspect, provided herein is a method of generating a plurality of second strand extension products of a target nucleic acid in a biological sample, the method comprising: a. providing a solid support comprising a plurality of immobilized capture probes and a plurality of immobilized spatial probes, wherein each capture probe comprises a sequencing primer binding sequence and a capture region, wherein each spatial probe comprises a primer binding sequence, a spatial barcode, and a probe sequence, wherein the capture region comprises at least two non-sequential nucleotides or non-sequential nucleotide sequences, and wherein each of the non-sequential nucleotides or non-sequential nucleotide sequences is complementary to a portion of a homopolymer sequence of a target nucleic acid; b. contacting the solid support with a biological sample comprising a plurality of target nucleic acids; c. hybridizing the capture region of each immobilized capture probe to a homopolymeric sequence of a target nucleic acid from the plurality, thereby forming a plurality of hybridized capture probes; d. extending the capture region of each hybridized capture probe with a polymerase, thereby generating a plurality of immobilized first strand extension products, wherein the extending comprises addition of a plurality of non- templated nucleotides to the end of the immobilized first strand extension products; e.removing the plurality of target nucleic acids from the solid support; f. providing a plurality of splint oligonucleotides to the solid support and hybridizing the splint oligonucleotides to each immobilized first strand extension product and immobilized spatial probe to form a splinted complex, wherein the splint oligonucleotide comprises a first region complementary to the plurality of non-templated nucleotides of the first strand extension product and a second region complementary to the probe sequence of the spatial probe, thereby bringing the immobilized first strand extension product and the immobilized spatial probe of the splinted complex into ligatable proximity; g. ligating the immobilized first strand extension product and the immobilized spatial probe of each splinted complex by enzymatic or chemical ligation, thereby forming a plurality of ligated first strand extension products; h. hybridizing a primer to each ligated first strand extension product and extending the hybridized primers, thereby generating a plurality of second strand extension products; and i. removing the plurality of second strand extension products.

[0230] In another aspect, provided herein is a method of generating a plurality of second strand extension products of a target nucleic acid in a biological sample, the method comprising: a. providing a solid support comprising a plurality of immobilized capture probes and a plurality of immobilized spatial probes, wherein each capture probe comprises a sequencing primer binding sequence and a capture region, wherein each spatial probe comprises a primer binding sequence, a spatial barcode, and a probe sequence, wherein the capture region comprises at least two non-sequential nucleotides or non-sequential nucleotide sequences, and wherein each of the non-sequential nucleotides or non-sequential nucleotide sequences is complementary to a portion of a homopolymer sequence of a target nucleic acid; b. contacting the solid support with a biological sample comprising a plurality of target nucleic acids; c. hybridizing the capture region of each immobilized capture probe to a homopolymeric sequence of a target nucleic acid from the plurality, thereby forming a plurality of hybridized capture probes; d. extending the capture region of each hybridized capture probe with a polymerase, thereby generating a plurality of immobilized first strand extension products, wherein the extending comprises addition of a plurality of non- templated nucleotides to the end of the immobilized first strand extension products; e. removing the plurality of target nucleic acids from the solid support; f. hybridizing a template switch oligonucleotide (TSO) to each immobilized first strand extension product, wherein the TSO is complementary to a plurality of the non-templated nucleotides, and wherein the TSO comprises a bait sequence at a 3’ end, thereby forming a plurality ofhybridized TSOs; g. incorporating the complement of the TSO into the 3’ end of the immobilized first strand extension product by template switching, thereby adding a bait sequence complement to the 3’ end of each immobilized first strand extension product; h. hybridizing the bait sequence complement of each immobilized first strand extension product to the probe sequence of the immobilized spatial probes, and extending the 3’ end of the hybridized first strand extension product, thereby incorporating a complement of the spatial barcode and a primer binding sequence complement into the 3’ end of each immobilized first strand extension product; i. denaturing the hybridized first strand extension products and spatial probes; j. hybridizing a primer to the primer binding sequence complement of each immobilized first strand extension product and extending the hybridized primers, thereby generating a plurality of second strand extension products; and k. removing the plurality of second strand extension products. In some aspecst, after step (g), the method further comprises hybridizing a blocking element to the capture region of the capture probe. In some aspects, the blocking element is complementary to a 5’ portion of the capture region of the capture probe.

[0231] In some aspects, the step of removing the plurality of second strand extension products comprises chemical or enzymatic removal of the second strand extension products. In some aspects, the chemical removal comprises contacting the plurality of second strand extension products with an alkaline solution. In some aspects, the enzymatic removal comprises enzymatic cleavage of a cleavage site, wherein the plurality of second strand extension products comprise the cleavage site at a 5’ end. In some aspects, the cleavage site comprises a restriction enzyme site, a uracil, an 8-oxoguanine, or a combination thereof.

[0232] In various aspects, the cleavage site is cleaved after the capture probe has been extended and the first complementary strands have been formed. In some aspects, the method further comprises contacting the surface with a uracil-DNA glycosylase (UDG). In some aspects, the surface is contacted with the UDG after the first complementary strands are generated and before the second complementary strands are generated. In some aspects, the method further comprises contacting the surface with a universal cleavage mix (UCM) (see, e.g., International Application Publication Number WO 2019 / 222264, incorporated by reference herein in its entirety, for discussion of cleavage mixes). In some aspects, contacting the surface with a universal cleavage mix (UCM) occurs before the plurality ofoligonucleotide primers is hybridized to the first complementary strands. In some aspects, the second complementary strands are generated off of the surface e.g., in solution).

[0233] In some aspects, the bait sequence is identical to a portion of the capture region of the capture probe.

[0234] In some aspects, each spatial probe further comprises an index sequence, a molecular identifier, or a combination thereof. In some aspects, the molecular identifier is a unique molecular identifier (UMI). In some aspects, the UMI comprises a physical UMI. In some aspects, the UMI comprises a virtual UMI. In some aspects, the UMI comprises both a physical UMI and a virtual UMI.

[0235] In some aspects, each capture probe further comprises a UMI. In some aspects, each capture probe further comprises a spatial barcode. In some aspects, each capture probe further comprises both a UMI and a spatial barcode. In some aspects, the UMI comprises a physical UMI. In some aspects, the UMI comprises a virtual UMI. In some aspects, the UMI comprises both a physical UMI and a virtual UMI.

[0236] In some aspects, the virtual UMI is derived from sequences at the 3’ end of a first- strand cDNA product. In some aspects, the virtual UMI is derived from a sequence about 6 to about 24 nucleotides in length from the 3’ end of a first-strand cDNA product. In some aspects, the virtual UMI is derived from a sequence about 6 nucleotides in length from the 3’ end of a first-strand cDNA product. In some aspects, the virtual UMI is derived from a sequence about 8 nucleotides in length from the 3’ end of a first-strand cDNA product. In some aspects, the virtual UMI is derived from a sequence about 10 nucleotides in length from the 3’ end of a first-strand cDNA product. In some aspects, the virtual UMI is derived from a sequence about 12 nucleotides in length from the 3’ end of a first-strand cDNA product. In some aspects, the virtual UMI is derived from a sequence about 14 nucleotides in length from the 3’ end of a first-strand cDNA product. In some aspects, the virtual UMI is derived from a sequence about 16 nucleotides in length from the 3’ end of a first-strand cDNA product. In some aspects, the virtual UMI is derived from a sequence about 18 nucleotides in length from the 3’ end of a first-strand cDNA product. In some aspects, the virtual UMI is derived from a sequence about 20 nucleotides in length from the 3’ end of a first-strand cDNA product. In some aspects, the virtual UMI is derived from a sequence about 22 nucleotides in length from the 3’ end of a first-strand cDNA product. In some aspects, the virtual UMI is derived from a sequence about 24 nucleotides in length from the 3’ end of a first-strand cDNA product.

[0237] In some aspects, the method further comprises amplifying the plurality of second strand extension products, thereby generating a library. In some aspects, generating the library comprises tagmentation or ligation of adapters to the second strand extension products. In some aspects, the method further comprises sequencing the library. In some aspects, sequencing comprises sequencing-by-synthesis, sequencing-by-ligation, or sequencing-by-binding.

[0238] In some aspects, the biological sample comprises a tissue sample. In some aspects, the tissue sample comprises a fresh frozen tissue sample or a formalin-fixed paraffin embedded (FFPE) sample.

[0239] In some aspects, step b) further comprises contacting the sample with a lysis buffer, a permeabilization buffer and / or a reagent to deparaffinize a FFPE sample.

[0240] In some aspects, the polymerase is a reverse transcriptase. In some aspects, the reverse transcriptase is a highly processive reverse transcriptase.

[0241] Regardless of the specific sequencing platform and protocol, at least a portion of the nucleic acids contained in the sample are sequenced to generate tens of thousands, hundreds of thousands, or millions of sequence reads, e.g., 100 bp reads. In some aspects, the sequence reads comprise about 20 bp, about 25 bp, about 30 bp, about 35 bp, about 36 bp, about 40 bp, about 45 bp, about 50 bp, about 55 bp, about 60 bp, about 65 bp, about 70 bp, about 75 bp, about 80 bp, about 85 bp, about 90 bp, about 95 bp, about 100 bp, about 110 bp, about 120 bp, about 130, about 140 bp, about 150 bp, about 200 bp, about 250 bp, about 300 bp, about 350 bp, about 400 bp, about 450 bp, about 500 bp, about 800 bp, about 1000 bp, or about 2000 bp.

[0242] In some aspects, reads are aligned to a reference genome, e.g., hgl9. In other aspects, reads are aligned to a portion of a reference genome, e.g., a chromosome or a chromosome segment. The reads that are uniquely mapped to the reference genome are known as sequence tags. In one aspects, at least about 3*106qualified sequence tags, at least about 5*106qualified sequence tags, at least about 8*106qualified sequence tags, at least about 10* 106qualified sequence tags, at least about 15* 106qualified sequence tags, at least about 20* 106qualified sequence tags, at least about 30* 106qualified sequence tags, at least about 40* 106qualified sequence tags, or at least about 50* 106qualified sequence tags are obtained from reads that map uniquely to a reference genome.

[0243] Samples that are used for determining DNA fragment sequence can include samples taken from any cell, fluid, tissue, or organ including nucleic acids in which sequences ofinterest are to be determined. In some aspects involving diagnosis of cancers, circulating tumor DNA may be obtained from a subject's bodily fluid, e.g. blood or plasma. In some aspects involving diagnosis of fetus, it is advantageous to obtain cell-free nucleic acids, e.g., cell-free DNA (cfDNA), from maternal body fluid. Cell-free nucleic acids, including cell-free DNA, can be obtained by various methods known in the art from biological samples including but not limited to plasma, serum, and urine (see, e.g., Fan et al., Proc Natl Acad Sci 105:16266-16271

[2008] ; Koide et al., Prenatal Diagnosis 25:604-607

[2005] ; Chen et al., Nature Med. 2: 1033-1035

[1996] ; Lo et al., Lancet 350: 485-487

[1997] ; Botezatu et al., Clin Chem. 46: 1078-1084, 2000; and Su et al., J Mol. Diagn. 6: 101-107

[2004] ).

[0244] In various aspects the nucleic acids (e.g., DNA or RNA) present in the sample can be enriched specifically or non-specifically prior to use (e.g., prior to preparing a sequencing library). Non-specific enrichment of sample DNA refers to the whole genome amplification of the genomic DNA fragments of the sample that can be used to increase the level of the sample DNA prior to preparing a cfDNA sequencing library. Methods for whole genome amplification are known in the art. Degenerate oligonucleotide-primed PCR (DOP), primer extension PCR technique (PEP) and multiple displacement amplification (MDA) are examples of whole genome amplification methods. In some aspects, the sample is un-enriched for DNA.

[0245] The sample including the nucleic acids to which the methods described herein are applied typically include a biological sample (“test sample”) as described above. In some aspects, the nucleic acids to be sequenced are purified or isolated by any of a number of well-known methods.

[0246] Accordingly, in certain aspects, the sample includes or consists essentially of a purified or isolated polynucleotide, or it can include samples such as a tissue sample, a biological fluid sample, a cell sample, and the like. Suitable biological fluid samples include, but are not limited to blood, plasma, serum, sweat, tears, sputum, urine, sputum, ear flow, lymph, saliva, cerebrospinal fluid, ravages, bone marrow suspension, vaginal flow, trans-cervical lavage, brain fluid, ascites, milk, secretions of the respiratory, intestinal and genitourinary tracts, amniotic fluid, milk, and leukophoresis samples. In some aspects, the sample is a sample that is easily obtainable by non-invasive procedures, e.g., blood, plasma, serum, sweat, tears, sputum, urine, stool, sputum, ear flow, saliva or feces. In certain aspects, the sample is a peripheral blood sample, or the plasma and / or serumfractions of a peripheral blood sample. In other aspects, the biological sample is a swab or smear, a biopsy specimen, or a cell culture. In another aspect, the sample is a mixture of two or more biological samples, e.g., a biological sample can include two or more of a biological fluid sample, a tissue sample, and a cell culture sample.

[0247] In certain aspects, samples can be obtained from sources, including, but not limited to, samples from different individuals, samples from different developmental stages of the same or different individuals, samples from different diseased individuals (e.g., individuals suspected of having a genetic disorder), normal individuals, samples obtained at different stages of a disease in an individual, samples obtained from an individual subjected to different treatments for a disease, samples from individuals subjected to different environmental factors, samples from individuals with predisposition to a pathology, samples individuals with exposure to an infectious disease agent, and the like.

[0248] In one illustrative, but non-limiting example, the sample is a maternal sample that is obtained from a pregnant female, for example a pregnant woman. In this instance, the sample can be analyzed using the methods described herein to provide a prenatal diagnosis of potential chromosomal abnormalities in the fetus. The maternal sample can be a tissue sample, a biological fluid sample, or a cell sample. A biological fluid includes, as nonlimiting examples, blood, plasma, serum, sweat, tears, sputum, urine, sputum, ear flow, lymph, saliva, cerebrospinal fluid, ravages, bone marrow suspension, vaginal flow, transcervical lavage, brain fluid, ascites, milk, secretions of the respiratory, intestinal and genitourinary tracts, and leukophoresis samples.

[0249] In certain aspects, samples can also be obtained from in vitro cultured tissues, cells, or other polynucleotide-containing sources. The cultured samples can be taken from sources including, but not limited to, cultures (e.g., tissue or cells) maintained in different media and conditions (e.g., pH, pressure, or temperature), cultures (e.g., tissue or cells) maintained for different periods of length, cultures (e.g., tissue or cells) treated with different factors or reagents (e.g., a drug candidate, or a modulator), or cultures of different types of tissue and / or cells.

[0250] Methods of isolating nucleic acids from biological sources are well known and will differ depending upon the nature of the source. One of skill in the art can readily isolate nucleic acids from a source as needed for the method described herein. In some instances, it can be advantageous to fragment the nucleic acid molecules in the nucleic acid sample. Fragmentation can be random, or it can be specific, as achieved, for example, usingrestriction endonuclease digestion. Methods for random fragmentation are well known in the art, and include, for example, limited DNAse digestion, alkali treatment and physical shearing.

[0251] In some aspects, nucleic acids in a tissue sample are transferred to and captured onto an array. For example, a tissue section is placed in contact with an array and nucleic acid is captured onto the array and tagged with a spatial address. The spatially-tagged DNA molecules are released from the array and analyzed, for example, by high throughput next generation sequencing (NGS), such as sequencing-by-synthesis (SBS). In some aspects, a nucleic acid in a tissue section (e.g., a formalin-fixed paraffin- embedded (FFPE) tissue section) is transferred to an array and captured onto the array by hybridization to a capture probe or capture oligonucleotide. In some aspects, a capture oligonucleotide can be a universal capture probe hybridizing, e.g., to an adaptor region in a nucleic acid sequencing library, and / or to the poly-A tail of an mRNA. In some aspects, the capture probe can be a gene-specific capture probe hybridizing, e.g., to a specifically targeted mRNA or cDNA in a sample, such as a TruSeq™ Custom Amplicon (TSCA) oligonucleotide probe (Illumina, Inc.). A capture oligonucleotide can be a plurality of capture oligonucleotides, e.g., a plurality of the same or of different capture oligonucleotides.

[0252] In some aspects, a combinatorial indexing (addressing) system is used to provide spatial information for analysis of nucleic acids in a tissue sample. The combinatorial indexing system can involve the use of two or more spatial address sequences (e.g., two, three, four, five or more spatial address sequences).

[0253] In some aspects, two spatial address sequences are incorporated into a nucleic acid during preparation of a sequencing library. A first spatial address can be used to define a certain position (i.e., capture site) in the X dimension on a capture array and a second spatial address sequence can be used define a position (i.e., a capture site) in the Y dimension on the capture array. During library sequencing, both X and Y spatial address sequences can be determined and the sequence information can be analyzed to define the specific position on the capture array.

[0254] In some aspects, three spatial address sequences are incorporated into a nucleic acid during preparation of a sequencing library. A first spatial address can be used to define a certain position (i.e., capture site) in the X dimension on a capture array, a second spatial address sequence can be used to define a position (i.e., a capture site) in the Y dimension on the capture array, and a third spatial address sequence can be used to define a positionof a two-dimensional sample section (e.g., the position of a slice of a tissue sample) in a sample (e.g., a tissue biopsy) to provide positional spatial information in the third dimension (Z dimension) of a sample. During library sequencing, X, Y, and Z spatial address sequences can be determined and the sequence information can be analyzed to define the specific position on the capture array.

[0255] In some aspects, a temporal address sequence (T) is optionally incorporated into a nucleic acid during preparation of a sequencing library. In some aspects, the temporal address sequence can be combined with two or three spatial address sequences. The temporal address sequence can, for example, be used in the context of a time-course experiment for determining time-dependent changes in gene-expression in a tissue sample. Time-dependent changes in gene-expression can occur in a tissue sample, for example, in response to a chemical, biological or physical stimulus (e.g., a toxin, a drug, or heat). Nucleic acid samples obtained at different timepoints from comparable tissue samples (e.g., proximal slices of a tissue sample) can be pooled and sequenced in bulk. An optional first spatial address can be used to define a certain position (i.e., capture site) in the X dimension on a capture array, a second optional spatial address sequence can be used to define a position (i.e., a capture site) in the Y dimension on the capture array, and a third optional spatial address sequence can be used to define a position of a two-dimensional sample section (e.g., the position of a slice of a tissue sample) in a sample (e.g., a tissue biopsy) to provide positional spatial information in the third dimension (Z dimension) of the sample. During library sequencing, T, X, Y, and Z address sequences are determined and the sequence information is analyzed to define the specific X, Y (and optionally Z) position on the capture array for each timepoint (T).

[0256] The address sequences X, Y, and, optionally, Z and / or T, can be consecutive nucleic acid sequences or the address sequences can be separated by one or more nucleic acids (e.g., 2 or more, 3 or more, 10 or more, 30 or more, 100 or more, 300 or more, or 1 ,000 or more). In some aspects, the X, Y, and optionally Z and / or T address sequences can each individually and independently be combinatorial nucleic acid sequences.

[0257] In some embodiments, the length of the address sequences (e.g., X, Y, Z, or T) can each individually and independently be 100 nucleic acids or less, 90 nucleic acids or less, 80 nucleic acids or less, 70 nucleic acids or less, 60 nucleic acids or less, 50 nucleic acids or less, 40 nucleic acids or less, 30 nucleic acids or less, 20 nucleic acids or less, 15 nucleic acids or less, 10 nucleic acids or less, 8 nucleic acids or less, 6 nucleic acids or less, or 4nucleic acids or less. The length of two or more address sequences in a nucleic acid can be the same or different. For example, if the length of address sequence X is 10 nucleic acids, the length of address sequence Y can be, e.g., 8 nucleic acids, 10 nucleic acids, or 12 nucleic acids.

[0258] Address sequences, e.g., spatial address sequences such as X or Y, can be either partially or fully degenerate sequences.

[0259] In some aspects, spatially addressed capture probes on an array can be released from the array onto a tissue section for generation of a spatially addressed sequencing library. In some aspects, a capture probe comprises a random primer sequence for in situ synthesis of spatially-tagged cDNA from RNA in the tissue section. In some aspects, a capture probe is a TruSeq™ Custom Amplicon (TSCA) oligonucleotide probe (Illumina, Inc.) for capturing and spatially tagging genomic DNA in the tissue section. The spatially-tagged nucleic acid molecules (e.g., cDNA or genomic DNA) are recovered from the tissue section and processed in single tube reactions to generate a spatially-tagged amplicon library.

[0260] In some aspects, magnetic nanoparticles can be used to capture nucleic acid (e.g., in situ synthesized cDNA) in a tissue sample for generation of a spatially addressed library.

[0261] In some aspects, spatial detection and analysis of nucleic acid in a tissue sample can be performed on a droplet actuator.

[0262] Described herein are improved methods and compositions for spatial-omics applications that preserve spatial information related to the origin of RNA or DNA in the tissue. Examples of spatial omics applications include, but are not limited to, spatial genomic applications, spatial proteomic applications; spatial transcriptomic applications; spatial agrigenomic applications; spatial epigenomics s applications; spatial phenomic applications; spatial ligandomic applications; and spatial multi omic applications (e.g., transcriptomic and genomic applications).

[0263] In some aspects, the total RNA is released from the tissue sample. Release includes lysis of tissue or permeabilization of the tissue. In various embodiments, one or more samples that have been contacted with a solid support can be lysed to release target nucleic acids. Lysis can be carried out using known techniques, such as those that employ one or more of chemical treatment, enzymatic treatment, electroporation, heat, hypotonic treatment, sonication or the like. It is contemplated that the tissue sample is permeabilized prior to contacting the tissue sample with a plurality of capture oligonucleotides in the methods. In various aspects, the tissue sample is treated with one or more blocking reagentsprior to contacting the tissue sample with a plurality of capture oligonucleotides in the methods. In various aspects, the tissue sample is permeabilized and treated with one or more blocking reagents prior to step contacting the tissue sample with a plurality of capture oligonucleotides in the methods.

[0264] In some embodiments, a tissue sample will be treated to remove embedding material (e.g., to remove paraffin or formalin) from the sample prior to release, capture or modification of nucleic acids. This can be achieved by contacting the sample with an appropriate solvent (e.g., xylene and ethanol washes). Treatment can occur prior to contacting the tissue sample with a solid support set forth herein or the treatment can occur while the tissue sample is on the solid support. It is also contemplated that the tissue is removed from the sample by enzymatic degradation. In various aspects, the tissue removal is carried out before the RNA is removed from the tissue. In various embodiments, the tissue is removed via degradation with proteinase K, e.g., at 37°C for 40 minutes. Exemplary methods for manipulating tissues for use with solid supports to which nucleic acids are attached are set forth in US Pat. App. Publ. No. 2014 / 0066318, which is incorporated herein by reference.

[0265] A formalin-fixed tissue sample may also be decrosslinked using known techniques.In various aspects, decrosslinking is carried out using Tris-EDTA (TE) buffer, e.g., at pH 8, pH 9, or another appropriate buffer at an appropriate pH. Decrosslinking may also be carried out at high heat, e.g., 70° C.

[0266] The methods above are also useful for improving capture efficiency of RNA transcripts for in situ RNA transcript library preparation, and / or for improving the nucleotide length of polynucleotides used in generating an in situ transcriptome library (e.g., improving the polynucleotide size of cDNA transcribed from mRNA isolated from a sample and used in generating an in situ transcriptome library).

[0267] According to the methods described herein, spatial detection and analysis of nucleic acids in a tissue sample can be performed using sets of two or more capture probes (e.g., 3 or more, 4 or more, 5 or more, 6 or more, 7 or more, 8 or more, 9 or more, or 10 or more capture probes). Typically, at least a first capture probe in a set of capture probes is immobilized on a capture array or a nanostructure. In some aspects, a second capture probe can be immobilized on the same capture array as the first capture probe, e.g., in proximity to the first capture probe, e.g., in the same capture site. In some aspects, a second capture probe can be immobilized on a nanostructure or a particle, such as a magnetic particle or a -n -magnetic nanoparticle. In some aspects, a second capture probe can be in solution, e.g., to be used to perform in situ reactions with a nucleic acid in a tissue sample. The capture probes in the capture probe sets individually and independently can have a variety of different regions, e.g., a capture region (e.g., a first universal or genespecific capture region or first clustering region), a primer binding region (e.g., a SBS primer region, such as a SBS3 or SBS12 region), or a second universal region / clustering sequence, such as a P5 or P7 region, a spatial address region (e.g., a partial or combinatorial spatial address region), or a cleavable region.

[0268] Generation of second complementary strands may be performed “on surface” or “off surface”. In some aspects, for on surface generation of second complementary strands the first complementary strands remain immobilized on the surface while second complementary strands are extended using the first complementary strands as template. In some aspects, the second complementary strands are removed (eluted) from the surface, after which the second complementary strands are subjected to indexed PCR for amplification. In some aspects, for on surface generation of second complementary strands an Exclusion Amplification (ExAmp) mix comprising an adapter-index oligonucleotide is contacted with the first complementary strands on the surface, thereby generating second complementary strands via strand invasion and isothermal amplification. In various aspects, the ExAmp mix further comprises a recombinase, a single-strand DNA binding protein (e.g., gp32 ssDNA binding protein), and a polymerase. As with any of the methods of generating second complementary strands described herein, generation of the second complementary strands may subsequently be followed by amplification of the second complementary strands (e.g., by indexed PCR), during which a second clustering primer sequence (e.g., P5) may be added to one or more of the second complementary strands. In some aspects, the amplifying comprises index PCR during which a first primer hybridizes to the first clustering primer sequence and a second primer hybridizes to the adapter nucleotide sequence, wherein the second primer comprises the second clustering primer sequence. In various aspects, an indexing sequence (e.g., i5) is also added to one or more of the second complementary strands during amplification. In some aspects, the second primer further comprises the indexing sequence. Addition of the second clustering primer sequence and optionally the indexing sequence to the one or more of the second complementary strands occurs, in various embodiments, off of the surface (e.g., in solution). The amplification of the second complementary strands may subsequently befollowed by sequencing. The sequencing information may subsequently be correlated with a spatial location of the target nucleic acids in the biological sample.

[0269] In some aspects, for off surface generation of second complementary strands formation of the first complementary strands is followed by cleavage of the first complementary strands from the surface, and second complementary strand synthesis is performed off of the surface (e.g., in solution). In some aspects, the second complementary strands are then amplified (e.g., via indexed PCR), during which a second clustering primer sequence (e.g., P5) may be added to one or more of the second complementary strands. In various aspects, an indexing sequence (e.g., i5) is also added to one or more of the second complementary strands during amplification. Addition of the second clustering primer sequence and optionally the indexing sequence to the one or more of the second complementary strands occurs, in various aspects, off of the surface (e.g., in solution). The amplification of the second complementary strands may subsequently be followed by sequencing. The sequencing information may subsequently be correlated with a spatial location of the target nucleic acids in the biological sample.

[0270] Methods of the disclosure further provide, in various embodiments, that the biological sample / tissue sample is digested. The digestion of the biological sample can occur, in various aspects, after generation of the first complementary strands. In some aspects, digestion of the biological sample occurs after generation of the first complementary strands but prior to generation of second complementary strands. The disclosure also provides methods in which the target nucleic acids (e.g., RNA) are removed from the surface. Removal of target nucleic acids from the surface can occur, in various aspects, after generation of the first complementary strands. In some aspects, removal of the target nucleic acids occurs after generation of the first complementary strands but prior to generation of second complementary strands. Removal of the target nucleic acids from the surface is achieved, in various aspects, by changing a condition. In further aspects, the condition is temperature, pH, formamide concentration, or a combination thereof.

[0271] Aspects of the disclosure include those in which a plurality of capture oligonucleotides are immobilized on a surface. The capture oligonucleotides, in various aspects, hybridize to target nucleic acids of a biological sample. In some aspects, each of the plurality of capture oligonucleotides comprises the same capture nucleotide sequence. In further aspects, the plurality of capture oligonucleotides comprises multiple, different capture nucleotide sequences. In still further aspects, the multiple, different capturenucleotide sequences comprise one or more gene-specific capture sequences, one or more universal capture sequences, or a combination thereof. In various aspects, the capture nucleotide sequence is a poly-T sequence, a poly-A sequence, a gene-specific capture sequence, or a universal capture sequence. In further aspects, the universal capture sequence is a random nucleotide sequence or a non-self complementary semi-random sequence. Aspects of the disclosure also include those in which capture nucleotide sequences are extended following hybridization of the capture oligonucleotide to the target nucleic acid. In some aspects, the extending of the capture nucleotide sequence is carried out using a reverse transcriptase. In some implementations of a method of the disclosure, the target nucleic acids are poly adenylated prior to hybridization of the target nucleic acids to the capture nucleotide sequences. In some aspects, the target nucleic acids are polyadenylated using a poly(A) polymerase. In further embodiments, the target nucleic acids are polyadenylated using chemical ligation or enzymatic ligation.

[0272] In various aspects, sequencing may be performed on various sequencing platforms that require preparation of a sequencing library. The preparation typically involves fragmenting the DNA (sonication, nebulization or shearing), followed by DNA repair and end polishing (blunt end or A overhang), and platform-specific adapter ligation. In one aspects, the methods described herein can utilize next generation sequencing technologies (NGS), that allow multiple samples to be sequenced individually as genomic molecules (i.e., singleplex sequencing) or as pooled samples comprising indexed genomic molecules (e.g., multiplex sequencing) on a single sequencing run. These methods can generate up to several billion reads of DNA sequences. In various aspects the sequences of genomic nucleic acids, and / or of indexed genomic nucleic acids can be determined using, for example, the Next Generation Sequencing Technologies (NGS) described herein. In various aspects, analysis of the massive amount of sequence data obtained using NGS can be performed using one or more processors as described herein.

[0273] In some aspects the sequencing methods contemplated herein involve the preparation of sequencing libraries. In one illustrative approach, sequencing library preparation involves the production of a random collection of adapter-modified DNA fragments (e.g., polynucleotides) that are ready to be sequenced. Sequencing libraries of polynucleotides can be prepared from DNA or RNA, including equivalents, analogs of either DNA or cDNA, for example, DNA or cDNA that is complementary or copy DNA produced from an RNA template, by the action of reverse transcriptase. Thepolynucleotides may originate in double-stranded form (e.g., dsDNA such as genomic DNA fragments, cDNA, PCR amplification products, and the like) or, in certain aspects, the polynucleotides may originated in single-stranded form (e.g., ssDNA, RNA, etc.) and have been converted to dsDNA form. By way of illustration, in certain aspects, single stranded mRNA molecules may be copied into double-stranded cDNAs suitable for use in preparing a sequencing library. The precise sequence of the primary polynucleotide molecules is generally not material to the method of library preparation, and may be known or unknown. In one aspects, the polynucleotide molecules are DNA molecules. More particularly, in certain aspects, the polynucleotide molecules represent the entire genetic complement of an organism or substantially the entire genetic complement of an organism, and are genomic DNA molecules (e.g., cellular DNA, cell free DNA (cfDNA), etc.), that typically include both intron sequence and exon sequence (coding sequence), as well as noncoding regulatory sequences such as promoter and enhancer sequences. In certain aspects, the primary polynucleotide molecules comprise human genomic DNA molecules, e.g., cfDNA molecules present in peripheral blood of a pregnant subject.

[0274] Preparation of sequencing libraries for some NGS sequencing platforms is facilitated by the use of polynucleotides comprising a specific range of fragment sizes. Preparation of such libraries typically involves the fragmentation of large polynucleotides (e.g. cellular genomic DNA) to obtain polynucleotides in the desired size range.

[0275] Paired end reads may be used for the sequencing methods and systems disclosed herein. The fragment or insert length is longer than the read length, and sometimes longer than the sum of the lengths of the two reads.

[0276] The methods and apparatus described herein may employ next generation sequencing technology (NGS), which allows massively parallel sequencing. In certain aspects, clonally amplified DNA templates or single DNA molecules are sequenced in a massively parallel fashion within a flow cell (e.g., as described in Volkerding et al. Clin Chem 55:641-658

[2009] ; Metzker M Nature Rev 11:31-46

[2010] ). The sequencing technologies of NGS include but are not limited to pyrosequencing, sequencing-by- synthesis with reversible dye terminators, sequencing by oligonucleotide probe ligation, and ion semiconductor sequencing. DNA from individual samples can be sequenced individually (i.e., singleplex sequencing) or DNA from multiple samples can be pooled and sequenced as indexed genomic molecules (i.e., multiplex sequencing) on a single sequencing run, to generate up to several hundred million reads of DNA sequences.Examples of sequencing technologies that can be used to obtain the sequence information according to the present method are further described here.

[0277] Some sequencing technologies are available commercially, such as the sequencing- by-hybridization platform from Affymetrix Inc. (Sunnyvale, Calif.) and the sequencing-by- synthesis platforms from 454 Life Sciences (Bradford, Conn.), Illumina / Solexa (Hayward, Calif.) and Helicos Biosciences (Cambridge, Mass.), and the sequencing-by-ligation platform from Applied Biosystems (Foster City, Calif.), as described below. In addition to the single molecule sequencing performed using sequencing-by-synthesis of Helicos Biosciences, other single molecule sequencing technologies include, but are not limited to, the SMRT™ technology of Pacific Biosciences, the ION TORRENT™ technology, and nanopore sequencing developed for example, by Oxford Nanopore Technologies.

[0278] While the automated Sanger method is considered as a ‘first generation’ technology, Sanger sequencing including the automated Sanger sequencing, can also be employed in the methods described herein. Additional suitable sequencing methods include, but are not limited to nucleic acid imaging technologies, e.g., atomic force microscopy (AFM) or transmission electron microscopy (TEM). Illustrative sequencing technologies are described in greater detail below.

[0279] In some aspects, the disclosed methods involve obtaining sequence information for the nucleic acids in the test sample by massively parallel sequencing of millions of DNA fragments using Illumina's sequencing-by-synthesis and reversible terminator-based sequencing chemistry (e.g. as described in Bentley et al., Nature 6:53-59

[2009] ). Template DNA can be genomic DNA, e.g., cellular DNA or cfDNA. In some aspects, genomic DNA from isolated cells is used as the template, and it is fragmented into lengths of several hundred base pairs. In other aspects, cfDNA or circulating tumor DNA (ctDNA) is used as the template, and fragmentation is not required as cfDNA or ctDNA exists as short fragments. For example fetal cfDNA circulates in the bloodstream as fragments approximately 170 base pairs (bp) in length (Fan et al., Clin Chem 56:1279-1286

[2010] ), and no fragmentation of the DNA is required prior to sequencing. Illumina's sequencing technology relies on the attachment of fragmented genomic DNA to a planar, optically transparent surface on which oligonucleotide anchors are bound. Template DNA is end- repaired to generate 5 '-phosphorylated blunt ends, and the polymerase activity of Klenow fragment is used to add a single A base to the 3' end of the blunt phosphorylated DNA fragments. This addition prepares the DNA fragments for ligation to oligonucleotideadapters, which have an overhang of a single T base at their 3' end to increase ligation efficiency. The adapter oligonucleotides are complementary to the flow-cell anchor oligos. Under limiting-dilution conditions, adapter-modified, single-stranded template DNA is added to the flow cell and immobilized by hybridization to the anchor oligos. Attached DNA fragments are extended and bridge amplified to create an ultra-high density sequencing flow cell with hundreds of millions of clusters, each containing about 1,000 copies of the same template. In one aspect, the randomly fragmented genomic DNA is amplified using PCR before it is subjected to cluster amplification. Alternatively, an amplification-free genomic library preparation is used, and the randomly fragmented genomic DNA is enriched using the cluster amplification alone (Kozarewa et al., Nature Methods 6:291-295

[2009] ). In some applications, the templates are sequenced using a robust four-color DNA sequencing-by-synthesis technology that employs reversible terminators with removable fluorescent dyes. High-sensitivity fluorescence detection is achieved using laser excitation and total internal reflection optics. Short sequence reads of about tens to a few hundred base pairs are aligned against a reference genome and unique mapping of the short sequence reads to the reference genome are identified using specially developed data analysis pipeline software. After completion of the first read, the templates can be regenerated in situ to enable a second read from the opposite end of the fragments. Thus, either single-end or paired end sequencing of the DNA fragments can be used.

[0280] Various aspects of the disclosure may use sequencing by synthesis that allows paired end sequencing. In some aspects, the sequencing by synthesis platform by Illumina involves clustering fragments. Clustering is a process in which each fragment molecule is isothermally amplified. In some aspects, as the example described here, the fragment has two different adapters attached to the two ends of the fragment, the adapters allowing the fragment to hybridize with the two different oligos on the surface of a flow cell lane. The fragment further includes or is connected to two index sequences at two ends of the fragment, which index sequences provide labels to identify different samples in multiplex sequencing. In some sequencing platforms, a fragment to be sequenced from both ends is also referred to as an insert.

[0281] In some aspects, identifying different samples comprises using a plurality of sequencing reads and a plurality of nonrandom UMIs to determine sequences of the doublestranded DNA fragments in the sample comprising identifying reads sharing a common nonrandom UMI and a common virtual UMI, wherein the common virtual UMI is found ina DNA fragment in the sample; and using the identified reads to determine a sequence of the DNA fragment in the sample. In some aspects, using the plurality of reads and the plurality of nonrandom UMIs to determine sequences of the double-stranded DNA fragments in the sample comprises identifying reads sharing a common nonrandom UMI, a common read position, and a common virtual UMI, wherein the common virtual UMI is found in a DNA fragment in the sample; and using the identified reads to determine a sequence of the DNA fragment in the sample.

[0282] In some aspects, a flow cell for clustering in the Illumina platform is a glass slide with lanes. Each lane is a glass channel coated with a lawn of two types of oligos (e.g., P5 and P7' oligos). Hybridization is enabled by the first of the two types of oligos on the surface. This oligo is complementary to a first adapter on one end of the fragment. A polymerase creates a compliment strand of the hybridized fragment. The double-stranded molecule is denatured, and the original template strand is washed away. The remaining strand, in parallel with many other remaining strands, is clonally amplified through bridge application.

[0283] In bridge amplification and other sequencing methods involving clustering, a strand folds over, and a second adapter region on a second end of the strand hybridizes with the second type of oligos on the flow cell surface. A polymerase generates a complementary strand, forming a double-stranded bridge molecule. This double-stranded molecule is denatured resulting in two single-stranded molecules tethered to the flow cell through two different oligos. The process is then repeated over and over, and occurs simultaneously for millions of clusters resulting in clonal amplification of all the fragments. After bridge amplification, the reverse strands are cleaved and washed off, leaving only the forward strands. The 3' ends are blocked to prevent unwanted priming.

[0284] After clustering, sequencing starts with extending a first sequencing primer to generate the first read. With each cycle, fluorescently tagged nucleotides compete for addition to the growing chain. Only one is incorporated based on the sequence of the template. After the addition of each nucleotide, the cluster is excited by a light source, and a characteristic fluorescent signal is emitted. The number of cycles determines the length of the read. The emission wavelength and the signal intensity determine the base call. For a given cluster all identical strands are read simultaneously. Hundreds of millions of clusters are sequenced in a massively parallel manner. At the completion of the first read, the read product is washed away.

[0285] In the next step of protocols involving two index primers, an index 1 primer is introduced and hybridized to an index 1 region on the template. Index regions provide identification of fragments, which is useful for de-multiplexing samples in a multiplex sequencing process. The index 1 read is generated similar to the first read. After completion of the index 1 read, the read product is washed away and the 3' end of the strand is deprotected. The template strand then folds over and binds to a second oligo on the flow cell. An index 2 sequence is read in the same manner as index 1. Then an index 2 read product is washed off at the completion of the step.

[0286] After reading two indices, read 2 initiates by using polymerases to extend the second flow cell oligos, forming a double-stranded bridge. This double-stranded DNA is denatured, and the 3' end is blocked. The original forward strand is cleaved off and washed away, leaving the reverse strand. Read 2 begins with the introduction of a read 2 sequencing primer. As with read 1, the sequencing steps are repeated until the desired length is achieved. The read 2 product is washed away. This entire process generates millions of reads, representing all the fragments. Sequences from pooled sample libraries are separated based on the unique indices introduced during sample preparation. For each sample, reads of similar stretches of base calls are locally clustered. Forward and reversed reads are paired creating contiguous sequences. These contiguous sequences are aligned to the reference genome for variant identification.

[0287] The sequencing by synthesis example described above involves paired end reads, which is used in many of the aspects of the disclosed methods. Paired end sequencing involves 2 reads from the two ends of a fragment. Paired end reads are used to resolve ambiguous alignments. Paired-end sequencing allows users to choose the length of the insert (or the fragment to be sequenced) and sequence either end of the insert, generating high-quality, alignable sequence data. Because the distance between each paired read is known, alignment algorithms can use this information to map reads over repetitive regions more precisely. This results in better alignment of the reads, especially across difficult-to- sequence, repetitive regions of the genome. Paired-end sequencing can detect rearrangements, including insertions and deletions (indels) and inversions.

[0288] Some aspects disclosed herein provide duplex sequencing methods that effectively suppress errors in situations when signals of valid sequences of interest are low, such as samples with low allele frequencies. The methods use virtual unique molecular indices (UMIs) in conjunction with short physical unique molecular indices placed on one arm orboth arms of sequencing adapters, such as the Illumina TruSeq® adapter. These implementations are based on the strategy of using physical UMIs on adapter sequences and virtual UMIs on sample DNA fragment sequences. In some aspects, alignment positions of reads are also used to suppress errors. For example, when multiple reads (or pairs of reads) share a physical UMI and align within the same interval (constrained range of positions) on the reference, the reads are expected to originate from a single DNA fragment. Physical UMIs, virtual UMIs, and alignment positions associated with reads provide "indices" that are, alone or in combination, uniquely associated with a specific double stranded DNA fragment from a sample. Using these indices, one can identify multiple reads derived from a single DNA fragment (a single molecule), which may be just one of many fragments from the same genomic site. Using the multiple reads from a single DNA molecule, error correction can be performed effectively. For example, the sequencing methodology may obtain a consensus nucleotide sequence (also referred to as "a consensus sequence") from the multiple reads derived from the same DNA fragment, which correction does not discard valid sequence information of this DNA fragment.

[0289] Adapter designs can provide physical UMIs that allow one to determine which strand of the DNA fragment the reads are derived from. Some aspects take advantage of this to determine a first consensus sequence for reads derived from one strand of the DNA fragment, and a second consensus sequence for the complementary strand. In some aspects, a consensus sequence includes the base pairs detected in all or a majority of reads while excluding base pairs appearing in few of the reads. Different criteria of consensus may be implemented. The process of combining reads based on UMIs or alignment locations to obtain a consensus sequence is also referred to as "collapsing" the reads. Using physical UMIs, virtual UMIs, and / or alignment locations, one can determine that reads for the first and second consensus sequences are derived from the same double stranded fragment. Therefore, in some aspects, a third consensus sequence is determined using the first and second consensus sequences obtained for the same DNA molecule / fragment, with the third consensus sequence including base pairs common for the first and second consensus sequences while excluding those inconsistent between the two. In other aspects, only one consensus sequence may be directly obtained by collapsing all reads derived from both strands of the same fragment, instead of by comparing the two consensus sequences obtained from the two strands. Finally, the sequence of the fragment may be determinedfrom the third or the only one consensus sequence, which includes base pairs that are consistent across reads derived from both strands of the fragment.

[0290] Paired end reads may use insert of different length (i.e., different fragment size to be sequenced). As the default meaning in this disclosure, paired end reads are used to refer to reads obtained from various insert lengths. In some instances, to distinguish short-insert paired end reads from long-inserts paired end reads, the latter is specifically referred to as mate pair reads. In some aspects involving mate pair reads, two biotin junction adapters first are attached to two ends of a relatively long insert (e.g., several kb). The biotinjunction adapters then link the two ends of the insert to form a circularized molecule. A sub-fragment encompassing the biotinjunction adapters can then be obtained by further fragmenting the circularized molecule. The sub-fragment including the two ends of the original fragment in opposite sequence order can then be sequenced by the same procedure as for short-insert paired end sequencing described above. Further details of mate pair sequencing using an Illumina platform is shown in an online publication at the following address, which is incorporated by reference by its entirety: res.illumina.com / documents / products / technotes / technote_nextera_matepair_data_process ing.pdf

[0291] After sequencing of DNA fragments, sequence reads of predetermined length, e.g., 100 bp, are localized by mapping (alignment) to a known reference genome. The mapped reads and their corresponding locations on the reference sequence are also referred to as tags. In another aspect of the procedure, localization is realized by k-mer sharing and readread alignment. The analyses of many aspects disclosed herein make use of reads that are either poorly aligned or cannot be aligned, as well as aligned reads (tags). In one aspect, the reference genome sequence is the NCBI36 / hgl8 sequence, which is available on the World Wide Web at genome.ucsc.edu / cgi- bin / hgGateway?org=Human&db=hgl8&hgsid=166260105). Alternatively, the reference genome sequence is the GRCh37 / hgl9 or GRCh38, which is available on the World Wide Web at genome.ucsc.edu / cgi-bin / hgGateway. Other sources of public sequence information include GenBank, dbEST, dbSTS, EMBL (the European Molecular Biology Laboratory), and the DDBJ (the DNA Databank of Japan). A number of computer algorithms are available for aligning sequences, including without limitation BLAST (Altschul et al., 1990), BLITZ (MPsrch) (Sturrock & Collins, 1993), FASTA (Person & Lipman, 1988), BOWTIE (Langmead et al., Genome Biology 10:R25.1-R25.10

[2009] ), orELAND (Illumina, Inc., San Diego, Calif., USA). In one aspect, one end of the clonally expanded copies of the plasma cfDNA molecules is sequenced and processed by bioinformatics alignment analysis for the Illumina Genome Analyzer, which uses the Efficient Large-Scale Alignment of Nucleotide Databases (ELAND) software.

[0292] Other sequencing methods may also be used to obtain sequence reads and alignments thereof. Additional suitable methods are described in U.S. patent application Ser. No. 15 / 130,668 filed no Apr. 15, 2016, which is incorporated by reference in its entirety.

[0293] In some aspects of the methods described herein, the sequence reads are about 20 bp, about 25 bp, about 30 bp, about 35 bp, about 40 bp, about 45 bp, about 50 bp, about 55 bp, about 60 bp, about 65 bp, about 70 bp, about 75 bp, about 80 bp, about 85 bp, about 90 bp, about 95 bp, about 100 bp, about 110 bp, about 120 bp, about 130, about 140 bp, about 150 bp, about 200 bp, about 250 bp, about 300 bp, about 350 bp, about 400 bp, about 450 bp, or about 500 bp. It is expected that technological advances will enable single-end reads of greater than 500 bp enabling for reads of greater than about 1000 bp when paired end reads are generated. In some aspects, paired end reads are used to determine sequences of interest, which comprise sequence reads that are about 20 bp to 1000 bp, about 50 bp to 500 bp, or 80 bp to 150 bp. In various aspects, the paired end reads are used to evaluate a sequence of interest. The sequence of interest is longer than the reads. In some aspects, the sequence of interest is longer than about 100 bp, 500 bp, 1000 bp, or 4000 bp. Mapping of the sequence reads is achieved by comparing the sequence of the reads with the sequence of the reference to determine the chromosomal origin of the sequenced nucleic acid molecule, and specific genetic sequence information is not needed. A small degree of mismatch (0-2 mismatches per read) may be allowed to account for minor polymorphisms that may exist between the reference genome and the genomes in the mixed sample. In some aspects, reads that are aligned to the reference sequence are used as anchor reads, and reads paired to anchor reads but cannot align or poorly align to the reference are used as anchored reads. In some aspects, poorly aligned reads may have a relatively large number of percentage of mismatches per read, e.g., at least about 5%, at least about 10%, at least about 15%, or at least about 20% mismatches per read.

[0294] A plurality of sequence tags (i.e., reads aligned to a reference sequence) are typically obtained per sample. In some aspects, at least about 3* 106sequence tags, at least about 5*106sequence tags, at least about 8*106sequence tags, at least about 10* 106sequence tags, at least about 15x 106sequence tags, at least about 20* 106sequence tags, at least about 30* 106sequence tags, at least about 40* 106sequence tags, or at least about 50* 106sequence tags of, e.g., 100 bp, are obtained from mapping the reads to the reference genome per sample. In some aspects, all the sequence reads are mapped to all regions of the reference genome, providing genome-wide reads. In other aspects, reads mapped to a sequence of interest.

[0295] In some aspects, the plurality of reads each includes a UMI. In some aspects, the UMI comprises a physical UMI and / or a virtual UMI. In some implementations, the plurality of reads each either includes a nonrandom UMI or is associated with a nonrandom UMI through a paired-end read. In some implementations, the plurality of amplified polynucleotides each has a nonrandom UMI on one end or has a first nonrandom UMI on a first end and a second nonrandom UMI on a second end.

[0296] In some aspects, it can be advantageous to employ relatively short physical UMIs because short physical UMIs are easier to incorporate into adapters. Furthermore, shorter physical UMIs are faster and easier to sequence in the amplified fragments. However, as physical UMIs become very short, the total number of different physical UMIs can become less than the number of adapter molecules required for sample processing. In order to provide enough adapters, the same UMI would have to be repeated in two or more adapter molecules. In such a scenario, adapters having the same physical UMIs may be ligated to multiple source DNA molecules. However, these short physical UMIs may provide enough information, when combined with other information such as virtual UMIs and / or alignment locations of reads, to uniquely identify reads as being derived from a particular source polynucleotide or DNA fragment in a sample. This is so because even though the same physical UMI may be ligated to two different fragments, it is unlikely the two different fragments would also happen to have the same alignment locations, or matching subsequences serving as virtual UMIs. So if two reads have the same short physical UMI and the same alignment location (or the same virtual UMI), the two reads are likely derived from the same DNA fragment.

[0297] Virtual UMIs that are defined at, or with respect to, the end positions of source DNA molecules can uniquely or nearly uniquely define individual source DNA molecules when the locations of the end positions are generally random as with some fragmentation procedures and with naturally occurring cfDNA. When the sample contains relatively few source DNA molecules, the virtual UMIs can themselves uniquely identify individualsource DNA molecules. Using a combination of two virtual UMIs, each associated with a different end of a source DNA molecule, increases the likelihood that virtual UMIs alone can uniquely identify source DNA molecules. Of course, even in situations where one or two virtual UMIs cannot alone uniquely identify source DNA molecules, the combination of such virtual UMIs with one or more physical UMIs may succeed.

[0298] If two reads are derived from the same DNA fragment, two subsequences having the same base pairs will also have the same relative location in the reads. On the contrary, if two reads are derived from two different DNA fragments, it is unlikely that two subsequences having the same base pairs have the exact same relative location in the reads. Therefore, if two or more subsequences from two or more reads have the same base pairs and the same relative location on the two or more reads, it can be inferred that the two or more reads are derived from the same fragment.

[0299] In some aspects, subsequences at or near the ends of a DNA fragment are used as virtual UMIs. This design choice has some practical advantages. First, the relative locations of these subsequences on the reads are easily ascertained, as they are at or near the beginning of the reads and the system need not use an offset to find the virtual UMI. Furthermore, since the base pairs at the ends of the fragments are first sequenced, those base pairs are available even if the reads are relatively short. Moreover, base pairs determined earlier in a long read have lower sequencing error rate than those determined later. In other aspects, however, subsequences located away from the ends of the reads can be used as virtual UMIs, but their relative positions on the reads may need to be ascertained to infer that the reads are obtained from the same fragment.

[0300] One or more subsequences in a read may be used as virtual UMIs. In some aspects, two subsequences, each tracked from a different end of the source DNA molecule, are used as virtual UMIs. In various aspects, virtual UMIs are about 24 base pairs or shorter, about 20 base pairs or shorter, about 15 base pairs or shorter, about 10 base pairs or shorter, about 9 base pairs or shorter, about 8 base pairs or shorter, about 7 base pairs or shorter, or about 6 base pairs or shorter. In some aspects, virtual UMIs are about 6 to 10 base pairs. In other aspects, virtual UMIs are about 6 to 24 base pairs.

[0301] In various aspects using UMIs, multiple sequence reads having the same UMI(s) are collapsed to obtain one or more consensus sequences, which are then used to determine the sequence of a source DNA molecule. Multiple distinct reads may be generated from distinct instances of the same source DNA molecule, and these reads may be compared toproduce a consensus sequence as described herein. The instances may be generated by amplifying a source DNA molecule prior to sequencing, such that distinct sequencing operations are performed on distinct amplification products, each sharing the source DNA molecule's sequence. Of course, amplification may introduce errors such that the sequences of the distinct amplification products have differences. In the context some sequencing technologies such as Illumina's sequencing-by-synthesis, a source DNA molecule or an amplification product thereof forms a cluster of DNA molecules linked to a region of a flow cell. The molecules of the cluster collectively provide a read. Typically, at least two reads are required to provide a consensus sequence. Sequencing depths of 100, 1000, and 10,000 are examples of sequencing depths useful in the disclosed embodiments for creating consensus reads for low allele frequencies (e.g., about 1% or less).

[0302] In some aspects, nucleotides that are consistent across 100% of the reads sharing a UMI or combination of UMIs are included in the consensus sequence. In other aspects, consensus criterion can be lower than 100%. For instance, a 90% consensus criterion may be used, which means that base pairs that exist in 90% or more of the reads in the group are included in the consensus sequence. In various aspects, the consensus criterion may be set at about 30%, about 40%, about 50%, about 60%, about 70%, about 80%, about 90%, about 95%, or about 100%.EXAMPLESExample 1: Spatial Transcriptomics using Disrupted Homopolymer Capture Probes

[0303] Spatial transcriptomics enables highly multiplexed in situ gene expression profiling within complex tissues. A key challenge in achieving single-cell resolution is assay sensitivity, in which tissue RNAs are converted into sequence-ready libraries. Described herein is a method which combines on-surface template-switching, single-stranded enzymatic ligation (TSO-LIG) and isothermal amplification to efficiently convert captured tissue RNAs, e.g., mRNAs, into spatially barcoded libraries.

[0304] The capture probes and related compositions, kits, and methods of the present disclosure may be applied to a broad range of spatial transcriptomic workflows known in the art (e.g., workflows including a capture oligonucleotide, for example a capture oligonucleotide that specificially hybridizes to a homopolymer sequence in a target nucleic acid), including, but not limited to, those described in International Application Nos.PCT / US2019 / 053868, PCT / US2021 / 035270, PCT / US2023 / 085743, and U.S. Pat. No.10,913,975, each of which is incorporated herein by reference in their entirety.

[0305] Described herein is an exemplary spatial transcriptomics workflow using the disrupted homopolymer capture probes of the present disclosure, using a solid support comprising an immobilized capture probe and an immobilized spatial probe. FIG. 2A shows a solid support comprising an immobilized capture probe (left) and an immobilized spatial probe (right). The immobilized capture probe comprises, from 5’ to 3’, a primer binding sequence (e.g., sbsl2) and a disrupted homopolymer capture sequence (DHP). The immobilized spatial probe comprises, from 5’ to 3’, a first primer binding sequence (e.g., P5), a spatial barcode, a second primer binding sequence (e.g., sbs3) and a probe sequence. The probe sequence of the immobilized spatial probe shares sequence identity with the DHP sequence of the immobilized capture probe.

[0306] FIG. 2B shows the step of hybridizing a target nucleic acid (e.g., a mRNA molecule from a biological sample), wherein the poly-A tail of the mRNA molecule hybridizes to the DHP sequence of the capture probe. Mismatches in the poly-A tail and DHP sequence duplex lead to disruptions in base pairing, as indicated by the arrows. Polymerase extension (e.g., extension with a reverse transcriptase) of the capture probe is then performed to generate a first cDNA extension product of the captured mRNA. FIGs. 2C and 2D show the process of template switching with a template switch oligo (TSO) including a bait sequence, thereby generating an extended capture probe including a bait sequence complement. Following template switching, a blocking element is hybridized to the DHP sequence of the capture probe, as shown in FIG. 2E, to inhibit hybridization of the probe sequence of the immobilized spatial probe to the DHP sequence of the capture probe.

[0307] The bait sequence complement then hybridizes to the probe sequence and is extended by a polymerase, as shown in FIG. 2F, to incorporate the spatial barcode and additional primer binding sequences (e.g., P5’ and sbs3’). The extended capture probe may then be removed from the solid support (e.g., by chemical or enzymatic cleavage) and processed for downstream applications, such as sequencing.Example 2: Library Generation with Capture Probes Comprising Disrupted Homopolymers

[0308] Capture probes comprising an adapter (adapter 1), a barcode comprising 10 random nucleotides (Nio), and a capture sequence (20T homopolymer (SEQ ID NO: 15), DHP0(SEQ ID NO: 14), or DHPnaive (SEQ ID NO: 16)) were created and used to generate a library (FIG. 3). The capture probes were immobilized on a substrate and 1 pg mouse kidney total RNA (comprising an adapter-TSO oligo at the 5’ end) was contacted with the substrate so that the polyA tail of the mRNA molecules hybridized to the capture sequence of the capture probes. The capture probe was extended to generate first strand cDNA, using the target nucleic acid as a template and incubating the samples in reverse transcriptase enzyme mix comprising an adapter-template switch oligo (adapter 2) at 42°C, overnight. Next, the RNA was removed from the sample by denaturation. Second strand synthesis was performed with polymerase and primers for adapter 2. The strand was hybridized with adapter 1 and the strands were cleaved from the substrate. Cleavage elute was amplified with dual -index (adapter 1, adapter 2) primer sets and sequenced.

[0309] The amount of mRNA captured by capture probes comprising either the 20T homopolymer, DHPO, or DHPnaive capture sequences was quantified by qPCR with primers for adapter 1 and adapter 2. Capture probes comprising a DHPO capture sequence showed similar results as capture probes comprising a 20T homopolymer capture sequence (Table 1). Capture probes comprising the 20T homopolymer capture sequence and capture probes comprising the DHPO capture sequence both generated ~4-fold more mRNA compared to capture probes comprising a DHPnaive capture sequence. The results are an average of 3 replicates.Table 1. qPCR results

[0310] The quality of the library that was generated from capture probes comprising either a 20T homopolymer, DHPO, or DHPnaive capture sequence was assessed by TapeStation. While all three libraries showed similar size distributions and no significant biproduct peaks, the 20T and DHPO libraries had significantly higher amounts DNA for sequencing.(FIG. 4). The libraries generated by capture probes comprising the 20T homopolymer capture sequence or DHPO capture sequence performed similarly.

[0311] Next, the expression level of over 23,000 individual genes from each of the libraries was quantified. There were 77 genes (-0.003%) with significantly increased expression and 83 genes (-0.003%) with significantly decreased expression in the libraries that were generated with capture probes comprising a DHPO capture sequence compared to the libraries that were generated with capture probes comprising a 20T homopolymer capture sequence, suggesting that the DHPO capture sequence performed similarly to the 20T homopolymer capture sequence (FIG. 5A).

[0312] There were 4,312 genes (-0.18%) with significantly increased expression and 257 genes (0.01%) with significantly decreased expression in the libraries that were generated with capture probes comprising a DHPnaive capture sequence compared to the libraries that were generated with capture probes comprising a 20T homopolymer capture sequence (FIG. 5B) Thus, libraries generated from capture probes comprising a DHPnaive capture sequence exhibited higher variation in gene expression compared to libraries generated from capture probes comprising a DHPO capture sequence.

[0313] All patents, patent applications, government publications, government regulations, and literature references cited in this specification are hereby incorporated herein by reference in their entirety. In the case of conflict, the present description, including definitions, will control.

[0314] The preceding merely illustrates the principles of the invention. It will be appreciated that those skilled in the art will be able to devise various arrangements which, although not explicitly described or shown herein, embody the principles of the invention and are included within its spirit and scope. Furthermore, all examples and conditional language recited herein are principally intended to aid the reader in understanding the principles of the invention and the concepts contributed by the inventors to furthering the art, and are to be construed as being without limitation to such specifically recited examples and conditions. Moreover, all statements herein reciting principles, aspects, and aspects of the invention as well as specific examples thereof, are intended to encompass both structural and functional equivalents thereof. Additionally, it is intended that such equivalents include both currently known equivalents and equivalents developed in the future, i.e., any elements developed that perform the same function, regardless of structure. The scope of the present invention, therefore, is not intended to be limited to the exemplaryaspects shown and described herein. Rather, the scope and spirit of present invention is embodied by the appended claims.

Claims

WHAT IS CLAIMED:

1. A capture probe comprising a first primer binding sequence and a capture region, wherein the capture region comprises at least two non-sequential nucleotides or non-sequential nucleotide sequences, where each of the non-sequential nucleotides or non-sequential nucleotide sequences is complementary to a portion of a homopolymer sequence of a target nucleic acid.

2. A capture probe comprising a first primer binding sequence, a spatial barcode, and a capture region, wherein the capture region comprises at least two non-sequential nucleotides or non-sequential nucleotide sequences, where each of the non-sequential nucleotides or non-sequential nucleotide sequences is complementary to a portion of a homopolymer sequence of a target nucleic acid.

3. The capture probe of claim 1 or 2, wherein the capture region comprises less than ten non-sequential nucleotide sequences.

4. The capture probe of any one of claims 1 to 3, wherein the capture region comprises two to six non-sequential nucleotide sequences.

5. The capture probe of any one of claims 1 to 4, wherein each of the non-sequential nucleotide sequences is separated by an intervening nucleotide or intervening nucleotide sequence.

6. The capture probe of claim 5, wherein the intervening nucleotide and / or intervening nucleotide sequence comprises at least one nucleotide that is not complementary to the homopolymer sequence.

7. The capture probe of any one of claims 1 to 6, wherein each of the at least two nonsequential nucleotide sequences is between 2 to 10 bases in length.

8. The capture probe of any one of claims 1 to 6, wherein each of the at least two nonsequential nucleotide sequences is between 2 to 8 bases in length.

9. The capture probe of any one of claims 1 to 6, wherein each of the at least two nonsequential nucleotide sequences is between 3 to 5 bases in length.

10. The capture probe of any one of claims 1 to 9, wherein the 3’ end of the capture region is complementary to the homopolymer sequence of the target nucleic acid.

11. The capture probe of any one of claims 1 to 10, wherein the 3’ end of the capture region comprises at least four nucleotides complementary to the homopolymer sequence of the target nucleic acid.

12. The capture probe of any one of claims 1 to 11, wherein the capture region comprises locked nucleic acids (LNAs), Bis-locked nucleic acids (bisLNAs), twisted intercalating nucleic acids (TINAs), bridged nucleic acids (BNAs), 2’-O-methyl RNA:DNA chimeric nucleic acids, minor groove binder (MGB) nucleic acids, morpholino nucleic acids, C5- modified pyrimidine nucleic acids, peptide nucleic acids (PNAs), phosphorothioate nucleic acids, or combinations thereof.

13. The capture probe of claim 12, wherein the capture region comprises locked nucleic acids (LNAs) or 2’-O-methyl RNA:DNA chimeric nucleic acids.

14. The capture probe of any one of claims 5 to 13, wherein the intervening nucleotide and / or intervening nucleotide sequence comprises a natural base.

15. The capture probe of claim 14, wherein the natural base is a deoxythymidine, deoxyadenosine, deoxyguanosine, deoxycytidine, or deoxyuridine.

16. The capture probe of any one of claims 5 to 13, wherein the intervening nucleotide and / or intervening nucleotide sequence comprises an unnatural base.

17. The capture probe of claim 16, wherein the unnatural base is a 2’-deoxyinosine, isoguanine, 3 -nitropyrrole, 5-nitroindole, or isocytosine.

18. The capture probe of any one of claims 1 to 17, further comprising an index sequence.

19. The capture probe of any one of claims 1 to 18, wherein the first primer binding sequence is a first sequencing primer binding sequence or a first decoding primer binding sequence.

20. The capture probe of any one of claims 1 to 19, wherein the target nucleic acid is a messenger RNA (mRNA).

21. The capture probe of any one of claims 1 to 20, wherein the homopolymer sequence is a poly-A sequence.

22. The capture probe of claim 21, wherein the poly-A sequence is incorporated into the target nucleic acid using poly(A) polymerase or terminal deoxynucleotidyl transferase (TdT).

23. The capture probe of any one of claims 1 to 20, wherein the homopolymer sequence is a poly-I sequence.

24. The capture probe of claim 23, wherein the poly-I sequence is incorporated into the target nucleic acid using a polymerase.

25. The capture probe of any one of claims 21 to 24, wherein the homopolymer sequence is between 3 and 50 bases in length.

26. The capture probe of any one of claims 21 to 24, wherein the homopolymer sequence is greater than 50 bases in length.

27. The capture probe of any one of claims 21 to 26, wherein each of the non- sequent! al nucleotide sequences comprises a plurality of deoxythymidines.

28. The capture probe of any one of claims 22 to 26, wherein each of the non- sequent! al nucleotide sequences comprises a plurality of deoxyadenosines, deoxycytidines, deoxyuridines, or a combination thereof.

29. The capture probe of any one of claims 21 to 28, wherein the intervening nucleotide and / or intervening nucleotide sequence does not comprise a deoxythymidine.

30. The capture probe of any one of claims 1 to 19, wherein the target nucleic acid is DNA.

31. The capture probe of claim 30, wherein the homopolymer sequence is a poly-A, poly-T, poly-G, or Poly-C sequence.

32. The capture probe of claim 31, wherein the homopolymer sequence is incorporated into the DNA using TdT.

33. The capture probe of any one of claims 1 to 32, comprising, from 5’ to 3’, the first primer binding sequence and the capture region.

34. The capture probe of claim 33, wherein the first primer binding sequence is a first sequencing primer binding sequence.

35. The capture probe of any one of claims 2 to 32, comprising, from 5’ to 3’, the spatial barcode, the first primer binding sequence, and the capture region.

36. The capture probe of claim 35, wherein the first primer binding sequence is a decoding primer binding sequence.

37. The capture probe of any one of claims 1 to 36, further comprising a cleavable site.

38. The capture probe of claim 37, wherein the cleavable site comprises a chemically cleavable moiety or an enzymatically cleavable moiety.

39. The capture probe of claim 38, wherein the enzymatically cleavable moiety comprises a restriction endonuclease recognition site.

40. The capture probe of any one of claims 1 or 3 to 39, wherein the capture probe further comprises a spatial barcode.

41. The capture probe of any one of claims 1 to 40, wherein the capture probe further comprises a unique molecular identifier (UMI).

42. A solid support comprising a plurality of immobilized capture probes, wherein each capture probe of the plurality comprises the capture probe of any one of claims 1 to 41.

43. The solid support of claim 42, wherein each capture probe is immobilized to the solid support at a 5’ end.

44. The solid support of claim 42 or 43, further comprising a plurality of immobilized spatial probes.

45. The solid support of claim 44, wherein each spatial probe of the plurality of immobilized spatial probes comprises a second primer binding sequence, a spatial barcode, and a probe sequence.

46. The solid support of claim 45, wherein each spatial probe is immobilized to the solid support at a 5’ end.

47. The solid support of claim 45, wherein each spatial probe is immobilized to the solid support at a 3’ end.

48. The solid support of any one of claims 44 to 47, wherein each spatial probe further comprises an index sequence, a molecular identifier, or a combination thereof.

49. The solid support of claim 48, wherein the molecular identifier is a unique molecular identifier.

50. The solid support of any one of claims 45 to 49, wherein the probe sequence is identical to a portion of the capture region of each immobilized capture probe.

51. The solid support of claim 50, wherein the first primer binding sequence of the capture probe is a first sequencing primer binding sequence, and wherein the portion of the capture region that is identical to the probe sequence is adjacent to the first sequencing primer binding sequence of the capture probe.

52. The solid support of claim 50 or 51, wherein the portion of the capture region that is identical to the probe sequence is hybridized to a blocking element.

53. The solid support of any one of claims 45 to 52, wherein the probe sequence is hybridized to a blocking element.

54. The solid support of any one of claims 45, 46, or 48 to 53, wherein the probe sequence is complementary to the reverse complement of a template switch oligo sequence.

55. The solid support of any one of claims 45, 46, or 48 to 54, wherein each spatial probe of the plurality of immobilized spatial probes comprises, from 5’ to 3’, the second primer binding sequence, the spatial barcode, and the probe sequence.

56. The solid support of claim 55, wherein the second primer binding sequence is a second decoding primer binding sequence.

57. The solid support of any one of claims 45 or 47 to 54, wherein each spatial probe of the plurality of immobilized spatial probes comprises, from 5’ to 3’, the probe sequence, the spatial barcode, and the second primer binding sequence.

58. The solid support of claim 57, wherein the second primer binding sequence is a second decoding primer binding sequence.

59. The solid support of any one of claims 42 to 58, wherein the solid support is a bead array, a spotted array, a flow cell, clustered particles arranged on a surface of a chip, a film, or a plate.

60. A solid support comprising a plurality of the capture probes of claim 1 and a plurality of the immobilized spatial probes of claim 46, wherein each capture probe comprises, from 5’ to 3’, the first primer binding sequence and the capture region, wherein the 3’ end of the capture region is complementary to the homopolymer sequence of the target nucleic acid, wherein each immobilized spatial probe comprises, from 5’ to 3’, the second primer binding sequence, the spatial barcode, and the probe sequence, and wherein each capture probe is attached to the solid support at a 5’ end.

61. A solid support comprising a plurality of the capture probe of claim 1, and a plurality of the immobilized spatial probes of claim 47, wherein each capture probe comprises, from 5’ to 3’, the first primer binding sequence and the capture region, wherein the 3’ end of the capture region is complementary to the homopolymer sequence of the target nucleic acid, wherein each immobilized spatial probe comprises, from 5’ to 3’, the probe sequence, the spatial barcode, and the second primer binding sequence, and wherein each capture probe is attached to the solid support at a 5’ end.

62. A solid support comprising a plurality of the capture probes of claim 2, wherein each capture probe comprises, from 5’ to 3’, the spatial barcode, the first primer binding sequence, and the capture region, wherein the 3’ end of the capture region is complementary to the homopolymer sequence of the target nucleic acid, and wherein each capture probe is attached to the solid support at a 5’ end.

63. A kit comprising the solid support of any one of claims 42 to 62.

64. The kit of claim 63, further comprising a template switch oligo.

65. The kit of claim 63 or 64, further comprising a splint oligo.

66. The kit of any one of claims 63 to 65, further comprising a blocking oligo.

67. A method of generating an immobilized complement of a target nucleic acid in a biological sample, the method comprising:a. contacting the solid support of any one of claims 42 to 62 with the biological sample comprising a plurality of target nucleic acids;b. hybridizing the capture region of each capture probe to a homopolymeric sequence of a target nucleic acid from the plurality; andc. extending each capture region with a polymerase, thereby generating an immobilized complement of each target nucleic acid.

68. A method of generating a plurality of second strand extension products of target nucleic acids of a biological sample, the method comprising:a. providing a solid support comprising a plurality of immobilized capture probes, wherein each capture probe of the immobilized plurality of capture probes comprises a first primer binding sequence, a spatial barcode, and a capture region, wherein the capture region comprises at least two non-sequential nucleotides or nonsequential nucleotide sequences, and wherein each of the non-sequential nucleotides or non-sequential nucleotide sequences is complementary to a portion of a homopolymer sequence of a target nucleic acid;b. contacting the solid support with a biological sample comprising a plurality of target nucleic acids;c. hybridizing the capture region of each immobilized capture probe to a homopolymeric sequence of a target nucleic acid from the plurality, thereby forming a plurality of hybridized capture probes;d. extending the capture region of each hybridized capture probe with a polymerase, thereby generating a plurality of immobilized first strand extension products, wherein the extending comprises addition of a plurality of non-templated nucleotides to the end of the immobilized first strand extension products;e. removing the plurality of target nucleic acids from the solid support; f. hybridizing a template switch oligonucleotide (TSO) to each immobilized first strand extension product, wherein the TSO is complementary to a plurality of the non-templated nucleotides, and wherein the TSO comprises a second sequencing primer binding sequence, thereby forming a plurality of hybridized TSOs;g. generating a plurality second strand extension product using the TSO; and h. removing the plurality of second strand extension products.

69. The method of claim 68, wherein each capture probe of the immobilized plurality of capture probes comprises a second primer binding sequence.

70. The method of claim 68 or 69, wherein the first primer binding sequence is a first sequencing primer binding sequence or a first decoding primer binding sequence, and wherein the second primer binding sequence is a second sequencing primer binding sequence or a second decoding primer binding sequence.

1. A method of generating a plurality of second strand extension products of a target nucleic acid in a biological sample, the method comprising:a. providing a solid support comprising a plurality of immobilized capture probes and a plurality of immobilized spatial probes, wherein each capture probe comprises a sequencing primer binding sequence and a capture region, wherein each spatial probe comprises a primer binding sequence, a spatial barcode, and a probe sequence, wherein the capture region comprises at least two non- sequent! al nucleotides or non- sequent! al nucleotide sequences, and wherein each of the non-sequential nucleotides or non-sequential nucleotide sequences is complementary to a portion of a homopolymer sequence of a target nucleic acid;b. contacting the solid support with a biological sample comprising a plurality of target nucleic acids;c. hybridizing the capture region of each immobilized capture probe to a homopolymeric sequence of a target nucleic acid from the plurality, thereby forming a plurality of hybridized capture probes;d. extending the capture region of each hybridized capture probe with a polymerase, thereby generating a plurality of immobilized first strand extension products, wherein the extending comprises addition of a plurality of non-templated nucleotides to the end of the immobilized first strand extension products;e. removing the plurality of target nucleic acids from the solid support; f. providing a plurality of splint oligonucleotides to the solid support and hybridizing the splint oligonucleotides to each immobilized first strand extension product and immobilized spatial probe to form a splinted complex, wherein the splint oligonucleotide comprises a first region complementary to the plurality of non-templated nucleotides of the first strand extension product and a second region complementary to the probe sequence of the spatial probe, thereby bringing the immobilized first strand extension product and the immobilized spatial probe of the splinted complex into ligatable proximity;g. ligating the immobilized first strand extension product and the immobilized spatial probe of each splinted complex by enzymatic or chemical ligation, thereby forming a plurality of ligated first strand extension products;h. hybridizing a primer to each ligated first strand extension product and extending the hybridized primers, thereby generating a plurality of second strand extension products; andi. removing the plurality of second strand extension products.

72. A method of generating a plurality of second strand extension products of a target nucleic acid in a biological sample, the method comprising:a. providing a solid support comprising a plurality of immobilized capture probes and a plurality of immobilized spatial probes, wherein each capture probe comprises a sequencing primer binding sequence and a capture region, wherein each spatial probe comprises a primer binding sequence, a spatial barcode, and a probe sequence, wherein the capture region comprises at least two non- sequent! al nucleotides or non- sequential nucleotide sequences, and wherein each of the non-sequential nucleotides or non-sequential nucleotide sequences is complementary to a portion of a homopolymer sequence of a target nucleic acid;b. contacting the solid support with a biological sample comprising a plurality of target nucleic acids;c. hybridizing the capture region of each immobilized capture probe to a homopolymeric sequence of a target nucleic acid from the plurality, thereby forming a plurality of hybridized capture probes;d. extending the capture region of each hybridized capture probe with a polymerase, thereby generating a plurality of immobilized first strand extension products, wherein the extending comprises addition of a plurality of non-templated nucleotides to the end of the immobilized first strand extension products;e. removing the plurality of target nucleic acids from the solid support; f. hybridizing a template switch oligonucleotide (TSO) to each immobilized first strand extension product, wherein the TSO is complementary to a plurality of the non-templated nucleotides, and wherein the TSO comprises a bait sequence at a 3’ end, thereby forming a plurality of hybridized TSOs;g. incorporating the complement of the TSO into the 3’ end of the immobilized first strand extension product by template switching, thereby adding a bait sequence complement to the 3’ end of each immobilized first strand extension product;h. hybridizing the bait sequence complement of each immobilized first strand extension product to the probe sequence of the immobilized spatial probes, and extending the 3’ end of the hybridized first strand extension product, thereby incorporating a complement of the spatial barcode and a primer binding sequence complement into the 3’ end of each immobilized first strand extension product;i. denaturing the hybridized first strand extension products and spatial probes;j . hybridizing a primer to the primer binding sequence complement of each immobilized first strand extension product and extending the hybridized primers, thereby generating a plurality of second strand extension products; andk. removing the plurality of second strand extension products.

73. The method of claim 72, wherein after step (g), the method further comprises hybridizing a blocking element to the capture region of the capture probe.

74. The method of claim 73, wherein the blocking element is complementary to a 5’ portion of the capture region of the capture probe.

75. The method of any one of claims 71 to 74, wherein the step of removing the plurality of second strand extension products comprises chemical or enzymatic removal of the second strand extension products.

76. The method of claim 75, wherein the chemical removal comprises contacting the plurality of second strand extension products with an alkaline solution.

77. The method of claim 75, wherein the enzymatic removal comprises enzymatic cleavage of a cleavage site, wherein the plurality of second strand extension products comprise the cleavage site at a 5’ end.

78. The method of claim 77, wherein the cleavage site comprises a restriction enzyme site, a uracil, an 8-oxoguanine, or a combination thereof.

79. The method of any one of claims 72 to 78, wherein the bait sequence is identical to a portion of the capture region of the capture probe.

80. The method of any one of claims 71 to 79, wherein each spatial probe further comprises an index sequence, a molecular identifier, or a combination thereof.

81. The method of any one of claims 71 to 80, wherein each capture probe further comprises an index sequence, a molecular identifier, a spatial barcode, or a combination thereof.

82. The method of claim 80 or 81, wherein the molecular identifier is a unique molecular identifier.

83. The method of claim 80 or 81, wherein the spatial barcode sequence of the spatial probe and the spatial barcode sequence of the capture probe are different.

84. The method of claim 80 or 81, wherein the spatial barcode sequence of the spatial probe and the spatial barcode sequence of the capture probe are the same.

85. The method of any one of claims 68 to 82, further comprising amplifying the plurality of second strand extension products, thereby generating a library.

86. The method of claim 85, wherein generating the library comprises tagmentation or ligation of adapters to the second strand extension products.

87. The method of claim 85 or 86, further comprising sequencing the library.

88. The method of claim 87, wherein sequencing comprises sequencing-by-synthesis, sequencing-by-ligation, or sequencing-by-binding.

89. The method of any one of claims 67 to 88, wherein the biological sample comprises a tissue sample.

90. The method of claim 89, wherein the tissue sample comprises a fresh frozen tissue sample or a formalin-fixed paraffin embedded (FFPE) sample.

91. The method of any one of claims 67 to 90, wherein step b) further comprises contacting the sample with a lysis buffer, a permeabilization buffer and / or a reagent to deparaffinize a FFPE sample.

92. The method of any one of claims 67 to 91, wherein the polymerase is a reverse transcriptase.

93. The method of claim 92, wherein the reverse transcriptase is a highly processive reverse transcriptase.