Methods and products for local or spatial detection of nucleic acids in tissue samples

By combining array technology with high-throughput DNA sequencing technology, the problem of difficulty in performing high-resolution global transcriptome analysis of tissue samples in existing technologies has been solved. This enables the determination of transcriptome information and preservation of spatial location for each cell, providing a global spatial expression pattern.

CN115896252BActive Publication Date: 2026-06-23SHICHENG GENE TECH SWEDISH CO

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
SHICHENG GENE TECH SWEDISH CO
Filing Date
2012-04-13
Publication Date
2026-06-23

AI Technical Summary

Technical Problem

Existing technologies struggle to perform global transcriptomic analysis of tissue samples at high resolution, resulting in the inability to simultaneously measure the transcriptomic information of each cell in the sample and the loss of location information.

Method used

By combining array technology with high-throughput DNA sequencing technology, nucleic acid molecules in tissue samples are located and labeled on the array using reverse transcription primers. cDNA is synthesized and sequenced to obtain spatial expression information for each cell, and the data is visualized by combining it with tissue sample images.

Benefits of technology

It achieves high-resolution global gene expression analysis, can simultaneously measure the transcriptome information of each cell, and preserve its spatial location in the tissue, providing a global spatial expression pattern.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN115896252B_ABST
    Figure CN115896252B_ABST
Patent Text Reader

Abstract

A method and product for local or spatial detection of nucleic acids in a tissue sample, the method comprising (a) providing an array comprising a substrate having immobilized thereon a plurality of capture probes, each of the probes being in a different position in the array and oriented to have a 3' free end, wherein each of the capture probes comprises a nucleic acid molecule; (b) contacting the array with the tissue sample to allow the capture probes on the array to associate with a position on the tissue sample and to allow nucleic acids on the tissue sample to hybridize to the capture domain of the capture probes; (c) generating DNA molecules from the captured nucleic acid molecules using the capture probes as extension or ligation primers, wherein the extended or ligated DNA molecules are labeled with a localization domain; (e) releasing at least some of the labeled DNA molecules and / or their complementary strands or amplicons from the array surface; and (f) directly or indirectly analyzing the sequence of the released DNA molecules.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention primarily relates to the local or spatial detection of nucleic acids in tissue samples. The nucleic acids can be RNA or DNA. Therefore, the methods provided by this invention can be used to detect and / or analyze RNA (e.g., RNA transcripts) or genomic DNA to obtain spatial information about gene location, distribution, or expression in tissue samples (e.g., individual cells), or more precisely, spatial information about the location or distribution of any genomic variation (not necessarily a variation within a gene). Thus, this invention enables spatial genomics and spatial transcriptomics research.

[0002] A quantitative and / or qualitative method is provided to analyze the distribution, location, or expression of genomic sequences in tissue samples, wherein the tissue samples retain spatial expression, distribution, or localization patterns. Therefore, this new method provides a workflow for conducting "spatial transcriptomics" or "spatial genomics" studies, allowing users to simultaneously determine the expression patterns or localization / distribution patterns of expressed genes, genes, or genomic loci contained in tissue samples.

[0003] Specifically, this invention utilizes a combination of array technology and high-throughput DNA sequencing to capture and label nucleic acid molecules (e.g., RNA or DNA molecules), particularly mRNA or DNA, in tissue samples using localization markers. DNA molecules are then synthesized and sequenced and analyzed to determine which genes are expressed in any part of the tissue sample. Preferably, the individual, separate, and specific transcriptome of each cell in the tissue sample can be obtained simultaneously. Therefore, the method of this invention provides a highly parallel, comprehensive transcriptome signature of individual cells in a tissue sample without losing spatial information in the tested tissue sample. This invention also provides an array for implementing the method of this invention and a method for fabricating the array. Background Technology

[0004] The human body contains over 100 trillion cells, which make up more than 250 different organs and tissues. We are still far from fully understanding the development and organization of complex organs such as the brain; therefore, it is necessary to use quantitative methods to analyze the expression patterns of genes in these tissues and identify the genes that control their development and function. These organs themselves are mixtures of differentiated cells that coordinate and maintain various bodily functions such as nutrient transport and defense. Therefore, cellular function depends on the cell's location within a specific tissue structure and its direct and indirect interactions with other cells in that tissue. Thus, it is essential to understand how these interactions affect each cell in the tissue at the transcriptional level.

[0005] Recent findings from deep RNA sequencing have revealed that most transcripts in human cell lines can be detected, and that the majority (75%) of human protein-coding genes are expressed in most tissues. Similarly, a detailed study of 1% of the human genome showed that chromosomes undergo pan-transcription, with a large portion of all bases contained in primary transcripts. Therefore, the transcriptional mechanism can be described as heterogeneous at the global level.

[0006] It is well known that transcriptosomals are only an approximate indicator of protein abundance, as the amount of protein generated from any given transcriptosome is affected by factors such as RNA translation and degradation rates. However, a recent immunological analysis of human organs and tissues suggests that tissue specificity is achieved through precise temporal and spatial control of protein levels, and that different tissues acquire their unique characteristics not by controlling the types of proteins expressed, but by controlling the quantity of each protein expressed.

[0007] However, subsequent global studies comparing the correlation between transcriptomics and proteomics demonstrated that most genes in the entire genome were expressed. Interestingly, the studies found a high correlation between changes in RNA levels and protein levels of individual gene products, indicating that studying the transcriptome of individual cells is biologically significant for understanding the functional roles of proteins.

[0008] Indeed, the analysis of cellular organization and expression patterns in biological tissues is a milestone in biomedical research and diagnostics. Cellular histology, using various staining techniques, first identified the basic structural mechanisms and common pathological changes of healthy organs more than a century ago. Advances in this field have led to the possibility of studying protein distribution through immunohistochemistry and in situ hybridization of gene expression.

[0009] However, the parallel development of increasingly advanced histology and gene expression technologies has led to the separation of imaging technology from transcriptome analysis. Therefore, prior to the method provided in this invention, there was no feasible way to use spatial resolution for whole transcriptome analysis.

[0010] As an alternative to or complement to in situ techniques, in vitro analysis methods for proteins and nucleic acids have also been developed, namely: extracting molecules from intact tissue samples, single cell types, or even single cells, and quantifying specific molecules in the extracts using methods such as ELISA and qPRC.

[0011] Recent advances in gene expression analysis have made it possible to assess the complete transcriptome of a tissue using microarrays or RNA sequencing, which contributes to our understanding of biological processes and diagnostic research. However, typical transcriptome analysis is performed by extracting mRNA from a whole tissue (or even an entire organism), while collecting smaller tissue regions or individual cells for transcriptome analysis is generally very laborious, expensive, and inaccurate.

[0012] Therefore, most gene expression studies based on microarrays or next-generation RNA sequencing use representative samples containing many cells, resulting in results that represent the average expression level of the tested genes. In some cases, in addition to global gene expression platforms, cells with significant differences have been isolated (Tang F et al., NatProtoc. 2010; 5:516-35; Wang D & Bodovitz S., Trends Biotechnol. 2010; 28:281-90), obtaining very precise information about intercellular differences. However, to date, there is no high-throughput method for studying transcriptional activity in intact tissues at high resolution. Summary of the Invention

[0013] Therefore, existing techniques for gene expression pattern analysis can only provide transcriptional information for one or a few genes at a time, or provide information on all genes in the same sample while losing their positional information. Thus, there is a clear need for a method that can simultaneously, separately, and specifically measure the transcriptome of each cell in a sample; that is, a method for global gene expression analysis of tissue samples that can provide transcriptional information with spatial resolution. This invention fulfills this need.

[0014] The innovative approach of the method and products described in this invention utilizes well-established array and sequencing technologies to obtain the transcriptional information of all genes in a sample, while simultaneously preserving the positional information of each transcript. This is clearly a milestone in life sciences for those skilled in the art. This new technology opens up a new field known as "spatial transcriptomics" and may have profound implications for our understanding of tissue development and tissue and cellular function in all multicellular organisms. Obviously, such a technology will be particularly useful in advancing our understanding of the causes and development of disease states, such as cancer, and in developing effective treatments for these diseases. The method of this invention can also be applied to the diagnosis of many diseases.

[0015] While the initial purpose of this invention is for transcriptome research, as detailed below, the principles and methods of this invention can also be applied to DNA analysis, and thus to genome analysis (“spatial genomics”). Therefore, in the broadest sense, this invention is primarily applicable to the detection and / or analysis of nucleic acids.

[0016] Array technology, especially microarrays, originated from research at Stanford University, which successfully attached small amounts of DNA oligonucleotides to a glass surface in an ordered arrangement, known as an "array," and used it to monitor the transcription of 45 genes (Schena M et al, Science. 1995; 270:368-9,371).

[0017] Since then, scientists worldwide have published over 30,000 papers using microarray technology. Various microarrays have been developed to suit diverse applications, such as detecting single nucleotide polymorphisms (SNPs) or genotypes, or resequencing mutated genomes. A key application of microarray technology is gene expression analysis. While gene expression microarrays were indeed invented as a method for analyzing the levels of expressed gene material in a specific sample, their true success lies in enabling the simultaneous comparison of the expression levels of many genes. Several commercially available microarray platforms are available for such experiments, but custom-designed gene expression arrays can also be fabricated.

[0018] While the use of microarrays in gene expression research is now widespread, it is clear that more advanced and comprehensive so-called "next-generation DNA sequencing" (NGS) technologies are beginning to replace DNA microarrays in many applications, such as deep transcriptome analysis.

[0019] The development of NGS technology for ultrafast genome sequencing is a milestone in life sciences (Petterson E et al, Genomics. 2009; 93:105-11). These new technologies have dramatically reduced the cost of DNA sequencing and enabled the sequencing of genomes in higher organisms, including those of specific individuals, at unprecedented speeds (WadeCM et al. Science. 2009; 326:865-7; Rubin J et al, Nature 2010; 464:587-91). New advances in high-throughput genomics have revolutionized the landscape of biological research, enabling not only complete genome identification but also the digital quantitative study of the entire transcriptome. In recent years, bioinformatics tools for visualizing and integrating such comprehensive datasets have also seen significant advancements.

[0020] However, it has been surprisingly discovered that for tissue samples characterized by two-dimensional spatial resolution, a unique combination of cell histology, microarray, and NGS technologies can yield comprehensive transcriptional or genomic information from multiple cells within the sample. Therefore, from one extreme perspective, the method of this invention can be used to analyze the expression of a single gene in a single cell of a sample while preserving the spatial information of that cell within the tissue sample; from the other extreme, and in a preferred aspect of the invention, the method can be used to simultaneously determine the expression of every gene in every cell, or substantially all cells, of a sample—that is, the global spatial expression pattern within a tissue sample. Clearly, the method of this invention can also be used for intermediate analyses.

[0021] The following is a brief description of the simplest form of the invention. The invention requires reverse transcription (RT) primers, which also include unique positioning markers (domains), and these primers are arranged on a target substrate such as a glass slide to create an "array". These unique positioning markers correspond to the positions of the RT primers on the array (array features). Tissue slices are placed on the array, and a reverse transcription reaction is performed on the target slide. The RT primers bind (or hybridize) to RNA in the tissue sample, using the bound RNA as a template and elongating it to obtain cDNA, which is thus attached to the array surface. Due to the presence of the unique positioning markers of the RT primers, each cDNA sequence carries the positional information of the template RNA in the tissue slice. Before or after the cDNA synthesis step, the tissue slice can be visualized or imaged by methods such as staining or photography, allowing the positioning markers in the cDNA molecule to be associated with a specific location in the tissue sample. The cDNA is then sequenced to obtain transcriptome results containing accurate positional information. A schematic diagram of this process is shown below. Figure 1 As shown. Then, the sequencing data can be paired with a specific location in the tissue sample, allowing the sequencing data and tissue sections to be visualized together, for example, using a computer to show the expression patterns of any gene of interest throughout the tissue. Figure 2 Similarly, different regions of a tissue slice can be marked on a computer screen, any region of interest can be selected, and information on the different expressed genes between the selected regions can be obtained. Clearly, the data obtained by the method of this invention is in stark contrast to data obtained using existing methods for studying mRNA populations. For example, in situ hybridization-based methods can only provide relative information on single mRNA transcripts. Therefore, the method of this invention has significant advantages over existing in situ techniques. The global gene expression information obtained by the method of this invention can also include co-expression information and quantitative estimates of transcriptional abundance. Clearly, this method is a widely applicable strategy that can be applied to the analysis of any tissue in any species, such as animals, plants, and fungi.

[0022] As described above and will be discussed in detail below, this basic method can obviously be readily extended to genomic DNA analysis, for example, for cell identification of tissue samples containing one or more specific mutations. For instance, genomic DNA can be broken into fragments and hybridized with primers (equivalent to RT primers as described above) that capture the DNA fragments (e.g., by ligating a conjugate with a complementary sequence to the primer to the DNA fragment, or by extending the DNA fragment using enzymatic methods, adding additional nucleotides, such as poly-A tails, to its ends to generate a sequence complementary to the primer) and guide the synthesis of the complementary strand of the captured molecule. The remaining steps of the analysis can be performed as described above. Therefore, the specific embodiments of the invention described below in the context of transcriptomic analysis can also be used in suitable methods for genomic DNA analysis.

[0023] As can be seen from the above description, transcriptomic or genomic information combined with location information has immense value. For example, it enables high-resolution global gene expression mapping, and this technology can be applied in many areas, including cancer research and diagnostics.

[0024] Furthermore, it is evident that the method described in this invention differs significantly from the aforementioned methods for whole transcriptome analysis of tissue samples, and these differences offer numerous advantages. This invention is based on a surprising discovery: the use of tissue slices does not interfere with the synthesis of DNA (e.g., cDNA) guided by primers (e.g., reverse transcription primers) attached to the array surface.

[0025] Therefore, in its primary and broadest sense, the present invention provides a method for local detection of nucleic acids in tissue samples, comprising:

[0026] (a) An array comprising a substrate on which a variety of capture probes are directly or indirectly immobilized, each type of probe occupying a different position in the array and oriented with a 3' free end so that the probe can serve as a primer for guiding an extension reaction or a ligation reaction, wherein each type of capture probe comprises a nucleic acid molecule comprising, in the 5' to 3' direction:

[0027] (i) The localization domain, corresponding to the position of the capture probes on the array, and

[0028] (ii) Capture domain;

[0029] (b) The array is brought into contact with a tissue sample such that the position of the capture probe on the array can be correlated with the position on the tissue sample, and the nucleic acid on the tissue sample is allowed to hybridize with the capture domain of the capture probe;

[0030] (c) Using the capture probe as an extension primer or a ligation primer, a DNA molecule is generated from the captured nucleic acid molecule, wherein the extended or ligated DNA molecule is labeled with a localization domain.

[0031] (d) Optionally, generate a complementary strand of the labeled DNA and / or optionally, amplify the labeled DNA;

[0032] (e) Releasing at least a portion of the labeled DNA molecules and / or their complementary strands or amplicones from the array surface, wherein the portion includes a localization domain or its complementary strand;

[0033] (f) Analyze the sequence of the released DNA molecules directly or indirectly.

[0034] The method of this invention has significant advantages over other spatial transcriptomics methods in the art. For example, the method described in this invention can obtain the global and spatial distribution of all transcripts in a tissue sample; furthermore, it can quantify the expression of each gene at each location or feature on the array, thus enabling multiple analyses based on data from a simple array. Therefore, the method of this invention makes it possible to detect and / or quantify the spatial expression of all genes in a single tissue sample. Furthermore, since the abundance of transcripts is not directly visible, for example, through fluorescence reactions, similar to standard microarrays, gene expression in a single sample can be measured even if the transcripts in the same sample are present at vastly different concentrations.

[0035] Accordingly, in another more specific aspect, the present invention can be viewed as providing a method for determining and / or analyzing the transcriptome of a tissue sample, comprising:

[0036] (a) An array comprising a substrate on which a variety of capture probes are directly or indirectly immobilized, each type of probe occupying a different position in the array and oriented to have a 3' free end so that the probe can be used as a reverse transcription (RT) primer, wherein each type of capture probe comprises a nucleic acid molecule comprising, in the 5' to 3' direction:

[0037] (i) The localization domain, corresponding to the position of the capture probes on the array, and

[0038] (ii) Capture domain;

[0039] (b) Contact the tissue sample with the array so that the position of the capture probe on the array is associated with the position on the tissue sample, and allow the RNA on the tissue sample to hybridize with the capture domain on the capture probe;

[0040] (c) Using the capture probe as an RT primer, a cDNA molecule is generated from the captured RNA molecule, and optionally, the cDNA molecule is amplified;

[0041] (d) Releasing at least a portion of the cDNA molecule and / or optionally its amplicon from the array surface, wherein the released molecule may be a first and / or second strand of cDNA or its amplicon, and wherein the portion includes a localization domain or its complementary strand;

[0042] (e) Analyze the sequence of the released molecules directly or indirectly.

[0043] As described below, any nucleic acid analysis method can be used in the analytical steps. Typically, this step may include sequencing, but sequencing is not actually a necessary step. For example, sequence-specific analysis methods can be used. For instance, sequence-specific amplification reactions can be performed, such as by using primers that are specific to the targeting domain and / or a specific target sequence, for example, a specific target DNA to be tested (i.e., corresponding to a specific cDNA / RNA or gene, etc.). A typical analytical method is a sequence-specific PCR reaction.

[0044] The sequence analysis information obtained according to step (e) can be used to obtain spatial information about RNA in the sample. In other words, the sequence analysis information can provide location information about RNA in the sample. This spatial information can be inferred from the nature of the measured sequence analysis information, for example, it can reveal the presence of a specific RNA that may itself have spatial significance in the context of the tissue sample used, and / or the spatial information (e.g., spatial localization) can be inferred by combining the location of the tissue sample on the array with sequencing information. Therefore, this method may only include the step of associating the sequence analysis information with a certain location on the tissue sample, for example, using a localization marker and its association with a certain location on the tissue sample. However, as described above, as a preferred embodiment of the invention, spatial information can be conveniently obtained by associating sequence analysis data and tissue sample images. Accordingly, in a preferred embodiment, the method may further include a step, namely:

[0045] (f) Associate the sequence analysis information with an image of the tissue sample, wherein the tissue sample was imaged before or after step (c).

[0046] In its broadest sense, the method of the present invention can be used for the local detection of a specific nucleic acid in a tissue sample. Therefore, in one embodiment, the method of the present invention can be used to determine and / or analyze the entire transcriptome or genome in a tissue sample, for example, the whole transcriptome of the tissue sample. However, the method is not limited to this, but includes determining and / or analyzing all or part of the transcriptome or genome. Therefore, the method may include determining and / or analyzing a portion or subset of the transcriptome or genome, such as the transcriptome corresponding to a subset of genes, or a specific subset of genes, such as a subset of genes associated with a specific disease or symptom, tissue type, etc.

[0047] On the other hand, the steps of this method as described above can be seen as providing a method for obtaining a spatially defined transcriptome or genome, particularly a global transcriptome or genome of a spatially defined tissue sample.

[0048] In another view, the method of the present invention can be considered as a method for local or spatial detection of nucleic acids (i.e., DNA or RNA) in tissue samples, or a method for local or spatial determination and / or analysis of nucleic acids (DNA or RNA) in tissue samples. Specifically, this method can be used for the local or spatial detection, determination, and / or analysis of gene expression or genomic variation in tissue samples. This local / spatial detection / determination / analysis means that the natural location or site of RNA or DNA in the cells or tissues of the tissue sample can be localized. Thus, for example, RNA or DNA can be localized to a cell, cell population, or cell type on the sample, or a specific region in the tissue sample. The natural site or location of said RNA or DNA (or in other words, the site or location of said RNA or DNA in the tissue sample), for example, an expressed gene or genomic site, can be determined.

[0049] This invention can also be viewed as providing an array for use in the method of this invention, comprising a substrate on which a variety of capture probes are directly or indirectly immobilized, each type of probe occupying a different position in the array and oriented to have a 3' free end so that the probe can serve as a reverse transcription (RT) primer, wherein each type of capture probe comprises a nucleic acid molecule containing, in the 5' to 3' direction:

[0050] (i) The localization domain, corresponding to the position of the capture probes on the array, and

[0051] (ii) A capture domain for capturing RNA from tissue samples in contact with the array.

[0052] In a related aspect, the invention also provides an array comprising a substrate on which a plurality of capture probes are directly or indirectly immobilized, each type of probe occupying a different position in the array and oriented to have a 3' free end to enable the probe to function as a reverse transcription (RT) primer, wherein each type of capture probe comprises a nucleic acid molecule containing, from the 5' end to the 3' end:

[0053] (i) The localization domain, corresponding to the position of the capture probes on the array, and

[0054] (ii) Capture domain;

[0055] RNA used to capture tissue samples in contact with the array.

[0056] Preferably, the use is for determining and / or analyzing the transcriptome of a tissue sample, particularly the whole transcriptome, and further includes the following steps:

[0057] (a) Using the capture probe as an RT primer, a cDNA molecule is generated from the captured nucleic acid molecule, and optionally, the cDNA molecule is amplified;

[0058] (b) Releasing at least a portion of the cDNA molecule and / or optionally its amplicon from the array surface, wherein the released molecule may be the first and / or second strand of the cDNA molecule or its amplicon, and wherein the portion includes a localization domain or its complementary strand;

[0059] (c) Analyze the sequence of the released molecules directly or indirectly; and optionally

[0060] (d) Associate the sequence analysis information with an image of the tissue sample, wherein the tissue sample was imaged before or after step (a).

[0061] Therefore, it is understood that the array of the present invention can be used to capture RNA, such as mRNA from tissue samples in contact with the array. The array can also be used to determine and / or analyze the whole or partial transcriptome of a tissue sample, or to obtain a spatially defined partial or global transcriptome from a tissue sample. The method of the present invention can therefore be viewed as a method for quantifying the spatial expression of one or more genes in a tissue sample. In other words, the method of the present invention can be used to detect the spatial expression of one or more genes in a tissue sample. In further words, the method of the present invention can be used to simultaneously determine the expression of one or more genes at one or more sites in a tissue sample. Furthermore, the method of the present invention can be viewed as a method for performing partial or global transcriptome analysis of a tissue sample using two-dimensional spatial resolution.

[0062] The RNA can be any RNA molecule present in the cell. Therefore, it can be mRNA, tRNA, rRNA, viral RNA, small nuclear RNA (snRNA), small nucleolar RNA (snoRNA), microRNA (miRNA), small interference RNA (siRNA), RNA interacting with piwi protein (piRNA), ribosomal RNA, antisense RNA, or non-coding RNA. However, mRNA is preferred.

[0063] Step (c) in the above method (corresponding to step (a) in the preferred form described above), which generates cDNA from the captured RNA as a template, can be considered as involving cDNA synthesis. It includes a reverse transcription step of the captured RNA, using the captured RNA as a template to extend the capture probe with RT primer functionality. This step produces the so-called first strand of cDNA. As will be described in detail below, after releasing the first strand of cDNA from the array, the synthesis of the second strand of cDNA can optionally be performed on the array, or in a separate step. It will also be described in detail below that, in some embodiments, the synthesis of the second strand can occur in the first step of amplifying the first strand of the released cDNA molecule.

[0064] The following section will discuss and describe the general applications of arrays in nucleic acid analysis and their specific applications in DNA analysis. The specific details and implementation methods of arrays and capture probes used in RNA environments described herein (where appropriate) also apply to all such arrays, including those used for DNA.

[0065] In this invention, the term "multiple" means two or more, or at least two, for example, 3, 5, 10, 15, 20, 30, 40, 50, 60, 70, 80, 90, 100, 150, 200, 400, 500, 1000, 2000, 5000, 10000, or more, etc. Therefore, for example, the number of capture probes can be any integer between any two of the above numbers. However, it is understood that, by design, conventional arrays containing hundreds, thousands, tens of thousands, hundreds of thousands, or even millions of capture probes can also be used.

[0066] Therefore, the method outlined herein employs a high-density nucleic acid array containing “capture probes” to capture and label transcripts of all individual cells in a tissue sample, such as a tissue sample slice, or “section.” The tissue sample or section used for analysis is prepared in a highly parallel manner to preserve spatial information within the slice. The captured RNA (preferably mRNA) molecules, or “transcriptome,” in each cell are transcribed into cDNA and analyzed, for example, by high-throughput sequencing. The resulting data can be correlated with images of the original tissue sample, such as the section, by means of so-called recognition code sequences (or ID tags, defined herein as localization domains) integrated onto the nucleic acid probes arranged in the array.

[0067] High-density nucleic acid arrays, or microarrays, are the core component of the spatial transcriptome labeling methods described herein. Microarrays are a multi-faceted technique used in molecular biology. A typical microarray consists of a series of oligonucleotide microdots arranged in an array (tens of thousands, often tens of thousands, of microdots can be integrated into an array). The different positions of each nucleic acid (oligonucleotide) microdot (each type of oligonucleotide / nucleic acid molecule) are called "features" (therefore, each type of capture probe in the above methods can be considered a specific feature of the array; each feature occupies a different position on the array), and typically, each individual feature contains picomoles (103). -12 A specific DNA sequence (a "type") on the order of moles is called a "probe" (or "reporter"). Typically, the probe can be a short segment of a gene or other nucleic acid component capable of hybridizing with a cDNA or cRNA sample (i.e., the "target") under highly stringent hybridization conditions. However, as described below, the probes in this invention differ from those in standard microarrays.

[0068] In gene expression microarrays, probe-targeted hybridization reactions are often detected and quantified by the detection of visual signals, such as fluorophores, silver ions, or chemifluorescent labels attached to all targets. The intensity of this visual signal is related to the relative abundance of each target nucleic acid in the sample. Since an array can contain tens of thousands of probes, a single microarray experiment can perform parallel testing of many genes.

[0069] In standard microarrays, probes are covalently bonded to a chemical matrix such as epoxy silane, amino silane, lysine, or polyacrylamide and attached to a solid surface or substrate. Typical substrates are glass, plastic, silicon chips, or wafers, but other known microarray platforms exist, such as microbeads.

[0070] The probes can be attached to the array of the present invention by any suitable method. In a preferred embodiment, the probes are fixed to the substrate by a chemical fixation method, which can be an interaction between the substrate (support material) and the probe based on a chemical reaction. Such a chemical reaction generally does not depend on the energy input of heat or light, but can be promoted by applying heat (e.g., providing an ideal temperature for the chemical reaction) or light of a certain wavelength. For example, a chemical fixation reaction can occur between functional groups on the substrate and corresponding functional elements on the probe. Such corresponding functional elements of the probe can be chemical groups inherent to the probe, such as hydroxyl groups, or additionally introduced groups. An example of such functional groups is an amino group. Typically, the probe to be fixed contains a functional amino group, or a functional amino group is introduced through chemical modification. Methods and approaches for such chemical modifications are well known.

[0071] The positioning of the functional groups in the probe to be immobilized can be used to control and shape the binding behavior and / or orientation of the probe; for example, the functional groups can be placed at the 5' end, 3' end, or within the probe sequence. A typical substrate for the probe to be immobilized contains groups capable of binding to the probe, such as groups capable of binding to amino-functionalized nucleic acids. Examples of such substrates include carboxyl, aldehyde, or epoxy-based substrates. Such materials are known to those skilled in the art. Functional groups capable of initiating linkage reactions between probes that are chemically active due to the introduction of amino groups, as well as array substrates, are known to those skilled in the art.

[0072] Optional substrates for probe immobilization may require chemical activation, such as activation of functional groups on the array substrate. The term "activated substrate" refers to a material containing interacting or reactive chemical groups, which have been established or activated through chemical modification steps known to those skilled in the art. For example, a substrate containing carboxyl groups must be activated before use. Furthermore, there are readily available substrates containing functional groups that can react with specific groups already present in nucleic acid probes.

[0073] Alternatively, the probe can be synthesized directly on a substrate. Suitable steps for such methods are known to those skilled in the art. For example, manufacturing techniques developed by Agilent Inc., Affymetrix Inc., Roche Nimblegen Inc., or Flexgen BV. Typically, a laser and a microscope array are used to specifically activate micro-dots of the nucleotide to be added. Such methods can provide spot sizes (i.e., features) of 30 μm or larger.

[0074] Therefore, the substrate can be any suitable substrate known to those skilled in the art. The substrate can take any appropriate form or format; for example, it can be flat or curved, such as raised or recessed relative to the area where the tissue sample and the substrate interact. Particularly preferred is that the substrate is planar, i.e., a two-dimensional chip or wafer.

[0075] Typically, the substrate is a solid-phase support, thus allowing probes on the substrate to achieve accurate and traceable positioning. An example of a substrate is a solid-phase material or substrate containing functional chemical groups, such as amine groups or amine-functionalized groups. The substrate contemplated in this invention is a non-porous substrate. Preferred non-porous substrates include glass, silicon, polylysine-coated materials, nitrocellulose, polystyrene, cyclic olefin copolymers (COCs), cyclic olefin polymers (COPs), polypropylene, polyethylene, and polycarbonate.

[0076] Any suitable material known to those skilled in the art can be used. Typically, glass or polystyrene is used. Polystyrene is a hydrophobic material, suitable for binding negatively charged macromolecules because it generally contains almost no hydrophilic groups. For nucleic acids immobilized on glass slides, it is further known that increasing the hydrophobicity of the glass surface can improve nucleic acid immobilization. This improvement allows for relatively denser arrays. In addition to polylysine coatings or surface treatments, the substrate, especially the glass substrate, can also be silanized, for example, treated with epoxy silane or amino silane, alkylated, or treated with polyacrylamide.

[0077] Several readily available standard feature arrays exist on the market, with varying numbers and sizes of features. In this invention, the arrangement of features can be varied according to the cell size and / or density of different tissues or organs. For example, the cross-section of typical animal cells is approximately 1-100 μm, while the cross-section of typical plant cells may vary in the range of 1-10,000 μm. Therefore, for animal or fungal tissue samples, it is preferable to use arrays with up to 2.1 million or 4.2 million features, and a feature size of 13 micrometers. For plant tissue samples, other formats, such as 8x130k feature arrays, are sufficient. For sequence analysis, especially NGS, there are also readily available commercial arrays, or commercial arrays known to be suitable for this field. Such arrays can also be used as array surfaces within the scope of this invention, such as Illumina bead arrays. In addition to readily available commercial arrays that can be customized, custom or non-standard "homemade" arrays can be fabricated, and methods for fabricating these arrays are well-established. Whether standard or non-standard, any array containing the probes defined below can be used in the methods of this invention.

[0078] Preferably, depending on the chemical matrix of the array, the probes on the microarray are fixed to the array via their 5' or 3' ends, i.e., attached or bonded to the array. Typically, for commercially available arrays, the probes are attached via connections at their 3' ends, leaving the 5' ends as free ends. However, there are also arrays where attachment is achieved via connections at the 5' ends, leaving the 3' ends as free probe ends, and as described in other parts of this invention, such arrays can be synthesized using standard techniques known in the art.

[0079] The covalent bonds connecting nucleic acid probes to the array substrate can be considered either direct or indirect. While probe attachment is achieved through "direct" covalent bonds, there may be chemical components or linkers that separate the "first" nucleotide of the nucleic acid probe from the substrate, such as glass or silicon; this constitutes an indirect connection. For the purposes of this invention, probes fixed to the substrate using covalent bonds and / or chemical linkers are generally considered to be directly fixed or attached to the substrate.

[0080] The capture probes of the present invention can be directly or indirectly attached to or interact with an array, as will be described in more detail below. Therefore, the capture probes do not need to be directly bound to the array, but can interact indirectly, for example, by binding to a molecule directly or indirectly bound to the array (e.g., the capture probe can interact with a capture probe binding partner (e.g., bind or hybridize with it), which is a surface probe directly or indirectly bound to the array). However, in general, the capture probes will be directly or indirectly bound to or attached to the array (through one or more intermediate media).

[0081] The uses, methods, and arrays of this invention may include probes fixed via a 5' or 3' end. However, when the capture probe is directly fixed to the array substrate, the fixing method must ensure that the 3' end of the capture probe can be freely extended; for example, the probe can be fixed via a 5' end. The capture probe can be indirectly fixed to allow it to have a free 3' end, i.e., an extendable 3' end.

[0082] An extended or extendable 3' end means that more nucleotides can be added to the 3' terminal nucleotide of a nucleic acid molecule (such as a capture probe) to extend the length of the nucleic acid molecule. That is, to extend the nucleic acid molecule using standard polymerization reactions, such as polymerase-catalyzed template polymerization reactions.

[0083] Therefore, in one embodiment, the array contains probes directly fixed at their 3' ends, namely, so-called surface probes as defined below. Each type of surface probe contains a region complementary to a type of capture probe, enabling the capture probe to hybridize with the surface probe to form a capture probe with a freely elongable 3' end. In a preferred aspect of the invention, if the array contains surface probes, the capture probes are synthesized in situ on the array.

[0084] The array probe can be composed of ribonucleotides and / or deoxyribonucleotides, and synthetic nucleotide residues capable of participating in Watson-Crick or similar base-matching reactions. Therefore, the nucleic acid can be DNA or RNA or any modified product thereof, such as PNA or other derivatives containing a non-nucleotide backbone. However, in a transcriptome analysis setting, the capture domain of the capture probe must be able to guide a reverse transcription reaction to generate cDNA complementary to the captured RNA molecule. As will be described in further detail below, in a genome analysis setting, the capture domain of the capture probe must be able to bind to a DNA fragment, which can be a binding domain added to the DNA fragment. In some embodiments, the capture domain of the capture probe can guide a DNA elongation (polymerase) reaction to generate DNA complementary to the captured DNA molecule. In other embodiments, the capture domain can serve as a template in the ligation reaction between the captured DNA molecule and a surface probe directly or indirectly immobilized on a substrate. In other embodiments, the capture domain can be attached to one strand of the captured DNA molecule.

[0085] In a preferred embodiment of the invention, the capture probe contains, or contains only, deoxyribonucleotides (dNTPs) at least in its capture domain. In a particular preferred embodiment, the entire capture probe contains, or contains only, deoxyribonucleotides.

[0086] In a preferred embodiment of the present invention, the capture probe is directly fixed on the array substrate, that is, it is fixed by its 5' end and its 3' end is free and can be extended.

[0087] The capture probe of this invention contains at least two domains: a capture domain and a localization domain (or a signature tag or signature domain; optionally, the localization domain may be defined as an ID domain or tag, or a localization tag). The capture probe may further contain a universal domain as defined below. When the capture probe is indirectly attached to an array surface by hybridization with a surface probe, the capture probe must have a sequence (e.g., a portion or domain) complementary to the surface probe. Such a complementary sequence may be complementary to the localization / identification domain and / or universal domain of the surface probe. In other words, the localization domain and / or universal domain may constitute a region or portion on the probe that is complementary to the surface probe. However, the capture probe may also contain additional domains (or regions, portions, or sequences) complementary to the surface probe. As described below, for ease of synthesis, such regions complementary to the surface probe may be portions of the capture domain or extensions of the capture domain (such portions or extensions are not used or cannot bind to target nucleic acids such as RNA).

[0088] A typical capture domain is located at the 3' end of the capture probe and contains a free 3' end that can be extended, for example, through a templated polymerization reaction. The capture domain contains a nucleotide sequence capable of hybridizing with nucleic acids, such as RNA (preferably mRNA), in a tissue sample that comes into contact with the array.

[0089] Preferably, the capture domain can be selected or designed to selectively or specifically bind (or more broadly, to bind) specific nucleic acids, such as RNA to be detected or analyzed. For example, the capture domain can be selected or designed to selectively capture mRNA, as is well known in the art, based on hybridization with the poly-A tail of mRNA. Thus, in a preferred embodiment, the capture domain contains a poly-T DNA oligonucleotide, i.e., a series of consecutive deoxythymidine residues linked by phosphodiester bonds, capable of hybridizing with the poly-A tail of mRNA. Optionally, the capture domain can contain nucleotides that are functionally or structurally similar to poly-T, i.e., nucleotides capable of selectively binding to poly-A, such as poly-U oligonucleotides, or oligonucleotides composed of deoxythymidine analogs, wherein the oligonucleotides retain the functional property of binding to poly-A. In a particular preferred embodiment, the capture domain, or more specifically, the poly-T element of the capture domain, contains at least 10 nucleotides, preferably at least 11, 12, 13, 14, 15, 16, 17, 18, 19, or 20 nucleotides. In a further embodiment, the capture domain, or more specifically, the poly-T element of the capture domain, contains at least 25, 30, or 35 nucleotides.

[0090] As is known in the art, nucleic acid capture can also be achieved using random sequences, such as random hexamers or similar sequences, and thus, such random sequences can be used to form all or part of the capture domain. For example, random sequences can be used in conjunction with poly-T (or poly-T analogs, etc.) sequences. Therefore, a capture domain containing poly-T (or "poly-T analogs") oligonucleotides can also contain random oligonucleotide sequences. For example, the random oligonucleotide sequence can be located at the 5' or 3' end of the poly-T sequence, for example, at the 3' end of the capture probe, but the position of such random sequences is not critical. Such a structure is advantageous for capturing the start of the poly-A tail of mRNA. Alternatively, the capture domain can be entirely a random sequence. Furthermore, as is known in the art, degenerate capture domains can also be used.

[0091] The capture domain can selectively bind to a desired nucleic acid subtype or class, such as RNA, for example, a specific type of RNA, such as the aforementioned mRNA or rRNA; or, it can bind to a specific subset of a specified RNA type, for example, a specific mRNA, such as the mRNA corresponding to a specific gene or gene group. Such capture probes can be selected or designed based on the sequence of the RNA to be captured, and are therefore sequence-specific capture probes targeting a specific RNA target or population (target group, etc.). Thus, according to principles well known in the art, the capture probe can be based on a specific gene sequence or a specific sequence motif or a shared / conserved sequence, etc.

[0092] In some embodiments, the capture probe is indirectly immobilized on the array substrate, for example, by hybridization with a surface probe, and its capture domain may further contain an upstream sequence (located at the 5' end of the sequence that hybridizes with nucleic acids from tissue samples such as RNA) capable of hybridizing with the 5' end of the surface probe. Individually, the capture domain of the capture probe can be considered as a capture domain oligonucleotide, which, in embodiments where the capture probe is indirectly immobilized on the array, can be used in the synthesis of the capture probe.

[0093] The localization domain (identification domain or identification tag) of the capture probe is located directly or indirectly upstream of the capture domain, i.e., close to the 5' end of the capture probe nucleic acid molecule. Preferably, the localization domain is directly adjacent to the capture domain, i.e., there is no spacer sequence between the capture domain and the localization domain. In some embodiments, the localization domain constitutes the 5' end of the capture probe and can be directly or indirectly immobilized on a sequence substrate.

[0094] As discussed above, each feature (i.e., different location) of the array contains a microdot of a nucleic acid probe, and the localization domain present in each feature is unique. Therefore, the probe "type" is defined by the localization domain it possesses; capture probes of the same type have the same localization domain. However, it is not required that every capture probe within a type has an identical sequence. In particular, since the capture domain can be or may contain random or degenerate sequences, the capture domains of individual probes within a type can vary widely. Accordingly, in some embodiments, the capture domains of the capture probes are identical, and a single probe sequence exists in each feature; however, in other embodiments, the capture probes are different, and members within a probe type will not contain identical sequences, although the localization domain sequences of each member within that type are the same. The requirement is that each feature or location of the array has a capture probe of a single type (specifically, the capture probes carried by each feature or location have equivalent localization tags, i.e., a single localization domain on each feature or location). Each type of probe has a different localization domain to distinguish the types. However, each member of a species may, in some cases as described in further detail here, have a different capture domain because the capture domain can be random or degenerate, or contain random or degenerate components. This means that the capture domain of a probe can differ at a given feature or location.

[0095] Therefore, in some, but not all, embodiments, any probe molecule immobilized on a specific feature has the same nucleotide sequence as other probe molecules immobilized on the same feature, but the nucleotide sequences of the probes on each feature are different, distinct, or distinguishable from each other. Preferably, each feature contains a different type of probe. However, in some embodiments, preferably, a feature group containing the same type of probe can be created, i.e., a feature that effectively covers a region of the array can be larger than a single feature, for example, reducing the array resolution. In other embodiments of the array, any probe molecule immobilized on a specific feature may have the same localization domain nucleotide sequence as other probe molecules immobilized on the same feature, but their capture domains may differ. However, the capture domains are typically designed to capture the same type of molecules, such as mRNA.

[0096] The localization domain (or tag) of a capture probe contains a sequence specific to each feature, acting as a localization or spatial marker (identification tag). In this way, by linking a nucleic acid such as RNA (e.g., a transcript) from a cell to the unique localization domain sequence of the capture probes in the array, each region or domain of a tissue sample, such as each cell in the tissue, can be identified using the full array spatial resolution. Through the localization domain, the capture probes in the array can be associated with a specific location in the tissue sample, for example, with a specific cell in the sample. Therefore, the localization domain of a capture probe can be viewed as a nucleic acid tag (identification tag).

[0097] Any suitable sequence can be used as the localization domain of the capture probe of the present invention. A suitable sequence means that, as a localization domain, it will not interfere with (i.e., inhibit or distort) the reaction between the RNA of the tissue sample and the capture domain of the capture probe. For example, the localization domain should be designed not to have specificity for hybridization with nucleic acid molecules in the tissue sample. Preferably, the sequence identity between the nucleic acid sequence of the capture probe localization domain and the nucleic acid sequence of the tissue sample is less than 80%. Preferably, the sequence identity between the capture probe localization domain and the majority of nucleic acid molecules in the tissue sample is less than 70%, 60%, 50%, or 40%. Sequence identity can be determined by appropriate methods known in the art, for example, using the BLAST alignment algorithm.

[0098] In a preferred embodiment, each capture probe's localization domain contains a unique identification code sequence. This identification code sequence can be generated using random sequences. The randomly generated sequences are then rigorously screened by mapping to all types of common reference genomes and predefined Tm intervals, GC content, and defined distances from other identification code sequences. This ensures that the identification code sequences do not interfere with the nucleic acid capture process, such as the capture of RNA from tissue samples, and that the identification code sequences can be easily distinguished from each other.

[0099] As described above, in a preferred embodiment, the capture probe further includes a universal domain (or a linker domain or linker tag). The universal domain of the capture probe is located directly or indirectly upstream of the localization domain, i.e., close to the 5' end of the capture probe nucleic acid molecule. Preferably, the universal domain is directly adjacent to the localization domain, i.e., there is no spacer sequence between the localization domain and the universal domain. In some embodiments, the capture probe contains a universal domain that forms the 5' end of the capture probe for direct or indirect fixation onto the array substrate.

[0100] In the methods and uses of this invention, the universal domain can have various applications. For example, the method of this invention includes the step of releasing (e.g., removing) at least a portion of a synthesized (i.e., elongated or ligated) nucleic acid (e.g., cDNA) molecule from an array surface. As described in other parts of this invention, this step can be implemented using various methods, one of which involves cleaving the nucleic acid (e.g., cDNA) molecule from the array surface. Thus, the universal domain itself may contain a cleavage domain, i.e., a sequence that can be specifically cleaved by chemical methods, or preferably by enzymatic methods.

[0101] Therefore, the cleavage domain may contain a sequence that can be recognized by one or more enzymes with nucleic acid cleavage capabilities (i.e., the ability to break phosphodiester bonds between two or more nucleotides). For example, the cleavage domain may contain a restriction endonuclease (restriction enzyme) recognition sequence. Restriction enzymes cleave double-stranded or single-stranded DNA at a specific recognition nucleic acid sequence called a "restriction site," and suitable enzymes are well known in the art. For example, particularly preferably, rare cleavage restriction enzymes, i.e., enzymes with long restriction sites (at least 8 base pairs in length), can be used to reduce the possibility of accidental cleavage at other sites on nucleic acid molecules such as cDNA. In this regard, it is understood that removing or releasing at least a portion of a nucleic acid molecule such as cDNA means releasing the portion of the nucleic acid molecule such as cDNA containing the localization domain and the entire sequence downstream of it, i.e., the entire sequence in the direction of the 3' end after the localization domain. Therefore, the cleavage of nucleic acid molecules such as cDNA should be carried out in the direction of the 5' end of the localization domain.

[0102] For example, the splicing domain may contain a poly-U sequence, and splicing of the poly-U sequence can be performed using uracil DNA glycosylase (UDG) and a commercially available enzyme called USER. TM The enzyme's DNA glycosylation enzyme-lyase endonuclease VIII.

[0103] Further examples of the use of the cleavage domain can be seen in embodiments where the capture probe is indirectly (i.e., via a surface probe) immobilized on an array substrate. In this embodiment, the cleavage domain may contain one or more mismatched nucleotides, meaning that the complementary portion of the surface probe and the capture probe are not 100% complementary. Such mismatches are recognized by, for example, MutY and T7 endonuclease I, resulting in the cleavage of the nucleic acid molecule at the mismatch site.

[0104] In some embodiments of the present invention, the positioning domain of the capture probe includes a shearing domain, wherein the shearing domain is located at the 5' end of the positioning domain.

[0105] The universal domain may also contain an amplification domain as an additional or alternative portion to the splicing domain. In some embodiments of the invention, as described in other parts of the invention, preferably, nucleic acid molecules (such as cDNA) can be amplified after being released (e.g., removed or excised) from the array substrate. However, it is understood that the initial amplification cycle, or any and all subsequent amplification cycles, can also be performed in situ on the array. The amplification domain contains a unique sequence capable of hybridizing with the amplification primers. Preferably, the amplification domain of the universal domain is the same for each type of capture probe. Therefore, a single amplification reaction will be sufficient to amplify all nucleic acid molecules (such as cDNA, etc.) (regardless of whether the nucleic acid molecules are released from the array substrate prior to the amplification reaction).

[0106] Any suitable sequence can be used as the amplification domain of the capture probe of this invention. "Suitable sequence" means that, as an amplification domain, it will not interfere with (i.e., inhibit or distort) the interaction between the tissue sample nucleic acid (such as RNA) and the capture domain of the capture probe. Furthermore, the amplification domain should contain a sequence that is different from or substantially different from any sequence of the tissue sample nucleic acid (such as RNA) to ensure that the primers for the amplification reaction can hybridize only with the amplification domain under the amplification conditions of the reaction.

[0107] For example, the amplification domain should be designed so that it or its complementary sequence does not specifically hybridize with nucleic acid molecules in the tissue sample. Preferably, the nucleic acid sequence of the capture probe amplification domain and its complementary strand have less than 80% sequence identity with the nucleic acid sequence of the tissue sample. Preferably, the amplification domain of the capture probe has less than 70%, 60%, or 40% sequence identity with most nucleic acid molecules in the tissue sample. Sequence identity can be determined by appropriate methods known in the art, for example, using the BLAST alignment algorithm.

[0108] Therefore, when viewed in isolation, the universal domain of the capture probe can be considered as a universal domain oligonucleotide. In embodiments where the capture probe is indirectly immobilized on an array, the universal domain oligonucleotide can be used for the synthesis of the capture probe.

[0109] In a representative embodiment of the invention, only the localization domain of each capture probe is unique. Therefore, on any particular array within the same embodiment, the capture domain and universal domain (if applicable) of each capture probe are identical to ensure array-wide consistency in the capture of nucleic acids (e.g., RNA) from tissue samples. However, as discussed above, in some embodiments, some capture domains may differ due to the inclusion of random or degenerate sequences.

[0110] In some embodiments, the capture probes are indirectly fixed (e.g., by hybridization with surface probes) on the array substrate, and such capture probes can be synthesized on the array as described below.

[0111] The surface probes are fixed to the array substrate directly at or at their 3' ends. Each type of surface probe has a unique correspondence with each feature (different location) on the array and is complementary to the capture probe section defined above.

[0112] Therefore, the 5' end of the surface probe contains a domain (complementary capture domain) that is complementary to the portion of the capture domain that does not bind to nucleic acids (such as RNA) in the tissue sample. In other words, the complementary capture domain can at least partially hybridize with the oligonucleotide of the capture domain. The surface probe further contains a domain (complementary localization domain or complementary signature domain) that is complementary to the localization domain of the capture probe. This complementary localization domain is located directly or indirectly downstream of the complementary capture domain (i.e., in the direction of its 3' end), meaning that the complementary localization domain and the complementary capture domain can be separated by a spacer sequence or a linker sequence. In some embodiments, the capture probe is synthesized on an array surface, and the surface probe of the array always contains a domain (complementary universal domain) that is complementary to the universal domain of the capture probe in the direction of its 3' end, i.e., in the direction directly or indirectly downstream of the localization domain. In other words, it contains a domain that can at least partially hybridize with the oligonucleotide of the universal domain.

[0113] In some embodiments of the invention, the sequence of the surface probe has 100% complementarity or sequence identity with the localization domain and the general domain, as well as with the portion of the capture domain that does not bind to nucleic acids (such as RNA) in the tissue sample. In other embodiments, the sequence identity between the surface probe sequence and these domains of the capture probe may be less than 100%, for example, less than 99%, 98%, 97%, 96%, 95%, 94%, 93%, 92%, 91%, or 90%. In a particular preferred embodiment of the invention, the sequence identity between the complementary general domain and the general domain of the capture probe is less than 100%.

[0114] In one embodiment of the invention, the capture probe is synthesized or generated on an array substrate. In a representative embodiment (see...), Figure 3The array contains surface probes as defined above. Oligonucleotides corresponding to the capture and universal domains of the capture probes contact the array and hybridize with the complementary domains of the surface probes. Excess oligonucleotides can be removed by washing under standard hybridization conditions. The resulting array contains some single-stranded probes, where the 5' and 3' ends of the surface probes are double-stranded, while the complementary domains are single-stranded. The array can be treated with polymerases to extend the 3' end of the universal domain oligonucleotide using a template to synthesize the capture probe's positioning domain. The 3' end of the synthesized positioning domain can be ligated to the 5' end of the capture domain oligonucleotide using ligases to generate the capture probe. It is known that the ligation reaction can be achieved by phosphorylating the 5' end of the capture domain oligonucleotide. Since each type of surface probe contains a unique complementary domain, each type of capture probe will contain a unique positioning domain.

[0115] As used herein, the term "hybridization reaction" or "hybridization" refers to a double-strand formation reaction between nucleotide sequences that are sufficiently complementary to bind together via Watson-Crick base pairing. Two nucleotide sequences are "complementary" to each other if their base pairings are structurally homologous. "Complementary" nucleotide sequences specifically bind under appropriate hybridization conditions to form a stable double helix. For example, if a portion of a first sequence binds to a portion of a second sequence in an antiparallel manner, with the 3' end of one sequence binding to the 5' end of the other, and each A, T(U), G, and C of one sequence binding to the T(U), A, C, and G of the other sequence, respectively, then the two sequences are complementary. RNA sequences may also contain complementary G=U or U=G base sequences. Therefore, the two "complementary" sequences in this invention do not need to have perfect homology. Generally, two sequences are considered sufficiently complementary if at least 90% (preferably at least about 95%) of their nucleotides along a specified length of their molecules have the same base pairing structure. Therefore, both the capture probe and the surface probe contain a complementary region within their structural domains. Furthermore, the capture domain of the capture probe contains a complementary region that is complementary to nucleic acids in the tissue sample, such as RNA (preferably mRNA).

[0116] Capture probes can also be synthesized on an array substrate using polymerase extension reactions (similar to those described above) and terminal transferase tailing reactions. The added tails can form capture domains, as further described in Example 7 below. The technique of adding nucleotide sequences to the ends of oligonucleotides using terminal transferases is known in the art, for example, introducing a homopolymer tail, such as a poly-T tail. Accordingly, in such synthetic reactions, the oligonucleotide corresponding to the universal domain of the capture probe can contact the array and hybridize with the complementary domain of the surface probe. Excess oligonucleotides can be removed by washing under standard hybridization conditions. The resulting array contains a subset of single-stranded probes, wherein the 5' and 3' ends of the surface probes are double-stranded, and the complementary positioning domain is single-stranded. The array can be treated with polymerases to extend the 3' end of the universal domain oligonucleotide using a template to synthesize the positioning domain of the capture probe. Then, to introduce capture domains, such as capture domains containing poly-T sequences, a poly-T tail can be added using terminal transferases to generate the capture probe.

[0117] The typical arrays of the present invention, and the arrays used in the methods of the present invention, may contain multiple microdots, or "features." A feature can be defined as a region or different location on the array substrate where a single type of capture probe is fixed. Therefore, each feature will contain multiple probe molecules of the same type. It is understood that, under this setup, although each capture probe of the same type can have the same sequence, this is not a necessary condition. Each type of capture probe will have the same localization domain (i.e., each member of the same type, and even each probe in the same feature, has the same "marker"), but the sequences of individual probes within the same feature (type) can be different because the sequences of the capture domains can differ. As mentioned above, random or degenerate capture domains can be used. Therefore, capture probes in the same feature can contain different random or degenerate sequences. The number and density of features on the array will determine the resolution of the array, i.e., the level of detail used for analyzing the transcriptome or genome of the tissue sample. Therefore, generally, as the density of features increases, the resolution of the array also increases.

[0118] As described above, the size and number of array features in this invention depend on the nature of the tissue sample and the required resolution. Therefore, if the goal is only to determine the transcriptome or genome of swathes of cells within a tissue sample (or if the sample contains large cells), the number and / or density of array features can be reduced (i.e., the maximum possible number of features) and / or the size of the features can be increased (i.e., the area of ​​each feature can be larger than the minimum possible feature), for example, an array containing a small number of large features. Alternatively, if the goal is to determine the transcriptome or genome of individual cells in the sample, it may be necessary to use features with the maximum possible number and minimum possible size, for example, an array containing many small features.

[0119] While single-cell resolution may be a preferred feature of the invention, it is not an essential objective; cell population-level resolution is also one of the objectives of the invention, for example, to detect or distinguish specific cell types or tissue regions, such as normal cells vs. tumor cells.

[0120] In representative embodiments of the present invention, the array may contain at least 2, 5, 10, 50, 100, 500, 750, 1000, 1500, 3000, 5000, 10000, 20000, 40000, 50000, 75000, 100000, 150000, 200000, 300000, 400000, 500000, 750000, 800000, 1000000, 1200000, 1500000, 1750000, 2000000, 2100000, 3000000, 3500000, 4000000, or 4200000 features. While 4,200,000 represents the maximum possible number of features in a currently available product array, according to the concept of the invention, arrays with more features than this number can be fabricated, and such arrays are one of the objectives of the invention. As mentioned above, the feature size can be reduced while the number of features can be increased in the same or similar area. For example, these features can be accommodated in approximately 20 cm². 2 10cm 2 5cm 2 1cm 2 1mm 2 or 100μm 2 In the following areas.

[0121] Therefore, in some embodiments of the present invention, the area of ​​each feature may be approximately 1 μm. 2 2μm 2 3μm 2 4μm 2 5μm 2 10μm 2 12μm 2 15μm 2 20μm 2 50μm 2 75μm 2 100μm 2 150μm 2 200μm 2 250μm 2 300μm 2 400μm 2 or 500μm 2 .

[0122] It is understood that the method of the present invention can be used on tissue samples of any organism, such as plants, animals, or fungi. The array of the present invention can be used to capture any nucleic acid in cells capable of transcription and / or translation, such as mRNA molecules. The array and method of the present invention are particularly suitable for isolating and analyzing transcriptomes or genomes of cells in a sample, and when spatial resolution of the transcriptome or genome is required, for example, when cells are interconnected or in direct contact with neighboring cells. However, it will be apparent to those skilled in the art that the method of the present invention can also be used for transcriptome or genome analysis of different cells or different cell types in a sample, even if the cells do not interact directly, for example, in blood samples. In other words, the cells to which the array is applicable do not need to be tissue cells, but can also be single cells (e.g., cells isolated from unfixed tissue). The single cell does not need to be a cell fixed in a certain location in the tissue, but can still be placed in a certain location in the array and can be identified individually. Therefore, when analyzing cells that do not interact directly or non-tissue cells, the spatial properties of the method can be used to acquire or retrieve unique or independent transcriptome or genome information of individual cells.

[0123] Therefore, the sample can be a dead or living tissue sample, or it can be a cultured sample. Representative examples include clinical samples such as whole blood or blood products, blood cells, tissues, living or cultured tissues, cells, etc., including their cell suspensions. For example, artificial samples can be prepared using cell suspensions (e.g., containing blood cells). Cells can be immobilized in a matrix (e.g., a gel matrix such as agar, agarose, etc.) and then sectioned using conventional methods. Such operations are known in immunohistochemistry in this field (e.g., see Andersson et al 2006, J. Histochem. Cytochem. 54(12):1413-23. Epub 2006 Sep 6).

[0124] The form of tissue preparation and the processing method of the obtained samples may affect the transcriptomic or genomic analysis of the present invention. Furthermore, various tissue samples may have different physical characteristics, and those skilled in the art can perform the necessary operations to obtain the tissue samples used in the methods of the present invention. However, it is apparent from this disclosure that any suitable sample preparation method can be used to obtain tissue samples in the present invention. For example, any cell layer with a thickness of about one cell or less can be used in the methods of the present invention. In one embodiment, the thickness of the tissue sample may be less than 0.9, 0.8, 0.7, 0.6, 0.5, 0.4, 0.3, 0.2, or 0.1 times the cross-sectional area of ​​the cell. However, as stated above, the present invention is not limited to single-cell resolution, and therefore, it is not required that the thickness of the tissue sample be less than the diameter of a single cell; thicker tissue samples can be used if necessary. For example, frozen sections can be used, with a thickness of, for example, 10-20 μm.

[0125] Tissue samples can be prepared in any manner convenient or necessary, and the present invention is not limited to any particular type of tissue preparation. Fresh, frozen, fixed, and unfixed tissues can all be used. Tissue samples can be fixed or embedded using any method known or described in the art, as needed and convenient. Therefore, any known fixative or embedding material can be used.

[0126] As a first representative example of the tissue sample used in this invention, the tissue can be prepared by deep freezing at a temperature suitable for maintaining or preserving the structural integrity (i.e., physical characteristics) of the tissue, for example, below -20°C, preferably below -25, -30, -40, -50, -60, -70, or -80°C. The frozen tissue sample can be sliced, i.e., cut into thin sections by any method and placed on an array surface. For example, the tissue sample can be prepared using a cryostatometer, with the temperature set in a low-temperature chamber suitable for maintaining the structural integrity of the tissue sample and the nucleic acid chemistry of the sample, for example, below -15°C, preferably below -20 or -25°C. Therefore, the processing of the sample should minimize the degradation or deterioration of nucleic acids (e.g., RNA) in the tissue. Such operating conditions are well known in the art, and any degree of degradation can be monitored by nucleic acid extraction, for example, by total RNA extraction and quality analysis at various stages of tissue sample preparation.

[0127] In a second representative example, tissue can be prepared using the standard method of formalin fixation and paraffin embedding (FFPE), a method well-established in the art. After fixation and embedding with paraffin or resin, the tissue sample can be sectioned, i.e., cut into thin sections and placed on an array. As mentioned above, other fixatives and / or embedding materials can also be used.

[0128] Obviously, before implementing the method of the present invention, tissue sample sections need to be processed to remove the embedding material on the sample, for example, by dewaxing, i.e., removing paraffin or resin. This step can be accomplished by any suitable method, and methods for removing paraffin, resin or other materials from samples have been well established in the art, for example, by culturing the sample (on the array surface) in a suitable solvent, such as xylene, for example, culturing twice, 10 minutes each time, followed by rinsing with ethanol, for example, rinsing with 99.5% ethanol for 2 minutes, 96% ethanol for 2 minutes, and 70% ethanol for 2 minutes.

[0129] To those skilled in the art, RNA in tissue sections prepared using FFPE or other fixation and embedding methods is obviously more likely to undergo partial degradation than RNA in frozen tissue. However, the present invention is not intended to be limited by any particular theory, and therefore, the use of the former tissue may be a preferred form for the methods of the present invention. For example, if the RNA in a sample is partially degraded, its RNA polynucleotides have a shorter average length and are more or less randomized compared to undegraded samples. Therefore, it can be presumed that partially degraded RNA introduces fewer errors in the various processing steps described in other parts of the present invention, such as the ligation of conjugates (amplification domains), amplification of cDNA molecules, and sequencing.

[0130] Therefore, in one embodiment of the invention, tissue samples, i.e., slices of tissue samples in contact with the array, are prepared using FFPE or other fixation and embedding methods. In other words, the samples can be fixed, for example, fixed and embedded. In an alternative embodiment of the invention, tissue samples are prepared by deep freezing. In yet another embodiment, tissue imprints can be used, the procedure of which is known in the art. In still other embodiments, unfixed samples can be used.

[0131] The thickness of the tissue sample used in the method of the present invention can be determined according to the sample preparation method and the physical properties of the tissue. Therefore, the method of the present invention can use slices of any suitable thickness. In a representative embodiment of the present invention, the thickness of the tissue sample slice is at least 0.1 μm, more preferably at least 0.2, 0.3, 0.4, 0.5, 0.7, 1.0, 1.5, 2, 3, 4, 5, 6, 7, 8, 9, or 10 μm. In other embodiments, the thickness of the tissue sample slice is at least 10, 12, 13, 14, 15, 20, 30, 40, or 50 μm. However, thickness is not a critical factor, and the values ​​listed above are only representative. Thicker samples, for example, 70 or 100 μm or more, can be used based on need and convenience. Typically, the thickness of tissue sample slices is between 1-100 μm, 1-50 μm, 1-30 μm, 1-25 μm, 1-20 μm, 1-15 μm, 1-10 μm, 2-8 μm, 3-7 μm, or 4-6 μm, but as mentioned above, thicker samples can also be used.

[0132] When tissue sample sections come into contact with the array, for example after a step such as dewaxing to remove embedding material, nucleic acid molecules (such as RNA) in the tissue sample will bind to capture probes immobilized on the array. In some embodiments, preferably, the binding of nucleic acid molecules (such as RNA) to capture probes can be promoted. Typically, promoting hybridization involves improving the conditions under which hybridization occurs. Key conditions that can be improved include the time and temperature at which the tissue sections are cultured on the array prior to the reverse transcription step, as described in other parts of the invention.

[0133] For example, when tissue sample sections are in contact with the array, the array can be cultured for at least 1 hour to allow nucleic acids (such as RNA) to hybridize with the capture probes. Preferably, the array can be cultured for 2, 3, 5, 10, 12, 15, 20, 22, or 24 hours, or until the tissue sample sections are dry. The array culture time is not a critical condition; any convenient or desired time can be used. Typical array culture times can be up to 72 hours. Therefore, culture can be performed at any suitable temperature, such as room temperature, but in a preferred embodiment, the culture temperature of the tissue sample sections on the array is at least 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, or 37°C. In the art, a culture temperature of up to 55°C is generally used. In a particular preferred embodiment, the tissue sample sections are dried on the array at 37°C for 24 hours. Once the tissue sample sections are dry, the array can be stored at room temperature until the reverse transcription step is performed. It is known that if tissue sample slices are allowed to dry on the array surface, then the slices must be rehydrated before further processing of the captured nucleic acids, such as before reverse transcription of the captured RNA.

[0134] Therefore, the method of the present invention may include a further step of hydrating the tissue sample after the sample comes into contact with the array.

[0135] In some embodiments, preferably, the capture probe is blocked (e.g., masked or modified) before it comes into contact with the tissue sample on the array, especially when the nucleic acids in the tissue sample have been modified before being captured on the array. Specifically, preferably, the free 3' end of the capture probe is blocked or modified. In a particular embodiment, the nucleic acids in the tissue sample, for example, can be modified to allow them to be captured by the capture probe. For example, as will be described in further detail below, a conjugate sequence (containing a binding domain that can bind to the capture domain of the capture probe) can be added to the end of the nucleic acid (e.g., the genomic DNA fragment). This step can be accomplished, for example, by a conjugate ligation reaction or a nucleic acid elongation reaction, for example, by appending nucleotides, such as poly-A tails, to the end of the sequence using an enzyme. It is necessary to block or modify the capture probe, especially the free 3' end, before the tissue sample comes into contact with the array to prevent the capture probe from being modified, for example, to prevent poly-A tails from being mistakenly added to the free 3' end of the capture probe. Preferably, the blocking domain can be added to the capture probe during its synthesis. However, the blocking domain can also be added after the capture probe has been synthesized.

[0136] In some embodiments, the capture probe can be blocked using any suitable and reversible method to prevent modification of the capture domain during nucleic acid modification of the tissue sample after the tissue sample comes into contact with the array. In other words, the capture probe can be reversibly masked or modified so that the capture domain of the capture probe does not contain a free 3' end, that is, the 3' end is removed, modified, or made inaccessible so that the capture probe is not affected during nucleic acid modification of the tissue sample, for example by ligation or elongation reactions, or by removing additional nucleotides to expose and / or restore the 3' end of the capture domain of the capture probe.

[0137] For example, a blocking probe can be hybridized with a capture probe to mask the free 3' end of the capture domain. The blocking probe is, for example, a hairpin probe or a partially double-stranded probe, suitable examples of which are known in the art. The free 3' end of the capture domain can be blocked by chemical modification, for example, by adding an azidomethyl group as a reversible chemical cap, so that the capture probe is free of a free 3' end. Optional suitable chemical caps are also known in the art; for example, the terminal nucleotide of the capture domain can be a reversible terminator nucleotide, added to the probe during or after capture probe synthesis.

[0138] Optionally or additionally, the capture domain of the capture probe can be modified to remove any modifications acquired during the modification of the nucleic acid molecules in the tissue sample, such as additional nucleotides. For example, the capture probe may contain an additional sequence, called a closing domain, downstream of the capture domain, i.e., at the 3' end of the capture domain. This can be, for example, a restriction endonuclease recognition sequence or a nucleotide sequence cleavable by a specific enzyme activity, such as uracil. Following the nucleic acid modification steps of the tissue sample, the capture probe can be enzymatically cleaved to remove the closing domain or any additional nucleotides added to the 3' end of the capture probe during modification. Removal of the closing domain can expose and / or restore the free 3' end of the capture domain of the capture probe. The closing domain can be synthesized as part of the capture probe or added to the capture probe via an in situ reaction (i.e., as a modification of an existing array), for example, through a linkage reaction with the capture domain.

[0139] The capture probe can be sealed using any combination of the above sealing mechanisms.

[0140] Once the nucleic acids in the tissue sample, such as genomic DNA fragments, are modified to hybridize with the capture domain of the capture probe, the capture probe must be unsealed, for example, by dissociating the blocking oligonucleotide, removing the chemical cap and / or blocking domain.

[0141] To correlate sequence analysis, transcriptomic information, or genomic information obtained from each feature of the array with partitions (i.e., regions or cells) of the tissue sample, the tissue sample is oriented with reference to the array features. In other words, the tissue sample is placed on the array such that the position of the capture probes on the array is associated with the position of the tissue sample. Thus, the position of each type of capture probe (or each feature of the array) within the tissue sample can be identified. In other words, the position of the tissue sample corresponding to the position of each type of capture probe can be identified. This step can be accomplished using positioning probes on the array, as described below. For convenience, but not essential, imaging of the tissue sample can be performed after contact with the array. This step can be performed before or after the nucleic acid processing steps of the tissue sample, for example, before or after the cDNA generation step of this method, particularly before or after the step of generating the first strand of cDNA by reverse transcription. In a preferred embodiment, the imaging step of the tissue sample precedes the step of releasing the captured and synthesized (i.e., elongated or ligated) DNA (such as cDNA) from the array. In a particular preferred embodiment, the tissue imaging step occurs after the nucleic acid processing step of the tissue sample, for example, after the reverse transcription step; and all residual tissue on the array is removed (e.g., washed) before releasing molecules (such as cDNA) from the array. In some embodiments, residual tissue on the array surface may be removed during the processing step of the captured nucleic acid (e.g., the reverse transcription step), for example, when the tissue used is tissue prepared by deep freezing. In this case, the imaging of the tissue sample may occur before the processing step, for example, before the cDNA synthesis step. Generally, the imaging step can be performed at any time after the tissue sample comes into contact with the surface, but should be completed before any steps that degrade or remove the tissue sample. As mentioned above, this depends on the tissue sample.

[0142] Preferably, the array may contain markers that facilitate the orientation of the tissue sample or its image relative to array features. Any suitable method for the marker array can be used, as long as it can be detected during tissue sample imaging. For example, molecules capable of generating signals, preferably visual signals, such as fluorescent molecules, can be directly fixed to the array surface. Preferably, at least two types of markers are present at different locations on the array surface; more preferably, at least three, four, five, six, seven, eight, nine, ten, twelve, fifteen, twenty, thirty, forty, fifty, sixty, seventy, eighty, ninety, or one hundred types of markers. Hundreds or even thousands of markers can be conveniently used. The markers may have a pattern, for example, forming the outer border of the array, such as the entire outer frame of the array features; or other informational patterns, such as cross-sections of the array to facilitate alignment of the tissue sample image with the array, or generally, the association between the array features and the tissue sample. Thus, the markers may be fixed molecules that can interact with signal-emitting molecules to generate signals. In a representative example, the array may contain labeled features, such as nucleic acid probes immobilized on the array substrate and hybridizing with labeled nucleic acids. For example, the labeled nucleic acid molecule, i.e., the labeled nucleic acid, may be linked to or bound to a chemical moiety that fluoresces (i.e., is activated) under light at a specific wavelength (or wavelength range). The timing of contact between such labeled nucleic acid molecules and the array can be before, during, or after staining of the tissue sample for visualization or imaging. However, the label must be detectable after the tissue sample is imaged. Therefore, in a preferred embodiment, the label can be detected using the same imaging conditions as for visualizing the tissue sample.

[0143] In a particular preferred embodiment of the invention, the array contains a labeling feature, and the labeling feature hybridizes with a labeled, preferably fluorescently labeled, nucleic acid molecule (e.g., oligonucleotide).

[0144] Tissue imaging steps can be performed using any convenient histological method in the art, such as light, bright-field, dark-field, phase-contrast, fluorescence, mirror, interferometry, confocal microscopy, or combinations thereof. Typically, tissue samples are stained before visualization to create contrast between different regions of the tissue sample (e.g., between cells). The type of staining agent used depends on the tissue type and the cellular regions to be stained. Such staining procedures are known in the art. In some embodiments, multiple staining agents are used to visualize (image) different aspects of the tissue sample, such as different regions of the tissue sample, specific cellular structures (e.g., organelles), or different cell types. In other embodiments, tissue samples can be visualized or imaged using methods other than tissue staining, for example, when the tissue sample already contains pigment to provide sufficient contrast, or when a special type of microscope is used.

[0145] In a preferred embodiment, a fluorescence microscope is used to visualize or image the tissue sample.

[0146] If imaging is to be performed without prior reverse transcription, it is preferable to remove the tissue sample before the step of releasing cDNA molecules from the array, i.e., any residual tissue still in contact with the array substrate after the reverse transcription step and optionally the imaging step. Therefore, the method of the present invention may include a step of washing the array. The removal of residual tissue samples can be performed using any suitable method, depending on the tissue sample. In the simplest embodiment, the array can be washed with water, which may contain various additives, such as surfactants (e.g., detergents), enzymes, etc., to help remove tissue. In some embodiments, a solution containing a protease (and a suitable buffer), such as proteinase K, is used to wash the array. In other embodiments, the solution may also contain, or optionally contain, cellulase, hemicellulase, or chitinase, for example, when the tissue sample is derived from a plant or fungus. In further embodiments, the temperature of the solution for washing the array may be, for example, at least 30°C, preferably at least 35, 40, 45, 50, or 55°C. Clearly, the damage to the fixed nucleic acid molecules by the washing solution should be minimized. For example, in some embodiments, nucleic acid molecules can be indirectly immobilized on the array substrate, for instance, through hybridization between capture probes and RNA and / or between capture probes and surface probes. Therefore, the washing step should not interfere with the interactions between molecules already immobilized on the array, i.e., it should not cause denaturation of the nucleic acid molecules.

[0147] Following the steps of contacting the array and the tissue sample, the hybridized nucleic acid molecules are immobilized (acquired) under suitable conditions where the nucleic acid molecules of the tissue sample, such as RNA (preferably mRNA), hybridize with the capture probe. The immobilization or acquisition of the hybridized nucleic acid involves covalent attachment between the complementary strand of the hybridized nucleic acid and the capture probe (i.e., via a nucleotide bond, specifically a phosphodiester bond formed between the juxtaposed 3'-hydroxyl and 5'-phosphate terminus of two adjacent nucleotides), thereby identifying or labeling the captured nucleic acid, which carries a localization domain capable of specifically binding to the feature that captured it.

[0148] In some embodiments, immobilizing hybridized nucleic acids (e.g., single-stranded nucleic acids) may include extending the capture probe to generate a copy of the captured nucleic acid, for example, generating cDNA from captured (hybridized) RNA. This step refers to the synthesis of the complementary strand of the hybridized nucleic acid, for example, generating cDNA based on a captured RNA template (i.e., RNA hybridized to the capture domain of the capture probe). Therefore, in the initial step of the capture probe extension reaction (e.g., cDNA generation), the captured (hybridized) nucleic acid (e.g., RNA) becomes the template for the extension step (e.g., reverse transcription). In other embodiments, as described below, immobilizing hybridized nucleic acids (e.g., partially double-stranded DNA) may include covalently binding the hybridized nucleic acid (e.g., a DNA fragment) to the capture probe, for example, by a ligation reaction to ligate the capture probe to the complementary strand of the nucleic acid hybridized with the capture probe.

[0149] Reverse transcription involves the step of synthesizing cDNA (a complementary DNA strand or a copy of DNA) from RNA, preferably mRNA (messenger RNA), using reverse transcriptase. Therefore, cDNA can be viewed as a copy of the RNA present in the cells at the time the tissue sample is collected; that is, the cDNA represents all or part of the genes expressed by the cells at the time of isolation.

[0150] Capture probes, especially their capture domains, act as primers, such as reverse transcription primers, in generating complementary strands for nucleic acids hybridized to them. Therefore, nucleic acid molecules (such as cDNA) generated by elongation reactions (e.g., reverse transcription) contain the capture probe sequence; that is, elongation reactions (e.g., reverse transcription) can be viewed as a method of indirectly labeling nucleic acids (e.g., transcripts) on a tissue sample that come into contact with each feature of the array. Thus, all nucleic acid molecules synthesized on a particular feature, such as cDNA, will contain the same nucleic acid "tag."

[0151] The nucleic acid molecule (such as cDNA) synthesized on each feature of the array can represent the genome or expressed gene in a region or area of ​​a tissue sample in contact with that feature, such as a tissue, cell type, cell population, or subpopulation of cells, and can further represent a gene expressed under specific conditions, such as at a specific time, in a specific environment, at a specific developmental stage, or in response to a specific stimulus. Therefore, the cDNA on any single feature may represent a gene expressed in a single cell, or, if the feature is in contact with a cell junction of the sample, its cDNA may represent genes in multiple cells. Similarly, if a single cell is in contact with multiple features, then each feature may represent a portion of the genes expressed in said cells. Similarly, in some embodiments, the captured nucleic acid is DNA, and any single feature may represent the genome of a single or multiple cells. Optionally, the genome of a single cell may be represented by multiple features.

[0152] The extension step of the capture probe, such as reverse transcription, can be carried out using any of the many suitable enzymes and procedures in the art, as described below. However, it is clear that there is no need to provide primers for the synthesis of the first strand of nucleic acid (such as cDNA), because the capture domain of the capture probe itself acts as the primer, for example, a reverse transcription primer.

[0153] Preferably, within the scope of this invention, the bound nucleic acid (i.e., the nucleic acid covalently bound to the capture probe) (e.g., cDNA) is treated to contain double-stranded DNA. However, in some embodiments, the captured DNA may already contain double-stranded DNA, for example, in the case where a partially double-stranded DNA fragment is ligated to the capture probe. The processing step of generating double-stranded DNA from the captured nucleic acid can be completed in a single reaction that generates only the second strand of DNA (e.g., cDNA), i.e., generating only double-stranded DNA molecules without increasing the number of double-stranded DNA molecules, or generating multiple copies of the second strand in an amplification reaction, which can be single-stranded DNA (e.g., linear amplification) or double-stranded DNA (e.g., cDNA) (e.g., exponential amplification).

[0154] The synthesis of second-strand DNA (e.g., cDNA) can be performed in situ on the array, either as a standalone second-strand synthesis step, such as using random primers as detailed below, or as an initial step in the amplification reaction. Alternatively, the first strand of DNA (e.g., cDNA) containing or bound to a capture probe can be released from the array before second-strand synthesis, for example, by reacting in solution, either as a standalone step or as part of the amplification reaction.

[0155] When second-strand synthesis is performed on the array (i.e., in situ), this method may include an optional step of removing captured nucleic acids, such as RNA, prior to second-strand synthesis, for example using an RNA nitrase (RNase), such as RNase H. The procedure for this step is well known and documented in the art. However, this step is generally unnecessary, and in most cases, RNA will degrade naturally. The step of removing tissue samples from the array also typically removes RNA from the array. RNase H can be used to increase the intensity of RNA removal.

[0156] For example, in tissue samples containing a large amount of RNA, the step of generating double-stranded cDNA can produce sufficient cDNA that can be directly used for sequencing (after release from the array). In this case, the second-strand synthesis of cDNA can be performed by any method known in the art, as described below. The second-strand synthesis reaction can be carried out simultaneously with the immobilization of cDNA on the array, or preferably, directly on the array after the cDNA has been substantially released from the array, as described below.

[0157] In other embodiments, it is necessary to increase (i.e., amplify) the bound nucleic acid (e.g., synthetic cDNA) to produce a sufficient quantity for DNA sequencing. In this embodiment, the first strand of the bound nucleic acid (e.g., cDNA molecule) also contains capture probes with array features, which act as a template in the amplification reaction, for example, a polymerase chain reaction. The first product of this amplification reaction is the second strand of DNA (e.g., cDNA), which itself can serve as a template for further cycles of the amplification reaction.

[0158] In any of the embodiments described above, the second strand of the DNA (e.g., cDNA) contains the complementary strand of the capture probe. If the capture probe contains a universal domain, particularly an amplification domain within that universal domain, it can be used in subsequent amplification reactions of the DNA (e.g., cDNA). This amplification reaction may include primers containing the same sequence as the amplification domain, i.e., primers complementary to (i.e., hybridizing) the complementary strand of the amplification domain. On the capture probe, the amplification domain is upstream of the localization domain (in the already bound nucleic acid, such as the first strand of cDNA), and from this perspective, the complementary strand of the localization domain will be incorporated into the second strand of the DNA (e.g., the cDNA molecule).

[0159] In some embodiments, the second strand of DNA (e.g., cDNA) is generated in a single reaction, and the second strand synthesis can be achieved by any suitable method. For example, the first strand of cDNA released from the array substrate, preferably but not necessarily, can be cultured under conditions suitable for templated DNA synthesis using random primers (e.g., hexammeric primers) and a DNA polymerase, preferably a strand displacement polymerase (e.g., Klenow (EXO)). This process will produce double-stranded cDNA molecules of various lengths, and it is unlikely to produce cDNA molecules of the full length, i.e., cDNA molecules corresponding to the entire mRNA serving as the template for synthesis. The random primers hybridize to random sites on the first strand of the cDNA molecule (i.e., in the sequence rather than at the ends).

[0160] To generate a full-length DNA molecule (e.g., cDNA), that is, a molecule corresponding to the entire captured nucleic acid (e.g., RNA molecule) (if the nucleic acid, such as RNA, is partially degraded in a tissue sample, the captured nucleic acid molecule, such as RNA, will not be a "full-length" transcript, nor will it be the same length as the initial fragment of genomic DNA), the unbound nucleic acid molecule (e.g., the first strand of cDNA) can be modified. For example, a linker or conjugate can be ligated to the 3' end of the cDNA molecule. This step can be accomplished using a single-stranded synthase, such as T4 RNA ligase or Circligase. TM (Epicentre Biotechnologies).

[0161] Alternatively, a double-stranded ligase, such as T4 DNA ligase, can be used to ligate a helper probe (a portion of the double-stranded DNA molecule capable of hybridizing to the 3' end of the first strand of a cDNA molecule) to the 3' end of an already bound nucleic acid (such as the first strand of cDNA) molecule. Other known enzymes suitable for this ligation step are also available in the art, including, for example, Tth DNA ligase, Taq DNA ligase, and Thermococcus sp. (strain 9°N) DNA ligase (9°N). TM DNA ligase, New England Biolabs, and Ampligase TM(Epicentre Biotechnologies). The helper probe also contains a specific sequence; using the complementary strand of the portion of the helper probe linked to the bound nucleic acid (e.g., the first strand of cDNA) as a primer, the synthesis of the second strand of the DNA (e.g., cDNA) molecule can be guided from said specific sequence. Further, an alternative method includes using a terminal transferase-active enzyme to incorporate a polynucleotide tail (e.g., a poly-A tail) into the 3' end of the bound nucleic acid (e.g., the first strand of cDNA) molecule. The second strand synthesis can be guided using a poly-T primer, which may also contain a specific amplification domain for further amplification. Other methods for generating "full-length" double-stranded DNA (e.g., cDNA) molecules (or for the synthesis of the longest possible second strand) are well known in the art.

[0162] In some embodiments, the second-chain synthesis can employ a template conversion method, for example, using... SMART TM The SMART (Switching Mechanism at 5' End of RNA Template) technology has been well-developed in this field. This technology is based on... The discovery that reverse transcriptases such as Invitrogen can add several nucleotides to the 3' end of an elongated cDNA molecule, resulting in a DNA / RNA hybrid product with a single-stranded DNA suspension at the 3' end, allows the suspended DNA to provide a target sequence as a hybridization target for oligonucleotide probes, providing an additional template for further elongation of the cDNA molecule. Preferably, the oligonucleotide probe hybridizing with the suspended cDNA contains an amplification domain sequence, the complementary strand of which is incorporated into the first strand of the cDNA synthesis product. Primers containing the amplification domain sequence hybridize with the complementary amplification domain sequence incorporated into the first strand of the cDNA; these primers can be added to the reaction mixture to guide the synthesis of the second strand, using a suitable polymerase and the first strand of cDNA as a template. This method avoids the requirement to ligate the conjugate to the 3' end of the first strand of cDNA. Although template conversion was initially developed for full-length mRNAs with a 5' cap, this method has proven equally effective for truncated mRNAs without a cap. Therefore, template conversion can be used in the methods of this invention to generate full-length and / or partially or truncated cDNA molecules. Therefore, in a preferred embodiment of the invention, the synthesis of the second strand can be performed using template switching, or by template switching. In a particular preferred embodiment, the template switching reaction is carried out in situ (i.e., while the capture probe is still directly or indirectly immobilized on the array), which is a reaction that further extends the first strand of cDNA to incorporate the complementary amplification domain. Preferably, the second strand synthesis reaction is also carried out in situ.

[0163] In some embodiments, it may be necessary, or preferably, to increase, enrich, or amplify DNA (e.g., cDNA) molecules, in which case an amplification domain may be incorporated into the DNA (e.g., cDNA) molecule. As described above, when the capture probe has a universal domain containing an amplification domain, the first amplification domain may be incorporated into the first strand of the bound nucleic acid molecule, such as cDNA. In these embodiments, second-strand synthesis may incorporate a second amplification domain. For example, primers used to generate the second strand of cDNA, such as random hexamer primers, poly-T primers, or primers complementary to the helper probe, may contain an amplification domain at the 5' end, i.e., a nucleotide sequence that can hybridize with the amplification primer. Thus, the resulting double-stranded DNA may contain an amplification domain at either of the two 5' ends of the double-stranded DNA (e.g., cDNA) molecule, or in a direction close to the two 5' ends. These amplification domains can be used as targets for primers in amplification reactions (e.g., PCR). Optionally, a linker or conjugate attached to the 3' end of the bound nucleic acid molecule (e.g., the first strand of cDNA) may contain a second universal domain containing the second amplification domain. Similarly, the second amplification domain can be incorporated into the first strand of cDNA molecule via template conversion.

[0164] In some embodiments, where the capture probe does not contain a universal domain, particularly a universal domain containing an amplification domain, the second strand of the cDNA molecule can be synthesized as described above. The resulting double-stranded DNA molecule product can be modified to incorporate an amplification domain at the 5' end of the first strand of the DNA (e.g., cDNA) molecule, and to incorporate a second amplification domain at the 5' end of the second strand of the DNA (e.g., cDNA), if the latter was not completed in the second-strand DNA (e.g., cDNA) synthesis step. Such an amplification domain can be incorporated, for example, by ligating a double-stranded ligase to the end of the DNA (e.g., cDNA) molecule. Enzymes suitable for this ligation step are known in the art, including, for example, TthDNA ligase, Taq DNA ligase, and Thermococcus sp. (9°N strain) DNA ligase (9°N). TM DNA ligase (New England Biolabs), Ampligase TM (Epicentre Biotechnologies) and T4 DNA ligase. In a preferred embodiment, the first and second amplification domains contain different sequences.

[0165] As can be seen from the above, the universal domain can obviously contain an amplification domain and can be added to a bound (i.e., elongated or ligated) DNA molecule or its complementary strand (e.g., the second strand) using various methods, techniques, and combinations of techniques in the art, for example, by using primers containing this domain, ligation reactions of conjugates, the use of terminal transferases, and / or template conversion methods. As can be seen from the above, such domains can be added before or after the DNA molecule is released from the array.

[0166] As can be seen from the above description, it is obvious that all DNA (e.g., cDNA) molecules synthesized from a single array according to the method of the present invention can all contain the same first and second amplification domains. Thus, a single amplification reaction, such as PCR, is sufficient to amplify all DNA (e.g., cDNA) molecules. Therefore, in a preferred embodiment, the method of the present invention may include an amplification step for a single DNA (e.g., cDNA) molecule. In one embodiment, the amplification step is performed after the DNA (e.g., cDNA) molecule has been released from the array substrate. In other embodiments, the amplification reaction can be performed on the array (i.e., in situ on the array). It is known in the art that amplification reactions can be performed on arrays, and on-chip thermal cyclers capable of performing such reactions already exist. Therefore, in one embodiment, arrays known in the art as sequencing platforms or for any form of sequence analysis (e.g., in next-generation sequencing technologies) can be used as the basis for the array of the present invention (e.g., Illumina microbead arrays, etc.).

[0167] For the synthesis of the second strand of DNA (e.g., cDNA), if the cDNA released from the array substrate contains partially double-stranded nucleic acid molecules, a strand displacement polymerase (e.g., Φ29 DNA polymerase, Bst(exo)) is preferably used. - DNA polymerase, Klenow (exo) - (DNA polymerase). For example, in some embodiments, the capture probe is indirectly immobilized on the array substrate via a surface probe, and the step of releasing DNA (e.g., cDNA) molecules includes a cleavage step, in which case the released nucleic acid is at least partially double-stranded (e.g., DNA:DNA, DNA:RNA, or DNA:DNA / RNA hybrid). To ensure that the complementary strand of the localization domain (signature domain) is incorporated into the second strand of DNA (e.g., cDNA) in the cDNA second-strand synthesis reaction, a strand displacement polymerase is necessary.

[0168] Clearly, the step of releasing at least a portion of a DNA (e.g., cDNA) molecule or its amplicon from an array surface or substrate can be accomplished by many methods. The primary objective of this release step is to generate a molecule capable of incorporating (or containing) the targeting domain (or its complementary strand) of a capture probe, so that the DNA (e.g., cDNA) molecule or its amplicon can be "tagged" according to its characteristics (or location) on the array. Therefore, this release step removes a DNA (e.g., cDNA) molecule or its amplicon from the array, and the DNA (e.g., cDNA) molecule or its amplicon incorporates the targeting domain or its complementary strand (by incorporating it into the bound nucleic acid (e.g., the first strand of cDNA) via, for example, the extension of the capture probe, and optionally, if second-strand synthesis is performed on the array, also copying the second strand of DNA, or if amplification is performed on the array, copying the amplicon). Therefore, the presence of the targeting domain (or its complementary strand) of a capture probe in the released molecule is essential for generating sequence analysis data that can be associated with various regions of a tissue sample.

[0169] Since the released molecule can be either the first or second strand of a DNA molecule (e.g., cDNA) or an amplicon, and since the capture probe can be indirectly immobilized on the array, it is understood that although the release step may include a step of cleaving the DNA (e.g., cDNA) molecule from the array, this release step is not a necessary nucleic acid cleaving step; the DNA (e.g., cDNA) molecule or its amplicon can be released simply by denaturing the double-stranded molecule, for example, releasing the second strand of cDNA from the first strand, or releasing the amplicon from the template, or releasing the first strand of the cDNA molecule (i.e., the elongated capture probe) from the surface probe. Accordingly, the DNA (e.g., cDNA) molecule can be released from the array by nucleic acid cleaving and / or denaturation (e.g., denaturing the double-stranded molecule by heating). If the amplification reaction is performed in situ on the array, it is obviously necessary to include the denaturation-released amplicon in the cyclic reaction.

[0170] In some embodiments, a DNA molecule (e.g., cDNA) is released by enzymatic cleavage at a cleavage domain, which may be located within the universal or localization domain of the capture probe. As mentioned above, the cleavage domain must be located upstream (5' end) of the localization domain so that the released DNA molecule (e.g., cDNA) contains the localization (marking) domain. Suitable enzymes for nucleic acid cleavage include restriction endonucleases, such as Rsal. Other enzymes include uracil DNA glycosylase (UDG) and DNA glycosylation-cleavage endonuclease VIII (USER). TM A mixture of enzymes, or a combination of MutY and T7 endonuclease I, is a preferred embodiment of the method of the present invention.

[0171] In an alternative embodiment, DNA (e.g., cDNA) molecules can be released from the surface or substrate of the array by physical methods. For example, in some embodiments, capture probes are indirectly (e.g., by hybridization with surface probes) immobilized on the array substrate, where disrupting the interactions between nucleic acid molecules is sufficient. Methods for disrupting interactions between nucleic acid molecules, such as denaturing double-stranded nucleic acid molecules, are well known in the art. A direct method for releasing DNA (e.g., cDNA) molecules (i.e., peeling synthetic DNA (e.g., cDNA) molecules from the array) is to use a solution capable of interfering with the hydrogen bonds of the double-stranded molecules. In a preferred embodiment of the invention, DNA (e.g., cDNA) molecules can be released by hot water treatment, for example, using water or a buffer solution at a temperature of at least 85°C, preferably at least 90, 91, 92, 93, 94, 95, 96, 97, 98, or 99°C. In addition to temperatures sufficient to disrupt hydrogen bonds, as an alternative or supplementary method, the solution may also contain salts, surfactants, etc., to further disrupt the interactions between nucleic acid molecules, thereby achieving the release of DNA (e.g., cDNA) molecules.

[0172] It is known that using a high-temperature solution, such as water at 90-99°C, may be sufficient to break the covalent bonds used to immobilize the capture probes or surface probes on the array substrate. Therefore, in a preferred embodiment, DNA (e.g., cDNA) molecules can be released by breaking the covalently immobilized capture probes or surface probes by applying hot water to the array.

[0173] It goes without saying that the released DNA (e.g., cDNA) molecules (a solution containing the released DNA (e.g., cDNA) molecules) will be collected for further processing, such as second-strand synthesis and / or amplification. Nevertheless, the method of the present invention may include a step of collecting or recovering the released DNA (e.g., cDNA) molecules. As described above, in the case of in situ amplification, the released molecules may contain amplicon of a bound nucleic acid (e.g., cDNA).

[0174] In some embodiments of the invention, it may be necessary to remove all unextended or unligated capture probes. For example, this step may occur after the step of releasing DNA molecules from the array. Such removal can be performed by any desired or convenient method, including, for example, using an enzyme to degrade the unextended or unligated probes, such as an exonuclease.

[0175] After DNA (e.g., cDNA) molecules or their amplicones are released from the array, they may have undergone the modifications described above, and then be analyzed (e.g., sequence determination; although, as mentioned above, actual sequencing is not required, but any sequence analysis method can be used). Therefore, any nucleic acid analysis method can be employed. Sequence analysis steps can identify localization domains, thereby locating the analyte molecule at a specific site in the tissue sample. Similarly, the nature or identity of the analyte molecule can be determined. In this way, nucleic acids, such as RNA, can be determined at a specific location on the array or even on the tissue sample. Thus, the analysis step can include or use any method to identify the analyte molecule (or even the "target" molecule) and its localization domains. Generally, such methods are sequence-specific methods. For example, this method can use sequence-specific primers or probes, especially primers or probes specific to the localization domain and / or the specific nucleic acid molecule to be detected or analyzed, such as DNA molecules corresponding to the nucleic acid molecule to be tested (e.g., RNA or cDNA). Typically, in such methods, sequence-specific amplification primers, such as PCR primers, can be used.

[0176] In some embodiments, it may be necessary to analyze a subset or family of target-related molecules, for example, all coding sequences of a specific proteome (e.g., a family of receptor proteins) sharing sequence similarity and / or conserved domains. Therefore, the amplification and / or analysis methods described herein can employ degenerate or gene-family-specific primers or probes to achieve hybridization with the captured nucleic acid or its derived nucleic acid (e.g., amplicon). In a particular preferred embodiment, the amplification and / or analysis methods can simultaneously employ universal primers (i.e., primers common to all captured sequences) and degenerate or gene-family-specific primers specific to a subset of the target molecule.

[0177] Therefore, in one embodiment, a sequence analysis method based on amplification reaction, particularly PCR reaction, was employed.

[0178] However, the steps of modifying and / or amplifying the released DNA (e.g., cDNA) molecules may introduce other components into the sample, such as enzymes, primers, nucleotides, etc. Therefore, the method of the present invention may further include a purification step of the sample containing the released DNA (e.g., cDNA) molecules or amplicon before sequence analysis, for example, removing oligonucleotide primers, nucleotides, salts, etc., that may interfere with the sequencing reaction. Any suitable DNA (e.g., cDNA) molecule purification method can be used.

[0179] As described above, the released DNA can be subjected to direct or indirect sequence analysis. Therefore, the sequence analysis matrix (which can be viewed as a molecule facing the sequence analysis step or process) can be a molecule released directly from the array or a derivative thereof. Thus, for example, in a sequence analysis step involving a sequencing reaction, the sequencing template can be a molecule released from the array or a molecule derived therefrom. For example, the first and / or second strands of the DNA (e.g., cDNA) molecule released from the array can be directly sequenced (e.g., sequenced), meaning they can directly participate in the sequence analysis reaction or process (e.g., a sequencing reaction or process, or as a sequencing molecule or a molecule identified in other ways). In the case of in situ amplification, the released molecule can be an amplicon. Optionally, the released molecule can undergo a second-strand synthesis or amplification step before sequence analysis (e.g., sequencing or other identification methods). Therefore, the sequence analysis matrix (e.g., a template) can be an amplicon or the second strand of a molecule released directly from the array.

[0180] Both strands of a double-stranded molecule can be sequenced (e.g., sequenced), but the invention is not limited to this; single-stranded molecules (e.g., cDNA) can also be analyzed (e.g., sequenced). For example, various sequencing technologies can be used for single-molecule sequencing, such as Helicos or PacBio technologies, or nanopore sequencing technologies under development. Therefore, in one embodiment, the first strand of DNA (e.g., cDNA) can be sequenced. The first strand of the DNA (e.g., cDNA) molecule may require 3' end modification before single-molecule sequencing. This step can be accomplished using a method similar to that used for processing the second strand of the DNA (e.g., cDNA) molecule. Such operations are known in the art.

[0181] In a preferred aspect of the invention, sequence analysis identifies or reveals partial sequences of captured nucleic acid (e.g., RNA) and localization domains. The sequence of the localization domain (or tag) identifies the characteristics of the captured nucleic acid molecules, such as mRNA. The sequence of the captured nucleic acid molecule (e.g., RNA) can be compared with a sequence database of the organism from which the sample originated to determine its corresponding gene. By determining which region of the tissue sample (e.g., a cell) is in contact with the characteristic, it is possible to determine which region of the tissue sample is expressing (or contains) the gene, e.g., in spatial genomics studies. This analysis can be used for all DNA (e.g., cDNA) molecules produced in the method of the invention to obtain the spatial transcriptome or genome of the tissue sample.

[0182] In a representative example, sequencing data can be analyzed to classify capture probes according to specific categories, such as by their localization domains. For instance, this step can be achieved by using the FASTQ identifier decomposer tool in the FastX software toolkit to group sequences into separate documents based on their capture probe localization domains (tags). Sequences of each category (i.e., from each feature) can be analyzed to determine the identity of the transcriptosome. For example, software such as Bastn can be used to identify sequences by comparing them to one or more genomic databases, preferably databases of organisms from which the tissue sample is derived. If a database sequence has the highest similarity to a sequence obtained by the method of this invention, then the identity of the database sequence is assigned to the sequence obtained by this invention. Generally, only sequences with at least 1e... -6 1e is preferred -7 1e -8 or 1e -9 A match will be considered a successful identification.

[0183] Obviously, any nucleic acid sequencing method can be used in the method of this invention. However, so-called "next-generation sequencing" technology is particularly suitable for this invention. High-throughput sequencing is particularly useful in the method of this invention because it performs partial sequencing of a large number of nucleic acids in a very short time. Considering the recent explosive growth in the number of genomes that have been sequenced in whole or in part, it is not necessary to sequence the entire DNA (e.g., cDNA) molecule to determine the gene corresponding to each molecule. For example, the first 100 nucleotides from each end of the DNA (e.g., cDNA) molecule should be sufficient to identify the characteristics of the capture pronucleus (e.g., mRNA) and the gene it expresses. Sequencing reactions at the "capture probe end" of the DNA (e.g., cDNA) molecule can yield at least about 20 bases of the localization domain sequence and transcriptome-specific sequence data, preferably 30 or 40 bases. Sequencing reactions at the "non-capture probe end" can yield at least about 70 bases of the transcriptome-specific sequence data, preferably 80, 90, or 100 bases.

[0184] In a representative example, the sequencing reaction could be based on a reversible staining terminator, such as that used in Illumina. TM In this technique, for example, DNA molecules can first be ligated to primers and amplified on, for instance, a glass slide or silicon wafer, forming local clonal colonies (bridge amplification). Four types of ddNTPs are added, and unbound nucleotides are washed away. Unlike pyrosequencing, DNA can be extended by only one nucleotide at a time. The fluorescently labeled nucleotides are photographed, and then the staining agent and the 3' end caps on the DNA are chemically removed to prepare for the next cycle. This process can be repeated until the desired sequence data is obtained. Using this technique, thousands of nucleic acids can be sequenced simultaneously on a single slide.

[0185] Other high-throughput sequencing technologies can also be applied to the method of this invention, such as pyrosequencing. In this method, the DNA amplification reaction is carried out in water droplets in an oil solution (microemulsion PCR), each droplet containing a single DNA template attached to a microbead coated with a single primer, which then forms a clonal colony. The sequencing instrument contains a number of picoliter-capacity wells, each containing a microbead and a sequencing enzyme. Pyrosequencing uses luciferase luminescence to detect individual nucleotides added to the nascent DNA and uses merged data to generate sequence reads.

[0186] One example of a technology under development is the detection of hydrogen ions released during DNA polymerization. A single type of nucleotide is used to focus on a micropore containing a template DNA strand to be sequenced. If the introduced nucleotide is complementary to the leader nucleotide of the template, it is added to the growing complementary strand; the resulting release of a hydrogen ion triggers a highly sensitive ion sensor, indicating that a reaction has occurred. If homopolymer repeats exist in the template sequence, multiple nucleotides will bind in a single cycle, causing a corresponding number of hydrogen ions to be released, and a proportionally higher electrical signal.

[0187] Therefore, it is clear that future sequencing formats are slowly becoming available; one of the main features of these platforms is their shorter runtime, and obviously, other sequencing technologies will also be used in the methods of this invention in the future.

[0188] An essential feature of this invention, as described above, is the step of binding the complementary strand of the captured nucleic acid molecule to a capture probe, for example, by reverse transcription of the captured RNA molecule. Reverse transcription reactions are well known in the art, and in typical reverse transcription reactions, the reaction mixture includes reverse transcriptase, dNTPs, and a suitable buffer. The reaction mixture may contain other components, such as RNase inhibitors. Primers and templates are the capture domain of the capture probe and the captured RNA molecule, respectively, as described above. In the subject matter of this method, typically, the amount of each dNTP used is between 10 and 5000 μM, usually from about 20 to 1000 μM. Obviously, enzymes with DNA polymerase activity can be used to generate the complementary strand of the captured DNA molecule through an equivalent reaction. Such reactions are well known in the art and will be described in further detail below.

[0189] The required reverse transcriptase activity can be provided by one or more different enzymes, among which suitable examples include: M-MLV, MuLV, AMV, HIV, and ArrayScript. TM MultiScribe TM ThermoScript TM as well as I, II, and III enzymes.

[0190] Reverse transcriptase reactions can be carried out at any suitable temperature, depending on the nature of the enzyme. Typically, reverse transcriptase reactions are performed between 37 and 55°C, but temperatures outside this range may also be suitable. Reaction times can be as short as 1, 2, 3, 4, or 5 minutes, or as long as 48 hours. Typically, reaction times are between 5 and 120 minutes, preferably 5-60, 5-45, or 5-30 minutes, or 1-10 or 1-5 minutes, depending on the choice. Reaction time is not a critical factor, and any reaction time length can be used as desired.

[0191] As described above, some embodiments of this method include an amplification step to increase the copy number of the prepared DNA (e.g., cDNA) molecules, for example, to enrich the sample to better represent the nucleic acids (transcriptions) captured from the tissue sample. The amplification reaction can be linear or exponential, depending on the need, and representative amplification reaction procedures include, but are not limited to, polymerase chain reaction (PCR); isothermal amplification reactions, etc.

[0192] Polymerase chain reaction (PCR) is well known in the art, as described in U.S. patents 4,683,202; 4,683,195; 4,800,159; 4,965,188 and 5,512,462, which are incorporated herein by reference. In a typical PCR amplification reaction, the reaction mixture contains DNA (e.g., cDNA) molecules released from the array as described above, said DNA (e.g., cDNA) molecules being combined with one or more primers for primer extension reactions, such as PCR primers that hybridize to the first and / or second amplification domains (e.g., forward and reverse primers for geometric (or exponential) amplification reactions, or a single primer for linear amplification reactions). The oligonucleotide primers contacting the released DNA (e.g., cDNA) molecules (hereinafter referred to as template DNA for convenience) will be of sufficient length to support hybridization with complementary template DNA under annealing conditions (described in more detail below). The length of the primer depends on the length of the amplification domain, but generally it must be at least 10 bp, usually at least 15 bp, more commonly at least 16 bp, and can be as long as 30 bp or more; the primer length range is generally between 18-50 bp, usually about 20-35 bp. The template DNA can be contacted with a single primer or a pair of primers (forward and reverse primers), depending on whether primer extension, linear or exponential amplification of the template DNA is required.

[0193] In addition to the components described above, the typical reaction mixture prepared in this method also includes polymerase and deoxyribonucleic acid triphosphates (dNTPs). The desired polymerase activity can be provided by one or more different polymerases. In many embodiments, the reaction mixture includes at least a Family A polymerase, and typical Family A polymerases suitable include, but are not limited to: Thermus aquaticus polymerases, including natural polymerase (Taq) and its derivatives and homologues, such as Klentaq (as described in Barnes et al, Proc. Natl. Acad. Sci. USA (1994) 91:2216-2220); Thermus thermophilus polymerases, including natural polymerase (Tth) and its derivatives and homologues, and analogues. In some embodiments, the amplification reaction performed is a high-fidelity reaction, and the reaction mixture may further include a polymerase having 3'-5' exonuclease activity, for example, a Family B polymerase may provide this activity, and suitable Family B polymerases include, but are not limited to: Thermococcus litoralis DNA polymerase (Vent), as described by Perler et al., Proc. Natl. Acad. Sci. USA (1992) 89:5577-5581; Pyrococcus species GB-D (Deep Vent); Pyrococcus furiosus DNA polymerase (Pfu), as described by Lundberg et al., Gene (1991) 108:1-6; Pyrococcus woesei (Pwo); and the like. If the reaction mixture contains both group A and group B polymerases, the concentration of group A polymerase can be higher than that of group B polymerase, with an activity difference typically at least 10-fold, and more commonly at least about 100-fold. Generally, the reaction mixture includes four different types of dNTPs, corresponding to four native bases: dATP, dTTP, dCTP, and dGTP. In this method, the concentration of each dNTP typically ranges from about 10 to 5000 μM, usually between about 20 and 1000 μM.

[0194] The reaction mixture obtained in the reverse transcriptase and / or amplification reaction steps of this method may further comprise an aqueous buffer medium containing a monovalent ion source, a divalent cation source, and a buffer. Any readily available monovalent ion source can be used, such as KCl, potassium acetate, ammonium acetate, potassium glutamate, NH4Cl, ammonium sulfate, and the like. The divalent cation can be magnesium, manganese, zinc, and the like, and the cation is generally magnesium. Any convenient magnesium ion source can be used, including MgCl2, magnesium acetate, and the like. The buffer solution contains Mg... 2+ The concentration range can be between 0.5-10 mM, but preferably between 3-6 mM, with an ideal value of 5 mM. Representative buffer media or salts that may be present in the buffer solution include tris(hydroxymethyl)aminomethane (Tris), trimethylglycine (Tricine), hydroxyethylpiperazine ethanesulfonic acid (HEPES), propanesulfonic acid (MOPS), and the like, with typical concentrations ranging from about 5-150 mM, typically between about 10-100 mM, more commonly about 20-50 mM, and in some preferred embodiments, the buffer medium concentration is sufficient to provide a pH range of about 6.0-9.5, with the most preferred being a pH of 7.3 at 72°C. Other media that may be contained in the buffer solution include chelating agents such as EDTA, EGTA, and the like.

[0195] In the preparation of the reverse transcriptase, DNA extension, or amplification reaction mixtures required for each step of this method, the various components can be combined in any convenient order. For example, in the amplification reaction, primers, polymerase, and then template DNA can be added to the buffer first, or all the various components can be mixed simultaneously to prepare the reaction mixture.

[0196] As described above, in a preferred embodiment of the present invention, DNA (e.g., cDNA) molecules can be modified by adding an amplification domain to the ends of nucleic acid molecules, a process that may include a ligation reaction. When the capture probe is indirectly immobilized on the array surface, the in-situ synthesis of the capture probe on the array also needs to be accomplished through a ligation reaction.

[0197] As is known in the art, ligases catalyze the formation of a phosphodiester bond between the juxtaposed 3'-hydroxyl and 5'-phosphate ends of two adjacent nucleic acids. Any convenient ligase can be used, with representative suitable ligases including, but not limited to, thermosensitive and thermostable ligases. Thermosensitive ligases include, but are not limited to, bacteriophage T4 DNA ligase, bacteriophage T7 ligase, and *E. coli* ligase. Thermostable ligases include, but are not limited to, Taq ligase, Tth ligase, and Pfu ligase. Thermostable ligases can be obtained from thermophilic or hyperthermophilic organisms, including, but not limited to, prokaryotes, eukaryotes, or archaea. Certain RNA ligases can be used in the method of the present invention.

[0198] In this ligation step, a suitable ligase and any reagents selected as necessary and / or by the hospital are combined with the reaction mixture and maintained under conditions suitable for the ligation reaction of the relevant oligonucleotides. The ligation reaction conditions are well known to those skilled in the art. In the ligation reaction, in some embodiments, the reaction mixture may be maintained at about 4°C to about 50°C, for example, at a temperature of about 20°C to about 37°C for a period of time, said time being about 5 seconds to about 16 hours, for example, from about 1 minute to about 1 hour. However, in other embodiments, the reaction mixture may be maintained in the range of about 35°C to about 45°C, for example, from about 37°C to about 42°C, such as at (about) 38°C, 39°C, 40°C, or 41°C for a period of time, said time being between about 5 seconds and about 16 hours, for example, from about 1 minute to about 1 hour, including from about 2 minutes to about 8 hours. In one representative embodiment, the ligation reaction mixture comprises 50 mM Tris (pH 7.5), 10 mM MgCl2, 10 mM DTT, 1 mM ATP, 25 mg / ml BSA, 0.25 units / ml RNase inhibitor, and 0.125 units / ml T4 DNA ligase. In yet another representative embodiment, 2.125 mM magnesium ions, 0.2 units / ml RNase inhibitor, and 0.125 units / ml RNase DNA ligase are used. The amount of conjugates used in the reaction depends on the concentration of DNA (e.g., cDNA) molecules in the sample, and is typically between 10 and 100 times the molar amount of DNA (e.g., cDNA).

[0199] In a representative example, the method of the present invention may include the following steps:

[0200] (a) The array is brought into contact with a tissue sample, wherein the array includes a substrate on which multiple capture probes are directly or indirectly immobilized, each type occupying a different position in the array, and the probes are oriented with a 3' free end so that the probes can serve as reverse transcriptase (RT) primers, wherein each type of capture probe includes a nucleic acid molecule, the nucleic acid molecule comprising, in the 5' to 3' direction:

[0201] (i) The localization domain corresponding to the position of the capture probe on the array, and

[0202] (ii) Capture domain;

[0203] This allows the RNA from the tissue sample to hybridize with the capture probe.

[0204] (b) Imaging tissue samples on the array;

[0205] (c) The captured mRNA molecules are reverse transcribed to generate cDNA molecules;

[0206] (d) Wash the array to remove residual tissue;

[0207] (e) Release at least a portion of the cDNA molecules from the array surface;

[0208] (f) Perform second-strand cDNA synthesis on the released cDNA molecules;

[0209] as well as

[0210] (g) Analyze (e.g., sequence) the sequence of cDNA molecules.

[0211] In one optional representative example, the method of the present invention may include the following steps:

[0212] (a) The array is brought into contact with a tissue sample, wherein the array includes a substrate on which at least two types of capture probes are directly or indirectly immobilized, each type occupying a different position in the array, and the probes are oriented with a 3' free end so that the probes can serve as reverse transcriptase (RT) primers, wherein each type of capture probe includes a nucleic acid molecule comprising, in the 5' to 3' direction:

[0213] (i) The localization domain corresponding to the position of the capture probe on the array, and

[0214] (ii) Capture domain;

[0215] This allows the RNA from the tissue sample to hybridize with the capture probe;

[0216] (b) Optionally, the tissue sample may be rehydrated;

[0217] (c) The captured mRNA molecule is reverse transcribed to generate the first strand of the cDNA molecule, and optionally, the second strand of the cDNA molecule is synthesized.

[0218] (d) Imaging tissue samples on the array;

[0219] (e) Wash the array to remove residual tissue;

[0220] (f) Release at least a portion of the cDNA molecules from the array surface;

[0221] (g) Perform second-strand cDNA synthesis on the released cDNA molecules;

[0222] as well as

[0223] (h) Analyze (e.g., sequence) the sequence of cDNA molecules.

[0224] In a representative example, the method of the present invention may include the following steps:

[0225] (a) The array is brought into contact with a tissue sample, wherein the array includes a substrate on which a variety of capture probes are directly or indirectly immobilized, each type occupying a different position in the array, and the probes are oriented with a 3' free end so that the probes can act as reverse transcriptase (RT) primers, wherein each type of capture probe includes a nucleic acid molecule comprising, in the 5' to 3' direction:

[0226] (i) The localization domain corresponding to the position of the capture probe on the array, and

[0227] (ii) Capture domain;

[0228] This allows the RNA from the tissue sample to hybridize with the capture probe.

[0229] (b) Optionally, image tissue samples on the array;

[0230] (c) The captured mRNA molecules are reverse transcribed to generate cDNA molecules;

[0231] (d) Optionally, if the imaging step of step (b) is not performed, the tissue sample on the array is imaged;

[0232] (e) Wash the array to remove residual tissue;

[0233] (f) Release at least a portion of the cDNA molecules from the array surface;

[0234] (g) Perform second-strand cDNA synthesis on the released cDNA molecules;

[0235] (h) Amplify double-stranded cDNA molecules;

[0236] (i) Optionally, the cDNA molecule is purified to remove components that may interfere with the sequencing reaction;

[0237] as well as

[0238] (j) Analyze (e.g., sequence) the sequence of a cDNA molecule.

[0239] This invention includes any suitable combination of the steps in the above methods. It is understood that this invention also includes variations of these methods, such as in-situ amplification on an array, and methods that include omitting the imaging step.

[0240] This invention can also be viewed as including a method for fabricating or producing an array (i) for capturing mRNA from a tissue sample in contact with it; or (ii) for determining and / or analyzing the (e.g., partial or global) transcriptome of a tissue sample, the method comprising directly or indirectly immobilizing a variety of capture probes on an array substrate, wherein each type of capture probe comprises a nucleic acid molecule comprising, in the 5' to 3' direction:

[0241] (i) The localization domain corresponding to the position of the capture probe on the array, and

[0242] (ii) Capture domain;

[0243] The method of the present invention for producing arrays can be further defined as follows: each capture probe fixed on the array is treated as a feature.

[0244] The method of attaching the capture probe to the array can be carried out using any suitable method as described above. When the capture probe is indirectly attached to the array, the capture probe can be synthesized on the array. The method may include one or more of the following steps:

[0245] (a) A variety of surface probes are directly or indirectly fixed on an array substrate, wherein the surface probes include:

[0246] (i) A domain capable of hybridizing with a partial capture domain oligonucleotide (the portion not used to capture nucleic acids such as RNA);

[0247] (ii) complementary localization domains; and

[0248] (iii) Complementary general domains;

[0249] (b) Hybridize capture domain oligonucleotides and universal domain oligonucleotides with surface probes immobilized on the array;

[0250] (c) Extending universal domain oligonucleotides through templated polymerization to generate the localization domain of the capture probe; and

[0251] (d) Link the localization domain to the capture domain oligonucleotide to generate a capture oligonucleotide.

[0252] The linking reaction in step (d) can be carried out simultaneously with the extension reaction in step (c), so the reaction does not have to be carried out in a separate step, or of course, it can be done as desired.

[0253] The array manufactured according to the array production method of the present invention can be further characterized as described above.

[0254] While the present invention relates to the detection or analysis of RNA and the analysis or detection of transcriptomes, as stated above, it is understood that the principles described can be similarly applied to the detection or analysis of cellular DNA and for genomic research. Therefore, from a broader perspective, the present invention can generally be viewed as a method for detecting nucleic acids, and more specifically, as providing a method for the analysis or detection of DNA. Spatial information can be valuable for genomics-related research, i.e., the spatial resolution detection and / or analysis of DNA molecules. According to the present invention, this can be achieved through genomic tagging. Such local or spatial detection methods can be useful, for example, in the study of genomic variations in different cells or regions of a tissue, such as comparing normal and pathological cells or tissues (e.g., normal vs. tumor cells or tissues), or studying genomic changes in disease progression. For example, tumor cells can contain heterologous populations of cells with different genomic variants (e.g., variants and / or other genetic abnormalities, such as chromosomal rearrangements, chromosomal amplification / deletion / insertion, etc.). Local detection of genomic variations in different cells or different genomic loci is useful in such cases, for example, studying the spatial distribution of genomic variations. A major application of this method is in tumor analysis. For example, within the scope of this invention, arrays designed to capture the genome of an entire cell with a single characteristic can be prepared. Thus, different cells in a tissue sample can be compared. Clearly, this invention is not limited to this design, and other possible variations are possible, in which DNA is locally detected, and the location of DNA capture on the array is associated with a specific location or site in the tissue sample.

[0255] Accordingly, in a broader aspect, the present invention can be viewed as providing a method for local detection of nucleic acids in tissue samples, comprising:

[0256] (a) An array is provided comprising a substrate on which a plurality of capture probes are directly or indirectly immobilized, each type of probe occupying a different position in the array and oriented to have a 3' free end, such that the probes can act as primers for extension reactions or primer ligation reactions, wherein each type of capture probe comprises a nucleic acid molecule comprising, in the 5' to 3' direction:

[0257] (i) the localization region corresponding to the position of the probe on the array, and

[0258] (ii) Capture domain;

[0259] (b) The array is brought into contact with a tissue sample such that the position of the capture probe on the array can be associated with a position on the tissue sample, and the nucleic acid on the tissue sample is allowed to hybridize with the capture domain of the capture probe;

[0260] (c) Using the capture probe as an extension primer or a ligation primer, a DNA molecule is generated from the captured nucleic acid molecule, wherein the extended or ligated DNA molecule is labeled by a localization domain;

[0261] (d) Optionally, a complementary strand of the labeled DNA is generated, and / or optionally, the labeled DNA is amplified;

[0262] (e) Releasing at least a portion of the labeled DNA molecules and / or their complementary strands or amplicones from the array surface, wherein the portion includes a localization domain or its complementary strand;

[0263] (f) Analyze the sequence of the released DNA molecules directly or indirectly (e.g., sequencing).

[0264] As described in more detail above, any nucleic acid analysis method can be used for the analytical steps; typically, sequencing may be included, but sequence determination is not necessary. For example, sequence-specific analytical methods can be employed. Sequence-specific amplification can be performed, such as using domain-specific primers and / or primers targeting a specific target sequence, for example, the specific target DNA to be tested (i.e., DNA corresponding to a specific cDNA / RNA, gene, gene variant, genomic locus, or genomic variant). One exemplary analytical method is sequence-specific PCR.

[0265] The sequence analysis (e.g., sequencing) information obtained in step (f) can be used to obtain spatial information related to the nucleic acids of the sample. In other words, the sequence analysis information can provide location-related information about the nucleic acids of the sample. This spatial information can come from the analysis of identified or determined sequences, for example, revealing the presence of a specific nucleic acid molecule, the presence of which may itself contain useful spatial information in the tissue sample used, and / or can be combined with the position of the tissue sample on the array with the sequence analysis information to infer spatial information (e.g., spatial localization). However, as mentioned above, spatial information can be conveniently obtained by associating sequence analysis data with an image of the tissue sample, as embodied in a preferred embodiment of the invention.

[0266] Accordingly, in a preferred embodiment, the method may further include the following steps:

[0267] (g) Associate the array analysis information with the tissue sample image, wherein the tissue sample is imaged before or after step (c).

[0268] The primer extension reaction involved in step (a) can be limited to a polymerase-catalyzed extension reaction, the purpose of which is to obtain the complementary strand of the captured nucleic acid molecule covalently linked to the capture probe, that is, to synthesize the complementary strand using the capture probe as a primer and the captured nucleic acid as a template. In other words, the extension reaction can be any primer extension reaction carried out by any polymerase. The nucleic acid can be RNA or DNA. Accordingly, the polymerase can be any polymerase; it can be reverse transcriptase or DNA polymerase. The ligation reaction can be carried out by any ligase, the purpose of which is to bind the complementary strand of the captured nucleic acid molecule to the capture probe, that is: wherein the captured nucleic acid molecule (hybridized to the capture probe) contains a partial double strand, and its complementary strand is linked to the capture probe.

[0269] A preferred embodiment of this method is the method described above for the determination and / or analysis of the transcriptome or for RNA assay. In an alternative preferred embodiment, the nucleic acid molecule to be tested is DNA. In this embodiment, the present invention provides a method for local detection of DNA in a tissue sample, comprising:

[0270] (a) An array comprising a substrate on which a variety of capture probes are directly or indirectly immobilized, each type occupying a different position in the array and having a 3' free end in a certain direction, such that the probes can act as primers to guide primer extension reactions or primer ligation reactions, wherein each type of capture probe comprises a nucleic acid molecule, the nucleic acid molecule comprising, in the 5' to 3' direction:

[0271] (i) the localization region corresponding to the position of the probe on the array, and

[0272] (ii) Capture domain;

[0273] (b) Contact the array with a tissue sample such that the position of the capture probe on the array can be correlated with the position on the tissue sample, and allow the DNA on the tissue sample to hybridize with the capture domain of the capture probe;

[0274] (c) Fragmenting the DNA in the tissue sample, wherein the fragmentation is performed before, during or after step (b) in which the array contacts the tissue sample;

[0275] (d) Using the captured DNA fragment as a template, the capture probe is extended by primer extension reaction or the captured DNA fragment is ligated to the capture probe by ligation reaction to generate DNA molecules, wherein the extended or ligated DNA molecules are labeled by a localization domain.

[0276] (e) Optionally, a complementary strand of the labeled DNA is generated and / or the labeled DNA is amplified;

[0277] (f) Releasing at least a portion of the labeled DNA molecules and / or their complementary strands or amplicones from the array surface, wherein the portion includes the localization domain or its complementary strand;

[0278] (g) Analyze the sequence of the released DNA molecules directly or indirectly.

[0279] The method may further include the following steps:

[0280] (h) Associate the array analysis information with an image of the tissue sample, wherein the tissue sample is imaged before or after step (d).

[0281] In spatial genomics research, where the target nucleic acid is DNA, imaging and image association steps are preferred in certain cases.

[0282] In some embodiments that capture DNA, the DNA can be any DNA molecule present in the cell. Therefore, it can be genomic (i.e., nuclear) DNA, mitochondrial DNA, or plasmid DNA, such as chloroplast DNA. In a preferred embodiment, the DNA is genomic DNA.

[0283] It is understood that when fragmentation is performed after the contact described in step (b) (i.e., after the tissue sample is placed on the array), DNA fragmentation occurs before hybridization with the capture domain. In other words, the DNA fragment hybridizes with the capture domain of the capture probe (or more specifically, hybridization is permitted).

[0284] In a particular embodiment of this invention, preferably, but not necessarily, a binding domain can be introduced into a DNA fragment of a tissue sample to allow or facilitate capture of the fragment by a capture probe on the array. Accordingly, the binding domain is capable of hybridizing with the capture domain of the capture probe. Therefore, the binding domain can be considered as the complementary strand of the capture domain (i.e., as a complementary capture domain), but absolute complementarity between the capture and binding domains is not required; only that the complementarity of the binding domains is sufficient to allow effective hybridization, i.e., enabling the DNA fragment in the tissue sample to hybridize with the capture domain of the capture probe. Introducing such a binding domain ensures that the sample DNA does not bind to the capture probe until after the fragmentation step. The binding domain can be provided to the DNA fragment using procedures well known in the art, for example, by linking conjugate or linker sequences that may contain the binding domain. For example, a linker sequence containing an extended end can be used. The binding domain can be present in the single-stranded portion of this linker to facilitate subsequent ligation of the linker to the DNA fragment; the single-stranded portion containing the binding domain can be used for hybridization with the capture domain of the capture probe. Alternatively, in a preferred embodiment, the binding domain can be introduced by introducing a polynucleotide tail (e.g., a homomeric tail, such as a poly-A domain) via a terminal transferase. This step can be performed using a similar procedure to that used in the RNA method described above for introducing a common domain. Therefore, in a preferred embodiment, a common binding domain can be introduced. In other words, a binding domain common to all DNA fragments can be used to achieve fragment capture on the array.

[0285] When performing a tailing reaction to introduce a (common) binding domain, the capture probe on the array can be protected from the effects of tailing; that is, the capture probe can be blocked or masked as described above. This step can be achieved, for example, by hybridizing the capture probe with a blocking oligonucleotide, such as hybridizing with the extended end (e.g., a single-stranded portion) of the capture probe. For example, when the capture domain contains a poly-T sequence, the blocking oligonucleotide can be a poly-A oligonucleotide. This blocking oligonucleotide may contain a blocked 3' end (i.e., an end that cannot be extended or tailed). As mentioned above, the capture probe can also be protected by chemical and / or enzymatic modification methods, i.e., by blocking.

[0286] When introducing the binding domain via the ligation reaction of a linker as described above, it is understood that, in addition to extending the capture probe to generate a complementary copy of the captured DNA fragment containing a positioning tag of the capture probe primer, the DNA fragment can also be ligated to the 3' end of the capture probe. As mentioned above, the ligation reaction requires phosphorylation of the 5' end to be ligated. Accordingly, in one embodiment, the 5' end of the newly added linker, the end to be ligated to the capture probe (i.e., the non-extended end of the linker added to the DNA fragment), is phosphorylated. In this embodiment of the ligation reaction, it is understood that a linker can be ligated to a double-stranded DNA fragment having a single-stranded extended 3' end containing the binding domain. Upon contact with the array, the extended end hybridizes with the capture domain of the capture probe. This hybridization reaction juxtaposes and ligates the 3' end of the capture probe with the 5' end (non-extended end) of the newly added linker. Therefore, the capture probe, and even the positioning domain, are incorporated into the captured DNA fragment through this ligation reaction. The principle of this embodiment is as follows: Figure 21 As shown.

[0287] Therefore, in a more specific embodiment, the method of the present invention in this respect may include:

[0288] (a) An array comprising a substrate on which a variety of capture probes are directly or indirectly immobilized, each type occupying a different position in the array and oriented to have a 3' free end, such that the probes can act as primers for extension reactions or primer ligation reactions, wherein each type of capture probe comprises a nucleic acid molecule comprising, in the 5' to 3' direction:

[0289] (i) The localization domain corresponding to the probe position on the array, and

[0290] (ii) Capture domain;

[0291] (b) Contact the array with a tissue sample such that the position of the capture probe on the array can be correlated with the position on the tissue sample;

[0292] (c) Fragmenting the DNA in the tissue sample, wherein the fragmentation is performed before, during or after step (b) in which the array contacts the tissue sample;

[0293] (d) Introduce a binding domain to the DNA fragment, the binding domain being capable of hybridizing with the capture domain;

[0294] (e) Allow the DNA fragment to hybridize with the capture domain of the capture probe;

[0295] (f) Using the captured DNA fragment as a template, the capture probe is extended by primer extension reaction or the captured DNA fragment is ligated to the capture probe by ligation reaction to generate a DNA molecule, wherein the extended or ligated DNA molecule is labeled by a localization domain.

[0296] (g) Optionally, a complementary strand of the labeled DNA is generated and / or the labeled DNA is amplified;

[0297] (h) Releasing at least a portion of the labeled DNA molecules and / or their complementary strands or amplicones from the array surface, wherein the portion includes the localization domain or its complementary strand;

[0298] (i) Analyze the sequence of the released DNA molecules directly or indirectly.

[0299] Optionally, the method may further include the following steps:

[0300] (j) Associate the array analysis information with an image of the tissue sample, wherein the tissue sample is imaged before or after step (f).

[0301] In the nucleic acid or DNA detection methods described above, the optional step of generating a complementary copy of the labeled nucleic acid / DNA or amplifying the labeled DNA may include using a strand displacement polymerase, following the principles described in the preceding RNA / transcriptome analysis / detection methods. A suitable strand displacement polymerase is described above. The purpose of this step is to ensure that the localization domain is replicated and introduced into the complementary copy or amplicon, especially when the capture probe is immobilized on the array by hybridization with a surface probe.

[0302] However, the use of strand displacement polymerase is not necessary in this step. For example, a non-strand displacement polymerase and a ligation reaction of oligonucleotides hybridizing to the localization domain can be used simultaneously. This procedure is similar to the synthesis of the capture probe on the array described above.

[0303] In one embodiment, the method of the present invention can be used to determine and / or analyze the entire genome of a tissue sample, e.g., the whole genome of a tissue sample. However, the method is not limited to this and also includes determining and / or analyzing the entire or a portion of the genome. Therefore, the method may include determining and / or analyzing a portion or subset of the genome, e.g., a portion of the genome corresponding to a subset or group of genes on a chromosome, such as a specific gene group or chromosome group, or a specific region or part of the genome, e.g., relating to a specific disease, condition, tissue type, etc. Therefore, the method can be used to detect or analyze the genomic sequence or genomic loci of tumor tissue compared to normal tissue, or even different cell types in a tissue sample. It can also detect the presence, deletion, distribution, or location of different genomic variations or loci in different cells, cell populations, tissues, local tissues, or tissue types.

[0304] From another perspective, the above steps of this method can be seen as providing a method for obtaining spatial information about nucleic acids, such as the genomic sequence, variants, or sites of tissue samples. In other words, the method of this invention can be used to label (or identify) genomes, especially individual or spatially distributed genomes.

[0305] From another perspective, the method of the present invention can be viewed as a spatial detection method for DNA in tissue samples, or a method for detecting DNA with spatial resolution, or a method for local or spatial determination and / or analysis of DNA in tissue samples. In particular, this method can be used for the local or spatial detection, determination, and / or analysis of genes or genomic sequences or genomic variants or sites (e.g., the distribution of genomic variants or sites) in tissue samples. Local / spatial detection / determination / analysis means that the native location or site of DNA in a tissue sample within a cell or tissue can be located. Thus, for example, DNA can be located within a cell, cell population, or cell type in the sample, or a specific region of the tissue sample. The native site or location of DNA (or in other words, the site or location of DNA in the tissue sample), such as a genomic variant or site, can be determined.

[0306] Therefore, it is understood that the array of the present invention can be used to capture nucleic acids, such as tissue sample DNA in contact with the array. The array can also be used to determine and / or analyze a portion or the entire genome of a tissue sample, or to obtain a portion or the entire genome of a tissue sample containing spatially defined information. Therefore, the method of the present invention can be viewed as a method for quantifying the spatial distribution of one or more genomic sequences (or variants, or gene loci) in a tissue sample. In other words, the method of the present invention can be used to detect one or more genomic sequences, genomic variants, or genomic gene loci in a tissue sample. In further words, the method of the present invention can be used to simultaneously determine the location or distribution of one or more genomic sequences, genomic variants, or genomic gene loci in a tissue sample. Furthermore, this method can be viewed as a method for performing partial or global analysis of nucleic acids (such as DNA) in a tissue sample with spatial resolution, such as two-dimensional spatial resolution.

[0307] This invention can also be viewed as providing an array for the method of this invention, the array comprising a substrate containing a variety of directly or indirectly fixed capture probes, each probe occupying a different position on the array and oriented with a free 3' end, so that the probe can have the function of extending or connecting primers, wherein each type of capture probe contains a nucleic acid molecule, the nucleic acid molecule having, in the direction of its 5' to 3' ends:

[0308] (i) The location domain, corresponding to the position of the capture probes on the array, and

[0309] (ii) A capture domain for capturing nucleic acids from tissue samples in contact with the array.

[0310] In one aspect, the captured nucleic acid molecule is DNA. The capture domain can be specific to a particular type or group of DNA to be detected, for example, by specifically hybridizing the capture domain with a specific motif sequence of the target DNA, such as a conserved sequence, following a similar method to RNA detection described above. Optionally, a binding domain can be introduced into the DNA to be captured, for example, a common binding domain as described above, which can be recognized by the capture domain of the capture probe. Therefore, as mentioned above, the binding domain can be, for example, a homopolymer sequence, such as poly-A. Similarly, the binding domain can be obtained according to principles and methods similar to those described above for designing RNA / transcriptomics analysis or detection. In this case, the capture domain can be complementary to the binding domain of the DNA molecule introduced into the tissue sample.

[0311] As mentioned in the previous section on RNA, the capture domain can be a random or degenerate sequence. Therefore, DNA can be captured by binding to a random or degenerate capture domain, or by binding to a capture domain containing at least a partially random or degenerate sequence.

[0312] In a related aspect, the invention also provides an array application comprising a substrate on which a plurality of capture probes are directly or indirectly immobilized, each probe being located at a different position on the array and oriented to have a 3' free end, so that the probes can act as primers for extension or ligation reactions, wherein each type of probe contains a nucleic acid molecule, the nucleic acid molecule having, in the direction from the 5' to the 3' end:

[0313] (i) The location domain, corresponding to the position of the capture probes on the array, and

[0314] (ii) Capture domain;

[0315] The capture domain is used to capture nucleic acids, such as DNA or RNA, from tissue samples that are in contact with the array.

[0316] Preferably, the application is used for local detection of nucleic acids in tissue samples, and further includes the following steps:

[0317] (a) Using the capture probe as an extension primer or ligation primer, a DNA molecule is generated from the captured nucleic acid molecule, wherein the extended or ligated molecule is labeled by a localization domain;

[0318] (b) Optionally, a complementary strand of the labeled nucleic acid is generated and / or the labeled nucleic acid is amplified;

[0319] (c) Releasing at least a portion of the labeled DNA molecule and / or its complementary strand or amplicon from the array surface, wherein the portion includes a localization domain or its complementary strand;

[0320] (d) Analyze the sequence of the released DNA molecules directly or indirectly; and optionally...

[0321] (e) Associate the sequence analysis information with an image of the tissue sample, wherein the imaging of the tissue sample was completed in step (a).

[0322] The DNA fragmentation step of a tissue sample can be performed as desired by any method known in the art. Therefore, physical methods, such as acoustic degradation or sonication, can be used for fragmentation. Related chemical methods are also known. Enzymatic methods can also be used to achieve fragmentation, for example, using endonucleases, such as restriction enzymes. Again, the methods and enzymes for this step are well known in the art. Fragmentation can be performed before, during, or after the step of preparing the tissue sample to be used in the array (e.g., the step of preparing tissue sections). Conveniently, the tissue fixation step can cause fragmentation. Therefore, formalin fixation, for example, can cause DNA fragmentation. Other fixatives can produce similar results.

[0323] As will be apparent from the details regarding the preparation and use of arrays in these aspects of the invention, the descriptions and details given in the RNA methods described above can also be applied analogously to the more extensive nucleic acid and DNA detection methods described herein. Therefore, all the aspects and details described above are equally applicable. For example, the discussion of reverse transcriptase primers and reactions, etc., can be equally applied to any aspect of the extended primers, polymerase reactions, etc., described above. Similarly, the reference methods for the synthesis of the first and second strands of cDNA can also be equally applied to the labeled DNA molecules and their complementary strands. Sequence analysis can be performed using the methods described above.

[0324] For example, a capture domain can be used to capture probes as described above. Capture domains containing or only containing poly-T can be used, for example, in cases where a binding domain containing a poly-A sequence is introduced into a DNA fragment.

[0325] The aforementioned universal domains can be introduced into capture probe / labeled DNA molecules (i.e., labeled and extended or linked molecules) for, for example, amplification and / or cleavage. Attached Figure Description

[0326] The invention will now be further described with reference to non-limiting embodiments and the accompanying drawings.

[0327] Figure 1 The diagram illustrates the overall concept of using an array of "identification code" oligodeoxythymidine probes to capture mRNA from tissue sections for transcriptome analysis.

[0328] Figure 2 The diagram shown illustrates the principle of transcriptosome abundance visualization in corresponding tissue sections.

[0329] Figure 3 The diagram shows the configuration of the 3' to 5' surface probes and the synthesis of the capture probes in the 5' to 3' directions that are indirectly fixed to the array surface.

[0330] Figure 4 The diagram shows the enzymatic (USER or Rsal) shearing efficiency of the self-made array and the shearing efficiency of the Agilent array in 99°C hot water, determined by the hybridization reaction of the fluorescently labeled probe with the array surface after probe release.

[0331] Figure 5 The image shown is a fluorescence image captured by an Agilent commercial array after DNA surface probe release mediated by hot water at 99°C. Hybridization of the fluorescent detection probes was performed after hot water treatment. The top array represents the untreated control group.

[0332] Figure 6The image shows a slice of mouse brain tissue fixed on a transcriptome capture array. The tissue has been synthesized with cDNA and treated with cytoplasmic staining agent (top, labeled NeuN) and nucleic acid staining agent (middle, labeled Nuclei), respectively. Both staining agents are shown in the merged image (bottom, labeled merge).

[0333] Figure 7 The table shown lists the Reads categorized by source site in the low-density homemade DNA capture array illustrated in the diagram.

[0334] Figure 8 The image shows brain tissue from FFPE mice stained with nucleic acid and Map2-specific staining agent within a microarray of identification codes.

[0335] Figure 9 The image shows the olfactory bulb of an FFPE mouse brain stained with nucleic acid staining agent (white) and its visible morphology.

[0336] Figure 10 The image shows the olfactory bulb of an FFPE mouse brain (approximately 2x2mm) stained with nucleic acid staining agent (white), which overlaps with the theoretical dot pattern of the low-resolution array.

[0337] Figure 11 The image shows the olfactory bulb of an FFPE mouse brain (approximately 2x2mm) stained with nucleic acid staining agent (white), which overlaps with the theoretical dot pattern of a medium-resolution array.

[0338] Figure 12 The image shown is a magnified view of the olfactory bulb region in the brain of an FFPE mouse (i.e., Figure 9 (Upper right part)

[0339] Figure 13 The image shows the product obtained from USER release during the amplification reaction using a random hexameric primer (R6) linked by B_handle (B_R6); the product was plotted using a bioanalyzer.

[0340] Figure 14 The image shows the product obtained from USER release during the amplification reaction using a random octomer primer (R8) linked by B_handle (B_R8); the product was plotted using a bioanalyzer.

[0341] Figure 15 The image shows the results of an experiment performed on FFPE brain tissue covering the entire array. ID5 (left) and ID20 (right) were amplified using ID-specific and gene-specific primers (B2M exon 4) after surface synthesis and release of cDNA; amplified ID5 and ID20.

[0342] Figure 16The diagram illustrates the principle of the method described in Example 4, specifically the use of a microarray immobilized with DNA oligonucleotides (capture probes) carrying spatial marker sequences (localization domains). Each feature of the oligonucleotides in the microarray carries 1) a unique marker (localization domain) and 2) a capture sequence (capture domain).

[0343] Figure 17 The results shown are from Example 5, which used pre-fragmented genomic DNA of an average size of 200 bp to perform the described spatial genomics experimental procedure. Internal products were amplified on the array, labeled, and DNA was synthesized. The measured peak size was as expected.

[0344] Figure 18 The results shown are from Example 5, which used pre-fragmented genomic DNA of an average size of 700 bp to perform the described spatial genomics experimental procedure. Internal products were amplified on the array, labeled, and DNA was synthesized. The measured peak size was as expected.

[0345] Figure 19 The results shown are from experiments performed using pre-fragmented genomic DNA of an average size of 200 bp, following the spatial genomics experimental procedure described in Example 5. The products were amplified using an inner primer and a universal sequence contained in surface oligonucleotides. Amplification was performed on an array, and the DNA was labeled and synthesized. Due to the highly diverse sample pool resulting from random fragmentation and terminal transferase labeling of the genomic DNA, tailing of the products is expected.

[0346] Figure 20 The results shown are from an experiment performed using pre-fragmented genomic DNA at 700 bp, following the spatial genomics procedure described in Example 5. The product was amplified using an inner primer and a universal sequence contained in a surface oligonucleotide. Amplification was performed on an array, and the DNA was labeled and synthesized. Due to the highly diverse sample pool resulting from random fragmentation and terminal transferase labeling of the genomic DNA, tailing of the product is expected.

[0347] Figure 21 The diagram illustrates the ligation reaction in which a linker is attached to a DNA fragment. This reaction introduces a binding domain for hybridization into the poly-T capture domain to achieve subsequent ligation with the capture probe.

[0348] Figure 22 The diagram shows the configuration of the capture probes in the 5' to 3' orientation used in a high-density capture array.

[0349] Figure 23 The image shows the framework of a high-density array used for the orientation of tissue samples, visualized through hybridization of fluorescently labeled probes.

[0350] Figure 24 The image shows cleaved and uncleaved capture probes on a high-density array, where the frame probes are not cleaved due to the absence of uracil bases. The capture probes are labeled with fluorescein linked to poly-A oligonucleotides.

[0351] Figure 25 The image shown is a bioanalyzer plot of a library prepared from transcripts captured from the mouse olfactory bulb.

[0352] Figure 26 The image shows a Matlab visualization of transcripts captured from total RNA extracted from the mouse olfactory bulb.

[0353] Figure 27 The image shows a full capture array of Olfr (olfactory receptor) transcripts captured from mouse olfactory bulb tissue, visualized using a Matlab visualization program.

[0354] Figure 28 The image shows the spray pattern of the self-made 41-ID-marker microarray.

[0355] Figure 29 The image shows a spatial genomic library prepared by capturing genomic fragments with poly-A tails and then performing A431-specific translocations.

[0356] Figure 30 The image shows the detection results of poly-A tailed genomic fragments that were captured on the capture array after being specifically translocated by A431 into poly-A tailed U2OS genomic fragments.

[0357] Figure 31 The image shows the overlap between the ID-tagged transcripts captured from mouse olfactory bulb tissue on a homemade array of 41-ID tags and the tissue image, visualized in Matlab. For clarity, specific features identifying particular genes have been delineated. Detailed Implementation

[0358] Example 1

[0359] Array fabrication

[0360] The following experiments demonstrate how to attach oligonucleotide probes to an array substrate via their 5' or 3' ends to obtain an array containing capture probes that can hybridize with mRNA.

[0361] Fabrication of a self-made spray microarray containing probes at the 5' to 3' ends

[0362] Twenty labeled RNA capture oligonucleotides (labeled 1-20, Table 1) were spotted onto a glass slide as capture probes. Each probe was synthesized with a 5'-terminal amino linker and a C6 spacer. All probes were synthesized by Sigma-Aldrich (St. Louis, MO, USA). The RNA capture probes were prepared into a 20 μM suspension at pH 8.5 using 150 mM sodium phosphate and sputtered onto a Nanoplotter NP2.1 / E system (Gesim, Grosserkmannsdorf, Germany) on CodeLink. TM Samples were spotted onto an activated microarray slide (7.5cm x 2.5cm; Surmodics, Eden Prairie, MN, USA). After spotting, surface sealing was performed according to the manufacturer's instructions. The probe was sprayed into 16 identical arrays on the slide, each array containing a preset spotting pattern. During the hybridization reaction, the 16 subarrays were separated by 16 chip clips. TM (Schleicher & Schuell BioScience, Keene, NH, USA) were separated.

[0363]

[0364]

[0365] Fabrication of a self-made spray microarray containing probes with 3' to 5' end orientation and synthesis of probes with 5' to 3' end orientation

[0366] The spraying method for surface probe oligonucleotides is the same as that for probes from the 5' to 3' ends described above, and each oligonucleotide has an amino-C7 linker at its 3' end, as shown in Table 1.

[0367] To prepare primers for the synthesis of the capture probe, a hybridization solution containing 4xSSC (0.1% SDS), 2 μM extension primer (universal domain oligonucleotide), and 2 μM linker primer (capture domain oligonucleotide) was incubated at 50 °C for 4 minutes. Simultaneously, a homemade array was attached to a ChipClip clip (Whatman). Then, 50 μL of hybridization solution was added to each well of the array, and the array was incubated at 50 °C and 300 rpm for 30 minutes.

[0368] After cultivation, remove the array from the ChipClip substrate holder and wash it according to the following three steps: 1) wash at 50°C with a 2xSSC solution containing 0.1% SDS at 300 rpm for 6 minutes; 2) wash with a 0.2xSSC solution at 300 rpm for 1 minute; and 3) wash with a 0.1xSSC solution at 300 rpm for 1 minute. Then, spin-dry the array and place it back into the ChipClip substrate holder.

[0369] To perform the extension and ligation reactions (to generate the localization domains of the capture probes), 50 μL of an enzyme mixture containing 10x Ampligase buffer, 2.5 U of amplified thermostable DNA polymerase Stoffel fragment (Applied Biosystems), 10 U of Ampligase (Epicentre Biotechnologies), 2 mM each of dNTPs (Fermentas), and water was added to each well. The array was then incubated at 55°C for 30 minutes. After incubation, the array was washed according to the previously described array washing method, but the duration of step 1) was changed from 6 minutes to 10 minutes.

[0370] The method is as follows Figure 3 As shown.

[0371] Tissue preparation

[0372] The following experiments demonstrate how to prepare tissue sample slices for use in the method of this invention.

[0373] Prepare fresh frozen tissue and slice it onto a capture probe array.

[0374] Fresh, unfixed mouse brain tissue was harvested, trimmed as needed, and frozen at -40°C in isopentane. The tissue was then sliced ​​into 10 μm sections in a cryogenic chamber. One tissue section was placed on each of the target capture pointer arrays.

[0375] Preparation of formalin-fixed paraffin-embedded (FFPE) tissues

[0376] Mouse brain tissue was taken and fixed in 4% formalin at 4°C for 24 hours, and then cultured in the following manner: 3 times for 1 hour in 70% ethanol; once for 1 hour in 80% ethanol; once for 1 hour in 96% ethanol; 3 times for 1 hour in 100% ethanol; and twice for 1 hour in xylene at room temperature.

[0377] The dehydrated samples were then incubated in low-melting-point liquid paraffin at 52–54 °C for up to 3 hours, with the paraffin replaced once during this period to wash away residual xylene. The completed tissue blocks were then stored at room temperature. The tissue was then sliced ​​into 4 μm sections in paraffin using an ultramicrotome and placed onto each of the target capture probe arrays.

[0378] The slices were dried on an array of slides at 37°C for 24 hours and stored at room temperature.

[0379] Dewaxing of FFPE tissue

[0380] Formalin-fixed paraffin-embedded 10μm sections of mouse brain tissue attached to CodeLink slides were dewaxed twice in xylene for 10 minutes each time; then in 99.5% ethanol for 2 minutes; in 96% ethanol for 2 minutes; and in 70% ethanol for 2 minutes; and then air-dried.

[0381] cDNA synthesis

[0382] The following experiment demonstrates the use of mRNA captured on an array of tissue sample slices as a template for cDNA synthesis.

[0383] cDNA Synthesis on Chip

[0384] A 16-well light-shielding frame and ChipClip substrate clip (purchased from Whatman) were connected to the CodeLink slide. cDNA synthesis was performed using an Invitrogen-purchased... SuperScript, a thermostable DNA polymerase TM II. One-step RT-PCR system (SuperScript) TM III One-step RT-PCR System with Taq DNAPolymerase). In each reaction, 25 μl of the 2x reaction mixture (i.e., containing...) SuperScript, a thermostable DNA polymerase TM (II. One-step RT-PCR system) Mix 22.5 μl of water and 0.5 μl of 100x BSA and heat to 50 °C. For each reaction, add 2 μl of SuperScript III / Platinum Taq enzyme mixture to the reaction mixture and add 50 μl of the reaction mixture to each well on the chip. Incubate the chip at 50 °C for 30 minutes (Thermomixer Comfort, Eppendorf).

[0385] Remove the reaction mixture from the wells and wash the substrate with the following reagents: 2xSSC, 0.1% SDS: 50°C for 10 minutes; 0.2xSSC: room temperature for 1 minute; and 0.1xSSC: room temperature for 1 minute. Then spin-dry the chip.

[0386] For FFPE tissue sections, the sections can be stained and visualized before tissue removal, as described in the visualization section below.

[0387] Visualization

[0388] Fluorescently labeled probe hybridization before staining

[0389] Before placing tissue sections, fluorescently labeled probes are hybridized with features containing labeled oligonucleotides sprayed onto the capture probe array. After tissue visualization, the fluorescently labeled probes aid in the orientation of the resulting image, allowing for the merging of the image with the expression profiles of the individual capture probe "labels" (localization domains) obtained from sequencing. To complete the hybridization of the fluorescent probes, a hybridization solution containing 4xSSC with 0.1% SDS and 2 μM detection probe (P) is incubated at 50°C for 4 minutes. Simultaneously, a homemade sequence is attached to a ChipClip clip (Whatman). Then, 50 μL of hybridization solution is added to each well of the sequence, and the mixture is incubated at 50°C and 300 rpm for 30 minutes.

[0390] After culturing, remove the sequences from the ChipClip substrate clips and wash them according to the following steps: 1) 2xSSC, 0.1% SDS: 50°C and 300 rpm for 6 minutes; 2) 0.2xSSC: 300 rpm for 1 minute; and 3) 0.1xSSC: 300 rpm for 1 minute. Then, spin-dry the sequences.

[0391] Whole-body histological staining of FFPE tissue sections before or after cDNA synthesis

[0392] For FFPE tissue sections immobilized with the capture probe sequence, wash and rehydrate as described above after dewaxing and before cDNA synthesis, or wash as described above after cDNA synthesis. Then, process the tissue sections as follows: incubate in hematoxylin for 3 minutes; rinse with deionized water; incubate in tap water for 5 minutes; rapidly immerse in acidified ethanol 8-12 times; wash twice with tap water for 1 minute each; wash with deionized water for 2 minutes; incubate in eosin for 30 seconds; wash three times with 95% ethanol for 5 minutes each; wash three times with 100% ethanol for 5 minutes each; rinse three times with xylene for 10 minutes each (can be overnight); cover the slide with a coverslip and mount with DPX; allow the slide to dry overnight in a fume hood.

[0393] Whole-body immunohistochemical staining of target proteins in FFPE tissue sections before or after cDNA synthesis.

[0394] For FFPE tissue sections immobilized with the capture probe sequence, wash and rehydrate as described above after dewaxing and before cDNA synthesis, or wash as described above after cDNA synthesis. Then, process the tissue sections as follows, keeping them moist throughout the staining process: incubate the sections overnight at room temperature in a humidified chamber with primary antibody (diluted with blocking buffer containing 1x Tris buffered saline (50 mM Tris, 150 mM NaCl, pH 7.6), 4% donkey serum, and 0.1% Triton-X); rinse three times with 1x TBS; incubate the sections at room temperature for 1 hour in a humidified chamber with paired secondary antibody conjugated with fluorescein (FITC, Cy3, or Cy5). After rinsing three times with 1x TBS to remove as much TBS as possible, mount the sections with ProLong Gold + DAPI (phenylindole) reagent (Invitrogen) and then analyze them using a fluorescence microscope and a set of matched filters.

[0395] Remove residual tissue

[0396] Frozen tissue

[0397] For freshly frozen mouse brain tissue, the washing step after cDNA synthesis is sufficient to completely remove the tissue.

[0398] FFPE organization

[0399] Take a slide of formalin-fixed paraffin-embedded mouse brain tissue and attach it to a ChipClip clip and a 16-well light-shielding frame (Whatman). Take Proteinase K Digest Buffer from the RNeasy FFPE kit (Qiagen) and add 10 μl of Proteinase K Solution (Qiagen) to every 150 μl. Add 50 μl of the final mixture to each well and incubate the slide at 56°C for 30 minutes.

[0400] Release of the capture probe (cDNA)

[0401] The capture probe (covalently attached probe) is released using a PCR buffer containing a mixture of USER enzymes capable of cleaving uracil.

[0402] Mount the 16-well light-shielding frame and CodeLink slides onto the ChipClip clip (Whatman). Take 50 μl of a mixture containing 1x FastStart High Fidelity Reaction Buffer with 1.8 mM MgCl2 (Roche), 200 μM dNTPs (New England Biolabs), and 0.1 U / 1 μl USER Enzyme (New England Biolabs), heat to 37°C, add to each well, and incubate at 37°C for 30 minutes, then mix (300 rpm shaking for 3 seconds, then incubate for 6 seconds) (Thermomixer comfort; Eppendorf). Then, use a pipette to recover the reaction mixture containing the released cDNA and probe from the wells.

[0403] The capture probe (covalently) is released in a TdT (terminal transferase) buffer containing a mixture of USER enzymes capable of cleaving uracil. (attached probe)

[0404] Take 50 μl of a mixture containing 1x TdT buffer (20 mM Tris-acetate (pH 7.9), 50 mM potassium acetate and 10 mM magnesium acetate) (New England Biolabs, www.neb.com), 0.1 μg / μl BSA (New England Biolabs), and 0.1 U / μl USER enzyme (New England Biolabs), heat to 37°C, add to each well, and incubate at 37°C for 30 minutes, then mix (300 rpm shaking for 3 seconds, then incubate for 6 seconds) (Thermomixer comfort; Eppendorf). Then, use a pipette to recover the reaction mixture containing the released cDNA and probe from the wells.

[0405] The capture probe (covalently attached probe) is released by scalding water.

[0406] Mount the 16-well light-shielding frame and CodeLink slides onto the ChipClip clip (Whatman). Pipette 50 μl of 99°C hot water into each well and allow to react for 30 minutes. Then, use a pipette to recover the reaction mixture containing the released cDNA and probe from the wells.

[0407] The capture probe (synthetic capture probe for in situ hybridization, i.e., hybridization with surface probe) is released using heated PCR buffer. (capture probe)

[0408] Take 50 μl of a mixture containing 1x TdT buffer (20 mM Tris-acetate (pH 7.9), 50 mM potassium acetate and 10 mM magnesium acetate) (New England Biolabs, www.neb.com), 0.1 μg / μl BSA (New England Biolabs), and 0.1 U / μl USER enzyme (New England Biolabs), preheat to 95°C, add to each well, incubate at 95°C for 5 minutes and mix (300 rpm shaking for 3 seconds, then stand for 6 seconds) (Thermomixercomfort; Eppendorf). Then, recover the reaction mixture containing the released probe from the well.

[0409] The capture probe (synthetic capture probe for in situ hybridization, i.e., the one that is released by heating TdT (terminal transferase) buffer) is released. (Site probe hybridization capture probe)

[0410] Take 50 μl of a mixture containing 1xTdT buffer (20 mM Tris-acetate (pH 7.9), 50 mM potassium acetate and 10 mM magnesium acetate) (New England Biolabs, www.neb.com), 0.1 μg / μl BSA (New England Biolabs), and 0.1 U / μl USER enzyme (New England Biolabs), preheat to 95°C, add to each well, incubate at 95°C for 5 minutes and mix (300 rpm shaking for 3 seconds, then stand for 6 seconds) (Thermomixercomfort; Eppendorf). Then, recover the reaction mixture containing the released probe from the well.

[0411] The efficiency of treating the sequence with USER enzyme and hot water heated to 99°C is as follows: Figure 4 As shown. The aforementioned self-made sequence was digested using USER and Rsal enzymes. Figure 4 The release of DNA surface probes was performed on commercially available sequences manufactured by Agilent using hot water (see...). Figure 5 ).

[0412] Probe collection and connector introduction

[0413] The following experiments show that the first strand of cDNA released from the array surface can be modified to produce double-stranded DNA and amplify it.

[0414] Picoplex whole genome amplification kit was used to amplify the entire transcriptome (including the capture of localization domain (marker) sequences). The probe sequence was not retained at the end of the obtained dsDNA.

[0415] The capture probe is released using either a PCR buffer containing a mixture of USER enzymes capable of cleaving uracil (for covalently attached probes) or a heated PCR buffer (for synthetic capture probes that hybridize in situ, i.e. capture probes that hybridize with surface probes).

[0416] According to the manufacturer's instructions, the released cDNA was amplified using the Picoplex (Rubicon Genomics) random primer whole genome amplification method.

[0417] Using terminal transferase (TdT) dA tailing for whole transcriptome amplification (including capture of localization domain (tagged) sequences) The needle sequence is retained at the end of the resulting dsDNA.

[0418] The capture probe is released by using a TdT (terminal transferase) buffer containing a mixture of USER enzymes that can cleave uracil (for covalently attached probes) or heated PCR buffer (for synthetic capture probes that hybridize in situ, i.e. capture probes that hybridize with surface probes).

[0419] Take 38 μl of the shearing mixture and place it into a clean 0.2 ml PCR tube. The mixture contains: 1xTdT buffer (20 mM Tris-acetate (pH 7.9), 50 mM potassium acetate and 10 mM magnesium acetate) (New England Biolabs, www.neb.com), 0.1 μg / μl BSA (New England Biolabs), 0.1 U / μl USER enzyme (New England Biolabs) (do not use for heat release); released cDNA (extending from the surface probe) and released surface probe. Add 0.5 μl RNase H (5 U / μl, final concentration 0.06 U / μl), 1 μl TdT (20 U / μl, final concentration 0.5 U / μl) and 0.5 μl dATPs (100 mM, final concentration 1.25 mM) to the PCR tube. To perform dA tailing, PCR tubes were incubated at 37°C for 15 minutes in an Applied Biosystems thermal cycler, followed by incubation at 70°C for 10 minutes to inactivate TdT. After dA tailing, a PCR premix was prepared. The premix contained: 1x Faststart HiFi PCR Buffer (Roche) containing 1.8 mM MgCl2, 0.2 mM of each dNTP (Fermentas), A (complementary to the amplification domain of the capture probe) and B_(dT)24 (Eurofins MWG Operon) (complementary to the poly-A tail to be ligated to the 3' end of the first strand of cDNA), 0.2 μM of each primer, and 0.1 U / μl of Faststart HiFi DNA polymerase (Roche). Take 23 μl of PCR premix and place it into nine clean 0.2 ml PCR tubes. Add 2 μl of dA tailing mixture to eight of the tubes, and add 2 μl of water (RNase-free / DNase-free) to the last tube (negative control). Perform PCR amplification according to the following program: 95°C hot start for 2 minutes, 50°C for 2 minutes and 72°C for 3 minutes for second-strand synthesis, 30 PCR amplification cycles, with each cycle consisting of 95°C for 30 seconds, 65°C for 1 minute, 72°C for 3 minutes, and a final extension at 72°C for 10 minutes.

[0420] Post-reaction cleanup and analysis

[0421] The four amplicones were combined and processed through a Qiaquick PCR purification column (Qiagen), followed by elution with 30 μl of EB (10 mM Tris-Cl, pH 8.5). The products were analyzed using an Agilent bioanalyzer with the DNA1000 kit according to the manufacturer's instructions.

[0422] sequencing

[0423] Illumina sequencing

[0424] Following the manufacturer's instructions, a dsDNA library for Illumina sequencing was constructed using a sample index. Sequencing was performed using the HiSeq2000 platform (Illumina).

[0425] Bioinformatics research

[0426] Digital transcriptomes were obtained from sequencing data of whole transcriptome libraries amplified using the dA-tailed terminal transferase method. information

[0427] Using the FASTQ identifier separation tool in the FastX toolkit, the sequencing data were sorted and organized into separate files according to their respective capture probe localization domains (tags). Then, the Tophat mapping tool was used to map each labeled sequencing data onto the mouse genome for analysis. The transcriptomic counts were obtained from the resulting SAM files using HTseq-count software.

[0428] Sequencing data obtained from whole transcriptome libraries amplified using the Picoplex whole genome amplification kit. Digital transcriptome information Use the FASTQ-to-FASTA conversion tool in the FastX toolkit to convert sequencing data from FASTQ format to FASTA format. Use the Blasten tool to align sequencing reads with the localization domain (tag) sequences of the capture probes, and ensure a match rate better than 1 e^(-1 / 2) for each tag sequence. -6 The reads were selected and placed into separate files for each sequence marker. Then, the Blasten tool was used to align the marker sequence read files with the mouse transcriptome to obtain matching records.

[0429] Merging visualization data and expression spectra

[0430] By using staining, the expression profiles of the localization domains (labels) of each capture probe are combined with spatial information obtained from tissue sections. In this way, transcriptomic information obtained from cellular compartments of tissue sections can be analyzed by direct comparison, and the differential expression characteristics of different cell subtypes can be distinguished within a predetermined structural environment.

[0431] Example 2

[0432] Figures 8 to 12Successful visualization of stained FFPE mouse brain tissue (olfactory bulb) slices placed on a transcriptome capture array labeled with identification codes was demonstrated, following the general procedure described in the examples. Compared to the experiments using fresh frozen tissue in Example 1, Figure 8 The FFPE structure shown is better. Figure 9 and Figure 10 It demonstrates how to localize tissue onto arrays of different probe densities.

[0433] Example 3

[0434] Whole transcriptome amplification (including capture of marker sequences) was performed using random primer second-strand synthesis and universal end amplification reactions. The probe sequence is retained at the end of the resulting dsDNA.

[0435] After releasing the capture probe (covalently attached probe) in a PCR buffer containing a mixture of USER enzymes capable of cleaving uracil.

[0436] or

[0437] After releasing the capture probe in heated PCR buffer (synthetic capture probe for in situ hybridization)

[0438] Add 40 μl of 1x Faststart HiFi PCR Buffer with 1.8 mM MgCl2 (pH 8.3) (Roche, www.roche-applied-science.com) to each of two tubes, along with 0.2 mM dNTPs (Fermentas, www.fermentas.com), 0.1 μg / μl BSA (New England Biolabs, www.neb.com), 0.1 U / μl USER enzyme (New England Biolabs), released cDNA (extended from the surface probe), and the released surface probe. Then add 1 μl of RNase H (5 U / μl). Incubate both tubes in a thermal cycler (Applied Biosystems, www.appliedbiosystems.com) at 37°C for 30 minutes, followed by treatment at 70°C for 20 minutes. Add 1 μl of Klenow fragment (3' to 5' exo minus) (Illumina, www.illumina.com) and 1 μl of random primer (10 μM) (Eurofins MWGOperon, www.eurofinsdna.com) to two tubes (one tube contains B_R8 (octamer), and the other contains B_R6 (hexamer)) to bring the final concentration to 0.23 μM. Incubate both tubes sequentially in an Applied Biosystems thermal cycler at 15°C for 15 min, 25°C for 15 min, 37°C for 15 min, and finally at 75°C for 20 min. After incubation, add 1 μl each of primers A_P and B (10 μM) (Eurofins MWG Operon) to each tube, bringing the final concentration to 0.22 μM. Then add 1 μl of Faststart HiFi DNA polymerase (5 U / μl) (Roche) to each tube, bringing the final concentration to 0.11 U / μl. Perform PCR amplification in an Applied Biosystems thermal cycler according to the following procedure: after a 2-min hot start at 94°C, perform 50 cycles of the following: 94°C for 15 seconds, 55°C for 30 seconds, 68°C for 1 minute, and a final extension at 68°C for 5 minutes. After amplification, take 40 μl from each tube, purify using a Qiaquick PCR purification column (Qiagen, www.qiagen.com), and elute with 30 μl of EB (10 mM Tris-Cl, pH 8.5).The purified product was analyzed using a bioanalyzer (Agilent, www.home.agilent.com) and a DNA 7500 kit. The results are shown below. Figure 13 and Figure 14 As shown.

[0439] This embodiment demonstrates a method for generating a large number of released cDNA molecules through second-strand synthesis of random hexamers and random octamers and subsequent amplification reactions.

[0440] Example 4

[0441] amplification of ID-specific and gene-specific products following cDNA synthesis and probe collection.

[0442] After releasing the capture probe (covalently attached probe) in a PCR buffer containing a mixture of USER enzymes capable of cleaving uracil:

[0443] The excised cDNA was amplified, with a final reaction mixture volume of 10 μl. A total of 10 μl of the excised template, 1 μl of ID-specific forward primer (2 μM), 1 μl of gene-specific reverse primer (2 μM), and 1 μl of a mixture containing 1.4x FastStart High Fidelity Reaction Buffer with 1.8 mM MgCl2 and FastStart High Fidelity Enzyme Blend were added to this mixture. The final reaction was performed using 1x FastStart High Fidelity Reaction Buffer with 1.8 mM MgCl2 and 1 U of FastStart High Fidelity Enzyme Blend. PCR amplification was performed in an Applied Thermal Cyclist. In Biosystems, follow this procedure: 94°C hot start for 2 minutes, then perform 50 cycles of the following: 94°C for 15 seconds, 55°C for 30 seconds, 68°C for 1 minute, and finally extend to 68°C for 5 minutes.

[0444] Primer sequences yielded a product of approximately 250 bp.

[0445] Beta-2w microglobulin (B2M) primers

[0446] 5'-TGGGGGTGAGAATTGCTAAG-3'(SEQ ID NO:43)

[0447] ID-1 primers

[0448] 5'-CCTTTCCTTCTCCTTCACC-3'(SEQ ID NO:44)

[0449] ID-5 primers

[0450] 5'-GTCCTCTATTCCGTCACCAT-3'(SEQ ID NO:45)

[0451] ID-20 primer

[0452] 5'-CTGCTTCTTCCTGGAACTCA-3'(SEQ ID NO:46)

[0453] The results are as follows Figure 15 As shown, successful amplification of ID-specific and gene-specific products was achieved using two different ID primers (i.e., specific to ID markers located at different locations on the microarray) and the same gene-specific primer derived from brain tissue and applicable to all probes. Accordingly, this experiment demonstrates that products can be identified using ID marker-specific or target nucleic acid-specific amplification reactions, and further demonstrates that different ID markers can be distinguished. In a second experiment, positive results were obtained from the tissue-covered microdots using only half of the ID probes in the array (i.e., capture probes).

[0454] Example 5

[0455] Space genomics research

[0456] background The purpose of this method is to capture DNA molecules from tissue samples that retain spatial information in order to determine which part of the tissue a particular DNA fragment originates from.

[0457] method The principle of this method lies in utilizing a microarray of DNA oligonucleotides (capture probes) immobilized with spatially labeled tag sequences (localization domains). Each feature of the oligonucleotides in the microarray carries 1) a unique tag (localization domain) and 2) a capture sequence (capture domain). By tracking the geographical location of a particular tag on the array surface, it is possible to extract two-dimensional localization information from each tag. Genomic DNA fragments can be added to the microarray, for example, by adding thin sections of tissue treated with FFPE. The genomic DNA in these tissue sections has been pre-fragmented through immobilization.

[0458] After tissue slices were placed on the array, a universal tailing reaction was performed using terminal transferase. This tailing reaction adds a polydA tail to the 3' end of the extended genomic DNA fragment in the tissue. The tailing reaction using terminal transferase forms a hybridized and 3'-closed polydA probe, thereby blocking surface oligonucleotides.

[0459] Following the terminal transferase tailing reaction, the genomic DNA fragment can hybridize with nearby spatially labeled oligonucleotides by contacting the polydA tail with the polydT capture sequence of the surface oligonucleotide. After hybridization, Klenow exo-isochain displacement polymerase uses the surface oligonucleotide as a primer to produce a new DNA strand complementary to the hybridized genomic DNA fragment. This new DNA strand also carries the positioning information contained in the surface oligonucleotide tag.

[0460] As a final step, the newly synthesized and labeled DNA strands are cleaved from the surface using enzymatic, denaturing, or physical methods, collected, and then downstream amplified by introducing universal ends, amplicon-specific amplification reactions, and / or sequencing.

[0461] Figure 16 This is a schematic diagram illustrating the principle of the process.

[0462] Materials and methods

[0463] Fabrication of a self-made spray microarray using probes in the 5' to 3' direction

[0464] Twenty DNA capture oligonucleotides, each with a tagged sequence (Table 1), were spotted onto a glass slide to serve as capture probes. These probes were synthesized using a 5'-terminal amino linker and a C6 spacer. All probes were synthesized by Sigma-Aldrich (St. Louis, MO, USA). A 20 μM suspension of the DNA capture probes was prepared with 150 mM sodium phosphate at pH 8.5 and spotted onto CodeLink using a Nanoplotter NP2.1 / E spotting system (Gesim, Grosserkmannsdorf, Germany). TM The activated microarray was mounted on a 7.5cm x 2.5cm slide (Surmodics, Eden Prairie, MN, USA). After dotting, surface sealing was performed according to the manufacturer's instructions. The probe dotted 16 identical arrays on the slide, each array containing a preset dotting pattern. During the hybridization reaction, the 16 subarrays were shielded with 16 chip clips. TM Schleicher & Schuell BioScience, Keene, NH, USA) separated.

[0465] A self-made spray microarray was prepared using probes in the 3' to 5' direction, and a capture probe in the 5' to 3' direction was synthesized.

[0466] The oligonucleotide spraying operation is the same as the operation method for probes in the 5' to 3' direction described above.

[0467] To prepare primers for the synthesis of the capture probe, a hybridization solution containing 4xSSC with 0.1% SDS, 2 μM extension primer (A_primer), and 2 μM ligation primer (p_poly_dT) was incubated at 50 °C for 4 minutes. Simultaneously, a homemade array was attached to a ChipClip (Whatman) substrate clip. Then, 50 μL of hybridization solution was added to each well of the array, and the array was incubated at 50 °C and 300 rpm for 30 minutes.

[0468] After cultivation, remove the array from the ChipClip substrate holder and wash it according to the following three steps: 1) wash with 2xSSC solution containing 0.1% SDS at 50°C and 300 rpm for 6 minutes; 2) wash with 0.2xSSC solution at 300 rpm for 1 minute; and 3) wash with 0.1xSSC solution at 300 rpm for 1 minute. Then, spin-dry the array and place it back into the ChipClip substrate holder.

[0469] To perform the extension and ligation reactions, 50 μL of an enzyme mixture containing 10x Ampligase buffer, 2.5 U of amplified thermostable DNA polymerase Stoffel fragment (AppliedBiosystems), 10 U of Ampligase (Epicentre Biotechnologies), 2 mM each of dNTPs (Fermentas), and water was added to each well. The array was then incubated at 55°C for 30 minutes, followed by washing as described above, but the washing time in the first step was changed from 6 minutes to 10 minutes.

[0470] Hybridization of polydA probes to protect surface oligonucleotides from the influence of dA tails

[0471] To protect the surface oligonucleotide capture sequence, hybridization is required to form a 3'-biotin-blocked polydA probe. A hybridization solution containing 4xSSC with 0.1% SDS and 2 μM 3'bio-polydA is incubated at 50°C for 4 minutes. Simultaneously, a homemade array is attached to a ChipClip clip (Whatman). Then, 50 μL of hybridization solution is added to each well of the array, and the array is incubated at 50°C and 300 rpm for 30 minutes.

[0472] After cultivation, remove the array from the ChipClip substrate holder and wash it according to the following three steps: 1) wash with 2xSSC solution containing 0.1% SDS at 50°C and 300 rpm for 6 minutes; 2) wash with 0.2xSSC solution at 300 rpm for 1 minute; and 3) wash with 0.1xSSC solution at 300 rpm for 1 minute. Then, spin-dry the array and place it back into the ChipClip substrate holder.

[0473] Preparation of formalin-fixed paraffin-embedded (FFPE) tissues

[0474] Mouse brain tissue was fixed with 4% formalin at 4°C for 24 hours. Then, it was cultured according to the following steps: 70% ethanol for 1 hour 3 times, 80% ethanol for 1 hour once, 96% ethanol for 1 hour once, 100% ethanol for 1 hour once, and xylene at room temperature for 1 hour twice.

[0475] The dehydrated samples were then incubated in liquid low-melting-point paraffin at 52-54°C for up to 3 hours, with the paraffin replaced once during this period to wash away residual xylene. The completed tissue blocks were stored at room temperature and, when needed, were sliced ​​into 4μm sections using an ultramicrotome and placed onto each intended capture probe sequence.

[0476] The sections were dried at 37°C for 24 hours on an array slide and stored at room temperature.

[0477] Dewaxing of FFPE tissue

[0478] Formalin-fixed paraffin-embedded 10 μm sections of mouse brain tissue attached to CodeLink slides were dewaxed twice in xylene for 10 minutes each time; then in 99.5% ethanol for 2 minutes; in 96% ethanol for 2 minutes; and in 70% ethanol for 2 minutes; and then air-dried.

[0479] Universal tailing reaction of genomic DNA

[0480] To perform dA tailing, a 50 μl reaction mixture was prepared containing 1x TdT buffer (20 mM Tris-acetate (pH 7.9), 50 mM potassium acetate and 10 mM magnesium acetate) (New England Biolabs, www.neb.com), 0.1 μg / μl BSA (New England Biolabs), 1 μl TdT (20 U / μl), and 0.5 μl dATPs (100 mM). This mixture was added to the array surface and incubated in a thermal cycler at 37 °C for 15 min, followed by incubation at 70 °C for 10 min to inactivate TdT. The temperature was then lowered again to 50 °C to allow the dA-tailed genomic fragment to hybridize with the surface oligonucleotide capture sequence.

[0481] After cultivation, remove the array from the ChipClip substrate clip and wash it according to the following three steps: 1) wash with 2xSSC solution containing 0.1% SDS at 50°C and 300 rpm for 6 minutes; 2) wash with 0.2xSSC solution at 300 rpm for 1 minute; and 3) wash with 0.1xSSC solution at 300 rpm for 1 minute. Then, spin-dry the array.

[0482] elongation of labeled DNA

[0483] Take 50 μl of the reaction mixture containing 1x Klenow buffer, 200 μM dNTPs (New England Biolabs) and 1 μl Klenow fragment (3' to 5' exo minus), heat to 37°C, add to each well, and incubate at 37°C for 30 minutes and mix (shake at 300 rpm for 3 seconds, let stand for 6 seconds) (Thermomixer comfort; Eppendorf).

[0484] After cultivation, remove the array from the ChipClip substrate clip and wash it according to the following three steps: 1) wash with 2xSSC solution containing 0.1% SDS at 50°C and 300 rpm for 6 minutes; 2) wash with 0.2xSSC solution at 300 rpm for 1 minute; and 3) wash with 0.1xSSC solution at 300 rpm for 1 minute. Then, spin-dry the array.

[0485] Remove residual tissue

[0486] Take a slide of formalin-fixed paraffin-embedded mouse brain tissue and attach it to a ChipClip clip and a 16-well light-shielding frame (Whatman). Add 10 μl of Proteinase K solution (Qiagen) to each 150 μl of Proteinase K Digest Buffer from the RNeasy FFPE kit (Qiagen). Add 50 μl of the final mixture to each well and incubate the slide at 56°C for 30 minutes.

[0487] The capture probe (covalently attached probe) is released using a PCR buffer containing a mixture of USER enzymes capable of cleaving uracil.

[0488] Mount the 16-well light-shielding frame and CodeLink slides onto the ChipClip clip (Whatman). Take 50 μl of a mixture containing: 1x FastStart High Fidelity Reaction Buffer with 1.8 mM MgCl2 (Roche), 200 μM dNTPs (New England Biolabs), and 0.1 U / 1 μl USER enzyme (New England Biolabs), heat to 37°C, add to each well, and incubate at 37°C for 30 minutes, then mix (300 rpm shaking for 3 seconds, then incubate for 6 seconds) (Thermomixer comfort; Eppendorf). Then, use a pipette to recover the reaction mixture containing the released cDNA and probe from the wells.

[0489] Synthesis of labeled DNA and amplification of ID-specific and gene-specific products after probe collection

[0490] After releasing the capture probe (covalently attached probe) in a PCR buffer containing a mixture of USER enzymes capable of cleaving uracil:

[0491] The cleaved cDNA was amplified, with a final reaction volume of 10 μl. A mixture of 7 μl of cleaved template, 1 μl of ID-specific forward primer (2 μM), 1 μl of gene-specific reverse primer (2 μM), 1 μl of FastStart High Fidelity Enzyme Blend, and 1.4x FastStart High Fidelity Reaction Buffer with 1.8 mM MgCl2, totaling 10 μl, was used for the final reaction with 1x FastStart High Fidelity Reaction Buffer with 1.8 mM MgCl2 and 1 U of FastStart High Fidelity Enzyme Blend.

[0492] The PCR amplification reaction was performed in an Applied Biosystems thermal cycler according to the following procedure: a 2-minute hot start at 94°C, followed by 50 cycles of the following sequence: 15 seconds at 94°C, 30 seconds at 55°C, 1 minute at 68°C, and a final extension at 68°C for 5 minutes.

[0493] Whole genome amplification (including capture of marker sequences) was performed using random primer second-strand synthesis and universal end amplification reactions. The probe sequence is retained at the end of the resulting dsDNA.

[0494] After releasing the capture probe (covalently attached probe) in a PCR buffer containing a mixture of USER enzymes capable of cleaving uracil.

[0495] The reaction mixture contained 40 μl of 1x Faststart HiFi PCR Buffer (pH 8.3) (Roche, www.roche-applied-science.com) containing 1.8 mM MgCl2, 0.2 mM each of dNTPs (Fermentas, www.fermentas.com), 0.1 μg / μl BSA (New England Biolabs, www.neb.com), 0.1 U / μl USER enzyme (New England Biolabs), released DNA (extended from the surface probe), and the released surface probe. The PCR tubes were then incubated in a thermal cycler (Applied Biosystems, www.appliedbiosystems.com) at 37°C for 30 min, followed by treatment at 70°C for 20 min. Add 1 μl of Klenow fragment (3' to 5' exo minus) (Illumina, www.illumina.com) and 1 μl of random primer (10 μM) for ligating universal ends (Eurofins MWG Operon, www.eurofinsdna.com) to the tube. Then, incubate the PCR tube in a thermal cycler (AppliedBiosystems, www.appliedbiosystems.com) at 15°C for 15 min, 25°C for 15 min, 37°C for 15 min, and finally at 75°C for 20 min. After incubation, add 1 μl each of A_P and B (10 μM) primers (Eurofins MWG Operon) and 1 μl of Faststart HiFi DNA polymerase (5 U / μl) (Roche) to the tube. PCR amplification was performed in an Applied Biosystems thermal cycler following the procedure: a 94°C hot start for 2 min, followed by 50 cycles of 94°C for 15 sec, 55°C for 30 sec, 68°C for 1 min, and a final extension at 68°C for 5 min. After amplification, 40 μl of each tube was purified using a Qiaquick PCR purification column (Qiagen, www.qiagen.com) and eluted with 30 μl of EB (10 mM Tris-Cl, pH 8.5). The purified products were analyzed using a bioanalyzer (Agilent, www.home.agilent.com) and a DNA 7500 kit.

[0496] Visualization

[0497] Fluorescently labeled probe hybridization before staining

[0498] Before placing tissue sections, fluorescently labeled probes are hybridized with designated label sequences sprayed onto the capture probe array. After tissue visualization, the fluorescently labeled probes help orient the resulting image, allowing the image to be merged with the sequence expression profiles of the individual capture probe label sequences obtained after sequencing. To complete the hybridization of the fluorescent probes, a hybridization solution containing 4xSSC with 0.1% SDS and 2 μM detection probe (P) is incubated at 50°C for 4 minutes. Simultaneously, homemade sequences are attached to ChipClip clips (Whatman). Then, 50 μL of hybridization solution is added to each well of the sequence, and the mixture is incubated at 50°C and 300 rpm for 30 minutes.

[0499] After culturing, remove the sequences from the ChipClip substrate clips and wash them according to the following steps: 1) 2xSSC, 0.1% SDS: 50°C and 300 rpm for 6 minutes; 2) 0.2xSSC: 300 rpm for 1 minute; and 3) 0.1xSSC: 300 rpm for 1 minute. Then, spin-dry the sequences.

[0500] Whole-body histological staining of FFPE tissue sections before or after DNA synthesis

[0501] For FFPE tissue sections immobilized with the capture probe sequence, wash and rehydrate as described above after dewaxing and before labeled DNA synthesis, or wash as described above after labeled DNA synthesis. Then, process the tissue sections as follows: incubate in hematoxylin for 3 minutes; rinse with deionized water; incubate in tap water for 5 minutes; rapidly immerse in acidified ethanol 8-12 times; wash twice with tap water for 1 minute each; wash with deionized water for 2 minutes; incubate in eosin for 30 seconds; wash three times with 95% ethanol for 5 minutes each; wash three times with 100% ethanol for 5 minutes each; rinse three times with xylene for 10 minutes each (can be overnight); cover the slide with a coverslip and mount with DPX; allow the slide to dry overnight in a fume hood.

[0502] Whole immunohistochemical staining of target proteins on FFPE tissue sections before or after DNA synthesis.

[0503] For FFPE tissue sections immobilized with the capture probe sequence, wash and rehydrate as described above after dewaxing and before labeled DNA synthesis, or wash as described above after DNA synthesis. Then, process the tissue sections as follows, keeping them moist throughout the staining process: dilute the primary antibody with blocking buffer (containing 1x Tris buffered saline (50mM Tris, 150mM NaCl, pH 7.6), 4% donkey serum, and 0.1% Triton-x) and incubate overnight at room temperature with the primary antibody in a humidified chamber; rinse three times with 1x TBS; incubate the sections at room temperature for 1 hour with a paired secondary antibody conjugated to fluorescein (FITC, Cy3, or Cy5). After rinsing three times with 1x TBS to remove as much TBS as possible, mount the sections with ProLong Gold + DAPI (phenylindole) reagent (Invitrogen) and then analyze using a fluorescence microscope with a matched filter set.

[0504] Example 6

[0505] This experiment was conducted according to the principle of Example 5, but instead of tissue, genomic DNA fragments were used on the array. The genomic DNA was pre-fragmented to average sizes of 200 bp and 700 bp, respectively. This experiment demonstrates that the principle can be achieved. The fragmented genomic DNA is very similar to FFPE tissue.

[0506] Amplification of internal gene-specific products after labeled DNA synthesis and probe collection

[0507] The capture probe (covalently attached probe) was released in a PCR buffer containing a mixture of USER enzymes capable of cleaving uracil. The buffer contained 1x FastStart HighFidelity Reaction Buffer with 1.8mM MgCl2 (Roche), 200μM dNTPs (New England Biolabs), and 0.1U / 1μl USER Enzyme (New England Biolabs).

[0508] The cleaved DNA was amplified to a final reaction volume of 50 μl. 1 μl of ID-specific forward primer (10 μM), 1 μl of gene-specific reverse primer (10 μM), and 1 μl of FastStart High Fidelity Enzyme Blend were added to 47 μl of the cleaved template. PCR amplification was performed in an Applied Biosystems thermal cycler according to the following program: a 2-minute hot start at 94°C, followed by 50 cycles consisting of 15 seconds at 94°C, 30 seconds at 55°C, 1 minute at 68°C, and a final extension at 68°C for 5 minutes.

[0509] Amplification of label-specific and gene-specific products after label DNA synthesis and probe collection.

[0510] The capture probe (covalently attached probe) was released using a PCR buffer containing a mixture of USER enzymes capable of cleaving uracil. This buffer contained 1x FastStart HighFidelity Reaction Buffer with 1.8mM MgCl2 (Roche), 200μM dNTPs (New England Biolabs), and 0.1U / 1μl USER Enzyme (New England Biolabs). Then:

[0511] The cleaved DNA was amplified to a final reaction volume of 50 μl. 1 μl of label-specific forward primer (10 μM), 1 μl of gene-specific reverse primer (10 μM), and 1 μl of FastStart High Fidelity Enzyme Blend were added to 47 μl of the cleaved template. PCR amplification was performed in an Applied Biosystems thermal cycler according to the following program: a 2-minute hot start at 94°C, followed by 50 cycles consisting of 15 seconds at 94°C, 30 seconds at 55°C, 1 minute at 68°C, and a final extension at 68°C for 5 minutes.

[0512] Forward-directed human genome DNA primers

[0513] 5'-GACTGCTCTTTCACCCATC-3'(SEQ ID NO:47)

[0514] Reverse-human genomic DNA primers

[0515] 5'-GGAGCTGCTGGTGCAGGG-3'(SEQ ID NO:48)

[0516] P-labeled specific primer

[0517] 5'-ATCTCGACTGCCACTCTGAA-3'(SEQ ID NO:49)

[0518] Experimental results are as follows Figure 17 and Figure 20 As shown in the figure. The figure displays the internal products amplified on the array— Figure 17 and Figure 18 The peak size measured was consistent with expectations. Therefore, this indicates that genomic DNA can be captured and amplified. Figure 19 and Figure 20 In this context, given the random fragmentation of genomic DNA and the highly diverse sample pool generated by terminal transferases, it is advisable to label the desired product with a tail.

[0519] Example 7

[0520] Polymerase elongation reaction and terminal transferase tailing reaction were used to separately target the capture probe in the 5' to 3' direction. synthesis

[0521] To prepare primers for the capture probe synthesis, a hybridization solution containing 4xSSC, 0.1% SDS, and 2 μM of extension primer (A-primer) was incubated at 50°C for 4 minutes. Simultaneously, a homemade array (see Example 1) was attached to a ChipClip (Whatman) clip. Then, 50 μL of hybridization solution was added to each well of the array, and the array was incubated at 50°C and 300 rpm for 30 minutes.

[0522] After cultivation, the array was removed from the ChipClip substrate holder and washed in the following three steps: 1) at 50°C with a 2xSSC solution containing 0.1% SDS at 300 rpm for 6 minutes; 2) with a 0.2xSSC solution at 300 rpm for 1 minute; and 3) with a 0.1xSSC solution at 300 rpm for 1 minute. Then, the array was spun dry and returned to the ChipClip substrate holder.

[0523] Take 1 μl of Klenow Fragment (3' to 5' exo minus) (Illumina, www.illumina.com), 10x Klenow buffer, 2 mM each of dNTPs (Fermentas), and water, mix to form 50 μl of reaction mixture, and inject into each well using a pipette.

[0524] The array was cultured in an Eppendorf Thermomixer thermal cycler: 15°C for 15 minutes, 25°C for 15 minutes, 37°C for 15 minutes, and finally 75°C for 20 minutes.

[0525] After cultivation, remove the array from the ChipClip substrate holder and wash it according to the following three steps: 1) wash at 50°C with a 2xSSC solution containing 0.1% SDS at 300 rpm for 6 minutes; 2) wash with a 0.2xSSC solution at 300 rpm for 1 minute; and 3) wash with a 0.1xSSC solution at 300 rpm for 1 minute. Then, spin-dry the array and place it back into the ChipClip substrate holder.

[0526] To perform dT tailing, a 50 μl reaction mixture was prepared containing 1x TdT buffer (20 mM Tris-acetate (pH 7.9), 50 mM calcium acetate, and 10 mM magnesium acetate) (New England Biolabs, www.neb.com), 0.1 μg / μl BSA (New England Biolabs), 0.5 μl RNase H (5 U / μl), 1 μl TdT (20 U / μl), and 0.5 μl dTTPs (100 mM). The mixture was added to the array surface, and the array was incubated in an Applied Biosystems thermal cycler at 37 °C for 15 min, followed by incubation at 70 °C for 10 min to inactivate TdT.

[0527] Example 8

[0528] Tissue and USER system using 5' to 3' high probe density array and formalin-fixed cryopreservation (FF-frozen) Spatial transcriptomics studies using amplification reactions catalyzed by cleavage and terminal transferases.

[0529] Fabrication of arrays

[0530] Pre-fabricated high-density microarray chips were ordered from Roche-Nimblegen (Madison, WI, USA). Each capture probe array contains 135,000 features, of which 132,460 features carry capture probes containing unique ID-tag sequences (localization domains) and capture regions (capture domains). Each feature is 13 x 13 μm in size. The configuration of each capture probe in the 5' to 3' direction includes a universal domain containing 5 dUTP bases (cleavage domain) and a conventional amplification domain, an ID tag (localization domain), and a capture region (capture domain). Figure 22 (and Table 2). Each array is also equipped with a frame for marking probes (and Table 2). Figure 23 The framework contains a 30bp universal sequence (Table 2) to initiate hybridization of fluorescent probes, which helps with orientation in array visualization.

[0531] Tissue preparation—Preparation of formalin-fixed frozen tissues

[0532] The animal (mouse) was perfused with 50 ml of PBS and 100 ml of 4% formalin solution. After excising the olfactory bulb, it was placed in a 4% formalin bath for 24 hours for fixation. Then, the tissue was treated with 30% sucrose dissolved in PBS for 24 hours to stabilize the morphology and remove excess formalin. The tissue was then frozen to -40°C at a controlled rate and stored at -20°C in the laboratory. Parallel specimens were prepared in the same manner, but with a post-fixation time of 3 hours, or the post-fixation step was omitted. Successful preparation was also achieved by perfusion with 2% formalin and omitting the post-fixation step. Similarly, the sucrose treatment step can be omitted. The tissue was sealed in a cryogenic chamber and cut into 10 μm sections. One tissue section was placed on each capture probe sequence to be used. Optionally, for better tissue adhesion, the array chip could be treated at 50°C for 15 minutes.

[0533] Optional control - total RNA preparation from tissue sections

[0534] Total RNA was extracted from a single tissue slice (10 μm) using the RNeasy FFPE kit (Qiagen) according to the manufacturer's instructions. The total RNA obtained from the tissue slices was used as a control, compared to experiments where RNA from the tissue slices was directly captured onto the array. Accordingly, in the case of total RNA manipulation onto the array, staining, visualization, and tissue degradation steps were omitted.

[0535] On-film reaction

[0536] Hybridization, reverse transcription, nuclear staining, tissue digestion, and probe shearing reactions of the labeled probe and framework probe were performed on 16-well silica-lined mats (ArrayIt, Sunnyvale, CA, USA) at a reaction volume of 50 μl per well. To prevent evaporation, the hybridization cassettes were covered with sealing plates (In Vitro AB, Stockholm, Sweden).

[0537] Optional - Tissue permeation before cDNA synthesis

[0538] Permeabilization was performed using Proteinase K (Qiagen, Hilden, Germany), which was diluted to 1 μg / ml with PBS. The solution was added to the wells, and the slides were incubated at room temperature for 5 minutes, followed by a gradual increase in temperature to 80°C over 10 minutes. The slides were briefly washed with PBS before the reverse transcription reaction.

[0539] Alternatively, after tissue attachment, permeabilization can be performed using a microwave. Place the slide in the bottom of a glass jar containing 50 ml of 0.2xSSC (Sigma-Aldrich) and microwave at 800W for 1 minute. Immediately after microwave treatment, place the slide on a paper towel and dry in a container isolated from unnecessary air for 30 minutes. After drying, briefly immerse the slide in water (RNase-free / DNase-free) and finally centrifuge to dry before starting cDNA synthesis.

[0540] cDNA synthesis

[0541] Reverse transcription was performed using the SuperScript III One-Step RT-PCR System with Platinum Taq (LifeTechnologies / Invitrogen, Carlsbad, CA, USA). The final volume of the reverse transcription reaction was 50 μl, containing 1x reaction mixture, 1x BSA (New England Biolabs, Ipswich, MA, USA), and 2 μl of SuperScript III RT / Platinum Taq mix. The solution was first heated to 50°C and then applied to tissue sections, where it was reacted at 50°C for 30 minutes. Subsequently, the reverse transcription solution was removed from the wells, and the slides were allowed to air dry for 2 hours.

[0542] Organizational visualization

[0543] After cDNA synthesis, nuclear staining and hybridization of the labeled probe with the framework probe (a probe attached to the array substrate for orientation of tissue samples on the array) were performed simultaneously. A PBS solution containing 300 nM DAPI and 170 nM labeled probe was prepared. This solution was added to the wells, and the slides were incubated at room temperature for 5 minutes, then briefly washed with PBS and dried.

[0544] Optionally, the labeled probe can be hybridized with the framework probe before placing the tissue on the array. The labeled probe is then diluted to 170 nM with hybridization buffer (4xSSC, 0.1% SDS). This solution is heated to 50°C, placed on the chip, and hybridized at 50°C and 300 rpm for 30 minutes. After hybridization, the slides are washed: first with 2xSSC, 0.1% SDS at 50°C and 300 rpm for 10 minutes, then with 0.2xSSC at 300 rpm for 1 minute, and finally with 0.1xSSC at 300 rpm for 1 minute. In this case, the staining solution after cDNA synthesis contains only DAPI nuclear staining agent diluted to 300 nM with PBS. The solution is added to the wells, the slides are incubated at room temperature for 5 minutes, then briefly washed with PBS and centrifuged.

[0545] The slides were examined using a Zeiss Axio Imager Z2 microscope and processed using MetaSystems.

[0546] Remove organization

[0547] For digesting tissue sections, Proteinase K (both from Qiagen) was diluted to 1.25 μg / μl with PKD buffer from the RNeasy FFPE kit. The tissue sections were then cultured in this solution at 56°C for 30 minutes, with intermittent mixing at 300 rpm for 3 seconds followed by a 6-second pause. The slides were then washed: first with 2xSSC and 0.1% SDS at 50°C and 300 rpm for 10 minutes, then with 0.2xSSC at 300 rpm for 1 minute, and finally with 0.1xSSC at 300 rpm for 1 minute.

[0548] Release probe

[0549] Preheat a 16-well hybridization cassette (ArrayIt) with a silicone liner to 37°C and attach it to a Nimblegen slide. Preheat 50 μl of a shearing mixture containing an unknown concentration of lysis buffer (Takara), 0.1 U / μl USER enzyme (NEB), and 0.1 μg / μl BSA to 37°C and inject it into each well containing cDNA immobilized on the surface. Remove air bubbles, seal the slide, and incubate at 37°C for 30 minutes in a Thermomixer Comfort model, intermittently mixing at 300 rpm for 3 seconds followed by a 6-second pause. After incubation, collect 45 μl of the shearing mixture from each reacted well and transfer it to a 0.2 ml PCR tube. Figure 24 ).

[0550] Preparing a library

[0551] Nucleotide exonuclease treatment

[0552] After cooling each solution on ice for 2 minutes, exonuclease I (NEB) was added to remove unextended cDNA probes. The final reaction volume was 46.2 μl, and the final concentration was 0.52 U / μl. The PCR tubes were cultured at 37°C for 30 minutes using an Applied Biosystems thermal cycler, followed by treatment at 80°C for 25 minutes to inactivate the exonuclease.

[0553] dA-tailing with terminal transferase

[0554] Following the exonuclease treatment, a polyA tailing reaction mixture containing TdT buffer (Takara), 3 mM dATP (Takara), and the manufacturer's TdT enzyme mixture (TdT and RNase H) (Takara) was prepared according to the manufacturer's instructions. 45 μl of this mixture was added to each sample. The resulting mixture was incubated in a thermal cycler at 37°C for 15 minutes, followed by treatment at 70°C for 10 minutes to inactivate the exonuclease.

[0555] Second-strand synthesis and PCR amplification

[0556] After the dA tailing reaction, four new 0.2 ml PCR tubes were prepared for each sample, and 23 μl of PCR premix was added to each tube, along with 2 μl of sample as a template. The final PCR reaction solution contained 1x Ex Taq buffer (Takara), 200 μM each of dNTPs (Takara), 600 nM A_primer (MWG), 600 nM B_dT20VN_primer (MWG), and 0.025 U / μl Ex Taq polymerase (Takara) (Table 2). The following cycle was performed in a thermal cycler to create the second strand of cDNA: 95°C for 3 minutes, 50°C for 2 minutes, and 72°C for 3 minutes. Then, 20 (library preparation) or 30 (cDNA confirmation) cycles were performed to amplify the samples: 95°C for 30 seconds, 67°C for 1 minute, 72°C for 3 minutes, followed by a final extension at 72°C for 10 minutes.

[0557] Document cleanup

[0558] After amplification, 500 μl of binding buffer (Qiagen) was added to each of the four PCR reaction tubes (100 μl), and the tubes were placed in a Qiaquick PCR purification column (Qiagen) and centrifuged at 17,900 x g for 1 minute to bind the amplified cDNA to the membrane. The membrane was then washed with washing buffer (Qiagen) containing ethanol, and finally eluted with 50 μl of 10 mM Tris-Cl (pH 8.5).

[0559] The samples were further purified and concentrated using an MBS robot (Magnetic Biosolutions) via CA-purification (i.e., purification with superparamagnetic microbeads bound to carboxylic acid). Finally, fragments shorter than 150–200 bp were removed with 10% PEG solution. The amplified cDNA was then reacted with CA-microbeads (Invitrogen) for 10 minutes, followed by elution with 15 μl of 10 mM Tris-Cl at pH 8.5.

[0560] Library quality analysis

[0561] The samples amplified after 30 cycles were analyzed using an Agilent bioanalyzer. Depending on the amount of sample material, either a DNA High Sensitivity Kit or a DNA 1000 Kit was selected to confirm the presence of the amplified cDNA library.

[0562] Preparing sequencing libraries

[0563] Create a library index

[0564] Sequencing libraries were prepared using samples amplified for 20 cycles. An index PCR premix was prepared for each sample. 23 μl of this premix was added to six 0.2 ml PCR tubes, and 2 μl of amplified and purified cDNA was added to each tube as a template, resulting in a PCR reaction solution containing 1x Phusion premix (Fermentas), 500 nM InPE1.0 (Illumina), 500 nM Index 1-12 (Illumina), and 0.4 nM InPE2.0 (Illumina). The samples were amplified in a thermal cycler for 18 cycles according to the following program: 98°C for 30 seconds, 65°C for 30 seconds, 72°C for 1 minute, followed by a final extension at 72°C for 5 minutes.

[0565] Sequencing library cleaning

[0566] After amplification, 750 μl of binding buffer (Qiagen) was added to each of the 6 tubes of PCR reaction solution (150 μl), and the mixture was placed in a Qiaquick PCR purification column (Qiagen). The column was centrifuged at 17,900 x g for 1 minute to bind the amplified cDNA to the membrane (due to the large sample volume (900 μl), the sample was divided into two aliquots (450 μl each) for separate binding in the two steps). The membrane was then washed with washing buffer (Qiagen) containing ethanol, and finally eluted with 50 μl of 10 mM Tris-Cl at pH 8.5.

[0567] The samples were further purified and concentrated using an MBS robot (Magnetic Biosolutions) via CA purification. Finally, fragments smaller than 300-350 bp were removed using 7.8% PEG solution. The amplified cDNA was then reacted with CA beads (Invitrogen) for 10 minutes, followed by elution with 15 μl of 10 mM Tris-Cl (pH 8.5). The samples were analyzed using an Agilent bioanalyzer, following the manufacturer's instructions and selecting either a DNA High Sensitivity kit or a DNA 1000 kit depending on the amount of sample material, to confirm the presence and size of the completed amplified library.

[0568] sequencing

[0569] Based on the required data throughput and manufacturer specifications, use Illumina Hiseq2000 or Miseq to sequence the library. For Read2, optionally, use primer B_r2 for custom sequencing to avoid sequencing through 20T homopolymer extension.

[0570] Data Analysis

[0571] The 42 bases at the 5' end of Read 1 and the 25 bases at the 5' end of Read 2 were trimmed (optionally, if custom primers were used, trimming of Read 2 is unnecessary). Then, the read data were mapped to the Mus musculus9 genome set with masked repetitive sequences using bowtie software, and the output data was formatted as a SAM file. The mapped read data were extracted and annotated with UCSC refGene gene annotations. Data was retrieved using "indexFinder" (a custom software for index retrieval). This established a MongoDB database containing all captured transcripts and their respective microarray index locations.

[0572] Connecting MATLAB software with a database allows for spatial visualization and analysis of data. Figure 26 ).

[0573] Optionally, fluorescently labeled frame probes can be used to overlay the visualized data with the microscopic images to achieve accurate alignment and enable the extraction of spatial transcriptome data.

[0574] Example 9

[0575] Amplification reaction catalyzed by MutY system cleavage and TdT catalysis was achieved using a 3' to 5' high probe density array and FFPE tissue. Spatial transcriptomics research should be conducted.

[0576] Fabrication of arrays

[0577] Prefabricated high-density microarray chips were ordered from Roche-Nimblegen (Madison, WI, USA). Each capture probe array contains 72,000 features, of which 66,022 features carry complementary sequences with unique ID-tags. Each feature is 16 x 16 μm in size. The configuration of each capture probe in the 3' to 5' direction is the same as that used in the homemade spray-on 3' to 5' array, but three additional bases are added to the upstream universal end (P') of the probe, making it a longer version of P', namely LP' (Table 2). Each array is also equipped with a frame for the labeled probe, which carries a 30 bp universal sequence to initiate hybridization of the fluorescent probe and aid in orientation during array visualization.

[0578] Synthesis of trapping probes in the 5' to 3' directions

[0579] The synthesis steps for the capture probes in the 5' to 3' direction on the high-density array are the same as those for the homemade spray array, except that the extension and ligation steps are performed first at 55°C for 15 minutes, followed by 72°C for 15 minutes. The A-handle probes (Table 2) contain an A / G mismatch to allow subsequent probe release via the MutY enzymatic system, as described below. The P-probe is replaced with a longer LP version to match the longer probes on the surface.

[0580] Preparation and dewaxing of formalin-fixed paraffin-embedded tissues

[0581] The implementation method for this step is the same as the operation procedure for the self-made sequence described above.

[0582] cDNA synthesis and staining

[0583] cDNA synthesis and staining were performed using a high-density Nimblegen array in the 5' to 3' direction, but biotin-labeled dCTPs and dATP, as well as four conventional dNTPs (each 25 times more than biotin-labeled nucleoside triphosphates) were added during cDNA synthesis.

[0584] organization removal

[0585] The removal of the tissue was performed according to the operating procedure of the high-density Nimblegen array in the 5' to 3' direction as described in Example 8.

[0586] Probe shearing with MutY

[0587] Preheat a 16-well hybridization cassette (ArrayIt) with a silicone liner to 37°C and attach it to a Codelink slide. Preheat 50 μl of a shearing mixture containing 1x Endonuclease VIII Buffer (NEB), 10 U / μl MutY (Trevigen), 10 U / μl Endonuclease VIII (NEB), and 0.1 μg / μl BSA to 37°C and inject it into each well containing cDNA immobilized on the surface. Remove air bubbles, seal the slide, and incubate at 37°C for 30 minutes in a Thermomixer Comfort, with intermittent mixing at 300 rpm for 3 seconds followed by a 6-second pause. After incubation, remove the blocking solution and collect 40 μl of the shearing mixture from each reacted well into a PCR tube.

[0588] Preparing a library

[0589] Biotin-streptavidin-mediated library cleaning

[0590] To remove unextended cDNA probes and change the buffer, the sample was purified by binding biotin-labeled cDNA to streptavidin-coated C1 microbeads (Invitrogen) and washing the microbeads with 0.1M NaOH (freshly prepared). Purification was performed using an MBS robot (Magnetic Biosolutions). The biotin-labeled cDNA was allowed to bind to the C1 microbeads (Invitrogen) for 10 minutes, followed by elution with 20 μl of water. Specifically, the microbead-water solution was heated to 80°C to disrupt the biotin-streptavidin link.

[0591] dA-tailing with terminal transferase

[0592] Following the purification steps, a polyA-tailed reaction mixture containing lysis buffer (Takara, Cellamp Whole Transcriptome Amplification Kit), TdT buffer (Takara), 1.5 mM dATP (Takara), and a TdT enzyme mixture (TdT and RNase H) (Takara) was prepared according to the manufacturer's instructions. 18 μl of each sample was added to a new 0.2 ml PCR tube, and 22 μl of polyA-tailed premix was added to form a 40 μl reaction mixture. The resulting mixture was incubated in a thermal cycler at 37°C for 15 minutes, followed by treatment at 70°C for 10 minutes to inactivate TdT.

[0593] Second-strand synthesis and PCR amplification

[0594] After the dA tailing reaction, four new 0.2 ml PCR tubes were prepared for each sample. 23 μl of PCR premix was added to each tube, and 2 μl of sample was added as a template. The final PCR reaction solution contained 1x Ex Taq buffer (Takara), 200 μM each of dNTPs (Takara), 600 nM A_primer (MWG), 600 nM B_dT20VN_primer (MWG), and 0.025 U / μl Ex Taq polymerase (Takara). To create the second strand of cDNA, the following cycle was performed once in a thermal cycler: 95°C for 3 minutes, 50°C for 2 minutes, and 72°C for 3 minutes. Then, the samples were amplified by performing 20 (library preparation) or 30 (cDNA confirmation) cycles as follows: 95°C for 30 seconds, 67°C for 1 minute, 72°C for 3 minutes, followed by a final extension at 72°C for 10 minutes.

[0595] Document cleanup

[0596] After amplification, 500 μl of binding buffer (Qiagen) was added to each of the four PCR reaction tubes (100 μl), and the tubes were placed in a Qiaquick PCR purification column (Qiagen) and centrifuged at 17,900 x g for 1 minute to bind the amplified cDNA to the membrane. The membrane was then washed with washing buffer (Qiagen) containing ethanol, and finally eluted with 50 μl of 10 mM Tris-Cl (pH 8.5).

[0597] The samples were further purified and concentrated using an MBS robot (Magnetic Biosolutions) via CA-purification (i.e., purification with superparamagnetic microbeads bound to carboxylic acid). Finally, fragments shorter than 150-200 bp were removed with 10% PEG solution. The amplified cDNA was then reacted with CA-microbeads (Invitrogen) for 10 minutes, followed by elution with 15 μl of 10 mM Tris-Cl at pH 8.5.

[0598] Second PCR amplification

[0599] The final PCR reaction solution contained 1x Ex Taq buffer (Takara), 200 μM each of dNTPs (Takara), 600 nM A-primer (MWG), 600 nM B-primer (MWG), and 0.025 U / μl Ex Taq polymerase (Takara). The sample was heated to 95°C and held for 3 minutes, then subjected to 30 cycles of the following procedure: 95°C for 30 seconds, 65°C for 1 minute, 72°C for 3 minutes, followed by a final extension at 72°C for 10 minutes.

[0600] Second Library Cleanup

[0601] After amplification, 500 μl of binding buffer (Qiagen) was added to each of the four PCR reaction tubes (100 μl), and the tubes were placed in a Qiaquick PCR purification column (Qiagen) and centrifuged at 17,900 x g for 1 minute to bind the amplified cDNA to the membrane. The membrane was then washed with washing buffer (Qiagen) containing ethanol, and finally eluted with 50 μl of 10 mM Tris-Cl (pH 8.5).

[0602] The samples were further purified and concentrated using an MBS robot (Magnetic Biosolutions) via CA-purification (i.e., purification with superparamagnetic microbeads bound to carboxylic acid). Finally, fragments of 150–200 bp were removed with 10% PEG solution. The amplified cDNA was then reacted with CA-microbeads (Invitrogen) for 10 minutes, followed by elution with 15 μl of 10 mM Tris-Cl at pH 8.5.

[0603] Preparing sequencing libraries

[0604] Create a library index

[0605] Sequencing libraries were prepared using samples amplified for 20 cycles. An index PCR premix was prepared for each sample; 23 μl was added to six 0.2 ml PCR tubes, and 2 μl of amplified and purified cDNA was added to each tube as a template, resulting in a PCR reaction solution containing 1x Phusion premix (Fermentas), 500 nM InPE1.0 (Illumina), 500 nM Index1-12 (Illumina), and 0.4 nM InPE2.0 (Illumina). The samples were amplified in a thermal cycler for 18 cycles according to the following program: 98°C for 30 seconds, 65°C for 30 seconds, 72°C for 1 minute, followed by a final extension at 72°C for 5 minutes.

[0606] Sequencing library cleaning

[0607] Following the amplification reaction, the sample was purified and concentrated using a CA-purification reaction linked to an MBS robot (Magnetic Biosolutions). Finally, fragments shorter than 300-350 bp were removed using 7.8% PEG solution. The amplified cDNA was then subjected to a binding reaction with CA-microbeads (Invitrogen) for 10 minutes, followed by elution with 15 μl of 10 mM Tris-Cl at pH 8.5.

[0608] Take 10 μl of the amplified and purified sample, place it on a Caliper XT chip, and excise fragments between 480 bp and 720 bp using Caliper XT (Caliper). Analyze the sample using an Agilent bioanalyzer with a DNA High Sensitivity Kit to confirm the presence and size of the completed amplified library.

[0609] Sequencing and data analysis

[0610] Sequencing and bioinformatics analysis were performed according to the operating procedure of the high-density Nimblegen array in the 5' to 3' orientation described in Example 8. However, in the data analysis, transcriptome mapping was not performed using read 1. Specific Olfr transcripts could be selected using Matlab visualization tools. Figure 27 ).

[0611] Example 10

[0612] A self-made spray-dot 41-tag microarray using probes in the 5' to 3' direction, with Proteinase K or User system cutting Microwave-treated formalin-fixed frozen (FF-frozen) tissues and TdT-based amplification reaction Conducting spatial transcriptomics research

[0613] Array fabrication

[0614] The spraying method for the homemade array is as described above, but the pattern used contains 41 probes with unique ID tags, and the configuration of the probes is the same as the high-density array in the 5' to 3' direction in Example 8.

[0615] All other steps are implemented in the same manner as described in Example 8.

[0616] Example 11

[0617] Alternative methods for performing cDNA synthesis steps

[0618] The aforementioned on-chip cDNA synthesis can also be combined with template conversion, by adding template conversion primers to the cDNA synthesis reaction to create the second strand (Table 2). A terminal base is added to the 3' end of the first cDNA strand using reverse transcriptase; this base binds to and introduces a second amplification domain, which guides the synthesis of the second strand. Once the double-stranded synthesis product is released from the array surface, it can be conveniently and directly amplified into a library.

[0619] Example 12

[0620] A self-made spray-dot 41-tag microarray using probes in the 5' to 3' direction, and a poly-A tail cut using the User system. Space genomics research using gDNA fragments and amplification reactions guided by TdT-tailed primers or translocation-specific primers.

[0621] Array fabrication

[0622] As described above, a self-made dot array was created using Codelink substrates (Surmodics), but the pattern used contained 41 probes, each with a unique ID tag. The configuration of the probes was the same as the high-density array in the 5' to 3' direction in Example 8.

[0623] Total DNA preparation of cells

[0624] DNA fragmentation

[0625] Genomic DNA (gDNA) was extracted from A431 and U2OS cell lines using the DNeasy kit (Qiagen) according to the manufacturer's instructions. The DNA was then fragmented into 500bp fragments using a Covaris ultrasonic sample fragmentation device (Covaris) according to the manufacturer's instructions.

[0626] Samples were purified and concentrated using an MBS robot (Magnetic Biosolutions) via CA-purification (i.e., purification with superparamagnetic microbeads bound to carboxylic acid). Finally, fragments shorter than 150–200 bp were removed with 10% PEG solution. The amplified cDNA was then reacted with CA-microbeads (Invitrogen) for 10 minutes, followed by elution with 15 μl of 10 mM Tris-Cl at pH 8.5.

[0627] Optional controls - peak values ​​of different cell lines

[0628] Different capture sensitivity levels were determined by the peak value of A431 DNA converted to U2OS DNA, for example, the peak value was 1%, 10%, or 50% of the A431 DNA.

[0629] dA-tailing with terminal transferase

[0630] Prepare a polyA-tailed reaction mixture containing TdT Buffer (Takara), 3 mM dATP (Takara), and a mixture of TdT enzymes (TdT and RNase H) (Takara) according to the manufacturer's instructions. Mix 45 μl of this mixture with 0.5 μg of DNA fragment. Incubate the resulting mixture in a thermal cycler at 37°C for 30 minutes, followed by inactivation of TdT at 80°C for 20 minutes. Then, wash the dA-tailed fragment with a Qiaquick (Qiagen) column according to the manufacturer's instructions, and determine the concentration using the Qubit system (Invitrogen) according to the manufacturer's instructions.

[0631] On-chip experiment

[0632] Hybridization, second-chain synthesis, and cleavage reactions were all performed on 16-well silica gel pads (ArrayIt, Sunnyvale, CA, USA). The hybridization chambers were covered with sealing plates to prevent evaporation.

[0633] Hybridization

[0634] 117 ng of DNA was precipitated into the wells of a preheated array (50 °C), and 1 x NEB buffer (New England Biolabs) and 1 x BSA were added to a total volume of 45 μl. The mixture was incubated for 30 minutes at 37 °C and 300 rpm in a Thermomixer Comfort (Eppendorf) equipped with MTP microplates.

[0635] Second chain synthesis

[0636] Without removing the hybridization mixture, add 15 μl of Klenow extension reaction mixture containing 1x NEB buffer, 1.5 μl of Klenow polymerase, and 3.75 μl of dNTPs (2 mM each) to the wells. Incubate the mixture at 37°C for 30 minutes in a Thermomixer Comfort (Eppendorf) without shaking.

[0637] Then, the slides were washed: 0.1% SDS in 2xSSC solution, 50°C, 300 rpm for 10 minutes; 0.2xSSC solution, 300 rpm for 1 minute; and 0.1xSSC solution, 300 rpm for 1 minute.

[0638] Release probe

[0639] Take 50 μl of a mixture containing 1x FastStart HighFidelity Reaction Buffer with 1.8 mM MgCl2 (Roche), 200 μM dNTPs (New England Biolabs), and 0.1 U / 1 μl USER Enzyme (New England Biolabs), heat to 37°C, add to each well, and incubate at 37°C for 30 minutes followed by mixing (300 rpm shaking for 3 seconds, then standing for 6 seconds) (Thermomixercomfort; Eppendorf). Then, use a pipette to recover the reaction mixture containing the released DNA from the wells.

[0640] Preparing a library

[0641] Amplification reaction

[0642] The amplification reaction was performed using 10 μl of reaction solution containing 7.5 μl of released sample, 1 μl of each primer, and 0.5 μl of enzyme (Roche, FastStartHiFi PCR system). The reaction was performed according to the following program: 94℃ for 2 minutes; followed by one cycle: 94℃ for 15 seconds, 55℃ for 2 minutes, 72℃ for 2 minutes; followed by 30 cycles: 94℃ for 15 seconds, 65℃ for 30 seconds, 72℃ for 90 seconds, and a final extension at 72℃ for 5 minutes.

[0643] In the preparation of sequencing libraries, two types of primers are used, including a surface probe A-handle and a specific translocation primer (A431) or a specific SNP primer linked to the B-handle (Table 2).

[0644] The sample was further purified and concentrated using a CA-purification reaction linked to an MBS robot (Magnetic Biosolutions) (i.e., purification with superparamagnetic microbeads bound to carboxylic acid). Finally, fragments shorter than 150–200 bp were removed with 10% PEG solution. The amplified cDNA was then reacted with CA-microbeads (Invitrogen) for 10 minutes, followed by elution with 15 μl of 10 mM Tris-Cl at pH 8.5.

[0645] Library quality analysis

[0646] The samples were analyzed using an Agilent bioanalyzer. Depending on the amount of sample material, either a DNA High Sensitivity kit or a DNA 1000 kit was selected to confirm the presence of the amplified cDNA library.

[0647] Create a library index

[0648] Sequencing libraries were prepared using samples amplified for 20 cycles. An index PCR premix was prepared for each sample, with 23 μl added to six 0.2 ml PCR tubes. 2 μl of amplified and purified cDNA was added to each tube as a template. The resulting PCR reaction solution contained 1x Phusion premix (Fermentas), 500 nM InPE1.0 (Illumina), 500 nM Index 1-12 (Illumina), and 0.4 nM InPE2.0 (Illumina). The samples were amplified in a thermal cycler according to the following program for 18 cycles: 98°C for 30 seconds, 65°C for 30 seconds, 72°C for 1 minute; followed by a final extension at 72°C for 5 minutes.

[0649] Sequencing library cleaning

[0650] The sample was further purified and concentrated using a CA-purification reaction connected to an MBS robot (Magnetic Biosolutions). Finally, fragments smaller than 300-350 bp were removed with 7.8% PEG solution. The amplified DNA was then reacted with CA-beads (Invitrogen) for 10 minutes, followed by elution with 15 μl of 10 mM Tris-Cl (pH 8.5). The sample was analyzed using an Agilent bioanalyzer. Following the manufacturer's instructions, either the DNA High Sensitivity Kit or the DNA 1000 Kit was selected based on the amount of sample material to confirm the presence and size of the amplified library. Figure 29 ).

[0651] sequencing

[0652] Sequencing was performed according to the operating procedure for the high-density Nimblegen array in the 5' to 3' orientation as described in Example 8.

[0653] Data Analysis

[0654] Data analysis was used to determine the capture sensitivity of the ID-capture probes in the array. Reads were then sorted and classified according to the translocation primers or SNP primers contained in Read 2, and then further sorted according to the IDs of these reads contained in Read 1.

[0655] Optional control - direct amplification of cell line-specific translocations

[0656] This step is used to directly determine the capture sensitivity of cell lines with peak values ​​using PCR. Using the forward and reverse primers for the A431 translocation (Table 2), the presence of the translocation in the replicated and released second-strand product was attempted and detected. Figure 30 ).

[0657] Table 2. Oligonucleotides used in spatial transcriptomics and spatial genomics research

[0658]

[0659]

[0660]

Claims

1. A non-diagnostic method for local detection of RNA in tissue sections, the method comprising: (a) An array comprising a plurality of features is provided on a substrate, wherein each feature occupies a different position on the array, wherein a feature among the plurality of features comprises a plurality of capture probes fixed thereon, and wherein the plurality of capture probes comprises nucleic acid molecules having domains having orientations from 5' to 3': (i) a first sequence comprising a positioning domain containing a nucleotide sequence specific to the feature; and (ii) A second sequence containing a capture domain comprising a nucleotide sequence complementary to the RNA to be detected; (b) Contact the array with a tissue slice and allow at least one RNA from the tissue slice to hybridize with the capture domain of the capture probe; (c) By using hybridized RNA as an extension template to extend the capture probe, a cDNA molecule is generated such that the generated cDNA molecule contains the nucleotide sequence of the positioning domain; (d) Releasing the resulting cDNA molecule or a portion thereof, or a second strand complementary to the resulting cDNA molecule or a portion thereof, from the array, and (e) Identify the nucleotide sequence of the described localization domain in the released cDNA molecule, or its complementary strand; and (f) Associate the nucleotide sequence of the localization domain with different positions on the array to detect RNA in the tissue slice.

2. The method according to claim 1, characterized in that, The step between steps (c) and (d) further includes: generating a second strand complementary to the resulting cDNA molecule or a portion thereof.

3. The method according to claim 1, characterized in that, Step (d) includes: releasing the resulting cDNA molecule or a portion thereof, or a second strand complementary to the resulting cDNA molecule or a portion thereof, from the capture domain of the capture probe by denaturation.

4. The method according to claim 1, characterized in that, The method further includes the step of amplifying the generated cDNA or a portion thereof released from the array, and a second strand complementary to the generated cDNA or a portion thereof.

5. The method according to claim 1, characterized in that, The capture probe further includes a shearing domain located on the 5' side of the positioning domain.

6. The method according to claim 5, characterized in that, Further, it includes cleaving the cleavage domain of the capture probe, wherein cleavage includes applying a cleaving enzyme that recognizes the nucleotide sequence in the cleavage domain and cleaves the resulting cDNA molecule at a location on the 5' side of the positioning domain, thereby releasing the resulting cDNA molecule or a portion thereof, or the second strand or a portion thereof, from the features on the array.

7. The method according to claim 1, characterized in that, The RNA is selected from mRNA, tRNA, rRNA, viral RNA, small nuclear RNA (snRNA), small nucleolar RNA (snoRNA), microRNA (miRNA), small interfering RNA (siRNA), piwi-interacting RNA (piRNA), ribozyme RNA, and antisense RNA.

8. The method according to claim 7, characterized in that, The RNA in question is mRNA.

9. The method according to claim 8, characterized in that, The capture domain of the capture probe hybridizes with the poly-A tail of the mRNA.

10. The method according to claim 9, characterized in that, The capture domain contains a poly-T sequence.

11. The method according to claim 1, characterized in that, The method further includes a step of staining the tissue section prior to step (c).

12. The method of claim 1, further comprising associating the nucleotide sequence obtained in step (f) with imaging of the tissue section, wherein the method includes imaging the tissue section prior to step (c).

13. The method according to claim 1, characterized in that, The capture probe among the plurality of capture probes has a free 3' end.

14. The method according to claim 1, characterized in that, The capture probe is fixed to the substrate via a connector.

15. The method according to claim 1, characterized in that, The array is a bead array and the capture probe is fixed to the beads of the bead array.

16. The method according to claim 1, characterized in that, The array comprises at least 1,000 features.

17. The method according to claim 1, characterized in that, The array is characterized by an average diameter of less than 100 micrometers.

18. The method according to claim 1, characterized in that, The substrate includes one or more arrays.

19. The method according to claim 18, characterized in that, The one or more arrays contain the same location domain.

20. The method according to claim 18, characterized in that, The one or more arrays contain different location domains.

21. The method according to claim 1, characterized in that, The capture probe further includes a primer binding site.

22. The method according to claim 1, characterized in that, The tissue sections include fixed or freshly frozen tissue sections.

23. The method according to claim 22, characterized in that, Fixed tissue sections are formalin-fixed paraffin-embedded tissue sections.

24. The method according to claim 1, characterized in that, The multiple features include printing or photolithographic deposition features.

25. The method according to claim 1, characterized in that, Identifying the nucleotide sequence of the said localization domain includes sequencing.