Host transcriptome fecal biomarkers for assessment of colorectal cancer
Fecal RNA biomarkers provide a non-invasive and accurate diagnostic tool for colorectal cancer, addressing the limitations of existing methods by enhancing sensitivity and specificity in detecting and monitoring the disease.
Patent Information
- Authority / Receiving Office
- WO · WO
- Patent Type
- Applications
- Current Assignee / Owner
- YEDA RES & DEV CO LTD
- Filing Date
- 2025-12-09
- Publication Date
- 2026-06-18
Smart Images

Figure IMGF000031_0001 
Figure IMGF000032_0001 
Figure IMGF000032_0002
Abstract
Description
[0001] HOST TRANSCRIPTOME FECAL BIOMARKERS FOR ASSESSMENT OF COLORECTAL CANCER
[0002] RELATED APPLICATIONS
[0003] This application claims the benefit of priority of U.S. Provisional Patent Application No. 63 / 730001, filed December 10, 2024, and of U.S. Provisional Patent Application No. 63 / 879738, filed September 11, 2025, the contents of which are incorporated herein by reference in their entirety.
[0004] FIELD OF THE INVENTION
[0005] The invention relates to diagnostic tools and methods for analyzing and detecting colorectal cancer and its progression.
[0006] BACKGROUND OF THE INVENTION
[0007] The current state of the art for diagnostics of gastrointestinal diseases, including inflammatory bowel diseases (IBD), celiac disease and colorectal cancer (CRC) is the use of endoscopy procedures. These are invasive, costly methods that carry risks and involve lengthy preparation. Screening colonoscopies are currently recommended for all adults over the age of 45 for the detection and removal of malignant and pre-malignant lesions (Carethers, J. M. Front. Oncol. 12, 966998 (2022)). However, compliance is generally low (50-60% Shaukat, A. & Levin, T. R. Nat. Rev. Gastroenterol. Hepatol. 19, 521-531 (2022)). The fecal immunochemical test (FIT) is a widely used alternative non-invasive approach, yet suffers from limited sensitivity and specificity specifically at the early stages of advanced adenomas (Cao, L.-J. et al. BMC Med. 19, 250 (2021)). Stool-based molecular assays for CRC diagnostics include measurements of microRNAs, DNA mutations and methylations and microbiome composition (Coleman, D. & Kuwada, S. Genes 15, 338.(2024); Imperiale, T. F. et al. N. Engl. J. Med. 390, 984-993 (2024); Porcaro, F., et al. Oncol. Rev. 18(2024); Chen, G. et al. Front. Immunol. 15, (2024)). While informative, such approaches do not recapitulate the range of molecular changes associated with the disease.
[0008] The gut sheds massive amounts of cells every day. These include epithelial cells, immune cells and stromal cells. Some of the present inventors have recently shown that fecal wash host transcriptomics provides robust signals for characterizing the molecular changes in the intestines of IBD patients Ungar et al. Gut 2022;71: 1988-1997). Another recent publication to the inventors and coworkers relates to the ability of distal fecal wash host transcriptomics to identify inflammation throughout the colon and terminal ileum (Dan et al. Cell Mol Gastroenterol Hepatol 2023;16: 1-15). WO 2023 / 002491, to some of the present inventors and coworkers, relates to methods of diagnosing gastric diseases and more particularly inflammatory bowel diseases, comprising analyzing the RNA expression level of human genes in a fecal RNA sample of the subject. However, these studies were performed on patients that have undergone lengthy preparation for colonoscopies, that massively reduced the amounts of bacteria in their guts. While stool host transcriptomics provided valuable information on the gut’s molecular state in neonates, where microbial load is minimal, the abundance of bacteria in the adult colon prohibited the simultaneous measurement of more than a few dozens of genes. Barnell et al., 2023 JAMA. 2023 ;33O( 18): 1760-1768) discuss the evaluation of a multitarget stool RNA (mt-sRNA) test for CRC screening.
[0009] A recent publication to the present inventors discloses that stool shed cell transcriptomics mirrors tumor biology and enables colorectal cancer diagnosis (Bahar Halpern et al., Sci Rep. 2025;15(l):34413).
[0010] There remains a long-felt need for additional diagnostic tools for providing early, accurate and minimally invasive means for diagnosing and monitoring gastrointestinal diseases and specifically colorectal cancer (CRC).
[0011] SUMMARY OF THE INVENTION
[0012] The invention provides diagnostic tools for analyzing fecal samples. Embodiments of the invention relate to assays and methods for diagnosing and prognosing patients afflicted with, or suspected of having a cancer selected from colon cancer, rectal cancer and colorectal cancer. More specifically, provided in embodiments of the invention are host transcriptome markers and classifiers capable of assessing, localizing and monitoring colorectal cancer.
[0013] The invention, is based, in part, on the discovery of unique gene signatures based on measurements of RNA biomarkers in fecal samples, determined to be unexpectedly effective for detecting colorectal cancer and evaluating its stage.
[0014] Accordingly, disclosed herein are diagnostic assays and methods, useful for early diagnosis, prognosis, monitoring and management of colorectal cancer. In some embodiments, non-invasive and minimally-invasive assays and methods are provided. In other embodiments, diagnostic kits for use with the methods of the invention are provided. In various aspects and embodiments, the invention relates to a method of analyzing a fecal (e.g. stool) RNA sample, the method comprising determining the levels of a plurality of gene products selected from Table 1, 2 and / or 3 in the sample. In another embodiment, the plurality of gene products is selected from one or more of Tables 4-6 and 7. In another embodiment, the plurality of gene products is as disclosed herein, for example gene products selected from one or more of Groups A-J as disclosed herein. Each possibility represents a separate embodiment of the invention.
[0015] In one aspect, the invention provides a method of diagnosing colorectal cancer, the method comprising (i) determining, in a fecal RNA sample of a subject, the levels of gene products selected from Table 1 or 6; and (ii) comparing the level of the gene products in the fecal RNA sample to their respective levels in a control, wherein a difference in the level of the gene products in the fecal RNA sample as compared to the control is indicative of the presence of colorectal cancer in the subject.
[0016] In one embodiment, the gene products are selected from Table 1. In another embodiment the gene products comprise (or consist essentially of) the gene products of Table 1. In another embodiment, the gene products are as set forth in Table 1. In one embodiment, the gene products are selected from Table 6. In another embodiment the gene products comprise (or consist essentially of) the gene products of Table 6. In another embodiment, the gene products are as set forth in Table 6.
[0017] In another embodiment, the method comprises determining the levels of a plurality of gene products presented in Table 1 or 6 in said fecal RNA sample to thereby obtain the transcriptomic signature of said sample with respect to the plurality of gene products, and comparing the transcriptomic signature of said sample to a control transcriptomic signature by a supervised classification algorithm, wherein the outcome of the comparison is indicative of the presence of colorectal cancer in said subject. In another embodiment, said plurality of gene products is selected from Table 1. In another embodiment said plurality of gene products comprises (or consists essentially of) the gene products of Table 1. In another embodiment, said plurality of gene products is as set forth in Table 1. In one embodiment, said plurality of gene products is selected from Table 6. In another embodiment said plurality of gene products comprises (or consists essentially of) the gene products of Table 6. In another embodiment, said plurality of gene products is as set forth in Table 6. Each possibility represents a separate embodiment of the invention. In various other embodiments, the gene products (or plurality of gene products) correspond to a subset of gene products as identified in Table 1 or 6, wherein each possibility represents a separate embodiment of the invention.
[0018] In another embodiment, said plurality of gene products is selected from Table 4 or 5. In another embodiment, said plurality of gene products comprises, consists or consists essentially of the gene products set forth in Table 4 and / or 5. Each possibility represents a separate embodiment of the invention.
[0019] In another embodiment a transcriptomic signature characterized by: (i) an increase in the level of gene products of Group A in Table 1 in the fecal RNA sample of the subject in comparison to their corresponding levels in a healthy control; (ii) a decrease in the level of gene products of Group B in Table 1 in the fecal RNA sample of said subject in comparison to their corresponding levels in a healthy control; or (iii) both (i) and (ii), is indicative of the presence of colorectal cancer in said subject.
[0020] In another embodiment a transcriptomic signature characterized by: (i) an increase in the level of gene products of Group G, I and / or K, as set forth in Tables 4, 5, and 6, respectively, in the fecal RNA sample of the subject in comparison to their corresponding levels in a healthy control; (ii) a decrease in the level of gene products of Group H, J, and / or L, as set forth in Tables 4, 5, and 6, respectively, in the fecal RNA sample of said subject in comparison to their corresponding levels in a healthy control; or (iii) both (i) and (ii), is indicative of the presence of colorectal cancer in said subject.
[0021] In another embodiment of the methods disclosed herein, the subject is suspected of having colorectal cancer. In another embodiment, said subject diagnosed with, or suspected of having, gastrointestinal (GI) inflammation, and said plurality of gene products is selected from Table 4. In another embodiment, said subject is diagnosed with, or suspected of having, inflammatory bowel disease (IBD), and said plurality of gene products is selected from Table 5.
[0022] In another embodiment, the methods disclosed herein further comprise predicting the location of the colorectal cancer, by a method comprising determining in said fecal RNA sample the levels of additional gene products selected from gene products presented in Table 2 or 3 and comparing the determined levels to their respective levels in a control, wherein a difference in the level of the additional gene products in the fecal RNA sample and in the control is predictive of the location of the colorectal cancer. In another aspect, the invention provides a method for differential diagnosis of colorectal cancer in a subject in need thereof, comprising (i) determining, in a fecal RNA sample of the subject, the levels of gene products selected from Table 4 and / or 5; and (ii) comparing the level of the gene products from the fecal RNA sample to their respective levels in a control, wherein a difference in the level of the gene products in the fecal RNA sample as compared to the control is indicative of the presence of the colorectal cancer in said subject.
[0023] In another embodiment said subject diagnosed with, or suspected of having, GI inflammation, and said plurality of gene products is selected from Table 4. In another embodiment said subject is diagnosed with, or suspected of having, IBD, and said plurality of gene products is selected from Table 5.
[0024] In another embodiment, the invention encompasses a method of assigning a medical intervention, comprising diagnosing colorectal cancer according to the method of as disclosed herein, and assigning the medical intervention based on the obtained results. In another embodiment, the invention provides a method of treating colorectal cancer in a subject in need thereof, the method comprising diagnosing colorectal cancer according to the method as disclosed herein, and treating the cancer. In some embodiments, treating said cancer comprises administering to said subject a colorectal cancer treatment or intervention, the colorectal cancer treatment or intervention selected from the group consisting of surgical resection, chemotherapy, biological therapy, irradiation and / or immunotherapy. Each possibility represents a separate embodiment of the invention.
[0025] In another aspect there is provided a method of analyzing a stool RNA sample, the method comprising determining the levels of a plurality of gene products selected from one or more of Tables 1 to 6 in the sample.
[0026] In another aspect, there is provided a method for monitoring the efficacy of a treatment of colorectal cancer in a subject in need thereof, the method comprising (i) determining, in a fecal RNA sample of the subject, the levels of gene products selected from Table 1 or 6; and (ii) comparing the level of the gene products from the fecal RNA sample to their corresponding levels in a fecal RNA sample obtained from said subject at an earlier sampling and / or to the level of the one or more gene products in a control. In another embodiment, the result of the comparison is indicative of the efficacy of the cancer treatment in said subject. In a particular embodiment a change in the level of the gene products in two consecutive measurements is indicative of the efficacy of the cancer treatment in said subject. In another embodiment the method comprises determining the levels of a plurality of gene products presented in Table 1 or 6 in said fecal RNA sample to thereby obtain the transcriptomic signature of said sample with respect to the plurality of gene products, and comparing the transcriptomic signature of said sample to a control transcriptomic signature by a supervised classification algorithm.
[0027] In one embodiment, the gene products are selected from Table 1. In another embodiment the gene products comprise (or consist essentially of) the gene products of Table 1. In another embodiment, the gene products are as set forth in Table 1. In one embodiment, the gene products are selected from Table 6. In another embodiment the gene products comprise (or consist essentially of) the gene products of Table 6. In another embodiment, the gene products are as set forth in Table 6. In other embodiments the gene products correspond to a subset of gene products as identified in Table 1 or 6, wherein each possibility represents a separate embodiment of the invention.
[0028] In another embodiment, the methods as disclosed herein further comprise subjecting the fecal RNA sample to selective depletion of microbial ribosomal RNA (rRNA). In another embodiment the sample is a stool sample. In another embodiment, the supervised classification algorithm employed in the methods of the invention is a linear classifier.
[0029] In another aspect, there is provided a kit comprising means for specifically determining and quantifying the levels of a plurality of gene products in a fecal RNA sample, and instructions for diagnosing colorectal cancer, wherein the plurality of gene products is selected from a list of biomarkers as presented in one or more of Tables 1-6.
[0030] Other objects, features and advantages of the present invention will become clear from the following description and drawings.
[0031] BRIEF DESCRIPTION OF THE DRAWINGS
[0032] Fig. 1A shows results of the effect of a ribosomal depletion ("microbial depletion") that increases the number of human genes by 6.1 fold on average (paired comparison). The median number of host detected genes increases from 392+541 with no ribosomal depletion to median of 1745+1050 with ribosomal depletion P = 8.3e-5.
[0033] Fig. IB shows comparison between 1 and 5 days storage of the sample ("1 Day freezer" and 5 Day freezer" respectively. It can be seen that extension of -20°C storage from 1 day to 5 days does not change the number of human genes. From median of 1674+1073 for 1 day storage to median of 1745+834 for 5 day storage. P = 0.8. Fig. 1C shows hierarchical clustering of stool and tissues from colorectal cancer (CRC) and control patients. Representative genes are shown on the right sorted by the group with highest expression.
[0034] Fig. ID shows a principal component analysis (PCA) of transcriptomic profiles. Numbers in parentheses show percent of explained variance by each PC. Healthy adjacent biopsy in black, colorectal cancer (CRC) biopsy in dark grey, control stool in light grey, CRC stool in open circles. Included are 172 highly expressed (5xl0‘5), highly variable and differentially expressed genes between colorectal cancer and control from single cell data, based on 323 genes with a maximal expression across patients above 5xl0'4.
[0035] Figs. 2A-2C - differentially expressed genes. Fig. 2A shows differentially expressed genes between the CRC tissue ("Tumor") and the adjacent healthy tissue ("Adjacent Normal"). Fig. 2B shows differentially expressed genes between the CRC stool ("CRC") and healthy stool ("CT"). Fig. 2C shows scatter plot of log2 gene expression ratio between CRC single cells and healthy single cells (x axis) to log2 gene expression ratio between CRC and healthy tissue (y axis). Each dot is a gene. Spearman correlation 0.83 P < 10'324
[0036] Figs. 3A-3D - expression in stool and corresponding tissue samples. Fig. 3A shows a scatter plot of log2 gene expression ratio between CRC and healthy adjacent tissue in biopsy (x axis) or stool (y axis). Each dot is a gene. Spearman correlation 0.29 P = 4.64xl0'112. Fig. 3B shows a scatter plot of log2 gene expression ratio between CRC and healthy stool (x axis) or patient matched pre-op and post-op stool (y axis). Each dot is a gene. Fig. 3C shows expression of the gene CA2 - a healthy tissue marker: on visium HD spatial transcriptomics slide (on the left, in which adjacent non-tumor is marked by a solid line) and in tissue and stool (top and bottom right, respectively). Fig. 3D shows expression of the gene IFITM2 - a CRC tissue marker: on visium HD spatial transcriptomics slide (on the left, in which the CRC tumor is marked by a dashed line) and in tissue and stool (top and bottom right, respectively).
[0037] Fig. 3E shows a classifier trained to differentiate between colorectal cancer and control tissue (dotted line) or stool samples (dashed and dotted line). The classifier was evaluated using a 50% test set in each iteration, with results averaged across 100 independent iterations. Receiver operating characteristic (ROC) curve and subsequent false positive rate, true positive rate and area under the curve (AUC) were recorded for each test set to examine the classification of samples of colorectal cancer samples and control samples based on ‘cancer score’ (methods). The overall performance is presented with mean values and standard error of the mean.
[0038] Figs. 4A-4C show differentially expressed genes between the tumor epithelial cells and the adjacent healthy epithelial cells in spatial transcriptomics (ST) that exhibited similar trends when comparing tissue (in Fig. 4A) or stool samples of CRC patients and controls (in Fig. 4B) of 3 different ST images (Pl - top, P2 - middle, P5 - bottom). Fig. 4C shows healthy cluster or CRC cluster, each cluster was used for the DGE in Fig. 4A and 4B (Pl - top, P2 - middle, P5 - bottom; healthy tissue marked by a solid line, and CRC tissue marked by a dashed line).
[0039] Fig. 5 shows a support vector machine classifier trained to differentiate between colorectal cancer and control tissue (dotted line) or stool samples (dashed and dotted line). The classifier was evaluated using a 50% test set in each iteration, with results averaged across 100 independent iterations. Receiver operating characteristic (ROC) curve and subsequent false positive rate, true positive rate and area under the curve (AUC) were recorded for each test set to examine the classification of samples of colorectal cancer samples and control samples based on ‘cancer score’. The overall performance is presented with mean values and standard error of the mean.
[0040] Figs. 6A-6B present a linear classifier for CRC diagnosis using stool biomarkers. Fig. 6A - Classifier trained to differentiate between colorectal cancer patients and healthy patients based on the human shed cell transcriptomics of tissues (dotted line) and of stool samples (dashed and dotted line) including myeloid genes. The classifier was evaluated using a 50% test set in each iteration. Receiver operating characteristic (ROC) curve is an average of 100 iterations. Area under the curve values are the mean values and standard errors of the mean over the 100 iterations. Fig. 6B - Estimated fractions of myeloid cells in stool and tissue samples inferred by computational deconvolution (** denote p<le-3).
[0041] Figs. 7A-7C demonstrate that stool RNA profiles return to a healthy molecular state after tumor resection. Fig. 7A - Principal component analysis (PCA) of transcriptomic profiles. Numbers in parentheses show percent of explained variance by each PC. Fig. 7B - Heat-map of the 49 classifier based up-regulated genes in CRC stool plotted across pooled control stool, CRC stool and post-op stool. Expression values are logw-scaled. Fig. 7C - Per-patient change in the stool-cancer score before and after surgical resection (n = 8 paired patients). Open circles = individual trajectories; full square = group median, paired Wilcoxon signed-rank test p = 0.0039. Expression levels are log 10 of the sum-normalized UMI counts.
[0042] Figs. 8A-8F show that stool cancer score is not affected by cancer histological and pathologic categories. Figs. 8A-8E are violin plots that display the distribution of the stool cancer score for each clinical category; individual samples are overlaid as jittered dots. Across all comparisons, no significant differences were observed, indicating that the stool-based cancer score is independent of (Fig. 8A) tumor size, (Fig. 8B) tumor location, (Fig. 8C) disease stage, (Fig. 8D) tumor invasion score (T score), (Fig. 8E) tumor histological grade. Fig. 8F - Number of genes detected in stool does not change with disease stage. Presented data excludes post-operative stools. Figs. 9A-9B show that the stool-derived classifier maintains its accuracy even after elimination of inflammatory genes. Fig. 9A - Classifier trained to differentiate between colorectal cancer patients and healthy patients based on the host transcriptomics of tissues (dotted line) and of stool samples (dashed and dotted line) after omitting myeloid specific genes. The classifier was evaluated using a 50% test set in each iteration. Receiver operating characteristic (ROC) curve is an average of 100 iterations. Area under the curve values are the mean values and standard errors of the mean over the 100 iterations. Fig. 9B - Classifier trained to differentiate between colorectal cancer patients and healthy patients based on the host transcriptomics of tissues (dotted line) and of stool samples (dashed and dotted line) after omitting IBD related genes. The classifier was evaluated using a 50% test set in each iteration. Receiver operating characteristic (ROC) curve is an average of 100 iterations. Area under the curve values are the mean values and standard errors of the mean over the 100 iterations.
[0043] DETAILED DESCRIPTION OF THE INVENTION
[0044] The present invention provides diagnostic means and methods for colorectal cancer. Specifically, methods and assays in accordance with the invention involve determining levels of specific biomarkers in fecal samples and comparing them to a control.
[0045] According to embodiments of the invention, provided are methods including providing a fecal sample, in particular a fecal RNA sample, and determining the level of at least one human gene product in the fecal RNA sample. Typically, the methods of the invention comprise determining the levels in the sample of a plurality of human gene products as disclosed herein, thereby obtaining the transcriptomic signature of the sample with respect to the plurality of gene products. In some embodiments, the methods of the invention further comprise a step of comparing the level of the gene products in the fecal RNA sample to their respective levels in a control, for example comparing the transcriptomic signature of said sample to a control transcriptomic signature by a supervised classification algorithm. In various embodiments, the gene products are selected from Table 1, 2 and / or 3. Additionally or alternatively, the gene products may be selected from one or more of Tables 4-7. Each possibility represents a separate embodiment of the invention.
[0046] In one aspect, there is provided a method of diagnosing colorectal cancer, the method comprising (i) determining, in a fecal RNA sample of a subject, the levels of gene products selected from Table 1 or 6; and (ii) comparing the level of the gene products in the fecal RNA sample to their respective levels in a control, wherein a difference in the level of the gene products in the fecal RNA sample as compared to the control is indicative of the presence of colorectal cancer in the subject.
[0047] In another aspect, the invention provides a method of predicting the location of a colorectal cancer in a subject having colorectal cancer, the method comprising: (i) determining, in a fecal RNA sample of the subject, the levels of a plurality of gene products selected from gene products presented in Table 2 or 3, thereby obtaining the transcriptomic signature of the sample with respect to the plurality of gene products, and (ii) comparing the transcriptomic signature of said sample to a control transcriptomic signature by a supervised classification algorithm, wherein the outcome of the comparison is predictive of the location of the colorectal cancer in said subject.
[0048] In another aspect, there is provided a method for differential diagnosis of colorectal cancer in a subject in need thereof, comprising (i) determining, in a fecal RNA sample of the subject, the levels of gene products selected from Table 4 and / or 5; and (ii) comparing the level of the gene products from the fecal RNA sample to their respective levels in a control, wherein a difference in the level of the gene products in the fecal RNA sample as compared to the control is indicative of the presence of the colorectal cancer in said subject.
[0049] In another aspect, there is provided a method of assigning a medical intervention, comprising diagnosing a colorectal cancer according to a method as disclosed herein, and assigning the medical intervention based on the obtained results.
[0050] In another aspect, the invention provides a method of treating colorectal cancer in a subject in need thereof, the method comprising diagnosing colorectal cancer according to a method as disclosed herein, and treating the cancer.
[0051] In another aspect, the invention relates to a method of analyzing a stool RNA sample, the method comprising determining the levels of a plurality of gene products selected from one or more of Tables 1 to 6 in the sample.
[0052] In yet another aspect, the invention relates to a method for monitoring the efficacy of a treatment of colorectal cancer in a subject in need thereof, the method comprising (i) determining, in a fecal RNA sample of the subject, the levels of gene products selected from Table 1 or 6; and (ii) comparing the level of the gene products from the fecal RNA sample to their corresponding levels in a fecal RNA sample obtained from said subject at an earlier sampling and / or to the level of the one or more gene products in a control. In another embodiment, a change in the level of the gene products in two consecutive measurements is indicative of the efficacy of the cancer treatment in said subject. In a further aspect, there is provided a kit comprising means for specifically determining and quantifying the levels of a plurality of gene products in a fecal RNA sample, and instructions for diagnosing colorectal cancer, wherein the plurality of gene products is selected from a list of biomarkers as presented in one or more of Tables 1-6.
[0053] These and other aspects and embodiments are further described and exemplified below.
[0054] Diagnostic and analytical methods
[0055] Disclosed herein in embodiments of the invention are methods for diagnosing and evaluating colorectal cancer (CRC). In various embodiments, the methods include determining, in a fecal RNA sample of a subject, the levels of gene products selected from one or more of Tables 1-7 as disclosed herein, and comparing the level of the gene products in the fecal RNA sample to their respective levels in a control.
[0056] The terms “diagnosing” and “diagnosis” as used herein refer to methods by which the skilled artisan can estimate and even determine whether or not a subject is suffering from a given disease or condition. In the context of CRC, these terms refer in particular to assessing the presence or absence of CRC in the subject (e.g. of at least one colon or rectal tumor as disclosed herein). As used herein, a subject "diagnosed with" a condition is a subject in which the presence of the condition is indicated, e.g. using clinical features and / or biomarkers accepted in the art as diagnostic criteria (in the case of gastrointestinal inflammation or IBD) or by the methods and assays of the invention (in case of CRC). It is noted that while the invention provides methods for diagnosing CRC, the use of these methods does not preclude the performance of additional diagnostic procedures to confirm or further characterize a diagnosis made based upon a method of the invention, such as colonoscopy and / or biopsy.
[0057] In some embodiments, the present invention provides a method of diagnosing a colorectal cancer, the method comprising (i) determining a level of at least one biomarker in a fecal sample; and (ii) comparing the level of the at least one biomarker from the fecal sample to its level in a control, wherein a difference in the level of the biomarker in the fecal sample and in the control is indicative of the presence of a colorectal cancer in a subject, wherein the biomarker is selected from a list of biomarkers presented in Table 1. In another embodiment biomarker is selected from a list of biomarkers presented in Table 6. In another embodiment said biomarker is selected from a list of biomarkers presented in Table 4 or 5. In another embodiment said biomarker is selected from a list of biomarkers presented in Table 4. In another embodiment said biomarker is selected from a list of biomarkers presented Table 5. In other embodiments, there is provided a method of diagnosing colorectal cancer, the method comprising (i) determining, in a fecal RNA sample of a subject, the levels of gene products selected from Table 1 or 6; and (ii) comparing the level of the gene products in the fecal RNA sample to their respective levels in a control, wherein a difference in the level of the gene products in the fecal RNA sample as compared to the control is indicative of the presence of colorectal cancer in the subject. In one embodiment, the gene products are selected from Table 1. In another embodiment the gene products are selected from Table 6. In another embodiment the gene products are selected from Tables 1 and 6. In another embodiment the gene products are selected from Table 4 or 5. In another embodiment the gene products are selected from Table 4. In another embodiment the gene products are selected from Table 5. In another embodiment the gene products are selected from Table 4 and 5.
[0058] In other embodiments, the method comprises determining the levels of a plurality of gene products presented in Table 1 or 6 in said fecal RNA sample to thereby obtain the transcriptomic signature of said sample with respect to the plurality of gene products, and comparing the transcriptomic signature of said sample to a control transcriptomic signature by a supervised classification algorithm, wherein the outcome of the comparison is indicative of the presence of colorectal cancer in said subject. In one embodiment, the plurality of gene products is selected from Table 1. In another embodiment the plurality of gene products is selected from Table 6. In another embodiment the plurality of gene products is selected from Tables 1 and 6. In another embodiment said plurality of gene products is selected from Table 4 or 5. In another embodiment said plurality of gene products is selected from Table 4. In another embodiment said plurality of gene products is selected from Table 5. In another embodiment said plurality of gene products is selected from Table 4 and 5.The term "colorectal cancer" or "CRC" includes both colon cancer and rectal cancer. In some embodiments, the cancer is microsatellite-stable (MS) CRC. In other embodiments, the cancer is microsatellite-instable (MSI) CRC.
[0059] The term “control” as used herein refers to biological samples from a particular predefined group of subjects such as healthy subjects. Therefore, the term “level in the control” with respect to a biomarker has the meaning of the level, e.g. amount or concentration, of the biomarker in biological samples of (or corresponding to) a predefined group of subjects, e.g. healthy subjects.
[0060] In some embodiments relating to determining or predicting the presence of CRC, a healthy control is conveniently used (e.g. a stool sample from a healthy individual). In some embodiments relating to determining or predicting the location of a colorectal cancer, a control corresponding to a cancer of a particular location (for example, a stool sample of known rectal cancer located tumor or a known colon cancer located tumor is conveniently used). In yet other embodiments relating to differential diagnosis, a control corresponding to an individual with gastrointestinal (GI) inflammation or IBD (e.g. a stool sample from said individual) may also be conveniently used. According to additional embodiments relating to differential diagnosis (e.g. when gene products selected from Tables 4 and / or 5 are used), the control corresponds to the general population, including both healthy control samples and samples obtained from IBD patients and subjects having other forms of (non-malignant) GI inflammation, but excluding samples from CRC patients.
[0061] In some embodiments, the methods of the invention involve comparing the transcriptomic signature of said sample to a control transcriptomic signature by a supervised classification algorithm. In some embodiments (for example, when the plurality of gene products is selected from Table 1 or 6), the outcome of the comparison is indicative of the presence of colorectal cancer in said subject. Thus, according to these embodiments, the presence of CRC may be determined for a transcriptomic signature significantly different from the control (e.g. an increase in the levels of gene products of Group A, G, I and / or K, and / or a decrease in the levels of gene products of Group B, H, J, and / or L), and the absence of CRC may be determined for a transcriptomic signature not significantly different from the control, as determined by the supervised classification algorithm.
[0062] In other embodiments (for example, when the plurality of gene products is selected from Table 2 or 3, as described below), the outcome of the comparison is predictive of the location of the colorectal cancer in said subject. According to these embodiments, a transcriptomic signature significantly different from (e.g. characterized by an increase in the level of gene products of Group C or E as compared to) a rectal cancer control indicates that the subject has colon cancer, and a transcriptomic signature significantly different from (e.g. characterized by an increase in the level of gene products of Group D or F as compared to) a colon cancer control indicates that the subject has rectal cancer.
[0063] According to some embodiments, an increase in the level of biomarkers (gene products) of Group A in Table 1 in the fecal sample in comparison to their level in the control is indicative of the presence of a colorectal cancer in a subject. According to some embodiments, a decrease in the level of biomarkers of Group B in Table 1 in the fecal sample in comparison to their level in the control is indicative of the presence of a colorectal cancer in a subject. According to some embodiments, an increase in the level of at least one biomarker of Group A in Table 1 in the fecal sample in comparison to its level in the control and a decrease in the level of at least one biomarker of Group B in Table 1 in the fecal sample in comparison to its level in the control are indicative of the presence of a colorectal cancer in a subject. In various other embodiments, lack of an increase in the level of biomarkers of Group A in Table 1 and / or lack of a decrease in the level of biomarkers of Group B in Table 1 in the fecal sample indicates the absence of a CRC in said subject. In another embodiment, a transcriptomic signature characterized by increased levels of biomarkers of Group A in Table 1 and / or a decreased levels of at least one biomarker of Group B in Table 1 in the fecal sample in comparison to the respective levels in the control (e.g., as evaluated using a supervised classification algorithm) is indicative of the presence of a colorectal cancer in a subject. Each possibility represents a separate embodiment of the invention. In some embodiments, the presence of CRC is further indicative that a subject is amenable for CRC treatment or intervention.
[0064] In other embodiments of the methods of the invention (for example, when the control is a sample obtained from the subject at an earlier time point), an increase in the level of biomarkers of Group A in Table 1 in the fecal sample in comparison to their level in the control is indicative of progression of a colorectal cancer in a subject, or, in other embodiments, of failure or inadequacy of a treatment in a subject receiving the treatment. According to some embodiments, a decrease in the level of biomarkers of Group B in Table 1 in the fecal sample in comparison to their levels in the control is indicative of progression of a colorectal cancer in a subject, or, in other embodiments, of failure or inadequacy of a treatment in a subject receiving the treatment. According to some embodiments, an increase in the level of at least one biomarker of Group A in Table 1 in the fecal sample in comparison to its level in the control and a decrease in the level of at least one biomarker of Group B in Table 1 in the fecal sample in comparison to its level in the control are indicative of progression of a colorectal cancer in a subject, or, in other embodiments, of failure or inadequacy of a treatment in a subject receiving the treatment. In another embodiment, a transcriptomic signature characterized by increased levels of biomarkers of Group A in Table 1 and / or a decreased levels of at least one biomarker of Group B in Table 1 in the fecal sample in comparison to the respective levels in the control (e.g., as evaluated using a supervised classification algorithm) is indicative of progression of a colorectal cancer in a subject, or, in other embodiments, of failure or inadequacy of a treatment in a subject receiving the treatment. In various other embodiments, lack of an increase in the level of biomarkers of Group A in Table 1 and / or lack of a decrease in the level of biomarkers of Group B in Table 1 in the fecal sample indicates the absence of a CRC progression in said subject or, in other embodiments, of success or adequacy of a treatment in a subject receiving the treatment. Each possibility represents a separate embodiment of the invention. According to some embodiments, the method comprises determining and comparing the level of a plurality of biomarkers. According to some embodiments, the method comprises determining and comparing the level of 4 or more biomarkers. According to some embodiments, the method comprises determining and comparing the level of 5 or more biomarkers. According to some embodiments, the method comprises determining and comparing the level of 10 or more biomarkers. According to some embodiments, the method comprises determining and comparing the level of from 3 to 211 or more biomarkers, e.g. 10-100, 20-50, 50-70 or 30-40 biomarkers, including any integer in between. In some embodiment, the collective levels of a plurality of the biomarkers are referred to as a transcriptomic signature. Therefore, according to some embodiments, the method comprises comparison the transcriptomic signature of a subject in the fecal sample to a transcriptomic signature of a control (e.g. using a supervised classification algorithm as disclosed herein).
[0065] In another embodiment, the biomarkers (gene products) are selected from the group consisting of: ALOX5AP, CEBPB, CMTM2, CXCL8, GADD45B, GCA, HCAR3, RGS2, SLC2A3, SOCS3, SOD2, SRGN, TSC22D3, VIM, and combinations thereof, wherein each possibility represents a separate embodiment of the invention. According to some embodiments, it is provided that the markers (gene products) do not include the following markers AC007192.1, ACSL1, ADAM8, AQP9, BASP1, BCL6, C5AR1, CCL3, CD44, CKLF, CSF3R, CXCR2, FCAR, FCGR2A, FFAR2, FPR1, FPR2, FYB1, G0S2, GBP1, HCAR2, HIF1A, ICAM1, IGSF6, IL1B, IL1R2, IL1RN, ITGAX, KLF2, LCP1, LCP2, MYADM, MYO1F, NAMPT, NFKBIA, OSM, PDE4B, PFKFB3, PHACTR1, PLAU, PLAUR, PLEK, PROK2, RHOH, RIPOR2, S100A12, S100A4, SAT1, SLC11A1, SOCS3, TLR4, TNFAIP3, TNFAIP6, TNFRSF1B, TREM1, VNN2. In other embodiments, the gene products do not contain the following: CCL4L2, IFITM1, IL1B, PNRC1, PROK2, PTGS2, FBXW5, PDLIM5, and STARD10. According to some embodiments, the method of comparing transcriptomic signature may be done using a classification algorithm. Thus, in some embodiments, methods in accordance with the invention include: i. providing a fecal RNA sample, ii. determining the levels in the sample of a plurality of human gene products, thereby obtaining the transcriptomic signature of the sample with respect to the plurality of gene products and iii. comparing the transcriptomic signature of said sample to a control transcriptomic signature by a supervised classification algorithm, wherein a significant difference between the transcriptomic signature of said sample and of the control is indicative of the presence of the colorectal cancer.
[0066] In other embodiments, the invention is directed to a method for monitoring the efficacy of a treatment of a colorectal cancer in a subject in need thereof. Thus, embodiments of the invention provide for assessing whether a treatment or intervention that is being (or has been) administered to the subject for the purpose of treating CRC, is (or was) effective in treating CRC in said subject. To this end, a fecal sample is collected from the subject (also referred to herein as a first fecal sample), and its RNA levels of gene products as disclosed herein (e.g. selected from Table 1 or 6) are compared to their respective levels corresponding to a fecal sample collected from said subject at an earlier time point (also referred to herein as "an earlier sampling" or "an earlier sampling control"). Additionally or alternatively, the levels of one or more of the gene products in said sample are compared to the (respective) levels of the one or more gene products in a control (such as a control corresponding to a subject afflicted with CRC prior to treatment, a subject in remission or a healthy control). Thus, in some embodiments, the levels of the gene products in the sample are compared to their respective levels in the control. According to embodiments of the invention, the outcome of the comparison is recorded, and these steps are repeated for a second fecal sample, collected from said subject at a time point subsequent to the time of collection of the first fecal sample. In some embodiments, a change in the level of the gene products (biomarkers) as compared to the control and / or earlier sampling, is indicative of the efficacy of the cancer treatment in said subject. In other embodiments, a change in the level of the gene products as compared to the control and / or earlier sampling, recorded for both the first and the second fecal samples (collectively referred to herein as "a change in two consecutive measurements"), is indicative of the efficacy of the cancer treatment in said subject. For example, a change characterized by reduction in the levels of gene products up-regulated in CRC patients (also referred to herein as CRC -related genes) and / or enhancement in the levels of gene products down-regulated in CRC patients as compared to healthy controls (also referred to as control-related genes) is indicative of adequate efficacy of the treatment, particularly if recorded in two consecutive measurements.
[0067] In another embodiment, each comparison (e.g. for the first and second samples collected from said subject) is conveniently performed by a method comprising determining the levels of a plurality of gene products (e.g. presented in Table 1 or 6) in said fecal RNA sample to thereby obtain the transcriptomic signature of said sample with respect to the plurality of gene products, and comparing the transcriptomic signature of said sample to a control transcriptomic signature by a supervised classification algorithm. According to yet another aspect, the present invention provides a method for monitoring the efficacy of a treatment of a colorectal cancer, the method comprising (i) determining a level of at least one biomarker in a fecal sample, wherein the at least one biomarker is selected from a list of biomarkers present in Table 1; and (ii) comparing the level of the at least one biomarker from the fecal sample to the level of the biomarker in a fecal sample obtained from the same subject at an earlier sampling and / or to the level of the biomarker in the control, wherein a change in the level of the biomarkers in two consecutive measurements is indicative of the efficacy of the cancer treatment. All terms, embodiments and definitions disclosed in any one of the above aspects apply and are encompassed herein as well. According to some embodiments, the method comprises determining the transcriptomic signature obtained in step (i) and comparing it to the transcriptomic signature obtained in previous fecal sample.
[0068] In some embodiments, the methods of the invention comprise predicting the location of a colorectal cancer (predicting whether the tumor is located in the colon or the rectum).
[0069] In another embodiment, the method is used for predicting the location of a colorectal tumor or for differentiating between colon cancer and rectal cancer, and comprises determining in said fecal RNA sample the levels of gene products selected from gene products presented in Table 2 or 3 as set forth hereinbelow. In various embodiments, the gene products comprise a plurality of gene products selected from Group C in Table 2, of Group D in Table 2, Group E in Table 3 and / or Group F in Table 3. In other embodiments, the method is used for differentially diagnosing a cancer selected from colon cancer and rectal cancer.
[0070] For example, a transcriptomic signature characterized by an increase in the level of gene products of Group C in Table 2 or of Group E in Table 3 in the fecal RNA sample of the subject in comparison to their corresponding levels in a rectal cancer control (e.g. stool sample of known rectal cancer located tumor) indicates that the subject has colon cancer. In another example, a transcriptomic signature characterized by an increase in the level of gene products of Group D in Table 2 or Group F in Table 3 in the fecal RNA sample of said subject in comparison to their corresponding levels in a colon cancer control (e.g. stool sample of known colon cancer located tumor) indicates that the subject has rectal cancer. Each possibility represents a separate embodiment of the invention.
[0071] In another aspect there is provided a method for differential diagnosis of colorectal cancer in a subject in need thereof, comprising (i) determining, in a fecal RNA sample of a subject, the levels of gene products selected from Table 4 and / or 5; and (ii) comparing the level of the gene products from the fecal RNA sample to their respective levels in a control sample, wherein a difference in the level of the gene products in the fecal RNA sample and in the control is indicative of the presence of the colorectal cancer in the subject.
[0072] As used herein, the term “differential diagnosis” relates to determining the presence or absence of a disease or condition in a subject who may be afflicted with alternative or additional conditions, and refers in particular to scenarios in which the subject is concurrently afflicted with, or is suspected of having, a second disease or condition exhibiting similar or overlapping symptoms or clinical features. For example, in the case of colorectal cancer (CRC), differential diagnosis methods of the invention enable distinguishing between CRC and other disorders and pathologies afflicting the GI tract. In particular embodiments, differential diagnosis of CRC comprises differentiation between CRC and conditions involving inflammation of the GI tract, including, but not limited to IBD and related disorders. In other embodiments, differential diagnosis of CRC comprises determining the presence of CRC in a subject with a GI inflammatory co-morbidities (e.g. a patient with IBD).
[0073] A subject may be determined as suspected of having a disorder by the treating physician based on e.g. known risk factors, familial history, clinical signs or molecular findings.
[0074] In another embodiment, the subject is diagnosed with, or suspected of having, gastrointestinal (GI) inflammation, and the plurality of gene products is selected from Table 4. In another embodiment, the control corresponds to a subject with GI inflammation.
[0075] The biological response of body tissues to injury, infection or irritation is typically characterized by inflammation, an immune reaction in which a cascade of cellular and microvascular events serves to eradicate the infection, remove damaged tissue and generate new tissue. During this process, elevated permeability in microvessels allows neutrophils and mononuclear cells to leave the intravascular compartment, and perform various anti-microbial activities to eradicate the injury. As used herein, the term "Gastrointestinal inflammation" (or GI inflammation) refers to inflammation of a mucosal layer of the gastrointestinal tract, such as, for example, the upper gastrointestinal tract (e.g., esophagus, stomach, and / or duodenum), or the lower gastrointestinal tract (e.g., bowel such as small and / or large intestines). GI inflammation may be chronic or acute. Acute inflammation is generally characterized by a short time of onset and infiltration or influx of neutrophils. Chronic inflammation is generally characterized by a relatively longer period of onset and infiltration or influx of mononuclear cells. Chronic inflammation can also be characterized by periods of spontaneous remission and spontaneous occurrence (also referred to as "flares"). GI inflammation may be involved in the etiology and / or pathology of various GI conditions (herein referred to as "GI inflammatory conditions"), including, but not limited to, IBD and specific forms thereof as discussed herein, as well as other immune or inflammatory conditions such as various forms of colitis (e.g., ulcerative, granulomatous, ischemic, radiation-induced, infectious), ileitis and gastritis.
[0076] In another embodiment, the subject is diagnosed with, or suspected of having, inflammatory bowel disease (IBD), and the plurality of gene products is selected from Table 5. In another embodiment, the control corresponds to a subject with IBD.
[0077] Inflammatory bowel disease (IBD), which includes Crohn disease (CD) and ulcerative colitis (UC), is a relapsing and remitting condition characterized by chronic inflammation at various sites in the gastrointestinal tract, which results in diarrhea and abdominal pain.
[0078] According to exemplary embodiments, supervised classification algorithms may include support vector machines (SVM), decision trees, gradient boosted trees, random forest, regularized regression, multiple linear regression (MLR), principal component regression (PCR), partial least squares (PLS), discriminant function analysis (DFA) including linear discriminant analysis (LDA), nearest neighbor, artificial neural networks, multi-layer perceptrons (MLP), generalized regression neural network (GRNN), and combinations thereof. In a particular embodiment, the supervised classification algorithm is a linear classifier.
[0079] Supervised classifiers (also referred to herein as supervised classification algorithms) are prediction tools based on learning from examples of labeled data. A supervised classification algorithm is a form of learning and pattern recognition algorithm, in which labeled data, consisting of input (typically vector) -output (correct classification) pairs, is used to train the classifier. Through the training process, a classification function is inferred from labeled training data. The classification function can then be used for classifying new examples, thereby correctly determining the class labels for unseen instances. Exemplary supervised classifiers including, but not limited to,
[0080] As used herein, the term “linear classifier” relates to a supervised classification algorithm that predicts a class label based on a linear combination of input features. The term further encompasses a supervised classifier that makes classification decisions based on the value of a linear combination of an object's characteristics and / or feature values of a feature vector. Exemplary linear classifiers include, but are not limited to, those in which the weighted sum is calculated and used to classify two test groups, such as Linear Discriminant Analysis (LDA). In another aspect, there is provided a method for analyzing a stool sample, comprising determining the level of a plurality of gene products selected from those presented in Table 1 in a stool sample obtained from the subject. In another aspect, there is provided a method for analyzing a stool sample, comprising determining the level of a plurality of gene products selected from those presented in one or more of Tables 1-7 in a stool sample obtained from the subject. In various embodiments, the method is used for diagnosing, assessing the presence, prognosing, monitoring, determining or predicting the progression, assessing or predicting treatment response, identifying CRC remission, and / or determining the location of CRC.
[0081] According to some embodiments, it is provided that the method is not for diagnosis of precancerous polyps.
[0082] Sample management and processing
[0083] In some embodiments, the sample to be used in connection with methods of the invention is a fecal wash sample (e.g. a distal (left-sided) fecal wash). In other embodiments, the sample is a solid fecal sample (stool sample).
[0084] The term "fecal sample" refers to the sample of feces as well to fecal wash sample and to any sample of the content of a colon. The term excludes however biopsy and / or tissue sampling. Typically and conveniently, the fecal sample is a stool sample.
[0085] As used herein, the term "fecal RNA sample" refers to a sample comprising or corresponding to at least a portion of the RNA transcriptome obtained from a fecal sample. A fecal RNA sample to be used in accordance with the invention is typically obtainable by a process comprising recovering RNA from a frozen stool sample (that had been advantageously stored at a temperature of -20°C or lower within 1 hour of sample collection, typically by thawing the frozen sample and isolating RNA from the thawed sample) and subjecting said fecal RNA sample to selective depletion of microbial rRNA as disclosed herein. In some embodiments, the fecal RNA sample is further processed prior to analysis, for example by performing reverse transcription of the resulting depleted RNA, and generating a library of gene products corresponding to said resulting depleted RNA.
[0086] In another embodiment the stool sample had been stored at a temperature not higher than -20°C within 1 hour of sample collection (defecation). In various other embodiments, said sample had been stored at a temperature not higher than -20°C for up to about one month, two weeks, one week, or in other embodiments 6, 5, 4, 3, 2 or 1 day prior to further processing (e.g. obtaining fecal RNA, selective microbial rRNA depletion and determining the levels of biomarkers in the resulting fecal RNA sample). In another embodiment, the stool sample had been stored at a temperature not higher than -20°C within 1 hour of sample collection and up to about 5-7 days. In another embodiment, step (i) of providing the fecal RNA sample comprises obtaining a stool sample from a subject, wherein the stool sample had been stored at a temperature not higher than -20°C within 1 hour of sample collection and up to about 5-7 days, extracting RNA from the thawed sample, and processing said RNA sample by selective depletion of microbial rRNA.
[0087] According to some embodiments, the fecal sample is processed before prior to determining the level(s) of the biomarkers. Processing of the sample may involve one or more of: filtration, distillation, centrifugation, extraction, concentration, dilution, purification, inactivation of interfering components, addition of reagents, and the like.
[0088] In some embodiments, the biomarker is isolated, extracted or derived RNA by a suitable method. Isolating RNA from a biological sample generally includes treating a biological sample in such a manner that the RNA present in the sample is extracted and made available for analysis. For example, phenol based extraction methods are single-step RNA isolation methods based on Guanidine isothiocyanate (GITC) / phenol / chloroform extraction require much less time than traditional methods (e.g. CsCh ultracentrifugation). Many commercial reagents (e.g. Trizol, RNAzol, RNAWIZ) are based on this principle. The entire procedure can be completed within an hour to produce high yields of total RNA .
[0089] According to some embodiments, the processing comprises selective depletion of microbial ribosomal RNA (rRNA). In particular, fecal RNA samples obtained from solid fecal samples (stool samples) are advantageously subjected to selective depletion of microbial rRNA. Thus, the percentage of a type of rRNA (or one or more particular sub-types thereof) in the sample is reduced with respect to the total nucleic acid in said sample. In the context of the invention, the fraction of human exonic reads in the sequenced samples may be increased. In particular embodiments, the depleted sample contains no more than 20% microbial rRNA and typically less than 10%, 9%, 8%, 7%, 6%, 5%, 4%, 3%, 2%, or 1%, including 0.5%, 0.1%, 0.01% or less. In another embodiment, the microbial rRNA is not detectable in the treated sample by conventional methods such as PCR.
[0090] In particular embodiments, selective microbial rRNA depletion may be performed by RNase H-based RNA depletion. In other particular embodiments, said depletion may include selective depletion of 5S, 16S and / or 23S rRNA. In another particular embodiment, selective microbial rRNA depletion comprises RNase H-based RNA depletion of 5S, 16S and 23S rRNA of gram-positive and gram-negative bacteria. Bacterial ribosomes contain three distinct RNA molecules referred to as 5S, 16S and 23S rRNAs. These names historically are related to the size of the RNA molecules, as determined by their sedimentation rate (e.g. in E.coli). While ribosomal RNA molecules vary substantially in size between organisms, 5S, 16S, and 23S rRNA are commonly used as generic names for the homologous RNA molecules in any bacterium, and this convention is referred to herein. These homologous regions may conveniently be targeted (e.g. by suitable nucleic acid probes) for selective depletion.
[0091] For example, DNA probes may be synthesized to be reverse-complement to the bacterial or fungal transcripts. Next, RNase H enzyme may be used which digests RNA-DNA specific hybrids. This leads to the selective digestion of only RNA molecules targeted by the DNA probes. Lastly, endocucleases such as DNase I enzyme may be used to remove the left over DNA probes and other DNA residues left in the sample after RNA extraction. Another method for depleting particular RNAs is by using nucleic acid probes (which are attached to an affinity tag) that specifically hybridize to the RNAs. Exemplary affinity tags include, but are not limited to hemagglutinin (HA), AviTag™, V5, Myc, T7, FLAG, HSV, VSV-G, His, biotin, or streptavidin.
[0092] In some embodiments, negative genomic selection of abundant microbial transcripts such as bacterial and / or fungal rRNA may be performed prior to the analysis. Examples of particular additional RNA transcripts that may be depleted include, but are not limited to Eubacterium rectale, Faecalibacterium prausnitzii, Bifidobacterium adolescentis, Ruminococcus sp 5 1 39BFAA, Bifidobacterium longum, Subdoligranulum, Ruminococcus gnavus, Escherichia coli, Ruminococcus torques, Akkermansia muciniphila, Ruminococcus bromii, Dialister invisus, Collinsella aerofaciens, Bacteroides uniformis, Bacteroides vulgatus, Eubacterium hallii, Dorea longicatena, Prevotella copri, Alistipes putredinis and Bifidobacterium bifidum. In some embodiments, rRNA of at least one, at least two, at least three, at least four or at least 5 or all of the above identified bacteria are depleted.
[0093] After obtaining the RNA sample, cDNA may be generated therefrom. For synthesis of cDNA, template mRNA may be obtained directly from lysed cells or may be purified from a total RNA or mRNA sample. The total RNA sample may be subjected to a force to encourage shearing of the RNA molecules such that the average size of each of the RNA molecules is between 100- 300 nucleotides, e.g. about 200 nucleotides. To separate the heterogeneous population of mRNA from the majority of the RNA found in the cell, various technologies may be used which are based on the use of oligo(dT) oligonucleotides attached to a solid support. Examples of such oligo(dT) oligonucleotides include: oligo(dT) cellulose / spin columns, oligo(dT) / magnetic beads, and oligo(dT) oligonucleotide coated plates.
[0094] According to another embodiment, long-read transcriptome sequencing is carried out, wherein the full length RNA molecule is sequenced (i.e. from the 3’polyA tail to the 5’ cap).
[0095] Generation of single stranded DNA from RNA requires synthesis of an intermediate RNA- DNA hybrid. For this, a primer is required that hybridizes to the 3’ end of the RNA. Annealing temperature and timing are determined both by the efficiency with which the primer is expected to anneal to a template and the degree of mismatch that is to be tolerated.
[0096] The annealing temperature is usually chosen to provide optimal efficiency and specificity, and generally ranges from about 50 °C to about 80°C, usually from about 55 °C to about 70 °C, and more usually from about 60 °C to about 68 °C. Annealing conditions are generally maintained for a period of time ranging from about 15 seconds to about 30 minutes, usually from about 30 seconds to about 5 minutes.
[0097] According to a specific embodiment, the primer comprises a polydT oligonucleotide sequence. Preferably the polydT sequence comprises at least 5 nucleotides. According to another is between about 5 to 50 nucleotides, more preferably between about 5-25 nucleotides, and even more preferably between about 12 to 14 nucleotides.
[0098] Following annealing of the primer (e.g. polydT primer) to the RNA sample, an RNA-DNA hybrid is synthesized by reverse transcription using an RNA-dependent DNA polymerase. Suitable RNA-dependent DNA polymerases for use in the methods and compositions of the invention include reverse transcriptases (RTs). Examples of RTs include, but are not limited to, Moloney murine leukemia virus (M-MLV) reverse transcriptase, human immunodeficiency virus (HIV) reverse transcriptase, rous sarcoma virus (RSV) reverse transcriptase, avian myeloblastosis virus (AMV) reverse transcriptase, rous associated virus (RAV) reverse transcriptase, and myeloblastosis associated virus (MAV) reverse transcriptase or other avian sarcoma-leukosis virus (ASLV) reverse transcriptases, and modified RTs derived therefrom. See e.g. U.S. Patent No. 7,056,716. Many reverse transcriptases, such as those from avian myeloblastosis virus (AMV- RT), and Moloney murine leukemia virus (MMLV-RT) comprise more than one activity (for example, polymerase activity and ribonuclease activity) and can function in the formation of the double stranded cDNA molecules.
[0099] Additional components required in a reverse transcription reaction include dNTPS (dATP, dCTP, dGTP and dTTP) and optionally a reducing agent such as Dithiothreitol (DTT) and MnCh. Following cDNA synthesis, the present inventors contemplate amplifying the cDNA (e.g. using a polymerase chain reaction - PCR, details of which are known in the art).
[0100] Methods of analyzing the amount of RNA are known in the art and include e.g. Northern Blot analysis, RT-PCR analysis, RNA in situ hybridization stain, DNA microarray, DNA chips, oligonucleotide microarray, RNA sequencing and deep sequencing. In another embodiment, the method includes quantitative PCR (qPCR). In another embodiment determining the level of at least one human gene product in the fecal RNA sample comprises RNA barcoding. In another embodiment determining the level of at least one human gene product in the fecal RNA sample comprises RNA sequencing. In another embodiment determining the level of at least one human gene product in the fecal RNA sample comprises RNA barcoding and sequencing.
[0101] The term “barcode” or “barcoding” when used as a verb with reference to a reaction, indicates a reaction performed to covalently attach a barcode in the sense of the disclosure to the reference item, in a configuration allowing detection of the barcode. Accordingly, barcoding in the sense of the disclosure refers to coupling a unique set of tags or identifiers in order to mark molecules for downstream detection and identification. As used herein, “unique” means different from any other. The term “barcode” as used herein refers in particular to a short sequence of nucleotides (for example, DNA or RNA) that is used as an identifier for an associated molecule, such as a target molecule and / or target nucleic acid, or as an identifier of the source of an associated molecule, such as a cell-of-origin. Barcodes can allow for identification and / or quantification of individual sequencing-reads. In some embodiments, a barcode can be obtained by sequential direct covalent linkage of a tag with another tag until formation of a barcode comprising a series of two or more tags directly attached one to another through covalent linkage.
[0102] In a particular embodiment, unique molecular identifiers (UMI) are used. Sequencing linker or a subtype of nucleic acid barcode may be used in a method that uses molecular tags to detect and quantify unique products (e.g., individual transcripts). A UMI may be used to distinguish effects through a single clone from multiple clones. The term “clone” as used herein may refer to a single mRNA or target nucleic acid to be sequenced. In one example embodiment, a random sequence of between 4 and 20 base pairs may be used, which may be designed such that assignment to the original can take place despite up to 4-7 errors during amplification or sequencing.
[0103] Unique molecular identifiers can be used, for example, to normalize samples for variable amplification efficiency. For example, in various embodiments, featuring a solid or semisolid support (for example a bead), to which nucleic acid barcodes (for example a plurality of barcodes sharing the same sequence) are attached, each of the barcodes may be further coupled to a unique molecular identifier, such that every barcode on the particular solid or semisolid support receives a distinct unique molecule identifier. A unique molecular identifier can then be, for example, transferred to a target molecule with the associated barcode, such that the target molecule receives not only a nucleic acid barcode, but also an identifier unique among the identifiers originating from that solid or semisolid support. qPCR generally refers to the PCR technique known as real-time quantitative polymerase chain reaction, quantitative polymerase chain reaction or kinetic polymerase chain reaction. This technique can simultaneously amplify and quantify target nucleic acids using PCR wherein the quantification is typically by virtue of an intercalating fluorescent dye or sequence-specific probes which contain fluorescent reporter molecules that are only detectable once hybridized to a target nucleic acid. Fluorescence can be monitored on each PCR cycle providing an amplification plot that allows a user to follow the reaction in real time. The amount of product detected at a certain point of the run is directly related to the initial amount of target in the sample.
[0104] Therapeutic applications
[0105] According to some embodiments of the invention, the method further comprises informing the subject of the diagnosis. As used herein the phrase “informing the subject” refers to advising the subject that based on the diagnosis the subject should seek a suitable treatment regimen.
[0106] According to another aspect, the present invention provides a method of assigning or recommending a medical treatment comprising diagnosing a colorectal cancer as described in any one of the above aspects and embodiments, and recommending the medical intervention based on the obtained results. All terms, embodiments and definitions disclosed in any one of the above aspects apply and are encompassed herein as well. According to some embodiments, the treatment comprises surgical intervention. According to other embodiments, the treatment includes chemotherapy, biological therapy or any other known method to treat colorectal cancer.
[0107] In other embodiments, the invention relates to a method of assigning a medical intervention (also referred to herein as a medical treatment). According to these embodiments, the intervention is assigned based on the results obtained in diagnostic methods of the invention, meaning that if a subject is determined to be afflicted with a colorectal tumor or a subtype thereof, a treatment corresponding to the identified tumor is selected for the subject.
[0108] Accordingly, the invention enables in embodiments thereof to determine whether the subject is in need of treatment for CRC (e.g. a surgical intervention / resection or a chemotherapy, biological therapy, irradiation and / or immunotherapy). Thus, it is envisioned that a subject in which the presence of CRC is indicated in accordance with the methods of the invention (e.g. an increase in the levels of gene products of Group A, G, I and / or K, and / or a decrease in the levels of gene products of Group B, H, J, and / or L, as compared to a healthy control) will be provided with a medical intervention for CRC treatment. In another example, it is envisaged that a subject predicted or indicated to have colon cancer in accordance with the methods of the invention (e.g. an increase in the level of gene products of Group C or E as compared to a rectal cancer control) will be provided with an intervention for colon cancer, and a subject predicted or indicated to have rectal cancer in accordance with the methods of the invention (e.g. an increase in the level of gene products of Group D or F as compared to a colon cancer control) will be provided with an intervention for rectal cancer.
[0109] According to another aspect, the present invention provides a method of treating a colorectal cancer in a subject in need thereof, the method comprising diagnosing a colorectal cancer as described in any one of the above aspects and embodiments and treating said subject. According to some embodiments, the treatment comprises surgical intervention and / or administering anticancer treatment. In another embodiment, the method comprises diagnosing a CRC in the subject, namely, determining that said subject is afflicted with CRC, and treating said subject in which the presence of CRC is indicated by administering to said subject a treatment or intervention for CRC as disclosed herein.
[0110] The term “treating” a condition or patient refers to taking steps to obtain beneficial or desired results, including clinical results. Beneficial or desired clinical results include, but are not limited to, ameliorating, abrogating, substantially inhibiting, slowing or reversing the progression of a disease, condition or disorder, substantially ameliorating or alleviating clinical or esthetical symptoms of a condition, substantially preventing the appearance of clinical or esthetical symptoms of a disease, condition, or disorder, and protecting from harmful or annoying symptoms. Treating further refers to accomplishing one or more of the following: (a) reducing the severity of the disorder; (b) limiting the development of symptoms characteristic of the disorder(s) being treated; (c) limiting worsening of symptoms characteristic of the disorder(s) being treated; (d) limiting recurrence of the disorder(s) in patients that have previously had the disorder(s); and / or (e) limiting recurrence of symptoms in patients that were previously asymptomatic for the disorder(s).
[0111] The term “treating cancer” as used herein should be understood to e.g. encompass treatment resulting in a decrease in tumor size; a decrease in the rate of tumor growth; stasis of tumor size; a decrease in the number of metastasis; a decrease in the number of additional metastasis; a decrease in the invasiveness of the cancer; a decrease in the rate of progression of the tumor from one stage to the next; inhibition of tumor growth in a tissue of a mammal having a malignant cancer; control of establishment of metastases; inhibition of tumor metastases formation; regression of established tumors as well as a decrease in the angiogenesis induced by the cancer, inhibition of growth and proliferation of cancer cells and so forth. The term “treating cancer” as used herein should also be understood to encompass prophylaxis such as prevention as cancer reoccurs after previous treatment (including surgical removal) and prevention of cancer in an individual prone (genetically, due to lifestyle, chronic inflammation and so forth) to develop cancer. As used herein, “prevention of cancer” is thus to be understood to include prevention of metastases, for example after surgical procedures or after chemotherapy.
[0112] For example, surgical resection is the mainstay for curative-intent treatment of CRC. Resection consists of removal of the anatomic segment of the large intestine harboring the tumor along with its regional lymphatic drainage. In general, a wide, 5-cm margin is planned, but a negative margin of any distance is acceptable. Resection is typically followed by reconnection of the bowel segments to restore enteral continuity (anastomosis).
[0113] For rectal cancer, sphincter-sparing surgical resection (low anterior resection) can be done in patients with low tumors near, but not involving, the anal sphincter complex without significant risk of local recurrence or decreased long-term survival. Sphincter-sparing procedures necessitate a low anastomosis, which often is followed by functional issues postoperatively (e.g., fecal leakage, incontinence). If there is local recurrence or poorly tolerated bowel function after a sphincter- sparing procedure, an abdominoperineal resection (APR) with permanent colostomy is usually recommended.
[0114] In addition, in colon cancer, postoperative chemotherapy is indicated for patients with stage III disease (lymph node-positive) or patients with high-risk stage II disease (lymph node-negative but high-risk features seen on pathology such as lymphovascular invasion).
[0115] For rectal cancer, several recent studies have further suggested using total neoadjuvant therapy (delivery of all chemotherapy and radiation before surgery). In general, patients who are stage T3 or T4 or who are suspected of having nodal disease will receive both chemotherapy and chemoradiation in conjunction with surgical resection. Kits
[0116] In other embodiments, the invention is directed to articles of manufacture (such as diagnostic kits) which may be used to facilitate the methods of the invention. Thus, in another aspect, there is provided a diagnostic kit, comprising means for specifically determining and quantifying the levels of a plurality of human gene products as disclosed herein in a fecal RNA sample.
[0117] In other embodiments, diagnostic kits in accordance with the invention comprise means for specifically determining and quantifying the levels of a plurality of human gene products as disclosed herein in a fecal RNA sample. In various embodiments, the plurality of gene products is selected from a list of biomarkers as present in Table 1, 2 and / or 3, wherein each possibility represents a separate embodiment of the invention. In various embodiments, the plurality of gene products is selected from a list of biomarkers as present in Table 3, 5, 6 and / or 7, wherein each possibility represents a separate embodiment of the invention. In another embodiment the plurality of gene products is selected from a list of biomarkers as set forth in Group A, B, C, D, E and / or F as disclosed herein, wherein each possibility represents a separate embodiment of the invention. In another embodiment the plurality of gene products is selected from a list of biomarkers as set forth in Group G, H, I, J and / or K, wherein each possibility represents a separate embodiment of the invention.
[0118] In another embodiment there is provided a kit comprising means for specifically determining and quantifying the levels of a plurality of gene products in a fecal RNA sample, and instructions for diagnosing colorectal cancer, wherein the plurality of gene products is selected from a list of biomarkers as presented in one or more of Tables 1-6.
[0119] As used herein, means for specifically determining and quantifying the levels of a plurality of gene products denotes structure- specific reagents (such as primers, probes or antibodies) which facilitate the assessment of the amount of the particular gene products in question in a sample. Reagents facilitating the evaluation of the amounts of DNA or RNA in a non-specific manner (that does not allow an evaluation of the level of a particular gene product as compared to other gene products) are not considered to facilitate "specific" determination and / or quantification. Accordingly, reagents such as buffers or enzymes are not considered to allow specific determination and quantification of the levels of gene products unless accompanied by sequencespecific reagents such as primers or probes.
[0120] In some embodiments, the means for specifically determining and quantifying the levels of a plurality of gene products may comprise primers or probes for facilitating specific quantification of the gene products. In some embodiments, said primers or probes may be attached to a surface so as to form a diagnostic article of manufacture, e.g. a gene expression array or chip. In an exemplary embodiment the means comprise quantitative polymerase chain reaction (qPCR) primers directed to said plurality of human gene products. In some embodiments, the means for specifically determining and quantifying the levels of a plurality of gene products comprise primers or probes to be used in an assay disclosed herein for determining and quantifying the levels of gene products, including, but not limited to droplet digital PCR (ddPCR) and targeted sequencing (also referred to herein as targeted transcriptomics).
[0121] According to another aspect, the present invention provides a kit comprising means for determining levels of a plurality of biomarkers in a fecal sample, and instructions for diagnosing colorectal cancer, wherein biomarkers of the plurality of biomarkers is selected from a list of biomarkers as present in Table 1.
[0122] All terms, embodiments and definitions disclosed in any one of the above aspects apply and are encompassed herein as well.
[0123] In some embodiments, the kit comprising means for providing a fecal RNA sample and / or means for comparing the levels of said plurality of human gene products in the sample to their levels in a control fecal RNA sample. Each possibility represents a separate embodiment of the invention.
[0124] In some embodiments, the kit contain means for isolation, extraction or derivation of RNA by a suitable method. Isolating RNA from a biological sample generally includes treating a biological sample in such a manner that the RNA present in the sample is extracted and made available for analysis. For example, phenol based extraction methods are single-step RNA isolation methods based on Guanidine isothiocyanate (GITC) / phenol / chloroform extraction require much less time than traditional methods (e.g. CsC12 ultracentrifugation). Many commercial reagents (e.g. Trizol, RNAzol, RNAWIZ) are based on this principle. The entire procedure can be completed within an hour to produce high yields of total RNA .
[0125] Silica gel - based purification methods: RNeasy is a purification kit marketed by Qiagen. It uses a silica gel-based membrane in a spin-column to selectively bind RNA larger than 200 bases. The method is quick and does not involve the use of phenol .
[0126] Oligo-dT based affinity purification of mRNA: Due to the low abundance of mRNA in the total pool of cellular RNA, reducing the amount of rRNA and tRNA in a total RNA preparation greatly increases the relative amount of mRNA. The use of oligo-dT affinity chromatography to selectively enrich poly (A)+ RNA has been practiced for over 20 years. The result of the preparation is an enriched mRNA population that has minimal rRNA or other small RNA contamination. mRNA enrichment is essential for construction of cDNA libraries and other applications where intact mRNA is highly desirable. The original method utilized oligo-dT conjugated resin column chromatography and can be time consuming. Recently more convenient formats such as spin-column and magnetic bead based reagent kits have become available.
[0127] In some embodiments, the diagnostic kits of the invention further comprise qPCR reagents. For example, the kits may include one or more of a Taq enzyme premix system (Taq enzyme, buffer, and dNTP), primers and probes of target genes for qPCR, a reagent for quality control of an activity of a Taq enzyme, reagents for plotting a reference curve and instructions for parameter set and procedure.
[0128] Gene products
[0129] Listed below are groups of gene products that may be used in connection with methods and kits of the invention, as described herein. In particular, the products of genes presented in Tables 1-7 below may be used in embodiments of the invention as disclosed herein. As disclosed herein, advantageous biomarkers to be used in embodiments of the invention are human gene products, more specifically RNA transcripts.
[0130] Table 1. A list of gene products upregulated or downregulated in CRC stool
[0131] Table 2. Gene products up-regulated specifically in colon or rectal tumors
[0132] Table 3 - Gene products highly up-regulated specifically in colon or rectal tumors
[0133]
[0134] Table 4 - Gene products included in classifier excluding myeloid cell genes
[0135] Table 5 - Gene products included in classifier excluding IBD related genes
[0136] Table 6 - Gene products for CRC diagnosis
[0137] In Tables 4-6, the gene products indicated as "CRC related genes" (Groups G, I, K) represent gene products found to be up-regulated in fecal samples of CRC patients as compared to controls, whereas "control related genes" (Groups H, J, L) represent gene products found to be down-regulated in fecal samples of CRC patients as compared to controls. Additional embodiments
[0138] In one aspect, there is provided a method of diagnosing a colorectal cancer, the method comprising (i) determining, in a fecal RNA sample of a subject, the levels of gene products selected from Table 1; and (ii) comparing the level of the gene products from the fecal RNA sample to their respective levels in a control (corresponding to a control sample), wherein a difference in the level of the gene products in the fecal RNA sample and in the control (i.e. as compared to the control) is indicative of the presence of the colorectal cancer in the subject.
[0139] In one embodiment, the method comprises determining the levels of a plurality of gene products presented in Table 1 in said fecal RNA sample to thereby obtain the transcriptomic signature of said sample with respect to the plurality of gene products, and comparing the transcriptomic signature of said sample to a control transcriptomic signature by a supervised classification algorithm, wherein the outcome of the comparison is indicative of the presence of colorectal cancer in said subject.
[0140] In another embodiment, a transcriptomic signature characterized by: (i) an increase in the level of gene products of Group A in Table 1 in the fecal RNA sample of the subject in comparison to their corresponding levels in a healthy control; (ii) a decrease in the level of gene products of Group B in Table 1 in the fecal RNA sample of said subject in comparison to their corresponding levels in a healthy control; or (iii) both (i) and (ii), is indicative of the presence of a colorectal cancer in said subject, wherein each possibility represents a separate embodiment of the invention.
[0141] In another embodiment, said gene products are selected from the group consisting of: AL0X5AP, CEBPB, CMTM2, CXCL8, GADD45B, GCA, HCAR3, RGS2, SLC2A3, SOCS3, SOD2, SRGN, TSC22D3, and VIM. In another embodiment said gene products comprise AL0X5AP, CEBPB, CMTM2, CXCL8, GADD45B, GCA, HCAR3, RGS2, SLC2A3, SOCS3, SOD2, SRGN, TSC22D3, and VIM.
[0142] In another aspect, there is provided a method of diagnosing a colorectal cancer, the method comprising (i) determining, in a fecal RNA sample of a subject, the levels of gene products selected from any one of Tables 4-6 and 7; and (ii) comparing the level of the gene products from the fecal RNA sample to their respective levels in a control sample, wherein a difference in the level of the gene products in the fecal RNA sample and in the control is indicative of the presence of the colorectal cancer in the subject. Each possibility represents a separate embodiment of the invention.
[0143] In another embodiment, the method comprises determining the levels of a plurality of gene products, comprising gene products presented in Table 1 and further comprising at least one additional gene product selected from Table 6, to thereby obtain the transcriptomic signature of said sample with respect to the plurality of gene products. In another embodiment the at least one additional gene product selected from Table 6 is selected from the group consisting of ANKRD12, MYL6B, PSMB9, UBR4 (CRC-associated), and control biomarkers (gene products up regulated in non-CRC samples) ARRDC4, CRIP1, LYPD8, PEX26, and SFN. In another embodiment the at least one additional gene product comprises ANKRD12, MYL6B, PSMB9, UBR4, and control biomarkers ARRDC4, CRIP1, LYPD8, PEX26, and SFN.
[0144] In one embodiment, the method comprises determining the levels of a plurality of gene products presented in any one of Tables 4-6 and 7 in said fecal RNA sample to thereby obtain the transcriptomic signature of said sample with respect to the plurality of gene products, and comparing the transcriptomic signature of said sample to a control transcriptomic signature by a supervised classification algorithm, wherein the outcome of the comparison is indicative of the presence of colorectal cancer in said subject. Each possibility represents a separate embodiment of the invention.
[0145] In another embodiment, a transcriptomic signature characterized by: (i) an increase in the level of gene products of Group G, I and / or K in the fecal RNA sample of the subject in comparison to their corresponding levels in a healthy control; (ii) a decrease in the level of gene products of Group H, J, and / or L in the fecal RNA sample of said subject in comparison to their corresponding levels in a healthy control; or (iii) both (i) and (ii), is indicative of the presence of a colorectal cancer in said subject, wherein each possibility represents a separate embodiment of the invention.
[0146] In another embodiment, the method further comprises subjecting the fecal RNA sample to selective depletion of microbial ribosomal RNA (rRNA).
[0147] In another embodiment, the method further comprises predicting the location of the colorectal cancer, comprising determining in said fecal RNA sample the levels of additional gene products selected from gene products presented in Table 2 or 3 and comparing the determined levels to their respective levels in a control sample, wherein a difference in the level of the additional gene products in the fecal RNA sample and in the control is predictive of the location of the colorectal cancer.
[0148] In another aspect, there is provided a method of predicting the location of a colorectal cancer in a subject having colorectal cancer, the method comprising: (i) determining, in a fecal RNA sample of a subject, the levels of a plurality of gene products selected from gene products presented in Table 2 or 3, thereby obtaining the transcriptomic signature of the sample with respect to the plurality of gene products, and (ii) comparing the transcriptomic signature of said sample to a control transcriptomic signature by a supervised classification algorithm, wherein the outcome of the comparison is predictive of the location of the colorectal cancer.
[0149] In various embodiments, a transcriptomic signature characterized by an increase in the level of gene products of Group C in Table 2 or of Group E in Table 3 in the fecal RNA sample of the subject in comparison to their corresponding levels in a rectal cancer control indicates that the subject has colon cancer, and a transcriptomic signature characterized by an increase in the level of gene products of Group D in Table 2 or Group F in Table 3 in the fecal RNA sample of said subject in comparison to their corresponding levels in a colon cancer control indicates that the subject has rectal cancer. In another aspect, there is provided a method of assigning a medical intervention comprising diagnosing a colorectal cancer according to a method as disclosed herein, and assigning the medical intervention based on the obtained results. In another aspect, there is provided a method of treating a colorectal cancer in a subject in need thereof, the method comprising diagnosing a colorectal cancer according to as disclosed herein, and treating the cancer.
[0150] In another aspect, there is provided a method for monitoring the efficacy of a treatment of a colorectal cancer, the method comprising (i) determining, in a fecal RNA sample of a subject, the levels of gene products selected from T able 1 or any one of T ables 4-6 and 7 ; and (ii) comparing the level of the gene products from the fecal sample to their corresponding levels in a fecal sample obtained from the same subject at an earlier sampling and / or to the level of the one or more gene products in the control, wherein a change in the level of the biomarkers in two consecutive measurements is indicative of the efficacy of the cancer treatment.
[0151] In another embodiment, the method comprises determining the levels of a plurality of gene products presented in Table 1 or any one of Tables 4-6 and 7 in said fecal RNA sample to thereby obtain the transcriptomic signature of said sample with respect to the plurality of gene products, and comparing the transcriptomic signature of said sample to a control transcriptomic signature by a supervised classification algorithm.
[0152] In another embodiment, methods of the invention further comprise subjecting the fecal RNA sample to selective depletion of microbial rRNA.
[0153] In another aspect, there is provided a kit comprising means for determining levels of a plurality of gene products in a fecal sample, and instructions for diagnosing colorectal cancer, wherein the plurality of gene products is selected from a list of biomarkers as presented in any one of Tables 1-6 and 7 wherein each possibility represents a separate embodiment of the invention. In another embodiment the kit further comprises means for specifically determining and quantifying the levels of the plurality of gene products in a fecal RNA sample.
[0154] In another aspect there is provided a method for differential diagnosis of colorectal cancer in a subject in need thereof, comprising (i) determining, in a fecal RNA sample of a subject, the levels of gene products selected from Table 4 and / or 5; and (ii) comparing the level of the gene products from the fecal RNA sample to their respective levels in a control sample, wherein a difference in the level of the gene products in the fecal RNA sample and in the control is indicative of the presence of the colorectal cancer in the subject. In another embodiment, the subject is diagnosed with, or suspected of having, gastrointestinal (GI) inflammation, and the plurality of gene products is selected from Table 4. In another embodiment, the control corresponds to a subject with GI inflammation.
[0155] In another embodiment, the subject is diagnosed with, or suspected of having, inflammatory bowel disease (IBD), and the plurality of gene products is selected from Table 5. In another embodiment, the control corresponds to a subject with IBD.
[0156] In an exemplary embodiment of the methods disclosed herein, the supervised classification algorithm is a linear classifier. In another exemplary embodiment, determining the levels of said plurality of gene products in said fecal RNA sample to thereby obtain the transcriptomic signature of said sample with respect to the plurality of gene products comprises calculating a CRC predictive score for the sample, as follows: (sum of normalized CRC markers) / (sum of normalized CRC markers + sum of normalized control markers), wherein "CRC markers" represents gene products upregulated in CRC fecal samples as compared to controls (e.g. those listed in Groups A, G, I, and / or K) and "control markers" represents gene products found to be down-regulated in fecal samples of CRC patients as compared to controls (e.g. those listed in Groups B, H, J, and / or L).
[0157] The following examples are presented in order to more fully illustrate some embodiments of the invention. They should, in no way be construed, however, as limiting the broad scope of the invention.
[0158] EXAMPLES
[0159] The foregoing description of the specific embodiments will so fully reveal the general nature of the invention that others can, by applying current knowledge, readily modify and / or adapt for various applications such specific embodiments without undue experimentation and without departing from the generic concept, and, therefore, such adaptations and modifications should and are intended to be comprehended within the meaning and range of equivalents of the disclosed embodiments. It is to be understood that the phraseology or terminology employed herein is for the purpose of description and not of limitation. The means, materials, and steps for carrying out various disclosed functions may take a variety of alternative forms without departing from the invention. Methods
[0160] Patient population
[0161] Human samples and clinical data were obtained from patients undergoing colectomy surgery due to presence of malignancy. Control stool samples were obtained from healthy individuals. Informed consent was obtained from all patients and all experiments followed all the Helsinki committee guidelines and regulations.
[0162] Sample collection
[0163] Pre-operational stool samples were collected either at home or at the hospital 1 to 7 days prior to tumor resection, post-operational samples were collected at home at least 1 month after tumor resection. Stools were collected directly to RNA Later (AM7021, Invitrogen), placed at - 20°C for up to 5 days and transferred to the lab for long storage at -80°C. Biopsies were collected at the operation room as soon as the specimen was excised. Biopsies were snap frozen directly on dry ice.
[0164] RNA extraction
[0165] For tissue samples - snap frozen tissues were thawed in 300 pl Tri-reagent and mechanically homogenized with bead beating, followed by a short centrifugation step to pull down beads and any tissue left-overs. Following this, ethanol was added in a ratio of 1: 1 to the supernatant from the previous step and continued according to the manufacturer instructions of Direct-zol mini prep kit (ZYMO research, R2052). For stool samples- stools in RNA Later were thawed on ice for 30 minutes until buffer was completely thawed. Stool sample was extracted from the buffer and wiped on Kim-wipe paper. Pea size stool sample was placed in lysis buffer containing 1ml RLT buffer with 40mM DTT and about 50pl of beads. Stool was mechanically homogenized with bead beating for 3 minutes followed by a 3 minutes centrifugation at 5000rpm step to pull down beads and any stool solids left-overs. Following this, ethanol was added in a ratio of 1: 1 to the supernatant from the previous step and continued according to the manufacturer instructions of Qiagen RNeasy Micro Kit (74004, Qiagen). Stool was eluted in 30 pl RNAse free water.
[0166] RNA sequencing
[0167] Stool samples RNA was depleted from bacterial ribosomal RNA using the NEBNext rRNA Depletion Kit (Bacteria, E7850X) kit according to the manufacturer instruction. Briefly: one third of total isolated stool RNA was used as input. Abundant bacteria ribosomal RNA-specific probes were hybridized to rRNA molecules in the total RNA stool sample by temperature ramp down from 95°C to 22°C over the course of 30 minutes. The rRNA-probe complexes were removed using RNase Hl enzyme. Next, DNase I was used to clean up the remaining sample by eliminating any residual DNA probes or genomic DNA. RNA was eluted in 9pl nuclease free water and used directly in the RT reaction. RNA was processed by the mcSCRBseq protocol (Bagnoli, J. W. et al. Nat. Commun. 9, 2937 (2018)) with minor modifications. For tissue RNA, RT reaction was applied on 20 ng of total RNA (2 pl 10 ng / pl RNA, 6.4 nuclease free water). For stool RNA 8.4pl from bacterial ribosomal RNA depleted RNA were used. Final RT reaction volume was 20 pl (lx Maxima H Buffer, 1 mM dNTPs, 2 pM TSO* E5V6NEXT, 7.5% PEG8000, 20U Maxima H enzyme, 2 pl barcoded RT primer, 8.4 pl RNA sample). Subsequent steps were applied as mentioned in the protocol. Library preparation was done using Nextera XT kit (Illumina) on 0.6 ng amplified cDNA from tissue or 1 ng from stool. Library final concentration of 2nM was loaded on NovaX (Illumina) sequencing machine aiming at 20 M reads per sample with the following setting: Readl - 26bp, Indexl - lObp, Index2 - lObp, Read2 - 66bp.
[0168] Bioinformatics and computational analysis
[0169] Illumina output files were demultiplexed and aligned to the human HG38 genome with UTAP (Kohen, R. et al. BMC Bioinformatics 20, 154 (2019)), using CUT AD APT with the default parameters.
[0170] Statistical analyses were performed with MATLAB R2023a. Mitochondrial and ribosomal genes as well as non-protein coding genes were removed from the analysis. Protein coding genes were extracted using the annotation in the Ensembl database (BioMart) for reference genome GRCm38 version 91. Gene expression for each sample was consequently normalized by the sum of the UMIs of the remaining genes that individually take up less than 5% of the sample sum. Samples with less than 1500 genes over the remaining genes were removed from the analyses.
[0171] Pelka differential gene expression was performed on 2000 cells from each cell type at the clusters midway levels.
[0172] Clustering was performed with the MATLAB function clustergram over the Zscore- transformed expression matrix, using Spearman distances and included genes with maximal expression above 5xl0'5of summed UMIs. Differential gene expression was performed using Wilcoxon ranksum tests and Benjamini-Hochberg FDR corrections.
[0173] Analysis of single cell data
[0174] Clustering was performed with the MATLAB function clustergram over the Zscore- transformed expression matrix, using Spearman distances and including highly variable genes (log-log regression residuals > 0 of noise (std / mean) vs. mean) with maximal expression above 5e'5of normalized summed UMIs. These genes were crossed with differentially expressed genes of cancer and control single cells extracted as follows. From each of the clusters at the ClusterMidway level at the Pelka UMAP, 1000 cells were randomly selected and divided into 2 groups based on the specimen type (N for normal, T for Tumor). Differential gene expression was performed using Wilcoxon ranksum tests and Benjamini-Hochberg FDR corrections, and a list of 2185 genes (FDR q-value less than 0.05 and fold change over 1.5) was extracted.
[0175] For the comparison of gene expression ratio between CRC single cells and healthy single cells to gene expression ratio between CRC and healthy tissue (Fig. 2C) data from Pelka et. al. (Cell 184, 4734-4752. e20 (2021)) were used. From each of the Epi and EpiT clusters, 5000 cells were randomly selected for the analysis. The mean expression for each gene was calculated for each group and used for the ratio of CRC to control single cells. All p-values were computed using Kruskal-Wallis tests.
[0176] Analysis of spatial transcriptomics data
[0177] Data was taken from Oliveira et. al. (https: / / doi.org / 10.1101 / 2024.06.04.597233 (2024)). Three samples of colorectal cancer tissues were analyzed on visium HD slide. For each slide, K- means clustering was used with K=5. Next, the clusters that represent the tumor region or the adjacent non-tumor region were selected. Differential gene expression between the clusters for each slide was performed using the loupe browser. Genes were divided into 2 groups for tumor or adjacent non-tumor related genes and plotted against the tissue or stool fold change between the CRC and control samples.
[0178] Linear Classifier for Cancer Score Prediction
[0179] Samples used to build the classifier were stool or tissue samples from the CRC or control cohort. Classification was based on a parameter that represented the fractional summed expression of CRC -related genes, extracted as follows. For 100 iterations, the data set was split into a training set (50%) and a test set (50%). Differential gene expression between the CRC and control training set samples was performed. Genes with a maximal normalized expression higher than 5*e'4over all the included samples that were expressed in more than 5% of the samples were further considered. The Kruskal-Wallis test followed by the Benjamini-Hochberg FDR correction was performed. Genes, the mean expression of which in CRC samples were more than 2-fold higher than the mean in control samples, with FDR q-value less than 0.25 were selected as CRC markers. Inversely, genes, the mean expression of which in CRC samples was less than 2-fold lower than the mean in control samples, with FDR q-value less than 0.25 were chosen as control markers. Those genes were identified based on the training set only.
[0180] Next, for the test set samples, the sums of CRC markers and control markers were calculated, after the gene expression levels for each gene were normalized internally by their maximal values across the test set. Cancer score was calculated for each test set sample, as follows: (sum of normalized CRC markers) / (sum of normalized CRC markers + sum of normalized control markers). The receiver operating characteristic curve and subsequent false-positive rate, truepositive rate, and AUC were recorded for each test set to examine the classification of samples of CRC patients and control patients based on the cancer score.
[0181] In a classifier excluding myeloid cells genes, genes were omitted if they showed expression levels in the myeloid cell type that were at least 1.5 times greater than their maximum expression in any other cell type. In a classifier excluding IBD related genes, genes were omitted if they were differentially expressed in fecal washes between IBD and control samples (AUC wash>0.85).
[0182] Example 1. Colorectal Cancer (CRC) patients stool transcriptomics correspond to tissue states in CRC
[0183] Here, presented is Gut Exfoliated Transcriptomics using Ribosomal Depletion Sequencing (GET-RIDseq), an approach for deep transcriptomic profiling of host mRNAs from stool samples via genomic depletion of abundant bacterial RNA. In this approach, RNA was isolated from stool samples obtained at home and preserved in RNAlater as described above. The RNA was incubated with DNA probes designed to be complementary to the ribosomal RNA sequences of abundant microbial species. Next, RNaseHl, an enzyme that degrades DNA-RNA hybrid molecules, was added, resulting in the efficient elimination of these abundant microbial RNA molecules, thus enriching for the human RNAs. Sensitive polyA-capture UMI-based RNA sequencing was used to quantify host gene expression in the resulting sample. It was found that genomic depletion increases the number of host genes by an average of 6.1-fold (Fig. 1A), raising the median gene count from 392 ± 541 without depletion to 1,745 ± 1,050 with depletion (p = 8.3e-5). The yield of host genes was remarkably stable over 5 days of preservation of the stool sample in a home freezer (Fig. IB). GET-RIDseq therefore opens the way for non-invasive deep transcriptomic profiling of the intestinal molecular states in diverse pathophysiological conditions.
[0184] A cohort of 54 patients with colorectal cancer, and 24 healthy controls, was recruited. Patients provided stool samples before undergoing surgery for removal of the tumor as described above. For each patient, tissue samples from the tumor and from an adjacent non-malignant colonic tissue, 3-5cm from the tumor margin, were acquired. For a subset of 12 patients, also acquired were stool samples at least 1 month after surgery, to assess whether the molecular signals resemble those of healthy controls.
[0185] The analyses identified large sets of differentially expressed genes between the tumor tissues and the adjacent healthy tissues (Fig. 2A) and between the CRC patients’ and control patients’ stool samples (Fig. 2B). Tissue gene expression changes correlated with those previously measured in a scRNAseq study (Kohen et al., Fig. 2C). Both tissue samples and stool samples strongly clustered according to the disease states (Fig. 1C, ID). Notably, gene expression in the combined dataset revealed modules of genes that were either higher in the tissue samples compared to stool samples (PIGR, CEACAM5 and EPCAM, Fig. 1C), or specific to the disease state, regardless of the sample type (tissues or stool). For example, the genes CCL20, LYZ, IFITM2 and CD44 were more highly expressed in both CRC tissue samples and stools compared to adjacent healthy tissue and control stools, whereas the genes FABP1, CAI, KRT20, GUCA2A, AQP8 and MUC2 were more highly expressed in the healthy tissues and in the control stools (Fig. 1C).
[0186] Example 2. CRC-associated stool transcriptomics reflects cellular states within the tumors and restoration of healthy molecular states following tumor resection
[0187] The relation between the gene expression changes in the tissues and the stool samples was examined (Fig. 3A). The analysis found that the ratios of expression between tumor and adjacent healthy tissue significantly correlated with the ratios of expression between CRC and healthy control stools (R=0.29, p<5xl0'112). The ratios between CRC and healthy control stools also strongly correlated with the ratios of expression in stools of CRC patients before and after tumor resection (Fig. 3B, r=0.42, p<5.7e'75). This indicates that the intestinal state reverts to a healthy one after the removal of the tumor.
[0188] Figs. 7A-7C further demonstrate that stool RNA profiles return to a healthy molecular state after tumor resection. Principal component analysis (PCA) showed that post surgery stool samples cluster within the control stool samples (Fig. 7A). Analysis was based on highly expressed (normalized expression > 5e-5), highly variable genes previously shown to be differentially expressed between colorectal cancer and control based on single cell data. Fig. 7B shows that the tumor signature present in CRC stool is fully reversed in post-operative samples, matching the control pattern and supporting molecular normalization of the gut lumen after resection. Fig. 7C shows that gene expression signature of post-surgery stool resembled the one observed in control stool. Stool shed cell transcriptomics therefore reflect a return to healthy molecular states after the removal of the tumor.
[0189] Thus, the data demonstrate that stool transcriptomics may be used to assess not only the presence or absence of a CRC tumor in newly diagnosed patients, but also for monitoring the efficacy of a treatment of a colorectal cancer in a subject in need thereof e.g. following an intervention such as surgery and / or chemotherapy.
[0190] To examine the relevance of the differentially expressed genes identified in patients’ stool samples to the cellular states within the tumors, a dataset of high-resolution spatial transcriptomics (ST) of colorectal cancer tissue samples was analyzed. The analyses found that differentially expressed genes between the tumor epithelial cells and the adjacent healthy epithelial cells exhibited similar trends when comparing stool samples of CRC patients and controls (Fig. 4A- 4C). For example, the gene CA2, a classic marker of healthy colonocytes, was elevated in expression in healthy patients’ stool samples, and reduced in the stools of CRC patients, yet reverted to higher levels in stool samples after resection (Fig. 3C). Similarly, ZG16, and AQP8, associated with healthy colonocytes, were enriched in non-tumor tissue regions, elevated in stool samples from healthy controls, and decreased in CRC patient stool samples, and these expression patterns reverted toward healthy levels in post-resection stool samples. Conversely, the gene IFITM2, which was elevated in CRC cells showed the opposite trend (Fig. 3D), and additional tumor-enriched genes ARPC1B, and S100A11 were also elevated in both tumor tissue and presurgery CRC stool samples but declined after tumor removal. Stool host gene expression therefore recapitulates cellular expression changes in the colonic tissue upon malignant transformation.
[0191] Example 3. Classifiers for diagnosing CRC and assessing tumor location
[0192] To examine the ability to diagnose CRC tumors based on stool host transcriptomics, the data were divided into training and test sets and trained a linear classifier, obtaining an area under the receiver operating curve (AUC) of 0.85, approaching the power of classification based on complete transcriptomics of the tissue (AUC=0.96, Fig. 3E). In particular, Fig. 3E shows a classifier trained to differentiate between colorectal cancer and control tissue (dotted line) or stool samples (dashed and dotted line). The classifier was evaluated using a 50% test set in each iteration, with results averaged across 100 independent iterations. Receiver operating characteristic (ROC) curve and subsequent false positive rate, true positive rate and area under the curve (AUC) were recorded for each test set to examine the classification of samples of colorectal cancer samples and control samples based on ‘cancer score’. The overall performance is presented with mean values and standard error of the mean.
[0193] Similar results were obtained with a support vector machine classifier (Fig. 5). In particular, the inventors further compared the tissue and stool expression signatures of rectal tumors and colon tumors. The analyses found that stool samples included differentially expressed genes, the expression of which correlated with the differences measured in rectal vs. colon cancer tissue samples (Fig. 5). The stool host transcriptomic data therefore contains information not only on disease existence but also location.
[0194] Using the classifier (depicted in Fig. 3E), a list of genes upregulated or downregulated in CRC stool was obtained, as set forth in Table 1 hereinabove. In particular, Group A gene products in Table 1 are specifically up-regulated in stool samples of CRC patients as compared to healthy controls, and Group B gene products in Table 1 are specifically down-regulated in stool samples of CRC patients as compared to healthy controls. Table 2 lists genes found to be differentially expressed in colon cancer or rectal cancer (calculated as fold change (FC) of at least 1.5 over the level in samples corresponding to the other group of tumors, using both stool and tissue samples). Specifically, Table 2 presents a list of gene products up-regulated specifically in colon tumors (Group C), and a list of gene products up- regulated specifically in rectal tumors (Group D). Table 3 lists the most highly differentially expressed genes (exhibiting FC of at least 2 over the other group of tumors, e.g. colon tumors served as a control for rectal tumors and vice a versa). Specifically, Table 3 presents a list of gene products highly up-regulated specifically in colon tumors (Group E), and a list of gene products highly up-regulated specifically in rectal tumors (Group F).
[0195] Example 4. Linear classifiers for CRC diagnosis and differential diagnosis
[0196] A linear classifier was trained on human gene expression from stool-derived shed cells, using a training and test set approach. The classifier achieved an area under the receiver operating characteristic curve (AUC) of 0.86 — approaching the performance of tissue-based classification (AUC = 0.96, Fig. 6A). This untargeted cancer score matches the sensitivity reported for the multifactor mt-sRNA test that combines 8 stool-derived eukaryotic RNA (seRNA) biomarkers, patient demographic information (smoking status), and a fecal immunochemical test (FIT / iFOBT) (mt- sRNA; 94%) and the three-marker multitarget FIT assay (mtFIT; 91%) and exceeds that of the single-analyte FIT test (~ 75 %). Next, the most informative stool biomarkers contributing to CRC classification were identified, and are presented in Table 6 below. Using these biomarkers, a cancer score was calculated for each of the stool samples (methods). Patient specific comparisons of cancer score pre- and post-surgery showed a constant decrease (Fig. 7C, Paired Wilcoxon p = 0.0039). Stool-based cancer score was independent of disease stage, tumor invasion score (T score), tumor histological grade, tumor size and tumor location (Figs. 8A-8F). Stool biomarkers included several myeloid-associated genes such as IL1B, CXCL8, and S100A8, the latter encoding calprotectin — a marker of gut inflammation. Consistent with this, computational deconvolution revealed a significant increase in shed myeloid cells in CRC stool compared to controls (Fig. 6B).
[0197] To assess whether stool classification relies solely on inflammation, the analysis was repeated after excluding myeloid cell-associated genes or IBD related genes, as described above. Remarkably, classification performance remained robust for both myeloid exclusion (stool AUC = 0.84; tissue AUC = 0.96; Figs. 9A, Table 4, or IBD exclusion (stool AUC = 0.82; tissue AUC = 0.96; Figs. 9C, Table 5) indicating that stool transcriptomes encode CRC-specific signatures that extend beyond inflammatory cell signals. The genes identified in these classifiers are presented in Tables 4 and 5, for classifiers excluding myeloid genes and IBD-related genes, respectively.
[0198] Table 4 shows Stool biomarkers extracted using the linear classifier based on non-myeloid genes (normalized expression > 5e'5). Accordingly, Table 4 shows genes that are particularly useful for differentially diagnosing CRC in a patient diagnosed with, or suspected of having, gastrointestinal inflammation.
[0199] Table 5 presents Stool biomarkers extracted using the linear classifier based on non-IBD genes (normalized expression > 5e'5). Accordingly, Table 5 shows genes that are particularly useful for differentially diagnosing CRC in a patient diagnosed with, or suspected of having, IBD.
[0200] Table 6 includes gene products identified in the linear classifier for CRC diagnosis (Figure 6). Shown are genes that were differentially expressed in >40% of training set iterations.
[0201] Discussion
[0202] Stool samples contain a mixture of cells shed from both healthy colonic epithelium and tumor tissue. The findings presented herein unexpectedly demonstrate that localized colorectal tumors shed sufficient cells into the lumen to generate a distinct and detectable transcriptomic signal, despite the background of normal colonocyte-derived RNA. Without wishing to be bound by a specific theory or mechanism of action, this may reflect the heightened proliferative activity of tumor cells, resulting in increased turnover and shedding, or a greater post-shedding RNA stability of malignant cells within the gut environment.
[0203] By combining microbial ribosomal RNA depletion with sensitive RNA sequencing, methods presented herein enable comprehensive profiling of human gene expression from stool. Stage- stratified performance shows that the cancer score remains discriminative even in early disease (Stage 1; Fig. 8C), indicating that detection is not confined to advanced tumor burden. Empirically, a stable classification and biological interpretability were retained with as few as -1,500 detected human genes (~6,000 UMIs), a read depth readily achievable from a single routine stool aliquot. The high AUC (e.g. Fig. 6B) indicates strong diagnostic efficacy, suggesting that transcriptome profiling offers a competitive alternative to existing modalities and may facilitate more comprehensive, molecularly informed CRC screening. This not only supports accurate CRC detection, but also allows inference of biological pathways, tumor subtypes, and individualized treatment responses. Such molecular resolution has relevance for non-invasive monitoring of rectal cancer — especially in patients undergoing non-operative management or surveillance following curative therapy. Additional differentially expressed genes
[0204] Table 7 below includes additional differential gene expression data of CRC and control samples. These gene products include the gene products listed in Tables 1-6 presented hereinabove and additional gene products which may conveniently be used in embodiments of the invention in accordance with their corresponding data (fold change, p value, q value) as presented in Table 7.
[0205] In some embodiments, gene products to be used in embodiments of the invention include a plurality of gene products selected from one or more of Tables 1-6, and one or more additional gene products selected from Table 7.
[0206] In Table 7, presented are DGE for CRC and control stool samples in which Exp>le-5, log2(Ratio)>4, qVal<0.05
[0207] Table 7 - Additional differentially expressed genes in CRC stool as compared to control
[0208] Although the present invention has been described herein above by way of preferred embodiments thereof, it can be modified, without departing from the spirit and nature of the subject invention as defined in the appended claims.
Claims
1. CLAIMS1. A method of diagnosing colorectal cancer, the method comprising (i) determining, in a fecal RNA sample of a subject, the levels of gene products selected from Table 1 or 6; and (ii) comparing the level of the gene products in the fecal RNA sample to their respective levels in a control, wherein a difference in the level of the gene products in the fecal RNA sample as compared to the control is indicative of the presence of colorectal cancer in the subject.
2. The method of claim 1, comprising determining the levels of a plurality of gene products presented in Table 1 or 6 in said fecal RNA sample to thereby obtain the transcriptomic signature of said sample with respect to the plurality of gene products, and comparing the transcriptomic signature of said sample to a control transcriptomic signature by a supervised classification algorithm, wherein the outcome of the comparison is indicative of the presence of colorectal cancer in said subject.
3. The method of claim 2, wherein said plurality of gene products is selected from Table 1.
4. The method of claim 2, wherein said plurality of gene products comprises the gene products of Table 6.
5. The method of claim 2, wherein said plurality of gene products is selected from Table 4 or5.
6. The method according to claim 1, wherein a transcriptomic signature characterized by: (i) an increase in the level of gene products of Group A in Table 1 in the fecal RNA sample of the subject in comparison to their corresponding levels in a healthy control; (ii) a decrease in the level of gene products of Group B in Table 1 in the fecal RNA sample of said subject in comparison to their corresponding levels in a healthy control; or (iii) both (i) and (ii), is indicative of the presence of colorectal cancer in said subject.
7. The method of claim 1, wherein a transcriptomic signature characterized by: (i) an increase in the level of gene products of Group G, I and / or K, as set forth in Tables 4, 5, and 6, respectively, in the fecal RNA sample of the subject in comparison to their corresponding levels in a healthy control; (ii) a decrease in the level of gene products of Group H, J, and / or L, as set forth in Tables 4, 5, and 6, respectively, in the fecal RNA sample of said subject in comparison to their corresponding levels in a healthy control; or (iii) both (i) and (ii), is indicative of the presence of colorectal cancer in said subject.
8. The method of any one of the preceding claims, wherein the subject is suspected of having colorectal cancer.
9. The method of claim 8, wherein said subject diagnosed with, or suspected of having, gastrointestinal (GI) inflammation, and said plurality of gene products is selected from Table 4.
10. The method of claim 8, wherein said subject is diagnosed with, or suspected of having, inflammatory bowel disease (IBD), and said plurality of gene products is selected from Table 5.
11. The method of any one of the preceding claims, further comprising predicting the location of the colorectal cancer, comprising determining in said fecal RNA sample the levels of additional gene products selected from gene products presented in Table 2 or 3 and comparing the determined levels to their respective levels in a control, wherein a difference in the level of the additional gene products in the fecal RNA sample and in the control is predictive of the location of the colorectal cancer.
12. A method for differential diagnosis of colorectal cancer in a subject in need thereof, comprising (i) determining, in a fecal RNA sample of the subject, the levels of gene products selected from Table 4 and / or 5; and (ii) comparing the level of the gene products from the fecal RNA sample to their respective levels in a control, wherein a difference in the level of the gene products in the fecal RNA sample as compared to the control is indicative of the presence of the colorectal cancer in said subject.
13. The method of claim 12, wherein said subject diagnosed with, or suspected of having, GI inflammation, and said plurality of gene products is selected from Table 4.
14. The method of claim 12, wherein said subject is diagnosed with, or suspected of having, IBD, and said plurality of gene products is selected from Table 5.
15. A method of assigning a medical intervention, comprising diagnosing colorectal cancer according to the method of any one of the preceding claims, and assigning the medical intervention based on the obtained results.
16. A method of treating colorectal cancer in a subject in need thereof, the method comprising diagnosing colorectal cancer according to the method of any one of claims 1 to 11, and treating the cancer.
17. The method of claim 16, wherein treating said cancer comprises administering to said subject a colorectal cancer treatment or intervention, the colorectal cancer treatment or intervention selected from the group consisting of surgical resection, chemotherapy, biological therapy, irradiation and / or immunotherapy.
18. A method of analyzing a stool RNA sample, the method comprising determining the levels of a plurality of gene products selected from one or more of Tables 1 to 6 in the sample.
19. A method for monitoring the efficacy of a treatment of colorectal cancer in a subject in need thereof, the method comprising (i) determining, in a fecal RNA sample of the subject, the levels of gene products selected from Table 1 or 6; and (ii) comparing the level of the gene products from the fecal RNA sample to their corresponding levels in a fecal RNA sample obtained from said subject at an earlier sampling and / or to the level of the one or more gene products in a control.
20. The method of claim 19, wherein a change in the level of the gene products in two consecutive measurements is indicative of the efficacy of the cancer treatment in said subject.
21. The method of claim 19 or 20, comprising determining the levels of a plurality of gene products presented in Table 1 or 6 in said fecal RNA sample to thereby obtain the transcriptomic signature of said sample with respect to the plurality of gene products, and comparing the transcriptomic signature of said sample to a control transcriptomic signature by a supervised classification algorithm.
22. The method of any one of the preceding claims, further comprising subjecting the fecal RNA sample to selective depletion of microbial ribosomal RNA (rRNA).
23. The method of any one of the preceding claims, wherein the sample is a stool sample.
24. The method of any one of the preceding claims, wherein the supervised classification algorithm is a linear classifier.
25. A kit comprising means for specifically determining and quantifying the levels of a plurality of gene products in a fecal RNA sample, and instructions for diagnosing colorectal cancer, wherein the plurality of gene products is selected from a list of biomarkers as presented in one or more of Tables 1-6.