Supercharge Your Innovation With Domain-Expert AI Agents!

Cancer staging prediction system based on genome analysis

A genome analysis and prediction system technology, applied in the field of cancer staging prediction system, can solve problems such as network complexity

Pending Publication Date: 2020-05-19
SHANDONG UNIV
View PDF6 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, this patent has the following defects: the feature selection method of the gene network is used in this patent, and there are tens of thousands of gene features, and the network constructed will be very complicated

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Cancer staging prediction system based on genome analysis
  • Cancer staging prediction system based on genome analysis
  • Cancer staging prediction system based on genome analysis

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0083] A cancer staging prediction system based on genomic analysis, including a raw data acquisition unit, a combined feature preprocessing unit, a joint gene selection unit, a classification model creation unit, and a prediction unit connected in sequence;

[0084] The raw data acquisition unit is used to: obtain the RNAseq expression data and clinical information of the cancer subtype samples corresponding to the Cancer Genome Atlas TCGA project, obtain the RSEM value of the gene expression in it, and samples with phase I and phase II annotations are considered early cancers, and the rest Samples with stage III and IV annotations are advanced cancers;

[0085] The combined feature preprocessing unit is used to discretize genetic features, that is, RNAseq expression data, through ChiMerge binning and WOE encoding, and improve the stability of the data and the robustness of the classification model through ChiMerge binning and WOE encoding.

[0086] ChiMerge binning: Through ...

Embodiment 2

[0105] According to a cancer stage prediction system based on genomic analysis described in Example 1, using log2 to convert the RSEM value, and standardizing the RSEM value after log2 conversion refers to:

[0106] Use log2 to transform the RSEM value by formula (I):

[0107] x=log 2 (RSEM+1) (I)

[0108] Standardize the RSEM value after log2 transformation by formula (II) to get z:

[0109]

[0110] In formula (II), x is the logarithmized value of RSEM value, is the mean of x and s is the standard deviation.

[0111] FCBF search is performed on the original training data, which refers to RSEM values, including:

[0112] (1) Use random sampling to select 80% of the original training data as the training data set. In the ten times of ten-fold cross-validation experiments, the training data is randomly divided into ten folds each time, and the FCBF search is performed on the training data set. , each time FCBF searches for ten-fold cross-sampling, and obtains 10 sub-fe...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention relates to a cancer staging prediction system based on genome analysis. The cancer staging prediction system comprises an original data acquisition unit, a combined feature preprocessingunit, a combined gene selection unit, a classification model creation unit and a prediction unit which are connected in sequence, wherein the original data acquisition unit is used for acquiring RNAseq expression data and clinical information of a cancer subtype sample corresponding to a cancer genome atlas TCGA project and acquiring an RSEM value of gene expression in the RNAseq expression dataand the clinical information; the combined feature preprocessing unit is used for discretizing genetic features, or, after 1.0 is added, converting the RSEM value by using log2, and standardizing theRSEM value after log2 conversion; the combined gene selection unit is used for sequentially performing FCBF search, joint statistical feature extraction and logistic regression model feature selection; and the classification model creation unit is used for generating a classification model and optimizing the performance of the classification model. The system is more stable and accurate in prediction performance.

Description

technical field [0001] The present invention relates to the technical fields of biological information and machine learning, in particular to a cancer stage prediction system based on genome analysis. Background technique [0002] Cancer has a lot to do with genes. When tumors are detected at an advanced stage, survival rates are very low, whereas early detection and effective treatment can improve survival rates. Therefore, developing effective strategies to stratify patients according to cancer stage and intrinsic mechanisms driving cancer development and progression is crucial for early prevention and treatment of cancer. Cancer is often asymptomatic in its early stages, and many patients have metastases at the time of diagnosis. Patients resected by resection are at high risk of metastatic recurrence, and early detection helps in early cancer prevention and treatment. In addition, understanding the key genetic drivers of disease progression can aid in the development ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G16B20/00G16B40/00
CPCG16B20/00G16B40/00Y02A90/10
Inventor 张海霞李芳君袁东风
Owner SHANDONG UNIV
Features
  • R&D
  • Intellectual Property
  • Life Sciences
  • Materials
  • Tech Scout
Why Patsnap Eureka
  • Unparalleled Data Quality
  • Higher Quality Content
  • 60% Fewer Hallucinations
Social media
Patsnap Eureka Blog
Learn More