Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Complex disease state evaluation method based on high-throughput sequencing data and clinical phenotype construction and application

A disease state, complex technology, applied in the field of genetic testing and bioinformatics, can solve problems such as limited guiding significance, single omics, and limited functions

Pending Publication Date: 2020-10-30
上海朴岱生物科技合伙企业(有限合伙)
View PDF0 Cites 4 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Gene chip technology, by hybridizing with a set of nucleic acid probes of known sequence for nucleic acid sequence determination, realizes high-throughput parallelization, but the disadvantages are that the repeatability and sensitivity need to be enhanced, and the analysis range is not wide enough
Next-generation sequencing technology, also known as next-generation sequencing technology (next-generation sequencing, NGS), is different from the first-generation sequencing technology. It achieves high-throughput parallel sequencing through in vitro fragment amplification and sequencing while synthesizing. The main disadvantages are read length
2) Detection and evaluation content is relatively single, with limited functions
At present, due to gene collection and screening capabilities and sequencing costs, the same marker detection scheme covers relatively few genes. In practical applications, single site or small fragment mutations are used as the main evaluation index. In recent years, gene expression levels and detection of all genes in the panel The overall mutation level as a marker evaluation scheme has attracted increasing attention; in terms of function, it mainly predicts the effect of site- or gene-related targeted drugs, and has limited guiding significance for a wider range of surgery, chemotherapy, radiotherapy, and immunotherapy
3) Marker design and supporting data analysis tools do not make full use of multivariate information
At present, most of the design schemes only focus on drug guidelines, labels and limited literature collection. The technical route focuses on a single omics level, and there are few comprehensive analyzes based on large-scale sequencing results, public databases and text mining technologies. It covers a variety of molecular omics and Multivariate data integration analysis of clinical phenotype information is seriously insufficient

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Complex disease state evaluation method based on high-throughput sequencing data and clinical phenotype construction and application
  • Complex disease state evaluation method based on high-throughput sequencing data and clinical phenotype construction and application
  • Complex disease state evaluation method based on high-throughput sequencing data and clinical phenotype construction and application

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0250] Example 1 The present invention is applied to the whole process of colorectal tumor status evaluation model construction and panel design. The present invention will be further described in detail in conjunction with specific examples. It should be understood that the following examples are only used to illustrate the present invention and not to limit the present invention. the scope of the invention. The specific steps are as follows:

[0251] S1.1 Acquisition and arrangement of colorectal tumor sequencing data and clinical phenotype information

[0252] The mRNA data and clinical data of TCGA-CRC were downloaded from the UCSC xena database. 380 orthotopic tumor samples and 51 paracancerous samples were selected. Expression levels of mRNA data were quantified in TPM. When the value of TPM is less than 1, it is regarded as a missing value. For a gene, if the number of missing values ​​is greater than 20% of the sample size, the gene is removed. The remaining missi...

Embodiment 2

[0284] Example 2 The present invention is applied to the whole process of constructing pancreatic ductal carcinoma status evaluation model and panel design. The present invention will be further described in detail in conjunction with specific examples. It should be understood that the following examples are only used to illustrate the present invention and not to limit the present invention. the scope of the invention. Specific steps are as follows:

[0285] S2.1 Acquisition and arrangement of pancreatic ductal carcinoma sequencing data and clinical phenotype information

[0286] S2.1.1 Independently obtained the sequencing data (exome sequencing and RNA-Seq) and clinical phenotype information (including age, gender, pathological grade, surgical status R0-R2, PDX modeling status) of 71 clinical cases of pancreatic ductal carcinoma , survival including OS and DFS); PDX models were successfully established in 39 of them, and on this basis, the standard drug efficacy data of two ...

Embodiment 3

[0310] The present invention is applied to the mining of pan-tumor prognostic markers. The present invention will be further described in detail in conjunction with specific examples. It should be understood that the following examples are only used to illustrate the present invention and not to limit the scope of the present invention. Specific steps are as follows:

[0311] S3.1 Pan-tumor sequencing and collection of clinical phenotype datasets

[0312] The mRNA data and clinical data of TCGA pan-cancer were downloaded from UCSC xena. The mRNA data is derived from the data generated by the TOIL RNA-seq analysis pipeline, and the expression level of the gene is quantified by TPM. For each cancer type, tumor in situ samples and paracancerous samples were selected. Cancer types with a paired number of in situ tumor samples and paracancerous samples greater than or equal to 20 were selected for abnormal regulation analysis, and finally 14 cancer types were selected. For the m...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention relates to the field of gene detection and bioinformatics, and discloses a method for mining complex disease markers based on transcriptome data, exon group / genome data and clinical phenotypes. The invention designs a set of calculation method for constructing a complex disease state evaluation model by integrating high-throughput sequencing data and clinical phenotypes, the calculation method is applied to targeted medication of colorectal cancer, pancreatic ductal cancer and pancreas ductal cancer, biomarkers related to diseases are screened respectively, and the correspondingdisease state evaluation model is formed. The marker considering both accuracy and mechanism interpretability is constructed through the method and can be used for prognosis evaluation of complex diseases, treatment effect prediction, treatment scheme assistant decision making and the like.

Description

technical field [0001] The present invention relates to the technical fields of gene detection and bioinformatics, in particular to a complex disease state assessment method based on high-throughput sequencing data and clinical phenotypes, and related detection panel design and implementation cases. Background technique [0002] The first-generation sequencing technology obtains the base information at a specific position of the sequence through the dideoxy terminal termination method or chemical cleavage method, and reads the nucleic acid sequence by electrophoresis and development. Gene chip technology, by hybridizing with a set of nucleic acid probes of known sequences for nucleic acid sequence determination, has achieved high-throughput parallelization. The disadvantages are that the repeatability and sensitivity need to be enhanced, and the analysis range is not wide enough. Next-generation sequencing technology, also known as next-generation sequencing technology (...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G16B40/20G16H50/20
CPCG16B40/20G16H50/20
Inventor 李园园戴文韬刘伟
Owner 上海朴岱生物科技合伙企业(有限合伙)
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products