A method for mining key RNA functions based on high-throughput experimental data mining

An experimental data, high-throughput technology, applied in the field of bioinformatics, can solve problems such as unavailability of applications, incomplete open data sets, lack of databases to help reveal functional mechanisms, etc., to achieve the effect of increasing the sample size

Active Publication Date: 2022-03-11
广州万德基因医学科技有限公司
View PDF8 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] Some difficulties faced by clinical tumor research: 1) It is difficult to collect enough large-scale clinical samples, which is not conducive to statistics and modeling; 2) There are many existing methods based on TCGA and other data sets, but these data sets are not completely open The problem is that downloading original data requires a lot of permissions, and ordinary researchers cannot apply for these permissions, so that they can only download tertiary data (processed and corrected data, non-original data), etc., which are not suitable for use with Combined analysis of clinical data other than TCGA; 3) The current large-scale cancer lncRNA expression profile analysis found that there are differences in transcription levels between various tumor types, showing that lncRNAs have great mining potential in disease research, and lncRNAs can be regarded as tumors. The "dark matter" of the transcription process in tissues, but lncRNA has very few known functions, and lacks a comprehensive database to help reveal the functional mechanism, so that it often encounters the problem of finding obvious differences in lncRNA but not knowing how to continue the research

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A method for mining key RNA functions based on high-throughput experimental data mining
  • A method for mining key RNA functions based on high-throughput experimental data mining
  • A method for mining key RNA functions based on high-throughput experimental data mining

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0124] 1) Sample collection: 100 patients with lung adenocarcinoma whose pathology has been clinically diagnosed and whose paraffin tissue can be obtained were selected as the research objects. Based on the high-throughput transcriptome sequencing (RNA-seq) method, the tissue samples (cancerous and paracancerous, a total of 200 samples) were subjected to transcriptome sequencing and bioinformatics analysis. The research subjects were all selected from Table 1 of Thoracic Cancer 9 (2018) 1680-1686, and the sample screening requirements were as follows: lung adenocarcinoma, confirmed information, and complete clinical follow-up information.

[0125] 2) RNA sequencing data preprocessing: use the fastx_clipper tool in fastx_toolkit to remove sequencing adapters, use the fastq_quality_filter tool in fastx_toolkit to remove low-quality sequencing reads, and then use tophat for data comparison. The reference genome is human hg19. In this way, the original sequencing reads of each sam...

Embodiment 2

[0131] 1) Sample collection: 10 patients with small cell lung cancer whose pathology has been clinically diagnosed and whose paraffin tissue can be obtained were selected as the research object. Tissue samples (lung cancer tissue and paracancerous tissue, a total of 20 samples) were tested based on the RNA high-throughput sequencing method. Sample screening requirements are as follows: small cell lung cancer, all patient specimens have been confirmed by the pathology department, the postoperative survival time is more than 3 months, and complete clinical follow-up information is available.

[0132] 2) Selection of accompanying data sets: Using public database resources, the RNA- Seq raw data, the data set number is gse60052, the download link is https: / / www.ncbi.nlm.nih.gov / sra? linkname=bioproject_sra_all&from_uid=257389, the clinical information of these samples is consistent with the clinical samples of this analysis, and can be combined for analysis.

[0133] 3) Data pre...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a method for mining key RNA functions based on high-throughput experimental data mining. By integrating a variety of data to obtain accompanying data sets and combining them with clinical data sets, the most relevant genes for project research can be found from a large number of known RNAs, and at the same time, the functions of unknown RNAs can be predicted, so as to better Determining the role it plays in life activities will provide an important basis for subsequent disease mechanisms, drug targets, and disease diagnosis.

Description

technical field [0001] The invention relates to bioinformatics, in particular to a method for mining key RNA functions based on high-throughput experimental data mining. Background technique [0002] About 93% of the human genome DNA nucleotide sequence can be transcribed into RNA, of which only 2% of the transcripts are translated into protein, and the remaining 98% belong to non-coding RNA (ncRNA). With the research progress of microRNA, it is revealed that ncRNA plays a very important role in human gene post-transcriptional regulation, cell growth, differentiation and proliferation. The most popular ncRNA research is mainly microRNA, circRNA, and lncRNA. In the field of tumor research, the research of mRNA and ncRNA is equally important. In recent years, bioinformatics programs have emerged in an endless stream, and co-expression relationships and protein interaction networks are increasingly widely used in the study of mRNA and ncRNA functions. [0003] Some difficult...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G16B30/10G16B35/20G16B40/00G16B50/10
CPCG16B30/10G16B35/20G16B40/00G16B50/10
Inventor 张洁霞陈梦麟黄凯铃刘艳卉骆颖筠张楠
Owner 广州万德基因医学科技有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products