Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

CircRNA function prediction method based on cascade decision system

A prediction method and judgment technology, applied in genomics, instrumentation, proteomics, etc., can solve problems such as high cost of time and equipment, unfavorable identification of circRNA in large quantities, and inability to continue to explore circRNA, etc., to achieve improved algorithm stability , The effect of improving the stability of the algorithm

Pending Publication Date: 2020-10-09
SUN YAT SEN UNIV
View PDF0 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, these experimental methods with a high reliability rate consume too much time and equipment costs, which is not conducive to the identification of circRNA functions in large quantities.
It is also impossible to continue to explore the important role played by the special function of a certain circRNA in clinical medicine

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • CircRNA function prediction method based on cascade decision system
  • CircRNA function prediction method based on cascade decision system
  • CircRNA function prediction method based on cascade decision system

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0021] In order to make the object, technical solution and advantages of the present invention clearer, the present invention will be further described below in conjunction with the embodiments and accompanying drawings.

[0022] refer to figure 1 The flowchart of the circRNA function prediction method based on the cascade judgment system and LightGBM in this embodiment. The main steps of the technical solution adopted by the present invention to solve its problem are:

[0023] S1. Import the circRNA of the large data sample as a (.bed) file, which contains the chromosome number, sequence start site, and positive and negative strand markers.

[0024] S2. Map the circRNA (.bed) file to the whole human genome (hg19 version) according to the relevant information such as the start site. Get the specific circRNA sequence information (.fasta) file.

[0025] S3. A feature extraction and fusion method is proposed to extract different features when circRNA expresses specific functio...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

In order to overcome the defects in the prior art, the invention aims to predict the function of the circRNA by using the provided cascade decision system in combination with the multi-classificationmodel of the LightGBM method. The technical scheme adopted by the invention mainly comprises the following steps of: (1) inputting the CircRNA of a big data sample in a (. bed) file form; (2) mappingthe CircRNA (. bed) file according to related information such as a starting site and the like to obtain a CircRNA sequence information (. fasta) file; (3) proposing a feature extraction and fusion method, and extracting a CircRNA feature; (4) proposing an A-type judgment system, and performing function prediction on the coded circRNA; (5) predicting other CircRNA by utilizing the LightGBM (LightGBM) algorithm; (6) according to a multi-classification model of a lightGBM algorithm, carrying out sampling and feature sampling of sample data by using core algorithms GOSS and EFB, mapping continuous features into discrete buckets by using a Histogram-based algorithm, and discretizing continuous variables; and (7) obtaining the optimal parameters of the model by adjusting the maximum depth of the tree, the minimum record number of the leaves, the data proportion used in each iteration and other parameters.

Description

technical field [0001] The invention relates to the technical field of bioinformatics, in particular to the field of CircRNA function prediction. Background technique [0002] CircRNAs have multiple functions in biology, such as being rich in miRNA binding sites and acting as sponges in cells; regulating protein activity by binding to proteins; some circRNAs can even be translated into proteins. Therefore, it has also become an important potential biomarker in recent years. To obtain the specific functions of newly discovered circRNAs expressed in vivo, a large number of experiments are required to identify the existing functions of circRNAs one by one, so as to obtain the final functional results. However, these experimental methods with a high reliability rate consume too much time and equipment costs, which is not conducive to the identification of circRNA functions in large quantities. It is also impossible to continue to explore the important role of the special funct...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G16B20/30G16B20/00
CPCG16B20/30G16B20/00
Inventor 邓怡云朱勉春戴宪华
Owner SUN YAT SEN UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products