Deep learning based long-chain non-coding RNA subcellular position prediction algorithm

A long-chain non-coding, deep learning technology, applied in the field of long-chain non-coding RNA subcellular location prediction algorithms, can solve the problem of bias and poor performance in the main category, and achieve the effect of avoiding over-fitting problems.

Active Publication Date: 2018-01-12
SHANGHAI JIAO TONG UNIV
View PDF3 Cites 11 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The classification results of most machine learning methods are biased towards the main

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Deep learning based long-chain non-coding RNA subcellular position prediction algorithm
  • Deep learning based long-chain non-coding RNA subcellular position prediction algorithm
  • Deep learning based long-chain non-coding RNA subcellular position prediction algorithm

Examples

Experimental program
Comparison scheme
Effect test

Example Embodiment

[0069] The following is a detailed description of the embodiments of the present invention. This embodiment is carried out based on the technical solution of the present invention, and provides detailed implementation methods and specific operation processes.

[0070] The present invention takes into account the imbalance of the data set, wherein the number of samples located in the cytoplasm, nucleus, cytosol, ribosomes and exosomes is 304, 152, 96, 47 and 26 respectively, so for each category except the first category double upsampling. The activation functions adopted by the encoding layer and the decoding layer in the three-layer stacked self-encoder in the present invention are all sigmoid functions, the adam optimizer selected by the optimizer, and the square error between the reconstructed output and the original input selected by the loss function. Both Batch_size and nb_epoch are 100, and the number of neurons in the three layers is set to 256, 128, and 64 respectivel...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to the field of RNA biology, in particular to deep learning based lncRNA (long-chain non-coding RNA) subcellular position prediction algorithm. Aiming at class imbalance in multi-class problem training samples, a novel up-sampling method is provided for preprocessing the training samples. A stacked autoencoder is adopted for characteristic extraction of sequence original characteristics. A deep learning based fusion algorithm is adopted for integrating prediction effects of multiple classifiers. Influences caused by data set imbalance to classifier effects are greatly reduced by adoption of the up-sampling method. High-level characteristics high in discriminability are effectively extracted from the original characteristics. By adoption of the deep learning based fusion algorithm for integrating prediction effects of all classifiers, robustness is improved, and high adaptability to subcellular position diversity and complexity is realized.

Description

technical field [0001] The invention relates to the field of RNA biology, in particular to a deep learning-based long noncoding RNA (longnoncoding RNA, lncRNA) subcellular position prediction algorithm. Background technique [0002] Noncoding RNA (noncoding RNA, ncRNA) has been proven to be an important regulatory factor. microRNA (miRNA) and lncRNA are the two main types of ncRNA. In recent years, lncRNA has received great attention in the field of RNA biology. Related studies have shown that the location information of lncRNA is of great help to understand its complex biological functions. Furthermore, they have been shown to be markers of certain diseases. Therefore, understanding the cellular functions of lncRNAs has become a central task in the post-genomic organization era. [0003] Like proteins, the function of lncRNAs depends on the cellular region in which they are located. Therefore, localization information can provide important references for revealing its ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F19/24G06N3/08
Inventor 曹真杨旸沈红斌
Owner SHANGHAI JIAO TONG UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products