Long-chain non-coding RNA subcellular localization method based on multi-feature information fusion

A technology for long-chain non-coding and subcellular localization, applied in the new field of long-chain non-coding RNA subcellular localization, can solve the problem of inaccurate prediction of subcellular location

Pending Publication Date: 2019-07-23
TIANJIN UNIV
View PDF3 Cites 4 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] The purpose of the present invention is to provide a method for subcellular localization of long-chain non-coding RNA based on multi-feature in

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Long-chain non-coding RNA subcellular localization method based on multi-feature information fusion
  • Long-chain non-coding RNA subcellular localization method based on multi-feature information fusion
  • Long-chain non-coding RNA subcellular localization method based on multi-feature information fusion

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0048] The present invention will be described in further detail below in conjunction with the accompanying drawings.

[0049] see figure 1 , the present invention mainly includes 5 parts, (i) construct benchmark data set. By screening the data in the RNALocate database, 643 long non-coding RNA sequences located in different subcellular locations were obtained. (ii) Construct feature vectors. By fusing the k-mer components of long-chain non-coding RNAs with the triplet structure-sequence to form feature vectors, the sequence and structure information of long-chain non-coding RNAs is more comprehensively utilized. Since the 8-mer component has a unique evolutionary mechanism, the parameter k is set to 8, so far, we can express a long non-coding RNA sequence as (4 8 +32) dimensional feature vector. (iii) Feature selection. The method of analysis of variance is used to select the optimal feature subset. (iv) Apply machine learning algorithms. Choose a support vector machi...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a long-chain non-coding RNA subcellular localization method based on multi-feature information fusion, and introduces a novel method to predict the position of a long-chain non-coding RNA subcellular. The method is characterized in that k-Mer component and triple structure-sequences are fused to express the long-chain non-coding RNA sequence as a vector, so that the sequence and the structure information of the long-chain non-coding RNA are utilized more comprehensively. In order to obtain an optimal feature subset, feature selection is carried out based on variance analysis. In a cross validation experiment of a leave-one method, the accuracy rate of the method reaches 92.38% and is superior to that of a same-class algorithm.

Description

technical field [0001] The invention relates to the field of bioinformatics, in particular to a novel long-chain non-coding RNA subcellular localization method. Background technique [0002] Long non-coding RNA (lncRNA) is a transcript longer than 200 nucleotides. At first they were thought to be "noise" of genome transcription, without biological function. However, in recent years, researchers have discovered that long non-coding RNAs play important roles in a variety of cellular and biological processes, such as cell differentiation, intracellular transport, chromatin modification, mRNA splicing, transcription and post-transcriptional regulation, etc. In addition, dysregulation of long noncoding RNAs has been associated with various human diseases, such as cardiovascular diseases, neurodegenerative diseases, obesity, and cancer. Increasing evidence indicates that the subcellular location of long non-coding RNAs has a strong influence on their biological functions. For e...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06N20/10G16B15/00G06K9/62
CPCG06N20/10G16B15/00G06F18/24
Inventor 杜朴风杨晓飞
Owner TIANJIN UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products