Mature miRNA full-site recognition method based on SVM-AdaBoost

A recognition method and a mature technology, applied in the field of bioinformatics, can solve problems such as class imbalance and low precision, achieve high classification performance, improve recognition precision, and reduce the average number of nucleotide offsets

Inactive Publication Date: 2019-02-26
QIQIHAR UNIVERSITY
View PDF1 Cites 6 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] The purpose of the present invention is to solve the existing single classifier identification mature miRNA problem of low precision and class imbalance, and propose a mature miRNA full-site recognition method based on SVM-AdaBoost

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Mature miRNA full-site recognition method based on SVM-AdaBoost
  • Mature miRNA full-site recognition method based on SVM-AdaBoost
  • Mature miRNA full-site recognition method based on SVM-AdaBoost

Examples

Experimental program
Comparison scheme
Effect test

specific Embodiment approach 1

[0027] The mature miRNA all-site recognition method based on SVM-AdaBoost of the present embodiment, described recognition method is realized through the following steps:

[0028] Step 1, select the pre-miRNA sequence in the miRBase database, and set up a training data set and a test set on the selected sequence;

[0029] Step 2. Extract the biological characteristics of mature miRNA splicing sites based on the structured sequence:

[0030] Step 21. Based on the biometric analysis, define the biometric characteristics of the mature miRNA cleavage site;

[0031] Step 22, defining the mature miRNA duplex and the site corresponding to the mature miRNA duplex;

[0032] Step two and three, constructing a sequence on the defined mature miRNA duplex for feature extraction;

[0033] Step 24, predicting the secondary structure and free energy of the constructed sequence;

[0034] Step 25, extracting a feature set on the constructed sequence;

[0035] Step 3. Obtain a new feature se...

specific Embodiment approach 2

[0039] Different from the specific embodiment one, in the mature miRNA all-site recognition method based on SVM-AdaBoost of the present embodiment, the pre-miRNA sequence in the miRBase database is selected in step one, and a training data set is established on the selected sequence and the procedure for the test set is,

[0040] Select the pre-miRNA sequence in the miRBase database, remove redundant sequences and multi-branched sequences, and establish a training set and a test set for the 3' end and a training set and a test set for the 5' end in the remaining sequences; among them, the pre-miRNA The meaning is precursor miRNA;

specific Embodiment approach 3

[0041] The difference from Embodiment 1 or Embodiment 2 is that in the SVM-AdaBoost-based full-site recognition method of mature miRNA in this embodiment, the pre-miRNA sequence in the miRBase database is selected in step 1, which is a human pre-miRNA sequence.

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

A mature miRNA full-site recognition method based on SVM-AdaBoost belongs to the field of bioinformatics. The existing single classifier has the problems of low accuracy and class imbalance in the recognition of mature miRNAs. The mature miRNA full-site recognition method based on SVM-AdaBoost comprises the steps of: selecting a pre-miRNA sequence in a miRBase database and building a training dataset and a test set on the selected sequence; extracting biological features of mature miRNA splicing sites based on the structured sequence; obtaining a new feature set by an information gain featureselection algorithm; constructing a probability-based adjustable parameter SVM classifier model; constructing an ensemble classifier model based on the AdaBoost algorithm; training a miRNA splicing full-site classifier. The method improves the recognition accuracy and reduces the average nucleotide offset number. By comparing and analyzing a plurality of mature miRNA recognition methods through the same test set, it is proved that the classification performance of the method provided by the invention is higher.

Description

technical field [0001] The invention relates to the field of bioinformatics, in particular to a miRNA full site recognition method. Background technique [0002] MiRNA is a kind of highly conserved endogenous small molecule RNA with a length of about 20-24nt, which regulates gene expression at the post-transcriptional level. miRNA inhibits protein synthesis and controls gene expression by binding to mRNA. It is estimated that miRNAs regulate 60% of human transcription processes. MiRNAs participate in a variety of biological processes through the regulation of sequence-specific RNA gene silencing. Existing studies have found that miRNAs are involved in cell proliferation and development, tissue differentiation, cell cycle, and apoptosis. For example, miRNA is closely related to the development of plant germ and leaf, human and mouse cell development, the growth and development of nerve cells, and the transformation of neural stem cells to nerve cells; miRNA is closely rela...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G16B20/30
Inventor 王颖汝吉东
Owner QIQIHAR UNIVERSITY
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products