MicroRNA-disease association prediction method based on multi-mode stacking automatic coding machine

An automatic encoding machine and prediction method technology, applied in the fields of machine learning and bioinformatics, can solve the problems of no concern about the relationship between miRNA and diseases and other biomolecules, so as to improve prediction accuracy, reduce model complexity, and time complexity low degree of effect

Active Publication Date: 2021-05-25
XINJIANG TECHN INST OF PHYSICS & CHEM CHINESE ACAD OF SCI
View PDF12 Cites 7 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, the human body is a unified whole, a variety of biomolecules in human cells are coordinated to maintain life activities, and the interactions between various biomolecules are interconnected. Most of the current calculation methods only consider a single type of known miRNA-disease association information without paying more attention to associations between miRNAs and disease and other biomolecules

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • MicroRNA-disease association prediction method based on multi-mode stacking automatic coding machine
  • MicroRNA-disease association prediction method based on multi-mode stacking automatic coding machine
  • MicroRNA-disease association prediction method based on multi-mode stacking automatic coding machine

Examples

Experimental program
Comparison scheme
Effect test

Embodiment

[0085] A kind of microRNA-disease association prediction method based on multimodal stacking autoencoder according to the present invention is carried out according to the following steps:

[0086] a. Selection and establishment of data sets: Based on the Human MicroRNA Disease Database v3.0 database, the known human microRNA and disease association data were obtained; microRNA sequence information was obtained based on the miRbase database; disease subject terms were obtained based on the Medical Subject Heading database; based on the miRTarBase database Obtain known microRNA-protein and microRNA-mRNA association data; obtain known protein-disease and mRNA-disease association data based on DisGeNET database; obtain known microRNA-lncRNA association data based on lncRNASNP2 database; obtain based on lncRNASNP2 and LncRNADisease database Known lncRNA-disease association data;

[0087] b. Generation of microRNA sequence features: the nucleotides based on microRNA are uracil, cyt...

Embodiment 2

[0136] In order to better illustrate the effect of the prediction method of the present invention, this prediction method is compared with the most popular random forest model at present, and table 1 has listed this embodiment and the random forest model using the five-fold cross-validation method in HMDDv3. The results generated on the 0 dataset:

[0137] Table 1 is based on the comparison of the present invention and random forest model results based on the HMDD v3.0 data set under the five-fold cross-validation

[0138]

[0139] image 3 and Figure 4The ROC curves generated by the present invention and the random forest model are shown respectively; by comparison, it can be seen that this embodiment has achieved more excellent results in sensitivity rate, specific rate, precision rate, Matthews correlation coefficient, and AUC value. The results are all higher than the random forest method, which shows that the comprehensive performance of the present invention is bet...

Embodiment 3

[0141] In order to further embody the effect of the prediction method of the present invention, this prediction method is compared with the latest calculation model at present, Figure 5 It shows the histogram of the mean AUC comparison between different models and the present invention based on the same HMDD data set under the five-fold cross-validation; With a higher AUC value, the overall performance is better than other models.

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a microRNA-disease association prediction method based on a multi-mode stacking automatic coding machine. The method comprises: forming microRNA sequence features and disease semantic similarity features; constructing a microRNA-protein-disease network, a microRNA-mRNA-disease network and a microRNA-lncRNA-disease network, and respectively obtaining network adjacent characteristics between microRNA and protein, between disease and protein, between mRNA and lncRNA and between disease and lncRNA by using a LINE network embedding method; mining, by using a multi-mode stacking automatic coding machine, the advanced abstract features of four features (self attribute features, protein network adjacent features, mRNA network adjacent features and lncRNA network adjacent features) of microRNA and diseases, thereby reducing the time complexity of a model and improving the prediction accuracy of the model; and training and predicting the processed features by using a CatBoost classifier, and taking an average value of prediction scores of the four features as a final prediction score. According to the method, the problems of high time consumption and high cost of a traditional biological experiment method are solved, so that a better classification effect is achieved, and the potential incidence relation between microRNA and diseases is predicted with higher accuracy.

Description

technical field [0001] The invention relates to the fields of machine learning and bioinformatics, in particular to a microRNA-disease association prediction method based on a multimodal stacked autoencoder. Background technique [0002] MicroRNA (miRNA) is a small molecule non-coding RNA (-22nt), which plays an important role in cells. It is estimated that 1-4% of the genes in the human genome are miRNAs, a single miRNAs regulates up to 200 mRNAs, and miRNAs usually bind to the 3' untranslated regions (UTRs) of target mRNAs through sequence-specific base pairs, repressing target mRNAs expression, thus participating in a series of important processes in the life process. Identifying potential microRNA (miRNA) and human disease associations has been a key goal in many bioinformatics research projects, which will facilitate the treatment and prevention of human diseases, molecular tool design and personalized diagnosis. [0003] Traditional biological experiments are expensi...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G16B40/20G06K9/62G06N3/04G06N3/08G06N20/00
CPCG16B40/20G06N3/04G06N3/08G06N20/00G06F18/214
Inventor 姬博亚尤著宏胡伦王磊周喜蒋同海黄历广
Owner XINJIANG TECHN INST OF PHYSICS & CHEM CHINESE ACAD OF SCI
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products