Key protein identification method based on capsule neural network and ensemble learning

A neural network and ensemble learning technology, applied in the field of systems biology, can solve problems such as few features and no deep-level features, and achieve the effect of improving accuracy

Active Publication Date: 2020-08-25
KUNMING UNIV OF SCI & TECH
View PDF9 Cites 3 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Although the above method incorporates the biological characteristics of some proteins, and also uses an integrated learning algorithm to identify key proteins, the features used are still too few, and no deep-level features have been mined.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Key protein identification method based on capsule neural network and ensemble learning
  • Key protein identification method based on capsule neural network and ensemble learning
  • Key protein identification method based on capsule neural network and ensemble learning

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0018] Embodiment 1: as Figure 1-2 As shown, a key protein identification method based on capsule neural network and integrated learning, the method steps are as follows:

[0019] Step 1: Use the Cytoscape tool to extract eight biological characteristics of proteins in the protein interaction network; among them, proteins are divided into two categories: non-key proteins and key proteins;

[0020] Key proteins tend to be the core nodes of PINs, because removing them would cause the PINs to collapse. The biological characteristics of proteins in the protein interaction network mainly include: betweenness centrality (Betweenness Centrality, BC), proximity centrality (Closeness Centrality, CC), degree centrality (Degree Centrality, DC), eigenvector centrality (Eigenvector Centrality, EC), Information Centrality (Information Centrality, IC), etc. The software Cytoscape was used to extract eight biological characteristics of the protein, BC, CC, DC, EC, IC, NC, SC and ION.

[0...

Embodiment 2

[0040] (1) To test the effectiveness of our invented method, we applied it to the Saccharomyces cerevisiae dataset and the E. coli dataset. Because they are well-studied and have the most complete and reliable key proteome and protein interaction network data across all species. The key proteins of Saccharomyces cerevisiae were obtained from the synthesis of MIPS database, SGD database, DEG database and SGDP database. Saccharomyces cerevisiae protein interaction network data were obtained from the DIP database. The protein interaction network of yeast consists of 5093 proteins and 24743 edges. Among the 5093 proteins in yeast, there were 1167 key proteins and 3926 non-key proteins, and the ratio of key proteins to non-key proteins reached 1:3.36. The key proteins of Escherichia coli were obtained from the DEG database. The protein interaction network data of Escherichia coli was downloaded from the DIP database, which included 2727 proteins and 11803 edges. Among the 2727 ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a key protein identification method based on a capsule neural network and ensemble learning. The key protein identification method comprises the following steps: 1, extractingeight biological characteristics of protein in a protein interaction network by utilizing a Cytoscape tool; 2, extracting deep enhanced features of the eight biological characteristics by using a capsule neural network; 3, connecting the biological characteristics with the protein enhancement characteristics; 4, putting the connected characteristics obtained in the step 3 into an integrated modelMulti-enseable, training the model, and predicting a new key protein by using the trained integrated model; 5, outputting a result. Compared with the initial biological characteristics, the enhanced characteristics extracted through the capsule neural network can improve the accuracy of predicting the key protein by some machine learning models, and the accuracy of predicting the key protein by the machine learning model can be further improved by fusing the initial biological characteristics and the enhanced characteristics.

Description

technical field [0001] The invention relates to a key protein identification method based on capsule neural network and integrated learning, belonging to the field of systems biology. Background technique [0002] The life activities of organisms often require the deep participation of proteins. A critical protein is one whose removal by knockout mutations results in loss of function of the associated protein complex and leads to cell death. Key proteins are an essential part of cellular life. Therefore, how to accurately predict key proteins has become a research focus in the field of proteomics. [0003] When studying key proteins in the early days, biologists mainly used biological experiments to observe the impact of organisms on organisms when certain proteins were lost, and to judge whether the protein was a key protein. Although good results have been achieved, there are limitations such as time-consuming and expensive. To this end, some researchers use computer t...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G16B40/20G16B25/10G06N3/04G06N3/08G06N20/20
CPCG16B40/20G16B25/10G06N3/08G06N20/20G06N3/045
Inventor 彭玮李霞戴伟
Owner KUNMING UNIV OF SCI & TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products