Key protein identification method based on Markov random walk

A random walk and identification method technology, applied in the field of bioinformatics, can solve problems such as false positives in protein interaction data, neglect of protein biological characteristics, and noise in PPI networks, so as to expand the application range and practicability, and overcome data noise High, efficiency-enhancing effect

Active Publication Date: 2018-11-13
YANGZHOU UNIV
View PDF2 Cites 5 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] Before the present invention was proposed, the identification field of key proteins was initially identified by the topological characteristics of the network, for example, degree centrality (DC), betweenness centrality (BC), local average connectivity (LAC), Li, etc. People fused PPI and gene expression data and proposed the centrality measurement method PeC, Zhang et al. Fused PPI network topology features and gene co-expression information and proposed the CoEWC method, but the disadvantages of these methods to identify key proteins are: (1) only consider the network The topological characteristics of the protein itself, while ignoring the inherent biological characteristics of the protein
(2) There is noise in the PPI network obtained through biological experiments, which makes the protein interaction data have false positives

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Key protein identification method based on Markov random walk
  • Key protein identification method based on Markov random walk
  • Key protein identification method based on Markov random walk

Examples

Experimental program
Comparison scheme
Effect test

Embodiment

[0076] To verify the performance of the proposed algorithm EPM in this chapter, we compared the number of key proteins identified with other five methods (DC, BC, LAC, PeC and CoEWC). For each method, we select the protein recognition results of top100, top200, top300, top400, top500, and top600 as candidate sets, and then intersect the proteins in each candidate set with the standard key protein set to obtain the real key in the candidate set The amount of protein, the experimental results are shown in Figure 2.

[0077] from Figure 2a , 2b , 2c, 2d, 2e, and 2f, it can be seen that in the yeast PPI network, the algorithm EPM proposed by us can achieve better results than other methods in identifying key proteins. When the top500 and top600 key proteins are used as candidate sets, the number of proteins identified by the algorithm proposed in this chapter is significantly higher than other methods. Among them, compared with the PeC method, when extracting the top100, top20...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention aims to provide a key protein identification method based on Markov random walk, and belongs to the technical field of biological information. The key protein identification method basedon the Markov random walk comprises the following steps that: using a Markov random walk thought to endow each vertex with a score which expresses the importance degree of the vertex, giving the initial value of the score, and according to a certain probability, enabling the score to realize random walk, and carrying our revision in transfer, wherein the scores of all vertexes form an n-column vector; and finally, according to the descending order of the scores, outputting k pieces of protein corresponding to the scores as a final result. By use of the method, biological attributes and topological characteristics are combined to improve accuracy for identifying key protein, meanwhile, a prediction result is more accurate, and prediction efficiency is improved.

Description

technical field [0001] The invention belongs to the technical field of biological information, and mainly relates to a technology for identifying key proteins in a protein interaction network through a Markov random walk algorithm, in particular to a method for identifying key proteins in a PPI network based on network topology information and protein biological attributes. Background technique [0002] Protein is an indispensable substance in life activities. It participates in almost all cycles of life activities, and key proteins play an irreplaceable role in this process. The absence of key proteins may cause life organisms to fail to survive. Therefore, the identification of key proteins in the PPI network not only helps to understand the process of cell growth regulation, but also helps to study the mechanism of biological evolution. In addition, in the field of biomedicine, the identification of key proteins is of great significance in the treatment of diseases and th...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F19/16
Inventor 刘维马良玉陈昕
Owner YANGZHOU UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products