Calculation method for predicting key proteins by combining multiple data features

A protein, multi-data technology, applied in the intersection of mathematics and biology, can solve the problems of predicting key proteins, such as the accuracy and efficiency need to be improved, and achieve the effect of solving expensive costs and long time periods.

Active Publication Date: 2019-01-08
EAST CHINA JIAOTONG UNIVERSITY
View PDF5 Cites 7 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0006] Although the above-mentioned comprehensive multi-data source method has improved the accuracy of predicting key proteins, the accuracy and efficiency of predicting key proteins still need to be improved

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Calculation method for predicting key proteins by combining multiple data features
  • Calculation method for predicting key proteins by combining multiple data features
  • Calculation method for predicting key proteins by combining multiple data features

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0020] The beneficial effects of the present invention will be described in detail below in conjunction with the accompanying drawings and specific embodiments, aiming at helping readers better understand the essence of the present invention, but not limiting the implementation and protection scope of the present invention.

[0021] In view of the fact that the protein interaction relationship data and key protein data of yeast are the most complete among all species at present, in order to test the effectiveness of the present invention, the yeast data is used for the next verification analysis. The yeast protein interaction relationship data used for testing in the present invention comes from the data of the DIP database in October 2010. After eliminating the repeated and self-interaction data, a protein interaction relationship containing 5093 proteins and 24743 links is finally obtained. Act on network data;

[0022] Download yeast gene expression data (GSE3431) from the ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a calculation method for predicting key proteins by combining multiple data features. According to the method, the features such as aggregation features, co-expression features, functional similarity and positional consistency of key proteins are analyzed; and the edge clustering coefficient of a protein interaction network, the Pearson correlation coefficients of gene expression values, the semantic similarity indexes of gene ontology terms and protein subcellular localization statistical features are effectively integrated. The method of the invention is simple and easy to use; four kinds of data, such as protein interaction relationship data, gene expression spectrum data, gene ontology term information data and protein subcellular localization data information are inputted; and test results indicate that the method of the invention can significantly improve the prediction accuracy and efficiency of the key proteins in the protein interaction network comparedwith an existing method.

Description

technical field [0001] The invention relates to the cross field of mathematics and biology, in particular to a calculation method for identifying key proteins in a protein interaction network based on various biological data sources. Background technique [0002] Proteins are the genetic organic substances that make up cells, are the main participants in life activities, and play a very important role in maintaining normal physiological activities. The key protein is the protein that is necessary to maintain the normal life activities of the organism. Once this kind of protein is abnormal, it will lead to the disorder of the normal physiological activities of the organism and even cause diseases. Studies have shown that key proteins are inextricably linked to disease-causing genes, drug target design, and personalized medical treatment. Effective identification of key proteins is helpful for the study of disease pathogenesis and drug molecular targets. [0003] Traditional ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G16B20/30
Inventor 张伟徐佳
Owner EAST CHINA JIAOTONG UNIVERSITY
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products