Overlapping protein complex identification method and system based on fuzzy clustering and gene ontology semantic similarity

A protein complex and gene ontology technology, applied in the field of overlapping protein complex identification, can solve problems such as insufficient accuracy, poor accuracy of protein complexes, and undetectable protein complexes, and achieve the effect of improving accuracy

Active Publication Date: 2021-10-01
XINJIANG TECHN INST OF PHYSICS & CHEM CHINESE ACAD OF SCI
View PDF10 Cites 1 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, protein complex identification methods based only on specific topological structures ignore a large amount of Gene Ontology semantic information associated with proteins in protein interaction networks, which describe protein functions from three aspects, namely, molecular function, cellular group analysis and biological process
Neglect of Gene Ontology information prevents discovery of more biologically significant protein complexes
In addition, some known protein

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Overlapping protein complex identification method and system based on fuzzy clustering and gene ontology semantic similarity
  • Overlapping protein complex identification method and system based on fuzzy clustering and gene ontology semantic similarity
  • Overlapping protein complex identification method and system based on fuzzy clustering and gene ontology semantic similarity

Examples

Experimental program
Comparison scheme
Effect test

Embodiment

[0029] The overlapping protein complex identification method based on fuzzy clustering and gene ontology semantic similarity described in the present invention is carried out according to the following steps:

[0030] a. Construct a protein interaction network with attributes for protein interaction data and protein gene ontology semantic information, including: abstracting proteins into nodes in the network, if there are interactions between proteins, there are edges between corresponding nodes, And on this basis, the gene ontology information related to the protein is counted, and the set including all nodes, the set of edges between nodes and the set of all gene ontology information associated with the nodes are obtained;

[0031] b. Calculate and obtain the adjacency matrix corresponding to the network according to the topology information of the network constructed in step a;

[0032] c. Apply an integrated Gene Ontology semantic similarity measurement method to calculate...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides an overlapping protein complex identification method and system based on fuzzy clustering and gene ontology semantic similarity. The system comprises a network construction module, a data preprocessing module, a parameter definition module, a model construction module, a model solving module, a protein complex identification module and a result display module. The protein complex recognition is realized by comprehensively considering the interaction relationship between the proteins in the protein interaction network and the gene ontology semantic similarity between the proteins. The method directly acts on the protein interaction network, can identify the overlapping protein complex in the network, is high in effect accuracy, and can effectively solve the protein complex identification problem in the protein interaction network.

Description

technical field [0001] The invention relates to the technical field of computer data processing and the field of computational biology, in particular to a method and system for identifying overlapping protein complexes based on fuzzy clustering and semantic similarity of gene ontology. Background technique [0002] At present, the existing protein complex identification methods are mainly based on the topological structure in the protein interaction network to complete the complex identification work, that is, to identify clusters with certain specific structures according to whether there is an interaction between proteins in the network. as a protein complex. Typical structures describing the topological properties of protein complexes include dense structures, k-cliques, and core-attached structures. However, protein complex identification methods based only on specific topological structures ignore a large amount of Gene Ontology semantic information associated with pro...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G16B15/20G16B40/00
CPCG16B15/20G16B40/00Y02A90/10
Inventor 胡伦潘翔宇周喜蒋同海苏小芮
Owner XINJIANG TECHN INST OF PHYSICS & CHEM CHINESE ACAD OF SCI
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products