PPI knowledge graph representation learning method based on protein family clustering

A knowledge map and learning method technology, applied in neural learning methods, text database clustering/classification, bioinformatics, etc., can solve many-to-many, multi-type and other problems that cannot be solved, and achieve the effect of ensuring the training effect

Pending Publication Date: 2020-12-25
刘容恺
View PDF0 Cites 4 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, TransE cannot solve problems such as many-to-many and multi-type. These problems were later solved by models such as TransH and TransR, and combined with higher-level semantic information such as paths to generate models such as PTransE

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • PPI knowledge graph representation learning method based on protein family clustering
  • PPI knowledge graph representation learning method based on protein family clustering
  • PPI knowledge graph representation learning method based on protein family clustering

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0027] The present invention will be further described below in conjunction with the accompanying drawings and embodiments.

[0028] Such as figure 2 As shown, the vectorized representation diagrams of each entity in the protein interaction knowledge map obtained by using GCN and PBG for preliminary representation learning respectively are given. Regardless of the approach, the loss function used during training is critical to the effectiveness of representation learning. However, the loss function of existing models often only considers the distance between h+r and t of triplets in the vector operation of the entire graph, or only performs representation learning on a semantic plane, while ignoring some graphs themselves on the semantic ontology. The most typical and most important is the parent-child relationship. In a vector space, similar entities should be trained and their vector representations should be located close to each other in the vector space. As a subclass...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

A PPI knowledge graph representation learning method based on protein family clustering is realized through the following steps: a) entity classification in a PPI knowledge graph; b) pre-representation learning; c) construction of a child-parent class loss function; C1) calculation of the sum of squares of distance errors; C2) normalization of the sum of squares of the distance errors; d) overalltraining; d1), training of a universal model; d2) training based on a child-parent class loss function; and e) multiple times of training through the step d). According to the PPI knowledge graph representation learning method, a representation learning model based on a child-parent class loss function is applied to a protein interaction knowledge graph, and representation learning training is performed through a child-parent class relationship between homologous proteins and a protein family. The invention has better accuracy, reliability and interpretability in protein function and interaction reasoning.

Description

technical field [0001] The present invention relates to a PPI knowledge map representation learning method, more specifically, to a PPI knowledge map representation learning method based on protein family clustering. Background technique [0002] Protein-protein interaction (Protein-Protein Interaction, PPI) is the basic component of the biomolecular network, the main surface representation of biological activities, and the ultimate executor of cell activity and function. important in research and development. Protein directly determines the composition and repair of organisms, the regulation of physiological functions of organisms, carrier transport and energy regulation, and participates in almost all biological activities such as heredity, development, reproduction, metabolism, and stress. In-depth research on protein structure and function, revealing the specific functions and mechanisms of thousands of proteins in organisms, has always been the core content of protein ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/36G06F16/35G06F40/30G06K9/62G06N3/08G16B50/10G16B50/30
CPCG06F16/367G06F16/355G06N3/08G06F16/35G16B50/10G16B50/30G06F40/30G06F18/214
Inventor 刘容恺
Owner 刘容恺
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products