Eureka AIR delivers breakthrough ideas for toughest innovation challenges, trusted by R&D personnel around the world.

Protein-ligand binding site prediction algorithm based on deep learning

A technology of ligand binding site and prediction algorithm, applied in the fields of protein biology and pattern recognition, can solve the problem of low accuracy of prediction algorithm, and achieve the effect of high prediction accuracy

Active Publication Date: 2020-01-14
SHANGHAI JIAO TONG UNIV
View PDF7 Cites 21 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0006] The purpose of the present invention is to provide a protein-ligand binding site prediction algorithm based on deep learning for the current situation that the prediction algorithm in the prior art has low accuracy, so as to solve the problems existing in the prior art

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Protein-ligand binding site prediction algorithm based on deep learning
  • Protein-ligand binding site prediction algorithm based on deep learning
  • Protein-ligand binding site prediction algorithm based on deep learning

Examples

Experimental program
Comparison scheme
Effect test

Embodiment

[0092] protein and MN 2+ The binding site data set is used as training set and test set. The training set contains a total of 440 proteins, including 1931 binding residues and 150229 non-binding residues; the test set contains 144 proteins, including 612 binding residues and 50838 non-binding residues.

[0093] First, the PSI-BLAST algorithm, HHblits algorithm, SCRATCH algorithm and S-SITE algorithm were used to extract the evolution information, secondary structure information, relative solvent accessibility and binding probability of all proteins in the training set and test set, respectively, and the evolution information ( Including PSSM and HHM) for normalization processing; secondly, calculate the Euclidean distance between each residue pair from the three-dimensional space coordinates of each residue of the protein in all training sets and test sets, and construct a distance matrix, and matrix The number of columns is scaled to 400; finally, the feature tensor is inter...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a protein-ligand binding site prediction algorithm based on deep learning. For protein to be predicted, the algorithm comprises the steps of: firstly, extracting sequence features and a distance matrix; distributing the sequence features to each residue through adoption of a sliding window method; and inputting the features corresponding to the residues into a residual neural network and a hybrid neural network one by one, and inputting output results of the residual neural network and the hybrid neural network into a Logistic regression classifier to obtain a final result, namely the binding probability corresponding to each residue in the protein. According to the method, a classic bidirectional long-short-term memory network and a residual neural network are fused, the fused network can process heterogeneous protein sequences and structural data at the same time, and complementarity of sequence features and structural features is mined. Compared with an existing method, the protein-ligand binding site prediction algorithm has higher prediction precision, and has good generalization performance for data sets of different ligands.

Description

technical field [0001] The present invention relates to the fields of protein biology and pattern recognition, in particular to a protein-ligand binding site prediction algorithm based on deep learning. Background technique [0002] Protein-ligand interactions play important roles in biological processes, such as signal transduction, post-translational modification, and antigen-antibody interactions. In addition, drug discovery and design also rely heavily on the mechanistic analysis of protein-ligand interactions. For the further exploration of the mechanism behind the protein-ligand interaction, the identification of the binding site is a very critical step. With the emergence of protein design technology, more new proteins will emerge with unexplored properties and functions, so the need for rapid and accurate binding site identification tools becomes more urgent. Currently there is a method for identifying protein binding sites through wet experiments, but its disadvan...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G16B15/30G16B20/30G16B40/00
CPCG16B15/30G16B20/30G16B40/00
Inventor 夏春秋杨旸沈红斌
Owner SHANGHAI JIAO TONG UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Eureka Blog
Learn More
PatSnap group products