Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Protein-ligand binding site predicting method based on inquiry drive

A binding site and prediction method technology, applied in the field of bioinformatics protein-ligand interaction, can solve the problems of fitting over-optimization, low scalability, low usability, etc., to prevent over-optimization and over-fitting, The effect of improving prediction accuracy

Inactive Publication Date: 2014-03-05
NANJING UNIV OF SCI & TECH
View PDF2 Cites 10 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0012] In view of the defects or deficiencies in the prior art, the present invention aims to provide a query-driven dynamic protein-ligand binding site prediction method to solve the problems existing in the protein-ligand binding site prediction method in the prior art. Low scalability, overfitting / overoptimization, and low usability issues

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Protein-ligand binding site predicting method based on inquiry drive
  • Protein-ligand binding site predicting method based on inquiry drive
  • Protein-ligand binding site predicting method based on inquiry drive

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0025] Such as figure 1 As shown, according to a preferred embodiment of the present invention, a query-driven protein-ligand binding site prediction method is used for a protein sequence to be predicted / queried (hereinafter referred to as a given query input q) Prediction is divided into two stages, namely, the dynamic model construction stage and the prediction stage, which are combined below figure 1 As shown, the implementation of the above two stages is described in detail.

[0026] (1) Dynamic model construction stage

[0027] The first step, using the PSI-BLAST tool software from the available data set D, namely figure 1 in the protein-ligand database lookup with a given query input q (i.e. figure 1 The query sequence in ) protein sequences with high homology constitute a query-driven and small-scale training data set D q-specific , so dynamically get a query-driven training data set, expressed as:

[0028] D. q-specific ←PSI-BLAST(q,D).

[0029] Such as figure ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides a protein-ligand binding site predicting method based on inquiry drive. The method includes the steps of firstly, for an given inquiry input, searching protein sequences with high homology to form a training data set based on inquiry drive; secondly, extracting all the binding residues in the training data set as the positive sample set and extracting all the non-binding residues in the training data set as the negative sample set; thirdly, extracting the feature vector of each sample from evolution information and secondary structure perspective to obtain the feature vector sets of the positive and negative samples; fourthly, using a standard support vector machine algorithm for training to obtain an SVM prediction model based on the inquiry input q; fifthly, for the inquiry input, using the same feature extracting method to extract the feature vector of each residue, inputting the feature vector of each residue into the SVM prediction model, and predicting by using a threshold segmentation method. By the method, prediction precision can be increased, and the possible problems of over-optimization and over-fitting on the fixed training data set can be prevented.

Description

technical field [0001] The invention relates to the field of bioinformatics protein-ligand interaction, in particular to a query-driven dynamic protein-ligand binding site prediction method. Background technique [0002] Protein-ligand interactions are ubiquitous and indispensable in life activities. It is time-consuming and labor-intensive to determine the binding sites between proteins and ligands through biological experiments. With the rapid development of sequencing technology and the advancement of the human structural genome, a large number of protein sequences without binding site mapping have been accumulated. Therefore, there is an urgent need to develop intelligent methods that can predict protein-ligand binding sites directly from protein sequences. In recent years, several sequence-based protein-ligand binding site prediction methods have emerged, for example: (1) Chen, K., Mizianty, M.J. and Kurgan, L. (2011) ATPsite: sequence-based prediction of ATP-binding...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
CPCG16B40/00
Inventor 於东军胡俊何雪李阳沈红斌唐振民杨静宇
Owner NANJING UNIV OF SCI & TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products