Protein binding site prediction method, device and equipment and storage medium

A combination site and prediction method technology, applied in the field of bioinformatics, can solve the problems of low accuracy and general applicability

Inactive Publication Date: 2018-01-09
SHENZHEN UNIV +1
View PDF3 Cites 19 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] The object of the present invention is to provide a method, device, computing device and storage medium for predicting protein binding sites, aiming to solve the problem of low accuracy and general applicability of protein binding site prediction in the prior art

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Protein binding site prediction method, device and equipment and storage medium
  • Protein binding site prediction method, device and equipment and storage medium
  • Protein binding site prediction method, device and equipment and storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0026] figure 1 The implementation process of the method for predicting protein binding sites provided by Example 1 of the present invention is shown. For the convenience of illustration, only the parts related to the embodiment of the present invention are shown, and the details are as follows:

[0027] In step S101, the protein sequence to be predicted is received, and the protein sequence is segmented using a preset sliding window and sliding step to obtain a plurality of amino acid subsequences constituting the protein sequence to be predicted.

[0028] The embodiment of the present invention is applicable to the prediction system of protein binding sites. In the embodiment of the present invention, in order to reflect the aggregation characteristics of protein-protein binding sites, after receiving the protein sequence to be predicted, the sliding window is started, and the protein sequence is divided by adjusting the size of the sliding window and the sliding step length...

Embodiment 2

[0041] figure 2 The structure of the protein binding site prediction device provided in Example 2 of the present invention is shown. For the convenience of description, only the parts related to the embodiment of the present invention are shown, including:

[0042] The sequence division unit 21 is configured to receive the protein sequence to be predicted, and use a preset sliding window and sliding step to perform sequence division on the protein sequence, to obtain multiple amino acid subsequences constituting the protein sequence to be predicted.

[0043] The first vector construction unit 22 is used to construct the word vector of the protein sequence according to the obtained multiple amino acid subsequences, the word element of the word vector represents each amino acid subsequence, and extracts the document features of the word elements, and constructs according to the extracted document features Document feature vectors for protein sequences.

[0044] The second vect...

Embodiment 3

[0049] image 3 The structure of the protein binding site prediction device provided by Example 3 of the present invention is shown. For the convenience of description, only the parts related to the embodiment of the present invention are shown, including:

[0050] The training sequence division unit 31 is configured to perform sequence division on the training protein sequence in the preset training set by using a preset sliding window and a sliding step to obtain a plurality of training amino acid subsequences constituting the training protein sequence.

[0051] The first feature processing unit 32 is used to construct the training word vector of the training protein sequence according to the obtained multiple training amino acid subsequences, the training word element of the training word vector represents each training amino acid subsequence, and performs document feature extraction on the training word element , construct a document feature training vector for training pr...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention belongs to the technical field of biological information, and provides a protein binding site prediction method, device and equipment and a storage medium. The method comprises the stepsof receiving a protein sequence to be predicted, using a preset sliding window and a sliding step size for performing sequence classification on the protein sequence, obtaining a plurality of amino acid subsequences, constructing a word vector of the protein sequence according to the subsequences, performing document feature extraction on word elements, constructing document feature vectors of the protein sequence according to extracted document features, performing protein chain biological feature extraction on the amino acid subsequences, constructing biological feature vectors of the protein sequence according to the extracted biological features, adopting a preset amino acid residue group classification model for classifying the amino acid subsequences represented by the document feature vectors and the biological feature vectors, and obtaining an amino acid residue group type of the protein sequence; accordingly, the protein binding site prediction accuracy and applicability areimproved.

Description

technical field [0001] The invention belongs to the technical field of biological information, and in particular relates to a method, device, equipment and storage medium for predicting protein binding sites. Background technique [0002] In recent years, bioinformatics has received widespread attention, and more and more researchers in different fields have devoted themselves to the research work of bioinformatics. Bioinformatics is a comprehensive subject that studies information content and information flow in biological and biological-related systems. Its knowledge system includes biology (genetics, biochemistry, etc.), mathematics (probability theory and mathematical statistics, algorithms, etc.) ), computer science (machine learning, computational theory, etc.), physical chemistry (molecular modeling, thermodynamics, etc.), and many other disciplines. [0003] Protein is the embodiment of life activities and the most important basic unit for all living things to expre...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F19/18G06F19/12
CPCG16B20/30G06N20/00G06N5/04G16B30/00G16B40/00
Inventor 张勇何威徐勇赵东宁
Owner SHENZHEN UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products