Unlock instant, AI-driven research and patent intelligence for your innovation.

Protein-nucleotide binding site prediction method based on supervised upsampling learning

A technology of binding site and learning method, applied in the field of bioinformatics protein-nucleotide interaction, which can solve the problems of poor interpretability and large gap between prediction accuracy and practical application.

Active Publication Date: 2018-01-05
NANJING UNIV OF SCI & TECH
View PDF4 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0006] In order to solve the shortcomings of the lack of positive sample information in the above-mentioned unbalanced sample space, which leads to a large gap between the prediction accuracy and poor interpretability, the purpose of the present invention is to propose a method with the ability to make up for the lack of positive sample information and high prediction accuracy. A Protein-Nucleotide Binding Site Prediction Method Based on Supervised Upsampling Learning

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Protein-nucleotide binding site prediction method based on supervised upsampling learning
  • Protein-nucleotide binding site prediction method based on supervised upsampling learning
  • Protein-nucleotide binding site prediction method based on supervised upsampling learning

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0051] In order to better understand the technical content of the present invention, specific embodiments are given together with the attached drawings for description as follows.

[0052] figure 1 Shown is a schematic diagram of the principle of a protein-nucleotide binding site prediction method based on supervised upsampling learning in an embodiment of the present invention, wherein a protein-nucleotide binding site prediction method based on supervised upsampling learning A prediction method, the realization of which includes the following steps:

[0053] Step 1: Based on the input protein sequence information, perform multi-view feature extraction and feature combination, that is, use the PSI-BLAST algorithm to extract the evolution information of the protein sequence, and the PSIPRED algorithm to extract the secondary structure information of the protein sequence; then use the sliding window method and Feature serial combination method, each amino acid residue in the p...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The present invention provides a protein-nucleotide binding site prediction method based on supervised upsampling learning, comprising the following steps: extracting each amino acid residue from the perspective of evolution information and secondary structure based on the protein sequence information in the training set The eigenvector of the basis, to obtain positive and negative sample sets, all amino acid residues bound to nucleotides are extracted as positive sample sets, and all amino acid residues not bound to nucleotides are extracted as negative sample sets; use supervised upsampling The learning method supplements the missing positive sample information in the positive and negative sample sets; uses the standard support vector machine algorithm (SVM) to train the protein-nucleotide binding site SVM prediction model; for the protein sequence information to be predicted, use the aforementioned The same method extracts the feature vector of each amino acid residue, inputs it into the prediction model, and then uses the threshold segmentation method to predict. The invention can improve prediction accuracy and prevent possible loss of sample information on unbalanced data sets.

Description

technical field [0001] The invention relates to the field of bioinformatics protein-nucleotide interaction, in particular to a protein-nucleotide binding site prediction method based on a supervised upsampling learning method. Background technique [0002] Nucleotides include adenosine triphosphate (ATP), adenosine diphosphate (ADP), adenine nucleotide (AMP), etc., which are an important class of biological macromolecules, for membrane transport, muscle contraction, cell Transportation, signal transmission, DNA replication and transcription, and other life activities are of great significance. In the process of realizing the above-mentioned nucleotide functions, the interaction between proteins and nucleotides plays a crucial role; this interaction is ubiquitous and indispensable in life activities. [0003] It takes a lot of time and money to determine the binding sites between proteins and nucleotides through biological experiments, and the efficiency is low. With the ra...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F19/16
Inventor 於东军胡俊何雪李阳沈红斌杨静宇
Owner NANJING UNIV OF SCI & TECH