DNA-binding protein identification method of interactive fusion characteristic representations and selective integrations

A fusion feature and protein-binding technology, which is applied in the interdisciplinary field of biology and informatics, can solve the problem of single recognition feature

Active Publication Date: 2017-12-12
FUQING BRANCH OF FUJIAN NORMAL UNIV
View PDF5 Cites 6 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

If only the feature representation method of single information such as amino acid composition information or protein frequency spectrum is used, the generated identification features are too single

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • DNA-binding protein identification method of interactive fusion characteristic representations and selective integrations
  • DNA-binding protein identification method of interactive fusion characteristic representations and selective integrations
  • DNA-binding protein identification method of interactive fusion characteristic representations and selective integrations

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0010] In the practical application of machine learning, it is generally believed that "data and features determine the upper limit of machine learning, and models and algorithms can approach this upper limit." Therefore, the present invention proceeds from these two aspects at the same time: 1) effectively fuse multiple biological information to generate features with strong discriminative ability; 2) select and integrate multiple classifiers to generate a classifier with strong generalization ability . figure 1 The framework of our predictive model is given, including interactively fused feature representations and selective ensemble classifiers. The left (dashed box) is the interactive fusion feature representation, and the right (dashed box) is the selective ensemble classifier.

[0011] 1) Interactive Fusion Feature Representation

[0012] Feature representation is to digitize a sequence of characters into a fixed-dimensional feature vector according to the mathematical...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The present invention relates to a DNA-binding protein identification method of interactive fusion characteristic representations and selective integrations. Compared with the prior art, the method provided by the present invention has more excellent performance, which indirectly shows that in the method provided by the present invention, the interactive fusion characteristic representations can generate characteristics that carry strong discrimination information, and at the same time selective integration can further improve the generalization of the overall learner, so that accurate prediction of DNA-binding protein can be ensured finally.

Description

technical field [0001] The invention relates to the cross field of biology and informatics, in particular to a method for predicting DNA-binding proteins by using machine learning. Background technique [0002] DNA-binding proteins play extremely important roles in various cellular processes, and identifying DNA-binding proteins is a very important task in understanding and interpreting protein functions. Starting from the protein sequence (primary structure), using machine learning methods to predict the structure and function of proteins is a hot topic in bioinformatics research and an important research method. [0003] There are two categories of machine learning-based prediction methods for DNA-binding proteins: protein structure-based prediction and protein sequence-based prediction. Prediction of DNA-binding proteins based on protein structure can achieve a high recognition rate. However, due to insufficient protein structure information, such methods cannot be widel...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F19/22G06F19/24
CPCG16B30/00G16B40/00
Inventor 游文杰陈芳甘胜进
Owner FUQING BRANCH OF FUJIAN NORMAL UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products