DNA binding protein identification and function annotation deep learning method based on self-attention mechanism

A deep learning and protein-binding technology, applied in the field of deep learning, can solve problems such as inability to complete functional annotations, and inability to obtain functional domains combining DNA and proteins

Active Publication Date: 2020-09-22
TIANJIN UNIV
View PDF5 Cites 18 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] At present, deep learning methods can achieve higher prediction accuracy than machine learning methods, but bo...

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • DNA binding protein identification and function annotation deep learning method based on self-attention mechanism
  • DNA binding protein identification and function annotation deep learning method based on self-attention mechanism
  • DNA binding protein identification and function annotation deep learning method based on self-attention mechanism

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0019] The present invention will be described in further detail below in conjunction with the accompanying drawings and specific embodiments. It should be understood that the specific embodiments described here are only used to explain the present invention, not to limit the present invention.

[0020] The present invention provides a deep learning model based on a self-attention mechanism, which can predict whether it has the function of binding to DNA according to the primary sequence information of the protein, and find out the region with the binding function.

[0021] Such as figure 1 As shown, the deep learning method of DNA binding protein identification and functional annotation based on the self-attention mechanism of the present invention comprises steps;

[0022] 1. Data set selection

[0023] The data set used in the present invention comprises large-scale balanced data set (table 1), large-scale unbalanced data set (table 2), small-scale balanced data set (tabl...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a DNA binding protein identification and function annotation deep learning method based on a self-attention mechanism. The method comprises the following steps: selecting a data set from a protein database, training and testing a constructed deep learning model by using the selected data set, and predicting whether protein can be combined with DNA by using the trained deeplearning model, wherein the deep learning model comprises a coding layer, an embedding layer, a long short-term memory neural network layer (LSTM), a convolutional neural network layer (CNN) and a self-attention layer (Self-Attention). Through a deep learning model based on a self-attention mechanism, whether the protein has a function of combining with DNA or not can be predicted according to primary sequence information of the protein, and an area with a combining function is found out.

Description

technical field [0001] The invention relates to the technical field of deep learning, in particular to a deep learning method for DNA binding protein identification and functional annotation based on a self-attention mechanism. Background technique [0002] DNA-binding proteins (DNA-binding proteins, DBP) are proteins that contain functional regions that can bind to DNA, and are widely present in various organisms. It plays a vital role in biological regulatory mechanisms, including DNA replication, DNA transcription, DNA repair, DNA recombination and other biological functions. Therefore, effectively determining the functional region where DNA binds to a protein is not only crucial for the analysis of protein function, but also plays a vital role in the development of new drugs. In a protein sequence capable of binding to DNA, there are one or several DNA binding domains, which are composed of several amino acid residues and capable of interacting with DNA, and are general...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G16B20/30G16B15/20G16B40/00G06K9/62G06N3/04G06N3/08
CPCG16B20/30G16B15/20G16B40/00G06N3/049G06N3/084G06N3/045G06N3/048G06F18/24
Inventor 宫秀军杨超莹
Owner TIANJIN UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products