A deep learning approach for DNA-binding protein identification and functional annotation based on self-attention mechanism

A technology of deep learning and protein binding, applied in the field of deep learning, it can solve the problems such as the inability to obtain functional domains that combine DNA and proteins, and the inability to complete functional annotations.

Active Publication Date: 2022-07-12
TIANJIN UNIV
View PDF5 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] At present, deep learning methods can achieve higher prediction accuracy than machine learning methods, but both types of methods cannot complete functional annotation and cannot obtain functional domains that combine DNA and proteins

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A deep learning approach for DNA-binding protein identification and functional annotation based on self-attention mechanism
  • A deep learning approach for DNA-binding protein identification and functional annotation based on self-attention mechanism
  • A deep learning approach for DNA-binding protein identification and functional annotation based on self-attention mechanism

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0019] The present invention will be further described in detail below with reference to the accompanying drawings and specific embodiments. It should be understood that the specific embodiments described herein are only used to explain the present invention, but not to limit the present invention.

[0020] The invention provides a deep learning model based on a self-attention mechanism, which can predict whether it has the function of binding with DNA according to the primary sequence information of the protein, and find out the region with the binding function.

[0021] like figure 1 As shown, the deep learning method for DNA binding protein identification and functional annotation based on self-attention mechanism of the present invention includes steps;

[0022] First, the selection of data sets

[0023] The data sets used in the present invention include large-scale balanced data sets (Table 1), large-scale unbalanced data sets (Table 2), small-scale balanced data sets ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a deep learning method for DNA binding protein identification and functional annotation based on a self-attention mechanism. A data set is selected from a protein database, and the constructed deep learning model is trained and tested by using the selected data set, and then the trained deep learning model is used for training and testing. The deep learning model can predict whether protein can bind to DNA; wherein, the deep learning model includes coding layer, embedding layer, long short-term memory neural network layer (LSTM), convolutional neural network layer (CNN) and self-attention Layer (Self‑Attention). Through the deep learning model based on the self-attention mechanism, the present invention can predict whether it has the function of binding with DNA according to the primary sequence information of the protein, and find out the region with the binding function.

Description

technical field [0001] The invention relates to the technical field of deep learning, in particular to a deep learning method for DNA binding protein identification and function annotation based on a self-attention mechanism. Background technique [0002] DNA-binding proteins (DBPs) are proteins that contain functional domains that can bind to DNA, and are widely present in various organisms. It plays a vital role in biological regulation mechanisms, including DNA replication, DNA transcription, DNA repair, DNA recombination and other biological functions. Therefore, the effective determination of the functional region of DNA and protein binding is not only crucial for protein functional analysis, but also plays a crucial role in the development of new drugs. In a protein sequence that can bind to DNA, there is one or several DNA-binding domains, which are composed of several amino acid residues and can interact with DNA. This region is generally called DNA-binding domain. ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G16B20/30G16B15/20G16B40/00G06K9/62G06N3/04G06N3/08
CPCG16B20/30G16B15/20G16B40/00G06N3/049G06N3/084G06N3/045G06N3/048G06F18/24
Inventor 宫秀军杨超莹
Owner TIANJIN UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products