Protein structure prediction method and device based on multi-task time domain convolutional neural network

A technology of convolutional neural network and protein structure, which is applied in the field of protein structure prediction based on multi-task time-domain convolutional neural network, can solve the problems of poor robustness and low accuracy, so as to improve the fitting degree and reduce the complexity , Improving the effect of generalization ability

Active Publication Date: 2021-01-29
WUHAN GENECREATE BIOLOGICAL ENG CO LTD
View PDF11 Cites 4 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0006] The problem of low accuracy and poor robustness of the existing protein structure prediction in the present invention, in the first aspect of the present invention provides a protein structure prediction method based on multi-task time-domain convolutional neural network, including the following steps: Gene sequence, and protein database; according to the genetic code table and protein database, establish a DNA-RNA-amino acid triple sequence data set corresponding to each protein; according to the residue depth and physical and chemical properties of the amino acids that make up each protein in the protein database Establishing a multiple regression equation to obtain the statistical depth features of each protein; clustering the ternary sequence data set through gene homology information and evolution rate and mapping it into a multidimensional feature vector; combining the multidimensional feature vector, protein The statistical depth feature is used as the input of the multi-task time-domain convolutional neural network for training the multi-task time-domain convolutional neural network until the output error of the multi-task time-domain convolutional neural network is lower than the threshold and tends to be stable Stop training at any time to obtain the trained multi-task time-domain convolutional neural network; input the target gene sequence into the trained multi-task time-domain convolutional neural network to obtain the statistical depth characteristics of the target amino acid sequence and its corresponding protein ; According to the statistical depth features of the amino acid sequence and its corresponding protein, the protein structure is predicted by using the existing protein morphological features and the ball rolling method

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Protein structure prediction method and device based on multi-task time domain convolutional neural network
  • Protein structure prediction method and device based on multi-task time domain convolutional neural network
  • Protein structure prediction method and device based on multi-task time domain convolutional neural network

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0028] The principles and features of the present invention are described below in conjunction with the accompanying drawings, and the examples given are only used to explain the present invention, and are not intended to limit the scope of the present invention.

[0029] refer to Figure 1 to Figure 3, in the first aspect of the present invention, a protein structure prediction method based on multi-task time-domain convolutional neural network is provided, comprising the following steps: S101. Obtaining the target gene sequence and protein database; S102. According to the genetic code table and protein The database establishes a DNA-RNA-amino acid triple sequence data set corresponding to each protein; according to the residue depth and physical and chemical properties of the amino acids that make up each protein in the protein database, a multiple regression equation is established to obtain the statistical depth characteristics of each protein ; S103. Clustering the triple...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a protein structure prediction method and device based on a multi-task time domain convolutional neural network. The method comprises the steps of: obtaining a target gene sequence and a protein database; establishing a DNA RNAamino acid ternary sequence data set corresponding to each protein according to the genetic code table and a protein database; establishing a multiple regression equation according to the residue depth and physicochemical properties of amino acids in the protein database to obtain statistical depth characteristics of each protein; clustering theternary sequence data set and mapping the ternary sequence data set into a multi-dimensional feature vector; taking the multi-dimensional feature vector and the statistical depth feature of the protein as the input of a multi-task time domain convolutional neural network, and training the multi-task time domain convolutional neural network; and predicting the protein structure by utilizing the statistical depth characteristics of the protein. According to the invention, the statistical depth characteristics of the protein are combined with the multi-task time domain convolutional neural network, so that the complexity of the model is reduced, and the generalization and the fitting degree are improved.

Description

technical field [0001] The invention relates to the field of biological information and deep learning, in particular to a protein structure prediction method and device based on a multi-task time-domain convolutional neural network. Background technique [0002] It is currently recognized in biology that the biological function of a protein is determined by its three-dimensional structure; the three-dimensional structure of a protein is determined by its primary structure; proteins with similar functions are also similar in structure. [0003] Studies have found that although the primary structure of proteins is ever-changing, that is, there are many types of amino acid arrangements and combinations in a polypeptide chain, the types of secondary structures are limited, mainly including α -spiral( α -helix), β-sheet (β-sheet), β-turn (β-turn) and random coil (random coil), where α The two protein secondary structures, the helix and the β sheet, only depend on the backbone o...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G16B15/00G16B40/00G06N3/04
CPCG16B15/00G16B40/00G06N3/045
Inventor 华权高赵海义舒芹
Owner WUHAN GENECREATE BIOLOGICAL ENG CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products