Unlock instant, AI-driven research and patent intelligence for your innovation.

DNA binding protein recognition method based on deep sparse representation network

A protein-binding and sparse representation technology, applied in the field of DNA-binding protein recognition, can solve problems such as poor final results, affect training results, and errors, and achieve the effects of improving prediction accuracy, generalization ability, and small errors

Pending Publication Date: 2022-08-05
SUZHOU UNIV OF SCI & TECH
View PDF1 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] The training goals of each step of this method are inconsistent, and there is a deviation from the macro goal, so that the trained model is difficult to achieve the optimal result; and each step has errors, and the error of the previous step will affect the training of the next step. As a result, the accumulation of errors eventually leads to poor final results

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • DNA binding protein recognition method based on deep sparse representation network
  • DNA binding protein recognition method based on deep sparse representation network

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0014] Exemplary embodiments of the present invention will now be described in detail with reference to the accompanying drawings. It should be understood that the embodiments shown and described in the accompanying drawings are exemplary only, and are intended to illustrate the principles and spirit of the present invention, and not to limit the scope of the present invention.

[0015] The present invention provides a DNA-binding protein identification method based on a deep sparse representation network as shown in each figure, comprising the following specific steps:

[0016] In step S1, a DNA-binding protein sequence data set is obtained, and the DNA-binding protein sequence data set is divided into a training set and a test set.

[0017] In this example, the DNA-binding protein sequence datasets are the PDB186, PDB1075, PDB2272 and PDB14189 datasets downloaded from the Protein Data Bank, of which there are 93 DNA-binding proteins and 93 non-DNA-binding proteins in PDB186;...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

A DNA binding protein recognition method based on a deep sparse representation network comprises the following specific steps: acquiring a DNA binding protein sequence data set, and dividing the DNA binding protein sequence data set into a training set and a test set; calculating a specificity scoring matrix of all sequences in the DNA binding protein sequence data set by adopting PSL-BLAST software; respectively filling or cutting all the specificity scoring matrixes into new specificity scoring matrixes with the same size; and constructing and training a DNA binding protein recognition classifier model by adopting a deep sparse representation network, and inputting the new specificity scoring matrix into the DNA binding protein recognition classifier model to complete the recognition of the DNA binding protein sequence. The constructed and trained DNA binding protein recognition classifier model is an end-to-end network, the prediction precision can be obviously improved, errors are small, potential features of a specific scoring matrix can be learned robustly by adopting a convolutional auto-encoder, classification is performed through a sparse representation layer, and the generalization ability of the model is improved.

Description

technical field [0001] The invention relates to a DNA-binding protein identification method based on a deep sparse representation network. Background technique [0002] DNA-binding proteins are specialized proteins capable of binding and interacting with DNA. DNA-binding proteins are involved in many biological processes, such as: identification of specific nucleotides, regulation of transcription and regulation of gene expression. At the same time, DNA-binding proteins are important components of anticancer drugs, antibiotics and steroids, and play an important role in the research of anticancer drugs and the treatment of genetic diseases. Early DNA-binding protein identification methods are generally biological experiments, such as filter binding, genetic analysis, chromatin immunoprecipitation and X-ray crystallography. Biological experimental methods are time-consuming and labor-intensive, and cannot meet the needs of large-scale protein sequence detection. [0003] C...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G16B20/00
CPCG16B20/00
Inventor 钱昱磬丁漪杰吴宏杰
Owner SUZHOU UNIV OF SCI & TECH