Protein coding method and protein post-translational modification site prediction method and system

A technology of post-translational modification and coding method is applied in the field of protein coding method and protein post-translational modification site prediction method and system, which can solve the problems of inability to predict protein post-translational modification with high precision, and inability to realize multi-feature data prediction.

Active Publication Date: 2019-07-19
HUAZHONG UNIV OF SCI & TECH
View PDF9 Cites 8 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] The invention solves the problem that the prediction method of protein post-translational modification sites in the prior art cannot realize the prediction of multi-feature data, and cannot predict protein post-translational modification in different species with high precision

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Protein coding method and protein post-translational modification site prediction method and system
  • Protein coding method and protein post-translational modification site prediction method and system
  • Protein coding method and protein post-translational modification site prediction method and system

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0083] Taking protein lysine succinylation as an example, we use the method of the present invention to construct a prediction model named HybridSucc, and its flow chart is as follows figure 1 shown. The specific steps are:

[0084] 1. We collected and integrated 21,770 succinylation sites of 7,415 proteins from scientific literature, and downloaded the primary sequence of the protein from the UniProt database. Identified lysine succinylation sites were considered positive data, while remaining lysine sites in the same protein were considered negative data and classified species-specifically, as described for these succinylation sites The protein classifies loci into 13 species including human, mouse, yeast, rice, rat, E. coli. Cut the protein sequence into a sequence centered on the site, 10 amino acids upstream, 10 amino acids downstream, and a length of 21.

[0085] 2. Encode the protein sequence for features. Based on the data set, encode the positive and negative data ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a protein coding method and a protein post-translational modification site prediction method and system, belonging to the field of bioinformatics. The protein coding method includes collection of modification site information, position weight training, and coding of a peptide fragment to be encoded. The protein coding method and protein post-translational modification siteprediction method includes: collection of modification site information, feature coding, model training, and protein post-translational modification site prediction. The protein coding method and theprotein post-translational modification site prediction method and system construct prediction models for digital vector features of different types of positive and negative sites by using a deep neural network and penalty logistic regression, respectively, so as to obtain a plurality of prediction models, take the prediction results of each prediction model as new features and utilizing penalty logical regression to construct a final model, can capture more protein information, thus being conductive to improvement of the accuracy of prediction, and can quickly identify large-scale protein modification sites.

Description

technical field [0001] The invention relates to the field of bioinformatics, and more specifically, to a protein coding method and a method and system for predicting protein post-translational modification sites. Background technique [0002] Protein post-translational modification is one of the most important mechanisms in eukaryotes and prokaryotes, which involves the attachment of chemical groups to the amino acid side chains of proteins. Various protein post-translational modifications (PTMs) play crucial roles in a variety of cellular processes that regulate protein function, physicochemical properties, conformation, stability, and molecular interactions in response to developmental signals or environmental stimuli. For example, protein phosphorylation, the most ubiquitous protein post-translational modification PTM, induces signal transduction and apoptosis; lysine succinylation plays a crucial role in metabolic pathways; protein acetylation and methylation Lysine ubi...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G16B15/20
CPCG16B15/20
Inventor 薛宇宁万山许浩东邓万锟郭亚萍
Owner HUAZHONG UNIV OF SCI & TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products