Protein lysine malonylation site prediction method based on deep learning

A technology of lysine malonylation site and malonylation site, which is applied in the field of biological information, can solve the problems of reducing prediction performance, influence of prediction results, etc., and achieves the effect of improving evaluation index and promoting application

Pending Publication Date: 2020-04-28
QINGDAO UNIV OF SCI & TECH
View PDF2 Cites 15 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] i) Existing methods only use limited features, and other potential features will also have an impact

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Protein lysine malonylation site prediction method based on deep learning
  • Protein lysine malonylation site prediction method based on deep learning
  • Protein lysine malonylation site prediction method based on deep learning

Examples

Experimental program
Comparison scheme
Effect test

Embodiment

[0053] A method for predicting protein lysine malonylation sites based on deep learning, such as figure 1 shown, including the following steps:

[0054] 1) Data collection: Collect experimentally validated lysine malonylation site data from protein databases and related literature.

[0055] The experimentally verified lysine malonylation data set used in the present invention mainly comes from papers (Zhang YJ, XieRP, Wang JW, et al. Computational analysis and prediction of lysine malonylationsites by exploiting informative features in an integrative machine-learningframework. BriefBioinform 2018:1-15), this dataset includes 1746 Kmal sites from 595 E. coli proteins, 3435 Kmal sites from 1174 proteins from M. musculus and 4579 Kmal sites from 1660 proteins in H. sapiens.

[0056] After random selection, the final training set E.coli contains 1453 positive samples and 1453 negative samples, M.musculus contains 2606 positive samples and 2606 negative samples, and H.sapiens con...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a protein lysine malonylation site prediction method based on deep learning, and relates to the technical field of biological information. The method comprises the steps: converting the character information of a protein sequence into a numerical vector by adopting an enhanced amino acid composition, a grouped enhanced amino acid composition, a dipeptide deviation expectedaverage value, a K neighbor score and a BLOSUM62 matrix feature extraction algorithm, performing fusion and obtaining a feature space, wherein the influence of each potential feature on a prediction result is fully considered; performing calculating by using a linear convolutional neural network to obtain malonyl site specificity characteristics; selecting related features and reducing feature dimensions through a maximum pooling layer, classifying malonylation sites and non-malonylation sites in combination with a multi-layer deep neural network, constructing a protein malonylation site prediction model DeepMal, and evaluating prediction performance by using 10-fold cross validation and an independent test data set. The model DeepMal is remarkably improved in evaluation indexes, and further promotion of application of deep learning in protein function prediction is facilitated.

Description

technical field [0001] The invention relates to the field of biological information technology, in particular to a method for predicting protein lysine malonylation sites based on deep learning. Background technique [0002] As an important protein post-translational modification site, malonylation was first discovered in 2011, an evolutionarily conserved protein post-translational modification type that occurs on lysine, and its occurrence depends on malonyl-CoA Adds a malonyl group to lysine and changes its charge from +1 to -1. This change has the potential to disrupt the electrostatic interactions of lysine with other amino acids and alter the protein structure, and may even affect its binding to target proteins. It has been shown to exist in a variety of metabolic pathways, such as glucose and fatty acid metabolism, Fatty acid synthesis and oxidation, impaired mitochondrial function, also related to muscle contraction, myocardial ischemia and hypothalamic regulation of...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G16B15/00G16B20/00
CPCG16B15/00G16B20/00
Inventor 于彬崔晓文王明辉王磊
Owner QINGDAO UNIV OF SCI & TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products