Prediction method for protein post-translational modification methylation loci

A post-translational modification and prediction method technology, applied in the field of protein post-translational modification methylation site prediction, can solve problems such as single features, over-modeled, and lack of detailed classification, achieving high throughput and high accuracy Effect

Inactive Publication Date: 2016-08-24
NANCHANG UNIV
View PDF1 Cites 18 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

For example, the number of training samples collected when building the model is relatively small, the features used in feature encoding are relatively single, and the model built is too simple without detailed classification, etc.
With the rapid development of modern technology, more and more methylation sites have been identified, and existing models and methods cannot meet the multi-type and high-precision prediction requirements

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Prediction method for protein post-translational modification methylation loci
  • Prediction method for protein post-translational modification methylation loci
  • Prediction method for protein post-translational modification methylation loci

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0036] The protein methylation data are collected from protein databases such as UniProt and PhosphoSite. The positive samples are the methylation sites marked by experimental verification, and the negative samples are randomly selected from the same protein as the positive samples. Labeled arginine (R) and lysine (K) sequences. The protein sequences collected above were removed 30% homology by the cd-hit tool, and then uniformly cut into a sequence with R or K as the center, 9 amino acids upstream, 9 amino acids downstream, and a length of 19. The sequence information, evolution information and physical and chemical properties of the positive and negative sample sequences after unified cutting and preprocessing are encoded according to the following steps:

[0037] (1) Sequence information encoding of sample sequence: sequence information includes amino acid occurrence frequency, binary code and K-space amino acid pair; Convert each amino acid in the sequence into a 20-dimen...

Embodiment 2

[0057] In order to facilitate the prediction application of protein methylation sites, an online prediction platform (http: / / bioinfo.ncu.edu.cn / PSSMe.aspx) was developed based on PSSMe and combined programming with MATLAB and C#. Just enter the protein name or fasta format sequence of the protein to be predicted in the UniProt database in the designated area of ​​the website, and the possible methylation site prediction of the protein can be performed. For example, if the user wants to predict the methylation site of the protein sequence named "B4DEH8", just enter "B4DEH8" in the protein name of the website, click the "Load" button, and the PSSMe tool will automatically download the protein from the UniProt database sequence and import it into the designated area, the B4DEH8 protein sequence information is as follows:

[0058] >tr|B4DEH8|B4DEH8_HUMAN

[0059] MEEEAEKLKELQNEVEKQMNMSPPPGNAGPVIMSIEEKMEADARSIYVGNVDYGATAEELEAHFHGCGSVNRVTILCDKFSGHPKGFAYIEFSDKESVRTSLALDESLFRGRQIKVIP...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a prediction method for protein post-translational modification methylation loci, and belongs to the field of bioinformatics. Protein methylation modification participates in cell functions and many life activities of cell processes, and recognition of protein methylation modification loci has very important significance in understanding of the life activities of cells. The prediction method combines with sequence information, evolutionary information and physical and chemical properties to conduct feature coding on a protein methylation sequence, an information gain optimization feature method is adopted and combines with a support vector machine to construct a prediction model, and it is shown through independent testing results that the prediction method has a good prediction property on the protein methylation loci; meanwhile, a network prediction platform is developed and used for conducting online prediction on the protein methylation loci.

Description

technical field [0001] The invention belongs to the field of bioinformatics, and in particular relates to a method for predicting methylation sites of protein post-translational modifications. Background technique [0002] Protein post-translational modifications (PTMs) play an important role in the regulatory mechanism of cells and affect various properties of proteins, including protein folding, activity and their biological functions. Therefore, in-depth study of PTMs plays an important role in understanding the pathogenesis of human diseases. Protein methylation is one of the most common post-translational modifications of proteins. Under the catalysis of methyltransferase, the methyl group is converted from N - Transfer of adenosylmethionine to the corresponding protein. Protein methylation not only plays an important role in the genetic modification of eukaryotic chromatin, but also plays a very important role in cell differentiation, development, gene expression, ge...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F19/18
CPCG16B20/00
Inventor 邱建丁温平平施绍萍梁汝萍
Owner NANCHANG UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products