Prediction method for signal peptide and cleavage site thereof on the basis of layered mixture model

A cleavage site and hybrid model technology, applied in informatics, sequence analysis, bioinformatics, etc., can solve the problems of high false positive signal peptides, and it is difficult for classifiers to correctly identify signal peptides and transmembrane helices, so as to reduce false positives. Positive, performance-enhancing, sensitivity-enhancing effects

Active Publication Date: 2017-07-14
SHANGHAI JIAO TONG UNIV
View PDF6 Cites 10 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Therefore, it is difficult for the classifier to correctly identify the signal peptide and transmembrane helix based on the amino acid residue characteristics, so that the false positives of the signal peptide prediction are too high

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Prediction method for signal peptide and cleavage site thereof on the basis of layered mixture model
  • Prediction method for signal peptide and cleavage site thereof on the basis of layered mixture model
  • Prediction method for signal peptide and cleavage site thereof on the basis of layered mixture model

Examples

Experimental program
Comparison scheme
Effect test

Embodiment

[0043] There is an input sequence with the following data:

[0044] >QuerySequence|SIGNAL 1 25

[0045] MIKSNRITACALAALFAGASFSASAWWGGPGYGNGLWDNMGDMFGDGYGDFNMSM

[0046] GGGGRGYGRGYGRGNGYGYGAPYGYGAPYGYGAPYGYGAPYGYGAPYGAMPYGA

[0047] MPPQMPAAPAQPQAAPSR

[0048] This is a sequence to be tested, and the software output results using the method of the present invention are as follows: figure 2 Shown:

[0049] According to Signal-3L 2.0 engine, the predicted signal peptide is: 1-25

[0050] MIKSNRITACALAALFAGASFSASAWWGGPGYGNGLWDNMGDMFGDGYGDFNMSM

[0051] GGGGR GYGRGYGRGN

[0052] The potential cleavage sites and the credit scores

[0053] It can be seen from the results that the method accurately and intuitively predicts the signal peptide and its cleavage site.

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a prediction method for signal peptide and a cleavage site thereof on the basis of a layered mixture model. The prediction method comprises the following steps that: firstly, in a first layer, applying an SVM (Support Vector Machine) classifier based on amino acid residue features to identify whether a protein sequence contains N-end hydrophobic fragments or not; then, in a second layer, applying a Naive Bayes and SVM classifier based on amino acid residue features and functional structural domain features to identify whether the hydrophobic fragments are the signal peptide or N-end transmembrane helixes or not; and finally, in a third layer, according to a statistical learning rule, screening candidate cleavage sites, calculating a statistical credit score, then, calculating the similarity score of a signal peptide sequence through a Needleman-Wunsch sequence comparison algorithm, and determining a predicted signal peptide cleavage site for the statistical credit score and a sequence similarity score integral.

Description

technical field [0001] The present invention relates to a method for predicting a signal peptide and its cleavage site based on a layered mixed model, which uses known protein sequences to predict whether the protein contains an N-terminal signal peptide, and predicts its cleavage site, especially a An algorithm that fuses amino acid residues and functional domains, combines statistical confidence scores and sequence similarity scores, and predicts signal peptides and their cleavage sites hierarchically from top to bottom. Background technique [0002] In 1979, G.Blobel and D.Sabatini proposed the Signal hypothesis for the first time based on experimental observations. G.Blobel and D.Sabatini believe that there is an amino acid fragment at the N-terminus of the secreted protein sequence that acts as a signal guide, which can guide the transfer of proteins between membranes and transport proteins to their destinations. They called this segment of amino acid that guides the s...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F19/22G06F19/24
CPCG16B30/00G16B40/00
Inventor 沈红斌张以泽
Owner SHANGHAI JIAO TONG UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products