Optimizing evidence theory based K nearest-neighbor alpha-helix prediction method

An evidence theory and nearest neighbor technology, applied in the field of bioinformatics and pattern recognition, can solve the problems of complex structure of membrane protein, inability to predict sequence, inability to predict, etc.

Inactive Publication Date: 2015-04-22
SHANGHAI JIAO TONG UNIV
View PDF0 Cites 3 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] Most of the current methods can only predict amino acid sequences of regular length, such as the TOP‐PRED method (Claros MG, von Heijne TopPred II: An improved software for membrane protein structure predictions. Comput Appl Biosci 10: 685–686.) can only A protein sequence with a predicted alpha helix structure length of about 21 residues; another method based on the hidden Markov model TMHMM (Krogh A, Larsson B, von Heijne G, Sonnhammer EL (2001) Predicting tansmembrane protein topology with a hidden Markov model: application to complete genomes. J Mol Biol 305:567–580.) cannot predict the sequence lengt

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Optimizing evidence theory based K nearest-neighbor alpha-helix prediction method
  • Optimizing evidence theory based K nearest-neighbor alpha-helix prediction method
  • Optimizing evidence theory based K nearest-neighbor alpha-helix prediction method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0041] Such as figure 1 As described, this embodiment includes the following steps:

[0042] 1) Search the SWISS-PROT protein database according to the protein sequence to obtain the target amino acid sequence, as shown in Seq ID No.1.

[0043] 2) Obtain the specific position scoring matrix PSSM as an amino acid feature through the PSI-BLAST sequence comparison tool;

[0044] 3) Extract feature vectors with sliding windows of size 13 and 15, respectively, and then perform fusion optimization;

[0045] 4) Classifying the extracted feature vectors with the K-Nearest Neighbor Algorithm of Optimal Evidence Theory to obtain the prediction curve of amino acid sequence prediction probability;

[0046] 5) Use the median filtering method to smooth, remove noise, and reduce the burrs of the probability curve;

[0047] 6) Use dynamic threshold segmentation to obtain the division result of whether each amino acid in the target sequence belongs to the alpha helical transmembrane structu...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses an optimizing evidence theory based K nearest-neighbor alpha-helix prediction method and relates to correlation techniques of mode recognition algorithms and computational biology. By means of the optimizing evidence theory based K nearest-neighbor alpha-helix prediction method, a membrane protein alpha-helix structure is accurately predicted when a protein sample with a high-resolution known structure is lacked. Multiple sliding window extraction feature vectors are fused to perform optimization by adopting a computational biology method including protein multiple sequence alignment and an OETPKNN algorithm, noise is smoothed by means of a median filtering method, then a prediction result is divided by means of a dynamic threshold method, and finally the membrane protein alpha-helix structure is obtained. By means of the optimizing evidence theory based K nearest-neighbor alpha-helix prediction method, the alpha-helix prediction accuracy is improved by higher than 20%, the tail end of an alpha-helix can be predicted, and a good effect is played on irregular alpha-helixes with the prediction length smaller than 15 amino acids.

Description

technical field [0001] The invention relates to a technology in the field of bioinformatics and pattern recognition, specifically a K-nearest neighbor algorithm based on optimization evidence theory, which is suitable for predicting the protein structure of alpha helix. Background technique [0002] In recent years, with the rapid development of bioinformatics, a large amount of biological data and computer science have been combined for research and analysis, revealing more life science mechanisms endowed by biological data. Among them, proteins are an important part of biological information. The ever-expanding protein database greatly facilitates scientists' research on protein structure and function. The function of protein is often determined by its specific structure, so the study of protein structure plays a pivotal role. In the current protein database, most of the protein structures are solved by experiments, but membrane proteins have strong hydrophobicity, are no...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F19/16G06F17/30G06K9/66
Inventor 沈红斌殷曦
Owner SHANGHAI JIAO TONG UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products