Protein structure class prediction method based on weighted composition position vectors and support vector machine

A protein structure and support vector machine technology, which is applied in the field of protein structure prediction based on weighted composition position vectors, can solve the problems of missing amino acid residue sequence information, limiting the performance of prediction methods, and inability to distinguish protein sequences, so as to achieve good popularization and application. Value, richness of representational information, simple prediction method effect

Inactive Publication Date: 2015-12-30
SYSU CMU SHUNDE INT JOINT RES INST +1
View PDF0 Cites 3 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Benefits of technology

In summary, it provides technical benefits that include improving predictive performance based on specific compositions or positions within proteins compared to previous methods such as genetic algorithms (GAs). It suggests adding weights from both location vectors and aminophosphate residues into existing models to improve their predictions. Additionally, we suggest applying these techniques directly onto new datasets without building up too many variables. Overall, our technology helps identify important features related to diseases like cancer through identifying changes caused by certain mutations.

Problems solved by technology

This patents describes various techniques used during molecular analysis that aim at predicting how well certain structures or functions within an organism's gene may work properly. These include identifying specific parts from these structures called primary structures through their orientation relative to each others; understanding what they fit together correctly helps identify any defective ones. Additionally, there exist several ways to determine if two structures share similar chemical compositions such as alpha-, beta-, gamma-, etc.; while some existing models use more complicated mathematical relationships like partial hydrolysis, peeling off layers containing α -walls, and uncoupling groups involving multiple atoms.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Protein structure class prediction method based on weighted composition position vectors and support vector machine
  • Protein structure class prediction method based on weighted composition position vectors and support vector machine
  • Protein structure class prediction method based on weighted composition position vectors and support vector machine

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0037] 1. Select the standard data set of protein structure

[0038] In this embodiment, two standard data sets of protein structures, Z277 and Z498, are selected as examples for protein prediction.

[0039] Z277 and Z498 two protein structure standard dataset information:

[0040] The Z277 dataset contains a total of 277 protein sequences, including 70 all-a (all-a) proteins, 61 all-b (all-b) proteins, 81 a / b-type proteins and 65 a+b-type proteins .

[0041] The Z498 data set contains a total of 498 protein sequences, including 107 all-a (all-a) proteins, 126 all-b (all-b) proteins, 136 a / b-type proteins and 129 a+b-type proteins .

[0042] 2. Analysis of amino acid composition

[0043] When amino acid composition is used to characterize protein sequences, the sequence information between amino acid residues in the sequence will be lost, and amino acid composition characterization methods cannot distinguish protein sequences with the same amino acid composition but differ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a protein structure class prediction method based on weighted composition position vectors and a support vector machine. Firstly, protein structure class normal data sets are selected or established, weighting factors are introduced based on composition position vectors, and to-be-predicted protein sequences are represented through the weighted composition position vectors; the weighted composition position vectors are combined with the support vector machine, and the protein structure class prediction method is established through a direct multiclass classification strategy. According to the method, information of amino acid compositions is contained, position information of each amino acid residue in the protein sequences is also contained, the one-to-one correspondence function relation is formed between the position information and the protein sequences, the weighting factors are introduced to the composition position vector representation method, and prediction accuracy can be obviously improved by means of adjusting the weighting factors. The method is simple, rapid and sensitive, and is expected to be applied to other protein prediction fields.

Description

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Owner SYSU CMU SHUNDE INT JOINT RES INST
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products