Unlock instant, AI-driven research and patent intelligence for your innovation.

A method for predicting the transmembrane region of membrane protein beta-barrel based on sparse coding and chain learning

A technology of sparse coding and transmembrane regions, which is applied in special data processing applications, instruments, electrical digital data processing, etc., can solve the problems of reducing prediction accuracy and achieve the goals of improving prediction accuracy, high efficiency, reducing error rate and mutation rate Effect

Active Publication Date: 2017-07-18
SHANGHAI JIAOTONG UNIV
View PDF4 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

At present, methods for obtaining protein structures include methods based on statistical information and membrane protein physicochemical properties Freeman, T. and Wimley, W. (2010) A highly accurate statistical approach for the prediction of transmembrane beta-barrels. Bioinformatics, this type is based on Statistical information and methods based on the physicochemical properties of membrane proteins are limited to a small number of structurally simple protein types, such as membrane protein structures with a small number of beta-strands. With the rapid development of machine learning methods, such as based on hidden Markov Modeling methods Singh, N. et al. (2011) Tmbhmm: a frequency profile based HMM for predicting the topology of transmembrane beta barrel proteins and the exposure status of transmembrane residues. Biochim. Biophys. Acta BBA Proteins Proteomics, 1814, 664–670, The prediction accuracy has been improved, but for special lengths such as shorter strands, there is a phenomenon that the false positive rate is too high, and the influence of system noise and many factors that reduce the prediction accuracy in the feature extraction process need to be resolved

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A method for predicting the transmembrane region of membrane protein beta-barrel based on sparse coding and chain learning
  • A method for predicting the transmembrane region of membrane protein beta-barrel based on sparse coding and chain learning
  • A method for predicting the transmembrane region of membrane protein beta-barrel based on sparse coding and chain learning

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0025] like figure 1 As shown, this embodiment includes the following steps:

[0026] 1) After obtaining the position-specific scoring matrix containing evolution information and the Z coordinate value representing the amino acid distance information as features, the position-specific scoring matrix (position-specific scoring matrix, PSSM) generated by the PSI-BALST sequence alignment tool and The residue distance information Z-score calculated by the Z-pred software is normalized respectively, in which: PSSM matrix represents the evolution information of amino acid sequence, reflecting the conservation of amino acids, which has been proved to be an effective protein feature; and Z-socre On behalf of the amino acid distance information, the distance of each residue relative to the membrane center is calculated, which is also used as an effective protein characteristic for the studied transmembrane structure.

[0027] The normalization formula for eigenvectors is:

[0028] ...

Embodiment 2

[0051] This embodiment includes the following steps:

[0052] [1] Protein data set: select the data set from the protein database, and divide the training set and test set respectively;

[0053] [2] Feature selection: respectively select the position-specific scoring matrix PSSM representing protein evolution information and Z-score representing the distance information of amino acids relative to the membrane;

[0054] [3] Feature extraction: extract feature vectors through normalization and sliding window methods;

[0055] [4] Sparse coding: the extracted features are calculated by the method of sparse coding, and the calculated sparse coefficients and base vectors are used to represent the original data. After the experimental statistics, the base vectors and the number of iterations are selected to be 128 and 1000 times respectively;

[0056] Support vector machine SVM[5] determines the optimal parameters c and g through 5-fold cross-validation and grid search method, sele...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides a method for predicting a membrane protein beta-barrel transmembrane area based on sparse coding and chain training and relates to a sparse coding technology, a chain learning algorithm and a support vector machine. Structure prediction is conducted on the membrane protein beta-barrel transmembrane area through a computing method, and important information is provided for the research of the structures and functions of proteins. According to the method, the concept of digital image processing is introduced creatively, sparse coding is conducted on a protein feature matrix, and feature dimensionality reduction and denoising are achieved; a membrane protein beta-barrel data set is organized in a protein database PDB, a position-specific scoring matrix and a Z score are extracted and used as features, the position-specific scoring matrix represents amino acid evolution information, the Z score represents the position information of amino acid residues, a feature vector is extracted through a sliding window, multi-feature fusion is achieved, a chain learning algorithm training model based on a SVM classifier is provided, a predication effect is remarkably improved, and a Jakenife cross validation result shows that the precision can reach 92.5%.

Description

technical field [0001] The invention relates to a technology in the field of membrane protein structure prediction and computational intelligence, specifically a method for predicting the transmembrane region of membrane protein beta-barrel based on sparse coding and chain learning. Background technique [0002] At present, with the rapid development of proteome databases, the number of proteins with known structures is increasing, which plays an important role in the study of protein functions. Membrane proteins are embedded in the biomembrane and run through the phospholipid bilayer. They have strong hydrophobicity and are not suitable for crystallization. The experimental method to solve the protein structure is not only expensive but also time-consuming. Therefore, using calculation methods to predict protein structure is a However, traditional machine learning methods still have some problems to be solved in the field of protein prediction, such as feature selection and...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F19/18
Inventor 沈红斌殷曦
Owner SHANGHAI JIAOTONG UNIV