Method for predicting outer membrane protein of germs on basis of machine learning technique

A technique of technology prediction and machine learning, applied in proteomics, genomics, instrumentation, etc., to achieve the effect of accelerating the identification process

Inactive Publication Date: 2018-05-08
上海韦翰斯生物医药科技有限公司
View PDF2 Cites 16 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] The purpose of the present invention is to provide a method for predicting its outer membrane protein based on machine learning technology, aiming to solve the problem that the identification of outer membrane protein in the new bacterial genome is mainly done through experiments

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method for predicting outer membrane protein of germs on basis of machine learning technique
  • Method for predicting outer membrane protein of germs on basis of machine learning technique
  • Method for predicting outer membrane protein of germs on basis of machine learning technique

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0039] The present invention is achieved in this way, a method for predicting bacterial outer membrane proteins based on machine learning technology, the method for predicting outer membrane proteins at the level of the whole genome of bacteria is:

[0040] Using the PSI-BLAST algorithm, the protein sequence is compared with the non-redundant protein sequence database, and the position-specific iterative scoring matrix (PSSM) is calculated, and the composition characteristics (amino acid residues) of the PSSM are calculated by the composition function and the autocorrelation function. Composition / PSSM_AAC), and PSSM autocorrelation features (autocorrelated amino acid position-specific composition / PSSM_AC), establish a classifier based on support vector machine, classify outer membrane proteins and non-outer membrane proteins, through local computer program, accept user The input protein sequence predicts whether the user-entered protein sequence is an outer membrane protein.

Embodiment 2

[0042] A method for predicting bacterial outer membrane proteins based on machine learning technology, the method for predicting outer membrane proteins at the bacterial genome level is:

[0043] Step 1. The user inputs the protein sequence to be predicted into the local computer program in FASTA format;

[0044] Step 2, the computer program uses the PSI-BLAST program to compare the protein sequence with the non-redundant protein sequence;

[0045] Step 3, the computer program calls Matlab to run the core prediction program, and calculates the PSSM composition characteristics and PSSM autocorrelation characteristics of the protein;

[0046] Step 4, the Matlab program performs feature selection and combination of multiple types of features according to a preset method to generate a protein feature vector;

[0047] Step 5, the Matlab program calls the libSVM program, and uses the pre-trained model to predict the likelihood that the protein is an outer membrane protein;

[0048...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a method for using a machine learning technique for predicting outer membrane protein coded on genomes of germs. The method includes the steps of utilizing a PSI-BLAST algorithm to calculate a location specificity feature vector of the protein, adopting an autocorrelation function for conducting feature conversion, building a classification device based on a supporting vector machine, conducting classification on the outer membrane protein and non-outer membrane protein, receiving a protein sequence input by a user through a local computer program, and predicting whether or not the protein sequence belongs to the outer membrane protein. By means of the method, calculation prediction can be conducted on the protein sequence coded by the whole genome of the germs, thesensitivity is high, the calculation speed is high, and an effective tool is provided for fast identification and screening of inner and outer membrane protein of the genomes of the germs. The methodis an accurate and effective screening method for the outer membrane protein and can be widely applied to identification of the outer membrane protein of new sequencing genomes of germs.

Description

technical field [0001] The invention belongs to the technical field of predicting bacterial outer membrane proteins, in particular to a method for predicting bacterial outer membrane proteins based on machine learning technology. Background technique [0002] A large number of beta-barrel-shaped transmembrane proteins are distributed on the outer membrane of Gram-negative bacteria, some of which are proteins that enable bacteria to invade cells, and are also target recognition proteins for the host immune system to clear bacteria, mediating the occurrence of various diseases. At the same time, it also activates the body's immune mechanism to fight against bacterial infections. [0003] Currently, the identification of outer membrane proteins within novel bacterial genomes is mostly done experimentally. However, using experimental methods to identify outer membrane proteins requires a lot of manpower and material resources, with high cost and low efficiency. A new bacterial...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F19/24G06F19/18G06F19/28G06K9/62
CPCG16B20/00G16B40/00G16B50/00G06F18/2411
Inventor 陈抗
Owner 上海韦翰斯生物医药科技有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products