Prediction method for protein subcellular site formed based on improved-period pseudo amino acid

A prediction method and pseudo-amino acid technology, applied in the field of bioinformatics, can solve problems such as poor prediction effect and information redundancy, and achieve the effects of difficult to predict offset, reliable prediction method, and protein data balance.

Inactive Publication Date: 2012-12-12
THE SECOND AFFILIATED HOSPITAL ARMY MEDICAL UNIV
View PDF0 Cites 9 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Benefits of technology

This patented technology uses binary strategies - which means just two or three specific locations within an organism's DNA molecule have been identified during analysis. By comparing these location identities between multiple samples from this same area, we aim to make predictions about how many other cells there were before they had taken up their entire life cycle (their genetic material). These techniques help scientists study important biological processes such as cell division and development.

Problems solved by technology

This patented technical problem addressed by the present patents relates to accurately identifying specific areas within larger parts or tissues called subpopulations while minimizing complexity and costly computations. Existing techniques like X-ray crystallography require extensive training data with high costs due to complicated calculations involved. Therefore, it would be desirable to develop new computerized tools capable of efficiently predicting small regions inside cell membranes without requiring detailed knowledge about how they function.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Prediction method for protein subcellular site formed based on improved-period pseudo amino acid
  • Prediction method for protein subcellular site formed based on improved-period pseudo amino acid
  • Prediction method for protein subcellular site formed based on improved-period pseudo amino acid

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0032] The construction process of the present invention is described in detail below in conjunction with embodiment:

[0033] 1. Construction of protein datasets: ① eukaryotic protein dataset Euk7579, ② prokaryotic protein dataset Gneg1456, and ③ viral protein dataset Virus252 were respectively obtained through the following addresses:

[0034] ①http: / / web.kuicr.kyoto-u.ac.jp / ~park / Seqdata / ;

[0035] ②http: / / www.csbio.sjtu.edu.cn / bioinf / Gneg-multi / ;

[0036] ③http: / / www.csbio.sjtu.edu.cn / bioinf / virus-multi / .

[0037] Transfer the above-mentioned eukaryotic, prokaryotic, and viral protein data sets into a word file, use the word search and replace functions to delete redundant information, and store the protein number, subcellular site number and amino acid sequence in the newly created file A.xls and A2.xls.

[0038]2. GO feature extraction and vector construction: Download the GO dataset file local_gene_association.goa_uniprot.sorted from ftp: / / ftp.ebi.ac.uk / pub / databases...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a prediction method for protein subcellular site formed based on improved-period pseudo amino acid, which has a strategy that an integrated classifier is constructed with a KNN (K nearest neighbor) method and an SVM (support vector machine) method based on a one-to-one scheme. The prediction method aims to predict the protein subcellular site and accelerate protein function study and belongs to the field of bioinformatics. The prediction method is used for constructing the integrated classifier with the KNN method based on the Euclidean distance and the SVM method based on an RBF (radial basis function) kernel function. The protein characteristic information consists of improved-period pseudo amino acid and is obtained by the fact that a high-score characteristic closely related to the protein subcellular site is extracted with a fselect.py method on the basis of the characteristics of GO (gene ontology), AAC (amino acid composition), AAP (amino acid pair composition) and the hydrophily and the hydrophobicity of amino acid. The prediction accuracy of the protein subcellular site aims to be improved with two prediction methods of KNN and SVM and according to the high-score characteristic. In the implementation, the prediction method is identified from indexes, such as total prediction accuracy rate, each-site prediction accuracy rate, MCC (Markovian correlation coefficient) and the like with a jackknife inspection method. The prediction method disclosed by the invention is suitable for the prediction of the subcellular site of the proteins of different species.

Description

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Owner THE SECOND AFFILIATED HOSPITAL ARMY MEDICAL UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products