Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Protein subcellular localization and prediction method realized by using nearest-neighbor retrieval

A technology of subcellular localization and prediction method, which is applied in the field of protein subcellular localization prediction realized by nearest neighbor retrieval, to achieve the effects of strong model adaptability, effective acquisition, and high overall accuracy

Inactive Publication Date: 2015-11-11
NANJING AGRICULTURAL UNIVERSITY
View PDF2 Cites 6 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] The purpose of the present invention is to propose a protein subcellular localization prediction method realized by nearest neighbor search for the problem of protein subcellular localization

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Protein subcellular localization and prediction method realized by using nearest-neighbor retrieval
  • Protein subcellular localization and prediction method realized by using nearest-neighbor retrieval
  • Protein subcellular localization and prediction method realized by using nearest-neighbor retrieval

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0043] The present invention will be further described below in conjunction with the accompanying drawings and embodiments.

[0044] 1 Selection of test data set

[0045]Take the dataset containing 317 apoptotic protein sequences obtained from the SWISS-PROT database as an example. 317 protein sequences, distributed in 6 intervals, including 112 cytoplasmic proteins, 55 membrane proteins, 34 mitochondrial proteins, 17 secreted proteins, and 52 nuclear proteins Strips, endoplasmic reticulum proteins (Endoplasmicreticulumproteins) 47.

[0046] 2 Experimental evaluation methods and indicators

[0047] There are three common predictive evaluation methods: Resubstitution, K-fold cross validation and Jackknife. For the self-compatibility test, the test set contains the sequence to be predicted, and it can be predicted that the detection success rate of the method in this paper is 100%. Compared with the K-fold cross-test, the knife-cut test uses a one-to-many prediction model, w...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

A protein subcellular localization and prediction method realized by using nearest-neighbor retrieval comprises the following steps of: (1), taking AAC characteristic vectors as characteristics of protein sequences and storing the AAC characteristic vector of each protein sequence in a training set to a plurality of hash tables with an LSH (Locality Sensitive Hashing) method; (2), during prediction, calculating a corresponding hash value of the AAC characteristic vector of a target sequence in each hash table with the LSH method, and obtaining a vector set of similar sequences; and (3), selecting Q vectors closest to a Euclidean distance of the AAC characteristic vector of the target sequence from the vector set of the similar sequences, calculating expected protein sequence distances between the AAC characteristic vector of the target sequence and the Q vectors with a global alignment dynamic programming method, and taking a corresponding interval of protein with a sequence having a longest expected distance from the target sequence in the Q vectors as a prediction interval.

Description

technical field [0001] The invention belongs to the field of bioinformatics, in particular to a method for predicting protein subcellular location using machine learning technology, specifically a method for predicting protein subcellular location using nearest neighbor retrieval. Background technique [0002] Protein subcellular localization refers to the specific location of a certain protein or a certain gene expression product in a cell, that is, to predict its subcellular location based on the given protein sequence. The subcellular localization of proteins is closely related to their biological functions. The knowledge position of protein cells plays a vital role in biology, cell biology, pharmacology, medicine. Although the subcellular localization of proteins can be determined experimentally, it is time consuming and expensive. As sequenced genomic data increases, methods for predicting subcellular localization of proteins become increasingly important, requiring a...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F19/18
Inventor 薛卫王雄飞赵南任守纲
Owner NANJING AGRICULTURAL UNIVERSITY
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products