Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

A Protein Subcellular Localization Prediction Method Using Nearest Neighbor Retrieval

A subcellular localization and prediction method technology, which is applied in the field of protein subcellular localization prediction realized by nearest neighbor retrieval, to achieve the effect of strong model adaptability, effective acquisition and high overall accuracy

Inactive Publication Date: 2018-02-23
NANJING AGRICULTURAL UNIVERSITY
View PDF2 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] The purpose of the present invention is to propose a protein subcellular localization prediction method realized by nearest neighbor search for the problem of protein subcellular localization

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A Protein Subcellular Localization Prediction Method Using Nearest Neighbor Retrieval
  • A Protein Subcellular Localization Prediction Method Using Nearest Neighbor Retrieval
  • A Protein Subcellular Localization Prediction Method Using Nearest Neighbor Retrieval

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0044] The present invention will be further described below in conjunction with the accompanying drawings and embodiments.

[0045] 1 Selection of test data set

[0046]Take the dataset containing 317 apoptotic protein sequences obtained from the SWISS-PROT database as an example. 317 protein sequences, distributed in 6 intervals, including 112 cytoplasmic proteins, 55 membrane proteins, 34 mitochondrial proteins, 17 secreted proteins, and nuclear proteins (Nuclear proteins) 52, endoplasmic reticulum proteins (Endoplasmic reticulum proteins) 47.

[0047] 2 Experimental evaluation methods and indicators

[0048] There are three common predictive evaluation methods: Resubstitution, K-fold cross validation and Jackknife. For the self-compatibility test, the test set contains the sequence to be predicted, and it can be predicted that the detection success rate of the method in this paper is 100%. Compared with the K-fold cross-test, the knife-cut test uses a one-to-many predi...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

A protein subcellular localization prediction method realized by nearest neighbor retrieval, the method comprises the following steps: (1), using the AAC feature vector as the feature of the protein sequence, using the LSH method to store the AAC feature vector of each protein sequence in the training set In multiple hash tables; (2), when predicting, use the LSH method to calculate the hash value corresponding to the AAC feature vector of the target sequence in each hash table, and obtain a set of similar sequence vectors; (3), from Select the Q vectors closest to the target sequence AAC feature vector Euclidean distance from the set of similar sequence vectors obtained, and use the global comparison dynamic programming method to calculate the protein sequence expected distance between the target sequence AAC feature vector and the vectors of the aforementioned Q vectors, The interval corresponding to the sequence protein with the highest expected distance from the target sequence among the Q vectors is used as the prediction interval.

Description

technical field [0001] The invention belongs to the field of bioinformatics, in particular to a method for predicting protein subcellular location using machine learning technology, specifically a method for predicting protein subcellular location using nearest neighbor retrieval. Background technique [0002] Protein subcellular localization refers to the specific location of a certain protein or a certain gene expression product in a cell, that is, to predict its subcellular location based on the given protein sequence. The subcellular localization of proteins is closely related to their biological functions. The knowledge position of protein cells plays a vital role in biology, cell biology, pharmacology, medicine. Although the subcellular localization of proteins can be determined experimentally, it is time consuming and expensive. As sequenced genomic data increases, methods for predicting subcellular localization of proteins become increasingly important, requiring a...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F19/18
Inventor 薛卫王雄飞赵南任守纲
Owner NANJING AGRICULTURAL UNIVERSITY
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products