Database searching method and database searching system for open type protein identification

A protein identification and database technology, applied in the field of bioinformatics, to achieve the effect of improving resolution, facilitating retrieval, and increasing scale

Active Publication Date: 2014-05-21
INST OF COMPUTING TECH CHINESE ACAD OF SCI
View PDF5 Cites 15 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] The purpose of the present invention is to provide an open protein identification database search method and its system, which is used to allow users not to specify the type of enzyme digestion and modification, or to specify any type of them for protein identification, so as to solve any type of enzyme digestion and Modified Identification Issues

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Database searching method and database searching system for open type protein identification
  • Database searching method and database searching system for open type protein identification

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0062] The present invention will be described in detail below in conjunction with the accompanying drawings and specific embodiments, but not as a limitation of the present invention.

[0063] Such as figure 1 Shown is a flow chart of the search method for the open protein identification database of the present invention. The specific steps of the process are as follows:

[0064] Step 101, setting necessary search parameters.

[0065] Step 102, inputting the protein sequence, slicing each protein sequence according to the specified type of enzyme cutting method, sorting all generated subsequences according to quality, and generating a peptide sequence data table. Index files are built on this basis.

[0066] Step 103, input the mass spectrogram, extract a certain number of spectral peaks from each mass spectrogram to generate a query set, and then query the index file described in step 101 to obtain the query result. The query result is a sequence fragment, that is, a rel...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a database searching method and a database searching system for open type protein identification. The database searching method comprises the steps of step 1, inputting protein sequences, simulating and splitting each protein sequence, ranking all generated subsequences according to mass, generating a peptide sequence data list, and establishing an index file according to the peptide sequence data list; step 2, inputting mass spectrums, generating a search set by extracting spectrum peaks from each mass spectrum, searching the index file, and obtaining a sequence set; step 3, generating a candidate peptide fragment according to decoration and combination on each mass spectrum and the sequence set corresponding to each mass spectrum, and marking the candidate peptide fragment; step 4, integrating a marking result, inferring from the peptide fragment to protein, and obtaining an identification result. According to the database searching method disclosed by the invention, the protein identification is carried out in a way that a user is allowed not to assign the types of digestion and decoration or is allowed to assign arbitrary types among the types, and the database searching method is used for solving identification problems of digestion and decoration in the arbitrary types.

Description

technical field [0001] The invention relates to the field of biological information, in particular to a database search method and system for open protein identification. Background technique [0002] Proteomics research refers to the study of protein characteristics on a large scale, including protein expression levels, post-translational modification studies, and protein interactions. As a key technology in this field, biological mass spectrometry has developed rapidly in recent years. The qualitative and quantitative analysis of proteins using mass spectrometry data has become one of the core contents of proteomics research. Among them, the database search method is the main method for mass spectrometry data analysis in proteomics. [0003] The formation process of proteome data is more complicated. Proteins are decomposed by biological enzymes, and many sub-fragments formed are called peptides or peptide segments. In order to be detected by the mass spectrometer, lon...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
CPCG16B20/00
Inventor 迟浩孙瑞祥王乐珩张文力贺思敏
Owner INST OF COMPUTING TECH CHINESE ACAD OF SCI
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products