Method based on support vector machine for on-line prediction of interaction of protein and nucleic acid

A support vector machine and protein technology, applied in the field of bioinformatics, can solve problems such as inability to use, and achieve the effects of high accuracy, short forecast time and low cost
CN101630346AInactive Publication Date: 2010-01-20SHANGHAI UNIV

Patent Information

Authority / Receiving Office
CN · China
Current Assignee / Owner
SHANGHAI UNIV
Publication Date
2010-01-20
Estimated Expiration
Not applicable · inactive patent

Smart Images

  • Figure 1
    Figure 1
Patent Text Reader

Abstract

The invention discloses a method based on a support vector machine for the on-line prediction of the interaction of protein and nucleic acid. The method includes the following steps: 1, the establishment of a training sample set of a protein sequence dataset; 2, the conversion of the protein sequence dataset; 3, the training of generated protein feature dataset by the support vector machine; and 4, prediction of the reading and the data conversion of protein sequence and the online prediction of type of the interaction classification of the protein and the nucleic acid. The invention can detect whether the protein acts with the nucleic acid or not under the circumstance that the interaction of the protein and the nucleic acid is not detected; proved by verification results, the accuracy rates of the 10 folded cross validation prediction of the protein which acts with r RNA, RNA and DNA respectively achieve 93.75 percent, 83.41 percent and 81.85 percent; and the accuracy rates of models obtained by verification of an external testing set are respectively 93.8 percent, 84.52 percent and 81.9 percent. During on-line prediction, a user only needs to provide the protein sequence to predict on the interface of a prediction webpage, data of the protein sequence is converted so as to accomplish the training of the support vector machine and the prediction of target types, and the result of prediction is outputted.
Need to check novelty before this filing date? Find Prior Art

Description

technical field

[0001] The invention relates to a method for realizing online prediction of protein and nucleic acid (DNA-, RNA-, rRNA-) interaction classification type based on a support vector machine. in the field of bioinformatics. Background technique

[0002] Proteins that interact with nucleic acids play extremely important roles in many aspects of gene function. Proteins that interact with DNA play key roles in various processes such as transcription, packaging, rearrangement, and repair. Proteins that interact with RNA control the synthesis process by interacting with various RNAs during protein synthesis. Therefore, proteins that interact with nucleic acids have received extensive interest over the past three decades. Since the Human Genome Project, the number of protein sequences that have been determined has gradually increased, and various protein data resources have expanded rapidly. Determining protein-nucleic acid interactions experimentally would be time...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More