Protein structural domain division method based on contact graph and fuzzy C-means clustering

A protein structure and mean clustering technology, applied in sequence analysis, character and pattern recognition, instruments, etc., can solve the problems of excellent template structure, domain division information cannot guarantee division efficiency, and time-consuming, etc., to improve efficiency and The effect of precision

Active Publication Date: 2019-08-23
ZHEJIANG UNIV OF TECH
View PDF9 Cites 4 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Since ThreaDomEx needs to search the existing database, it cannot guarantee that the template structure searched every time is excellent, and it takes a lot of time to search the database, so the domain division information obtained by it cannot be guaranteed to be optimal and the division efficiency is yet to be determined. further improvement
[0004] In summary, the existing protein domain division methods are still far from the requirements of practical applications in terms of calculation cost and division accuracy, and urgently need to be improved.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Protein structural domain division method based on contact graph and fuzzy C-means clustering
  • Protein structural domain division method based on contact graph and fuzzy C-means clustering
  • Protein structural domain division method based on contact graph and fuzzy C-means clustering

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0028] The present invention will be further described below in conjunction with the accompanying drawings.

[0029] refer to figure 1 and figure 2 , a protein domain division method based on contact graph and fuzzy C-means clustering, comprising the following steps:

[0030] 1) Input the protein sequence information to be divided into structural domains, denoted as S;

[0031] 2) Use the RaptorX-Contact server (http: / / raptorx.uchicago.edu / ContactMap / ) to predict the contact map of the protein sequence S, and record the predicted contact map information as where L represents the number of residues in the protein sequence S, m i,j∈{0,1} denotes the i-th residue R in S i with the jth residue R j contact state: m i,j = 1 means two residues are in contact, m i,j = 0 means that two residues do not touch;

[0032] 3) For any element m in M i,j , using a weight matrix W with 2k+1 rows and 2k+1 columns:

[0033]

[0034] Perform the following processing to get

[003...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

A protein structural domain division method based on a contact graph and fuzzy C-means clustering comprises the following steps: firstly, predicting contact diagram information of protein by using a RaptorX-Contact server according to input protein sequence information to be subjected to structural domain division; then carrying out weighting processing on the contact image information; secondly,using a fuzzy C-means clustering algorithm to cluster the contact image information; thirdly, dividing a protein structural domain according to the clustering information; and finally, predicting thethree-dimensional structure of each structural domain by using an I-TASSER server. The protein structural domain division method based on contact graph and fuzzy C-means clustering provided by the invention is low in calculation cost and high in division precision.

Description

technical field [0001] The invention relates to the fields of bioinformatics, pattern recognition and computer application, in particular to a protein structure domain division method based on contact graph and fuzzy C-means clustering. Background technique [0002] In life activities, proteins often exist in the form of multiple domains in order to complete complex biological functions. Each protein domain can perform a specific biological function independently of the rest of the protein. During the evolution of protein molecules, protein domains can recombine in different arrangements, resulting in proteins with different functions. Therefore, the precise division of protein domains is helpful for the study of protein functions and the design of drug target proteins, which has very important guiding significance. [0003] At present, the methods specially used for protein domain division are: FIEFDom (Bondugula R, et al. FIEFDom: a transparent domain boundary recognitio...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06K9/62G16B25/10G16B30/20
CPCG16B25/10G16B30/20G06F18/2321
Inventor 胡俊饶亮刘俊周晓根陈伟锋张贵军
Owner ZHEJIANG UNIV OF TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products