Residue contact information self-learning-based protein structure prediction method

A technology for protein structure and contact information, applied in informatics, bioinformatics, and used to analyze two-dimensional or three-dimensional molecular structures, etc., can solve the problem of affecting prediction accuracy, "oversampling, and inability to effectively capture remote forces between residues" and other issues to achieve the effect of improving the prediction accuracy

Active Publication Date: 2019-01-15
ZHEJIANG UNIV OF TECH
View PDF2 Cites 7 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, the knowledge energy function of this method cannot effectively capture the long-range interaction between residues, and when predicting a target protein with a long sequence, switching between stages under a fixed cost is likely to

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Residue contact information self-learning-based protein structure prediction method
  • Residue contact information self-learning-based protein structure prediction method
  • Residue contact information self-learning-based protein structure prediction method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0042] The present invention will be further described below in conjunction with the accompanying drawings.

[0043] refer to Figure 1 ~ Figure 3 , a protein structure prediction method based on residue contact information self-learning, including the following steps:

[0044] 1) Given the input sequence information, use the Robetta server to obtain the fragment library of the sequence;

[0045]2) Use RaptorX-Contact to predict the contact map of the sequence, obtain the contact situation of N residue pairs, and make the contact between the kth residue pair in the contact map, the contact refers to the Cα-Cα Euclidean distance less than The exposure probability is denoted as P k , k∈{1,...,N};

[0046] 3) Initialization: population size NP, information entropy threshold α, the maximum number of iterations in the first and second phases of the population are G1 and G2 respectively, according to the input sequence, execute the first and second phases of the Rosetta Abinitio...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a residue contact information self-learning-based protein structure prediction method. According to the method, a fragment library and a contact map are obtained through Robetta and RaptorX-Contact; in the first stage of population evolution, residue pair distance distribution learning is performed, an information entropy index is established so as to reflect the convergence degree of a population, the purpose of automatic learning can be achieved; in the second stage of the population evolution, a scoring function is established with learned residue pair distance distribution information so as to assist an energy function in conformation space search; and a final prediction result is obtained through clustering. With the protein structure prediction method provided by the invention adopted, the residue pair distance information can be learned automatically so as to assist the energy function in conformation space optimization; and the information entropy indexis constructed, and therefore, the dynamic switching of the two stages can be realized.

Description

technical field [0001] The invention relates to the fields of biological informatics, intelligent optimization and computer application, and in particular to a protein structure prediction method based on residue contact information self-learning. Background technique [0002] A protein is a biological macromolecule with a certain specific spatial structure formed by a polypeptide chain composed of amino acids in the form of "dehydration condensation" after twists and turns, and thus plays a specific function in an organism. The three-dimensional structure of proteins is of great importance in drug design, protein engineering, and biotechnology. At present, millions of protein sequences have been resolved, but most of the protein structures are unknown. Therefore, protein structure prediction is an important research problem. [0003] The gap between protein sequence and structure is mainly due to the rapid development of sequencing technology and relatively slow progress ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G16B15/20
Inventor 张贵军谢腾宇马来发周晓根王柳静郝小虎
Owner ZHEJIANG UNIV OF TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products