Method for semi-supervised learning prediction of protein secondary structure

A semi-supervised learning, secondary structure technology, applied in neural learning methods, for analyzing two-dimensional or three-dimensional molecular structures, informatics, etc.

Pending Publication Date: 2020-02-28
TIANJIN UNIV
View PDF7 Cites 7 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, labeling protein secondary structure labels requires a lot of manpower and financial resources, and even time.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method for semi-supervised learning prediction of protein secondary structure
  • Method for semi-supervised learning prediction of protein secondary structure
  • Method for semi-supervised learning prediction of protein secondary structure

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0017] The present invention will be described in further detail below in conjunction with the accompanying drawings and specific embodiments. It should be understood that the specific embodiments described here are only used to explain the present invention, not to limit the present invention.

[0018] The invention provides a method for semi-supervised learning and prediction of protein secondary structure, such as figure 1 As shown, the details are as follows:

[0019] 1. Get the protein data set

[0020] First, a dataset needs to be obtained. In this example, the dataset used is the CullPDB dataset, which consists of 6133 proteins and each protein has 39900 features. 6133 proteins × 39900 features can be reshaped into 6133 proteins × 700 amino acids × 57 features. The following table is the protein secondary structure categories and the frequency of occurrence of each category:

[0021] Description of protein secondary structure classes and class frequencies in datase...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a method for semi-supervised learning prediction of a protein secondary structure. The method comprises the following steps: (1) acquiring a protein sequence data set; (2) performing data cleaning and feature extraction on the acquired data set; (3) establishing a Simi-GAN neural network model; (4) training the Simi-GAN neural network model; (5) adjusting parameters of theSimi-GAN neural network model; (6) evaluating the Simi-GAN neural network model. According to the method, a semi-supervised prediction model can be established for the protein secondary structure under the condition that a large number of missing value labels exist, thereby saving a large amount of manpower and financial resources.

Description

technical field [0001] The present invention relates to the fields of bioinformatics and deep learning, and uses a deep learning model to perform semi-supervised learning and prediction of protein secondary structure, which belongs to a key research issue of bioinformatics prediction. Specifically, it relates to a method for semi-supervised learning and prediction of protein secondary structure, which uses protein data sets with missing labels to train a deep learning classification model to predict protein secondary structure. Background technique [0002] Protein secondary structure prediction is the inference of the secondary structure of protein fragments based on their amino acid sequences. In bioinformatics and theoretical chemistry, protein secondary structure prediction is important for medicine and biotechnology, such as drug design and the design of novel enzymes. Since secondary structures can be used to find long-range relationships of proteins with non-alignabl...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G16B15/20G16B40/20G06N3/04G06N3/08
CPCG16B15/20G16B40/20G06N3/08G06N3/047G06N3/048G06N3/045
Inventor 宫秀军赵兴海
Owner TIANJIN UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products