Unlock instant, AI-driven research and patent intelligence for your innovation.

Method for predicting protein solubility by using convolutional neural network

A convolutional neural network and protein technology, applied in the field of predicting protein solubility prediction, can solve problems such as the difficulty of applying SVM, and achieve the effects of avoiding overfitting, improving accuracy, and increasing prediction accuracy

Active Publication Date: 2019-12-03
HENAN NORMAL UNIV
View PDF2 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Among them, the support vector machine (SVM) method is mostly used. This method has a good performance on small data sets, but as the amount of data increases, SVM is difficult to apply. Using the classic algorithm convolution in deep learning Neural network (CNN) can solve this problem very well

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method for predicting protein solubility by using convolutional neural network
  • Method for predicting protein solubility by using convolutional neural network
  • Method for predicting protein solubility by using convolutional neural network

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0039] The following embodiments will describe the present invention in detail with reference to the drawings. In the drawings or descriptions, similar or identical parts use the same symbols, and in practical applications, the shape, thickness or height of each component can be enlarged or reduced. The various embodiments listed in the present invention are only used to illustrate the present invention, and are not intended to limit the scope of the present invention. Any obvious modifications or changes made to the present invention do not depart from the spirit and scope of the present invention.

[0040] A method for predicting protein solubility by a convolutional neural network, comprising the following steps:

[0041] S1: Screen protein data to exclude protein sequences containing 20 amino acids that are not necessary for the human body;

[0042] S2: Use CD-hit tool to reduce data set redundancy;

[0043] S3: Calculate the 2-mer frequency of each protein sequence;

...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a method for predicting protein solubility through a convolutional neural network, and relates to the field of biological information and deep learning. The method comprises the following steps: carrying out 2-mer frequency operation on data to obtain a 20*20 matrix, carrying out convolution operation on the matrix to express hidden features in the matrix, and inputting theobtained features into a deep neural network to carry out training calculation or prediction calculation. According to the invention, the convolutional neural network in deep learning is applied to protein solubility prediction, and the protein solubility can be obtained through a protein primary structure; the method can effectively avoid an overfitting phenomenon occurring when a big data set is trained in traditional machine learning, the prediction precision is further improved by using a secondary classifier, and the method not only improves the precision of protein solubility prediction, but also reduces the process of feature extraction.

Description

technical field [0001] The invention relates to the field of biological information and deep learning, in particular to a method for predicting protein solubility prediction. Background technique [0002] Protein is the material basis of life, an organic macromolecule, the basic organic matter that constitutes a cell, and the main bearer of life activities. In human cells, except for water, protein accounts for about 80% of the intracellular substances. Therefore, the components that constitute body tissues and organs are the most important physiological functions of proteins. Protein has many properties such as solubility, hydrolysis, salting out, denaturation, etc., and protein solubility is one of the important properties. [0003] Computational methods for predicting protein solubility have been developed. Among them, the support vector machine (SVM) method is mostly used. This method has a good performance on small data sets, but as the amount of data increases, SVM i...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G16B40/00G06N3/04G06N3/08
CPCG16B40/00G06N3/08G06N3/045
Inventor 王鲜芳杜志勇郜鹏刘依锋李鸿飞陆凡
Owner HENAN NORMAL UNIV