Unlock instant, AI-driven research and patent intelligence for your innovation.

Protein classification method

A classification method and protein technology, applied in the field of protein classification, can solve problems such as unclear architecture and data representation performance, difficult to calculate volume, etc., and achieve high-speed and high-precision classification effects

Pending Publication Date: 2020-05-29
QINGDAO NAT LAB FOR MARINE SCI & TECH DEV CENT
View PDF5 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, the current challenges for CNNs-based approaches to protein data classification are mainly two-fold: 1) while extending the basic approach of CNNs to volumetric data is conceptually straightforward, it is unclear which architectures and data representations will yield good performance;
[0006] 2) The volume is difficult to calculate

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Protein classification method
  • Protein classification method
  • Protein classification method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0024] The specific implementation manners of the present invention will be further described in detail below in conjunction with the accompanying drawings.

[0025] The protein classification method that the present invention proposes, such as figure 1 shown, including the following steps:

[0026] Step S1: Putting the three-dimensional protein model into a voxel-occupied grid of N*N*N size.

[0027] In the embodiment of the present invention, the protein three-dimensional model data are 2267 independent protein structures distributed in 107 classes selected from the protein structure reference database, of which there is only one protein in 18 classes, and the largest class contains 110 proteins. The average size of the protein class was 21.18.

[0028] The proteins in the dataset are in pdb format, and all generated pdb files are cleaned and prepared: water molecules are removed; atoms are added if missing, to generate the final pdb format dataset.

[0029] In the embodi...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a protein classification method. The method comprises the following steps: putting a protein three-dimensional model into a voxel occupation grid with the size of N*N*N; projecting N*N rays to the voxel occupation grid, setting each ray to pass through the centers of N voxels; constructing an occupation model based on the intersection condition of the rays and the protein three-dimensional model; and classifying the proteins by the occupation model through a three-dimensional convolutional neural network. Volume representation of a protein structure is realized by adopting the voxel occupying grid, a rapid and accurate classifier is created for protein from original protein volume data by utilizing machine learning, and a high-speed and high-precision classificationeffect is realized for a protein data set.

Description

technical field [0001] The invention belongs to the technical field of protein data classification, and in particular relates to a protein classification method. Background technique [0002] The goals of structural biology include a comprehensive understanding of molecular shape and form supported by biological macromolecules, and extending this to understand how most biological processes are represented using different molecular structures. Among these macromolecules, proteins are key effectors involved in most processes, with dynamically complex surfaces; they can consist of tens of thousands of atoms, at the atomic scale due to local (residue side chains) or global (loop or domain) structural changes that greatly affect their global and local shapes. [0003] Since the structure of proteins is linked to their function, and disruption of their interactions can lead to disease states, characterizing their shape is important to help identify potential binders, such as othe...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G16B20/00G16B40/00G16B15/00G06K9/62G06N3/04
CPCG16B20/00G16B40/00G16B15/00G06N3/045G06F18/241
Inventor 魏志强聂婕聂为之刘安安苏育挺
Owner QINGDAO NAT LAB FOR MARINE SCI & TECH DEV CENT