Molecular recognition feature function prediction method based on ensemble learning

A molecular identification and integrated learning technology, applied in the field of bioinformatics, can solve the problems of high resource consumption, complex experimental process and high cost, and achieve the effect of low resource consumption, simple experimental process and low cost

Pending Publication Date: 2022-01-14
XIDIAN UNIV
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] The purpose of the present invention is to provide a protein molecular recognition feature function prediction method based on integrated learning to solve the problems of complex experimental procedures, large resource consumption and high cost in the prior art.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Molecular recognition feature function prediction method based on ensemble learning
  • Molecular recognition feature function prediction method based on ensemble learning
  • Molecular recognition feature function prediction method based on ensemble learning

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0025] The specific embodiments and effects of the present invention will be further described in detail below in conjunction with the accompanying drawings.

[0026] refer to figure 1 , the implementation steps of this example are as follows:

[0027] Step 1. Obtain data related to inherently disordered proteins and their functional annotations, and preliminarily screen protein sequences based on functional annotations.

[0028] 1.1) Download the 2020_12 version of the data set in the DisProt database from the public website, including 1590 inherently disordered protein sequences and corresponding 7 functional annotations. The 7 functional annotations are entropy chain, bioconcentration, molecular recognition assembler, molecular Recognition partner, molecular recognition display site, molecular recognition effector and molecular recognition scavenger, in which molecular recognition assembler, molecular recognition partner, molecular recognition display site, molecular recog...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a molecular recognition feature function prediction method based on ensemble learning, and mainly solves the problem that an existing molecular recognition feature predictor cannot further divide molecular recognition feature functions. According to the scheme, the method comprises the following steps: downloading inherent disordered protein data and functional annotations thereof, dividing training data and test data, performing feature representation on a protein sequence, and designing a residue tag of the protein sequence; selecting a single-input binary association strategy machine learning model; training different machine learning models by using the training data; integrating training results of different machine learning models by using an integration strategy to collect a prediction model; and inputting to-be-researched protein sequence data into the prediction model, and outputting a molecular recognition feature function prediction result of the protein. The method is simple in experimental process, low in resource consumption, low in cost and high in reliability of prediction results, can be used for predicting molecular recognition features in protein sequences, and provides reference for drug target acting positions.

Description

[0001] The invention belongs to the technical field of bioinformatics, and in particular relates to a method for predicting the function of molecular recognition features, which can be used to predict the function of molecular recognition features in protein sequences and provide references for drug target action positions. Background technique [0002] Molecular recognition features refer to regions in proteins that contain inherently disordered regions of between 10 and 70 residues that transform from disordered regions to ordered regions after binding to their partners. The partners include carbohydrates, ions, lipids, nucleic acids, proteins and small molecules. The functions of molecular recognition features include molecular recognition assembler (molecular recognition assembler), molecular recognition scavenger (molecular recognition scavenger), molecular recognition effector (molecular recognition effector), molecular recognition display sites (molecular recognition dis...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G16C20/20G06K9/62G06N3/08G06N3/12G06N20/10
CPCG16C20/20G06N3/126G06N3/084G06N20/10G06F18/24323
Inventor 鱼亮李浩铮
Owner XIDIAN UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products