Integrated learning method for recognizing ECM (extracellular matrix) protein

An integrated learning and extracellular matrix technology, applied in the field of integrated learning to identify extracellular matrix proteins, can solve problems such as data set imbalance, achieve the effect of reducing dimensionality disaster and improving classifier performance

Inactive Publication Date: 2015-02-04
SHANDONG UNIV
View PDF3 Cites 27 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] In order to solve the deficiencies in the prior art, the present invention discloses an integrated learning method for identifying extracellular matrix proteins, the purpose of which is to solve the probl

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Integrated learning method for recognizing ECM (extracellular matrix) protein
  • Integrated learning method for recognizing ECM (extracellular matrix) protein
  • Integrated learning method for recognizing ECM (extracellular matrix) protein

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0042] The present invention is described in detail below in conjunction with accompanying drawing:

[0043] In order to establish computational methods for the identification of protein functional properties, protein sequences should first be represented as numerical feature vectors. figure 1 The feature building strategy of the present invention is given. Based on sequence composition, physical and chemical properties, evolutionary information and structural information, the present invention adopts 10 feature establishment methods to map protein sequences into numerical feature vectors with a dimension of 315. Each feature creation strategy is explained one by one below.

[0044] 1. Build strategies based on sequence composition features

[0045] (I) Frequency of functional groups

[0046] The side chains of amino acids play an important role in the structural folding and stabilization of proteins. Based on the chemical groups of the side chains, the present invention d...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses an integrated learning method for recognizing ECM (extracellular matrix) protein. According to the method, data set building: a training sample set and an independent test sample set of an ECM protein sequence are built; the protein sequence in the training sample set is mapped into a numeric feature vector; a relatively effective feather subset is selected by an information gain ratio-incremental feature selection method, an integrated classifier model is built by an integrated learning method, and the problem of data set unbalance is solved; the independent test sample set is mapped into a numeric feature vector, the category of the test sample is obtained by a majority voting method on the basis of a predication result of the integrated learning method, and the performance of a prediction system is finally evaluated by utilizing the predication result of the test sample. The invention discloses a network server system for recognizing the ECM protein. Users do not need to understand the concrete executing process of ECM protein recognition, and the prediction result can be obtained only through inputting the protein sequence to be predicted.

Description

technical field [0001] The invention relates to the field of protein functional attribute recognition, in particular to an integrated learning method for recognizing extracellular matrix proteins. Background technique [0002] The extracellular matrix (ECM) is the microenvironment for the survival of cells and tissues, and plays an important role in the regulation of cell behavior and tissue properties. The powerful biological functions of ECM are attributed to the diversity of ECM proteins. The composition and dynamic changes of ECM proteins have an all-round impact on the proliferation, differentiation, migration of cells, morphogenesis and differentiation of tissues and other life phenomena. Meanwhile, dysfunction of ECM proteins can lead to numerous diseases. Proteoglycans and collagens are the main components of ECM proteins. Among them, proteoglycans regulate physiological activities such as tissue repair, tumor growth, cell adhesion, proliferation, and migration; c...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F19/24G06F19/18
Inventor 张承进杨润涛高瑞张丽娜
Owner SHANDONG UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products