Building method of two-level fitting quantitative structure-activity relationship (QSAR) model for forecasting compound activity

A construction method and a technology for fitting models, which are applied in the field of biomedical information, can solve problems such as model performance degradation, modeling failure, and non-convergence, and achieve the effects of fewer independent variables, preventing modeling failure, and easy interpretation

Inactive Publication Date: 2013-02-13
SOUTH CHINA AGRI UNIV
View PDF1 Cites 16 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

If the information gain contained in the independent variable is insufficient, it is difficult for the built model to have a good predictive ability. However, although increasing the num

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Building method of two-level fitting quantitative structure-activity relationship (QSAR) model for forecasting compound activity
  • Building method of two-level fitting quantitative structure-activity relationship (QSAR) model for forecasting compound activity
  • Building method of two-level fitting quantitative structure-activity relationship (QSAR) model for forecasting compound activity

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0039] Such as figure 1 As shown, the linear regression-neural network of the present embodiment fits the QSAR model in two stages before and after, and its construction steps are as follows:

[0040] 1) Finishing of biological activity

[0041] In order to ensure the statistical effect, 35 pyrazole compounds with p38 kinase inhibitory rate were taken as the training set S 1 , convert its inhibition rate α into logarithmic form: Y 1 =LgBio=-lg(α -1 -1). Y 1 =LgBio is the dependent variable used in subsequent modeling. Sybyl analysis software was used to test the two-dimensional structure of the compound, and the three-dimensional structure of the compound that passed the test was generated.

[0042] 2) Construction of pre-fitting model

[0043] The training set compound S 1 Import the molecular table S1.tbl of Sybyl software, in the Topomer CoMFA module, for the training set S 1 The substituents of the compounds are divided. On the one hand, the division of substituent...

Embodiment 2

[0047] This embodiment is to measure the goodness of fit, and compare the M 1 -M 2 Two-level model with M 1 The goodness of fit of the single-level model, the specific steps are as follows:

[0048] 1) Variable naming

[0049] will model M 1 For the training set S 1 The calculated activity of the compound is named Y 2 .

[0050] will model M 2 For the training set S 1 The calculated activity of the compound is named Y 3 .

[0051] 2) Export spreadsheet file

[0052] The Sybyl molecule forms S 1 The two columns LgBio and Pre_LgBio in .tbl are exported as S 1 _M 1 .csv file, then converted to S 1 _M 1 .xls file. The above LgBio is Y 1 , Pre_LgBio is Y 2 .

[0053] Using the same method, export M2 from the SPSS Clementine software to the training set compound S 1 The computational activity of , saved as S 1 _M 2 .xls file; where, S 1 _M 2 The .xls file contains the variable Y 1 and Y 3 .

[0054] 3) Calculate the square of the correlation coefficient an...

Embodiment 3

[0059] This embodiment is to measure the predictive performance, compare the above-mentioned embodiment 1 built M 1 -M 2 Two-level model with M 1 The prediction performance of the single-level model, the specific steps are as follows:

[0060] 1) Collation of p38 kinase inhibitory activity

[0061] Take 35 non-training sets S 1 Pyrazole compounds of elements form a test set S 2 , whose p38 kinase inhibitory activity is denoted as Y 4 . The test set S 2 The 35 pyrazole compounds are made into Sybyl molecular form S 2 .tbl, Y 4 Specified as dependent variable (indicated as LgBio in the S2.tbl molecule sheet).

[0062] 2) Determination of the predictive performance of the single-level model M1

[0063] In the TopomerCoMFA module of Sybyl software, the predicted molecular form S 2 .p38 kinase inhibitory activity of tbl, the result is marked as Y 5 (at S 2 Indicated as Pre_LgBio in the .tbl molecule form). During the prediction process, the local physiological effects...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a building method of a two-level fitting quantitative structure-activity relationship (QSAR) model for forecasting compound activity. The building method includes following procedures: 1 a plurality of compounds with the same frames are utilized as a training set, and the train set compounds are divided into substituent groups and are coincided; 2 a linear regression method is utilized to calculate local physiological action produced by each substituent group, and a preceding-stage fitting model is built; 3 according to the local physiological action which is obtained in calculating mode in the procedure 2, a neural network method is utilized to calculate the whole biological activity, and a backward-stage fitting model is built; and 4 the preceding-stage fitting model and the backward-stage fitting model are combined to form the front-and-back two-stage QSAR model. According to the building method, the linear regression method and the neural network method are combined to build the model, the neural network method has good fitting performance, and compared with a traditional linear model, the built model can accurately forecast the biological activity of the compounds.

Description

technical field [0001] The invention relates to a method for constructing an OSAR model, in particular to a method for constructing a two-stage fitting QSAR model for predicting compound activity, and belongs to the field of biomedical information technology. Background technique [0002] Quantitative Structure-Activity Relationship (QSAR) is a technique for quantitatively predicting the activity of compounds with the help of mathematical models. Because the research results of 3D QSAR have clear guiding significance, it has been widely adopted by many researches. However, since the modeling process of 3D QSAR is executed in the black box of commercial software, and the process in the software black box is difficult to be intervened by human beings, this undoubtedly increases the difficulty of its modeling optimization. Modeling method of QSAR. Therefore, it is of great significance to establish a convenient and fast 3D QSAR modeling method. [0003] At present, in the mo...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/50G06N3/02
Inventor 刘雅红贺利民梁智斌方炳虎陈建新汤有志陈良柱
Owner SOUTH CHINA AGRI UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products