Variable selection method for modeling organic pollutant quantitative structure and activity relationship
A technology of organic pollutants and quantitative structure, applied in special data processing applications, instruments, electrical digital data processing, etc., can solve the problem of inability to screen large-scale variable sets, variable screening methods that cannot be verified to be optimal, and cannot guarantee the same results, etc. question
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0036] The so-called "standard" test set - the Selwood data set - was selected for testing. This dataset was first published in the literature (Selwood, D.L.; Livingstone, D.J.; Comley, J.C.W.; O'Dowd, A.B.; Hudson, A.T.; Jackson, P.; Jandu, K.S.; Rose, V.S.; Stables, J.N., Structure-activity Relationships of antifilarial antimycin analogs: a multivariate pattern recognition study. J. Med. Chem. 1990, 33(1), 136-142.). The dataset contains 31 samples and 53 descriptors. The parameters set during the screening process are as follows: the number of retained models Ns = 100, the correlation coefficient between variables r int =0.9, the initial value r of the critical value of the correlation coefficient that determines whether to perform LOOCV or LMOCV calculation cri =0.1 (but this value should be adjusted accordingly as the number of variables increases). After calculation, the results shown in the table below are obtained. This data set has never seen a model with a number o...
Embodiment 2
[0039] According to the literature (Yi Xiang, Guo Zongru, thiazolidinediones and aryl keto acid PPAR-γ agonist three-dimensional quantitative structure-activity relationship research. Acta Pharmaceutica Sinica 2001, 36 (4), 262-268.) 58 PPAR- The structure and biological effects of gamma agonists were calculated using the E-Dragon software provided by the Virtual Computational Chemistry Laboratory (VCCLAB) to obtain 1664 molecular structure descriptors, and 814 descriptors were obtained after pre-screening. Utilize VSMVI method screening then, screening parameter is with embodiment 1. Finally, the results shown in the table below are obtained.
[0040]
[0041]
Embodiment 3
[0043] The "Environmental Toxicity Prediction Challenge" training set provided by Dr.Igor V.Tetko was used for variable screening test. The training set includes 644 organic compounds whose structures are represented by 1664 descriptors calculated by the E-Dragon software of the Virtual Computational Chemistry Laboratory (VCCLAB), available at http: / / www.cadaster.eu. / node / 65. The data and 827 descriptors were obtained after variable pre-screening, and the parameters of VSMVI were the same as those in Example 1. Finally, the following results are obtained.
[0044]
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com