Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Method for predicting compound carcinogenic toxicity based on complex sampling and improved decision forest algorithm

A toxicity prediction and forest algorithm technology, applied in computing, special data processing applications, instruments, etc., can solve the problems of time-consuming, random prediction, and high testing costs, and achieve balanced prediction capabilities, rapid prediction, and good application prospects.

Inactive Publication Date: 2009-11-25
SHANGHAI INST OF MATERIA MEDICA CHINESE ACAD OF SCI
View PDF0 Cites 15 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, this approach suffers from the following problems: (1) high cost of testing (average test cost over two million dollars); (2) time-consuming (3 to 5 years); (3) ethical considerations and public pressure - in R&D and Reduce or eliminate the use of animals in testing
[0007] In addition to performance limitations, current computational toxicity prediction methods often encounter the problem of random prediction, that is, changing the composition of even a small training data set may make diametrically opposite prediction results

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method for predicting compound carcinogenic toxicity based on complex sampling and improved decision forest algorithm
  • Method for predicting compound carcinogenic toxicity based on complex sampling and improved decision forest algorithm
  • Method for predicting compound carcinogenic toxicity based on complex sampling and improved decision forest algorithm

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0027] Hereinafter, the present invention will be described in detail.

[0028] Among them, the establishment of a compound carcinogenic toxicity prediction model based on complex sampling and improved decision forest algorithm mainly involves five steps:

[0029] 1) Compute the atom descriptors for each atom type in the molecule:

[0030] Since carcinogenic toxicity may be related to the intrinsic properties of various compounds such as the size, shape, electronic information, polarizability, electronegativity and covalent radius of heteroatoms in the compound, we first use the following five categories of about 49 compounds that meet the above The characteristic descriptors are used to characterize the properties of compounds: a) electronic descriptors, the electronic descriptors are atomic polarization sum (Apol), dipole moment (Dipole), highest occupied molecular orbital (HOMO), lowest unoccupied molecular orbital (LUMO), super delocalization energy (Sr); b) space descrip...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention relates to a method for predicting compound carcinogenic toxicity based on complex sampling and an improved decision forest algorithm. The method is suitable for calculating carcinogenic toxicity evaluation and virtual selection on a compound according to an organic small molecular structure, and comprises the following steps: firstly, adopting a relative force field to molecules with the molecular structure to carry out optimization and charge calculation, carrying out complex sampling to the compound with centralized original training to be used for generating a training subset, and fixing various relative descriptors in composition calculation molecules of descriptors according to a complex sampling algorithm result; secondly, optimizing a descriptor pool by using a method based on relative matrix analysis and factor analysis; and finally, carrying out data mining on carcinogenic toxicity data and corresponding chemical characteristics thereof of training set molecules by using the improved decision forest method to obtain a classified prediction confidence interval, a carcinogenic toxicity prediction mold and a judgment rule. The method has favorable application prospect in high throughput virtual selection and calculating the carcinogenic toxicity evaluation.

Description

technical field [0001] The invention relates to a compound carcinogenic toxicity prediction calculation method based on complex sampling and improved decision forest algorithm, which is suitable for virtual carcinogenic toxicity evaluation and screening of the compound according to the molecular structure information of the organic compound. Background technique [0002] Toxicity issues are an important factor in the failure of late-stage drug development. The carcinogenic toxicity of a compound refers to a long-term effect of the compound inducing the growth of malignant or benign tumors in the human body. Rodent biological testing is the main carcinogenic toxicity test method currently used. However, this approach suffers from the following problems: (1) high cost of testing (average test cost exceeds two million dollars); (2) time-consuming (3 to 5 years); (3) ethical considerations and public pressure - in R&D and Reduce or eliminate the use of animals in testing. Due...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F19/00
Inventor 蒋华良罗小民张振山朱维良郑明月沈建华陈凯先薛春霞
Owner SHANGHAI INST OF MATERIA MEDICA CHINESE ACAD OF SCI
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products