Unlock instant, AI-driven research and patent intelligence for your innovation.

Molecular Generation Method Based on Data Mining

A data mining and molecular technology, applied in chemical data mining, instruments, chemical machine learning, etc., can solve problems such as effects and efficiency that need to be improved, illegal molecules generated, multiple computing resources, etc., to achieve simple solutions, improve efficiency and effects , to solve computationally expensive effects

Active Publication Date: 2022-07-15
UNIV OF SCI & TECH OF CHINA
View PDF9 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, deep learning models usually require a large amount of labeled data, a lot of computing resources, and a very complex training model, which brings great difficulty to practical applications.
At the same time, some people regard this problem as a black-box optimization problem and use the perspective of evolutionary calculation to solve this problem, but the effect and efficiency need to be improved. For example, the process of generating molecules will produce illegal molecules

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Molecular Generation Method Based on Data Mining
  • Molecular Generation Method Based on Data Mining
  • Molecular Generation Method Based on Data Mining

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0019] The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention. Obviously, the described embodiments are only a part of the embodiments of the present invention, rather than all the embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those of ordinary skill in the art without creative work fall within the protection scope of the present invention.

[0020] Embodiments of the present invention provide a method for generating molecules based on data mining, such as Figure 1 to Figure 2 As shown, it mainly includes:

[0021] Step 1. Obtain the dataset and select N from the dataset p a molecule.

[0022] As a subspace of the entire chemical space S, the dataset contains multiple molecules; each molecule is represented by an undirected graph, including and ε two sets; where, is the set of atoms, ea...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a method for generating molecules based on data mining, comprising: acquiring a data set, selecting N from the data set p number of molecules; encode all the molecules to obtain the corresponding molecular matrix, put N p A molecular matrix is ​​used as the initial population, and chemical restriction mask operation is performed on each molecular matrix in combination with chemical prior knowledge to indicate whether each atom in each molecule is saturated; based on the molecular matrix and the result of chemical restriction mask operation, within the population Perform crossover and mutation operations to generate multiple new molecular matrices, and select N from the molecular matrix in the population and the new molecular matrix through the fitness function p The molecular matrix of each is used as the next-generation population; the above operations are performed on the molecular matrix in the next-generation population until the population converges or the number of iterations exceeds a preset upper limit. The above can solve the disadvantages of low efficiency and expensive calculation of deep learning models, greatly improve the efficiency and effect, and produce legal molecules.

Description

technical field [0001] The invention relates to the field of automatic molecule generation, in particular to a method for generating molecules based on data mining. Background technique [0002] Molecular generation is a very important task for the field of chemical pharmacy. In the past chemical pharmaceutical industry, it took billions of dollars and more than ten years from the discovery of a drug to the actual production of drugs. In recent years, the technology of using computers to accelerate molecular pharmacy has received more and more attention. Many articles in the top journals of computer and chemistry have begun to focus on this aspect of research, which shows its importance. [0003] The traditional pharmaceutical paradigm starts with an existing chemical material concept, then synthesizes candidate materials based on this concept, then tests the material in the entire system, and finally evaluates the material's properties and properties. However, this paradi...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G16C10/00G16C20/70
CPCG16C10/00G16C20/70
Inventor 刘淇陈恩红朱健甫郝中楷阴钰陆承镪
Owner UNIV OF SCI & TECH OF CHINA