Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Molecule generation method based on data mining

A data mining and molecular technology, applied in chemical data mining, instruments, chemical machine learning, etc., can solve problems such as effects and efficiency that need to be improved, illegal molecules generated, multiple computing resources, etc., to achieve simple solutions, improve efficiency and effects , to solve computationally expensive effects

Active Publication Date: 2021-02-26
UNIV OF SCI & TECH OF CHINA
View PDF9 Cites 1 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, deep learning models usually require a large amount of labeled data, a lot of computing resources, and a very complex training model, which brings great difficulty to practical applications.
At the same time, some people regard this problem as a black-box optimization problem and use the perspective of evolutionary calculation to solve this problem, but the effect and efficiency need to be improved. For example, the process of generating molecules will produce illegal molecules

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Molecule generation method based on data mining
  • Molecule generation method based on data mining
  • Molecule generation method based on data mining

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0019] The technical solutions in the embodiments of the present invention will be clearly and completely described below in conjunction with the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only some of the embodiments of the present invention, not all of them. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts belong to the protection scope of the present invention.

[0020] Embodiments of the present invention provide a method for generating molecules based on data mining, such as Figure 1 ~ Figure 2 As shown, it mainly includes:

[0021] Step 1. Obtain the data set and select N from the data set p individual molecules.

[0022] As a subspace of the entire chemical space S, the data set contains multiple molecules; each molecule is represented by an undirected graph, including and ε two sets; where, is a...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a molecule generation method based on data mining, which comprises the following steps: acquiring a data set, and selecting Np molecules from the data set; encoding all molecules to obtain corresponding molecular matrixes, taking Np molecular matrixes as an initial population, and performing chemical limitation mask operation on each molecular matrix in combination with chemical prior knowledge to indicate whether each atom in each molecule is saturated or not; performing crossover and mutation operations in the population on the basis of the molecular matrix and a chemical limit mask operation result to generate a plurality of new molecular matrixes, and selecting Np molecular matrixes from the molecular matrix in the population and the new molecular matrixes through a fitness function to serve as a next generation population; and executing the operation on the molecular matrix in the next generation of population until the population convergence or iteration frequency exceeds a preset upper limit. According to the method, the defects that a deep learning model is low in efficiency and expensive in calculation can be overcome, the efficiency and the effectare greatly improved, and generated molecules are legal molecules.

Description

technical field [0001] The invention relates to the field of automatic molecular generation, in particular to a method for generating molecules based on data mining. Background technique [0002] Molecule generation is a task of great importance to the field of chemical pharmaceuticals. In the past chemical pharmaceutical industry, it took billions of dollars and more than ten years from the discovery of a drug to the actual production of the drug. In recent years, the technology of using computers to accelerate molecular pharmaceuticals has received more and more attention. Many articles in top journals of computer science and chemistry have begun to focus on research in this area, which shows its importance. [0003] The traditional pharmaceutical paradigm starts with an existing chemical material concept, then synthesizes a candidate material based on this concept, tests the material in an entire system, and finally evaluates the properties and properties of the material...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G16C10/00G16C20/70
CPCG16C10/00G16C20/70
Inventor 刘淇陈恩红朱健甫郝中楷阴钰陆承镪
Owner UNIV OF SCI & TECH OF CHINA
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products