Drug small molecule activity prediction method based on bidirectional long-short memory model

A memory model, small molecule technology for applications in cheminformatics and bioinformatics

Pending Publication Date: 2020-09-08
MINDRANK AI LTD
View PDF2 Cites 7 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, no studies have directly fed SMILES into sequen

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Drug small molecule activity prediction method based on bidirectional long-short memory model
  • Drug small molecule activity prediction method based on bidirectional long-short memory model
  • Drug small molecule activity prediction method based on bidirectional long-short memory model

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0055] In the following description, the present invention will be described with reference to various embodiments. However, those skilled in the art will recognize that the various embodiments can be implemented without one or more specific details or with other alternative and / or additional methods, materials or components. In other cases, well-known structures, materials, or operations are not shown or described in detail so as not to obscure aspects of various embodiments of the present invention. Similarly, for the purpose of explanation, specific quantities, materials, and configurations are set forth in order to provide a thorough understanding of the embodiments of the present invention. However, the present invention can be implemented without specific details. In addition, it should be understood that the various embodiments shown in the drawings are illustrative representations and are not necessarily drawn to scale.

[0056] In this specification, reference to "one ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a drug small molecule activity prediction method based on a bidirectional long-short memory model. The method comprises the steps of: obtaining a data set; preprocessing the data set: expressing all compound molecules in a data set by using SMILES, standardizing SMILES expressions of all the molecules, unifying encoding modes and sequences of atoms, bonds and connection relationships in the SMILES expressions of the molecules, and carrying out deduplication processing by using InChIKey of the molecules; encoding the pre-processed data set, in which a single element, a single number, a single symbol and the entire square bracket of the SMILES sequence are regarded as a sequence token by one-hot encoding, each token itself has chemical significance and directivity, and the combination of any tokens complies with chemical rules; constructing a bidirectional long-short memory core fragment identification model; inputting encoded data into the bidirectional long-short memory core fragment identification model to obtain a hidden state moment; and evaluating the bidirectional long-short memory core fragment identification model.

Description

Technical field [0001] The invention relates to the fields of chemoinformatics and bioinformatics. Specifically, the present invention relates to a method and system for predicting the activity of small drug molecules based on a two-way long and short memory model. Background technique [0002] Clarifying the relationship between molecular structure and biological activity has always been an important topic in the field of medicinal chemistry. However, with the explosive growth of experimental data, it is increasingly difficult to clarify this relationship based on empirical measurement and heuristic rules. [0003] Chemoinformatics is an active research field that uses high-performance computers and machine learning methods to predict biological activity from molecular structures. In recent decades, with the emergence of deep learning methods, machine learning has attracted more and more attention from the scientific community. Data-driven analysis has become a routine procedur...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G16C20/30G16C20/70G16B15/30G16B40/00G06F40/284G06F40/30
CPCG16C20/30G16C20/70G16B15/30G16B40/00G06F40/284G06F40/30
Inventor 牛张明韦德·门佩斯-史密斯
Owner MINDRANK AI LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products