Ligand molecule massive characteristic screening method in drug design

A ligand molecule and feature screening technology, which is applied in molecular design, calculation, chemical statistics, etc., can solve the problems of high time consumption and achieve the effect of increasing comprehensibility and improving learning efficiency

Active Publication Date: 2017-05-31
NANJING UNIV OF POSTS & TELECOMM
View PDF5 Cites 8 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Considering that the characteristic dimension of the ligand molecule is very likely to be very large

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Ligand molecule massive characteristic screening method in drug design
  • Ligand molecule massive characteristic screening method in drug design
  • Ligand molecule massive characteristic screening method in drug design

Examples

Experimental program
Comparison scheme
Effect test

Example Embodiment

[0018] The present invention will be further described in detail below with reference to the accompanying drawings.

[0019] figure 1 It is a framework diagram of the system of the present invention, and based on the framework, the present invention provides a method for screening massive features of LASSO ligands based on the EDPP criterion. The specific implementation steps of the method include the following:

[0020] Step 1: Ligand molecule ECFP signature generation. Given an initial dataset in is the atomic connection graph for each molecule, Y i is the label for each sample. The initial data set is processed to obtain ECFP features describing the sample, namely data set D t ={(X i ,Y i )|X i ∈R 1*m , 1≤i≤n}.

[0021] Step 2: Molecular feature screening of ligands based on EDPP LASSO method. For dataset D t , applying the EDPP criterion, for satisfying the condition (λ∈(0,λ 0 ]) of λ={λ i |0≤ii >λ i+1}, get the feature screening result of each λ value Τ={...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a ligand molecule massive characteristic screening method in drug design. In the drug molecule virtual screening based on the ligand, dimensionality (every dimensional characteristic represents a substructure) of the ligand molecule fingerprint feature generated by using the most popular ECFP method at present is massive and even ten thousands of dimensions due to the large number of ligand molecule, so the method can meet the problem of 'dimensionality disaster' in actual task. The method screens the massive ECFP molecule fingerprint features by using the LASSO method based on EDPP rule, and acquires related characteristics of the ligand molecule by using a robustness selecting method. The activity of the ligand molecule is often related to a few number of substructure; the method can rapidly and largely remove uncorrelated characteristics, select related characteristics of robustness, solve the problem of the 'dimensionality disaster', acquire the substructure related to the ligand activity and push the wider application of the ECFP method in the drug design.

Description

technical field [0001] The invention relates to a method for screening ligand molecular features based on machine learning, and belongs to the technical field of computer-aided drug design. Background technique [0002] In recent years, how to improve the effectiveness of drug virtual screening has become an urgent problem for pharmaceutical companies. Since a large number of biochemical experiments provide sufficient data, machine learning methods can just use these data to help solve problems. [0003] Drug virtual screening can be divided into two types: target structure-based and ligand-based methods. The virtual screening of drugs based on the target structure simulates the physical interaction between the compound and the target to determine whether there may be a drug effect, such as the molecular docking method. Ligand-based methods mainly use existing data to predict the activity of compounds when the target structure is unknown. The key to this type of method is...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F19/00
CPCG16C20/50G16C20/70
Inventor 吴建盛张邱鸣胡海峰
Owner NANJING UNIV OF POSTS & TELECOMM
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products