Massive Feature Screening Method for Ligand Molecules in Drug Design

A feature screening and ligand molecule technology, applied in the field of computer-aided drug design, can solve the problems of high time consumption and achieve the effect of increasing comprehensibility and improving learning efficiency

Active Publication Date: 2019-06-04
NANJING UNIV OF POSTS & TELECOMM
View PDF5 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Considering that the characteristic dimension of the ligand molecule is very likely to be very large, the traditional LASSO method has a large time cost and it is difficult to solve this problem well.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Massive Feature Screening Method for Ligand Molecules in Drug Design
  • Massive Feature Screening Method for Ligand Molecules in Drug Design
  • Massive Feature Screening Method for Ligand Molecules in Drug Design

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0018] The present invention will be further described in detail below in conjunction with the accompanying drawings of the specification.

[0019] figure 1 It is a framework diagram of the system of the present invention. Based on this framework, the present invention provides a method for screening massive features of LASSO ligands based on EDPP criteria. The specific implementation steps of the method include the following:

[0020] Step 1: Generation of ECFP feature of ligand molecule. Given initial data set among them Is the atom connection diagram of each molecule, Y i Is the mark of each sample. Process the initial data set to get the ECFP feature describing the sample, that is, data set D t ={(X i ,Y i )|X i ∈R 1*m ,1≤i≤n}.

[0021] Step 2: Screening of ligand molecular characteristics based on EDPP LASSO method. For data set D t , Applying the EDPP criterion, for satisfying the condition (λ∈(0,λ 0 ]) of λ={λ i |0≤i i > λ i+1 }, get the feature screening result of each λ ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a ligand molecule massive characteristic screening method in drug design. In the drug molecule virtual screening based on the ligand, dimensionality (every dimensional characteristic represents a substructure) of the ligand molecule fingerprint feature generated by using the most popular ECFP method at present is massive and even ten thousands of dimensions due to the large number of ligand molecule, so the method can meet the problem of 'dimensionality disaster' in actual task. The method screens the massive ECFP molecule fingerprint features by using the LASSO method based on EDPP rule, and acquires related characteristics of the ligand molecule by using a robustness selecting method. The activity of the ligand molecule is often related to a few number of substructure; the method can rapidly and largely remove uncorrelated characteristics, select related characteristics of robustness, solve the problem of the 'dimensionality disaster', acquire the substructure related to the ligand activity and push the wider application of the ECFP method in the drug design.

Description

Technical field [0001] The invention relates to a method for screening ligand molecular characteristics based on machine learning, and belongs to the technical field of computer-aided drug design. Background technique [0002] In recent years, how to improve the effectiveness of virtual drug screening has become an urgent problem for pharmaceutical companies to solve. Since a large number of biochemical experiments provide sufficient data, machine learning methods can use these data to help solve problems. [0003] Virtual drug screening is divided into two types: target structure and ligand-based methods. Drug virtual screening based on target structure simulates the physical interaction between the compound and the target to determine whether there may be drug effects, such as molecular docking methods. The ligand-based method is mainly to use existing data to predict the activity of the compound when the target structure is unknown. The most important thing for this kind of m...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G16C20/50G16C20/70
CPCG16C20/50G16C20/70
Inventor 吴建盛张邱鸣胡海峰
Owner NANJING UNIV OF POSTS & TELECOMM
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products