Hierarchical storage optimization method for super-large-scale drug data

A drug data and hierarchical storage technology, applied in the direction of chemical informatics data warehouse, chemical information database system, chemical informatics programming language, etc., to achieve the effect of heterogeneous storage and platform development and utilization, and improve I/O performance

Active Publication Date: 2020-05-29
OCEAN UNIV OF CHINA
View PDF10 Cites 9 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] The invention provides a hierarchical storage optimization method for super-large-scale drug data involved in the computer-aided drug design process, which solves the I / O problem of super-large-scale drug data in the existing super-computing environment

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Hierarchical storage optimization method for super-large-scale drug data

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0025] The present invention provides a hierarchical storage optimization method for ultra-large-scale drug data involved in the computer-aided drug design process, the process is as follows figure 1 shown, including the following steps:

[0026]1) In view of the heterogeneity of the supercomputing cluster environment, build a cluster resource management system based on a distributed multi-level storage structure, and allocate specific cluster resources to specific users, user groups or jobs; among them, cluster storage resources include storage clusters And computing clusters, the entire underlying storage structure includes four levels, which are: computing cluster main memory, that is, internal memory, which has fast I / O speed, small capacity and high cost; computer clusters dominated by HDD+SSD The auxiliary memory, that is, the external memory, has a slightly slower I / O speed and larger capacity than the main memory; the distributed big data server cluster HDD+SSD built o...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention, which belongs to the field of super-large-scale data storage management, relates to a hierarchical storage optimization method for super-large-scale drug data. The method comprises thefollowing steps: step one, constructing a cluster storage resource management system based on a distributed multistage storage structure, and allocating specific cluster storage resources to specificusers, user groups or jobs; step two, performing characterization processing on the jobs, dividing job categories, and intelligently scheduling the jobs to servers of data blocks required by the jobs;step three, designing a data classification model, mapping and storing massive result data generated in the computer-aided drug design process by applying the model, and segmenting the generated datainto data blocks to be respectively stored on servers at corresponding storage levels; and step four, designing a corresponding I / O method for each level of storage structure and the characteristic attributes thereof, dynamically scheduling I / O requests, and optimizing the I / O scheduling strategy of each level of storage structure. According to the method, the I / O performance in the supercomputing environment is improved, and heterogeneous storage and platform development and utilization of super-large-scale drug data are realized.

Description

technical field [0001] The invention belongs to the technical field of ultra-large-scale data storage management, in particular to a hierarchical storage optimization method for ultra-large-scale, multi-source, and heterogeneous drug data generated in the process of computer-aided drug design. Background technique [0002] The whole process of computer-aided drug design includes virtual drug screening, lead optimization, target prediction, kinetic simulation, etc. The whole process involves drug data or intermediate result data and result data with large scale, diverse structures, and various data in the process. It has the characteristics of time correlation (the output of the previous stage is the input of the next stage). According to the characteristics of the above-mentioned process drug data, a multi-level storage resource management system is designed, and a series of characteristic operations, data classification models, and I / O scheduling strategy optimization are u...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G16C20/90
CPCG16C20/90
Inventor 刘昊杨雁博魏志强
Owner OCEAN UNIV OF CHINA
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products