High-performance computing-oriented distributed data organization method

A technology of distributed data and high-performance computing, applied in computing, digital data processing, structured data retrieval, etc. Policy data consistency is limited, metadata operation semantics are complex, etc.

Active Publication Date: 2017-05-24
JIANGNAN INST OF COMPUTING TECH
View PDF5 Cites 13 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] (1) The hierarchical directory structure is inefficient when organizing massive files, the metadata operation semantics are complex, and there is a metadata performance bottleneck;
[0004] (2) The data access is the same as the stand-alone file system. In the high-performance computing environment, the interface of the traditional file system is redundant, and its implementation is tightly coupled with the kernel VFS. In terms of interface description, data block size, cache strategy, and data consistency There are a lot of restrictions in other aspects, not flexible enough
The current object storage service is very different from high-performance computing in terms of interface and application scenarios, so it cannot be directly used to build a high-performance computer mass storage system

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • High-performance computing-oriented distributed data organization method
  • High-performance computing-oriented distributed data organization method
  • High-performance computing-oriented distributed data organization method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0024] In order to make the content of the present invention clearer and easier to understand, the content of the present invention will be described in detail below in conjunction with specific embodiments and accompanying drawings.

[0025] The development of high-performance computers is about to enter the E-class era. The scale of computing subsystems continues to grow, and the number of cores exceeds tens of millions. At the same time, the scale and complexity of supported applications are also increasing. Subsystem I / O bandwidth, metadata performance, scalability, data manageability and other aspects put forward higher requirements. Traditional storage systems centered on distributed shared file systems have many limitations in terms of data organization and access interfaces, which adversely affect the scalability and I / O performance of the entire high-performance computer storage system.

[0026] The invention applies the emerging object storage technology in the Inter...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a high-performance computing-oriented distributed data organization method. Data access and data management are separated in an interface layer; and at a computing node end, a client provides a simplified data access interface for an application only, and file access operation, in a POSIX form, completely transparent for the application is supported. According to the method, for a data storage demand of a high-performance computing application, POSIX file system access semantics is simplified, and the data access and the data management are decoupled, so that an efficient and lightweight programming interface is provided for the application; a distributed object storage technology is adopted for performing data organization, so that data organization limitation of a conventional file system is broken through, a more efficient data access protocol is realized, and system expandable capability is further improved; and a memory-based metadata management mode is provided, and a high-performance non-relational key-value database is introduced in a high-performance computer system for the first time, so that an efficient and expandable metadata service can be provided for outside by utilizing excellent characteristics of the high-performance computer system.

Description

technical field [0001] The present invention relates to the field of high-performance computing and the field of distributed storage systems; more specifically, the present invention relates to a distributed data organization method oriented to high-performance computing. Background technique [0002] Distributed file systems are commonly used on high-performance computer systems to build a centralized and shared storage environment to meet the needs of application data storage and access. Typical systems such as Luster, GPFS, PVFS, Panasas, etc. are general-purpose distributed shared file systems, which follow POSIX storage semantics for storage space organization and data access. The system has complete functions and complex design. There are great limitations in high-concurrency data access, mainly in the following aspects: [0003] (1) The hierarchical directory structure is inefficient when organizing massive files, the metadata operation semantics are complex, and the...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
CPCG06F16/27
Inventor 陈曦朱建涛尉红梅何王全何晓斌漆锋滨
Owner JIANGNAN INST OF COMPUTING TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products