A decentralized distributed heterogeneous storage system data distribution method

A heterogeneous storage and system data technology, applied in the direction of digital data processing, data processing input/output process, character and pattern recognition, etc., can solve the problem that the system does not have scalability, does not consider the heterogeneous characteristics of the storage system Suitable for problems such as ultra-large-scale data applications

Active Publication Date: 2020-12-08
CHONGQING UNIV
View PDF7 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

This data distribution method does not take into account the heterogeneous characteristics of the storage system, which will result in intensive write operations to the SSD
There are also some technologies that use solid-state drives to improve centralized storage performance. This centralized data distribution strategy makes the system not scalable and is not suitable for ultra-large-scale data applications.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A decentralized distributed heterogeneous storage system data distribution method
  • A decentralized distributed heterogeneous storage system data distribution method
  • A decentralized distributed heterogeneous storage system data distribution method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0030] Below in conjunction with accompanying drawing and embodiment the present invention will be further described:

[0031] One, the first method of the present invention comprises the following steps:

[0032] Step 1. During the execution of the program, count the number of times each data object is read / written, convert the number of reads and writes into a weight, and use it as the data access mode; classify the data object according to the data access mode, such as read Intensive, write-intensive, and mixed; the classification method can use the common K-Means clustering algorithm, and each type of data object has an attribute value used to represent the average write times of this type of data object.

[0033] Step 2. Classify storage devices according to their capacity and read / write performance, such as high-speed solid-state drives, low-speed solid-state drives, high-speed mechanical hard drives, and low-speed mechanical hard drives. Each storage device has its own ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a data distribution method of a decentralized distributed heterogeneous storage system, which comprises the following steps: 1. Classifying data objects; 2. Classifying storage devices; 3. Dividing stored data into different "placement groups" Each type of storage device corresponds to a type of "placement group cluster"; 4. Calculate the proportion of each data object to be stored in a different type of "placement group cluster"; 5. Use the hash algorithm to determine Which "placement group" in the "placement group cluster" the data object to be stored belongs to; 6. Use the data distribution algorithm of the storage system to store the data objects in each "placement group" to multiple corresponding storage devices Medium; 7. During the operation of the system, the migration threshold is calculated according to the access characteristics of the data objects, and the data objects are dynamically migrated. The invention has the advantages of maintaining the performance, load balance and expansibility of the storage system, and reducing the number of write operations to the solid-state hard disk.

Description

technical field [0001] The invention belongs to the technical field of distributed computer storage, and in particular relates to a data distribution method of a decentralized distributed heterogeneous storage system. Background technique [0002] In big data applications, scientific computing and cloud computing platforms, reliable and scalable storage systems play a vital role in system performance. As the amount of data increases (PB level), the data distribution strategy of the storage system must ensure performance and scalability. Decentralized data distribution strategies, such as Ceph, use the processing power of the storage device itself to provide a reliable object storage system. Solid-state drives (SSDs) have better read and write performance than traditional mechanical hard drives (HDDs), and are more and more widely used in storage systems to form large-scale distributed heterogeneous storage systems. In addition, the new archive hard disk (Archive HDD) is al...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06F3/06
CPCG06F3/0607G06F3/0655G06F3/0685G06F18/23213G06F3/06
Inventor 沙行勉诸葛晴凤吴林
Owner CHONGQING UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products