Data processing method and apparatus

A data processing and theoretical technology, applied in the field of network communication, can solve problems such as the storage capacity bottleneck of bucket nodes, and achieve the effect of improving theoretical storage utilization and improving storage utilization.

Active Publication Date: 2018-02-16
XINHUASAN INFORMATION TECH CO LTD
View PDF7 Cites 9 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] The present invention provides a data processing method and device to solve the problem that the number of copies specified in the data redundancy policy in the existing Ceph cluster is equal to the number of bucket nodes specified as fault domains, and the storage capacity of different bucket nodes is relatively different. In large cases, bucket nodes with small storage capacity will quickly become a storage capacity bottleneck

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Data processing method and apparatus
  • Data processing method and apparatus
  • Data processing method and apparatus

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0019] In order to enable those skilled in the art to better understand the technical solutions in the embodiments of the present invention, some nouns and concepts involved in the present invention will be briefly described below.

[0020] 1. Pool (storage pool): The pool is a collection of PGs. When the pool is established, it is necessary to configure the data redundancy policy and specify the bucket node as the fault domain; among them, the data redundancy policy specifies the number of copies of the pool; after the pool is established, if The number of copies specified by the data redundancy policy of the pool is equal to the number of bucket nodes designated as the fault domain of the pool, and the PGs added to the pool will be evenly mapped to each bucket node designated as the fault domain;

[0021] It should be noted that the number of copies specified by the data redundancy policy of the pool may not be equal to the number of bucket nodes specified as fault domains. I...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a data processing method and apparatus. The method comprises the steps of for any storage pool of a Ceph cluster, when a copy number of the pool is equal to a quantity of bucketnodes specified as a fault domain, adding the pool into a first type of pool group corresponding to the bucket nodes; and for any first type of the pool group, when a first theoretical storage usagerate of the first type of the pool group is smaller than a preset usage rate threshold, triggering storage unit migration between the bucket nodes corresponding to the first type of the pool group, thereby enabling a second theoretical storage usage rate of the first type of the pool group after migration to be greater than the first theoretical storage usage rate, and updating a controlled copy CRUSH information graph, under extendible hashing, corresponding to the first type of the pool group after migration. By applying the method and the apparatus, the theoretical storage usage rate of thefirst type of the pool group can be increased, so that the storage usage rate of the Ceph cluster can be increased.

Description

technical field [0001] The present invention relates to the technical field of network communication, in particular to a data processing method and device. Background technique [0002] Ceph (distributed storage system) is a distributed storage system with excellent performance, high reliability and high scalability, which is widely used in various large, medium and small storage environments. The addressing process of the ceph system is completed through three mappings between File (file) and object (object), object and PG (Placement Group, placement group), and PG and OSD (Object Storage Device, object storage device). The object and PG are implemented through the hash algorithm, and the mapping between the PG and OSD is completed through the CRUSH (Controlled Replication Under Scalable Hashing) algorithm. [0003] The fault domain is another important concept in the Ceph cluster. Through the introduction of the fault domain, combined with the redundancy strategy, the clu...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F3/06G06F11/14H04L29/08
Inventor 杨潇顾雷雷
Owner XINHUASAN INFORMATION TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products