Data reconstruction method and device based on distributed storage Ceph

A technology of distributed storage and data reconstruction, which is applied in electrical digital data processing, input/output process of data processing, instruments, etc. It can solve problems such as difficulty in reasonableness, difficulty in high reconstruction bandwidth, and difficulty in reaching the average degree of influence.

Active Publication Date: 2020-11-10
FENGHUO COMM SCI & TECH CO LTD
View PDF5 Cites 3 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, since Ceph data reconstruction is a dynamic process, if the fixed parameter configuration method is adopted, it is difficult to ensure that the configured fixed parameters are reasonable for each stage of the entire reconstruction process, so it is difficult to achieve high reconstruction bandwidth at the same time. It also guarantees low impact on normal business, that is, it is difficult to achieve the optimal average reconstruction bandwidth and the average impact of the entire data reconstruction process on upper-layer business

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Data reconstruction method and device based on distributed storage Ceph
  • Data reconstruction method and device based on distributed storage Ceph
  • Data reconstruction method and device based on distributed storage Ceph

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0047] In order to have a more intuitive understanding of the data reconstruction method provided by the present invention, the embodiment of the present invention first introduces the design principle and overall idea of ​​the data reconstruction method.

[0048] Distributed storage Ceph generally controls its data reconstruction process through multiple parameters, such as osd_max_backfills parameters, osd_recovery_sleep parameters, osd_recovery_op_priority parameters, osd_client_op_priority parameters, etc. The present invention finds through many tests and researches that there are two parameters that play a major role in the data reconstruction process, namely osd_max_backfills (i.e. the first parameter mentioned in the text) and osd_recovery_sleep (i.e. the second parameter mentioned in the text), and The other parameters have less effect. Therefore, the present invention mainly controls the data reconstruction process effectively by dynamically adjusting the first param...

Embodiment 2

[0058] Based on the design principles and ideas in the embodiment, the embodiment of the present invention provides a data reconstruction method based on distributed storage Ceph, which can dynamically adjust parameters according to the dynamic characteristics of the Ceph data reconstruction process to ensure the reconstruction bandwidth and The balance between the degree of influence on the upper-level business.

[0059] This method can be guided by user expectations, so here first receive the expected parameters set by the user in the form of a configuration file, mainly the average reconstruction bandwidth that the user expects to achieve; and then monitor whether data occurs in the Ceph cluster in real time through the underlying instructions provided by Ceph Reconstruction; if data reconstruction occurs, initialize the relevant parameters required for this data reconstruction, such as the start time of this data reconstruction, the total amount of reconstructed data, the h...

Embodiment 3

[0080] Based on the data reconstruction method based on distributed storage Ceph provided in the above-mentioned embodiment 2, the embodiment of the present invention further introduces the complete algorithm flow corresponding to the method, as shown in figure 2 As shown, it specifically includes the following steps:

[0081] Step 201: Receive the expected parameters set by the user in the form of a configuration file, including the average reconstruction bandwidth BW that the user expects to achieve, the perceived granularity S of reconstructed data volume, the perceived granularity N1 of reconstructed data volume change, and the data unit counting method UNIT And the second parameter controls one or more items in the granularity N2.

[0082] The value range of the average reconstruction bandwidth BW expected by the user is (0, ∞), which is configured by the user according to the hardware environment and service requirements. The default value in the embodiment of the prese...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a data reconstruction method and device based on distributed storage Ceph, and the method comprises the steps: adjusting a first parameter according to the task load of currentdata reconstruction in a Ceph cluster, and enabling each piece of reconstruction data to be in reconstruction; adjusting the second parameter according to the average reconstruction bandwidth expected to be reached by the user, so that the real-time reconstruction bandwidth of the Ceph cluster meets the requirements of the user; and in the data reconstruction process, dynamically adjusting the second parameter according to the proportion of the current residual reconstructed data volume to the total reconstructed data volume until the reconstruction process is finished. The data reconstruction process is controlled by adopting a dynamic parameter adjustment method adaptive to the dynamic characteristics of the distributed storage Ceph data reconstruction process, so that the balance between the reconstruction bandwidth and the degree of influence on upper-layer services can be ensured; and the expectation of the user is taken as guidance during dynamic parameter adjustment, so that the expectation of the user is met more easily.

Description

technical field [0001] The invention belongs to the technical field of distributed storage, and more particularly relates to a method and device for data reconstruction based on distributed storage Ceph. Background technique [0002] Ceph is a unified open source distributed storage system. Because of its unified storage, it supports external block storage, file storage, and object storage services. It also has the characteristics of high availability, high scalability, and high performance. Therefore, it is widely used in various fields. There is a wide range of uses. [0003] In the distributed storage system Ceph, hard disk failure is a common problem. In order to ensure data security, when a hard disk failure occurs, the distributed storage system will reconstruct and dump the data on the failed hard disk according to the algorithm. The longer the data reconstruction process takes, the higher the probability of secondary failure or multiple failures in the distributed ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): H04L29/08G06F3/06
CPCH04L67/1097H04L67/1044G06F3/0614G06F3/067
Inventor 王筱橦张书东蓝海李庆林
Owner FENGHUO COMM SCI & TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products