A method and device for data reconstruction based on distributed storage ceph

A distributed storage and data reconstruction technology, applied in the direction of electrical digital data processing, data processing input/output process, instruments, etc., can solve problems such as difficult high reconstruction bandwidth, difficult to be reasonable, difficult to reach the average degree of influence, etc.

Active Publication Date: 2022-02-18
FENGHUO COMM SCI & TECH CO LTD
View PDF5 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, since Ceph data reconstruction is a dynamic process, if the fixed parameter configuration method is adopted, it is difficult to ensure that the configured fixed parameters are reasonable for each stage of the entire reconstruction process, so it is difficult to achieve high reconstruction bandwidth at the same time. It also guarantees low impact on normal business, that is, it is difficult to achieve the optimal average reconstruction bandwidth and the average impact of the entire data reconstruction process on upper-layer business

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A method and device for data reconstruction based on distributed storage ceph
  • A method and device for data reconstruction based on distributed storage ceph
  • A method and device for data reconstruction based on distributed storage ceph

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0047] In order to have a more intuitive understanding of the data reconstruction method provided by the present invention, the embodiment of the present invention first introduces the design principle and overall idea of ​​the data reconstruction method.

[0048] Distributed storage Ceph generally controls its data reconstruction process through multiple parameters, such as osd_max_backfills parameters, osd_recovery_sleep parameters, osd_recovery_op_priority parameters, osd_client_op_priority parameters, etc. The present invention finds through many tests and researches that there are two parameters that play a major role in the data reconstruction process, namely osd_max_backfills (i.e. the first parameter mentioned in the text) and osd_recovery_sleep (i.e. the second parameter mentioned in the text), and The other parameters have less effect. Therefore, the present invention mainly controls the data reconstruction process effectively by dynamically adjusting the first param...

Embodiment 2

[0058] Based on the design principles and ideas in the embodiment, the embodiment of the present invention provides a data reconstruction method based on distributed storage Ceph, which can dynamically adjust parameters according to the dynamic characteristics of the Ceph data reconstruction process to ensure the reconstruction bandwidth and The balance between the degree of influence on the upper-level business.

[0059] This method can be guided by user expectations, so here first receive the expected parameters set by the user in the form of a configuration file, mainly the average reconstruction bandwidth that the user expects to achieve; and then monitor whether data occurs in the Ceph cluster in real time through the underlying instructions provided by Ceph Reconstruction; if data reconstruction occurs, initialize the relevant parameters required for this data reconstruction, such as the start time of this data reconstruction, the total amount of reconstructed data, the h...

Embodiment 3

[0080] Based on the data reconstruction method based on distributed storage Ceph provided in the above-mentioned embodiment 2, the embodiment of the present invention further introduces the complete algorithm flow corresponding to the method, as shown in figure 2 As shown, it specifically includes the following steps:

[0081] Step 201: Receive the expected parameters set by the user in the form of a configuration file, including the average reconstruction bandwidth BW that the user expects to achieve, the perceived granularity S of reconstructed data volume, the perceived granularity N1 of reconstructed data volume change, and the data unit counting method UNIT And the second parameter controls one or more items in the granularity N2.

[0082] The value range of the average reconstruction bandwidth BW expected by the user is (0, ∞), which is configured by the user according to the hardware environment and service requirements. The default value in the embodiment of the prese...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a data reconstruction method and device based on distributed storage Ceph, wherein the method includes: adjusting the first parameter according to the task amount of current data reconstruction in the Ceph cluster, so that each reconstructed data is in the "reconstructed "Medium"; adjust the second parameter according to the average reconstruction bandwidth expected by the user, so that the real-time reconstruction bandwidth of the Ceph cluster meets the user's needs; The proportion of the amount is dynamically adjusted to the second parameter until the reconstruction process ends. Control the data reconstruction process by adopting a dynamic parameter adjustment method adapted to the dynamic characteristics of the distributed storage Ceph data reconstruction process, which can ensure the balance between the reconstruction bandwidth and the impact on the upper layer business; and in the dynamic adjustment The parameters are guided by user expectations, and it is easier to meet user expectations.

Description

technical field [0001] The invention belongs to the technical field of distributed storage, and more particularly relates to a method and device for data reconstruction based on distributed storage Ceph. Background technique [0002] Ceph is a unified open source distributed storage system. Because of its unified storage, it supports external block storage, file storage, and object storage services. It also has the characteristics of high availability, high scalability, and high performance. Therefore, it is widely used in various fields. There is a wide range of uses. [0003] In the distributed storage system Ceph, hard disk failure is a common problem. In order to ensure data security, when a hard disk failure occurs, the distributed storage system will reconstruct and dump the data on the failed hard disk according to the algorithm. The longer the data reconstruction process takes, the higher the probability of secondary failure or multiple failures in the distributed ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): H04L67/1097H04L67/1042G06F3/06
CPCH04L67/1097H04L67/1044G06F3/0614G06F3/067
Inventor 王筱橦张书东蓝海李庆林
Owner FENGHUO COMM SCI & TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products