Fault tolerance method and device for disk arrays

A disk array and disk technology, applied in the storage field, can solve problems such as disk array stop, failure to return to redundant state, data loss, etc., and achieve the effect of restoring redundancy

Active Publication Date: 2011-09-14
NEW H3C TECH CO LTD
View PDF3 Cites 21 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, during the rebuilding process, if a disk rebuilding read error occurs again, where the rebuilding read error is a read error caused by rebuilding I / O during the rebuilding process, then the rebuilding is stopped. At this time, the disk array can only Stuck in degraded state, unable to go back to redundant state
Once other disks in the disk array fail again, the entire disk array will fail, that is, the I / O channel will be closed, which will not only cause the disk array to stop providing services, but also cause the data stored in the disk array to be lost
[0004] In addition, when a business read is performed on a disk array in a degraded state, if a business read error occurs, the business read error is: a read error caused by business I / O on the disk during the business read and write process, then, at this time, the The disk array fails, that is, the I / O channel is closed, which causes the disk array to stop providing services and leads to the loss of previously stored data

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Fault tolerance method and device for disk arrays
  • Fault tolerance method and device for disk arrays
  • Fault tolerance method and device for disk arrays

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0032] In Example 1, when figure 1 When a disk such as disk 3 in the disk array shown fails, a hot spare disk is added to the disk array to replace the failed disk 3, specifically as follows figure 2 shown.

[0033] Afterwards, the disk array with the hot spare disk added in units of stripes is figure 2 The indicated disk array is reconstructed.

[0034] exist figure 2 During the reconstruction process of the disk array shown, if a reconstruction read error occurs in the currently reconstructed stripe, the identifier of the current stripe will be recorded in the non-volatile memory, the current stripe will be skipped, and the next stripe will be skipped. Continue rebuilding until the rebuilding of the disk array is completed. For details, see image 3 . exist image 3 In , the strips marked with slashes, that is, the strips with sequence numbers 1, 3, 5, and 6, have reconstruction read errors, that is, they are not successfully reconstructed, and are recorded in the n...

Embodiment 2

[0047] Embodiment 2 is different from Embodiment 1. Embodiment 1 is mainly performed for rebuilding read errors, while Embodiment 2 is mainly described for the process of performing service read on a disk array in a degraded state.

[0048] The disk array in the degraded state in this embodiment 2 can be the disk array after the disk array loses redundancy, specifically the disk array before being rebuilt or during the rebuilding process, or the disk array that stopped rebuilding due to a rebuilding read error etc., which are not limited by the embodiments of the present invention. During the business read process of the strip in the degraded disk array, if a business read error occurs in the current strip being read, record the identification of the current strip in the non-volatile memory and return an error command, and control the disk array to continue to provide business read and write, and control the disk array to remain in a degraded state. For details, see Figure 4...

corresponding Embodiment 1

[0058] see Figure 5 , Figure 5 The device structure diagram provided for the embodiment of the present invention. The device corresponds to embodiment 1, including: a replacement unit and a reconstruction unit;

[0059] Wherein, the replacement unit is used for adding a hot spare disk in the disk array to replace the failed disk when a disk in the disk array fails;

[0060] The rebuilding unit is used to rebuild the disk array with the hot spare disk added in units of stripes;

[0061] Crucially, as Figure 5 As shown, the device also includes:

[0062] a recording unit, configured to record the identifier of the current stripe into a non-volatile memory when a reconstruction read error occurs in the current stripe reconstructed by the rebuilding unit, and trigger the rebuilding unit to skip the current stripe, Continue rebuilding from the next stripe until the rebuilding of the disk array is complete;

[0063] The repair unit is used for repairing the reconstruction r...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a fault tolerance method and device for disk arrays, wherein the method comprises the following steps: when a disk in a disk array goes wrong, adding a hot spare into the disk array so as to replace the disk which goes wrong, and carrying out reconstruction on the disk array into which the hot spare is added in strips; when a reconstructed current strip has a reconstruction read error, recording the identifier of the current strip into a non-volatile random access memory (NVRAM), and skipping over the current strip, and continuing to carry out reconstruction from the next strip until the reconstruction on the disk array is completed; and aiming at the identifier of each strip recorded in the NVRAM, repairing the reconstruction read error of the strip corresponding to the identifier of the strip by writing, and deleting the identifier of the strip from the NVRAM after the repairing is completed. By using the method and device provided by the invention, the occurrence of problems caused by the reconstruction read errors or service read errors of the disk array in a degraded state can be avoided.

Description

technical field [0001] The invention relates to the field of storage, in particular to a fault-tolerant method and device of a disk array. Background technique [0002] Redundant Array of Independent Disks (RAID: Redundant Array of Independent Disks), referred to as a disk array, combines multiple independent disks into an array to provide good redundancy and higher storage performance than a single disk. In the field of storage, data is directly or indirectly stored on multiple individual disks through the redundancy of the disk array itself, so that data will not be lost when one or more disks fail, that is, data fault tolerance is realized. [0003] Wherein, when the redundancy of the disk array is lost due to some reasons such as failure of a disk in the disk array, the disk array will be in a degraded state. Taking a disk failure in a disk array to cause the disk array to lose its redundancy and make the disk array in a degraded state as an example, in the prior art, i...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F11/16G06F3/06
Inventor 郑辉曹庭华
Owner NEW H3C TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products