A disk failure detection method using multi-model prediction

A fault detection and multi-model technology, applied in static memory, instruments, etc., can solve the problems of insufficient extraction of SMART index timing characteristics, low accuracy, and inability to predict disk failures in advance, so as to reduce overfitting and improve Efficiency, the effect of reducing the feature dimension

Active Publication Date: 2018-12-11
南京群顶科技股份有限公司
View PDF3 Cites 38 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, the failure prediction method based on the threshold is too simple, the accuracy rate is low, and the disk failure cannot be predicted in advance
[0005] Existing fault detection methods based on machine learning algorithms do not fully extract the timing characteristics of SMART indicators

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A disk failure detection method using multi-model prediction
  • A disk failure detection method using multi-model prediction
  • A disk failure detection method using multi-model prediction

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0038] In order to facilitate the understanding of those skilled in the art, the present invention will be further described below in conjunction with the embodiments and accompanying drawings.

[0039] The disk failure prediction method of this embodiment refers to extracting various features of disk SMART indicators by means of time series data processing, and using machine learning algorithms and related theories to establish a binary classification model to predict disk status. The flow of the disk failure prediction algorithm in this embodiment is as follows: figure 1 shown. Including the following key technical links:

[0040] Step 1: Data Collection

[0041] The data set provided by Backblaze is preferred. This data set contains monitoring data of more than 30,000 disks for more than 17 consecutive months. When the disk stops working, does not respond to commands, or the RAID system reports that it cannot be read or written, it will be marked as positive Sample (i.e....

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a disk fault detection method using multi-model prediction, which extracts multiple characteristics of disk SMART indexes through a sequential data processing means, and establishes a classification model to predict disk state. Step 1, data input: acquiring a data set composed of monitoring data of a plurality of disks in a period of time; step 2, SMART screening: adoptingmutation point detection mode to select SMART index; step 3, feature engineering: using SMART index as the input of the user-defined feature extraction module to extract the features of the SMART index, then extracting the corresponding parameter configuration, and transmitting the parameter configuration to the feature extraction module as a parameter, so as to extract the feature sets of the training set and the test set; step 4, data set balance: desampling the negative sample which occupies a large amount by adopting dimension reduction clustering; step 5, algorithm selection and modeling:on the basis of the step 4, training the classification model and testing whether the current disk belongs to the normal state or the fault state that needs to be replaced.

Description

technical field [0001] The invention relates to the field of data mining, in particular to a disk failure prediction algorithm. Background technique [0002] In recent years, with the development of emerging technologies such as cloud storage, mass data storage technology has developed faster and faster, and data centers have become increasingly large. In the downtime cost of the data center, the failure of network equipment is an important factor, and the disk, as the final storage place of data, is one of the most important network equipment and the most frequently failed equipment. As the importance of data continues to increase, the impact of accidents caused by disk-based storage device failures is increasing, and the cost of data recovery is getting higher and higher. [0003] Disk failures are generally divided into two types: predictable and unpredictable. Unpredictable faults, such as transient faults such as sudden chip failure, have a process such as motor beari...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G11C29/10
CPCG11C29/10
Inventor 周镶玉徐磊张永磊张琳琳
Owner 南京群顶科技股份有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products