Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Hard disk failure prediction method for cloud computing platform

A cloud computing platform and hard disk failure technology, applied in the detection of faulty computer hardware, hardware monitoring, etc., can solve the problems of compromise, insensitivity, easy to cause over-learning, etc., and achieve high fault recall rate and good overall performance Effect

Inactive Publication Date: 2015-04-08
NANJING UNIV
View PDF5 Cites 69 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Cost-sensitive learning adjusts the penalty parameters according to the situation. In unbalanced classification, setting a larger penalty parameter for positive class misclassification can improve the classification effect of the classifier on the positive class. The effect of this type of method depends on the set parameters; support Compared with other classification methods, the vector machine is less sensitive to data imbalance, as in literature 8: Japkowicz N, Stephen S. The class imbalance problem: A systematic study [J]. Intelligent data analysis, 2002, In 6(5):429-449., Japkowicz et al. compared the impact of data imbalance on different classification methods, including decision tree C4.5, BP neural network and support vector machine, etc., and the results showed that support vector machine It is relatively insensitive to data imbalance, so on this issue, many methods based on support vector machines have appeared; the combination method is to combine several classifiers to improve the classification effect, and the combination method needs to combine multiple classifiers. Compromise the differences and biases of different methods, and it is easy to cause the problem of over-learning

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Hard disk failure prediction method for cloud computing platform
  • Hard disk failure prediction method for cloud computing platform
  • Hard disk failure prediction method for cloud computing platform

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0040] The present invention is described in further detail in conjunction with accompanying drawing and specific embodiment:

[0041] The invention discloses a hard disk failure prediction method of a cloud computing platform. Firstly, according to the hard disk maintenance records in the prediction time window, the SMART log data of the hard disk is marked as normal hard disk samples and faulty hard disk samples, and then the K-means clustering algorithm is used to remove the noise. The normal hard disk samples are divided into k disjoint subsets, and combined with the faulty hard disk samples respectively, k groups of balanced training sets are generated according to the SMOTE oversampling algorithm, and k support vector machine classifiers are obtained through training, which are used for faulty hard disk classification. predict. In the prediction stage, first use the DBSCAN clustering algorithm to cluster the test set, predict the samples in the cluster clusters as normal...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a hard disk failure prediction method for a cloud computing platform. The hard disk failure predication method comprises the following steps: marking SMART log data of a hard disk as a normal hard disk sample and a faulted hard disk sample according to a hard disk maintenance record in a prediction time window; then, dividing the denoised normal hard disk sample into k non-intersected subsets by adopting a K-means clustering algorithm; combining the k non-intersected subsets with the faulted hard disk sample respectively; generating k groups of balance training sets according to an SMOTE (Synthetic Minority Oversampling Technique) so as to obtain k support vector machine classifiers for predicting the faulted hard disk. In the prediction stage, test sets can be clustered by using a DBSCAN (Density-based Spatial Clustering Of Applications With Noise), a sample in a clustered cluster is predicted as the normal hard disk sample, a noise sample is predicted by each classifier obtained by training, and further a final prediction result is obtained by voting. According to the method disclosed by the invention, hard disk fault prediction is carried out by using the SMART data of the hard disk, and relatively high fault recall ratio and overall performance can be obtained.

Description

technical field [0001] The invention relates to a hard disk failure prediction method of a cloud computing platform, which belongs to the field of computer data mining, and specifically relates to a hard disk failure prediction algorithm. Background technique [0002] Hard disk failure prediction can ensure data security, improve operation and maintenance efficiency, and control storage costs. This technology involves technologies in many fields such as cloud computing, data mining, hard disk SMART technology, fault prediction technology, and extremely unbalanced data classification technology. Hard disk failure prediction mainly refers to relying on hard disk SMART data for failure prediction. However, as document 1: PINHEIRO E, WEBERWD, BARROSO L A.Failure trends in a large disk drive population[EB / OL].[2012-10-10].http: / / research.google.com / archive / disk_failures .pdf. Introduced, using statistical methods, 36% of the causes of failures cannot be estimated. [0003] At ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F11/22G06F11/34
Inventor 周嵩王景峰柏文阳宋云华
Owner NANJING UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products