Slow disk detection method and system

A detection method and hard disk technology, applied in the field of slow disk detection methods and systems, can solve the problem of high misjudgment rate, and achieve the effects of improving the accuracy rate, reducing the missed judgment rate and the misjudgment rate.

Pending Publication Date: 2020-04-21
SANGFOR TECH INC
0 Cites 3 Cited by

AI-Extracted Technical Summary

Problems solved by technology

[0004] However, in the prior art, a normal hard disk is often ju...
View more

Method used

In this embodiment, the generation process of the trusted IO information table of the hard disk preset partition is described in detail, and the trusted IO information table of each hard disk can also be updated regularly or in real time, which improves the trusted IO information table of the hard disk. The accuracy also further improved the accuracy of slow disk judgment.
[0121] It should be noted that, with the increase of hard disk usage time, each hard disk may have a certain amount of wear and tear, so each hard disk can also update its trusted IO information table regularly or in real time, such as every 2 days, Update its own trusted IO information table once a week or a month, or each hard disk can also update its own trusted IO information table in real time to make the trusted IO information table of each hard disk more accurate. There will be a temporary trusted IO information table corresponding to the update cycle.
[0154] In the embodiment of the present application, the IO performance index of the hard disk is collected by the acquisition unit 501, wherein the IO performance index at least includes the random IO response time of the hard disk and the corresponding IO response time of the preset partition, that is, this embodiment not only adopts The random IO response time of the hard disk is used as the measurement index of the slow disk, and the IO response time of the preset partition of the hard disk is also used as the measurement index of the slow disk, and the first judging unit 502 is used to judge whether the IO performance index is abnormal, and when abnormality occurs, Record the duration of the abnormality and/or the duratio...
View more

Abstract

The embodiment of the invention discloses a slow disk detection method and a slow disk detection system, which are used for improving the accuracy of slow disk detection and reducing the missed judgment rate and the misjudgment rate of slow disk detection. The method provided by the embodiment of the invention comprises the following steps: acquiring IO performance indexes of a hard disk, whereinthe IO performance indexes at least comprise random input/output IO response time of the hard disk and input/output IO response time of each preset partition of the hard disk; judging whether the IO performance index of the hard disk is abnormal or not; if yes, recording the duration time of the abnormality and/or the duration times of the abnormality in a preset time; respectively judging whetherthe duration time and/or the duration frequency are/is greater than a corresponding preset threshold value or not; and if so, determining that the hard disk is a slow disk.

Application Domain

Functional testingFaulty hardware testing methods

Technology Topic

EngineeringInput/output +3

Image

  • Slow disk detection method and system
  • Slow disk detection method and system
  • Slow disk detection method and system

Examples

  • Experimental program(1)

Example Embodiment

[0073] The embodiment of the present invention provides a slow disk detection method and system, which are used to detect the slow disk phenomenon in the hard disk by adopting the method of random detection and partition detection, and only when the duration of the slow disk phenomenon exceeds the preset threshold Determining that the hard disk is a slow disk improves the accuracy rate of slow disk detection and reduces the rate of missed judgment and misjudgment rate of slow disk detection.
[0074] In order to enable those skilled in the art to better understand the solutions of the present invention, the following will clearly and completely describe the technical solutions in the embodiments of the present invention in conjunction with the drawings in the embodiments of the present invention. Obviously, the described embodiments are only It is an embodiment of a part of the present invention, but not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts shall fall within the protection scope of the present invention.
[0075] The terms "first", "second", "third", "fourth" and the like in the description and claims of the present invention and the above drawings are used to distinguish similar objects, but not necessarily to describe specific sequence or sequence. It is to be understood that the terms so used are interchangeable under appropriate circumstances such that the embodiments described herein can be practiced in sequences other than those illustrated or described herein. Furthermore, the terms "comprising" and "having", as well as any variations thereof, are intended to cover a non-exclusive inclusion, for example, a process, method, system, product or device comprising a sequence of steps or elements is not necessarily limited to the expressly listed instead, may include other steps or elements not explicitly listed or inherent to the process, method, product or apparatus.
[0076] For the convenience of understanding, the slow disk detection method in this application is described below, please refer to figure 1 , figure 1 It is a schematic diagram of an embodiment of a slow disk detection method in the present application;
[0077] 101. Collect the IO performance index of the hard disk, the IO performance index at least includes the random input and output IO response time of the hard disk and the input and output IO response time of each preset partition of the hard disk;
[0078] As a data storage tool, hard disks generally have corresponding performance indicators when they leave the factory, including maximum IO throughput IOPS, average latency, and maximum bandwidth MB/s. However, with the use of the disk, you will find that the disk becomes slower and slower during use. It may be that the average delay latency increases significantly, or the maximum IO throughput rate IOPS decreases, or the maximum bandwidth MB/s reduce. In either case, the performance of the upper-layer business will decrease, the delay will increase, and the service capability will deteriorate. These situations are all manifestations of slow disk, and there are also super-long lags in a single visit.
[0079] For a hard disk, the IO performance index of the hard disk is an important indicator to measure the data read and write speed of the hard disk. Among them, the IO performance index includes: IO response time (IO delay), maximum IO throughput rate, maximum bandwidth, etc., among which, IO response time The time is the sum of the processing time of a single IO of the hard disk inside the hard disk and the waiting time spent by a single IO operation in the IO waiting queue; the maximum IO throughput is the amount of data that flows through the hard disk system bus in the actual use of the hard disk, the size It is the product of the number of IO operations per second (IOPS) performed by the IO system and the data size of a single IO operation; the maximum bandwidth of the hard disk is generally a fixed value, and the bandwidth is generally determined by the interface type between the hard disk and the South Bridge.
[0080] The slow disk is mainly reflected in the long response time of a single IO of the hard disk, so generally the response time of a single IO is used as a sign of whether the hard disk is a slow disk. The difference between the present application and the prior art is that the present embodiment not only uses the A single random IO response time is used as the measure of the slow disk, and the hard disk is divided into preset multiple partitions, and the IO response time of each preset partition in the multiple preset partitions is used as the measure of the slow disk. , the size of the partition can be 32k, 64k, 128K or other sizes, and there is no specific limit on the size of the partition here.
[0081] 102. Determine whether the IO performance index of the hard disk is abnormal, if yes, execute step 103, if not, execute step 106;
[0082] After obtaining the random IO response time in the hard disk performance index and the IO response time corresponding to multiple preset partitions in step 101, determine whether the random IO response time of the entire hard disk and/or the IO response time corresponding to multiple preset partitions are Abnormal, if abnormal, execute step 103, if not abnormal, execute step 106.
[0083] 103. Record the duration of the abnormality and/or the number of times the abnormality lasts within a preset time;
[0084] In order to avoid the occurrence of random exceptions, the hard disk is misjudged as a slow disk, so when the IO performance index is abnormal in step 102, record the duration of the exception, or record the number of times the exception lasts within a preset time period, And according to the duration of the abnormality and/or the number of times the abnormality lasts within a preset time period, step 104 is executed.
[0085] 104. Determine whether the duration and/or the number of durations are greater than a corresponding preset threshold, if yes, execute step 105, and if not, execute step 106;
[0086] In order to avoid misjudging the hard disk as a slow disk, after recording the duration of the abnormality and/or the number of times the abnormality lasts within a preset time period in step 103, the duration of the abnormality and/or the duration of the abnormality within the preset time period are respectively judged Whether the number of durations in a segment is greater than the respective corresponding preset thresholds, if yes, execute step 105 , if not, execute step 106 .
[0087] Assumption: Set the threshold corresponding to the abnormal duration to 15 minutes, and set the time interval for collecting and judging hard disk IO performance indicators to 10s, then the corresponding number of times to collect and judge IO performance indicators within 15 minutes is (15*60 /10=90 times), then the threshold corresponding to the number of times the abnormality persists within a preset time period (such as 15 minutes) can be set to 45 times, 60 times, 80 times or other values, etc., no specific limitation is made here.
[0088] It should be noted that in this embodiment, the preset time period corresponding to the number of abnormal durations and the abnormal duration can be the same, that is, 15 minutes at the same time, or they can be different, for example, the abnormal duration is 15 minutes, and the abnormal duration corresponding to The preset time period can be 30 minutes or 1 hour, which can be set according to specific application scenarios, and there is no specific limitation here.
[0089] 105. Determine that the hard disk is a slow disk;
[0090] If the duration of the abnormality obtained in step 104 and/or the duration of the abnormality within the preset time period is greater than the corresponding threshold, it is determined that the hard disk is a slow disk; otherwise, step 106 is executed.
[0091] 106. Determine that the hard disk is a non-slow disk.
[0092] If the IO performance index of the hard disk is abnormal, but the duration of the abnormality and/or the duration of the abnormality are not greater than the corresponding threshold, it is determined that the unexpected abnormality may be caused by operation rather than a substantial abnormality of the hard disk. Then it is determined that the hard disk is not a slow disk.
[0093] In the embodiment of the present application, the IO performance index of the hard disk is collected, wherein the IO performance index at least includes the random IO response time of the hard disk and the IO response time corresponding to the preset partition, that is, this embodiment not only uses the random IO response time of the hard disk as the The measurement index of the slow disk, the IO response time of the hard disk preset partition is also used as the measurement index of the slow disk, and judges whether the IO performance index is abnormal, and records the duration of the abnormality and/or in the preset The continuous number of abnormalities in the time period, and only when the abnormal duration and/or the abnormal continuous times in the preset time period are greater than the preset threshold, the hard disk is judged as a slow disk, which improves the accuracy of slow disk detection , reducing the missed rate and false positive rate of slow disk detection.
[0094] based on figure 1 Described embodiment, describe in detail below figure 1 Step 102 in the described embodiment, see figure 2 , figure 2 for figure 1 The refinement step of step 102:
[0095] 1021. Collect the random IO response time of the hard disk;
[0096] For hard disks, random IO response time includes random write IO operation response time or random read IO operation response time, because random IO operation refers to the sector address provided by this IO and the sector address given by the last IO The difference is large, so that the magnetic head needs a relatively large movement between two IO operations to restart reading or writing, and if the sector address given by the current IO is consistent with the sector address at the end of the last IO or is If it is close, the head can quickly start this IO operation. Such multiple IO operations are called continuous IO operations, so random IO response time is generally used as a measure of hard disk IO performance.
[0097] Therefore, in this embodiment, the random IO response time of the hard disk is collected, and step 1022 is executed after the time is collected.
[0098] 1022. Determine whether the random IO response time is greater than a first time threshold, if yes, perform step 1026, and if not, perform step 1023;
[0099]Compare the random IO response time collected from the hard disk with the first time threshold (generally set to 2000ms), if the random IO response time is greater than the first time threshold, then directly execute step 1026, if the random IO response time is not If it is greater than the first time threshold, it cannot be directly determined that the hard disk is not a slow disk, and step 1023 needs to be performed, that is, to further determine whether the IO response time of each preset partition of the hard disk is greater than the corresponding trusted IO response time.
[0100] 1023. Read the trusted IO information table of the preset partition of the hard disk, where the trusted IO information table at least includes a trusted IO response time corresponding to each preset partition of the hard disk;
[0101] When the random IO response time of the hard disk is not greater than the first time threshold, the trusted IO information table of the preset partition of the hard disk is read, wherein the trusted IO information table includes at least the trusted IO information table corresponding to each preset partition of the hard disk. In addition, the IO response time may also include the maximum data throughput of each preset partition, etc., which is not specifically limited here.
[0102] 1024. Collect the IO response time of the hard disk preset partition;
[0103] After obtaining the trusted IO response time corresponding to each preset partition of the hard disk, the IO response time corresponding to the preset partition of the hard disk (such as any one of the multiple partitions of the hard disk) can be collected, and the preset partition The IO response time of the corresponding partition is compared with the trusted IO response time of the corresponding partition to determine whether the IO response time of each preset partition is abnormal.
[0104] 1025. Determine whether the time difference between the IO response time of the preset partition and the trusted IO response time of the corresponding partition is greater than the second time threshold, if yes, perform step 1026, and if not, perform step 1027;
[0105] After obtaining the IO response time of each preset partition and the trusted IO response time corresponding to each preset partition, the IO response time of the preset partition (such as any one of the multiple partitions of the hard disk) can be judged. Whether the time difference of the trusted IO response time of the corresponding partition is greater than the second time threshold, if yes, execute step 1026, if not, execute step 1027.
[0106] 1026. Determine that the IO performance index is abnormal;
[0107] If the random IO response time of the hard disk is greater than the first time threshold, or the time difference between the IO response time of any preset partition of the hard disk and the trusted IO response time of the corresponding partition is greater than the second time threshold, it is determined that the IO performance index of the hard disk is abnormal .
[0108] 1027. Determine that the IO performance index is normal.
[0109] If the random IO response time of the hard disk is not greater than the first time threshold, and the time difference between the IO response time of each preset partition of the hard disk and the trusted IO response time of the corresponding partition is not greater than the second time threshold, then determine the hard disk IO performance indicators are normal.
[0110] This embodiment describes in detail the process of judging the abnormality of hard disk IO performance indicators, and the judgment process not only judges the random IO response time of the entire hard disk, but also judges the IO response time of each preset partition of the hard disk, improving The accuracy rate of slow disk detection is improved, and the missed judgment rate and false positive rate are reduced.
[0111] based on figure 2 The described embodiment, before step 1023, also includes the following steps, please refer to image 3 , image 3 An embodiment of the generation process of the trusted IO response schedule for each preset partition of the hard disk:
[0112] 1028. Determine whether there is a trusted IO information table for each preset partition of the hard disk, and if so, execute figure 2 Step 1023 in the embodiment, if not, then execute step 1029;
[0113] exist figure 2 Before step 1023 in the described embodiment, that is, before reading the trusted IO information table of the hard disk preset partition, step 1028 needs to be executed, that is, to determine whether there is a trusted IO information table of the hard disk preset partition, and if so, then implement figure 2 Step 1023 in the embodiment is to read the trusted IO information table of the preset partition of the hard disk, if not, go to step 1029.
[0114] 1029. Count the number of IO performance index collections of each preset partition of the hard disk, and the IO response time in each IO performance index;
[0115] When there is no trusted IO information table for the preset partition of the hard disk, it is necessary to count the number of IO performance index collections for each preset partition of the hard disk, and the IO response time collected each time for each preset partition.
[0116] Specifically, assuming that the hard disk is 1G, and the hard disk is divided into 4 partitions (assumed to be A, B, C, D four partitions), then the size of each partition is 128K, then if there is no preset partition of the hard disk Before the credible IO information table, count the number of IO performance index collections for each partition, and the IO response time collected each time for each preset partition. Specifically, the IO data collected each time for each preset partition The response time is shown in Table 1.
[0117]
[0118] 1030. When the collection times are greater than the first threshold, determine the credible IO response time corresponding to each preset partition from the multiple IO response times corresponding to the collection times according to the first preset algorithm, so as to generate the Describe the table of trusted IO information.
[0119] When the number of acquisitions of each partition is greater than the first threshold (such as 500), then according to the first preset algorithm, it is determined from the multiple IO response time tables corresponding to the number of acquisitions of each preset partition trusted IO response time to generate a trusted IO information table.
[0120] Specifically, the credible IO response time of each preset partition can be selected from the multiple IO response time tables corresponding to each partition according to the average algorithm, or it can be selected from multiple IO response time tables corresponding to each partition according to the weighted average algorithm. The trusted IO response time of each preset partition is selected from the IO response time table to generate the trusted IO information table corresponding to each preset partition of the hard disk. It is easy to understand that each hard disk corresponds to a preset Trusted IO information table of the partition.
[0121] It should be noted that as the usage time of the hard disk increases, each hard disk may wear out to a certain extent, so each hard disk can also update its own trusted IO information table regularly or in real time, such as every 2 days, a week or once Update its own trusted IO information table once a month, or each hard disk can also update its own trusted IO information table in real time to make the trusted IO information table of each hard disk more accurate. All will correspond to a temporary trusted IO information table.
[0122] In this embodiment, the generation process of the trusted IO information table of the hard disk preset partition is described in detail, and the trusted IO information table of each hard disk can also be updated regularly or in real time, which improves the accuracy of the hard disk trusted IO information table , and further improved the accuracy of the slow disk judgment.
[0123] image 3 The generation process of the trusted IO information table in the described single hard disk storage system, and when in the multi-hard disk storage array, this embodiment further describes the generation process of the same kind of trusted IO information table of the multi-hard disk, please refer to Figure 4 , Figure 4 It is an embodiment of the generation process of the same kind of trusted IO information table in the storage array:
[0124] 401. When there are multiple hard disks of the same type, determine whether there is a trusted IO information table of the same type, the trusted IO information table of the same type is used to store trusted IO performance indicators of corresponding partitions of hard disks of the same type, and the The trusted IO performance index includes at least trusted IO response time, if not, then execute step 402, if yes, then execute step 404;
[0125] It is easy to understand that as the storage capacity increases, the storage system may include multiple hard disks of the same type or different types. For example, a storage system may include one or more disks of the same type, and may also include One or more SSDs of the same type.
[0126] When there are multiple hard disks of the same type in the storage system, in order to better reflect the accuracy of the hard disk preset partition trusted IO information table, when there are multiple hard disks of the same type, it can be judged whether there are trusted IOs of the same type. An information table, wherein the trusted IO information table of the same type is used to store trusted IO performance indicators of corresponding partitions of hard disks of the same type, and the IO performance indicators include at least trusted IO response time.
[0127] Specifically, suppose there are two hard disks in the storage system, the partitions of the first hard disk are A, B, C, and D, and the partitions of the second hard disk are A1, B1, C1, and D1; among them, A and A1, B The physical addresses of B1, C and C1, D and D1 in the two hard disks correspond to each other, and the stored data content is also the same or similar, then the trusted IO information table of the same type is the trusted IO information table based on the preset partition of the first hard disk. The IO information table and the trusted IO information table of the preset partition of the second hard disk, and the same trusted IO information table of the corresponding partition determined.
[0128] 402. Read the trusted IO information table of each hard disk preset partition, and count the writing times of the trusted IO information table of each hard disk preset partition;
[0129] When the same kind of trusted IO information table does not exist in the storage system, read the trusted IO information table of each hard disk preset partition, and count the write times of the trusted IO information table of each hard disk preset partition, and Step 403 is executed when the number of write times of the trusted IO information table of each preset partition of the hard disk exceeds the second threshold.
[0130] 403. When the number of times of writing the trusted IO information table of each hard disk preset partition is greater than the second threshold, determine the trusted IO information table of the same type from the partition trusted IO information tables corresponding to multiple hard disks according to the second preset algorithm. Information Sheet;
[0131] corresponds to image 3 In step 1030 of the described embodiment, as the usage time of the hard disk increases, the hard disk will be worn out, so each hard disk will regularly or in real time update the trusted IO information table of its preset partition, and in step 402 not only read The trusted IO information table of each hard disk preset partition also counts the number of writes (ie, the number of updates) of the trusted IO information table, and when the number of times written in the trusted IO information table of each hard disk exceeds the second threshold When , according to a preset algorithm, the trusted IO information table of the same type is determined from the trusted IO information tables of the corresponding partitions of multiple hard disks. It is easy to understand that the trusted IO information table of the same type is mainly an IO information table used for comparing data of the same type.
[0132] Specifically, from the partition trusted IO information tables corresponding to multiple hard disks, the preset algorithm for determining the trusted IO information tables of the same type can be an average algorithm or a weighted average algorithm. For example, different hard disks can be given Different weighting coefficients are used to determine the response time of the same type of trusted IO information in the same type of trusted IO information table. Here, there is no specific limitation on the specific preset algorithm for calculating the same type of trusted IO information table.
[0133] Among them, Table 2 provides a schematic diagram of trusted IO information tables of multiple hard disk preset partitions:
[0134] Table 2
[0135]
[0136] similar to image 3In step 1030 of the described embodiment, as the usage time of the hard disk increases, each hard disk may wear to a certain extent, so each hard disk can also update its trusted IO information table regularly or in real time, such as every 2 days , Update its trusted IO information table once a week or a month, so that the trusted IO information table of each hard disk is more accurate. In this way, when there are multiple hard disks of the same type, in a certain (same type hard disk When the trusted IO information table of any one of the hard disks is updated, correspondingly, it is also necessary to update the trusted IO information tables of the same type of multiple hard disks regularly or in real time.
[0137] 404. Read the similar trusted IO information table;
[0138] When there is a trusted IO information table of the same type in the storage system, the trusted IO information table of the same type is directly read, and step 405 is executed according to the trusted IO information table of the same type.
[0139] 405. Determine whether the time difference between the first trusted IO response time in the trusted IO information table of each hard disk preset partition and the second trusted IO response time in the same trusted IO information table is greater than the third time threshold, if so, Then execute step 406, otherwise, execute step 407;
[0140] After obtaining the trusted IO information table of the hard disk of the same type, it can be judged that the first trusted IO response time in the trusted IO information table of each hard disk preset partition and the second trusted IO response time in the trusted IO information table of the same type Whether the time difference is greater than the third time threshold, if yes, execute step 406 , if not, execute step 407 .
[0141] 406. Determine that the IO performance index of the hard disk is abnormal;
[0142] When the time difference between the first trusted IO response time of each hard disk in the trusted IO information table of the preset partition and the second trusted IO response time in the trusted IO information table of the same type is greater than the third time threshold, it indicates The IO performance index of the hard disk is abnormal.
[0143] It should be noted that, different from the prior art, this embodiment not only uses the trusted IO information table of each hard disk preset partition to judge the IO performance index of each hard disk, but also has multiple simultaneous When different types of hard disks are used, the same trusted IO performance table is used to judge the IO performance index of each hard disk, which further improves the accuracy of hard disk IO performance index judgment and reduces the misjudgment rate.
[0144] 407. Determine that the IO performance index of the hard disk is normal.
[0145] If the time difference between the first trusted IO response time in the trusted IO information table of each hard disk preset partition and the second trusted IO response time in the same kind of trusted IO information table is not greater than the third time threshold, then determine The IO performance index of the hard disk is normal.
[0146] In this embodiment, when there are multiple hard disks of the same type in the storage system, not only the trusted IO information table of each hard disk’s preset partition is used to judge the IO performance index of each hard disk, but also the When there are multiple hard disks of the same type, the same trusted IO performance table is used to judge the IO performance index of each hard disk, which further improves the accuracy of hard disk IO performance index judgment and reduces the misjudgment rate.
[0147] The slow disk detection method in the embodiment of the present application has been described in detail above, and the slow disk detection system in the embodiment of the present application will be described below, please refer to Figure 5 , an embodiment of the slow disk detection system in the embodiment of the present application, including:
[0148] The collection unit 501 is used to collect the IO performance index of the hard disk, and the IO performance index at least includes the random input and output IO response time of the hard disk and the input and output IO response time of each preset partition of the hard disk;
[0149] The first judging unit 502 is configured to judge whether the IO performance index of the hard disk is abnormal:
[0150] A recording unit 503, configured to record the duration of the abnormality and/or the number of times the abnormality lasts within a preset time when the IO performance index is abnormal;
[0151] The second judging unit 504 is configured to respectively judge whether the duration and/or the number of durations are greater than a corresponding preset threshold;
[0152] The first determining unit 505 is configured to determine that the hard disk is a slow disk when the duration and/or the number of durations are greater than a corresponding preset threshold.
[0153] It should be noted that the functions of each unit in this embodiment are the same as figure 1 It is similar to what is described in the embodiment, and will not be repeated here.
[0154] In the embodiment of the present application, the IO performance index of the hard disk is collected by the collection unit 501, wherein the IO performance index includes at least the random IO response time of the hard disk and the IO response time corresponding to the preset partition, that is, this embodiment not only uses the random IO response time of the hard disk The IO response time is used as the measurement index of the slow disk, and the IO response time of the preset partition of the hard disk is also used as the measurement index of the slow disk, and the first judging unit 502 is used to judge whether the IO performance index is abnormal, and when an abnormality occurs, record the abnormality The duration of the abnormality and/or the number of abnormalities in the preset time period, and when the duration of the abnormality and/or the number of abnormalities in the preset time period is greater than the preset threshold, it is determined that the hard disk is a slow disk , improve the accuracy rate of slow disk detection, and reduce the missed rate and misjudgment rate of slow disk detection.
[0155] based on Figure 5 The slow disk detection system, Image 6 for Figure 5 A detailed diagram of the functional modules of the first judging unit, wherein the first judging unit 502 specifically includes:
[0156] An acquisition module 5021, configured to acquire the random IO response time of the hard disk;
[0157] A first judging module 5022, configured to judge whether the random IO response time is greater than a first time threshold;
[0158] The reading module 5023 is configured to read the trusted IO information table of the hard disk preset partition when the random IO response time is not greater than the first time threshold, the trusted IO information table includes at least the The trusted IO response time corresponding to each preset partition of the hard disk;
[0159] A collection module 5024, configured to collect the IO response time of the hard disk preset partition;
[0160] The second judging module 5025 is configured to judge whether the time difference between the IO response time of the preset partition and the trusted IO response time of the corresponding partition is greater than a second time threshold;
[0161] The determination module 5026 is configured to determine that the IO performance index of the hard disk is abnormal when the time difference between the IO response time of the preset partition and the trusted IO response time of the corresponding partition is greater than the second time threshold.
[0162] It should be noted that the functions of each module in this embodiment are the same as figure 2 It is similar to what is described in the embodiment, and will not be repeated here.
[0163] In this embodiment, the detailed functional modules of the first judging unit are described in detail, and the first judging unit not only judges the random IO response time of the entire hard disk through the first judging module 5022, but also judges the random IO response time of the hard disk through the second judging module 5025. The IO response time of a preset partition is judged, which improves the accuracy of slow disk detection and reduces the rate of missed and false positives.
[0164] based on Image 6 The slow disk detection system, Figure 7 for Image 6 Another detailed diagram of the functional modules of the first judging unit in , wherein the first judging unit 502 may also include:
[0165] The third judging module 5027 is used to judge whether the trusted IO information table exists;
[0166] A statistical module 5028, configured to count the number of IO performance index collections of each preset partition of the hard disk and the IO response time in each IO performance index when the trusted IO information table does not exist;
[0167] The generation module 5029 is configured to determine the credible IO response time corresponding to each preset partition from the plurality of IO response times corresponding to the collection times according to the first preset algorithm when the collection times are greater than the first threshold , so as to generate the trusted IO information table.
[0168] In this embodiment, the generation process of the trusted IO information table of the hard disk preset partition is described in detail, and the trusted IO information table of each hard disk can also be updated regularly or in real time, which improves the accuracy of the hard disk trusted IO information table , and further improved the accuracy of the slow disk judgment.
[0169] The following is based on Figure 5 to Figure 7 Described slow disk detection system, the slow disk detection system is described in detail below, please refer to Figure 8 , another embodiment of the slow disk detection system, comprising:
[0170] The collection unit 801 is used to collect the IO performance index of the hard disk, and the IO performance index at least includes the random input and output IO response time of the hard disk and the input and output IO response time of each preset partition of the hard disk;
[0171] The first judging unit 802 is configured to judge whether the IO performance index of the hard disk is abnormal:
[0172] A recording unit 803, configured to record the duration of the abnormality and/or the number of times the abnormality lasts within a preset time when the IO performance index is abnormal;
[0173] The second judging unit 804 is configured to respectively judge whether the duration and/or the number of durations are greater than a corresponding preset threshold;
[0174] The first determining unit 805 is configured to determine that the hard disk is a slow disk when the duration and/or the number of durations are greater than a corresponding preset threshold.
[0175] Preferably, the slow disk detection system also includes:
[0176] The first updating unit 806 is configured to update the trusted IO information table of each hard disk.
[0177] Preferably, the system also includes:
[0178] The third judging unit 807 is used to judge whether there is a trusted IO information table of the same type when there are multiple hard disks of the same type. Performance indicators, and the trusted IO performance indicators include at least trusted IO response time;
[0179] The reading statistics unit 808 is used to read the trusted IO information table of each hard disk preset partition when there is no trusted IO information table of the same type, and count the trusted IO information table of each hard disk preset partition. number of writes;
[0180] The second determining unit 809 is configured to determine the trusted IO information table of the same type from the trusted IO information tables of the corresponding partitions of the plurality of hard disks according to a second preset algorithm when the number of times of writing is greater than the second threshold;
[0181] The fourth judging unit 810 is used to judge whether the time difference between the first trusted IO response time in the trusted IO information table of the partition corresponding to each hard disk and the second trusted IO response time in the same kind of trusted IO information table is greater than the third time threshold;
[0182] The third determining unit 811 is configured to determine that the IO performance index of the hard disk is abnormal when the time difference between the first trusted IO response time and the second trusted IO response time is greater than the third time threshold.
[0183]Preferably, the system also includes:
[0184] The second update unit 812 is configured to update the trusted IO information table of the same type of hard disk.
[0185] In the embodiment of this application, the IO performance index of the hard disk is collected by the collection unit 801, wherein the IO performance index includes at least the random IO response time of the hard disk and the IO response time corresponding to the preset partition, that is, this embodiment not only uses the random IO response time of the hard disk The IO response time is used as the measurement index of the slow disk, and the IO response time of the preset partition of the hard disk is also used as the measurement index of the slow disk, and the first judging unit 802 is used to judge whether the IO performance index is abnormal, and when an abnormality occurs, record the abnormality The duration of the abnormality and/or the number of abnormalities in the preset time period, and when the duration of the abnormality and/or the number of abnormalities in the preset time period is greater than the preset threshold, it is determined that the hard disk is a slow disk , improve the accuracy rate of slow disk detection, and reduce the missed rate and misjudgment rate of slow disk detection.
[0186] Secondly, when there are multiple hard disks of the same type in the storage system, not only the IO performance index of each hard disk is judged by using the trusted IO information table of each hard disk preset partition through the first judging unit, but also in the storage system. When there are multiple hard disks of the same type in the network, the IO performance index of each hard disk is judged by the fourth judging unit 810 using the same trusted IO performance table, which further improves the accuracy of hard disk IO performance index judgment and reduces misjudgment. Rate.
[0187] The slow disk detection system in the embodiment of the present invention has been described above from the perspective of modular functional entities, and the slow disk detection system in the embodiment of the present invention is described below from the perspective of hardware processing:
[0188] An embodiment of the slow disk detection system in the embodiment of the present invention includes:
[0189] processor and memory;
[0190] The memory is used to store computer programs, and when the processor is used to execute the computer programs stored in the memory, the following steps can be implemented:
[0191] Collecting the IO performance index of the hard disk, the IO performance index at least includes the random input and output IO response time of the hard disk and the input and output IO response time of each preset partition of the hard disk;
[0192] Determine whether the IO performance index of the hard disk is abnormal:
[0193] If so, record the duration of the abnormality and/or the number of times the abnormality lasts within a preset time;
[0194] Respectively determine whether the duration and/or the number of durations are greater than a corresponding preset threshold;
[0195] If yes, then determine that the hard disk is a slow disk.
[0196] In some embodiments of the present invention, the processor may also be used to implement the following steps:
[0197] Obtain the random IO response time of the hard disk;
[0198] Judging whether the random IO response time is greater than a first time threshold;
[0199] If not, then read the trusted IO information table of the hard disk preset partition, the trusted IO information table at least includes the trusted IO response time corresponding to each preset partition of the hard disk;
[0200] Collecting the IO response time of the preset partition of the hard disk;
[0201] Judging whether the time difference between the IO response time of the preset partition and the trusted IO response time of the corresponding partition is greater than a second time threshold;
[0202] If yes, it is determined that the IO performance index is abnormal.
[0203] In some embodiments of the present invention, the processor may also be used to implement the following steps:
[0204] Judging whether the trusted IO information table exists;
[0205] If it does not exist, count the number of IO performance index collections of each preset partition of the hard disk, and the IO response time in each IO performance index;
[0206] When the collection times are greater than the first threshold, the credible IO response time corresponding to each preset partition is determined from the multiple IO response times corresponding to the collection times according to the first preset algorithm, so as to generate the possible Letter IO information table.
[0207] In some embodiments of the present invention, the processor may also be used to implement the following steps:
[0208] Update the trusted IO information table of each hard disk.
[0209] In some embodiments of the present invention, the processor may also be used to implement the following steps:
[0210] When there are multiple hard disks of the same type, it is judged whether there is a trusted IO information table of the same type. Trusted IO performance indicators include at least trusted IO response time;
[0211] If it does not exist, read the trusted IO information table of each hard disk preset partition, and count the write times of the trusted IO information table of each hard disk preset partition;
[0212] When the number of times of writing is greater than the second threshold, then according to the second preset algorithm, determine the trusted IO information table of the same type from the trusted IO information tables of the corresponding partitions of the plurality of hard disks;
[0213] Judging whether the time difference between the first trusted IO response time in the corresponding partition trusted IO information table of each hard disk and the second trusted IO response time in the same kind of trusted IO information table is greater than the third time threshold;
[0214] If yes, it is determined that the IO performance index of the hard disk is abnormal.
[0215] In some embodiments of the present invention, the processor may also be used to implement the following steps:
[0216] Update the trusted IO information table of the same type of hard disk.
[0217] It can be understood that when the processor in the slow disk detection system described above executes the computer program, it can also realize the functions of the units in the above corresponding device embodiments, which will not be repeated here. Exemplarily, the computer program may be divided into one or more modules/units, and the one or more modules/units are stored in the memory and executed by the processor to complete the present invention. The one or more modules/units may be a series of computer program instruction segments capable of accomplishing specific functions, and the instruction segments are used to describe the execution process of the computer program in the slow disk detection system. For example, the computer program can be divided into units in the above-mentioned slow disk detection system, and each unit can realize the specific functions described in the corresponding slow disk detection system above.
[0218] The computer device may be a computing device such as a desktop computer, a notebook, a palmtop computer, or a cloud server. The computer device may include, but is not limited to, a processor, memory. Those skilled in the art can understand that the processor and memory are just examples of computer devices, and do not constitute a limitation to computer devices, and may include more or less components, or combine certain components, or different components, such as the The computer device may also include input and output devices, network access devices, buses, and so on.
[0219] The processor can be a central processing unit (Central Processing Unit, CPU), and can also be other general-purpose processors, digital signal processors (Digital Signal Processor, DSP), application specific integrated circuits (Application Specific Integrated Circuit, ASIC), off-the-shelf Programmable gate array (Field-Programmable GateArray, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc. The general-purpose processor can be a microprocessor or the processor can also be any conventional processor, etc., and the processor is the control center of the computer device, using various interfaces and lines to connect various parts of the entire computer device.
[0220] The memory can be used to store the computer programs and/or modules, and the processor realizes the computer by running or executing the computer programs and/or modules stored in the memory and calling the data stored in the memory Various functions of the device. The memory may mainly include a program storage area and a data storage area, wherein the program storage area may store an operating system, an application program required by at least one function, etc.; the data storage area may store data created according to the use of the terminal, and the like. In addition, the memory may include a high-speed random access memory, and may also include a non-volatile memory, such as a hard disk, a memory, a plug-in hard disk, a smart memory card (Smart Media Card, SMC), a secure digital (SecureDigital, SD) card, A flash memory card (Flash Card), at least one magnetic disk storage device, flash memory device, or other volatile solid state storage devices.
[0221] The present invention also provides a computer-readable storage medium, the computer-readable storage medium is used to realize the function of the slow disk detection system, and a computer program is stored thereon. When the computer program is executed by a processor, the processor can be used for Perform the following steps:
[0222] Collecting the IO performance index of the hard disk, the IO performance index at least includes the random input and output IO response time of the hard disk and the input and output IO response time of each preset partition of the hard disk;
[0223] Determine whether the IO performance index of the hard disk is abnormal:
[0224] If so, record the duration of the abnormality and/or the number of times the abnormality lasts within a preset time;
[0225] Respectively determine whether the duration and/or the number of durations are greater than a corresponding preset threshold;
[0226] If yes, then determine that the hard disk is a slow disk.
[0227] In some embodiments of the present invention, when the computer program stored in the computer-readable storage medium is executed by the processor, the processor may be specifically configured to perform the following steps:
[0228] Obtain the random IO response time of the hard disk;
[0229] Judging whether the random IO response time is greater than a first time threshold;
[0230] If not, then read the trusted IO information table of the hard disk preset partition, the trusted IO information table at least includes the trusted IO response time corresponding to each preset partition of the hard disk;
[0231] Collecting the IO response time of the preset partition of the hard disk;
[0232] Judging whether the time difference between the IO response time of the preset partition and the trusted IO response time of the corresponding partition is greater than a second time threshold;
[0233] If yes, it is determined that the IO performance index is abnormal.
[0234] In some embodiments of the present invention, when the computer program stored in the computer-readable storage medium is executed by the processor, the processor may be specifically configured to perform the following steps:
[0235] Judging whether the trusted IO information table exists;
[0236]If it does not exist, count the number of IO performance index collections of each preset partition of the hard disk, and the IO response time in each IO performance index;
[0237] When the collection times are greater than the first threshold, the credible IO response time corresponding to each preset partition is determined from the multiple IO response times corresponding to the collection times according to the first preset algorithm, so as to generate the possible Letter IO information table.
[0238] In some embodiments of the present invention, when the computer program stored in the computer-readable storage medium is executed by the processor, the processor may be specifically configured to perform the following steps:
[0239] Update the trusted IO information table of each hard disk.
[0240] In some embodiments of the present invention, when the computer program stored in the computer-readable storage medium is executed by the processor, the processor may be specifically configured to perform the following steps:
[0241] When there are multiple hard disks of the same type, it is judged whether there is a trusted IO information table of the same type. Trusted IO performance indicators include at least trusted IO response time;
[0242] If it does not exist, read the trusted IO information table of each hard disk preset partition, and count the write times of the trusted IO information table of each hard disk preset partition;
[0243] When the number of times of writing is greater than the second threshold, then according to the second preset algorithm, determine the trusted IO information table of the same type from the trusted IO information tables of the corresponding partitions of the plurality of hard disks;
[0244] Judging whether the time difference between the first trusted IO response time in the corresponding partition trusted IO information table of each hard disk and the second trusted IO response time in the same kind of trusted IO information table is greater than the third time threshold;
[0245] If yes, it is determined that the IO performance index of the hard disk is abnormal.
[0246] In some embodiments of the present invention, when the computer program stored in the computer-readable storage medium is executed by the processor, the processor may be specifically configured to perform the following steps:
[0247] Update the trusted IO information table of the same type of hard disk.
[0248] It can be understood that, if the integrated unit is realized in the form of a software function unit and sold or used as an independent product, it can be stored in a corresponding computer-readable storage medium. Based on such an understanding, the present invention realizes all or part of the processes in the methods of the above corresponding embodiments, and can also be completed by instructing related hardware through a computer program, and the computer program can be stored in a computer-readable storage medium, When the computer program is executed by the processor, it can realize the steps of the above-mentioned various method embodiments. Wherein, the computer program includes computer program code, and the computer program code may be in the form of source code, object code, executable file or some intermediate form. The computer-readable medium may include: any entity or device capable of carrying the computer program code, a recording medium, a U disk, a removable hard disk, a magnetic disk, an optical disk, a computer memory, a read-only memory (ROM, Read-OnlyMemory), Random access memory (RAM, Random Access Memory), electric carrier signal, telecommunication signal, software distribution medium, etc. It should be noted that the content contained in the computer-readable medium may be appropriately increased or decreased according to the requirements of legislation and patent practice in the jurisdiction. For example, in some jurisdictions, computer-readable media Excludes electrical carrier signals and telecommunication signals.
[0249] Those skilled in the art can clearly understand that for the convenience and brevity of the description, the specific working process of the above-described system, device and unit can refer to the corresponding process in the foregoing method embodiment, which will not be repeated here.
[0250] In the several embodiments provided in this application, it should be understood that the disclosed system, device and method can be implemented in other ways. For example, the device embodiments described above are only illustrative. For example, the division of the units is only a logical function division. In actual implementation, there may be other division methods. For example, multiple units or components can be combined or May be integrated into another system, or some features may be ignored, or not implemented. In another point, the mutual coupling or direct coupling or communication connection shown or discussed may be through some interfaces, and the indirect coupling or communication connection of devices or units may be in electrical, mechanical or other forms.
[0251] The units described as separate components may or may not be physically separated, and the components shown as units may or may not be physical units, that is, they may be located in one place, or may be distributed to multiple network units. Part or all of the units can be selected according to actual needs to achieve the purpose of the solution of this embodiment.
[0252] In addition, each functional unit in each embodiment of the present invention may be integrated into one processing unit, each unit may exist separately physically, or two or more units may be integrated into one unit. The above-mentioned integrated units can be implemented in the form of hardware or in the form of software functional units.
[0253] As mentioned above, the above embodiments are only used to illustrate the technical solutions of the present invention, rather than to limit them; although the present invention has been described in detail with reference to the foregoing embodiments, those of ordinary skill in the art should understand that: it can still understand the foregoing The technical solutions recorded in each embodiment are modified, or some of the technical features are replaced equivalently; and these modifications or replacements do not make the essence of the corresponding technical solutions deviate from the spirit and scope of the technical solutions of the various embodiments of the present invention.

PUM

no PUM

Description & Claims & Application Information

We can also present the details of the Description, Claims and Application information to help users get a comprehensive understanding of the technical details of the patent, such as background art, summary of invention, brief description of drawings, description of embodiments, and other original content. On the other hand, users can also determine the specific scope of protection of the technology through the list of claims; as well as understand the changes in the life cycle of the technology with the presentation of the patent timeline. Login to view more.

Similar technology patents

Adaptive fault detection method for airplane rotation actuator driving device based on deep learning

InactiveCN104914851Aimprove accuracyReduce the false alarm rate of detection
Owner:BEIHANG UNIV

Video monitoring method and system

Owner:深圳辉锐天眼科技有限公司

Scene semantic segmentation method based on full convolution and long and short term memory units

InactiveCN107480726Aimprove accuracylow resolution accuracy
Owner:UNIV OF ELECTRONIC SCI & TECH OF CHINA

Classification and recommendation of technical efficacy words

  • improve accuracy

Golf club head with adjustable vibration-absorbing capacity

InactiveUS20050277485A1improve grip comfortimprove accuracy
Owner:FUSHENG IND CO LTD

Stent delivery system with securement and deployment accuracy

ActiveUS7473271B2improve accuracyreduces occurrence and/or severity
Owner:BOSTON SCI SCIMED INC

Method for improving an HS-DSCH transport format allocation

InactiveUS20060089104A1improve accuracyincrease benefit
Owner:NOKIA SOLUTIONS & NETWORKS OY

Catheter systems

ActiveUS20120059255A1increase selectivityimprove accuracy
Owner:ST JUDE MEDICAL ATRIAL FIBRILLATION DIV

Gaming Machine And Gaming System Using Chips

ActiveUS20090075725A1improve accuracy
Owner:UNIVERSAL ENTERTAINMENT CORP
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products