Method for estimating data storage capacity and electronic device
By acquiring and analyzing historical data on the data storage volume of the target device, detecting abnormal change events and generating abnormal change data, the problem of difficulty in predicting rapid changes in data storage volume in existing technologies is solved. This enables accurate prediction of future data storage volume and the formulation of emergency plans, thereby improving system stability.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- LENOVO (BEIJING) LTD
- Filing Date
- 2022-09-30
- Publication Date
- 2026-06-30
AI Technical Summary
In storage systems, existing technologies struggle to effectively predict and respond to rapid changes in data storage volume caused by historical anomalies, potentially triggering alarm thresholds or limits and leading to system risks or failures.
By acquiring historical data on the data storage volume of the target device, detecting abnormal change events, generating abnormal change data, and making predictions based on this data, including building an abnormal dataset and matching operational items, the future changes in data storage volume can be predicted.
It improves the accuracy of data storage estimation for target devices under abnormal change events, helps users develop contingency plans, and improves system stability and robustness.
Smart Images

Figure CN115421670B_ABST
Abstract
Description
Technical Field
[0001] This application relates to a method for estimating data storage capacity and an electronic device. Background Technology
[0002] In typical forecasting processes, future trends are predicted based on initial data. However, in many actual storage system capacity operation and maintenance activities, it is necessary to pay attention not only to the natural growth and changes in system data, but also to the potential impact of a recurrence of a historical anomaly on the system. For example, if an anomaly occurs again within a certain historical time window, it may quickly trigger alarm thresholds or storage capacity limits, leading to system risks or failures. Summary of the Invention
[0003] This application provides a method for estimating data storage capacity and an electronic device. The technical solution adopted in the embodiments of this application is as follows:
[0004] A method for estimating data storage capacity includes:
[0005] Acquire first data of the target device, the first data including the amount of data stored by the target device at each moment in the first time interval;
[0006] Based on the first data, detect abnormal change events in the data storage volume of the target device, and obtain abnormal change data, which can characterize the way in which the data storage volume of the target device changes during the duration of the abnormal change event.
[0007] Based on the first data and the abnormal change data, the data storage volume of the target device is estimated when the abnormal change event occurs at the target time.
[0008] In some embodiments, based on the first data and the abnormal change data, an estimate is made of the data storage volume of the target device in the event of the abnormal change event at a target time, including:
[0009] Retrieve an abnormal dataset and second data from the target device; wherein, the abnormal dataset includes multiple abnormal change data and a first operation that triggers the corresponding abnormal change event, and the second data includes one or more second operation items executed by the target device at a target time;
[0010] Match the second operation item in the second data with the first operation item in the abnormal dataset to select abnormal change data from the abnormal dataset;
[0011] Based on the first data and the selected abnormal change data, the data storage volume of the target device is estimated in the event that the abnormal change event occurs at the target time.
[0012] In some embodiments, based on the first data and the abnormal change data, an estimate is made of the data storage volume of the target device in the event of the abnormal change event at a target time, including:
[0013] Based on the first data, the first data storage volume of the target device at the target time is estimated; wherein, the first data storage volume is the data storage volume of the target device under the condition that no abnormal change event occurs at the target time;
[0014] Based on the first data storage volume and the abnormal change data, the second data storage volume of the target device is estimated in the case that the abnormal change event occurs at the target time.
[0015] In some embodiments, detecting abnormal changes in the data storage capacity of the target device based on the first data and obtaining abnormal change data includes:
[0016] Based on the first data, a data sequence is generated consisting of multiple sub-data arranged in chronological order; wherein, the sub-data is formed by splitting the first data according to a preset time interval;
[0017] The anomalous change data is generated based on the outliers in the data sequence.
[0018] In some embodiments, generating the anomalous change data based on outliers in the data sequence includes:
[0019] Identify outliers in the data sequence;
[0020] Based on the outliers, target data in the data sequence is determined; wherein the target data includes multiple consecutive sub-data and at least the outliers;
[0021] The abnormal change data is generated based on the target data.
[0022] In some embodiments, determining outliers in the data sequence includes:
[0023] The sub-data in the data sequence is preprocessed to obtain the variation parameters of the sub-data; the variation parameters can characterize the degree of change in the data storage volume of the target device in the corresponding time period.
[0024] Based on the changing parameters, outliers in the data sequence are identified.
[0025] In some embodiments, determining outliers in the data sequence includes:
[0026] The sub-data in the data sequence is preprocessed to obtain different types of change parameters for the sub-data; different types of change parameters can characterize the degree of change in the data storage volume of the target device in the corresponding time period from different dimensions.
[0027] Based on each of the aforementioned changing parameters, outliers in the data sequence are determined.
[0028] If at least one of the plurality of the changing parameters characterizes the sub-data as an outlier, then the sub-data is identified as an outlier.
[0029] In some embodiments, determining the target data in the data sequence based on the outlier includes:
[0030] The target data is determined by a series of consecutive outliers, a first number of sub-data points preceding the series of outliers, and / or a second number of sub-data points following the series of outliers.
[0031] In some embodiments, determining the target data in the data sequence based on the outlier includes:
[0032] Based on the changing trend of the outliers, the sub-data in the data sequence is smoothed.
[0033] Based on the smoothed data sequence, the target data is determined; wherein the target sequence segment includes the outlier and sub-data that are close to the outlier and have the same trend of change as the outlier.
[0034] An electronic device, comprising:
[0035] The acquisition module is used to acquire first data of the target device, the first data including the data storage amount of the target device at each time in the first time interval;
[0036] The detection module is used to detect abnormal change events in the data storage volume of the target device based on the first data, and to obtain abnormal change data, wherein the abnormal change data can characterize the change pattern of the data storage volume of the target device during the duration of the abnormal change.
[0037] The estimation module is used to estimate the amount of data stored by the target device when the abnormal change event occurs at a target time, based on the first data and the abnormal change data.
[0038] The data storage capacity estimation method of this application embodiment obtains first data of the target device, detects abnormal change events in the data storage capacity of the target device based on the first data, and obtains abnormal change data. The abnormal change data can characterize the change pattern of the data storage capacity of the target device during the duration of the abnormal change event. Based on the first data and the abnormal change data, the data storage capacity of the target device under the condition of an abnormal change event at a target time is estimated. This method can estimate the data storage capacity of the target device under the condition of an abnormal change event at a target time, so that users can formulate contingency plans or take targeted avoidance measures, which is beneficial to improving the stability and robustness of the system. Attached Figure Description
[0039] Figure 1 This is a flowchart illustrating the data storage estimation method according to an embodiment of this application;
[0040] Figure 2 This is a flowchart of one embodiment of step S130 in the data storage estimation method of this application;
[0041] Figure 3 A graph showing the data storage capacity of the target device;
[0042] Figure 4 This is a flowchart of another embodiment of step S130 in the data storage estimation method of this application;
[0043] Figure 5 This is a structural block diagram of one embodiment of the electronic device described in this application;
[0044] Figure 6 This is a structural block diagram of another embodiment of the electronic device described in this application. Detailed Implementation
[0045] Various embodiments and features of this application are described herein with reference to the accompanying drawings.
[0046] It should be understood that various modifications can be made to the embodiments described herein. Therefore, the above description should not be considered as limiting, but merely as an example of embodiments. Other modifications within the scope and spirit of this application will be apparent to those skilled in the art.
[0047] The accompanying drawings, which are included in and form part of this specification, illustrate embodiments of the present application and, together with the general description of the present application given above and the detailed description of the embodiments given below, serve to explain the principles of the present application.
[0048] These and other features of this application will become apparent from the following description of preferred forms of embodiments given as non-limiting examples, with reference to the accompanying drawings.
[0049] It should also be understood that although this application has been described with reference to some specific examples, those skilled in the art can certainly implement many other equivalent forms of this application, which have the features described in the claims and are therefore all within the scope of protection defined herein.
[0050] The above and other aspects, features and advantages of this application will become more apparent when taken in conjunction with the accompanying drawings and in view of the following detailed description.
[0051] Specific embodiments of this application are described thereafter with reference to the accompanying drawings; however, it should be understood that the claimed embodiments are merely examples of this application, which can be implemented in various ways. Well-known and / or repeated functions and structures are not described in detail to avoid unnecessary or redundant details that could obscure the application. Therefore, the specific structural and functional details claimed herein are not intended to be limiting, but merely serve as the basis and representative basis for the claims to teach those skilled in the art to use this application in a variety of substantially any suitable detailed structures.
[0052] This specification may use the phrases “in one embodiment,” “in another embodiment,” “in yet another embodiment,” or “in other embodiments,” all of which may refer to one or more of the same or different embodiments according to this application.
[0053] This application provides a method for estimating data storage volume. Figure 1 For the data storage estimation method of this application embodiment, see [link to relevant documentation]. Figure 1 As shown, the method for estimating the amount of data stored in this application embodiment may specifically include the following steps.
[0054] S110, acquire the first data of the target device, the first data including the data storage amount of the target device at each time in the first time interval.
[0055] Optionally, the target device may include various electronic devices with data storage capabilities. The target device may be a single electronic device, for example, a server. The target device may also include multiple electronic devices, for example, a database composed of multiple electronic devices.
[0056] Optionally, this data storage capacity estimation method can be applied to the target device itself, that is, it can be applied in self-monitoring application scenarios to estimate the data storage capacity of the target device. It can also be applied to other electronic devices to estimate the data storage capacity of the target device. For example, this data storage capacity estimation method can be applied to a central control system in a database, where the target device can be a data storage unit of the entire database, or it can be a sub-unit within a data storage unit. The central control system can estimate the data storage capacity of the data storage unit based on this method, or it can estimate the data storage capacity of a sub-unit within the data storage unit based on this method.
[0057] The first data includes the amount of data stored by the target device at each moment within a first time interval. Optionally, the first time interval can be a historical time interval, in which case the first data can be the historical data of the target device, which may record the amount of data stored by the target device at each moment within the historical time interval. Optionally, the first time interval can also be a time interval with a specific duration, but not an actual occurrence; for example, the first time interval can be a future time interval. In this case, the first data may not be the actual data recorded by the target device, but may be, for example, predicted data.
[0058] S120, based on the first data, detect abnormal change events in the data storage volume of the target device, and obtain abnormal change data, wherein the abnormal change data can characterize the way in which the data storage volume of the target device changes during the duration of the abnormal change event.
[0059] Optionally, the abnormal change event includes events where the data storage volume of the target device abnormally increases, decreases, or fluctuates within a short period of time. For example, the data storage volume of the target device rapidly increases and then rapidly decreases within a short period of time.
[0060] Optionally, once the first data is obtained, abnormal changes in the data storage capacity of the target device can be identified based on the first data. In specific implementations, various methods can be used to detect abnormal changes. For example, abnormal changes can be detected based on specific methodological steps, or by utilizing, for example, machine learning models.
[0061] Optionally, the abnormal change data can describe how the data storage volume of the target device changes during the duration of the abnormal change event in various ways. For example, the time period during which the abnormal change event occurs can be defined as the abnormal time period. Based on this, the abnormal change data can include the relative change in the data storage volume of the target device during the abnormal time period. Alternatively, based on the relative change in the data storage volume of the target device during the abnormal time period, the trend and rate of change of the data storage volume of the target device can be determined. Another example is that the abnormal change data can include a curve showing the change in the data storage volume of the target device during the abnormal time period. Or, based on the curve showing the change in the data storage volume of the target device during the abnormal time period, the curve can be smoothed, and the smoothed curve can be used as the abnormal change data.
[0062] It should be noted that the above abnormal change data is only illustrative and should not be construed as being limited to the data content or format in the above example. In specific implementation, various abnormal change data can be used to describe the changes in the amount of data stored on the target device during an abnormal change event.
[0063] S130, based on the first data and the abnormal change data, estimate the amount of data stored by the target device when the abnormal change event occurs at the target time.
[0064] During the normal operation of the target device, its data storage volume is also dynamically changing. For example, as the target device continues to operate, its data storage volume will dynamically increase even without any abnormal events. Therefore, the data storage volume of the target device exhibits a normal change pattern, that is, a normal change pattern under the condition that no abnormal events occur. This normal change pattern can be related to various factors such as the target device's application scenario, application objects, data storage methods, and customer groups.
[0065] Based on this, in coordination Figure 2 As shown, step S130, based on the first data and the abnormal change data, estimates the amount of data stored by the target device when the abnormal change event occurs at the target time, which may include the following steps.
[0066] S131, based on the first data, estimate the first data storage volume of the target device at the target time; wherein, the first data storage volume is the data storage volume of the target device under the condition that no abnormal change event occurs at the target time.
[0067] Optionally, normal change data can be obtained based on the first data. This normal change data can be used to describe the normal change pattern of the target device's data storage volume when no abnormal change events occur on the target device. Then, based on this normal change data, the data storage volume of the target device under the condition that no abnormal change events occur at the target time is predicted, thus obtaining storage volume data. Of course, this storage volume data represents the data storage volume of the target device under the condition that no abnormal change events occur before the target time.
[0068] Optionally, the first data storage volume can be the data storage volume of the target device under the condition that no abnormal change events occur between the current time and the target time, or it can be the data storage volume of the target device under the condition that one or more abnormal change events occur between the current time and the target time, but no abnormal change events occur at the target time.
[0069] Understandably, in practice, the amount of data stored in the first data source can be predicted in various ways. For example, the first data source can be used as input data and fed into a machine learning model, which can then output the amount of data stored in the first data source based on that first data source.
[0070] S132, based on the first data storage volume and the abnormal change data, estimate the second data storage volume of the target device in the case of the abnormal change event occurring at the target time.
[0071] Optionally, the target time can be a target point in time or a target time period. For example, the data storage volume of the target device can be estimated for a specific duration after the current time, or the data storage volume of the target device can be estimated for a specific time point in the future.
[0072] Optionally, based on the already determined first data storage volume, the change in data storage volume caused by the abnormal change event can be determined based on the abnormal change data. Adding this change amount to the first data storage volume yields the second data storage volume.
[0073] For example, the first data may be a first change curve of the data storage capacity of the target device, such as... Figure 3 As shown in Part A. To ensure the accuracy of the forecast, the first change curve can be trend-processed to form a second change curve, as shown below. Figure 3 As shown in section B. The first increasing slope can be generated based on the second change curve, as shown below. Figure 3 As shown in section C. Extend the first growing slope to form the second growing slope, as shown. Figure 3As shown in section D, the value on the second increasing slope is the first data storage amount. Based on the first change curve, to detect abnormal changes in the data storage amount of the target device, the abnormal change curve shown in segment E of the first change curve can be used as the abnormal change data. By shifting the abnormal change curve to the target time and aligning its starting point with the first change curve, or by placing the starting point of the abnormal change curve on the second increasing slope, the growth curve of the data storage amount of the target device when an abnormal change event occurs at the target time can be obtained. Figure 3 As shown in section F. Figure 3 The G section shows the alarm preset or upper limit value for the data storage volume in the target device. Figure 3 The system can intuitively determine whether the data storage volume of the target device has reached the alarm threshold or upper limit when an abnormal change event occurs at the target time.
[0074] The data storage capacity estimation method of this application embodiment obtains first data of the target device, detects abnormal change events in the data storage capacity of the target device based on the first data, and obtains abnormal change data. The abnormal change data can characterize the change pattern of the data storage capacity of the target device during the continuous process of the abnormal change event. Based on the first data and the abnormal change data, the data storage capacity of the target device under the condition of an abnormal change event at a target time is estimated. This can predict whether the data storage capacity of the target device will reach the alarm threshold or upper limit under the condition of an abnormal change event at a target time, so that users can formulate emergency plans in a targeted manner or take purposeful avoidance, which is beneficial to improving the stability and robustness of the system.
[0075] Cooperate Figure 4 As shown, in some embodiments, step S130, which estimates the amount of data stored by the target device when the abnormal change event occurs at a target time based on the first data and the abnormal change data, may include the following steps.
[0076] S133, retrieve the abnormal dataset and the second data of the target device; wherein, the abnormal dataset includes multiple abnormal change data and a first operation that triggers the corresponding abnormal change event, and the second data includes one or more second operation items executed by the target device at the target time.
[0077] Optionally, detecting abnormal changes in the data storage of the target device may include identifying the abnormal change event, determining how the data storage of the target device changes during the duration of the abnormal change event, or determining the cause of the abnormal change event.
[0078] Optionally, one or more first operational items that triggered the abnormal change event are identified. If abnormal change data is obtained and one or more associated first operational items are identified, an abnormal dataset can be constructed based on the abnormal change data and the associated one or more first operational items. The abnormal change event is then recorded using the abnormal dataset. Optionally, the first operational items include, but are not limited to, system upgrades, system maintenance, system rollbacks, hardware upgrades, etc. For example, if an abnormal change event is determined to be triggered by a system rollback, the first operational item of system rollback can be recorded using a tag, and this tag can be associated with the abnormal change data and stored in the abnormal dataset.
[0079] Optionally, the second data can be any data that includes one or more second operational events planned to be performed by the target device after the current time. For example, the second data could be the target device's operation and maintenance plan. Of course, the second data is not limited to operation and maintenance plans and can also include other data. Optionally, the second operational events may include, but are not limited to, system upgrades, system maintenance, system rollbacks, hardware upgrades, etc.
[0080] S134, match the second operation item in the second data with the first operation item in the abnormal dataset to select abnormal change data from the abnormal dataset.
[0081] Optionally, when it is necessary to estimate the amount of data stored in the event of the abnormal change on the target device at a target time, the abnormal dataset and the second data can be invoked to determine one or more second operations that the target device may perform at the target time. These one or more second times are then matched with the first events in the abnormal dataset, and abnormal data is selected from the abnormal dataset. For example, if it is determined based on the second data that the target device may need to perform a system rollback operation at the target time, the change curves associated with the rollback operation can be queried from the abnormal dataset.
[0082] S135, based on the first data and the selected abnormal change data, estimate the amount of data storage required by the target device in the event of the abnormal change at the target time.
[0083] Optionally, the first data storage volume of the target device at the target time can be predicted based on the first data; the first data storage volume can be the data storage volume of the target device under the condition that no abnormal change event occurs at the target time.
[0084] Optionally, the first data storage volume can be the data storage volume of the target device under the condition that no abnormal change events occur between the current time and the target time, or it can be the data storage volume of the target device under the condition that one or more abnormal change events occur between the current time and the target time, but no abnormal change events occur at the target time.
[0085] Optionally, the growth rate of the data storage volume of the target device under normal conditions can be determined based on the first data, and the data storage volume of the target device under normal conditions can be determined based on the growth rate when no abnormal change events occur between the current time and the target time, and when no abnormal change events occur at the target time.
[0086] Optionally, the target device may experience multiple abnormal change events simultaneously at the target time, or it may experience multiple abnormal change events sequentially. For example, based on second data, it can be determined that the target device may experience a first abnormal change event at a first time point, and a second abnormal change event may occur at a second time point after the first time point. First abnormal data for the first abnormal change event and second abnormal data for the second abnormal change event can be obtained. Then, based on the growth rate of the target device's data storage volume under normal conditions, the data storage volume of the target device under the condition that no abnormal change event occurred at the first time point can be determined. Based on the first abnormal change data and the data storage volume of the target device under the condition that no abnormal change event occurred at the first time point, the data storage volume of the target device after the first abnormal change event can be determined. Based on the second abnormal change data and the data storage volume of the target device after the first abnormal change event, the data storage volume of the target device during and after the second abnormal change event can be determined. Based on one or more second operation items that the target device may perform at the target time, the abnormal change events that the target device may experience at the target time can be estimated relatively accurately, and consequently, the data storage volume of the target device under the condition that an abnormal change event occurs at the target time can be estimated relatively accurately.
[0087] In some embodiments, step S120, based on the first data, detects abnormal change events in the data storage capacity of the target device and obtains abnormal change data, may include the following steps.
[0088] S121, Based on the first data, a data sequence is generated consisting of multiple sub-data arranged in chronological order; wherein the sub-data is formed by splitting the first data according to a preset time interval.
[0089] Taking the first data point as historical data as an example, historical data for a specific historical interval can be obtained. For instance, a monitoring system may store historical data for 180 days prior to the current time, but when detecting abnormal change events, only the most recent 90 days of historical data can be obtained to analyze recent abnormal change events.
[0090] Optionally, after obtaining the first data, a time window can be used to divide the first data into multiple sub-data, and the multiple sub-data can be arranged into a data sequence in chronological order.
[0091] S122, Based on the outliers in the data sequence, generate the abnormal change data.
[0092] Optionally, given a data sequence, outlier analysis can be performed on the data sequence using a preset method to identify outliers and generate anomalous change data based on these outliers. For example, outliers can be directly used as anomalous change data. Alternatively, outliers can be further processed to generate anomalous change data. Since outliers are far from the general level of the data sequence, outlier analysis can detect anomalous change events relatively accurately.
[0093] In some embodiments, step S122, generating the anomalous change data based on outliers in the data sequence, may include:
[0094] Identify outliers in the data sequence;
[0095] Based on the outliers, target data in the data sequence is determined; wherein the target data includes multiple consecutive sub-data and at least the outliers;
[0096] The abnormal change data is generated based on the target data.
[0097] In practice, anomaly events may not immediately cause the target device's data storage volume to exceed normal levels, nor may they cause it to consistently exceed normal levels by a significant margin. Data points that do not significantly exceed normal levels are not identified as outliers. However, these data points within the same time window are still affected by the anomaly event. Missing this subset of data would prevent the anomaly data from fully describing how the target device's data storage volume changes during the anomaly event. When outliers are identified in the data sequence, target data is determined based on these outliers. This target data includes multiple consecutive sub-data points, including the outliers. In other words, expanding the outliers into a set of consecutive sub-data points helps to improve the anomaly data, enabling it to accurately describe how the target device's data storage volume changes during the anomaly event.
[0098] In some embodiments, determining outliers in the data sequence may include:
[0099] The sub-data in the data sequence is preprocessed to obtain the variation parameters of the sub-data; the variation parameters can characterize the degree of change in the data storage volume of the target device in the corresponding time period.
[0100] Based on the changing parameters, outliers in the data sequence are identified.
[0101] Optionally, the subdata actually consists of continuous data storage amounts within a time window. The maximum and minimum data storage amounts in the subdata can be determined, and the absolute value of the difference between the maximum and minimum data storage amounts is calculated as a variation parameter. Based on the absolute value of the difference between the maximum and minimum values, outliers in the data sequence are determined.
[0102] Optionally, the difference between the last and first values in the sub-data can be determined, and this difference can be used as a variation parameter to identify outliers in the data sequence.
[0103] Optionally, the absolute value of the difference between the first and last values in a sub-data set can be determined as a variation parameter. Outliers in the data sequence are then identified based on this absolute value of the difference between the first and last data values.
[0104] Optionally, the standard deviation of each sub-data set can be determined and used as a variation parameter to identify outliers in the data sequence.
[0105] In some embodiments, determining outliers in the data sequence includes:
[0106] The sub-data in the data sequence is preprocessed to obtain different types of change parameters for the sub-data; different types of change parameters can characterize the degree of change in the data storage volume of the target device in the corresponding time period from different dimensions.
[0107] Based on each of the aforementioned changing parameters, outliers in the data sequence are determined.
[0108] If at least one of the plurality of the changing parameters characterizes the sub-data as an outlier, then the sub-data is identified as an outlier.
[0109] Because different types of variation parameters can characterize the degree of change in the data storage volume of the target device within a corresponding time period from different dimensions, outliers in the data sequence are identified based on each variation parameter. If at least one sub-data point among multiple variation parameters is an outlier, then that sub-data point is identified as an outlier. In this way, it is possible to determine whether a sub-data point belongs to an outlier from different dimensions, enabling more comprehensive outlier detection.
[0110] Optionally, the maximum and minimum data storage sizes in the sub-data can be determined, and the absolute value of the difference between the maximum and minimum data storage sizes, |v, can be calculated. max -v min |, as the first variation parameter. It determines the absolute value of the difference between the first and last values in the sub-data, |v. end -v start |, as the second variation parameter. The standard deviation s of each sub-data set can also be determined.m The standard deviation is used as the third variation parameter. Outliers in the data sequence are determined based on the first, second, and third variation parameters, respectively. If one of the three variation parameters represents an outlier, then that sub-data point is identified as an outlier. Alternatively, if two of the three variation parameters represent outliers, then the sub-data point is identified as an outlier. Or, if all three variation parameters represent outliers, then the sub-data point is identified as an outlier.
[0111] In some embodiments, determining the target data in the data sequence based on the outlier includes:
[0112] Based on the changing trend of the outliers, the sub-data in the data sequence is smoothed.
[0113] Based on the smoothed data sequence, the target data is determined; wherein the target sequence segment includes the outlier and sub-data that are close to the outlier and have the same trend of change as the outlier.
[0114] Optionally, the difference v between the last and first digits of the sub-data can be determined. end -v start This difference can be denoted as e. If e > 0, it indicates that the sub-data has an increasing trend; if e < 0, it indicates that the sub-data has a decreasing trend. Subsequently, the sub-data in the data sequence can be smoothed based on the changing trends of the sub-data. If the sub-data adjacent to or close to the outlier in the smoothed data sequence has the same changing trend as the outlier, then multiple consecutive sub-data, including the outlier and sub-data with the same changing trend as the outlier, can be identified as target data. This ensures that the abnormal change data can completely describe how the data storage volume of the target device changes during the abnormal change event.
[0115] In some embodiments, determining the target data in the data sequence based on the outlier includes:
[0116] The target data is determined by a series of consecutive outliers, a first number of sub-data points preceding the series of outliers, and / or a second number of sub-data points following the series of outliers.
[0117] In other words, based on the identification of multiple consecutive outliers, these multiple consecutive outliers, a first number of sub-data points preceding each outlier, and a second number of sub-data points following each outlier can be collectively defined as target data. This ensures that the abnormal change data can fully describe how the target device's data storage volume changes during an abnormal change event. Optionally, the first and second numbers can be the same or different.
[0118] See Figure 5 As shown in the illustration, this application also provides an electronic device, including:
[0119] The acquisition module 201 is used to acquire first data of the target device, the first data including the data storage amount of the target device at each time in the first time interval;
[0120] Detection module 202 is used to detect abnormal change events in the data storage volume of the target device based on the first data, and to obtain abnormal change data, wherein the abnormal change data can characterize the change pattern of the data storage volume of the target device during the duration of the abnormal change time.
[0121] The estimation module 203 is used to estimate the amount of data stored by the target device when the abnormal change event occurs at a target time, based on the first data and the abnormal change data.
[0122] In some embodiments, the estimation module 203 is specifically used for:
[0123] Retrieve an abnormal dataset and second data from the target device; wherein, the abnormal dataset includes multiple abnormal change data and a first operation that triggers the corresponding abnormal change event, and the second data includes one or more second operation items executed by the target device at a target time;
[0124] Match the second operation item in the second data with the first operation item in the abnormal dataset to select abnormal change data from the abnormal dataset;
[0125] Based on the first data and the selected abnormal change data, the data storage volume of the target device is estimated in the event that the abnormal change event occurs at the target time.
[0126] In some embodiments, the estimation module 203 is specifically used for:
[0127] Based on the first data, the first data storage volume of the target device at the target time is estimated; wherein, the first data storage volume is the data storage volume of the target device under the condition that no abnormal change event occurs at the target time;
[0128] Based on the first data storage volume and the abnormal change data, the second data storage volume of the target device is estimated in the case that the abnormal change event occurs at the target time.
[0129] In some embodiments, the detection module 202 is specifically used for:
[0130] Based on the first data, a data sequence is generated consisting of multiple sub-data arranged in chronological order; wherein, the sub-data is formed by splitting the first data according to a preset time interval;
[0131] The anomalous change data is generated based on the outliers in the data sequence.
[0132] In some embodiments, the detection module 202 is specifically used for:
[0133] Identify outliers in the data sequence;
[0134] Based on the outliers, target data in the data sequence is determined; wherein the target data includes multiple consecutive sub-data and at least the outliers;
[0135] The abnormal change data is generated based on the target data.
[0136] In some embodiments, the detection module 202 is specifically used for:
[0137] The sub-data in the data sequence is preprocessed to obtain the variation parameters of the sub-data; the variation parameters can characterize the degree of change in the data storage volume of the target device in the corresponding time period.
[0138] Based on the changing parameters, outliers in the data sequence are identified.
[0139] In some embodiments, the detection module 202 is specifically used for:
[0140] The sub-data in the data sequence is preprocessed to obtain different types of change parameters for the sub-data; different types of change parameters can characterize the degree of change in the data storage volume of the target device in the corresponding time period from different dimensions.
[0141] Based on each of the aforementioned changing parameters, outliers in the data sequence are determined.
[0142] If at least one of the plurality of the changing parameters characterizes the sub-data as an outlier, then the sub-data is identified as an outlier.
[0143] In some embodiments, the detection module 202 is specifically used for:
[0144] The target data is determined by a series of consecutive outliers, a first number of sub-data points preceding the series of outliers, and / or a second number of sub-data points following the series of outliers.
[0145] In some embodiments, the detection module 202 is specifically used for:
[0146] Based on the changing trend of the outliers, the sub-data in the data sequence is smoothed.
[0147] Based on the smoothed data sequence, the target data is determined; wherein the target sequence segment includes the outlier and sub-data that are close to the outlier and have the same trend of change as the outlier.
[0148] See Figure 6 As shown, the fifth embodiment of this application also provides an electronic device, including at least a memory 301 and a processor 302. The memory 301 stores a program, and the processor 302 implements the method described in any of the above embodiments when executing the program on the memory 301.
[0149] The sixth embodiment of this application also provides a computer-readable storage medium storing computer-executable instructions, wherein executing the computer-executable instructions in the computer-readable storage medium implements the method described in any of the above embodiments.
[0150] Those skilled in the art will understand that embodiments of this application can be provided as methods, electronic devices, computer-readable storage media, or computer program products. Therefore, this application can take the form of a completely hardware embodiment, a completely software embodiment, or an embodiment combining software and hardware. Furthermore, this application can take the form of a computer program product implemented on one or more computer-readable storage media containing computer-readable program code. When implemented in software, these functions can be stored in a computer-readable medium or transmitted as one or more instructions or code on a computer-readable medium.
[0151] The aforementioned processor can be a general-purpose processor, a digital signal processor, an application-specific integrated circuit (ASIC), a programmable logic device (PLD), or a combination thereof. The aforementioned PLD can be a complex programmable logic device (CPLD), a field-programmable gate array (FPGA), a generic array logic (GAL), or any combination thereof. The general-purpose processor can be a microprocessor or any conventional processor, etc.
[0152] The aforementioned memory may include non-persistent memory in computer-readable media, such as random access memory (RAM) and / or non-volatile memory, like read-only memory (ROM) or flash RAM. Memory is an example of computer-readable media.
[0153] The aforementioned readable storage medium may be a magnetic disk, optical disk, DVD, USB, read-only memory (ROM) or random access memory (RAM), etc. This application does not limit the specific form of storage medium.
[0154] The above embodiments are merely exemplary embodiments of this application and are not intended to limit this application. The scope of protection of this application is defined by the claims. Those skilled in the art can make various modifications or equivalent substitutions to this application within its substance and scope of protection, and such modifications or equivalent substitutions should also be considered to fall within the scope of protection of this application.
Claims
1. A method for estimating data storage capacity, comprising: Acquire first data of the target device, the first data including the amount of data stored by the target device at each moment in the first time interval; Based on the first data, abnormal change events in the data storage volume of the target device are detected, and abnormal change data is obtained. The abnormal change data can characterize the way in which the data storage volume of the target device changes during the duration of the abnormal change event. The abnormal change event includes events in which the data storage volume of the target device abnormally increases, decreases, or fluctuates within a short period of time. Based on the first data and the abnormal change data, the estimation of the data storage volume of the target device in the event of the abnormal change event at the target time includes: Retrieve an abnormal dataset and second data from the target device; wherein the abnormal dataset includes multiple abnormal change data and a first operation that triggers the corresponding abnormal change event, and the second data includes one or more second operation items executed by the target device at a target time; Match the second operation item in the second data with the first operation item in the abnormal dataset to select abnormal change data from the abnormal dataset; Based on the first data and the selected abnormal change data, the data storage volume of the target device is estimated in the event that the abnormal change event occurs at the target time.
2. The method of claim 1, wherein, Based on the first data and the abnormal change data, an estimate is made of the data storage volume of the target device in the event of the abnormal change event at a target time, including: Based on the first data, the first data storage volume of the target device at the target time is estimated; wherein, the first data storage volume is the data storage volume of the target device under the condition that no abnormal change event occurs at the target time; Based on the first data storage volume and the abnormal change data, the second data storage volume of the target device is estimated in the case that the abnormal change event occurs at the target time.
3. The method of claim 1, wherein, The step of detecting abnormal changes in the data storage volume of the target device based on the first data and obtaining abnormal change data includes: Based on the first data, a data sequence is generated consisting of multiple sub-data arranged in chronological order; wherein, the sub-data is formed by splitting the first data according to a preset time interval; The anomalous change data is generated based on the outliers in the data sequence.
4. The method of claim 3, wherein, The process of generating the anomalous change data based on outliers in the data sequence includes: Identify outliers in the data sequence; Based on the outliers, target data in the data sequence is determined; wherein the target data includes multiple consecutive sub-data and at least the outliers; The abnormal change data is generated based on the target data.
5. The method of claim 4, wherein, The step of determining outliers in the data sequence includes: The sub-data in the data sequence is preprocessed to obtain the variation parameters of the sub-data; the variation parameters can characterize the degree of change in the data storage volume of the target device in the corresponding time period. Based on the changing parameters, outliers in the data sequence are identified.
6. The method of claim 4, wherein, The step of determining outliers in the data sequence includes: The sub-data in the data sequence is preprocessed to obtain different types of change parameters for the sub-data; different types of change parameters can characterize the degree of change in the data storage volume of the target device in the corresponding time period from different dimensions. Based on each of the aforementioned changing parameters, outliers in the data sequence are determined. If at least one of the plurality of the changing parameters characterizes the sub-data as an outlier, then the sub-data is identified as an outlier.
7. The method of claim 4, wherein, The step of determining the target data in the data sequence based on the outlier points includes: The target data is determined by a series of consecutive outliers, a first number of sub-data points preceding the series of outliers, and / or a second number of sub-data points following the series of outliers.
8. The method of claim 4, wherein, The step of determining the target data in the data sequence based on the outlier points includes: Based on the changing trend of the outliers, the sub-data in the data sequence is smoothed. Based on the smoothed data sequence, the target data is determined; wherein the target sequence segment includes the outlier and sub-data that are close to the outlier and have the same trend of change as the outlier.
9. An electronic device, comprising: The acquisition module is used to acquire first data of the target device, the first data including the data storage amount of the target device at each time in the first time interval; The detection module is used to detect abnormal change events in the data storage volume of the target device based on the first data, and to obtain abnormal change data. The abnormal change data can characterize the change pattern of the data storage volume of the target device during the duration of the abnormal change. The abnormal change events include events in which the data storage volume of the target device abnormally increases, decreases, or fluctuates within a short period of time. The estimation module is used to estimate the amount of data stored by the target device when the abnormal change event occurs at a target time, based on the first data and the abnormal change data. The prediction module is also used to retrieve the abnormal dataset and the second data of the target device; wherein, the abnormal dataset includes multiple abnormal change data and first operation items that trigger the corresponding abnormal change events, and the second data includes one or more second operation items executed by the target device at the target time; Match the second operation item in the second data with the first operation item in the abnormal dataset to select abnormal change data from the abnormal dataset; Based on the first data and the selected abnormal change data, the data storage volume of the target device is estimated in the event that the abnormal change event occurs at the target time.