A data table generation method, device and equipment

By using metric intervals and time intervals as index keys in the target time-series data table of the time-series database, the problem of low storage and query efficiency in the existing technology is solved, realizing efficient storage and query of time-series data and expanding the application of time-series databases in the IoT field.

CN118689871BActive Publication Date: 2026-06-23HANGZHOU HIKVISION DIGITAL TECHNOLOGY CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
HANGZHOU HIKVISION DIGITAL TECHNOLOGY CO LTD
Filing Date
2023-03-22
Publication Date
2026-06-23

AI Technical Summary

Technical Problem

Existing time-series databases are inefficient in storing and querying time-series data. They cannot efficiently utilize the label values ​​of time-series data as index keys, resulting in long query times and inefficient storage and querying of time-series data.

Method used

By generating a target time series data table, using the measurement interval and time interval as index keys, and storing the correspondence between object label values ​​and measurement values ​​and time values, efficient storage and querying of time series data can be achieved.

Benefits of technology

It improves the storage and query efficiency of time series data, shortens query time, and expands the storage and query capabilities of time series databases in the IoT field.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN118689871B_ABST
    Figure CN118689871B_ABST
Patent Text Reader

Abstract

The application provides a data table generation method, device and equipment, the method comprising: obtaining an initial time sequence data table; generating a plurality of measurement intervals based on all measurement values in the initial time sequence data table, and generating a plurality of time intervals based on all time values in the initial time sequence data table; generating a target time sequence data table based on the initial time sequence data table, the target time sequence data table comprising a plurality of target time sequence data table entries, the index key of the target time sequence data table entry being the measurement interval and the time interval, the target time sequence data table entry comprising a column field corresponding to the measurement interval and the time interval, the column field comprising an object label value, the measurement value of the object label value in the initial time sequence data table being in the measurement interval, and the time value of the object label value in the initial time sequence data table being in the time interval. Through the technical solution of the application, efficient time sequence data storage and query can be performed, the query efficiency is relatively high, and the query time is relatively short.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This application relates to the field of Internet technology, and in particular to a method, apparatus and device for generating data tables. Background Technology

[0002] Time series databases, also known as time-series databases, are used to store data with time tags (data that changes in chronological order, i.e., time-series data). Time-series data can be applied to various application scenarios. The characteristics of time series data are: high generation frequency (e.g., multiple time series data can be generated within one second), dependence on the collection time (each time series data corresponds to a unique time), numerous collection points, and large information volume (e.g., tens of thousands of collection points, each generating time series data every second, resulting in tens of gigabytes of data per day).

[0003] When using a time series database to store time series data, the label value of the time series data is used as the index key. This method cannot efficiently store or query time series data, resulting in low query efficiency and long query time. Summary of the Invention

[0004] This application provides a method for generating a data table, the method comprising:

[0005] Obtain an initial time series data table, which includes multiple initial time series data table entries. The index key of each initial time series data table entry is an object label value, and each initial time series data table entry includes a metric value and a time value corresponding to the object label value.

[0006] Multiple metric intervals are generated based on all metric values ​​in the initial time series data table, and multiple time intervals are generated based on all time values ​​in the initial time series data table;

[0007] A target time series data table is generated based on the initial time series data table. The target time series data table includes multiple target time series data table entries. The index key of each target time series data table entry is a metric interval and a time interval. Each target time series data table entry includes column fields corresponding to the metric interval and the time interval. The column fields include object label values. The metric value corresponding to the object label value in the initial time series data table is within the metric interval, and the time value corresponding to the object label value in the initial time series data table is within the time interval. The number of column fields may be the same or different for different target time series data table entries.

[0008] This application provides a data table generation apparatus, the apparatus comprising:

[0009] The acquisition module is used to acquire an initial time series data table, which includes multiple initial time series data table entries. The index key of each initial time series data table entry is an object label value, and each initial time series data table entry includes a metric value and a time value corresponding to the object label value.

[0010] The generation module is used to generate multiple metric intervals based on all metric values ​​in the initial time series data table, and to generate multiple time intervals based on all time values ​​in the initial time series data table.

[0011] A processing module is used to generate a target time series data table based on the initial time series data table. The target time series data table includes multiple target time series data table entries. The index key of each target time series data table entry is a metric interval and a time interval. Each target time series data table entry includes at least one column field corresponding to the metric interval and the time interval. Each column field includes an object label value. The metric value corresponding to the object label value in the initial time series data table is within the metric interval, and the time value corresponding to the object label value in the initial time series data table is within the time interval. Different target time series data table entries may have the same or different number of column fields.

[0012] This application provides an electronic device, including: a processor and a machine-readable storage medium, the machine-readable storage medium storing machine-executable instructions that can be executed by the processor; the processor is used to execute the machine-executable instructions to implement the data table generation method of the example above.

[0013] As can be seen from the above technical solutions, in this embodiment, a target time-series data table can be generated based on the initial time-series data table. The target time-series data table includes multiple target time-series data table entries, with the index key of the target time-series data table entry being the measurement interval and time interval. The target time-series data table entry includes column fields corresponding to the measurement interval and time interval, and the column fields include object label values. The target time-series data table stores the correspondence between the measurement interval, time interval, and object label values, while the initial time-series data table stores the correspondence between the object label values ​​and the measurement values ​​and time values. This enables efficient storage and querying of time-series data, resulting in high query efficiency and short query time. On top of the time-series database, efficient storage and querying of measurement intervals and time intervals are achieved, enabling efficient measurement queries and expanding the storage and query capabilities of the time-series database in the IoT (Internet of Things) field. Attached Figure Description

[0014] To more clearly illustrate the technical solutions in the embodiments of this application or the prior art, the drawings used in the description of the embodiments of this application or the prior art will be briefly introduced below. Obviously, the drawings described below are only some embodiments recorded in this application. For those skilled in the art, other drawings can be obtained based on these drawings of the embodiments of this application.

[0015] Figure 1 This is a flowchart illustrating a data table generation method according to one embodiment of this application;

[0016] Figure 2 This is a flowchart illustrating a data table generation method according to one embodiment of this application;

[0017] Figure 3 This is a flowchart illustrating a data query method in one embodiment of this application;

[0018] Figure 4 This is a schematic diagram of the structure of a data table generation device according to one embodiment of this application;

[0019] Figure 5 This is a hardware structure diagram of an electronic device according to one embodiment of this application. Detailed Implementation

[0020] The terminology used in the embodiments of this application is for the purpose of describing particular embodiments only and is not intended to limit the application. The singular forms “a,” “the,” and “the” as used in this application and claims are also intended to include the plural forms unless the context clearly indicates otherwise. It should also be understood that the term “and / or” as used herein refers to any and all possible combinations comprising one or more of the associated listed items.

[0021] It should be understood that although the terms first, second, third, etc., may be used to describe various information in embodiments of this application, such information should not be limited to these terms. These terms are only used to distinguish information of the same type from one another. For example, without departing from the scope of this application, first information may also be referred to as second information, and similarly, second information may also be referred to as first information. Depending on the context, the word "if" may also be interpreted as "when," "when," or "in response to a determination."

[0022] This application proposes a data table generation method in its embodiments; see [link to relevant documentation]. Figure 1 The diagram shown illustrates the process of generating this data table. This data table generation method may include the following steps:

[0023] Step 101: Obtain the initial time series data table. The initial time series data table may include multiple initial time series data table entries. The index key of the initial time series data table entry may be the object label value. The initial time series data table entry may also include the measurement value, time value and application data corresponding to the object label value.

[0024] Step 102: Generate multiple metric intervals based on all metric values ​​in the initial time series data table, and generate multiple time intervals based on all time values ​​in the initial time series data table.

[0025] For example, generating multiple metric intervals based on all metrics in the initial time-series data table may include, but is not limited to: sorting all metrics in the initial time-series data table according to their natural order; for example, sorting all metrics in ascending order or descending order; and dividing all metrics into multiple metric intervals based on the sorting result. Alternatively, one metric interval may be generated for each metric in the initial time-series data table.

[0026] For example, if the metric values ​​in the initial time series data table are numeric, the natural order of each metric value in the initial time series data table can be the numerical order of each metric value; or, if the metric values ​​in the initial time series data table are string, the natural order of each metric value in the initial time series data table can be the ASCII code order of each metric value.

[0027] Step 103: Generate a target time series data table based on the initial time series data table. The target time series data table may include multiple target time series data table entries. The index key of the target time series data table entry may be a measurement interval and a time interval. The target time series data table entry may also include column fields corresponding to the measurement interval and the time interval. The column fields include object label values. The measurement value corresponding to the object label value in the initial time series data table is within the measurement interval, and the time value corresponding to the object label value in the initial time series data table is within the time interval. The number of column fields of different target time series data table entries may be the same or different.

[0028] For example, having the same or different number of columns in different target time series data table entries means that different target time series data table entries are allowed to have different numbers of columns. However, the number of columns in different target time series data table entries is related to the content of the initial time series data table; different target time series data table entries may have the same number of columns, or they may have different numbers of columns. For instance, target time series data table entry 1 has 8 columns, target time series data table entry 2 has 8 columns, target time series data table entry 3 has 6 columns, target time series data table entry 4 has 10 columns, target time series data table entry 5 has 6 columns, and so on.

[0029] For example, generating a target time-series data table based on the initial time-series data table may include, but is not limited to: generating multiple index combinations based on multiple metric intervals and multiple time intervals; for each index combination, which may include a metric interval and a time interval, selecting an initial time-series data table entry corresponding to the index combination from the initial time-series data table, wherein the metric value in the initial time-series data table entry is within the metric interval and the time value in the initial time-series data table entry is within the time interval; determining the number K of initial time-series data table entries corresponding to the index combination, where K may be a positive integer; generating a target time-series data table entry corresponding to the index combination, wherein the index key of the target time-series data table entry may be the metric interval and the time interval, and the target time-series data table entry includes K column fields, and the K column fields include object label values ​​from the K initial time-series data table entries corresponding to the index combination.

[0030] In one possible implementation, the metric value corresponding to the object tag value can be a value that changes over time; wherein, the metric value corresponding to the object tag value may include, but is not limited to: the location value corresponding to the object tag value, and the location value is a value after encoding the location; or, the cell number value corresponding to the object tag value; or, the temperature value corresponding to the object tag value.

[0031] In one possible implementation, after generating the target time-series data table based on the initial time-series data table, upon receiving a data query instruction, if the data query instruction includes time index information and metric index information, the target time-series data table entry corresponding to the time index information and the metric index information can be queried from the target time-series data table. The time index information can be within a time interval in the target time-series data table entry, and the metric index information can be within a metric interval in the target time-series data table entry. Based on this, the initial time-series data table can be queried using at least one object label value in the target time-series data table entry to obtain the initial time-series data table entry corresponding to that object label value.

[0032] In one possible implementation, after receiving a data query instruction, if the data query instruction includes tag index information, the initial time series data table entry corresponding to the tag index information is queried from the initial time series data table, and the tag index information matches the object tag value in the initial time series data table entry.

[0033] As can be seen from the above technical solutions, in this embodiment, a target time-series data table can be generated based on the initial time-series data table. The target time-series data table includes multiple target time-series data table entries, with the index key of the target time-series data table entry being the measurement interval and time interval. The target time-series data table entry includes column fields corresponding to the measurement interval and time interval, and the column fields include object label values. The target time-series data table stores the correspondence between the measurement interval, time interval, and object label values, while the initial time-series data table stores the correspondence between the object label values ​​and the measurement values ​​and time values. This enables efficient storage and querying of time-series data, resulting in high query efficiency and short query time. On top of the time-series database, efficient storage and querying of measurement intervals and time intervals are achieved, enabling efficient measurement queries and expanding the storage and query capabilities of the time-series database in the IoT (Internet of Things) field.

[0034] The technical solutions described above in the embodiments of this application will be explained below in conjunction with specific application scenarios.

[0035] When using a time series database to store time series data, if the label value of the time series data is used as the index key, it is impossible to store time series data efficiently or query time series data efficiently. In other words, it is impossible to perform efficient time series data storage and querying, resulting in low query efficiency and long query time.

[0036] In response to the above findings, this application proposes a data table generation method. Based on the initial time series data table, a target time series data table can be generated. This target time series data table uses the measurement interval and time interval as index keys. By using the measurement interval and time interval as index keys, time series data can be stored efficiently and time series data can be queried efficiently. That is, efficient time series data storage and time series data query are performed, with high query efficiency and short query time.

[0037] See Figure 2 The diagram shown illustrates a process for generating a data table, which may include:

[0038] Step 201: Obtain the initial time series data table. The initial time series data table may include multiple initial time series data table entries. The index key of the initial time series data table entry may be the object label value. The initial time series data table entry may also include the measurement value, time value and application data corresponding to the object label value.

[0039] For example, the initial time-series data table can be a data table in a time-series database. Each table entry (i.e., each record, such as each row of the initial time-series data table as a table entry) in the initial time-series data table can be called an initial time-series data table entry. The initial time-series data table can be a data table based on HBase (Hadoop Database, a distributed database), or it can be a data table based on other databases; there are no restrictions on this.

[0040] For example, for each initial time series data entry in the initial time series data table, the index key of the entry is the object tag value. The object tag value is a unique identifier for the time series data, and the time series data can be retrieved through its object tag value. For instance, for a given object (such as a user object, device object, vehicle object, etc.), the object can correspond to at least one tag value (e.g., a device object corresponds to at least one tag value from mobile phone number, IMEI number, and SIM card number). After encoding the object's tag value, the object tag value of the time series data can be obtained. For example, combining all the object's tag values ​​and performing dictionary encoding yields the object tag value.

[0041] For example, the initial time series data entry may also include a time value in the time series data, which is used to indicate the time when the time series data was generated. For instance, if the time series data was generated at time A, then the time value in the time series data can be time A.

[0042] For example, initial time-series data entries may also include measures from the time-series data. These measures can be values ​​that change over time; that is, values ​​that describe how an object changes over time. For instance, measures in time-series data may include, but are not limited to: location values ​​in the time-series data, where the location value is an encoded value of a location (such as a location area); or cell number values ​​in the time-series data; or temperature values ​​in the time-series data. Of course, these are just a few examples of measures, and the type of measure is not limited.

[0043] When the metric is a location value, since an object may be in different locations at different times, its location in different time series data changes over time. Therefore, location values ​​(such as latitude and longitude) can be used as metrics. When using location values ​​as metrics, they can be used directly, or they can be encoded and used as metrics. For example, GeoHash encoding can be used to encode location values ​​and then use the encoded value as the metric. GeoHash encoding is an address encoding method that can encode two-dimensional spatial latitude and longitude data into a string.

[0044] When the metric is the cell number, since the object may be in different cells (base stations or part of the coverage area of ​​base stations (fan antennas)) at different times, the cell number in different time series data changes over time, so the cell number can be used as the metric.

[0045] When the metric is a temperature value, since an object may correspond to different temperatures at different times, the temperature in different time series data changes over time, so the temperature value is used as the metric.

[0046] For example, the initial time series data table entry may also include application data from the time series data, including data other than label values, time values, and measure values, without restriction. Of course, the time series data may also only include label values, time values, and measure values, i.e., the application data is empty.

[0047] In one possible implementation, the device object (such as a mobile device) can report its location area every 5 minutes. An example of time-series data can be found in Table 1.

[0048] Table 1

[0049] Phone number IMEI card number Reporting time Location area 11111111111 cccccccc ssssssss 2022.3.31 10:00:00 abcdef 12222222222 ddddddd hhhhhhh 2022.3.31 10:00:00 badefg 11111111111 cccccccc ssssssss 2022.3.31 10:05:00 abcdeh 12222222222 ddddddd hhhhhhh 2022.3.31 10:05:00 abcdeh … … … … …

[0050] In Table 1, the mobile phone number 11111111111, the IMEI number cccccccc, and the card number ssssssss are the tag values ​​in the time series data. After combining the mobile phone number 11111111111, the IMEI number cccccccc, and the card number ssssssss, dictionary encoding is performed to obtain the object tag value corresponding to this time series data. The reporting time 2022.3.31 10:00:00 is the time value in the time series data, used to indicate the time when the time series data was generated. The location region abcdef is the metric value in the time series data (i.e., the metric value can be a location value). The location region abcdef is obtained by encoding the location region (such as latitude and longitude) of the device object, such as by using GeoHash encoding.

[0051] Based on the time series data shown in Table 1, an initial time series data table as shown in Table 2 can be constructed. Of course, Table 2 is just an example of an initial time series data table, and there are no restrictions on this initial time series data table.

[0052] Table 2

[0053]

[0054]

[0055] In Table 2, object label value 1 is obtained by dictionary encoding of the mobile phone number 11111111111, the IMEI number cccccccc, and the card number ssssssss. Object label value 2 is obtained by dictionary encoding of the mobile phone number 12222222222, the IMEI number ddddddd, and the card number hhhhhhh. Object label values ​​1 and 2 are the index keys of the initial time series data table entries. The first initial time series data table entry also includes the metric value abcdef and the time value 2022.3.31 10:00:00 corresponding to object label value 1, the second initial time series data table entry also includes the metric value badefg and the time value 2022.3.31 10:00:00 corresponding to object label value 2, and so on.

[0056] Based on the initial time series data table shown in Table 2, if it is necessary to query the device objects that appear in a certain area within a certain period of time, such as from 9:00 AM to 11:00 AM on March 31, 2022, and if the certain area is abcdef, then it is necessary to traverse the initial time series data table shown in Table 2. However, the traversal operation will lead to problems such as "inefficient storage of time series data, inefficient querying of time series data, that is, inefficient storage and querying of time series data, low query efficiency, and long query time".

[0057] In response to the above findings, in this embodiment, after obtaining the initial time series data table, a target time series data table can be generated based on the initial time series data table. The generation process of the target time series data table is described in the following process.

[0058] Step 202: Generate multiple metric intervals based on all metric values ​​in the initial time series data table.

[0059] For example, metrics can be categorized into discrete and non-discrete metrics. Discrete metrics refer to a finite number of metrics or a countable number of metrics. For instance, if a random variable X can only take a finite number of distinct values ​​or a countable number of values, then X is called a discrete random variable. Non-discrete metrics refer to an infinite number of metrics or an uncountable number of metrics. For instance, if a random variable X has an infinite number of distinct values ​​or an uncountable number of values, then X is called a non-discrete random variable. For example, height can take any number of values ​​from 0 cm to 200 cm, which is an infinite number of values, and therefore a non-discrete value. In this embodiment, considering efficiency, non-discrete metrics can be approximated using significant numbers. That is, non-discrete metrics can also be converted into discrete metrics. Therefore, all metrics are discrete metrics.

[0060] In one possible implementation, for all metrics (such as discrete metrics) in the initial time series data table, a metric interval can be generated for each metric in the initial time series data table, that is, each metric is treated as a separate metric interval. For example, assuming that the initial time series data table includes metric A1, metric A2, metric A3, and so on, metric intervals A1, A2, A3, and so on can be generated, meaning that each metric is treated as a separate metric interval.

[0061] In another possible implementation, considering that the range of the metric values ​​may be relatively large, if the metric values ​​are directly used as the index keys of the target time series data table, there is a problem of low index key lookup efficiency. Therefore, for the case of a large range of metric values, a metric value segmentation method is proposed, that is, based on the natural order of each metric value in the initial time series data table, all metric values ​​in the initial time series data table are sorted, and based on the sorting result, all metric values ​​are divided into multiple metric intervals.

[0062] For example, if the measures in the initial time series data table are numerical values, the natural order of each measure in the initial time series data table can be the numerical order of each measure. Based on this, all measures in the initial time series data table can be sorted according to the numerical order of each measure in the initial time series data table. For example, all measures in the initial time series data table can be sorted according to the natural order of the measures from smallest to largest (numerical order), or according to the natural order of the measures from largest to smallest (numerical order).

[0063] For example, since the measures in the initial time series data table are numerical values, such as 1, 2, 3, 4, etc., all measures can be sorted based on the numerical order of each measure.

[0064] For example, if the metrics in the initial time-series data table are string-type metrics, the natural order of each metric in the initial time-series data table can be the ASCII (Information Interchange Standard Code) order of each metric. Based on this, all metrics in the initial time-series data table can be sorted according to the ASCII order of each metric. For instance, all metrics in the initial time-series data table can be sorted in ascending order (i.e., ASCII order), or in descending order (i.e., ASCII order).

[0065] For example, since the measures in the initial time series data table are string-type measures, such as abcdef, bafg, abcdeh, abcdeh, etc., these string-type measures can be converted to ASCII codes. For example, the measures abcdef, bafg, abcdeh, abcdeh, etc. can be converted to ASCII codes. After converting the string-type measures to ASCII codes, the order of the ASCII codes of each measure can be determined, and all measures can be sorted based on the order of the ASCII codes of each measure.

[0066] For example, after sorting all measures in the initial time-series data table, the measures can be divided into multiple measure intervals based on the sorting results. Dividing all measures into multiple measure intervals involves a trade-off between index accuracy and index storage space. The finer the segmentation of the measure interval (i.e., the more measure intervals), the higher the index accuracy, but the larger the required index storage space. Conversely, the coarser the segmentation of the measure interval (i.e., the fewer measure intervals), the lower the index accuracy, but the smaller the required index storage space. The number of measure intervals M can be decided based on the actual situation.

[0067] For example, if we need to divide all metrics into M metric intervals, we can determine the minimum and maximum metric values ​​based on the sorting results, and then divide the range of metrics between the minimum and maximum metric values ​​into M equal parts, with each part corresponding to a metric interval, thus obtaining M metric intervals.

[0068] For example, suppose we need to divide all metrics into M intervals. We can round down or up the quotient of the total number of metrics and M to get N. Based on the sorting result, we can divide all metrics into M intervals. The first M-1 intervals correspond to N metrics, and the last interval corresponds to the remaining metrics. For each interval, we can determine the minimum and maximum metric values ​​(e.g., N metrics) based on the sorting result, thus obtaining the minimum and maximum metric values ​​for that interval, and consequently, the M intervals.

[0069] Of course, the above is just an example of dividing all the metrics into M metric intervals, and there are no restrictions on this.

[0070] In summary, this embodiment proposes a measurement segmentation method for measurement values, resulting in multiple measurement intervals. The segmentation method uses natural ordering for range segmentation, primarily considering that a measurement value query is a range, such as a query area range or a query temperature range. Taking string and numeric types as examples of natural ordering range segmentation, string-type measurement values ​​can consist of multiple characters (e.g., 6 characters). For example, each string-type measurement value represents a location area, and the segmentation algorithm can use the first few characters as segments because adjacent areas share the same prefix. Numeric-type measurement values ​​can consist of multiple digits, such as temperature values, and can retain 2 valid digits, segmenting the range as integers.

[0071] Step 203: Generate multiple time intervals based on all time values ​​in the initial time series data table.

[0072] For example, all time values ​​in the initial time series data table can be divided into multiple time intervals. When dividing into multiple time intervals, it is a trade-off between index accuracy and index storage space. The finer the segmentation of the time interval (i.e., the more time intervals there are), the higher the index accuracy and the larger the index storage space required. The coarser the segmentation of the time interval (i.e., the fewer time intervals there are), the lower the index accuracy and the smaller the index storage space required. The number of time intervals P can be decided according to the actual situation.

[0073] For example, suppose we need to divide all time values ​​into P time intervals. We can divide the range between the minimum and maximum time values ​​into P equal parts, each part corresponding to a time interval, thus obtaining P time intervals. Alternatively, we can pre-configure time interval ranges, such as 5 minutes or 8 minutes, and divide the range between the minimum and maximum time values ​​into P time intervals based on these ranges. When dividing the time value range into P time intervals, the time interval ranges corresponding to different time intervals can be the same or different; there are no restrictions on this. Of course, the above is just an example of dividing all time values ​​into P time intervals, and there are no limitations on this approach.

[0074] Step 204: Generate multiple index combinations based on multiple metric intervals and multiple time intervals; for each index combination, the index combination may include a metric interval and a time interval.

[0075] For example, assuming multiple metric intervals include metric interval A1 and metric interval A2, and multiple time intervals include time interval B1 and time interval B2, then index combinations C1, C2, C3, and C4 can be generated. Specifically, index combination C1 includes metric interval A1 and time interval B1, index combination C2 includes metric interval A1 and time interval B2, index combination C3 includes metric interval A2 and time interval B1, and index combination C4 includes metric interval A2 and time interval B2.

[0076] Step 205: For each index combination, select the initial time series data table entry corresponding to the index combination from the initial time series data table. The metric value in the initial time series data table entry is within the metric range of the index combination, and the time value in the initial time series data table entry is within the time range of the index combination.

[0077] For example, since the initial time-series data table entries include the metric and time values ​​corresponding to the object label values, for each index combination, based on the metric and time values ​​in the initial time-series data table entries, the initial time-series data table entry corresponding to that index combination can be selected from all initial time-series data table entries. For instance, the initial time-series data table entry corresponding to index combination C1 (which can be multiple) can be selected from all initial time-series data table entries, where the metric value in this initial time-series data table entry is in metric interval A1, and the time value in this initial time-series data table entry is in time interval B1. Similarly, the initial time-series data table entry corresponding to index combination C2 can be selected from all initial time-series data table entries, where the metric value in this initial time-series data table entry is in metric interval A1, and the time value in this initial time-series data table entry is in time interval B2, and so on.

[0078] Step 206: For each index combination, generate a target time-series data table entry corresponding to that index combination based on the initial time-series data table entry (which can be at least one). After obtaining the target time-series data table entry for each index combination, all target time-series data table entries are combined to form the target time-series data table.

[0079] For example, for each index combination, the index combination may include a metric interval and a time interval. Based on this, the index key of the target time series data table entry corresponding to the index combination can be the metric interval and the time interval. For instance, for the target time series data table entry corresponding to index combination C1, the index key of the corresponding target time series data table entry is the metric interval A1 and the time interval B1, and so on.

[0080] For example, when generating a target time-series data table entry based on the initial time-series data table entry corresponding to the index combination, if the number of initial time-series data table entries corresponding to the index combination is K (where K can be a positive integer), then the target time-series data table entry includes K column fields. Clearly, since the number of column fields in the target time-series data table entry is the same as the number of initial time-series data table entries corresponding to the index combination, different target time-series data table entries can have the same or different number of column fields.

[0081] For example, if index combination C1 corresponds to 10 initial time-series data table entries, then the target time-series data table entry corresponding to index combination C1 can include 10 column fields. If index combination C2 corresponds to 8 initial time-series data table entries, then the target time-series data table entry corresponding to index combination C2 can include 8 column fields. If index combination C3 corresponds to 10 initial time-series data table entries, then the target time-series data table entry corresponding to index combination C3 can include 10 column fields. If index combination C4 corresponds to 6 initial time-series data table entries, then the target time-series data table entry corresponding to index combination C4 can include 6 column fields.

[0082] For example, for the K column fields of the target time-series data table entry, the K column fields can include object label values ​​from the K initial time-series data table entries corresponding to the index combination. That is, each column field is used to record one object label value, and different column fields are used to record different object label values. For instance, if index combination C1 corresponds to initial time-series data table entries s1-s10, then the first column field of the target time-series data table entry corresponding to index combination C1 is used to record the object label values ​​in the initial time-series data table entry s1, the second column field is used to record the object label values ​​in the initial time-series data table entry s2, and so on.

[0083] In summary, this embodiment allows for the generation of a target time-series data table based on an initial time-series data table. The target time-series data table can include multiple target time-series data table entries. For each target time-series data table entry, its index key can be a metric interval and a time interval. The target time-series data table entry can also include column fields (which can be K column fields) corresponding to the metric interval and the time interval. These column fields include object label values ​​(i.e., object label values ​​in the K initial time-series data table entries corresponding to the index combination). The metric value corresponding to this object label value in the initial time-series data table is within the metric interval, and the time value corresponding to this object label value in the initial time-series data table is within the time interval.

[0084] See Table 3 for an example of a target time series data table. There are no restrictions on this target time series data table.

[0085] Table 3

[0086]

[0087] In Table 3, Hour_Term 1 represents a measurement interval and a time interval, such as measurement interval A1 and time interval B1; Hour_Term 2 represents a measurement interval and a time interval, such as measurement interval A1 and time interval B2; Hour_Term 3 represents a measurement interval and a time interval, such as measurement interval A2 and time interval B1; and Hour_Term 4 represents a measurement interval and a time interval, such as measurement interval A2 and time interval B2.

[0088] The object label value is the index key of the initial time series data table entry. For the object label values ​​corresponding to Hour_Term 1, such as object label value 1, object label value 2, and object label value 3, the corresponding metric values ​​of these object label values ​​in the initial time series data table are in the metric interval A1, and the corresponding time values ​​of these object label values ​​in the initial time series data table are in the time interval B1, and so on.

[0089] Q1, Q2, Q3, etc., are column fields. The names of the column fields can be arbitrary, and the number of column fields for different target time series data table entries can be the same or different. For example, Hour_Term 1 corresponds to 10 column fields, i.e., Q1-Q10; Hour_Term 2 corresponds to 8 column fields, i.e., Q1-Q8; Hour_Term 3 corresponds to 10 column fields, i.e., Q1-Q10; and Hour_Term 4 corresponds to 6 column fields, i.e., Q1-Q6.

[0090] For example, based on the target time series data table shown in Table 3, if we need to query the device objects that appear in a certain area within a certain period of time, such as from 9:00 AM to 11:00 AM on March 31, 2022, and the area is abcdef, then we can directly query based on the rowKey (index key) without traversing the target time series data table shown in Table 3. Since no traversal operation is required, time series data can be stored efficiently, time series data can be queried efficiently, the query efficiency is relatively high, and the query time is relatively short.

[0091] In one possible implementation, the target time series data table can be an inverted index data table, and the target time series data table can be stored on disk. The metric interval and time interval serve as the term (index key) of the target time series data table, thereby creating an inverted index for the metric value. Taking advantage of the characteristic that the number of column fields can be different (the number of column fields of different target time series data table items may be the same or different), the structure of the inverted index is designed.

[0092] A metric index (where the metric value serves as the index key) is a type of inverted index. The aforementioned metric interval is a component of the inverted index's term; that is, the metric interval and the time interval together form the inverted index's term. The structure of the inverted index is shown in Table 3. RowKey is the index key, composed of Hour and Term. Hour represents the time interval, such as 2021090101, and Term represents the metric interval. Qualify Family is the column family, which can be any string. The Qualify value serves as the column name, such as Q1, Q2, Q3, etc. From the perspective of the inverted index, docId is the ID, serving as the document ID in the inverted index. Term consists of Hour (time interval) and MeasureSegment (metric interval), serving as the key value of the inverted index. Posting_list is an array composed of all column names under the column family, used to store all document IDs that match a given Term.

[0093] In one possible implementation, based on the initial time-series data table and the target time-series data table, this application also proposes a data query method, see [link to relevant documentation]. Figure 3 As shown, the method includes:

[0094] Step 301: Receive a data query instruction. This data query instruction may include time index information and metric index information, or it may include tag index information.

[0095] Step 302: If the data query instruction includes time index information and metric index information, then query the target time series data table entries corresponding to the time index information and the metric index information. When querying the target time series data table entry, the time index information must be within the time interval of the target time series data table entry, and the metric index information must be within the metric interval of the target time series data table entry.

[0096] Step 303: Query the initial time series data table by using the object label value (which can be at least one) in the target time series data table entry to obtain the initial time series data table entry corresponding to the object label value.

[0097] For example, a target time series data table entry may include at least one object label value, such as K object label values ​​(i.e., K object label values ​​are stored through K column fields). For each object label value, the initial time series data table is queried through the object label value to obtain the initial time series data table entry corresponding to the object label value, thereby obtaining K initial time series data table entries corresponding to the K object label values.

[0098] Step 304: If the data query instruction includes tag index information, then query the initial time series data table entry corresponding to the tag index information from the initial time series data table. The tag index information matches the object tag value in the initial time series data table entry, that is, the tag index information is the same as the object tag value.

[0099] For example, when querying the initial time-series data table by object tag value or tag index information, since the index key of the initial time-series data table is the object tag value, the corresponding initial time-series data table item can be retrieved efficiently from the initial time-series data table. After retrieving the initial time-series data table item, the required data can be obtained from it.

[0100] As can be seen from the above technical solutions, in this embodiment, a target time-series data table can be generated based on the initial time-series data table. The target time-series data table includes multiple target time-series data table entries, with the index key of the target time-series data table entry being the measurement interval and time interval. The target time-series data table entry includes column fields corresponding to the measurement interval and time interval, and the column fields include object label values. The target time-series data table stores the correspondence between the measurement interval, time interval, and object label values, while the initial time-series data table stores the correspondence between the object label values ​​and the measurement values ​​and time values. This enables efficient storage and querying of time-series data, resulting in high query efficiency and short query time. On top of the time-series database, efficient storage and querying of measurement intervals and time intervals are achieved. For example, querying data for a specific location and time range enables efficient measurement queries, which can expand the storage and querying capabilities of the time-series database in the IoT field (where the measurement data received by the time-series database contains geographical location information).

[0101] Based on the same concept as the above method, this application proposes a data table generation device, see [link to relevant documentation]. Figure 4 The diagram shown is a structural schematic of the data table generation device, which may include:

[0102] The acquisition module 41 is used to acquire an initial time series data table, which includes multiple initial time series data table entries. The index key of each initial time series data table entry is an object label value, and each initial time series data table entry includes a metric value and a time value corresponding to the object label value.

[0103] The generation module 42 is used to generate multiple measurement intervals based on all measurement values ​​in the initial time series data table, and to generate multiple time intervals based on all time values ​​in the initial time series data table.

[0104] Processing module 43 is used to generate a target time series data table based on the initial time series data table; wherein, the target time series data table includes multiple target time series data table entries, the index key of each target time series data table entry is a metric interval and a time interval, each target time series data table entry includes at least one column field corresponding to the metric interval and the time interval, each column field includes an object label value, the metric value corresponding to the object label value in the initial time series data table is within the metric interval, and the time value corresponding to the object label value in the initial time series data table is within the time interval; wherein, different target time series data table entries may have the same or different number of column fields.

[0105] For example, when the generation module 42 generates multiple metric intervals based on all metric values ​​in the initial time series data table, it is specifically used to: sort all metric values ​​in the initial time series data table according to the natural order of each metric value in the initial time series data table; wherein, all metric values ​​are sorted according to the natural order of metric values ​​from smallest to largest, or according to the natural order of metric values ​​from largest to smallest; divide all metric values ​​into multiple metric intervals based on the sorting result; or, generate a metric interval for each metric value in the initial time series data table.

[0106] For example, if the metric value in the initial time series data table is a numerical metric value, then the natural order of each metric value in the initial time series data table is the numerical order of each metric value; or, if the metric value in the initial time series data table is a string metric value, then the natural order of each metric value in the initial time series data table is the ASCII code order of each metric value.

[0107] For example, when the processing module 43 generates the target time-series data table based on the initial time-series data table, it specifically performs the following steps: generating multiple index combinations based on the multiple metric intervals and the multiple time intervals; for each index combination, the index combination includes a metric interval and a time interval; selecting an initial time-series data table entry corresponding to the index combination from the initial time-series data table, wherein the metric value in the initial time-series data table entry is within the metric interval, and the time value in the initial time-series data table entry is within the time interval; determining the number K of initial time-series data table entries corresponding to the index combination; generating a target time-series data table entry corresponding to the index combination, wherein the index key of the target time-series data table entry is the metric interval and the time interval, and the target time-series data table entry includes K column fields, wherein the K column fields include object label values ​​from the K initial time-series data table entries corresponding to the index combination.

[0108] For example, the metric corresponding to the object label value is a value that changes over time.

[0109] For example, the measurement value corresponding to the object label value includes: the location value corresponding to the object label value, and the location value is a value after encoding the location; or, the cell number value corresponding to the object label value; or, the temperature value corresponding to the object label value.

[0110] For example, the device further includes a query module, configured to, upon receiving a data query instruction, if the data query instruction includes time index information and metric index information, query the target time series data table for the target time series data table corresponding to the time index information and the metric index information, wherein the time index information is within a time interval in the target time series data table for which the metric index information is within a metric interval in the target time series data table for which the metric index information is within ...

[0111] Based on the same concept as the above method, this application proposes an electronic device, see [link to previous application]. Figure 5 As shown, the electronic device includes a processor 51 and a machine-readable storage medium 52, the machine-readable storage medium 52 storing machine-executable instructions that can be executed by the processor 51; the processor 51 is used to execute the machine-executable instructions to implement the data table generation method disclosed in the above example of this application.

[0112] Based on the same concept as the above method, this application also provides a machine-readable storage medium storing a plurality of computer instructions, which, when executed by a processor, can implement the data table generation method disclosed in the above example of this application.

[0113] The aforementioned machine-readable storage medium can be any electronic, magnetic, optical, or other physical storage device that can contain or store information, such as executable instructions, data, etc. For example, machine-readable storage media can be: RAM (Random Access Memory), volatile memory, non-volatile memory, flash memory, storage drives (such as hard disk drives), solid-state drives, any type of storage disk (such as optical discs, DVDs, etc.), or similar storage media, or combinations thereof.

[0114] The systems, devices, modules, or units described in the above embodiments can be implemented by a computer entity or by a product with a certain function. A typical implementation device is a computer, which can be a personal computer, laptop computer, cellular phone, camera phone, smartphone, personal digital assistant, media player, navigation device, email sending and receiving device, game console, tablet computer, wearable device, or any combination of these devices.

[0115] For ease of description, the above devices are described separately by function as various units. Of course, in implementing this application, the functions of each unit can be implemented in one or more software and / or hardware.

[0116] Those skilled in the art will understand that embodiments of this application can be provided as methods, systems, or computer program products. Therefore, this application can take the form of a completely hardware embodiment, a completely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, embodiments of this application can take the form of a computer program product implemented on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) containing computer-usable program code.

[0117] This application is described with reference to flowchart illustrations and / or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of this application. It will be understood that each block of the flowchart illustrations and / or block diagrams, and combinations of blocks in the flowchart illustrations and / or block diagrams, can be implemented by computer program instructions. These computer program instructions can be provided to a processor of a general-purpose computer, special-purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, generate instructions for implementing the flowchart... Figure 1 One or more processes and / or boxes Figure 1 A device that provides the functions specified in one or more boxes.

[0118] Furthermore, these computer program instructions can also be stored in a computer-readable storage medium that can direct a computer or other programmable data processing device to operate in a particular manner, such that the instructions stored in the computer-readable storage medium produce an article of manufacture including instruction means, which are implemented in the process. Figure 1 One or more processes and / or boxes Figure 1 The function specified in one or more boxes.

[0119] These computer program instructions may also be loaded onto a computer or other programmable data processing equipment to cause a series of operational steps to be performed on the computer or other programmable equipment to produce a computer-implemented process, thereby providing instructions that execute on the computer or other programmable equipment for implementing the process. Figure 1 One or more processes and / or boxes Figure 1 The steps of the function specified in one or more boxes.

[0120] The above description is merely an embodiment of this application and is not intended to limit the scope of this application. Various modifications and variations can be made to this application by those skilled in the art. Any modifications, equivalent substitutions, improvements, etc., made within the spirit and principles of this application should be included within the scope of the claims of this application.

Claims

1. A method for generating a data table, characterized in that, The method includes: Obtain an initial time series data table, which includes multiple initial time series data table entries. The index key of each initial time series data table entry is an object label value, and each initial time series data table entry includes a metric value and a time value corresponding to the object label value. Multiple metric intervals are generated based on all metric values ​​in the initial time series data table, and multiple time intervals are generated based on all time values ​​in the initial time series data table; A target time series data table is generated based on the initial time series data table. The target time series data table includes multiple target time series data table entries. The index key of each target time series data table entry is a metric interval and a time interval. Each target time series data table entry includes column fields corresponding to the metric interval and the time interval. The column fields include object label values. The metric value corresponding to the object label value in the initial time series data table is within the metric interval, and the time value corresponding to the object label value in the initial time series data table is within the time interval. The number of column fields may be the same or different for different target time series data table entries.

2. The method according to claim 1, characterized in that, The generation of multiple metric intervals based on all metric values ​​in the initial time-series data table includes: Based on the natural order of each metric in the initial time series data table, all metrics in the initial time series data table are sorted; wherein, all metrics are sorted in ascending order of metric values, or in descending order of metric values. Based on the sorting results, all metrics are divided into multiple metric intervals; or, Generate a metric interval for each metric value in the initial time series data table.

3. The method according to claim 2, characterized in that, If the metric values ​​in the initial time series data table are numerical values, then the natural order of each metric value in the initial time series data table is the numerical order of each metric value. If the metric values ​​in the initial time series data table are string-type metric values, then the natural order of each metric value in the initial time series data table is the ASCII code order of each metric value.

4. The method according to claim 1, characterized in that, The process of generating a target time series data table based on the initial time series data table includes: Multiple index combinations are generated based on the multiple metric intervals and the multiple time intervals; for each index combination, the index combination includes a metric interval and a time interval, and an initial time series data table entry corresponding to the index combination is selected from the initial time series data table, wherein the metric value in the initial time series data table entry is in the metric interval and the time value in the initial time series data table entry is in the time interval; Determine the number K of the initial time-series data entries corresponding to the index combination; Generate the target time series data table entry corresponding to the index combination. The index key of the target time series data table entry is the metric interval and the time interval. The target time series data table entry includes K column fields, and the K column fields include the object label values ​​in the K initial time series data table entries corresponding to the index combination.

5. The method according to any one of claims 1-4, characterized in that, The metric corresponding to the object's label value is a value that changes over time; The measurement value corresponding to the object tag value includes: the location value corresponding to the object tag value, wherein the location value is a value after encoding the location; or, the cell number value corresponding to the object tag value; or, the temperature value corresponding to the object tag value.

6. The method according to any one of claims 1-4, characterized in that, After generating the target time series data table based on the initial time series data table, the method further includes: Upon receiving a data query instruction, if the data query instruction includes time index information and measurement index information, then the target time series data table is queried from the target time series data table for the target time series data table corresponding to the time index information and the measurement index information, wherein the time index information is in the time interval of the target time series data table item and the measurement index information is in the measurement interval of the target time series data table item; By querying the initial time series data table using at least one object tag value from the target time series data table entry, the initial time series data table entry corresponding to the object tag value is obtained.

7. The method according to claim 6, characterized in that, The method further includes: Upon receiving a data query instruction, if the data query instruction includes tag index information, then the initial time series data table entry corresponding to the tag index information is queried from the initial time series data table; the tag index information matches the object tag value in the initial time series data table entry.

8. A data table generation device, characterized in that, The device includes: The acquisition module is used to acquire an initial time series data table, which includes multiple initial time series data table entries. The index key of each initial time series data table entry is an object label value, and each initial time series data table entry includes a metric value and a time value corresponding to the object label value. The generation module is used to generate multiple metric intervals based on all metric values ​​in the initial time series data table, and to generate multiple time intervals based on all time values ​​in the initial time series data table. A processing module is used to generate a target time series data table based on the initial time series data table. The target time series data table includes multiple target time series data table entries. The index key of each target time series data table entry is a metric interval and a time interval. Each target time series data table entry includes at least one column field corresponding to the metric interval and the time interval. Each column field includes an object label value. The metric value corresponding to the object label value in the initial time series data table is within the metric interval, and the time value corresponding to the object label value in the initial time series data table is within the time interval. Different target time series data table entries may have the same or different number of column fields.

9. The apparatus according to claim 8, Its features are, in, When the generation module generates multiple metric intervals based on all metric values ​​in the initial time series data table, it specifically performs the following steps: Sorting all metric values ​​in the initial time series data table according to their natural order; wherein, the metric values ​​are sorted either in ascending order or descending order; dividing all metric values ​​into multiple metric intervals based on the sorting result; or, generating one metric interval for each metric value in the initial time series data table. Wherein, if the metric value in the initial time series data table is a numerical metric value, then the natural order of each metric value in the initial time series data table is the numerical order of each metric value; or, if the metric value in the initial time series data table is a string metric value, then the natural order of each metric value in the initial time series data table is the ASCII code order of each metric value. Specifically, when the processing module generates the target time-series data table based on the initial time-series data table, it performs the following steps: It generates multiple index combinations based on the multiple metric intervals and the multiple time intervals; for each index combination, the index combination includes a metric interval and a time interval; it selects an initial time-series data table entry corresponding to the index combination from the initial time-series data table, wherein the metric value in the initial time-series data table entry is within the metric interval, and the time value in the initial time-series data table entry is within the time interval; it determines the number K of initial time-series data table entries corresponding to the index combination; it generates a target time-series data table entry corresponding to the index combination, wherein the index key of the target time-series data table entry is the metric interval and the time interval, and the target time-series data table entry includes K column fields, wherein the K column fields include object label values ​​from the K initial time-series data table entries corresponding to the index combination; The metric value corresponding to the object label value is a value that changes over time; The measurement value corresponding to the object tag value includes: the location value corresponding to the object tag value, wherein the location value is a value after encoding the location; or, the cell number value corresponding to the object tag value; or, the temperature value corresponding to the object tag value. The device further includes a query module, which, upon receiving a data query instruction, if the data query instruction includes time index information and metric index information, queries the target time series data table for the target time series data table corresponding to the time index information and the metric index information, wherein the time index information is located in the time interval of the target time series data table for the target time series data table for the metric index information is located in the metric interval of the target time series data table for the target time series data table for querying the initial time series data table through at least one object tag value in the target time series data table for the initial time series data table for obtaining the initial time series data table for the object tag value. The query module is further configured to, upon receiving a data query instruction, if the data query instruction includes tag index information, query the initial time series data table entry corresponding to the tag index information from the initial time series data table, wherein the tag index information matches the object tag value in the initial time series data table entry.

10. An electronic device, characterized in that, include: A processor and a machine-readable storage medium, the machine-readable storage medium storing machine-executable instructions that can be executed by the processor; The processor is configured to execute machine-executable instructions to implement the method of any one of claims 1-7.