A small range of out-of-order timing data fast query method and device
By limiting the time window for out-of-order data and using a binary search algorithm, the problem of poor performance of time-series databases caused by out-of-order data is solved, achieving fast querying and resource conservation.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- 山东浪潮数据库技术有限公司
- Filing Date
- 2023-11-23
- Publication Date
- 2026-06-19
AI Technical Summary
Out-of-order data leads to poor query performance in time-series databases, requiring the traversal of all data, increasing disk I/O and memory resource consumption, and the data rearrangement is time-consuming.
By limiting the degree of disorder, binary search and local traversal are used to query time-series data. Disordered data within 15 minutes is allowed only within a small range. The binary search algorithm is used to determine the start and end positions of the query, thereby reducing the data traversal range.
It improves database query performance, reduces disk I/O and memory resource consumption, and enhances user experience.
Smart Images

Figure CN117555945B_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of databases, specifically providing a method and apparatus for fast querying of time-series data with a small range of disordered order. Background Technology
[0002] In most scenarios, time-series data is stored in ascending order of time. However, due to network latency, equipment failure, and other reasons, the collected data may not arrive in the correct order. We conventionally refer to this as out-of-order data. Out-of-order data is a common phenomenon, and handling out-of-order data is a scenario that time-series databases must support.
[0003] Because time-series databases are write-heavy and query-light, they have very high requirements for data write performance. Therefore, time-series databases typically store data in the order it was inserted, meaning the data stored on disk is also out of order. This necessitates that time-series databases consider out-of-order data scenarios when performing queries.
[0004] When there is out-of-order data in a time-series database, since the data is no longer in order, binary search cannot be used for quick location, and the only way is to traverse all the data. For range queries, traversing all the data is also commonly used, but sometimes all the data is sorted first and then binary search is used for range queries.
[0005] Out-of-order data can significantly impact the performance of time-series database queries, primarily due to the following issues:
[0006] 1. It requires traversing all the data, which makes the query algorithm very inefficient;
[0007] 2. Traversing all the data requires loading all the data, which increases disk I / O;
[0008] 3. Rearranging all the data will take a long time and consume a lot of memory resources; Summary of the Invention
[0009] This invention addresses the shortcomings of the prior art by providing a practical method for fast querying of small-scale out-of-order time-series data.
[0010] A further technical objective of this invention is to provide a reasonably designed, safe, and applicable device for rapid querying of small-scale out-of-order time-series data.
[0011] The technical solution adopted by this invention to solve its technical problem is:
[0012] A fast query method for time series data with limited disorder is proposed. The degree of disorder is limited so that the time series data is sequential in terms of overall trend. This allows the use of binary search and local traversal to query records at a certain time point or to filter out all records within a certain time period.
[0013] Furthermore, the time range of out-of-order data is restricted, accepting only a minute-long out-of-order time window, meaning only out-of-order data within a certain time period can be written.
[0014] Furthermore, assuming the time-series data are inserted in the correct order, and the corresponding timestamps are s1, s2, s3…sn; assuming the out-of-order time window is T, then the timestamps si of all records satisfy the following formula:
[0015]
[0016] Furthermore, the algorithm for querying a record with a specific timestamp also requires traversing the data, but this is not a traversal of all data; instead, it selects records within a storage range for traversal.
[0017] Then, the algorithm for querying records within a certain time range is given, for example, [sbegin, send].
[0018] Furthermore, the algorithm for querying a record with a specific timestamp includes the following steps:
[0019] (1) Query the starting position;
[0020] If you want to query a record with a timestamp of si, then the starting record for the query should be set to sk, k<=i;
[0021] sk+T≤si
[0022] sk≤si-T;
[0023] The rule of small-range disorder can ensure that all records with timestamps less than si-T-1 are before the records with timestamp si that are actually queried. Therefore, by using the binary search algorithm to query records with timestamps si–T-1, we can start traversing from the records with timestamps si–T-1.
[0024] Using binary search, it is guaranteed that the timestamps of the records found are all less than or equal to si–T-1, and the records will be stored before the record with the timestamp si that is being queried.
[0025] (2) Query the end position;
[0026] The end position must be after the si record; set the end record for the query to be se, e>i.
[0027] se-T≥si
[0028] se≥si+T
[0029] Therefore, we only need to iterate to the record with a timestamp greater than or equal to si+T+1 to stop.
[0030] Furthermore, the algorithm for querying records within a certain time range first determines the starting position of the query. If the query is for a record with a timestamp of sbegin, then the starting record to be queried should be set to sk,k<=begin.
[0031] sk+T≤sbegin
[0032] sk≤sbegin-T
[0033] Use binary search to find the record corresponding to the timestamp sbegin–T–1 as the starting position for traversal.
[0034] Furthermore, after querying the starting position, query the ending position. The ending position must be after the si record. Set the query ending record as se,e>i.
[0035] se-T≥send
[0036] se≥send+T
[0037] Therefore, the iteration can stop when a record with a timestamp greater than or equal to send+T+1 is found.
[0038] A device for fast querying time-series data with small-scale out-of-order conditions includes: at least one memory and at least one processor;
[0039] The at least one memory is used to store a machine-readable program;
[0040] The at least one processor is used to call the machine-readable program to execute a fast query method for time-series data with a small range of out-of-order data.
[0041] Compared with the prior art, the fast query method and apparatus for small-range out-of-order time-series data of the present invention have the following outstanding advantages:
[0042] This invention limits the disorder of time-series data and uses binary search plus local traversal to make it possible to query records at a certain point in time or within a certain time period without traversing all records. This improves the overall query performance of the database, reduces disk I / O, and enhances the user experience. Attached Figure Description
[0043] To more clearly illustrate the technical solutions in the embodiments of the present invention or the prior art, the following will briefly introduce the algorithm for querying a certain timestamp record using the accompanying drawings used in the description of the embodiments or the prior art. Obviously, the accompanying drawings described below are some embodiments of the present invention. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.
[0044] Appendix Figure 1 This is a flowchart illustrating a method for fast querying of time-series data with a small range of disordered order. Detailed Implementation
[0045] To enable those skilled in the art to better understand the present invention, the present invention will be further described in detail below with reference to specific embodiments. Obviously, the described embodiments are merely some embodiments of the present invention, and not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of the present invention.
[0046] The following is a preferred embodiment:
[0047] like Figure 1 As shown in the figure, this embodiment provides a method for real-time dynamic analysis of streaming serial port data. It limits the degree of disorder, so that the time-series data is sequential in terms of overall trend. This allows the use of binary search and local traversal to query records at a certain time point or to filter out all records within a certain time period.
[0048] The time range for out-of-order data is limited to a 15-minute out-of-order time window (this parameter can be adjusted), meaning only out-of-order data within 15 minutes can be written.
[0049] For example, if the current time is 12:00, then inserting time-series data with timestamps before 11:45 will discard the data, while inserting data with timestamps after 11:45 will succeed. In this way, the disorder is only within a small range, and the overall trend remains sequential.
[0050] Assuming the time-series data is inserted in the correct order, the timestamps corresponding to the time-series data are s1, s2, s3…sn. Assuming the out-of-order time window is T, then the following formula applies to the timestamps si of all records:
[0051]
[0052] I. Algorithm for querying a record with a specific timestamp
[0053] Querying a record also requires traversing the data, but this is not a traversal of all data, but rather a traversal of records within a specific storage range.
[0054] (1) Query the starting position;
[0055] Therefore, if you want to query a record with a timestamp of si, you need to set the starting record for the query to sk, k<=i.
[0056] sk+T≤si
[0057] sk≤si-T
[0058] The rule of small-range disorder can ensure that all records with timestamps less than si-T-1 are before the records with timestamp si that are actually queried. Therefore, the binary search algorithm is used to query the records with timestamps si–T-1.
[0059] Therefore, we only need to traverse from the records with timestamps si–T-1. Using binary search, we may not be able to pinpoint the exact record, but we can guarantee that the timestamps of the records found are all less than or equal to si–T-1, and these records will only be stored before the record with the si timestamp being queried.
[0060] (2) Query the end position;
[0061] The end position must be after the si record; set the end record for the query to be se, e>i.
[0062] se-T≥si
[0063] se≥si+T
[0064] Therefore, we only need to iterate to the record with a timestamp greater than or equal to si+T+1 to stop.
[0065] II. Algorithm for querying records within a specific time range
[0066] Assume the time range for the query is [sbegin, send];
[0067] (1) Query the starting position;
[0068] Therefore, if you want to query a record with a timestamp of sbegin, you need to set the starting record for the query to sk,k<=beign;
[0069] sk+T≤sbegin
[0070] sk≤sbegin-T
[0071] Use binary search to find the record corresponding to the timestamp sbegin–T–1 as the starting position for traversal.
[0072] (2) Query the end position;
[0073] The end position must be after the si record; set the query to end record as se,e.
[0074] se-T≥send
[0075] se≥send+T
[0076] Therefore, we only need to iterate through records with timestamps greater than or equal to send+T+1 to stop.
[0077] This embodiment of a device for fast querying time-series data with small-scale out-of-order conditions includes: at least one memory and at least one processor;
[0078] The at least one memory is used to store a machine-readable program;
[0079] The at least one processor is used to call the machine-readable program to execute a fast query method for time-series data with a small range of out-of-order data.
[0080] The specific embodiments described above are merely specific examples of the present invention. The patent protection scope of the present invention includes, but is not limited to, the specific embodiments described above. Any technical solution that conforms to the technical claims of the present invention and any appropriate changes or substitutions made by a person skilled in the art should fall within the patent protection scope of the present invention.
[0081] Although embodiments of the invention have been shown and described, it will be understood by those skilled in the art that various changes, modifications, substitutions and alterations can be made to these embodiments without departing from the principles and spirit of the invention, the scope of which is defined by the appended claims and their equivalents.
Claims
1. A method for fast querying of time-series data with a small range of disordered order, characterized in that, By limiting the degree of disorder, the time series data is ordered in terms of overall trend, so that binary search and local traversal can be used to query records at a certain time point or filter out all records within a certain time period. The time range of out-of-order data is restricted, and only a short out-of-order time window of one minute is accepted, meaning that only out-of-order data within a certain time period can be written. Assuming the time-series data are inserted in the correct order, and the corresponding timestamps are s1, s2, s3…sn; assuming the out-of-order time window is T, then the following formula applies to all records' timestamps si: ; The algorithm for querying a record with a specific timestamp also requires iterating through the data. However, this does not involve iterating through all the data; instead, it selects records within a storage range for iteration. Then, the algorithm for querying records within a certain time range is given, assuming the time range is [sbegin, send]. The algorithm for querying a record with a specific timestamp includes the following steps: (1) Query the starting position; If you want to query a record with a timestamp of si, then the starting record for the query should be set to sk, k <= i; ; ; The rule of small-range disorder can ensure that all records with timestamps less than si-T-1 are before the records with timestamp si that are actually queried. Therefore, by using the binary search algorithm to query records with timestamps si–T-1, we can start traversing from the records with timestamps si–T-1. Using binary search, it is guaranteed that the timestamps of the records found are all less than or equal to si–T -1, and the records will be stored before the record with the timestamp si that is being queried. (2) Query the end position; The end position must be after the si record; set the end record for the query to be se, e > i. ; ; Therefore, we only need to iterate to the record with a timestamp greater than or equal to si+T+1 to stop; The algorithm for querying records within a certain time range first determines the starting position of the query. If the query is for a record with a timestamp of sbegin, then the starting record to be queried should be set to sk, where k <= begin. ; ; Use binary search to find the record corresponding to the timestamp sbegin –T – 1 as the starting position for the traversal; After querying the starting position, query the ending position. The ending position must be after the si record. The ending record is set as se, e > i. ; ; Therefore, the iteration can stop when a record with a timestamp greater than or equal to send + T + 1 is found.
2. A small out-of-order timing data fast query device, characterized in that, include: At least one memory and at least one processor; The at least one memory is used to store a machine-readable program; The at least one processor is configured to invoke the machine-readable program to execute the method of claim 1.