Real-time time series data stream computing processing method, system, device and medium
By introducing stream computing services and caching mechanisms into the time-series database, the issues of efficiency and flexibility in real-time data processing were resolved, achieving high throughput and low latency real-time data processing and improving the processing capabilities of the time-series database.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- 上海沄熹科技有限公司
- Filing Date
- 2023-08-21
- Publication Date
- 2026-06-12
Smart Images

Figure CN117009395B_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of time-series database technology, specifically to a method, system, device, and medium for real-time time-series data stream computing processing. Background Technology
[0002] A time-series database is a database system used for storing and querying time-series data. By providing efficient storage structures and indexing mechanisms, it enables fast storage and querying of time-series data. This technology is typically used for storing and analyzing large-scale real-time data, offering high performance, high scalability, and flexible query capabilities.
[0003] Stream computing is a computational model for real-time processing of data streams, widely used in many fields. It focuses on real-time computation and processing of continuously generated data streams to obtain real-time results and insights, employing techniques such as windowing, real-time aggregation, and pattern detection to meet the requirements of real-time performance, low latency, and high throughput.
[0004] Existing technologies face challenges in terms of real-time data processing efficiency and flexibility. Therefore, improving the efficiency and accuracy of real-time data processing in time-series databases, and achieving higher processing throughput and lower processing latency are urgent technical problems to be solved. Summary of the Invention
[0005] The technical objective of this invention is to provide a method, system, device, and medium for real-time time-series data streaming processing, in order to address the problem of how to improve the efficiency and accuracy of real-time data processing in time-series databases, and achieve higher processing throughput and lower processing latency.
[0006] The technical objective of this invention is achieved as follows: a method for real-time time-series data stream computing processing, the specific method of which is as follows:
[0007] Start the time-series database and start the stream computing service;
[0008] Create stream computation: Create a stream computation on any time series table, which is the original table;
[0009] Inserting Time Series Data: After the time series data is written to disk, determine whether a stream computing task has been created on the time series table to which the data is inserted, i.e., whether a stream computing task has been created on the original table to which the data was inserted.
[0010] If a stream computing task is created, the start time of the stream computing task determines whether to send time-series data to the stream computing task.
[0011] Perform stream computing tasks.
[0012] As a preferred method, the specific steps for setting up scheduled stream computing tasks in the stream computing service are as follows:
[0013] ① When the database service starts, a stream computing service is created as a background service. The stream computing service reads the stream computing tasks in the stream computing task table and sets up scheduled tasks according to the start time of each stream computing task.
[0014] ② When stream computing is created, a scheduled task is set up for the stream computing task sent to the stream computing service.
[0015] When the database service ends, both the stream computing task and the stream computing service will stop operating.
[0016] More specifically, stream computing can be created as follows:
[0017] The parameters for stream computing creation include start time, end time, aggregation rules, and time interval. After parsing the parameters for stream computing creation, a time series table with the same name as the stream computing is created according to the result type of the aggregation calculation. The time series table is used to store the results of the stream computing task, which is the result table.
[0018] Write the stream computing task to the stream computing task table, persist it to disk, and send it to the background stream computing service.
[0019] More preferably, the execution of the stream computation task is as follows:
[0020] If a scheduled task created by the stream computing service is triggered, the stream computing task will be started, and the stream computing task will create a cache based on the columns of the result table.
[0021] When time-series data is sent to a stream processing task, the timestamp of the time-series data is determined based on the start time and time interval:
[0022] If the timestamp of the time series data previously written to the cache belongs to the same time window, then that time series data will be written to the cache.
[0023] If the timestamp of the time series data previously written to the cache does not belong to the same time window, the data in the cache will be calculated according to the aggregation rules set when the stream computation was created. The calculation results will be written to the result table and the cache will be cleared. New data will be written to the cache until the end time of the stream computation task is reached and the stream computation task ends.
[0024] Even better, the background stream computing service creates a scheduled task based on the start time of the stream computing task created by the user, and starts the stream computing task when the scheduled time is reached.
[0025] A real-time time-series data stream computing and processing system, the system comprising:
[0026] The startup unit is used to start the time-series database and the stream computing service;
[0027] Create a cell to create stream computation on any time series table, which is the original table;
[0028] The insertion unit is used to determine whether a stream computation task has been created on the time series table to which the time series data is inserted after it has been written to disk; that is, whether a stream computation task has been created on the original table to which the data is inserted.
[0029] If a stream computing task is created, the start time of the stream computing task determines whether to send time-series data to the stream computing task.
[0030] Execution unit, used to process stream computing tasks.
[0031] As a preferred method, the specific steps for setting up scheduled stream computing tasks in the stream computing service are as follows:
[0032] ① When the database service starts, a stream computing service is created as a background service. The stream computing service reads the stream computing tasks in the stream computing task table and sets up scheduled tasks according to the start time of each stream computing task.
[0033] ② When stream computing is created, a scheduled task is set up for the stream computing task sent to the stream computing service.
[0034] When the database service ends, both the stream computing task and the stream computing service will stop operating.
[0035] As a preferred option, the working process of this system is as follows:
[0036] (1) When the database service starts, a stream computing service is created as a background service. The stream computing service reads the stream computing task table and creates stream computing tasks.
[0037] (2) The user creates a stream computation on any time series table, which is the original table, and creates a new time series table to store the stream computation results, i.e., the result table; after the stream computation is successfully created, the stream computation task is sent to the background stream computation service.
[0038] (3) The background stream computing service creates a scheduled task based on the start time of the stream computing created by the user, and starts the stream computing task when the scheduled time is reached.
[0039] (4) When inserting time-series data, first write it to disk. After successful disk writing, check whether stream computing has been created on the original table to which the data was inserted:
[0040] If so, data is sent to the stream computing task. If the data received by the stream computing task is within the time window specified by the stream computing task, the data is written to the cache; otherwise, the data in the cache is aggregated and written to the result table. After clearing the cache, the new data is written to the cache.
[0041] (5) When the end time of the stream computation is reached, the stream computation task stops, and no more data is sent after the data is successfully written to disk when inserting data into the original table of the stream computation.
[0042] An electronic device includes: a memory and at least one processor;
[0043] The memory contains computer programs;
[0044] The at least one processor executes the computer program stored in the memory, causing the at least one processor to perform the real-time time-series data streaming processing method as described above.
[0045] A computer-readable storage medium storing a computer program that can be executed by a processor to implement the real-time time-series data streaming processing method described above.
[0046] The real-time time-series data streaming computation processing method, system, device, and medium of the present invention have the following advantages:
[0047] (i) This invention can process time-series data in real time and quickly. It does not require processing time-series data written to disk. Reading and writing through caching is much faster than reading and writing on disk. In addition, it does not require partitioning the time-series data and then aggregating it. Instead, it aggregates the data in real time by judging the timestamp of the time-series data. This makes it faster and more effective to complete the data collection and analysis of high-frequency important equipment.
[0048] (ii) This invention solves the challenges of efficiency and flexibility in real-time data processing in the prior art. By introducing an innovative stream computing processing method, it improves the efficiency and accuracy of real-time data processing, and achieves higher processing throughput and lower processing latency.
[0049] (III) The present invention aims to overcome the limitations of traditional stream computing methods in handling complex event patterns, so as to provide more flexible and powerful processing capabilities;
[0050] (iv) This invention combines time-series databases with stream computing, improving the efficiency and flexibility of real-time data processing; wherein, the time-series database, as one of the data sources for stream computing, provides efficient data storage and query support for stream computing; stream computing then uses the time-series data stored in the time-series database to process data streams in real time and generate real-time results.
[0051] (v) After creating a stream computing task, the user of this invention writes the stream computing task into the task status table, performs aggregation calculations on the inserted data in real time, and provides more flexible and powerful processing capabilities for the creation of stream computing tasks and the execution of stream computing tasks after the time-series database service is started. Attached Figure Description
[0052] The invention will be further described below with reference to the accompanying drawings.
[0053] Appendix Figure 1 To create a flowchart for stream computing;
[0054] Appendix Figure 2 A flowchart for performing stream computing tasks. Detailed Implementation
[0055] The following detailed description of the real-time time-series data streaming computing processing method, system, device, and medium of the present invention, with reference to the accompanying drawings and specific embodiments, is provided.
[0056] Example 1:
[0057] This embodiment provides a method for stream computing processing of real-time time-series data, as detailed below:
[0058] S1. Start the time series database and start the stream computing service;
[0059] S2. Create Stream Computation: Create a stream computation on any time series table, which is the original table;
[0060] S3. Inserting Time Series Data: After the time series data is written to disk, determine whether a stream computing task has been created on the time series table to which the data is inserted, i.e., whether a stream computing task has been created on the original table to which the data is inserted:
[0061] If a stream computing task is created, the start time of the stream computing task determines whether to send time-series data to the stream computing task.
[0062] S4, Execute the stream computation task.
[0063] The specific method for setting up the stream computing service's scheduled task in step S1 of this embodiment is as follows:
[0064] ① When the database service starts, a stream computing service is created as a background service. The stream computing service reads the stream computing tasks in the stream computing task table and sets up scheduled tasks according to the start time of each stream computing task.
[0065] ② When stream computing is created, a scheduled task is set up for the stream computing task sent to the stream computing service.
[0066] When the database service ends, both the stream computing task and the stream computing service will stop operating.
[0067] As attached Figure 1 As shown, the specific steps for creating the stream computation in step S2 of this embodiment are as follows:
[0068] S201. The parameters created by stream computing include start time, end time, aggregation rules, and time interval. After parsing the parameters created by stream computing, a time series table with the same name as stream computing is created according to the result type of aggregation computing. The time series table is used to store the results of stream computing tasks, which is the result table.
[0069] S202. Write the stream computing task into the stream computing task table, write it to disk, and send it to the background stream computing service.
[0070] As attached Figure 2 As shown, the execution flow computation task in step S4 of this embodiment is as follows:
[0071] If a scheduled task created by the stream computing service is triggered, the stream computing task will be started, and the stream computing task will create a cache based on the columns of the result table.
[0072] When time-series data is sent to a stream processing task, the timestamp of the time-series data is determined based on the start time and time interval:
[0073] If the timestamp of the time series data previously written to the cache belongs to the same time window, then that time series data will be written to the cache.
[0074] If the timestamp of the time series data previously written to the cache does not belong to the same time window, the data in the cache will be calculated according to the aggregation rules set when the stream computation was created. The calculation results will be written to the result table and the cache will be cleared. New data will be written to the cache until the end time of the stream computation task is reached and the stream computation task ends.
[0075] In this embodiment, the background stream computing service establishes a scheduled task based on the start time of the stream computing created by the user, and starts the stream computing task when the scheduled time is reached.
[0076] Example 2:
[0077] This embodiment provides a real-time time-series data stream computing and processing system, the system comprising:
[0078] The startup unit is used to start the time-series database and the stream computing service;
[0079] Create a cell to create stream computation on any time series table, which is the original table;
[0080] The insertion unit is used to determine whether a stream computation task has been created on the time series table to which the time series data is inserted after it has been written to disk; that is, whether a stream computation task has been created on the original table to which the data is inserted.
[0081] If a stream computing task is created, the start time of the stream computing task determines whether to send time-series data to the stream computing task.
[0082] Execution unit, used to process stream computing tasks.
[0083] In this embodiment, the specific method for setting up scheduled stream computing tasks in the stream computing service is as follows:
[0084] ① When the database service starts, a stream computing service is created as a background service. The stream computing service reads the stream computing tasks in the stream computing task table and sets up scheduled tasks according to the start time of each stream computing task.
[0085] ② When stream computing is created, a scheduled task is set up for the stream computing task sent to the stream computing service.
[0086] When the database service ends, both the stream computing task and the stream computing service will stop operating.
[0087] The working process of this system is as follows:
[0088] (1) When the database service starts, a stream computing service is created as a background service. The stream computing service reads the stream computing task table and creates stream computing tasks.
[0089] (2) The user creates a stream computation on any time series table, which is the original table, and creates a new time series table to store the stream computation results, i.e., the result table; after the stream computation is successfully created, the stream computation task is sent to the background stream computation service.
[0090] (3) The background stream computing service creates a scheduled task based on the start time of the stream computing created by the user, and starts the stream computing task when the scheduled time is reached.
[0091] (4) When inserting time-series data, first write it to disk. After successful disk writing, check whether stream computing has been created on the original table to which the data was inserted:
[0092] If so, data is sent to the stream computing task. If the data received by the stream computing task is within the time window specified by the stream computing task, the data is written to the cache; otherwise, the data in the cache is aggregated and written to the result table. After clearing the cache, the new data is written to the cache.
[0093] (5) When the end time of the stream computation is reached, the stream computation task stops, and no more data is sent after the data is successfully written to disk when inserting data into the original table of the stream computation.
[0094] Example 3:
[0095] This invention also provides an electronic device, including: a memory and a processor;
[0096] The memory stores the instructions executed by the computer.
[0097] The processor executes computer execution instructions stored in the memory, causing the processor to perform the real-time time-series data streaming processing method in any embodiment of the present invention.
[0098] The processor can be a central processing unit (CPU), or other general-purpose processors, digital signal processors (DSPs), application-specific integrated circuits (ASICs), off-the-shelf programmable gate arrays (FPGAs), or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc. The processor can be a microprocessor or any conventional processor.
[0099] Memory is used to store computer programs and / or modules. The processor implements various functions of the electronic device by running or executing the computer programs and / or modules stored in the memory, and by accessing data stored in the memory. Memory can mainly include a program storage area and a data storage area. The program storage area can store the operating system, at least one application program required for a function, etc.; the data storage area can store data created based on the use of the terminal, etc. In addition, memory can also include high-speed random access memory, and can also include non-volatile memory, such as hard disks, RAM, plug-in hard disks, smart memory cards (SMC), secure digital cards (SD cards), flash memory cards, at least one disk storage device, flash memory devices, or other volatile solid-state storage devices.
[0100] Example 4:
[0101] This invention also provides a computer-readable storage medium storing multiple instructions, which are loaded by a processor to cause the processor to execute the real-time time-series data streaming processing method according to any embodiment of this invention. Specifically, a system or apparatus equipped with a storage medium may be provided, on which software program code implementing the functions of any of the above embodiments is stored, and the computer (or CPU or MPU) of the system or apparatus may read and execute the program code stored in the storage medium.
[0102] In this case, the program code read from the storage medium can itself implement the function of any of the above embodiments, and therefore the program code and the storage medium storing the program code constitute part of the present invention.
[0103] Storage media embodiments for providing program code include floppy disks, hard disks, magneto-optical disks, optical disks (such as CD-ROM, CD-R, CD-RW, DVD-ROM, DVD-RYM, DVD-RW, DVD+RW), magnetic tapes, non-volatile memory cards, and ROMs. Alternatively, program code can be downloaded from a server computer via a communication network.
[0104] Furthermore, it should be clear that not only can the program code read by the computer be executed, but also the operating system or other components operating on the computer can be instructed based on the program code to perform some or all of the actual operations, thereby realizing the function of any of the embodiments described above.
[0105] Furthermore, it is understood that the program code read from the storage medium is written to the memory set in the expansion board inserted into the computer or to the memory set in the expansion unit connected to the computer. Then, based on the instructions of the program code, the CPU or other components installed on the expansion board or expansion unit execute some and all of the actual operations, thereby realizing the function of any of the embodiments described above.
[0106] Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of the present invention, and not to limit them; although the present invention has been described in detail with reference to the foregoing embodiments, those skilled in the art should understand that modifications can still be made to the technical solutions described in the foregoing embodiments, or equivalent substitutions can be made to some or all of the technical features; and these modifications or substitutions do not cause the essence of the corresponding technical solutions to deviate from the scope of the technical solutions of the embodiments of the present invention.
Claims
1. A method for real-time time-series data stream computing processing, characterized in that, The method is as follows: Start the time-series database and start the stream computing service; Create stream computation: Create a stream computation on any time series table, which is the original table; Inserting Time Series Data: After the time series data is written to disk, determine whether a stream computing task has been created on the time series table to which the data is inserted, i.e., whether a stream computing task has been created on the original table to which the data was inserted. If a stream computing task is created, the start time of the stream computing task determines whether to send time-series data to the stream computing task. Execute stream computing tasks; The specific methods for setting up scheduled tasks for stream computing services are as follows: ① When the database service starts, a stream computing service is created as a background service. The stream computing service reads the stream computing tasks in the stream computing task table and sets up scheduled tasks according to the start time of each stream computing task. ② When stream computing is created, a scheduled task is set up for the stream computing task sent to the stream computing service; When the database service ends, both the stream computing task and the stream computing service will stop. The specific steps for creating a stream computation are as follows: The parameters for stream computing creation include start time, end time, aggregation rules, and time interval. After parsing the parameters for stream computing creation, a time series table with the same name as the stream computing is created based on the result type of the aggregation computing. The time series table is used to store the results of the stream computing task, which is called the result table. Write the stream computing task to the stream computing task table, write it to disk, and send it to the background stream computing service; The specific execution flow computation task is as follows: If a scheduled task created by the stream computing service is triggered, the stream computing task will be started, and the stream computing task will create a cache based on the columns of the result table. When time-series data is sent to a stream processing task, the timestamp of the time-series data is determined based on the start time and time interval: If the timestamp of the time series data previously written to the cache belongs to the same time window, then the time series data will be written to the cache. If the timestamp of the time series data previously written to the cache does not belong to the same time window, the data in the cache will be calculated according to the aggregation rules set when the stream computation was created, the calculation result will be written to the result table and the cache will be cleared, and the new data will be written to the cache until the end time of the stream computation task is reached and the stream computation task ends. The background stream computing service creates a scheduled task based on the start time of the stream computing task created by the user, and starts the stream computing task when the scheduled time is reached.
2. A real-time time-series data stream computing and processing system, characterized in that, The system includes: The startup unit is used to start the time-series database and the stream computing service; Create a cell to create stream computation on any time series table, which is the original table; The insertion unit is used to determine whether a stream computation task has been created on the time series table to which the time series data is inserted after it has been written to disk; that is, whether a stream computation task has been created on the original table to which the data is inserted. If a stream computing task is created, the start time of the stream computing task determines whether to send time-series data to the stream computing task. Execution unit, used to process stream computing tasks; The specific methods for setting up scheduled tasks for stream computing services are as follows: ① When the database service starts, a stream computing service is created as a background service. The stream computing service reads the stream computing tasks in the stream computing task table and sets up scheduled tasks according to the start time of each stream computing task. ② When stream computing is created, a scheduled task is set up for the stream computing task sent to the stream computing service; When the database service ends, both the stream computing task and the stream computing service will stop. The working process of this system is as follows: (1) When the database service starts, a stream computing service is created as a background service. The stream computing service reads the stream computing task table to create stream computing tasks. (2) The user creates a stream computation on any time series table, which is the original table, and creates a new time series table. The time series table is used to store the stream computation results, i.e., the result table. After the stream computation is successfully created, the stream computation task is sent to the background stream computation service. (3) The background stream computing service creates a scheduled task based on the start time of the stream computing created by the user, and starts the stream computing task when the scheduled time is reached; (4) When inserting time-series data, first write it to disk. After successful disk writing, check whether stream computing has been created on the original table to which the data was inserted: If so, data is sent to the stream computing task. If the data received by the stream computing task is within the time window specified by the stream computing task, the data is written to the cache; otherwise, the data in the cache is aggregated and written to the result table. After clearing the cache, the new data is written to the cache. (5) When the end time of the stream computation is reached, the stream computation task stops, and no more data is sent after the data is successfully written to disk when inserting data into the original table of the stream computation.
3. An electronic device, characterized in that, include: Memory and at least one processor; The memory contains computer programs; The at least one processor executes the computer program stored in the memory, causing the at least one processor to perform the real-time time-series data streaming processing method as described in claim 1.
4. A computer-readable storage medium, characterized in that, The computer-readable storage medium stores a computer program that can be executed by a processor to implement the real-time time-series data streaming processing method as described in claim 1.