A data storage method and system based on data playback
By storing code execution dependencies in working memory during data replay and creating a simulated database operation layer, the problem of excessively long database storage time is solved, achieving more efficient data storage and system stability verification.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- ADVANCED NEW TECHNOLOGIES CO LTD
- Filing Date
- 2018-12-11
- Publication Date
- 2026-06-30
AI Technical Summary
In existing technologies, the database storage method during data playback takes too long, resulting in low efficiency in system stability verification.
By configuring code execution dependency data into a predefined dataset and storing it in working memory during data replay, a simulated database operation layer is created to simulate data calls, thus avoiding dependence on third-party databases.
It improves data storage efficiency, shortens data storage time, reduces system complexity, and enhances the overall efficiency of data playback.
Smart Images

Figure CN117349261B_ABST
Abstract
Description
Technical Field
[0001] This specification relates to the field of information storage technology, and in particular to a data storage method and system based on data playback. Background Technology
[0002] With the continuous development of network technology, various online services are constantly being presented to users. The integration of numerous online services under the same business system makes these systems extremely complex. Furthermore, most services require execution through core business systems. Therefore, regardless of whether the business system is complex or core, system stability is a crucial capability. Because these systems handle a large volume of business, system stability verification is extremely time-consuming. The system stability verification process primarily involves replaying online business data, which relies on database storage. Research and analysis have revealed that database storage accounts for over 90% of the total time spent on system stability verification.
[0003] In existing technologies, data playback requires the use of databases for data retrieval and storage. Compared to conventional databases, in-memory databases can reduce the time overhead associated with data storage. However, because in-memory databases still adhere to the usage standards of conventional databases, the inherent characteristics of their storage method dictate that data storage requires more time. Therefore, a faster data storage solution is needed based on existing technology. Summary of the Invention
[0004] This specification provides a data storage method and system based on data playback to solve the problem that existing technologies require more data storage time due to database storage methods in data playback.
[0005] To solve the above-mentioned technical problems, the embodiments in this specification are implemented as follows:
[0006] This specification provides an embodiment of a data storage method based on data playback, comprising:
[0007] Configure code execution dependency data into a predetermined data set, wherein the code execution dependency data is the data required for data replay;
[0008] Insert the data set into the working memory for data playback;
[0009] A simulated database operation layer is created, which is used to call the code execution dependency data during the data replay process;
[0010] During the data replay process, the data generated by calling the code execution dependency data is stored in the data set.
[0011] Additionally, in the method, configuring the code execution dependency data into a predetermined data set includes: extracting the code execution dependency data from the simulation database and storing the code execution dependency data into the data set.
[0012] Additionally, in the method, inserting the data set into the working memory of data playback includes: after the data playback thread starts, inserting the data set into the working memory corresponding to the data playback thread.
[0013] In addition, in the method, creating a simulated database operation layer includes: obtaining a simulated database operation layer by mapping the database operation layer of the offline database.
[0014] Additionally, in the method, the simulated database operation layer includes: a data access interface for a simulated data set, and a call service that simulates calls to the data dependent on the code execution.
[0015] In addition, the simulated database operation layer in the method further includes: simulating online databases, historical databases, and elastic databases; simulating routing rules that distinguish different data sets of the online databases, historical databases, and elastic databases; and simulating locks on the data sets.
[0016] Additionally, in the method, the code execution depends on data, including data that historical data depends on and data that environmental data depends on.
[0017] This specification provides an embodiment of a data storage system based on data playback, comprising:
[0018] The configuration module is used to configure code execution dependency data into a predetermined data set, wherein the code execution dependency data is the data required for data replay;
[0019] An insertion module is used to insert the data set into the working memory for data playback.
[0020] A module is created to create a simulated database operation layer, which is used to call the code execution dependency data during data replay.
[0021] The storage module is used to store the data generated during the data playback process by calling the code execution dependency data into the data set.
[0022] In addition, in the system, the configuration module is specifically used to: extract code execution dependency data from the simulation database and store the code execution dependency data in a data set.
[0023] In addition, in the system, the insertion module is specifically used to: insert the data set into the working memory corresponding to the data playback thread after the data playback thread starts.
[0024] In addition, in the system, the creation module is specifically used to: obtain a simulated database operation layer by mapping the database operation layer of the offline database.
[0025] In addition, in the system, the creation module is further used to: simulate the data access interface of the data set, and simulate the invocation service for calling the data that the code execution depends on.
[0026] In addition, in the system, the creation module is further used to: simulate online databases, historical databases, and elastic databases; simulate routing rules that distinguish different data sets of the online databases, historical databases, and elastic databases; and simulate locks on the data sets.
[0027] In addition, the code execution dependent data in the system includes: historical data dependent data and environmental data dependent data.
[0028] The above-described at least one technical solution adopted in the embodiments of this specification can achieve the following beneficial effects:
[0029] By storing code execution dependencies in a predefined data set, and then inserting this data set into the working memory for data replay, and by creating a simulated database operation layer to invoke the code execution dependencies in the data set during data replay, data generated during replay can be stored in the data set. Based on this solution, data can be stored in the working memory of the thread corresponding to data replay, effectively improving data storage efficiency and reducing data storage time. Attached Figure Description
[0030] To more clearly illustrate the technical solutions in the embodiments or prior art of this specification, the drawings used in the description of the embodiments or prior art will be briefly introduced below. Obviously, the drawings described below are only some embodiments recorded in this specification. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.
[0031] Figure 1 This is a schematic diagram illustrating the playback of business data in a simulation system involved in the practical application scenario of the solution described in this specification.
[0032] Figure 2 A flowchart illustrating a data storage method based on data playback provided in an embodiment of this specification;
[0033] Figure 3 A schematic diagram illustrating the overall data playback method provided in the embodiments of this specification;
[0034] Figure 4 This is a schematic diagram illustrating the simulated database operation layer and data set calls provided in the embodiments of this specification;
[0035] Figure 5 This is a schematic diagram of a data storage system based on data playback, provided as an embodiment of this specification. Detailed Implementation
[0036] To enable those skilled in the art to better understand the technical solutions in this specification, the technical solutions in the embodiments of this specification will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of this application, and not all embodiments. Based on the embodiments of this specification, all other embodiments obtained by those skilled in the art without creative effort should fall within the scope of protection of this application.
[0037] Data replay, as the foundation of a comprehensive system evaluation framework, is widely used in processes such as system stability verification, code maturity verification, and architecture maturity verification. Examples include verifying system financial security and stability, and performance stability. By importing data from the real environment into a simulation environment for data simulation replay, and because the business logic using the code does not produce any business results, the verification of system stability will not generate real business results, thus achieving the purpose of system business simulation testing.
[0038] Figure 1 This diagram illustrates the business data replay in a simulation system within a real-world application scenario, illustrating the solution described in this specification. The simulation system initiates the replay of real online business data. The simulation database stores all data from the real online business. Before the replay begins, the data required for the replay—i.e., code execution dependencies—is extracted from the simulation database and configured into a predetermined data set. When the replay officially begins, the data set is inserted into the working memory of the thread corresponding to the replay. The simulation server calls the data within the data set and stores the data generated during the replay process back into the data set. While this embodiment describes the replay of real online business data, this application is not limited to this; it can also include replays of data related to financial security, code maturity, and architecture maturity. Since working memory offers faster storage speeds compared to in-memory databases, utilizing the working memory occupied by the replay thread for data storage eliminates reliance on third-party databases, reducing system complexity.
[0039] Based on the above scenario, the solution in this manual will be described in detail below.
[0040] Figure 2 This specification provides a flowchart illustrating a data storage method based on data playback, which may specifically include the following steps:
[0041] First, in step S210, code execution dependency data is configured into a predetermined data set, wherein the code execution dependency data is the data required for data playback.
[0042] In one or more embodiments of this specification, before the simulation system initiates data replay of real online business, all data of the real online business needs to be imported into the simulation database. Before the data replay begins, the data required for this business replay is configured into a predetermined data set. The data required for this business replay refers to the code execution dependency data needed for this business replay. The configuration of the data set is completed by extracting the code execution dependency data from the simulation database and storing the code execution dependency data in the data set.
[0043] The relationship between the above data replay and business replay is that when verifying the stability of the system, it is necessary to complete the replay of online business in the offline simulation system. However, the offline simulation replay process is actually a data replay of the simulation data of the online business. That is, business replay is a replay of business data.
[0044] It should be noted that all data from real online business operations includes not only data in the real online database but also data such as business requests. The data in the real online database is accumulated from real historical business operations. Code execution dependency data refers to the data required for this business replay. Code execution dependency data includes at least the data that the historical data of the business replay depends on and the data that the environment data depends on. In practical applications, code execution dependency data can also be called business execution dependency data. This difference in name does not limit the scope of protection of the solution in this specification.
[0045] In a specific embodiment, for example, the online business is as follows: user "Zhang San" initiates a transfer of 10,000 yuan to "Li Si". When performing an offline simulation playback of this online business, it is necessary to extract the code execution dependency data corresponding to this online business from the simulation database and store it in a dataset. For example, the code execution dependency data corresponding to this online business includes: data dependent on historical data, such as user "Zhang San's" account balance; and data dependent on environmental data, such as some configuration information for this transfer business.
[0046] Next, in step S220, the data set is inserted into the working memory for data playback.
[0047] In one or more embodiments of this specification, this step is executed after data playback begins. The simulation system initiates data playback of the online business, and the simulation server creates a data playback thread or selects a data playback thread from a thread pool. The data playback thread can be single-threaded, i.e., the data playback program itself. Once the data playback thread is created or selected, the corresponding working memory for the data playback thread is generated. After the data playback officially begins, the pre-configured data set can be inserted into the working memory of the data playback thread.
[0048] It should be noted that a data set refers to all data structures (such as map objects) that have indexing capabilities and the ability to manipulate in-memory objects. When a data set is inserted into working memory, it actually uses the memory's data storage capabilities.
[0049] In one embodiment, step S210 first stores the code execution dependency data in a data set. Then, step S220, after the data playback execution begins, inserts the pre-configured data set into the working memory of the data playback thread. Steps S210 and S220 complete the configuration of the data set in the working memory, ensuring that the working memory has the data dependency conditions required for data playback, thus laying the foundation for calling the working memory during data playback. In this embodiment, the data set configuration process for the working memory is simple and has no complex dependencies.
[0050] As described in the foregoing embodiments, since the working memory of the data replay thread is the same as the thread's lifecycle, the lifecycle of the data stored in the dataset is also the same as the thread's lifecycle. When the data replay thread ends, the working memory is released, and the data stored in the dataset disappears. Since the data replay process does not require consideration of data storage persistence, the working memory of the data replay thread is used instead of the actual offline database to implement database access and storage during data replay. This embodiment uses the working memory of the data replay thread for data storage, which is initialized only each time the data replay thread starts. Therefore, no data cleanup operation is required, shortening the data replay time. Figure 3 This is a schematic diagram illustrating the overall data playback method provided in the embodiments of this specification, as shown below. Figure 3 As shown, the overall approach to data replay is: data preparation and business execution. Furthermore, this embodiment implements the call through the working memory of the data replay thread itself; therefore, the data replay process does not require remote RPC (Remote Procedure Call) calls to a third-party database, avoiding the time-consuming nature of remote calls.
[0051] In step S230, a simulated database operation layer is created. This simulated database operation layer is used to call the code execution dependency data during the data playback process.
[0052] In one or more embodiments of this specification, the simulated database operation layer may be created before the simulation system initiates data playback of the online real business. The simulated database operation layer may be created by mapping the database operation layer of the offline database. The offline database is the real database in the simulation system, and its database operation layer includes data access and storage operations. By simulating the database operation layer of the offline database, a simulated database operation layer adapted to the data set in the working memory is obtained, containing all the functions of the original database operation layer.
[0053] Figure 4 This is a schematic diagram illustrating the simulated database operation layer and data set calls provided in the embodiments of this specification, such as... Figure 4 As shown, the simulated database operation layer may include: a data access interface for the simulated data set, and a call service that simulates calls to data that depend on code execution. The call service that simulates calls to data that depend on code execution generally needs to simulate all the necessary calls the simulation server makes to the data set. For example, it may simulate `put` (insert / add operation) calls to data that depend on code execution in the data set, `get` (query operation) calls to data that depend on code execution in the data set, and `remove` (delete / modify operation) calls to data that depend on code execution in the data set.
[0054] In one or more embodiments of this specification, the simulated database operation layer may further include:
[0055] Simulate online databases, historical databases, and elastic databases; simulate routing rules that distinguish different data sets in online databases, historical databases, and elastic databases; simulate locks on data sets.
[0056] Furthermore, by simulating online databases, historical databases, and elastic databases, we can simulate data source scenarios for different databases. When the corresponding data sources in the real offline databases are not interconnected, we need to simulate the data isolation of these different data sources in the offline databases. Then, we can simulate the routing rules that distinguish between different data sets in the online database, historical database, and elastic database.
[0057] In one or more embodiments of this specification, the simulation method used above can be a mock method, which is a method of simulating service calls based on mockito; DAO (Data Access Object) is an object-oriented database interface, therefore, simulating the data access interface of a data set can be understood as a mock operation on the DAO, i.e., DAO-mock; simulating the call of code execution dependent data can be understood as a mock operation on all the call operations (e.g., insert / modify operations, query operations, delete operations) required by the data set; simulating the online database, historical database, and elastic database can be understood as a mock operation on the online database, historical database, and elastic database; simulating the routing rules that distinguish different data sets of the online database, historical database, and elastic database can be understood as a mock operation on the routing rules of different data sources.
[0058] It should be noted that the step numbers mentioned in the embodiments of this specification are not a restriction on the order of execution of the steps. For example, step S230 can be executed before step S210.
[0059] Finally, in step S240, the data generated during the data playback process by calling the code execution dependency data is stored in the data set.
[0060] In one or more embodiments of this specification, data replay refers to the replay of business data from real online business operations. Before the data replay begins, all data from the real online business is imported into the simulation database. After the simulation system initiates the data replay of the online business, the code execution dependency data required for the business to be replayed is extracted from the simulation database and configured into a data set. Once the data replay officially begins, the configured data set is inserted into the working memory. During the data replay process, the data set in the working memory is called, and the data generated by calling the data set in the working memory during the data replay process is stored in the data set. Calling the data set in the working memory is actually calling the code execution dependency data in the data set. In the embodiments of this specification, by storing the data generated by the data replay into a data set, which is part of the working memory, the data storage operation is realized using the memory of the data replay thread. The data storage speed in memory is much faster than the storage speed of an offline memory-based database, thereby reducing data storage time and improving data storage efficiency.
[0061] Specifically, the data replayed refers to all data during the business replay process, including but not limited to code execution dependency data, user business requests, results returned to the user, and process data of the business replay.
[0062] In a specific embodiment, for example, a replay transaction is as follows: User "Zhang San" initiates a transfer of 10,000 yuan to "Li Si". In this transfer transaction scenario, the replay data includes: code execution dependency data, such as user "Zhang San's" account balance and the configuration information of the transfer transaction; user business request, such as the transfer request initiated by user "Zhang San"; the result returned to the user, such as the success / failure result returned to user "Zhang San"; and process data of the business replay, such as the process data of user "Zhang San" transferring money to "Li Si". By calling data from the data set, the business data is replayed, and the data generated in the data replay is finally stored in the data set. The data generated by the data replay in this transfer transaction is, for example, user "Zhang San" successfully transferred 10,000 yuan to "Li Si".
[0063] Following the same line of thought, embodiments of this specification also provide a data storage system based on data playback, such as... Figure 5 A data storage system based on data playback, provided as an embodiment of this specification, mainly includes:
[0064] Configuration module 501 is used to configure code execution dependency data into a predetermined data set, wherein the code execution dependency data is the data required for data replay;
[0065] Insertion module 502 is used to insert the data set into the working memory of data playback;
[0066] Module 503 is created to create a simulated database operation layer, which is used to call the code execution dependency data during data playback.
[0067] The storage module 504 is used to store the data generated during the data playback process by calling the code execution dependency data into the data set.
[0068] According to an embodiment of this application, the configuration module 501 is specifically used to: extract code execution dependency data from the simulation database and store the code execution dependency data in a data set.
[0069] According to an embodiment of this application, the insertion module 502 is specifically used to: insert the data set into the working memory corresponding to the data playback thread after the data playback thread starts.
[0070] According to an embodiment of this application, the creation module 503 is specifically used to: obtain a simulated database operation layer by mapping the database operation layer of the offline database.
[0071] According to an embodiment of this application, the creation module 503 is further configured to: simulate a data access interface for a data set, and simulate a call service for calling the code execution dependent data.
[0072] According to an embodiment of this application, the creation module 503 is further configured to: simulate an online database, a historical database, and an elastic database; simulate routing rules that distinguish different data sets of the online database, the historical database, and the elastic database; and simulate locks on the data sets.
[0073] According to embodiments of this application, the code execution dependency data includes: historical data dependency data and environmental data dependency data.
[0074] The foregoing has described specific embodiments of this specification. Other embodiments are within the scope of the appended claims. In some cases, the actions or steps recited in the claims may be performed in a different order than that shown in the embodiments and may still achieve the desired result. Furthermore, the processes depicted in the drawings do not necessarily require the specific or sequential order shown to achieve the desired result. In some embodiments, multitasking and parallel processing are possible or may be advantageous.
[0075] The various embodiments in this specification are described in a progressive manner. Similar or identical parts between embodiments can be referred to interchangeably. Each embodiment focuses on describing the differences from other embodiments. In particular, the embodiments for apparatus, electronic devices, and non-volatile computer storage media are basically similar to the method embodiments, so the descriptions are relatively simple; relevant parts can be referred to the descriptions of the method embodiments.
[0076] This specification is described with reference to flowchart illustrations and / or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of this specification. It will be understood that each block of the flowchart illustrations and / or block diagrams, and combinations of blocks in the flowchart illustrations and / or block diagrams, can be implemented by computer program instructions. These computer program instructions can be provided to a processor of a general-purpose computer, special-purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create a machine for implementing the flowchart illustrations and / or block diagrams. Figure 1 One or more processes and / or boxes Figure 1 A device that provides the functions specified in one or more boxes.
[0077] Memory may include non-persistent storage in computer-readable media, such as random access memory (RAM) and / or non-volatile memory, such as read-only memory (ROM) or flash RAM. Memory is an example of computer-readable media.
[0078] Computer-readable media includes both permanent and non-permanent, removable and non-removable media that can store information using any method or technology. Information can be computer-readable instructions, data structures, modules of programs, or other data. Examples of computer storage media include, but are not limited to, phase-change memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), other types of random access memory (RAM), read-only memory (ROM), electrically erasable programmable read-only memory (EEPROM), flash memory or other memory technologies, CD-ROM, digital versatile optical disc (DVD) or other optical storage, magnetic tape, magnetic magnetic disk storage or other magnetic storage devices, or any other non-transferable medium that can be used to store information accessible by a computing device. As defined herein, computer-readable media does not include non-transitory computer-readable media, such as modulated data signals and carrier waves.
[0079] It should also be noted that the terms "comprising," "including," or any other variations thereof are intended to cover non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements includes not only those elements but also other elements not expressly listed, or elements inherent to such a process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one..." does not exclude the presence of other identical elements in the process, method, article, or apparatus that includes said element.
[0080] This specification can be described in the general context of computer-executable instructions that are executed by a computer, such as program modules. Generally, program modules include routines, programs, objects, components, data structures, etc., that perform a specific task or implement a specific abstract data type. This specification can also be practiced in distributed computing environments, where tasks are performed by remote processing devices connected via a communication network. In distributed computing environments, program modules can reside on local and remote computer storage media, including storage devices.
[0081] The various embodiments in this specification are described in a progressive manner. Similar or identical parts between embodiments can be referred to mutually. Each embodiment focuses on describing the differences from other embodiments. In particular, the system embodiments are basically similar to the method embodiments, so the description is relatively simple; relevant parts can be referred to the descriptions in the method embodiments.
[0082] The above description is merely an embodiment of this specification and is not intended to limit this application. Various modifications and variations can be made to this application by those skilled in the art. Any modifications, equivalent substitutions, improvements, etc., made within the spirit and principle of this application should be included within the scope of the claims of this application.
Claims
1. A data storage method based on data playback, comprising: Configure code execution dependency data into a predetermined data set, wherein the code execution dependency data is the data required for data replay; Insert the data set into the working memory for data playback; A simulated database operation layer is created. This simulated database operation layer is used to call the code execution dependency data during data replay. The simulated database operation layer is adapted to the data set in the working memory and includes the functionality of the offline database operation layer. The simulated database operation layer includes a data access interface for the simulated data set, and a call service that simulates calls to the code execution dependency data. It simulates online databases, historical databases, and elastic databases, simulates routing rules that distinguish between different data sets of the online database, historical database, and elastic database, and simulates locks on the data sets. During the data replay process, the data generated by calling the code execution dependency data is stored in the data set.
2. The method of claim 1, wherein configuring code execution dependency data into a predetermined data set includes: Extract code execution dependency data from the simulation database and store the code execution dependency data in a data set.
3. The method as described in claim 1, wherein inserting the data set into the working memory for data playback comprises: After the data playback thread starts, the data set is inserted into the working memory corresponding to the data playback thread.
4. The method of claim 3, further comprising: The lifecycle of data storage in the dataset is the same as the lifecycle of the data playback thread.
5. The method of claim 1, wherein creating the simulated database operation layer comprises: A simulated database operation layer is obtained by mapping the database operation layer of the offline database.
6. The method as described in claim 1, wherein the offline database is a real database in the simulation system, and the database operation layer of the offline database includes data retrieval and storage operations on the offline database.
7. The method of claim 1, wherein the method comprises: By simulating the database operation layer of an offline database, a simulated database operation layer adapted to the data set in the working memory is obtained.
8. The method of claim 1, wherein the code execution dependency data includes: Historical data depends on data and environmental data depends on data.
9. The method as described in claim 1, wherein the data replayed refers to the data during the business replay process, and the data includes, but is not limited to, code execution dependency data, user business requests, results returned to the user, and process data of business replay.
10. A data storage system based on data playback, comprising: The configuration module is used to configure code execution dependency data into a predetermined data set, wherein the code execution dependency data is the data required for data replay; An insertion module is used to insert the data set into the working memory for data playback. A creation module is used to create a simulated database operation layer, which is used to call the code execution dependency data during data replay; wherein, the simulated database operation layer is adapted to the data set in the working memory and includes the functions of the offline database operation layer; The creation module is further used to simulate the data access interface of the data set, and to simulate the call service that calls the data that the code execution depends on; The creation module is further used to simulate online databases, historical databases, and elastic databases; simulate routing rules that distinguish different data sets of the online databases, historical databases, and elastic databases; and simulate locks on the data sets. The storage module is used to store the data generated during the data playback process by calling the code execution dependency data into the data set.
11. The system of claim 10, wherein the configuration module is specifically used for: Extract code execution dependency data from the simulation database and store the code execution dependency data in a data set.
12. The system of claim 10, wherein the insertion module is specifically used for: After the data playback thread starts, the data set is inserted into the working memory corresponding to the data playback thread.
13. The system of claim 10, wherein the creation module is specifically used for: A simulated database operation layer is obtained by mapping the database operation layer of the offline database.
14. The system of claim 10, wherein the code execution dependency data includes: Historical data depends on data and environmental data depends on data.