Method and device for determining hotspot data, electronic equipment and storage medium

By writing data to a time window table based on service instance identifiers in a distributed environment and combining this with load balancing strategies to collect hot data, the problem of mixed data writing and inaccurate statistics in multi-instance concurrent scenarios is solved, achieving real-time, global, and accurate identification of hot data.

CN122309845APending Publication Date: 2026-06-30BEIJING BAIDU NETCOM SCI & TECH CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
BEIJING BAIDU NETCOM SCI & TECH CO LTD
Filing Date
2026-03-31
Publication Date
2026-06-30

Smart Images

  • Figure CN122309845A_ABST
    Figure CN122309845A_ABST
Patent Text Reader

Abstract

This disclosure provides a method, apparatus, electronic device, and storage medium for determining hotspot data, relating to Internet technology, and particularly to the field of intelligent recommendation technology. The method includes: receiving request message data sent by a client, wherein the request message data includes business data corresponding to multiple service instances; writing the business data into a time window data table corresponding to the service instance based on the service instance's identification information; determining candidate hotspot data based on the access frequency corresponding to the data in the time window data table; statistically analyzing the total access frequency of the candidate hotspot data within the time window of the distributed cluster according to a preset load balancing strategy; and determining the target hotspot data based on the statistical results. This disclosure improves the real-time performance, comprehensiveness, and accuracy of hotspot identification.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This disclosure relates to Internet technology, and more particularly to the field of intelligent recommendation technology, specifically to a method, apparatus, electronic device, and storage medium for determining hot data. Background Technology

[0002] With the rapid development of internet content ecosystems and recommendation system technologies, users' demands for real-time, novel, and personalized content continue to rise. Real-time trending topic generation has become a core technological element in information recommendation, content distribution, and social media dissemination. Automating the generation of highly popular, relevant, and high-quality topics effectively enhances the recommendation system's ability to perceive trends across the internet, improves content distribution efficiency, and enhances user interaction experience. Therefore, this technology is widely used in mainstream recommendation systems. Summary of the Invention

[0003] This disclosure presents a method, apparatus, electronic device, and storage medium for determining hotspot data.

[0004] According to a first aspect of this disclosure, a method for determining hot data is provided, comprising: receiving request message data sent by a client, wherein the request message data includes business data corresponding to multiple service instances; writing the business data into a time window data table corresponding to the service instance according to the identification information of the service instance; determining candidate hot data according to the access frequency corresponding to the data in the time window data table; statistically analyzing the total access frequency of the candidate hot data in the time window of the distributed cluster according to a preset load balancing strategy, and determining target hot data based on the statistical results.

[0005] According to a second aspect of this disclosure, a method for determining hotspot data is provided, comprising: loading a business configuration file, wherein the business configuration file includes: identification information of service instances, business scenarios included in each service instance, sampling rate, and reporting period; writing business data into a front-end buffer through a preset interface according to the sampling rate; switching the front-end buffer and the back-end buffer according to the reporting period, and for the switched back-end buffer, statistically analyzing the access frequency of data in the back-end buffer according to the business scenario; generating request message data based on the statistical results, and reporting the request message data to the server.

[0006] According to a third aspect of this disclosure, a device for determining hotspot data is provided, comprising: a receiving module configured to receive request message data sent by a client, wherein the request message data includes business data corresponding to multiple service instances; a first writing module configured to write the business data into a time window data table corresponding to the service instance according to the identification information of the service instance; a first determining module configured to determine candidate hotspot data according to the access frequency corresponding to the data in the time window data table; and a second determining module configured to statistically analyze the total access frequency of the candidate hotspot data in the time window of the distributed cluster according to a preset load balancing strategy, and determine the target hotspot data based on the statistical results.

[0007] According to a fourth aspect of this disclosure, a device for determining hotspot data is provided, comprising: a loading module configured to load a business configuration file, wherein the business configuration file includes: identification information of service instances, business scenarios included in each service instance, sampling rate, and reporting period; a second writing module configured to write business data to a front-end buffer through a preset interface according to the sampling rate; a statistics module configured to switch between a front-end buffer and a back-end buffer according to the reporting period, and for the switched back-end buffer, to perform statistics on the access frequency of data in the back-end buffer according to the business scenario; and a reporting module configured to generate request message data based on the statistical results and report the request message data to the server.

[0008] According to a fifth aspect of this disclosure, an electronic device is provided, comprising: at least one processor; and a memory communicatively connected to the at least one processor; wherein the memory stores instructions executable by the at least one processor, the instructions being executed by the at least one processor to enable the at least one processor to perform a method as described in any implementation of the first or second aspect.

[0009] According to a sixth aspect of this disclosure, a non-transitory computer-readable storage medium is provided storing computer instructions for causing a computer to perform a method as described in any implementation of the first or second aspect.

[0010] According to a seventh aspect of this disclosure, a computer program product is provided, including a computer program that, when executed by a processor, implements the method as described in any of the implementations of the first or second aspect.

[0011] It should be understood that the description in this section is not intended to identify key or essential features of the embodiments of this disclosure, nor is it intended to limit the scope of this disclosure. Other features of this disclosure will become readily apparent from the following description. Attached Figure Description

[0012] The accompanying drawings are provided to better understand this solution and do not constitute a limitation of this disclosure. Wherein: Figure 1 This is an exemplary system architecture diagram to which this disclosure can be applied; Figure 2 This is a flowchart of one embodiment of the method for determining hotspot data according to this disclosure; Figure 3 This is a flowchart of another embodiment of the method for determining hotspot data according to this disclosure; Figure 4 yes Figure 3 A flowchart of an embodiment of step 303; Figure 5 This is a flowchart of yet another embodiment of the method for determining hotspot data according to this disclosure; Figure 6 yes Figure 2 A flowchart of an embodiment of step 204; Figure 7 This is a flowchart of yet another embodiment of the method for determining hotspot data according to this disclosure; Figure 8 This is a flowchart of one embodiment of the method for determining hotspot data according to this disclosure; Figure 9 This is an application scenario diagram based on the method for determining hotspot data disclosed herein; Figure 10 This is a schematic diagram of the structure of one embodiment of the hotspot data determination device according to the present disclosure; Figure 11 This is a schematic diagram of the structure of one embodiment of the hotspot data determination device according to the present disclosure; Figure 12 This is a block diagram of an electronic device used to implement the hotspot data determination method of the embodiments of this disclosure. Detailed Implementation

[0013] The exemplary embodiments of this disclosure are described below with reference to the accompanying drawings, including various details of the embodiments to aid understanding, and should be considered merely exemplary. Therefore, those skilled in the art will recognize that various changes and modifications can be made to the embodiments described herein without departing from the scope and spirit of this disclosure. Similarly, for clarity and brevity, descriptions of well-known functions and structures are omitted in the following description.

[0014] It should be noted that, unless otherwise specified, the embodiments and features described in this disclosure can be combined with each other. This disclosure will now be described in detail with reference to the accompanying drawings and embodiments.

[0015] Figure 1An exemplary system frame 100 is shown, illustrating an embodiment of the hotspot data determination method or hotspot data determination apparatus of this disclosure.

[0016] like Figure 1 As shown, system architecture 100 may include terminal devices 101, 102, and 103, a network 104, and a server 105. Network 104 serves as the medium for providing communication links between terminal devices 101, 102, and 103 and server 105. Network 104 may include various connection types, such as wired or wireless communication links, or fiber optic cables, etc.

[0017] Users can use terminal devices 101, 102, and 103 to interact with server 105 via network 104 to receive or send messages, etc. Various applications for enabling information communication between the terminal devices 101, 102, and 103 and server 105 can be installed. These applications include cloud storage applications and instant messaging applications.

[0018] Terminal devices 101, 102, and 103 and server 105 can be either hardware or software. When terminal devices 101, 102, and 103 are hardware, they can be various electronic devices with displays, including but not limited to smartphones, tablets, laptops, and desktop computers. When terminal devices 101, 102, and 103 are software, they can be installed in the aforementioned electronic devices, and can be implemented as multiple software programs or software modules, or as a single software program or software module; no specific limitation is made here. When server 105 is hardware, it can be implemented as a distributed server cluster composed of multiple servers, or as a single server. When server 105 is software, it can be implemented as multiple software programs or software modules, or as a single software program or software module; no specific limitation is made here.

[0019] Server 105 can provide various services through its built-in applications. For example, users can operate the server through applications on terminal devices 101, 102, and 103 and send statistical requests for hot data to server 105. Server 105 can receive and process these hot data statistical requests, performing the following processing: receiving request message data sent by the client, wherein the request message data includes business data corresponding to multiple service instances; writing the business data into the time window data table corresponding to the service instance based on the service instance's identification information; determining candidate hot data based on the access frequency corresponding to the data in the time window data table; statistically analyzing the total access frequency of the candidate hot data in the time window of the distributed cluster according to a preset load balancing strategy, and determining the target hot data based on the statistical results.

[0020] It should be noted that the hotspot data determination method provided in this embodiment is generally executed by server 105, and correspondingly, the hotspot data determination device is generally set in server 105.

[0021] It should be understood that Figure 1 The number of terminal devices, networks, and servers shown is merely illustrative. Depending on implementation needs, any number of terminal devices, networks, and servers can be included.

[0022] Continue to refer to Figure 2 The document illustrates a flow 200 of an embodiment of a method for determining hotspot data according to the present disclosure. The method for determining hotspot data includes the following steps: Step 201: Receive request message data sent by the client.

[0023] In this embodiment, the execution entity of the hotspot data determination method (e.g.) Figure 1 The server 105 shown receives request message data sent by the client. This request message data includes business data corresponding to multiple service instances. Specifically, the client first generates request message data based on the business configuration file. This request message data typically includes business data corresponding to multiple service instances, and the business data includes access data and mapping data of the access frequency corresponding to that access data. After generating the request message data, the client reports it to the server. The aforementioned execution entity receives the request message data sent by the client.

[0024] It should be noted that the ReportRequest message is a standardized data carrier for clients to submit access statistics requests for hot data to the aggregation server. It is the core transmission message for distributed multi-scenario hot data statistics, and its content contains key information for server-side data aggregation and hotspot calculation. The ReportRequest message generally includes: the service instance identifier, which is the unique identity of the service instance to which the client belongs; and multi-scenario key-count mapping data, which is the core data body of the ReportRequest message. It contains statistical data on the access frequency of hot data (key) under all configured scenarios within a reporting period. It is generally divided by scenario name, and each scenario corresponds to a set of key-count key-value pairs, where key is the key of business access data, and count is the access frequency of the key in this reporting period.

[0025] Step 202: Write the business data into the time window data table corresponding to the service instance based on the service instance's identification information.

[0026] In this embodiment, the execution entity writes the business data into the time window data table corresponding to the service instance based on the service instance's identification information. That is, the execution entity first determines the identification information of each service instance in the ReportRequest message, then determines the time window data table corresponding to that service instance based on the identification information, and finally writes the business data corresponding to that service instance into the time window data table.

[0027] It should be noted that a service instance refers to a running copy of a program, and a single service instance can include multiple business scenarios. A time window data table, also known as a time window data map, is a concurrent map with a key-value pair storage structure allocated by the system for each time window. It is used for non-blocking writing, fast aggregation, and temporary storage of business data such as hotspot signals, topic statistics, and access counts within that time window.

[0028] Specifically, the aforementioned execution entity determines the sliding time window data table corresponding to the service instance based on the service instance's unique identifier. The sliding time window data table is a set of data tables divided by a fixed duration, with each instance corresponding to an independent sliding table, thus avoiding data mixing. The sliding window duration can generally be set to 5 minutes, and the window scrolls automatically based on timestamps.

[0029] Next, the above process retrieves the generation timestamp of the scene data and, based on this timestamp, determines the current time window data table from the sliding time window data table. Specifically, it identifies the time window containing the generation timestamp within the time range encompassed by the sliding time window and retrieves the data table for that time window, i.e., the current time window data table. Finally, the scene data is written into the current time window data table.

[0030] Step 203: Determine candidate hotspot data based on the access frequency corresponding to the data in the time window data table.

[0031] In this embodiment, the execution entity determines candidate hot data based on the access frequency corresponding to the data in the time window data table. Specifically, for all data in the time window data table, the execution entity calculates the access frequency for each piece of data and periodically discards data within the expired window. The calculated access frequencies are then sorted in descending order, and candidate hot data is determined based on the sorted access frequency list and a preset frequency threshold. For example, if the preset frequency threshold is 120, the execution entity will identify access frequencies greater than the preset threshold and use the data corresponding to these access frequencies as candidate hot data.

[0032] Step 204: Statistically analyze the total access frequency of candidate hot data in the time window of the distributed cluster according to the preset load balancing strategy, and determine the target hot data based on the statistical results.

[0033] In this embodiment, the execution entity calculates the total access frequency of candidate hot data within the time window of the distributed cluster according to a preset load balancing strategy, and determines the target hot data based on the statistical results. Specifically, the execution entity determines a temporary master node from all nodes in the distributed cluster according to the load balancing strategy. For example, the node health and load values ​​of all nodes in the distributed cluster can be calculated using the load balancing strategy, and then a temporary master node is determined from all nodes based on the calculated node health and load values. Next, the temporary master node pulls the sliding time window data table of the business scenario corresponding to the candidate hot data from other slave nodes in the distributed cluster. Then, the total access frequency of the candidate hot data in all pulled sliding time window data tables is calculated, and the calculated total access frequency is sorted in descending order. Finally, the target hot data is selected from the sorted list according to the hot data size value configured in the request message data.

[0034] For example, if the hot data size configured in the request message data is 120, meaning that 120 hot data items need to be filtered out, then for the total access frequency list sorted in descending order, the above execution entity will directly select the access data corresponding to the top 120 total access frequencies as the target hot data.

[0035] The hot data determination method provided in this disclosure first receives request message data sent by the client; then, based on the service instance's identification information, it writes business data into the time window data table corresponding to the service instance; next, it determines candidate hot data based on the access frequency corresponding to the data in the time window data table; finally, it statistically analyzes the total access frequency of the candidate hot data in the time window of the distributed cluster according to a preset load balancing strategy, and determines the target hot data based on the statistical results. This method writes business data from different service instances into their respective independent time window tables, ensuring no interference, no mixed writing, and no disorder, facilitating problem location, instance-specific monitoring, and load analysis. It achieves orderly access, isolated storage, and real-time statistics of business data in a distributed environment, solving problems such as mixed writing across multiple instances, time-series disorder, and statistical bias. Furthermore, based on time windows and load balancing, it achieves global mining of hot data, improving the real-time performance, comprehensiveness, and accuracy of hot data identification.

[0036] Furthermore, the collection, storage, use, processing, transmission, provision, and disclosure of any type of information, such as user personal information, involved in the technical solutions disclosed herein comply with the provisions of relevant laws and regulations and do not violate public order and good morals.

[0037] Continue to refer to Figure 3 , Figure 3 A flow 300 of another embodiment of the method for determining hotspot data according to this disclosure is shown. The method for determining hotspot data includes the following steps: Step 301: Receive request message data sent by the client.

[0038] Step 301 is basically the same as step 201 in the aforementioned embodiment. For the specific implementation method, please refer to the aforementioned description of step 201, which will not be repeated here.

[0039] Step 302: Parse the request message data to determine the service instance identification information and the corresponding business data of the service instance included in the request message data.

[0040] In this embodiment, the execution entity of the hotspot data determination method (e.g.) Figure 1 The server 105 shown will parse the request message data (ReportRequest message) to determine the identification information of each service instance included in the ReportRequest message based on the parsing result, and determine the business data corresponding to each service instance based on the parsing result.

[0041] Step 303: For the service instance included in the request message data, determine the time window data table corresponding to the service instance based on the identification information of the service instance, and write the business data corresponding to the service instance into the time window data table corresponding to the service instance.

[0042] In this embodiment, for each service instance included in the ReportRequest message, the aforementioned execution entity will determine the time window data table corresponding to the service instance based on the unique identifier information of the service instance, and write the business data corresponding to the service instance into the time window data table corresponding to the service instance.

[0043] Step 304: Determine candidate hotspot data based on the access frequency corresponding to the data in the time window data table.

[0044] Step 305: Statistically analyze the total access frequency of candidate hot data in the time window of the distributed cluster according to the preset load balancing strategy, and determine the target hot data based on the statistical results.

[0045] Steps 304-305 are basically the same as steps 203-204 in the aforementioned embodiments. For specific implementation methods, please refer to the aforementioned description of steps 203-204, which will not be repeated here.

[0046] from Figure 3 It can be seen from this that, with Figure 2Compared to the corresponding embodiments, the method for determining hotspot data in this embodiment emphasizes the step of writing business data into the time window data table corresponding to the service instance. This method accurately obtains the service instance identifier and corresponding business data through standardized parsing of request message data, and writes the business data into the dedicated time window data table according to the instance identifier. This achieves traceability of business data sources, isolation between instances, time-series partitioning, and high statistical efficiency in a distributed environment. It solves the problems of mixed data writing, unclear attribution, and inaccurate statistics in multi-service instance concurrent scenarios, and improves the stability and accuracy of hotspot data collection, storage, and statistics.

[0047] In some optional implementations of this embodiment, the method for determining the hot data further includes: determining the business scenarios included in the service instance, and determining the scenario data corresponding to the business scenarios from the business data.

[0048] In this application scenario, since each service instance may include one or more scenarios, the aforementioned execution entity will also determine the business scenarios included in the service instance and determine the scenario data corresponding to each business scenario from the business data of the service instance based on the name or identifier of the business scenario.

[0049] This allows for further segmentation of service instances based on scenarios and determination of scenario data corresponding to those scenarios, thereby achieving accurate segmentation of business data and improving the accuracy of subsequent statistical results for hot data.

[0050] Continue to refer to Figure 4 , Figure 4 It shows Figure 3 The process 400 of the embodiment of step 303 includes: Step 401: Determine the sliding time window data table corresponding to the service instance based on the service instance's identification information.

[0051] The aforementioned execution entity determines the sliding time window data table corresponding to each service instance based on its unique identifier. The sliding time window data table is a collection of data tables partitioned by a fixed duration. Each instance corresponds to an independent sliding table, thus avoiding data mixing. The sliding window duration is typically set to 5 minutes, and the window scrolls automatically based on timestamps.

[0052] Step 402: Determine the current time window data table from the sliding time window data table based on the generation timestamp of the scene data.

[0053] The above execution will obtain the generation timestamp of the scene data, and based on the generation timestamp, determine the current time window data table from the sliding time window data table. That is, determine the time window in which the generation timestamp is located from the time range included in the sliding time window, and obtain the data table of the time window in which it is located, i.e., the current time window data table.

[0054] Step 403: Write the scene data into the current time window data table.

[0055] The aforementioned execution entity will write the scene data into the current time window map.

[0056] In some optional implementations of this embodiment, step 403 includes: Step 4031: In response to determining the data key of the scene data that already exists in the current time window data table, update the access frequency of the scene data.

[0057] During the data writing process, if the aforementioned execution entity determines that the data key of the scene data already exists in the data table of the current time window, the aforementioned execution entity will directly update the access frequency of the scene data, that is, increment the access frequency of the scene data by 1.

[0058] Step 4032: In response to determining that the data key of the scene data does not exist in the current time window data table, the data key of the scene data is written into the current time window data table, and the access frequency of the scene data is initialized.

[0059] If the aforementioned execution entity determines that the data key for scene data does not exist in the current time window data table, it will write the data key for scene data into the current time window data table and initialize the access frequency of scene data. Thus, by determining whether the data key for scene data exists in the current time window data table, updating the access frequency for existing data, and initializing the writing and counting for non-existent data, automatic deduplication, accurate counting, real-time updates, and complete coverage of hotspot data are achieved, improving the accuracy of hotspot statistics and system processing efficiency.

[0060] By splitting business data according to service instances and scenarios, and writing the split data into the current time window map, the efficiency and accuracy of data writing are improved.

[0061] In some optional implementations of this embodiment, the method for determining the hot data further includes: discarding the scene data in response to determining that the generation timestamp of the scene data exceeds the time range of all windows in the sliding time window.

[0062] In this implementation, if the execution entity determines that the generation timestamp of the scene data exceeds the time range of all windows in the sliding time window, the scene data will be discarded. Thus, by discarding scene data whose generation timestamps exceed the entire time range of the sliding time window, automatic filtering of expired data is achieved, improving the utilization rate of system resources.

[0063] Continue to refer to Figure 5 , Figure 5 A flowchart 500 is shown as another embodiment of the method for determining hotspot data according to this disclosure. The method for determining hotspot data includes the following steps: Step 501: Receive request message data sent by the client.

[0064] Step 502: Parse the request message data to determine the service instance identification information and the corresponding business data of the service instance included in the request message data.

[0065] Steps 501-502 are basically the same as steps 301-302 in the aforementioned embodiments. For specific implementation methods, please refer to the aforementioned description of steps 301-302, which will not be repeated here.

[0066] Step 503: Determine the sliding time window data table corresponding to the service instance based on the service instance's identification information.

[0067] Step 504: Determine the current time window data table from the sliding time window data table based on the generation timestamp of the scene data.

[0068] Step 505: Write the scene data into the current time window data table.

[0069] Steps 503-505 are basically the same as steps 401-403 in the aforementioned embodiments. For specific implementation methods, please refer to the aforementioned description of steps 401-403, which will not be repeated here.

[0070] Step 506: Calculate the access frequency corresponding to the data in the current time window data table.

[0071] In this embodiment, the execution entity of the hotspot data determination method (e.g.) Figure 1 The server 105 shown will count the access frequency corresponding to the data in the current time window data table. Specifically, the execution entity will traverse the current time window data table, and for each piece of data in the data table, the execution entity will aggregate and count the access frequency of that data, thereby obtaining the counted access frequency data.

[0072] Step 507: Determine candidate hotspot data based on access frequency and preset frequency threshold.

[0073] In this embodiment, the execution entity determines candidate hotspot data based on the relationship between the statistically analyzed access frequencies and a preset frequency threshold. Specifically, the execution entity sorts the statistically analyzed access frequencies in descending order, and then determines candidate hotspot data based on the sorted access frequency list and the preset frequency threshold. For example, if the preset frequency threshold is 120, the execution entity will identify access frequencies greater than the preset frequency threshold and use the data corresponding to these access frequencies as candidate hotspot data.

[0074] Step 508: Statistically analyze the total access frequency of candidate hot data within the time window of the distributed cluster according to the preset load balancing strategy, and determine the target hot data based on the statistical results.

[0075] Step 508 is basically the same as step 204 in the aforementioned embodiment. For the specific implementation method, please refer to the aforementioned description of step 204, which will not be repeated here.

[0076] from Figure 5 It can be seen from this that, with Figure 3 , Figure 4 Compared with the corresponding embodiments, the hot spot data determination method in this embodiment emphasizes the step of determining candidate hot spot data. By statistically analyzing the access frequency of data in the current time window data table and combining it with a preset frequency threshold to filter candidate hot spot data, the hot spot quantitative judgment is achieved. This method is highly real-time, statistically accurate, and computationally efficient, providing a reliable basis for subsequent distributed global aggregation and target hot spot determination.

[0077] Continue to refer to Figure 6 , Figure 6 It shows Figure 2 The process 600 of the embodiment of step 204 includes: Step 601: Determine a temporary master node from the nodes of the distributed cluster according to the preset load balancing strategy.

[0078] The aforementioned execution entity first collects real-time load information of each node in the distributed cluster, sorts each node according to the preset load balancing strategy and the real-time load information of the nodes, that is, sorts them from low to high according to the load score, and finally determines the temporary master node based on the sorting result, and determines the other nodes in the distributed cluster except the temporary master node as slave nodes.

[0079] In some optional implementations of this embodiment, step 601 includes: Step 6011: For nodes in the distributed cluster, calculate the node health and load value using a load balancing strategy.

[0080] For each node in the distributed cluster, the aforementioned execution entity collects real-time load information from each node and calculates the node's health and load value accordingly. The load balancing strategy here refers to prioritizing nodes with the lowest load.

[0081] Step 6012: Determine a temporary master node from all nodes based on node health and load values.

[0082] The aforementioned execution entity will determine a temporary master node from all nodes based on the calculated node health and load values. That is, it will generate a comprehensive score for the node based on the node health and load values, and determine the temporary master node from all nodes based on the comprehensive score.

[0083] Based on the preset lowest load priority load balancing strategy, the nodes are sorted from low to high load, and the node with the lowest load is determined as the temporary master node, while the remaining nodes are slave nodes, thus improving the efficiency of master node selection.

[0084] Step 602: The temporary master node pulls the sliding time window data table of the business scenario corresponding to the candidate hot data from other slave nodes.

[0085] The temporary master node pulls the sliding time window data table corresponding to the business scenarios of the candidate hotspot data from other slave nodes.

[0086] Step 603: Statistically analyze the total access frequency of candidate hot data in all retrieved sliding time window data tables, and determine the target hot data based on the statistical results.

[0087] The aforementioned execution entity performs a secondary aggregation of the total access frequency of candidate hot data across all retrieved sliding time window data tables, thereby determining the target hot data based on the secondary aggregation result. The number of target hot data points is determined by the hot data size value configured in the ReportRequest message. The execution entity outputs a statistical list of hot data keys, where each hot data key is its key-value pair.

[0088] In some optional implementations of this embodiment, step 603, determining the target hotspot data based on statistical results, includes: sorting the total access frequency of all candidate hotspot data in descending order based on the statistical results; and selecting a preset number of candidate hotspot data as target hotspot data from the total access frequency list after descending order.

[0089] In this implementation, the aforementioned execution entity sorts the total access frequency values ​​corresponding to all candidate hotspot data in descending order based on the statistical results, and selects a preset number of candidate hotspot data as target hotspot data from the descendingly sorted total access frequency list. This preset number is determined based on the hotspot data size value configured in the ReportRequest message. This improves the efficiency of determining target hotspot data.

[0090] Based on the preset minimum load priority load balancing strategy, the nodes are sorted from low to high load, and the node with the lowest load is determined as the temporary master node, while the remaining nodes are slave nodes. The temporary master node uniformly performs the aggregation and statistics of the time window data of each slave node and the hot spot filtering, thus realizing the balanced scheduling and efficient execution of distributed statistical tasks.

[0091] Continue to refer to Figure 7 , Figure 7 A flow 700 of another embodiment of the method for determining hotspot data according to the present disclosure is shown. The method for determining hotspot data includes the following steps: Step 701: Receive request message data sent by the client.

[0092] Step 702: Write the business data into the time window data table corresponding to the service instance based on the service instance's identification information.

[0093] Step 703: Determine candidate hotspot data based on the access frequency corresponding to the data in the time window data table.

[0094] Step 704: Statistically analyze the total access frequency of candidate hot data in the time window of the distributed cluster according to the preset load balancing strategy, and determine the target hot data based on the statistical results.

[0095] Steps 701-704 are basically the same as steps 201-204 in the aforementioned embodiments. For specific implementation methods, please refer to the aforementioned description of steps 201-204, which will not be repeated here.

[0096] Step 705: Use the name of the business scenario as a storage index to store the target hot data and obtain a global hot data storage table.

[0097] In this embodiment, the execution entity of the hotspot data determination method (e.g.) Figure 1 The server 105 shown will use the name of the business scenario as a storage index to store the identified target hot data, thereby obtaining a global hot data storage table, and persisting it to a distributed KV storage (Key-Value pair storage), such as Redis or dedicated storage.

[0098] Step 706: The temporary master node synchronizes the global hotspot data storage table to other slave nodes.

[0099] After generating the global hot data storage table, the temporary master node will synchronize it to other slave nodes, thereby achieving global consistency of hot data, real-time synchronization of distributed nodes, efficient querying, and easy reuse.

[0100] Step 707: In response to receiving a hot data query request, determine whether a global hot data storage table is stored locally.

[0101] If the aforementioned execution entity receives a hot data query request, which is generally a business cold start or a query request under a specific scenario, the aforementioned execution entity will first determine whether a global hot data storage table is stored locally, that is, it will prioritize querying the currently maintained statistical hot list locally.

[0102] Step 708: In response to determining that a global hot data storage table exists in the local storage, the hot data query result is determined from the global hot data storage table based on the name of the business scenario and the quantity of hot data in the hot data query request.

[0103] If it is determined that a global hot data storage table exists in the local storage, the aforementioned execution entity will determine the hot data query results from the global hot data storage table based on the name of the business scenario and the quantity of hot data in the hot data query request, thereby obtaining a list of hot keys for use by the business to warm up the local cache or dynamically adjust the caching strategy.

[0104] Step 709: In response to determining that the global hot data storage table is not stored locally, a remote hot data query request is generated.

[0105] If it is determined that the global hot data storage table is not stored locally, a remote hot data query request will be generated.

[0106] Step 710: Send the remote hot data query request to the distributed cluster through a load balancing strategy, so that the distributed cluster can determine the hot data query result from the global hot data storage table stored in the distributed cluster based on the name of the business scenario and the quantity of hot data in the remote hot data query request.

[0107] The aforementioned execution entity will send remote hot data query requests to the distributed cluster through a load balancing strategy. This allows the distributed cluster to determine the hot data query results from the global hot data storage table stored in the distributed cluster based on the name of the business scenario and the quantity of hot data in the remote hot data query request. This results in a list of hot keys, which can be used for business preheating of local caching or dynamic adjustment of caching strategies.

[0108] from Figure 7It can be seen from this that, with Figure 2 Compared to the corresponding embodiments, the method for determining hot data in this embodiment emphasizes the steps of storing target hot data and querying hot data. By prioritizing local queries of the global hot data storage table and initiating remote cluster queries through load balancing when no table is available, this method achieves low latency for hot data reading, reduced local resource burden, balanced cluster load, and global data consistency, thereby improving the stability and efficiency of hot data query.

[0109] Continue to refer to Figure 8 The document illustrates a flow 800 of an embodiment of a method for determining hotspot data according to the present disclosure. The method for determining hotspot data includes the following steps: Step 801: Load the business configuration file.

[0110] In this embodiment, the execution entity of the hotspot data determination method (e.g.) Figure 1 Clients 101, 102, and 103 (shown in the diagram) load business configuration files. These configuration files include: service instance identification information, the business scenarios for each service instance, the sampling rate, and the reporting cycle. Specifically, after system startup, the client loads a pre-defined business configuration file, stored in JSON or YAML format, containing key configuration information such as service instance identification information, the business scenarios for each instance, the data sampling rate, and the data reporting cycle.

[0111] Step 802: Write the business data into the front-end buffer through a preset interface according to the sampling rate.

[0112] The aforementioned execution entity will sample and filter the real-time collected business data according to the sampling rate parameter corresponding to the current service instance. The valid business data that hits the sampling rules will be asynchronously written to the front-end buffer through a preset standard interface for subsequent time window statistics and hotspot calculation.

[0113] Step 803: Switch the front-end buffer and the back-end buffer according to the reporting cycle. For the back-end buffer after the switch, count the access frequency of the data in the back-end buffer according to the business scenario.

[0114] The aforementioned execution entity determines the current reporting cycle, for example, a reporting cycle of 30 seconds. When the reporting cycle trigger time is reached, the execution entity switches between the foreground buffer and the background buffer, changing the original foreground buffer to the background buffer and the original background buffer to the new foreground buffer, thus ensuring uninterrupted business data writing. The data in the background buffer can be used for subsequent hotspot calculations and reporting, while the new foreground buffer continues to receive real-time business data writes, achieving seamless integration of data writing, cycle switching, and scenario statistics.

[0115] Step 804: Generate request message data based on the statistical results and report the request message data to the server.

[0116] The aforementioned execution entities complete the data access frequency statistics of the backend buffer after the switch according to the preset reporting cycle, obtain the access frequency statistics of hot data in each business scenario, and generate request message data containing service instance identifier, business scenario, time window information, hot data key and corresponding access frequency according to the statistical results and server interface protocol. The request message data is then reported to the server through a preset network channel so that the server can perform global hotspot aggregation, candidate hotspot screening and evolutionary fine-tuning of topic generation model.

[0117] The hotspot data determination method provided in this disclosure first loads a business configuration file; then, business data is written to a front-end buffer through a preset interface according to a sampling rate; subsequently, the front-end buffer and back-end buffer are switched according to a reporting cycle; for the switched back-end buffer, the access frequency of data in the back-end buffer is statistically analyzed according to the business scenario; finally, request message data is generated based on the statistical results and reported to the server. This method, through configuration loading, sampling and writing, switching between front-end and back-end buffers, scenario statistics, and message reporting, achieves lightweight data collection and non-blocking writing, improving the real-time performance and accuracy of request message generation.

[0118] The hot data determination method provided in this disclosure is applicable to high-concurrency multi-key query scenarios, including but not limited to online recommendation services (Cube, forward index access), search engines, inventory queries, interface rate limiting, and other distributed caching optimization fields.

[0119] The specific implementation process includes: embedding a lightweight generic SDK (Software Development Kit) into business services, defining independent scenarios through configuration files (such as the "cube" scenario, with a key type of uint64 and a TopN size of 10 for hot keys). Business code calls the record interface to report when multiple keys are accessed. During service cold starts, the get_hot_keys interface is called to query the list of hot keys and preload it into the local cache; the cache capacity or strategy can be dynamically adjusted at runtime.

[0120] In deduplication service cold start scenarios, preheating hotspot-related forward-ranked data prevents the initial wave of requests from penetrating downstream databases. Extended scenarios may include: application to interface rate limiting (statistics of hot users / hot interfaces) or anti-spam mechanisms. The hotspot data determination method provided by this disclosure is highly versatile and can be flexibly extended to any distributed system requiring hotspot detection.

[0121] Continue to refer to Figure 9, Figure 9 This paper illustrates an application scenario of the method for determining hotspot data according to this disclosure, in which a business cold start or runtime query request (specified scenario) is made. The system first queries the currently cached list of statistical hotspots; if not found, it falls back to querying the persistent hotspots remotely, for use by the business to warm up the local cache or dynamically adjust the caching strategy.

[0122] The client-side generic SDK uses the record or record_batch interface to determine the ReportRequest message from the client's business data through lock-free statistics in a double-buffer (buffer area), and periodically serializes and reports the ReportRequest message to the aggregation server cluster through load balancing.

[0123] The server cluster receives ReportRequest messages through data receiving nodes, distributes these messages by service instance and scenario, and writes the distributed data into a Map for the current time window. Expired windows are periodically evicted, and the total access frequency within the sliding time window is maintained. Next, a master node is selected from all nodes in the distributed cluster based on a load balancing strategy. The master node then pulls the corresponding window data from other slave nodes for distributed secondary aggregation. Finally, the Top N hotspots are calculated based on the secondary aggregation results, generating and outputting a list of hotspot keys. This global hotspot key list is persisted to a distributed key-value store (such as Redis or dedicated storage) and can also be used for subsequent hotspot data queries.

[0124] Further reference Figure 10 As an implementation of the methods shown in the above figures, this disclosure provides an embodiment of a device for determining hotspot data, which is similar to... Figure 2 Corresponding to the method embodiments shown, this device can be specifically applied to various electronic devices.

[0125] like Figure 10As shown, the hotspot data determination device 1000 of this embodiment includes: a receiving module 1001, a first writing module 1002, a first determining module 1003, and a second determining module 1004. The receiving module 1001 is configured to receive request message data sent by a client, wherein the request message data includes business data corresponding to multiple service instances; the first writing module 1002 is configured to write the business data into a time window data table corresponding to the service instance based on the service instance's identification information; the first determining module 1003 is configured to determine candidate hotspot data based on the access frequency corresponding to the data in the time window data table; and the second determining module 1004 is configured to statistically analyze the total access frequency of the candidate hotspot data within the time window of the distributed cluster according to a preset load balancing strategy, and determine the target hotspot data based on the statistical results.

[0126] In this embodiment, the specific processing of the receiving module 1001, the first writing module 1002, the first determining module 1003, and the second determining module 1004 in the hotspot data determining device 1000, and the resulting technical effects, can be referred to respectively. Figure 2 The relevant descriptions of steps 201-204 in the corresponding embodiments will not be repeated here.

[0127] In some optional implementations of this embodiment, the first writing module 1002 includes: a parsing submodule, configured to parse the request message data to determine the identification information of the service instance included in the request message data and the business data corresponding to the service instance; and a writing submodule, configured to, for the service instance included in the request message data, determine the time window data table corresponding to the service instance based on the identification information of the service instance, and write the business data corresponding to the service instance into the time window data table corresponding to the service instance.

[0128] In some optional implementations of this embodiment, the hotspot data determination device 1000 further includes: a scenario determination module, configured to determine the business scenarios included in the service instance and determine the scenario data corresponding to the business scenarios from the business data; and a writing submodule including: a window determination unit, configured to determine the sliding time window data table corresponding to the service instance based on the identification information of the service instance; a data table determination unit, configured to determine the current time window data table from the sliding time window data table based on the generation timestamp of the scenario data; and a writing unit, configured to write the scenario data into the current time window data table.

[0129] In some optional implementations of this embodiment, the writing unit is further configured to: update the access frequency of scene data in response to determining that a data key of scene data already exists in the current time window data table; and write the data key of scene data into the current time window data table and initialize the access frequency of scene data in response to determining that a data key of scene data does not exist in the current time window data table.

[0130] In some optional implementations of this embodiment, the hotspot data determination device 1000 further includes a discard module, configured to discard scene data in response to determining that the generation timestamp of the scene data exceeds the time range of all windows in the sliding time window.

[0131] In some optional implementations of this embodiment, the first determining module is further configured by 1003 to: count the access frequency corresponding to the data in the current time window data table; and determine candidate hot data based on the access frequency and a preset frequency threshold.

[0132] In some optional implementations of this embodiment, the second determining module 1004 includes: a node determining submodule, configured to determine a temporary master node from the nodes of the distributed cluster according to a preset load balancing strategy; a pulling submodule, configured to have the temporary master node pull sliding time window data tables of the business scenarios corresponding to the candidate hot data from other slave nodes; and a statistics submodule, configured to count the total access frequency of the candidate hot data in all the pulled sliding time window data tables.

[0133] In some optional implementations of this embodiment, the node determination submodule is further configured to: calculate the node health and load value of a node in a distributed cluster using a load balancing strategy; and determine a temporary master node from all nodes based on the node health and load value.

[0134] In some optional implementations of this embodiment, the second determining module 1004 is configured to: sort the total access frequency of all candidate hotspot data in descending order according to the statistical results; and select a preset number of candidate hotspot data as target hotspot data from the total access frequency list after descending order.

[0135] In some optional implementations of this embodiment, the hot data determination device 1000 further includes: a storage module configured to use the name of the business scenario as a storage index to store the target hot data and obtain a global hot data storage table; and a synchronization module configured to have the global hot data storage table synchronized to other slave nodes by a temporary master node.

[0136] In some optional implementations of this embodiment, the hot data determination device 1000 further includes: a request receiving module, configured to determine whether a global hot data storage table is stored locally in response to receiving a hot data query request; and a local query module, configured to determine the hot data query result from the global hot data storage table based on the name of the business scenario and the quantity of hot data in the hot data query request in response to determining that a global hot data storage table is stored locally.

[0137] In some optional implementations of this embodiment, the hot data determination device 1000 further includes: a request generation module configured to generate a remote hot data query request in response to determining that a global hot data storage table is not stored locally; and a remote query module configured to send the remote hot data query request to a distributed cluster through a load balancing strategy, so that the distributed cluster determines the hot data query result from the global hot data storage table stored in the distributed cluster based on the name of the business scenario and the quantity of hot data in the remote hot data query request.

[0138] Further reference Figure 11 As an implementation of the methods shown in the above figures, this disclosure provides an embodiment of a device for determining hotspot data, which is similar to... Figure 8 Corresponding to the method embodiments shown, this device can be specifically applied to various electronic devices.

[0139] like Figure 11 As shown, the hotspot data determination device 1100 of this embodiment includes: a loading module 1101, a second writing module 1102, a statistics module 1103, and a reporting module 1104. The loading module 1101 is configured to load a business configuration file, which includes: service instance identification information, the business scenario included in each service instance, the sampling rate, and the reporting period. The second writing module 1102 is configured to write business data into a front-end buffer through a preset interface according to the sampling rate. The statistics module 1103 is configured to switch between the front-end buffer and the back-end buffer according to the reporting period, and for the switched back-end buffer, to statistically analyze the access frequency of data in the back-end buffer according to the business scenario. The reporting module 1104 is configured to generate request message data based on the statistical results and report the request message data to the server.

[0140] In this embodiment, the specific processing of the loading module 1101, the second writing module 1102, the statistics module 1103, and the reporting module 1104 in the hotspot data determination device 1100, and the resulting technical effects, can be found in the following references: Figure 8 The relevant descriptions of steps 801-804 in the corresponding embodiments will not be repeated here.

[0141] According to embodiments of this disclosure, this disclosure also provides an electronic device, a readable storage medium, and a computer program product.

[0142] Figure 12 A schematic block diagram of an example electronic device 1200 that can be used to implement embodiments of the present disclosure is shown. The electronic device is intended to represent various forms of digital computers, such as laptop computers, desktop computers, workstations, personal digital assistants, servers, blade servers, mainframe computers, and other suitable computers. The electronic device may also represent various forms of mobile devices, such as personal digital processors, cellular phones, smartphones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions are merely illustrative and are not intended to limit the implementation of the present disclosure described and / or claimed herein.

[0143] Figure 12 A schematic block diagram of an example electronic device 1200 that can be used to implement embodiments of the present disclosure is shown. The electronic device is intended to represent various forms of digital computers, such as laptop computers, desktop computers, workstations, personal digital assistants, and other suitable computers. The electronic device may also represent various forms of mobile devices, such as personal digital processors, cellular phones, smartphones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions are merely illustrative and are not intended to limit the implementation of the present disclosure described and / or claimed herein.

[0144] like Figure 12 As shown, device 1200 includes a computing unit 1201, which can perform various appropriate actions and processes according to a computer program stored in read-only memory (ROM) 1202 or a computer program loaded from storage unit 1208 into random access memory (RAM) 1203. The RAM 1203 may also store various programs and data required for the operation of device 1200. The computing unit 1201, ROM 1202, and RAM 1203 are interconnected via bus 1204. Input / output (I / O) interface 1205 is also connected to bus 1204.

[0145] Multiple components in device 1200 are connected to I / O interface 1205, including: input unit 1206, such as keyboard, mouse, etc.; output unit 1207, such as various types of monitors, speakers, etc.; storage unit 1208, such as disk, optical disk, etc.; and communication unit 1209, such as network card, modem, wireless transceiver, etc. Communication unit 1209 allows device 1200 to exchange information / data with other devices through computer networks such as the Internet and / or various telecommunications networks.

[0146] The computing unit 1201 can be a variety of general-purpose and / or special-purpose processing components with processing and computing capabilities. Some examples of the computing unit 1201 include, but are not limited to, a central processing unit (CPU), a graphics processing unit (GPU), various special-purpose artificial intelligence (AI) computing chips, various computing units running machine learning model algorithms, a digital signal processor (DSP), and any suitable processor, controller, microcontroller, etc. The computing unit 1201 performs the various methods and processes described above, such as the method for determining hot data. For example, in some embodiments, the method for determining hot data may be implemented as a computer software program tangibly contained in a machine-readable medium, such as storage unit 1208. In some embodiments, part or all of the computer program may be loaded and / or installed on device 1200 via ROM 1202 and / or communication unit 1209. When the computer program is loaded into RAM 1203 and executed by the computing unit 1201, one or more steps of the method for determining hot data described above may be performed. Alternatively, in other embodiments, the computing unit 1201 may be configured to perform a method for determining hotspot data by any other suitable means (e.g., by means of firmware).

[0147] Various embodiments of the systems and techniques described above herein can be implemented in digital electronic circuit systems, integrated circuit systems, field-programmable gate arrays (FPGAs), application-specific integrated circuits (ASICs), application-specific standard products (ASSPs), systems-on-a-chip (SoCs), payload-programmable logic devices (CPLDs), computer hardware, firmware, software, and / or combinations thereof. These various embodiments may include implementations in one or more computer programs that can be executed and / or interpreted on a programmable system including at least one programmable processor, which may be a dedicated or general-purpose programmable processor, capable of receiving data and instructions from a storage system, at least one input device, and at least one output device, and transmitting data and instructions to the storage system, the at least one input device, and the at least one output device.

[0148] The program code used to implement the methods of this disclosure may be written in any combination of one or more programming languages. This program code may be provided to a processor or controller of a general-purpose computer, special-purpose computer, or other programmable data processing apparatus, such that when executed by the processor or controller, the program code causes the functions / operations specified in the flowcharts and / or block diagrams to be implemented. The program code may be executed entirely on a machine, partially on a machine, as a standalone software package partially on a machine and partially on a remote machine, or entirely on a remote machine or server.

[0149] In the context of this disclosure, a machine-readable medium can be a tangible medium that may contain or store a program for use by or in conjunction with an instruction execution system, apparatus, or device. A machine-readable medium can be a machine-readable signal medium or a machine-readable storage medium. A machine-readable medium can be, but is not limited to, electronic, magnetic, optical, electromagnetic, infrared, or semiconductor systems, apparatus, or devices, or any suitable combination of the foregoing. More specific examples of machine-readable storage media include electrical connections based on one or more wires, portable computer disks, hard disks, random access memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), optical storage devices, magnetic storage devices, or any suitable combination of the foregoing.

[0150] To provide interaction with a user, the systems and techniques described herein can be implemented on a computer having: a display device for displaying information to the user (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor); and a keyboard and pointing device (e.g., a mouse or trackball) through which the user provides input to the computer. Other types of devices can also be used to provide interaction with the user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user can be received in any form (including sound input, voice input, or tactile input).

[0151] The systems and technologies described herein can be implemented in computing systems that include backend components (e.g., as a data server), or computing systems that include middleware components (e.g., an application server), or computing systems that include frontend components (e.g., a user computer with a graphical user interface or web browser through which a user can interact with implementations of the systems and technologies described herein), or any combination of such backend, middleware, or frontend components. The components of the system can be interconnected via digital data communication of any form or medium (e.g., a communication network). Examples of communication networks include local area networks (LANs), wide area networks (WANs), and the Internet.

[0152] Cloud computing refers to a technological system that enables access to elastic and scalable shared physical or virtual resources via a network. These resources can include servers, operating systems, networks, software, and storage devices, and can be deployed and managed in an on-demand, self-service manner. Cloud computing technology can provide efficient and powerful data processing capabilities for applications such as artificial intelligence and blockchain, as well as for model training.

[0153] Computer systems can include clients and servers. Clients and servers are generally located far apart and typically interact through communication networks. The client-server relationship is created by computer programs running on the respective computers and having a client-server relationship with each other. The server can be a cloud server, also known as a cloud computing server or cloud host, which is a hosting product within the cloud computing service system to address the shortcomings of traditional physical hosts and Virtual Private Server (VPS) services, such as high management difficulty and weak business scalability.

[0154] It should be understood that the various forms of processes shown above can be used to rearrange, add, or delete steps. For example, the steps described in this disclosure can be executed in parallel, sequentially, or in different orders, as long as the desired result of the technical solution disclosed in this disclosure can be achieved, and this is not limited herein.

[0155] The specific embodiments described above do not constitute a limitation on the scope of protection of this disclosure. Those skilled in the art should understand that various modifications, combinations, sub-combinations, and substitutions can be made according to design requirements and other factors. Any modifications, equivalent substitutions, and improvements made within the spirit and principles of this disclosure should be included within the scope of protection of this disclosure.

Claims

1. A method for determining hotspot data, comprising: Receive request message data sent by the client, wherein the request message data includes business data corresponding to multiple service instances; The business data is written into the time window data table corresponding to the service instance based on the service instance's identification information; Candidate hotspot data are determined based on the access frequency corresponding to the data in the time window data table; The total access frequency of the candidate hot data in the time window of the distributed cluster is statistically analyzed according to the preset load balancing strategy, and the target hot data is determined based on the statistical results.

2. The method according to claim 1, wherein, The step of writing the business data into the time window data table corresponding to the service instance based on the service instance's identification information includes: The request message data is parsed to determine the identification information of the service instance and the business data corresponding to the service instance included in the request message data; For the service instance included in the request message data, the time window data table corresponding to the service instance is determined according to the identification information of the service instance, and the business data corresponding to the service instance is written into the time window data table corresponding to the service instance.

3. The method according to claim 2, further comprising: Determine the business scenarios included in the service instance, and determine the scenario data corresponding to the business scenarios from the business data; as well as The step of determining the time window data table corresponding to the service instance based on the service instance's identification information, and writing the business data corresponding to the service instance into the time window data table corresponding to the service instance, includes: The sliding time window data table corresponding to the service instance is determined based on the identification information of the service instance; Based on the generation timestamp of the scene data, determine the current time window data table from the sliding time window data table; Write the scene data into the current time window data table.

4. The method according to claim 3, wherein, The step of writing the scene data into the current time window data table includes: In response to determining a data key in the current time window data table that already contains the scene data, the access frequency of the scene data is updated; In response to determining that the data key of the scene data does not exist in the current time window data table, the data key of the scene data is written into the current time window data table, and the access frequency of the scene data is initialized.

5. The method according to claim 3, further comprising: In response to determining that the generation timestamp of the scene data exceeds the time range of all windows in the sliding time window, the scene data is discarded.

6. The method according to any one of claims 3-5, wherein, The step of determining candidate hotspot data based on the access frequency corresponding to the data in the time window data table includes: Calculate the access frequency corresponding to the data in the current time window data table; The candidate hotspot data are determined based on the access frequency and a preset frequency threshold.

7. The method according to claim 3, wherein, The step of statistically analyzing the total access frequency of the candidate hot data within the time window of the distributed cluster according to a preset load balancing strategy includes: A temporary master node is determined from the nodes of the distributed cluster according to a preset load balancing strategy; The temporary master node pulls the sliding time window data table of the business scenario corresponding to the candidate hotspot data from other slave nodes; The total access frequency of the candidate hotspot data in all the retrieved sliding time window data tables is statistically analyzed.

8. The method according to claim 7, wherein, The step of determining a temporary master node from the nodes of the distributed cluster according to a preset load balancing strategy includes: For each node in the distributed cluster, the node health and load value are calculated using the load balancing strategy. Based on the node health and the load value, a temporary master node is determined from all nodes.

9. The method according to claim 7, wherein, The determination of target hotspot data based on statistical results includes: Based on the statistical results, sort all candidate hotspot data in descending order by the total access frequency. A preset number of candidate hotspot data are selected from the total access frequency list after descending order as the target hotspot data.

10. The method according to any one of claims 7-9, further comprising: The name of the business scenario is used as a storage index to store the target hot data, resulting in a global hot data storage table; The temporary master node synchronizes the global hotspot data storage table to the other slave nodes.

11. The method of claim 10, further comprising: In response to receiving a hot data query request, determine whether the global hot data storage table is stored locally; In response to determining that the global hot data storage table exists in the local storage, the hot data query result is determined from the global hot data storage table based on the name of the business scenario and the quantity of hot data in the hot data query request.

12. The method of claim 11, further comprising: In response to determining that the global hotspot data storage table is not stored locally, a remote hotspot data query request is generated; The load balancing strategy is used to send the remote hot data query request to the distributed cluster, so that the distributed cluster can determine the hot data query result from the global hot data storage table stored in the distributed cluster based on the name of the business scenario and the quantity of hot data in the remote hot data query request.

13. A method for determining hotspot data, comprising: Load the business configuration file, which includes: the identification information of the service instance, the business scenarios included in each service instance, the sampling rate, and the reporting cycle; The business data is written to the front-end buffer through a preset interface according to the sampling rate. The foreground buffer and the background buffer are switched according to the reporting cycle. For the background buffer after the switch, the access frequency of the data in the background buffer is statistically analyzed according to the business scenario. Request message data is generated based on the statistical results, and the request message data is reported to the server.

14. A device for determining hotspot data, comprising: The receiving module is configured to receive request message data sent by the client, wherein the request message data includes business data corresponding to multiple service instances; The first writing module is configured to write the business data into the time window data table corresponding to the service instance based on the identification information of the service instance; The first determining module is configured to determine candidate hotspot data based on the access frequency corresponding to the data in the time window data table; The second determining module is configured to statistically analyze the total access frequency of the candidate hot data in the time window of the distributed cluster according to a preset load balancing strategy, and determine the target hot data based on the statistical results.

15. A device for determining hotspot data, comprising: The loading module is configured to load a business configuration file, wherein the business configuration file includes: the identification information of the service instance, the business scenario included in each service instance, the sampling rate, and the reporting cycle; The second writing module is configured to write business data into the front-end buffer through a preset interface according to the sampling rate. The statistics module is configured to switch the front-end buffer and the back-end buffer according to the reporting period, and to count the access frequency of the data in the back-end buffer according to the business scenario after the switch. The reporting module is configured to generate the request message data based on the statistical results and report the request message data to the server.

16. An electronic device comprising: At least one processor; as well as A memory communicatively connected to the at least one processor; wherein, The memory stores instructions that can be executed by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-12 or 13.

17. A non-transitory computer-readable storage medium storing computer instructions for causing the computer to perform the method of any one of claims 1-12 or 13.

18. A computer program product comprising a computer program that, when executed by a processor, implements the method according to any one of claims 1-12 or 13.