An AI optimization-based terminal data transmission method

By fusing multi-source information through a hierarchical constraint reinforcement learning model, a data transmission strategy for electricity meter terminals is generated, which solves the problem of insufficient dynamic adaptation capability in existing technologies and realizes intelligent data transmission decision-making and stable execution.

CN121967341BActive Publication Date: 2026-06-26HANGZHOU HUALONG ELECTRONIC TECH CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
HANGZHOU HUALONG ELECTRONIC TECH CO LTD
Filing Date
2026-03-26
Publication Date
2026-06-26

AI Technical Summary

Technical Problem

Existing data transmission strategies for electricity meter terminals are insufficient to achieve collaborative perception of multi-source information and hierarchical adaptive decision-making, especially in scenarios with multiple concurrent services, fluctuating network conditions, and limited terminal resources, where the dynamic adaptability of the transmission strategy is inadequate.

Method used

An AI-optimized terminal data transmission method is adopted. By integrating terminal data status, communication network status and time context information through a hierarchical constraint reinforcement learning model, scheduling strategies and transmission parameter strategies are generated and their feasibility is verified to achieve intelligent scheduling and stable execution.

Benefits of technology

It improves the adaptability and feasibility of data transmission from electricity meter terminals in complex operating scenarios, and enhances the intelligence and stability of transmission decisions.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN121967341B_ABST
    Figure CN121967341B_ABST
Patent Text Reader

Abstract

The application discloses a terminal data transmission method based on AI optimization, relates to the technical field of intelligent communication of electric energy meters, and comprises the following steps: combining a labeled to-be-transmitted queue, a communication network state, a terminal resource state and time context information into a preliminary transmission state feature vector, and generating a transmission state feature vector through preprocessing; inputting the transmission state feature vector into a hierarchical constraint reinforcement learning model, outputting a scheduling type strategy through a high-level decision layer, outputting a parameter type strategy through a low-level execution layer under the upper-layer scheduling constraint, and performing feasibility determination on the two types of strategies by a constraint checker to output a feasible transmission strategy vector; and the feasible transmission strategy vector comprises a feasible scheduling type strategy and a feasible transmission type strategy; and the application realizes intelligent scheduling and stable execution of terminal data transmission of electric energy meters in a complex operation scenario, and improves the scene adaptability and executability of transmission decision.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of smart communication technology for electricity meters, and in particular to a terminal data transmission method based on AI optimization. Background Technology

[0002] With the continuous advancement of new power systems and smart grids, electricity meters have evolved from traditional single metering devices into intelligent terminals integrating metering, communication, status monitoring, and data analysis. Existing electricity meter terminals generally upload metering data, event data, and operational status data to the main station system periodically or on demand via carrier communication, public wireless networks, or dedicated communication networks. To adapt to the application requirements of large-scale terminal access and multi-service concurrency, related technologies have introduced queued caching, time-sharing scheduling, and adaptive transmission mechanisms based on network status, enabling terminals to complete data uploads to a certain extent based on communication conditions and service priorities. Simultaneously, with the development of edge computing and artificial intelligence technologies, some existing solutions are beginning to utilize simple rule models or statistical learning methods to optimize the configuration of communication time slots, transmission frequencies, and data compression strategies to improve overall transmission efficiency and system stability.

[0003] In existing technological systems, data transmission decisions at electricity meter terminals are mostly based on static rules or low-dimensional state parameters. They typically adjust only the communication link status or the priority of a single service, failing to fully integrate multi-source information such as terminal resource status, historical communication behavior, and temporal context. This results in room for improvement in the dynamic adaptability of transmission strategies. Especially in scenarios involving multiple concurrent services, fluctuating network conditions, and limited terminal resources, existing methods often struggle to establish effective hierarchical coordination between scheduling decisions and transmission parameter configuration. The correlation between data selection, packetization methods, and transmission timing needs strengthening. These shortcomings are not due to limitations in the technological approach, but rather an objective requirement for higher-level intelligent decision-making mechanisms as application complexity increases. Summary of the Invention

[0004] In view of the aforementioned existing problems, the present invention is proposed.

[0005] Therefore, this invention provides an AI-optimized terminal data transmission method to solve the problem that the data transmission strategy of electricity meter terminals is difficult to achieve multi-source information collaborative perception and hierarchical adaptive decision-making.

[0006] To solve the above-mentioned technical problems, the present invention provides the following technical solution:

[0007] This invention provides an AI-optimized terminal data transmission method, comprising: the terminal collecting metering service data to be transmitted, writing it into a transmission queue and attaching service labeling information to form a labeled transmission queue; simultaneously obtaining the communication network status based on historical transmission records and extracting the terminal resource status from the terminal's operating status information; combining the labeled transmission queue, communication network status, terminal resource status, and time context information into a preliminary transmission status feature vector, and generating a transmission status feature vector through preprocessing; inputting the transmission status feature vector into a hierarchical constraint reinforcement learning model, outputting scheduling strategies through a high-level decision layer, and outputting parameter-based strategies through a low-level execution layer under upper-level scheduling constraints; and having a constraint checker determine the feasibility of the two types of strategies and output a feasible transmission strategy vector; the feasible transmission strategy vector includes feasible scheduling strategies and feasible transmission strategies; selecting data to be transmitted from the labeled transmission queue according to the feasible scheduling strategies, and performing data processing and packetization on the data to be transmitted according to the feasible transmission strategies, and obtaining transmission feedback records.

[0008] As a preferred embodiment of the AI-optimized terminal data transmission method of the present invention, the terminal collects metering service data to be transmitted, writes it into a transmission queue and attaches service labeling information to form a labeled transmission queue. The specific steps are as follows.

[0009] Collect metering business data, perform integrity verification on the metering business data, and obtain the metering business data to be transmitted;

[0010] The metering service data to be transmitted is written into the transmission queue, and a unique data identifier is assigned to each metering service data and the enqueue time is recorded. Service labeling information is added to each metering service data in the transmission queue to generate a labeled transmission queue.

[0011] As a preferred embodiment of the AI-optimized terminal data transmission method of the present invention, the steps of simultaneously obtaining the communication network status based on historical transmission records and extracting the terminal resource status from the terminal operating status information are as follows:

[0012] Read historical transmission records within a preset time window, perform statistical analysis on communication transmission based on historical transmission records, calculate communication network status parameters, and obtain communication network status;

[0013] Extract remaining battery power, processor utilization, and available storage space from the terminal's operating status information, and output the terminal resource status.

[0014] As a preferred embodiment of the AI-optimized terminal data transmission method of the present invention, the steps of combining the labeled queue of data to be transmitted, the communication network status, the terminal resource status, and the time context information into a preliminary transmission status feature vector, and generating a transmission status feature vector through preprocessing, are as follows:

[0015] Extract queue status feature vectors from the labeled queues to be transmitted according to priority and timeliness;

[0016] The communication network state parameters and terminal resource states are vectorized to generate network feature vectors and resource feature vectors.

[0017] Discrete encoding and periodic sine and cosine encoding are performed on the temporal context information, and it is mapped to a temporal feature vector;

[0018] The queue state feature vector, network feature vector, resource feature vector and time feature vector are concatenated to generate a preliminary transmission state feature vector;

[0019] Perform missing value processing, outlier processing, and normalization on the initial transmission state feature vector, and output the transmission state feature vector.

[0020] As a preferred embodiment of the AI-optimized terminal data transmission method of the present invention, the time context information is obtained by reading the current clock information of the terminal to determine the current time, the time period to which it belongs, and its position in the business statistics cycle.

[0021] As a preferred embodiment of the AI-optimized terminal data transmission method of the present invention, the steps of inputting the transmission state feature vector into a hierarchical constraint reinforcement learning model, outputting scheduling strategies through a high-level decision layer, and outputting parameter-based strategies through a low-level execution layer under the upper-level scheduling constraints are as follows:

[0022] The transmission state feature vector is loaded into the hierarchical constraint reinforcement learning model, and the high-level decision layer is called to perform a forward inference process according to the decision mapping relationship, and the candidate scheduling class policy is output.

[0023] Based on the candidate scheduling strategies, the scheduling object, the sending time, and the sending batch size are selected one by one to form the scheduling strategy.

[0024] The scheduling behavior corresponding to the scheduling strategy is compared with the running state information in the transmission state feature vector to determine the scheduling feasibility boundary and form the upper-level scheduling constraint.

[0025] Under the upper-level scheduling constraints, the transmission state feature vector is submitted as input to the lower-level execution layer, and forward inference processing is performed according to the parameter decision mapping relationship to output the parameter class policy.

[0026] As a preferred embodiment of the AI-optimized terminal data transmission method of the present invention, the step of using a constraint checker to determine the feasibility of two types of strategies and outputting a feasible transmission strategy vector includes the following specific steps.

[0027] The scheduling policy and its corresponding upper-level scheduling constraints are input into the constraint checker. The constraint checker performs consistency verification on the scheduling policy. When the upper-level scheduling constraints are violated, a fixed order of policy correction is executed, and a feasible scheduling policy is output.

[0028] The parameter-type strategy is submitted to the constraint checker, which performs a feasibility check on the parameter-type strategy based on the upper-level scheduling constraints corresponding to the feasible scheduling strategy. If the upper-level scheduling constraints are not met, a deterministic correction is performed, and a feasible transmission strategy is output.

[0029] The feasible scheduling strategy and the feasible transmission strategy are uniformly vectorized and sequentially concatenated to output the feasible transmission strategy vector.

[0030] As a preferred embodiment of the AI-optimized terminal data transmission method of the present invention, the construction process of the hierarchical constraint reinforcement learning model includes the following specific steps:

[0031] Based on the reinforcement learning decision-making framework, a hierarchical structure is constructed, consisting of an input layer, a hierarchical policy decision layer, a constraint judgment layer, and an output layer; the hierarchical policy decision layer includes a high-level decision layer and a low-level execution layer.

[0032] The input layer receives the transmission state feature vector and inputs it to the hierarchical policy decision layer. The higher-level decision layer makes global scheduling decisions, and the lower-level execution layer makes transmission parameter decisions.

[0033] By establishing a constraint decision layer that runs through both the high-level decision-making layer and the low-level execution layer, and by performing feasibility verification and execution deterministic correction, a hierarchical constraint reinforcement learning model is formed.

[0034] As a preferred embodiment of the AI-optimized terminal data transmission method of the present invention, the high-level decision layer takes the transmission state feature vector as input, adopts a value function-based reinforcement learning method for offline training, and runs in forward inference mode on the terminal side to learn the scheduling decision rules under different operating states.

[0035] The lower-level execution layer is trained using a policy network-based reinforcement learning method under the scheduling constraints of the upper layer, and is formed by making detailed decisions on packet segmentation, transmission rhythm and transmission control parameters.

[0036] As a preferred embodiment of the AI-optimized terminal data transmission method of the present invention, the specific steps for obtaining the transmission feedback record are as follows:

[0037] Based on feasible scheduling strategies, construct an upper-level scheduling constraint view, and within the scope of the upper-level scheduling constraint view, select data to be sent from the marked queues to be transmitted to form a set of data to be sent.

[0038] Based on feasible transmission strategies, perform data organization and packet processing on the data set to be sent, and send the corresponding data packets; collect the data packet sending results and establish an association record with the corresponding data to be sent to generate a transmission feedback record.

[0039] The beneficial effects of this invention are as follows: by integrating the data status to be transmitted, communication network status, terminal resource status and time context information of the electricity meter terminal, a unified transmission status feature is constructed. Combined with a hierarchical constraint reinforcement learning mechanism, scheduling strategies and transmission parameter strategies are generated and their feasibility is verified, thereby realizing intelligent scheduling and stable execution of electricity meter terminal data transmission in complex operating scenarios, and improving the scenario adaptability and executability of transmission decisions. Attached Figure Description

[0040] To more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings used in the following description of the embodiments will be briefly introduced. Obviously, the drawings described below are only some embodiments of the present invention. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.

[0041] Figure 1 This is a flowchart of an AI-optimized terminal data transmission method.

[0042] Figure 2 A flowchart for generating a labeled queue to be transmitted.

[0043] Figure 3 A flowchart for generating transmission state feature vectors.

[0044] Figure 4 This is a flowchart of policy decision-making for a hierarchical constraint reinforcement learning model.

[0045] Figure 5 This is a comparison chart showing how the weights of network packet loss rate and transmission interval change over time.

[0046] Figure 6 This is a diagram illustrating how recognition accuracy changes over time. Detailed Implementation

[0047] To make the above-mentioned objects, features and advantages of the present invention more apparent and understandable, the specific embodiments of the present invention will be described in detail below with reference to the accompanying drawings.

[0048] Many specific details are set forth in the following description in order to provide a full understanding of the invention. However, the invention may also be practiced in other ways different from those described herein, and those skilled in the art can make similar extensions without departing from the spirit of the invention. Therefore, the invention is not limited to the specific embodiments disclosed below.

[0049] Secondly, the term "one embodiment" or "embodiment" as used herein refers to a specific feature, structure, or characteristic that may be included in at least one implementation of the present invention. The phrase "in one embodiment" appearing in different places in this specification does not necessarily refer to the same embodiment, nor is it a single or selective embodiment that is mutually exclusive with other embodiments.

[0050] Reference Figures 1-6 This is one embodiment of the present invention, which provides an AI-optimized terminal data transmission method, including the following steps:

[0051] S1. The terminal collects the metering service data to be transmitted, writes it into the queue to be transmitted and adds service labeling information to form a labeled queue to be transmitted. At the same time, it obtains the communication network status based on historical transmission records and extracts the terminal resource status from the terminal operation status information.

[0052] S1.1 Collect metering business data, perform integrity verification on the metering business data, and obtain the metering business data to be transmitted.

[0053] Furthermore, the terminal reads metering business data from the metering business data source and sequentially performs field integrity verification, data format and value range verification, and key field consistency verification on each metering business data. When the metering business data meets the preset integrity verification rules, it is identified as the metering business data to be transmitted and output. Metering business data that fails the verification does not enter the subsequent processing flow.

[0054] It should be noted that the integrity verification rules are a set of verification conditions used to determine whether metering business data meets the transmission requirements. They include at least the existence verification of required fields, the verification of field data format and value range, and the consistency verification between key fields.

[0055] The integrity verification rules are determined based on the business specifications, data model definitions, and identifiable business constraints of the metering business data. They are obtained by analyzing the field structure, business meaning, and upstream and downstream processing requirements of the metering business data to clarify the key fields that must participate in transmission and processing and the requirements for legal values.

[0056] S1.2 Write the metering service data to be transmitted into the transmission queue, and assign a unique data identifier to each metering service data and record the enqueue time; add service labeling information to each metering service data in the transmission queue to generate a labeled transmission queue.

[0057] Furthermore, the terminal sequentially writes the identified metering service data to be transmitted into the transmission queue. While each piece of metering service data is written into the transmission queue, a unique data identifier is generated and written for the metering service data, and the corresponding enqueue time is recorded based on the terminal's current clock information. After completing the recording of the unique data identifier and the enqueue time, service annotation information is added to the metering service data to be transmitted, and the metering service data with the added service annotation information is stored in the transmission queue, thus forming an annotated transmission queue containing a unique data identifier, enqueue time, and service annotation information.

[0058] It should be noted that the business labeling information includes the business type identifier, business priority identifier, and business timeliness identifier corresponding to the metering business data.

[0059] S1.3 Read historical transmission records within a preset time window, perform statistical analysis on communication transmission based on historical transmission records, calculate communication network status parameters, and obtain communication network status.

[0060] Furthermore, the terminal determines the start and end times of the preset time window and reads the historical transmission records that fall within the preset time window from the historical transmission record storage area; the terminal classifies, summarizes and statistically calculates the read historical transmission records according to the transmission success or failure identifier, retransmission count, acknowledgment delay and timeout event records to obtain communication network status parameters; the terminal encapsulates and outputs the communication network status parameters to form the communication network status.

[0061] It should be noted that the preset time window is set according to the real-time requirements of communication network status assessment. By configuring a fixed duration or a sliding time range, the historical transmission records can reflect the recent communication transmission status of the terminal.

[0062] S1.4 Extract the remaining power, processor utilization, and available storage space from the terminal's operating status information, and output the terminal resource status.

[0063] Furthermore, the remaining battery power field, processor utilization rate field, and available storage space field are located in the terminal's operating status information. The remaining battery power field, processor utilization rate field, and available storage space field are parsed and their units are made consistent to form the remaining battery power value, processor utilization rate value, and available storage space value. These values ​​are then encapsulated and output in a preset field order to obtain the terminal resource status.

[0064] It should be noted that the terminal's operating status information is provided by the terminal's internal operating status information source and is obtained by calling the terminal's local operating status acquisition interface;

[0065] The field order is set according to the fixed-dimensional mapping requirements of the terminal resource status in subsequent feature vector construction and model input.

[0066] S2. Combine the labeled queue to be transmitted, communication network status, terminal resource status and time context information into a preliminary transmission status feature vector, and generate a transmission status feature vector through preprocessing.

[0067] S2.1 Extract the queue status feature vector from the labeled queue to be transmitted according to priority and timeliness.

[0068] Furthermore, the system reads the business priority identifier, business timeliness identifier, and enqueue time corresponding to each metering business data in the labeled queue to be transmitted. It then sorts the labeled queue to be transmitted according to the business priority identifier from high to low, the business timeliness identifier from urgent to general within the same business priority identifier, and the enqueue time from early to late within the same business timeliness identifier. Based on the sorting results, it counts the number of data entries for each business priority identifier, the number of data entries for each business timeliness identifier, the queuing wait time corresponding to the earliest enqueue time, and the number of data entries nearing expiration, forming a queue status feature vector.

[0069] It should be noted that the number of data entries nearing expiration refers to the remaining valid time of metering business data being less than the expiration threshold. The remaining valid time is obtained by the difference between the maximum allowable waiting time corresponding to the business timeliness identifier of the data and the current queuing waiting time since the queuing time.

[0070] The failure threshold is determined by configuring the maximum allowable transmission time limit specified by the service timeliness identifier corresponding to the metering service data, according to a preset ratio or a fixed lead time.

[0071] S2.2. Vectorize the communication network state parameters and terminal resource state to generate network feature vectors and resource feature vectors.

[0072] Furthermore, the system sequentially reads the transmission success rate, transmission failure rate, average retransmission count, acknowledgment delay statistics, and timeout event rate from the communication network status. After performing missing value completion and numerical range truncation on each communication network status parameter, the system arranges the parameters in order to generate a network feature vector. Similarly, it sequentially reads the remaining battery power, processor utilization rate, and available storage space from the terminal resource status in order of field. After performing missing value completion and numerical range truncation on each terminal resource status field, the system arranges the parameters in order to generate a resource feature vector.

[0073] S2.3. Perform discrete encoding and periodic sine and cosine encoding on the time context information and map it into a time feature vector.

[0074] Furthermore, the system reads the terminal's current clock information and determines the time period identifier and the position value of the current time within the business statistics cycle. It then performs discrete encoding on the time period identifier and the date type identifier to generate discrete time encoding features, and calculates periodic sine and cosine encoding on the position value within the business statistics cycle to generate periodic sine and cosine encoding features. Finally, it concatenates the discrete time encoding features with the periodic sine and cosine encoding features to obtain a time feature vector.

[0075] It should be noted that the time context information is obtained by reading the terminal's current clock information to determine the current time, the time period, and its position in the business statistics cycle.

[0076] S2.4. Concatenate the queue state feature vector, network feature vector, resource feature vector, and time feature vector to generate a preliminary transmission state feature vector.

[0077] It should be noted that the dimensional lengths of the queue state feature vector, network feature vector, resource feature vector, and time feature vector are checked for consistency with the preset dimensional lengths. After the consistency check passes, the terminal sequentially concatenates the queue state feature vector, network feature vector, resource feature vector, and time feature vector according to the concatenation order to form a preliminary transmission state feature vector.

[0078] It should be noted that the preset dimension length is set based on the number of feature terms contained in various feature vectors and the fixed input dimension requirements.

[0079] S2.5 Perform missing value processing, outlier processing, and normalization processing on the initial transmission state feature vector, and output the transmission state feature vector.

[0080] Furthermore, each feature dimension of the initial transmission state feature vector is checked one by one. When a feature component is found to be missing, it is replaced with a valid historical value of the same type to ensure feature integrity. Subsequently, based on the normal value range of each feature component obtained statistically during historical operation, abnormal values ​​that exceed the reasonable range are truncated or pulled back to fall into the corresponding reasonable range. After completing the processing of missing and abnormal values, linear normalization or standardization is used to scale the values ​​of each feature component, so that feature components with different dimensions and value ranges are mapped to a unified numerical scale, thereby obtaining a transmission state feature vector that can be directly used as model input.

[0081] It should be noted that the normal range of values ​​is determined based on the actual value distribution of each characteristic component during historical operation, and the numerical range that can reflect the normal operating level is obtained through statistical analysis.

[0082] S3. Input the transmission state feature vector into the hierarchical constraint reinforcement learning model. The higher-level decision layer outputs scheduling-type policies, and the lower-level execution layer outputs parameter-type policies under the upper-level scheduling constraints. The constraint checker then determines the feasibility of the two types of policies and outputs a feasible transmission policy vector.

[0083] It should be noted that the specific process of constructing a hierarchical constraint reinforcement learning model is as follows:

[0084] Based on the reinforcement learning decision-making framework, a hierarchical structure is constructed, consisting of an input layer, a hierarchical policy decision layer, a constraint decision layer, and an output layer.

[0085] Furthermore, within the reinforcement learning decision framework, following the data flow direction, an input layer is set up for receiving transmission state feature vectors, a hierarchical policy decision layer (including a high-level decision layer and a low-level execution layer) for executing global scheduling decisions and transmission parameter decisions respectively, a constraint judgment layer (constraint checker) for performing feasibility verification and deterministic correction of scheduling and parameter-based policies, and an output layer for uniformly outputting feasible transmission policy vectors. This forms a hierarchical structure where each layer has clearly defined responsibilities and the decision-making process is constrained at each level.

[0086] It should also be noted that the high-level decision-making layer uses the transmission state feature vector as input, adopts a value function-based reinforcement learning method for offline training, and runs it on the terminal side in a forward inference mode to learn the scheduling decision rules under different operating states.

[0087] The lower-level execution layer is trained using a policy-based reinforcement learning method under the scheduling constraints of the upper layer, and makes detailed decisions on packet segmentation, transmission rhythm and transmission control parameters.

[0088] The input layer receives the transmission state feature vector and inputs it to the hierarchical policy decision layer. The higher-level decision layer makes global scheduling decisions, and the lower-level execution layer makes transmission parameter decisions.

[0089] Furthermore, the terminal loads the transmission state feature vector and inputs it into the input layer. The input layer then passes the transmission state feature vector to the hierarchical policy decision layer in dimensional order. The hierarchical policy decision layer first calls the higher-level decision layer to perform forward inference on the transmission state feature vector, outputting candidate scheduling strategies corresponding to the global scheduling decision, and converting the candidate scheduling strategies into scheduling strategies for scheduling objects, sending timing, and sending batch size. Under the upper-level scheduling constraints formed by the scheduling strategies, the hierarchical policy decision layer then calls the lower-level execution layer to perform forward inference on the transmission state feature vector, outputting parameter strategies such as packet segmentation, sending rhythm, and transmission control parameters that match the scheduling strategies, thereby completing the linkage output of global scheduling decisions and transmission parameter decisions.

[0090] By establishing a constraint decision layer that runs through both the high-level decision-making layer and the low-level execution layer, and by performing feasibility verification and execution deterministic correction, a hierarchical constraint reinforcement learning model is formed.

[0091] Furthermore, a constraint decision layer is set up between the hierarchical policy decision layer and the output layer. This constraint decision layer simultaneously receives candidate scheduling strategies output by the higher-level decision layer, parameter strategies output by the lower-level execution layer, and runtime state information from the transmission state feature vector. The constraint decision layer first performs consistency verification between the candidate scheduling strategies and the runtime state information to determine the scheduling feasibility boundary. When a candidate scheduling strategy violates the scheduling feasibility boundary, it performs deterministic corrections on the scheduling object, transmission timing, or transmission batch size in a fixed order, and outputs a feasible scheduling strategy. Under the upper-level scheduling constraints corresponding to the feasible scheduling strategy, the constraint decision layer performs feasibility verification on the parameter strategies. When a parameter strategy does not meet the upper-level scheduling constraints, it performs deterministic corrections on the packet segmentation method, transmission rhythm, or transmission control parameters, and outputs a feasible transmission strategy. This forms a hierarchical constraint reinforcement learning model where the input layer, the hierarchical policy decision layer, and the constraint decision layer collaboratively constrain the output.

[0092] It should also be noted that upper-level scheduling constraints are a set of constraints used to limit the executable scope of scheduling policies. They describe the relationships that must be satisfied between the selection of scheduling objects, the arrangement of sending timing, the size of sending batches, and the current running state.

[0093] Upper-layer scheduling constraints are determined comprehensively based on the terminal's transmission capacity, the timeliness requirements of metering service data, and historical transmission operation characteristics.

[0094] S3.1 Load the transmission state feature vector into the hierarchical constraint reinforcement learning model, and call the high-level decision layer to perform a forward inference process according to the decision mapping relationship, and output the candidate scheduling class policy.

[0095] Furthermore, the transmission state feature vector is read from the transmission state feature vector storage area to the computation storage area, and a dimensionality consistency check is performed on the transmission state feature vector to confirm that the transmission state feature vector matches the input dimension of the high-level decision layer. The transmission state feature vector is loaded into the input layer of the hierarchical constraint reinforcement learning model according to the feature component order specified by the input layer. The input layer passes the transmission state feature vector to the high-level decision layer dimension by dimension. The high-level decision layer performs a forward inference process on the transmission state feature vector based on the decision mapping relationship obtained from offline training, calculates the action value corresponding to the candidate scheduling class strategy, obtains the action value output set, and generates a candidate scheduling action set from the action value output set. The terminal outputs the candidate scheduling action set as a candidate scheduling class strategy according to the candidate scheduling class strategy encoding rule. The candidate scheduling class strategy encoding rule is defined based on the selectable value space of scheduling object, sending time, and sending batch size, and is formed by discretely numbering each value combination.

[0096] It should be noted that the hierarchical constraint reinforcement learning model is trained offline. During training, training samples are constructed based on the transmission state feature vectors, scheduling policies, parameter policies, and corresponding transmission feedback records formed by historical transmissions. The high-level decision layer uses a value function-based reinforcement learning method to learn the mapping relationship between the transmission state feature vectors and scheduling policies. The low-level execution layer uses a policy network-based reinforcement learning method under the scheduling constraints of the upper layer to learn the mapping relationship between the transmission state feature vectors and parameter policies. During training, the constraint decision layer filters or corrects policies that do not meet the upper-level scheduling constraints, thereby completing the training of the hierarchical constraint reinforcement learning model.

[0097] It should also be noted that the expression for calculating the action value corresponding to the candidate scheduling class strategy is:

[0098] ;

[0099] in: The feature vector in the current transmission state is represented as Under the condition of, execute the first The action value corresponding to each candidate scheduling action is used to characterize the relative merits of the candidate scheduling strategy in the current transmission state. Indicates the relationship with the first The transpose of the weight vector corresponding to the candidate scheduling action is used to perform a weighted summation of the feature representations after nonlinear transformation to calculate the i-th... The value contribution of each candidate scheduling action; This represents a nonlinear activation function used to perform nonlinear mapping processing on the linear combination results, thereby enhancing the expressive power of the mapping relationship between the transmission state feature vector and the value of candidate scheduling actions; This represents the weight matrix used to perform a linear transformation on the transmission state feature vector. It is obtained through iterative optimization learning based on historical transmission state feature vectors and transmission feedback data during the offline training phase and is used to extract combined feature information related to scheduling decisions. The transmission status feature vector is obtained by concatenating and processing the queue status feature vector, network feature vector, resource feature vector, and time feature vector, and is used to characterize the current transmission operation status. Representation and weight matrix The corresponding bias vector is used to perform numerical translation correction on the linear transformation result; Indicates the relationship with the first The bias term corresponding to the candidate scheduling action is used to adjust the bias term for the first candidate scheduling action. The action value output of each candidate scheduling action is corrected as a whole. An index representing a candidate scheduling action;

[0100] The value output set of all candidate scheduling actions is calculated, and the expression is as follows:

[0101] ;

[0102] in: This indicates that the feature vector of the transmission state is Under the condition, the action value output set is calculated for each of the candidate scheduling actions; This indicates that the feature vector of the transmission state is Under the given conditions, execute the first candidate scheduling action. The corresponding action value; This indicates the first candidate scheduling action, corresponding to a combination of scheduling strategy values; In the transmission state feature vector Under the condition that, execute the second candidate scheduling action. The corresponding action value; This indicates the second candidate scheduling action; In the transmission state feature vector Under the condition of, execute the first Candidate scheduling actions The corresponding action value; It is the first One candidate scheduling action.

[0103] S3.2. Based on the candidate scheduling strategies, select the scheduling object, the sending time, and the sending batch size one by one to form a scheduling strategy.

[0104] Furthermore, the terminal receives candidate scheduling policies and extracts the candidate sets corresponding to the scheduling object, the candidate sets corresponding to the sending time, and the candidate sets corresponding to the sending batch size from the candidate scheduling policies. It selects a unique scheduling object from the candidate set corresponding to the scheduling object, a unique sending time from the candidate set corresponding to the sending time, and a unique sending batch size from the candidate set corresponding to the sending batch size. It then combines the unique scheduling object, the unique sending time, and the unique sending batch size in the order of the scheduling policy fields and outputs them to form the scheduling policy.

[0105] S3.3. Compare the scheduling behavior corresponding to the scheduling strategy with the running state information in the transmission state feature vector to determine the scheduling feasibility boundary and form the upper-level scheduling constraint.

[0106] Furthermore, the scheduling object, sending timing, and sending batch size are extracted from the scheduling strategy to generate corresponding scheduling behavior description information. Feature components corresponding to the communication network state, terminal resource state, and queue state are extracted from the transmission state feature vector as running state information. The scheduling behavior description information and running state information are compared item by item to obtain the feasible intervals for sending timing, sending batch size, and scheduling object, which are then combined to form upper-level scheduling constraints.

[0107] S3.4 Under the upper-level scheduling constraints, the transmission state feature vector is submitted as input to the lower-level execution layer, and forward inference processing is performed according to the parameter decision mapping relationship to output the parameter class strategy.

[0108] Furthermore, within the constraints of the upper-layer scheduling, the terminal receives the transmission state feature vector and inputs it to the lower-layer execution layer. The lower-layer execution layer performs forward inference processing on the transmission state feature vector based on the parameter decision mapping relationship (obtained through joint training and learning of historical transmission state feature vectors, corresponding parameter class strategies, and actual transmission feedback data under the constraints of the upper-layer scheduling during the offline training phase). The lower-layer execution layer generates parameter output results by performing linear combination operations and nonlinear activation operations on the transmission state feature vector. The parameter output results are parsed into packet segmentation method, transmission rhythm, and transmission control parameters. The packet segmentation method, transmission rhythm, and transmission control parameters are then checked for consistency with the feasible set of scheduling objects, feasible interval of transmission timing, and feasible interval of transmission batch size in the upper-layer scheduling constraints to obtain the parameter class strategy that satisfies the upper-layer scheduling constraints.

[0109] like Figure 5The diagram illustrates the relationship between the network packet loss rate weight and the transmission interval weight of the baseline strategy and the strategy of this invention over time under varying communication network conditions. The upper overview diagram reflects the response trends of the two strategies to changes in network conditions throughout the entire operating cycle, and selects the time intervals with more significant network fluctuations using red dashed rectangles. The lower enlarged partial diagram compares and analyzes the changes in the transmission interval and packet loss rate weights within this interval, marking key peak positions and the moments when the differences between the two curves are greatest. Figure 5 This is used to illustrate the differences in the behavior of different strategies in adjusting the transmission interval when the network state fluctuates.

[0110] S3.5 Input the scheduling policy and the corresponding upper-level scheduling constraints into the constraint checker. The constraint checker performs consistency verification on the scheduling policy. When the upper-level scheduling constraints are violated, the policy is corrected in a fixed order and a feasible scheduling policy is output.

[0111] Furthermore, the scheduling policy and upper-level scheduling constraints are input into the constraint checker. The constraint checker performs consistency checks on the scheduling object, sending timing, and sending batch size against the feasible set of scheduling objects, feasible intervals of sending timing, and feasible intervals of sending batch size in the upper-level scheduling constraints. When the scheduling object, sending timing, or sending batch size does not meet the upper-level scheduling constraints, the constraint checker performs policy corrections on the sending batch size, sending timing, and scheduling object in a fixed order to ensure that the scheduling object, sending timing, and sending batch size fall within the corresponding feasible range, and outputs a feasible scheduling policy.

[0112] S3.6 Submit the parameter-type strategy to the constraint checker. The constraint checker performs a feasibility check on the parameter-type strategy based on the upper-level scheduling constraints corresponding to the feasible scheduling strategy. If the upper-level scheduling constraints are not met, a deterministic correction is performed, and a feasible transmission strategy is output.

[0113] Furthermore, the parameter-based strategy is input into the constraint checker. The constraint checker extracts the corresponding upper-level scheduling constraints based on the feasible scheduling strategy, and performs feasibility checks on the packet splitting method, transmission rhythm, and transmission control parameters against the feasible set of scheduling objects, the feasible interval of transmission timing, and the feasible interval of transmission batch size, respectively. When the packet splitting method, transmission rhythm, or transmission control parameters do not meet the upper-level scheduling constraints, the constraint checker performs deterministic corrections on the packet splitting method, transmission rhythm, and transmission control parameters to make them meet the upper-level scheduling constraints, and outputs a feasible transmission strategy.

[0114] S3.7. Perform unified vectorization encoding on feasible scheduling strategies and feasible transmission strategies and concatenate them sequentially to output a feasible transmission strategy vector.

[0115] Furthermore, the scheduling object, sending timing, and sending batch size are read from the feasible scheduling strategy, and the packet splitting method, sending rhythm, and transmission control parameters are read from the feasible transmission strategy. Vectorization encoding is performed on the scheduling object, sending timing, sending batch size, packet splitting method, sending rhythm, and transmission control parameters. The vectorized scheduling strategy component and the vectorized transmission strategy component are sequentially concatenated to form a feasible transmission strategy vector.

[0116] S4. Select data to be sent from the marked queue of data to be sent according to the feasible scheduling strategy, and perform data processing and packetization on the data to be sent according to the feasible transmission strategy, and obtain transmission feedback records.

[0117] S4.1 Construct an upper-level scheduling constraint view based on feasible scheduling strategies, and select data to be sent from the marked queues within the scope of the upper-level scheduling constraint view to form a set of data to be sent.

[0118] Furthermore, the terminal reads the scheduling object, sending timing, and sending batch size from the feasible scheduling strategy, and converts the scheduling object, sending timing, and sending batch size into filtering conditions for the marked queue to be transmitted, including business labeling information matching conditions, timeliness constraints corresponding to the enqueue time and business timeliness identifier, and data item limit conditions corresponding to the sending batch size.

[0119] The filtering conditions are uniformly expressed, and the metering service data that meets all the filtering conditions are determined as the selectable range of the upper-level scheduling constraint view, thus forming the upper-level scheduling constraint view for the marked queues to be transmitted. Within the scope of the upper-level scheduling constraint view, the marked queues to be transmitted are traversed, and the metering service data that meets the upper-level scheduling constraint view are selected sequentially until the upper limit of the number of data items corresponding to the sending batch size is reached or there is no metering service data in the queue that meets the conditions. The selected metering service data are summarized and output according to a unique data identifier to form a set of data to be sent.

[0120] S4.2. Based on the feasible transmission strategy, perform data organization and packet processing on the data set to be sent, and send the corresponding data packets; collect the data packet sending results and establish an association record with the corresponding data to be sent, and generate a transmission feedback record.

[0121] Furthermore, the terminal reads the packetization method, transmission rhythm, and transmission control parameters from the feasible transmission strategy, and organizes the metering service data in the data set to be transmitted according to unique data identifiers. Data organization includes reordering and grouping the metering service data to meet the packetization method requirements. The terminal encapsulates the data set to be transmitted into one or more data packets according to the packetization method, and writes a unique data identifier list and data packet sequence number information for backtracking into each data packet. The terminal sends each data packet according to the transmission rhythm and transmission control parameters, and collects the transmission time, transmission success or failure flag, acknowledgment delay, retransmission count, and timeout event record for each data packet during the transmission process. The terminal establishes a one-to-one association record between the transmission result of each data packet and the unique data identifier list in the data packet, and summarizes and outputs the association records to generate a transmission feedback record.

[0122] like Figure 6 The graph, presented as a single overview curve, illustrates the overall trend of the indicator over time, reflecting the system's stability at different operational stages. Continuous recording of data across the entire time domain allows observation of how the recognition accuracy fluctuates within a certain range during operation, providing a reference for subsequent analysis of the system's overall operational status and strategy execution effectiveness.

[0123] In summary, this invention achieves intelligent scheduling and stable execution of data transmission from the electricity meter terminal in complex operating scenarios by integrating the data status to be transmitted, communication network status, terminal resource status, and time context information. It also generates scheduling strategies and transmission parameter strategies by combining a hierarchical constraint reinforcement learning mechanism and performs feasibility constraint verification. This improves the scenario adaptability and executability of transmission decisions.

[0124] It should be noted that the above embodiments are only used to illustrate the technical solutions of the present invention and are not intended to limit it. Although the present invention has been described in detail with reference to preferred embodiments, those skilled in the art should understand that modifications or equivalent substitutions can be made to the technical solutions of the present invention without departing from the spirit and scope of the technical solutions of the present invention, and all such modifications or substitutions should be covered within the scope of the claims of the present invention.

Claims

1. A terminal data transmission method based on AI optimization, characterized in that: include, The terminal collects metering service data to be transmitted, writes it into the queue to be transmitted and adds service labeling information to form a labeled queue to be transmitted. At the same time, it obtains the communication network status based on historical transmission records and extracts the terminal resource status from the terminal operation status information. The labeled queue to be transmitted, communication network status, terminal resource status and time context information are combined into a preliminary transmission status feature vector, and a transmission status feature vector is generated through preprocessing. The transmission state feature vector is input into the hierarchical constraint reinforcement learning model. A scheduling-type policy is output through a higher-level decision layer, and a parameter-type policy is output through a lower-level execution layer under the constraints of the higher-level scheduling. A constraint checker then determines the feasibility of both types of policies and outputs a feasible transmission policy vector. The specific steps are as follows. The transmission state feature vector is loaded into the hierarchical constraint reinforcement learning model, and the high-level decision layer is called to perform a forward inference process according to the decision mapping relationship, and the candidate scheduling class policy is output. Based on the candidate scheduling strategies, the scheduling object, the sending time, and the sending batch size are selected one by one to form the scheduling strategy. The scheduling behavior corresponding to the scheduling strategy is compared with the running state information in the transmission state feature vector to determine the scheduling feasibility boundary and form the upper-level scheduling constraint. Under the upper-level scheduling constraints, the transmission state feature vector is submitted as input to the lower-level execution layer, and forward inference processing is performed according to the parameter decision mapping relationship to output the parameter class strategy; The scheduling policy and its corresponding upper-level scheduling constraints are input into the constraint checker. The constraint checker performs consistency verification on the scheduling policy. When the upper-level scheduling constraints are violated, a fixed order of policy correction is executed, and a feasible scheduling policy is output. The parameter-type strategy is submitted to the constraint checker, which performs a feasibility check on the parameter-type strategy based on the upper-level scheduling constraints corresponding to the feasible scheduling strategy. If the upper-level scheduling constraints are not met, a deterministic correction is performed, and a feasible transmission strategy is output. The feasible scheduling strategy and the feasible transmission strategy are uniformly vectorized and sequentially concatenated to output the feasible transmission strategy vector. Select data to be sent from the marked queue of data to be sent according to the feasible scheduling strategy, and perform data processing and packetization on the data to be sent according to the feasible transmission strategy, and obtain transmission feedback records.

2. The AI-optimized terminal data transmission method as described in claim 1, characterized in that: The terminal collects the metering service data to be transmitted, writes it into the transmission queue, and attaches service labeling information to form a labeled transmission queue. The specific steps are as follows. Collect metering business data, perform integrity verification on the metering business data, and obtain the metering business data to be transmitted; Write the metering service data to be transmitted into the transmission queue, and assign a unique data identifier to each metering service data and record the enqueue time. Add business labeling information to each metering business data in the queue to be transmitted, and generate a labeled queue to be transmitted.

3. The AI-optimized terminal data transmission method as described in claim 1, characterized in that: The steps for simultaneously obtaining the communication network status based on historical transmission records and extracting the terminal resource status from the terminal operating status information are as follows: Read historical transmission records within a preset time window, perform statistical analysis on communication transmission based on historical transmission records, calculate communication network status parameters, and obtain communication network status; Extract remaining battery power, processor utilization, and available storage space from the terminal's operating status information, and output the terminal resource status.

4. The AI-optimized terminal data transmission method as described in claim 3, characterized in that: The process of combining the labeled queue of data to be transmitted, the communication network status, the terminal resource status, and the time context information into a preliminary transmission status feature vector, and then generating a new transmission status feature vector through preprocessing, is detailed below. Extract queue status feature vectors from the labeled queues to be transmitted according to priority and timeliness; The communication network state parameters and terminal resource states are vectorized to generate network feature vectors and resource feature vectors. Discrete encoding and periodic sine and cosine encoding are performed on the temporal context information, and it is mapped to a temporal feature vector; The queue state feature vector, network feature vector, resource feature vector and time feature vector are concatenated to generate a preliminary transmission state feature vector; Perform missing value processing, outlier processing, and normalization on the initial transmission state feature vector, and output the transmission state feature vector.

5. The AI-optimized terminal data transmission method as described in claim 4, characterized in that: The time context information is obtained by reading the terminal's current clock information to determine the current time, the time period, and its position in the business statistics cycle.

6. The AI-optimized terminal data transmission method as described in claim 1, characterized in that: The specific steps for constructing the hierarchical constraint reinforcement learning model are as follows: Based on the reinforcement learning decision-making framework, a hierarchical structure is constructed, consisting of an input layer, a hierarchical policy decision layer, a constraint judgment layer, and an output layer; the hierarchical policy decision layer includes a high-level decision layer and a low-level execution layer. The input layer receives the transmission state feature vector and inputs it to the hierarchical policy decision layer. The higher-level decision layer makes global scheduling decisions, and the lower-level execution layer makes transmission parameter decisions. By establishing a constraint decision layer that runs through both the high-level decision-making layer and the low-level execution layer, and by performing feasibility verification and execution deterministic correction, a hierarchical constraint reinforcement learning model is formed.

7. The AI-optimized terminal data transmission method as described in claim 1, characterized in that: The high-level decision-making layer takes the transmission state feature vector as input, uses a value function-based reinforcement learning method for offline training, and runs on the terminal side in a forward inference mode to learn the scheduling decision rules under different operating states. The lower-level execution layer is trained using a policy network-based reinforcement learning method under the scheduling constraints of the upper layer, and is formed by making detailed decisions on packet segmentation, transmission rhythm and transmission control parameters.

8. The AI-optimized terminal data transmission method as described in claim 1, characterized in that: The specific steps for obtaining the transmission feedback record are as follows: Based on feasible scheduling strategies, construct an upper-level scheduling constraint view, and within the scope of the upper-level scheduling constraint view, select data to be sent from the marked queues to be transmitted to form a set of data to be sent. Based on feasible transmission strategies, perform data organization and packet processing on the data set to be sent, and send the corresponding data packets; collect the data packet sending results and establish an association record with the corresponding data to be sent to generate a transmission feedback record.