A method and system for constructing an intelligent data element identification system and a medium
By constructing an intelligent data element identification system and combining static and dynamic feature generation with reinforcement learning models, the problem of insufficient flexibility in traditional identification methods is solved, and the self-adaptation and security management of the identification system are realized, making it applicable to fields such as smart cities and the industrial internet.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- GOLDEN TIMES CULTURE COMM
- Filing Date
- 2025-12-05
- Publication Date
- 2026-06-23
Smart Images

Figure CN121684073B_ABST
Abstract
Description
Technical Field
[0001] This application relates to the field of computer data processing technology, and in particular to a method, system and medium for constructing an intelligent data element identification system. Background Technology
[0002] With the development of the information society and the deepening of digital transformation, the scale of data generated by various business systems is experiencing explosive growth, and its sources are becoming increasingly diversified, including but not limited to unstructured or multi-source heterogeneous data streams continuously generated by heterogeneous devices such as IoT sensors, mobile terminals, and industrial control systems. Faced with such a vast and complex data ecosystem, how to efficiently identify, classify, track, and securely manage massive amounts of data has become one of the key bottlenecks restricting many industries from achieving intelligent upgrades.
[0003] Currently, traditional data identification methods often rely on static rules or pre-defined label systems, lacking sufficient flexibility and adaptability to effectively cope with dynamically changing data environments. Especially in complex scenarios requiring real-time responses to data feature fluctuations, contextual semantic changes, and security level adjustments, identifiers generated based on fixed templates are prone to high conflict rates, poor traceability, and weak scalability, thus impacting the overall system's operational efficiency and service quality. Furthermore, with distributed architectures becoming the mainstream deployment model, existing solutions fail to fully consider the resource-constrained characteristics of edge computing nodes, making it difficult to deploy high-performance models and resulting in low overall system collaboration efficiency, failing to meet the needs of diverse future application scenarios. Summary of the Invention
[0004] To address the aforementioned technical issues, this application provides a method, system, and medium for constructing an intelligent data element identification system.
[0005] Firstly, this application provides a method for constructing an intelligent data element identification system, employing the following technical solution:
[0006] A method for constructing an intelligent data element identification system, the method comprising:
[0007] Obtain the raw data stream and user-defined identification rules, and extract static and dynamic features from them;
[0008] Based on the static features, a basic identifier is generated, and the dynamic features are input into a pre-configured reinforcement learning agent model, outputting dynamically adjusted parameters.
[0009] The basic identifier and the dynamic adjustment parameter are combined into a dynamic identifier, and a parameter adjustment record is generated.
[0010] Obtain the parameter adjustment records and externally added data type samples, perform incremental training on the reinforcement learning agent model, and generate an optimized reinforcement learning agent model.
[0011] The optimized reinforcement learning agent model is compressed to generate a lightweight inference model;
[0012] A distributed node network is constructed based on the obtained infrastructure topology map, and the lightweight inference model is deployed to the edge nodes, while the optimized reinforcement learning agent model is deployed to the central node.
[0013] The resource scheduling algorithm is executed based on the node load status to allocate computing tasks, and the dynamic identifier is mapped to blockchain storage through a blockchain smart contract;
[0014] Collect system runtime performance metrics and user feedback data, perform performance comparison tests and generate hyperparameter adjustment instructions, and call verification tools to verify the uniqueness of the dynamic identifier;
[0015] The hyperparameter adjustment instructions are fed back to the incremental training steps of the reinforcement learning agent model for optimization.
[0016] By adopting the above technical solution, a complete chain is demonstrated, from raw data collection to identifier generation, model optimization, system deployment, and closed-loop feedback, realizing the dynamic evolution and global trustworthy management of the identifier system. This application has strong adaptive adjustment capabilities, a rigorous security protection mechanism, and a wide range of applications, and can play an important role in future smart cities, industrial internet, digital twins, and other fields.
[0017] Optionally, the step of generating a basic identifier based on the static features, inputting the dynamic features into a pre-configured reinforcement learning agent model, and outputting dynamically adjusted parameters includes:
[0018] Obtain a set of structured feature vectors, including static and dynamic feature vectors;
[0019] A basic identifier is generated based on the static feature vector through a hash operation;
[0020] The dynamic feature vector is input into a pre-configured reinforcement learning agent model, an initial adjustment value is output through a policy network, the initial adjustment value is corrected based on the time difference error, and the converged dynamic adjustment parameters are output.
[0021] By adopting the above technical solution, based on the dual-track design concept of separating static and dynamic aspects, the core identity attributes are solidified by hash algorithm and the external response features are adaptively adjusted by reinforcement learning. The final output dynamic adjustment parameters are the result of multiple nonlinear mappings and gradient backpropagation within the deep neural network, reflecting a highly abstract understanding of the future behavior direction.
[0022] Optionally, the steps of obtaining the parameter adjustment records and externally added data type samples, incrementally training the reinforcement learning agent model, and generating an optimized reinforcement learning agent model include:
[0023] Based on the parameter adjustment record, extract the dynamic adjustment parameter values recorded during the dynamic identifier generation process, and associate them with the static feature metadata and timestamp sequence of the corresponding data elements;
[0024] Obtain a sample set of newly added external data types, including the feature vectors of the newly added data types;
[0025] Adjust the parameters to record the dataset and the new data type sample set, and combine them into an incremental training dataset;
[0026] The incremental training dataset is input into the reinforcement learning agent model, the weights of the model's feature extraction layer are frozen, and only the fully connected layers of the policy network are unfrozen for backpropagation.
[0027] The parameter update direction is calculated based on the policy gradient algorithm, and historical adjustment records are stored in an empirical replay buffer.
[0028] The reward function weights are corrected by time difference error, and gradient pruning technique is applied to constrain the parameter update magnitude.
[0029] Generate and output the optimized reinforcement learning agent model.
[0030] By adopting the above technical solution, rapid adaptation to new data types and strategy optimization are achieved, effectively solving the problems of catastrophic forgetting, low sample efficiency, and rigid reward design faced by traditional reinforcement learning in continuous learning scenarios. This technical solution not only improves the adaptability and decision-making accuracy of the surrogate model in dynamic identifier generation tasks, but also enhances its maintainability and scalability in complex and ever-changing data environments, providing solid technical support for building intelligent systems with continuous evolution capabilities.
[0031] Optionally, the steps of constructing a distributed node network based on the obtained infrastructure topology map, deploying the lightweight inference model to edge nodes, and deploying the optimized reinforcement learning agent model to the central node include:
[0032] Obtain the infrastructure topology map and analyze the node attributes and network connection relationships in the map;
[0033] Calculate the processing capability score and network latency score of each node, and divide the nodes into edge nodes and center nodes according to the dynamic partitioning threshold to construct a distributed node network;
[0034] A lightweight inference model is obtained, and the lightweight inference model is divided into data blocks adapted to the storage capacity of the edge device. The data blocks are distributed to the target edge node through a secure transmission protocol, and model integrity verification and runtime environment configuration are performed on the target edge node.
[0035] Obtain the optimized reinforcement learning agent model, create a model sandbox execution environment at the central node, load the model parameters of the reinforcement learning agent model and initialize the policy network weights, and establish a bidirectional communication channel with the edge nodes.
[0036] By adopting the above technical solutions and utilizing a scientifically sound node partitioning strategy, effective integration of edge and center resources is achieved. The differentiated deployment of lightweight models and reinforcement learning agent models fully leverages their respective advantages, collectively forming a highly automated and secure new intelligent computing service system. This system not only effectively addresses the diverse business challenges of large-scale IoT application scenarios but can also be widely applied in smart cities, industrial internet, autonomous driving, and other fields, demonstrating broad development prospects and practical value.
[0037] Optionally, the steps of allocating computing tasks based on a resource scheduling algorithm according to node load status and mapping dynamic identifiers to blockchain storage via blockchain smart contracts include:
[0038] Real-time collection of load status data of all nodes in the distributed node network to generate a node load status matrix;
[0039] Based on the node load state matrix, a resource scheduling algorithm is executed, a multi-constraint function is constructed with task completion time and energy consumption as optimization objectives, a heuristic search algorithm is used to solve the optimal solution for task allocation, and a computational task allocation strategy containing target node identifiers and task priority sequences is output.
[0040] The dynamic identifier generation task is distributed to the target node according to the computation task allocation strategy.
[0041] Obtain the dynamic identifier and associated metadata generated by the target node through the smart contract interface;
[0042] The blockchain storage contract is invoked to map the dynamic identifier and associated metadata to the distributed ledger, generating a storage proof containing the blockchain transaction hash.
[0043] By adopting the above technical solutions, a closed-loop resource scheduling and blockchain identification management system has been constructed. This system can not only significantly improve the utilization efficiency of computing resources in a large-scale distributed environment, but also ensure the authenticity and non-repudiation of each key business entity throughout its entire lifecycle. In this way, it can provide enterprise users with a new type of digital asset management solution that combines high performance and strong security.
[0044] Optionally, the step of feeding back the hyperparameter adjustment instructions to the incremental training steps of the reinforcement learning agent model for optimization includes:
[0045] The hyperparameter adjustment instructions are parsed to generate an incremental training configuration update file;
[0046] Load the reinforcement learning agent model and the current model parameters;
[0047] The training process is reconstructed according to the incremental training configuration update file, and incremental training operations are performed on the reinforcement learning agent model.
[0048] Output the optimized parameter set of the reinforcement learning agent model.
[0049] By adopting the above technical solutions, a high degree of automation management is achieved throughout the entire process from instruction issuance to model upgrade. A unique dynamic training process reconstruction mechanism significantly enhances the robustness and adaptability of the overall system. Especially when facing the challenges of constantly evolving business scenarios, it can quickly respond to external stimuli and continuously improve its decision-making level through precise and effective parameter correction methods, thereby achieving the goal of long-term stable and efficient operation.
[0050] Optionally, the following steps are included after generating the dynamic identifier:
[0051] Obtain the record set generated by the dynamic identifier and statistically analyze the identifier conflict frequency distribution;
[0052] When the collision frequency exceeds a preset frequency threshold, it is identified as a high-frequency collision identifier;
[0053] Extract the static and dynamic feature vectors corresponding to high-frequency conflicting identifiers;
[0054] Construct a feature dimension expansion matrix and add timestamp entropy features to the static feature vector;
[0055] The expanded static feature vector is input into the reinforcement learning agent model, and the conflict resolution parameters are output.
[0056] Reconstruct the dynamic identifier based on the conflict resolution parameters and update the parameter adjustment record.
[0057] By adopting the above technical solutions, it is possible to autonomously perceive changes in the external environment and make reasonable responses without much human intervention, which significantly improves the accuracy, flexibility and expansion potential of the identification allocation process.
[0058] Optionally, after the steps of deploying the lightweight inference model to the edge nodes and deploying the optimized reinforcement learning agent model to the central node, the method further includes:
[0059] Local feature evolution data are collected during the operation of the edge node;
[0060] Encrypt the local feature evolution data to generate federated learning data packets;
[0061] The federated learning data packets are transmitted to the central node via a secure aggregation channel;
[0062] The central node decrypts the federated learning data packet and fuses data from multiple edge nodes to generate a global feature evolution map.
[0063] The reward function of the reinforcement learning agent model is updated based on the global feature evolution graph.
[0064] By adopting the above technical solutions, the entire identification engine has the ability to self-reflect and iteratively improve, realizing a major transformation from static configuration to autonomous evolution, greatly enhancing the robust stability of the system when facing unknown threats and interference factors, and improving the system's adaptability and intelligence level.
[0065] Secondly, this application provides a system for constructing an intelligent data element identification system, which adopts the following technical solution:
[0066] A system for constructing an intelligent data element identification system, the system comprising:
[0067] The feature extraction module is used to acquire the raw data stream and user-defined identification rules, and extract static and dynamic features from them.
[0068] The dynamic adjustment module is used to generate basic identifiers based on the static features, input the dynamic features into a pre-configured reinforcement learning agent model, and output dynamic adjustment parameters.
[0069] A dynamic identifier generation module is used to combine the basic identifier and the dynamic adjustment parameter into a dynamic identifier and generate a parameter adjustment record;
[0070] The incremental training module is used to acquire the parameter adjustment records and externally added data type samples, perform incremental training on the reinforcement learning agent model, and generate an optimized reinforcement learning agent model.
[0071] The model compression module is used to compress the optimized reinforcement learning agent model to generate a lightweight inference model.
[0072] The model deployment module is used to construct a distributed node network based on the acquired infrastructure topology map, deploy the lightweight inference model to the edge nodes, and deploy the optimized reinforcement learning agent model to the central node.
[0073] The resource allocation module is used to allocate computing tasks by executing a resource scheduling algorithm based on the node load status, and to map the dynamic identifier to blockchain storage through a blockchain smart contract;
[0074] The performance testing module is used to collect system runtime performance indicators and user feedback data, perform performance comparison tests and generate hyperparameter adjustment instructions, and call the verification tool to verify the uniqueness of the dynamic identifier.
[0075] The feedback optimization module is used to feed back the hyperparameter adjustment instructions to the incremental training steps of the reinforcement learning agent model for optimization.
[0076] Thirdly, this application provides a computer-readable storage medium, which adopts the following technical solution:
[0077] A computer-readable storage medium storing a computer program that can be loaded by a processor and executed as in any of the methods in the first aspect.
[0078] In summary, this application achieves at least one of the following beneficial technical effects: By constructing an intelligent data element identification system, a complete chain is realized, from raw data collection to identifier generation, model optimization, system deployment, and closed-loop feedback. First, by combining static feature-generated basic identifiers with a dynamic feature-driven reinforcement learning model, dynamic adaptive adjustment of identifiers is achieved, improving the intelligence level of the identification system. Second, incremental training and model compression techniques ensure continuous model optimization while reducing deployment costs. Third, distributed node network deployment and blockchain storage mapping enable efficient processing and reliable management of the identification system. Finally, a complete feedback optimization mechanism is established, enabling the system to self-evolve and adapt to changing application scenarios, providing strong technical support for data governance in smart cities, industrial internet, and other fields. Attached Figure Description
[0079] Figure 1 This is a first flowchart illustrating a method for constructing an intelligent data element identification system according to one embodiment of this application.
[0080] Figure 2 This is a second flowchart illustrating a method for constructing an intelligent data element identification system according to one embodiment of this application.
[0081] Figure 3 This is a schematic diagram of the third process of constructing an intelligent data element identification system according to one embodiment of this application.
[0082] Figure 4 This is a schematic diagram of the fourth process of constructing an intelligent data element identification system according to one embodiment of this application.
[0083] Figure 5 This is a schematic diagram of the fifth step in the construction method of the intelligent data element identification system according to one embodiment of this application.
[0084] Figure 6 This is a schematic diagram of the sixth process of constructing an intelligent data element identification system according to one embodiment of this application.
[0085] Figure 7 This is a schematic diagram of the seventh process of constructing an intelligent data element identification system according to one embodiment of this application.
[0086] Figure 8 This is the eighth flowchart of a method for constructing an intelligent data element identification system according to one embodiment of this application. Detailed Implementation
[0087] To make the purpose, technical solution, and advantages of this application clearer, the following description is provided in conjunction with the appendix. Figures 1-8 The present application will be further described in detail below with reference to embodiments. It should be understood that the specific embodiments described herein are for illustrative purposes only and are not intended to limit the scope of the application.
[0088] This application discloses a method for constructing an intelligent data element identification system.
[0089] Reference Figure 1 A method for constructing an intelligent data element identification system, specifically including:
[0090] Step S101: Obtain the raw data stream and user-defined identification rules, and extract static and dynamic features from them;
[0091] The raw data stream refers to the continuous collection of unstructured or multi-source heterogeneous data generated from different business scenarios or terminal devices such as sensors. The user-defined identification rules are a set of constraints or semantic specifications set by the upper-layer application to guide the subsequent identification generation process on which specific attributes (such as security level, access control, etc.) should be met.
[0092] Building upon this foundation, the raw data stream is parsed and classified to extract two types of key features: "static features" primarily reflect the basic attributes of the data, such as data type (image / text), source node ID, and priority tags—relatively stable and unchanging information; "dynamic features" focus on the behavioral patterns of data over time, such as update frequency, contextual dependencies, and access popularity. This dual-dimensional feature extraction mechanism enables subsequent labeling not only to reflect the essential characteristics of the data but also to respond to changing environmental trends, enhancing the flexibility and intelligence of the labeling system.
[0093] Step S102: Generate basic identifiers based on static features, input dynamic features into a pre-configured reinforcement learning agent model, and output dynamically adjusted parameters;
[0094] The basic identifier is typically a fixed-length character sequence obtained by operating a static feature vector using a cryptographic hash function. This ensures the consistency of identifiers with the same combination of static attributes, thus laying the foundation for initial uniqueness. The reinforcement learning agent model is a pre-trained policy network specifically responsible for perceiving the influence of various factors in the current dynamic environment and making optimal decisions accordingly, i.e., outputting a set of "dynamically adjusted parameters." These parameters are essentially fine-tuning factors for the basic identifier, which can be understood as a weighted offset or perturbation coefficient. Their existence enables the identifier to react sensitively to external changes while maintaining core stability.
[0095] It is worth noting that reinforcement learning here plays a role that is not just a simple mapping transformation, but rather an ability to gradually improve its ability to select the best form of representation in complex situations by accumulating experience through long-term trial and error.
[0096] Step S103: Combine the basic identifier and the dynamic adjustment parameter into a dynamic identifier, and generate a parameter adjustment record;
[0097] The combination operation includes: normalizing the dynamically adjusted parameters and concatenating the basic identifier and the normalized parameters into a string according to a predetermined format.
[0098] Specifically, the output from the reinforcement learning model undergoes necessary normalization, compressing floating-point numbers that might otherwise fall within different scale ranges into a uniform interval (e.g., [0,1]). This aims to eliminate imbalances caused by inconsistent units between components, ensuring no dominant shift occurs during subsequent concatenation. After normalization, the basic identifier and adjustment coefficients are concatenated according to a pre-defined format template to form a new, readable string-based identifier. This step facilitates manual identification and logging, as well as cross-platform transmission and database storage management.
[0099] It should be noted that the synthesis method typically employs string concatenation or other reversible transformation techniques. The aim is to introduce new variable dimensions without disrupting the original identifier structure, thereby ensuring that the final "dynamic identifier" retains the initial basic meaning while incorporating specific influencing factors of the current environment. Although the embodiments in this application employ a relatively intuitive connection method for combination, it can be flexibly replaced with other more complex function mapping relationships, such as XOR operations, nested encryption, and other advanced encapsulation modes, depending on the application scenario.
[0100] At the same time, every parameter adjustment process is recorded in detail, forming what is known as a "parameter adjustment log". This log information is not only an important resource library for later incremental learning, but also an indispensable historical document for tracing the root cause of problems and evaluating the effectiveness of strategies.
[0101] Step S104: Obtain parameter adjustment records and externally added data type samples, perform incremental training on the reinforcement learning agent model, and generate an optimized reinforcement learning agent model.
[0102] Incremental training refers to the process of introducing new samples to fine-tune the model without losing existing knowledge. Compared to the traditional method of retraining from scratch, this method greatly reduces resource consumption and is more in line with the evolutionary needs of real-world business scenarios.
[0103] Specifically, leveraging the concept of incremental learning, only the most recent one or a few "parameter adjustment records" are selected as part of the incremental sample set. Combined with newly emerging unknown data category instances (i.e., "externally added data type samples"), the existing reinforcement learning agent's weights are updated locally in a targeted manner. This approach significantly reduces training overhead while improving the model's generalization ability to novel data types. Furthermore, since each incremental training iteration involves recalibrating the reward function (considering factors such as the decrease in conflict rate and energy savings), the model's learning direction always remains highly consistent with actual needs, avoiding the trap of blind optimization.
[0104] Step S105: Compress the optimized reinforcement learning agent model to generate a lightweight inference model;
[0105] This step addresses the challenge of directly applying high-performance models to resource-constrained environments. Although the complete reinforcement learning agent, after multiple rounds of iterative optimization, possesses strong generalization capabilities and robustness, its large model size and complex computational structure make it unsuitable for deployment on small devices at the network edge.
[0106] To address this, advanced model compression techniques, such as knowledge distillation, are employed to transfer knowledge from a large teacher model to a small student model, resulting in a "lightweight inference model" that maintains high accuracy while adapting to low-power hardware platforms. This lightweight version is particularly suitable for applications requiring fast real-time response and high fault tolerance, further expanding the application boundaries of the identification system.
[0107] Step S106: Construct a distributed node network based on the obtained infrastructure topology map, deploy the lightweight inference model to the edge nodes, and deploy the optimized reinforcement learning agent model to the central node.
[0108] The infrastructure topology diagram encompasses the spatial layout and communication links of nodes at various levels, including data centers, edge servers, and IoT gateways. Based on this topology diagram, the system automatically plans a reasonable distributed node network topology and allocates model resources appropriately according to the functional positioning of different nodes: edge nodes closer to the data source undertake the task of quickly generating identifiers, thus requiring only lightweight and efficient inference models; while operations requiring in-depth analysis and judgment based on comprehensive global information are handled by powerful computing nodes located in the central position, correspondingly configured with complete reinforcement learning agent models. In this way, the entire identifier system exhibits a clear hierarchical division of labor, effectively alleviating the bottleneck effect of centralized processing and improving overall response speed and service quality.
[0109] Step S107: Based on the node load status, execute the resource scheduling algorithm to allocate computing tasks, and map the dynamic identifier to the blockchain storage through the blockchain smart contract;
[0110] In particular, considering that there may be significant performance differences between nodes in a distributed environment, it is necessary to adopt differentiated scheduling measures to reasonably allocate computing resources in practical work.
[0111] Specifically, the system first collects a series of status parameters for each machine, such as current CPU utilization, remaining memory capacity, and disk I / O speed. Then, based on this information, a multi-dimensional task allocation matrix is constructed. After comprehensively evaluating factors such as the urgency and priority of each task, the most reasonable assignment decision is made. Data objects that have completed identification and binding are packaged and sent to the smart contract module deployed on the blockchain. After the module parses and confirms their accuracy, they are formally written into the distributed ledger. This not only permanently stores relevant information for easy future retrieval but also enhances the authority and credibility of the data by leveraging the tamper-proof nature of the blockchain.
[0112] Step S108: Collect system runtime performance indicators and user feedback data, perform performance comparison tests and generate hyperparameter adjustment instructions, and call the verification tool to verify the uniqueness of the dynamic identifier.
[0113] Performance metrics primarily include quantitative measurements such as identifier generation latency, number of conflicts, and throughput; while user feedback data may encompass subjective satisfaction ratings and misidentification complaints. By conducting comprehensive statistical analysis of these two types of data and comparing them horizontally with other benchmark solutions (control group), the overall performance of the current identifier system can be scientifically evaluated. If certain aspects are found to fall short of expectations, corresponding hyperparameter adjustment procedures will be initiated, such as adjusting control variables like the exploration rate ε and discount factor γ, to guide the reinforcement learning agent towards a better solution space.
[0114] At the same time, it is also necessary to regularly use professional formal verification tools to conduct rigorous formal logic deductions on each of the dynamic identifiers to be released, to ensure that there is no duplication between the existing and new parts, and to fundamentally eliminate the potential risk of identity confusion.
[0115] Step S109: Feedback the hyperparameter adjustment instructions to the incremental training steps of the reinforcement learning agent model for optimization.
[0116] The entire identifier management system is not a rigid set of unchanging rules, but a living entity with self-learning capabilities. Whenever new optimization suggestions are generated, they can be seamlessly integrated back into the previous incremental training module without manual intervention, continuing to drive the model towards greater accuracy and efficiency.
[0117] The above embodiments demonstrate a complete chain from raw data collection to identifier generation, model optimization, system deployment, and closed-loop feedback, realizing the dynamic evolution and global trust management of the identifier system. This application possesses strong adaptive adjustment capabilities, a robust security protection mechanism, and a wide range of applications, enabling it to play a significant role in future smart cities, industrial internet, digital twins, and other fields.
[0118] Reference Figure 2 As one implementation of step S102, the steps of generating basic identifiers based on static features, inputting dynamic features into a pre-configured reinforcement learning agent model, and outputting dynamically adjusted parameters include:
[0119] Step S201: Obtain a structured feature vector set, which includes static feature vectors and dynamic feature vectors;
[0120] Static feature vectors typically reflect the inherent attributes of the data itself, such as data type, source path, security level, or business priority tags; while dynamic feature vectors reflect the state changes of data throughout its lifecycle, such as access frequency, update cycle, and contextual parameters. This distinction helps to differentiate the processing of information from different dimensions, thereby improving the accuracy and adaptability of subsequent identifier generation.
[0121] Step S202: Generate a basic identifier based on the static feature vector through a hash operation;
[0122] Since static features often represent immutable core metadata, using a deterministic and uniformly distributed hash function for mapping them is a highly reasonable choice. Hash operations are a commonly used one-way transformation technique in cryptography, capable of converting inputs of arbitrary length into fixed-length outputs, ensuring that the same input always produces the same digest result, making even small differences lead to significantly different hash values. Furthermore, to further enhance security and collision resistance, SHA-256, MD5, or other high-performance hash algorithms can be chosen as the underlying implementation method, while combining them with a salt injection mechanism to prevent rainbow table attacks and improve system robustness.
[0123] Step S203: Input the dynamic feature vector into the pre-configured reinforcement learning agent model, output the initial adjustment value through the policy network, correct the initial adjustment value based on the time difference error, and output the converged dynamic adjustment parameter.
[0124] The role of this reinforcement learning agent model is to predict the optimal action policy based on the current state features (i.e., dynamic feature vectors) to adjust the behavior of the identifier. The core idea of the reinforcement learning framework is to simulate the interaction process between the agent and its environment, and to learn the objective function that maximizes long-term cumulative rewards through continuous trial and error.
[0125] In this embodiment, the construction of the reinforcement learning agent model includes: adopting a policy gradient framework, with the input layer dimension matching the length of the dynamic feature vector; the reward function is defined as the reciprocal of a weighted function of the identifier conflict rate and the system energy consumption.
[0126] Specifically, the policy gradient method is used as the core training mechanism. It belongs to the model-free reinforcement learning approach, allowing direct optimization of the policy function without relying on value estimation, thus making it more suitable for modeling complex scenes in continuous action spaces. The input layer design must meet the consistency requirement with the dynamic feature vector dimension to fully preserve multi-dimensional state information over time. Meanwhile, the design of the reward function is particularly crucial. It needs to consider not only immediate performance feedback, such as the increased cost of relabeling due to frequent conflicts, but also the impact of resource consumption indicators such as CPU utilization and communication overhead, ultimately forming a composite evaluation standard that balances efficiency and energy efficiency.
[0127] After the reinforcement learning agent model completes its initial inference, it outputs a set of dynamically adjusted parameters for subsequent integration. To stabilize the output and accelerate convergence, an additional temporal difference correction (TD) module is typically added to the end of the policy network. This module uses TD error to measure the deviation between the actual reward and the expected reward, and incrementally corrects the initial output accordingly. This method effectively alleviates the high variance problem caused by traditional Monte Carlo sampling, improving the reliability and generalization ability of the overall policy update. As the number of iterations increases, the dynamically adjusted parameters gradually converge, eventually reaching an ideal configuration that satisfies both local optima and global perspective.
[0128] In the above implementation, based on the dual-track design concept of separating static and dynamic elements, the core identity attributes are solidified by hashing algorithms and the external response features are adaptively adjusted by reinforcement learning. The final output dynamic adjustment parameters are the result of multiple nonlinear mappings and gradient backpropagation within the deep neural network, reflecting a highly abstract understanding of the future behavior direction.
[0129] Reference Figure 3 As one implementation of step S104, the steps of obtaining parameter adjustment records and externally added data type samples, incrementally training the reinforcement learning agent model, and generating an optimized reinforcement learning agent model include:
[0130] Step S301: Extract the dynamic adjustment parameter values recorded during the dynamic identifier generation process based on the parameter adjustment record, and associate them with the static feature metadata and timestamp sequence of the corresponding data element;
[0131] Dynamic identifiers typically refer to identity or attribute label information that changes in real time within the operating environment, and have wide application value in scenarios such as network security, user behavior tracking, and IoT device management. These identifiers are not static but evolve continuously with changes in the environmental state, thus their generation involves a series of strategic parameter adjustments. These parameters may involve setting classification thresholds, changing weight coefficients, switching activation functions, etc., forming an ordered time-series dataset. This dataset not only reflects the model's past behavior patterns and response mechanisms but also provides valuable historical reference for subsequent incremental training.
[0132] For example, in a malicious traffic detection task, whenever a new attack variant emerges, the system may automatically adjust the relevant parameters of the anomaly scoring rules to adapt to the new threat form; these continuous adjustment records constitute the aforementioned "parameter adjustment sequence". By structurally extracting and organizing these historical trajectories, a basic corpus that can be used for supervised fine-tuning or online learning is formed.
[0133] Step S302: Obtain the external newly added data type sample set, which includes the feature vector of the newly added data type;
[0134] In this context, real-world application needs are constantly evolving, meaning that existing training data may have blind spots or be outdated. To enable the model to handle new categories of objects that have never been seen before (such as new product styles in image recognition or new dialect accents in speech recognition), it is necessary to supplement the model with relevant representative examples in a timely manner.
[0135] In this embodiment, the newly added data type can be understood as any information entity that exceeds the cognitive boundaries of the current model. After preprocessing, they are converted into a numerical representation in a unified format, namely, the so-called "feature vector". This conversion process may involve various methods such as natural language encoding (e.g., TF-IDF, BERT embedding), image pixel normalization, and sensor signal filtering, with the aim of ensuring that different modal inputs can participate in computation and comparison in a common space.
[0136] It is worth noting that although the newly added samples themselves may not carry explicit labeling results, their inherent potential distribution characteristics are sufficient to stimulate the reactivation and reconstruction of the model's internal representation capabilities, thereby promoting its development towards a wider range of applicability.
[0137] Step S303: Adjust the parameter recording dataset and the newly added data type sample set into an incremental training dataset;
[0138] This process is not simply equivalent to file-level splicing and stacking, but rather a task of content fusion and logical association at a higher dimension. On the one hand, it is necessary to fully explore the potential connections between the two types of heterogeneous sources; on the other hand, it is necessary to consider how to rationally arrange the roles of the two in the training process, such as having the former take the lead in providing prior guidance or the latter as the core driver to explore unknown areas. To this end, methods such as multi-view learning frameworks and memory-enhanced network architectures can be used to coordinate the interaction between the two. In addition, attention must be paid to the quality control of the synthesized data to prevent misleading feedback caused by too much redundancy or conflicting items. In short, only when these two types of materials are organically combined can they truly exert a synergistic effect, making the evolutionary path of the entire system more robust and reliable.
[0139] Step S304: Input the incremental training dataset into the reinforcement learning agent model, freeze the network weights of the model's feature extraction layer, and only unfreeze the fully connected layers of the policy network for backpropagation.
[0140] The lower-level feature extraction layers (such as convolutional layers, embedding layers, or self-attention modules) are primarily responsible for extracting general, low-level semantic features (such as character patterns, syntactic structures, and field positional relationships) from the original input. These features exhibit high sharing and stability across most data types. The higher-level policy network (typically an action selection head composed of fully connected layers) focuses on mapping abstract features to specific action spaces (such as parameter adjustment direction and identifier generation rule selection). Its parameters are more susceptible to changes in task distribution. Therefore, when facing new data types, keeping the feature extraction layers fixed prevents the loss of existing knowledge while allowing the policy layers to adjust freely to adapt to the new environment, thus achieving efficient and stable transfer learning. Furthermore, this partial unfreezing strategy significantly reduces the computational resources required for training and the variance of gradient updates, improving the convergence speed and robustness of the training process.
[0141] Step S305: Calculate the parameter update direction based on the policy gradient algorithm, and use an experience replay buffer to store historical adjustment records;
[0142] Unlike value function-based methods, policy gradient directly performs gradient ascent on the parameters of the policy function to maximize the expected value of long-term cumulative reward. In this embodiment, the surrogate model analyzes the subsequent reward changes caused by each state-action pair in the incremental training dataset, uses the likelihood ratio trick to estimate the gradient direction, and then guides the policy parameters to evolve in a better direction.
[0143] To improve sample utilization and training stability, the system introduces an experience replay buffer mechanism to store state transition triples (state, action, reward) during historical parameter adjustments. These historical records are randomly sampled and replayed during training, breaking the temporal correlation between data and mitigating training oscillations caused by non-stationary policy updates. More importantly, experience replay allows the model to repeatedly "review" past successful adjustment paths, thereby strengthening the memory and solidification of key decision patterns and improving the generalization ability of the policy.
[0144] Step S306: Correct the reward function weights using time difference error and apply gradient pruning technique to constrain the parameter update magnitude;
[0145] Temporal difference error is a crucial indicator measuring the deviation between the current value estimate and the actual observed return, reflecting the agent's accuracy in predicting environmental feedback. During incremental training, if a particular class of newly added data samples consistently generates significant temporal difference errors, it indicates that the existing reward function's incentive mechanism for this type of data may be biased or insufficient. To address this, the system dynamically adjusts the weights of each dimension in the reward function based on the magnitude and sign of the temporal difference error. For example, it increases the positive reward for correct privacy compliance judgments or reduces the penalty for invalid format adjustments. This error feedback-based reward reshaping mechanism enables the model to autonomously identify which behaviors are more conducive to achieving long-term goals, even in the absence of explicit annotations, thereby achieving adaptive calibration of the reward function.
[0146] Meanwhile, to prevent excessively large gradients from causing drastic fluctuations or even divergence in model parameters, the system employs gradient pruning to impose amplitude limitations on the gradient tensors generated during backpropagation. Specifically, when the overall gradient norm exceeds a preset threshold, it is scaled proportionally to a safe range, ensuring that each parameter update occurs within a controllable interval. This not only enhances the numerical stability of the training process but also improves the model's robustness when facing noisy data or outlier samples.
[0147] Step S307: Generate and output the optimized reinforcement learning agent model.
[0148] The above implementation achieves rapid adaptation to new data types and strategy optimization, effectively solving problems such as catastrophic forgetting, low sample efficiency, and rigid reward design faced by traditional reinforcement learning in continuous learning scenarios. This technical solution not only improves the adaptability and decision-making accuracy of the surrogate model in dynamic identifier generation tasks, but also enhances its maintainability and scalability in complex and ever-changing data environments, providing solid technical support for building intelligent systems with continuous evolution capabilities.
[0149] Reference Figure 4 As one implementation of step S106, the steps of constructing a distributed node network based on the obtained infrastructure topology map, deploying a lightweight inference model to edge nodes, and deploying an optimized reinforcement learning agent model to the central node include:
[0150] Step S401: Obtain the infrastructure topology diagram and parse the node attributes and network connection relationships in the diagram;
[0151] An infrastructure topology map refers to a network structure view composed of physical or virtual devices, which includes information about various computing nodes (such as servers, edge gateways, and terminal devices) and the communication links between them. By parsing this topology map, basic attributes of each node can be extracted, such as CPU performance metrics, memory capacity, storage space, network bandwidth, and operating system type; at the same time, the connection relationships between nodes can also be identified, such as whether there are direct paths, latency, and whether they are within the same local area network.
[0152] Step S402: Calculate the processing capability score and network latency score of each node, and divide the nodes into edge nodes and center nodes according to the dynamic partitioning threshold to construct a distributed node network.
[0153] This categorization is not statically set but based on a dynamic evaluation mechanism. Specifically, it comprehensively considers each node's Processing Capability Score, which may involve multiple dimensions such as floating-point operation speed, number of concurrent threads, and cache hit rate; it also introduces a Network Latency Score, reflecting the speed of communication response time between the node and other critical nodes. These two scores are used together, and a dynamically adjusted categorization threshold determines whether a node belongs to the edge or center level.
[0154] In this embodiment, the dynamic threshold means that it can be automatically updated based on actual operating conditions. For example, when a large number of new devices are added to a certain area, causing an increase in overall load, some high-performance devices originally designated as edge nodes may be reassigned as central nodes to maintain consistent service quality. This flexible role-switching mechanism enhances the system's ability to cope with complex scenario changes, making the distributed architecture more resilient and adaptable.
[0155] Step S403: Obtain the lightweight inference model, divide the lightweight inference model into data blocks adapted to the storage capacity of the edge device, distribute the data blocks to the target edge node through a secure transmission protocol, and perform model integrity verification and runtime environment configuration on the target edge node.
[0156] Lightweight inference models refer to machine learning models that have been modified through compression, quantization, pruning, or other optimization techniques. Their main purpose is to reduce the demand for hardware resources, enabling them to run smoothly in constrained environments. While these models sacrifice some accuracy, they can significantly improve deployment efficiency while maintaining basic availability.
[0157] Because edge nodes often have limited storage space, the model file must first be segmented, breaking the complete model down into several data chunks adapted to the current node's storage capacity. These chunks are then sent one by one to the target node using an encrypted transmission protocol (such as TLS 1.3), where the original model structure is reconstructed locally. To ensure the model's authenticity and integrity, verification measures such as hash checks must be performed at the receiving end to prevent erroneous inferences due to tampering during transmission. Furthermore, a suitable runtime environment must be configured for the target node, including installing dependent libraries and setting access control rules, to ensure the model can start stably and provide services.
[0158] Step S404: Obtain the optimized reinforcement learning agent model, create a model sandbox execution environment at the central node, load the model parameters of the reinforcement learning agent model and initialize the policy network weights, and establish a bidirectional communication channel with the edge nodes.
[0159] Reinforcement learning agent models generally have high computational overhead and parameter size, making them difficult to deploy directly on ordinary edge devices. Therefore, it is necessary to perform specialized optimizations, such as gradient pruning, experience replay pool management, and policy distillation, to improve training convergence speed and generalization ability. Once the optimized version is obtained, it should be centrally deployed on a central node.
[0160] Specifically, central nodes generally refer to data center or cloud server instances with powerful computing capabilities, good network access, and high stability. During deployment, an independent sandbox execution environment is created for each node to isolate potential security risks between different models and prevent malicious attackers from penetrating the entire cluster through a vulnerable module. Simultaneously, pre-trained model parameters need to be loaded, and policy network weights initialized, putting the model into a standby state to receive state input from the edge and make corresponding adjustment commands.
[0161] In the above implementation, a scientifically sound node partitioning strategy effectively integrates edge and central resources. By leveraging the differentiated deployment of lightweight models and reinforcement learning agent models, their respective advantages are fully utilized, collectively forming a highly automated and secure new intelligent computing service system. This system not only effectively addresses the diverse business challenges of large-scale IoT application scenarios but can also be widely applied in smart cities, industrial internet, autonomous driving, and other fields, demonstrating broad development prospects and practical value.
[0162] Reference Figure 5 As one implementation of step S107, the steps of allocating computing tasks based on the node load status using a resource scheduling algorithm and mapping dynamic identifiers to blockchain storage via a blockchain smart contract include:
[0163] Step S501: Collect load status data of all nodes in the distributed node network in real time and generate a node load status matrix.
[0164] The system relies on monitoring agents or sensor components deployed on each participating node to continuously acquire key performance indicators (KPIs) reflecting the current operational status. These KPIs not only cover traditional hardware-level parameters, such as CPU utilization, memory usage, and network bandwidth, but also include higher-level service quality assessment dimensions, such as task queue backlog and communication latency between neighboring nodes. This multi-dimensional data acquisition mechanism ensures that subsequent resource scheduling decisions can comprehensively consider the actual service capabilities of each node and its relative position in the network topology.
[0165] Next, the original measurements are normalized and organized into a two-dimensional matrix in a unified format, namely the so-called "node load state matrix". Each row represents the set of all observed attributes of a specific node, and each column represents the distribution of a certain type of load feature among different nodes.
[0166] Step S502: Execute a resource scheduling algorithm based on the node load state matrix, construct a multi-constraint function with task completion time and energy consumption as optimization objectives, use a heuristic search algorithm to solve for the optimal solution of task allocation, and output a computational task allocation strategy containing the target node identifier and task priority sequence.
[0167] The resource scheduling algorithm revolves around two core optimization objectives: minimizing the overall task completion time and reducing the total system energy consumption. These two variables together constitute a multi-objective optimization problem, which is merged into a single evaluation function min(α·T+β·E) through linear weighting, where α and β are the importance coefficients of the corresponding terms.
[0168] To address potential surges in concurrent requests or node failures in complex environments, adaptive heuristic search strategies can be employed to solve the objective function. Typical methods include, but are not limited to, metaheuristic frameworks such as genetic algorithms, particle swarm optimization, or simulated annealing. These algorithms can quickly converge to near-optimal task allocation results without fully understanding the precise global mathematical expression, effectively avoiding the exponential computational overhead of traditional exhaustive search methods. The final output is a structured task allocation strategy table, which not only clarifies which nodes should undertake which types of tasks but also implicitly prioritizes each task, facilitating subsequent stages to proceed with the execution process according to predetermined rules.
[0169] Step S503: Distribute the dynamic identifier generation task to the target node according to the computing task allocation strategy;
[0170] In this process, the scheduling center initiates Remote Procedure Call (RPC) or other distributed communication mechanisms based on the information specified by the computing task allocation strategy, and pushes the specific identifier creation instructions to the selected candidate nodes.
[0171] Specifically, dynamic identifiers typically refer to unique identification codes that are continuously generated as the business context changes. These may originate from application scenarios such as IoT device registration, digital asset ownership certificates, or temporary tokens in various decentralized identity authentication systems. Because such identifiers often need to meet security requirements of high randomness, unpredictability, and collision resistance, their generation logic presents a high technical barrier. The node selected as the executor must be pre-configured with corresponding cryptographic primitive libraries (such as hash function families, elliptic curve signature toolkits, etc.) and immediately begin its internal computation process upon receiving the task.
[0172] Step S504: Obtain the dynamic identifier and associated metadata generated by the target node through the smart contract interface;
[0173] Smart contracts refer to programmable script units embedded within a blockchain platform. They define a series of automatically triggered conditions and corresponding sequences of execution actions, and can operate independently without third-party intervention once the preset rules are met.
[0174] In this embodiment, these contracts act as a bridge, responsible for receiving fresh identification results and related auxiliary descriptive information (collectively referred to as metadata) submitted by external nodes, and performing preliminary verification on the completeness and format correctness of the latter. Metadata should generally include at least a timestamp field to record the generation time, a version number to track iteration history, and even additional extended attributes such as geolocation tags or access restrictions.
[0175] Furthermore, standardized encapsulation protocols are needed to further abstract and transform the original messages, making them conform to the common interaction syntax requirements for on-chain processing. In this way, the originally distributed and heterogeneous local computing outputs can be transformed into consistent structure objects that can be shared and accessed across the entire network, laying the foundation for further persistent storage.
[0176] Step S505: Invoke the blockchain storage contract to map the dynamic identifier and associated metadata to the distributed ledger, and generate a storage proof containing the blockchain transaction hash.
[0177] The previously encapsulated identifier-metadata pairs are submitted to the blockchain storage contract module used to maintain permanent records. To improve storage efficiency and enhance security, the system uses a Merkle tree as the underlying index structure for batch processing. The digest value of each Merkle tree root node is submitted to the consensus engine to await consensus from all ledger nodes; only blocks confirmed as valid through a majority voting mechanism can be officially appended to the end of the ledger.
[0178] At the same time, each newly added data item will receive unique path proof information, allowing any querying party to trace back to the original content's origin at any future time using only a small hash link, greatly simplifying the verification and evidence collection process. More importantly, since all write operations are protected by cryptographic commitments, any attempt to tamper with historical records will be exposed because the complete chain cannot be forged, thus fundamentally eliminating the possibility of malicious repudiation.
[0179] Furthermore, the proof of storage is essentially a digital certificate file with timestamped evidentiary value. It demonstrates to the outside world that specific data has indeed been reliably recorded in a publicly accessible and immutable distributed database. To this end, the system extracts the unique identifier corresponding to the transaction that was just successfully written, namely the transaction hash value, and integrates it along with other necessary attachments (such as the block height, Merkel path, etc.) into a standardized proof document.
[0180] The above implementation constructs a closed-loop resource scheduling and blockchain identification management system, which can not only significantly improve the utilization efficiency of computing resources in a large-scale distributed environment, but also ensure the authenticity and non-repudiation of each key business entity throughout its entire lifecycle, thereby providing enterprise users with a new type of digital asset management solution that combines high performance and strong security.
[0181] Reference Figure 6 As one implementation of step S109, the step of feeding back hyperparameter adjustment instructions to the incremental training step of the reinforcement learning agent model for optimization includes:
[0182] Step S601: Parse the hyperparameter adjustment instructions and generate an incremental training configuration update file;
[0183] This process involves transforming received abstract adjustment suggestions into concrete operational guidelines that are machine-readable and program-controllable. This requires not only decoding at the grammatical level but also incorporating semantic understanding and contextual reasoning mechanisms to ensure that the transformed information accurately reflects the original intent. The incremental training configuration update file is essentially a data structure describing how to modify existing training settings. It may exist in JSON format, YAML document, or other standardized protocol formats, containing information such as the names of the fields to be changed and a table comparing their old and new values.
[0184] More importantly, the document also includes information on whether the adjustment involves replacing key components (such as switching optimizer types) and whether there are potential risk warnings (such as gradient explosion warnings), providing supplementary decision-making support. By conducting in-depth analysis and standardization of the instructions, model collapse due to human error can be effectively avoided.
[0185] Step S602: Load the reinforcement learning agent model and the current model parameters;
[0186] The current model parameters refer to the set of weight matrices most recently successfully saved, representing the optimal or at least usable state representation up to this point. The emphasis on "loading" rather than re-initialization is because the optimization process employs a gradual evolutionary path rather than a complete overhaul. This preserves the valuable knowledge assets accumulated in the early stages and provides a solid foundation for subsequent fine-tuning.
[0187] Step S603: Reconstruct the training process according to the incremental training configuration update file and perform incremental training operation on the reinforcement learning agent model.
[0188] The refactoring of the training process includes: dynamically replacing optimizer initialization parameters; reorganizing the batch partitioning strategy for training data; and reconfiguring the normalization method of the network layer.
[0189] Specifically, once new configuration updates are received, the system immediately activates an intelligent scheduling engine to comprehensively review and adjust the existing training pipeline according to the latest rules. For example, if the learning rate needs to be reduced, it automatically switches to a new optimizer object with decay characteristics; if the importance of certain samples is found to have increased significantly, it triggers a priority-driven data sampler to reorder the elements in the training batch.
[0190] More importantly, all changes are guided by the principle of minimal intrusion, affecting only those modules that are actually affected without interfering with the functionality of other unrelated areas. At the same time, to ensure that the training process remains under control, a series of protective measures are implemented simultaneously, such as gradient clipping to limit the maximum fluctuation range and freezing non-critical layers to prevent overfitting from worsening.
[0191] Step S604: Output the optimized parameter set of the reinforcement learning agent model.
[0192] The above implementation achieves a high degree of automated management of the entire process from instruction issuance to model upgrade. A unique dynamic training process reconstruction mechanism significantly enhances the robustness and adaptability of the overall system. Especially when facing constantly evolving business scenarios, it can quickly respond to external stimuli and continuously improve its decision-making capabilities through precise and effective parameter correction methods, thereby achieving the goal of long-term stable and efficient operation.
[0193] Reference Figure 7 As a further implementation method for constructing an intelligent data element identification system, after generating dynamic identifiers, the following steps are also included:
[0194] Step S701: Obtain the dynamic identifier generation record set and statistically analyze the identifier conflict frequency distribution;
[0195] The dynamic identifier generation record set refers to the set of metadata generated during each dynamic identifier generation process, including but not limited to key fields such as the basic identifier composition method, dynamic parameter source, and timestamp markers. By performing cluster analysis on the occurrence of identical or similar identifiers in these record sets, the repetition probability and spatiotemporal distribution patterns of different identifiers can be obtained. This conflict frequency distribution not only reflects the effectiveness of the current identifier strategy but also provides a quantitative basis for whether to initiate conflict response measures. For example, if an identifier of a specific format frequently appears in multiple different contexts, it may indicate that the feature dimensions used to distinguish them are insufficient or the weight settings are unreasonable.
[0196] Step S702: When the collision frequency exceeds a preset frequency threshold, it is identified as a high-frequency collision identifier;
[0197] The preset frequency threshold can be a configurable variable set according to business scenario requirements, such as the upper limit of the number of times the same identifier appears per unit time, or the consistency judgment standard for cross-node identifiers. Once the actual conflict rate is detected to exceed this limit, it means that the existing identifier generation logic can no longer meet the uniqueness guarantee requirements in a high-concurrency, multi-source heterogeneous environment, so a corresponding compensation mechanism must be triggered to correct potential problems.
[0198] Step S703: Extract the static feature vector and dynamic feature vector corresponding to the high-frequency conflict identifier;
[0199] Step S704: Construct a feature dimension expansion matrix and add timestamp entropy features to the static feature vector;
[0200] The timestamp entropy value can be obtained by calculating the Shannon entropy or other relevant entropy functions of the time interval sequence of events within a time window, and is used to measure the level of randomness in the time distribution. For basic identifiers that originally relied solely on manually constructed rules, the lack of characterization of the uncertainty in the time dimension often leads to a high risk of collisions when faced with sudden traffic or nonlinear growth patterns.
[0201] Therefore, superimposing a new dimension reflecting time complexity on top of the original static features is equivalent to enhancing the expressiveness and discriminative power of the original feature space. Specifically, the activity index at each time point can be collected using the sliding window method, and the local entropy value estimation result for the corresponding time period can be generated accordingly. This result is then encoded into a unified feature representation framework to form a new composite feature matrix for subsequent model use.
[0202] Step S705: Input the expanded static feature vector into the reinforcement learning agent model and output the conflict resolution parameters.
[0203] The previously constructed reinforcement learning agent model already incorporates a knowledge base on how to automatically adjust internal parameters based on input features to maximize a specific reward objective. In the current step, because the input signal is provided after feature expansion, it theoretically guides the model to make more accurate and robust decision responses. The conflict resolution parameters are essentially a set of operational suggestions derived from the reinforcement learning algorithm. These may include, but are not limited to, modifying identifier length limits, switching hash algorithms, increasing the perturbation factor ratio, and introducing semantic constraints. All these parameters aim to reduce the likelihood of similar conflicts recurring in the future, while maintaining manageable overall system overhead.
[0204] Step S706: Reconstruct the dynamic identifier based on the conflict resolution parameters and update the parameter adjustment record.
[0205] This step is a typical component of a feedback control loop. By adopting improvement strategies recommended by reinforcement learning agents, targeted modifications can be made to previously problematic identifiers. For example, if the increased collisions are found to be due to excessively short hash bits, the hash string length can be appropriately increased in the new generation logic; or if frequent mismatches are caused by neglecting geographical factors, relevant auxiliary information such as IP address geolocation codes can be added to the new version.
[0206] Understandably, each such iterative update is synchronously written to the globally maintained log file, namely the "parameter adjustment record," ensuring that all changes are traceable, reproducible, and evaluable. Furthermore, this process implicitly includes a mechanism to correct previous misconceptions, enabling the entire identifier management system to continuously evolve.
[0207] In the above implementation, the system can autonomously perceive changes in the external environment and make reasonable responses without much human intervention, which significantly improves the accuracy, flexibility and scalability of the identification allocation process.
[0208] Reference Figure 8 As a further implementation of the method for constructing an intelligent data element identification system, after the steps of deploying the lightweight inference model to the edge nodes and deploying the optimized reinforcement learning agent model to the central node, the method further includes:
[0209] Step S801: Collect local feature evolution data while the edge node is running;
[0210] Local feature evolution data refers to the structural or statistical changes in data attributes observed by edge devices within their service range over time. For example, a sensor node might detect a significant increase in the frequency of temperature fluctuations in a certain area, or a shift in user access behavior patterns in a particular business scenario. These evolution processes often contain key clues about the current operating status of the system and are important bases for determining whether dynamic adjustments to the existing identification strategy are necessary. Because edge nodes are typically close to the physical world and the end-user interface, their sensing capabilities are highly timely and geographically relevant, thus enabling them to effectively capture new feature patterns that have not yet been identified by the upper-layer centralized model.
[0211] Step S802: Encrypt local feature evolution data to generate federated learning data packets;
[0212] Furthermore, considering the potential competition between different edge nodes and even the possibility of malicious attackers infiltrating, directly transmitting the original observation results in plaintext could easily lead to the leakage of sensitive information (such as patterns of population movement within a specific area) or the risk of man-in-the-middle tampering. Therefore, encryption is used to encapsulate this data, ensuring that even if intercepted during network transmission, the true content cannot be recovered. This encryption method can combine symmetric encryption algorithms (such as AES), asymmetric encryption algorithms (such as RSA), or homomorphic encryption technology can be introduced to support aggregation operations without decryption. The resulting federated learning data packets not only retain the necessary semantic structure for subsequent analysis but also meet the stringent security and compliance requirements of modern information systems.
[0213] Step S803: Transmit the federated learning data packet to the central node through the secure aggregation channel;
[0214] This involves constructing a controlled information backhaul path, enabling geographically dispersed edge nodes to complete knowledge-sharing tasks without exposing their local private information. The secure aggregation channel is not a traditional point-to-point link connection, but a higher-level security protocol stack design concept. It comprehensively utilizes various cutting-edge cryptographic and hardware isolation technologies, such as multi-party secure computation, differential privacy, and trusted execution environments, to achieve high-quality collaboration while ensuring the interests of participating parties are not infringed upon under decentralized conditions.
[0215] Step S804: Decrypt the federated learning data package at the central node and fuse the data from multiple edge nodes to generate a global feature evolution map;
[0216] The decryption process relies on the authentication mechanism and key management system established in the preceding stages. After decoding, Graph Neural Networks (GNNs) or other topology modeling methods can be used to uniformly represent and transform the local feature vectors uploaded by each edge node, further exploring the correlation strength and transmission paths between them, ultimately creating a global feature evolution map covering the entire network. This map not only reflects the current state distribution of various data objects but also quantitatively displays the possible development directions and locations of anomalous mutation nodes in the future, providing strong support for the next stage of model optimization.
[0217] Step S805: Update the reward function of the reinforcement learning agent model based on the global feature evolution map.
[0218] Among these, the reward function, as one of the most crucial components of reinforcement learning theory, determines which actions an agent should take to maximize its cumulative reward, thus influencing its long-term policy choices. However, traditional reward and punishment settings are often a fixed set of rules pre-defined by humans, which is insufficient to meet the demands of increasingly complex real-world application scenarios.
[0219] In this embodiment, by leveraging the rich contextual background knowledge provided by the pre-constructed global evolutionary graph, the long-term benefits of each labeling decision can be reassessed more scientifically and rationally, and the original scoring standard system can be automatically corrected accordingly. For example, when it is found that some data labels that were originally considered to be of low importance frequently appear in the abnormal event triggering chain, the value weight coefficient of the corresponding branch action can be appropriately increased.
[0220] In the above implementation, the entire identification engine has the ability to self-reflect and iteratively improve, realizing a major transformation from static configuration to autonomous evolution, greatly enhancing the robust stability of the system when facing unknown threats and interference factors, and improving the system's adaptability and intelligence level.
[0221] This application also discloses a system for constructing an intelligent data element identification system.
[0222] A system for constructing an intelligent data element identification system, the system comprising:
[0223] The feature extraction module is used to acquire the raw data stream and user-defined identification rules, and extract static and dynamic features from them.
[0224] The dynamic adjustment module is used to generate basic identifiers based on static features, input dynamic features into a pre-configured reinforcement learning agent model, and output dynamic adjustment parameters.
[0225] The dynamic identifier generation module is used to combine the basic identifier and the dynamically adjusted parameters into a dynamic identifier and generate parameter adjustment records.
[0226] The incremental training module is used to acquire parameter adjustment records and externally added data type samples to incrementally train the reinforcement learning agent model and generate an optimized reinforcement learning agent model.
[0227] The model compression module is used to compress the optimized reinforcement learning agent model to generate a lightweight inference model.
[0228] The model deployment module is used to build a distributed node network based on the acquired infrastructure topology map, deploy the lightweight inference model to the edge nodes, and deploy the optimized reinforcement learning agent model to the central node.
[0229] The resource allocation module is used to allocate computing tasks based on the node load status by executing a resource scheduling algorithm, and to map dynamic identifiers to blockchain storage through blockchain smart contracts.
[0230] The performance testing module is used to collect system runtime performance metrics and user feedback data, perform performance comparison tests and generate hyperparameter adjustment instructions, and call verification tools to verify the uniqueness of dynamic identifiers.
[0231] The feedback optimization module is used to feed back hyperparameter adjustment instructions to the incremental training steps of the reinforcement learning agent model for optimization.
[0232] The intelligent data element identification system construction system of this application embodiment can implement any of the above methods, and the specific working process of each module in the system can refer to the corresponding process in the above method embodiment.
[0233] In the several embodiments provided in this application, it should be understood that the provided methods and systems can be implemented in other ways. For example, the system embodiments described above are merely illustrative; for example, the division of a certain module is merely a logical functional division, and in actual implementation there may be other division methods, such as multiple modules can be combined or integrated into another system, or some features can be ignored or not executed.
[0234] This application also discloses a computer-readable storage medium.
[0235] A computer-readable storage medium storing a computer program that can be loaded by a processor and executed by any of the methods described above for constructing an intelligent data element identification system.
[0236] The computer-readable storage medium can be any tangible medium that contains or stores a program that can be used by or in connection with an instruction execution system, apparatus, or device; the program code contained on the computer-readable medium can be transmitted using any suitable medium, including but not limited to wireless, wire, optical fiber, RF, etc., or any suitable combination thereof.
[0237] In this application, the terms "first" and "second" are used for descriptive purposes only and should not be construed as indicating or implying relative importance or implicitly specifying the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include one or more of that feature. In the description of this application, "multiple" means two or more, unless otherwise explicitly specified.
[0238] Although this application has been described herein in conjunction with various embodiments, those skilled in the art, by reviewing the accompanying drawings, disclosure, and appended claims, will understand and implement other variations of the disclosed embodiments in carrying out the claimed application. In the claims, the word "comprising" does not exclude other components or steps, and "a" or "an" does not exclude a plurality. A single processor or other unit can implement several functions listed in the claims. While different dependent claims may recite certain measures, this does not mean that these measures cannot be combined to produce a good effect.
[0239] The above are all preferred embodiments of this application and are not intended to limit the scope of protection of this application. Any feature disclosed in this specification (including the abstract and drawings) may be replaced by other equivalent or similar features unless specifically stated otherwise. That is, unless specifically stated otherwise, each feature is only one example of a series of equivalent or similar features.
Claims
1. A method for constructing an intelligent data element identification system, characterized in that, The construction methods include: Obtain the raw data stream and user-defined identification rules, and extract static and dynamic features from them; Based on the static features, a basic identifier is generated, and the dynamic features are input into a pre-configured reinforcement learning agent model, outputting dynamically adjusted parameters. The basic identifier and the dynamic adjustment parameter are combined into a dynamic identifier, and a parameter adjustment record is generated. Obtain the parameter adjustment records and externally added data type samples, perform incremental training on the reinforcement learning agent model, and generate an optimized reinforcement learning agent model. The optimized reinforcement learning agent model is compressed to generate a lightweight inference model; A distributed node network is constructed based on the obtained infrastructure topology map, and the lightweight inference model is deployed to the edge nodes, while the optimized reinforcement learning agent model is deployed to the central node. The resource scheduling algorithm is executed based on the node load status to allocate computing tasks, and the dynamic identifier is mapped to blockchain storage through a blockchain smart contract; Collect system runtime performance metrics and user feedback data, perform performance comparison tests and generate hyperparameter adjustment instructions, and call verification tools to verify the uniqueness of the dynamic identifier; The hyperparameter adjustment instructions are fed back to the incremental training steps of the reinforcement learning agent model for optimization.
2. The method for constructing an intelligent data element identification system according to claim 1, characterized in that, The steps of generating a basic identifier based on the static features, inputting the dynamic features into a pre-configured reinforcement learning agent model, and outputting dynamically adjusted parameters include: Obtain a set of structured feature vectors, including static and dynamic feature vectors; A basic identifier is generated based on the static feature vector through a hash operation; The dynamic feature vector is input into a pre-configured reinforcement learning agent model, an initial adjustment value is output through a policy network, the initial adjustment value is corrected based on the time difference error, and the converged dynamic adjustment parameters are output.
3. The method for constructing an intelligent data element identification system according to claim 1, characterized in that, The steps for obtaining the parameter adjustment records and externally added data type samples, incrementally training the reinforcement learning agent model, and generating an optimized reinforcement learning agent model include: Based on the parameter adjustment record, extract the dynamic adjustment parameter values recorded during the dynamic identifier generation process, and associate them with the static feature metadata and timestamp sequence of the corresponding data elements; Obtain a sample set of newly added external data types, including the feature vectors of the newly added data types; Adjust the parameters to record the dataset and the new data type sample set, and combine them into an incremental training dataset; The incremental training dataset is input into the reinforcement learning agent model, the weights of the model's feature extraction layer are frozen, and only the fully connected layers of the policy network are unfrozen for backpropagation. The parameter update direction is calculated based on the policy gradient algorithm, and historical adjustment records are stored in an empirical replay buffer. The reward function weights are corrected by time difference error, and gradient pruning technique is applied to constrain the parameter update magnitude. Generate and output the optimized reinforcement learning agent model.
4. The method for constructing an intelligent data element identification system according to claim 3, characterized in that, The steps of constructing a distributed node network based on the obtained infrastructure topology map, deploying the lightweight inference model to edge nodes, and deploying the optimized reinforcement learning agent model to the central node include: Obtain the infrastructure topology map and analyze the node attributes and network connection relationships in the map; Calculate the processing capability score and network latency score of each node, and divide the nodes into edge nodes and center nodes according to the dynamic partitioning threshold to construct a distributed node network; A lightweight inference model is obtained, and the lightweight inference model is divided into data blocks adapted to the storage capacity of the edge device. The data blocks are distributed to the target edge node through a secure transmission protocol, and model integrity verification and runtime environment configuration are performed on the target edge node. Obtain the optimized reinforcement learning agent model, create a model sandbox execution environment at the central node, load the model parameters of the reinforcement learning agent model and initialize the policy network weights, and establish a bidirectional communication channel with the edge nodes.
5. The method for constructing an intelligent data element identification system according to claim 4, characterized in that, The steps of allocating computing tasks based on node load status using a resource scheduling algorithm and mapping the dynamic identifier to blockchain storage via a blockchain smart contract include: Real-time collection of load status data of all nodes in the distributed node network to generate a node load status matrix; Based on the node load state matrix, a resource scheduling algorithm is executed, a multi-constraint function is constructed with task completion time and energy consumption as optimization objectives, a heuristic search algorithm is used to solve the optimal solution for task allocation, and a computational task allocation strategy containing target node identifiers and task priority sequences is output. The dynamic identifier generation task is distributed to the target node according to the computation task allocation strategy. Obtain the dynamic identifier and associated metadata generated by the target node through the smart contract interface; The blockchain storage contract is invoked to map the dynamic identifier and associated metadata to the distributed ledger, generating a storage proof containing the blockchain transaction hash.
6. The method for constructing an intelligent data element identification system according to claim 1, characterized in that, The steps for optimizing the incremental training steps of the reinforcement learning agent model by feeding back the hyperparameter adjustment instructions include: The hyperparameter adjustment instructions are parsed to generate an incremental training configuration update file; Load the reinforcement learning agent model and the current model parameters; The training process is reconstructed according to the incremental training configuration update file, and incremental training operations are performed on the reinforcement learning agent model. Output the optimized parameter set of the reinforcement learning agent model.
7. A method for constructing an intelligent data element identification system according to any one of claims 1 to 6, characterized in that, The following steps are included after generating the dynamic identifier: Obtain the record set generated by the dynamic identifier and statistically analyze the identifier conflict frequency distribution; When the collision frequency exceeds a preset frequency threshold, it is identified as a high-frequency collision identifier; Extract the static and dynamic feature vectors corresponding to high-frequency conflicting identifiers; Construct a feature dimension expansion matrix and add timestamp entropy features to the static feature vector; The expanded static feature vector is input into the reinforcement learning agent model, and the conflict resolution parameters are output. Reconstruct the dynamic identifier based on the conflict resolution parameters and update the parameter adjustment record.
8. The method for constructing an intelligent data element identification system according to claim 1, characterized in that, After the steps of deploying the lightweight inference model to the edge nodes and deploying the optimized reinforcement learning agent model to the central nodes, the method further includes: Local feature evolution data are collected during the operation of the edge node; Encrypt the local feature evolution data to generate federated learning data packets; The federated learning data packets are transmitted to the central node via a secure aggregation channel; The central node decrypts the federated learning data packet and fuses data from multiple edge nodes to generate a global feature evolution map. The reward function of the reinforcement learning agent model is updated based on the global feature evolution graph.
9. A system for constructing an intelligent data element identification system, characterized in that, The system includes: The feature extraction module is used to acquire the raw data stream and user-defined identification rules, and extract static and dynamic features from them. The dynamic adjustment module is used to generate basic identifiers based on the static features, input the dynamic features into a pre-configured reinforcement learning agent model, and output dynamic adjustment parameters. A dynamic identifier generation module is used to combine the basic identifier and the dynamic adjustment parameter into a dynamic identifier and generate a parameter adjustment record; The incremental training module is used to acquire the parameter adjustment records and externally added data type samples, perform incremental training on the reinforcement learning agent model, and generate an optimized reinforcement learning agent model. The model compression module is used to compress the optimized reinforcement learning agent model to generate a lightweight inference model. The model deployment module is used to construct a distributed node network based on the acquired infrastructure topology map, deploy the lightweight inference model to the edge nodes, and deploy the optimized reinforcement learning agent model to the central node. The resource allocation module is used to allocate computing tasks by executing a resource scheduling algorithm based on the node load status, and to map the dynamic identifier to blockchain storage through a blockchain smart contract; The performance testing module is used to collect system runtime performance indicators and user feedback data, perform performance comparison tests and generate hyperparameter adjustment instructions, and call the verification tool to verify the uniqueness of the dynamic identifier. The feedback optimization module is used to feed back the hyperparameter adjustment instructions to the incremental training steps of the reinforcement learning agent model for optimization.
10. A computer-readable storage medium, characterized in that: The computer program is stored that can be loaded by a processor and executed as described in any one of claims 1 to 8.