Machine learning-based data asset table entry intelligent classification system and debt structure optimization method

By using dynamic spatiotemporal data maps and deep deterministic policy gradient networks, the spatiotemporal mapping gap and gradient conflict between the high-dimensional physical domain and the financial domain are resolved, enabling high-precision data asset classification and debt structure optimization, and ensuring the stability and automated execution of the system.

CN122199154APending Publication Date: 2026-06-12NANJING AUDIT UNIV

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
NANJING AUDIT UNIV
Filing Date
2026-03-17
Publication Date
2026-06-12

AI Technical Summary

Technical Problem

Existing machine learning-based artificial intelligence systems face computational bottlenecks, architectural flaws, spatiotemporal scale mismatches, and gradient conflicts when mapping high-dimensional physical domain data to the financial asset domain, resulting in low accuracy in data asset classification and unstable debt structure optimization.

Method used

By employing dynamic spatiotemporal data graph construction, spatiotemporal graph attention network feature extraction, multi-scale spatiotemporal semantic pooling layer, and deep deterministic policy gradient network, combined with orthogonal gradient projection mechanism, gradient conflict resolution is achieved across spatiotemporal scale mismatch and heterogeneous multi-task joint training.

🎯Benefits of technology

It improves the accuracy and robustness of data asset classification, ensures the global stability and automated execution of debt structure optimization, and achieves high-precision, fully automated closed-loop control from the underlying physical device perception to the top-level smart contract fund transfer.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN122199154A_ABST
    Figure CN122199154A_ABST
Patent Text Reader

Abstract

The application discloses a data asset table entering intelligent classification system and debt structure optimization method based on machine learning, and belongs to the technical field of artificial intelligence systems in the production field. In view of the problems of space-time scale mismatch and joint training gradient conflict between high-frequency physical perception data and low-frequency macro financial decision, the application obtains multi-source heterogeneous time sequence perception data to calculate a dynamic mutual information entropy threshold to construct a dynamic space-time data graph. A space-time graph attention network is used to extract features to output asset classification confidence and hidden layer semantic features. A multi-scale space-time semantic pooling layer is used for frequency reduction and aggregation to construct a Markov decision process state space. Finally, the deep deterministic policy gradient network based on orthogonal gradient projection intervention joint training is input, a multi-dimensional action tensor is output, a configuration instruction is issued, and the intelligent contract is triggered to automatically execute the fund account matching and transfer. The application effectively crosses the cross-domain mapping gap, eliminates catastrophic forgetting, and realizes automatic global optimization.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of artificial intelligence systems in the production sector, and in particular to a data asset entry intelligent classification system based on machine learning and a debt structure optimization method. Background Technology

[0002] With the deep integration of industrial internet and artificial intelligence technologies, production systems continuously generate massive amounts of heterogeneous time-series sensing data. In current intelligent production management and enterprise resource planning, accurately assessing and classifying these underlying physical data as data assets, and guiding macro-decision systems to optimize debt structure accordingly, has become an important means to improve resource utilization efficiency and reduce enterprise operational risks. However, existing machine learning-based artificial intelligence systems face serious computational bottlenecks and architectural defects when attempting to map high-dimensional physical domain data to the financial asset domain and perform automated optimization control.

[0003] In the data asset classification stage, conventional data processing models usually treat multi-source sensor data as isolated one-dimensional time series. This processing paradigm seriously ignores the highly dynamic spatial coupling relationship between production nodes. In actual industrial environments, the data quality and asset value of a specific node are often dynamically affected by upstream and downstream topological nodes. The lack of an effective spatiotemporal topological feature extraction mechanism will lead to a huge semantic gap between the underlying perception data and the high-level asset assessment dimension, making it difficult for the system to output a reliable asset classification confidence level.

[0004] Secondly, in the subsequent structural optimization stage, existing decision-making systems generally adopt a fragmented two-stage processing architecture, or mechanically input high-frequency physical data directly into the optimization controller. Since the underlying perception data exhibits high-frequency fluctuation characteristics, while macro-optimization actions such as debt structure adjustment are low-frequency decisions, this severe mismatch of spatiotemporal scales forces reinforcement learning agents to face the dimensional explosion of the state space when constructing Markov decision processes. The agents find it difficult to extract stable state transition patterns from rapidly changing high-frequency states, directly leading to the divergence of optimization algorithms and the extremely unstable generated structural configuration parameters.

[0005] Even with current technologies attempting to construct an end-to-end joint computing architecture coupling classification and reinforcement learning networks to eliminate read / write latency in intermediate stages, the model still exposes a fatal gradient conflict problem during the backpropagation phase of training. Classification tasks rely on supervised learning, resulting in relatively stable and well-defined gradients; while reinforcement learning tasks rely on trial and error, leading to extremely large policy gradient variances and often significant noise. When the system performs joint backpropagation to update the shared underlying feature representation layer, the drastic gradients generated by the reinforcement learning task can easily overwrite the feature extraction weights already established by the classification network. This forgetting phenomenon not only compromises the accuracy of data asset classification but also causes subsequent debt structure optimization tasks to completely fail.

[0006] In summary, existing technologies urgently need an intelligent processing architecture that can dynamically capture data topological relationships, effectively bridge the gap between spatiotemporal scale mismatches, and resolve gradient conflicts in heterogeneous multi-task joint training from the underlying mathematical mechanism, thereby achieving high-precision data asset classification and globally stable structural optimization. Summary of the Invention

[0007] This invention overcomes the shortcomings of the prior art and provides a data asset entry intelligent classification system and debt structure optimization method based on machine learning.

[0008] To achieve the above objectives, the technical solution adopted by this invention is: a method for intelligent classification of data assets and optimization of debt structure based on machine learning, comprising the following steps:

[0009] S1. Acquire multi-source heterogeneous time-series sensing data in the production field, calculate the dynamic mutual information entropy threshold of the time-series sensing data in the current time window, establish dynamic topological connections between the nodes corresponding to each time-series sensing data based on the dynamic mutual information entropy threshold, and construct a dynamic spatiotemporal data map.

[0010] S2. Input the dynamic spatiotemporal data map into the pre-trained spatiotemporal graph attention network ST-GAT for feature extraction, and output the data asset classification confidence matrix for the time-series perceived data, as well as the corresponding hidden layer semantic feature vector.

[0011] S3. Input the hidden layer semantic feature vector into the multi-scale spatiotemporal semantic pooling layer for time-scale frequency reduction and feature aggregation to generate an aggregated semantic vector. Then, concatenate the aggregated semantic vector with the baseline debt structure tensor of the current system to construct the state space of the Markov decision process (MDP) for reinforcement learning.

[0012] S4. Input the MDP state space into the Deep Deterministic Policy Gradient (DDPG) agent network, and output a multidimensional action tensor through the policy function built into the DDPG agent network. The DDPG agent network is pre-trained based on maximizing the joint reward function as the optimization objective. Generate dynamic debt structure parameter configuration instructions according to the multidimensional action tensor, and send the configuration instructions to the resource scheduling gateway of the artificial intelligence optimization operating system to automatically update the capitalization label of the corresponding data asset in the underlying database, and trigger the smart contract connected to the gateway to execute automated fund account allocation and transfer.

[0013] In a preferred embodiment of the present invention, the specific process of constructing a dynamic spatiotemporal data map in step S1 includes:

[0014] Within the current time window, calculate the time series mutual information entropy between any two of the time-series-aware data nodes;

[0015] Determine whether the mutual information entropy is greater than the dynamic mutual information entropy threshold; if yes, establish a directed edge between the two corresponding time-series-aware data nodes, and use the value of the mutual information entropy as the initial connection weight of the directed edge; if no, disconnect the corresponding node connection.

[0016] Using the time-series-aware data nodes as graph nodes and the directed edges and connection weights as the graph structure, a dynamic spatiotemporal data graph representing the topological relationships of production physical quantities is generated.

[0017] In a preferred embodiment of the present invention, the specific process of feature extraction by the spatiotemporal graph attention network ST-GAT in step S2 includes:

[0018] By utilizing the spatial graph convolution mechanism, the clustering value density of target features of adjacent nodes to the central node is calculated based on the connection weights between nodes, in order to extract spatial topological features;

[0019] By utilizing a temporal self-attention mechanism, attention weights are assigned to the historical time step features of the central node to extract temporal decay features;

[0020] The spatial topological features are fused with the temporal decay features and mapped to a preset asset class space through a multilayer perceptron. The data asset classification confidence matrix is ​​output, and the output of the penultimate layer of the multilayer perceptron is extracted as the hidden layer semantic feature vector.

[0021] In a preferred embodiment of the present invention, the specific process of temporal-scale frequency reduction and feature aggregation in the multi-scale spatiotemporal semantic pooling layer in step S3 includes:

[0022] A predefined asset activity decay function is used to input the high-frequency output hidden layer semantic feature vector into the multi-scale spatiotemporal semantic pooling layer.

[0023] The asset activity decay function is used to perform time-step-based cumulative calculation on the hidden layer semantic feature vector to obtain the cumulative feature value;

[0024] When the accumulated feature value is greater than or equal to the preset state trigger threshold, the accumulated features are pooled to generate the aggregate semantic vector, and the state update of the MDP state space is triggered; when the accumulated feature value is less than the state trigger threshold, the aggregate semantic vector of the previous time step is kept unchanged.

[0025] In a preferred embodiment of the present invention, the action space and state space of the DDPG agent network in step S4 are defined as follows:

[0026] The multidimensional action tensor is a normalized multidimensional tensor, and its action dimensions include at least the long-term debt adjustment ratio, the short-term debt adjustment ratio, and the data asset capitalization hedging ratio.

[0027] The benchmark debt structure tensor is a multidimensional matrix containing the financial benchmark data set by the current system; the MDP state space is generated by flattening and connecting the aggregate semantic vector with the benchmark debt structure tensor, and then aligning the dimensions through a fully connected network.

[0028] In a preferred embodiment of the present invention, the joint reward function mentioned in step S4 as the optimization objective is calculated in the pre-joint training phase as follows:

[0029] ;

[0030] in, The reward value output by the joint reward function; This is an estimated value of the comprehensive financing cost generated after the simulation execution based on the aforementioned multidimensional action tensor; and These are the preset weighting coefficients; N is the total number of asset classes. The first in the data asset classification confidence matrix Confidence score of asset class; For the preset first Liquidity conversion factor of asset classes; This is a penalty term triggered when the multidimensional action tensor violates a preset hard financial constraint.

[0031] In a preferred embodiment of the present invention, the ST-GAT pre-trained in step S2 and the DDPG agent network pre-trained in step S4 share a common low-level feature representation layer. The ST-GAT and the DDPG agent network are jointly trained before model deployment based on a joint loss function with adaptive weights and according to an orthogonal gradient projection mechanism. The specific joint training process includes:

[0032] The joint loss function The calculation formula is:

[0033] ;

[0034] in, Cross-entropy classification loss, The policy gradient loss; and These are the learnable noise parameters corresponding to the classification task and the reinforcement learning task, respectively;

[0035] During the joint backpropagation process, the classification gradient of the joint loss function with respect to the classification task is calculated respectively. And the policy gradient generated by the reinforcement learning task. ;

[0036] If the classification gradient With the policy gradient If the dot product is less than zero, then the policy gradient is... Projected onto the classification gradient Generate the modified update gradient on the normal plane. and utilize the classification gradient With the update gradient Together, the underlying feature representation layer and the learnable noise parameters are updated synchronously.

[0037] A machine learning-based intelligent classification system for data asset entry and debt structure optimization includes:

[0038] The data perception and graph construction module is used to acquire multi-source heterogeneous time-series perception data in the production field, calculate the dynamic mutual information entropy threshold of the time-series perception data in the current time window, establish dynamic topological connections between the nodes corresponding to each time-series perception data based on the dynamic mutual information entropy threshold, and construct a dynamic spatiotemporal data graph.

[0039] The asset classification and feature extraction module is used to input the dynamic spatiotemporal data map into the pre-trained spatiotemporal graph attention network ST-GAT for feature extraction, and output the data asset classification confidence matrix for the time-series perceived data, as well as the corresponding hidden layer semantic feature vector.

[0040] The state space mapping module is used to input the hidden layer semantic feature vector into a multi-scale spatiotemporal semantic pooling layer for temporal-scale frequency reduction and feature aggregation, generating an aggregated semantic vector. This aggregated semantic vector is then concatenated with the baseline debt structure tensor of the current system to construct the state space of a Markov decision process (MDP) for reinforcement learning.

[0041] The optimization control and instruction generation module is used to input the MDP state space into the Deep Deterministic Policy Gradient (DDPG) agent network, output a multi-dimensional action tensor through the policy function built into the DDPG agent network. The DDPG agent network is pre-trained based on maximizing the joint reward function as the optimization objective. The module generates dynamic debt structure parameter configuration instructions based on the multi-dimensional action tensor and sends the configuration instructions to the resource scheduling gateway of the artificial intelligence optimization operating system to automatically update the capitalization label of the corresponding data assets in the underlying database and trigger the smart contract connected to the gateway to execute automated fund account allocation and transfer.

[0042] An electronic device includes a memory and a processor, wherein the memory stores a computer program, and the processor executes the program to implement any of the methods described above.

[0043] A non-volatile computer-readable storage medium having a computer program stored thereon, which, when executed by a processor, implements any of the methods described above.

[0044] This invention addresses the shortcomings of the prior art and has the following beneficial effects:

[0045] This invention constructs a dynamic spatiotemporal data map by calculating the dynamic mutual information entropy threshold of time-series sensing data in the production field within the current time window. It then utilizes a spatiotemporal graph attention network to extract spatial topological features and temporal decay features, accurately capturing the highly dynamic spatial coupling relationships and temporal dependencies between underlying physical devices. This multi-scale feature fusion mechanism effectively bridges the semantic gap between underlying sensing physical quantities and high-level asset assessment dimensions, directly outputting a highly reliable data asset classification confidence matrix and hidden semantic feature vectors. Traditional data processing paradigms often fragment sensor data into isolated one-dimensional time series, easily ignoring the data quality spillover effect between topological nodes. The data map construction and attention extraction mechanism of this invention completely overcomes this structural defect, significantly improving the accuracy and robustness of data asset classification in complex industrial IoT environments.

[0046] This invention creatively introduces a multi-scale spatiotemporal semantic pooling layer. It utilizes an asset activity decay function to perform time-step-based cumulative calculations on the hidden semantic feature vectors of high-frequency outputs, generating aggregated semantic vectors only when a state trigger threshold is exceeded to update the state space of the Markov decision process. This frequency reduction aggregation mechanism perfectly aligns with the frequency difference between high-frequency physical perception data fluctuations and low-frequency macro-level debt adjustment decisions at the underlying mathematical logic level, effectively preventing reinforcement learning agents from falling into dimensionality explosion due to high-frequency drastic state changes. Compared to existing decision systems, which often suffer from severe spatiotemporal scale mismatches leading to optimization algorithm divergence or highly unstable configuration parameters when connecting classification and optimization tasks, this invention's frequency reduction mapping architecture constructs a stable and highly regular state transition environment for the agent. This ensures that the deep deterministic policy gradient network can converge quickly under complex constraints, outputting smooth and globally optimal debt structure allocation actions in real time.

[0047] In the pre-joint training phase of the model, this invention constructs a joint loss function with adaptive weights based on homoscedastic uncertainty and innovatively introduces an orthogonal gradient projection mechanism. When the classification gradient of the classification task and the policy gradient of the reinforcement learning task conflict in direction (i.e., the dot product is less than zero), the policy gradient is forcibly projected onto the normal plane of the classification gradient to generate an update gradient. This underlying backpropagation intervention fundamentally resolves the severe antagonism between supervised learning (stable gradient) and trial-and-error exploration (high variance noise gradient) when synchronously updating the shared underlying feature representation layer. Compared to conventional end-to-end joint training in existing technologies, which is prone to catastrophic forgetting due to the brutal gradient of reinforcement learning destroying the classification network weights, the orthogonal projection mechanism of this invention ensures the absolute stability of classification feature extraction and the non-interference of reinforcement learning optimization exploration, realizing true deep coupling and mutual promotion of heterogeneous multi-task networks and maximizing the comprehensive optimization efficiency of the system.

[0048] This invention deeply binds the multidimensional action tensor output by the deep deterministic policy gradient network with the smart contract of the underlying database capitalization label update and resource scheduling gateway. It uses a joint reward function that integrates comprehensive financing costs, asset liquidity conversion coefficients, and hard financial constraint penalties as the sole optimization objective for network pre-training. This closed-loop control link transforms the abstract debt structure optimization requirements into quantitative mathematical indicators that the agent can accurately calculate, directly driving the system to automatically execute the dynamic transfer of fund account allocations. Previous AI applications often lacked the ability to translate financial business logic into underlying calculation formulas and relied excessively on manual intervention, resulting in significant lag. This invention not only endows the model with a global quantitative perspective that considers both debt duration and asset liquidity but also constructs a fully automated technical closed loop from underlying data perception to top-level smart contract execution, achieving a breakthrough in the intelligent collaborative scheduling of physical and financial flows. Attached Figure Description

[0049] To more clearly illustrate the technical solutions in the embodiments of the present invention or the prior art, the drawings used in the description of the embodiments or the prior art will be briefly introduced below. Obviously, the drawings described below are only some embodiments recorded in the present invention. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.

[0050] Figure 1 This is a flowchart of a method for intelligent classification of data assets and optimization of debt structure based on machine learning, according to the present invention.

[0051] Figure 2 This is an architectural block diagram of the data asset entry intelligent classification and debt structure optimization system and electronic device of the present invention. Detailed Implementation

[0052] The technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of the present invention, and not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of the present invention.

[0053] Many specific details are set forth in the following description in order to provide a full understanding of the invention. However, the invention may also be practiced in other ways different from those described herein. Therefore, the scope of protection of the invention is not limited to the specific embodiments disclosed below.

[0054] Application Overview:

[0055] In the existing fields of data asset entry and corporate debt structure optimization, there are two conventional technical evolution routes. The first technical route relies on traditional static financial rule expert systems. The advantage of this route is that the business logic is clear and has high interpretability, but its disadvantage is that it is completely unable to handle the massive heterogeneous high-frequency sensor data generated by the underlying layer of modern industrial Internet of Things, resulting in asset valuation lagging far behind actual production conditions.

[0056] The second technical approach attempts to introduce conventional end-to-end deep learning models to directly map underlying perception data into decision outputs. While this approach improves the automation level of the system, it exposes a core technical contradiction that is extremely difficult to overcome. Specifically, the underlying physical perception data exhibits high-frequency and violent fluctuations at the millisecond level, while macro-level debt structure adjustments and fund transfers are low-frequency, stable, long-cycle decisions. There is an extremely serious mismatch between the two in terms of spatiotemporal scale. When conventional artificial intelligence models attempt to cross this scale gap, the optimization algorithm is very likely to fall into the dimensional explosion of the Markov state space.

[0057] Furthermore, if the supervised classification task of data assets is forcibly combined with the reinforcement learning trial and error task of debt optimization, the heterogeneous tasks are prone to generate adversarial and gradient tearing in the underlying mathematical logic. Therefore, a typical problem that has not yet been solved in the current technology field is how to build a cross-scale artificial intelligence bridge that is strictly convergent in mathematical optimization and highly stable in engineering execution between the volatile high-frequency physical production domain and the low-frequency stable macro-financial domain.

[0058] To address the aforementioned core technical challenges, this invention overcomes the processing limitations of traditional artificial intelligence models based on a single time dimension. It proposes an intelligent processing architecture based on dynamic topology mapping of multi-source heterogeneous data and multi-scale feature fusion using a spatiotemporal graph attention network. Combined with a deep deterministic policy gradient network featuring orthogonal gradient projection intervention, this effectively solves the high-frequency noise interference in data asset quantification and the catastrophic forgetting problem during multi-task joint training in complex industrial IoT scenarios. Compared to existing technologies, this invention not only eliminates the spatiotemporal mapping gap between the physical and financial domains at the underlying feature transfer mechanism but also resolves gradient conflicts between heterogeneous tasks at the backpropagation mathematical level. Ultimately, it achieves an end-to-end high-precision, fully automated closed-loop control link from underlying physical device perception and high-dimensional feature downsampling and aggregation to top-level smart contract fund transfer, demonstrating significant system robustness and global optimization convergence advantages.

[0059] Example 1:

[0060] like Figure 1 As shown, a method for intelligent classification of data assets and optimization of debt structure based on machine learning includes the following steps:

[0061] S1. Acquire multi-source heterogeneous time-series sensing data in the production field, calculate the dynamic mutual information entropy threshold of the time-series sensing data in the current time window, establish dynamic topological connections between the nodes corresponding to each time-series sensing data based on the dynamic mutual information entropy threshold, and construct a dynamic spatiotemporal data map.

[0062] S2. Input the dynamic spatiotemporal data map into the pre-trained spatiotemporal graph attention network ST-GAT for feature extraction, and output the data asset classification confidence matrix for the time-series perceived data, as well as the corresponding hidden layer semantic feature vector.

[0063] S3. Input the hidden layer semantic feature vector into the multi-scale spatiotemporal semantic pooling layer for time-scale frequency reduction and feature aggregation to generate an aggregated semantic vector. Then, concatenate the aggregated semantic vector with the baseline debt structure tensor of the current system to construct the state space of the Markov decision process (MDP) for reinforcement learning.

[0064] S4. Input the MDP state space into the Deep Deterministic Policy Gradient (DDPG) agent network, and output a multidimensional action tensor through the policy function built into the DDPG agent network. The DDPG agent network is pre-trained based on maximizing the joint reward function as the optimization objective. Generate dynamic debt structure parameter configuration instructions according to the multidimensional action tensor, and send the configuration instructions to the resource scheduling gateway of the artificial intelligence optimization operating system to automatically update the capitalization label of the corresponding data asset in the underlying database, and trigger the smart contract connected to the gateway to execute automated fund account allocation and transfer.

[0065] When applying the aforementioned core technologies to transform data asset classification and debt optimization into a feasible system, this invention faces a critical technological hurdle that must be overcome: the system must resolve the cross-domain mapping gap of feature semantics and the joint optimization gap of heterogeneous multi-tasks. That is, it must not only enable the underlying AI model to understand the "sensor language" of physical devices through rigorous mathematical logic and quantify it into "asset value," but also guide the reinforcement learning agent to find the globally optimal solution in a high-dimensional action space filled with hard financial constraints without compromising the accuracy of the pre-classification network. This requires a deep reconstruction of the feature transfer mechanism and the underlying logic of backpropagation to demonstrate the high maturity and feasibility of this solution in complex industrial scenarios.

[0066] Preferably, in step S1, "multi-source heterogeneous time-series sensing data" refers to the time series of underlying physical quantities collected by various sensors (such as temperature x, vibration frequency y) in the industrial field; in order to quantify the nonlinear spatial coupling relationship between these physical devices, the system introduces mutual information entropy as the basis for calculation.

[0067] Before calculating mutual information entropy, for multi-source heterogeneous time-series sensing data with inconsistent sampling frequencies, the system uses a preset reference high-frequency clock as a unified time axis and performs upsampling alignment on the low-frequency sensor data sequence using linear interpolation or a zero-order hold algorithm. This ensures that within the current sliding time window, the mutual information entropy I(X;Y) between any two device node sequences X and Y is calculated logically as the relative entropy of the product of the joint probability distribution and the marginal probability distribution.

[0068] ;

[0069] Furthermore, since the underlying sensing data is a continuous floating-point sequence, the system uses an equal-frequency binning algorithm or a symbol aggregation approximation algorithm to discretize the continuous time-series sensing data into a finite-state discrete symbol sequence. Subsequently, the frequency of each discrete symbol appearing within a time window is counted to estimate the discrete marginal probability distribution. , and joint probability distribution Finally, the values ​​are substituted into the discrete mutual information entropy formula mentioned above for calculation.

[0070] The system dynamically extracts the statistical mean or a specific quantile of the mutual information entropy of all nodes within the current time window as the dynamic mutual information entropy threshold.

[0071] If and only if I(X;Y) is greater than the dynamic mutual information entropy threshold, the system establishes directed edge connections in the graph adjacency matrix A in memory and assigns I(X;Y) as the edge weight, thereby completing the construction of the underlying data structure of the "dynamic spatiotemporal data graph".

[0072] Preferably, in step S2, the Spatiotemporal Graph Attention Network (ST-GAT) is a deep learning architecture for processing graph-structured data. Its core lies in the spatial graph convolution mechanism, i.e., the central node. In the +1 layer hidden state By aggregating its neighboring nodes The update is based on the characteristics of the feature, specifically manifested as follows:

[0073] ;

[0074] Where W is the learnable weight matrix; The attention weight represents the spillover effect of neighboring devices on the asset value of the central device. The system calculates this weight through a single-layer feedforward neural network. The specific calculation formula is as follows:

[0075] ;

[0076] Where || denotes the concatenation operation of feature vectors. For the shared attention mechanism parameter vector, LeakyReLU is a non-linear activation function. As the central node The set of neighboring nodes.

[0077] After extracting spatial topological features, the temporal self-attention mechanism maps the spatial feature sequence of the central node across consecutive historical time steps into a query matrix Q, a key matrix K, and a value matrix V, and calculates the attention weights in the temporal dimension. Specific temporal decay features... The calculation formula is:

[0078] ;

[0079] in, The scaling dimension of the feature vector. The function is used to assign decay weights to the asset value at the current moment at different historical time steps; finally, the system combines spatial features with the aforementioned temporal decay features. Perform residual connection and layer normalization fusion.

[0080] After multi-layer extraction, the data asset classification confidence matrix is ​​output by the Softmax layer at the end of the network, which represents the probability distribution of the data source being rated as a core asset, a general asset, or an archived asset.

[0081] The hidden semantic feature vector is extracted from the penultimate layer of the network and contains the deep feature expression of the input data after multiple nonlinear transformations.

[0082] Preferably, in step S3, the multi-scale spatiotemporal semantic pooling layer, as a core component connecting the high-frequency physical domain and the low-frequency financial domain, is responsible for performing the time-scale downscaling operation, that is, smoothly accumulating the features of multiple high-frequency time steps.

[0083] Preferably, in step S3, feature concatenation refers to directly concatenating the aggregated semantic vector generated after frequency reduction with the benchmark debt structure tensor representing the current financial state along the feature dimension within the tensor computation framework. The concatenated complete tensor constitutes the state space of the Markov Decision Process (MDP) of the reinforcement learning agent's observation environment. .

[0084] Preferably, in step S3, the Markov Decision Process (MDP) state space is the mathematical carrier for the reinforcement learning agent to observe the environment, which contains all the current environmental features required for the agent to make the next decision.

[0085] Preferably, in step S4, the Deep Deterministic Policy Gradient (DDPG) agent network is based on an Actor-Critic architecture, where the Actor network (i.e., the policy function π) directly receives the state. It outputs a deterministic multidimensional action tensor. The tensor is normalized to the [-1,1] interval by the Tanh activation function, which is then mapped to the specific adjustment ratios of long-term debt and short-term debt, respectively.

[0086] Preferably, in step S4, at the physical execution level, the system does not merely output a suggestion report, but instead calls the API of the underlying relational database through the resource scheduling gateway to execute an UPDATE statement to update the capitalization label; simultaneously, the gateway triggers a smart contract via the RPC protocol, that is, sends a message carrying... The encrypted transactions of parameters are executed by smart contracts calling the bank's open API to automatically allocate and transfer funds between accounts, thus completing a closed loop from algorithm derivation to physical execution.

[0087] Example 2:

[0088] This optimized embodiment employs a multi-scale spatiotemporal semantic pooling mechanism with a time forgetting factor. Its core optimization principle lies in solving the spatiotemporal scale mismatch between high-frequency physical data and low-frequency financial decisions. By introducing an asset activity decay function, the system exponentially smooths the accumulation of high-frequency features. Only when the accumulated semantic energy exceeds a preset threshold is a low-frequency MDP state update triggered, thereby building a bridge for cross-domain frequency reduction mapping at the underlying mathematical logic.

[0089] In specific system engineering implementations, the Spatiotemporal Graph Attention Network (ST-GAT) outputs a high-dimensional hidden semantic feature vector at each high-frequency physical clock cycle t. , where d is the feature dimension.

[0090] The multi-scale spatiotemporal semantic pooling layer internally maintains a dynamic feature accumulation tensor register. And a sliding feature cache queue Q;

[0091] Preferably, when the system receives First, it is pushed into queue Q, and the accumulation register is updated using a predefined asset activity decay function. The specific tensor accumulation calculation formula is as follows:

[0092] ;

[0093] In this formula, This is the cumulative feature tensor updated at the current time step. This is the accumulated state from the previous time step. The preset time forgetting factor, The physical significance lies in the fact that the influence of historical physical data on the current macroeconomic value of assets decays exponentially over time.

[0094] Preferably, in order to determine whether the current accumulated characteristics are sufficient to trigger a revaluation of macroeconomic asset values, the system calculates the accumulated characteristic tensor. of Norm, used as an assessment value of the activity of scalar assets at the current moment. :

[0095] ;

[0096] in, For tensor In the The component values ​​of each dimension.

[0097] Preferably, the system compares the activity evaluation values ​​in real time during each high-frequency cycle. With preset state trigger threshold This threshold is pre-calibrated using a grid search based on the historical volatility of the company's asset inventory.

[0098] Triggering conditions (when) ≥ (At time): When the system determines that there has been a substantial change in the data quality or operating condition of the underlying physical device, the system performs an average pooling operation on all historical feature vectors in the cache queue Q to generate low-frequency aggregated semantic vectors. :

[0099] ;

[0100] Then the system will Output to the MDP state space, forcibly triggering the DDPG agent to perform a low-frequency action inference; simultaneously, clear the register ( =0) and clear queue Q, starting the next frequency reduction cycle.

[0101] Maintenance conditions (when) < (At this time): The system determines that the current physical data fluctuation is a normal high-frequency production noise and has not yet reached the level of asset value revaluation. At this time, the system maintains the aggregated semantic vector of the previous low-frequency cycle unchanged, intercepts the transmission of features to the next level, and does not trigger the inference update of the DDPG agent.

[0102] This embodiment uses mathematical quantization to control frequency reduction and interception, filtering out over 90% of high-frequency useless noise in the industrial IoT. This avoids the DDPG network performing invalid matrix multiplication inference every millisecond, greatly reducing the computational overhead and I / O latency of the AI-optimized operating system. At the same time, it maps the massive and disordered high-frequency features into a stable, sparse, and highly economically regular state transition sequence, constructing a stable Markov environment for the reinforcement learning agent. This ensures that the output debt structure configuration parameters do not experience violent high-frequency oscillations, guaranteeing the global stability of automated fund transfer instructions in the macro-financial environment.

[0103] Example 3:

[0104] This optimized embodiment refines the core training objective of the DDPG agent network. In conventional general reinforcement learning frameworks, agents can only receive a single scalar reward and cannot understand complex macro-financial business logic. This embodiment adopts a dynamic soft-constraint reward function that integrates financial costs and asset liquidity. Its core optimization principle is to transform the abstract debt structure optimization requirement into a differentiable and quantifiable mathematical optimization objective of the deep reinforcement learning algorithm. Through a carefully designed reward function, the agent is guided to explore the global optimal solution in a multi-dimensional action space that balances minimizing financing costs and maximizing asset liquidity.

[0105] In specific system engineering implementations, the Actor network of the DDPG agent is responsible for determining the state space at the current time step t. Output a multidimensional action tensor.

[0106] Preferably, the system first defines the multidimensional action tensor output by the policy function π. To ensure that the financial adjustment parameters output by the agent are within the compliance range, the system employs [a specific method / mechanism] at the output layer of the Actor network. The activation function restricts the original output to the range [-1, 1]. Subsequently, the system maps the normalized tensor components to the actual business adjustment ratios through a linear mapping.

[0107] ;

[0108] in, Characterized by the long-term debt adjustment ratio, Characterized by the short-term debt adjustment ratio, Characterizes the capitalization hedging ratio of data assets.

[0109] During the pre-joint training phase, the system constructs a joint reward function based on simulation environment interaction. The function consists of three core components: a financing cost penalty term, an asset liquidity gain term, and a hard financial default penalty term. The specific calculation formula is as follows:

[0110] .

[0111] The quantization calculation logic for each component of the reward function:

[0112] Comprehensive financing cost estimate The system has a built-in financial simulator based on historical interest rate curves;

[0113] When the agent outputs an action Then, the simulator extracts the initial total long-term debt from the current benchmark debt matrix. Initial total short-term debt and the corresponding market annualized interest rate and The simulator simulates debt ratio adjustments, and its overall financing cost is estimated. The specific calculation formula is as follows:

[0114] ;

[0115] The first two items are the adjusted interest costs for both long-term and short-term debt. The total value of the data assets assessed by the system. Preferential discount rates for data asset capitalization pledging; The lower the value, the higher the reward value the agent receives.

[0116] Asset liquidity realization gain Where N is the total number of preset asset categories (e.g., core trade secret level assets, general production factor level assets, low value archive level assets, N=3). The output of the ST-GAT classification network is the first Confidence score (Softmax probability value) of asset class; The first preset of the system The liquidity conversion factor of asset classes; this factor compels agents to prioritize the actual monetization capability of current high-confidence data assets when adjusting debt.

[0117] Hard financial default penalties The system has pre-set hard limits such as "the debt-to-asset ratio shall not exceed 75%" or "the current ratio shall not be lower than 1.2".

[0118] During simulation execution Subsequently, if the system detects that any red line has been breached, such as the current debt-to-asset ratio... Greater than the upper limit If this occurs, a hard financial default penalty will be triggered; to avoid a fixed large penalty causing gradient explosion in the reinforcement learning network, the aforementioned... A dynamic adaptive penalty function is used, and the calculation formula is as follows: ,in This is the preset magnification factor. This represents the absolute cost of financing under the current benchmark conditions; this dynamic mechanism ensures that the penalty is scaled to the current financial level, and that the penalty increases exponentially with the degree of default.

[0119] If no breakthrough is achieved, then =0; This mechanism ensures that the agent can quickly avoid dangerous actions that could lead to the company's bankruptcy during the exploration process.

[0120] Weighting coefficient and : Preset hyperparameters used to balance the weight of cost reduction and efficiency improvement in the overall optimization goal.

[0121] This embodiment introduces a very large, hard penalty for breach of contract. The system can force the DDPG agent to converge to a compliant policy space early in the training phase, avoiding system crashes or violations of financial regulations due to the output of illegal fund transfer instructions during the actual physical execution phase; and creatively integrates classification confidence. By directly embedding the reward function, traditional fragmented systems have no idea whether the underlying assets are "good" or "bad" when optimizing debt; while the intelligent agent of this invention pursues maximization... In the process, the agent spontaneously learns an advanced strategy: when the underlying high-frequency data indicates high-confidence, high-quality general-purpose assets, the agent tends to increase short-term debt to expand production scale; when the underlying data exhibits low liquidity characteristics, the agent automatically shortens the duration of liabilities. This dynamic optimization capability based on real-time fluctuations in underlying physical data significantly improves the risk resistance and capitalization return of the enterprise's capital account allocation.

[0122] Example 4:

[0123] This embodiment refines the core methods of heterogeneous multi-task joint training. In conventional end-to-end joint computing architectures, when the system attempts to simultaneously optimize the pre-supervised classification network ST-GAT and the post-reinforcement learning network DDPG, it faces fatal gradient tearing and catastrophic forgetting problems. This embodiment adopts an adaptive joint loss function based on homoscedastic uncertainty and an orthogonal gradient projection intervention algorithm. The core optimization principle is that, during the backpropagation stage of the model's pre-joint training, the system can not only dynamically balance the dimensional differences between the classification task and the trial-and-error exploration task, but also forcibly eliminate the adversarial gradients generated by the two tasks when updating the shared feature representation layer from the underlying tensor computation graph, thereby ensuring the absolute stability of classification feature extraction and the non-interference of reinforcement learning optimization exploration.

[0124] In specific system engineering implementations, the spatiotemporal graph attention network and the deep deterministic policy gradient network share some of the underlying multilayer perceptron feature representation layer parameters at the physical level. In each batch iteration of the pre-joint training, the system performs the following underlying mathematical intervention steps:

[0125] Preferably, in order to solve the classification cross-entropy loss Gradient loss of reinforcement learning strategy To address the significant difference in numerical magnitude, the system incorporates the homoscedasticity uncertainty theory from multi-task learning, adding two learnable scalar noise parameters to the computational graph. and This is used to dynamically adjust the weights of the two tasks; the joint loss function. The specific calculation formula is as follows:

[0126] ;

[0127] In this formula, when the loss variance (i.e., uncertainty) of a certain task... When the weight of the task increases, the system will automatically reduce the weighting factor of that task in the total loss. This avoids high-noise tasks dominating the update direction of the entire network; simultaneously, the regularization term... Preventing the network from expanding without limit To avoid loss and punishment.

[0128] Preferably, after calculating Then, the system invokes the automatic differentiation engine to calculate the classification gradient vectors generated by the joint loss function with respect to the classification task. And the policy gradient vector generated by the reinforcement learning task. .

[0129] Subsequently, the system updates the underlying shared parameters. Previously, we calculated the dot product of these two high-dimensional gradient vectors (i.e., the numerator of the cosine similarity):

[0130] Preferably, the system determines the sign of the dot product in real time to detect whether gradient antagonism exists between tasks:

[0131] Non-conflict state (when) (≥0): This indicates that the angle between the update directions of the classification task and the reinforcement learning task in the current parameter space is less than or equal to 90 degrees, and the two have a synergistic effect of promoting each other; at this time, the system does not intervene in the gradient, but directly adds the original gradients for gradient descent update: ,in This is the learning rate.

[0132] Conflict state (when) <0): This indicates that the angle between the update directions of the two tasks is greater than 90 degrees, meaning that the "brute force gradient" of reinforcement learning is attempting to erase the feature extraction weights that the classification network has painstakingly learned; at this time, the system forces the orthogonal gradient projection, and the system will change the policy gradient. Projected onto classification gradient On the normal plane, generate the modified update gradient. The specific geometric projection formula is as follows:

[0133] ;

[0134] Using this projection formula, the system learns from the reinforcement learning gradient. The classification gradient has been completely eliminated. Opposing conflicting components; subsequently, the system utilizes components that retain the original classification orientation. Modifications that do not cause damage The parameters of the shared layer are updated synchronously together: .

[0135] In traditional joint training, the high variance gradient of reinforcement learning can instantly destroy the weights of the classification network, causing a precipitous drop in the accuracy of data asset classification. The orthogonal projection mechanism of this invention ensures that no matter how frantically the DDPG agent explores debt allocation actions, its backpropagated gradient will never negatively interfere with the already converged classification feature extraction capability of ST-GAT. At the same time, this embodiment automatically balances the learning pace of supervised learning and reinforcement learning through an adaptive loss function, enabling the AI-optimized operating system to simultaneously improve the accuracy of physical data understanding and the ability to find financial assets in a unified computational flow before deployment. This greatly shortens the model training cycle and significantly improves the global optimal solution achievement rate of the system's final output fund account allocation instructions.

[0136] Example 5:

[0137] like Figure 2 As shown, this embodiment provides a data asset entry intelligent classification and debt structure optimization system based on machine learning. The system is deployed in an artificial intelligence optimized operating system to achieve fully automated closed-loop control from the perception of the underlying physical devices to the intelligent transfer of top-level financial assets.

[0138] The system specifically includes a data perception and graph construction module, an asset classification and feature extraction module, a state space mapping module, and an optimization control and instruction generation module. The data perception and graph construction module is used to acquire multi-source heterogeneous time-series perception data in the production field, calculate the dynamic mutual information entropy threshold of the time-series perception data within the current time window, and establish dynamic topological connections between the nodes corresponding to each time-series perception data based on the dynamic mutual information entropy threshold to construct a dynamic spatiotemporal data graph.

[0139] The asset classification and feature extraction module is communicatively connected to the data perception and graph construction module. It is used to input the dynamic spatiotemporal data graph into a pre-trained spatiotemporal graph attention network for feature extraction and output the data asset classification confidence matrix and the corresponding hidden layer semantic feature vector for the time-series perceived data.

[0140] The state space mapping module receives the hidden layer semantic feature vector, inputs it into the multi-scale spatiotemporal semantic pooling layer for time-scale frequency reduction and feature aggregation to generate an aggregated semantic vector, and concatenates the aggregated semantic vector with the baseline debt structure tensor of the current system to construct the state space of the Markov decision process for reinforcement learning.

[0141] The optimization control and instruction generation module receives the state space of the Markov decision process and inputs it into the deep deterministic policy gradient agent network, outputting a multi-dimensional action tensor through a built-in policy function. This module then generates dynamic debt structure parameter configuration instructions based on the multi-dimensional action tensor, and finally sends these configuration instructions to the resource scheduling gateway of the AI ​​optimization operating system. This automatically updates the capitalization tags of the corresponding data assets in the underlying database and triggers the smart contract connected to the gateway to execute automated fund allocation and transfer.

[0142] In order to enable the above system and algorithm to move beyond pure mathematical derivation and be truly implemented in an industrial environment, this embodiment also provides an electronic device as a physical hardware carrier for executing the above artificial intelligence algorithm and control logic. The electronic device includes a physical communication bus and a processor and a memory that are electrically connected to the physical communication bus.

[0143] The processor can be a physical computing core with high-performance tensor matrix operations and floating-point concurrent computing capabilities, such as a central processing unit, graphics processing unit, tensor processing unit, or dedicated neural network acceleration chip. The memory is used to store the computer program and high-dimensional weight parameters of the network model; it can include volatile random access memory and non-volatile read-only memory. When the electronic device is powered on, the processor reads the computer program from the memory via a physical communication bus and loads it into memory for execution, thereby fully implementing all the calculation and control steps of the machine learning-based intelligent classification and debt structure optimization methods described in Examples 1 to 4 at the physical hardware level.

[0144] Furthermore, this embodiment also provides a non-volatile computer-readable storage medium for persistently storing the computer-executable instructions and algorithm models corresponding to the above methods. The computer-readable storage medium can take the form of a solid-state drive, a mechanical hard drive, a flash drive, or a cloud-based distributed storage cluster. The computer-readable storage medium stores a computer program. When the computer program is read and executed by the physical processor of the aforementioned electronic device, it can drive the physical processor to complete all core algorithm flow and instruction issuance operations, such as the construction of dynamic spatiotemporal data maps, the frequency reduction pooling of multi-scale spatiotemporal features, the orthogonal gradient projection intervention of the joint loss function, and the automated fund account allocation and transfer based on reinforcement learning.

[0145] In summary, this invention proposes a machine learning-based intelligent classification method and debt structure optimization system for data assets. By constructing a spatiotemporal data graph through dynamic mutual information entropy calculation and extracting multi-scale features using a spatiotemporal graph attention network, it successfully bridges the semantic mapping gap between underlying high-frequency physical perception data and high-level macro-financial asset value. Furthermore, this invention innovatively introduces a multi-scale spatiotemporal semantic pooling layer based on an asset activity decay function and a joint training mechanism based on orthogonal gradient projection intervention. This not only completely eliminates the dimensionality explosion and spatiotemporal scale mismatch problems faced by reinforcement learning agents from the underlying mathematical computation architecture, but also fundamentally resolves the gradient tearing and catastrophic forgetting phenomena caused by synchronous backpropagation updates in heterogeneous multi-task networks. Finally, it constructs a deep reinforcement learning optimization link that integrates comprehensive financing costs and dynamic soft constraints on asset liquidity, realizing end-to-end fully automated closed-loop control from the underlying physical data perception of the Industrial Internet of Things to the top-level resource scheduling gateway and smart contract fund account allocation and transfer. This significantly improves the assessment accuracy of enterprise data asset capitalization and the global optimal convergence stability of dynamic adjustment of macro debt structure.

[0146] Based on the preferred embodiments of the present invention described above, those skilled in the art can make various changes and modifications without departing from the inventive concept. The technical scope of this invention is not limited to the contents of the specification, but must be determined according to the scope of the claims.

Claims

1. A method for intelligent classification of data assets and optimization of debt structure based on machine learning, characterized in that, Includes the following steps: S1. Acquire multi-source heterogeneous time-series sensing data in the production field, calculate the dynamic mutual information entropy threshold of the time-series sensing data in the current time window, establish dynamic topological connections between the nodes corresponding to each time-series sensing data based on the dynamic mutual information entropy threshold, and construct a dynamic spatiotemporal data map. S2. Input the dynamic spatiotemporal data map into the pre-trained spatiotemporal graph attention network ST-GAT for feature extraction, and output the data asset classification confidence matrix for the time-series perceived data, as well as the corresponding hidden layer semantic feature vector. S3. Input the hidden layer semantic feature vector into the multi-scale spatiotemporal semantic pooling layer for time-scale frequency reduction and feature aggregation to generate an aggregated semantic vector. Then, concatenate the aggregated semantic vector with the baseline debt structure tensor of the current system to construct the state space of the Markov decision process (MDP) for reinforcement learning. S4. Input the MDP state space into the Deep Deterministic Policy Gradient (DDPG) agent network, and output a multidimensional action tensor through the policy function built into the DDPG agent network. The DDPG agent network is pre-trained based on maximizing the joint reward function as the optimization objective. Generate dynamic debt structure parameter configuration instructions according to the multidimensional action tensor, and send the configuration instructions to the resource scheduling gateway of the artificial intelligence optimization operating system to automatically update the capitalization label of the corresponding data asset in the underlying database, and trigger the smart contract connected to the gateway to execute automated fund account allocation and transfer.

2. The method for intelligent classification of data assets and optimization of debt structure based on machine learning according to claim 1, characterized in that, The specific process of constructing the dynamic spatiotemporal data map in step S1 includes: Within the current time window, calculate the time series mutual information entropy between any two of the time-series-aware data nodes; Determine whether the mutual information entropy is greater than the dynamic mutual information entropy threshold; if yes, establish a directed edge between the two corresponding time-series-aware data nodes, and use the value of the mutual information entropy as the initial connection weight of the directed edge; if no, disconnect the corresponding node connection. Using the time-series-aware data nodes as graph nodes and the directed edges and connection weights as the graph structure, a dynamic spatiotemporal data graph representing the topological relationships of production physical quantities is generated.

3. The method for intelligent classification of data assets and optimization of debt structure based on machine learning according to claim 1, characterized in that, The specific process of feature extraction by the spatiotemporal graph attention network ST-GAT in step S2 includes: By utilizing the spatial graph convolution mechanism, the clustering value density of target features of adjacent nodes to the central node is calculated based on the connection weights between nodes, in order to extract spatial topological features; By utilizing a temporal self-attention mechanism, attention weights are assigned to the historical time step features of the central node to extract temporal decay features; The spatial topological features are fused with the temporal decay features and mapped to a preset asset class space through a multilayer perceptron. The data asset classification confidence matrix is ​​output, and the output of the penultimate layer of the multilayer perceptron is extracted as the hidden layer semantic feature vector.

4. The method for intelligent classification of data assets and optimization of debt structure based on machine learning according to claim 1, characterized in that, The specific process of temporal-scale frequency reduction and feature aggregation in the multi-scale spatiotemporal semantic pooling layer in step S3 includes: A predefined asset activity decay function is used to input the high-frequency output hidden layer semantic feature vector into the multi-scale spatiotemporal semantic pooling layer. The asset activity decay function is used to perform time-step-based cumulative calculation on the hidden layer semantic feature vector to obtain the cumulative feature value; When the accumulated feature value is greater than or equal to the preset state trigger threshold, the accumulated features are pooled to generate the aggregate semantic vector, and the state update of the MDP state space is triggered; when the accumulated feature value is less than the state trigger threshold, the aggregate semantic vector of the previous time step is kept unchanged.

5. The method for intelligent classification of data assets and optimization of debt structure based on machine learning according to claim 1, characterized in that, The action space and state space of the DDPG agent network in step S4 are defined as follows: The multidimensional action tensor is a normalized multidimensional tensor, and its action dimensions include at least the long-term debt adjustment ratio, the short-term debt adjustment ratio, and the data asset capitalization hedging ratio. The benchmark debt structure tensor is a multidimensional matrix containing the financial benchmark data set by the current system; the MDP state space is generated by flattening and connecting the aggregate semantic vector with the benchmark debt structure tensor, and then aligning the dimensions through a fully connected network.

6. The method for intelligent classification of data assets and optimization of debt structure based on machine learning according to claim 1, characterized in that, The joint reward function mentioned in step S4 as the optimization objective is calculated in the following logic during the pre-joint training phase: ; in, The reward value output by the joint reward function; This is an estimated value of the comprehensive financing cost generated after the simulation execution based on the aforementioned multidimensional action tensor; and These are the preset weighting coefficients; N is the total number of asset classes. The first in the data asset classification confidence matrix Confidence score of asset class; For the preset first Liquidity conversion factor of asset classes; This is a penalty term triggered when the multidimensional action tensor violates a preset hard financial constraint.

7. The method for intelligent classification of data assets and optimization of debt structure based on machine learning according to claim 1, characterized in that, The ST-GAT pre-trained in step S2 and the DDPG agent network pre-trained in step S4 share a common low-level feature representation layer. The ST-GAT and the DDPG agent network are jointly trained before model deployment using a joint loss function based on adaptive weights and an orthogonal gradient projection mechanism. The specific joint training process includes: The joint loss function The calculation formula is: ; in, Cross-entropy classification loss, The policy gradient loss; and These are the learnable noise parameters corresponding to the classification task and the reinforcement learning task, respectively; During the joint backpropagation process, the classification gradient of the joint loss function with respect to the classification task is calculated respectively. And the policy gradient generated by the reinforcement learning task. ; If the classification gradient With the policy gradient If the dot product is less than zero, then the policy gradient is... Projected onto the classification gradient Generate the modified update gradient on the normal plane. and utilize the classification gradient With the update gradient Together, the underlying feature representation layer and the learnable noise parameters are updated synchronously.

8. A machine learning-based intelligent classification system for data asset entry and debt structure optimization, applied to the method described in any one of claims 1 to 7, characterized in that, include: The data perception and graph construction module is used to acquire multi-source heterogeneous time-series perception data in the production field, calculate the dynamic mutual information entropy threshold of the time-series perception data in the current time window, establish dynamic topological connections between the nodes corresponding to each time-series perception data based on the dynamic mutual information entropy threshold, and construct a dynamic spatiotemporal data graph. The asset classification and feature extraction module is used to input the dynamic spatiotemporal data map into the pre-trained spatiotemporal graph attention network ST-GAT for feature extraction, and output the data asset classification confidence matrix for the time-series perceived data, as well as the corresponding hidden layer semantic feature vector. The state space mapping module is used to input the hidden layer semantic feature vector into a multi-scale spatiotemporal semantic pooling layer for temporal-scale frequency reduction and feature aggregation, generating an aggregated semantic vector. This aggregated semantic vector is then concatenated with the baseline debt structure tensor of the current system to construct the state space of a Markov decision process (MDP) for reinforcement learning. The optimization control and instruction generation module is used to input the MDP state space into the Deep Deterministic Policy Gradient (DDPG) agent network, output a multi-dimensional action tensor through the policy function built into the DDPG agent network. The DDPG agent network is pre-trained based on maximizing the joint reward function as the optimization objective. The module generates dynamic debt structure parameter configuration instructions based on the multi-dimensional action tensor and sends the configuration instructions to the resource scheduling gateway of the artificial intelligence optimization operating system to automatically update the capitalization label of the corresponding data assets in the underlying database and trigger the smart contract connected to the gateway to execute automated fund account allocation and transfer.

9. An electronic device, characterized in that, It includes a memory and a processor, wherein the memory stores a computer program, and the processor executes the program to implement the method as described in any one of claims 1 to 7.

10. A non-volatile computer-readable storage medium, characterized in that, It stores a computer program thereon, which, when executed by a processor, implements the method as described in any one of claims 1 to 7.