An e-commerce data security management method and system based on big data

By constructing a chain-like correlation graph and a predictive model, the problem of insufficient identification of risk transmission paths in the e-commerce supply chain was solved, enabling proactive risk prediction and hierarchical prevention and control, and improving the effectiveness of data security management and the stability of the supply chain.

CN121980592BActive Publication Date: 2026-06-16SHENZHEN GLOBALBRANDS TECH CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
SHENZHEN GLOBALBRANDS TECH CO LTD
Filing Date
2026-04-03
Publication Date
2026-06-16

AI Technical Summary

Technical Problem

Existing technologies cannot effectively identify the hidden risk chain transmission path between multiple nodes in the e-commerce supply chain, lack the ability to proactively predict risks, resulting in a lag in risk spread prevention and control. Furthermore, traditional data security management methods are rigid and cannot balance risk control effectiveness with business continuity.

Method used

By constructing a chain-like correlation graph based on big data technology, and by quantitatively calculating risk transmission parameters, combined with pre-trained nodes and a chain-like risk transmission prediction model, a comprehensive prediction of supply chain risks can be achieved, and a graded blocking strategy can be formulated for differentiated management.

Benefits of technology

It enables proactive prediction and precise hierarchical prevention and control of e-commerce supply chain risks, improves the pertinence and effectiveness of data security management, avoids the spread of risks, and ensures the stability of the supply chain and business continuity.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN121980592B_ABST
    Figure CN121980592B_ABST
Patent Text Reader

Abstract

The application relates to the technical field of data security management, and particularly provides an e-commerce data security management method and system based on big data, which comprises the following steps: collecting node security state data and inter-node association relationship of each level node of an e-commerce supply chain link in response to a preset response mechanism, obtaining node risk features, association relationship features and data sensitive features through preprocessing; constructing a chain association graph map with the node and association relationship features, and obtaining risk transmission parameters through quantitative calculation; inputting the node risk features into a node risk prediction model to obtain node risk prediction results; inputting the chain association graph map, the risk transmission parameters and the node risk prediction results into a chain risk transmission prediction model to obtain chain transmission risk prediction results; formulating a hierarchical blocking strategy by comprehensively considering the node risk, the chain transmission risk and the data sensitive features, and performing differentiated management and control on the supply chain link according to the hierarchical blocking strategy, so that e-commerce supply chain risk identification and hierarchical security management and control are realized.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of data security management technology, specifically to a data security management method and system for e-commerce based on big data. Background Technology

[0002] With the deep digital upgrade of the e-commerce industry and the collaborative development of the supply chain, the e-commerce supply chain has formed a core platform: a multi-level node linkage system of first-level suppliers, second-level service providers, and third-level partners. The frequency of data interaction between nodes, including user information, transaction data, and operational data, is growing exponentially. The complexity and correlation of data flow have increased significantly, and data security has become a core element to ensure the stable operation of the e-commerce supply chain and prevent operational risks.

[0003] Currently, traditional data security management technologies have significant shortcomings. They fail to conduct quantitative analysis of the relationships between nodes across the entire supply chain, making it impossible to calculate key indicators such as correlation strength and risk transmission coefficients. This prevents the implicit risk chain transmission paths between multi-level nodes from becoming explicit, and also hinders the early identification of key transmission paths and their impact scope, leading to delayed risk spread prevention and control. Furthermore, their ability to predict associated risks is insufficient, often relying on passive post-incident remediation rather than proactive prediction based on real-time node security status and chain correlation characteristics. This makes it impossible to predict both node and chain transmission risks, and also makes it difficult to assess the risk outbreak time window and impact boundaries. Moreover, their risk handling methods are rigid, often employing a "one-size-fits-all" approach to sever data flow channels without implementing differentiated control based on risk transmission probability and data sensitivity levels, making it difficult to balance risk control effectiveness with the business continuity of the e-commerce supply chain.

[0004] At the same time, the rapid development and mature application of technologies such as big data, graph computing, graph neural networks, and spatiotemporal sequence analysis have provided a feasible path to solve the technical pain points of data security management across the entire e-commerce supply chain.

[0005] Therefore, there is an urgent need to build an e-commerce data security management method that is supported by big data technology and addresses the chain-like transmission characteristics of risks at multiple nodes in the supply chain. This method should enable full-link data association modeling, proactive multi-dimensional risk prediction, precise hierarchical transmission blocking, and dynamic iterative optimization, thereby breaking through the application limitations of traditional technologies and improving the control capabilities and efficiency of e-commerce supply chain data security. Summary of the Invention

[0006] To overcome the shortcomings of existing technologies, this invention provides a data security management method and system for e-commerce based on big data, in order to solve the problems in existing technologies.

[0007] One embodiment of the present invention provides a data security management method for e-commerce based on big data, comprising the following steps:

[0008] In response to the preset response mechanism, the system collects node security status data and inter-node correlation data of each level of the e-commerce supply chain and performs preprocessing to obtain node risk characteristics, correlation characteristics and data sensitivity characteristics of each level of nodes.

[0009] A chain-like association graph is constructed by using node risk characteristics as graph node attributes and association relationship characteristics as association edge attributes. The chain-like association graph is then calculated based on preset quantitative calculation rules to obtain risk transmission parameters, which include node risk transmission coefficient, edge risk transmission probability, and edge association strength.

[0010] The node risk characteristics are input into a pre-trained node risk prediction model, and the node risk prediction result for a single node is output. The node risk prediction result includes the node risk level, the probability of risk occurrence, and the risk type.

[0011] The chain-linked graph, risk transmission parameters, and node risk prediction results are input into the pre-trained chain-linked risk transmission prediction model, and the link transmission risk prediction results are output. The link transmission risk prediction results include risk transmission probability, risk transmission path, risk impact range, and risk outbreak time window.

[0012] Based on the node risk prediction results, link transmission risk prediction results, and data sensitivity characteristics, a tiered blocking strategy is formulated, and corresponding differentiated control is implemented on the supply chain path according to the tiered blocking strategy.

[0013] This application also relates to a big data-based e-commerce data security management system, comprising:

[0014] The data acquisition module is used to collect node security status data and inter-node correlation data of each level of the e-commerce supply chain in response to the preset response mechanism, and to preprocess the data to obtain the node risk characteristics, correlation characteristics and data sensitivity characteristics of each level of nodes.

[0015] The risk transmission calculation module is used to construct a chain-like association graph by using node risk characteristics as graph node attributes and association relationship characteristics as association edge attributes; and to calculate the chain-like association graph based on preset quantization calculation rules to obtain risk transmission parameters, which include node risk transmission coefficient, edge risk transmission probability, and edge association strength.

[0016] The node risk prediction module is used to input node risk features into a pre-trained node risk prediction model and output the node risk prediction result for a single node. The node risk prediction result includes the node risk level, the probability of risk occurrence, and the risk type.

[0017] The transmission risk prediction module is used to input the chain-like association graph, risk transmission parameters and node risk prediction results into the pre-trained chain-like risk transmission prediction model, and output the link transmission risk prediction results, which include the risk transmission probability, risk transmission path, risk impact range and risk outbreak time window.

[0018] The strategy formulation module is used to formulate a graded blocking strategy based on the node risk prediction results, link transmission risk prediction results, and data sensitivity characteristics, and to implement corresponding differentiated control over the supply chain path according to the graded blocking strategy.

[0019] This application also relates to a computer device, including a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor executes the computer program to implement the steps of the above-described big data-based e-commerce data security management method.

[0020] This application also relates to a computer-readable storage medium storing a computer program that, when executed by a processor, implements the steps of the above-described method for e-commerce data security management based on big data.

[0021] The above embodiments provide a data security management method and system for e-commerce based on big data, which has the following beneficial effects:

[0022] This invention collects and preprocesses security status data and relationships of nodes at each level of the e-commerce supply chain through a pre-defined response mechanism. It constructs a chain-like association graph and quantifies risk transmission parameters. A node risk prediction model is used to predict the risk of individual nodes. Combining the chain-like association graph, risk transmission parameters, and node risk prediction results, a chain-like risk transmission prediction model is used to comprehensively predict the risk of transmission along the supply chain. Finally, a tiered blocking strategy is formulated based on node risk, transmission risk, and data sensitivity characteristics, and differentiated control is implemented across the supply chain. This approach aligns with the hierarchical and chain-like characteristics of the e-commerce supply chain, enabling risk identification, risk transmission simulation, and tiered prevention and control. It improves the targeting and effectiveness of e-commerce data security management, avoiding the impact of traditional one-size-fits-all management methods on normal business operations. Furthermore, it leverages big data and dual-model prediction to achieve proactive risk assessment, effectively curbing the transmission and spread of supply chain data security risks and ensuring the security and stability of e-commerce supply chain data flow. Attached Figure Description

[0023] Figure 1 A flowchart illustrating a big data-based e-commerce data security management method provided in this embodiment of the invention;

[0024] Figure 2This is a schematic block diagram of a computer device provided in an embodiment of the present invention. Detailed Implementation

[0025] The technical solutions in the embodiments of the present invention will now be clearly and completely described in conjunction with the accompanying drawings.

[0026] Reference Figure 1 One embodiment of the present invention provides a data security management method for e-commerce based on big data, comprising the following steps:

[0027] S10. In response to the preset response mechanism, collect the node security status data and the relationship between nodes at each level of the e-commerce supply chain and perform preprocessing to obtain the node risk characteristics, relationship characteristics and data sensitivity characteristics of each level of nodes.

[0028] Specifically, this step, as the initial data processing stage of this invention, involves collecting security status data for each individual node and related relationship data such as business connections and data flow between nodes after the preset response mechanism is triggered, targeting the core hub level nodes, first-level transmission level nodes, and end-connection level nodes in the e-commerce supply chain. The node security status data includes security-related indicators such as abnormal node data transmission, abnormal access behavior, and sensitive data operation records. The inter-node relationship data includes the frequency of interaction between upstream and downstream nodes, data flow paths, and business binding relationships. The collected data is preprocessed using conventional data preprocessing methods, including data cleaning, deduplication, normalization, and feature extraction. Redundant and invalid data are removed, and valid data is standardized. From the valid data, node risk characteristics that characterize the security risk status of each level of node, relationship characteristics reflecting the degree of interaction between nodes, and data sensitivity characteristics that identify the confidentiality and importance level of the data are extracted.

[0029] For example, assuming the e-commerce supply chain is a complete e-commerce retail process, including the e-commerce platform's central control server node as the core hub, brand merchant business management nodes and payment gateway nodes as primary transmission nodes, and logistics and warehousing nodes and user terminal access nodes as end-connection nodes, the collected node security status data would at least include data transmission packet loss and unauthorized login alarms on the e-commerce platform's central control server, records of sensitive data violations on brand merchant nodes, abnormal transaction transmission on payment gateway nodes, order information tampering on logistics and warehousing nodes, and abnormal access behavior on user terminal nodes. The collected inter-node relationships would at least include the daily interaction frequency between the platform's central control server and brand merchants / payment gateways, order data flow paths, and business binding relationships between brand merchants and logistics and warehousing nodes. The node risk characteristics obtained through data preprocessing and feature extraction would be quantitative characteristics such as the frequency of abnormal behavior and risk operation weights of each node; the relationship characteristics would be characterizing characteristics such as the tightness of interaction between nodes and the strength of data flow associations; and the data sensitivity characteristics would be sensitive identification characteristics at different levels, such as user payment information, order privacy data, and core business data of merchants. The core purpose of this step is to provide basic data support for risk prediction and security management in the supply chain.

[0030] S20. Using node risk characteristics as graph node attributes and association relationship characteristics as association edge attributes, construct a chain-like association graph; calculate the chain-like association graph based on preset quantitative calculation rules to obtain risk transmission parameters, including node risk transmission coefficient, edge risk transmission probability, and edge association strength.

[0031] Specifically, this step, based on the three types of feature data obtained in step S10, completes the construction of the basic model for supply chain risk analysis and quantifies the risk transmission capability. On the one hand, it uses the node risk characteristics of each level of nodes as node attributes and the correlation characteristics between nodes as connection edge attributes, combined with the hierarchical transmission characteristics of the e-commerce supply chain, to construct a chain-like correlation graph, which is used to intuitively display the node distribution and correlation transmission structure of the supply chain. On the other hand, it performs overall calculations on the chain-like correlation graph through preset quantitative calculation logic to obtain three types of risk transmission parameters that characterize the risk diffusion capability of nodes themselves, the probability of risk transmission through related edges, and the degree of correlation between nodes. For example, taking the above-mentioned e-commerce retail full-process link as an example, the constructed chain-like correlation graph uses nodes such as the e-commerce platform central control server, brand merchant management nodes, and payment gateways as graph nodes, and the business flow and data interaction relationships between each node as graph correlation edges. Then, through quantitative calculation, it obtains the risk transmission coefficient corresponding to each node, the edge risk transmission probability corresponding to each correlation edge, and the edge correlation strength, thereby realizing the quantitative characterization of the risk transmission characteristics of the supply chain. The core purpose of this step is to build a visualized and computable supply chain risk transmission structure and form quantitative risk transmission indicators, so as to provide structural and data support for risk prediction at the subsequent node and link levels.

[0032] S30. Input the node risk features into the pre-trained node risk prediction model and output the node risk prediction result for a single node. The node risk prediction result includes the node risk level, the probability of risk occurrence, and the risk type.

[0033] Specifically, this step uses the node risk features extracted in step S10 as input data and utilizes a pre-trained node risk prediction model to perform risk prediction analysis on each individual node in the e-commerce supply chain. Through the model's identification and reasoning of node risk features, it directly outputs prediction results that characterize the risk status of each independent node. The prediction results specifically cover the node risk level of the corresponding level node, the probability of node risk occurrence, and the type of risk when node risk occurs.

[0034] S40. Input the chain-linked correlation graph, risk transmission parameters and node risk prediction results into the pre-trained chain-linked risk transmission prediction model, and output the link transmission risk prediction results. The link transmission risk prediction results include the risk transmission probability, risk transmission path, risk impact range and risk outbreak time window.

[0035] Specifically, this step uses the chain-like correlation graph and risk transmission parameters obtained in step S20, as well as the node risk prediction results obtained in step S30, as comprehensive input data. It uses a pre-trained chain-like risk transmission prediction model to analyze the overall risk transmission situation of the e-commerce supply chain. Through the comprehensive identification and deduction of multi-dimensional input data by the model, it outputs prediction results to characterize the risk transmission situation of the supply chain. The prediction results specifically cover the risk transmission probability, risk transmission path, risk impact range, and risk outbreak time window.

[0036] S50. Based on the node risk prediction results, link transmission risk prediction results, and data sensitivity characteristics, formulate a graded blocking strategy, and implement corresponding differentiated control over the supply chain path according to the graded blocking strategy.

[0037] Specifically, this step integrates the node risk prediction results obtained in step S30, the link transmission risk prediction results obtained in step S40, and the data sensitivity characteristics obtained in step S10. Based on these, a tiered blocking strategy adapted to the risk status of the supply chain is formulated. Then, based on the formulated tiered blocking strategy, differentiated control is implemented on the e-commerce supply chain. This differentiated control means that for nodes and corresponding transmission links that are predicted by the model to have different risk levels and different transmission trends, the control methods and control intensity are matched accordingly, rather than adopting uniform control measures, so as to achieve precise security control.

[0038] In one embodiment, the triggering of the preset response mechanism specifically includes the following steps:

[0039] S101. Based on the initially constructed chain-like association graph, the nodes of the e-commerce supply chain are divided into core hub level nodes, first-level transmission level nodes, and end-association level nodes with decreasing trigger priorities.

[0040] S102. Based on the node risk transmission coefficient and edge association strength initially calculated for each level of node, configure differentiated trigger thresholds for each level of node; wherein, the higher the node risk transmission coefficient and edge association strength, the lower the corresponding trigger threshold.

[0041] S103. When the node security status data corresponding to any level node reaches the trigger threshold, or when the edge risk transmission probability of the associated edge of the level node reaches the preset transmission threshold, the response mechanism is triggered.

[0042] In this embodiment, specifically as described in steps S101-S103 above, step S101 uses the initial chain-like association map of the e-commerce supply chain, which is preliminarily constructed in the early stage of risk monitoring, as the structural basis and division criteria. This initial chain-like association map has initially sorted out the distribution relationship of each node in the supply chain and the basic association structure between nodes, which can intuitively reflect the position and association relationship of the nodes in the chain. On this basis, combined with the business functions undertaken by each node in the entire e-commerce supply chain, the core position of data flow, and the influence of risk transmission and diffusion, all nodes in the entire chain are divided into three levels, and the risk monitoring trigger priority corresponding to each level decreases in sequence. The core hub level nodes undertake the functions of full-domain data scheduling and core business control, and have the widest risk impact range, so they have the highest trigger priority; the first-level transmission level nodes undertake the business and data flow of the core nodes, with the next highest risk transmission capability, and have a medium trigger priority; the terminal association level nodes only participate in the execution of terminal business, with a limited risk impact range, and have the lowest trigger priority. For example, in the aforementioned e-commerce retail process, based on this initial chain-like association graph, the e-commerce platform's central control server node, which coordinates the data, transactions, and management of the entire platform, can be divided into core hub-level nodes; the brand merchant business management node and payment gateway node, which are responsible for product management and transaction settlement, can be divided into first-level transmission-level nodes; and the logistics warehousing node and user terminal access node, which execute goods warehousing and distribution and user front-end access, can be divided into end-level association-level nodes.

[0043] Step S102 calculates two types of parameters based on the risk transmission coefficients of nodes at each level and the edge association strength of node associations, which are obtained through pre-quantification calculations based on the initial chain-linked graph. The initial calculation refers to the preliminary calculations performed during the risk monitoring and response mechanism configuration phase, based solely on the node hierarchy, basic business attributes, and basic relationships between nodes as reflected in the initial chain-linked graph, using preset basic quantification rules. This is used to intuitively reflect the risk diffusion potential and the degree of association of each node in the supply chain. Based on this, this step configures independent and differentiated trigger thresholds with varying values ​​for nodes at different levels. The configuration of these trigger thresholds follows the core rule of matching risk transmission capability with monitoring sensitivity. The higher the risk transmission coefficient of a node, the stronger its ability to spread risk throughout the entire chain after a risk occurs. The higher the edge association strength of a node's associated edges, the tighter the interaction and binding between the node and other nodes, and the more convenient the risk transmission. Therefore, the lower the trigger threshold is set for such nodes, the faster the core nodes with high risk transmission capabilities can be detected when risk signs appear, avoiding risk monitoring delays caused by excessively high thresholds. For end nodes with weak risk transmission capabilities and low association strength, relatively high trigger thresholds can be set to balance monitoring accuracy and system operating efficiency.

[0044] Step S103 clarifies the actual triggering and judgment logic of the entire response mechanism. This step adopts a dual judgment method of parallel monitoring of node risk anomalies and link transmission risks. In the process of routine real-time security monitoring of nodes at all levels of the e-commerce supply chain, it is not necessary for both types of conditions to be met at the same time. As long as any judgment condition is met, the response mechanism in step S10 can be triggered immediately, thereby initiating subsequent data collection, preprocessing, and full-process risk analysis and security control operations. The first condition is that the real-time collected node security status data of a certain level node reaches the corresponding trigger threshold pre-configured in step S102. This condition indicates that the node itself has shown clear signs of security risks such as abnormal data transmission, unauthorized access, and unauthorized operation of sensitive data, and the response process needs to be initiated immediately for handling. The second condition is that the edge risk transmission probability of the associated edge corresponding to the node at this level reaches the transmission threshold pre-set by the system. The edge risk transmission probability here is used to characterize the potential possibility of risk spreading and propagating along the links between nodes. Even if the node itself does not detect significant security status anomalies, if the risk transmission probability of its associated links reaches the preset standard, it means that the risk has the objective conditions to propagate and spread to the node along the upstream and downstream links. If the response mechanism is not triggered in time, it is very easy for the risk to be hidden and then concentrated and erupt. Therefore, this kind of link transmission risk is also used as an effective trigger condition. For example, in the entire e-commerce retail process, on the one hand, if the security status data of the platform's central control server node at the core hub level, such as the frequency of illegal logins or the packet loss rate of data transmission, reaches its corresponding low trigger threshold, the response mechanism can be directly triggered. On the other hand, if the payment gateway node itself does not show obvious security anomalies, but the probability of edge risk transmission between its connection edge and the brand merchant's business management node reaches the preset transmission threshold, it indicates that there is a possibility of hidden transmission of risk from the merchant node to the payment gateway. In this case, the response mechanism can also be triggered, thereby achieving comprehensive monitoring of the node's own risks and the risks transmitted through the link, and minimizing the possibility of omissions in risk monitoring.

[0045] In one embodiment, step S20, the construction of the chain association graph, specifically includes the following steps:

[0046] S211. Perform structured embedding processing on the node risk characteristics of each level of nodes to form a node attribute vector, wherein the node attribute vector includes node risk level weight, risk type identifier and level label.

[0047] S212. Classify and quantify the characteristics of the association relationship to form an association edge attribute vector, wherein the association edge attribute vector includes the association relationship type, interaction frequency, data flow volume and duration.

[0048] S213. Taking the core hub-level nodes as the topology center, and according to the hierarchical chain linkage relationship of the core hub-level nodes, the first-level transmission-level nodes and the terminal association-level nodes, perform topological mapping between the node attribute vectors and the associated edge attribute vectors to construct a chain association graph that includes the node topology relationship.

[0049] In this embodiment, specifically as described in steps S211-S213 above, step S211 performs structured embedding processing on the risk features of each level of nodes extracted in step S10. This structured embedding processing can use conventional feature vector encoding and embedding algorithms in the field to transform the originally discrete and non-standardized node risk feature data into standardized vector data that can be recognized and calculated by computers through vector encoding. The resulting node attribute vector integrates three types of key identification information: first, node risk level weight, which is used to quantify the degree of potential risk of the node; second, risk type identifier, which is used to distinguish different risk categories such as data leakage, illegal access, and abnormal transmission; and third, the level label, which is used to mark the node as a core hub level, a first-level transmission level, or a terminal association level. Through this vector, the risk attributes and level positioning of a single node can be completely characterized.

[0050] Step S212 performs classification and quantification processing on the node association features obtained in step S10. This classification and quantification can adopt conventional association feature classification and numerical quantification algorithms in the field to transform the abstract association relationships between nodes into quantifiable numerical indicators, thereby forming an association edge attribute vector. The vector contains four types of core quantitative information: first, the association relationship type, used to distinguish different business association types such as order flow, payment interaction, and warehouse collaboration; second, the interaction frequency, used to count the number of business interactions between nodes per unit time; third, the data flow volume, used to quantify the size of data such as order data and payment data transmitted between nodes; and fourth, the duration, used to record the stable existence time of the association relationship between nodes. Through the above quantitative information, the core attributes of the association edges between nodes can be comprehensively characterized.

[0051] Step S213 uses the core hub-level nodes defined in step S101 as the topological center of the entire chain-linked graph. Conventional topological mapping and graph construction algorithms in this field can be employed. Following a hierarchical chain-linked relationship of "core hub-level nodes leading the linkage → first-level transmission nodes receiving intermediate support → terminal connection nodes executing the final execution," the vector topological mapping is completed in stages: First, the attribute vectors of the core hub-level nodes and the attribute vectors of each first-level transmission node are linked through corresponding edge attribute vectors to establish a preliminary topological connection. Second, the attribute vectors of the first-level transmission nodes and the attribute vectors of each terminal connection node are linked through corresponding edge attribute vectors to complete the secondary topological connection. Finally, the node attributes carried by the vectors, the edge attributes, and the hierarchical transmission relationships between nodes are deeply integrated to construct a chain-linked graph that combines node risk attributes, quantitative edge attributes, and clear hierarchical topological relationships. This provides a structured and quantifiable graph foundation for the subsequent calculation of risk transmission parameters (node ​​risk transmission coefficient, edge risk transmission probability, etc.). For example, in the aforementioned e-commerce retail process, the e-commerce platform's central control server node serves as the topology center. First, its node attribute vector is mapped and linked with the associated edge attribute vectors of first-level transmission nodes such as brand merchant business management nodes and payment gateway nodes. Then, the associated edge attribute vectors of brand merchant business management nodes are used to connect the attribute vectors of logistics and warehousing nodes, and the associated edge attribute vectors of payment gateway nodes are used to connect the attribute vectors of user terminal nodes, ultimately forming a chain-like association graph with clear hierarchy and complete attributes.

[0052] In one embodiment, step S20, the calculation of the risk transmission parameter, specifically includes the following steps:

[0053] S221. Based on the level label in the node attribute vector, determine the basic transmission coefficient of each level node. Then, adjust the basic transmission coefficient of each level node by weighting it with the node risk level weight to obtain the node risk transmission coefficient of each level node.

[0054] S222. Based on the association relationship type, interaction frequency, data flow volume and duration in the associated edge attribute vector, the initial association strength of the associated edge is calculated; the initial association strength is corrected by combining the hierarchical matching degree of the nodes at both ends of the associated edge to obtain the edge association strength.

[0055] S223. Perform a weighted operation on the node risk transmission coefficient and the edge association strength to obtain the edge risk transmission probability of the corresponding associated edge.

[0056] In this embodiment, specifically as described in steps S221-S223 above, step S221 first assigns corresponding basic transmission coefficients to three types of nodes—core hub level, first-level transmission level, and end-association level—based on the hierarchical labels within the node attribute vector generated in step S211. These basic transmission coefficients are determined by the node hierarchical positioning, with the core hub level nodes having the highest basic transmission coefficient, followed by the first-level transmission level, and the end-association level having the lowest. Then, using a conventional weighted correction calculation method in the art, the node risk level weight in the node attribute vector is used as a correction factor to numerically adjust the basic transmission coefficients of each level. The higher the risk level weight, the larger the corrected transmission coefficient. Finally, a node risk transmission coefficient that integrates the node hierarchical level and risk level is obtained. This coefficient can characterize the ability of a single node to spread risk outward.

[0057] Step S222 first uses four quantitative indicators—relationship type, interaction frequency, data flow volume, and duration—found in the associated edge attribute vector to calculate the initial association strength of the associated edge using conventional multi-factor quantitative calculation methods. This initial association strength can initially reflect the tightness of business connections between nodes. Then, the hierarchical matching degree of the nodes at both ends of the associated edge is introduced to further refine the initial association strength. A higher hierarchical matching degree indicates stronger business linkage and risk transmission adaptability between the two nodes. The refined value is the final edge association strength, which more accurately characterizes the actual ability of the inter-node association link to bear risk transmission. For example, in an e-commerce retail link, the brand merchant business management node and the payment gateway node belong to the same first-level transmission level, and their hierarchical matching degree is high. After refinement based on the initial association strength, the resulting edge association strength will be higher than that of inter-node association links with significant hierarchical differences.

[0058] Step S223 employs a conventional weighted fusion calculation method in this field, weighting the node risk transmission coefficient obtained in step S221 and the edge association strength obtained in step S222. Based on the actual application, the calculation weights of the two types of parameters are reasonably allocated. By combining the node's own risk diffusion capability and the association link transmission capability, the edge risk transmission probability of the corresponding associated edge is finally obtained. This probability can intuitively reflect the likelihood of risk transmission between two nodes along the associated edge, providing a core quantitative basis for subsequent response mechanism triggering and link transmission risk prediction.

[0059] In one embodiment, step S30 specifically includes the following steps:

[0060] S311. The node risk features of each level of node are fused with the corresponding node attribute vectors to obtain the fused node risk input features.

[0061] S312. Input the node risk input features into the pre-trained node risk prediction model. Through the hierarchical feature extraction branch of the model, extract the node hierarchical association features based on the hierarchical labels in the node attribute vector and the node topological relationship in the chain association graph. Through the risk feature analysis branch of the model, perform feature mining on the risk anomaly indicators and safety status data in the node risk input features to extract the real-time risk features of the node.

[0062] S313. The model's prediction output layer performs joint risk prediction on the node hierarchical association features and the real-time risk features of the nodes, and outputs the node risk prediction results for a single-level node, including the node risk level, the probability of risk occurrence, and the risk type.

[0063] In this embodiment, specifically as described in steps S311-S313 above, step S311 uses the real-time node risk features of each level of nodes collected and extracted in step S10 as the basic data, and performs feature fusion processing with the node attribute vectors generated in step S211 and corresponding to each node. This feature fusion can adopt conventional feature splicing and weighted fusion methods in the field to organically integrate the discretized real-time risk features with the standardized node attribute vectors, eliminate redundant feature information and strengthen the risk representation dimension, make up for the information limitations of single feature data, and obtain fused node risk input features with complete dimensions and comprehensive representation capabilities, providing standardized and high-quality input data for the model's risk prediction. For example, in the aforementioned e-commerce retail full-process chain, node risk features such as the abnormal access frequency and sensitive data operation records of the e-commerce platform's central control server are fused with node attribute vectors containing high-risk level weights and the level labels of the core hub to form node risk input features adapted to the model input.

[0064] Step S312 inputs the node risk input features obtained in step S311 into the pre-trained node risk prediction model, and uses the model's dual-branch parallel structure to simultaneously perform differentiated feature extraction: on the one hand, through the model's hierarchical feature extraction branch, based on the hierarchical label in the node attribute vector as the core basis, and combined with the node topology linkage relationship already constructed in the chain association graph, it mines the hierarchical positioning of the node in the supply chain and the upstream and downstream association transmission attributes, so as to extract node hierarchical association features that can reflect the impact of node hierarchical association; on the other hand, through the model's risk feature analysis branch, it focuses on the risk anomaly indicators and real-time security status data contained in the node risk input features, and performs feature mining and risk pattern extraction, so as to obtain node real-time risk features that can characterize the current real-time security status of the node. Through dual-branch feature decoupling extraction, hierarchical attributes and real-time risks are separately characterized.

[0065] Step S313 uses the prediction output layer of the pre-trained model to perform feature fusion and joint risk prediction on the node hierarchical association features and real-time risk features of the nodes extracted in step S312. Taking into account the impact of node hierarchical transmission and its own real-time risk status, the final output is a risk prediction result for a single-level node covering multiple dimensions, specifically including three core information categories: node risk level, probability of risk occurrence, and risk type. This prediction result can directly provide quantitative basis and data support for subsequent link transmission risk prediction and the formulation of hierarchical security blocking strategies.

[0066] It should be noted that the training and construction of the pre-trained node risk prediction model includes the following steps:

[0067] S301. Collect historical security data and historical business operation data of each node in the e-commerce supply chain, and construct a training dataset containing normal operation samples and risky abnormal samples; preprocess the training dataset, including data cleaning, missing value imputation, outlier removal and standardization, to obtain a standardized training dataset.

[0068] Specifically, step S301, as the data foundation preparation step for model training, has the core objective of collecting high-quality, comprehensive training data and completing standardized preprocessing to provide reliable support for subsequent model annotation and training. In practice, firstly, multi-dimensional historical data from each level of the e-commerce supply chain is collected, including historical security data such as node security logs, risk event records, and business operation logs, as well as historical business operation data such as inter-node business interaction data and data flow records. By integrating this data, a training dataset containing two types of core samples is constructed: one type consists of samples under normal operating conditions at each node, and the other type covers various risk and anomaly scenarios such as data leakage, unauthorized access, abnormal transmission, and order tampering. This ensures that the training dataset covers all scenarios of supply chain risk management and possesses sample diversity and comprehensiveness.

[0069] After the dataset is constructed, it undergoes standardized preprocessing: data cleaning removes duplicate entries, incorrectly formatted data, and invalid data without business significance; for missing values ​​in key fields, conventional fill methods in the field, such as mean fill, interpolation fill, or scenario-adaptive fill (e.g., filling missing values ​​for business interaction frequency with the historical mean of the same node); outliers exceeding reasonable business range (e.g., abnormally high daily data flow) are identified and removed to avoid abnormal data interfering with model training logic; finally, standardization processing (e.g., Z-score standardization, Min-Max normalization) maps the original data of different dimensions and magnitudes to a unified numerical range, eliminating training bias caused by differences in data scale, and finally obtaining a standardized training dataset, laying the data foundation for subsequent supervision and annotation.

[0070] For example, in the aforementioned e-commerce retail process, historical secure login logs of the e-commerce platform's central control server, abnormal transaction records of payment gateway nodes, and order flow business data of logistics and warehousing nodes are collected to construct a training dataset containing normal operation samples (such as regular business operation data of the platform server) and risky abnormal samples (such as illegal transaction data of payment nodes and order tampering records of warehousing nodes). After data cleaning, missing value imputation, outlier removal, and standardization, a standardized training dataset with uniform format and compliant data is obtained.

[0071] S302. Supervised annotation is performed on each sample in the standardized training dataset. The annotation content includes the label of the level to which it belongs, the label of the node risk level, the label of the risk occurrence probability interval, and the label of the risk type, forming a labeled training sample set.

[0072] Specifically, step S302 is the supervised labeling stage. Its core is to assign a clear business label to each sample in the standardized training dataset, constructing a training sample set with supervised information, providing a clear learning objective for subsequent iterative training of the model. In practice, for each sample in the standardized training dataset, based on the node hierarchy classification rules of the e-commerce supply chain and the actual situation of historical risk events, four types of core labeling are carried out: First, the level label label, marking the sample's corresponding node as a core hub level, first-level transmission level, or end-level association level according to its position in the chain-like association graph; second, the node risk level label, dividing the node risk level into high, medium, and low levels based on the severity, scope of impact, and frequency of historical risk events; third, the risk occurrence probability interval label, dividing the probability into high (0.8-1.0), medium (0.4-0.8), and low (0-0.4) intervals based on the statistical frequency of historical risk events; and fourth, the risk type label label, marking the risk event as a specific risk type such as data leakage, unauthorized access, abnormal transmission, or order tampering, based on its specific manifestation.

[0073] Through the above annotation, each sample is transformed into an annotated sample containing complete supervision information such as "node level - risk level - probability interval - risk type", and finally forms an annotated training sample set. This ensures that the model training can clearly understand the mapping logic of "input features → output prediction results", thereby improving training efficiency and prediction accuracy.

[0074] S303. Construct a multi-branch model architecture, wherein the model architecture includes:

[0075] The hierarchical feature extraction branch is used to extract node hierarchical association features based on the node topology relationship between the node's hierarchical label and the chain association graph;

[0076] The risk feature analysis branch is used to extract real-time risk features of nodes based on risk anomaly indicators and safety status data in the node risk input features.

[0077] The prediction output layer is used to fuse node hierarchical association features with real-time risk features of nodes and output node risk prediction results.

[0078] Specifically, step S303 is the model architecture construction stage, the core of which is to build a multi-branch model architecture adapted to the risk prediction task of this solution, and to clarify the functional positioning and data flow logic of each branch. In practice, this involves constructing a multi-branch node risk prediction model architecture containing three core structures:

[0079] The hierarchical feature extraction branch's core function is to extract node-level association features that characterize node hierarchical positioning and transmission attributes based on the node topology relationships of the hierarchical labels and chain-linked association graphs. This branch uses the sample's hierarchical labels as input and combines the hierarchical topological linkage relationships of core hubs, primary transmission, and terminal connections in the chain-linked association graph to uncover the influence patterns of different levels of nodes on risk transmission. Ultimately, it outputs a unified-dimensional node-level association feature vector, providing feature support for the model to capture the impact of hierarchical attributes on risk.

[0080] Risk Feature Analysis Branch: The core function of this branch is to extract real-time risk features that characterize the current risk status of a node based on risk anomaly indicators and security status data from the node's risk input features. This branch uses risk anomaly indicators (such as the number of abnormal accesses and the frequency of sensitive data operations) and security status data (such as data transmission packet loss rate and login authentication failure rate) contained in the sample as input objects. It extracts real-time risk features of nodes through feature mining algorithms, ensuring that the input data matches the architecture and the actual application scenario.

[0081] Prediction Output Layer: The core function is to fuse the node hierarchical association features output from the hierarchical feature extraction branch with the real-time node risk features output from the risk feature analysis branch, and output the node risk prediction result. This layer integrates the two types of features through conventional methods such as feature concatenation and weighted fusion, combining node hierarchical attributes and real-time risk status, and finally outputs a multi-dimensional prediction result including node risk level, risk occurrence probability, and risk type.

[0082] S304. Divide the labeled training sample set into a training subset and a validation subset according to a preset ratio; adopt a machine learning training method, input the training subset into a multi-branch model architecture for iterative training, obtain the prediction result through forward propagation, calculate the prediction error using a preset joint loss function, and update the parameters of each branch of the model through back propagation, iterating until the loss value of the training subset converges to a stable state.

[0083] Specifically, step S304 is the model iterative training stage. The core is to use the training subset to iteratively optimize the parameters of the constructed multi-branch model architecture, allowing the model to gradually master the mapping rule of "input features → output prediction results" until the model's loss value on the training subset stabilizes. In practice, firstly, according to the conventional dataset partitioning ratio in this field (e.g., 7:3, 8:2), the labeled training sample set is divided into a training subset (for model parameter learning) and a validation subset (for subsequent model generalization ability verification). Then, conventional supervised machine learning training methods are used to iteratively train the model: the training subset is input into the multi-branch model architecture at fixed batch sizes (e.g., 32, 64 samples / batch). Through the model's forward propagation process, the hierarchical feature extraction branch and the risk feature analysis branch extract corresponding features respectively. Then, the prediction output layer completes feature fusion and risk prediction to obtain the model's prediction result. The prediction error between the prediction result and the true labeled samples is calculated using a preset joint loss function (where, For classification tasks involving risk levels and types, the cross-entropy loss function is used; for regression tasks involving the probability of risk occurrence, the mean squared error loss function is used. The two types of losses are combined through weighted summation to achieve joint optimization across multiple tasks. The prediction error is fed back to each branch of the model via backpropagation. The model parameters of the hierarchical feature extraction branch, risk feature analysis branch, and prediction output layer are updated based on the error gradient. The training process of "forward propagation - error calculation - backpropagation - parameter update" is repeated to continuously optimize the model parameters until the loss value of the training subset converges to a stable state (e.g., the fluctuation range of the loss value is less than a preset threshold for 10 consecutive iterations). This indicates that the model has initially grasped the feature patterns in the training data and has completed basic training.

[0084] S305. Verify and optimize the model's predictive performance using the validation subset: Calculate the model's risk identification accuracy, risk level prediction accuracy, and risk type determination recall on the validation subset. If any indicator fails to meet the preset standard, adjust the weights of the joint loss function or the model parameters, and return to S304 to repeat the training iteration until all indicators meet the preset requirements, thus completing the construction of the pre-trained node risk prediction model.

[0085] Specifically, step S305 is the model performance verification and optimization stage. The core is to verify the model's generalization ability through a validation subset, avoiding overfitting, and adjusting model parameters based on the verification results until the model meets the performance requirements of practical applications, ultimately completing the construction and training of the node risk prediction model. In practice, the validation subset defined in step S304 is used to perform performance tests on the trained model, calculating three core verification metrics: first, risk identification accuracy, which measures the proportion of normal samples correctly distinguished from risky abnormal samples; second, risk level prediction accuracy, which measures the degree of match between the model's prediction of node risk levels (high / medium / low) and the actual risk level; and third, risk type determination recall, which measures the proportion of the model correctly identifying various risk types (data leakage / illegal access, etc.).

[0086] If any indicator fails to meet the preset standard (e.g., risk identification accuracy ≥ 85%, risk level prediction accuracy ≥ 80%, risk type determination recall ≥ 80%), then adjust the weight allocation of the joint loss function (e.g., the weight ratio of classification tasks to regression tasks) or model parameters (e.g., the number of neurons in the feature extraction branch, the learning rate of iterative training), and return to step S304 to restart the training iteration; repeat the "training-validation-optimization" cycle until the model meets the preset requirements for all three core indicators on the validation subset, indicating that the model not only grasps the patterns in the training data, but also has a good generalization ability to unseen data, and finally completes the construction and training of the node risk prediction model.

[0087] In one embodiment, step S40 specifically includes the following steps:

[0088] S411. The node topological relationships in the chain association graph, the node risk transmission coefficient and edge association strength in the risk transmission parameters, and the node risk level and risk occurrence probability in the node risk prediction results are fused to obtain the link risk input features.

[0089] S412. Input the link risk input features into the pre-trained chain risk transmission prediction model. Through the link topology feature extraction branch of the model, mine the transmission path features between nodes based on the hierarchical chain linkage relationship of the chain association graph. Through the risk transmission intensity analysis branch of the model, calculate the transmission potential energy features of each link in the link by combining the node risk transmission coefficient and the edge association strength. Through the node risk association mining branch of the model, extract the risk association features between nodes based on the node risk level and the probability of risk occurrence.

[0090] S413. The transmission path characteristics, transmission potential energy characteristics and risk correlation characteristics are jointly calculated by the prediction output layer of the model to output the link transmission risk prediction results, including risk transmission probability, risk transmission path, risk impact range and risk outbreak time window.

[0091] In this embodiment, specifically as described in steps S411-S413 above, step S411 uses the constructed chain-link graph, the calculated risk transmission parameters, and the node risk prediction results as multi-dimensional data sources, integrating and fusing the core features to obtain link risk input features adapted to the model input. Specifically, the node topological relationship features in the chain-link graph, the node risk transmission coefficient and edge association strength features in the risk transmission parameters, and the node risk level and risk occurrence probability features in the node risk prediction results are organically integrated using conventional feature splicing and weighted fusion methods. This unifies the scattered topological structure features, transmission capability features, and node risk state features into a unified and comprehensive link risk input feature, providing standardized input data for the subsequent chain-link risk transmission prediction model and ensuring that the model can comprehensively utilize multi-dimensional information to accurately predict link transmission risks. For example, in the aforementioned e-commerce retail process, the topological relationships of the platform's central control server, payment gateway, and warehousing nodes, the node risk transmission coefficients and edge association strengths between nodes, as well as the risk levels and probability of risk occurrence of each node are fused to form the link risk input features corresponding to the supply chain.

[0092] Step S412 inputs the link risk input features obtained in step S411 into the pre-trained chain-like risk transmission prediction model. Based on the model's three-branch parallel structure, differential feature extraction and mining are performed simultaneously: Through the link topology feature extraction branch of the model, the hierarchical chain linkage relationship of the chain-like association graph is used as the core basis to analyze the upstream and downstream associations and hierarchical transmission paths between nodes, and the transmission path features between nodes are mined; Through the risk transmission strength analysis branch of the model, the risk transmission capacity of each transmission link in the link is quantified by combining the risk transmission coefficient of the node itself and the edge association strength between the nodes, and the transmission potential energy features of each link are obtained; Through the node risk association mining branch of the model, the risk linkage and correlation between different nodes are mined based on the risk level and the probability of risk occurrence of each node, and the risk association features between nodes are extracted. Through the three-branch parallel feature extraction, the core influencing factors of link risk transmission are characterized and represented separately.

[0093] Step S413 uses the prediction output layer of the pre-trained chain-based risk transmission prediction model to perform multi-feature joint operation and comprehensive prediction on the transmission path features, transmission potential energy features, and risk association features extracted in step S412. It fully integrates the influence of three core features: link topology, transmission strength, and node risk association, and finally outputs a multi-dimensional link transmission risk prediction result, specifically including four core contents: risk transmission probability, risk transmission path, risk impact range, and risk outbreak time window. This prediction result can intuitively reflect the transmission possibility, specific transmission route, impact range, and outbreak period of risk in the supply chain, providing a comprehensive and quantitative decision-making basis for subsequent step S50 to formulate a graded security blocking strategy and achieve precise risk prevention and control.

[0094] It should be noted that the training and construction of the pre-trained chain-like risk transmission prediction model includes the following steps:

[0095] S401. Collect historical link topology data, historical risk transmission parameter data, historical node risk prediction data, and historical risk actual transmission event data of the e-commerce supply chain to construct a link training dataset that includes normal transmission scenarios and abnormal transmission scenarios. Preprocess the link training dataset, including data cleaning, missing value imputation, outlier removal, standardization, and transmission path regularization, to obtain a standardized link training dataset.

[0096] Specifically, step S401, as the data foundation preparation step for model training, has the core objective of collecting multi-dimensional historical data related to link transmission and completing standardized preprocessing to provide high-quality data support for subsequent model annotation and training. In practice, four types of core historical data are collected first: 1) historical link topology data, i.e., topological structure data such as hierarchical linkages and connections between nodes in the supply chain; 2) historical risk transmission parameter data, including historical node risk transmission coefficients for each node and historical edge association strength data between nodes; 3) historical node risk prediction data, i.e., prediction-related data such as historical risk levels and probability of risk occurrence for each node; and 4) historical actual risk transmission event data, including records of risk transmission processes and results that have occurred in the supply chain. By integrating this data, a link training dataset containing two types of core samples is constructed: one type is samples from risk-free or normal business transmission scenarios, and the other type is samples from abnormal risk transmission scenarios, ensuring that the training dataset covers all scenarios of link risk transmission and possesses sample diversity and comprehensiveness.

[0097] After the dataset is constructed, it undergoes standardized preprocessing: data cleaning removes duplicate entries, incorrectly formatted data, and invalid data without business significance; for missing values ​​in key fields (such as edge association strength data for certain time periods), conventional fill methods in the field, such as mean fill, interpolation fill, or scenario-adaptive fill (e.g., fill missing edge association strength values ​​with the mean of nodes at the same level in the same link); outliers exceeding reasonable business scope (such as abnormally high node risk transmission coefficients) are identified and removed to avoid abnormal data interfering with model training logic; standardization processing (such as Z-score standardization and Min-Max normalization) maps raw data of different dimensions and magnitudes to a unified numerical range, eliminating training bias caused by differences in data scale; additional transmission path normalization processing is added to unify the risk transmission paths recorded in different formats into a standardized path format of "starting node - intermediate node - ending node" to ensure the consistency of path data, ultimately obtaining a standardized link training dataset, laying the data foundation for subsequent supervision and annotation.

[0098] For example, in the aforementioned e-commerce retail process, historical topology connection data between the platform's central control server and payment and warehousing nodes are collected, along with historical risk transmission coefficients and edge association strength data for each node, historical risk levels and probability of occurrence data for each node, and actual event data of past order information leakage risks transmitted from payment nodes to warehousing nodes. This data is used to construct a link training dataset containing normal transmission samples (such as business data transmission when there is no risk) and abnormal transmission samples (such as risk transmission across nodes). After data cleaning, missing value imputation, outlier removal, standardization processing, and transmission path regularization, a standardized link training dataset with a unified format and compliant data is obtained.

[0099] S402. Supervised annotation is performed on each sample in the standardized link training dataset. The annotation content includes the risk transmission probability interval label, the actual risk transmission path label, the actual risk impact range label, and the actual risk outbreak time window label, forming an annotated link training sample set.

[0100] Specifically, step S402 is the supervised labeling step, the core of which is to assign a clear transmission-related label to each sample in the standardized link training dataset, construct a training sample set with supervised information, and provide a clear learning objective for the subsequent iterative training of the model. In practice, for each sample in the standardized link training dataset, four types of core annotations are carried out based on the actual historical risk transmission situation: First, risk transmission probability interval annotation, which divides the risk transmission into three intervals: high probability (0.8-1.0), medium probability (0.4-0.8), and low probability (0-0.4) according to the statistical probability of the actual transmission of the risk in history; Second, risk actual transmission path annotation, which marks the actual transmission node sequence of the risk in history as a standardized path label of "starting node-intermediate node-ending node"; Third, risk actual impact range annotation, which marks the actual impact range of the risk in history as a core level impact, full link impact, and local level impact, based on the number and level of nodes ultimately affected by the risk transmission; Fourth, risk actual outbreak time window annotation, which marks the time span from the occurrence of the risk to its peak as a short-term (0-24 hours), medium-term (24-72 hours), and long-term (more than 72 hours) time window label, based on the historical risk transmission time span.

[0101] Through the above annotation, each sample is transformed into an annotated sample containing complete supervised information including "transmission probability interval - actual transmission path - actual impact range - actual outbreak time window". This ultimately forms an annotated link training sample set, ensuring that the model training can clearly define the mapping logic of "input features → output prediction results", thereby improving training efficiency and prediction accuracy.

[0102] S403. Construct a multi-branch model architecture, the model architecture including:

[0103] The link topology feature extraction branch is used to mine the transmission path features between nodes based on the hierarchical chain linkage relationship of the chain association graph;

[0104] The risk transmission strength analysis branch is used to calculate the transmission potential energy characteristics of each link by combining the node risk transmission coefficient and the edge association strength.

[0105] The node risk association mining branch is used to extract risk association features between nodes based on node risk level and risk occurrence probability.

[0106] The prediction output layer is used to perform joint operations on the three types of features and output the prediction results of the link transmission risk.

[0107] Specifically, step S403 is the model architecture construction stage, the core of which is to build a multi-branch model architecture adapted to the link risk transmission prediction task, and to clarify the functional positioning and data flow logic of each branch. In practice, this involves constructing a multi-branch chain-like risk transmission prediction model architecture containing four core structures:

[0108] Link Topology Feature Extraction Branch: The core function is to mine the transmission path features between nodes based on the hierarchical chain linkage relationship of the chain-linked graph. This branch uses the topology of the chain-linked graph as input, and combines the hierarchical linkage rules of core hubs, first-level transmission, and terminal connections to analyze key information such as reachability paths, path lengths, and path node densities between nodes. Finally, it outputs a transmission path feature vector with a unified dimension, providing feature support for the model to capture the impact of topology on risk transmission.

[0109] The risk transmission intensity analysis branch's core function is to calculate the transmission potential energy characteristics of each link in the link by combining the node risk transmission coefficient and the edge association strength. This branch uses the node risk transmission coefficient and edge association strength in the sample as input, and quantifies the risk transmission capacity of each transmission link through potential energy calculation algorithms (such as weighted multiplication of coefficients and strengths, and summation of cumulative potential energy in the link). Finally, it outputs the transmission potential energy characteristic vector between each node in the link, providing a basis for the model to judge the risk transmission intensity.

[0110] The node risk association mining branch's core function is to extract risk association features between nodes based on their risk levels and probability of occurrence. This branch uses the risk level and probability of occurrence of each node as input data to mine the correlation between risk levels and the synergy of occurrence probabilities between nodes (such as the risk linkage pattern between high-risk nodes and surrounding nodes). Ultimately, it outputs a risk association feature vector between nodes, providing support for the model to capture the impact of node risks on transmission.

[0111] The prediction output layer's core function is to jointly perform calculations on the three types of features output from the link topology feature extraction branch, the risk transmission intensity analysis branch, and the node risk association mining branch, and output the link transmission risk prediction result. This layer integrates the three types of features through conventional methods such as feature splicing and weighted fusion, comprehensively considering the influence of topology path, transmission intensity, and node risk association, and finally outputs a multi-dimensional prediction result including risk transmission probability, transmission path, impact range, and outbreak time window.

[0112] S404. Divide the labeled link training sample set into a training subset and a validation subset according to a preset ratio; adopt machine learning training method, input the training subset into a multi-branch model architecture for iterative training, obtain the prediction result through forward propagation, calculate the prediction error using a preset multi-task joint loss function, and update the parameters of each branch of the model through back propagation, iterating until the loss value of the training subset converges to a stable state.

[0113] Specifically, step S404 is the model iterative training stage. The core is to use the training subset to iteratively optimize the parameters of the constructed multi-branch model architecture, allowing the model to gradually master the mapping rule of "input features → output prediction results" until the model's loss value on the training subset stabilizes. In practice, firstly, according to the conventional dataset partitioning ratio in this field (e.g., 7:3, 8:2), the labeled link training sample set is divided into a training subset (for model parameter learning) and a validation subset (for subsequent model generalization ability verification). Then, conventional supervised machine learning training methods are used to iteratively train the model: the training subset is input into the multi-branch model architecture at fixed batch sizes (e.g., 32, 64 samples / batch). Through the model's forward propagation process, the three feature extraction branches extract their corresponding features, and then the prediction output layer completes feature fusion and risk propagation prediction to obtain the model's prediction result. The prediction error between the prediction result and the true labeled samples is calculated using a preset multi-task joint loss function (wherein, the joint loss function targets the risk propagation probability). The regression task for rate and outbreak time window adopts the mean squared error loss function, while the matching / classification task for transmission path and influence range adopts the cross-entropy loss function. By weighted summation and integration of various losses, multi-task joint optimization is achieved. The prediction error is fed back to each branch of the model through the backpropagation algorithm. The model parameters of the link topology feature extraction branch, risk transmission intensity analysis branch, node risk association mining branch and prediction output layer are updated according to the error gradient. The training process of "forward propagation-error calculation-backpropagation-parameter update" is repeated to continuously optimize the model parameters until the loss value of the training subset converges to a stable state (e.g., the fluctuation of the loss value in 10 consecutive iterations is less than the preset threshold), indicating that the model has initially mastered the feature patterns in the training data and completed the basic training.

[0114] S405. Verify and optimize the model's predictive performance using the validation subset: Calculate the risk transmission probability prediction error, transmission path prediction accuracy, impact range matching degree, and time window hit rate of the model on the validation subset. If any indicator fails to meet the preset standard, adjust the weights of the multi-task joint loss function or the model parameters, and return to S404 to repeat the training iteration until all indicators meet the preset requirements, thus completing the construction of the pre-trained chain-like risk transmission prediction model.

[0115] Specifically, step S405 is the model performance verification and optimization stage. The core is to verify the model's generalization ability through a validation subset, avoid overfitting, and adjust model parameters based on the verification results until the model meets the performance requirements of practical applications, ultimately completing the construction and training of the chain-like risk transmission prediction model. In practice, the training model is tested using the validation subset defined in step S404, calculating four core verification metrics: 1) Risk transmission probability prediction error, measuring the deviation between the model's predicted transmission probability and the true probability (e.g., mean absolute error); 2) Transmission path prediction accuracy, measuring the matching ratio between the model's predicted transmission path and the true path; 3) Impact range matching degree, measuring the degree of fit between the model's predicted impact range and the true impact range; and 4) Time window hit rate, measuring the consistency ratio between the model's predicted outbreak time window and the true time window.

[0116] If any indicator fails to meet the preset standard (e.g., risk transmission probability prediction error ≤ 0.1, transmission path prediction accuracy ≥ 80%, impact range matching degree ≥ 85%, time window hit rate ≥ 80%), then adjust the weight allocation of the multi-task joint loss function (e.g., the weight ratio of regression task and classification task) or model parameters (e.g., the number of neurons in the feature extraction branch, the learning rate of iterative training), and return to step S404 to restart the training iteration; repeat the "training-validation-optimization" cycle until the model meets the preset requirements for all four core indicators on the validation subset, indicating that the model not only grasps the patterns in the training data, but also has a good generalization ability to unseen data, and finally completes the construction and training of the pre-trained chain-like risk transmission prediction model.

[0117] In one embodiment, step S50 specifically includes the following steps:

[0118] S511. Extract and integrate the risk basis of the hierarchical blocking strategy. The risk basis includes the node risk level and risk occurrence probability in the node risk prediction results, the risk transmission probability and risk impact range in the link transmission risk prediction results, and the sensitivity level corresponding to the data sensitivity features.

[0119] S512. Based on the aforementioned risk criteria, construct a weighted hierarchical model, configure differentiated weights for each risk criterion, and adjust the weight coefficients of the corresponding node risk levels step by step according to the hierarchical order of core hub level nodes, first-level transmission level nodes, and terminal related level nodes.

[0120] S513. The risk basis is weighted by the weighted grading model to obtain the comprehensive risk value of each level node and the corresponding transmission link, and the comprehensive risk value is mapped to the corresponding comprehensive risk level according to the preset risk threshold.

[0121] S514. Match the corresponding blocking strategy according to the comprehensive risk level to obtain a hierarchical blocking strategy that is compatible with the risk status of each level of node and the corresponding transmission link.

[0122] In this embodiment, specifically as described in steps S511 to S514 above, step S511 is the risk basis integration step, the core of which is to extract and summarize the multi-dimensional core basis that affects the graded blocking decision, so as to ensure that the formulation of the blocking strategy has comprehensive and quantitative data support. In specific implementation, three key risk bases are integrated: The first is the basis for node risk prediction results, namely the risk level (high / medium / low) and probability of occurrence (e.g., 0.72, 0.89) of each node at each level output in step S313, reflecting the basic risk status of a single node; the second is the basis for link transmission risk prediction results, namely the risk transmission probability (e.g., 0.89) and risk impact range (e.g., local first-level transmission level, full-link impact) output in step S413, reflecting the potential for risk diffusion and the scope of impact in the link; the third is the basis for data sensitivity characteristics, namely the sensitivity level of the data flowing through each node (divided into core sensitive level, general sensitive level, and non-sensitive level according to data type, where the core sensitive level includes user payment information, identity information, etc., the general sensitive level includes order flow information, etc., and the non-sensitive level includes publicly available business promotion data, etc.), reflecting the degree of harm after data leakage or tampering.

[0123] By standardizing and organizing the above three types of criteria (such as converting risk level and sensitivity level into quantitative scores: high level = 3 points, medium level = 2 points, low level = 1 point), and then using feature integration to unify and associate the scattered multi-dimensional criteria with nodes at each level and corresponding transmission links, a complete and calculable set of risk criteria is formed, laying the data foundation for the subsequent construction of a weighted grading model.

[0124] Step S512 is the weighted hierarchical model construction stage. The core is to configure differentiated weights based on the importance of various risk criteria, and adjust the weight coefficients according to the characteristics of the node hierarchy to ensure the model can reflect the impact of different factors on blocking decisions. In specific implementation, firstly, based on the business logic of supply chain risk management, basic weights are configured for the three types of risk criteria: considering that data sensitivity directly relates to the degree of risk severity, the highest basic weight (e.g., 0.4) is configured; the probability of risk transmission and the scope of impact are related to the overall loss after risk spread, so the second highest basic weight (e.g., 0.3) is configured; the node risk level and the probability of occurrence are the basic sources of risk, so a basic weight (e.g., 0.3) is configured.

[0125] Based on this, following the hierarchical order of "core hub level nodes → first-level transmission level nodes → terminal connection level nodes," the weight coefficients of the corresponding node risk levels are progressively reduced: core hub level nodes, as the core of the link, have the greatest impact on the entire link, and their node risk level weight remains at the basic value (0.3); the risk impact of first-level transmission level nodes is the second largest, and their weight coefficient is reduced to 0.25; the risk impact of terminal connection level nodes is relatively limited, and their weight coefficient is reduced to 0.2. This reflects the difference in the impact of node levels on risk transmission and ensures that the model's decisions are more in line with the actual risk transmission patterns of the link.

[0126] For example, weights are configured for the e-commerce retail chain: the basic weight for data sensitivity level is 0.4, the basic weight for risk transmission probability and impact scope is 0.3, and the basic weight for node risk level and occurrence probability is 0.3. Among them, the node risk level weight of the platform control server at the core hub level remains at 0.3; the weight of the payment gateway and brand merchant nodes at the first-level transmission level is reduced to 0.25; and the weight of the logistics and warehousing nodes at the end-connection level is reduced to 0.2, thus completing the weighted hierarchical model adapted to this chain.

[0127] Step S513 is the comprehensive risk level calculation step. Its core is to quantify the integrated risk data using a weighted grading model, transforming multi-dimensional data into a unified comprehensive risk value and mapping it to a clear comprehensive risk level. Specifically, firstly, various risk data are quantified and assigned values: risk level (high=3, medium=2, low=1), risk occurrence probability (using the original probability value, e.g., 0.85), risk transmission probability (using the original probability value, e.g., 0.89), risk impact scope (full-link=3, local first-level=2, end-level=1), and data sensitivity level (core sensitive=3, general sensitive=2, non-sensitive=1). Then, according to the weights configured in the weighted grading model, the quantified data for each node and its corresponding transmission link are weighted and summed to obtain the comprehensive risk value (calculated). Formula: Comprehensive Risk Value = Data Sensitivity Level Score × 0.4 + (Risk Transmission Probability + Risk Impact Scope Score) × 0.3 × Mean Coefficient + Node Risk Level Score × Hierarchical Weight + Node Risk Occurrence Probability × 0.3 × Mean Coefficient, where the mean coefficient is used to unify the magnitude of each factor's value and avoid calculation deviations caused by differences in units; finally, based on the preset risk threshold, the comprehensive risk value is mapped to the corresponding comprehensive risk level: high risk (comprehensive risk value ≥ 2.5), medium risk (1.5 ≤ comprehensive risk value < 2.5), and low risk (comprehensive risk value < 1.5).

[0128] For example, the quantitative criteria for payment gateway nodes in the e-commerce retail chain are: data sensitivity level 3 points, risk transmission probability 0.89, risk impact range 2 points, node risk level 3 points (high), node risk occurrence probability 0.85, and its hierarchical weight is 0.25 (first-level transmission level); after weighted calculation, the comprehensive risk value is 3×0.4+(0.89+2)×0.3×0.5+3×0.25+0.85×0.3×0.5≈2.63, which is mapped to "high risk level" according to the preset threshold; the quantitative criteria for logistics and warehousing nodes are: data sensitivity level 2 points, risk transmission probability 0.65, risk impact range 1 point, node risk level 2 points (medium), node risk occurrence probability 0.72, and hierarchical weight 0.2; after calculation, the comprehensive risk value is ≈1.87, which is mapped to "medium risk level".

[0129] Step S514 is the tiered blocking strategy matching stage. Its core is to match differentiated and adaptive blocking strategies based on the comprehensive risk level of each level of node and its corresponding transmission link, achieving the prevention and control goal of "strong blocking for high risk, moderate control for medium risk, and light warning for low risk." In specific implementation, a blocking strategy library corresponding to three categories of comprehensive risk levels is preset:

[0130] High-risk level: Match the "complete block + emergency response" strategy, that is, immediately block the transmission link between the node and the upstream and downstream nodes, suspend the node's related business operations, and trigger emergency response procedures (such as data isolation, security audit, manual verification) to prevent the risk from spreading and escalating;

[0131] Medium risk level: Match the "rate limiting control + real-time monitoring" strategy, that is, limit the flow of business data of the node (such as reducing the data transmission rate and limiting the number of concurrent requests) to ensure that the core business is not interrupted, while starting real-time risk monitoring to continuously track changes in risk status. If the risk escalates, switch to the high risk strategy.

[0132] Low-risk level: Match the "risk warning + routine monitoring" strategy, that is, issue risk warnings through system alarms, log recording and other means to remind operation and maintenance personnel to pay attention, while maintaining routine security monitoring efforts, without interrupting business and ensuring business continuity.

[0133] For example, in the aforementioned e-commerce retail supply chain, the payment gateway node, classified as "high-risk," is subject to a "complete block + emergency response" strategy. This immediately blocks its transmission links with the platform's central control server and brand merchant nodes, suspends payment transactions, and initiates a security audit. The logistics and warehousing node, classified as "medium-risk," is subject to a "flow control + real-time monitoring" strategy. This limits the order data flow rate while simultaneously monitoring the node's security status in real time. The end-user promotion and advertising node (assuming a low-risk overall risk level) is subject to a "risk warning + routine monitoring" strategy, issuing only warnings without affecting normal business operations. Through this tiered strategy matching, risks are effectively controlled while minimizing the impact on normal supply chain operations.

[0134] It should be noted that the tiered blocking strategy in this solution is not statically executed, but rather features a closed-loop mechanism for continuous iterative optimization. By combining the actual implementation results of the blocking strategy with reverse adjustments to model parameters and strategy configurations, the adaptability and accuracy of risk management are ensured. The specific iterative optimization process is as follows:

[0135] After executing the tiered blocking strategy in step S514, two types of core feedback data are continuously collected: first, risk control effectiveness data, including whether the risk was effectively contained after blocking, whether secondary transmission occurred, and whether the scope of risk spread met expectations; second, business impact assessment data, including the degree of impact of the blocking strategy on the normal business flow of the supply chain (such as business interruption duration, data transmission delay, changes in transaction success rate, etc.). Through comprehensive analysis of the two types of data, the suitability of the current tiered blocking strategy with the weighted tiered model is determined: if risk spread still occurs after blocking high-risk nodes, or if the control strategy for medium- and low-risk nodes excessively affects business continuity, an iterative optimization mechanism is triggered.

[0136] During the iterative optimization process, the parameters of the weighted hierarchical model are first adjusted based on the feedback data. This includes recalibrating the basic weights of each risk criterion (e.g., increasing the weight ratio of data-sensitive features in core business nodes), optimizing the weight reduction coefficients of different levels of nodes (e.g., adjusting the weight difference between the first-level transmission level and the final level according to the actual transmission pattern), and correcting the preset threshold of the comprehensive risk level (e.g., appropriately lowering the threshold range of the medium-risk level for links with high business continuity requirements). Secondly, the feedback data is used as new samples to supplement the training dataset of the node risk prediction model and the chain-like risk transmission prediction model. The model training process of steps S301~S305 and S401~S405 is repeated to update the model parameters and improve the prediction accuracy. Finally, based on the optimized model and parameters, steps S511~S514 are re-executed to generate the iterative hierarchical blocking strategy.

[0137] Through a closed-loop iteration of "strategy execution → effect feedback → model optimization → strategy update", the tiered blocking strategy can dynamically adapt to the business changes and risk evolution patterns of the e-commerce supply chain. While effectively preventing and controlling risks, it minimizes unnecessary impact on normal business and achieves a dynamic balance between risk control and business continuity.

[0138] It should be noted that the aforementioned examples of the entire e-commerce retail process are merely illustrative, intended to more intuitively and clearly explain the technical solution of this invention, facilitating understanding and implementation by those skilled in the art, and do not constitute any limitation on the technical solution of this invention. The technical solution of this invention can be flexibly adapted to the risk management needs of various e-commerce supply chains according to the differences in actual application scenarios. Any technical implementation based on the core concepts of this invention (i.e., node risk prediction, link transmission risk prediction, tiered blocking strategy formulation, and closed-loop iterative optimization), regardless of the specific supply chain scenario, number of nodes, or business type, falls within the protection scope of this invention.

[0139] In one embodiment, a big data-based e-commerce data security management system is provided, which corresponds to the big data-based e-commerce data security management method described in the previous embodiment. This big data-based e-commerce data security management system includes:

[0140] The data acquisition module is used to collect node security status data and inter-node correlation data of each level of the e-commerce supply chain in response to the preset response mechanism, and to preprocess the data to obtain the node risk characteristics, correlation characteristics and data sensitivity characteristics of each level of nodes.

[0141] The risk transmission calculation module is used to construct a chain-like association graph by using node risk characteristics as graph node attributes and association relationship characteristics as association edge attributes; and to calculate the chain-like association graph based on preset quantization calculation rules to obtain risk transmission parameters, which include node risk transmission coefficient, edge risk transmission probability, and edge association strength.

[0142] The node risk prediction module is used to input node risk features into a pre-trained node risk prediction model and output the node risk prediction result for a single node. The node risk prediction result includes the node risk level, the probability of risk occurrence, and the risk type.

[0143] The transmission risk prediction module is used to input the chain-like association graph, risk transmission parameters and node risk prediction results into the pre-trained chain-like risk transmission prediction model, and output the link transmission risk prediction results, which include the risk transmission probability, risk transmission path, risk impact range and risk outbreak time window.

[0144] The strategy formulation module is used to formulate a graded blocking strategy based on the node risk prediction results, link transmission risk prediction results, and data sensitivity characteristics, and to implement corresponding differentiated control over the supply chain path according to the graded blocking strategy.

[0145] For specific limitations regarding a big data-based e-commerce data security management system, please refer to the limitations of a big data-based e-commerce data security management method described above, which will not be repeated here. The various modules in the aforementioned big data-based e-commerce data security management system can be implemented entirely or partially through software, hardware, or a combination thereof. These modules can be embedded in or independent of the processor in a computer device, or stored in the memory of a computer device as software, so that the processor can call and execute the corresponding operations of each module.

[0146] In one embodiment, a computer device is provided, which may be a server, and its internal structure diagram may be as follows. Figure 2 As shown, the computer device includes a processor, memory, network interface, and database connected via a system bus. The processor provides computing and control capabilities. The memory includes non-volatile storage media and internal memory. The non-volatile storage media stores the operating system, computer programs, and database. The internal memory provides the environment for the operation of the operating system and computer programs stored in the non-volatile storage media. The database is used for data storage, data processing, and data analysis. The network interface is used for communication with external terminals via a network connection. When the computer program is executed by the processor, it implements a big data-based e-commerce data security management method.

[0147] In one embodiment, a computer device is provided, including a memory, a processor, and a computer program stored in the memory and executable on the processor. When the processor executes the computer program, it implements a big data-based e-commerce data security management method.

[0148] In one embodiment, a computer-readable storage medium is provided having a computer program stored thereon, which, when executed by a processor, implements a big data-based e-commerce data security management method.

[0149] Those skilled in the art will understand that all or part of the processes in the methods of the above embodiments can be implemented by a computer program instructing related hardware. The computer program can be stored in a non-volatile computer-readable storage medium. When executed, the computer program can include the processes of the embodiments of the above methods. Any references to memory, storage, databases, or other media used in the embodiments provided in this application can include non-volatile and / or volatile memory. Non-volatile memory may include read-only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), or flash memory. Volatile memory may include random access memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms, such as static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), dual data rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous link DRAM (SLDRAM), RAMbus direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), etc.

[0150] Those skilled in the art will clearly understand that, for the sake of convenience and brevity, the above-described division of functional units and modules is used as an example. In practical applications, the above functions can be assigned to different functional units and modules as needed, that is, the internal structure of the device can be divided into different functional units or modules to complete all or part of the functions described above.

[0151] The above-described embodiments are only used to illustrate the technical solutions of this application, and are not intended to limit them. Although this application has been described in detail with reference to the foregoing embodiments, those skilled in the art should understand that modifications can still be made to the technical solutions described in the foregoing embodiments, or equivalent substitutions can be made to some of the technical features. Such modifications or substitutions do not cause the essence of the corresponding technical solutions to deviate from the spirit and scope of the technical solutions of the embodiments of this application, and should all be included within the protection scope of this application.

Claims

1. A data security management method for e-commerce based on big data, characterized in that, Includes the following steps: In response to the preset response mechanism, the system collects node security status data and inter-node correlation data of each level of the e-commerce supply chain and performs preprocessing to obtain node risk characteristics, correlation characteristics and data sensitivity characteristics of each level of nodes. A chain-like association graph is constructed by using node risk characteristics as graph node attributes and association relationship characteristics as association edge attributes. The chain-like association graph is then calculated based on preset quantitative calculation rules to obtain risk transmission parameters, which include node risk transmission coefficient, edge risk transmission probability, and edge association strength. The node risk characteristics are input into a pre-trained node risk prediction model, and the node risk prediction result for a single node is output. The node risk prediction result includes the node risk level, the probability of risk occurrence, and the risk type. The chain-linked graph, risk transmission parameters, and node risk prediction results are input into the pre-trained chain-linked risk transmission prediction model, and the link transmission risk prediction results are output. The link transmission risk prediction results include risk transmission probability, risk transmission path, risk impact range, and risk outbreak time window. Based on the node risk prediction results, link transmission risk prediction results, and data sensitivity characteristics, a tiered blocking strategy is formulated, and corresponding differentiated control is implemented on the supply chain path according to the tiered blocking strategy.

2. The e-commerce data security management method based on big data as described in claim 1, characterized in that, The triggering of the preset response mechanism specifically includes the following steps: Based on the initially constructed chain-like relational graph, the nodes of the e-commerce supply chain are divided into core hub level nodes, first-level transmission level nodes, and end-link relational level nodes with decreasing trigger priorities. Based on the node risk transmission coefficient and edge association strength initially calculated for each level of nodes, differentiated trigger thresholds are configured for each level of nodes; where the higher the node risk transmission coefficient and edge association strength, the lower the corresponding trigger threshold. The response mechanism is triggered when the node security status data corresponding to any level node reaches the trigger threshold, or when the edge risk transmission probability of the associated edge of that level node reaches the preset transmission threshold.

3. The e-commerce data security management method based on big data as described in claim 1, characterized in that, The step of constructing a chain-like association graph by using node risk characteristics as graph node attributes and association relationship characteristics as association edge attributes specifically includes the following steps: The node risk characteristics of each level of nodes are structured and embedded to form a node attribute vector, which includes the node risk level weight, risk type identifier and level label. The characteristics of the association relationship are classified and quantified to form an association edge attribute vector, which includes the association relationship type, interaction frequency, data flow volume and duration. Using the core hub-level nodes as the topology center, and based on the hierarchical chain linkage relationship between the core hub-level nodes, the first-level transmission nodes, and the terminal associated nodes, the node attribute vectors and associated edge attribute vectors are topologically mapped to construct a chain-linked association graph that includes the node topology relationship.

4. The e-commerce data security management method based on big data as described in claim 3, characterized in that, The step of calculating the risk transmission parameters based on the preset quantification calculation rules for the chain correlation graph specifically includes the following steps: Based on the level label in the node attribute vector, the basic transmission coefficient of each level node is determined. The basic transmission coefficient of each level node is then weighted and corrected by the node risk level weight to obtain the node risk transmission coefficient of each level node. The initial association strength of the associated edge is calculated based on the association relationship type, interaction frequency, data flow volume and duration in the associated edge attribute vector. The initial association strength is corrected by combining the hierarchical matching degree of the nodes at both ends of the associated edge to obtain the edge association strength; The risk transmission coefficient of a node is weighted by the edge association strength to obtain the risk transmission probability of the corresponding associated edge.

5. The e-commerce data security management method based on big data as described in claim 3, characterized in that, The step of inputting node risk features into a pre-trained node risk prediction model and outputting the node risk prediction result for a single node specifically includes the following steps: The node risk features of each level of node are fused with the corresponding node attribute vectors to obtain the fused node risk input features. The node risk input features are input into the pre-trained node risk prediction model. Through the hierarchical feature extraction branch of the model, node hierarchical association features are extracted based on the hierarchical labels in the node attribute vector and the node topological relationships in the chain association graph. Through the risk feature analysis branch of the model, feature mining is performed on the risk anomaly indicators and safety status data in the node risk input features to extract the real-time risk features of the nodes. The model's prediction output layer performs joint risk prediction on the node hierarchical association features and the real-time risk features of the nodes, and outputs the node risk prediction results for a single-level node, including the node risk level, the probability of risk occurrence, and the risk type.

6. The e-commerce data security management method based on big data as described in claim 4, characterized in that, The step of inputting the chain-linked association graph, risk transmission parameters, and node risk prediction results into the pre-trained chain-linked risk transmission prediction model and outputting the link transmission risk prediction results specifically includes the following steps: The link risk input features are obtained by fusing the node topological relationships in the chain association graph, the node risk transmission coefficient and edge association strength in the risk transmission parameters, and the node risk level and risk occurrence probability in the node risk prediction results. The link risk input features are input into a pre-trained chain-like risk transmission prediction model. The transmission path features between nodes are mined based on the hierarchical chain linkage relationship of the chain-like association graph through the link topology feature extraction branch of the model; the transmission potential energy features of each link are calculated by combining the node risk transmission coefficient and the edge association strength through the risk transmission intensity analysis branch of the model; and the risk association features between nodes are extracted based on the node risk level and the probability of risk occurrence through the node risk association mining branch of the model. The model's prediction output layer performs joint calculations on transmission path characteristics, transmission potential energy characteristics, and risk correlation characteristics, outputting link transmission risk prediction results including risk transmission probability, risk transmission path, risk impact range, and risk outbreak time window.

7. The e-commerce data security management method based on big data as described in claim 1, characterized in that, The step of formulating a tiered blocking strategy based on the node risk prediction results, link transmission risk prediction results, and data sensitivity characteristics specifically includes the following steps: Extract and integrate the risk basis of the graded blocking strategy. The risk basis includes the node risk level and risk occurrence probability in the node risk prediction results, the risk transmission probability and risk impact range in the link transmission risk prediction results, and the sensitivity level corresponding to the data sensitivity features. Based on the aforementioned risk criteria, a weighted hierarchical model is constructed, with differentiated weights assigned to each risk criterion. The weight coefficients of the corresponding node risk levels are progressively reduced according to the hierarchical order of core hub level nodes, primary transmission level nodes, and terminal related level nodes. The risk criteria are weighted and calculated using the weighted grading model to obtain the comprehensive risk value of each level node and its corresponding transmission link. The comprehensive risk value is then mapped to the corresponding comprehensive risk level according to a preset risk threshold. By matching the corresponding blocking strategy with the comprehensive risk level, a hierarchical blocking strategy adapted to the risk status of each level of node and corresponding transmission link is obtained.

8. A big data-based e-commerce data security management system, used to implement the steps of the big data-based e-commerce data security management method as described in any one of claims 1-7, characterized in that, include: The data acquisition module is used to collect node security status data and inter-node correlation data of each level of the e-commerce supply chain in response to the preset response mechanism, and to preprocess the data to obtain the node risk characteristics, correlation characteristics and data sensitivity characteristics of each level of nodes. The risk transmission calculation module is used to construct a chain-like association graph by using node risk characteristics as graph node attributes and association relationship characteristics as association edge attributes; and to calculate the chain-like association graph based on preset quantization calculation rules to obtain risk transmission parameters, which include node risk transmission coefficient, edge risk transmission probability, and edge association strength. The node risk prediction module is used to input node risk features into a pre-trained node risk prediction model and output the node risk prediction result for a single node. The node risk prediction result includes the node risk level, the probability of risk occurrence, and the risk type. The transmission risk prediction module is used to input the chain-like association graph, risk transmission parameters and node risk prediction results into the pre-trained chain-like risk transmission prediction model, and output the link transmission risk prediction results, which include the risk transmission probability, risk transmission path, risk impact range and risk outbreak time window. The strategy formulation module is used to formulate a graded blocking strategy based on the node risk prediction results, link transmission risk prediction results, and data sensitivity characteristics, and to implement corresponding differentiated control over the supply chain path according to the graded blocking strategy.

9. A computer device comprising a memory, a processor, and a computer program stored in the memory and executable on the processor, characterized in that, When the processor executes the computer program, it implements the steps of the big data-based e-commerce data security management method as described in any one of claims 1-7.

10. A computer-readable storage medium storing a computer program, characterized in that, When the computer program is executed by the processor, it implements the steps of the big data-based e-commerce data security management method as described in any one of claims 1-7.