Data full life cycle privacy protection encryption management method and system

By constructing a data lineage chart and analyzing privacy propagation paths, and dynamically adjusting encryption strategies in conjunction with a risk assessment model, the shortcomings of existing technologies in data privacy protection are addressed, enabling the tracking of privacy propagation throughout the entire data lifecycle and the optimized allocation of encryption resources.

CN122197080APending Publication Date: 2026-06-12HANGZHOU MICROFLUIDIC TECHNOLOGY CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
HANGZHOU MICROFLUIDIC TECHNOLOGY CO LTD
Filing Date
2026-05-14
Publication Date
2026-06-12

Smart Images

  • Figure CN122197080A_ABST
    Figure CN122197080A_ABST
Patent Text Reader

Abstract

The application belongs to the technical field of information security, and discloses a data full life cycle privacy protection encryption management method and system; including: collecting data objects in each business system, tracking the complete evolution process of each data object from creation to destruction, and constructing a data bloodline pedigree graph; according to the data bloodline pedigree graph, calculating the privacy inheritance degree and privacy diffusion degree of each data node, and combining a preset data asset value evaluation model to generate a pedigree risk heat map; based on the pedigree risk heat map, identifying key privacy nodes and high-risk transmission paths, and combining a preset encryption cost benefit model to determine the optimal encryption timing and encryption granularity parameters of each key privacy node, and performing encryption operations on each key privacy node; the application can realize ordered scheduling, reliable execution and full-process traceable management of encryption operations, and ensure the integrity and auditability of data full life cycle privacy protection encryption management.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of information security technology, and more specifically, to a method and system for data privacy protection and encryption management throughout the entire data lifecycle. Background Technology

[0002] With the deepening of digital transformation and the widespread application of big data technology, data has become a core asset and an important production factor for enterprises. Throughout the entire lifecycle of data collection, storage, processing, sharing, and destruction, data objects often undergo multiple derivation and transformation operations, forming complex data dependencies and posing higher requirements for data privacy protection. Currently, data privacy protection mainly relies on a combination of traditional static encryption and access control. This involves processing stored data using fixed encryption algorithms and then configuring corresponding access permission policies by security administrators. Some advanced management systems have begun to introduce data classification and grading technologies, enabling differentiated labeling and management of different types of data, which to some extent improves the level of precision in data privacy protection.

[0003] However, existing data privacy protection technologies still have significant shortcomings: On the one hand, traditional protection methods mostly target the static storage state of data, making it difficult to comprehensively track the evolution and flow path of data throughout its entire lifecycle. Furthermore, they lack systematic analysis of the propagation and diffusion patterns of privacy-sensitive information during data derivation and transformation, making it impossible to accurately assess the actual degree of privacy exposure at each stage. On the other hand, existing systems mostly employ uniform and fixed encryption strategies to protect data, lacking the ability to quantitatively assess the privacy risks of different data objects. They cannot dynamically adjust the strength and timing of protection based on the risk situation, and it is difficult to achieve optimal encryption performance under limited computing resources. Therefore, how to fully utilize risk quantification assessment technology to build a data privacy protection system with privacy propagation tracking and dynamic encryption decision-making capabilities has become an urgent technical problem to be solved in this field.

[0004] In view of this, the present invention proposes a data privacy protection encryption management method and system throughout the entire data lifecycle to solve the above problems. Summary of the Invention

[0005] To overcome the aforementioned deficiencies of the prior art and achieve the above objectives, the present invention provides the following technical solution: a data lifecycle privacy protection encryption management method, comprising: Collect data objects from various business systems, track the complete evolution process of each data object from creation to destruction, record the derivation relationship, derivation operation type and the responsible party for the derivation operation, and construct a data lineage diagram with each data object as a data node; Sensitivity levels are marked for each data node in the data lineage diagram. The propagation path and diffusion range of the sensitivity level of each data node in the derivation operation are analyzed. Based on the propagation path and diffusion range, the privacy inheritance degree and privacy diffusion degree of each data node are calculated. Based on the privacy inheritance and privacy diffusion of each data node, and combined with the preset data asset value assessment model, the privacy exposure risk of each data node is obtained through aggregated quantitative analysis, and a spectrum risk heat map is generated. Based on the phylogenetic risk heat map, key privacy nodes and high-risk propagation paths composed of multiple key privacy nodes are identified in the data nodes. Combined with the preset encryption cost-benefit model, the optimal encryption timing and encryption granularity parameters of each key privacy node are determined. Based on the optimal encryption timing and encryption granularity parameters, encryption operations are performed on each key privacy node, and the execution results of each encryption operation are recorded to form an encryption execution log.

[0006] Furthermore, methods for constructing data pedigree charts include: Obtain the data object registration list and operation behavior logs of each data object from each business system. The data object registration list includes the data object registration records of each data object. Based on the operation behavior logs, obtain the lifecycle event sequence of each data object and mark the corresponding current lifecycle state. Based on the lifecycle event sequence, extract the derivation relationship between each data object. Based on the derivation relationship between each data object, calculate the derivation depth and derivation breadth of each data object. Based on the lifecycle event sequence, count the operation frequency and the number of operating entities for each data object. Each data object is treated as a data node, and each data node is configured with corresponding node attributes. The node attributes of each data node include the data object registration record, current lifecycle state, derivation depth, derivation breadth, operation frequency, and number of operation subjects. Each derivation relationship is treated as a directed edge, and each derivation relationship is configured with corresponding edge attributes. The edge attributes of each directed edge include the upstream data object code, downstream data object code, derivation operation type, derivation operation timestamp, and the responsible subject of the derivation operation. Integrate all data nodes with directed edges to construct a data lineage graph.

[0007] Furthermore, methods for analyzing the propagation path and diffusion range of the sensitivity level of each data node during the derivation operation include: Obtain a set of sensitive classification rules, which includes a sensitive keyword dictionary and a data type sensitivity mapping table; for each data node in the data lineage diagram, obtain the data object name and data type from the corresponding node attributes; match the data object name with the sensitive keyword dictionary, and obtain the keyword matching sensitivity level for each data node based on the matching results; based on the data type, obtain the corresponding baseline sensitivity level from the data type sensitivity mapping table; determine the sensitivity level for each data node based on the keyword matching sensitivity level and the baseline sensitivity level. For each data node in the data lineage graph, recursively trace back along the reverse direction of the directed edges to construct all directed paths from the root node to the corresponding data node, and record each directed path as a propagation path; where the root node is a data node without an upstream data object; each propagation path includes the path start node code, the path end node code, the path node code sequence, and the path edge sequence; for each data node in the data lineage graph, recursively traverse all downstream reachable data nodes along the forward direction of the directed edges to form a diffusion node set for each data node; count the number of data nodes in each diffusion node set to obtain the diffusion range of each data node.

[0008] Furthermore, methods for calculating the privacy inheritance and privacy diffusion of each data node include: Based on the preset sensitivity level score mapping table, obtain the sensitivity level score corresponding to the sensitivity level of each data node; for each propagation path, obtain the sensitivity level score corresponding to the root node and mark it as the path source sensitivity score; obtain the derivation operation type of each directed edge in each path edge sequence, and obtain the corresponding sensitivity transfer coefficient from the preset operation type sensitivity transfer coefficient set according to each derivation operation type; multiply each path source sensitivity score by the sensitivity transfer coefficient corresponding to each directed edge in the corresponding path edge sequence to obtain the path transfer sensitivity value of each propagation path. For each data node, select the inheritance path from the corresponding propagation path; if there is no inheritance path, set the privacy inheritance degree of the corresponding data node to zero; if there is one or more inheritance paths, obtain the maximum value of the path transmission sensitivity value corresponding to all inheritance paths and mark it as the maximum transmission sensitivity value; obtain the upper limit of sensitivity score according to the sensitivity level score mapping table; calculate the ratio of the maximum transmission sensitivity value to the upper limit of sensitivity score to obtain the privacy inheritance degree of each data node. For each data node, obtain the corresponding sensitivity level score and diffusion range; if the diffusion range is zero, set the privacy diffusion degree of the corresponding data node to zero; if the diffusion range is greater than zero, calculate the sensitivity intensity normalization value based on the sensitivity level score and the upper limit of the sensitivity score; calculate the diffusion coverage rate of each data node based on the diffusion range; calculate the privacy diffusion degree of each data node by weighted summation of the sensitivity intensity normalization value and the diffusion coverage rate based on the preset diffusion weight. Based on the sensitivity level, sensitivity level score, privacy inheritance degree, and privacy diffusion degree of each data node, the node attributes of the corresponding data nodes are supplemented with annotations.

[0009] Furthermore, methods for obtaining the privacy exposure risk of each data node through aggregated quantitative analysis include: The operation frequency, number of operating entities, and derivation breadth of each data node are normalized to obtain evaluation indicators for each data node. These evaluation indicators include standard operation frequency, number of standard operating entities, and standard derivation breadth. A pre-defined data asset value assessment model is used, which includes value weights and lifecycle state correction coefficients for each evaluation indicator. Based on these value weights, the standard operation frequency, number of standard operating entities, and standard derivation breadth of the same data node are weighted and summed to obtain the basic asset value of each data node. The corresponding lifecycle state correction coefficient is obtained based on the current lifecycle state of each data node. Finally, the data asset value index of each data node is calculated based on the basic asset value and the lifecycle state correction coefficient. From the node attributes of each data node, obtain the privacy inheritance degree and privacy diffusion degree respectively; based on the privacy inheritance degree and privacy diffusion degree, calculate the propagation baseline value and coupling interaction value of each data node respectively; preset the interaction enhancement coefficient, calculate the product of the coupling interaction value and the interaction enhancement coefficient to obtain the interaction enhancement amount; calculate the sum of the propagation baseline value and the interaction enhancement amount to obtain the privacy propagation strength of each data node; for each data node, calculate the modulation base number according to the corresponding data asset value index; preset the value modulation index, calculate the value modulation index power of the modulation base number to obtain the asset modulation factor; calculate the product of the privacy propagation strength and the asset modulation factor to obtain the privacy exposure risk of each data node.

[0010] Furthermore, methods for generating phylogenetic risk heatmaps include: Based on privacy exposure risks, each data node is classified into risk levels; for each propagation path, the privacy exposure risk of all data nodes in the corresponding path node encoding sequence is obtained; the average privacy exposure risk of all data nodes in the same propagation path is calculated to obtain the path average risk value; the maximum privacy exposure risk of all data nodes in the same propagation path is obtained to obtain the path peak risk value; based on preset path risk weights, the path average risk value and the path peak risk value of the same propagation path are weighted and summed to obtain the path aggregate risk value of each propagation path. Based on the privacy exposure risk, risk level, and data asset value index of each data node, the node attributes of the corresponding data nodes are supplemented and annotated; the path aggregation risk value of each propagation path is integrated with the corresponding path node coding sequence to form a path risk record set; based on the supplemented and annotated data lineage chart and path risk record set, a lineage risk heat map is generated.

[0011] Furthermore, methods for identifying key privacy nodes and high-risk propagation paths include: From the node attributes of each data node, the risk level and current lifecycle status are extracted respectively; the risk level includes low risk level, medium risk level, high risk level and extremely high risk level; the current lifecycle status includes live status and destroyed status; the risk level and current lifecycle status of each data node are analyzed, and data nodes with a risk level of high risk level or extremely high risk level and a current lifecycle status of live status are selected as candidate privacy nodes. A privacy contagion regeneration number is introduced to quantitatively assess the privacy risk propagation and amplification capabilities of each candidate privacy node; from all propagation paths, propagation paths with an aggregated risk value greater than a preset path risk screening threshold are selected as high-risk candidate paths; based on the high-risk candidate paths, the propagation blocking impact of each candidate privacy node is calculated. For each candidate privacy node, the derivation depth is obtained from the corresponding node attributes, and the source proximity coefficient is calculated. Based on the source proximity coefficient and the preset source protection amplification coefficient, the source protection amplification factor is calculated. For each candidate privacy node, the corresponding privacy infection regeneration number, the propagation blocking impact degree and the source protection amplification factor are multiplied in sequence to obtain the immunity priority. Candidate privacy nodes with immunity priority greater than or equal to the preset immunity priority threshold are identified as key privacy nodes. From high-risk candidate paths, propagation paths containing two or more key privacy nodes in their path node encoding sequences are selected as high-risk propagation paths.

[0012] Furthermore, the methods for determining the encryption granularity parameters of each key privacy node include: A pre-defined encryption cost-benefit model is established, comprising a set of granularity levels, a unit encryption cost coefficient and protection coverage coefficient corresponding to each granularity level, and data format granularity adaptation rules. The data format granularity adaptation rules determine the range of granularity levels supported by each data format. For each critical privacy node, the data format and privacy exposure risk are obtained from the corresponding node attributes. Based on the data format granularity adaptation rules, all granularity levels supported by the corresponding data format for each critical privacy node are determined and marked as the available granularity level set. For each granularity level in the available granularity level set, the corresponding unit encryption cost coefficient and protection coverage coefficient are obtained. Based on the privacy exposure risk and protection coverage coefficient, the granular protection benefit for each critical privacy node at each granularity level is calculated. Based on the granular protection benefit and the unit encryption cost coefficient, the encryption benefit ratio for each critical privacy node at each granularity level is calculated. From the available granularity level set, the granularity level with the largest encryption benefit ratio is selected as the encryption granularity parameter for the corresponding critical privacy node.

[0013] Furthermore, methods for determining the optimal encryption timing for each key privacy node include: For each key privacy node, obtain the corresponding lifecycle event sequence; based on the lifecycle event sequence, determine the number of exposure operations for each key privacy node; obtain the creation timestamp and the number of operating subjects from the node attributes corresponding to each key privacy node, calculate the survival time based on the creation timestamp, and calculate the subject exposure amplification factor based on the number of operating subjects; calculate the exposure operation frequency for each key privacy node based on the number of exposure operations and the survival time; calculate the privacy exposure rate for each key privacy node based on the exposure operation frequency and the subject exposure amplification factor. If the privacy exposure rate is greater than zero, the privacy exposure half-life of the corresponding key privacy node is calculated based on the privacy exposure rate; the tolerance doubling number is calculated based on the preset risk tolerance multiple; the risk tolerance duration and encryption deadline of the corresponding key privacy node are calculated based on the privacy exposure half-life and the tolerance doubling number; the risk tolerance duration of the corresponding key privacy node is compared with the preset instant encryption duration threshold and emergency encryption duration threshold respectively, and the encryption timing level of the corresponding key privacy node is determined based on the comparison results; if the privacy exposure rate is equal to zero, the encryption deadline is set to the preset default encryption scheduling time, and the corresponding encryption timing level is set. Based on the encryption timing level and encryption deadline, the optimal encryption execution time for each key privacy node is determined; the optimal encryption execution time and encryption timing level of the same key privacy node are integrated to obtain the optimal encryption timing for each key privacy node.

[0014] A data lifecycle privacy protection encryption management system, implementing the aforementioned data lifecycle privacy protection encryption management method, including: The genealogy construction module is used to collect data objects from various business systems, track the complete evolution process of each data object from creation to destruction, record the derivation relationship, derivation operation type and the responsible party for the derivation operation, and construct a data lineage diagram with each data object as a data node; The privacy propagation module is used to label the sensitivity level of each data node in the data lineage diagram, analyze the propagation path and diffusion range of the sensitivity level of each data node in the derivation operation process, and calculate the privacy inheritance degree and privacy diffusion degree of each data node based on the propagation path and diffusion range. The risk aggregation module is used to obtain the privacy exposure risk of each data node by combining the privacy inheritance and privacy diffusion degree of each data node with the preset data asset value assessment model, and generate a spectrum risk heat map. The encryption decision module is used to identify key privacy nodes and high-risk propagation paths composed of multiple key privacy nodes in the data nodes based on the spectral risk heat map, and to determine the optimal encryption timing and encryption granularity parameters for each key privacy node by combining a preset encryption cost-benefit model. The encryption execution module is used to perform encryption operations on each key privacy node according to the optimal encryption timing and encryption granularity parameters, and record the execution results of each encryption operation to form an encryption execution log.

[0015] The technical effects and advantages of the data privacy protection encryption management method and system of this invention throughout the entire data lifecycle are as follows: By collecting data objects from various business systems and tracing their complete evolution from creation to destruction, a data lineage diagram is constructed. This enables systematic recording and visual management of derivation relationships, transformation operation types, and responsible entities throughout the data lifecycle. This effectively overcomes the shortcomings of existing technologies that only protect the static storage state of data and cannot comprehensively track the data flow path. By labeling each data node with a sensitivity level and analyzing the propagation path and diffusion range of the sensitivity level during the derivation operation process, two quantitative indicators, privacy inheritance and privacy diffusion, are introduced. This establishes a systematic analysis mechanism for the propagation and diffusion patterns of privacy-sensitive information during data derivation and transformation. This can effectively achieve accurate assessment of the actual privacy exposure level of data at each stage and avoid privacy protection blind spots and omissions caused by ignoring the data lineage propagation effect. By combining a data asset valuation model with an aggregation quantification method that integrates interactive coupling enhancement and power function modulation, privacy exposure risks are calculated and a spectral risk heatmap is generated. This enables differentiated quantitative assessment of privacy risks for different data objects and an intuitive presentation of the overall risk situation. By drawing on the ideas of basic reproduction number and targeted immunization strategy in epidemiology, privacy transmission reproduction number and immunization priority are introduced. Combined with the source proximity coefficient and the impact of transmission blocking, key privacy nodes and high-risk transmission paths are accurately identified. This achieves optimal screening of encrypted protection objects under the constraint of limited encryption resources, thereby effectively improving the utilization efficiency of encryption resources and the coverage of privacy protection. By drawing on the concept of radioactive decay half-life in nuclear physics, and introducing privacy exposure half-life and risk tolerance duration, combined with an encryption cost-benefit model to dynamically determine the optimal encryption timing and encryption granularity parameters, intelligent encryption decision-making is achieved by dynamically adjusting protection strength and timing based on risk conditions. This avoids the security risks and resource waste caused by insufficient or excessive protection under traditional fixed encryption strategies. By generating scheduling queues according to encryption timing levels and optimal encryption execution times, and verifying and retrying the encryption operation execution results, while calculating the encryption coverage of each high-risk propagation path to form a complete encryption execution log, orderly scheduling, reliable execution, and full traceability management of encryption operations are achieved, thereby ensuring the integrity and auditability of data privacy protection encryption management throughout the entire data lifecycle. Attached Figure Description

[0016] Figure 1 This is a flowchart of the data lifecycle privacy protection encryption management method according to Embodiment 1 of the present invention; Figure 2 This is a schematic diagram of the data lifecycle privacy protection encryption management system of Embodiment 2 of the present invention. Detailed Implementation

[0017] The technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of the present invention, and not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of the present invention. Example 1:

[0018] Please see Figure 1 As shown in this embodiment, the data lifecycle privacy protection encryption management method includes: Collect data objects from various business systems, track the complete evolution process of each data object from creation to destruction, record the derivation relationship, derivation operation type and the responsible party for the derivation operation, and construct a data lineage diagram with each data object as a data node.

[0019] Methods for collecting data objects from various business systems include: The system retrieves the data object registration list from the data asset registry (a centralized registration and management platform for uniformly registering and managing data asset metadata generated by various business systems). This list records the basic attribute information of all registered data objects in each business system, specifically including multiple data object registration records. A data object refers to a data entity with an independent identifier and lifecycle generated by each business system during data collection, processing, analysis, and sharing. Each data object registration record includes a unique data object code, data object name, business system identifier, data type, data format, and creation timestamp. The unique data object code is a predefined globally unique identifier for the data object. The business system identifier... This is used to identify the business system that generates the corresponding data object. Business systems include, but are not limited to, user information management systems, business transaction processing systems, data analysis service systems, and data sharing and exchange systems. Data types include raw data, intermediate processed data, analysis result data, and shared and published data. Raw data refers to unprocessed source data directly generated by terminal devices or manual input. Intermediate processed data is transitional data generated after cleaning, merging, or transforming the raw data. Analysis result data is conclusive data generated after performing modeling analysis or statistical calculations on the intermediate processed data. Shared and published data is data released to external systems or users after review. Data formats include, but are not limited to, structured data, semi-structured data, and unstructured data. The operation behavior logs of each data object are obtained from the operation log management platform of each business system (i.e., the log management platform used to record all operation behaviors performed by users and automated programs on data objects in each business system). The operation behavior logs record all operation events experienced by a data object during its lifecycle, specifically including multiple operation event records. Each operation event record includes an operation event code, operation timestamp, operation type, operation responsible party, operation input object code set, and operation output object code set. The operation event code is a predefined unique identifier for the operation event. Operation types include creation, reading, transformation, merging, splitting, copying, and destruction operations. The operation responsible party identifies the user account or automated program performing the operation. The operation input object code set records the unique data object code of the data object used when performing transformation, merging, and other operations. The operation output object code set records the unique data object code of the data object generated after the operation is executed.

[0020] Methods for tracking the complete evolution of each data object from creation to destruction, and recording derivation relationships, derivation operation types, and responsible parties for derivation operations include: Based on the operation behavior logs, the lifecycle event sequence of each data object is obtained, and the corresponding current lifecycle state is marked. Specifically, all operation event records are filtered from the operation behavior logs corresponding to the data objects. All operation event records are sorted from earliest to latest according to their corresponding operation timestamps to form the lifecycle event sequence of the data objects. The lifecycle event sequence describes the complete operation process of the data object from creation to destruction. If there is no operation event record of type destruction in the lifecycle event sequence, the current lifecycle state is marked as alive. If there is an operation event record of type destruction in the lifecycle event sequence, the current lifecycle state is marked as destroyed. The current lifecycle state is used to identify whether the data object is still in an active state that can be accessed and operated at the current point in time. Based on the lifecycle event sequence, the derivation relationships between data objects are extracted. Specifically, for each data object's lifecycle event sequence, operation event records with operation types of transformation, merging, or splitting are selected and marked as derivation event records. For each derivation event record, the unique code of each data object in the corresponding operation input object code set is obtained, and the corresponding data object is marked as an upstream data object. The unique code of each data object in the operation output object code set is obtained, and the corresponding data object is marked as a downstream data object. For each derivation event record, the association relationship between each upstream data object and each downstream data object is recorded as a derivation relationship. Each derivation relationship includes an upstream data object code, a downstream data object code, a derivation operation type, a derivation operation timestamp, and a derivation operation responsible entity. Among them, the upstream data object code is the unique code of the upstream data object; the downstream data object code is the unique code of the downstream data object; the derivation operation type is the operation type in the corresponding derivation event record; the derivation operation timestamp is the operation timestamp in the corresponding derivation event record; and the derivation operation responsible entity is the operation responsible entity in the corresponding derivation event record. Based on the derivation relationships between data objects, the derivation depth and derivation breadth of each data object are calculated. Specifically, for each data object, a recursive tracing is performed upstream along the derivation relationship to count the number of derivation relationship levels traversed from the corresponding data object to data objects without upstream data objects, thus obtaining the derivation depth. The derivation depth reflects the generational distance of the data object from its original data source. For each data object, the number of all derivation relationships with the corresponding data object as an upstream data object is counted, thus obtaining the derivation breadth. The derivation breadth reflects the number of data objects directly derived downstream from the data object. For example, data objects include... ,and for upstream data objects, At the same time and The upstream data object, that is, the data object that does not have an upstream data object. ;because arrive If the derivation level is 1, then The derivation depth is 1; because and arrive If the derivation hierarchy is 2, then and The derivation depth is 2 for all of them; due to Only The upstream data object, then The derivation breadth is 1; because At the same time and The upstream data object, then The derivation breadth is 2; because and If no downstream data objects exist, then and The derivation breadth of all of them is 0; it should be noted that if Also for The upstream data object, i.e. arrive There are two paths, respectively and At this point, the maximum value of the derivation depth corresponding to multiple paths is taken as the derivation depth. The derivation depth is 2; Based on the lifecycle event sequence, the operation frequency and the number of operating entities for each data object are counted. Specifically, the total number of operation event records contained in the lifecycle event sequence of each data object is counted to obtain the operation frequency. The operation frequency reflects the activity level of the data object during its lifecycle. The number of all unique operating entities in the lifecycle event sequence of each data object is counted to obtain the number of operating entities. The number of operating entities reflects the diversity of the responsible entities participating in the operation of the corresponding data object.

[0021] Methods for constructing data pedigree charts include: Each data object is treated as a data node, and each data node is configured with corresponding node attributes. The node attributes of each data node include the unique code of the data object, the name of the data object, the identifier of the business system to which it belongs, the data type, the data format, the creation timestamp, the current lifecycle status, the derivation depth, the derivation breadth, the operation frequency, and the number of operation subjects. Each derivation relationship is treated as a directed edge, and each derivation relationship is configured with corresponding edge attributes. The edge attributes of each directed edge include the upstream data object code, the downstream data object code, the derivation operation type, the derivation operation timestamp, and the responsible party for the derivation operation. The direction of the directed edge is from the data node corresponding to the upstream data object to the data node corresponding to the downstream data object, which is used to represent the direction of data lineage. All data nodes are integrated with directed edges to construct a data lineage graph. The data lineage graph is a directed graph structure with data nodes as vertices and directed edges as connections. It is used to fully describe the lineage relationships between data objects, the data flow paths, and the lineage position of each data object in the entire data management system.

[0022] Sensitivity levels are marked for each data node in the data lineage diagram. The propagation path and diffusion range of the sensitivity level of each data node in the derivation operation are analyzed. Based on the propagation path and diffusion range, the privacy inheritance degree and privacy diffusion degree of each data node are calculated.

[0023] Methods for labeling the sensitivity levels of data nodes in a pedigree chart include: A set of sensitive classification rules is obtained from the privacy policy management platform (i.e., the policy management platform used to formulate, store, and manage data privacy protection policies and sensitive classification and grading rules). This set defines the criteria for determining the privacy sensitivity of data objects, specifically including a sensitive keyword dictionary and a data type sensitivity mapping table. The sensitive keyword dictionary contains multiple keyword records, each containing a sensitive keyword and its corresponding sensitivity level. Sensitivity levels include public, internal, sensitive, and high-sensitivity. The data type sensitivity mapping table records the mapping relationship between each data type and its corresponding baseline sensitivity level. The baseline sensitivity level also includes public, internal, sensitive, and high-sensitivity levels. A preset sensitivity level score mapping table includes the sensitivity level score corresponding to each sensitivity level. The public level has the lowest sensitivity level score, and the high-sensitivity level has the highest. Each sensitivity level score is preset by those skilled in the art according to data security management specifications. For each data node in the data lineage diagram, the data object name and data type are obtained from the corresponding node attributes. The data object name is matched against all sensitive keywords in the sensitive keyword dictionary. If the data object name contains one or more sensitive keywords, the sensitivity level corresponding to all matched sensitive keywords is obtained, and the highest sensitivity level is selected as the keyword matching sensitivity level. If no sensitive keywords are matched with the data object name, the keyword matching sensitivity level is set to public. Based on the data type, the corresponding baseline sensitivity level is obtained from the data type sensitivity mapping table. The keyword matching sensitivity level is compared with the baseline sensitivity level, and the higher sensitivity level is taken as the sensitivity level of the corresponding data node. Based on the sensitivity level score mapping table, the sensitivity level score corresponding to the sensitivity level of each data node is obtained.

[0024] Methods for analyzing the propagation path and diffusion range of the sensitivity level of each data node during the derivation operation include: A set of sensitive transfer coefficients for operation types is preset, which includes the sensitive transfer coefficients corresponding to each derived operation type. The sensitive transfer coefficients are used to reflect the degree to which different types of derived operations preserve the privacy sensitivity of upstream data nodes during execution. The higher the sensitive transfer coefficient, the higher the degree to which the corresponding derived operation preserves upstream privacy information. The set of sensitive transfer coefficients for operation types is preset by those skilled in the art based on the characteristics of data content preservation and transformation of each derived operation. For each data node in the data lineage graph, recursively tracing back along the reverse direction of the directed edges, constructing all directed paths from the root node to the corresponding data node, and recording each directed path as a propagation path; where the root node is a data node without an upstream data object; each propagation path includes a path start node code, a path end node code, a path node code sequence, and a path edge sequence; where the path start node code is the unique code of the root node's data object; the path end node code is the unique code of the corresponding data node's data object; the path node code sequence is an ordered set of the unique codes of all data objects passed through by the propagation path in sequence; the path edge sequence is an ordered set of all directed edges passed through by the propagation path in sequence; the propagation path is used to describe the complete link of privacy sensitivity transmission from the data source along the lineage relationship to the current data node; For each propagation path, obtain the sensitivity score corresponding to the root node and mark it as the path source sensitivity score; obtain the derivation operation type of each directed edge in each path edge sequence, and obtain the corresponding sensitivity transfer coefficient from the operation type sensitivity transfer coefficient set according to each derivation operation type; multiply each path source sensitivity score by the sensitivity transfer coefficient corresponding to each directed edge in the corresponding path edge sequence to obtain the path transfer sensitivity value of each propagation path; whereby the path transfer sensitivity value is used to reflect the residual strength of privacy sensitivity at the path termination node after multiple derivation operations; For each data node in the data lineage graph, recursively traverse all downstream reachable data nodes along the positive direction of the directed edges, forming a diffusion node set for each data node. This diffusion node set describes the set of all data nodes whose privacy sensitivity can be propagated downstream through lineage relationships to influence the corresponding data node. For example... The set of diffusion nodes is , The set of diffusion nodes is The number of data nodes in each diffusion node set is counted to obtain the diffusion range of each data node; where the diffusion range is used to reflect the size of the potential privacy impact range of the corresponding data node.

[0025] Methods for calculating the privacy inheritance and privacy diffusion of each data node include: Based on the propagation path, the privacy inheritance degree of each data node is calculated. Specifically, for each data node, all propagation paths ending at the corresponding data node are obtained and marked as inheritance paths. If no inheritance path exists, the privacy inheritance degree of the corresponding data node is set to zero. If one or more inheritance paths exist, the maximum value among the path-transmitted sensitivity values ​​corresponding to all inheritance paths is obtained and marked as the maximum transmitted sensitivity value. The sensitivity level score corresponding to the high sensitivity level in the sensitivity level score mapping table is obtained and marked as the upper limit of the sensitivity score. The ratio of the maximum transmitted sensitivity value to the upper limit of the sensitivity score is calculated to obtain the privacy inheritance degree of each data node. The privacy inheritance degree reflects the strength of privacy sensitivity inherited by the data node from the upstream data source through the lineage propagation path. The higher the privacy inheritance degree, the more upstream privacy information the data node retains. Based on the diffusion range, the privacy diffusion degree of each data node is calculated. Specifically, for each data node, the corresponding sensitivity level score and diffusion range are obtained. If the diffusion range is zero, the privacy diffusion degree of the corresponding data node is set to zero. If the diffusion range is greater than zero, the ratio of the sensitivity level score to the upper limit of the sensitivity score is calculated to obtain the sensitivity intensity normalization value. The sensitivity intensity normalization value is used to reflect the proportion of the data node's own sensitivity relative to the highest sensitivity level. The total number of all data nodes in the statistical pedigree chart is obtained to obtain the total number of nodes. The ratio of the diffusion range to the total number of nodes is calculated to obtain the diffusion coverage rate of each data node. The diffusion coverage rate reflects the proportion of a data node's privacy diffusion range to all data nodes. Corresponding diffusion weights are assigned to the sensitivity intensity normalization value and the diffusion coverage rate, and a weighted summation of the sensitivity intensity normalization value and diffusion coverage rate for the same data node is calculated based on these diffusion weights to obtain the privacy diffusion degree of each data node. Each diffusion weight is pre-set by a person skilled in the art according to privacy protection strategies. The privacy diffusion degree reflects the potential risk of a data node spreading its own privacy sensitivity downstream; a higher privacy diffusion degree indicates a greater potential impact of the data node on downstream data privacy security. Based on the sensitivity level, sensitivity level score, privacy inheritance degree, and privacy diffusion degree of each data node, the node attributes of the corresponding data nodes are supplemented with annotations.

[0026] Based on the privacy inheritance and privacy diffusion of each data node, and combined with a pre-set data asset value assessment model, the privacy exposure risk of each data node is obtained through aggregated quantitative analysis, and a spectrum risk heat map is generated.

[0027] Methods for obtaining privacy exposure risks of each data node through aggregated quantitative analysis include: From the data lineage diagram, node attributes of each data node are obtained, and operation frequency, number of operators, derivation breadth, and current lifecycle status are extracted from each node attribute. The operation frequency, number of operators, and derivation breadth of each data node are normalized to obtain evaluation indicators for each data node. These evaluation indicators include standard operation frequency, standard number of operators, and standard derivation breadth. Standard operation frequency reflects the relative level of operational activity of a data node among all data nodes; standard number of operators reflects the relative level of operator diversity of a data node among all data nodes; and standard derivation breadth reflects the relative level of downstream influence of a data node among all data nodes. Specifically, the maximum operation frequency among all data nodes is taken as the maximum operation frequency, and the ratio of the operation frequency of each data node to the maximum operation frequency is calculated to obtain the standard operation frequency of each data node. The method for normalizing the number of operators and derivation breadth is the same as the method for normalizing the operation frequency. A pre-defined data asset valuation model is provided. This model is used to quantify and assess the value level of corresponding data assets based on the usage activity and scope of influence of the data objects. The data asset valuation model includes value weights and lifecycle status correction coefficients for each valuation indicator. Each value weight is pre-set by a person skilled in the art according to the data asset management strategy. The lifecycle status correction coefficients include a survival status correction coefficient and a destroyed status correction coefficient. The survival status correction coefficient is greater than the destroyed status correction coefficient. Each lifecycle status correction coefficient is pre-set by a person skilled in the art according to the data asset risk management requirements. Based on the value weights, the standard operation frequency, the number of standard operation subjects, and the standard derivation breadth of the same data node are weighted and summed to obtain the basic asset value of each data node. The basic asset value reflects the original asset value level of the data node without considering the impact of its lifecycle state. According to the current lifecycle state of each data node, the corresponding lifecycle state correction coefficient is obtained from the data asset valuation model. The product of the basic asset value and the corresponding lifecycle state correction coefficient is calculated to obtain the data asset value index of each data node. The data asset value index reflects the comprehensive value level of the data node as a data asset; a higher data asset value index indicates higher business importance and protection value of the data node.

[0028] Privacy inheritance degree and privacy diffusion degree are obtained from the node attributes of each data node. Based on these degrees, the privacy propagation intensity of each data node is calculated. Specifically, for each data node, the arithmetic mean of its corresponding privacy inheritance degree and privacy diffusion degree is calculated to obtain a propagation baseline value. This baseline value reflects the basic joint contribution level of privacy inheritance and privacy diffusion. The product of privacy inheritance degree and privacy diffusion degree is calculated to obtain a coupling interaction value. This value reflects the synergistic enhancement effect between privacy inheritance and privacy diffusion. When both privacy inheritance degree and privacy diffusion degree are at high levels, the coupling interaction value increases significantly, reflecting the superimposed amplification characteristics of the two privacy risk factors. A preset interaction enhancement coefficient is established, pre-set by those skilled in the art based on the coupling characteristics of privacy risks. The product of the coupling interaction value and the interaction enhancement coefficient is calculated to obtain the interaction enhancement amount. The sum of the propagation baseline value and the interaction enhancement amount is calculated to obtain the privacy propagation intensity of each data node. This intensity comprehensively reflects the degree of dual privacy risk contribution of the data node as both a receiver and a source of privacy information. Based on the privacy propagation intensity and data asset value index, the privacy exposure risk of each data node is calculated. Specifically, a value modulation index is preset, which is pre-set by those skilled in the art according to data asset protection strategies. For each data node, the sum of 1 and the corresponding data asset value index is calculated to obtain the modulation base. The value modulation index of the modulation base is then raised to the power of the value modulation index to obtain the asset modulation factor. The asset modulation factor reflects the nonlinear modulation effect of data asset value on privacy exposure risk. When the data asset value index is zero, the asset modulation factor equals one, indicating that the privacy propagation intensity is not affected by asset value modulation. As the data asset value index increases, the asset modulation factor grows in a power function form, nonlinearly amplifying the privacy exposure risk faced by high-value data assets. The product of the privacy propagation intensity and the asset modulation factor is calculated to obtain the privacy exposure risk of each data node. The privacy exposure risk reflects the comprehensive risk of privacy leakage faced by the data node in the data lineage diagram. The higher the privacy exposure risk, the more privacy protection processing is required for the data node.

[0029] Methods for generating phylogenetic risk heatmaps include: Based on privacy exposure risks, each data node is classified into risk levels. Specifically, a set of preset risk level thresholds is used, including a low-risk upper limit threshold, a medium-risk upper limit threshold, and a high-risk upper limit threshold. Each risk level threshold is preset by a person skilled in the art according to privacy protection management standards. If the privacy exposure risk is less than or equal to the low-risk upper limit threshold, the corresponding data node is classified as low-risk. If the privacy exposure risk is greater than the low-risk upper limit threshold but less than or equal to the medium-risk upper limit threshold, the corresponding data node is classified as medium-risk. If the privacy exposure risk is greater than the medium-risk upper limit threshold but less than or equal to the high-risk upper limit threshold, the corresponding data node is classified as high-risk. If the privacy exposure risk is greater than the high-risk upper limit threshold, the corresponding data node is classified as extremely high-risk. Based on privacy exposure risks, the path aggregation risk value for each propagation path is calculated. Specifically, for each propagation path, the privacy exposure risk of all data nodes in the corresponding path node encoding sequence is obtained; the average privacy exposure risk of all data nodes in the same propagation path is calculated to obtain the path average risk value; the path average risk value reflects the overall risk level of each data node on the propagation path; the maximum value of the privacy exposure risk of all data nodes in the same propagation path is obtained to obtain the path peak risk value; the path peak risk value reflects the extreme risk level represented by the data node with the highest risk on the propagation path; corresponding path risk weights are set for the path average risk value and the path peak risk value, respectively, and the path average risk value and the path peak risk value of the same propagation path are weighted and summed based on the path risk weights to obtain the path aggregation risk value for each propagation path; each path risk weight is pre-set by those skilled in the art according to the path risk assessment strategy; the path aggregation risk value reflects the overall privacy risk level of the propagation path. Based on the privacy exposure risk, risk level, and data asset value index of each data node, the node attributes of the corresponding data nodes are supplemented with annotations; the path aggregation risk values ​​of each propagation path are integrated with the corresponding path node coding sequences to form a path risk record set; based on the supplemented annotation data lineage diagram and the path risk record set, a lineage risk heat map is generated; the lineage risk heat map is used to visually present the distribution of privacy exposure risk of each data node in the data lineage and the risk aggregation status of each propagation path.

[0030] Based on the phylogenetic risk heatmap, key privacy nodes and high-risk propagation paths composed of multiple key privacy nodes are identified in the data nodes. Combined with a pre-set encryption cost-benefit model, the optimal encryption timing and encryption granularity parameters for each key privacy node are determined.

[0031] Methods for identifying key privacy nodes and high-risk propagation paths include: Obtain the node attributes of each data node, and extract the risk level and current lifecycle status from each node attribute; analyze the risk level and current lifecycle status of each data node, and select data nodes with a risk level of high risk or extremely high risk and a current lifecycle status of live, and use them as candidate privacy nodes. A privacy contagion reproduction number is introduced to quantitatively assess the privacy risk propagation and amplification capability of each candidate privacy node. Specifically, a privacy contagion judgment threshold is preset, which is pre-set by those skilled in the art according to privacy protection and control standards. For each candidate privacy node, all directed edges with the unique code of the corresponding data object as the upstream data object code are obtained from the data lineage graph, and the data nodes pointed to by the downstream data object codes corresponding to each directed edge are obtained and marked as direct downstream nodes. The corresponding privacy inheritance degree is obtained from the node attributes of each direct downstream node. For each candidate privacy node, the number of direct downstream nodes with a privacy inheritance degree greater than the privacy contagion judgment threshold is counted to obtain the privacy contagion reproduction number of each candidate privacy node. The privacy contagion reproduction number is borrowed from the concept of the basic reproduction number in epidemiology to reflect the ability of a candidate privacy node to transmit significant privacy risks to direct downstream data nodes through derivation operations. When the privacy contagion reproduction number is greater than one, it indicates that the privacy risk of the candidate privacy node shows an amplification and diffusion trend in the lineage propagation process, that is, a single candidate privacy node can cause a significant increase in the privacy risk of multiple direct downstream nodes. High-risk candidate paths are selected from all propagation paths, and the propagation blocking impact of each candidate privacy node is calculated based on these high-risk candidate paths. Specifically, a path risk screening threshold is preset, which is pre-set by those skilled in the art according to path risk management requirements. From the path risk record set, propagation paths with a path aggregation risk value greater than the path risk screening threshold are selected and marked as high-risk candidate paths. The total number of all high-risk candidate paths is counted to obtain the total number of high-risk paths. For each candidate privacy node, the number of times the unique code of the corresponding data object appears in the path node code sequence of all high-risk candidate paths is counted to obtain the path hit count. The path hit count reflects the frequency of candidate privacy nodes participating in high-risk propagation paths. The ratio of the path hit count to the total number of high-risk paths is calculated to obtain the propagation blocking impact of each candidate privacy node. The propagation blocking impact reflects the proportion of high-risk propagation paths that can be effectively blocked after encryption of the candidate privacy node. The higher the propagation blocking impact, the more high-risk propagation links can be cut off simultaneously after encryption of the candidate privacy node. Based on the privacy contagion reproduction number, propagation blocking impact, and derivation depth, key privacy nodes among the candidate privacy nodes are identified. Specifically, for each candidate privacy node, the derivation depth is obtained from the corresponding node attributes. The reciprocal of the sum of 1 and the derivation depth is calculated to obtain the source proximity coefficient. The source proximity coefficient reflects the degree of proximity of the candidate privacy node to the original data source in the data lineage diagram. The smaller the derivation depth, the larger the source proximity coefficient, indicating that the candidate privacy node is closer to the original data source, and encryption of it can block longer downstream derivation propagation links from the lineage source. A source protection amplification coefficient is preset, which is pre-set by those skilled in the art according to the source data protection priority strategy. The product of the source proximity coefficient and the source protection amplification coefficient is calculated, and 1 is added to obtain the source protection amplification factor. The source protection amplification factor is used to amplify the candidate privacy nodes that are close to the original data source. Points are assigned higher encryption priority weights. For each candidate privacy node, the corresponding privacy infection regeneration number, transmission blocking impact, and source protection amplification factor are multiplied sequentially to obtain the immunity priority. The immunity priority is based on the priority assessment concept of the target immunization strategy in infectious disease prevention and control, and is used to comprehensively reflect the priority of implementing encryption protection (i.e., implementing "immunization") on candidate privacy nodes. The higher the immunity priority, the stronger the necessity of encrypting the candidate privacy node. The immunity priority of each candidate privacy node is compared with a preset immunity priority threshold, which is preset by those skilled in the art based on encryption resource constraints and privacy protection coverage objectives. If the immunity priority is greater than or equal to the immunity priority threshold, the corresponding candidate privacy node is determined as a key privacy node; if the immunity priority is less than the immunity priority threshold, the corresponding candidate privacy node is not considered a key privacy node. From high-risk candidate paths, propagation paths containing two or more key privacy nodes in their path node encoding sequences are selected as high-risk propagation paths; among them, high-risk propagation paths are used to identify centralized propagation links of privacy risks where multiple key privacy nodes are distributed in series.

[0032] Methods for determining the encryption granularity parameters of each key privacy node include: A pre-defined encryption cost-benefit model is used to comprehensively evaluate the protection benefits and implementation costs at different encryption granularities to determine the optimal encryption granularity. The model includes a set of granularity levels, a unit encryption cost coefficient and protection coverage coefficient for each granularity level, and data format granularity adaptation rules. The granularity level set includes object-level encryption, field-level encryption, and element-level encryption. Object-level encryption encrypts the entire data object; field-level encryption encrypts specific sensitive fields within the data object; and element-level encryption encrypts specific sensitive data elements within the data object. The unit encryption cost coefficient reflects the computational resource consumption at different encryption granularities; the unit encryption cost coefficient for element-level encryption is greater than that for field-level encryption, and the unit encryption cost coefficient for field-level encryption is greater than that for field-level encryption. The encryption cost coefficient is greater than the unit encryption cost coefficient of object-level encryption; each unit encryption cost coefficient is preset by those skilled in the art based on the computational complexity of the encryption algorithm and the key management overhead; each protection coverage coefficient is used to reflect the effective mitigation ratio of privacy risks for different encryption granularities, with the protection coverage coefficient of object-level encryption being greater than that of field-level encryption, and the protection coverage coefficient of field-level encryption being greater than that of element-level encryption; each protection coverage coefficient is preset by those skilled in the art based on the coverage of sensitive information at each granularity level; data format granularity adaptation rules are used to determine the range of granularity levels supported by each data format; among them, structured data supports object-level encryption, field-level encryption, and element-level encryption, semi-structured data supports object-level encryption and field-level encryption, and unstructured data only supports object-level encryption; For each key privacy node, the data format and privacy exposure risk are obtained from the corresponding node attributes. Based on the data format granularity adaptation rules, all granularity levels supported by the corresponding data format for each key privacy node are determined and marked as a set of available granularity levels. For each granularity level in the set of available granularity levels, the corresponding unit encryption cost coefficient and protection coverage coefficient are obtained. The product of the privacy exposure risk and the corresponding protection coverage coefficient is calculated to obtain the granular protection benefit for each key privacy node at each granularity level. The granular protection benefit reflects the amount of privacy risk that can be effectively mitigated at the corresponding granularity level. The ratio of the granular protection benefit to the corresponding unit encryption cost coefficient is calculated to obtain the encryption benefit ratio for each granularity level. The encryption benefit ratio reflects the privacy protection benefit that can be obtained per unit encryption cost at the corresponding granularity level. From the set of available granularity levels, the granularity level with the largest encryption benefit ratio is selected as the encryption granularity parameter for the corresponding key privacy node.

[0033] Methods for determining the optimal encryption timing for each key privacy node include: A privacy exposure half-life is introduced to quantitatively assess the encryption urgency of each key privacy node. Specifically, for each key privacy node, a corresponding lifecycle event sequence is obtained. From the lifecycle event sequence, operation event records of read or copy operation types are selected, and the total number of corresponding operation event records is counted to obtain the exposure operation count. The exposure operation count reflects the cumulative number of times the key privacy node has been accessed or copied externally during its lifecycle. The creation timestamp and the number of operating entities are obtained from the node attributes corresponding to each key privacy node, and the difference between the current system time and the creation timestamp is calculated to obtain the survival time. The survival time reflects the time span experienced by the key privacy node from its creation to the current moment. The exposure operation count is then calculated. The ratio of the number of operations to the duration of operation is used to obtain the exposure operation frequency of each critical privacy node. The exposure operation frequency reflects how often a critical privacy node is accessed or copied per unit time. The natural logarithm of the number of operating entities is calculated and summed with one to obtain the entity exposure amplification factor. This factor reflects the amplification effect of the increase in the number of operating entities on the privacy exposure risk per unit time; the amplification effect is limited when the number of operating entities is small, and significantly enhanced when the number of operating entities is large. The product of the exposure operation frequency and the entity exposure amplification factor is calculated to obtain the privacy exposure rate of each critical privacy node. This rate reflects the rate at which the privacy risk accumulates over time due to continuous access and copying by multiple entities to the critical privacy node. If the privacy exposure rate is greater than zero, the ratio of the natural logarithm of 2 to the privacy exposure rate is calculated to obtain the privacy exposure half-life of the corresponding critical privacy node. The privacy exposure half-life, borrowed from the concept of radioactive decay half-life in nuclear physics, describes the time period required for the cumulative exposure risk of a critical privacy node to double. A shorter privacy exposure half-life indicates a faster accumulation of privacy risk and a higher urgency for encryption protection; a longer privacy exposure half-life indicates a slower accumulation of privacy risk and a more ample allowable encryption response time. A preset risk tolerance multiple is established, pre-set by those skilled in the art according to privacy and security management standards. This risk tolerance multiple represents the maximum multiple by which the cumulative exposure risk is allowed to increase relative to the current privacy exposure risk without encryption protection. The logarithm of the risk tolerance multiple, base 2, is calculated to obtain the tolerance multiplication factor. The tolerance doubling count is used to convert the risk tolerance multiple into a doubling count based on the exposure half-life. The product of the privacy exposure half-life and the tolerance doubling count is calculated to obtain the risk tolerance duration for the corresponding critical privacy node. The risk tolerance duration reflects the estimated time required for the cumulative exposure risk of the critical privacy node to increase from the current level to the tolerance limit. The current system time is added to the risk tolerance duration to obtain the encryption deadline for the corresponding critical privacy node. The encryption deadline is the latest time at which the critical privacy node must complete the encryption operation. If the privacy exposure rate is zero, it indicates that the corresponding critical privacy node is not frequently accessed. The encryption deadline is set to a preset default encryption scheduling time, and the encryption timing level is set to planned encryption. The default encryption scheduling time is preset by those skilled in the art based on the system maintenance cycle. The system presets an instant encryption duration threshold and an emergency encryption duration threshold. The instant encryption duration threshold is lower than the emergency encryption duration threshold, and both are preset by those skilled in the art based on encryption scheduling response capabilities. If the risk tolerance duration is less than or equal to the instant encryption duration threshold, the encryption timing level of the corresponding critical privacy node is marked as instant encryption. If the risk tolerance duration is greater than the instant encryption duration threshold but less than or equal to the emergency encryption duration threshold, the encryption timing level of the corresponding critical privacy node is marked as emergency encryption. If the risk tolerance duration is greater than the emergency encryption duration threshold or the privacy exposure rate is zero, the encryption timing level of the corresponding critical privacy node is marked as planned encryption. Instant encryption indicates that the privacy risk accumulation rate of the corresponding critical privacy node is extremely fast, requiring immediate encryption. Emergency encryption indicates that the privacy risk accumulation rate of the corresponding critical privacy node is relatively fast, requiring encryption to be completed in a short period, but allowing limited scheduling buffer before the encryption deadline. Planned encryption indicates that the privacy risk accumulation rate of the corresponding critical privacy node is relatively slow or it is not currently frequently accessed, and encryption can be scheduled to be performed during the system maintenance window. Based on the encryption timing level and encryption deadline, the optimal encryption execution time for each critical privacy node is determined. Specifically, if the encryption timing level is immediate encryption, the current system time is used as the optimal encryption execution time for the corresponding critical privacy node. If the encryption timing level is emergency encryption, an emergency scheduling buffer time is preset, which is pre-set by those skilled in the art based on the encryption task scheduling cycle. The current system time is added to the emergency scheduling buffer time to obtain the emergency scheduling time. The emergency scheduling time is compared with the encryption deadline, and the smaller value is taken as the optimal encryption execution time for the corresponding critical privacy node. If the encryption timing level is planned encryption, a system maintenance window time is preset, which is pre-set by those skilled in the art based on the system operation and maintenance plan. The system maintenance window time is compared with the encryption deadline, and the smaller value is taken as the optimal encryption execution time for the corresponding critical privacy node. The optimal encryption execution time is integrated with the encryption timing level to determine the optimal encryption timing for the corresponding critical privacy node. The unique data object code, the business system identifier, encryption granularity parameters, and the optimal encryption timing of each key privacy node are integrated to form an encryption decision record for each key privacy node; the encryption decision records of all key privacy nodes are summarized with high-risk propagation paths to generate an encryption decision scheme.

[0034] Based on the optimal encryption timing and encryption granularity parameters, encryption operations are performed on each key privacy node, and the execution results of each encryption operation are recorded to form an encryption execution log.

[0035] Methods for performing encryption operations on key privacy nodes include: From the encryption decision-making scheme, obtain the encryption decision records and high-risk propagation paths of all key privacy nodes; from each encryption decision record, extract the unique code of the data object, the identifier of the business system to which it belongs, the encryption granularity parameters and the optimal encryption timing; from each optimal encryption timing, extract the optimal encryption execution time and the encryption timing level. An encryption task scheduling queue is generated based on the encryption timing level and the optimal encryption execution time. Specifically, all critical privacy nodes are grouped according to their encryption timing level to obtain an immediate encryption group, an emergency encryption group, and a planned encryption group. Within each group, critical privacy nodes are sorted from earliest to latest according to their corresponding optimal encryption execution time. The sorted critical privacy nodes within each group are then sequentially added to the encryption task scheduling queue in the order of immediate encryption group, emergency encryption group, and planned encryption group. The encryption task scheduling queue is used to determine the encryption execution order of each critical privacy node, and the critical privacy nodes at the front of the queue will receive encryption execution resources first. A preset encryption algorithm configuration table is provided, which includes the encryption algorithm identifier and encryption key length corresponding to each granularity level. The encryption algorithm identifier and encryption key length are preset by those skilled in the art based on data security compliance requirements and encryption performance requirements. Encryption operations are performed sequentially on each critical privacy node according to the order of the encryption task scheduling queue. Specifically, for each critical privacy node, it is determined whether the current system time has reached the corresponding optimal encryption execution time. If the current system time has not reached the optimal encryption execution time, and the encryption timing level is not instant encryption, the encryption operation is performed after the optimal encryption execution time. If the encryption timing level is instant encryption, the waiting is skipped and the encryption operation is performed immediately. The specific process of performing the encryption operation is as follows: locate the corresponding business system based on the business system identifier of the critical privacy node; obtain the data object to be encrypted from the corresponding business system based on the unique code of the data object of the critical privacy node; obtain the corresponding encryption algorithm identifier and encryption key length from the encryption algorithm configuration table based on the encryption granularity parameter of the critical privacy node; call the encryption algorithm corresponding to the encryption algorithm identifier, and generate the corresponding encryption key based on the encryption key length; use the corresponding encryption key to perform encryption processing on the data object to be encrypted; write the encrypted data object back to the corresponding business system, and store the encryption key in the key management system (i.e., a security key management platform for centralized storage, distribution, and management of encryption keys). The execution result of each encryption operation is verified to determine the corresponding execution status. Specifically, the encrypted data object is reread from the corresponding business system and decrypted using the corresponding encryption key. If the decryption is successful and the decrypted data content is consistent with the data content before encryption, the execution status of the corresponding encryption operation is marked as encryption successful. If the decryption fails or the decrypted data content is inconsistent with the data content before encryption, the execution status of the corresponding encryption operation is marked as encryption failed. If an unexpected situation such as system abnormality or network interruption occurs during the encryption process, the execution status of the corresponding encryption operation is marked as encryption abnormal.

[0036] Methods for recording the execution results of each encryption operation to form an encryption execution log include: For each encryption operation at each critical privacy node, an encryption execution record is generated. Each encryption execution record includes an encryption execution record code, a unique code for the data object, an identifier for the business system to which it belongs, encryption granularity parameters, an encryption algorithm identifier, encryption key length, encryption timing level, optimal encryption execution time, actual encryption execution time, and execution status. Among these, the encryption execution record code is a predefined unique identifier for the encryption execution record; the actual encryption execution time is the system time at which the encryption operation actually started. For encryption execution records with an encryption failure or encryption error status, encryption retry processing is performed. Specifically, a maximum number of retries is preset, which is pre-set by those skilled in the art based on the system fault tolerance strategy. For critical privacy nodes with an encryption failure or encryption error status, the encryption operation and verification process are re-executed. After each retry, the execution status and actual encryption execution time in the corresponding encryption execution record are updated. The cumulative number of retries for each critical privacy node is counted. If the cumulative number of retries reaches the maximum number of retries and the execution status is still not encryption successful, the execution status in the corresponding encryption execution record is marked as encryption terminated, and the corresponding termination reason is recorded in the corresponding encryption execution record. Among them, the termination reasons include decryption verification failure and unrecovered system abnormalities. Based on each encryption execution record and high-risk propagation path, the encryption coverage status of each high-risk propagation path is calculated. Specifically, for each high-risk propagation path, the number of key privacy nodes with a successful encryption execution status is counted among all corresponding key privacy nodes, obtaining the number of encrypted nodes in the path; the total number of key privacy nodes in each high-risk propagation path is counted, obtaining the total number of key nodes in the path; the ratio of the number of encrypted nodes in the path to the total number of key nodes in the path is calculated, obtaining the path encryption coverage rate of each high-risk propagation path; where the path encryption coverage rate is used to reflect the degree of encryption completion of key privacy nodes on the high-risk propagation path; if the path encryption coverage rate is equal to one, the path protection status of the corresponding high-risk propagation path is marked as fully protected; if the path encryption coverage rate is greater than zero and less than one, the path protection status of the corresponding high-risk propagation path is marked as partially protected; if the path encryption coverage rate is equal to zero, the path protection status of the corresponding high-risk propagation path is marked as unprotected. All encrypted execution records, the encryption coverage of each high-risk propagation path, and the path protection status are summarized and integrated to form an encrypted execution log. The encrypted execution log is used to fully record the encrypted operation execution process and results of each key privacy node, as well as the encryption protection coverage of each high-risk propagation path.

[0037] This embodiment collects data objects from various business systems and tracks their complete evolution from creation to destruction, constructing a data lineage diagram. This enables systematic recording and visual management of derivation relationships, transformation operation types, and responsible entities throughout the data lifecycle, effectively overcoming the shortcomings of existing technologies that only protect the static storage state of data and cannot comprehensively track the data flow path. By labeling each data node with a sensitivity level and analyzing the propagation path and diffusion range of the sensitivity level in the derivation operation process, two quantitative indicators, privacy inheritance degree and privacy diffusion degree, are introduced to establish a systematic analysis mechanism for the propagation and diffusion patterns of privacy-sensitive information in the data derivation and transformation process. This can effectively achieve accurate assessment of the actual privacy exposure level of data at each stage, avoiding privacy protection blind spots and omissions caused by ignoring the data lineage propagation effect. This embodiment combines a data asset valuation model with an aggregation quantification method that integrates interactive coupling enhancement and power function modulation to calculate privacy exposure risks and generate a spectral risk heatmap. This enables differentiated quantitative assessment of privacy risks for different data objects and a clear presentation of the overall risk situation. By drawing on the concepts of basic reproduction number and targeted immunization strategies in epidemiology, it introduces privacy transmission reproduction number and immunization priority, combining source proximity coefficient and transmission blocking impact to accurately identify key privacy nodes and high-risk transmission paths. This achieves optimal screening of encrypted protection objects under the constraint of limited encryption resources, thereby effectively improving the utilization efficiency of encryption resources and the coverage of privacy protection. This embodiment, by drawing on the concept of radioactive decay half-life in nuclear physics, introduces privacy exposure half-life and risk tolerance duration, and dynamically determines the optimal encryption timing and encryption granularity parameters using an encryption cost-benefit model. This enables intelligent encryption decision-making by dynamically adjusting protection strength and timing based on risk conditions, thereby avoiding security risks and resource waste caused by insufficient or excessive protection under traditional fixed encryption strategies. By generating a scheduling queue according to encryption timing level and optimal encryption execution time, and verifying and retrying the encryption operation execution results, while calculating the encryption coverage of each high-risk propagation path to form a complete encryption execution log, this achieves orderly scheduling, reliable execution, and full traceability management of encryption operations, thus ensuring the integrity and auditability of data privacy protection encryption management throughout the entire data lifecycle. Example 2:

[0038] Please see Figure 2 As shown in the figure, the parts not described in detail in this embodiment are described in the content of Embodiment 1. A data privacy protection encryption management system is provided, including a genealogy construction module, a privacy propagation module, a risk aggregation module, an encryption decision module, and an encryption execution module. The modules are connected by wired and / or wireless means to realize data transmission between modules.

[0039] The genealogy construction module is used to collect data objects from various business systems, track the complete evolution process of each data object from creation to destruction, record the derivation relationship, derivation operation type and the responsible party for the derivation operation, and construct a data lineage diagram with each data object as a data node; The privacy propagation module is used to label the sensitivity level of each data node in the data lineage diagram, analyze the propagation path and diffusion range of the sensitivity level of each data node in the derivation operation process, and calculate the privacy inheritance degree and privacy diffusion degree of each data node based on the propagation path and diffusion range. The risk aggregation module is used to obtain the privacy exposure risk of each data node by combining the privacy inheritance and privacy diffusion degree of each data node with the preset data asset value assessment model, and generate a spectrum risk heat map. The encryption decision module is used to identify key privacy nodes and high-risk propagation paths composed of multiple key privacy nodes in the data nodes based on the spectral risk heat map, and to determine the optimal encryption timing and encryption granularity parameters for each key privacy node by combining a preset encryption cost-benefit model. The encryption execution module is used to perform encryption operations on each key privacy node according to the optimal encryption timing and encryption granularity parameters, and record the execution results of each encryption operation to form an encryption execution log. Example 3:

[0040] This application also provides an electronic device. The electronic device may include one or more processors and one or more memories. The memories store computer-readable code that, when executed by the one or more processors, can perform the data lifecycle privacy-preserving encrypted management method described above.

[0041] The methods or systems according to the embodiments of this application can also be implemented using the architecture of the electronic device shown in this application. The electronic device may include a bus, one or more CPUs, ROM, RAM, a communication port connected to a network, input / output, a hard disk, etc. The storage device in the electronic device, such as ROM or hard disk, may store the data lifecycle privacy protection encryption management method provided in this application. Furthermore, the electronic device may also include a user interface. Of course, the architecture shown in this application is merely exemplary; when implementing different devices, one or more components in the electronic device shown in this application may be omitted according to actual needs. Example 4:

[0042] One embodiment of this application discloses a computer-readable storage medium. The computer-readable storage medium stores computer-readable instructions. When executed by a processor, the computer-readable instructions can perform a data lifecycle privacy-preserving encryption management method according to an embodiment of this application, as described with reference to the above figures. The storage medium includes, but is not limited to, volatile memory and / or non-volatile memory. Volatile memory may include, for example, random access memory (RAM) and cache memory. Non-volatile memory may include, for example, read-only memory (ROM), hard disk, flash memory, etc.

[0043] Furthermore, according to embodiments of this application, the processes described in the above-referenced flowcharts can be implemented as computer software programs. For example, this application provides a non-transitory machine-readable storage medium storing machine-readable instructions that can be executed by a processor to perform instructions corresponding to the method steps provided in this application, such as a data lifecycle privacy protection encryption management method. When this computer program is executed by a central processing unit (CPU), it performs the functions defined in the method of this application.

[0044] The above description is merely a preferred embodiment of the present invention and is not intended to limit the present invention. Although the present invention has been described in detail with reference to the foregoing embodiments, those skilled in the art can still modify the technical solutions described in the foregoing embodiments or make equivalent substitutions for some of the technical features. Any modifications, equivalent substitutions, improvements, etc., made within the spirit and principles of the present invention should be included within the protection scope of the present invention.

[0045] All formulas in this manual are dimensionless and calculated numerically. The formulas are derived from software simulations based on a large amount of collected data to obtain the most recent real-world results. The preset parameters and thresholds in the formulas are set by those skilled in the art according to the actual situation.

[0046] Although embodiments of the invention have been shown and described, those skilled in the art will understand that various changes, modifications, substitutions and alterations can be made to these embodiments without departing from the principles and spirit of the invention, the scope of which is defined by the claims and their equivalents.

Claims

1. A data privacy protection and encryption management method throughout its entire lifecycle, characterized in that: include: Collect data objects from various business systems, track the complete evolution process of each data object from creation to destruction, record the derivation relationship, derivation operation type and the responsible party for the derivation operation, and construct a data lineage diagram with each data object as a data node; Sensitivity levels are marked for each data node in the data lineage diagram. The propagation path and diffusion range of the sensitivity level of each data node in the derivation operation are analyzed. Based on the propagation path and diffusion range, the privacy inheritance degree and privacy diffusion degree of each data node are calculated. Based on the privacy inheritance and privacy diffusion of each data node, and combined with the preset data asset value assessment model, the privacy exposure risk of each data node is obtained through aggregated quantitative analysis, and a spectrum risk heat map is generated. Based on the phylogenetic risk heat map, key privacy nodes and high-risk propagation paths composed of multiple key privacy nodes are identified in the data nodes. Combined with the preset encryption cost-benefit model, the optimal encryption timing and encryption granularity parameters of each key privacy node are determined. Based on the optimal encryption timing and encryption granularity parameters, encryption operations are performed on each key privacy node, and the execution results of each encryption operation are recorded to form an encryption execution log.

2. The data lifecycle privacy protection encryption management method according to claim 1, characterized in that, Methods for constructing data pedigree charts include: Obtain the data object registration list and operation behavior logs of each data object from each business system. The data object registration list includes the data object registration records of each data object. Based on the operation behavior logs, obtain the lifecycle event sequence of each data object and mark the corresponding current lifecycle state. Based on the lifecycle event sequence, extract the derivation relationship between each data object. Based on the derivation relationship between each data object, calculate the derivation depth and derivation breadth of each data object. Based on the lifecycle event sequence, count the operation frequency and the number of operating entities for each data object. Each data object is treated as a data node, and each data node is configured with corresponding node attributes. The node attributes of each data node include the data object registration record, current lifecycle state, derivation depth, derivation breadth, operation frequency, and number of operation subjects. Each derivation relationship is treated as a directed edge, and each derivation relationship is configured with corresponding edge attributes. The edge attributes of each directed edge include the upstream data object code, downstream data object code, derivation operation type, derivation operation timestamp, and the responsible subject of the derivation operation. Integrate all data nodes with directed edges to construct a data lineage graph.

3. The data lifecycle privacy protection encryption management method according to claim 2, characterized in that, Methods for analyzing the propagation path and scope of the sensitivity level of each data node during the derivation operation include: Obtain a set of sensitive classification rules, which includes a sensitive keyword dictionary and a data type sensitivity mapping table; for each data node in the data lineage diagram, obtain the data object name and data type from the corresponding node attributes; match the data object name with the sensitive keyword dictionary, and obtain the keyword matching sensitivity level for each data node based on the matching results; based on the data type, obtain the corresponding baseline sensitivity level from the data type sensitivity mapping table; determine the sensitivity level for each data node based on the keyword matching sensitivity level and the baseline sensitivity level. For each data node in the data lineage graph, recursively trace back along the reverse direction of the directed edges to construct all directed paths from the root node to the corresponding data node, and record each directed path as a propagation path; where the root node is a data node without an upstream data object; each propagation path includes the path start node code, the path end node code, the path node code sequence, and the path edge sequence; for each data node in the data lineage graph, recursively traverse all downstream reachable data nodes along the forward direction of the directed edges to form a diffusion node set for each data node; count the number of data nodes in each diffusion node set to obtain the diffusion range of each data node.

4. The data lifecycle privacy protection encryption management method according to claim 3, characterized in that, Methods for calculating the privacy inheritance and privacy diffusion of each data node include: Based on the preset sensitivity level score mapping table, obtain the sensitivity level score corresponding to the sensitivity level of each data node; for each propagation path, obtain the sensitivity level score corresponding to the root node and mark it as the path source sensitivity score; obtain the derivation operation type of each directed edge in each path edge sequence, and obtain the corresponding sensitivity transfer coefficient from the preset operation type sensitivity transfer coefficient set according to each derivation operation type; multiply each path source sensitivity score by the sensitivity transfer coefficient corresponding to each directed edge in the corresponding path edge sequence to obtain the path transfer sensitivity value of each propagation path. For each data node, select the inheritance path from the corresponding propagation path; if there is no inheritance path, set the privacy inheritance degree of the corresponding data node to zero; if there is one or more inheritance paths, obtain the maximum value of the path transmission sensitivity value corresponding to all inheritance paths and mark it as the maximum transmission sensitivity value; obtain the upper limit of sensitivity score according to the sensitivity level score mapping table; calculate the ratio of the maximum transmission sensitivity value to the upper limit of sensitivity score to obtain the privacy inheritance degree of each data node. For each data node, obtain the corresponding sensitivity level score and diffusion range; if the diffusion range is zero, set the privacy diffusion degree of the corresponding data node to zero; if the diffusion range is greater than zero, calculate the sensitivity intensity normalization value based on the sensitivity level score and the upper limit of the sensitivity score; calculate the diffusion coverage rate of each data node based on the diffusion range; calculate the privacy diffusion degree of each data node by weighted summation of the sensitivity intensity normalization value and the diffusion coverage rate based on the preset diffusion weight. Based on the sensitivity level, sensitivity level score, privacy inheritance degree, and privacy diffusion degree of each data node, the node attributes of the corresponding data nodes are supplemented with annotations.

5. The data lifecycle privacy protection encryption management method according to claim 4, characterized in that, Methods for obtaining privacy exposure risks of each data node through aggregated quantitative analysis include: The operation frequency, number of operating entities, and derivation breadth of each data node are normalized to obtain evaluation indicators for each data node. These evaluation indicators include standard operation frequency, number of standard operating entities, and standard derivation breadth. A pre-defined data asset value assessment model is used, which includes value weights and lifecycle state correction coefficients for each evaluation indicator. Based on these value weights, the standard operation frequency, number of standard operating entities, and standard derivation breadth of the same data node are weighted and summed to obtain the basic asset value of each data node. The corresponding lifecycle state correction coefficient is obtained based on the current lifecycle state of each data node. Finally, the data asset value index of each data node is calculated based on the basic asset value and the lifecycle state correction coefficient. From the node attributes of each data node, obtain the privacy inheritance degree and privacy diffusion degree respectively; based on the privacy inheritance degree and privacy diffusion degree, calculate the propagation baseline value and coupling interaction value of each data node respectively; preset the interaction enhancement coefficient, calculate the product of the coupling interaction value and the interaction enhancement coefficient to obtain the interaction enhancement amount; calculate the sum of the propagation baseline value and the interaction enhancement amount to obtain the privacy propagation strength of each data node; for each data node, calculate the modulation base number according to the corresponding data asset value index; preset the value modulation index, calculate the value modulation index power of the modulation base number to obtain the asset modulation factor; calculate the product of the privacy propagation strength and the asset modulation factor to obtain the privacy exposure risk of each data node.

6. The data lifecycle privacy protection encryption management method according to claim 5, characterized in that, Methods for generating phylogenetic risk heatmaps include: Based on privacy exposure risks, each data node is classified into risk levels; for each propagation path, the privacy exposure risk of all data nodes in the corresponding path node encoding sequence is obtained; the average privacy exposure risk of all data nodes in the same propagation path is calculated to obtain the path average risk value; the maximum privacy exposure risk of all data nodes in the same propagation path is obtained to obtain the path peak risk value; based on preset path risk weights, the path average risk value and the path peak risk value of the same propagation path are weighted and summed to obtain the path aggregate risk value of each propagation path. Based on the privacy exposure risk, risk level, and data asset value index of each data node, the node attributes of the corresponding data nodes are supplemented and annotated; the path aggregation risk value of each propagation path is integrated with the corresponding path node coding sequence to form a path risk record set; based on the supplemented and annotated data lineage chart and path risk record set, a lineage risk heat map is generated.

7. The data lifecycle privacy protection encryption management method according to claim 6, characterized in that, Methods for identifying key privacy nodes and high-risk propagation paths include: From the node attributes of each data node, the risk level and current lifecycle status are extracted respectively; the risk level includes low risk level, medium risk level, high risk level and extremely high risk level; the current lifecycle status includes live status and destroyed status; the risk level and current lifecycle status of each data node are analyzed, and data nodes with a risk level of high risk level or extremely high risk level and a current lifecycle status of live status are selected as candidate privacy nodes. A privacy contagion regeneration number is introduced to quantitatively assess the privacy risk propagation and amplification capabilities of each candidate privacy node; from all propagation paths, propagation paths with an aggregated risk value greater than a preset path risk screening threshold are selected as high-risk candidate paths; based on the high-risk candidate paths, the propagation blocking impact of each candidate privacy node is calculated. For each candidate privacy node, the derivation depth is obtained from the corresponding node attributes, and the source proximity coefficient is calculated. Based on the source proximity coefficient and the preset source protection amplification coefficient, the source protection amplification factor is calculated. For each candidate privacy node, the corresponding privacy infection regeneration number, the propagation blocking impact degree and the source protection amplification factor are multiplied in sequence to obtain the immunity priority. Candidate privacy nodes with immunity priority greater than or equal to the preset immunity priority threshold are identified as key privacy nodes. From high-risk candidate paths, propagation paths containing two or more key privacy nodes in their path node encoding sequences are selected as high-risk propagation paths.

8. The data lifecycle privacy protection encryption management method according to claim 7, characterized in that, Methods for determining the encryption granularity parameters of each key privacy node include: A pre-defined encryption cost-benefit model is established, comprising a set of granularity levels, a unit encryption cost coefficient and protection coverage coefficient corresponding to each granularity level, and data format granularity adaptation rules. The data format granularity adaptation rules determine the range of granularity levels supported by each data format. For each critical privacy node, the data format and privacy exposure risk are obtained from the corresponding node attributes. Based on the data format granularity adaptation rules, all granularity levels supported by the corresponding data format for each critical privacy node are determined and marked as the available granularity level set. For each granularity level in the available granularity level set, the corresponding unit encryption cost coefficient and protection coverage coefficient are obtained. Based on the privacy exposure risk and protection coverage coefficient, the granular protection benefit for each critical privacy node at each granularity level is calculated. Based on the granular protection benefit and the unit encryption cost coefficient, the encryption benefit ratio for each critical privacy node at each granularity level is calculated. From the available granularity level set, the granularity level with the largest encryption benefit ratio is selected as the encryption granularity parameter for the corresponding critical privacy node.

9. The data lifecycle privacy protection encryption management method according to claim 8, characterized in that, Methods for determining the optimal encryption timing for each key privacy node include: For each key privacy node, obtain the corresponding lifecycle event sequence; based on the lifecycle event sequence, determine the number of exposure operations for each key privacy node; obtain the creation timestamp and the number of operating subjects from the node attributes corresponding to each key privacy node, calculate the survival time based on the creation timestamp, and calculate the subject exposure amplification factor based on the number of operating subjects; calculate the exposure operation frequency for each key privacy node based on the number of exposure operations and the survival time; calculate the privacy exposure rate for each key privacy node based on the exposure operation frequency and the subject exposure amplification factor. If the privacy exposure rate is greater than zero, the privacy exposure half-life of the corresponding key privacy node is calculated based on the privacy exposure rate; the tolerance doubling number is calculated based on the preset risk tolerance multiple; the risk tolerance duration and encryption deadline of the corresponding key privacy node are calculated based on the privacy exposure half-life and the tolerance doubling number; the risk tolerance duration of the corresponding key privacy node is compared with the preset instant encryption duration threshold and emergency encryption duration threshold respectively, and the encryption timing level of the corresponding key privacy node is determined based on the comparison results; if the privacy exposure rate is equal to zero, the encryption deadline is set to the preset default encryption scheduling time, and the corresponding encryption timing level is set. Based on the encryption timing level and encryption deadline, the optimal encryption execution time for each key privacy node is determined; the optimal encryption execution time and encryption timing level of the same key privacy node are integrated to obtain the optimal encryption timing for each key privacy node.

10. A data lifecycle privacy protection encryption management system, implementing the data lifecycle privacy protection encryption management method according to any one of claims 1-9, characterized in that, include: The genealogy construction module is used to collect data objects from various business systems, track the complete evolution process of each data object from creation to destruction, record the derivation relationship, derivation operation type and the responsible party for the derivation operation, and construct a data lineage diagram with each data object as a data node; The privacy propagation module is used to label the sensitivity level of each data node in the data lineage diagram, analyze the propagation path and diffusion range of the sensitivity level of each data node in the derivation operation process, and calculate the privacy inheritance degree and privacy diffusion degree of each data node based on the propagation path and diffusion range. The risk aggregation module is used to obtain the privacy exposure risk of each data node by combining the privacy inheritance and privacy diffusion degree of each data node with the preset data asset value assessment model, and generate a spectrum risk heat map. The encryption decision module is used to identify key privacy nodes and high-risk propagation paths composed of multiple key privacy nodes in the data nodes based on the spectral risk heat map, and to determine the optimal encryption timing and encryption granularity parameters for each key privacy node by combining a preset encryption cost-benefit model. The encryption execution module is used to perform encryption operations on each key privacy node according to the optimal encryption timing and encryption granularity parameters, and record the execution results of each encryption operation to form an encryption execution log.