Identification encryption and dynamic cache method in data desensitization transmission

By using a globally unique cache number and a dynamic desensitization strategy, combined with AES-256 encryption, the problems of low data transmission efficiency and insufficient security are solved, achieving efficient, secure, and accurate data transmission traceability, and ensuring the real-time nature of personalized medical services and the full-cycle protection of user privacy.

CN122226366APending Publication Date: 2026-06-16CHENGDU SHUMEIZHIHUA INFORMATION TECHNOLOGY CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
CHENGDU SHUMEIZHIHUA INFORMATION TECHNOLOGY CO LTD
Filing Date
2026-03-16
Publication Date
2026-06-16

AI Technical Summary

Technical Problem

Existing data transmission encryption algorithms are inefficient in high-frequency data interaction scenarios, resulting in agent response delays. Static desensitization rules cannot dynamically adjust the desensitization granularity. The cache management mechanism is not strongly associated with user identifiers. Unique number encryption algorithms are easily reverse-engineered, making it impossible to achieve accurate traceability and security control throughout the entire lifecycle.

Method used

It employs a globally unique cache number based on timestamps, user identifiers, and random numbers, combined with a dynamic de-identification strategy and the AES-256 algorithm, to encrypt and store de-identified data. It establishes a link between the globally unique cache number and data query commands, blocking direct cloud access to the original database, and realizing data filtering and encrypted transmission. It also incorporates a dynamic expiration mechanism and HTTPS two-way encryption.

🎯Benefits of technology

It improves data transmission efficiency, ensures the real-time nature of personalized medical services, reduces the risk of key leakage, achieves secure and controllable data with accurate traceability, and ensures full-cycle protection of user privacy.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN122226366A_ABST
    Figure CN122226366A_ABST
Patent Text Reader

Abstract

The application discloses an identification encryption and dynamic caching method in data desensitization transmission, comprising the following steps: receiving a data query instruction and performing legality verification; forwarding the legal data query instruction to a screening module of a local data server; dynamically desensitizing privacy fields in an original database to generate a globally unique cache number based on a timestamp, a user identifier and a random number, and establishing an association between the globally unique cache number and the legal data query instruction; using a desensitized data set composed of the globally unique cache number and the dynamically desensitized data information, and using an AES-256 algorithm to encrypt and store the desensitized data set; associating the encrypted and stored desensitized data set with the globally unique cache number, and presetting a dynamic expiration mechanism for the cache data; feeding back the globally unique cache number and the desensitized data set to a cloud big model, using the globally unique cache number to decrypt and obtain user original privacy data, and filling and converting the desensitized data set.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of data desensitization and transmission technology, and in particular to a method for identifier encryption and dynamic caching in data desensitization and transmission. Background Technology

[0002] The core data stored by this technology is highly sensitive and private information, encompassing user names, mobile phone numbers, and other identity-identifying data, as well as patient medical data such as surgical history and treatment records. The intelligent agent in this embodiment, designed to provide personalized medical services (such as recommending postoperative rehabilitation plans based on surgical history and interpreting medication guidance based on treatment records), incorporates large-scale artificial intelligence (AI) model technology. However, these AI models all utilize cloud service invocation, posing a risk to the security of sensitive user data. Data security is a core supporting element for AI-powered medical applications, and related technologies such as transmission encryption and data anonymization are rapidly developing. Various algorithms are widely used for the security protection of sensitive data to meet the privacy and compliance requirements of medical data.

[0003] Currently, existing technologies for data anonymization and transmission, such as the technology disclosed in "Patent Publication No. CN121145252A, entitled 'A Method and Apparatus for Auditing Function Execution in Reproductive Medical Services'", anonymize the original medical information in the audit parameters to obtain the medical information to be stored. Additionally, the technology disclosed in "Patent Publication No. CN117592555A, entitled 'A Federated Learning Method and System for Multi-Source Heterogeneous Medical Data'" anonymizes and encrypts the original medical data from various medical centers, using the k-anonymity algorithm for anonymization and the Advanced Encryption Standard (AES) algorithm for encryption. However, the above technologies have the following drawbacks: First, existing data transmission encryption algorithms mostly adopt standardized and general encryption schemes. In high-frequency data interaction scenarios involving large cloud models, the encryption and decryption efficiency is low, which can easily lead to delays in the response of intelligent agents and affect the real-time performance of personalized medical services.

[0004] Second, existing data anonymization technologies mostly employ static anonymization rules, which can only simply mask or replace fixed sensitive fields, and cannot dynamically adjust the anonymization granularity according to the application scenario of the data. If anonymization is overdone, the data will lose its application value and will not be able to support the accurate analysis of large artificial intelligence models. If anonymization is insufficient, it will lead to the leakage of privacy information, making it difficult to balance data usability and security.

[0005] Third, the existing cache management mechanism is not specifically designed for the privacy attributes of medical data. The cached data is not strongly associated with and isolated from the user's unique identifier, and the cached data of different users is prone to confusion. Moreover, the encryption protection of the cached data is insufficient. Once the cache server is attacked, a large amount of poorly protected medical data will be directly leaked, posing a serious threat to user privacy.

[0006] Fourth, the existing unique number encryption algorithms do not adequately guarantee randomness and uniqueness. The number generation logic is easily reverse-engineered, making it impossible to effectively hide the user's true identity and data association. Furthermore, the binding method between the number and the original data is relatively simple, which can easily lead to problems such as number tampering and data mismatch during data flow, making it difficult to achieve accurate traceability and security control of medical data throughout its entire lifecycle.

[0007] Therefore, there is an urgent need to propose a simple, secure, and reliable method for identifier encryption and dynamic caching in data desensitization and transmission. Summary of the Invention

[0008] To address the aforementioned problems, the present invention aims to provide a method for identifier encryption and dynamic caching in data de-identification transmission. The technical solution adopted by the present invention is as follows: A method for identifying and encrypting data during data de-identification transmission, used for data de-identification transmission between the data requesting end, cloud-based large model, and local data server, includes the following steps: It receives data query commands from the cloud-based large model and verifies the validity of the verification token carried in the data query command, blocking illegal data query commands; it forwards the valid data query commands to the filtering module of the local data server, preventing the cloud-based large model from directly accessing the original database in the local data server; Dynamically desensitize privacy fields in the original database, generate globally unique cache numbers based on timestamps, user identifiers, and random numbers, and establish a connection between the globally unique cache numbers and legitimate data query commands; A de-identified dataset is constructed using a globally unique cache number and dynamically de-identified data information, and the de-identified dataset is encrypted and stored using the AES-256 algorithm; the encrypted de-identified dataset is used as cache data and associated with the globally unique cache number, and a dynamic expiration mechanism for the cache data is preset. The globally unique cache number and the de-identified dataset are fed back to the cloud-based large model. The globally unique cache number is used to decrypt and obtain the user's original privacy data. The de-identified dataset is then populated and transformed before being fed back to the data requesting end.

[0009] Compared with the prior art, the present invention has the following beneficial effects: (1) This invention constructs logical isolation (verification of the legality of the verification token) and blocks the direct access of the cloud-based large model to the original database. It only transmits the desensitized data to the cloud for screening and encryption, which greatly reduces the amount of data transmitted over the public network, reduces the computational overhead of encryption and decryption, improves the response speed of the intelligent agent, and ensures the real-time nature of personalized medical services. At the same time, by combining the instruction verification token mechanism with HTTPS bidirectional encrypted transmission, it avoids the risk of single key management of general encryption algorithms, reduces the probability of key leakage and data interception from the root, and significantly improves transmission security.

[0010] (2) The present invention adopts a dynamic desensitization strategy based on system load status. When the load is low, deep desensitization is enabled to fully hide privacy fields and generate cache numbers. When the load is high, it switches to lightweight desensitization to remove only explicit identity identifiers. The two modes can be automatically switched according to query needs, avoiding the problem of data failure due to excessive desensitization or privacy leakage due to insufficient desensitization. At the same time, the data filtering algorithm extracts only the minimum necessary information to support the large model's answer, ensuring that the large model can generate accurate personalized service results and that the original privacy data is not touched by the large model throughout the process, thus achieving the dual goals of data security and controllability and service accuracy.

[0011] (3) This invention generates a globally unique cache number through a multi-dimensional feature hybrid encryption algorithm of "timestamp + user identifier + random number", and uses the AES-256 algorithm to encrypt and store cache data, so that the cache data is strongly associated with the unique number to achieve logical isolation of multi-user data. Even if the cache server is attacked, the attacker cannot directly obtain the original diagnosis and treatment data. Combined with the dynamic expiration mechanism, the window for sensitive data exposure is shortened, providing full-cycle protection for user privacy.

[0012] (4) This invention uses a unique cache number to establish a strong association between desensitized data and original privacy data. After the large model returns the results, the cache number is used to accurately locate and decrypt the cache data. Based on the correspondence between the privacy fields recorded in the desensitization stage and the general expression, the preliminary answer is personalized and converted. The original privacy data and the large model are logically isolated throughout the process. This ensures that the large model does not come into contact with any original privacy data while outputting personalized results that fit the user's actual situation. This enables precise traceability and security control of medical data from reception, desensitization, transmission, processing to output.

[0013] In summary, this invention has the advantages of simple logic and high security and reliability, and has high practical and promotional value in the field of data anonymization and transmission technology. Attached Figure Description

[0014] To more clearly illustrate the technical solutions of the embodiments of the present invention, the accompanying drawings used in the embodiments will be briefly introduced below. It should be understood that the following drawings only show some embodiments of the present invention and should not be regarded as a limitation on the scope of protection. For those skilled in the art, other related drawings can be obtained based on these drawings without creative effort.

[0015] Figure 1 This is a logic flowchart of the present invention. Detailed Implementation

[0016] To make the objectives, technical solutions, and advantages of this application clearer, the present invention will be further described below with reference to the accompanying drawings and embodiments. The embodiments of the present invention include, but are not limited to, the following embodiments. All other embodiments obtained by those skilled in the art based on the embodiments in this application without inventive effort are within the scope of protection of this application.

[0017] In this embodiment, the term "and / or" is merely a description of the relationship between related objects, indicating that there can be three relationships. For example, A and / or B can represent three situations: A exists alone, A and B exist simultaneously, and B exists alone.

[0018] The terms "first" and "second," etc., used in the specification and claims of this embodiment are used to distinguish different objects, not to describe a specific order of objects. For example, "first target object" and "second target object," etc., are used to distinguish different target objects, not to describe a specific order of target objects.

[0019] like Figure 1 As shown, this embodiment provides a method for identifier encryption and dynamic caching in data anonymization transmission, used for data anonymization transmission between the data requesting end, the cloud-based large model, and the local data server. It achieves logical isolation between the original sensitive data and the large model through a closed-loop process of "instruction reception → local data filtering and anonymization on the local data server → encryption caching and unique number binding → encrypted transmission → result association and conversion → user output," preventing the large model from directly accessing user privacy information.

[0020] The first step is to receive data query commands from the cloud-based large model and verify the validity of the verification token carried in the data query command, blocking illegal data query commands; then forward the valid data query commands to the filtering module of the local data server to prevent the cloud-based large model from directly accessing the original database in the local data server.

[0021] Only an instruction exchange channel is established between the local data server and the cloud-based large model. Direct access ports to the original database are not opened, and the data manipulation language (DML) constructed by the cloud-based large model is not executed, thus blocking the large model's direct access requests to the original database at the channel level.

[0022] The system retrieves data query commands from the cloud-based large-scale model and determines whether the command carries a verification token. If no verification token is included or the token's fields are empty, the query command is intercepted and marked as invalid. No further processing is performed on invalid query commands. In this embodiment, a verification token for the data query command is pre-set in the cloud-based large-scale model, and this token serves as the sole legitimate basis for accessing the cloud-based large-scale model's interface.

[0023] If the data query command contains a preset verification token and the verification token is not empty, the pre-stored valid token library in the local data server is called to perform consistency verification on the verification token. When the data query command is valid, DML is constructed after the verification passes, prohibiting the cloud-based large model from directly manipulating the database with CRUD (Create, Read, Update, Delete) permissions, and only retaining query permissions based on preset rules.

[0024] If the verification token matches the valid token database pre-stored in the local data server, the valid data query command is forwarded to the filtering module of the local data server; otherwise, the data query command is intercepted, and its expression is: ; ; in, Carry a token with the command received by the interface; This represents the locally stored valid token; Check represents the token consistency verification result; Legitimate represents the command validity determination result. Indicates a null value; AND logical operation.

[0025] The second step involves dynamically anonymizing the privacy fields in the original database, generating a globally unique cache number based on a timestamp, user identifier, and random number, and establishing a link between this globally unique cache number and legitimate data query commands. Specifically: The dynamic desensitization strategy in this embodiment prioritizes deep desensitization under low load and automatically switches to lightweight desensitization under high load to ensure stable system operation and prevent system crashes.

[0026] The lightweight desensitization (direct filtering mode) reads the user's original sensitive data stored on the local data server; based on the query command requirements, it performs direct removal operations on explicit identity fields such as name and mobile phone number, such as removing key fields such as username and phone_number and returning structured object information; it outputs a lightweight dataset that retains only non-privacy related information, adapting to the high-speed response requirements of large models for simple queries, such as general rehabilitation suggestions and standardized knowledge question answering scenarios.

[0027] Deep anonymization (privacy field hiding + cache association mode): Reads the user's original sensitive data stored on the local data server; comprehensively hides various privacy data fields, not only covering structured identity information, but also deeply masking sensitive content such as specific dates, hospitals visited, and disease details in unstructured texts such as medical records.

[0028] In this embodiment, the user's original sensitive data stored in the local data server is read, the privacy fields in the user's original sensitive data are hidden, the data information after dynamic data desensitization is obtained, and the correspondence between the privacy fields and the general description is recorded.

[0029] Generate a globally unique cache number based on timestamp, user identifier, and random number, and establish an association between the globally unique cache number and a valid data query command, including the following steps: The globally unique cache number is UID = SHA256(TS⊕ID⊕R); where TS represents the current millisecond-level timestamp to ensure the temporal uniqueness of the number; ID represents the user identifier; R represents a 128-bit random number to enhance the number's resistance to reverse derivation; ⊕ represents the XOR operation; and SHA256(·) represents the SHA-256 hash function. UID is a 64-bit unique cache number.

[0030] A globally unique cache ID is used as the cache key, and the data information before and after dynamic data anonymization is used as the cache value, establishing a association between the globally unique cache ID and a valid data query command. Here, outside the local data server, only the globally unique ID and the de-identified data (anonymized data) exist; while within the local data server, the corresponding cached data can be parsed using the globally unique ID. Finally, the answer from the cloud-based big data model is combined with the cached data on the local server to generate a personalized answer. When a data query command is confirmed to be executed and queries the database, a copy of the unanonymized data is generated. This unanonymized data is then processed using an anonymization algorithm to generate a globally unique cache ID and a list of corresponding cached data (a list of unanonymized and anonymized data). Therefore, each validated data query command corresponds one-to-one with a globally unique cache ID.

[0031] The third step involves creating a de-identified dataset using a globally unique cache ID and dynamically anonymized data. This dataset is then encrypted and stored using the AES-256 algorithm. The cache server only stores the encrypted content and its UID, ensuring that even if the cache server is attacked, the original data cannot be directly accessed. This encrypted de-identified dataset is used as cached data and associated with the globally unique cache ID. A dynamic expiration mechanism for the cached data is also pre-defined. For example, the cache storage time for the postoperative rehabilitation suggestion field is set to 7 days, and the cache storage time for the medication guidance field is set to 3 days, to reduce the exposure window for sensitive data.

[0032] In this embodiment, the de-identified dataset is constructed by combining the globally unique cache number and the dynamically anonymized data information in key-value pairs. This key-value pair format is stored in a non-relational database, specifically Redis. This de-identified dataset allows the corresponding cached data to be located using the globally unique cache number.

[0033] The de-identified dataset is encrypted and stored using the AES-256 algorithm, and the encrypted de-identified dataset is used as cached data.

[0034] The fourth step involves sending the globally unique cache number and the de-identified dataset back to the cloud-based big model. The globally unique cache number is used to decrypt and obtain the user's original privacy data. The de-identified dataset is then populated and transformed before being sent back to the data requesting end.

[0035] (41) The globally unique cache number and the de-identified dataset are fed back to the cloud-based large model via HTTPS protocol. This includes: The local data server converts the globally unique cache number and the de-identified dataset into HTTP data, encrypts it into ciphertext using a symmetric session key, and generates a first data digest using a hash algorithm. The cloud-based large model decrypts the ciphertext using the symmetric session key to obtain the original HTTP data, and generates a second data digest using a hash algorithm. If the first data digest is the same as the second data digest, the globally unique cache number and the de-identified dataset are stored in the cloud-based large model.

[0036] (42) Locate the cached data using the globally unique cache number, decrypt the cached data using the AES-256 decryption key, and obtain the user's original privacy data in the original database; fill and transform the user's original privacy data based on the correspondence between privacy fields and general expressions.

[0037] The above embodiments are merely preferred embodiments of the present invention and are not intended to limit the scope of protection of the present invention. Any changes made based on the design principles of the present invention, or any non-creative modifications made thereon, shall fall within the scope of protection of the present invention.

Claims

1. A method for identifier encryption and dynamic caching in data anonymization transmission, used for data anonymization transmission between the data requesting end, cloud-based large model, and local data server, characterized in that: Includes the following steps: It receives data query commands from the cloud-based large model and verifies the validity of the verification token carried in the data query command, blocking illegal data query commands; it forwards the valid data query commands to the filtering module of the local data server, preventing the cloud-based large model from directly accessing the original database in the local data server; Dynamically desensitize privacy fields in the original database, generate globally unique cache numbers based on timestamps, user identifiers, and random numbers, and establish a connection between the globally unique cache numbers and legitimate data query commands; A de-identified dataset is constructed using a globally unique cache number and dynamically de-identified data information, and the de-identified dataset is encrypted and stored using the AES-256 algorithm; the encrypted de-identified dataset is used as cache data and associated with the globally unique cache number, and a dynamic expiration mechanism for the cache data is preset. The globally unique cache number and the de-identified dataset are fed back to the cloud-based large model. The globally unique cache number is used to decrypt and obtain the user's original privacy data. The de-identified dataset is then populated and transformed before being fed back to the data requesting end.

2. The method for identifier encryption and dynamic caching in data desensitization transmission according to claim 1, characterized in that, The process of receiving data query commands from a cloud-based large model and verifying the validity of the verification token carried in the data query command, and intercepting illegal data query commands, includes the following steps: Obtain data query commands from the cloud-based large model and determine whether the data query command carries a verification token; if it does not carry a verification token or the verification token field is empty, then intercept the data query command and mark it as an illegal data query command. If the data query command contains a preset verification token and the verification token is not empty, then the valid token library pre-stored in the local data server is called to perform consistency verification on the verification token. If the verification token matches the valid token library pre-stored in the local data server, the valid data query command is forwarded to the filtering module of the local data server; otherwise, the data query command is intercepted.

3. The method for identifier encryption and dynamic caching in data desensitization transmission according to claim 2, characterized in that, Also includes: In the cloud-based large model, a verification token for data query commands is preset, and the verification token is used as the sole legitimate basis for accessing the cloud-based large model interface.

4. The method for identifier encryption and dynamic caching in data desensitization transmission according to claim 1 or 2, characterized in that, Dynamic data anonymization of privacy fields in the original database includes the following steps. Read the user's original sensitive data stored on the local data server, hide the privacy fields in the original sensitive data, obtain the data information after dynamic data anonymization, and record the correspondence between the privacy fields and the general description.

5. The method for identifier encryption and dynamic caching in data desensitization transmission according to claim 4, characterized in that, Generate a globally unique cache number based on timestamp, user identifier, and random number, and establish an association between the globally unique cache number and a valid data query command, including the following steps: The globally unique cache number is UID=SHA256(TS⊕ID⊕R); where TS represents the current millisecond-level timestamp; ID represents the user identifier; R represents a 128-bit random number; ⊕ represents the XOR operation; and SHA256(·) represents the SHA-256 hash function. The globally unique cache number is used as the key of the cache, and the data information before and after dynamic data anonymization is used as the cache value to establish an association between the globally unique cache number and the valid data query command.

6. The method for identifier encryption and dynamic caching in data desensitization transmission according to claim 5, characterized in that, A de-identified dataset is constructed using a globally unique cache number and dynamically anonymized data information, and is encrypted and stored using the AES-256 algorithm. This encrypted de-identified dataset is then used as cache data and associated with the globally unique cache number, including the following steps: The de-identified dataset is constructed by using key-value pairs to represent the globally unique cache number and the de-identified dynamic data information. The de-identified dataset is encrypted and stored using the AES-256 algorithm, and the encrypted de-identified dataset is used as cached data.

7. The method for identifier encryption and dynamic caching in data desensitization transmission according to claim 6, characterized in that, A dynamic expiration mechanism for pre-defined cached data includes configuring a cache expiration time for any field based on the field privacy level and / or business application scenario.

8. The method for identifier encryption and dynamic caching in data desensitization transmission according to claim 7, characterized in that, The globally unique cache number and the de-identified dataset are fed back to the cloud-based large model. The globally unique cache number is used to decrypt and obtain the user's original private data. The de-identified dataset is then populated and transformed, including the following steps: The globally unique cache number and the de-identified dataset are fed back to the cloud-based large model via HTTPS protocol; The cached data is located using a globally unique cache number and decrypted using an AES-256 decryption key to obtain the user's original privacy data from the original database. The user's original privacy data is then populated and transformed based on the correspondence between privacy fields and general expressions.

9. The method for identifier encryption and dynamic caching in data desensitization transmission according to claim 8, characterized in that, The globally unique cache number and the de-identified dataset are fed back to the cloud-based large model via HTTPS, including the following steps: The local data server converts the globally unique cache number and the de-identified dataset into HTTP data, encrypts it into ciphertext using a symmetric session key, and generates a first data digest using a hash algorithm. The cloud-based big data model uses a symmetric session key to decrypt the ciphertext to obtain the original HTTP data, and then uses a hash algorithm to decrypt it to generate a second data digest. If the first data digest is the same as the second data digest, and the globally unique cache number and the de-identified dataset are stored in the cloud-based large model.