A file data integrity encryption storage method, system and medium
By employing multi-factor key derivation, block-level authentication and verification, and encrypted container encapsulation, this technology addresses the issues of a single key derivation mechanism, lack of file block verification, and overly coarse access control granularity in existing technologies. It achieves full lifecycle security protection for files in a distributed environment, ensuring data confidentiality, integrity, and traceability.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- SHANDONG HUIZHENG INFORMATION TECH CO LTD
- Filing Date
- 2025-11-13
- Publication Date
- 2026-06-26
AI Technical Summary
Existing technologies suffer from a lack of simple key derivation mechanisms, file block verification, insufficient metadata encryption, and coarse-grained access control. This results in files failing to achieve full security protection from encryption and storage to access in a distributed environment, affecting the confidentiality, integrity, and traceability of files in complex network environments.
By employing multi-factor key derivation, block-level authentication and verification, encrypted container encapsulation, and proxy re-encryption access control, we achieve full lifecycle security protection for file data.
It simultaneously ensures data confidentiality, tamper resistance, and verifiable access throughout the entire file lifecycle, achieving fine-grained secure storage and access control.
Smart Images

Figure CN121502805B_ABST
Abstract
Description
Technical Field
[0001] This application relates to the field of data encryption and storage technology, and in particular to a method, system and medium for encrypting and storing file data with complete integrity. Background Technology
[0002] Files often involve multiple levels of security protection and integrity verification during storage and transmission. Especially in complex application scenarios such as distributed storage, cloud collaboration, and remote sharing, traditional file encryption mechanisms are no longer sufficient to meet the security requirements throughout the entire file lifecycle.
[0003] Currently, existing file encryption storage technologies mainly include static encryption schemes based on single-key derivation, file integrity verification methods based on hash checks, and access control mechanisms relying on a central server. However, these methods all have varying degrees of security vulnerabilities in practical use. For example, the single-key derivation strategy reuses the same key across different devices, allowing attackers to decrypt multiple files once they obtain user passwords or derivation parameters; integrity detection schemes based on single hash checks can only verify overall consistency and cannot quickly locate tampering at the file block level; while centralized access control systems suffer from single points of failure and rigid permission management, lacking dynamic key negotiation and temporary authorization capabilities.
[0004] In summary, existing technologies suffer from several technical problems. These include a single key derivation mechanism, lack of file block verification, insufficient metadata encryption, and overly coarse access control. Consequently, files in distributed environments cannot achieve full security protection from encryption and storage to access, further affecting the confidentiality, integrity, and traceability of files in complex network environments. Summary of the Invention
[0005] The purpose of this application is to provide a method, system, and medium for encrypting and storing complete file data, in order to solve the technical problems in the prior art, such as the inability to achieve full security protection of files in a distributed environment from encryption and storage to access due to the single key derivation mechanism, lack of file block verification, insufficient metadata encryption, and coarse access control granularity, which further affect the confidentiality, integrity, and traceability of files in complex network environments.
[0006] In view of the above problems, this application provides a method, system and medium for encrypting and storing file data to ensure its integrity.
[0007] Firstly, this application provides a method for encrypting and storing file data with integrity, implemented through a file data integrity encryption and storage system, comprising: deriving a derived key based on a user password, device hardware identifier, and random salt value using a multi-factor key; encrypting and generating authentication tags for a target file using the derived key to obtain block-level encrypted data units; constructing a block-level integrity structure for the block-level encrypted data units based on a tree-structured data structure to obtain a root hash value; encapsulating the block-level encrypted data units and the root hash value into an encrypted container to obtain a container digest value; digitally signing and timestamping the container digest value; and performing distributed encrypted storage and redundant backup on the signed encrypted container to achieve secure access to the distributed encrypted storage data.
[0008] Preferably, the method for encrypting and storing complete file data further includes: receiving the user password, wherein the user password satisfies a preset strength policy; obtaining the device hardware identifier and performing irreversible hashing on the device hardware identifier to generate a device identifier digest; collecting the random salt value of the file to be protected; introducing an underivative random secret value; and performing key derivation operations on the user password, the device identifier digest, the random salt value, and the random secret value to generate a master derived key.
[0009] Preferably, the method for encrypting and storing complete file data further includes: using the primary derived key as a seed to perform multi-purpose key layering to generate the derived key, wherein the derived key includes a data encryption key, an authentication key, and a key encapsulation key; wherein the intermediate keys in the primary derived key generation process and the derived key generation process are stored in a controlled memory area through a non-interchangeable strategy and are erased from memory after use.
[0010] Preferably, the method for encrypting and storing complete file data further includes: dividing the target file into continuous data blocks according to a preset block division strategy; generating an initialization vector for each data block in the continuous data blocks based on a random number generator; performing authenticated symmetric encryption on the initialization vector based on the derived key to output ciphertext data and simultaneously generating an authentication tag corresponding to the ciphertext data; and combining the ciphertext data and the authentication tag to obtain the block-level encrypted data unit.
[0011] Preferably, the method for encrypting and storing complete file data further includes: defining input items for ciphertext data and authentication tags in the block-level encrypted data unit to generate a ciphertext sequence and an authentication tag sequence; constructing leaf node inputs based on the ciphertext sequence and authentication tag sequence; and performing bottom-up iterative leaf hash calculations using the leaf node inputs to obtain the root hash value.
[0012] Preferably, the method for encrypting and storing complete file data further includes: using the ciphertext sequence, the authentication tag sequence, the root hash value, and the encrypted data block of the target file as data to be encapsulated; constructing a logical structure of the encrypted container using a hierarchical format; assembling the logical structure based on the data to be encapsulated to form a container binary image; and performing a hash operation on the container binary image to obtain the container digest value of the encrypted container.
[0013] Preferably, the method for encrypting and storing complete file data further includes: classifying the file metadata dataset of the target file at the field level into ordinary metadata and sensitive metadata; marking the sensitive metadata according to a predefined sensitive field table to form a sensitive field set; and encrypting the sensitive field set using an independent metadata key to generate the encrypted data block.
[0014] Preferably, the method for encrypting and storing complete file data further includes: receiving an access request; authenticating the user based on the access request to obtain the user's identity; generating a temporary session key according to the request category in the access request and the user's identity; performing access according to the temporary session key; destroying the temporary session key after the access is completed; and updating the access audit log.
[0015] Secondly, this application also provides a file data integrity encrypted storage system for executing a file data integrity encrypted storage method as described in the first aspect, comprising: a derived key obtaining module for performing multi-factor key derivation based on user password, device hardware identifier, and random salt value to obtain a derived key; an encrypted data unit obtaining module for performing file block encryption and authentication tag generation on a target file using the derived key to obtain block-level encrypted data units; a root hash value obtaining module for constructing a block-level integrity structure for the block-level encrypted data units based on a tree data structure to obtain a root hash value; a container digest value obtaining module for performing encrypted container encapsulation based on the block-level encrypted data units and the root hash value to obtain a container digest value; and a secure access module for performing digital signature and timestamp binding on the container digest value, and performing distributed encrypted storage and redundant backup on the signed encrypted container to complete secure access to the distributed encrypted storage data.
[0016] Thirdly, a computer-readable storage medium storing a computer program, which, when executed, implements the steps of the file data integrity encryption storage method described in any one of the first aspects above.
[0017] The technical solution provided in this application has at least the following technical effects or advantages: by achieving the integrated secure storage goal based on multi-factor key derivation, block-level authentication and verification, encrypted container encapsulation and proxy re-encryption access control, it achieves the technical effect of simultaneously ensuring data confidentiality, tamper-proofness and verifiable access throughout the entire life cycle of the file.
[0018] The above description is merely an overview of the technical solution of this application. To enable a clearer understanding of the technical means of this application and to facilitate its implementation according to the description, and to make the above and other objects, features, and advantages of this application more apparent, specific embodiments of this application are described below. It should be understood that the content described in this section is not intended to identify key or important features of the embodiments of this application, nor is it intended to limit the scope of this application. Other features of this application will become readily apparent through the following description. Attached Figure Description
[0019] To more clearly illustrate the technical solutions in this application or the prior art, the drawings used in the description of the embodiments or the prior art will be briefly introduced below. Obviously, the drawings described below are merely exemplary. For those skilled in the art, other drawings can be obtained based on the provided drawings without creative effort.
[0020] Figure 1 This is a flowchart illustrating a method for encrypting and storing file data to ensure its integrity, as described in this application.
[0021] Figure 2 This is a schematic diagram of the structure of a file data integrity encrypted storage system according to this application.
[0022] The attached diagram shows the following modules: Module 1 for obtaining the derived key, Module 2 for obtaining the encrypted data unit, Module 3 for obtaining the root hash value, Module 4 for obtaining the container digest value, and Module 5 for secure access. Detailed Implementation
[0023] This application provides a method, system, and medium for encrypting and storing complete file data. It addresses the technical problems in existing technologies where, due to a single key derivation mechanism, lack of file block verification, insufficient metadata encryption, and overly coarse access control granularity, files cannot achieve full security protection from encryption and storage to access in a distributed environment, further affecting the confidentiality, integrity, and traceability of files in complex network environments. It achieves an integrated secure storage goal based on multi-factor key derivation, block-level authentication verification, encrypted container encapsulation, and proxy re-encryption access control, simultaneously ensuring data confidentiality, tamper resistance, and verifiable access throughout the entire file lifecycle.
[0024] The technical solutions of this application will now be clearly and completely described with reference to the accompanying drawings. Obviously, the described embodiments are only a part of the embodiments of this application, and not all of them. It should be understood that this application is not limited to the exemplary embodiments described herein. All other embodiments obtained by those skilled in the art based on the embodiments of this application without creative effort are within the scope of protection of this application. It should also be noted that, for ease of description, only the parts related to this application are shown in the accompanying drawings, not all of them.
[0025] Example 1, please refer to the appendix. Figure 1 This application provides a method for encrypting and storing file data to ensure its integrity, which is applied to a file data integrity encryption and storage system, and specifically includes the following steps:
[0026] S1: Derivation of a multi-factor key is performed based on the user password, device hardware identifier, and random salt value to obtain a derived key.
[0027] Furthermore, this application also includes: receiving the user password, wherein the user password satisfies a preset strength policy; obtaining the device hardware identifier and performing irreversible hashing on the device hardware identifier to generate a device identifier digest; collecting the random salt value of the file to be protected; introducing an underivative random secret value; and performing key derivation operations on the user password, the device identifier digest, the random salt value, and the random secret value to generate a master derived key.
[0028] Furthermore, this application also includes: using the primary derived key as a seed to perform multi-purpose key layering to generate the derived key, wherein the derived key includes a data encryption key, an authentication key, and a key encapsulation key; wherein the intermediate key in the primary derived key generation process and the derived key generation process is stored in a controlled memory area through a non-interchangeable strategy and is erased from memory after use.
[0029] Specifically, the system receives password information input by the user, which is the user's input used for key generation. The preset strength policy refers to the complexity requirements for the user password, such as a minimum length of eight characters, the inclusion of uppercase and lowercase letters, numbers, and special symbols, and the prohibition of common words or consecutive numbers. Setting a preset strength policy prevents attackers from obtaining user passwords through simple cracking or dictionary attacks.
[0030] Subsequently, the hardware identifier of the currently used device is obtained, which could be the device's motherboard serial number or network card MAC address. To prevent the original hardware identifier from being stolen during transmission or storage, an irreversible hashing process is performed on the device's hardware identifier, that is, using a hash algorithm to convert it into a fixed-length digest data. Due to the one-way nature of hash algorithms, even if an attacker obtains the hash result, they cannot deduce the original device information, thus protecting device privacy and security.
[0031] Next, a random salt value is acquired for the file to be protected. The random salt value is a randomly generated piece of data used to introduce a unique interference factor into each key derivation process, so that the same password produces completely different key results under different files or different users.
[0032] A non-derivative random secret value is introduced, generated internally by the secure hardware module and unexportable through any external interface. Even if an attacker possesses the user password, salt value, and device information, they cannot generate the same key without this random secret value, thus enhancing security.
[0033] The user password, device identifier digest, random salt value, and random secret value are input together into the key derivation algorithm to perform key derivation operations. The key derivation algorithm employs a high-security algorithm, such as Argon2id or PBKDF2, and sets sufficiently high iteration counts and memory overhead parameters. For example, the iteration count can be set to 50,000 times and the memory overhead to 64 megabytes to increase the cost of cracking. After multiple iterations, the key derivation algorithm outputs a high-strength master derivation key.
[0034] Multi-purpose key layering, using the master derived key as a seed, refers to generating independent subkeys for different purposes through key derivation functions, thus forming a layered key system. Multi-purpose key layering means generating independent keys for different security operations based on different application scenarios, avoiding the potential risks of reusing a single key across multiple functions. For example, a layered structure based on the HKDF algorithm can be used, introducing different tag parameters in different derivation contexts to ensure that each derived key is logically and mathematically independent.
[0035] Subsequently, the generated derived keys mainly include a data encryption key, an authentication key, and a key encapsulation key. The data encryption key is used to encrypt the actual content of the file, directly ensuring file confidentiality. The authentication key is used to generate and verify authentication tags, detecting whether data has been tampered with during transmission or storage, ensuring integrity. The key encapsulation key protects other keys or intermediate security parameters, ensuring that the keys exist in encrypted form during transmission or storage. This separation of the three key functions allows for fine-grained control without compromising the overall security structure. For example, when a file needs to be updated, only the data encryption key needs to be regenerated, without needing to derive the authentication or encapsulation keys again, thus improving efficiency.
[0036] The intermediate keys are obtained during the main derivation key generation phase and the derivative key generation phase. An intermediate key is a temporary variable generated during the execution of the key derivation algorithm. If this intermediate variable is stolen by an attacker, it may be used to deduce the final key. Therefore, a non-commutative strategy is adopted, prohibiting the intermediate key from being copied, exported, or stored in uncontrolled areas; it can only exist temporarily in a secure, controlled memory area. A controlled memory area refers to an area pre-allocated by the operating system or encryption module and accessed only for secure operations, such as a trusted execution environment or secure memory pages.
[0037] Furthermore, once the intermediate key has completed its computational task, a memory erasure operation is performed immediately. Memory erasure refers to overwriting the memory space occupied by the intermediate key with random data or zero values, ensuring that even if an attacker physically reads the memory data, they cannot recover the original key information, thereby blocking potential leakage paths during key generation and derivation.
[0038] S2: The target file is encrypted in blocks and authentication tags are generated using the derived key to obtain block-level encrypted data units.
[0039] Furthermore, this application also includes: dividing the target file according to a preset block division strategy to obtain continuous data blocks; generating an initialization vector for each data block in the continuous data blocks based on a random number generator; performing authenticated symmetric encryption on the initialization vector based on the derived key to output ciphertext data, and simultaneously generating an authentication tag corresponding to the ciphertext data; combining the ciphertext data and the authentication tag to obtain the block-level encrypted data unit.
[0040] Specifically, the target file is divided according to a preset chunking strategy to obtain contiguous data blocks. A chunking strategy is a rule for dividing a large file into multiple fixed- or variable-sized data segments, which can be set based on the total file size, the block size of the storage system, or network transmission efficiency. For example, a 100-megabyte file can be divided into 100 data blocks, each megabyte in size. This facilitates parallel encryption, fast verification, and independent access to partial data during subsequent transmission or storage. Contiguous data blocks refer to file segments arranged sequentially in logical order, each segment containing a complete portion of the original file content, thus ensuring that the complete file can be reconstructed during merging.
[0041] Next, an initialization vector is generated for each data block in the consecutive data blocks. The initialization vector is a random input value used in the encryption algorithm. By using a random number generator to generate a different initialization vector for each data block, even if the same plaintext is encrypted in different blocks, the output ciphertext will be completely different, effectively preventing the risk of information leakage due to ciphertext duplication. For example, if two data blocks have the same content but different initialization vectors, the encryption results will be completely different, thus improving the ability to resist analytical attacks. The random number generator uses cryptographically secure algorithms, such as a generation method based on system entropy pooling, to ensure that the generated vector is unpredictable and unique each time.
[0042] Subsequently, authenticated symmetric encryption is performed on each data block and its corresponding initialization vector using the derived key. Authenticated symmetric encryption is an algorithm that verifies data integrity while encrypting it. During encryption, not only is ciphertext data generated, but an authentication tag is also output to verify whether the data has been tampered with during decryption. For example, when a one-megabyte data block is encrypted, it may result in one megabyte of ciphertext and a sixteen-byte authentication tag, which is equivalent to an encrypted digital fingerprint. The tag will only pass verification if the decryption end uses the same key and vector; otherwise, it indicates that the data has been altered during transmission or storage.
[0043] After generating encryption and authentication tags, the ciphertext data is combined with the corresponding authentication tags to form complete block-level encrypted data units. Each encrypted data unit is an independent secure packet containing ciphertext and verification information, possessing self-verification capabilities. Even when accessing or transmitting only a portion of the data blocks in a large-scale distributed storage system, each encrypted data unit can be independently verified without relying on full file verification.
[0044] S3: Based on the tree data structure, construct the block-level integrity structure of the block-level encrypted data unit to obtain the root hash value.
[0045] Furthermore, this application also includes: defining input items for the ciphertext data and authentication tags in the block-level encrypted data unit, generating a ciphertext sequence and an authentication tag sequence; constructing leaf node inputs based on the ciphertext sequence and the authentication tag sequence, and performing bottom-up iterative leaf hash calculations using the leaf node inputs to obtain the root hash value.
[0046] Specifically, defining input items for ciphertext data and authentication tags in block-level encrypted data units refers to extracting each encrypted file data block and its corresponding authentication tag, and arranging them in a uniform format to form an input set. Ciphertext data refers to the data that cannot be directly read after the original file has been converted by the encryption algorithm, while authentication tags are verification information used to verify the integrity and authenticity of the ciphertext. The process of defining input items establishes an ordered mapping relationship for each pair of ciphertext and tags, enabling subsequent batch calculations and structured processing. For example, if a file is divided into 10 data blocks, and each data block has a corresponding authentication tag, then the result of defining input items is 10 pairs of ciphertext and tags, forming a sequential list, providing the input basis for the next step of hash tree calculation.
[0047] Next, leaf node inputs are constructed based on the ciphertext sequence and authentication tag sequence, meaning that each defined pair of ciphertext and authentication tags is used as a leaf node of the hash tree. A hash tree is a tree-like data structure used to ensure the integrity of large amounts of data during transmission and storage. The input to each leaf node is a concatenation or combination of ciphertext and authentication tags, which is transformed into a fixed-length hash value using a hash algorithm. Then, starting from the leaf node input, a bottom-up iterative hash calculation is performed, that is, the hash values of adjacent leaf nodes are combined again and hashed, repeating this process until only one hash value remains. The final hash value is called the root hash value, representing the overall verification identifier of all encrypted data blocks and authentication information in the entire file.
[0048] S4: Perform encryption container encapsulation based on the block-level encrypted data unit and the root hash value to obtain the container digest value.
[0049] Furthermore, this application also includes: using the ciphertext sequence, the authentication tag sequence, the root hash value, and the encrypted data block of the target file as the data to be encapsulated; constructing a logical structure of the encrypted container using a hierarchical format; assembling the logical structure based on the data to be encapsulated to form a container binary image; and performing a hash operation on the container binary image to obtain the container digest value of the encrypted container.
[0050] Furthermore, this application also includes: classifying the file metadata dataset of the target file at the field level into ordinary metadata and sensitive metadata; marking the sensitive metadata according to a predefined sensitive field table to form a sensitive field set; and encrypting the sensitive field set using an independent metadata key to generate the encrypted data block.
[0051] Specifically, the data to be encapsulated consists of a ciphertext sequence, an authentication tag sequence, a root hash value, and encrypted data blocks from the target file. This indicates that the composition of the encapsulated content is clearly defined in the initial stage of generating the encryption container. The target file's metadata dataset is classified at the field level into ordinary metadata and sensitive metadata. The descriptive information accompanying the target file is managed hierarchically according to its security importance. File metadata describes file attributes, such as filename, creation time, modifier, version number, file size, and access permissions. Field-level classification means analyzing the security attributes of each metadata field individually, rather than processing the entire metadata uniformly. Ordinary metadata refers to content that does not reveal privacy or security risks, such as file format or size information; sensitive metadata may contain user identifiers, geographical locations, business key identifiers, or file content digests. If sensitive metadata is leaked, it may pose system security or privacy risks. Classification allows subsequent encryption strategies to be more precise and efficient, avoiding performance losses caused by indiscriminate encryption. Next, sensitive metadata is marked according to a predefined sensitive field table, forming a sensitive field set. A predefined sensitive field table is used to clearly define the types or names of fields considered sensitive information. The sensitive field table can be configured according to industry standards, security levels, or organizational policies. The tagging process involves labeling fields in the file metadata so that subsequent encryption modules can directly identify and perform protection operations. The resulting sensitive field set is a collection of all tagged fields, used for subsequent independent encryption. This allows for flexible adaptation to different security requirements; for example, in research data, the names of experimenters and the location of experiments might be listed as sensitive fields. Then, the sensitive field set is encrypted using an independent metadata key, generating an encrypted data block. This represents the generation of an independent metadata key for encrypting the metadata portion. The purpose of the independent key is to achieve key isolation; even if the master encryption key is leaked, the metadata portion cannot be decrypted, thereby improving overall security. The encryption process uses a symmetric encryption algorithm, such as AES, which, through the combined action of the metadata key and a random initialization vector, converts the sensitive field set into ciphertext form. The encrypted result is called an encrypted data block. The encrypted data block is stored or encapsulated separately from ordinary metadata to prevent sensitive information from being read or parsed during file exchange, cloud storage, or cross-system transmission.
[0052] Furthermore, constructing the logical structure of the encrypted container using a hierarchical format refers to organizing data according to a pre-defined hierarchy and structure, giving the encrypted container parsing and scalability. A hierarchical format means the encrypted container is divided into multiple logical layers, such as a header layer, index layer, data layer, and verification layer. The header layer contains metadata such as container identifier, version number, encryption algorithm information, and timestamps; the index layer describes the offset and length of each data block for quick subsequent location; the data layer contains ciphertext data and authentication tags; and the verification layer stores the root hash value and related verification information. This hierarchical structure allows for targeted updates or verification of specific parts without unpacking the entire container.
[0053] Then, based on the data to be encapsulated, a logical structure is assembled to form a container binary image, representing the conversion of the logical structure into a storable or transmittable entity form according to binary encoding rules. The logical structure is merely an abstract data organization scheme, while the binary image is its physical representation, existing as a continuous byte stream. By serializing, aligning, and padding the data at each level, it can be recognized and parsed on different platforms.
[0054] Finally, a hash operation is performed on the container binary image to obtain the container digest value of the encrypted container. This means performing a complete hash calculation on the packaged container file to generate a unique digital fingerprint. The hash operation uses algorithms such as SHA-256 or BLAKE3, calculating the encrypted container content byte by byte to output a fixed-length digest value. The container digest value is used to verify whether the container has been modified or corrupted during storage or transmission, because even slight byte changes will result in completely different hash values.
[0055] S5: Digitally sign and timestamp the container digest value, and perform distributed encrypted storage and redundant backup on the signed encrypted container to complete secure access to the distributed encrypted storage data.
[0056] Furthermore, this application also includes: receiving an access request, authenticating the user identity based on the access request, generating a temporary session key according to the request category in the access request and the user identity, performing access according to the temporary session key, destroying the temporary session key after the access is completed, and updating the access audit log.
[0057] Specifically, digitally signing and timestamping the container digest value signifies a joint authentication of the entire encrypted container's trustworthiness and timeliness after encapsulation. A digital signature is the process of encrypting the digest value using a private key to generate signature data, ensuring that only the entity possessing the corresponding private key can generate the signature, while anyone outside can verify its authenticity using the public key. Timestamping involves a trusted timestamp server marking the signature result, proving that the signature existed at a specific point in time and has not been tampered with. For example, when the digest value of a file container is a 64-character hexadecimal hash string, this hash is encrypted using the signer's private key and combined with the timestamp from the timestamp server to generate a complete signature record. This not only ensures the trustworthiness of the container's content source but also proves that the file existed before a certain point in time, preventing forgery or subsequent modification.
[0058] Next, distributed encrypted storage and redundant backup of the signed encrypted container refers to storing the fully signed encrypted container on multiple independent storage nodes to improve security and availability. Distributed encrypted storage means that the container file is divided into several parts, encrypted separately, and stored on different servers or cloud nodes, thereby avoiding single points of failure or centralized data leakage. Redundant backup is to further improve disaster recovery capabilities by storing identical encrypted data copies across multiple nodes, so that if one node is damaged or fails, it can be quickly recovered through other nodes.
[0059] Subsequently, an access request is received, and authentication is performed based on the request to obtain the user's identity. This signifies that the user's identity is verified when attempting to access encrypted files. The access request includes information such as user identifier, access purpose, and the type of operation required. Authentication is completed by comparing credentials, keys, or biometrics. For example, it may require entering a username and password or verifying fingerprints, device IDs, or two-factor authentication codes. Upon successful authentication, an identity token bound to the user is generated, serving as the basis for subsequent access control. If authentication fails, the access request is denied, thereby preventing unauthorized users from accessing sensitive data.
[0060] Next, based on the request category and user identity in the access request, a temporary session key is generated. This means that at the start of the access session, a unique encryption key is dynamically created for the current access operation, valid only during that session. The request category refers to the specific type of access, such as reading a file, modifying a file, or downloading a container; different categories of operations may require different permission ranges or security levels. Temporary session keys are typically generated through key negotiation algorithms or proxy re-encryption mechanisms, without directly exposing the master key information, thus enabling fine-grained control. For example, when a user only requests file digest verification, a session key with only read permissions can be generated, while when the user is an administrator, the session key includes write or delete permissions.
[0061] Subsequently, access is performed based on the temporary session key. After the access is complete, the temporary session key is destroyed, and the access audit log is updated. This means that all encryption and decryption operations during the access process are based on the session key, and the original key of the encrypted container is never exposed to the visitor. After the access is completed, the temporary key is immediately cleared from memory to prevent it from being intercepted or reused by malicious programs. Simultaneously, the complete access behavior is recorded, including visitor identity, time, operation type, access result, and source IP address, and stored in the access audit log. The audit log serves as a security traceability and behavior auditing tool, helping administrators discover abnormal access, detect potential attacks, and provide evidence for subsequent security analysis. For example, if a user is found to have accessed encrypted files multiple times outside of working hours, an alert can be automatically triggered, and the relevant session can be frozen.
[0062] In summary, the file data integrity encryption storage method provided in this application has the following technical effects: by achieving the integrated secure storage goal based on multi-factor key derivation, block-level authentication and verification, encrypted container encapsulation and proxy re-encryption access control, it achieves the technical effect of simultaneously ensuring data confidentiality, tamper resistance and verifiable access throughout the entire file lifecycle.
[0063] Example 2: Based on the same inventive concept as the file data integrity encryption storage method in the foregoing examples, this application also provides a file data integrity encryption storage system. Please refer to the appendix. Figure 2 The system includes: a derived key acquisition module 1, used to derive a derived key by performing multi-factor key derivation based on the user password, device hardware identifier, and random salt value; an encrypted data unit acquisition module 2, used to perform file block encryption and authentication tag generation on the target file using the derived key, to obtain a block-level encrypted data unit; a root hash value acquisition module 3, used to construct a block-level integrity structure for the block-level encrypted data unit based on a tree data structure, to obtain a root hash value; a container digest value acquisition module 4, used to perform encrypted container encapsulation based on the block-level encrypted data unit and the root hash value, to obtain a container digest value; and a secure access module 5, used to digitally sign and timestamp the container digest value, and perform distributed encrypted storage and redundant backup on the signed encrypted container, to complete secure access to the distributed encrypted storage data.
[0064] Furthermore, the file data integrity encryption storage system is also used for: receiving the user password, the user password satisfying a preset strength policy; obtaining the device hardware identifier, and performing irreversible hashing on the device hardware identifier to generate a device identifier digest; collecting the random salt value of the file to be protected; introducing an underivative random secret value; and performing key derivation operations on the user password, the device identifier digest, the random salt value, and the random secret value to generate a master derived key.
[0065] Furthermore, the file data integrity encryption storage system is also used to: perform multi-purpose key layering using the main derived key as a seed to generate the derived key, wherein the derived key includes a data encryption key, an authentication key, and a key encapsulation key; wherein the intermediate key in the main derived key generation process and the derived key generation process is stored in a controlled memory area through a non-interchangeable strategy and is erased from memory after use.
[0066] Furthermore, the file data integrity encryption storage system is also used for: dividing the target file according to a preset block division strategy to obtain continuous data blocks; generating an initialization vector for each data block in the continuous data blocks based on a random number generator; performing authenticated symmetric encryption on the initialization vector based on the derived key to output ciphertext data and simultaneously generating an authentication tag corresponding to the ciphertext data; and combining the ciphertext data and the authentication tag to obtain the block-level encrypted data unit.
[0067] Furthermore, the file data integrity encryption storage system is also used to: define input items for ciphertext data and authentication tags in the block-level encrypted data unit, and generate ciphertext sequence and authentication tag sequence; construct leaf node input based on the ciphertext sequence and authentication tag sequence, and perform bottom-up iterative leaf hash calculation with the leaf node input to obtain the root hash value.
[0068] Furthermore, the file data integrity encryption storage system is also used for: using the ciphertext sequence, the authentication tag sequence, the root hash value, and the encrypted data block of the target file as data to be encapsulated; constructing a logical structure of the encryption container using a hierarchical format; assembling the logical structure based on the data to be encapsulated to form a container binary image; and performing a hash operation on the container binary image to obtain the container digest value of the encryption container.
[0069] Furthermore, the file data integrity encryption storage system is also used for: classifying the file metadata dataset of the target file at the field level into ordinary metadata and sensitive metadata; marking the sensitive metadata according to a predefined sensitive field table to form a sensitive field set; and encrypting the sensitive field set using an independent metadata key to generate the encrypted data block.
[0070] Furthermore, the file data integrity encryption storage system is also used for: receiving an access request, authenticating the user based on the access request to obtain the user identity; generating a temporary session key according to the request category in the access request and the user identity; performing access according to the temporary session key; destroying the temporary session key after the access is completed; and updating the access audit log.
[0071] The various embodiments in this specification are described in a progressive manner, with each embodiment focusing on the differences from other embodiments. The file data integrity encryption storage method and specific examples in the foregoing embodiment one are also applicable to the file data integrity encryption storage system in this embodiment. Through the foregoing detailed description of the file data integrity encryption storage method, those skilled in the art can clearly understand the file data integrity encryption storage system in this embodiment. Therefore, for the sake of brevity, it will not be described in detail here.
[0072] In Embodiment 3, based on the same inventive concept as the file data integrity encryption storage method in the foregoing embodiments, this application also provides a computer-readable storage medium storing a computer program, which, when executed, implements the steps of the file data integrity encryption storage method described in any one of Embodiment 1 above.
[0073] The above description of the disclosed embodiments enables those skilled in the art to make or use this application. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the general principles defined herein may be implemented in other embodiments without departing from the spirit or scope of this application. Therefore, this application is not to be limited to the embodiments shown herein, but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.
[0074] Obviously, those skilled in the art can make various modifications and variations to this application without departing from the spirit and scope of this application. Therefore, if such modifications and variations fall within the scope of this application and its equivalents, this application also intends to include such modifications and variations.
Claims
1. A method for encrypting and storing file data with complete integrity, characterized in that, include: A multi-factor key derivation is performed based on the user password, device hardware identifier, and random salt value to obtain a derived key; The target file is encrypted in blocks and authentication tags are generated using the derived key to obtain block-level encrypted data units. The block-level encrypted data unit is constructed based on a tree-structured data structure to obtain the root hash value. This includes: defining input items for ciphertext data and authentication tags in the block-level encrypted data unit to generate a ciphertext sequence and an authentication tag sequence; constructing leaf node inputs based on the ciphertext sequence and authentication tag sequence; and performing bottom-up iterative leaf hash calculations using the leaf node inputs to obtain the root hash value. Based on the block-level encrypted data unit and the root hash value, perform encrypted container encapsulation to obtain the container digest value; The container digest value is digitally signed and timestamped, and the signed encrypted container is distributed and encrypted for storage and redundant backup, thus enabling secure access to the distributed encrypted storage data. The container digest value is obtained, including: The encrypted sequence, the authentication tag sequence, the root hash value, and the encrypted data block of the target file are used as the data to be encapsulated; The logical structure of the encrypted container is constructed using a hierarchical format; The logical structure is assembled based on the data to be encapsulated to form a container binary image. The logical structure is converted into a storable or transmittable entity form according to binary encoding rules. The logical structure is just an abstract data organization scheme, while the binary image is its physical representation, which exists in the form of a continuous byte stream. By performing serialization, alignment and padding operations on the data at each level, it can be recognized and parsed on different platforms. Perform a hash operation on the binary image of the container to obtain the container digest value of the encrypted container; Obtain the encrypted data block, including: The file metadata dataset of the target file is classified at the field level into ordinary metadata and sensitive metadata; The sensitive metadata is marked according to a predefined sensitive field table to form a sensitive field set; The sensitive field set is encrypted using an independent metadata key to generate the encrypted data block.
2. The method for encrypting and storing file data with complete integrity as described in claim 1, characterized in that, Before obtaining the derived key, the following is included: Receive the user password, and the user password meets the preset strength policy; Obtain the device hardware identifier and perform irreversible hashing on the device hardware identifier to generate a device identifier digest; Collect the random salt value of the file to be protected; Introduce a random secret value that cannot be derived; The user password, the device identifier digest, the random salt value, and the random secret value are used to perform key derivation operations to generate a master derivation key.
3. The method for encrypting and storing file data with complete integrity as described in claim 2, characterized in that, Obtain the derived key, including: Using the primary derived key as a seed, multi-purpose key layering is performed to generate the derived key, which includes a data encryption key, an authentication key, and a key encapsulation key. The intermediate keys used in the primary derived key generation process and the derivative key generation process are stored in a controlled memory area using a non-interchangeable strategy and are erased from memory after use.
4. The method for encrypting and storing file data with complete integrity as described in claim 1, characterized in that, Obtain block-level encrypted data units, including: The target file is divided according to a preset block-splitting strategy to obtain continuous data blocks; For each data block in the continuous data block, an initialization vector is generated based on a random number generator; Based on the derived key, perform authenticated symmetric encryption on the initialization vector, output ciphertext data, and generate an authentication tag corresponding to the ciphertext data. The encrypted data and the authentication tag are combined to obtain the block-level encrypted data unit.
5. The method for encrypting and storing file data with complete integrity as described in claim 1, characterized in that, Performing secure access includes: Receive an access request, perform authentication based on the access request, and obtain the user's identity; A temporary session key is generated based on the request category and the user identity in the access request; Access is performed based on the temporary session key. After the access is completed, the temporary session key is destroyed and the access audit log is updated.
6. A file data integrity encrypted storage system, characterized in that, The steps for implementing the file data integrity encryption storage method according to any one of claims 1 to 5 include: The derived key generation module is used to derive multi-factor keys based on user passwords, device hardware identifiers, and random salt values to obtain derived keys. The encrypted data unit obtaining module is used to perform file block encryption and authentication tag generation on the target file using the derived key, thereby obtaining a block-level encrypted data unit; The root hash value acquisition module is used to construct a block-level integrity structure for the block-level encrypted data unit based on a tree data structure to obtain the root hash value. It includes: defining input items for ciphertext data and authentication tags in the block-level encrypted data unit, generating a ciphertext sequence and an authentication tag sequence; constructing leaf node inputs based on the ciphertext sequence and authentication tag sequence, and performing bottom-up iterative leaf hash calculations using the leaf node inputs to obtain the root hash value. The container digest value acquisition module is used to perform encrypted container encapsulation based on the block-level encrypted data unit and the root hash value to obtain the container digest value; The secure access module is used to digitally sign and timestamp the container digest value, and to perform distributed encrypted storage and redundant backup of the signed encrypted container, thereby completing secure access to the distributed encrypted storage data.
7. A computer-readable storage medium, characterized in that, The computer-readable storage medium stores a computer program, which, when executed, implements the steps of the file data integrity encryption storage method according to any one of claims 1 to 5.