A data encryption method and an encrypted query method

By using homomorphic encryption and secure indexes based on SAR-Tree structures, the problem of inaccurate queries under encryption methods is solved, enabling efficient queries while protecting data privacy, thus improving query efficiency and security.

CN119848889BActive Publication Date: 2026-06-30JIANGNAN UNIV

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
JIANGNAN UNIV
Filing Date
2024-12-19
Publication Date
2026-06-30

AI Technical Summary

Technical Problem

Existing encryption methods, due to changes in data structure, cannot achieve accurate queries in an encrypted state, resulting in low query efficiency and insufficient privacy protection capabilities, which can easily lead to information leakage.

Method used

A public-private key pair is created using homomorphic encryption to encrypt the target data. A secure index based on the SAR-Tree structure is constructed, and the secure index is used for querying. By calculating the loose and exact scores of non-leaf nodes, a descending heap and pruner are constructed to obtain the target encrypted result set. Finally, the data is decrypted using the noise vector and the private key.

Benefits of technology

Enables efficient querying of encrypted data, reduces computational and storage overhead, improves query efficiency, and prevents privacy leaks during the query process, providing comprehensive protection for data privacy, query privacy, result privacy, and access patterns.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN119848889B_ABST
    Figure CN119848889B_ABST
Patent Text Reader

Abstract

This invention relates to the field of data management technology and discloses a data encryption method and an encrypted query method, including encrypted upload on the client side and query on the cloud server. A key pair is created using a homomorphic encryption method to encrypt the data. Using the encrypted data as nodes, a secure index based on a SAR-Tree structure is constructed and sent to the cloud server for storage. During querying, an encrypted query request is obtained, the loose score corresponding to each non-leaf node in the secure index is calculated, and heap sort is used to descend the order to obtain the corresponding descending heap. The precise score is further calculated to prune the descending heap, obtaining the target encrypted result set. Each node in the target encrypted result set is added to a preset noise component to obtain the noise vector and encryption interference points. The encryption interference points are decrypted using a private key to obtain the decrypted data, which is sent to the client to restore the original data. This invention improves query efficiency and ensures data query security.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of data management technology, and in particular to a data encryption method and an encrypted query method. Background Technology

[0002] With the development of cloud computing infrastructure, more and more small and medium-sized enterprises are migrating their services and data to cloud platforms. To protect data privacy, data is usually encrypted before being outsourced; however, the query performance of encrypted data can be affected.

[0003] After data is encrypted, query performance is often affected due to the complexity of the encryption algorithm and the change in data structure. Traditional encryption methods such as AES, while providing strong data encryption capabilities, require decryption operations during queries, which increases query complexity and time costs. Furthermore, encrypted data may also face additional overhead during storage and transmission.

[0004] While using Order-Preserving Encryption (OPE) and Order-Revealing Encryption (ORE) to encrypt and query datasets can solve the problem of needing to decrypt during the query, these two methods are based on sequential queries. Range queries can be performed directly on the encrypted data, but exact value queries usually require decryption first. Therefore, it is not possible to perform accurate location and distance calculations without decryption.

[0005] Furthermore, the methods described above have shortcomings in protecting access pattern privacy. Access patterns refer to the patterns and habits of users in querying data. By analyzing access patterns, attackers may be able to infer sensitive user information or the purpose of their queries. However, the methods described above do not fully consider this aspect, making users' access patterns vulnerable to leakage.

[0006] In summary, existing encrypted query methods cannot achieve accurate queries in an encrypted state due to the changes that encryption methods bring to the data structure, resulting in reduced query efficiency. Furthermore, existing methods have insufficient privacy protection capabilities, which can easily lead to information leakage. Summary of the Invention

[0007] Therefore, the technical problem to be solved by the present invention is to overcome the problem that the existing technology cannot achieve accurate query in the encrypted state due to the change of data structure by the encryption method, which reduces query efficiency and has insufficient privacy protection capabilities.

[0008] To address the aforementioned technical problems, this invention provides a data encryption method applied to a client, comprising:

[0009] Use homomorphic encryption to create a key pair containing a public key and a private key;

[0010] Use the public key to encrypt the target data and obtain the corresponding encrypted data;

[0011] Using encrypted data as nodes, a secure index based on the SAR-Tree structure is built and sent to a cloud server for storage.

[0012] Preferably, the homomorphic encryption method includes: Paillier algorithm, RSA algorithm and ElGamal algorithm.

[0013] Preferably, the security index based on the SAR-Tree structure includes:

[0014] Leaf nodes are represented as: ;

[0015] A normal non-leaf node is represented as: ;

[0016] A non-leaf node containing a blind object is represented as: ;

[0017] in, This represents the encrypted data represented by the i-th leaf node in the security index. This indicates the total number of data nodes in the subtree contained in the root node of this node. This represents the minimum boundary matrix of the node. This indicates the center of the spatial matrix of the node. For encryption symbols, This indicates that the node points to a node at the next level; This represents the combination of encryption carriers corresponding to all blind objects under the same parent node. This represents a bucket of nodes that share the same parent node. Indicates based on and The expression for retrieving child nodes is: , This represents the subtree containing the non-leaf node with the blind object. The first layer A blind person, , , This indicates the total number of leaf nodes associated with this parent node. This represents the total number of blind objects that can be accommodated in each leaf node.

[0018] This embodiment provides an encrypted query method based on the data encryption method described above, applied to a cloud server, including:

[0019] Obtain the encrypted query request obtained by the client using the public key in the key pair to encrypt the original query request;

[0020] Based on the encrypted query request, the secure index and the private key, calculate the loose score corresponding to each non-leaf node in the secure index, and use heap sort to sort it in descending order to obtain the corresponding descending heap.

[0021] Based on a descending heap, the non-leaf node with the largest loose score is obtained, and its child nodes are accessed. Based on the dimensions of the non-leaf node and its corresponding child nodes, the result vector is obtained, and the loose score of the non-leaf node is updated to obtain the exact score corresponding to the non-leaf node.

[0022] Calculate the exact score of each node in the descending heap sequentially, and construct a pruner and an encrypted result set based on the exact score and a preset pruning threshold, until all remaining uncalculated nodes in the descending heap are dominated by the nodes in the pruner, and obtain the target encrypted result set.

[0023] Add each node in the target encryption result set to the preset noise component to obtain the noise vector and the encryption interference point set. Use the private key to decrypt each encryption interference point in the encryption interference point set to obtain the corresponding decryption data set.

[0024] The decrypted data set and noise vector are sent to the client so that the client can use the noise vector to denoise each decrypted data in the decrypted data set, obtain the corresponding original data, and restore the data to be queried in the target dataset required by the original query request.

[0025] Preferably, based on the encrypted query request, the secure index, and the private key, the loose score corresponding to each non-leaf node in the secure index is calculated, including:

[0026] Initialize the set of non-leaf objects consisting of all non-leaf nodes in the secure index. Each non-leaf node The loose fraction is ;

[0027] Traversing the set of non-leaf objects Each non-leaf node Determine the non-leaf node Is the number of layers greater than 2?

[0028] If the non-leaf node If the number of layers is not greater than 2, then the non-leaf node is not updated. Loose fractions;

[0029] If the non-leaf node If the number of layers is greater than 2, then the non-leaf node is determined. Does the dominant object belong to the set of non-leaf objects?

[0030] If it exists, then the non-leaf node... The recursive call points to the non-leaf node. Based on the homomorphic property, the loose fraction of the child nodes of the non-leaf node and the total number of data nodes in the subtree contained in the root node of the non-leaf node are calculated. After adding, then combine with the non-leaf node. The loose fractions are added together and used as the non-leaf node. Loose fractions;

[0031] If it does not exist, then the non-leaf node is not updated. Loose fractions.

[0032] Preferably, based on the dimensions of the non-leaf node and its corresponding child node, a result vector is obtained, including:

[0033] The non-leaf node and its corresponding child node are used as the first encryption object. With the second encrypted object Calculate the spatial dimension and non-spatial dimension of the first encrypted object and the second encrypted object respectively;

[0034] Calculate the sum of the spatial and non-spatial dimensions of the first encrypted object, and use this as the dimension sum of the first encrypted object. ;

[0035] Calculate the sum of the spatial and non-spatial dimensions of the second encrypted object, and use this as the dimension sum of the second encrypted object. ;

[0036] Compare the dimensions of the first encrypted object. Dimensions of the second encrypted object and :

[0037] like If so, the returned result vector will be 0;

[0038] like Then compare the first encrypted object. With the second encrypted object The spatial and non-spatial dimensions are used to obtain a result vector containing the comparison results of the spatial and non-spatial dimensions.

[0039] Preferably, the result vector containing comparison results of spatial and non-spatial dimensions is obtained, including:

[0040] If the first encrypted object The spatial dimension is no greater than that of the second encrypted object. If the spatial dimension is specified, the spatial dimension comparison result is 1; otherwise, the spatial dimension comparison result is 0.

[0041] If the first encrypted object The non-spatial dimension is no greater than that of the second encrypted object. If the non-spatial dimension is not specified, the non-spatial dimension comparison result is returned as 1; otherwise, the non-spatial dimension comparison result is returned as 0.

[0042] The result vector is constructed based on the comparison results of spatial dimensions and non-spatial dimensions.

[0043] Preferably, the loose score of the non-leaf node is updated to obtain the precise score corresponding to the non-leaf node, including:

[0044] Initialize the set of non-leaf objects consisting of all non-leaf nodes in the secure index. Each non-leaf node The loose fraction is ;

[0045] Traversing the set of non-leaf objects Each non-leaf node Determine the non-leaf node Is the number of layers greater than 1?

[0046] If the non-leaf node If the layer number is not greater than 1, then the non-leaf node is not updated. Loose fractions;

[0047] If the non-leaf node If the number of layers is greater than 1, then the set of non-leaf objects is calculated. The result vector of non-leaf nodes :

[0048] If there exists a non-leaf node that satisfies and If the recursive call of the non-leaf node points to the child node of the non-leaf node;

[0049] If there exists a non-leaf node that satisfies Based on the homomorphic property, the loose fraction of the child nodes of the non-leaf node and the total number of data nodes in the subtree contained in the root node of the non-leaf node are calculated. After adding, then combine with the non-leaf node. The loose fractions are added together and used as the non-leaf node. The exact fraction;

[0050] in, It is the encrypted object represented by the non-leaf node. Indicates encrypted query, This represents the encrypted minimum bounding rectangle (MBR) with the maximum spatial distance. This represents the encrypted minimum bounding rectangle with the minimum spatial distance.

[0051] Preferably, based on the precise score and a preset pruning threshold, a pruner and an encrypted result set are constructed until all remaining uncomputed nodes in the descending heap are dominated by nodes in the pruner, thus obtaining the target encrypted result set, including:

[0052] The child nodes of each non-leaf node in the descending heap are evaluated in descending order of their exact scores:

[0053] If the exact score of a child node of a non-leaf node is greater than the preset pruning threshold, then the corresponding child node is added to the encrypted result set and the dominant point is added to the pruner set.

[0054] The current encrypted result set is output as the target encrypted result set until all child nodes of non-leaf nodes are dominated by nodes in the pruner set.

[0055] Preferably, the client uses a noise vector to denoise each decrypted data in the decrypted data set to obtain the corresponding original data, represented as:

[0056] ;

[0057] in, Indicates the first [item] in the decrypted data set The original data corresponding to each decrypted data. Indicates the first [item] in the decrypted data set Decrypted data, Indicates the first [item] in the decrypted data set The noise component corresponding to each decrypted data.

[0058] Compared with the prior art, the above-described technical solution of the present invention has the following advantages:

[0059] The data encryption method described in this invention uses homomorphic encryption to encrypt the target dataset, ensuring that effective query operations can be performed on the encrypted data without decrypting it before querying, and also preventing potential privacy leaks during the query process. A secure index based on a SAR-Tree structure is constructed using the encrypted data as nodes and stored on a cloud server for querying based on the secure index. The SAR-Tree structure significantly reduces computational and storage overhead during the query process, improving query efficiency.

[0060] The encrypted query method described in this invention performs a dominance query based on a secure index of a SAR-Tree structure constructed using the data encryption method of this invention. It calculates the loose score corresponding to each non-leaf node in the secure index and uses heap sort to descend the order, obtaining a corresponding descending heap. Further, it calculates the precise score to prune the descending heap, obtaining the target encrypted result set. By constructing the descending heap, the node most likely to contain the required data can be quickly located. The precise score of each node is calculated progressively, and a pruner is constructed using a preset pruning threshold to continuously remove dominated nodes, gradually narrowing the query range and retaining the objects dominated by the Top-k positions, thus completing the query. This effectively reduces unnecessary calculations and improves query efficiency. This invention adds each node in the target encrypted result set to a preset noise component to obtain a noise vector and encrypted interference points. The encrypted interference points are decrypted using a private key to obtain decrypted data, which is sent to the client to restore the original data. Throughout the query process, the data exists in encrypted form until it is sent to the client, ensuring data privacy and the security of the query results during transmission, preventing the leakage of result information. The query request is also encrypted to prevent the leakage of query content. This invention provides comprehensive protection for data privacy, query privacy, result privacy, and access mode privacy while executing queries, thus ensuring data query security. Attached Figure Description

[0061] To make the content of this invention easier to understand, the invention will be further described in detail below with reference to specific embodiments and accompanying drawings, wherein:

[0062] Figure 1 This is a flowchart of the data encryption method provided by the present invention;

[0063] Figure 2 This is a flowchart of the encrypted query method provided by the present invention;

[0064] Figure 3 This is a schematic diagram of the data flow model provided by the present invention;

[0065] Figure 4 This is a flowchart of the steps for Top-k dominance query provided by the present invention;

[0066] Figure 5 This is a schematic diagram of the target dataset provided by the present invention;

[0067] Figure 6 This is a schematic diagram of the SAR-Tree structure provided by the present invention;

[0068] Figure 7 This is a schematic diagram of query iteration provided by the present invention. Detailed Implementation

[0069] The present invention will be further described below with reference to the accompanying drawings and specific embodiments, so that those skilled in the art can better understand and implement the present invention. However, the embodiments described are not intended to limit the present invention. Example 1:

[0070] Reference Figure 1 The flowchart shown illustrates the steps of the data encryption method provided by this invention. Applied to a client, the specific steps include:

[0071] S101: Create a key pair containing a public key and a private key using homomorphic encryption;

[0072] S102: Use the public key to encrypt the target data and obtain the corresponding encrypted data;

[0073] S103: Using encrypted data as nodes, a secure index based on the SAR-Tree structure is built and sent to the cloud server for storage.

[0074] Specifically, the homomorphic encryption methods include: Paillier algorithm, RSA algorithm and ElGamal algorithm.

[0075] Specifically, security indexes based on SAR-Tree structures include:

[0076] Leaf nodes are represented as: ;

[0077] A normal non-leaf node is represented as: ;

[0078] A non-leaf node containing a blind object is represented as: ;

[0079] in, This represents the encrypted data represented by the i-th leaf node in the security index. This indicates the total number of data nodes in the subtree contained in the root node of this node. This represents the minimum boundary matrix of the node. This indicates the center of the spatial matrix of the node. For encryption symbols, This indicates that the node points to a node at the next level; This represents the combination of encryption carriers corresponding to all blind objects under the same parent node. This represents a bucket of nodes that share the same parent node. Indicates based on and The expression for retrieving child nodes is: , This represents the subtree containing the non-leaf node with the blind object. The first layer A blind person, , , This indicates the total number of leaf nodes associated with this parent node. This represents the total number of blind objects that can be accommodated in each leaf node.

[0080] The data encryption method described in this invention uses homomorphic encryption to encrypt the target dataset, ensuring that effective query operations can be performed on the encrypted data without decrypting it before querying, and also preventing potential privacy leaks during the query process. A secure index based on a SAR-Tree structure is constructed using the encrypted data as nodes and stored on a cloud server for querying based on the secure index. The SAR-Tree structure significantly reduces computational and storage overhead during the query process, improving query efficiency.

[0081] Example 2:

[0082] Reference Figure 2 The flowchart shown illustrates the steps of the encrypted query method provided by this invention, applied to a cloud server. The specific steps include:

[0083] S201: Obtain the encrypted query request after the client has encrypted the original query request using the public key in the key pair;

[0084] S202: Based on the encrypted query request, the secure index and the private key, calculate the loose score corresponding to each non-leaf node in the secure index, and use heap sort to sort it in descending order to obtain the corresponding descending heap;

[0085] S203: Based on a descending heap, obtain the non-leaf node with the largest loose score, and access the child nodes of the non-leaf node. Based on the dimension of the non-leaf node and its corresponding child nodes, obtain the result vector, update the loose score of the non-leaf node, and obtain the precise score corresponding to the non-leaf node.

[0086] S204: Calculate the exact score of each node in the descending heap sequentially, and based on the exact score and a preset pruning threshold, construct a pruner and an encrypted result set until all remaining uncalculated nodes in the descending heap are dominated by nodes in the pruner, obtaining the target encrypted result set, including:

[0087] S204-1: Evaluate the child nodes of each non-leaf node in the descending heap according to their exact scores in descending order:

[0088] S204-2: If the exact score of the child node of a non-leaf node is greater than the preset pruning threshold, then add the corresponding child node to the encrypted result set and add the dominator point to the pruner set.

[0089] S204-3: Output the current encrypted result set, which is the target encrypted result set, until all child nodes of non-leaf nodes are dominated by nodes in the pruner set.

[0090] S205: Add each node in the target encryption result set to the preset noise component to obtain the noise vector and the encryption interference point set. Use the private key to decrypt each encryption interference point in the encryption interference point set to obtain the corresponding decryption data set.

[0091] S206: Send the decrypted data set and noise vector to the client so that the client can use the noise vector to denoise each decrypted data in the decrypted data set, obtain the corresponding original data, and restore the data to be queried in the target dataset required by the original query request.

[0092] Among them, the first in the decrypted data set The original data corresponding to each decrypted data , is represented as: ; Indicates the first [item] in the decrypted data set Decrypted data, Indicates the first [item] in the decrypted data set The noise component corresponding to each decrypted data.

[0093] Specifically, in step S202, based on the encrypted query request, the security index, and the private key, the loose score corresponding to each non-leaf node in the security index is calculated, including:

[0094] Initialize the set of non-leaf objects consisting of all non-leaf nodes in the secure index. Each non-leaf node The loose fraction is ;

[0095] Traversing the set of non-leaf objects Each non-leaf node Determine the non-leaf node Is the number of layers greater than 2?

[0096] If the non-leaf node If the number of layers is not greater than 2, then the non-leaf node is not updated. Loose fractions;

[0097] If the non-leaf node If the number of layers is greater than 2, then the non-leaf node is determined. Does the dominant object belong to the set of non-leaf objects?

[0098] If it exists, then the non-leaf node... The recursive call points to the non-leaf node. Based on the homomorphic property, the loose fraction of the child nodes of the non-leaf node and the total number of data nodes in the subtree contained in the root node of the non-leaf node are calculated. After adding, then combine with the non-leaf node. The loose fractions are added together and used as the non-leaf node. Loose fractions;

[0099] If it does not exist, then the non-leaf node is not updated. Loose fractions.

[0100] Specifically, in step S203, obtaining the result vector includes:

[0101] The non-leaf node and its corresponding child node are used as the first encryption object. With the second encrypted object Calculate the spatial dimension and non-spatial dimension of the first encrypted object and the second encrypted object respectively;

[0102] Calculate the sum of the spatial and non-spatial dimensions of the first encrypted object, and use this as the dimension sum of the first encrypted object. ;

[0103] Calculate the sum of the spatial and non-spatial dimensions of the second encrypted object, and use this as the dimension sum of the second encrypted object. ;

[0104] Compare the dimensions of the first encrypted object. Dimensions of the second encrypted object and :

[0105] like If so, the returned result vector will be 0;

[0106] like Then compare the first encrypted object. With the second encrypted object The spatial and non-spatial dimensions are used to obtain a result vector containing the comparison results of the spatial and non-spatial dimensions.

[0107] Among them, if the first encrypted object The spatial dimension is no greater than that of the second encrypted object. If the spatial dimension is less than 1, the spatial dimension comparison result is 1; otherwise, the spatial dimension comparison result is 0. If the first encrypted object... The non-spatial dimension is no greater than that of the second encrypted object. If the non-spatial dimension is not specified, the non-spatial dimension comparison result is returned as 1; otherwise, the non-spatial dimension comparison result is returned as 0. Based on the spatial dimension comparison result and the non-spatial dimension comparison result, a result vector is constructed, which is a result vector containing the comparison results of spatial dimension and multiple non-spatial dimensions.

[0108] Specifically, in step S203, the loose score of the non-leaf node is updated to obtain the precise score corresponding to the non-leaf node, including:

[0109] Initialize the set of non-leaf objects consisting of all non-leaf nodes in the secure index. Each non-leaf node The loose fraction is ;

[0110] Traversing the set of non-leaf objects Each non-leaf node Determine the non-leaf node Is the number of layers greater than 1?

[0111] If the non-leaf node If the layer number is not greater than 1, then the non-leaf node is not updated. Loose fractions;

[0112] If the non-leaf node If the number of layers is greater than 1, then the set of non-leaf objects is calculated. The result vector of non-leaf nodes :

[0113] If there exists a non-leaf node that satisfies and If the recursive call of the non-leaf node points to the child node of the non-leaf node;

[0114] If there exists a non-leaf node that satisfies Based on the homomorphic property, the loose fraction of the child nodes of the non-leaf node and the total number of data nodes in the subtree contained in the root node of the non-leaf node are calculated. After adding, then combine with the non-leaf node. The loose fractions are added together and used as the non-leaf node. The exact fraction;

[0115] in, It is the encrypted object represented by the non-leaf node. Indicates encrypted query, This represents the encrypted minimum bounding rectangle (MBR) with the maximum spatial distance. This represents the encrypted minimum bounding rectangle with the minimum spatial distance.

[0116] The encrypted query method described in this invention performs a dominant query based on a secure index of a SAR-Tree structure constructed using the data encryption method of this invention. It calculates the loose score corresponding to each non-leaf node in the secure index and uses heap sort to descend the order, obtaining a corresponding descending heap. Further, it calculates the precise score to prune the descending heap, obtaining the target encrypted result set. By constructing the descending heap, the node most likely to contain the required data can be quickly located. The precise score of each node is calculated step by step, and a pruner is constructed using a preset pruning threshold, effectively reducing unnecessary calculations and improving query efficiency. Each node in the target encrypted result set is added to a preset noise component to obtain a noise vector and an encrypted interference point. The encrypted interference point is decrypted using a private key to obtain the decrypted data, which is sent to the client to restore the original data. Throughout the query process, the data exists in encrypted form until it is sent to the client, ensuring data privacy and the security of the query results during transmission, preventing the leakage of result information. The query request is also encrypted to prevent the leakage of query content. This invention provides comprehensive protection for data privacy, query privacy, result privacy, and access mode privacy while executing the query, ensuring data query security.

[0117] Based on the above embodiments, in this embodiment of the invention, the data encryption method and encrypted query method provided by the present invention are used to achieve the following: Figure 3 In the data circulation model shown, data encryption requests are made, and both the data owner and user are clients. Figure 4 The diagram shows the flowchart of the Top-k dominance query steps, which include:

[0118] S301: The data owner (DO) shall, in accordance with security parameters Creating key pairs for the Paillier cryptosystem ;

[0119] S302: DO uses the public key Using the target dataset D, a secure index is built using the Paillier cryptosystem. This security index uses a SAR-Tree structure;

[0120] Among them, public key and security index The key pair is handed over to the data service provider DSP. Handed over to the data assistance provider DAP;

[0121] In a SAR-Tree structure, a typical non-leaf node structure o is represented as: Leaf node structure, represented as: Special non-leaf nodes (containing blind objects) are represented as follows: The blind object is equipped with an encrypted carrier. To protect against unlinkability in queries, the corresponding blind node is found through the IV. For example, if the blind object o i For the k-th node in NB, then IV i Set [k] to 1, and set the other dimensions to 0; express The minimum boundary matrix, express The center of the space matrix, This indicates a node pointing to the next level. This indicates the number of data nodes in the subtree of the root node. This represents all blind objects corresponding to the same parent node. combination; For input and Calculate the corresponding child nodes.

[0122] To prevent data service providers (DSPs) from inferring whether two queries are identical by tracking the entry counts in two leaf nodes, the DO generates artificial data points and adds them to the leaf nodes. The encrypted counts of these interference points are set to... Therefore, it will not affect the query.

[0123] Reference Figure 5 The image shown is a schematic diagram of the target dataset; refer to... Figure 6 The diagram shown is a schematic of the SAR-Tree structure; in which, As a non-leaf node leaf node ( (for noise points), through and For nodes containing blind objects, such as Figure 6 middle correspond The first node .

[0124] S303: Authorized user AU, according to For query request Encrypt the query request and send it to the encrypted server. Submitted to DSP for processing;

[0125] S304: DSP calculates the root node according to the SLBC protocol. Preliminary rating: These scored nodes are then inserted into heap H;

[0126] The SLBC (secure lightweight batch counting protocol) is used to calculate the loose scores of non-leaf nodes to guide the SAR-Tree search; the nodes input to the SAR-Tree... and encrypted query request non-leaf object set and The output is a loose score; the specific implementation steps are as follows:

[0127] 1) As the root node of the SAR-Tree, initialize Each object in The loose fraction is .

[0128] 2) Traversal For each object, if L(Z) > 2 (L represents the number of levels) and there exists a partially dominant object... ,but Compared to It is partially controlled. In this case, the SLBC protocol needs to recursively call the pointer. The child nodes.

[0129] 3) Based on the homomorphic property, the loose fraction and Add them together. If Complete domination Do not update loose scores.

[0130] Reference Figure 6 As shown, Preliminary rating: These scored nodes are then inserted into heap H.

[0131] S305: Select the highest-rated item and visit its child nodes. Obtained using the SBC protocol The nodes are accurately scored, and these scored nodes are added to table T. The encrypted result set W, the pruner set F, and the pruning set are updated synchronously. ;

[0132] The SBC (secure batch counting protocol) is primarily used to calculate the accurate scores of nodes. Its implementation is similar to SLBC, specifically including:

[0133] 1) As the root node of the SAR-Tree, initialize Each object in The score is .

[0134] 2) Traversal For each object, if L(Z) > 1 (L represents the number of layers) and There is an encrypted point p that satisfies and In this case, the SBC protocol needs to recursively call the pointer. The child nodes.

[0135] 3) When When, the score increases .

[0136] Special, when It is a non-leaf node and A point in the middle partially dominates the object. ,and satisfy ,use enter and calculate .

[0137] It is an encrypted object. It is an encrypted query. and These represent the encrypted minimum bounding rectangle (MBR) with the maximum spatial distance and the encrypted minimum bounding rectangle with the minimum spatial distance, respectively. Represents encrypted object Regarding the query Dominated or covered in space ; Represents encrypted object Regarding the query There is no dominant or covering space. ; Represents encrypted object Regarding the query Dominated or covered in space .

[0138] The SDDC protocol (secure dynamic dominance check protocol) determines the existence of data without disclosing plaintext content to the cloud service provider. , Representing data points Regarding query points The lower is better If it is If the condition is met, the SDDC protocol returns 1; otherwise, it returns 0. Its input is the encrypted object. , European distance , Encrypted query , Output either 1 or 0. The specific implementation steps are as follows:

[0139] 1) Calculation Dimensions and That is, spatial dimension The sum of non-spatial dimensions, and Dimensions and That is, spatial dimension Compared with the sum of non-spatial dimensions using the SIC protocol, if < The SIC protocol returns 1 otherwise, 0.

[0140] 2) When > The protocol ends and returns 0.

[0141] 3) When < Using the SVC protocol, compare the effective dimensions of a and b, and return a result vector c containing the comparison results of each dimension.

[0142] The SIC protocol is used to determine the comparison of two encrypted objects, returning 0 or 1. The SVC protocol is used to calculate the comparison of two encrypted vectors, returning a result vector containing the comparison results (0 or 1) of each dimension.

[0143] Reference Figure 6 As shown, The highest-rated node is the one whose child nodes are visited and computed. ,renew F settings , The pruner F decides whether to continue exploring a node based on the currently known best result. Ψ represents the threshold of the k-th highest score in the current result set, used to determine whether a newly discovered node is likely to be included in the final result set; if the node score is less than Ψ, the search for that node is terminated.

[0144] S306: Continue processing other nodes in sequence and skip the dominated nodes to obtain the final result set;

[0145] Reference Figure 7 The diagram illustrates the query iteration process; that is, the node with the second highest score is then processed. Access and compute its child nodes ,renew , , Similarly, continue visiting , ,renew , , ; The point in is The search terminates because all child nodes under node o1 have scores less than Ψ, and the result set is not updated. The encrypted result set W is a dynamically updated collection specifically used to store the highest-scoring node relevant to the query.<p,s> In this context, p represents a node, and s represents its corresponding exact score.

[0146] The SME (Secure Minimum Extraction Protocol) allows data to be extracted from encrypted datasets without revealing plaintext content or access patterns of target records to cloud servers. Extract the target record with the minimum score. Its input is the dataset. Output a new dataset , The last object is the target record with the minimum score. The specific implementation steps are as follows:

[0147] 1): DSP for each data calculate , Put in , Put in DSP encryption Send to DAP.

[0148] A binary sequence, Random permutation function Binary sequence Having fractions, It is a positive integer generated by a pseudo-random sequence.

[0149] 2): DAP Utilization Decryption ,get , Assigned to ,according to sorting in descending order Obtain a new binary sequence Generate the extraction set. The first element is The Middle elements ( The second element is the target extraction element. The remaining elements in the data are used Randomly arrange to generate new ,Will Send it to the DSP. It is a positive integer generated by a pseudo-random sequence. It is a random permutation function.

[0150] 3): DSP based on and Performing the XOR operation yields .

[0151] S307: DSP for each point in W Add noise component Forming noise vector r and encryption interference points Where r is sent to AU, Send to DAP for decryption;

[0152] S308: DAP receives interference data and through Decrypted data obtained Send to AU;

[0153] S309: AU passed Obtain the origin point.

[0154] This invention utilizes sub-protocols including SLBC, SBC, SDDC, SIC, and SME to optimize performance and protect data privacy, query privacy, result privacy, and access pattern privacy; it also uses early pruning techniques and result update algorithms to reduce unnecessary computation and communication overhead and improve query response speed.

[0155] The data encryption method of this invention uses homomorphic encryption to encrypt the target dataset, ensuring effective querying of the encrypted data without decryption, and preventing potential privacy leaks during the query process. A secure index based on a SAR-Tree structure is constructed using the encrypted data as nodes and stored on a cloud server for querying based on this index. The SAR-Tree structure significantly reduces computation and storage overhead during the query process, improving query efficiency. The encrypted query method of this invention performs dominant queries based on the secure index of the SAR-Tree structure constructed by the data encryption method of this invention. It calculates the loose score corresponding to each non-leaf node in the secure index and uses heap sort to descend the order, obtaining a descending heap. Further, it calculates the precise score to prune the descending heap, obtaining the target encrypted result set. Constructing a descending heap allows for rapid location of the nodes most likely to contain the required data. The precise score of each node is calculated progressively, and a pruner is constructed using a preset pruning threshold, effectively reducing unnecessary computation and improving query efficiency. Each node in the target encrypted result set is added to a preset noise component to obtain a noise vector and encrypted interference points. The encrypted interference points are then decrypted using a private key to obtain the decrypted data, which is sent to the client to restore the original data. Throughout the query process, data exists in encrypted form and is only decrypted after being sent to the client, ensuring data privacy, guaranteeing the security of query results during transmission, and preventing the leakage of result information. Query requests are also encrypted to prevent the leakage of query content. This invention provides comprehensive protection for data privacy, query privacy, result privacy, and access mode privacy while executing queries, ensuring data query security.

[0156] Those skilled in the art will understand that embodiments of this application can be provided as methods, systems, or computer program products. Therefore, this application can take the form of a completely hardware embodiment, a completely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, this application can take the form of a computer program product embodied on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) containing computer-usable program code.

[0157] This application is described with reference to flowchart illustrations and / or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of this application. It will be understood that each block of the flowchart illustrations and / or block diagrams, and combinations of blocks in the flowchart illustrations and / or block diagrams, can be implemented by computer program instructions. These computer program instructions can be provided to a processor of a general-purpose computer, special-purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, generate instructions for implementing the flowchart... Figure 1 One or more processes and / or boxes Figure 1 A device that provides the functions specified in one or more boxes.

[0158] These computer program instructions may also be stored in a computer-readable storage medium that can direct a computer or other programmable data processing device to function in a particular manner, such that the instructions stored in the computer-readable storage medium produce an article of manufacture including instruction means, which are implemented in a process Figure 1 One or more processes and / or boxes Figure 1 The function specified in one or more boxes.

[0159] These computer program instructions may also be loaded onto a computer or other programmable data processing equipment to cause a series of operational steps to be performed on the computer or other programmable equipment to produce a computer-implemented process, thereby providing instructions that execute on the computer or other programmable equipment for implementing the process. Figure 1 One or more processes and / or boxes Figure 1 The steps of the function specified in one or more boxes.

[0160] Obviously, the above embodiments are merely illustrative examples for clear explanation and are not intended to limit the implementation. Those skilled in the art will recognize that other variations or modifications can be made based on the above description. It is neither necessary nor possible to exhaustively list all possible implementations here. However, obvious variations or modifications derived therefrom are still within the scope of protection of this invention.

Claims

1. A data encryption method, characterized in that, Applied to the client side, including: Use homomorphic encryption to create a key pair containing a public key and a private key; Use the public key to encrypt the target data and obtain the corresponding encrypted data; Using encrypted data as nodes, a secure index based on the SAR-Tree structure is constructed and sent to a cloud server for storage; Among them, the security index based on the SAR-Tree structure includes: Leaf nodes are represented as: ; A normal non-leaf node is represented as: ; A non-leaf node containing a blind object is represented as: ; in, This represents the encrypted data represented by the i-th leaf node in the security index. This indicates the total number of data nodes in the subtree contained in the root node of this node. This represents the minimum boundary matrix of the node. This indicates the center of the spatial matrix of the node. For encryption symbols, This indicates that the node points to a node at the next level; This represents the combination of encryption carriers corresponding to all blind objects under the same parent node. This represents a bucket of nodes that share the same parent node. Indicates based on and The expression for retrieving child nodes is: , This represents the j-th blind object in the i-th leaf node of the security index. , , This indicates the total number of leaf nodes associated with this parent node. This represents the total number of blind objects that can be accommodated in each leaf node.

2. The data encryption method according to claim 1, characterized in that, The homomorphic encryption methods include: Paillier algorithm, RSA algorithm and ElGamal algorithm.

3. An encrypted query method based on the data encryption method as described in any one of claims 1 to 2, characterized in that, When used on cloud servers, the public key and security index are provided by the cloud server's data service provider, while the key pair is provided by the cloud server's data assistance provider, including: Obtain the encrypted query request obtained by the client using the public key in the key pair to encrypt the original query request; Based on the encrypted query request, the secure index and the private key, calculate the loose score corresponding to each non-leaf node in the secure index, and use heap sort to sort it in descending order to obtain the corresponding descending heap. Based on the descending heap, the non-leaf node with the largest loose score is obtained, and the child nodes of the non-leaf node are accessed. Based on the dimension of the non-leaf node and its corresponding child nodes, the result vector is obtained, the loose score of the non-leaf node is updated, and the exact score corresponding to the non-leaf node is obtained. Calculate the exact score of each node in the descending heap sequentially, and construct a pruner and an encrypted result set based on the exact score and a preset pruning threshold, until all remaining uncalculated nodes in the descending heap are dominated by the nodes in the pruner, and obtain the target encrypted result set. Add each node in the target encryption result set to the preset noise component to obtain the noise vector and the encryption interference point set. Use the private key to decrypt each encryption interference point in the encryption interference point set to obtain the corresponding decryption data set. The decrypted data set and noise vector are sent to the client so that the client can use the noise vector to denoise each decrypted data in the decrypted data set, obtain the corresponding original data, and restore the data to be queried in the target dataset required by the original query request.

4. The encrypted query method according to claim 3, characterized in that, Based on the encrypted query request, the secure index, and the private key, calculate the loose score for each non-leaf node in the secure index, including: Initialize the set of non-leaf objects consisting of all non-leaf nodes in the secure index. Each non-leaf node The loose fraction is ; Traversing the set of non-leaf objects Each non-leaf node Determine the non-leaf node Is the number of layers greater than 2? If the non-leaf node If the number of layers is not greater than 2, then the non-leaf node is not updated. Loose fractions; If the non-leaf node If the number of layers is greater than 2, then the non-leaf node is determined. Does the dominant object belong to the set of non-leaf objects? If it exists, then the non-leaf node... The recursive call points to the non-leaf node. Based on the homomorphic property, the loose fraction of the child nodes of the non-leaf node and the total number of data nodes in the subtree contained in the root node of the non-leaf node are calculated. After adding, then combine with the non-leaf node. The loose fractions are added together and used as the non-leaf node. Loose fractions; If it does not exist, then the non-leaf node is not updated. Loose fractions.

5. The encrypted query method according to claim 4, characterized in that, Based on the dimensions of the non-leaf node and its corresponding child nodes, obtain the result vector, including: The non-leaf node and its corresponding child node are used as the first encryption object. With the second encrypted object Calculate the spatial dimension and non-spatial dimension of the first encrypted object and the second encrypted object respectively; Calculate the sum of the spatial and non-spatial dimensions of the first encrypted object, and use this as the dimension sum of the first encrypted object. ; Calculate the sum of the spatial and non-spatial dimensions of the second encrypted object, and use this as the dimension sum of the second encrypted object. ; Compare the dimensions of the first encrypted object. Dimensions of the second encrypted object and : like If so, the returned result vector will be 0; like Then compare the first encrypted object. With the second encrypted object The spatial and non-spatial dimensions are used to obtain a result vector containing the comparison results of the spatial and non-spatial dimensions.

6. The encrypted query method according to claim 5, characterized in that, Obtain the result vector containing comparison results of spatial and non-spatial dimensions, including: If the first encrypted object The spatial dimension is no greater than that of the second encrypted object. If the spatial dimension is specified, the spatial dimension comparison result is 1; otherwise, the spatial dimension comparison result is 0. If the first encrypted object The non-spatial dimension is no greater than that of the second encrypted object. If the non-spatial dimension is not specified, the non-spatial dimension comparison result is returned as 1; otherwise, the non-spatial dimension comparison result is returned as 0. The result vector is constructed based on the comparison results of spatial dimensions and non-spatial dimensions.

7. The encrypted query method according to claim 6, characterized in that, Update the loose score of the non-leaf node to obtain the precise score corresponding to the non-leaf node, including: Initialize the set of non-leaf objects consisting of all non-leaf nodes in the secure index. Each non-leaf node The loose fraction is ; Traversing the set of non-leaf objects Each non-leaf node Determine the non-leaf node Is the number of layers greater than 1? If the non-leaf node If the layer number is not greater than 1, then the non-leaf node is not updated. Loose fractions; If the non-leaf node If the number of layers is greater than 1, then the set of non-leaf objects is calculated. The result vector of non-leaf nodes : If there exists a non-leaf node that satisfies and If the recursive call of the non-leaf node points to the child node of the non-leaf node; If there exists a non-leaf node that satisfies Based on the homomorphic property, the loose fraction of the child nodes of the non-leaf node and the total number of data nodes in the subtree contained in the root node of the non-leaf node are calculated. After adding, then combine with the non-leaf node. The loose fractions are added together and used as the non-leaf node. The exact fraction; in, It is the encrypted object represented by the non-leaf node. Indicates encrypted query, This represents the encrypted minimum bounding rectangle (MBR) with the maximum spatial distance. This represents the encrypted minimum bounding rectangle with the minimum spatial distance.

8. The encrypted query method according to claim 3, characterized in that, Based on the precise score and a preset pruning threshold, a pruner and an encrypted result set are constructed until all remaining uncomputed nodes in the descending heap are dominated by nodes in the pruner, thus obtaining the target encrypted result set, including: The child nodes of each non-leaf node in the descending heap are evaluated in descending order of their exact scores: If the exact score of a child node of a non-leaf node is greater than the preset pruning threshold, then the corresponding child node is added to the encrypted result set and the dominant point is added to the pruner set. The current encrypted result set is output as the target encrypted result set until all child nodes of non-leaf nodes are dominated by nodes in the pruner set.

9. The encrypted query method according to claim 3, characterized in that, The client uses a noise vector to denoise each decrypted data item in the decrypted dataset, obtaining the corresponding original data, represented as: ; in, Indicates the first [item] in the decrypted data set The original data corresponding to each decrypted data. Indicates the first [item] in the decrypted data set Decrypted data, Indicates the first [item] in the decrypted data set The noise component corresponding to each decrypted data.