An efficient enterprise data matching method based on privacy protection
By optimizing the enterprise data matching process through bilinear pairing and Paillier semi-homomorphic encryption technology, the problem of data privacy protection in third-party service environments is solved, and accurate matching of encrypted data and information security are achieved.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- YIQICHA TECH CO LTD
- Filing Date
- 2026-01-28
- Publication Date
- 2026-06-19
AI Technical Summary
Existing technologies cannot reliably match sensitive information in third-party service environments while ensuring data privacy and security, and lack effective correlation matching and conditional query solutions in encrypted data states.
Keyword comparison is performed using bilinear matching technology with identity verification, and core data is encrypted using Paillier semi-homomorphic encryption technology, reducing reliance on third-party devices and optimizing key management and data matching processes through cloud computing.
It achieves accurate data matching and privacy protection in encrypted data state, reduces communication resource consumption, and ensures information security for users and data providers during the matching process.
Smart Images

Figure CN122241745A_ABST
Abstract
Description
Technical Field
[0001] This invention relates to a method for efficient matching of enterprise data based on privacy protection, belonging to the field of data privacy protection. Background Technology
[0002] Privacy-preserving data matching, a crucial step in collaborative data processing, hinges on ensuring the security of sensitive information among participants while completing data comparison operations. This technology primarily involves various comparison methods, such as numerical range verification and attribute value comparison. Given the current widespread awareness of data security, achieving reliable matching of sensitive information within third-party service environments has become a pressing technical challenge.
[0003] Existing solutions generally adopt a direct matching model for raw data. This plaintext data, without any security processing (such as encryption encoding, hash digests, etc.), is completely exposed to the processing agency during the matching process. In real-world applications, due to the need for business collaboration or joint analysis, data holders often need to provide data resources to multiple partners, which has created a demand for data matching with privacy protection features.
[0004] To ensure secure data flow, privacy-preserving data matching technologies have emerged. This technology first performs secure preprocessing on the data features and specific values involved in the matching, and then performs query and matching operations in the encrypted or encoded data format. While this method effectively protects data privacy, the data features and numerical information are deeply hidden after security processing, and the inherent correlations between the original data no longer exist in the encrypted state. Therefore, how to design feasible solutions that support correlation matching and conditional queries in the encrypted data state has become a core research topic in this field. Summary of the Invention
[0005] The purpose of this invention is to provide a privacy-preserving, efficient matching method for enterprise data to solve the problem of enterprise data privacy protection.
[0006] To achieve the above objectives, the present invention is implemented using the following technical solution: This invention relates to a privacy-preserving, efficient method for matching enterprise data, which mainly comprises the following two core components: 1. By employing bilinear matching technology for keyword comparison with identity verification, the operational efficiency of key management is significantly optimized. This authentication mechanism can complete user identity verification before the data comparison stage, thereby ensuring the system's security capabilities.
[0007] 2. Paillier semi-homomorphic encryption technology is used to encrypt core data. While ensuring data security, it reduces the reliance on third-party devices in intermediate computing steps compared to the PPDMC algorithm, effectively reducing the consumption of communication resources.
[0008] The entity model framework is attached. Figure 1 The data matching process is shown in Appendix 2.
[0009] Furthermore, a privacy-preserving, efficient matching method for enterprise data is characterized by: 1. System Initialization: During system startup, the primary task is to complete the key distribution process and configure data encryption and signature parameters (a trusted authority distributes keys to the data holder and the querying party; both parties encrypt their own data and attach digital signatures, then submit the data to be verified and processed to the cloud server). Its key feature is: The authoritative institution first selects the parameters required for bilinear pairing, two multiplicative cyclic groups. , Their order is a prime number p. Let g be the generator, and let e be the bilinear mapping: Define security parameters It is a symmetric encryption function. The padding bits in the code. Define two symmetric encryption functions. : and ,in It is a symmetric encryption function with padding, whose parameters are distributed to the user and the data owner, and whose padding bits are related to the security parameters. Related. And The AES symmetric encryption algorithm is used, and its parameters are distributed to the cloud server and users to return results when querying specific data ranges. Three hash functions with different functions are defined. , where hash function Map the data to the multiplication loop group , A hash function with a fixed number of bits k, used for XOR operations during encryption and decryption. Select a random number. ,calculate Secondly, the authoritative institution needs to generate a set of public and private keys for the Paillier encryption algorithm for the cloud server, namely... The selected public key is defined as follows: and distribute their respective identifiers to data owners and users. Finally, the random numbers required by Horner's rule need to be assigned to both parties. , The selection needs to meet the following requirements. .
[0010] Data owners and users need to convert the data under their database keywords into binary and package the data using Horner's Law. Assume the data owner owns multiple keywords:
[0011] in, The dataset content under the keywords is but The corresponding binary form is ,Will Add to a random number c, where c is a random even number, and Then, it is compared with a random number generated by an authoritative institution. binary form If we add them together, the data of m after packaging is:
[0012] in, ,and Similarly, suppose a user has multiple keywords:
[0013] in, The dataset content under the keywords is ,but The corresponding binary form is ,Will With random numbers Add, where It is a random even number, and Then, it is compared with a random number generated by an authoritative institution. binary form Subtracting the two, the data of q after packaging is:
[0014] in, ,and .
[0015] 2. Matching Phase: The cloud server verifies the data and performs data matching, returning the matching results to the queryer in encrypted form. Its key feature is: (1) Keyword matching stage: The keyword Q to be queried is obtained by symmetric encryption. Calculated using identity identifiers issued by authoritative institutions. Select random ,calculate And send the calculated values to the cloud server. .
[0016] The cloud server needs to perform a user identity verification process. When the system determines that the user's identity exists within the preset authorized list, it will receive the data submitted by the user and perform specific calculations:
[0017] Then calculate The obtained F is compared with the user's identity. merged into Send to the data owner.
[0018] After receiving data from the cloud server, the data owner first calculates the identity of the user to be queried. And calculate the following formula:
[0019] prove:
[0020]
[0021] When the calculated fill value matches the pre-agreed parameters, the ownership of the queried data can be confirmed as belonging to the requesting party. The fill value is then stripped from the calculation result, and a symmetric key decryption process is implemented to obtain the core terms the user needs to retrieve. Next, the record information corresponding to the term is verified in the database, and finally, the cloud service system returns the matching query results to the user.
[0022] (2) Data value range matching stage: The data holder and the end user must use the public key in the Paillier encryption system issued by an authoritative certification authority. For the original dataset respectively and query parameters Encryption processing is implemented. Before transmitting specific information to the cloud server, a data signing process must be completed. This process can use standardized signature schemes, such as the ECDSA algorithm based on elliptic curves or the SM2 signature mechanism of the Chinese national cryptographic standard.
[0023] The cloud server must perform a signature verification process on the received data subject and user information. This is then completed in an encrypted state. and The numerical superposition operation uses the private key in the Paillier encryption scheme. Decrypt the superposition result. The decrypted value is... It needs to go through specific processing steps to transform into result B, and finally complete the calculation of the value of D and send the calculation result back to the end user: Input: The decrypted result D Output: Result B Step 1: Result D is as well as The value obtained by adding them together is
[0024] Step 2: Step 3: Step 4: Step 5: Step 6: break; Step 7: End if; Step 8: ; Step 9: End For; Step 10: Step 11: Step 12: End if; 3. Result Analysis Phase: Users reverse-process the obtained encrypted information through a preset decryption process to ultimately recover the original query data. Utilizing cloud computing services significantly reduces the computational burden on local devices. Its key features include: Results Analysis Phase: After receiving the results returned by the cloud server, the user... To decrypt, follow these steps to obtain the specific data range matching results: enter: Output: Data size relationship Step 1: For Running the symmetric encryption decryption algorithm yields B; Step 2: Extract user data ; Step 3: If B=0, then the data value held is equal to the data value being queried.
[0025] Step 4: If B=1, and If the value of the data held is greater than the value of the data being queried, then the value of the data held is greater than the value of the data being queried.
[0026] Step 5: If B=1, and If the value of the data held is less than the value of the data being queried, then the value of the data held is less than the value of the data being queried.
[0027] Compared with the prior art, the beneficial effects achieved by the present invention are as follows: 1. Ensure accurate data matching during the keyword matching process, and implement strict confidentiality measures for sensitive information of third-party cloud service platforms.
[0028] 2. Data volume comparison can be completed by relying on a single third-party cloud server, and key information is hidden from the cloud during the matching process between users and data providers.
[0029] 3. Even if a user colludes with an external attacker, they cannot obtain the actual values submitted by the data provider. The matching results are only fed back to the user through a symmetric encrypted channel, thus achieving two-way privacy protection throughout the entire data comparison process. Attached Figure Description
[0030] Figure 1 For entity model framework; Figure 2 For the data matching process; Detailed Implementation
[0031] A privacy-preserving, efficient method for matching enterprise data mainly consists of two parts: 1. By employing bilinear matching technology for keyword comparison with identity verification, the operational efficiency of key management is significantly optimized. This authentication mechanism can complete user identity verification before the data comparison stage, thereby ensuring the system's security capabilities.
[0032] 2. Paillier semi-homomorphic encryption technology is used to encrypt core data. While ensuring data security, it reduces the reliance on third-party devices in intermediate computing steps compared to the PPDMC algorithm, effectively reducing the consumption of communication resources.
[0033] Furthermore, a privacy-preserving, efficient matching method for enterprise data is characterized by: 1. System Initialization: During system startup, the primary task is to complete the key distribution process and configure data encryption and signature parameters (a trusted authority distributes keys to the data holder and the querying party; both parties encrypt their own data and attach digital signatures, then submit the data to be verified and processed to the cloud server). Its key feature is: The authoritative institution first selects the parameters required for bilinear pairing, two multiplicative cyclic groups. , Their order is a prime number p. Let g be the generator, and let e be the bilinear mapping: Define security parameters It is a symmetric encryption function. The padding bits in the code. Define two symmetric encryption functions. : and ,in It is a symmetric encryption function with padding, whose parameters are distributed to the user and the data owner, and whose padding bits are related to the security parameters. Related. And The AES symmetric encryption algorithm is used, and its parameters are distributed to the cloud server and users to return results when querying specific data ranges. Three hash functions with different functions are defined. , where hash function Map the data to the multiplication loop group , A hash function with a fixed number of bits k, used for XOR operations during encryption and decryption. Select a random number. ,calculate Secondly, the authoritative institution needs to generate a set of public and private keys for the Paillier encryption algorithm for the cloud server, namely... The selected public key is defined as follows: and distribute their respective identifiers to data owners and users. Finally, the random numbers required by Horner's rule need to be assigned to both parties. , The selection needs to meet the following requirements. .
[0034] Data owners and users need to convert the data under their database keywords into binary and package the data using Horner's Law. Assume the data owner owns multiple keywords:
[0035] in, The dataset content under the keywords is but The corresponding binary form is ,Will Add to a random number c, where c is a random even number, and Then, it is compared with a random number generated by an authoritative institution. binary form If we add them together, the data of m after packaging is:
[0036] in, ,and Similarly, suppose a user has multiple keywords:
[0037] in, The dataset content under the keywords is ,but The corresponding binary form is ,Will With random numbers Add, where It is a random even number, and Then, it is compared with a random number generated by an authoritative institution. binary form Subtracting the two, the data of q after packaging is:
[0038] in, ,and .
[0039] 2. Matching Phase: The cloud server verifies the data and performs data matching, returning the matching results to the queryer in encrypted form. Its key feature is: (1) Keyword matching stage: The keyword Q to be queried is obtained by symmetric encryption. Calculated using identity identifiers issued by authoritative institutions. Select random ,calculate And send the calculated values to the cloud server. .
[0040] The cloud server needs to perform a user identity verification process. When the system determines that the user's identity exists within the preset authorized list, it will receive the data submitted by the user and perform specific calculations:
[0041] Then calculate The obtained F is compared with the user's identity. merged into Send to the data owner.
[0042] After receiving data from the cloud server, the data owner first calculates the identity of the user to be queried. And calculate the following formula:
[0043] prove:
[0044]
[0045] When the calculated fill value matches the pre-agreed parameters, the ownership of the queried data can be confirmed as belonging to the requesting party. The fill value is then stripped from the calculation result, and a symmetric key decryption process is implemented to obtain the core terms the user needs to retrieve. Next, the record information corresponding to the term is verified in the database, and finally, the cloud service system returns the matching query results to the user.
[0046] (2) Data value range matching stage: The data holder and the end user must use the public key in the Paillier encryption system issued by an authoritative certification authority. For the original dataset respectively and query parameters Encryption processing is implemented. Before transmitting specific information to the cloud server, a data signing process must be completed. This process can use standardized signature schemes, such as the ECDSA algorithm based on elliptic curves or the SM2 signature mechanism of the Chinese national cryptographic standard.
[0047] The cloud server must perform a signature verification process on the received data subject and user information. This is then completed in an encrypted state. and The numerical superposition operation uses the private key in the Paillier encryption scheme. Decrypt the superposition result. The decrypted value is... It needs to go through specific processing steps to transform into result B, and finally complete the calculation of the value of D and send the calculation result back to the end user: Input: The decrypted result D Output: Result B Step 1: Result D is as well as The value obtained by adding them together is
[0048] Step 2: Step 3: Step 4: Step 5: Step 6: break; Step 7: End if; Step 8: ; Step 9: End For; Step 10: Step 11: Step 12: End if; 3. Result Analysis Phase: Users reverse-process the obtained encrypted information through a preset decryption process to ultimately recover the original query data. Utilizing cloud computing services significantly reduces the computational burden on local devices. Its key features include: Results Analysis Phase: After receiving the results returned by the cloud server, the user... To decrypt, follow these steps to obtain the specific data range matching results: enter: Output: Data size relationship Step 1: For Running the symmetric encryption decryption algorithm yields B; Step 2: Extract user data ; Step 3: If B=0, then the data value held is equal to the data value being queried.
[0049] Step 4: If B=1, and If the value of the data held is greater than the value of the data being queried, then the value of the data held is greater than the value of the data being queried.
[0050] Step 5: If B=1, and If the value of the data held is less than the value of the data being queried, then the value of the data held is less than the value of the data being queried.
Claims
1. A privacy-preserving, efficient method for matching enterprise data, characterized in that, include: By employing a bilinear matching technique for keyword comparison with identity verification, the operational efficiency of key management is significantly optimized. This authentication mechanism can complete user identity verification before the data comparison stage, thereby ensuring the system's security capabilities. By using Paillier semi-homomorphic encryption technology to encrypt core data, data security is ensured while reducing reliance on third-party devices in intermediate computation steps compared to the PPDMC algorithm, effectively reducing communication resource consumption.
2. The method for efficient matching of enterprise data based on privacy protection according to claim 1, characterized in that: In the identity-based bilinear matching mechanism, users first encrypt their search keywords and generate a unique private key using their personal identification. Then, the user transmits the calculated set of encrypted parameters to a cloud server. After verifying the user's identity, the server generates a shared key and constructs an encrypted query response, forwarding it to the data holder. The data holder decrypts the query results, confirms the user's legitimacy, performs a data retrieval in the storage system, and finally returns the query results to the requesting party. This process effectively ensures the privacy of the search process and the security of data storage.
3. As described in claim 1, a Chinese legal knowledge graph is constructed. Its characteristics are: Data size comparison technology based on the Paillier semi-homomorphic encryption scheme enables addition homomorphism and single multiplication operations while maintaining the ciphertext state. This unique computational characteristic makes it a key technology in the field of secure multi-party computation. Each participant can perform specific operations on the encrypted data without decryption, effectively maintaining the confidentiality of the original data.