Set security intersection method and apparatus, electronic device, and storage medium

By adding a hash bucketing step to the set-safe intersection method, the set elements are preprocessed and filtered, which solves the problems of low efficiency and security in the existing technology and realizes efficient and secure set intersection calculation.

CN115906177BActive Publication Date: 2026-06-23CHINA TELECOM CORP LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
CHINA TELECOM CORP LTD
Filing Date
2022-12-15
Publication Date
2026-06-23

AI Technical Summary

Technical Problem

Existing set-safe intersection methods are inefficient, especially when there are no intersecting elements, resulting in significant waste of computational resources. Furthermore, methods based on naive hashing have security issues.

Method used

By adding a hash bucketing step before the PSI protocol process, the set elements are filtered to obtain a reduced set, thereby reducing the number of elements input to the PSI protocol. The multi-round hash bucketing method reduces the number of set elements and improves computational efficiency.

Benefits of technology

It reduces the time overhead of the PSI protocol, improves the computational efficiency of set-safe intersection, enhances overall performance, and ensures both execution efficiency and security.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN115906177B_ABST
    Figure CN115906177B_ABST
Patent Text Reader

Abstract

The present disclosure provides a set secure intersection method, device, electronic equipment and storage medium, and relates to the technical field of privacy protection. The set secure intersection method comprises the following steps: a hash bucketing step is added before a PSI protocol process, elements in a first set and a second set are filtered to obtain a reduced first set and a reduced second set, and the reduced first set and the reduced second set are taken as inputs to execute the PSI protocol process to obtain an intersection result of the first set and the second set. The present disclosure reduces the number of intersection set elements to be solved in the PSI protocol process, thereby reducing the time cost of the PSI protocol execution, improving the calculation efficiency of the set secure intersection, and achieving the effect of improving the overall performance.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This disclosure relates to the field of privacy protection technology, and in particular to a secure intersection method, apparatus, electronic device and storage medium. Background Technology

[0002] Private Set Intersection (PSI), also known as privacy set intersection, solves the problem of secure computation of intersection between two or more parties. Each participating party holds its own data set, and by running the private set intersection protocol, one or all parties can obtain the intersection of all sets. At the same time, the elements of each party's data set other than the intersection cannot be known to the other party.

[0003] Secure intersection technology can be used independently for applications such as secure verification of advertising effectiveness, secure matching of social contacts, secure comparison of blacklists, and secure verification of multiple loans. It can also be combined with federated learning to complete the virtual fusion of multi-party data.

[0004] Existing set-secure intersection methods include those based on naive hashing, those based on RSA (Ron Rivest, Adi Shamir, Leonard Adleman) blind signatures, those based on the Diffie-Hellman key exchange protocol, and those based on OT (Oblivious Transfer). Among these, naive hashing-based set-secure intersection suffers from security issues and is being phased out. Other set-secure intersection methods, regardless of the intersection size, require blind signing, modular exponentiation, or execution of a set of OT protocols on all elements of the sets. These computations involve numerous and time-consuming cryptographic operations, significantly reducing efficiency. In extreme cases, when the two sets have no intersection elements, all time-consuming computations must be completed to conclude that the intersection is empty.

[0005] Therefore, improving the computational efficiency of set-safe intersection has become an urgent technical problem to be solved.

[0006] It should be noted that the information disclosed in the background section above is only used to enhance the understanding of the background of this disclosure, and therefore may include information that does not constitute prior art known to those skilled in the art. Summary of the Invention

[0007] This disclosure provides a set-safe intersection method, apparatus, electronic device, and storage medium, which at least to some extent overcomes the problem of low efficiency in set-safe intersection in related technologies.

[0008] Other features and advantages of this disclosure will become apparent from the following detailed description, or may be learned in part by practice of this disclosure.

[0009] According to one aspect of this disclosure, a set-secure intersection (PSI) method is provided, applied to a first device, comprising: initiating a set-secure intersection (PSI) protocol process to a second device; determining with the second device execution conditions and common parameters for hash bucketing, wherein the execution conditions are necessary conditions to be met for hash bucketing, and the common parameters are necessary parameters for performing hash bucketing; performing hash bucketing on a locally stored first set according to the common parameters to obtain a first bucket identifier table and a first bucket element registration table for the first set; sending the first bucket identifier table to the second device, and receiving a second bucket identifier table of the second set sent by the second device. The table and the reduced second set, wherein the reduced second set are obtained by the second device filtering elements from the locally stored second set according to the first bucket identifier table, the second bucket identifier table, and the second bucket element registration table of the second set; according to the first bucket identifier table, the second bucket identifier table, and the first bucket element registration table, the corresponding elements in the first set are filtered out, and after the execution condition is not met, the reduced first set is obtained; using the reduced first set and the reduced second set as input, the PSI protocol process is executed to obtain the intersection result of the first set and the second set.

[0010] In one embodiment of this disclosure, the common parameters include: a hash function sequence, a bucketing strategy, a bucket size, and an update strategy;

[0011] Based on the aforementioned public parameters, the first set stored locally is subjected to hash bucketing to obtain a first bucket identifier table and a first bucket element registration table for the first set. This includes: obtaining the first bucket identifier table and the first bucket element registration table for the first set through the following first preset steps: updating the bucketing strategy of each bucket according to the update strategy; creating a bucket identifier table and a bucket element registration table according to the bucket size; setting the bucket identifier table to zero and the bucket element registration table to empty; selecting the target hash function to be used in this round from the hash function sequence; calculating the hash value of each element in the first set one by one using the target hash function; and determining the hash value of each element in the first set according to the bucketing strategy. The hash value of each element in the first set corresponds to the bucket number of the hash bucket, and the number of the preset position in the bucket identifier table is set to T. According to the bucket number of the hash bucket corresponding to the hash value of each element in the first set, the hash value of each element in the first set is put into the corresponding hash bucket, and the bucket identifier table and the bucket element registration table are updated to obtain the first bucket identifier table and the first bucket element registration table of the first set. The first bucket identifier table is used to record whether each hash bucket storing the elements in the first set has been put into an element, and the first bucket element registration table is used to record the elements that have been put into each hash bucket storing the elements in the first set.

[0012] In one embodiment of this disclosure, based on the first bucket identifier table, the second bucket identifier table, and the first bucket element registration table, corresponding elements in the first set are filtered out. After the execution condition is not met, a reduced first set is obtained. This includes: filtering out elements in the first set through the following second preset steps to obtain the filtered first set: performing logical operations on the corresponding identifiers in the first bucket identifier table and the second bucket identifier table one by one to obtain a merged bucket identifier table; determining the elements to be filtered out in the first bucket element registration table based on the merged bucket identifier table, and counting the elements retained in the first bucket element registration table to obtain the filtered first set.

[0013] In one embodiment of this disclosure, the elements to be filtered out in the first bucket element registration table are determined according to the merged bucket identifier table, and the elements retained in the first bucket element registration table are counted to obtain the first set after filtering, including: when the identifier in the merged bucket identifier table is T, the elements of the corresponding sequence in the first bucket element registration table are determined to be the elements to be filtered out; when the identifier in the merged bucket identifier table is F, the elements of the corresponding sequence in the first bucket element registration table are determined to be the elements to be retained.

[0014] In one embodiment of this disclosure, based on the first bucket identifier table, the second bucket identifier table, and the first bucket element registration table, corresponding elements in the first set are filtered out. After the execution condition is not met, a reduced first set is obtained. This includes: determining whether the filtered first set meets the execution condition; if not, using the filtered first set as the reduced first set; if yes, performing the first preset step and the second preset step on the filtered first set until a new filtered first set does not meet the execution condition, ending the execution of the first preset step and the second preset step, and using the new filtered first set as the reduced first set.

[0015] In one embodiment of this disclosure, the execution conditions include at least one of: maximum number of executions, set element screening rate threshold, and set element screening amount threshold.

[0016] In one embodiment of this disclosure, the method further includes: sending the intersection result of the first set and the second set to the second device.

[0017] According to another aspect of this disclosure, a set-secure intersection method is provided, applied to a second device, comprising responding to a PSI protocol process initiated by a first device, determining with the first device execution conditions and common parameters for hash bucketing, wherein the execution conditions are necessary conditions to be met for hash bucketing, and the common parameters are necessary parameters for performing hash bucketing; performing hash bucketing on a locally stored second set according to the common parameters to obtain a second bucket identifier table and a second bucket element registration table for the second set; sending the second bucket identifier table to the first device, and receiving the first bucket identifier table of the first set sent by the first device, so that the first device... Based on the first bucket identifier table, the second bucket identifier table, and the first bucket element registration table of the first set, elements are filtered out from the locally stored first set to obtain a reduced first set; based on the first bucket identifier table, the second bucket identifier table, and the second bucket element registration table, corresponding elements in the second set are filtered out, and after the execution condition is not met, a reduced second set is obtained; the reduced second set is sent to the first device so that the first device can use the reduced first set and the reduced second set as input to execute the PSI protocol process and obtain the intersection result of the first set and the second set.

[0018] In one embodiment of this disclosure, the common parameters include: a hash function sequence, a bucketing strategy, a bucket size, and an update strategy. Based on the common parameters, a second set stored locally is subjected to hash bucketing to obtain a second bucket identifier table and a second bucket element registration table for the second set. This includes: obtaining the second bucket identifier table and the second bucket element registration table for the second set through the following third preset steps: updating the bucketing strategy of each bucket according to the update strategy; creating a bucket identifier table and a bucket element registration table according to the bucket size; setting the bucket identifier table to zero; and setting the bucket element registration table to empty. A target hash function for this round is selected from the hash function sequence, and the second set is calculated one by one using the target hash function. The hash values ​​of each element in the second set are determined; according to the bucketing strategy, the bucket number corresponding to the hash value of each element in the second set is determined, and the number of the preset position in the bucket identifier table is set to T; according to the bucket number corresponding to the hash value of each element in the second set, the hash value of each element in the second set is placed into the corresponding hash bucket, and the bucket identifier table and the bucket element registration table are updated to obtain the second bucket identifier table and the second bucket element registration table of the second set. The second bucket identifier table is used to record whether each hash bucket storing elements in the second set has been filled with elements, and the second bucket element registration table is used to record the elements that have been filled into each hash bucket storing elements in the second set.

[0019] In one embodiment of this disclosure, based on the first bucket identifier table, the second bucket identifier table, and the second bucket element registration table, corresponding elements in the second set are filtered out. After the execution condition is not met, a reduced second set is obtained. This includes: filtering out elements in the second set through the following fourth preset step to obtain the filtered second set: performing logical operations on the corresponding identifiers in the first bucket identifier table and the second bucket identifier table one by one to obtain a merged bucket identifier table; determining the elements to be filtered out in the second bucket element registration table based on the merged bucket identifier table, and counting the elements retained in the second bucket element registration table to obtain the filtered second set.

[0020] In one embodiment of this disclosure, the elements to be filtered out in the first bucket element registration table are determined according to the merged bucket identifier table, and the elements retained in the first bucket element registration table are counted to obtain the first set after filtering, including: when the identifier in the merged bucket identifier table is T, the elements of the corresponding sequence in the first bucket element registration table are determined to be the elements to be filtered out; when the identifier in the merged bucket identifier table is F, the elements of the corresponding sequence in the first bucket element registration table are determined to be the elements to be retained.

[0021] In one embodiment of this disclosure, based on the first bucket identifier table, the second bucket identifier table, and the second bucket element registration table, corresponding elements in the second set are screened out. After the execution condition is not met, a reduced second set is obtained. This includes: determining whether the screened second set meets the execution condition; if not, using the screened second set as the reduced second set; if yes, performing the third preset step and the fourth preset step on the screened second set until a new screened second set does not meet the execution condition, ending the execution of the third preset step and the fourth preset step, and using the new screened second set as the reduced second set.

[0022] In one embodiment of this disclosure, the execution conditions include at least one of: maximum number of executions, set element screening rate threshold, and set element screening amount threshold.

[0023] In one embodiment of this disclosure, the method further includes: receiving an intersection result sent by the first device.

[0024] According to another aspect of this disclosure, a set secure intersection (PSI) device is provided, applied to a first device, comprising: a first parameter determination module, configured to initiate a set secure intersection (PSI) protocol process to a second device, and determine with the second device execution conditions and common parameters for hash bucketing, wherein the execution conditions are necessary conditions to be met for executing hash bucketing, and the common parameters are necessary parameters for executing hash bucketing; a first bucketing processing module, configured to perform hash bucketing processing on a locally stored first set according to the common parameters, to obtain a first bucket identifier table and a first bucket element registration table for the first set; and a data receiving module, configured to send the first bucket identifier table to the second device, and receive the second buckets of the second set sent by the second device. The system comprises an identifier table and a reduced second set, wherein the reduced second set is obtained by the second device filtering elements from a locally stored second set based on the first bucket identifier table, the second bucket identifier table, and the second bucket element registration table of the second set; the first bucket processing module is further configured to filter out corresponding elements in the first set based on the first bucket identifier table, the second bucket identifier table, and the first bucket element registration table, and obtain the reduced first set after the execution condition is not met; and an intersection result determination module is configured to use the reduced first set and the reduced second set as inputs to execute the PSI protocol process to obtain the intersection result of the first set and the second set.

[0025] In one embodiment of this disclosure, the common parameters include: a hash function sequence, a bucketing strategy, a bucket size, and an update strategy; the aforementioned first bucketing processing module is further configured to obtain a first bucket identifier table and a first bucket element registration table for the first set through the following first preset steps: updating the bucketing strategy of each bucket according to the update strategy; creating a bucket identifier table and a bucket element registration table according to the bucket size; setting the bucket identifier table to zero and the bucket element registration table to empty; selecting a target hash function to be used in this round from the hash function sequence; calculating the hash value of each element in the first set one by one using the target hash function; and determining the hash value of each element in the first set according to the bucketing strategy. The hash value of each element in the first set corresponds to the bucket number of the hash bucket, and the number of the preset position in the bucket identifier table is set to T. According to the bucket number of the hash bucket corresponding to the hash value of each element in the first set, the hash value of each element in the first set is put into the corresponding hash bucket, and the bucket identifier table and the bucket element registration table are updated to obtain the first bucket identifier table and the first bucket element registration table of the first set. The first bucket identifier table is used to record whether each hash bucket storing the elements in the first set has been put into an element, and the first bucket element registration table is used to record the elements that have been put into each hash bucket storing the elements in the first set.

[0026] In one embodiment of this disclosure, the first bucket processing module is further configured to filter out elements in the first set through the following second preset steps to obtain a first set after filtering: performing logical operations on the corresponding identifier bits in the first bucket identifier table and the second bucket identifier table one by one to obtain a merged bucket identifier table; determining the elements to be filtered out in the first bucket element registration table according to the merged bucket identifier table, and counting the elements retained in the first bucket element registration table to obtain a first set after filtering.

[0027] In one embodiment of this disclosure, the first bucketing processing module is further configured to determine, when the identifier bit in the merged bucketing identifier table is T, that the element in the corresponding sequence in the first bucketing element registration table is an element to be filtered out; and when the identifier bit in the merged bucketing identifier table is F, that the element in the corresponding sequence in the first bucketing element registration table is an element to be retained.

[0028] In one embodiment of this disclosure, the first binning processing module is further configured to determine whether the first set after screening meets the execution conditions; if not, the first set after screening is used as the reduced first set; if yes, the first preset step and the second preset step are executed on the first set after screening until the newly screened first set does not meet the execution conditions, the execution of the first preset step and the second preset step is terminated, and the newly screened first set is used as the reduced first set.

[0029] In one embodiment of this disclosure, the above-mentioned apparatus further includes an intersection result sending module, which is used to send the intersection result of the first set and the second set to the second device.

[0030] According to another aspect of this disclosure, a set secure intersection device is provided, applied to a second device, comprising: a second parameter determination module, configured to respond to a PSI protocol process initiated by a first device, and determine with the first device the execution conditions and common parameters for hash bucketing, wherein the execution conditions are necessary conditions to be met for executing hash bucketing, and the common parameters are necessary parameters for executing hash bucketing; a second bucketing processing module, configured to perform hash bucketing processing on a locally stored second set according to the common parameters, to obtain a second bucket identifier table and a second bucket element registration table for the second set; and an identifier table inter-transfer module, configured to send the second bucket identifier table to the first device and receive the first bucket identifier table of the first set sent by the first device, so that the second set can be used for hash bucketing. The first device filters elements from a locally stored first set according to the first bucket identifier table, the second bucket identifier table, and the first bucket element registration table of the first set to obtain a reduced first set. The second bucket processing module is further configured to filter out corresponding elements in the second set according to the first bucket identifier table, the second bucket identifier table, and the second bucket element registration table, and obtain a reduced second set after the execution condition is not met. The reduced set sending module is configured to send the reduced second set to the first device so that the first device can use the reduced first set and the reduced second set as input to execute the PSI protocol process and obtain the intersection result of the first set and the second set.

[0031] In one embodiment of this disclosure, the common parameters include: a hash function sequence, a bucketing strategy, a bucket size, and an update strategy; the second bucketing processing module is further configured to obtain a second bucket identifier table and a second bucket element registration table of the second set through the following third preset steps: updating the bucketing strategy of each bucket according to the update strategy, creating a bucket identifier table and a bucket element registration table according to the bucket size, setting the bucket identifier table to zero, and setting the bucket element registration table to empty; selecting the target hash function to be used in this round from the hash function sequence, and calculating the hash value of each element in the second set one by one using the target hash function; determining the hash value of each element in the second set according to the bucketing strategy. The hash value of each element in the second set corresponds to the bucket number of the hash bucket, and the number of the preset position in the bucket identifier table is set to T. According to the bucket number of the hash bucket corresponding to the hash value of each element in the second set, the hash value of each element in the second set is put into the corresponding hash bucket, and the bucket identifier table and the bucket element registration table are updated to obtain the second bucket identifier table and the second bucket element registration table of the second set. The second bucket identifier table is used to record whether each hash bucket storing the elements in the second set has been put into an element, and the second bucket element registration table is used to record the elements that have been put into each hash bucket storing the elements in the second set.

[0032] In one embodiment of this disclosure, the second bucketing processing module is further configured to filter out elements in the second set through the following fourth preset step to obtain a filtered second set: performing logical operations on the corresponding identifier bits in the first bucketing identifier table and the second bucketing identifier table one by one to obtain a merged bucketing identifier table; determining the elements to be filtered out in the second bucketing element registration table according to the merged bucketing identifier table, and counting the elements retained in the second bucketing element registration table to obtain a filtered second set.

[0033] In one embodiment of this disclosure, the second bucketing processing module is further configured to determine, when the identifier bit in the merged bucketing identifier table is T, that the element in the corresponding sequence in the first bucketing element registration table is an element to be filtered out; and when the identifier bit in the merged bucketing identifier table is F, that the element in the corresponding sequence in the first bucketing element registration table is an element to be retained.

[0034] In one embodiment of this disclosure, the second binning processing module is further configured to determine whether the second set after screening meets the execution conditions; if not, the second set after screening is used as the reduced second set; if yes, the third preset step and the fourth preset step are executed on the second set after screening until the newly screened second set does not meet the execution conditions, the execution of the third preset step and the fourth preset step is terminated, and the newly screened second set is used as the reduced second set.

[0035] In one embodiment of this disclosure, the apparatus further includes an intersection result receiving module, which is used to receive the intersection result sent by the first device.

[0036] According to another aspect of this disclosure, an electronic device is provided, comprising: a processor; and a memory for storing executable instructions of the processor; wherein the processor is configured to perform the above-described set-safe intersection method by executing the executable instructions.

[0037] According to another aspect of this disclosure, a computer-readable storage medium is provided having a computer program stored thereon, which, when executed by a processor, implements the above-described set-safe intersection method.

[0038] This disclosure provides a set-secure intersection method, apparatus, electronic device, and storage medium. The set-secure intersection method includes: adding a hash bucketing step before the PSI protocol process to filter elements in a first set and a second set, obtaining a reduced first set and a reduced second set; using the reduced first set and the reduced second set as input, executing the PSI protocol process to obtain the intersection result of the first set and the second set. This disclosure reduces the number of elements in the sets to be intersected input to the PSI protocol process, thereby reducing the time overhead of PSI protocol execution, improving the computational efficiency of set-secure intersection, and achieving the effect of improving overall performance.

[0039] It should be understood that the above general description and the following detailed description are exemplary and explanatory only, and are not intended to limit this disclosure. Attached Figure Description

[0040] The accompanying drawings, which are incorporated in and form part of this specification, illustrate embodiments consistent with this disclosure and, together with the description, serve to explain the principles of this disclosure. It is obvious that the drawings described below are merely some embodiments of this disclosure, and those skilled in the art can obtain other drawings based on these drawings without any inventive effort.

[0041] Figure 1This diagram illustrates a communication system architecture according to an embodiment of the present disclosure;

[0042] Figure 2 This diagram illustrates a set-safe intersection method according to an embodiment of the present disclosure.

[0043] Figure 3 This diagram illustrates a set-safe intersection method according to an embodiment of the present disclosure.

[0044] Figure 4 This diagram illustrates another set-safe intersection method according to an embodiment of the present disclosure.

[0045] Figure 5 This diagram illustrates another set-safe intersection method according to an embodiment of the present disclosure.

[0046] Figure 6 This diagram illustrates another set-safe intersection method according to an embodiment of the present disclosure.

[0047] Figure 7 This diagram illustrates a set secure intersection device according to an embodiment of the present disclosure;

[0048] Figure 8 This diagram illustrates another set-secure intersection device according to an embodiment of the present disclosure;

[0049] Figure 9 A structural block diagram of an electronic device according to an embodiment of the present disclosure is shown. Detailed Implementation

[0050] Exemplary embodiments will now be described more fully with reference to the accompanying drawings. However, these exemplary embodiments can be implemented in many forms and should not be construed as limited to the examples set forth herein; rather, they are provided so that this disclosure will be more comprehensive and complete, and will fully convey the concept of the exemplary embodiments to those skilled in the art. The described features, structures, or characteristics may be combined in any suitable manner in one or more embodiments.

[0051] Furthermore, the accompanying drawings are merely illustrative of this disclosure and are not necessarily drawn to scale. The same reference numerals in the drawings denote the same or similar parts, and therefore repeated descriptions of them will be omitted. Some block diagrams shown in the drawings are functional entities and do not necessarily correspond to physically or logically independent entities. These functional entities may be implemented in software, in one or more hardware modules or integrated circuits, or in different network and / or processor devices and / or microcontroller devices.

[0052] As mentioned in the background section, existing set-secure intersection methods include those based on naive hashing, RSA blind signatures, Diffie-Hellman key exchange protocols, and unintended over-the-air (OT) communication. Among these, naive hashing-based methods suffer from security issues and are being phased out. Other set-secure intersection methods, regardless of the intersection size, require blind signing, modular exponentiation, or a set of OT protocols for all elements in the sets. These computations involve numerous and time-consuming cryptographic operations, significantly reducing efficiency. In extreme cases, when the two sets have no intersection elements, all time-consuming computations must be completed to conclude that the intersection is empty.

[0053] Based on this, this disclosure provides a set-secure intersection method, apparatus, electronic device, and storage medium. By adding a hash bucketing step before the PSI protocol process, elements in the first and second sets are filtered to obtain a reduced first set and a reduced second set. Using the reduced first set and the reduced second set as input, the PSI protocol process is executed to obtain the intersection result of the first set and the second set. This disclosure reduces the number of elements in the sets to be intersected input to the PSI protocol process, thereby reducing the time overhead of PSI protocol execution, improving the computational efficiency of set-secure intersection, and achieving the effect of improving overall performance.

[0054] This disclosure adds a preprocessing step to the existing two-party set secure intersection protocol. The preprocessing uses a dynamic multi-round hash bucketing method to reduce the number of elements in each set, thereby making the subsequent two-party set secure intersection protocol run more efficiently. By controlling the execution conditions of hash bucketing, the information exposed by the hash bucketing sequence number of each element can be controlled, which can simultaneously ensure execution efficiency and execution security.

[0055] This disclosure effectively reduces the number of elements in the intersection set to be found in the input PSI protocol by adding a preprocessing process based on multi-round hash bucketing, thereby reducing the time overhead of PSI protocol execution and improving overall performance.

[0056] From a cryptographic performance perspective, the time overhead of modular exponentiation of large integers is significantly greater than that of hash functions. For example, the time required for a 1024-bit modular exponentiation operation in a single thread is approximately 20 times that of a single execution of the SHA-256 hash function. Therefore, adding an appropriate amount of hash function operations to the set-secure intersection protocol can improve the execution efficiency of the set-secure intersection protocol.

[0057] Figure 1A schematic diagram of an exemplary system architecture that can be applied to the set-secure intersection method or set-secure intersection apparatus of the present disclosure embodiments is shown.

[0058] like Figure 1 As shown, the system architecture 100 may include terminal devices 101, 102, and 103, a network 104, and a server 105.

[0059] Network 104 is a medium used to provide a communication link between terminal devices 101, 102, 103 and server 105, and can be a wired network or a wireless network.

[0060] Optionally, the aforementioned wireless or wired networks use standard communication technologies and / or protocols. The network is typically the Internet, but can also be any network, including but not limited to Local Area Networks (LANs), Metropolitan Area Networks (MANs), Wide Area Networks (WANs), mobile, wired or wireless networks, private networks, or any combination of virtual private networks. In some embodiments, technologies and / or formats including Hyper Text Markup Language (HTML), Extensible Markup Language (XML), etc., are used to represent data exchanged over the network. Furthermore, conventional encryption technologies such as Secure Socket Layer (SSL), Transport Layer Security (TLS), Virtual Private Networks (VPNs), and Internet Protocol Security (IPsec) can be used to encrypt all or some links. In other embodiments, custom and / or dedicated data communication technologies can be used to replace or supplement the aforementioned data communication technologies.

[0061] Terminal devices 101, 102, and 103 can be various electronic devices, including but not limited to smartphones, tablets, laptops, desktop computers, wearable devices, augmented reality devices, virtual reality devices, etc.

[0062] Optionally, the client applications installed on different terminal devices 101, 102, and 103 may be the same, or clients of the same type of application based on different operating systems. Depending on the terminal platform, the specific form of the application client may also differ; for example, the application client may be a mobile client, a PC client, etc.

[0063] Server 105 can be a server that provides various services, such as a backend management server that supports the devices operated by users using terminal devices 101, 102, and 103. The backend management server can analyze and process received requests and other data, and feed the processing results back to the terminal devices.

[0064] Optionally, the server can be a standalone physical server, a server cluster or distributed system composed of multiple physical servers, or a cloud server providing basic cloud computing services such as cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communication, middleware services, domain name services, security services, CDN (Content Delivery Network), and big data and artificial intelligence platforms. The terminal can be a smartphone, tablet, laptop, desktop computer, smart speaker, smartwatch, etc., but is not limited to these. The terminal and server can be directly or indirectly connected via wired or wireless communication, which is not limited herein.

[0065] Those skilled in the art will know that Figure 1 The number of terminal devices, networks, and servers shown is merely illustrative; any number of terminal devices, networks, and servers can be included depending on actual needs. This disclosure does not limit the scope of the embodiments.

[0066] The following detailed description of this exemplary implementation method is provided in conjunction with the accompanying drawings and embodiments.

[0067] First, this disclosure provides a set-safe intersection method that can be executed by any electronic device with computing capabilities.

[0068] Figure 2 This diagram illustrates a set-safe intersection method according to an embodiment of the present disclosure, such as... Figure 2 As shown, the set-safe intersection method provided in this embodiment includes the following steps:

[0069] S202, initiate a Set Security Intersection (PSI) protocol process to the second device, and determine the execution conditions and common parameters of hash bucketing with the second device. The execution conditions are the necessary conditions that need to be met to execute hash bucketing, and the common parameters are the necessary parameters for executing hash bucketing.

[0070] It should be noted that the first or second device can be any electronic device with data storage capabilities, such as the aforementioned terminal device or server. The first and second devices can determine the execution conditions and common parameters of hash bucketing through interactive negotiation, or they can determine the execution conditions and common parameters of hash bucketing through non-interactive methods using pre-set data.

[0071] In one embodiment of this disclosure, the execution conditions may include at least one of: maximum number of executions, set element screening rate threshold, and set element screening amount threshold; the common parameters may include: hash function sequence, bucketing strategy, bucket size, and update strategy.

[0072] S204. Based on the common parameters, perform hash bucketing on the first set stored locally to obtain the first bucket identifier table and the first bucket element registration table of the first set.

[0073] It should be noted that the first set or the second set can be any set of elements that requires set-safe intersection; the first bucket identifier table is used to record whether each hash bucket storing elements in the first set has been filled with elements; the first bucket element registration table is used to record the elements that have been filled with each hash bucket storing elements in the first set; the hash bucket processing may include the first preset step and the second preset step in the embodiments of this disclosure.

[0074] In one embodiment of this disclosure, the common parameters include: a hash function sequence, a bucketing strategy, a bucket size, and an update strategy; based on the common parameters, a hash bucketing process is performed on the locally stored first set to obtain a first bucket identifier table and a first bucket element registration table for the first set, including: obtaining the first bucket identifier table and the first bucket element registration table for the first set through the following first preset steps: updating the bucketing strategy of each bucket according to the update strategy, creating a bucket identifier table and a bucket element registration table according to the bucket size, setting the bucket identifier table to zero, and setting the bucket element registration table to empty; from The target hash function to be used in this round is selected from the hash function sequence. The hash value of each element in the first set is calculated one by one using the target hash function. According to the bucketing strategy, the bucket number of the hash bucket corresponding to the hash value of each element in the first set is determined, and the number of the preset position in the bucket identifier table is set to T. According to the bucket number of the hash bucket corresponding to the hash value of each element in the first set, the hash value of each element in the first set is put into the corresponding hash bucket, and the bucket identifier table and the bucket element registration table are updated to obtain the first bucket identifier table and the first bucket element registration table of the first set.

[0075] It should be noted that the steps for initializing the bucket identifier table and the bucket element registration table are as follows: update the bucketing strategy of each bucket according to the update strategy, create the bucket identifier table and the bucket element registration table according to the bucket size, set the bucket identifier table to zero, and set the bucket element registration table to empty. When executing each round of hash bucketing, the steps of initializing the bucket identifier table and the bucket element registration table need to be executed first. The bucketing strategy can be to use the last bit of the binary representation of the hash value of an element as the bucket number of that element.

[0076] In one embodiment of this disclosure, see Figure 3 The diagram illustrates a set-secure intersection method. During the i-th round of hash bucketing, the first device selects a hash function H from the sequence of hash functions representing common parameters. i (·), according to the hash function H i (·) Calculate each element x in the first set one by one. j The hash value is used to obtain the bucket number k according to the bucketing strategy, and the position of the bucket identifier with the sequence number k is set to T (i.e., True). Then, x... j Alternatively, its index can be placed into the sequence of bucket element registration tables associated with sequence number k, and the bucket identifier table and bucket element registration table can be updated to obtain the first bucket identifier table and the first bucket element registration table of the first set.

[0077] S206, the first bucket identifier table is sent to the second device, and the second bucket identifier table of the second set and the reduced second set are sent by the second device. The reduced second set is obtained by the second device by filtering elements of the locally stored second set according to the first bucket identifier table, the second bucket identifier table and the second bucket element registration table of the second set.

[0078] S208, based on the first bucket identifier table, the second bucket identifier table and the first bucket element registration table, filter out the corresponding elements in the first set, and obtain the reduced first set after the execution conditions are not met;

[0079] It should be noted that the hash bucketing step involves filtering out the corresponding elements in the first set based on the first bucket identifier table, the second bucket identifier table, and the first bucket element registration table.

[0080] In one embodiment of this disclosure, based on the first bucket identifier table, the second bucket identifier table, and the first bucket element registration table, corresponding elements in the first set are screened out. After the execution conditions are not met, a reduced first set is obtained. This includes: screening out elements in the first set through the following second preset steps to obtain the screened first set: performing logical operations on the corresponding identifiers in the first bucket identifier table and the second bucket identifier table one by one to obtain a merged bucket identifier table; determining the elements to be screened out in the first bucket element registration table based on the merged bucket identifier table, and counting the elements retained in the first bucket element registration table to obtain the screened first set.

[0081] It should be noted that logical operations are performed on the corresponding identifier bits in the first and second bucket identifier bit tables one by one. The logical operation here can be an AND operation. For example, a bitwise AND operation can be performed on the corresponding identifier bits in the first and second bucket identifier bit tables one by one to obtain the merged bucket identifier bit table.

[0082] In one embodiment of this disclosure, the elements to be screened out in the first bucket element registration table are determined according to the merged bucket identifier table, and the elements retained in the first bucket element registration table are counted to obtain the first set after screening, including: when the identifier in the merged bucket identifier table is T, the elements of the corresponding sequence in the first bucket element registration table are determined to be the elements to be screened out; when the identifier in the merged bucket identifier table is F, the elements of the corresponding sequence in the first bucket element registration table are determined to be the elements to be retained.

[0083] It should be noted that when the identifier in the merged bucket identifier table is T, it means that the set element in the first bucket element registration table sequence corresponding to that number is not a final intersection element. It does not need to be entered into the PSI protocol process for calculation and needs to be removed. When the identifier in the merged bucket identifier table is F, it means that the set element in the first bucket element registration table sequence corresponding to that number may be a final intersection element. It needs to be checked in the next round of hash bucketing or entered into the PSI protocol process for calculation and verification, and needs to be retained.

[0084] In one embodiment of this disclosure, based on the first bucket identifier table, the second bucket identifier table, and the first bucket element registration table, corresponding elements in the first set are screened out. After the execution conditions are not met, a reduced first set is obtained. This includes: determining whether the screened first set meets the execution conditions; if not, using the screened first set as the reduced first set; if yes, performing a first preset step and a second preset step on the screened first set until a new screened first set does not meet the execution conditions, ending the execution of the first preset step and the second preset step, and using the new screened first set as the reduced first set.

[0085] S210: Using the reduced first set and the reduced second set as input, execute the PSI protocol process to obtain the intersection result of the first set and the second set.

[0086] The set-safe intersection method provided in this disclosure adds a hash bucketing step before the PSI protocol process to filter elements in the first and second sets, resulting in a reduced first set and a reduced second set. Using these reduced first and second sets as input, the PSI protocol process is executed to obtain the intersection result of the first and second sets. This disclosure reduces the number of elements in the sets to be intersected input to the PSI protocol process, thereby reducing the time overhead of PSI protocol execution, improving the computational efficiency of set-safe intersection, and ultimately enhancing overall performance.

[0087] In one embodiment of this disclosure, the method further includes sending the intersection result of the first set and the second set to a second device, so that the second device can obtain the intersection result of the first set and the second set.

[0088] In one embodiment of this disclosure, when the execution conditions for hash bucketing include a maximum number of executions and a threshold for the set element sieving rate, the maximum number of executions is set to... Where H(·)| is the length of the binary representation output by the hash function. Set the initial bucket size; set the set element sieving rate threshold to T. filter =1%, the execution condition requires that the maximum number of executions and the set element screening rate threshold be met simultaneously.

[0089] The hash function sequence in the above common parameters can be constructed using the international algorithm SHA-256 as the base hash function |H(·)|. Combined with the round number i of the hash bucket, the hash function of the i-th item in the hash function sequence is H. i (x) = H(x||i), where x is a set element; || represents string concatenation; the bucket size in the i-th round is determined by the minimum number of elements in the first and second sets of the input for that round, i.e. Among them, |A (i) |B represents the number of elements in the first set in the i-th round. (i) | Input the number of elements in the second set for the i-th round; the bucketing strategy adopts Suffix method, which uses only the last bit of the binary representation of the hash value. The value in the i-th round is related to the size of the bucket. The update strategy is to recalculate the bucket size and bucketing strategy value before each round of execution, based on the situation of the first set and the second set after filtering out the first and second devices in the previous rounds.

[0090] The first device initiates a Set Security Intersection (PSI) protocol procedure to the second device, using a locally stored first set as input; the second device responds to the PSI protocol procedure by using a locally stored second set as input, and sets the number of elements in the second set |B|. (0) |Sent to the first device; the first device determines the maximum number of execution epochs. max And the threshold T for the set element screening rate filter =1%, and the maximum number of execution epochs. max And the threshold T for the set element screening rate filter=1% Send to the second device.

[0091] When performing the i-th round of hash bucketing, the first device determines the bucketing strategy. and the size of the bucket And the determined bucketing strategy and the size of the bucket Send to the second device. The first device creates a local file of size [size missing]. Bucket Identification Table And the bucket element registration form And the bin identification table And the bucket element registration form Perform initialization; the second device is created locally with a size of [size missing]. Bucket Identification Table And the bucket element registration form And the bin identification table And the bucket element registration form Perform initialization.

[0092] The first device uses the hash function sequence with hash function H as the i-th item. i (x) = H(x||i), for the first set A (i) elements in Calculate the hash value one by one And retrieve the hash value The last binary representation of Bit, as the first set A (i) The bin number k is used to set the first bin identifier table. and the first bucket element registration form

[0093] The second device uses the hash function sequence where the i-th hash function is H. i (x) = H(x||i), for the second set B (i) elements in Calculate the hash value one by one And retrieve the hash value The last binary representation of Bits, as the second set B (i) Set the bin number k, and set the second bin identifier table. and the second bucket element registration form

[0094] The second device will identify the second bin. Send to the first device, the first device will then send the first bin identifier table. Second bucket identification table Merge the data using bitwise AND operations to obtain the merged bucket identifier table. The first device can identify the first bin marker. Send to the second device so that the second device can check the first bin identifier table. Second bucket identification table Merge the data using bitwise AND operations to obtain the merged bucket identifier table. Alternatively, the merged first device can be directly merged into the bin identifier table. Send to the second device.

[0095] The first device checks the labels in the merge bin labeling table one by one. If the conditions are met, then register the first batch of bins. All corresponding elements are added to the next round of the filter set A. (i+1) If the condition is not met, then the first batch of bin registration forms will be updated. All corresponding elements are filtered out.

[0096] The second device checks the identifiers in the merge bin identifier table one by one. If the conditions are met, then register the second sub-bucket. All corresponding elements are added to the next round of the filter set B. (i+1) If the condition is not met, then the second bin registration form will be updated. All corresponding elements are filtered out.

[0097] The first device calculates the element removal rate of the set in this round (i.e., the i-th round). Among them, |A (i+1) | is the number of elements in the first set in the (i+1)th round, |B (i+1) | Input the number of elements in the second set for the (i+1)th round. Also determine if the execution condition for hash bucketing is met, i.e. And i+1 < epoch max When the execution conditions are met, hash bucketing continues; when the execution conditions are not met, hash bucketing ends. Finally, the first device obtains the reduced first set, and the second device obtains the reduced second set. The second device sends the reduced second set to the first device. The first device uses the reduced first and second sets as inputs to the Set Secure Intersection (PSI) protocol and executes the PSI protocol to obtain the intersection of the first and second sets.

[0098] This disclosure also provides another set-safe intersection method, see [link to relevant documentation]. Figure 4 The flowchart shown is another set-safe intersection method, which may include the following steps:

[0099] S402, the first device and the second device interact to generate PSI protocol execution parameters, and the first device obtains the first set from the local machine, and the second device obtains the second set from the local machine.

[0100] S404, the first device and the second device interact to perform multiple rounds of hash bucketing preprocessing, respectively filtering out relevant elements in the first set and the second set, to obtain the reduced first set and the reduced second set.

[0101] S406, the first device takes the reduced first set as input and the second device takes the reduced second set as input, executes the PSI protocol process according to the PSI protocol execution parameters, and the first device obtains the intersection result.

[0102] S408, the first device sends the intersection result to the second device.

[0103] In one embodiment of this disclosure, see Figure 5 The flowchart of another set-safe intersection method shown above, step S404 may further include the following steps:

[0104] S502, the first device and the second device negotiate and determine the execution conditions and common parameters for hash bucketing.

[0105] S504, the first device performs hash bucketing on the first set to obtain a first bucket identifier table and a first bucket element registration table; the second device performs hash bucketing on the second set to obtain a second bucket identifier table and a second bucket element registration table.

[0106] S506, perform an AND operation on the first bucket identifier table and the second bucket identifier table to obtain a merged bucket identifier table.

[0107] S508, the first device screens the first element registration table according to the merged bin identifier table to obtain the first set after screening; the second device screens the second element registration table according to the merged bin identifier table to obtain the second set after screening.

[0108] S510: Determine if the hash bucketing execution condition is met. If yes, continue executing S204-208. If no, obtain the reduced first set and the reduced second set.

[0109] This disclosure also provides another set-safe intersection method, see [link to relevant documentation]. Figure 6 The flowchart shown is another set-safe intersection method, which may include the following steps:

[0110] S602, responding to the PSI protocol process initiated by the first device, determining the execution conditions and common parameters of hash bucketing with the first device, wherein the execution conditions are the necessary conditions to be met for executing hash bucketing, and the common parameters are the necessary parameters for executing hash bucketing;

[0111] S604, based on the common parameters, perform hash bucketing on the locally stored second set to obtain the second bucket identifier table and the second bucket element registration table of the second set;

[0112] S606, the second bucket identifier table is sent to the first device, and the first bucket identifier table of the first set is received from the first device, so that the first device can filter the elements of the locally stored first set according to the first bucket identifier table, the second bucket identifier table and the first bucket element registration table of the first set to obtain the reduced first set;

[0113] S608, based on the first bucket identifier table, the second bucket identifier table, and the second bucket element registration table, filter out the corresponding elements in the second set, and obtain the reduced second set after the execution conditions are not met;

[0114] S610, the reduced second set is sent to the first device so that the first device can use the reduced first set and the reduced second set as input to execute the PSI protocol process and obtain the intersection result of the first set and the second set.

[0115] In one embodiment of this disclosure, the common parameters include: a hash function sequence, a bucketing strategy, a bucket size, and an update strategy. Based on the common parameters, the second set stored locally is subjected to hash bucketing processing to obtain a second bucket identifier table and a second bucket element registration table for the second set. This includes: obtaining the second bucket identifier table and the second bucket element registration table for the second set through the following third preset step: updating the bucketing strategy of each bucket according to the update strategy; creating a bucket identifier table and a bucket element registration table according to the bucket size; setting the bucket identifier table to zero; and setting the bucket element registration table to empty. A target hash function for this round is selected from the hash function sequence, and the second set is calculated one by one using the target hash function. The hash values ​​of each element in the second set are determined. Based on the bucketing strategy, the bucket number corresponding to the hash value of each element in the second set is determined, and the number of the preset position in the bucket identifier table is set to T. Based on the bucket number corresponding to the hash value of each element in the second set, the hash values ​​of each element in the second set are placed into the corresponding hash buckets. The bucket identifier table and the bucket element registration table are updated to obtain the second bucket identifier table and the second bucket element registration table for the second set. The second bucket identifier table records whether each hash bucket storing elements in the second set has contained elements, and the second bucket element registration table records the elements already contained in each hash bucket storing elements in the second set.

[0116] In one embodiment of this disclosure, elements in the second set are filtered out according to the first bucket identifier table, the second bucket identifier table, and the second bucket element registration table. After the execution conditions are not met, a reduced second set is obtained. This includes: filtering out elements in the second set through the following fourth preset step to obtain the filtered second set: performing logical operations on the corresponding identifiers in the first bucket identifier table and the second bucket identifier table one by one to obtain a merged bucket identifier table; determining the elements to be filtered out in the second bucket element registration table according to the merged bucket identifier table, and counting the elements retained in the second bucket element registration table to obtain the filtered second set.

[0117] In one embodiment of this disclosure, the elements to be screened out in the first bucket element registration table are determined according to the merged bucket identifier table, and the elements retained in the first bucket element registration table are counted to obtain the first set after screening, including: when the identifier in the merged bucket identifier table is T, the elements of the corresponding sequence in the first bucket element registration table are determined to be the elements to be screened out; when the identifier in the merged bucket identifier table is F, the elements of the corresponding sequence in the first bucket element registration table are determined to be the elements to be retained.

[0118] In one embodiment of this disclosure, based on the first bucket identifier table, the second bucket identifier table, and the second bucket element registration table, corresponding elements in the second set are screened out. After the execution conditions are not met, a reduced second set is obtained. This includes: determining whether the screened second set meets the execution conditions; if not, using the screened second set as the reduced second set; if yes, performing a third preset step and a fourth preset step on the screened second set until a new screened second set does not meet the execution conditions, ending the execution of the third preset step and the fourth preset step, and using the new screened second set as the reduced second set.

[0119] In one embodiment of this disclosure, the method further includes: receiving an intersection result sent by a first device.

[0120] Based on the same inventive concept, this disclosure also provides a set secure intersection device, as shown in the following embodiment. Since the principle by which this device solves the problem is similar to that of the method embodiment described above, the implementation of this device embodiment can refer to the implementation of the method embodiment described above, and repeated details will not be elaborated further.

[0121] Figure 7 This diagram illustrates a set-secure intersection device according to an embodiment of the present disclosure, as shown below. Figure 7 As shown, the device can be applied to a first device, including:

[0122] The first parameter determination module 710 is used to initiate a Set Security Intersection (PSI) protocol process to the second device and determine the execution conditions and common parameters of hash bucketing with the second device. The execution conditions are the necessary conditions that need to be met to execute hash bucketing, and the common parameters are the necessary parameters for executing hash bucketing.

[0123] The first bucketing processing module 720 is used to perform hash bucketing processing on the locally stored first set according to common parameters to obtain the first bucket identifier table and the first bucket element registration table of the first set.

[0124] The data receiving module 730 is used to send the first bucket identifier table to the second device, and receive the second bucket identifier table of the second set and the reduced second set sent by the second device. The reduced second set is obtained by the second device by filtering elements of the locally stored second set according to the first bucket identifier table, the second bucket identifier table and the second bucket element registration table of the second set.

[0125] The first bucket processing module 720 is also used to filter out the corresponding elements in the first set according to the first bucket identifier table, the second bucket identifier table and the first bucket element registration table, and obtain the reduced first set after the execution conditions are not met.

[0126] The intersection result determination module 740 is used to take the reduced first set and the reduced second set as input, execute the PSI protocol process, and obtain the intersection result of the first set and the second set.

[0127] In one embodiment of this disclosure, the common parameters include: a hash function sequence, a bucketing strategy, a bucket size, and an update strategy; the aforementioned first bucketing processing module 720 is further configured to obtain a first bucket identifier table and a first bucket element registration table for the first set through the following first preset steps: updating the bucketing strategy of each bucket according to the update strategy, creating a bucket identifier table and a bucket element registration table according to the bucket size, setting the bucket identifier table to zero, and setting the bucket element registration table to empty; selecting the target hash function to be used in this round from the hash function sequence, and calculating the hash value of each element in the first set one by one using the target hash function; determining the hash value of each element in the first set according to the bucketing strategy. The hash value of each element in the first set corresponds to the bucket number of the hash bucket, and the number of the preset position in the bucket identifier table is set to T. According to the bucket number of the hash bucket corresponding to the hash value of each element in the first set, the hash value of each element in the first set is put into the corresponding hash bucket, and the bucket identifier table and the bucket element registration table are updated to obtain the first bucket identifier table and the first bucket element registration table of the first set. The first bucket identifier table is used to record whether each hash bucket storing the elements in the first set has been put into an element, and the first bucket element registration table is used to record the elements that have been put into each hash bucket storing the elements in the first set.

[0128] In one embodiment of this disclosure, the first bucket processing module 720 is further configured to filter out elements in the first set through the following second preset steps to obtain the first set after filtering: performing logical operations on the corresponding identifier bits in the first bucket identifier table and the second bucket identifier table one by one to obtain a merged bucket identifier table; determining the elements to be filtered out in the first bucket element registration table according to the merged bucket identifier table, and counting the elements retained in the first bucket element registration table to obtain the first set after filtering.

[0129] In one embodiment of this disclosure, the first binning processing module 720 is further configured to determine, when the identifier bit in the merged binning identifier table is T, the element in the corresponding sequence in the first binning element registration table is the element to be filtered out; and when the identifier bit in the merged binning identifier table is F, the element in the corresponding sequence in the first binning element registration table is the element to be retained.

[0130] In one embodiment of this disclosure, the first binning processing module 720 is further configured to determine whether the first set after screening meets the execution conditions; if not, the first set after screening is used as the reduced first set; if so, the first preset step and the second preset step are executed on the first set after screening until the newly screened first set does not meet the execution conditions, the execution of the first preset step and the second preset step is terminated, and the newly screened first set is used as the reduced first set.

[0131] In one embodiment of this disclosure, the above-mentioned apparatus further includes an intersection result sending module, which is used to send the intersection result of the first set and the second set to the second device.

[0132] Based on the same inventive concept, this disclosure also provides another set-safe intersection device, as shown in the following embodiment. Since the principle of this device embodiment in solving the problem is similar to that of the above-described method embodiment, the implementation of this device embodiment can refer to the implementation of the above-described method embodiment, and repeated details will not be elaborated further.

[0133] Figure 8 This illustration shows a schematic diagram of another set-secure intersection device in an embodiment of this disclosure, such as... Figure 8 As shown, the device can be applied to a second device, including:

[0134] The second parameter determination module 810 is used to respond to the PSI protocol process initiated by the first device and determine the execution conditions and common parameters of hash bucketing with the first device. The execution conditions are the necessary conditions that need to be met to execute hash bucketing, and the common parameters are the necessary parameters for executing hash bucketing.

[0135] The second bucketing processing module 820 is used to perform hash bucketing processing on the locally stored second set according to common parameters to obtain the second bucket identifier table and the second bucket element registration table of the second set.

[0136] The identifier bit table inter-transmission module 830 is used to send the second bucket identifier bit table to the first device and receive the first bucket identifier bit table of the first set sent by the first device, so that the first device can filter the elements of the locally stored first set according to the first bucket identifier bit table, the second bucket identifier bit table and the first bucket element registration table of the first set to obtain the reduced first set.

[0137] The second bucket processing module 820 is also used to filter out the corresponding elements in the second set according to the first bucket identifier table, the second bucket identifier table, and the second bucket element registration table, and obtain the reduced second set after the execution conditions are not met.

[0138] The reduced set sending module 840 is used to send the reduced second set to the first device, so that the first device can use the reduced first set and the reduced second set as input to execute the PSI protocol process and obtain the intersection result of the first set and the second set.

[0139] In one embodiment of this disclosure, the common parameters include: a hash function sequence, a bucketing strategy, a bucket size, and an update strategy; the second bucketing processing module 820 is further configured to obtain the second bucket identifier table and the second bucket element registration table of the second set through the following third preset steps: updating the bucketing strategy of each bucket according to the update strategy, creating a bucket identifier table and a bucket element registration table according to the bucket size, setting the bucket identifier table to zero, and setting the bucket element registration table to empty; selecting the target hash function to be used in this round from the hash function sequence, and calculating the hash value of each element in the second set one by one using the target hash function; determining the first bucketing step according to the bucketing strategy. The hash value of each element in the second set corresponds to the bucket number of the hash bucket, and the number of the preset position in the bucket identifier table is set to T. According to the bucket number of the hash value of each element in the second set, the hash value of each element in the second set is put into the corresponding hash bucket, and the bucket identifier table and the bucket element registration table are updated to obtain the second bucket identifier table and the second bucket element registration table of the second set. The second bucket identifier table is used to record whether each hash bucket storing the elements in the second set has been put into an element, and the second bucket element registration table is used to record the elements that have been put into each hash bucket storing the elements in the second set.

[0140] In one embodiment of this disclosure, the second bucket processing module 820 is further configured to filter out elements in the second set through the following fourth preset step to obtain the filtered second set: performing logical operations on the corresponding identifier bits in the first bucket identifier table and the second bucket identifier table one by one to obtain a merged bucket identifier table; determining the elements to be filtered out in the second bucket element registration table according to the merged bucket identifier table, and counting the elements retained in the second bucket element registration table to obtain the filtered second set.

[0141] In one embodiment of this disclosure, the second binning processing module 820 is further configured to determine, when the identifier bit in the merged binning identifier table is T, the element of the corresponding sequence in the first binning element registration table is the element to be filtered out; and when the identifier bit in the merged binning identifier table is F, the element of the corresponding sequence in the first binning element registration table is the element to be retained.

[0142] In one embodiment of this disclosure, the second binning processing module 820 is further configured to determine whether the second set after filtering meets the execution conditions; if not, the second set after filtering is used as the reduced second set; if so, the third preset step and the fourth preset step are executed on the second set after filtering until the newly filtered second set does not meet the execution conditions, and the execution of the third preset step and the fourth preset step is terminated, and the newly filtered second set is used as the reduced second set.

[0143] In one embodiment of this disclosure, the above-described apparatus further includes an intersection result receiving module, which is used to receive the intersection result sent by the first device.

[0144] Those skilled in the art will understand that various aspects of this disclosure can be implemented as systems, methods, or program products. Therefore, various aspects of this disclosure can be specifically implemented in the following forms: entirely in hardware, entirely in software (including firmware, microcode, etc.), or in a combination of hardware and software, collectively referred to herein as “circuit,” “module,” or “system.”

[0145] The following reference Figure 9 To describe an electronic device 900 according to such an embodiment of the present disclosure. Figure 9 The electronic device 900 shown is merely an example and should not impose any limitation on the functionality and scope of use of the embodiments disclosed herein.

[0146] like Figure 9 As shown, the electronic device 900 is manifested in the form of a general-purpose computing device. The components of the electronic device 900 may include, but are not limited to: at least one processing unit 910, at least one storage unit 920, and a bus 930 connecting different system components (including the storage unit 920 and the processing unit 910).

[0147] The storage unit stores program code, which can be executed by the processing unit 910, causing the processing unit 910 to perform the steps described in the "Exemplary Methods" section of this specification according to various exemplary embodiments of this disclosure. For example, the processing unit 910 can perform the following steps of the above method embodiment: initiating a Set Security Intersection (PSI) protocol process to the second device, determining the execution conditions and common parameters of hash bucketing with the second device, wherein the execution conditions are necessary conditions to be met for executing hash bucketing, and the common parameters are necessary parameters for executing hash bucketing; performing hash bucketing processing on the locally stored first set according to the common parameters, obtaining a first bucket identifier table and a first bucket element registration table for the first set; sending the first bucket identifier table to the second device, and receiving the second bucket identifier table of the second set sent by the second device. The system consists of a bucket identifier table and a reduced second set. The reduced second set is obtained by the second device filtering elements from the locally stored second set based on the first bucket identifier table, the second bucket identifier table, and the second bucket element registration table of the second set. Based on the first bucket identifier table, the second bucket identifier table, and the first bucket element registration table, corresponding elements in the first set are filtered out. After the execution conditions are not met, the reduced first set is obtained. Using the reduced first set and the reduced second set as input, the PSI protocol process is executed to obtain the intersection of the first set and the second set.

[0148] Storage unit 920 may include readable media in the form of volatile storage units, such as random access memory (RAM) 9201 and / or cache memory 9202, and may further include read-only memory (ROM) 9203.

[0149] Storage unit 920 may also include a program / utility 9204 having a set (at least one) program module 9205, such program module 9205 including but not limited to: operating system, one or more application programs, other program modules and program data, each or some combination of these examples may include an implementation of a network environment.

[0150] Bus 930 can represent one or more of several types of bus structures, including a memory cell bus or memory cell controller, a peripheral bus, a graphics acceleration port, a processing unit, or a local bus using any of the various bus structures.

[0151] Electronic device 900 can also communicate with one or more external devices 940 (e.g., keyboard, pointing device, Bluetooth device, etc.), and with one or more devices that enable a user to interact with electronic device 900, and / or with any device that enables electronic device 900 to communicate with one or more other computing devices (e.g., router, modem, etc.). This communication can be performed via input / output (I / O) interface 950. Furthermore, electronic device 900 can also communicate with one or more networks (e.g., local area network (LAN), wide area network (WAN), and / or public networks, such as the Internet) via network adapter 960. As shown, network adapter 960 communicates with other modules of electronic device 900 via bus 930. It should be understood that, although not shown in the figures, other hardware and / or software modules can be used in conjunction with electronic device 900, including but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives, and data backup storage systems.

[0152] From the above description of the embodiments, those skilled in the art will readily understand that the exemplary embodiments described herein can be implemented by software or by combining software with necessary hardware. Therefore, the technical solutions according to the embodiments of this disclosure can be embodied in the form of a software product, which can be stored in a non-volatile storage medium (such as a CD-ROM, USB flash drive, external hard drive, etc.) or on a network, including several instructions to cause a computing device (such as a personal computer, server, terminal device, or network device, etc.) to execute the methods according to the embodiments of this disclosure.

[0153] In exemplary embodiments of this disclosure, a computer-readable storage medium is also provided, which may be a readable signal medium or a readable storage medium. A program product capable of implementing the methods described above is stored thereon. In some possible implementations, various aspects of this disclosure may also be implemented as a program product including program code, which, when run on a terminal device, causes the terminal device to perform the steps described in the "Exemplary Methods" section of this specification according to various exemplary embodiments of this disclosure.

[0154] More specific examples of computer-readable storage media in this disclosure may include, but are not limited to: electrical connections having one or more wires, portable computer disks, hard disks, random access memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), optical storage devices, magnetic storage devices, or any suitable combination of the foregoing.

[0155] In this disclosure, a computer-readable storage medium may include a data signal propagated in baseband or as part of a carrier wave, carrying readable program code. Such propagated data signals may take various forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination thereof. A readable signal medium may also be any readable medium other than a readable storage medium, capable of transmitting, propagating, or transmitting a program for use by or in connection with an instruction execution system, apparatus, or device.

[0156] Optionally, the program code contained on the computer-readable storage medium may be transmitted using any suitable medium, including but not limited to wireless, wired, optical fiber, RF, etc., or any suitable combination thereof.

[0157] In practical implementation, program code for performing the operations of this disclosure can be written in any combination of one or more programming languages, including object-oriented programming languages ​​such as Java and C++, and conventional procedural programming languages ​​such as C or similar languages. The program code can execute entirely on the user's computing device, partially on the user's device, as a standalone software package, partially on the user's computing device and partially on a remote computing device, or entirely on a remote computing device or server. In cases involving remote computing devices, the remote computing device can be connected to the user's computing device via any type of network, including a local area network (LAN) or a wide area network (WAN), or it can be connected to an external computing device (e.g., via the Internet using an Internet service provider).

[0158] It should be noted that although several modules or units for the device used to perform actions have been mentioned in the detailed description above, this division is not mandatory. In fact, according to embodiments of this disclosure, the features and functions of two or more modules or units described above can be embodied in one module or unit. Conversely, the features and functions of one module or unit described above can be further divided and embodied by multiple modules or units.

[0159] Furthermore, although the steps of the method in this disclosure are described in a specific order in the accompanying drawings, this does not require or imply that the steps must be performed in that specific order, or that all the steps shown must be performed to achieve the desired result. Additional or alternative steps may be omitted, multiple steps may be combined into one step, and / or a step may be broken down into multiple steps.

[0160] From the above description of the embodiments, those skilled in the art will readily understand that the exemplary embodiments described herein can be implemented by software or by combining software with necessary hardware. Therefore, the technical solutions according to the embodiments of this disclosure can be embodied in the form of a software product, which can be stored in a non-volatile storage medium (such as a CD-ROM, USB flash drive, external hard drive, etc.) or on a network, including several instructions to cause a computing device (such as a personal computer, server, mobile terminal, or network device, etc.) to execute the methods according to the embodiments of this disclosure.

[0161] Other embodiments of this disclosure will readily occur to those skilled in the art upon consideration of the specification and practice of the invention disclosed herein. This disclosure is intended to cover any variations, uses, or adaptations of this disclosure that follow the general principles of this disclosure and include common knowledge or customary techniques in the art not disclosed herein. The specification and examples are to be considered exemplary only, and the true scope and spirit of this disclosure are indicated by the appended claims.

Claims

1. A set-safe intersection method, characterized in that, Applied to the first device, including: A Set Secure Intersection (PSI) protocol process is initiated with the second device to determine the execution conditions and common parameters for hash bucketing. The execution conditions are the necessary conditions that need to be met to execute hash bucketing, and the common parameters are the necessary parameters for executing hash bucketing. The execution conditions include at least one of the following: maximum number of executions, set element removal rate threshold, and set element removal amount threshold. Based on the public parameters, hash bucketing is performed on the locally stored first set to obtain the first bucket identifier table and the first bucket element registration table of the first set. The first bucket identifier table is sent to the second device, and the second bucket identifier table of the second set and the reduced second set are received from the second device. The reduced second set is obtained by the second device by filtering elements of the locally stored second set according to the first bucket identifier table, the second bucket identifier table and the second bucket element registration table of the second set. The elements in the first set are filtered out through the following second preset steps to obtain the filtered first set: logical operations are performed on the corresponding identifiers in the first bucket identifier table and the second bucket identifier table to obtain a merged bucket identifier table; based on the merged bucket identifier table, the elements to be filtered out in the first bucket element registration table are determined, the elements retained in the first bucket element registration table are counted to obtain the filtered first set, and the reduced first set is determined based on the filtered first set; Using the reduced first set and the reduced second set as input, the PSI protocol process is executed to obtain the intersection result of the first set and the second set.

2. The set-safe intersection method according to claim 1, characterized in that, The common parameters include: hash function sequence, bucketing strategy, bucket size, and update strategy; Based on the aforementioned public parameters, the first set stored locally is subjected to hash bucketing to obtain the first bucket identifier table and the first bucket element registration table of the first set, including: The first set of elements is obtained through the following first preset steps: Update the bucketing strategy of each bucket according to the update strategy, create a bucket identifier table and a bucket element registration table according to the bucket size, set the bucket identifier table to zero, and set the bucket element registration table to empty. Select the target hash function to be used in this round from the hash function sequence, and use the target hash function to calculate the hash value of each element in the first set one by one; According to the bucketing strategy, the hash value of each element in the first set corresponds to the bucketing sequence number of the hash bucket, and the sequence number of the preset position in the bucketing identifier table is set to T; Based on the hash value of each element in the first set corresponding to the hash bucket number, the hash value of each element in the first set is placed into the corresponding hash bucket, and the bucket identifier table and the bucket element registration table are updated to obtain the first bucket identifier table and the first bucket element registration table of the first set. The first bucket identifier table is used to record whether each hash bucket storing elements in the first set has been filled with elements, and the first bucket element registration table is used to record the elements that have been filled into each hash bucket storing elements in the first set.

3. The set-safe intersection method according to claim 1, characterized in that, Based on the merged bucket identifier table, determine the elements to be removed from the first bucket element registration table, count the elements retained in the first bucket element registration table, and obtain the first set after removal, including: When the identifier bit in the merged bucket identifier table is T, the element in the corresponding sequence in the first bucket element registration table is determined to be the element that needs to be filtered out. When the identifier bit in the merged bucket identifier table is F, the element in the corresponding sequence in the first bucket element registration table is determined to be the element that needs to be retained.

4. The set-safe intersection method according to claim 2, characterized in that, The process of determining the reduced first set based on the first set after filtering includes: Determine whether the first set after filtering meets the execution condition; If not, the first set after filtering will be used as the first reduced set; If so, the first preset step and the second preset step are executed on the first set after screening until the newly screened first set does not meet the execution conditions, then the execution of the first preset step and the second preset step is ended, and the newly screened first set is taken as the reduced first set.

5. The set-safe intersection method according to claim 1, characterized in that, The method further includes: The intersection result of the first set and the second set is sent to the second device.

6. A set-safe intersection method, characterized in that, Applied to a second device, including: In response to the PSI protocol process initiated by the first device, the execution conditions and common parameters of hash bucketing are determined with the first device. The execution conditions are the necessary conditions that need to be met to execute hash bucketing, and the common parameters are the necessary parameters for executing hash bucketing. The execution conditions include at least one of the following: maximum number of executions, set element removal rate threshold, and set element removal amount threshold. Based on the public parameters, hash bucketing is performed on the locally stored second set to obtain the second bucket identifier table and the second bucket element registration table of the second set. The second bucket identifier table is sent to the first device, and the first bucket identifier table of the first set is received from the first device, so that the first device can filter the elements of the locally stored first set according to the first bucket identifier table, the second bucket identifier table and the first bucket element registration table of the first set to obtain the reduced first set. The elements in the second set are filtered out through the following fourth preset step to obtain the filtered second set: logical operations are performed on the corresponding identifiers in the first bucket identifier table and the second bucket identifier table one by one to obtain a merged bucket identifier table; according to the merged bucket identifier table, the elements that need to be filtered out in the second bucket element registration table are determined, the elements retained in the second bucket element registration table are counted to obtain the filtered second set, and the reduced second set is determined based on the filtered second set; The reduced second set is sent to the first device so that the first device can use the reduced first set and the reduced second set as input to execute the PSI protocol process and obtain the intersection result of the first set and the second set.

7. The set-safe intersection method according to claim 6, characterized in that, The common parameters include: hash function sequence, bucketing strategy, bucket size, and update strategy; Based on the aforementioned public parameters, the locally stored second set is subjected to hash bucketing to obtain a second bucket identifier table and a second bucket element registration table for the second set, including: The second bucket identifier table and the second bucket element registration table of the second set are obtained through the following third preset step: Update the bucketing strategy of each bucket according to the update strategy, create a bucket identifier table and a bucket element registration table according to the bucket size, set the bucket identifier table to zero, and set the bucket element registration table to empty. Select the target hash function to be used in this round from the hash function sequence, and use the target hash function to calculate the hash value of each element in the second set one by one; According to the bucketing strategy, the hash value of each element in the second set is determined to correspond to the bucketing sequence number of the hash bucket, and the sequence number of the preset position in the bucketing identifier table is set to T; Based on the hash value of each element in the second set corresponding to the hash bucket number, the hash value of each element in the second set is placed into the corresponding hash bucket, and the bucket identifier table and the bucket element registration table are updated to obtain the second bucket identifier table and the second bucket element registration table of the second set. The second bucket identifier table is used to record whether each hash bucket storing elements in the second set has been filled with elements, and the second bucket element registration table is used to record the elements that have been filled into each hash bucket storing elements in the second set.

8. The set-safe intersection method according to claim 7, characterized in that, Based on the merged bucket identifier table, determine the elements to be removed from the first bucket element registration table, count the elements retained in the first bucket element registration table, and obtain the first set after removal, including: When the identifier bit in the merged bucket identifier table is T, the element in the corresponding sequence in the first bucket element registration table is determined to be the element that needs to be filtered out. When the identifier bit in the merged bucket identifier table is F, the element in the corresponding sequence in the first bucket element registration table is determined to be the element that needs to be retained.

9. The set-safe intersection method according to claim 7, characterized in that, The process of determining the reduced second set based on the filtered second set includes: Determine whether the second set after filtering meets the execution conditions; If not, the second set after filtering will be used as the reduced second set; If so, the third preset step and the fourth preset step are performed on the second set after screening until the newly screened second set does not meet the execution conditions, then the execution of the third preset step and the fourth preset step is ended, and the newly screened second set is taken as the reduced second set.

10. The set-safe intersection method according to claim 6, characterized in that, The method further includes: Receive the intersection result sent by the first device.

11. A set-safe intersection device, characterized in that, Applied to the first device, including: The first parameter determination module is used to initiate a Set Security Intersection (PSI) protocol process to the second device, and to determine the execution conditions and common parameters of hash bucketing with the second device. The execution conditions are the necessary conditions that need to be met to execute hash bucketing, and the common parameters are the necessary parameters for executing hash bucketing. The execution conditions include at least one of the following: maximum number of executions, set element removal rate threshold, and set element removal amount threshold. The first bucketing processing module is used to perform hash bucketing processing on the locally stored first set according to the public parameters to obtain the first bucket identifier table and the first bucket element registration table of the first set. The data receiving module is used to send the first bucket identifier table to the second device, and receive the second bucket identifier table of the second set and the reduced second set sent by the second device. The reduced second set is obtained by the second device by filtering elements of the locally stored second set according to the first bucket identifier table, the second bucket identifier table and the second bucket element registration table of the second set. The first bucketing processing module is further configured to filter out elements in the first set through the following second preset steps to obtain a first set after filtering: performing logical operations on the corresponding identifiers in the first bucketing identifier table and the second bucketing identifier table one by one to obtain a merged bucketing identifier table; determining the elements to be filtered out in the first bucketing element registration table according to the merged bucketing identifier table, counting the elements retained in the first bucketing element registration table to obtain a first set after filtering, and determining a reduced first set based on the first set after filtering; The intersection result determination module is used to take the reduced first set and the reduced second set as input, execute the PSI protocol process, and obtain the intersection result of the first set and the second set.

12. A set secure intersection device, characterized in that, Applied to a second device, including: The second parameter determination module is used to respond to the PSI protocol process initiated by the first device and determine the execution conditions and common parameters of hash bucketing with the first device. The execution conditions are the necessary conditions that need to be met to execute hash bucketing, and the common parameters are the necessary parameters for executing hash bucketing. The execution conditions include at least one of the following: maximum number of executions, set element removal rate threshold, and set element removal amount threshold. The second bucketing module is used to perform hash bucketing on the locally stored second set according to the public parameters to obtain the second bucket identifier table and the second bucket element registration table of the second set. The identifier table inter-transmission module is used to send the second bucket identifier table to the first device and receive the first bucket identifier table of the first set sent by the first device, so that the first device can filter the elements of the locally stored first set according to the first bucket identifier table, the second bucket identifier table and the first bucket element registration table of the first set to obtain the reduced first set. The second bucketing processing module is further configured to filter out elements in the second set through the following fourth preset step to obtain a filtered second set: performing logical operations on the corresponding identifiers in the first bucketing identifier table and the second bucketing identifier table one by one to obtain a merged bucketing identifier table; determining the elements to be filtered out in the second bucketing element registration table according to the merged bucketing identifier table, counting the elements retained in the second bucketing element registration table to obtain a filtered second set, and determining a reduced second set based on the filtered second set; The reduced set sending module is used to send the reduced second set to the first device, so that the first device can use the reduced first set and the reduced second set as input to execute the PSI protocol process and obtain the intersection result of the first set and the second set.

13. An electronic device, characterized in that, include: processor; as well as Memory for storing the executable instructions of the processor; The processor is configured to execute the set-safe intersection method of any one of claims 1 to 10 by executing the executable instructions.

14. A computer-readable storage medium having a computer program stored thereon, characterized in that, When the computer program is executed by the processor, it implements the set-safe intersection method according to any one of claims 1 to 10.