A searchable public key encryption method supporting semantic query simultaneously

By semantically expanding and vectorizing the query keywords, and combining this with predicate encryption, the problem of semantic information not being considered in existing technologies is solved, thus achieving more accurate encrypted retrieval.

CN114969795BActive Publication Date: 2026-06-16XINYANG NORMAL UNIVERSITY

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
XINYANG NORMAL UNIVERSITY
Filing Date
2022-06-21
Publication Date
2026-06-16

AI Technical Summary

Technical Problem

Existing searchable public-key encryption schemes fail to adequately consider the semantic information of query keywords, causing search results to deviate from user intent and affecting search accuracy and user experience.

Method used

By semantically expanding the query keywords, a query expansion term set is generated. The document keywords and the query expansion term set are then transformed into coefficient vectors and root vectors, which are then encrypted using a predicate encryption method. The matching relationship in the encrypted state is then calculated.

Benefits of technology

It improves the accuracy of encrypted retrieval, making the search results closer to the user's query intent and enhancing the user experience.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN114969795B_ABST
    Figure CN114969795B_ABST
Patent Text Reader

Abstract

The application relates to the technical field of data encryption, and particularly discloses a searchable public key encryption method supporting semantic query simultaneously, which comprises the following steps: performing semantic expansion on a query keyword set Q input during searching to obtain a query expansion keyword set; converting a document keyword set W in a data set and the query expansion keyword set into a coefficient vector and a root vector respectively; generating a public key and a private key for the data set; performing encryption processing on the coefficient vector of the document keyword set W and the root vector of the query expansion keyword set respectively by using a predicate encryption method according to the generated public key and private key; and obtaining an encrypted index I of the document. W The encryption trapdoor of the query is T Q The application can guarantee data security and perform semantic expansion on user query information, and solves the accuracy problem in the field of ciphertext retrieval.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] The present invention relates to the technical field of data encryption, and particularly to a searchable public key encryption method that supports semantic query at the same time. Background Art

[0002] In the era of big data, the utilization and mining of data are the main themes. As one of the common strategies for utilizing and mining data, data retrieval is necessary to introduce some data security technologies to protect the retrieval process. Since data is usually stored and utilized in plaintext, data encryption is one of the common protection means. However, traditional encryption schemes often disrupt the original structure of data, making it difficult for user data to be indexed by the cloud, thus hindering users from retrieving cloud data and causing inconvenience. If all ciphertext data is downloaded to the local for decryption and then retrieved on the plaintext, it will bring huge transmission, storage, and computing overheads. Therefore, simply using traditional encryption schemes to achieve ciphertext retrieval can no longer meet the requirements of big data today, and it is urgent to find new technologies.

[0003] Searchable Encryption technology is one of the effective technical means to solve the problem of ciphertext retrieval. This technology can provide a secure indexing mechanism for encrypted data and directly retrieve on the ciphertext data using a security trapdoor, thus realizing the function of secure ciphertext retrieval. Searchable public key encryption allows multiple data owners to encrypt data using a public key, and data users can securely retrieve data using a private key, so it is naturally applicable to the one-to-many application scenario. In recent years, searchable public key encryption has received a lot of attention. However, most of the existing searchable public key encryption schemes that support keyword query are based on the information retrieval method of exact matching, and do not fully explore the semantic relationship between query terms and document terms. Specifically, assume that the index keyword set is W = {w1, w2,... w t}, and the query keyword set is Q = {q1, q2,... q m}, where m < t. The existing scheme solves the problem of determining whether Q is a subset of W in the encrypted state. However, in many cases, due to the lack of understanding of relevant field knowledge by users, the submitted keywords cannot accurately and comprehensively express the actual retrieval intention of users, resulting in incomplete retrieval results. For example, when a user wants to find a document containing the keyword "tomato", since the query condition only contains the word "tomato", documents containing "tomato" cannot be retrieved. This will cause the following two major defects: the retrieval results only literally meet the requirements of users, and the actual content often deviates from the needs of users; the retrieval mode based on exact matching lacks in understanding the query intention of users, and thus the query results do not quite meet the expectations of users.

[0004] The reason for this deficiency is that current methods merely extract keywords from user queries and document text to construct encryption trapdoors and secure indexes, without fully considering the semantic information of these keywords. Only by fully considering the semantic information of query keywords and elevating keyword-level retrieval to semantic-level retrieval can the query process be transformed from exact matching to semantic matching, thereby significantly improving retrieval results.

[0005] As described above, current public key encryption schemes that support keyword queries do not fully consider the semantic information of the query keywords, resulting in search results that do not closely match the user's query intentions and greatly affecting the user's query experience. Summary of the Invention

[0006] The purpose of this invention is to provide a searchable public key encryption method that simultaneously supports semantic queries, thereby semantically expanding user query information while ensuring data security and solving the accuracy problem in the field of encrypted retrieval.

[0007] This invention provides a searchable public-key encryption method that simultaneously supports semantic queries, comprising the following steps:

[0008] The set of keywords Q entered during the search is semantically expanded to obtain the expanded term set.

[0009] The set of document keywords W and the set of extended query terms in the dataset to be retrieved Convert them into coefficient vectors respectively and root vector

[0010] Generate public and private keys for the dataset;

[0011] Based on the generated public key, the coefficient vector of the document keyword set W is encrypted using a predicate encryption method. Encryption is performed to obtain the encrypted index I of the document. W ;

[0012] Based on the generated private key, the query extended term set is encrypted using the predicate encryption method. root vector Encryption is performed, and the encrypted trapdoor for the query is T. Q .

[0013] Furthermore, the semantic expansion of the query keyword set Q input during retrieval yields an expanded query term set. Includes the following steps:

[0014] When querying keyword set Q = {q1,q2,…q} m For each word q in the query keyword set Qi If we search the synonym list, i∈[1,m], then one of the synonyms q is a word. i The set of synonyms for q is {q i1 ,q i2 ,…q in};

[0015] Calculate the word q i Each similar word q in the set of synonyms ij With word q i Mutual information I(q) i q ij ), j∈[1, n], the calculation formula is as follows:

[0016]

[0017] Where p(q) i ) represents the number of words q in the dataset. i The number of documents; p(q) ij ) represents the number of words q in the dataset. ij The number of documents;

[0018] p(q i )p(q ij ) represents a dataset that simultaneously contains the word q i With word q ij The number of documents;

[0019] Pair of words q i Similar words q ij The mutual information is normalized, and the calculation formula is as follows:

[0020]

[0021] Where I maxi For the set {I(q i q i1 ),I(q i q i2 ),...I(q i q in The maximum value in )};

[0022] Calculation word q i Similar words q ij The scarcity score is calculated using the following formula:

[0023]

[0024] Where D is the number of documents in the dataset;

[0025] Calculation word q i Similar words q ijThe final relevance score is calculated as follows:

[0026] Score(q i ,q ij ) = α·S(q i ,q ij ) + β·I(q i ,q ij ) (4)

[0027] where α and β are user-defined weights respectively;

[0028] When the user sets the threshold score to γ, when Score(q i ,q ij ) > γ, then the similar word q ij is added to the query expansion word set When Score(q i ,q ij ) ≤ γ, then the similar word q ij is discarded;

[0029] After all the words in the query keyword set Q are expanded and screened, the generated query expansion word set is used as the final query keyword set.

[0030] Furthermore, the document keyword set W in the dataset to be retrieved and the query expansion word set are respectively transformed into coefficient vectors and root vectors including the following steps:

[0031] When the document keyword set W = {w1, w2,... w t )}, where n < t, using the relationship between the roots and coefficients of a unary t-degree equation, the document keyword set W is transformed into a coefficient vector and the query expansion word set is transformed into a root vector

[0032] For the document keyword set W, construct the function f(x),

[0033] f(x) = (x - w1)(x - w2)…(x - w t ) = a t x t + a t-1 x t-1 + … + a0 (5)

[0034] When f(x) = 0, the roots of the function f(x) are w1, w2,..., w t , then the document keyword set W is transformed into a coefficient vector

[0035] For query extended term set Construct the root vector

[0036] in,

[0037]

[0038] in For q j The power of i, where i∈[0,t], j∈[1,n].

[0039] Furthermore, it also includes based on the coefficient vector With the root vector The inner product determines the index and query extended term set of the document keyword set W. The matching relationship between queries;

[0040] When querying the extended vocabulary When the keyword belongs to document keyword set W, query the extended term set. root vector The coefficient vector of the document keyword set W inner product The index and query extended term set of the document keyword set W. The queries are matched.

[0041] Furthermore, the generation of public and private keys for the dataset includes:

[0042] Choose two cyclic groups G1 and G2 of order q, and a bilinear pair. G1×G1→G2;

[0043] Randomly select a new member g from the cyclic group G1, and t+2 random numbers. The t+2 random numbers form a set {γ0, γ1, ..., γ...} t The set SK = {γ0, γ1, ..., γ} is the private key set, where SK = {γ0, γ1, ..., γ}. t , β}, where γ0, γ1,..., γ t β are random integers;

[0044] Calculate {U0, U1, ..., U... t , V, X}, where:

[0045]

[0046]

[0047] ······

[0048]

[0049] V = g β (11)

[0050]

[0051] Where U0, ..., U t V, X are sets of public keys, where public key PK = {U0, U1, ..., U2}. t ,V,X}.

[0052] Furthermore, based on the generated public key, the root vector of the document keyword set W is encrypted using a predicate encryption method. Encryption is performed to obtain the encrypted index I of the document. W ,include:

[0053] Choose a random number s;

[0054] Calculate C respectively i C β C X ,in,

[0055]

[0056] C β =V s =g sβ (14)

[0057]

[0058] Where i∈[0,t]; a i coefficient vector One of the elements;

[0059] The set {C0, C1, ..., C} is... t C β C X The encrypted index I that constitutes the document keyword set W W ;

[0060] Then the encrypted index I of the document keyword set W W for,

[0061] I W ={C0,C1,…C t C β C X} (16).

[0062] Furthermore, the step of using the generated private key and employing a predicate encryption method to expand the query term set... coefficient vector Encryption is performed, and the encrypted trapdoor for the query is T. Q ,include:

[0063] Choose a random number u,

[0064] Calculate T respectively i ,T u ,

[0065]

[0066]

[0067] Where i∈[0,t]; j∈[0,t];

[0068] b i root vector One of the elements;

[0069] Where the set {T0,T1,…T} t ,T u} constitutes the extended query term set Encryption Trapdoor T Q ;

[0070] Then query the extended vocabulary set Encryption Trapdoor T Q for:

[0071] T Q ={T0,T1,…T t ,T u} (19).

[0072] Furthermore, it also includes determining the document keyword set W and the query extended term set under encrypted conditions. The matching relationships specifically include:

[0073] Based on the encrypted index I of the document keyword set W W and query extended term set Encryption Trapdoor T Q Calculate A4 using the following formula:

[0074] A4 = A2 × A3 (20)

[0075] in,

[0076]

[0077]

[0078] in,

[0079]

[0080] Where i∈[0,t];

[0081] A2, A3, and A4 are the intermediate calculated values ​​during the judgment process;

[0082] When A4 = C X Then, the document keyword set W and the query extended term set... match;

[0083] When A4 ≠ C X Then, the document keyword set W and the query extended term set... Mismatch.

[0084] Compared with the prior art, the beneficial effects of the present invention are as follows:

[0085] This invention proposes a searchable public-key encryption method that simultaneously supports semantic queries. First, the query keywords are semantically expanded. Then, the expanded query keywords and the original document keywords are vectorized. Next, a predicate encryption method is used to encrypt the document and query vectors. Finally, the matching relationship between the document and the query is calculated in the encrypted state. This invention semantically expands user query information while ensuring data security, solving the accuracy problem in the field of encrypted retrieval. Compared with existing encryption methods, the searchable public-key encryption method in this invention has a significantly improved query accuracy and is closer to the user's query intent. Attached Figure Description

[0086] The accompanying drawings are provided to further illustrate the invention and form part of the specification. They are used in conjunction with embodiments of the invention to explain the invention and do not constitute a limitation thereof. In the drawings:

[0087] Figure 1 This is a flowchart of a searchable public key encryption method that simultaneously supports semantic queries, as proposed in this invention. Detailed Implementation

[0088] The technical solutions of the present invention will be clearly and completely described below with reference to the accompanying drawings of the embodiments of the present invention. However, it should be understood that the scope of protection of the present invention is not limited to the specific implementation.

[0089] Example

[0090] like Figure 1 As shown, this invention proposes a searchable public key encryption method that simultaneously supports semantic queries. While ensuring data security, it semantically expands the user's query information, thus solving the accuracy problem in the field of encrypted retrieval.

[0091] To achieve semantic retrieval of encrypted text, this invention first semantically expands the query keywords, then vectorizes the expanded query keywords and the original document keywords. Next, it encrypts the document and query vectors using a predicate encryption method, and finally calculates the matching relationship between the document and the query in the encrypted state. The specific steps are as follows:

[0092] Step 1: Search for semantic expansion of keywords.

[0093] The set of keywords Q entered during the search is semantically expanded to obtain the expanded term set. Specifically, the following steps are included:

[0094] When querying keyword set Q = {q1,q2,…q} m To make the expanded query terms more accurately reflect the user's query intent, each term q in the query keyword set Q is... i If we search the synonym list, i∈[1,m], then one of the synonyms q is a word. i The set of synonyms for q is {q i1 ,q i2 ,…q in};

[0095] Step 1.1: Calculate word q i Each similar word q in the set of synonyms ij With word q i Mutual information I(q) i q ij ), j∈[1, n], the calculation formula is as follows:

[0096]

[0097] Where p(q) i ) represents the number of words q in the dataset. i The number of documents; p(q) ij ) represents the number of words q in the dataset. ij The number of documents;

[0098] p(q i )p(q ij ) represents a dataset that simultaneously contains the word q i With word q ij The number of documents;

[0099] Step 1.2: For word q i Similar words q ij The mutual information is normalized, and the calculation formula is as follows:

[0100]

[0101] Where I maxiFor the set {I(q i q i1 ),I(q i q i2 ),…I(q i q in The maximum value in )};

[0102] Step 1.3: Calculate word q i Similar words q ij The scarcity score is calculated using the following formula:

[0103]

[0104] Where D is the number of documents in the dataset;

[0105] Step 1.4: Calculate word q i Similar words q ij The final relevance score is calculated using the following formula:

[0106] Score(q i ,q ij )=α·S(q i ,q ij )+β·I(q i q ij (4)

[0107] Where α and β are user-defined weights;

[0108] When the user sets the threshold score to γ, when Score(q) i ,q ij When )>γ, then similar words q ij Add to query extended term set When Score(q) i ,q ij When )≤γ, then discard the similar word q. ij ;

[0109] Once all words in the query keyword set Q have been expanded and filtered, the resulting expanded keyword set will be... As the final set of query keywords.

[0110] Step 2: Vectorized representation of documents and query keywords.

[0111] The set of document keywords W and the set of extended query terms in the dataset to be retrieved The document keyword set W is transformed into a coefficient vector by performing vectorization processing on each part. Expand the query term set Transform into root vector Includes the following steps:

[0112] When the document keyword set W = {w1, w2, … w t}, where n < t, using the relationship between the roots and coefficients of a unary t-degree equation, the document keyword set W is transformed into a coefficient vector and the query expansion term set is transformed into a root vector

[0113] For the document keyword set W, construct the function f(x),

[0114] f(x) = (x - w1)(x - w2)…(x - w t ) = a t x t + a t-1 x t-1 + … + a0 (5)

[0115] When f(x) = 0, the roots of the function f(x) are w1, w2, …, w t , then the document keyword set W is transformed into a coefficient vector

[0116] For the query expansion term set construct the root vector

[0117] where,

[0118]

[0119] where is the i-th power of q j here i ∈ [0, t], j ∈ [1, n].

[0120] According to the inner product of the coefficient vector and the root vector to determine the matching relationship between the index of the document keyword set W and the query of the query expansion term set When the query expansion term set belongs to the document keyword set W, the root vector of the query expansion term set and the coefficient vector of the document keyword set W then the index of the document keyword set W and the query of the query expansion term set match.

[0121] Step 3: Vector encryption of the document and the query.

[0122] Step 3.1: Generate public and private keys for the data set, specifically including:

[0123] Choose two cyclic groups G1 and G2 of order q, and a bilinear pair. G1×G1→G2;

[0124] Randomly select a new member g from the cyclic group G1, and t+2 random numbers. The t+2 random numbers form a set {γ0, γ1, ..., γ...} t The set SK = {γ0, γ1, ..., γ} is the private key set, where SK = {γ0, γ1, ..., γ}. t , β}, where γ0, γ1,..., γ t β are random integers;

[0125] Calculate {U0, U1, ..., U... t , V, X}, where:

[0126]

[0127]

[0128] ······

[0129]

[0130] V = g β (11)

[0131]

[0132] Where U0, ..., U t V, X are sets of public keys, where public key PK = {U0, U1, ..., U...} t ,V,X}

[0133] The public key PK is public, while the private key SK is kept secret.

[0134] Step 3.2: Based on the generated public key, use the predicate encryption method to encrypt the coefficient vector of the document keyword set W. Encryption is performed to obtain the encrypted index I of the document. W ,include:

[0135] Given a document keyword set W, the corresponding vector And the public key PK, and choose a random number s;

[0136] Calculate C respectively i C β C X ,

[0137]

[0138] C β =Vs =g sβ (14)

[0139]

[0140] Where i∈[0,t]; a i coefficient vector One of the elements;

[0141] The set {C0, C1, ..., C} is... t C β C X The encrypted index I that constitutes the document keyword set W W ;

[0142] Then the encrypted index I of the document keyword set W W for

[0143] I W ={C0,C1,…C t C β C X} (16)

[0144] Step 3.3: Based on the generated private key, use the predicate encryption method to expand the query term set. root vector Encryption is performed, and the encrypted trapdoor for the query is T. Q ,include:

[0145] Given a semantically expanded set of query keywords corresponding vector And the private key SK, and choose a random number u.

[0146] Calculate T respectively i ,T u ,

[0147]

[0148]

[0149] Where i∈[0,t]; j∈[0,t];

[0150] b i root vector One of the elements;

[0151] Where the set {T0,T1,…T} t ,T u} constitutes the extended query term set Encryption Trapdoor T Q ;

[0152] Then query the extended vocabulary set Encryption Trapdoor T Q for:

[0153] T Q ={T0,T1,…T t ,T u} (19).

[0154] Step 4: Compare the document keyword set W with the query extended term set under encrypted status. The matching relationship, i.e. the semantic retrieval process under encryption, specifically includes:

[0155] Based on the encrypted index I of the document keyword set W W and query extended term set Encryption Trapdoor T Q Calculate A4 using the following formula:

[0156] A4 = A2 × A3 (20)

[0157] in,

[0158]

[0159]

[0160] in,

[0161]

[0162] Where i∈[0,t];

[0163] A2, A3, and A4 are the intermediate calculated values ​​during the judgment process;

[0164] When A4 = C X Then, the document keyword set W and the query extended term set... In this case, the semantic search result output is 1;

[0165] When A4 ≠ C X Then, the document keyword set W and the query extended term set... If there is no match, the semantic search result will be 0.

[0166] Finally, it should be noted that the above-disclosed embodiment is only one specific embodiment of the present invention. However, the embodiments of the present invention are not limited thereto, and any variations that can be conceived by those skilled in the art should fall within the protection scope of the present invention.

Claims

1. A searchable public-key encryption method that simultaneously supports semantic queries, characterized in that, Includes the following steps: The set of query keywords entered during the search Perform semantic expansion to obtain the query extended term set. The steps include: When querying keyword set For the query keyword set Each word in Look up the synonym list. Then one of the words The set of synonyms for is ; Calculate the word Each similar word in the set of synonyms With words mutual information , The calculation formula is as follows: (1) in For words in the dataset The number of documents; For words in the dataset The number of documents; For the dataset that simultaneously contains words With words The number of documents; Word pair Similar words The mutual information is normalized, and the calculation formula is as follows: (2) in For set The maximum value in; Calculation words Similar words The scarcity score is calculated using the following formula: (3) in D The number of documents in the dataset; Calculation words Similar words The final relevance score is calculated using the following formula: (4) in and Each is a user-defined weight; When the user sets the threshold score ,when When, then similar words Add to query extended term set ,when At that time, similar words are discarded. ; When querying keyword set After all words in the query have been expanded and filtered, the resulting expanded term set will be generated. As the final set of query keywords; The set of document keywords in the dataset to be retrieved W and query extended term set Convert them into coefficient vectors respectively and root vector The steps include: When document keyword set , ,in Using one yuan t The relationship between the roots and coefficients of the quadratic equation, and the set of document keywords. W Transform into coefficient vector And will expand the query term set Transform into root vector ; For document keyword sets W constructor , (5) when When =0, the function The root is , ,..., Then the document keyword set W Transform into coefficient vector ; For query extended term set Construct the root vector , in, (7) in for of Power, here , ; Generate public and private keys for the dataset; Based on the generated public key, the document keyword set is encrypted using a predicate encryption method. W coefficient vector Encryption is performed to obtain the encrypted index of the document. ; Based on the generated private key, the query extended term set is encrypted using the predicate encryption method. root vector Encryption is performed to obtain the encrypted trapdoor for the query. .

2. The searchable public key encryption method that simultaneously supports semantic queries according to claim 1, characterized in that: It also includes based on the coefficient vector With the root vector The inner product determines the document keyword set. W Index and Query Extended Terminology The matching relationship between queries; When querying the extended vocabulary Belongs to document keyword set W When querying the extended vocabulary set root vector With document keyword set W coefficient vector inner product Then the document keyword set W Index and Query Extended Terminology The queries are matched.

3. The searchable public key encryption method that simultaneously supports semantic queries according to claim 1, characterized in that: The generation of public and private keys for the dataset includes: Choose two order as follows: q Cyclic group and and a bilinear pairing ; Randomly select cyclic group One of the living members ,as well as t+ 2 random numbers, t+ A set of 2 random numbers This set is a set of private keys, and the private key is SK. = , in , , , , They are random integers; calculate ,in: (8) (9) (10) (11) (12) in , , The structure is a set of public keys, public key PK = .

4. The searchable public key encryption method that simultaneously supports semantic queries according to claim 1, characterized in that: The document keyword set is encrypted using a predicate encryption method based on the generated public key. W root vector Encryption is performed to obtain the encrypted index of the document. ,include: Choose a random number s; Calculate separately ,in, (13) (14) (15) in ; coefficient vector One of the elements; The set Composing a document keyword set W Encrypted index ; Document keyword set W Encrypted index for, (16)。 5. A searchable public-key encryption method that simultaneously supports semantic queries, as described in claim 4, characterized in that: The query extended term set is then encrypted using a predicate encryption method based on the generated private key. coefficient vector Encryption is performed to obtain the encrypted trapdoor for the query. ,include: Choose a random number u , Calculate separately , (17) (18) in ; ; root vector One of the elements; The set Constructing an extended query term set Encryption trapdoor ; Then query the extended vocabulary set Encryption trapdoor for: (19)。 6. A searchable public-key encryption method that simultaneously supports semantic queries, as described in claim 5, is characterized in that: It also includes determining the keyword set of a document under encrypted status. W With query extended term set The matching relationships specifically include: Based on document keyword set W Encrypted index and query extended term set Encryption trapdoor ,calculate The calculation formula is as follows: (20) in, (21) (22) in, (23) in ; , , , These are the intermediate calculated values ​​during the judgment process; when Then, the document keyword set W With query extended term set match; when Then, the document keyword set W With query extended term set Mismatch.