A search method, device, computer device and storage medium
By grouping and encrypting the information to be searched and the information to be stored, and generating the index string to be searched and the reference index string, the problem of ciphertext data not being able to be searched directly is solved, and efficient and secure ciphertext data search is achieved.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- BEIJING YOUZHUJU NETWORK TECH CO LTD
- Filing Date
- 2022-04-28
- Publication Date
- 2026-06-23
AI Technical Summary
In existing technologies, encrypted data cannot be directly searched for using fuzzy search, resulting in low search efficiency and accuracy.
By grouping and encrypting the information to be searched and the information to be stored, and using the same encryption algorithm to generate the index string to be searched and the reference index string, the association between plaintext data and ciphertext data is established, enabling direct searching of ciphertext data.
It improves the efficiency and accuracy of searching encrypted data, ensures that retrieval is not affected when the key is changed or the encryption algorithm is upgraded, and enhances data security.
Smart Images

Figure CN117009404B_ABST
Abstract
Description
Technical Field
[0001] This disclosure relates to the field of information security technology, and in particular to a search method, apparatus, computer device, and storage medium. Background Technology
[0002] For data security, plaintext data is typically encrypted before storage, and then the encrypted ciphertext data is stored in a database. Currently, many scenarios require fuzzy searches to retrieve relevant data information. For example, in e-commerce, entering a buyer's mobile phone number allows you to find their order information for after-sales processing.
[0003] It can be seen that, based on known input data, fuzzy search can be used to obtain data related to the input data when storing plaintext data. However, for encrypted data, the encrypted data is not directly related to the input data. Therefore, fuzzy search cannot be performed directly on the stored encrypted data, which affects the search efficiency and accuracy of fuzzy search. Summary of the Invention
[0004] This disclosure provides a search method, apparatus, computer device, and storage medium that enables direct searching of stored encrypted data, while also improving search efficiency and accuracy.
[0005] In a first aspect, embodiments of this disclosure provide a search method, the method comprising:
[0006] Determine the index string to be searched corresponding to the received search information;
[0007] Based on the index string to be searched, target combination information matching the index string to be searched is determined, the target combination information including the reference index string and associated ciphertext data information;
[0008] The encrypted data information in the target combination information is used as the search result of the information to be searched.
[0009] Secondly, embodiments of this disclosure also provide a search device, the device comprising:
[0010] The index string determination module is used to determine the index string corresponding to the received search information;
[0011] The target combination information determination module is used to determine the target combination information that matches the index string to be searched, based on the index string to be searched. The target combination information includes the reference index string and the associated encrypted data information.
[0012] The search result determination module is used to use the encrypted data information in the target combination information as the search result of the information to be searched.
[0013] Thirdly, embodiments of this disclosure also provide a computer device, the computer device comprising:
[0014] One or more processors;
[0015] Storage device for storing one or more programs.
[0016] When the one or more programs are executed by the one or more processors, the one or more processors implement the search method provided in any embodiment of this disclosure.
[0017] Fourthly, embodiments of this disclosure also provide a computer-readable storage medium having a computer program stored thereon, which, when executed by a processor, implements the search method provided in any embodiment of this disclosure.
[0018] The technical solution of this disclosure specifically discloses a search method, apparatus, computer device, and storage medium. The search method includes: determining a search index string corresponding to received search information; determining target combination information matching the search index string, wherein the target combination information includes a reference index string and associated ciphertext data information; and using the ciphertext data information in the target combination information as the search result for the search information. This technical solution uses the index string as an intermediate item, establishing an association between plaintext data and ciphertext data information based on the reference index string. When searching, the search index string corresponding to the search information is first determined, then the reference index string matching the search index string is determined, and the fuzzy search result can be determined based on the ciphertext data information in the target combination information corresponding to the matched reference index string. Compared to the prior art, which cannot directly search ciphertext data without decryption, this technical solution can directly search stored ciphertext data, improving search efficiency and accuracy. Attached Figure Description
[0019] To more clearly illustrate the technical solutions of the exemplary embodiments of this disclosure, the accompanying drawings used in describing the embodiments are briefly introduced below. Obviously, the accompanying drawings described are only a portion of the embodiments to be described in this disclosure, and not all of them. For those skilled in the art, other drawings can be obtained from these drawings without any creative effort.
[0020] Figure 1 This is a flowchart illustrating a search method provided in Embodiment 1 of this disclosure;
[0021] Figure 2 This is a flowchart illustrating a search method provided in Embodiment 2 of this disclosure;
[0022] Figure 2a This is an example flowchart of the search method provided in Embodiment 2 of this disclosure;
[0023] Figure 3 This is a schematic diagram of the structure of a search device provided in Embodiment 3 of this disclosure;
[0024] Figure 4 This is a schematic diagram of the structure of a computer device provided in Embodiment 4 of this disclosure. Detailed Implementation
[0025] Embodiments of this disclosure will now be described in more detail with reference to the accompanying drawings. While some embodiments of this disclosure are shown in the drawings, it should be understood that this disclosure can be implemented in various forms and should not be construed as limited to the embodiments set forth herein. Rather, these embodiments are provided to provide a more thorough and complete understanding of this disclosure. It should be understood that the accompanying drawings and embodiments of this disclosure are for illustrative purposes only and are not intended to limit the scope of protection of this disclosure.
[0026] It should be understood that the steps described in the method embodiments of this disclosure may be performed in different orders and / or in parallel. Furthermore, the method embodiments may include additional steps and / or omit the steps shown. The scope of this disclosure is not limited in this respect.
[0027] The term "comprising" and its variations as used herein are open-ended inclusions, meaning "including but not limited to". The term "based on" means "at least partially based on". The term "one embodiment" means "at least one embodiment"; the term "another embodiment" means "at least one additional embodiment"; the term "some embodiments" means "at least some embodiments". Definitions of other terms will be given in the description below.
[0028] It should be noted that the concepts of "first" and "second" mentioned in this disclosure are used only to distinguish different devices, modules, or units, and are not used to limit the order of functions performed by these devices, modules, or units or their interdependencies. It should also be noted that the modifications of "a" and "a plurality of" mentioned in this disclosure are illustrative rather than restrictive, and those skilled in the art should understand that unless otherwise expressly indicated in the context, they should be understood as "one or more".
[0029] The names of messages or information exchanged between multiple devices in the embodiments of this disclosure are for illustrative purposes only and are not intended to limit the scope of such messages or information.
[0030] Example 1
[0031] Figure 1 This is a flowchart illustrating a search method provided in Embodiment 1 of this disclosure. This embodiment is applicable to situations where stored encrypted data is searched directly. The method can be executed by a search device, which can be implemented by software and / or hardware and can be configured in a terminal and / or server to implement the search method in this disclosure.
[0032] It's important to note that searching can be understood as inputting keywords, and the system can automatically perform synonym searches or allow for some difference between the search terms and the information being searched. Based on known input data, searching can retrieve data related to the input data when storing plaintext data. However, for encrypted data, the encrypted data is not directly related to the input data; therefore, it's impossible to directly search the stored encrypted data without decryption. In many scenarios, to improve data security, plaintext data is typically encrypted before storage, and then the encrypted data is stored in the database.
[0033] To establish an inclusion relationship between the storage and retrieval of encrypted data, an encryption algorithm can be used to encrypt and store the information to be stored in groups. When the information to be searched is received, the same encryption algorithm is used to encrypt the grouped information. Then, the encrypted grouped information is matched with the stored information, and the matching stored information is used as the search result.
[0034] The search methods described above, which involve encrypting and storing all plaintext information in ciphertext databases, present at least two problems: First, excessively long ciphertext length: the larger the ciphertext blocks, the greater the increase in ciphertext length, leading to higher transmission and storage costs. Second, security issues: to enable retrieval, the ciphertext generated from the same plaintext must be immutable; therefore, the key cannot be changed or the encryption algorithm modified, otherwise the original ciphertext will not be retrieved. However, this creates a one-to-one mapping between plaintext and ciphertext, significantly reducing security. Therefore, a method is needed to enable searching of ciphertext data.
[0035] like Figure 1 As shown, the search method provided in this embodiment one may specifically include:
[0036] S101. Determine the index string to be searched corresponding to the received search information.
[0037] The search method provided in this embodiment can be applied to users searching for encrypted data, or to the system itself searching for encrypted data. For example, in an e-commerce scenario, recipient information in order information returned by an e-commerce open platform is sensitive and is often returned encrypted. If a merchant needs to retrieve order information related to that recipient for after-sales processing, they can use the search method in this solution to obtain the desired information. This application scenario involves users searching for encrypted data. Another example is that currently, website login passwords are mostly stored as encrypted text and cannot be decrypted. However, when a user needs to change their login password, if the new password is too similar to the old password, the user needs to be reminded. This application scenario involves the system itself searching for encrypted data.
[0038] Considering that conventional encryption algorithms encrypt plaintext data as a whole, plaintext data with inclusion relationships will not have inclusion relationships in the ciphertext after encryption, thus making retrieval impossible. Therefore, this embodiment encrypts plaintext data in groups, and similarly, the information to be searched is also encrypted in groups before the search is performed.
[0039] It's important to understand that, to ensure that the encrypted data associated with the index string can be found in the encrypted database, the grouping encryption method of the information to be searched should be consistent with the grouping encryption method of the corresponding index string of the encrypted data. For example, if the index string corresponding to the encrypted data is grouped into 2-character blocks and the grouped data is hashed and encrypted, and then the encrypted index string and encrypted data are stored in the encrypted database, then the information to be searched should also be grouped into 2-character blocks and the grouped information should be hashed and encrypted before the search is performed.
[0040] For example, assuming the plaintext data is "AB City CD District", it is grouped into two-character blocks, which are represented as: "AB", "B City", "City C", "CD", "D District"; the grouped plaintext data is then encrypted, and the encrypted data is represented as (encrypt represents an encryption algorithm, and "|" is a separator with no special meaning, the same below):
[0041] encrypt("AB")|encrypt("City B")|encrypt("City C")|encrypt("CD")|encrypt("District D");
[0042] Assuming the search term is "AB City", we first group the results, representing them as "AB" and "B City". Then, we encrypt the grouped search results, representing them as follows:
[0043] encrypt("AB")|encrypt("B City"), it can be seen that the retrieved words after block encryption are included in the ciphertext and can be retrieved.
[0044] Among them, the information to be searched can be multiple keywords, can be key words, or can be identity identifiers, such as mobile phone numbers, ID card numbers, login accounts, etc., or can include one or more of keywords, key words and identity identifiers, which will not be listed one by one here. The index string to be searched can be understood as the string obtained after grouping and encrypting the information to be searched.
[0045] In this step, when receiving the information to be searched, the characters included in the information to be searched can be determined, and all the characters form a sequence. Then, the sequence is grouped according to the set splitting rule to obtain one or more groups of character groups. The number of character groups is related to the number of characters included in the information to be searched and the set splitting rule. Each character group is processed by a set encryption algorithm to obtain the encrypted value corresponding to each character group, and all the encrypted values are combined in sequence to form the string to be searched for the information to be searched.
[0046] S102. According to the index string to be searched, determine the target combined information that matches the index string to be searched, where the target combined information includes a reference index string and associated ciphertext data information.
[0047] In this embodiment, when encrypting and storing the plaintext data, it is necessary to process the plaintext data to obtain combined information, where the combined information includes a reference index string and associated ciphertext data information. Among them, the reference index string can be the result obtained after processing some key information in the plaintext data through a set algorithm and is used for retrieval. The ciphertext data information refers to the ciphertext data generated after encrypting the plaintext data, which can support decryption and is the information that the user wants to obtain.
[0048] It can be understood that the process of determining the reference index string is to generate it by grouping and encrypting some key information of the plaintext data, while in step S101, the information to be searched is grouped and encrypted to determine the index string to be searched. That is to say, the generation methods of the reference index string and the index string to be searched are the same, so as to ensure that there may be an inclusion relationship between the index string to be searched and the reference index string, so that the ciphertext data can be searched.
[0049] In this embodiment, when receiving information to be stored, key information can be extracted from the information to be stored, the characters contained in the key information can be determined, and all characters can be formed into a sequence. Then, the sequence is grouped according to a set splitting rule to obtain one or more character groups. The number of character groups is related to the number of characters contained in the information to be stored and the set splitting rule. Each character group is processed by a set encryption algorithm to obtain the encryption value corresponding to each character group. All encryption values are combined in sequence to form a reference index string for the information to be stored.
[0050] It should be noted that the rules for generating the reference index string are the same as those for generating the index string to be searched; that is, the grouping method and encryption algorithm are consistent. The process of generating the reference index string can be as follows: extract key information from the plaintext data, group and encrypt the key information according to a set rule, and then combine the encrypted values of each group in sequence to form the reference index string. For example, extracting key information can be as follows: for mobile phone numbers, the last 6 or 8 digits can be extracted; for ID card information, the last 6 digits can be extracted; for address information, the province, city, and district can be ignored, and only the last part can be extracted.
[0051] Of course, to further reduce the length of the index string, we can truncate the encryption result of each group (group size denoted as m) (for example, truncate to n bits), but this will introduce a certain probability of collisions. The larger the group contains, the lower the probability of collisions. The group size m and the truncation length n can be adjusted in practice based on the specific circumstances.
[0052] The stored information is encrypted using a pre-defined algorithm to obtain encrypted ciphertext data. This ciphertext data is then linked to a reference index string, forming a combined message. It's important to note that plaintext data encryption can use a separate encryption algorithm, which can be understood as different from the algorithm used to generate the reference index string. This design ensures that changing the key or upgrading the encryption algorithm does not affect retrieval of the ciphertext data, thus improving the overall security of the ciphertext data.
[0053] Specifically, the encrypted database contains combination information. Since each combination information contains a reference index string, the query index string can be compared with the reference index string. If the reference index string matches the query index string, it indicates that the combination information is associated with the query index string, and the combination information corresponding to the reference index string is determined as the target combination information. Similarly, it is necessary to compare the reference index strings corresponding to all combination information contained in the encrypted database with the query index string to determine all reference index strings that match the query index string, and the combination information corresponding to all matching reference index strings is taken as the target combination information.
[0054] It is known that when comparing the index string to be searched with the reference index string, the comparison result may be that the reference index string matches the index string to be searched, in which case the combination information corresponding to the reference index string is determined to be the target combination information, and the comparison of reference index strings with the index string to be searched in other combination information continues; or the reference index string does not match the index string to be searched, in which case the comparison of reference index strings with the index string to be searched in other combination information continues. When all reference index strings in the encrypted database have been compared with the index string to be searched, the search is considered complete.
[0055] S103. Use the encrypted data information in the target combination information as the search result of the information to be searched.
[0056] In this implementation, the combined information includes an index string and corresponding ciphertext data. Once the target combined information is determined, the ciphertext data within it can be identified. The ciphertext data represents the related information sought in the fuzzy search, and the ciphertext data corresponding to the target index string serves as the search result for the information to be searched.
[0057] This disclosure provides a search method comprising: first, determining a search index string corresponding to received search information; determining target combination information matching the search index string, wherein the target combination information includes a reference index string and associated ciphertext data information; and finally, using the ciphertext data information in the target combination information as the search result for the search information. Using this method, an index string is used as an intermediate item, and plaintext data and ciphertext data information are associated based on the reference index string. When searching, the search index string corresponding to the search information is first determined, then the reference index string matching the search index string is determined, and the fuzzy search result can be determined based on the ciphertext data information in the target combination information corresponding to the matched reference index string. Compared to existing technologies that cannot directly search ciphertext data without decryption, this technical solution can directly search stored ciphertext data, improving search efficiency and accuracy.
[0058] As an optional embodiment of this first embodiment, based on the above embodiments, the target combination information is recorded in a ciphertext database, and the step of determining the combination information recorded in the ciphertext database may include:
[0059] a) Determine the index string to be stored corresponding to the received information to be stored.
[0060] Understandably, before storing encrypted data, the plaintext data to be stored is processed according to the set rules to obtain a reference index string and encrypted data information. The reference index string is then associated with the corresponding encrypted data information and saved to the encrypted database for users to query and retrieve relevant information.
[0061] The encrypted database stores at least one reference index string and a combination of corresponding encrypted data information. In this embodiment, separate encrypted databases can be created for different users. For example, different users can use their identity document (ID) for encryption, meaning each user can have their own encrypted database. These encrypted databases are independent and isolated, which can improve security to a certain extent. For instance, consider an e-commerce platform. Due to the large amount of data such as product storage and transaction volume, if all e-commerce-related data were grouped and encrypted according to set rules and stored in a single encrypted database, the time and space required for searching would be significant. To address this issue, a separate encrypted database can be created for each e-commerce user. The reference index string stored in the encrypted database is obtained by encrypting the corresponding e-commerce user's ID. This ensures isolated data storage. When an e-commerce user needs to search, the encrypted database corresponding to the user's ID is determined based on the user's ID for the search. Data is not shared between users, thus ensuring data security and improving search efficiency.
[0062] In this embodiment, the reference index string and the corresponding ciphertext data information are combined and stored in the ciphertext database. Therefore, before storage, the reference index string and the associated ciphertext data information need to be determined based on the plaintext data. The process of determining the reference index string can be as follows: determine that all the characters contained in the information to be stored form a character sequence; then group the character sequence according to a set rule to form one or more combinations; process the combinations according to a set algorithm, such as a hash algorithm, and combine all the processed values to form the index string to be stored.
[0063] Further, determining the index string to be stored corresponding to the received information to be stored includes:
[0064] a1) Determine the sequence of characters to be stored based on the characters contained in the information to be stored.
[0065] Specifically, determine all the characters contained in each piece of information to be stored. You can either construct a character sequence to be stored from all the characters or select key characters to construct a character sequence to be stored.
[0066] Preferably, determining the sequence of characters to be stored based on the characters contained in the information to be stored includes:
[0067] a11) Based on the characters contained in the information to be stored, obtain the initial character sequence.
[0068] Specifically, determine all the characters contained in each piece of information to be stored, and all the characters constitute the initial character sequence.
[0069] a12) Determine the information type to which the information to be stored belongs, and obtain at least one key character position corresponding to the information type.
[0070] In this step, the information type to be stored is determined. This information type could be a mobile phone number, ID card information, address information, etc., without specific limitations. Different information types correspond to different key characters. For example, for mobile phone numbers, the last 6 or 8 digits can be truncated as the key character; for ID card information, the last 6 digits can be truncated as the key character; for address information, the province, city, and district can be ignored, and the remaining portion can be used as the key character. Specifically, after determining the information type to be stored, the character selection method corresponding to the information type can be determined based on the set key character selection method, obtaining at least one key character corresponding to the information type.
[0071] a13) Find the key character corresponding to each key character position in the initial character sequence, and construct the character sequence to be stored based on each key character.
[0072] Specifically, since the initial character sequence contains all the characters contained in the information to be stored, and given that the key character positions have been determined, the key characters corresponding to each key character position can be found from the initial character sequence, and the found key characters can be used to form the character sequence to be stored.
[0073] For example, assuming the information to be stored is a product number 123456789, the initial character sequence can be determined as 123456789. Assuming that for information types like product numbers, the last 6 digits are extracted as key characters, the key characters corresponding to the last 6 digits of the initial character sequence are used to form the character sequence to be stored as 456789.
[0074] a2) Group the characters in the character sequence to be stored according to the set splitting rules to obtain at least one second character group.
[0075] The splitting rules are determined based on actual practical situations. The splitting rules can be described as follows: The current character sequence to be split is denoted as the split-to-splitting sequence, which is the character sequence to be stored; the first predetermined number of characters in the split-to-splitting sequence are each taken as a target character; for each target character in the split-to-splitting sequence, starting from the target character, a second predetermined number of characters are sequentially selected along the positive direction of the split-to-splitting sequence to form a character group corresponding to the target character; wherein the second predetermined number is less than the length of the split-to-splitting sequence; the first predetermined number is determined based on the length and the second predetermined number.
[0076] In this process, the character sequence to be stored is split into two parts. The first few characters of the sequence can be selected as target characters. These first few characters can be represented as a first set number of characters, which can be set according to user needs or experience. For each target character in the sequence, starting from the target character, several characters are sequentially selected along the positive direction of the sequence to be split, forming a character group corresponding to the target character. These sequentially selected characters can be represented as a second set number of characters. The first set number should be less than or equal to the length of the sequence to be split. The second set number determines the number of characters in each character group; therefore, the selection of the first set number should ensure that each character group contains the second set number of characters.
[0077] For example, suppose the sequence to be split is 456789, and suppose the first set number is 5 and the second set number is 2. Take the first 5 characters in the sequence to be split as target characters. It can be seen that the target characters are "4", "5", "6", "7" and "8". Starting from each target character, select 2 characters in sequence along the positive direction of the sequence to be split to form the character group corresponding to the target character. The character groups are represented as '45', '56', "67", "78" and "89" respectively.
[0078] It should be noted that the splitting rules for the information to be searched and the character sequence to be stored are the same. This ensures that the index string to be searched corresponding to the information to be searched and the reference index string corresponding to the character sequence to be stored can have an inclusion relationship, and thus a search can be performed.
[0079] a3) Perform hash processing on each of the second character groups to obtain the second hash value corresponding to each second character group.
[0080] The hashing process can be understood as encrypting the second character group using a hash algorithm. It's important to note that the encryption algorithm used to generate the search string is the same as the encryption algorithm used to store the index string – both are hash encryption algorithms. This serves the same purpose as using the same splitting rules: to ensure that the searched index string and the reference index string corresponding to the stored character sequence may have an inclusion relationship, thus allowing the search to proceed.
[0081] It's important to note that, to ensure data security, different hash functions can be used to create separate encrypted databases based on the user ID. This ensures storage isolation between the databases and improves data security. Of course, for the same target database, it can also be understood that for the same user ID, the hash algorithm used for storage and retrieval must be identical.
[0082] Specifically, based on the hash function, each second character group is hashed to obtain one or more second hash values. All hash values are then combined to form the index string for storing the information. For example, assuming the second character groups are '45', '56', '67', '78', and '89', after hash encryption (where encrypt represents an encryption algorithm), the results are: encrypt("45"), encrypt("56"), encrypt("67"), encrypt("78"), and encrypt("89").
[0083] a4) Determine the index string to be stored for the information to be stored based on each of the second hash values.
[0084] Specifically, all the second hash values are combined in the order of their corresponding split second character groups to form the index string to be stored. Continuing with the example above, the index string to be stored can be represented as: encrypt("45")|encrypt("56")|encrypt("67")|encrypt("78")|encrypt("89") (where encrypt represents an encryption algorithm, and | represents splitting, which has no actual meaning).
[0085] b) Encrypt the information to be stored to obtain the ciphertext data information of the information to be stored.
[0086] In this step, the information to be stored is encrypted according to the set encryption strategy to obtain the corresponding ciphertext data. The encryption algorithm used should be reversible to ensure that after the ciphertext data is retrieved, it can be decrypted to obtain the plaintext data desired by the user. It should be noted that the plaintext data encryption can use a separate encryption algorithm, which can be understood as different from the encryption algorithm used to generate the reference index string. This design ensures that changing the key or upgrading the encryption algorithm does not affect the retrieval of the ciphertext data, thus improving the overall security of the ciphertext data.
[0087] c) Determine a reference index string based on the index string to be stored, and associate the reference index string with the ciphertext data information to form a combined information and store it in the ciphertext database.
[0088] In this step, after determining the index string to be stored, it can be directly used as the reference index string, or a portion of the index string can be truncated and used as the reference index string. Considering further reducing the length of the index string, the encryption result of each group (group size denoted as m) can be truncated (e.g., truncated to n bits), but this will introduce a certain probability of collisions. The larger the number of characters in a group, the lower the probability of collisions. The group size m and the truncated length n can be adjusted in practice based on the specific circumstances.
[0089] Specifically, after determining the reference index string, it is associated with the encrypted data information to form a combined piece of information, which is then stored in the encrypted database. For example, assuming the information to be stored is "AB Province, CD City, EF District, G Center", the reference index string is generated by grouping the key information "G Center" into two characters and performing a hash operation: hash("G Center")|hash("Center"), and the information to be stored is encrypted to obtain: encrypt("AB Province, CD City, EF District, G Center"). The combined information can be represented as: hash("G Center")|hash("Center")+encrypt("AB Province, CD City, EF District, G Center"), and stored in the encrypted database. Here, hash represents a hash algorithm, encrypt represents an encryption algorithm, and | represents a separator with no actual meaning. It is known that the encrypted database can contain multiple combined pieces of information. The reference index string is used for subsequent searches, and the encrypted data information is the data information the user wants.
[0090] Further, determining the reference index string based on the index string to be stored includes: directly using the index string to be stored as the reference index string; or, selecting a sub-index string from the index string to be stored as the reference index string.
[0091] In this embodiment, the index string to be stored can be directly used as the reference index string. Considering that to further reduce the length of the index string, the encryption result of each group can be truncated to obtain a sub-index string, which can then be used as the reference index string. For example, the first n characters, the middle n characters, or the last n characters can be selected (n can be any integer less than the length of each group), and the selection method can be preset by the user. It is understood that selecting a portion of the index string may result in a certain probability of collision. The larger the number of characters in each group, the lower the probability of collision. Therefore, the length of each group and the length of the truncated portion can be adjusted in practice according to the actual situation.
[0092] For example, suppose the selection method is set as follows: select the last two characters of the encryption result of each group as a sub-index string. Suppose the groups are '456', '567', '678', and '789', and the corresponding encryption results are: encrypt("456"), encrypt("567"), encrypt("678"), and encrypt("789"), where encrypt represents an encryption algorithm, and | represents splitting, which has no actual meaning. The last two characters of each encryption result are extracted as sub-index strings, and all the extracted sub-index strings are combined to form a reference index string.
[0093] As another optional embodiment of this embodiment, this optional embodiment further optimizes and adds the following based on the above embodiment: decrypting the search results and feeding back the decrypted search results to the search interface of the client.
[0094] In this optional embodiment, after determining the search result, the search result is decrypted, i.e., the ciphertext data information is decrypted. Since the encryption algorithm for encrypting plaintext data into ciphertext data information is reversible, the ciphertext data information can be decrypted by performing the inverse encryption operation, and the decrypted search result is displayed on the client's search interface for the user to view.
[0095] The above optional embodiments of this example provide feedback on the search results after they are determined. It can be seen that, compared to existing technologies that cannot directly search encrypted data, this optional embodiment can directly decrypt and feed back the pre-determined search results to the client's search interface, thus achieving direct searching of encrypted data and improving search efficiency and accuracy.
[0096] Example 2
[0097] Figure 2A flowchart of a search method provided by an embodiment of this disclosure is given. This embodiment is a further optimization of the above embodiment. In this embodiment, determining the search index string corresponding to the received search information is further specified as follows: grouping the characters in the search information according to a set splitting rule to obtain at least one group of first character groups; performing hash processing on each first character group to obtain a first hash value corresponding to each first character group; and determining the search index string of the search information based on each first hash value.
[0098] Meanwhile, in this embodiment, determining the target combination information matching the index string to be searched is specified as follows: for each pre-stored combination information, the index string to be searched is compared with the reference index string in the combination information to obtain the index string comparison result; the combination information whose index string comparison result meets the comparison condition is taken as the target combination information matching the index string to be searched; wherein, the comparison condition is set as: the index string to be searched is a substring of the reference index string in the combination information; or, the reference index string in the combination information is a substring of the index string to be searched.
[0099] like Figure 2 As shown, the search method provided in this embodiment two specifically includes the following steps:
[0100] S201. Group the characters in the information to be searched according to the set splitting rules to obtain at least one first character group.
[0101] Specifically, when a search query is received, if the query consists of characters such as text or numbers, all characters contained within the query are identified, and the query is used as the search character sequence. For example, to search for information related to a mobile phone number ending in 1234, the search character sequence would be "1234". If the query is in the form of voice or image, it needs to be converted into character form first, and then the characters contained in the converted query are used to construct the search character sequence.
[0102] In this step, the splitting rules for the information to be searched should be consistent with the splitting rules for the reference index strings stored in the encrypted database. This ensures that the search string corresponding to the information to be searched and the reference index strings corresponding to the character sequence to be stored may have an inclusion relationship, thus enabling a search. Specifically, by grouping all characters in the information to be searched according to the splitting rules, one or more groups of first character groups can be obtained.
[0103] Furthermore, the setting of the splitting rules includes:
[0104] The current character sequence to be split is denoted as the sequence to be split, which is the information to be searched. The first set number of characters in the sequence to be split are respectively taken as target characters. For each target character in the sequence to be split, starting from the target character, a second set number of characters are sequentially selected along the positive direction of the sequence to be split to form a character group corresponding to the target character. The second set number is less than the length of the sequence to be split. The first set number is determined based on the length and the second set number.
[0105] The rules for setting the splitting can be found in the description in Implementation Example 1, and will not be elaborated here.
[0106] S202. Perform hash processing on each of the first character groups to obtain the first hash value corresponding to each first character group.
[0107] The hashing process can be understood as processing the first character block using a hash encryption algorithm. It's important to know that the encryption algorithm used to generate the search index string is the same as the encryption algorithm used to store the index string. This serves the same purpose as using the same splitting rules: to ensure that the search index string corresponding to the search character sequence may have an inclusion relationship with the reference index string corresponding to the stored character sequence, thus enabling the search.
[0108] It should be noted that, for the same database, or the same user ID, the index strings for storage and retrieval must use the same hash algorithm.
[0109] Specifically, based on the set hash function, each first character group is hashed to obtain one or more first hash values.
[0110] S203. Determine the search index string of the information to be searched based on each of the first hash values.
[0111] All hash values are combined in the order of their corresponding first character groupings to form the search index string for the information to be searched.
[0112] S204. For each pre-stored combination of information, compare the index string to be searched with the reference index string in the combination information to obtain the index string comparison result.
[0113] Specifically, the search process iterates through the combined information in the encrypted database, comparing the query index string with the reference index string in each combined information. The search ends only after all combined information in the encrypted database has been traversed. It's understandable that the comparison result might be that the query index string is a substring of the reference index string in the combined information, or the reference index string in the combined information is a substring of the query index string, or the query index string and the reference index string in the combined information are completely unrelated.
[0114] For each combination of information, the comparison result between the query index string and the query index string may be one of the three results mentioned above. The search ends when all combinations of information in the encrypted database have been compared with the query index string.
[0115] S205. The combination information that meets the comparison conditions of the index string comparison results is taken as the target combination information that matches the index string to be searched.
[0116] The comparison conditions are set as follows: the index string to be searched is a substring of the reference index string in the combined information; or, the reference index string in the combined information is a substring of the index string to be searched.
[0117] For example, suppose the index string to be searched is: encrypt("45")|encrypt("56") (where encrypt represents an encryption algorithm, and | represents a separator with no actual meaning). The reference index string in combination A is: encrypt("45")|encrypt("56")|encrypt("67")|encrypt("78"); the reference index string in combination B is encrypt("56"). According to this comparison condition, it can be determined that: the index string to be searched, encrypt("45")|encrypt("56"), is a substring of the reference index string encrypt("45")|encrypt("56")|encrypt("67")|encrypt("78") in combination A, and the index string comparison result satisfies the comparison condition. The reference index string encrypt("56") in combination B is a substring of the index string encrypt("45")|encrypt("56") to be searched, and the index string comparison result satisfies the comparison condition.
[0118] Specifically, if the index string to be searched is a substring of the reference index string in the combined information; or, if the reference index string in the combined information is a substring of the index string to be searched, it indicates that the reference index string is associated with the index string to be searched, and the combined information of the reference index string is what the user wants to obtain. The combined information of the reference index string is then used as the target combined information, and the reference index strings in other combined information in the encrypted database are compared with the index string to be searched. The traversal ends when all combined information in the encrypted database has been traversed.
[0119] If the index string to be searched is not a substring of the reference index string in the combined information, and the reference index string in the combined information is also not a substring of the index string to be searched, it indicates that the reference index string and the index string to be searched are not related, and the combined information of the reference index string is not what the user wants to obtain. Then, continue searching for other combined information in the encrypted database, and continue comparing the reference index strings in the other combined information in the encrypted database with the index string to be searched. The search ends when all combined information in the encrypted database has been traversed.
[0120] S206. Use the encrypted data information in the target combination information as the search result of the information to be searched.
[0121] Specifically, after determining that the index string comparison result meets the comparison conditions according to the above steps, it can be known that the ciphertext data information in the target combination information is the information that the user wants to obtain. Therefore, the ciphertext data information in the target combination information is used as the search result for the information to be searched.
[0122] To more clearly illustrate the embodiments of this disclosure, an e-commerce application scenario is used as an example for further description. Since recipient information in the order information returned by an e-commerce open platform is sensitive, it is encrypted. The e-commerce open platform's database stores the recipient's order information. A portion of the recipient's mobile phone number is hashed to obtain a reference index string, and the corresponding order information is encrypted to obtain ciphertext data. The encrypted mobile phone number and the associated encrypted order information are stored as a combined information in the database. When a merchant needs to retrieve information, such as querying order information related to a recipient's mobile phone number for after-sales processing, for example, if a merchant wants to query order information related to a recipient's mobile phone number ending in "1234", then "1234" is used as the search character. Assume the splitting rule is to take the first three characters of the last four digits of the mobile phone number as the first character, and group every two characters together. The search steps include: 1. Grouping the characters in the search information "1234" according to the set splitting rules to obtain three first character groups "12", "23", and "34"; 2. Hashing the first character groups "12", "23", and "34" to obtain the first hash values hash("12"), hash("23"), and hash("34") for each first character group, where hash is assumed to be a hash algorithm; 3. Determining the search index string h based on the first hash values hash("12"), hash("23"), and hash("34"). 4. For each pre-stored combination of information, compare the query index string hash("12")|hash("23")|hash("34") with the reference index string in the combination information to obtain the index string comparison result; 5. If the index string comparison result satisfies that the query index string is a substring of the reference index string or vice versa, then the combination information associated with the reference index string is taken as the target combination information that matches the query index string; 6. Use the encrypted order information in the target combination information as the search result for the information to be searched; 7. Decrypt the order information and send the decrypted order information to the client's search interface. Merchants can view the search results in the search interface and obtain the associated order information for recipients whose last four digits of the mobile phone number are "1234".
[0123] This second embodiment provides a search method that specifies the implementation of determining the index string to be searched corresponding to the received search information, and also specifies the implementation of determining the target combination information that matches the index string. Using the method provided in this embodiment, when a search is required, the search information is first grouped and encrypted in the same way as the reference index string is generated to generate a search string; this ensures that the search string and the reference index string have an inclusion relationship. By comparing the search string and the reference index string, the reference index string that matches the search string can be determined. The combination information corresponding to the reference index string is taken as the target combination information, and the encrypted data information in the target combination information is taken as the search result. Compared with the prior art, which cannot directly search encrypted data without decryption, this technical solution enables direct searching of stored encrypted data and improves search efficiency and accuracy.
[0124] To facilitate a better understanding of the method provided in this embodiment, Figure 2a Here is an example flowchart of the search method provided in Embodiment 2 of this disclosure, as follows: Figure 2a As shown below, an exemplary process is given to illustrate the execution process of the search method in practical applications:
[0125] S1. Receive the information to be stored.
[0126] S2. Obtain the initial character sequence based on the characters contained in the information to be stored.
[0127] S3. Obtain the key character positions based on the information type to which the information to be stored belongs.
[0128] S4. Find the key characters corresponding to each key character position in the initial character sequence, and construct the character sequence to be stored based on each key character.
[0129] S5. Group the characters in the character sequence to be stored to obtain the second character group;
[0130] S6. Perform hash processing on each second character group to obtain the second hash value corresponding to each second character group;
[0131] S7. Determine the index string to be stored for the information to be stored based on each second hash value;
[0132] S8. Encrypt the information to be stored to obtain the ciphertext data information of the information to be stored.
[0133] S9. Determine the reference index string based on the index string to be stored, and associate the reference index string with the ciphertext data information to form a combined information and store it in the ciphertext database.
[0134] Steps S1 to S9 above are executed after receiving the information to be stored. The execution order of steps S1 to S7 and step S8 is not specifically restricted. Steps S10 to S17 are executed after receiving the search information and constitute the specific search process.
[0135] S10, Receive the information to be searched;
[0136] S11. Group the characters in the search information according to the set splitting rules to obtain the first character group;
[0137] S12. Perform hash processing on each first character group to obtain the first hash value corresponding to each first character group;
[0138] S13. Determine the search index string of the information to be searched based on each of the first hash values;
[0139] S14. For each pre-stored combination of information, compare the index string to be searched with the reference index string in the combination information to obtain the index string comparison result;
[0140] S15. Use the combination information that meets the comparison conditions of the index string comparison results as the target combination information to be matched with the index string to be searched.
[0141] S16. Use the ciphertext data information in the target combination information as the search result of the information to be searched;
[0142] S17. Decrypt the search results and send the decrypted search results to the client's search interface.
[0143] Example 3
[0144] Figure 3 This is a schematic diagram of a search device provided in Embodiment 3 of this disclosure. This embodiment is applicable to situations where stored encrypted data is searched directly. The device can be implemented by software and / or hardware and can be configured in a terminal and / or server to implement the search method in this disclosure. Specifically, the device may include: a query index string determination module 31, a target combination information determination module 32, and a search result determination module 33.
[0145] Among them, the index string determination module 31 is used to determine the index string to be searched corresponding to the received search information.
[0146] The target combination information determination module 32 is used to determine the target combination information that matches the index string to be searched based on the index string to be searched. The target combination information includes the reference index string and the associated ciphertext data information.
[0147] The search result determination module 33 is used to use the encrypted data information in the target combination information as the search result of the information to be searched.
[0148] This embodiment provides a search device integrated into an execution device. First, it determines the index string corresponding to the received search information. Then, based on the index string, it determines target combination information matching the index string. The target combination information includes a reference index string and associated encrypted data information. Finally, it uses the encrypted data information in the target combination information as the search result for the search information. Using this method, the index string is used as an intermediate item, and a connection is established between plaintext data and encrypted data information based on the reference index string. When searching, the search index string corresponding to the search information is first determined, then the reference index string matching the index string is determined. Based on the encrypted data information in the target combination information corresponding to the matching reference index string, the search result can be determined, realizing fuzzy search of encrypted data and improving search efficiency and accuracy.
[0149] Based on any optional technical solution in the embodiments of this disclosure, optionally, the index string determination module 31 can be specifically used for:
[0150] The characters in the search character sequence are grouped according to the set splitting rules to obtain at least one first character group;
[0151] Perform hash processing on each of the first character groups to obtain the first hash value corresponding to each first character group;
[0152] Based on each of the first hash values, determine the search index string of the information to be searched.
[0153] Based on any optional technical solution in the embodiments of this disclosure, optionally, the target combination information is recorded in an encrypted database, and the device further includes a combination information determination module, which may specifically include:
[0154] The pending index string determination unit is used to determine the pending index string corresponding to the received pending information.
[0155] The encrypted data information acquisition unit is used to encrypt the information to be stored and obtain the encrypted data information of the information to be stored.
[0156] The information combination unit is used to determine a reference index string based on the index string to be stored, and associate the reference index string with the ciphertext data information to form a piece of information and store it in the ciphertext database.
[0157] Optionally, based on any optional technical solution in the embodiments of this disclosure, the unit for determining the index string to be stored can be specifically used for:
[0158] The sequence of characters to be stored is determined based on the characters contained in the information to be stored;
[0159] The characters contained in the character sequence to be stored are grouped according to the set splitting rules to obtain at least one second character group;
[0160] Perform hash processing on each of the second character groups to obtain the second hash value corresponding to each second character group;
[0161] Based on each of the second hash values, determine the index string to be stored for the information to be stored.
[0162] Furthermore, the setting of the splitting rules includes:
[0163] The current character sequence to be split is denoted as the sequence to be split, where the sequence to be split is the information to be searched or the character sequence to be stored.
[0164] The first predetermined number of characters in the sequence to be split are respectively taken as target characters;
[0165] For each target character in the sequence to be split, starting from the target character, a second predetermined number of characters are sequentially selected along the positive direction of the sequence to be split to form a character group corresponding to the target character;
[0166] Wherein, the second set number is less than the length of the sequence to be split; the first set number is determined based on the length value and the second set number.
[0167] Based on any optional technical solution in the embodiments of this disclosure, optionally, the step of the index string determination unit determining the character sequence to be stored based on the characters contained in the information to be stored can be specifically described as follows:
[0168] Based on the characters contained in the information to be stored, an initial character sequence is obtained;
[0169] Determine the information type to which the information to be stored belongs, and obtain at least one key character bit corresponding to the information type;
[0170] Find the key character corresponding to each key character position in the initial character sequence, and construct the character sequence to be stored based on each key character.
[0171] Based on any optional technical solution in the embodiments of this disclosure, optionally, the step of the combined information constituting unit determining the reference index string based on the index string to be stored can be described as follows:
[0172] Use the index string to be stored directly as the reference index string; or...
[0173] Select a sub-index string from the index string to be stored and use it as a reference index string.
[0174] Based on any optional technical solution in the embodiments of this disclosure, optionally, the target combination information determination module 32 may be specifically used for:
[0175] For each pre-stored combination of information, the index string to be searched is compared with the reference index string in the combination information to obtain the index string comparison result;
[0176] The combination information that meets the comparison conditions of the index string comparison results shall be used as the target combination information that matches the index string to be searched.
[0177] The comparison conditions are set as follows: the index string to be searched is a substring of the reference index string in the combined information; or, the reference index string in the combined information is a substring of the index string to be searched.
[0178] Optionally, based on any of the optional technical solutions in the embodiments of this disclosure, the device further includes a decryption module, used for:
[0179] Decrypt the search results and display the decrypted results on the client's search interface.
[0180] The above-described apparatus can execute the methods provided in any embodiment of this disclosure, and has the corresponding functional modules and beneficial effects for executing the methods.
[0181] It is worth noting that the various units and modules included in the above-mentioned device are only divided according to functional logic, but are not limited to the above division, as long as the corresponding functions can be realized; in addition, the specific names of each functional unit are only for easy differentiation and are not used to limit the protection scope of the embodiments of this disclosure.
[0182] Example 4
[0183] Figure 4 This is a schematic diagram of the structure of a computer device provided in Embodiment 4 of this disclosure. Refer to the following... Figure 4 It illustrates a computer device suitable for implementing embodiments of the present disclosure (e.g., Figure 4 The diagram below shows the structure of the terminal device or server 40. The terminal device in this embodiment may include, but is not limited to, mobile terminals such as mobile phones, laptops, digital broadcast receivers, PDAs (personal digital assistants), PADs (tablet computers), PMPs (portable multimedia players), and vehicle terminals (e.g., vehicle navigation terminals), as well as fixed terminals such as digital TVs and desktop computers. Figure 4 The computer device shown is merely an example and should not be construed as limiting the functionality and scope of the embodiments disclosed herein.
[0184] like Figure 4 As shown, the computer device 40 may include a processing unit (e.g., a central processing unit, a graphics processing unit, etc.) 41, which can perform various appropriate actions and processes according to a program stored in a read-only memory (ROM) 42 or a program loaded from a storage device 48 into a random access memory (RAM) 43. The RAM 43 also stores various programs and data required for the operation of the computer device 40. The processing unit 41, ROM 42, and RAM 43 are interconnected via a bus 45. An edit / output (I / O) interface 44 is also connected to the bus 45.
[0185] Typically, the following devices can be connected to I / O interface 44: input devices 46 including, for example, touchscreens, touchpads, keyboards, mice, cameras, microphones, accelerometers, gyroscopes, etc.; output devices 47 including, for example, liquid crystal displays (LCDs), speakers, vibrators, etc.; storage devices 48 including, for example, magnetic tapes, hard disks, etc.; and communication devices 49. Communication device 49 allows computer device 40 to communicate wirelessly or wiredly with other devices to exchange data. Although Figure 4 A computer device 40 with various devices is shown, but it should be understood that it is not required to implement or have all of the devices shown. More or fewer devices may be implemented or have instead.
[0186] In particular, according to embodiments of this disclosure, the processes described above with reference to the flowcharts can be implemented as computer software programs. For example, embodiments of this disclosure include a computer program product comprising a computer program carried on a non-transitory computer-readable medium, the computer program containing program code for performing the methods shown in the flowcharts. In such embodiments, the computer program can be downloaded and installed from a network via a communication device 49, or installed from a storage device 48, or installed from a ROM 42. When the computer program is executed by the processing device 41, it performs the functions defined in the methods of embodiments of this disclosure.
[0187] The names of messages or information exchanged between multiple devices in the embodiments of this disclosure are for illustrative purposes only and are not intended to limit the scope of such messages or information.
[0188] The computer device provided in this embodiment and the search method provided in the above embodiments belong to the same inventive concept. Technical details not described in detail in this embodiment can be found in the above embodiments, and this embodiment has the same beneficial effects as the above embodiments.
[0189] Example 5
[0190] This disclosure provides a computer storage medium storing a computer program that, when executed by a processor, implements the search method provided in the above embodiments.
[0191] It should be noted that the computer-readable medium described above in this disclosure can be a computer-readable signal medium or a computer-readable storage medium, or any combination of the two. A computer-readable storage medium can be, for example,—but not limited to—an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination thereof. More specific examples of a computer-readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer disk, a hard disk, random access memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), optical storage device, magnetic storage device, or any suitable combination thereof.
[0192] In this disclosure, a computer-readable storage medium can be any tangible medium that contains or stores a program that can be used by or in connection with an instruction execution system, apparatus, or device. In this disclosure, a computer-readable signal medium can include a data signal propagated in baseband or as part of a carrier wave, carrying computer-readable program code. Such propagated data signals can take various forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination thereof. A computer-readable signal medium can also be any computer-readable medium other than a computer-readable storage medium, which can send, propagate, or transmit a program for use by or in connection with an instruction execution system, apparatus, or device. The program code contained on the computer-readable medium can be transmitted using any suitable medium, including but not limited to: wires, optical fibers, RF (radio frequency), etc., or any suitable combination thereof.
[0193] In some implementations, clients and servers can communicate using any currently known or future-developed network protocol such as HTTP (Hypertext Transfer Protocol) and can interconnect with digital data communication (e.g., communication networks) of any form or medium. Examples of communication networks include local area networks (“LANs”), wide area networks (“WANs”), the Internet (e.g., the Internet of Things), and peer-to-peer networks (e.g., ad hoc peer-to-peer networks), as well as any currently known or future-developed networks.
[0194] The aforementioned computer-readable medium may be included in the aforementioned computer device; or it may exist independently and not assembled into the computer device.
[0195] The aforementioned computer-readable medium carries one or more programs that, when executed by the computer device, cause the computer device to:
[0196] Computer program code for performing the operations of this disclosure can be written in one or more programming languages or a combination thereof, including but not limited to object-oriented programming languages such as Java, Smalltalk, and C++, as well as conventional procedural programming languages such as the "C" language or similar programming languages. The program code can be executed entirely on the user's computer, partially on the user's computer, as a standalone software package, partially on the user's computer and partially on a remote computer, or entirely on a remote computer or server. In cases involving remote computers, the remote computer can be connected to the user's computer via any type of network—including a local area network (LAN) or a wide area network (WAN)—or can be connected to an external computer (e.g., via the Internet using an Internet service provider).
[0197] The flowcharts and block diagrams in the accompanying drawings illustrate the architecture, functionality, and operation of possible implementations of systems, methods, and computer program products according to various embodiments of this disclosure. In this regard, each block in a flowchart or block diagram may represent a module, segment, or portion of code containing one or more executable instructions for implementing a specified logical function. It should also be noted that in some alternative implementations, the functions indicated in the blocks may occur in a different order than those indicated in the drawings. For example, two consecutively indicated blocks may actually be executed substantially in parallel, and they may sometimes be executed in reverse order, depending on the functions involved. It should also be noted that each block in the block diagrams and / or flowcharts, and combinations of blocks in the block diagrams and / or flowcharts, can be implemented using a dedicated hardware-based system that performs the specified function or operation, or using a combination of dedicated hardware and computer instructions.
[0198] The units described in the embodiments of this disclosure can be implemented in software or in hardware. The name of a unit does not necessarily limit the unit itself; for example, the first acquisition unit can also be described as "a unit that acquires at least two Internet Protocol addresses".
[0199] The functions described above in this document can be performed, at least in part, by one or more hardware logic components. For example, exemplary types of hardware logic components that can be used, without limitation, include: Field Programmable Gate Arrays (FPGAs), Application-Specific Integrated Circuits (ASICs), Application Standard Products (ASSPs), System-on-Chip (SoCs), Complex Programmable Logic Devices (CPLDs), and so on.
[0200] In the context of this disclosure, a machine-readable medium can be a tangible medium that may contain or store a program for use by or in conjunction with an instruction execution system, apparatus, or device. A machine-readable medium can be a machine-readable signal medium or a machine-readable storage medium. A machine-readable medium can be, but is not limited to, electronic, magnetic, optical, electromagnetic, infrared, or semiconductor systems, apparatus, or devices, or any suitable combination of the foregoing. More specific examples of machine-readable storage media include electrical connections based on one or more wires, portable computer disks, hard disks, random access memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), optical storage devices, magnetic storage devices, or any suitable combination of the foregoing.
[0201] In this embodiment of the disclosure, the computer-executable instructions contained in the computer storage medium are used, when executed by a computer processor, to perform all embodiments corresponding to the search method mentioned above.
[0202] The above description is merely a preferred embodiment of this disclosure and an explanation of the technical principles employed. Those skilled in the art should understand that the scope of this disclosure is not limited to technical solutions formed by specific combinations of the above-described technical features, but should also cover other technical solutions formed by arbitrary combinations of the above-described technical features or their equivalents without departing from the above-described concept. For example, technical solutions formed by substituting the above features with (but not limited to) technical features disclosed in this disclosure that have similar functions.
[0203] Furthermore, although the operations are described in a specific order, this should not be construed as requiring these operations to be performed in the specific order shown or in a sequential order. In certain environments, multitasking and parallel processing may be advantageous. Similarly, while some specific implementation details are included in the above discussion, these should not be construed as limiting the scope of this disclosure. Certain features described in the context of individual embodiments may also be implemented in combination in a single embodiment. Conversely, various features described in the context of a single embodiment may also be implemented individually or in any suitable sub-combination in multiple embodiments.
[0204] Although the subject matter has been described using language specific to structural features and / or methodological logic, it should be understood that the subject matter defined in the appended claims is not necessarily limited to the specific features or actions described above. Rather, the specific features and actions described above are merely illustrative examples of implementing the claims.
Claims
1. A search method, characterized in that, include: Determine the index string to be searched corresponding to the received search information; Based on the index string to be searched, target combination information matching the index string to be searched is determined, the target combination information including the reference index string and associated ciphertext data information; The encrypted data information in the target combination information is used as the search result of the information to be searched; The target combination information is recorded in a ciphertext database, which has a corresponding relationship with the user identity identifier. The steps for determining the combination information recorded in the ciphertext database include: Based on the characters contained in the information to be stored, an initial character sequence is obtained; the information type to which the information to be stored belongs is determined, and at least one key character position corresponding to the information type is obtained; the key characters corresponding to each key character position are found from the initial character sequence, and a character sequence to be stored is constructed based on each key character; based on the character sequence to be stored, the index string to be stored corresponding to the received information to be stored is determined. The information to be stored is encrypted to obtain the encrypted data information of the information to be stored. Based on the index string to be stored, a reference index string is determined, and the reference index string is associated with the ciphertext data information to form a combined information and store it in the ciphertext database. The step of determining the reference index string based on the index string to be stored includes: The encryption operation results of each group in the index string to be stored are extracted to obtain the reference index string.
2. The method according to claim 1, characterized in that, The step of determining the index string corresponding to the received search information includes: The characters in the information to be searched are grouped according to the set splitting rules to obtain at least one first character group; Perform hash processing on each of the first character groups to obtain the first hash value corresponding to each first character group; Based on each of the first hash values, determine the search index string of the information to be searched.
3. The method according to claim 1, characterized in that, The step of determining the index string to be stored corresponding to the received information to be stored based on the character sequence to be stored includes: The sequence of characters to be stored is determined based on the characters contained in the information to be stored; The characters contained in the character sequence to be stored are grouped according to the set splitting rules to obtain at least one second character group; Perform hash processing on each of the second character groups to obtain the second hash value corresponding to each second character group; Based on each of the second hash values, determine the index string to be stored for the information to be stored.
4. The method according to claim 2 or 3, characterized in that, The defined splitting rules include: The current character sequence to be split is denoted as the sequence to be split, where the sequence to be split is the information to be searched or the character sequence to be stored. The first predetermined number of characters in the sequence to be split are respectively taken as target characters; For each target character in the sequence to be split, starting from the target character, a second predetermined number of characters are sequentially selected along the positive direction of the sequence to be split to form a character group corresponding to the target character; Wherein, the second set number is less than the length of the sequence to be split; the first set number is determined based on the length value and the second set number.
5. The method according to claim 1, characterized in that, The step of determining the reference index string based on the index string to be stored includes: Use the index string to be stored directly as the reference index string; or... Select a sub-index string from the index string to be stored, and use it as a reference index string.
6. The method according to claim 1, characterized in that, The step of determining the target combination information that matches the index string to be searched includes: For each pre-stored combination of information, the index string to be searched is compared with the reference index string in the combination information to obtain the index string comparison result; The combination information that meets the comparison conditions of the index string comparison results shall be used as the target combination information that matches the index string to be searched. The comparison conditions are set as follows: the index string to be searched is a substring of the reference index string in the combined information; or, the reference index string in the combined information is a substring of the index string to be searched.
7. The method according to any one of claims 1-3 and 5-6, characterized in that, Also includes: Decrypt the search results and display the decrypted results on the client's search interface.
8. A search device, characterized in that, include: The index string determination module is used to determine the index string corresponding to the received search information; The target combination information determination module is used to determine the target combination information that matches the index string to be searched, based on the index string to be searched. The target combination information includes the reference index string and the associated encrypted data information. The search result determination module is used to take the encrypted data information in the target combination information as the search result of the information to be searched; The target combination information is recorded in a ciphertext database, and the ciphertext database has a corresponding relationship with the user identity identifier. The device further includes a combination information determination module, which includes: The unit for determining the index string to be stored is used to obtain an initial character sequence based on the characters contained in the information to be stored; determine the information type to which the information to be stored belongs, and obtain at least one key character position corresponding to the information type; search for key characters corresponding to each key character position from the initial character sequence, and construct a character sequence to be stored based on each key character; and determine the index string to be stored corresponding to the received information to be stored based on the character sequence to be stored. The encrypted data information acquisition unit is used to encrypt the information to be stored and obtain the encrypted data information of the information to be stored. A combined information constituting unit is used to determine a reference index string based on the index string to be stored, and associate the reference index string with the ciphertext data information to form a combined information and store it in the ciphertext database. The step of determining the reference index string based on the index string to be stored includes: The encryption operation results of each group in the index string to be stored are extracted to obtain the reference index string.
9. A computer device, characterized in that, The computer device includes: One or more processors; Storage device for storing one or more programs. When the one or more programs are executed by the one or more processors, the one or more processors implement the search method as described in any one of claims 1-7.
10. A computer-readable storage medium having a computer program stored thereon, characterized in that, When the program is executed by the processor, it implements the search method as described in any one of claims 1-7.