Data classification method and device, processing equipment and storage medium
By segmenting the fingerprint of the data to be determined into multiple segments and performing segmented retrieval and matching in a preset database, the problem of long processing time caused by the large number of fingerprints in the database in the existing technology is solved, and a more efficient determination process is achieved.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- 南京中孚信息技术有限公司
- Filing Date
- 2022-09-02
- Publication Date
- 2026-06-23
AI Technical Summary
In existing technologies, when there are many key fingerprints in the database, the key identification process takes a long time and is inefficient.
The fingerprint of the data to be determined is segmented into multiple fingerprint segments to be matched. Segmented retrieval is performed in a preset database to determine the candidate fingerprint set, and the target fingerprint is determined by Hamming distance matching.
It reduces the amount of data required to compute Hamming codes, improves search efficiency, shortens the time required to determine the secret key, and enhances the accuracy and efficiency of the secret key determination.
Smart Images

Figure CN115408720B_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of confidentiality science and technology, and more specifically, to a data confidentiality method, apparatus, processing device, and storage medium. Background Technology
[0002] Classifying classified documents is the prerequisite and foundation for carrying out all aspects of confidentiality management. The key content within a document that determines whether a section needs to be classified is called a "classified point," which includes text, images, videos, and audio. The process of determining the location, classification level, and type of these classified points is called "classification."
[0003] In existing technologies, the data to be encrypted is typically fingerprinted and then matched sequentially with multiple encrypted fingerprints pre-stored in a database. The encryption of the data is then determined based on the most matching encrypted fingerprint.
[0004] However, when the database contains a large number of pre-stored key fingerprints, this search method is time-consuming and inefficient. Summary of the Invention
[0005] This invention provides a data encryption method, apparatus, processing device, and storage medium, which can segment the fingerprint of the data to be encrypted and then perform segmented retrieval in a preset database to determine the target fingerprint. Compared with the prior art, which searches and matches the entire fingerprint, this reduces the system's retrieval and matching overhead, reduces search time, and improves efficiency.
[0006] The embodiments of the present invention can be implemented as follows:
[0007] In a first aspect, embodiments of this application provide a data encryption method, the method comprising:
[0008] The fingerprint of the data to be determined is segmented to obtain multiple fingerprint segments to be matched;
[0009] Based on the results of segmented retrieval of each fingerprint segment to be matched in a preset database, a candidate fingerprint set is determined, which includes multiple candidate fingerprints.
[0010] Each candidate fingerprint in the candidate fingerprint set is matched with the fingerprint of the data to be determined to identify the target fingerprint, and the identification result of the data to be determined is determined based on the target fingerprint.
[0011] In an optional implementation, before determining the candidate fingerprint set based on the results of segmented retrieval in a preset database for each of the fingerprint segments to be matched, the method further includes:
[0012] Multiple fixed-secret rule fingerprints are segmented to obtain multiple rule fingerprint segments corresponding to each fixed-secret rule fingerprint;
[0013] Based on the position of each rule fingerprint segment in the corresponding fixed-secret rule fingerprint, each rule fingerprint segment is added to the corresponding segment fingerprint set, and the number of the segment fingerprint sets is equal to the number of the rule fingerprint segments;
[0014] The preset database is obtained by combining the fingerprint sets of each segment.
[0015] In one optional implementation, determining the candidate fingerprint set based on the results of segmented retrieval of each of the fingerprint segments to be matched in a preset database includes:
[0016] The corresponding fingerprint segment to be matched is retrieved in each of the segmented fingerprint sets to obtain multiple search results;
[0017] Based on the search results, at least one candidate fingerprint is determined;
[0018] The set of each of the candidate fingerprints is called the candidate fingerprint set.
[0019] In an optional implementation, the method further includes:
[0020] If the search result indicates that the segmented fingerprint set does not match the fingerprint segment to be matched, then the corresponding fingerprint segment to be matched is searched in the next segmented fingerprint set.
[0021] In an optional implementation, the method further includes:
[0022] If the search result shows that the segmented fingerprint set matches the fingerprint segment to be matched, then at least one regular fingerprint segment in each segmented fingerprint set is the same as each fingerprint segment to be matched.
[0023] Based on the identifier of each rule fingerprint segment, multiple candidate fingerprints corresponding to each rule fingerprint segment are determined.
[0024] In one optional implementation, the step of matching each candidate fingerprint in the candidate fingerprint set with the fingerprint of the data to be determined to identify the target fingerprint includes:
[0025] The Hamming distance between each candidate fingerprint in the candidate fingerprint set and the fingerprint of the data to be determined is calculated to obtain multiple fingerprint matching values;
[0026] The candidate fingerprint corresponding to the smallest fingerprint matching value among the plurality of fingerprint matching values is taken as the target fingerprint.
[0027] In one optional implementation, determining the encryption result of the data to be encrypted based on the target fingerprint includes:
[0028] The key information corresponding to the target fingerprint is used as the key determination result of the data to be determined.
[0029] Secondly, embodiments of this application provide a data encryption device, comprising:
[0030] The segmentation module is used to segment the fingerprint of the data to be determined into multiple fingerprint segments to be matched;
[0031] The retrieval module is used to determine a candidate fingerprint set based on the results of segmented retrieval of each fingerprint segment to be matched in a preset database. The candidate fingerprint set includes multiple candidate fingerprints.
[0032] The matching module is used to match each candidate fingerprint in the candidate fingerprint set with the fingerprint of the data to be determined, determine the target fingerprint, and determine the determination result of the data to be determined based on the target fingerprint.
[0033] The segmentation module is further configured to: segment multiple fixed-secret rule fingerprints to obtain multiple rule fingerprint segments corresponding to each fixed-secret rule fingerprint; add each rule fingerprint segment to a corresponding segmented fingerprint set according to the position of each rule fingerprint segment in the corresponding fixed-secret rule fingerprint, wherein the number of segmented fingerprint sets is equal to the number of rule fingerprint segments; and combine the segmented fingerprint sets to obtain the preset database.
[0034] The retrieval module is further configured to: retrieve the corresponding fingerprint segment to be matched in each of the segmented fingerprint sets to obtain multiple retrieval results; determine at least one candidate fingerprint based on each of the retrieval results; and use the set of each candidate fingerprint as the candidate fingerprint set.
[0035] The retrieval module is further configured to, if the retrieval result indicates that the segmented fingerprint set does not match the fingerprint segment to be matched, then retrieve the corresponding fingerprint segment to be matched in the next segmented fingerprint set.
[0036] The retrieval module is further configured to: if the retrieval result is that the segmented fingerprint set matches the fingerprint segment to be matched, then obtain at least one regular fingerprint segment in each segmented fingerprint set that is the same as each fingerprint segment to be matched; and determine multiple candidate fingerprints corresponding to each regular fingerprint segment based on the identifier of each regular fingerprint segment.
[0037] The matching module is further configured to calculate the Hamming distance between each candidate fingerprint in the candidate fingerprint set and the fingerprint of the data to be determined, thereby obtaining multiple fingerprint matching values; and to select the candidate fingerprint corresponding to the fingerprint matching value with the smallest value among the multiple fingerprint matching values as the target fingerprint.
[0038] The matching module is further configured to use the key information corresponding to the target fingerprint as the key determination result of the data to be determined.
[0039] Thirdly, embodiments of this application provide a processing device, the processing device comprising: a processor, a storage medium and a bus, the storage medium storing machine-readable instructions executable by the processor, and when the processing device is running, the processor communicates with the storage medium via the bus, the processor executing the machine-readable instructions to perform the steps of the data encryption method as described in any one of the first aspects.
[0040] Fourthly, embodiments of this application provide a computer-readable storage medium storing a computer program that, when executed by a processor, implements the steps of the data encryption method as described in any one of the first aspects.
[0041] The beneficial effects of the embodiments of the present invention include:
[0042] The data encryption method, apparatus, processing device, and storage medium provided in this application embodiment can obtain a candidate dataset by segmenting multiple fingerprint segments obtained from the fingerprint of the data to be encrypted into a preset database, and then matching the candidate dataset with the fingerprint of the data to be encrypted to obtain the encryption result. Compared with the whole-segment retrieval method in the prior art, this segmented retrieval method reduces the amount of computation when matching in the preset database, reduces the time overhead of retrieval and matching, and improves search efficiency. Attached Figure Description
[0043] To more clearly illustrate the technical solutions of the embodiments of the present invention, the accompanying drawings used in the embodiments will be briefly introduced below. It should be understood that the following drawings only show some embodiments of the present invention and should not be regarded as a limitation on the scope. For those skilled in the art, other related drawings can be obtained based on these drawings without creative effort.
[0044] Figure 1 This is a flowchart illustrating the steps of the data confidentiality determination method provided in the embodiments of this application;
[0045] Figure 2 A schematic diagram illustrating the fingerprint segmentation process of the data to be classified in the data classification method provided in this application embodiment;
[0046] Figure 3 This is a schematic diagram of another step in the data confidentiality determination method provided in the embodiments of this application;
[0047] Figure 4 A schematic diagram of a preset database for the data confidentiality determination method provided in the embodiments of this application;
[0048] Figure 5 This is a schematic diagram of another step in the data confidentiality determination method provided in the embodiments of this application;
[0049] Figure 6 A matching diagram illustrating the data encryption method provided in this application embodiment;
[0050] Figure 7 This is a schematic diagram of another step in the data confidentiality determination method provided in the embodiments of this application;
[0051] Figure 8 This is a schematic diagram of another step in the data confidentiality determination method provided in the embodiments of this application;
[0052] Figure 9 The execution flowchart of the data confidentiality determination method provided in the embodiments of this application is shown below;
[0053] Figure 10 This is a schematic diagram of the data security device provided in the embodiments of this application;
[0054] Figure 11 This is a schematic diagram of the processing device provided in an embodiment of this application.
[0055] Icons: 100 - Data encryption device; 1001 - Segmentation module; 1002 - Retrieval module; 1003 - Matching module; 2001 - Processor; 2002 - Memory. Detailed Implementation
[0056] To make the objectives, technical solutions, and advantages of the embodiments of the present invention clearer, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of the present invention, and not all embodiments. The components of the embodiments of the present invention described and shown in the accompanying drawings can generally be arranged and designed in various different configurations.
[0057] Therefore, the following detailed description of the embodiments of the invention provided in the accompanying drawings is not intended to limit the scope of the claimed invention, but merely to illustrate selected embodiments of the invention. All other embodiments obtained by those skilled in the art based on the embodiments of the invention without inventive effort are within the scope of protection of the invention.
[0058] It should be noted that similar labels and letters in the following figures indicate similar items. Therefore, once an item is defined in one figure, it does not need to be further defined and explained in subsequent figures.
[0059] Furthermore, the terms "first" and "second" are used only to distinguish descriptions and should not be interpreted as indicating or implying relative importance.
[0060] It should be noted that, where there is no conflict, the features in the embodiments of the present invention can be combined with each other.
[0061] The efficiency and accuracy of classification determine the efficiency and quality of confidentiality work. In existing technologies, databases typically pre-store multiple complete security fingerprints. After the data to be classified is fingerprinted, it is matched sequentially with each security fingerprint in the database, which involves calculating the Hamming distance. Finally, based on the calculated Hamming distance results, the best-matching security fingerprint is obtained, and the data to be classified is further classified.
[0062] However, when the database stores a large number of key fingerprints, this search method is time-consuming and inefficient.
[0063] Based on this, this application provides a data encryption method, apparatus, processing device, and storage medium, which can perform segmented retrieval of the fingerprint of the data to be encrypted in a preset database to determine the target fingerprint. Compared with the prior art, which searches and matches the entire fingerprint, this reduces the system's retrieval and matching overhead, reduces search time, and improves efficiency.
[0064] A confidential document may contain only a few confidential parts, such as a piece of text, a sentence, or a password. Furthermore, a confidential document may contain multiple contents with different levels of confidentiality. In this case, if the entire confidential document is simply managed as a single confidential matter, it may result in a large number of confidential documents and an inaccurate level of confidentiality.
[0065] Therefore, in order to filter out the essential attributes of classified documents, it is necessary to distinguish the key, confidential, and minimal information units in the classified documents from other content and determine the corresponding classification level. These minimal information units are the secret points, and the process of determining the classification level is called classification.
[0066] To facilitate the determination of security classification, the fingerprints of key points with determined security levels are typically pre-stored in a database. These fingerprints are usually saved as binary strings. After the data to be classified is fingerprinted, it also yields a string of binary data. It's understandable that the binary strings of fingerprints corresponding to key points with the same security level and similar content are equal or similar. The key points and security level of the data to be classified can be determined by calculating the distance between the two fingerprints using Hamming codes.
[0067] In this embodiment of the application, by segmenting and matching the fingerprint of the data to be determined, the amount of data required to calculate the Hamming code is reduced, thereby improving the search efficiency.
[0068] Figure 1 The diagram shown is a flowchart illustrating the steps of the data confidentiality determination method provided in this application embodiment. The executing entity of this application can be a computer device with computing and processing capabilities, such as... Figure 1 As shown, the method includes the following steps:
[0069] S101, the fingerprint of the data to be determined is segmented to obtain multiple fingerprint segments to be matched.
[0070] The fingerprint of the data to be determined can be obtained by fingerprinting the data to be determined, and it is represented in the form of a binary string.
[0071] The data to be classified may be text, images, audio, etc., and this application does not limit it.
[0072] Based on the preset number of segments and the preset segment length, the binary string corresponding to the fingerprint of the data to be matched is evenly divided to obtain multiple fingerprint segments to be matched with a length of the preset segment length and a number of segments of the preset number. Each fingerprint segment to be matched is represented in the form of a binary string.
[0073] It should be noted that since the binary string corresponding to the fingerprint of the undetermined password data is obtained by converting the fingerprint of the undetermined password data through a preset conversion algorithm, the length of the corresponding binary string is fixed for different undetermined password data.
[0074] S102, determine the candidate fingerprint set based on the results of segmented retrieval of each fingerprint segment to be matched in the preset database.
[0075] The candidate fingerprint set includes: multiple candidate fingerprints.
[0076] Candidate fingerprints are fingerprints that have already been classified, with their security level, content, and other information determined. They are represented in the form of binary strings, and their length can be equal to the length of the binary string corresponding to the data to be classified. Furthermore, each candidate fingerprint is uniquely identified by a candidate fingerprint identifier.
[0077] It should be noted that when multiple fingerprint segments to be matched are matched with each fingerprint in the preset database, if any fingerprint segment to be matched can match a certain fingerprint, then the fingerprint in the pending secret data is close to that fingerprint, and the Hamming distance may be small. This fingerprint can be a candidate fingerprint.
[0078] Optionally, such as Figure 2As shown, multiple fingerprint segments corresponding to the fingerprint of the undetermined password data can be sequentially searched in a preset database. For example, the first fingerprint segment to be matched of the fingerprint of the first undetermined password data can be searched in the preset database first to determine whether the part from the start part to the preset cutting length in each fingerprint in the preset database matches the first fingerprint segment to be matched. If they do not match, the second fingerprint segment to be matched can be searched sequentially until a match is found.
[0079] It is understandable that when multiple fingerprint segments to be matched successfully match the corresponding segments in each fingerprint in the preset database, there may be one or more candidate fingerprints, and the set of at least one candidate fingerprint is taken as the candidate fingerprint set.
[0080] S103, each candidate fingerprint in the candidate fingerprint set is matched with the fingerprint of the data to be determined to identify the target fingerprint, and the identification result of the data to be determined is determined based on the target fingerprint.
[0081] The candidate fingerprint with the smallest distance to the fingerprint of the data to be determined can be determined by calculating the distance between each candidate fingerprint in the candidate fingerprint set and the fingerprint of the data to be determined, such as the Hamming distance, and then the candidate fingerprint with the smallest distance can be used as the target fingerprint.
[0082] Furthermore, since the target fingerprint is a fingerprint with a determined security level, the security level of the target fingerprint can be used as the security level of the fingerprint of the data to be determined, thus obtaining the security determination result of the data to be determined.
[0083] In this way, the encryption result can be obtained by only calculating the distance between multiple candidate fingerprints that match the fingerprint portion of the data to be encrypted and the fingerprint of the data to be encrypted, without having to calculate the distance between all candidate fingerprints and the fingerprint of the data to be encrypted, which greatly reduces the amount of computation.
[0084] In this embodiment, multiple fingerprint segments obtained by segmenting the fingerprint of the data to be determined are retrieved in a preset database to obtain a candidate dataset. The candidate dataset is then matched with the fingerprint of the data to be determined, which reduces the amount of computation when matching in the preset database, reduces the time overhead of retrieval and matching, and improves search efficiency.
[0085] Optionally, such as Figure 3 As shown, in step S102 above, the candidate fingerprint set is determined based on the results of segmented retrieval of each fingerprint segment to be matched in the preset database, which can be achieved by the following steps S201 to S203:
[0086] S201, the multiple fixed-secret rule fingerprints are segmented to obtain multiple rule fingerprint segments corresponding to each fixed-secret rule fingerprint.
[0087] The security rule fingerprint can be multiple fingerprints that have been security-defined and are stored in a preset database. The security level and content of the security rule fingerprint are determined, represented by a binary string, and uniquely identified by the security rule fingerprint identifier.
[0088] It should be noted that the key-determining rule fingerprint is all the key-point fingerprints in the preset database, while the above-mentioned candidate fingerprints are key-point fingerprints that can match the corresponding position of a certain fingerprint segment of the fingerprint to be determined.
[0089] The length of the binary string corresponding to each encryption rule fingerprint can be the same as the length of the fingerprint of the data to be encrypted. Therefore, each encryption rule fingerprint can be divided into multiple rule fingerprint segments with a preset cutting length and a preset number of cutting segments.
[0090] S202, based on the position of each rule fingerprint segment in the corresponding fixed-secret rule fingerprint, add each rule fingerprint segment to the corresponding segment fingerprint set.
[0091] The number of segmented fingerprint sets is equal to the number of regular fingerprint segments.
[0092] like Figure 4 As shown, after dividing multiple fixed-secret rule fingerprints into multiple rule fingerprint segments, the multiple rule fingerprint segments corresponding to the preset number of segments of each fixed-secret rule fingerprint can be marked by the fixed-secret rule fingerprint identifier of the fixed-secret rule fingerprint.
[0093] Furthermore, the segmented fingerprint set to which each rule fingerprint segment belongs is determined according to its position in the fixed-secret rule fingerprint. For example, starting from the beginning of the first fixed-secret rule fingerprint, multiple binary strings of a preset segment length are taken as the first segment of the first fixed-secret rule fingerprint and added to the first segmented fingerprint set. Subsequent strings of the first fixed-secret rule fingerprint can be arranged sequentially according to this rule and added to the segmented fingerprint set at the corresponding positions.
[0094] S203, combine the fingerprint sets of each segment to obtain a preset database.
[0095] It is understandable that the number of segmented fingerprint sets is equal to the number of preset segments, and the number of regular fingerprint segments contained in each segmented fingerprint set is equal to the number of password-determining regular fingerprints. Therefore, a combination of multiple segmented fingerprint sets can be used as a preset database for subsequent segmented retrieval of fingerprints for data to be password-determined.
[0096] In this embodiment, the fingerprint of the password determination rule is segmented to generate multiple segmented fingerprint sets, which are then combined to form a preset database. This further facilitates the segmented fingerprint data to be determined according to the corresponding relationship, improving the efficiency of searching and determining candidate fingerprint sets.
[0097] Optionally, such as Figure 5 As shown, in step S102 above, the candidate fingerprint set is determined based on the results of segmented retrieval of each fingerprint segment to be matched in the preset database, which can be achieved by the following steps S301 to S303.
[0098] S301, retrieve the corresponding fingerprint segment to be matched in each segment fingerprint set to obtain multiple search results.
[0099] See Figure 6 If both the fingerprint of the fixed-secret rule and the fingerprint of the data to be determined are divided into 4 segments, and then matched according to the corresponding relationship, it can be found that the third segment of the fixed-secret rule fingerprint and the third segment of the fingerprint of the data to be determined are equal. Then, the Hamming distance between the candidate fingerprint and the fingerprint of the data to be determined may be within 3. Therefore, the fixed-secret rule fingerprint can be used as a candidate fingerprint. After obtaining multiple candidate fingerprints, further screening can be performed on the candidate fingerprint set to determine the target fingerprint.
[0100] Based on the above principle, multiple fingerprint segments obtained from the fingerprint segmentation of the data to be determined can be sequentially searched in the corresponding segment fingerprint sets to determine multiple search results. For example, ... Figure 2 The first fingerprint segment to be matched in the second undetermined secret data of the fingerprint is in Figure 4 When searching the first segment fingerprint set of the preset database, if no regular fingerprint segment matching the first fingerprint segment to be matched is found, the search result is a mismatch. Then, the search continues on the second segment fingerprint set of the preset database for the second fingerprint data to be matched. If a regular fingerprint segment matching the second fingerprint segment is found, the search result is a match, and the search stops.
[0101] S302, Based on each search result, determine at least one candidate fingerprint.
[0102] Based on the search results, i.e. the above matching or non-matching, the fingerprint of the rule fingerprint containing a certain fingerprint segment that can match a certain fingerprint segment of the fingerprint to be determined is further filtered out as a candidate fingerprint.
[0103] S303, the set of all candidate fingerprints is taken as the candidate fingerprint set.
[0104] It is understandable that the same fingerprint segment to be matched may match at least one regular fingerprint segment of a fixed-secret rule fingerprint in the corresponding segmented fingerprint set. In this case, all of these fixed-secret rule fingerprints can be used as candidate fingerprints, and the set of multiple candidate fingerprints is used as the candidate fingerprint set.
[0105] In this embodiment, each fingerprint segment to be matched is sequentially searched in its corresponding segmented fingerprint set, and a candidate fingerprint set is generated based on the search results. This segmented matching method reduces the amount of data required to calculate the Hamming distance between the fingerprints of the data to be classified and each classification rule fingerprint, thus improving the classification efficiency.
[0106] Optionally, the data security determination method provided in this application embodiment may further include the following steps:
[0107] If the search result shows that the segmented fingerprint set does not match the fingerprint segment to be matched, then the corresponding fingerprint segment to be matched will be searched in the next segmented fingerprint set.
[0108] The processing device will perform segmented searches from front to back according to the position of each fingerprint segment to be matched in the fingerprint of the data to be determined. If no matching rule fingerprint segment is found in the corresponding segment fingerprint set, the next fingerprint segment to be matched will be searched in the corresponding segment fingerprint set until a rule fingerprint segment that matches the corresponding fingerprint segment is found in a certain segment fingerprint set.
[0109] For example, see Figure 2 If the first fingerprint segment of the fingerprint of the Nth undetermined secret data is to be matched Figure 4 If no matching regular fingerprint segment is found in the first segment fingerprint set, the second matching fingerprint segment of the Nth undetermined fingerprint data is searched in the second segment fingerprint set, and the search results determine whether to search for the next matching fingerprint segment.
[0110] In this embodiment, the search for the next matching fingerprint segment will continue for the matching fingerprint segment in the corresponding segmented fingerprint set where no matching regular fingerprint segment has been found. Since only segmented search is required and no distance calculation is needed, the search time is short and the search efficiency is high.
[0111] Optionally, such as Figure 7 As shown, the data confidentiality determination method provided in this application embodiment may further include the following steps:
[0112] S401, if the search result is that the segmented fingerprint set matches the fingerprint segment to be matched, then at least one regular fingerprint segment in each segmented fingerprint set is the same as each fingerprint segment to be matched.
[0113] If one or more identical rule fingerprint segments are found in the corresponding segmented fingerprint set for a given fingerprint segment to be matched, the search result is considered a match, and the search for the next fingerprint segment to be matched will not continue.
[0114] For example, Figure 2 The second fingerprint segment to be matched in the first fingerprint of the undetermined cryptographic data is in Figure 4When searching the second segmented fingerprint set shown, two matching regular fingerprint segments can be obtained.
[0115] S402, based on the identifier of each rule fingerprint segment, determine multiple candidate fingerprints corresponding to each rule fingerprint segment.
[0116] As described in the above embodiments, although the various fingerprint segments of the same encryption rule fingerprint belong to different segment datasets, they all correspond to the same encryption rule fingerprint identifier. Thus, by retrieving the identifier of the matching fingerprint segment, the encryption rule fingerprint to which that segment belongs can be determined, and it can be used as a candidate fingerprint.
[0117] For example, see Figure 2 , Figure 4 When searching the second fingerprint segment of the fingerprint of the first undetermined secret data in the second segmented fingerprint set, if it is determined that the second rule fingerprint segment of the first secret rule fingerprint and the second rule fingerprint segment of the second secret rule fingerprint can match, then based on the identifiers of these two second fingerprint segments, both the first secret rule fingerprint and the second secret rule fingerprint are determined to be candidate fingerprints. A candidate fingerprint set is then formed.
[0118] In this embodiment, the corresponding candidate fingerprint is determined based on the identifier of the regular fingerprint segment that matches the fingerprint to be matched. This allows the entire fingerprint segment to be determined based on the regular fingerprint segment, thus avoiding search errors.
[0119] Optionally, such as Figure 8 As shown, in step S103 above, each candidate fingerprint in the candidate fingerprint set is matched with the fingerprint of the data to be determined to determine the target fingerprint, which can be achieved by the following steps S501 to S502.
[0120] S501, calculate the Hamming distance between each candidate fingerprint in the candidate fingerprint set and the fingerprint of the data to be determined, and obtain multiple fingerprint matching values.
[0121] Hamming distance, also known as code distance, is the difference in the number of corresponding bits between two valid codes in information encoding. For example, if 10101 and 00110 differ in their first, fourth, and fifth bits respectively, then their Hamming distance is 3.
[0122] The Hamming distance between each candidate fingerprint and the fingerprint of the data to be determined is calculated by XOR operation, and the calculated Hamming distance result is used as the fingerprint matching value between each candidate fingerprint and the fingerprint of the data to be determined.
[0123] S502, select the candidate fingerprint corresponding to the fingerprint matching value with the smallest value among multiple fingerprint matching values as the target fingerprint.
[0124] It should be noted that the smaller the Hamming distance calculation result, i.e., the fingerprint matching value, the smaller the distance between the candidate fingerprint and the fingerprint in the data to be determined, and the higher the matching degree. Therefore, the candidate fingerprint with the smallest fingerprint matching value among all candidate fingerprints can be taken as the target fingerprint.
[0125] In this embodiment, by comparing the Hamming distance calculation results between each candidate fingerprint and the fingerprint of the data to be determined, the target fingerprint is determined with low computational cost, thus improving the efficiency of data encryption.
[0126] Optionally, the data encryption method provided in this application embodiment may further include: using the encryption point information corresponding to the target fingerprint as the encryption result of the data to be encrypted.
[0127] It is understandable that the target fingerprint is one of the fixed-secret rule fingerprints. The fixed-secret rule fingerprint can be a secret point fingerprint with a pre-determined security level and content. Therefore, the security level and content of the target fingerprint, i.e., the secret point information, are also known.
[0128] If the target fingerprint matches the fingerprint of the data to be determined, the security level of the target fingerprint can be used as the security level of the data to be determined, and the result of the data to be determined can be generated.
[0129] In this embodiment, the key point information of the target fingerprint that can match the data to be determined is used as the determination result of the data to be determined, thereby improving the accuracy of the determination.
[0130] From the above embodiments, refer to Figure 9 The steps of the data confidentiality determination method provided in the embodiments of this application will be described.
[0131] After inputting the fingerprint of the data to be classified, the fingerprint is first segmented to obtain multiple fingerprint segments to be matched. Optionally, before, after, or simultaneously, multiple classification rule fingerprints can be segmented to obtain multiple rule fingerprint segments, generating multiple segmented fingerprint sets, which are then combined into a preset database.
[0132] Then, based on the position of the fingerprint segment to be matched in the fingerprint of the undetermined password data, a search is performed sequentially in the corresponding segment fingerprint set. If no matching regular fingerprint segment is found in the corresponding segment fingerprint set, the search continues for the next fingerprint segment to be matched in the corresponding segment fingerprint set, and this step is repeated until a matching fingerprint segment finds a matching regular fingerprint segment in the corresponding segment fingerprint set.
[0133] If at least one regular fingerprint segment that matches the fingerprint segment to be matched is found in the corresponding segmented fingerprint set, then the candidate fingerprints of each regular fingerprint segment are determined according to the identifier of each regular fingerprint segment, and the set of each candidate fingerprint is taken as the candidate fingerprint set.
[0134] Finally, based on the Hamming distance between each candidate fingerprint in the candidate fingerprint set and the fingerprint of the data to be determined, the candidate fingerprint with the smallest Hamming distance is determined as the target fingerprint, and the key information of the target fingerprint is output as the key determination result.
[0135] See Figure 10 This application provides a data encryption device 100, comprising:
[0136] The segmentation module 1001 is used to segment the fingerprint of the data to be determined into multiple fingerprint segments to be matched.
[0137] The retrieval module 1002 is used to determine a candidate fingerprint set based on the results of segmented retrieval of each fingerprint segment to be matched in a preset database. The candidate fingerprint set includes multiple candidate fingerprints.
[0138] The matching module 1003 is used to match each candidate fingerprint in the candidate fingerprint set with the fingerprint of the data to be determined, determine the target fingerprint, and determine the determination result of the data to be determined based on the target fingerprint.
[0139] The segmentation module 1001 is further configured to: segment multiple fixed-secret rule fingerprints to obtain multiple rule fingerprint segments corresponding to each fixed-secret rule fingerprint; add each rule fingerprint segment to the corresponding segment fingerprint set according to the position of each rule fingerprint segment in the corresponding fixed-secret rule fingerprint, wherein the number of segment fingerprint sets is equal to the number of rule fingerprint segments; and combine the segment fingerprint sets to obtain a preset database.
[0140] The retrieval module 1002 is further configured to retrieve the corresponding fingerprint segments to be matched in each segment fingerprint set to obtain multiple retrieval results; determine at least one candidate fingerprint based on each retrieval result; and use the set of each candidate fingerprint as the candidate fingerprint set.
[0141] The retrieval module 1002 is also used to retrieve the corresponding fingerprint segment to be matched in the next segment fingerprint set if the retrieval result is that the segment fingerprint set does not match the fingerprint segment to be matched.
[0142] The retrieval module 1002 is further configured to: if the retrieval result is that the segmented fingerprint set matches the fingerprint segment to be matched, then obtain at least one regular fingerprint segment in each segmented fingerprint set that is the same as each fingerprint segment to be matched; and determine multiple candidate fingerprints corresponding to each regular fingerprint segment based on the identifier of each regular fingerprint segment.
[0143] The matching module 1003 is further used to calculate the Hamming distance between each candidate fingerprint in the candidate fingerprint set and the fingerprint of the data to be determined, so as to obtain multiple fingerprint matching values; and to take the candidate fingerprint corresponding to the fingerprint matching value with the smallest value among the multiple fingerprint matching values as the target fingerprint.
[0144] The matching module 1003 is also specifically used to use the key point information corresponding to the target fingerprint as the key determination result of the data to be determined.
[0145] Please see Figure 11 This embodiment also provides a processing device, which includes a processor 2001, a memory 2002 and a bus. The memory 2002 stores machine-readable instructions that can be executed by the processor 2001. When the processing device is running, the machine-readable instructions are executed. The processor 2001 and the memory 2002 communicate via the bus. The processor 2001 is used to execute the steps of the data encryption method in the above embodiment.
[0146] The memory 2002, processor 2001, and bus components are electrically connected directly or indirectly to enable data transmission or interaction. For example, these components can be electrically connected to each other via one or more communication buses or signal lines. The data encryption device includes at least one software function module that can be stored in the memory 2002 or embedded in the operating system (OS) of the processing device in the form of software or firmware. The processor 2001 is used to execute executable modules stored in the memory 2002, such as the software function modules and computer programs included in the data encryption system.
[0147] The memory 2002 may be, but is not limited to, random access memory (RAM), read-only memory (ROM), programmable read-only memory (PROM), erasable programmable read-only memory (EPROM), electrically erasable programmable read-only memory (EEPROM), etc.
[0148] Optionally, this application also provides a storage medium storing a computer program, which, when run by a processor, executes the steps of the above-described method embodiments. The specific implementation and technical effects are similar and will not be repeated here.
[0149] Those skilled in the art will clearly understand that, for the sake of convenience and brevity, the specific working processes of the systems and devices described above can be referred to the corresponding processes in the method embodiments, and will not be repeated here. In the several embodiments provided in this application, it should be understood that the disclosed systems, devices, and methods can be implemented in other ways. The device embodiments described above are merely illustrative. For example, the division of modules is only a logical functional division, and in actual implementation, there may be other division methods. Furthermore, multiple modules or components can be combined or integrated into another system, or some features can be ignored or not executed. Another point is that the displayed or discussed mutual coupling or direct coupling or communication connection can be through some communication interfaces; the indirect coupling or communication connection of devices or modules can be electrical, mechanical, or other forms.
[0150] Furthermore, the functional units in the various embodiments of this application can be integrated into one processing unit, or each unit can exist physically separately, or two or more units can be integrated into one unit. If the functions are implemented as software functional units and sold or used as independent products, they can be stored in a computer-readable storage medium. Based on this understanding, the technical solution of this application, in essence, or the part that contributes to the prior art, or part of the technical solution, can be embodied in the form of a software product. This computer software product is stored in a storage medium and includes several instructions to cause a computer device (which may be a personal computer, server, or network device, etc.) to execute all or part of the steps of the methods described in the various embodiments of this application. The aforementioned storage medium includes various media capable of storing program code, such as USB flash drives, portable hard drives, ROM, RAM, magnetic disks, or optical disks.
[0151] The above description is merely a specific embodiment of the present invention, but the scope of protection of the present invention is not limited thereto. Any variations or substitutions that can be easily conceived by those skilled in the art within the technical scope disclosed in the present invention should be included within the scope of protection of the present invention. Therefore, the scope of protection of the present invention should be determined by the scope of the claims.
Claims
1. A method for determining data confidentiality, characterized in that, The method includes: The fingerprint of the data to be determined is segmented to obtain multiple fingerprint segments to be matched; Based on the results of segmented retrieval of each fingerprint segment to be matched in a preset database, a candidate fingerprint set is determined. The candidate fingerprint set includes multiple candidate fingerprints, which are key fingerprints that match the corresponding position of a fingerprint segment to be matched in the fingerprint of the fingerprint to be determined. Each candidate fingerprint in the candidate fingerprint set is matched with the fingerprint of the data to be determined to identify the target fingerprint, and the identification result of the data to be determined is determined based on the target fingerprint. Before determining the candidate fingerprint set based on the results of segmented retrieval in the preset database for each of the fingerprint segments to be matched, the method further includes: dividing the multiple fixed-secret rule fingerprints into multiple rule fingerprint segments corresponding to each fixed-secret rule fingerprint; adding each rule fingerprint segment to the corresponding segmented fingerprint set according to the position of each rule fingerprint segment in the corresponding fixed-secret rule fingerprint, wherein the number of segmented fingerprint sets is equal to the number of rule fingerprint segments; and combining the segmented fingerprint sets to obtain the preset database. The step of determining the candidate fingerprint set based on the results of segmented retrieval of each fingerprint segment to be matched in the preset database includes: according to the position of the fingerprint segment to be matched in the fingerprint of the fingerprint data to be determined, sequentially searching in the corresponding segmented fingerprint set; if no matching regular fingerprint segment is found in the corresponding segmented fingerprint set, then the next fingerprint segment to be matched is searched in the corresponding segmented fingerprint set, and the process is repeated until a matching fingerprint segment finds a corresponding regular fingerprint segment in the corresponding segmented fingerprint set; based on the identifier of at least one regular fingerprint segment that matches the fingerprint segment to be matched, the candidate fingerprints of each regular fingerprint segment are determined, and the set of all candidate fingerprints is taken as the candidate fingerprint set.
2. The data security determination method according to claim 1, characterized in that, The step of determining the candidate fingerprint set based on the results of segmented retrieval of each of the fingerprint segments to be matched in a preset database includes: The corresponding fingerprint segment to be matched is retrieved in each of the segmented fingerprint sets to obtain multiple search results; Based on the search results, at least one candidate fingerprint is determined; The set of each of the candidate fingerprints is called the candidate fingerprint set.
3. The data security determination method according to claim 2, characterized in that, The method further includes: If the search result indicates that the segmented fingerprint set does not match the fingerprint segment to be matched, then the corresponding fingerprint segment to be matched is searched in the next segmented fingerprint set.
4. The data security determination method according to claim 2, characterized in that, The method further includes: If the search result shows that the segmented fingerprint set matches the fingerprint segment to be matched, then at least one regular fingerprint segment in each segmented fingerprint set is the same as each fingerprint segment to be matched. Based on the identifier of each rule fingerprint segment, multiple candidate fingerprints corresponding to each rule fingerprint segment are determined.
5. The data security determination method according to claim 1, characterized in that, The step of matching each candidate fingerprint in the candidate fingerprint set with the fingerprint of the data to be determined to identify the target fingerprint includes: The Hamming distance between each candidate fingerprint in the candidate fingerprint set and the fingerprint of the data to be determined is calculated to obtain multiple fingerprint matching values; The candidate fingerprint corresponding to the smallest fingerprint matching value among the plurality of fingerprint matching values is taken as the target fingerprint.
6. The data security determination method according to claim 1, characterized in that, The step of determining the encryption result of the data to be encrypted based on the target fingerprint includes: The key information corresponding to the target fingerprint is used as the key determination result of the data to be determined.
7. A data security device, characterized in that, include: The segmentation module is used to segment the fingerprint of the data to be determined into multiple fingerprint segments to be matched; The retrieval module is used to determine a candidate fingerprint set based on the results of segmented retrieval in a preset database for each fingerprint segment to be matched. The candidate fingerprint set includes multiple candidate fingerprints, and each candidate fingerprint is a key fingerprint that matches the corresponding position of a fingerprint segment to be matched in the fingerprint of the fingerprint to be determined. The matching module is used to match each of the candidate fingerprints in the candidate fingerprint set with the fingerprint of the data to be determined, determine the target fingerprint, and determine the determination result of the data to be determined based on the target fingerprint. The segmentation module is further configured to segment multiple fixed-secret rule fingerprints to obtain multiple rule fingerprint segments corresponding to each fixed-secret rule fingerprint; add each rule fingerprint segment to a corresponding segmented fingerprint set according to the position of each rule fingerprint segment in the corresponding fixed-secret rule fingerprint, wherein the number of segmented fingerprint sets is equal to the number of rule fingerprint segments; and combine the segmented fingerprint sets to obtain the preset database. The retrieval module is used to sequentially search the corresponding segmented fingerprint set according to the position of the fingerprint segment to be matched in the fingerprint of the undetermined password data. If no matching regular fingerprint segment is found in the corresponding segmented fingerprint set, the next matching fingerprint segment is searched in the corresponding segmented fingerprint set. This process is repeated until a matching fingerprint segment finds a matching regular fingerprint segment in the corresponding segmented fingerprint set. Based on the identifier of at least one regular fingerprint segment that matches the matching fingerprint segment, the candidate fingerprints of each regular fingerprint segment are determined, and the set of all candidate fingerprints is taken as the candidate fingerprint set.
8. A processing apparatus, characterized in that, The processing device includes a processor, a storage medium, and a bus. The storage medium stores machine-readable instructions executable by the processor. When the processing device is running, the processor communicates with the storage medium via the bus, and the processor executes the machine-readable instructions to perform the steps of the data encryption method as described in any one of claims 1-6.
9. A computer-readable storage medium, characterized in that, The computer-readable storage medium stores a computer program that, when executed by a processor, implements the steps of the data encryption method as described in any one of claims 1-6.