A data processing method, device, storage medium and electronic equipment
By determining the security parameters and distinguishability of data processing strategies under specified business scenarios and selecting target evaluation methods, the problem of inaccurate selection of data protection strategies in existing technologies is solved, thereby improving the security and privacy protection of the data processing process.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- ALIPAY (HANGZHOU) INFORMATION TECH CO LTD
- Filing Date
- 2022-09-07
- Publication Date
- 2026-06-19
AI Technical Summary
The lack of accurate evaluation methods in existing technologies to determine the degree of data privacy protection provided by data processing strategies makes it impossible to select data protection strategies suitable for the current business scenario, and thus fails to guarantee privacy and security during data processing.
By acquiring sample data from a specified business scenario, processing each data processing strategy, determining its security parameters and discriminative power under multiple evaluation methods, selecting the target evaluation method suitable for the business scenario, and thus determining the target processing strategy to process the data to be processed.
It enables the accurate selection of appropriate data processing strategies in different business scenarios, thereby improving the security and privacy protection capabilities of the data processing process.
Smart Images

Figure CN115544555B_ABST
Abstract
Description
Technical Field
[0001] This specification relates to the field of information technology, and in particular to a data processing method, apparatus, storage medium, and electronic device. Background Technology
[0002] With the continuous advancement of computer technology, technologies such as mobile internet, cloud computing, and big data have also developed rapidly. This has given rise to numerous new service models and applications. However, while these services provide convenience for users' lives and work, they often require the collection of large amounts of information. This information often contains sensitive and private information such as user identity, interests, location, or internal corporate data. The collection, sharing, publication, analysis, and utilization of this information can directly or indirectly leak user privacy, posing a significant threat to user privacy and security. Therefore, it is necessary to process this information to achieve data processing while fully protecting data and privacy.
[0003] Currently, arbitrary evaluation methods are commonly used to determine the degree of data privacy protection provided by data processing strategies. However, this approach only determines the degree of data privacy protection from a single perspective. The selected data protection strategy may not be suitable for the current business scenario in which the data is located, and therefore cannot guarantee that it can provide better protection for the privacy of the data in the current business scenario during the data processing process.
[0004] Therefore, accurately determining the degree of data privacy protection provided by different data processing strategies, and thus identifying appropriate data processing strategies to improve the security of the data processing process, is an urgent problem to be solved. Summary of the Invention
[0005] This specification provides a data processing method, apparatus, storage medium, and electronic device. It enables the accurate determination of data processing strategies that match the business scenario in which the data to be processed exists.
[0006] The following technical solution is adopted in this specification:
[0007] This specification provides a data processing method, including:
[0008] Obtain sample data for a specified business scenario;
[0009] For each preset data processing strategy, the sample data is processed using that data processing strategy to obtain the processed data corresponding to that data processing strategy;
[0010] Based on the processed data corresponding to each data processing strategy, determine the security parameters corresponding to each data processing strategy under each evaluation method;
[0011] For each evaluation method, based on the differences between the security parameters corresponding to each data processing strategy under that evaluation method, the discrimination of each data processing strategy in processing the sample data under that evaluation method is determined, and this discrimination is used as the discrimination of that evaluation method.
[0012] Based on the discrimination of each evaluation method, an evaluation method suitable for the specified business scenario is selected from all evaluation methods and used as the target evaluation method for the specified business scenario.
[0013] When the data to be processed in the specified business scenario is obtained, a target processing strategy is determined from each data processing strategy through the target evaluation method, and the data to be processed is processed through the target processing strategy.
[0014] Optionally, for each data processing strategy, at least one sub-strategy is included in the data processing strategy;
[0015] For each preset data processing strategy, the sample data is processed according to that strategy to obtain the processed data corresponding to that strategy, specifically including:
[0016] For each preset data processing strategy, the sample data is processed by each of the sub-strategies contained in the data processing strategy to obtain the processed data after each sub-strategy.
[0017] Based on the processed data corresponding to each data processing strategy, determine the security parameters for each data processing strategy under each evaluation method, specifically including:
[0018] For each data processing strategy, based on the processed data obtained after processing by each sub-strategy included in the data processing strategy, the security parameters corresponding to the data processing strategy under each evaluation method are determined.
[0019] Optionally, the target processing strategy is determined from various data processing strategies through the target evaluation method, specifically including:
[0020] For each data processing strategy, determine the corresponding security parameters under each target evaluation method after processing the data to be processed according to the data processing strategy.
[0021] Based on the security parameters corresponding to each of the aforementioned target evaluation methods, determine the comprehensive security parameters corresponding to this data processing strategy;
[0022] Based on the comprehensive security parameters corresponding to each data processing strategy, the target processing strategy is determined from each data processing strategy.
[0023] Optionally, based on the security parameters corresponding to each of the target evaluation methods, the comprehensive security parameters corresponding to the data processing strategy are determined, specifically including:
[0024] The safety parameters corresponding to each target evaluation method are normalized to obtain the normalized safety parameters corresponding to each target evaluation method.
[0025] Based on the normalized security parameters corresponding to each target evaluation method, determine the comprehensive security parameters corresponding to the data processing strategy.
[0026] Optionally, the evaluation method includes at least one of a first evaluation method, a second evaluation method, and a third evaluation method;
[0027] The first evaluation method evaluates the security of the processed data by using the information entropy of the processed data;
[0028] The second evaluation method assesses the security of the processed data by the similarity between the sample data and the processed sample data.
[0029] The third evaluation method evaluates the security of the processed data by determining the indistinguishability between the sample data and the processed sample data.
[0030] This specification provides a data processing apparatus, including:
[0031] The acquisition module retrieves sample data for a specified business scenario.
[0032] The first processing module processes the sample data according to each preset data processing strategy to obtain the processed data corresponding to the data processing strategy.
[0033] The first determination module determines the security parameters of each data processing strategy under each evaluation method based on the processed data corresponding to each data processing strategy.
[0034] The second determining module, for each evaluation method, determines the discrimination of each data processing strategy in processing the sample data under the evaluation method based on the differences between the security parameters corresponding to each data processing strategy under the evaluation method, and uses it as the discrimination of the evaluation method.
[0035] The selection module selects an evaluation method suitable for the specified business scenario from among the evaluation methods based on the discrimination degree corresponding to each evaluation method, and uses it as the target evaluation method for the specified business scenario.
[0036] The second processing module, upon acquiring the data to be processed under the specified business scenario, determines a target processing strategy from various data processing strategies through the target evaluation method, and processes the data to be processed through the target processing strategy.
[0037] Optionally, for each data processing strategy, at least one sub-strategy is included in the data processing strategy;
[0038] The first processing module is specifically used to process the sample data by adopting each sub-strategy included in the preset data processing strategy, so as to obtain each processed data after processing by each sub-strategy.
[0039] The first determining module is specifically used to determine the security parameters corresponding to each evaluation method for each data processing strategy, based on the processed data obtained after processing by each sub-strategy included in the data processing strategy.
[0040] Optionally, the second processing module is specifically used to: for each data processing strategy, determine the security parameters corresponding to each target evaluation method after processing the data to be processed according to the data processing strategy; determine the comprehensive security parameters corresponding to the data processing strategy based on the security parameters corresponding to each target evaluation method; and determine the target processing strategy from each data processing strategy based on the comprehensive security parameters corresponding to each data processing strategy.
[0041] Optionally, the second processing module is specifically used to normalize the security parameters corresponding to each target evaluation method to obtain normalized security parameters corresponding to each target evaluation method; and to determine the comprehensive security parameters corresponding to the data processing strategy based on the normalized security parameters corresponding to each target evaluation method.
[0042] Optionally, the evaluation method includes at least one of a first evaluation method, a second evaluation method, and a third evaluation method; the first evaluation method evaluates the security of the processed data based on the information entropy of the processed data; the second evaluation method evaluates the security of the processed data based on the similarity between the sample data and the processed sample data; and the third evaluation method evaluates the security of the processed data by determining the indistinguishability between the sample data and the processed sample data.
[0043] This specification provides a computer-readable storage medium storing a computer program that, when executed by a processor, implements the above-described data processing method.
[0044] This specification provides an electronic device, including a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor executes the program to implement the above-described data processing method.
[0045] The above-mentioned technical solutions adopted in this specification can achieve the following beneficial effects:
[0046] In the data processing method provided in this specification, sample data under a specified business scenario is acquired in advance. For each preset data processing strategy, the sample data is processed by the data processing strategy to obtain the processed data corresponding to the data processing strategy. For each evaluation method, based on the processed data corresponding to each data processing strategy, the security parameters corresponding to each data processing strategy under the evaluation method are determined. When the distinguishability between the security parameters corresponding to each data processing strategy under the evaluation method meets the preset conditions, the evaluation method is used as the target evaluation method for the specified business scenario. Then, when the data to be processed under the specified business scenario is acquired, the target processing strategy is determined from each data processing strategy through the target evaluation method, and the data to be processed is processed through the target processing strategy.
[0047] As can be seen from the above method, this solution can pre-determine the target evaluation method for a specified business scenario based on sample data. Therefore, when data to be processed in that specific business scenario is obtained, the target processing strategy can be directly determined using the previously determined target evaluation method for that business scenario, and then the data to be processed can be processed. Compared to current methods that determine the security parameters of different data processing strategies through arbitrary evaluation methods, this solution can accurately determine the target processing strategy that matches different business scenarios through a unified target evaluation method, thereby improving the security of the data processing process. Attached Figure Description
[0048] The accompanying drawings, which are included to provide a further understanding of this specification and form part of this specification, illustrate exemplary embodiments and are used to explain this specification, but do not constitute an undue limitation thereof. In the drawings:
[0049] Figure 1 This is a flowchart illustrating a data processing method provided in this specification;
[0050] Figure 2 This is a schematic diagram of a data processing procedure provided in this specification;
[0051] Figure 3 This is a schematic diagram of a data processing apparatus provided in this specification.
[0052] Figure 4 The one provided in this specification corresponds to Figure 1 A schematic diagram of an electronic device. Detailed Implementation
[0053] To make the objectives, technical solutions, and advantages of this specification clearer, the technical solutions of this specification will be clearly and completely described below in conjunction with specific embodiments and corresponding drawings. Obviously, the described embodiments are only a part of the embodiments of this specification, and not all of them. Based on the embodiments in this specification, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of this specification.
[0054] The technical solutions provided in the various embodiments of this specification are described in detail below with reference to the accompanying drawings.
[0055] Figure 1 This is a flowchart illustrating a data processing method provided in this specification, including the following steps:
[0056] S100: Obtain sample data for a specified business scenario.
[0057] In services and applications across various sectors such as government affairs, healthcare, and finance, it is often necessary to acquire large amounts of data for analysis and processing to obtain and utilize relevant results. For example, in the financial sector, institutions such as banks typically need to analyze users' historical behavioral records and business data (such as credit records, deposit information, consumption patterns, and loan history) through multiple information channels to assess users' repayment ability and personal creditworthiness.
[0058] For example, user profiles can be created based on data such as a user's historical order records, occupation, and location, and then information can be recommended to that user based on that profile.
[0059] However, the data obtained in the above process usually includes some sensitive data, such as users' personal privacy data, financial institutions' transaction data, and medical institutions' patient data. Once this sensitive data is leaked, it will pose a serious threat to privacy and security.
[0060] Therefore, in the process of acquiring and processing data, privacy calculations are usually performed on the data to process it while protecting the data itself from being leaked to the outside world, thereby making full use of the data while fully protecting data privacy and security.
[0061] Currently, there are numerous data processing strategies for privacy-preserving computations on different types of data. For data within the same scenario, there is typically no standard evaluation method to differentiate the privacy protection levels of different data processing strategies. Therefore, it is impossible to accurately determine which data processing strategy offers the highest level of privacy protection for a given data, and consequently, the most suitable data processing strategy cannot be determined. Based on this, this specification provides a data processing method that requires obtaining sample data from a specified business scenario to determine the target processing strategy for that scenario. Thus, when processing data to be processed within that specified scenario, the determined target processing strategy can be directly applied to the data.
[0062] In practical applications, the specified business scenarios mentioned above can include various scenarios such as user profiling in information recommendation, transaction and credit analysis in the financial field, and joint model training and diagnostic analysis in the medical field. Correspondingly, the sample data and the data to be calculated can be different types of data in different scenarios. For example, in the information recommendation scenario, the data to be calculated can be personal data such as the user's occupation, location, and preferences; in the financial field, the data to be calculated can be transaction and financial data of various financial institutions; and in the medical field, the data to be calculated can be patient data of medical institutions. Of course, other business scenario types and other data types can also be included, but this specification does not specifically limit them.
[0063] In addition, the server can also obtain various data processing strategies, which may include: Secure Multi-party Computation (MPC), Federated Learning (FL), Trusted Execution Environment (TEE), Multi-party Intermediary Computation (MPIC), etc. Of course, other data processing strategies may also be included, but this specification does not specifically limit them.
[0064] In practical applications, data in different business scenarios will be processed and applied through different data processing strategies. Therefore, the data processing strategies corresponding to the sample data and the data to be calculated in this specification can be the data processing strategies corresponding to the specified business scenario in which the sample data and the data to be calculated are located. Of course, they can also be all known data processing strategies that are preset in advance.
[0065] In this specification, the executing entity for implementing the data processing method can refer to a designated device such as a server set up on the business platform. For ease of description, this specification will only use the server as the executing entity as an example to illustrate one data processing method provided in this specification.
[0066] S102: For each preset data processing strategy, the sample data is processed by the data processing strategy to obtain the processed data corresponding to the data processing strategy.
[0067] After obtaining sample data for a specified business scenario and the corresponding data processing strategy, the server can process the sample data using the various data processing strategies to obtain the processed data.
[0068] Specifically, in order to ensure that data privacy is not leaked, during the process of the server processing data through different data processing strategies, each data strategy usually contains multiple sub-strategies. Therefore, the server can use each sub-strategy contained in the preset data processing strategy to process the sample data separately, and obtain the processed data after each sub-strategy.
[0069] For example, during the processing of sample data, the server can encrypt the sample data using sub-policies included in the data processing strategy when the sample data is sent, such as hashing the data using a hash function to obtain the encrypted sample data. Of course, sample data can also be encrypted using methods such as homomorphic encryption, secret sharing, asymmetric encryption (RSA), and the Data Encryption Standard (DES), but this specification does not specifically limit the methods used.
[0070] After obtaining the encrypted data, the server can further process the encrypted sample data using another sub-strategy included in the data processing strategy to obtain the processing result. The data type of this processing result can be determined based on the actual application scenario of the data, such as gradient data, sorting information, statistical information, etc., and this specification does not specify any particular type.
[0071] S104: Based on the processed data corresponding to each data processing strategy, determine the security parameters corresponding to each data processing strategy under each evaluation method.
[0072] After obtaining the processed data, the server can determine the security parameters corresponding to each data processing strategy under the evaluation method based on the processed data corresponding to each data processing strategy. The security parameters are used to characterize the protection strength of each data processing strategy when processing data. In practical applications, the protection of data is mainly to protect the privacy of data from being leaked. Therefore, the security parameters can also be understood as the protection strength or protection capability of each data processing strategy for data privacy when processing data.
[0073] The aforementioned evaluation methods may include: a first evaluation method, a second evaluation method, and a third evaluation method. The first evaluation method evaluates the security of the processed data based on the information entropy of the processed data. The second evaluation method evaluates the security of the processed data based on the similarity between the sample data and the processed sample data. The third evaluation method evaluates the security of the processed data by determining the indistinguishability between the sample data and the processed sample data. Of course, other evaluation methods may also be included, and this specification does not specifically limit them.
[0074] Furthermore, for each evaluation method, the server can determine the security parameters corresponding to the data processing strategy under that evaluation method based on the processed data obtained after processing each sub-strategy contained in the data processing strategy. For example, the server can determine the security parameters of each data processing strategy under different evaluation methods based on the encrypted data and the processed data mentioned above.
[0075] Specifically, the server can determine the security parameters of each data processing strategy under the information entropy based on the information entropy corresponding to the encrypted data and the information entropy corresponding to the processed data. It can also determine the security parameters of each data processing strategy under the data similarity based on the similarity between the encrypted data and the sample data and the processed data and the sample data. Finally, it can determine the security parameters of each data processing strategy under the data indistinguishability based on the indistinguishability between the encrypted data and the sample data and the processed data and the sample data.
[0076] Of course, the server can also determine the security parameters of each data processing strategy under different evaluation methods based solely on the processed data or solely on the encrypted data.
[0077] To facilitate understanding, this specification also provides a schematic diagram of the data processing process, such as... Figure 2 As shown.
[0078] Figure 2 This is a schematic diagram of a data processing procedure provided in this specification.
[0079] In this process, participants A and B will send data to be processed to the server. When sending the data, the data sent by A and the data sent by B will be encrypted respectively, so as to obtain the encrypted data corresponding to A and the encrypted data corresponding to B. Then, the server will interact with and process the encrypted data corresponding to A and the encrypted data corresponding to B to obtain the processing result, and send the different processing results to A and B respectively.
[0080] During this process, the server can calculate the security parameters of the encrypted data corresponding to A, the encrypted data corresponding to B, the processing result received by A, and the processing result received by B under each evaluation method, and use them as the security parameters of the data processing strategy under each evaluation method.
[0081] In practical applications, a single evaluation method often corresponds to multiple specific evaluation methods. Taking information entropy as an example, it includes concepts such as mutual information entropy, mutual information, relative information entropy, and information entropy gain / loss. Data similarity includes nominal attribute similarity, binary attribute similarity, numerical attribute similarity, ordinal attribute similarity, and mixed-type attribute similarity. Indistinguishability includes statistical indistinguishability and computational indistinguishability.
[0082] Therefore, the server can determine the security parameters of different data processing strategies under various specific evaluation methods. It should be noted that, in this specification, the security parameters of different data processing strategies under various evaluation methods can refer to the security parameters of different data processing strategies under broad evaluation methods such as information entropy, data similarity, and indistinguishability, or they can refer to the security parameters of different data processing strategies under specific evaluation methods such as mutual information entropy, mutual information, relative information entropy, and information entropy gain / loss (included in information entropy); nominal attribute similarity, binary attribute similarity, numerical attribute similarity, ordinal attribute similarity, and mixed-type attribute similarity (included in data similarity); and statistical indistinguishability and computational indistinguishability (included in indistinguishability). This explanation only uses a few representative evaluation methods as examples; other evaluation methods are not listed here.
[0083] S106: For each evaluation method, based on the differences between the security parameters corresponding to each data processing strategy under that evaluation method, determine the discrimination of each data processing strategy in processing the sample data under that evaluation method, and use it as the discrimination of that evaluation method.
[0084] Because each evaluation method calculates security parameters for different data processing strategies differently, their ability to differentiate the magnitudes of these parameters also varies. For example, some evaluation methods determine security parameters for each data processing strategy that are quite similar. Therefore, the security parameters determined by these methods are not very valuable and instead consume a lot of system resources. Thus, these less effective evaluation methods can be removed, leaving only the more effective methods to calculate the security parameters for each data processing strategy.
[0085] Therefore, after the server determines the security parameters corresponding to each data processing strategy under the evaluation method, it can determine the discrimination of each data processing strategy in processing the sample data under the evaluation method based on the differences between the security parameters corresponding to each data processing strategy under the evaluation method, and use this as the discrimination of the evaluation method.
[0086] S108: Based on the discrimination of each evaluation method, select the evaluation method applicable to the specified business scenario from among the evaluation methods, and use it as the target evaluation method for the specified business scenario.
[0087] After determining the discrimination level corresponding to each evaluation method, the server can select the evaluation method suitable for the specified business scenario from among the evaluation methods based on the discrimination level corresponding to each evaluation method, and use it as the target evaluation method for the specified business scenario.
[0088] Specifically, the server can use an evaluation method as the target evaluation method for the specified business scenario when the distinguishability between the security parameters corresponding to each data processing strategy under this evaluation method meets a preset condition. For example, if the distinguishability between the reference security parameters corresponding to each data processing strategy under this evaluation method is greater than a preset distinguishability, then this evaluation method is used as the target evaluation method.
[0089] For example, in a given business scenario where the sample data is located, there are four data processing strategies: A, B, C, and D. Information entropy includes four evaluation methods: mutual information entropy, mutual information, relative information entropy, and information entropy gain / loss. The mutual information entropies corresponding to data processing strategies A, B, C, and D are 1 bit, 3 bit, 5 bit, and 7 bits, respectively; the mutual information is 3 bit, 4 bit, 4 bit, and 5 bits, respectively; the relative information entropy is 1 bit, 2 bit, 4 bit, and 9 bits, respectively; and the information entropy gain / loss is 4 bit, 4 bit, 4 bit, and 4 bits, respectively. Since variance can usually accurately reflect the magnitude of the differences among data points in a set of data, calculations show that the variances of data processing strategies A, B, C, and D are 5 under mutual information entropy, 0.5 under mutual information entropy, 9.5 under relative information entropy, and 0 under information entropy gain / loss. This indicates that mutual information entropy and relative information entropy have a high ability to distinguish the magnitudes of the safety parameters corresponding to different data processing strategies, while relative information entropy and information entropy gain / loss have a low ability to distinguish the magnitudes of the safety parameters corresponding to different data processing strategies. When the preset deviation is set to 4, mutual information entropy and relative information entropy can be used as target evaluation methods.
[0090] Furthermore, the server can sort the evaluation methods according to the deviation between the security parameters corresponding to each data processing strategy under different evaluation methods, in descending order of the deviation, to obtain the sorting results for each evaluation method. Then, based on the sorting results, a specified number of evaluation methods are selected as the target evaluation methods. Of course, the server can also select evaluation methods that are ranked before a specified position as the target evaluation methods. The specified number and specified ranking can be set according to the actual situation, and this specification does not impose specific limitations on them.
[0091] In addition, for evaluation methods such as data similarity and indistinguishability, the target evaluation methods included in data similarity and indistinguishability can be selected through the above methods to obtain a combination of each target evaluation method.
[0092] For example, the server determines that the target evaluation methods in information entropy are mutual information entropy and relative information entropy, the target evaluation methods in data similarity are nominal attribute similarity and numerical attribute similarity, and the target evaluation method in indistinguishability is statistical indistinguishability. This results in a combination of target evaluation methods that include mutual information entropy, relative information entropy, nominal attribute similarity, numerical attribute similarity, and statistical indistinguishability.
[0093] It should be noted that, in this specification, the above-mentioned target evaluation method can be selected from a range of evaluation methods such as information entropy, data similarity, and indistinguishability.
[0094] S110: When the data to be processed in the specified business scenario is obtained, a target processing strategy is determined from each data processing strategy through the target evaluation method, and the data to be processed is processed through the target processing strategy.
[0095] Once the target evaluation method is determined, when the server obtains the data to be processed in the specified business scenario, it can select the target data processing strategy through the aforementioned target evaluation method.
[0096] Specifically, for each data processing strategy, after determining the security parameters corresponding to the data to be processed under each target evaluation method, the server can determine the comprehensive security parameters corresponding to the data processing strategy based on the security parameters corresponding to each target evaluation method.
[0097] Furthermore, the server can normalize the security parameters corresponding to the data processing strategy under each target evaluation method to obtain the normalized security parameters corresponding to each target evaluation method. Then, the normalized security parameters corresponding to each target evaluation method are combined to obtain the comprehensive security parameters corresponding to the data processing strategy.
[0098] Of course, the server can also determine the weight of the security parameters under each target evaluation method, and then perform a weighted summation of the security parameters under each target evaluation method to obtain the comprehensive security parameters corresponding to the data processing strategy.
[0099] Then, the server can determine the data processing strategy that matches the specified business scenario (such as the data processing strategy with the highest comprehensive security parameters) based on the comprehensive security parameters corresponding to each data processing strategy, and use it as the target data processing strategy.
[0100] In practical applications, different data processing strategies have varying levels of privacy protection capabilities for the computational data, resulting in differences in computational complexity and system resource consumption during the computation process. This leads to some data processing strategies with higher privacy protection capabilities requiring higher system computing power, time consumption, and system resource consumption. Therefore, servers can combine the comprehensive security parameters and computational complexity of each data processing strategy to select a suitable data processing strategy as the target processing strategy, and then process the data to be processed.
[0101] As can be seen from the above method, this solution can pre-determine the target evaluation method for a specified business scenario based on sample data. Therefore, when data to be processed in that specific business scenario is obtained, the target processing strategy can be directly determined using the previously determined target evaluation method for that business scenario, and then the data to be processed can be processed. Compared to current methods that determine the security parameters of different data processing strategies through arbitrary evaluation methods, this solution can accurately determine the target processing strategy that matches different business scenarios through a unified target evaluation method, thereby improving the security of the data processing process.
[0102] The above describes one or more data processing methods implemented in this specification. Based on the same approach, this specification also provides corresponding data processing apparatus, such as... Figure 3 As shown.
[0103] Figure 3 A schematic diagram of a data processing apparatus provided in this specification includes:
[0104] Module 300 retrieves sample data for a specified business scenario;
[0105] The first processing module 302 processes the sample data according to each preset data processing strategy to obtain the processed data corresponding to the data processing strategy.
[0106] The first determining module 304 determines the security parameters corresponding to each data processing strategy under each evaluation method based on the processed data corresponding to each data processing strategy.
[0107] The second determining module 306, for each evaluation method, determines the discrimination of each data processing strategy in processing the sample data under the evaluation method based on the differences between the security parameters corresponding to each data processing strategy under the evaluation method, and uses it as the discrimination of the evaluation method.
[0108] Module 308 selects an evaluation method suitable for the specified business scenario from among the evaluation methods based on the discrimination degree corresponding to each evaluation method, and uses it as the target evaluation method corresponding to the specified business scenario.
[0109] The second processing module 310, upon acquiring the data to be processed under the specified business scenario, determines a target processing strategy from various data processing strategies through the target evaluation method, and processes the data to be processed through the target processing strategy.
[0110] Optionally, for each data processing strategy, at least one sub-strategy is included in the data processing strategy;
[0111] The first processing module 302 is specifically used to process the sample data by adopting each sub-strategy included in the preset data processing strategy, and to obtain the processed data after processing by each sub-strategy.
[0112] The first determining module 304 is specifically used to determine the security parameters corresponding to each evaluation method for each data processing strategy based on the processed data obtained after processing by each sub-strategy included in the data processing strategy.
[0113] Optionally, the second processing module 310 is specifically configured to: for each data processing strategy, determine the security parameters corresponding to each target evaluation method after processing the data to be processed according to the data processing strategy; determine the comprehensive security parameters corresponding to the data processing strategy based on the security parameters corresponding to each target evaluation method; and determine the target processing strategy from each data processing strategy based on the comprehensive security parameters corresponding to each data processing strategy.
[0114] Optionally, the second processing module 310 is specifically used to normalize the security parameters corresponding to each target evaluation method to obtain normalized security parameters corresponding to each target evaluation method; and to determine the comprehensive security parameters corresponding to the data processing strategy based on the normalized security parameters corresponding to each target evaluation method.
[0115] Optionally, the evaluation method includes at least one of a first evaluation method, a second evaluation method, and a third evaluation method; the first evaluation method evaluates the security of the processed data based on the information entropy of the processed data; the second evaluation method evaluates the security of the processed data based on the similarity between the sample data and the processed sample data; and the third evaluation method evaluates the security of the processed data by determining the indistinguishability between the sample data and the processed sample data.
[0116] This specification also provides a computer-readable storage medium storing a computer program that can be used to execute the above-described... Figure 1 This provides a data processing method.
[0117] This instruction manual also provides Figure 4 The one shown corresponds to Figure 1 A schematic diagram of the structure of an electronic device. (e.g.) Figure 4 At the hardware level, the electronic device includes a processor, internal bus, network interface, memory, and non-volatile memory, and may also include other hardware required for the business operations. The processor reads the corresponding computer program from the non-volatile memory into memory and then runs it to achieve the above-mentioned functions. Figure 1 The data processing method described herein. Of course, in addition to software implementation, this specification does not exclude other implementation methods, such as logic devices or a combination of hardware and software, etc. In other words, the execution subject of the following processing flow is not limited to individual logic units, but can also be hardware or logic devices.
[0118] In the 1990s, improvements to a technology could be clearly distinguished as either hardware improvements (e.g., improvements to the circuit structure of diodes, transistors, switches, etc.) or software improvements (improvements to the methodology). However, with technological advancements, many methodological improvements today can be considered direct improvements to the hardware circuit structure. Designers almost always obtain the corresponding hardware circuit structure by programming the improved methodology into the hardware circuit. Therefore, it cannot be said that a methodological improvement cannot be implemented using hardware physical modules. For example, a Programmable Logic Device (PLD) (such as a Field Programmable Gate Array (FPGA)) is such an integrated circuit whose logic function is determined by the user programming the device. Designers can program and "integrate" a digital system onto a PLD themselves, without needing chip manufacturers to design and manufacture dedicated integrated circuit chips. Furthermore, nowadays, instead of manually manufacturing integrated circuit chips, this programming is mostly implemented using "logic compiler" software. Similar to the software compiler used in program development, the original code before compilation must also be written in a specific programming language, called a Hardware Description Language (HDL). There are many HDLs, such as ABEL (Advanced Boolean Expression Language), AHDL (Altera Hardware Description Language), Confluence, CUPL (Cornell University Programming Language), HDCal, JHDL (Java Hardware Description Language), Lava, Lola, MyHDL, PALASM, and RHDL (Ruby Hardware Description Language). Currently, the most commonly used are VHDL (Very-High-Speed Integrated Circuit Hardware Description Language) and Verilog. Those skilled in the art should also understand that by simply performing some logic programming on the method flow using one of these hardware description languages and programming it into an integrated circuit, the hardware circuit implementing the logical method flow can be easily obtained.
[0119] The controller can be implemented in any suitable manner. For example, it can take the form of a microprocessor or processor and a computer-readable medium storing computer-readable program code (e.g., software or firmware) executable by the (micro)processor, logic gates, switches, application-specific integrated circuits (ASICs), programmable logic controllers, and embedded microcontrollers. Examples of controllers include, but are not limited to, the following microcontrollers: ARC 625D, Atmel AT91SAM, Microchip PIC18F26K20, and Silicon Labs C8051F320. A memory controller can also be implemented as part of the control logic of the memory. Those skilled in the art will also recognize that, in addition to implementing the controller in purely computer-readable program code form, the same functionality can be achieved by logically programming the method steps to make the controller take the form of logic gates, switches, application-specific integrated circuits, programmable logic controllers, and embedded microcontrollers. Therefore, such a controller can be considered a hardware component, and the means included therein for implementing various functions can also be considered as structures within the hardware component. Alternatively, the means for implementing various functions can be considered as both software modules implementing the method and structures within the hardware component.
[0120] The systems, devices, modules, or units described in the above embodiments can be implemented by computer chips or entities, or by products with certain functions. A typical implementation device is a computer. Specifically, a computer can be, for example, a personal computer, laptop computer, cellular phone, camera phone, smartphone, personal digital assistant, media player, navigation device, email device, game console, tablet computer, wearable device, or any combination of these devices.
[0121] For ease of description, the above devices are described in terms of function, divided into various units. Of course, in implementing this specification, the functions of each unit can be implemented in one or more software and / or hardware.
[0122] Those skilled in the art will understand that embodiments of this specification can be provided as methods, systems, or computer program products. Therefore, this specification may take the form of a completely hardware embodiment, a completely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, this specification may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) containing computer-usable program code.
[0123] This specification is described with reference to flowchart illustrations and / or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of this specification. It will be understood that each block of the flowchart illustrations and / or block diagrams, and combinations of blocks in the flowchart illustrations and / or block diagrams, can be implemented by computer program instructions. These computer program instructions can be provided to a processor of a general-purpose computer, special-purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, generate instructions for implementing the flowchart illustrations and / or block diagrams. Figure 1 One or more processes and / or boxes Figure 1 A device that provides the functions specified in one or more boxes.
[0124] These computer program instructions may also be stored in a computer-readable storage medium that can direct a computer or other programmable data processing device to function in a particular manner, such that the instructions stored in the computer-readable storage medium produce an article of manufacture including instruction means, which are implemented in a process Figure 1 One or more processes and / or boxes Figure 1 The function specified in one or more boxes.
[0125] These computer program instructions may also be loaded onto a computer or other programmable data processing equipment to cause a series of operational steps to be performed on the computer or other programmable equipment to produce a computer-implemented process, thereby providing instructions that execute on the computer or other programmable equipment for implementing the process. Figure 1 One or more processes and / or boxes Figure 1 The steps of the function specified in one or more boxes.
[0126] In a typical configuration, a computing device includes one or more processors (CPU), input / output interfaces, network interfaces, and memory.
[0127] Memory may include non-persistent storage in computer-readable media, such as random access memory (RAM) and / or non-volatile memory, such as read-only memory (ROM) or flash RAM. Memory is an example of computer-readable media.
[0128] Computer-readable media includes both permanent and non-permanent, removable and non-removable media that can store information using any method or technology. Information can be computer-readable instructions, data structures, modules of programs, or other data. Examples of computer storage media include, but are not limited to, phase-change memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), other types of random access memory (RAM), read-only memory (ROM), electrically erasable programmable read-only memory (EEPROM), flash memory or other memory technologies, CD-ROM, digital versatile optical disc (DVD) or other optical storage, magnetic tape, magnetic magnetic disk storage or other magnetic storage devices, or any other non-transferable medium that can be used to store information accessible by a computing device. As defined herein, computer-readable media does not include transient computer-readable media, such as modulated data signals and carrier waves.
[0129] It should also be noted that the terms "comprising," "including," or any other variations thereof are intended to cover non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements includes not only those elements but also other elements not expressly listed, or elements inherent to such a process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one..." does not exclude the presence of other identical elements in the process, method, article, or apparatus that includes said element.
[0130] Those skilled in the art will understand that the embodiments of this specification can be provided as methods, systems, or computer program products. Therefore, this specification may take the form of a completely hardware embodiment, a completely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, this specification may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) containing computer-usable program code.
[0131] This specification can be described in the general context of computer-executable instructions that are executed by a computer, such as program modules. Generally, program modules include routines, programs, objects, components, data structures, etc., that perform a specific task or implement a specific abstract data type. This specification can also be practiced in distributed computing environments, where tasks are performed by remote processing devices connected via a communication network. In distributed computing environments, program modules can reside in local and remote computer storage media, including storage devices.
[0132] The various embodiments in this specification are described in a progressive manner. Similar or identical parts between embodiments can be referred to interchangeably. Each embodiment focuses on describing the differences from other embodiments. In particular, the system embodiments are basically similar to the method embodiments, so the description is relatively simple; relevant parts can be referred to the descriptions in the method embodiments.
[0133] The above description is merely an embodiment of this specification and is not intended to limit this specification. Various modifications and variations can be made to this specification by those skilled in the art. Any modifications, equivalent substitutions, improvements, etc., made within the spirit and principles of this specification should be included within the scope of the claims of this specification.
Claims
1. A data processing method, comprising: Obtain sample data for a specified business scenario; For each preset data processing strategy, the sample data is processed according to the data processing strategy to obtain the processed data corresponding to the data processing strategy; Based on the processed data corresponding to each data processing strategy, determine the security parameters corresponding to each data processing strategy under each evaluation method. The security parameters are used to characterize the strength or ability of each data processing strategy to protect data privacy when processing data. For each evaluation method, based on the differences between the security parameters corresponding to each data processing strategy under that evaluation method, the discrimination of each data processing strategy in processing the sample data under that evaluation method is determined, and this discrimination is used as the discrimination of that evaluation method. Based on the discrimination of each evaluation method, an evaluation method suitable for the specified business scenario is selected from all evaluation methods and used as the target evaluation method for the specified business scenario. When the data to be processed in the specified business scenario is obtained, a target processing strategy is determined from each data processing strategy through the target evaluation method, and the data to be processed is processed through the target processing strategy.
2. The method of claim 1, wherein for each data processing strategy, the data processing strategy includes at least one sub-strategy; For each preset data processing strategy, the sample data is processed according to that strategy to obtain the processed data corresponding to that strategy, specifically including: For each preset data processing strategy, the sample data is processed by each of the sub-strategies contained in the data processing strategy to obtain the processed data after each sub-strategy. Based on the processed data corresponding to each data processing strategy, determine the security parameters for each data processing strategy under each evaluation method, specifically including: For each data processing strategy, based on the processed data obtained after processing by each sub-strategy included in the data processing strategy, the security parameters corresponding to the data processing strategy under each evaluation method are determined.
3. The method as described in claim 1, wherein the target processing strategy is determined from various data processing strategies through the target evaluation method, specifically includes: For each data processing strategy, determine the corresponding security parameters under each target evaluation method after processing the data to be processed according to the data processing strategy. Based on the security parameters corresponding to each of the aforementioned target evaluation methods, determine the comprehensive security parameters corresponding to this data processing strategy; Based on the comprehensive security parameters corresponding to each data processing strategy, the target processing strategy is determined from each data processing strategy.
4. The method as described in claim 1, wherein the comprehensive security parameters corresponding to the data processing strategy are determined based on the security parameters corresponding to each of the target evaluation methods, specifically including: The safety parameters corresponding to each target evaluation method are normalized to obtain the normalized safety parameters corresponding to each target evaluation method. Based on the normalized security parameters corresponding to each target evaluation method, determine the comprehensive security parameters corresponding to the data processing strategy.
5. The method of claim 1, said evaluating comprising: At least one of the first evaluation method, the second evaluation method, and the third evaluation method; The first evaluation method evaluates the security of the processed data by using the information entropy of the processed data; The second evaluation method assesses the security of the processed data by the similarity between the sample data and the processed data. The third evaluation method evaluates the security of the processed data by determining the indistinguishability between the sample data and the processed data.
6. A data processing apparatus, comprising: The acquisition module retrieves sample data for a specified business scenario. The first processing module processes the sample data according to each preset data processing strategy to obtain the processed data corresponding to the data processing strategy. The first determination module determines the security parameters of each data processing strategy under each evaluation method based on the processed data corresponding to each data processing strategy. The security parameters are used to characterize the strength or ability of each data processing strategy to protect data privacy when processing data. The second determining module, for each evaluation method, determines the discrimination of each data processing strategy in processing the sample data under the evaluation method based on the differences between the security parameters corresponding to each data processing strategy under the evaluation method, and uses it as the discrimination of the evaluation method. The selection module selects an evaluation method suitable for the specified business scenario from among the evaluation methods based on the discrimination degree corresponding to each evaluation method, and uses it as the target evaluation method for the specified business scenario. The second processing module, upon acquiring the data to be processed under the specified business scenario, determines a target processing strategy from various data processing strategies through the target evaluation method, and processes the data to be processed through the target processing strategy.
7. The apparatus of claim 6, wherein for each data processing strategy, the data processing strategy includes at least one sub-strategy; The first processing module is specifically used to process the sample data by adopting each sub-strategy included in the preset data processing strategy, so as to obtain each processed data after processing by each sub-strategy. The first determining module is specifically used to determine the security parameters corresponding to each evaluation method for each data processing strategy, based on the processed data obtained after processing by each sub-strategy included in the data processing strategy.
8. The apparatus of claim 6, wherein the second processing module is specifically configured to: for each data processing strategy, determine the security parameters corresponding to each target evaluation method after processing the data to be processed according to the data processing strategy; determine the comprehensive security parameters corresponding to the data processing strategy based on the security parameters corresponding to each target evaluation method; and determine the target processing strategy from each data processing strategy based on the comprehensive security parameters corresponding to each data processing strategy.
9. The apparatus of claim 8, wherein the second processing module is specifically configured to: normalize the security parameters corresponding to each target evaluation method to obtain normalized security parameters corresponding to each target evaluation method; and determine the comprehensive security parameters corresponding to the data processing strategy based on the normalized security parameters corresponding to each target evaluation method.
10. The apparatus of claim 6, wherein the evaluation method includes: At least one of the first evaluation method, the second evaluation method, and the third evaluation method; The first evaluation method evaluates the security of the processed data by using the information entropy of the processed data; the second evaluation method evaluates the security of the processed data by using the similarity between the sample data and the processed data; and the third evaluation method evaluates the security of the processed data by determining the indistinguishability between the sample data and the processed data.
11. A computer-readable storage medium storing a computer program that, when executed by a processor, implements the method described in any one of claims 1 to 5.
12. An electronic device comprising a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor executes the program to implement the method described in any one of claims 1 to 5.
Citation Information
Patent Citations
Security policy evaluation method and device, computer readable medium and electronic device
CN110278201A
Multi-party data association query method and device based on event triggering
CN110543498A