Unlock instant, AI-driven research and patent intelligence for your innovation.

Method and system for training data privacy measurement in machine learning

A training data and machine learning technology, applied in the field of privacy security in the field of artificial intelligence, it can solve the problems of high attack success rate, model privacy leakage, lack of blank reference benchmark success rate, etc., to prevent privacy leakage and achieve high accuracy. Effect

Active Publication Date: 2021-06-29
HUAZHONG UNIV OF SCI & TECH
View PDF7 Cites 3 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] At present, although there are many types of privacy attack methods for machine learning model training samples at home and abroad, the evaluation methods for the risk of model privacy leakage are limited to using a specific attack method, and the attack success rate of the attack method is calculated as the model privacy leakage. possibility of
There are certain problems with this evaluation method: 1. Some attack methods are implemented based on white-box scenarios. However, in practical applications, in order to better protect the model and prevent the model from leaking privacy during the evaluation process, the model evaluator may only The model of the black box mechanism can be obtained. The black box condition limits the evaluator's prior knowledge of the internal parameters, structure, algorithm, etc. of the model. The evaluator can only give any data of the model and get the prediction result of the model for the data. Therefore, in Under such conditions, it is also a big problem to effectively measure the degree of model privacy information leakage; 2. The risk of a model’s privacy leakage is related to factors such as model structure, optimization strategy, the amount of information in the data set itself, and the number of classification labels. The method has specificity and limitations, because the performance of different privacy attack methods is different-the attack success rate is very different, and the success rates of cross-attacks cannot be compared with each other. Such limitations will reduce the stability of the evaluation method; 3. The attack success rate obtained according to a single attack method has certain limitations as a leak risk, that is, there is a lack of a benchmark success rate for the blank control. Although a high attack success rate can indicate that the attack method is effective, it does not mean that the model is vulnerable to privacy attack, because the success rate of a single attack and the risk of privacy leakage are not positively correlated linearly

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and system for training data privacy measurement in machine learning
  • Method and system for training data privacy measurement in machine learning

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0025] In order to make the object, technical solution and advantages of the present invention clearer, the present invention will be further described in detail below in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain the present invention, not to limit the present invention. In addition, the technical features involved in the various embodiments of the present invention described below can be combined with each other as long as they do not constitute a conflict with each other.

[0026] The flow process of the inventive method is as figure 1 As shown, the whole is divided into two stages: the disturbance processing stage and the model sensitivity evaluation stage. Users need to upload the query API of the model to be evaluated that operates in a black-box mechanism, a certain amount of training samples and non-training samples. The training samples refer to the data use...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a method and a system for training data privacy measurement in machine learning, and belongs to the field of privacy security in the field of artificial intelligence. According to the method, for a machine learning model black box interface query mechanism in an actual scene, internal information of the model is not needed in the evaluation process, only the sensitivity of the Jacobian matrix evaluation model to data samples and features is calculated and utilized, and privacy leakage in the evaluation process is avoided; based on the gradient optimization theory, the possibility of data privacy information leakage in the model is effectively quantified in combination with the relation between model output and input; the method does not depend on a single privacy attack, and is very effective for most privacy attacks, especially attacks depending on model gradient and prediction output. The model privacy disclosure risk can be evaluated without knowing the internal information of the model, the privacy safety of the model and the training set is ensured in the evaluation process, and a stable guarantee is provided for flourishing development of the artificial intelligence industry.

Description

technical field [0001] The invention belongs to the field of privacy and security in the field of artificial intelligence, and more specifically relates to a method and system for measuring privacy of training data in machine learning. Background technique [0002] In recent years, with the development of the artificial intelligence industry, technologies led by machine learning have been widely used in various fields such as object detection, image recognition, and speech recognition, bringing earth-shaking changes to people's lives. In machine learning, given a data set and an initial model that match the training task, a specific optimization algorithm is used to optimize the model so that it can obtain predictive capabilities for data other than the training set. However, when the model implements the prediction function, there is a security risk - data leakage. The data leakage here does not refer to the direct leakage of data packets in the traditional network security...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F21/62G06K9/62G06N20/20
CPCG06F21/6245G06N20/20G06F18/241G06F18/214
Inventor 王琛刘高扬徐天龙彭凯
Owner HUAZHONG UNIV OF SCI & TECH