Model training method and device based on privacy data set

A technology for private data and model training, applied in neural learning methods, digital data protection, electrical digital data processing, etc.

Active Publication Date: 2022-02-01
TSINGHUA UNIV
View PDF3 Cites 5 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] Embodiments of the present invention provide a model training method and device based on priva...

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Model training method and device based on privacy data set
  • Model training method and device based on privacy data set
  • Model training method and device based on privacy data set

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0051] In order to make the purpose, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below in conjunction with the drawings in the embodiments of the present invention. Obviously, the described embodiments It is a part of embodiments of the present invention, but not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without creative efforts fall within the protection scope of the present invention.

[0052] Data processing such as data analysis, data mining, and trend prediction is widely used in more and more scenarios for a large amount of information data flooding in various industries such as economy, culture, education, medical care, and public management. Among them, through data cooperation, multiple data owners can obtain better data p...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to the technical field of multi-party data cooperation, and provides a model training method and device based on a privacy data set. The method comprises the following steps: training a server-side model based on a public data set and a real label corresponding to the public data set; obtaining a first model output sent by each client, wherein the first model output is obtained by inputting the public data set into the local learning model, and the local learning model is obtained based on the privacy data set and corresponding label training; training a server side model based on the corresponding public data output by each first model; inputting the public data set into the server-side model to obtain a second model output; and sending the second model output to each client, so that each client performs retraining of the local learning model based on the second model output and the public data set. Therefore, on the premise of avoiding leakage of the privacy data set, model training is carried out by taking the privacy data set as a part of training samples based on knowledge distillation and knowledge fusion.

Description

technical field [0001] The present invention relates to the technical field of multi-party data cooperation, in particular to a model training method and device based on private data sets. Background technique [0002] In data analysis, data mining, economic forecasting and other fields, machine learning models can be used to analyze and discover potential data value. Since the data held by a single data owner may be incomplete, it is difficult to accurately describe the target. In order to obtain better model prediction results, the joint training of the model is carried out through the data cooperation of multiple data owners. has been widely used. However, in the process of multi-party data cooperation, issues such as data security and model security are involved. [0003] Especially in the medical field, some data sets involve privacy and cannot be made public, and can only be used within the hospital. It is very difficult to build a learning model based on the privat...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F21/62G06V10/80G06V10/82G06K9/62G06N3/04G06N3/08
CPCG06F21/6245G06N3/04G06N3/08G06F18/214
Inventor 刘洋程思婕武婧雯
Owner TSINGHUA UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products