Unlock instant, AI-driven research and patent intelligence for your innovation.

Training sample recombination method and system for distributed model training

A technology for training samples and model training, applied in computing models, machine learning, computing, etc., can solve problems such as uneven data distribution, affecting the overall performance of the model, and bias.

Active Publication Date: 2020-11-13
ALIPAY (HANGZHOU) INFORMATION TECH CO LTD
View PDF3 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

When the data sets of all parties are non-independent and identically distributed (Non-IID), due to the inhomogeneity of the data distribution held by all parties, the use of distributed learning for model training will cause deviations in the model training process, which in turn will affect the accuracy of the trained model. overall performance
[0003] Therefore, it is necessary to propose a training sample reorganization method for distributed model training to reduce the problem of uneven data distribution of all parties

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Training sample recombination method and system for distributed model training
  • Training sample recombination method and system for distributed model training
  • Training sample recombination method and system for distributed model training

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0015] In order to more clearly illustrate the technical solutions of the embodiments of the present specification, the following briefly introduces the drawings that need to be used in the description of the embodiments. Apparently, the accompanying drawings in the following description are only some examples or embodiments of this specification, and those skilled in the art can also apply this specification to other similar scenarios. Unless otherwise apparent from context or otherwise indicated, like reference numerals in the figures represent like structures or operations.

[0016] It should be understood that "system", "device", "unit" and / or "module" used in this specification is a method for distinguishing different components, elements, parts, parts or assemblies of different levels. However, the words may be replaced by other expressions if other words can achieve the same purpose.

[0017] As indicated in the specification and claims, the terms "a", "an", "an" and / ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

One or more embodiments of the present specification relate to a training sample recombination method and system for distributed model training. The method is implemented by a server in a participant.The method comprises the steps that a fusion training sample set is acquired, wherein the fusion training sample set comprises training samples from one or more training members; a first model is obtained and sent to each training member; and one or more rounds of training sample recombination are carried out on each training member, wherein each round of recombination comprises the following steps of acquiring a transmission proportion coefficient of the current round; selecting a part of training samples for each training member based on the transmission proportionality coefficient and issuing the training samples; obtaining a model performance parameter corresponding to the current round uploaded by each training member, the model performance parameter corresponding to the current round is a model performance parameter of a model obtained by training a training member based on a training sample held by the training member and a training sample issued by a server; and determining tocarry out the next round of recombination or stop recombination.

Description

technical field [0001] One or more embodiments of this specification relate to multi-party collaborative model training, and in particular to a training sample reorganization method and system for distributed model training. Background technique [0002] In the fields of data analysis, data mining, economic forecasting, etc., distributed model training can collaboratively train machine learning models for multiple parties to use while ensuring the security of multi-party data. However, in distributed model training, it is expected that the data sets held by multiple parties have the same distribution and the data features are independent of each other. When the data sets of all parties are non-independent and identically distributed (Non-IID), due to the inhomogeneity of the data distribution held by all parties, the use of distributed learning for model training will cause deviations in the model training process, which in turn will affect the accuracy of the trained model....

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06N20/00
CPCG06N20/00
Inventor 郑龙飞周俊王力陈超超
Owner ALIPAY (HANGZHOU) INFORMATION TECH CO LTD