Unlock instant, AI-driven research and patent intelligence for your innovation.

Sample data processing method and device and multi-party model training system

A sample data and model training technology, applied in the field of multi-party model training system, can solve the problems of different data quality, different data collection methods, data sample label conflicts, etc.

Active Publication Date: 2020-07-10
ALIPAY (HANGZHOU) INFORMATION TECH CO LTD
View PDF8 Cites 3 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] For joint training involving multiple data owners, different data owners have different data sources, different data collection methods, and various errors in the data aggregation process, so that for the same training model, the data quality of each data owner is also different. are not the same, and for data samples with the same data identifier (ID), different data owners will also generate different sample labels, so that there is a label conflict between the data samples collected by each data owner

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Sample data processing method and device and multi-party model training system
  • Sample data processing method and device and multi-party model training system
  • Sample data processing method and device and multi-party model training system

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0043] The subject matter described herein will now be discussed with reference to example implementations. It should be understood that the discussion of these implementations is only to enable those skilled in the art to better understand and realize the subject matter described herein, and is not intended to limit the protection scope, applicability or examples set forth in the claims. Changes may be made in the function and arrangement of elements discussed without departing from the scope of the disclosure. Various examples may omit, substitute, or add various procedures or components as needed. For example, the methods described may be performed in an order different from that described, and various steps may be added, omitted, or combined. Additionally, features described with respect to some examples may also be combined in other examples.

[0044] As used herein, the term "comprising" and its variants represent open terms meaning "including but not limited to". The...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The embodiment of the invention provides a sample data processing method and device for multi-party model training. In the method, a first sample dataset is classified into a second sample dataset anda third sample dataset based on data tags of sample data, the second sample data in the second sample dataset having a unique data tag, and the third sample data in the third sample dataset having atleast two different data tags; performing model training by using the second sample data set to train a first model; performing data quality evaluation on the local data of each first member node byusing the first model; based on the data quality evaluation result of each first member node, performing label reconstruction on third sample data in a third sample data set, wherein the third sampledata after label reconstruction has a unique data label.

Description

technical field [0001] The embodiments of this specification generally relate to the field of artificial intelligence, and in particular, relate to a sample data processing method and device for multi-party model training, and a multi-party model training system. Background technique [0002] With the development of artificial intelligence technology, business models such as deep neural network (DNN) have been gradually applied to various business application scenarios, such as risk assessment, speech recognition, natural language processing, etc. The model structure of business models in different application scenarios is relatively fixed. In order to achieve better model performance, more data owners are required to provide more training sample data during model training. For example, when a business model is applied to medical, financial and other fields, different medical or financial institutions will collect different data samples. Once these data samples are used to ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06K9/62G06N3/08G06N20/00
CPCG06N3/08G06N20/00G06F18/24G06F18/214
Inventor 郑龙飞周俊王力陈超超
Owner ALIPAY (HANGZHOU) INFORMATION TECH CO LTD