A third-party federated gradient boosting decision tree model training method

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A decision tree and gradient technology, applied in character and pattern recognition, data processing applications, finance, etc., can solve problems such as data leakage, space consumption, and difficulty in finding a trusted third party, so as to protect data security and reduce storage space , to ensure the effect of training accuracy

Active Publication Date: 2022-04-26

蓝象智联(杭州)科技有限公司

View PDF12 Cites 0 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

The first batch of performance evaluation data of the privacy computing institute directly under the Ministry of Industry and Information Technology shows that the average time-consuming of federal tree modeling with 900 features and 400,000 samples in the industry is 2 hours, 23 minutes and 47 seconds, which is difficult to meet the needs of the industry;

[0005] 2. There are third-party assistants to participate in training and synchronize model parameter distribution, but it is difficult to find a trusted third party for actual commercial implementation, and there is a risk of data leakage;

[0006] 3. The existing feature value storage efficiency is low, and a data set with 900 features and 400,000 samples needs to occupy 3.9G space

If the intermediate results of federated gradient boosting decision tree model training are stored on the local disk, a federated gradient boosting decision tree model training will consume more than 10G of space

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment

[0100] Embodiment: A third-party-free federated gradient boosting decision tree model training method in this embodiment is used for joint risk control modeling between banks and operators, such as figure 1 shown, including the following steps:

[0101] S1: The training initiator and training participants synchronously initialize the model parameters of their respective federated gradient boosting decision tree models; the model parameters include the depth of the federated gradient boosting decision tree, the number of federated gradient boosting decision trees, the sampling rate of large gradient samples, and the small gradient Sample sampling rate, tree column sampling rate, tree row sampling rate, learning rate, maximum number of leaves, minimum number of node samples after splitting, minimum profit of splitting, number of bins, L2 regularization, L1 regularization, termination threshold, modeling method;

[0102] S2: The training initiator samples d sample data sets x fro...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a third-party federated gradient boosting decision tree model training method. It includes the following steps: synchronous initialization of the training initiator and training participants; synchronous sampling of d sample data sets by the training initiator and training participants; binning of each feature data in their respective sample data sets by the training initiator and training participants , Record binning information and store bit slices; the training initiator calculates the first-order gradient sum and second-order gradient sum corresponding to each binning of each feature data of each sample data set, and the training initiator, training participants The party calculates the first-order gradient sum and the second-order gradient sum corresponding to each sub-bin of each feature data in the training participant's sample data set according to the safe multiplication protocol; the training initiator searches for the optimal split point, and synchronizes the results to the training participants side; repeat the above steps until the termination condition is met. The invention protects data security, reduces storage space, and greatly compresses communication traffic.

Description

technical field [0001] The invention relates to the technical field of gradient boosting decision tree model training, in particular to a third-party federated gradient boosting decision tree model training method. Background technique [0002] The federated gradient boosting decision tree model can solve both classification problems and regression problems, and has good interpretability, so it is widely used in the field of federated learning, especially in the field of bank risk control. The federated gradient boosting decision tree model is a very practical tree model. In the federated gradient boosting decision tree model, each participant calculates the first and second derivatives of the decision tree based on local data, and decides the best In this process, the first and second derivatives of different participants need to be added. Additive homomorphic encryption can be used to protect the data privacy of each participant from being leaked to the tree model during t...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Patents(China)

IPC IPC(8): G06K9/62G06Q40/02G06F30/27

CPCG06Q40/02G06F30/27G06F18/214

Inventor 郭梁徐时峰刘洋裴阳毛仁歆宋鎏屹

Owner 蓝象智联(杭州)科技有限公司

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

A third-party federated gradient boosting decision tree model training method

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology