Model training system based on separation degree index

A technology of model training and separation, which is applied in the field of machine learning model training system, can solve the problems of low model training efficiency and loss of model effect, and achieve the effect of protecting data security and customer privacy

Active Publication Date: 2020-05-08
SICHUAN XW BANK CO LTD
View PDF13 Cites 5 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

In addition, model training with encrypted data requires redevelopment of the model training code. The efficiency of model training is much lower than that of local plaintext training. As for the asynchronous optimization of parameters during model training, it will further lead to loss of model effect.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Model training system based on separation degree index
  • Model training system based on separation degree index
  • Model training system based on separation degree index

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0038] like figure 1 The model training system based on the degree of separation index of the present invention includes: establishing respectively on the storage medium through the processor module: a model training unit, a model pruning and compression unit and an output unit;

[0039] The model training unit includes:

[0040] a. Data cleaning module: Determine the model to be trained through the label definition, and after inputting the original variable through the data input port, perform data cleaning on the original variable, including: filling the missing value of the variable in the model (usually using the mean or median to pair fill in missing values), eliminate character variables, and map categorical variables to corresponding values ​​(for example, to map categorical variables with implicit order, the mapping of professional title level is: primary = 1, intermediate = 2, advanced = 3), and then generate a structured training data structure.

[0041] b. Feature...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a model training system based on a separation degree index. The model training system comprises a model training unit, a model pruning and compressing unit and an output unit.The model training unit comprises the following modules of: a, data cleaning module for original variable cleaning; b, a feature selection module for screening candidate feature sets compressed by amodel; c, a model training module for model training and optimization. The model pruning and compressing unit comprises the following modules of: d, a data sample grouping module for data sample grouping; e, a feature correlation discrimination module used for calculating correlation coefficients of features and target variables and grouping and sorting samples; f, a feature optimal breakpoint selection module for selecting the optimal breakpoints of the features; g, a feature separation degree index calculation module which constructs feature separation degree indexes and outputs a feature with the best effect. The output unit comprises the following modules of: h, an optimal feature selection module for optimal feature selection; and i, an output module used for outputting a single-pointrule list. According to the method, the established model can be trained under the condition that the data of one party is not transmitted out, so that the data security and customer privacy of two parties are effectively protected.

Description

technical field [0001] The invention relates to a training system of a machine learning model, specifically a model training system based on a separation index. Background technique [0002] In the field of machine learning and artificial intelligence, the traditional joint modeling method is generally that both partners hold part of the data (explanatory variables or labels), and one party carries the data to the other party for data cleaning, processing, and modeling deployment. There are two problems in such a process. One is that the direct carrying of data may lead to the risk of data leakage, and the other is the risk of legal compliance. With the increasingly stringent privacy protection legislation of citizens, the circulation of sensitive data may touch legal issues. lead to regulatory intervention. [0003] For the above problems, the existing solution is called federated learning. The core idea of ​​federated learning is data encryption. The two parties independe...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06N20/00
CPCG06N20/00
Inventor 毛正冉刘嵩韩晗郑乐王张琦
Owner SICHUAN XW BANK CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products