Method of solving data imbalance based on Epochs

A balancing method and data technology, applied in the field of deep learning, to achieve the effect of simple and fast training and solving data imbalance

Inactive Publication Date: 2018-01-12
BEIJING UNIV OF TECH
View PDF0 Cites 10 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The difficulty of this method is to set reasonable weights. In practical applicatio

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method of solving data imbalance based on Epochs
  • Method of solving data imbalance based on Epochs
  • Method of solving data imbalance based on Epochs

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0044] In order to make the object, technical solution and features of the present invention clearer, the present invention will be further described in detail below in conjunction with specific implementation examples and with reference to the accompanying drawings. The flow chart of the Epoch-based solution to data imbalance is as follows: figure 1 shown.

[0045] The individual steps are explained below:

[0046] 1) Send the total training data into the Epoch-based method, randomly select a sample set of sample size according to the initialized category weight, obtain a relatively balanced sample set for resampling, and send it to the deep learning neural network for training .

[0047] 2) After the neural network has trained the data of an Epoch, adjust the category weight, and then go through the method based on Epoch, and then continue training. After many iterations, the category weight infinitely fits the set final weight, so that the The training error of the netwo...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a method of solving data imbalance based on Epochs and belongs to the field of deep learning. In training process, each Epoch randomly resamples each class according to weightso that samples in each Epoch can be averagely represented during training; addition is made to each sample according to the resampled weights, a sample set of single Epoch size is randomly resampledfrom a sample base according to a weight ratio so that data of the resampled Epochs are relatively balanced. The method has the advantages that the data imbalance problem can be more effectively solved; in the training process, each Epoch randomly resamples each class according to the weight so that the samples in each Epoch can be averagely represented during training; the main idea lies in thataddition is made to each sample according to the resampled weights, the sample set of single Epoch size is randomly resampled from the sample base according to the weight ratio, and accordingly, dataof the resampled Epochs are relatively balanced.

Description

technical field [0001] The invention belongs to the field of deep learning, in particular to an Epoch-based method for solving data imbalance, and belongs to the technical field of deep learning. Background technique [0002] Whether in academia or industry, unbalanced learning has attracted more and more attention, and the scene of unbalanced data also appears in all aspects of Internet applications, such as search engine click prediction (clicked web pages often occupy a small ratio), product recommendation in the field of e-commerce (recommended products are purchased at a very low rate), credit card fraud detection, network attack recognition, etc. [0003] The impact of imbalanced data usually occurs in classification problems. Now there is a binomial classification problem (two types of data) that contains 100 rows of data. Among them, 90 rows of data represent the first category, and the remaining 10 rows represent the second category of data. This is unbalanced da...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06K9/62G06N3/04G06N3/08
Inventor 赵建峰宁振虎蔡永泉薛菲公备王昱波
Owner BEIJING UNIV OF TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products