Method of solving data imbalance based on Epochs

A balancing method and data technology, applied in the field of deep learning, to achieve the effect of simple and fast training and solving data imbalance
CN107578071AInactive Publication Date: 2018-01-12BEIJING UNIV OF TECH

Patent Information

Authority / Receiving Office
CN · China
Current Assignee / Owner
BEIJING UNIV OF TECH
Publication Date
2018-01-12
Estimated Expiration
Not applicable · inactive patent

Smart Images

  • Figure 1
    Figure 1
  • Figure 2
    Figure 2
  • Figure 3
    Figure 3
Patent Text Reader

Abstract

The invention discloses a method of solving data imbalance based on Epochs and belongs to the field of deep learning. In training process, each Epoch randomly resamples each class according to weightso that samples in each Epoch can be averagely represented during training; addition is made to each sample according to the resampled weights, a sample set of single Epoch size is randomly resampledfrom a sample base according to a weight ratio so that data of the resampled Epochs are relatively balanced. The method has the advantages that the data imbalance problem can be more effectively solved; in the training process, each Epoch randomly resamples each class according to the weight so that the samples in each Epoch can be averagely represented during training; the main idea lies in thataddition is made to each sample according to the resampled weights, the sample set of single Epoch size is randomly resampled from the sample base according to the weight ratio, and accordingly, dataof the resampled Epochs are relatively balanced.
Need to check novelty before this filing date? Find Prior Art

Description

technical field

[0001] The invention belongs to the field of deep learning, in particular to an Epoch-based method for solving data imbalance, and belongs to the technical field of deep learning. Background technique

[0002] Whether in academia or industry, unbalanced learning has attracted more and more attention, and the scene of unbalanced data also appears in all aspects of Internet applications, such as search engine click prediction (clicked web pages often occupy a small ratio), product recommendation in the field of e-commerce (recommended products are purchased at a very low rate), credit card fraud detection, network attack recognition, etc.

[0003] The impact of imbalanced data usually occurs in classification problems. Now there is a binomial classification problem (two types of data) that contains 100 rows of data. Among them, 90 rows of data represent the first category, and the remaining 10 rows represent the second category of data. This is unbalanced da...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More