Unlock instant, AI-driven research and patent intelligence for your innovation.

Method and device for constructing unbalanced sample classification model

A classification model and sample technology, applied in the computer field, can solve problems affecting the effect of classification models, model overfitting, loss of hidden information in training samples, etc., to avoid complexity and difficulty in factory development, improve technical effects, Improve the performance of the classification effect

Active Publication Date: 2021-12-07
BEIJING JINGDONG SHANGKE INFORMATION TECH CO LTD +1
View PDF6 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Oversampling will cause overfitting problems in the model and affect the performance of the classification model
Undersampling may also lead to the loss of some hidden information in the training samples, and the accuracy of the loss classification model
In addition, cost-sensitivity learning refers to assigning different costs to positive and negative classes, which involves modifying the cost function or objective function of the classification model, increasing the complexity of the classification model and the difficulty of engineering development, and may not guarantee the final classification effect

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and device for constructing unbalanced sample classification model
  • Method and device for constructing unbalanced sample classification model
  • Method and device for constructing unbalanced sample classification model

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0030] Hereinafter, embodiments of the present disclosure will be described with reference to the drawings. It should be understood, however, that these descriptions are exemplary only, and are not intended to limit the scope of the present disclosure. Also, in the following description, descriptions of well-known structures and techniques are omitted to avoid unnecessarily obscuring the concepts of the present disclosure.

[0031] The terminology used herein is for the purpose of describing particular embodiments only, and is not intended to be limiting of the present disclosure. The words "a", "an" and "the" used herein shall also include the meanings of "plurality" and "multiple", unless the context clearly indicates otherwise. In addition, the terms "comprising", "comprising", etc. used herein indicate the existence of stated features, steps, operations and / or components, but do not exclude the existence or addition of one or more other features, steps, operations or comp...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The present disclosure provides a method for constructing an imbalanced sample classification model. The method includes receiving raw sample population data, constructing a classification model, and outputting the classification model. Wherein, constructing the classification model includes calculating the k closest first sample individuals in the minority class for each first sample individual belonging to the minority class in the original sample population, where k is greater than or equal to is a positive integer of 2, the classification of all sample individuals in the original sample population is known; then for each first sample individual, at least one is determined according to the k nearest first sample individuals second sample individual; then mixing the original sample population with all the second sample individuals to form a mixed sample population; finally constructing a classification model based on the mixed sample population. The present disclosure also provides a device, a system and a readable storage medium for constructing an imbalanced sample classification model.

Description

technical field [0001] The present disclosure relates to the field of computer technology, and more specifically, to a method and device for constructing an imbalanced sample classification model. Background technique [0002] In data mining classification or prediction tasks, sometimes the obtained data distribution is not balanced. For the unbalanced samples with a large disparity in the proportion of each category, the characteristics of the minority category with less data are easy to be ignored, which makes it easy to classify the data that should belong to the minority category into the majority category when making data predictions. This is very unfavorable for the analysis of unbalanced samples with a small minority class but a very large influence. For example, when network users are divided into normal users and malicious users (such as network hackers), the number of malicious users is far smaller than that of normal users, but the destructive power of these mali...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06K9/62
CPCG06F18/24147G06F18/214
Inventor 刘朋飞赵一鸿李爱华葛胜利
Owner BEIJING JINGDONG SHANGKE INFORMATION TECH CO LTD