Unbalanced industrial data classification method and device, and computer readable storage medium

A technology of industrial data and classification methods, applied in computer parts, calculation, character and pattern recognition, etc., can solve the problems of inability to learn sample characteristics, few training samples, poor learning effect, etc., and achieve the goal of alleviating the lack of potential useful information Effect

Pending Publication Date: 2022-02-01
CHINA TELECOM CORP LTD
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] The inventor found through research that the EasyEnsemble algorithm, which is a related technology to solve the problem of data balance, is a down-sampling technology based on integrated learning. Various types of samples are taken as part of the training set, which makes the original few samples even fewer, resulting in potentially useful information. The lack of learning effect is poor, especially when the amount of data is small and the samples are extremely unbalanced. This method of integrated downsampling makes the sub-classifier contain too few training samples, which makes it impossible to learn the characteristics of the sample, resulting in poor performance of the classifier.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Unbalanced industrial data classification method and device, and computer readable storage medium
  • Unbalanced industrial data classification method and device, and computer readable storage medium
  • Unbalanced industrial data classification method and device, and computer readable storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0047] The following will clearly and completely describe the technical solutions in the embodiments of the present disclosure with reference to the accompanying drawings in the embodiments of the present disclosure. Apparently, the described embodiments are only some of the embodiments of the present disclosure, not all of them. The following description of at least one exemplary embodiment is merely illustrative in nature and in no way intended as any limitation of the disclosure, its application or uses. Based on the embodiments in the present disclosure, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts belong to the protection scope of the present disclosure.

[0048] Relative arrangements of components and steps, numerical expressions and numerical values ​​set forth in these embodiments do not limit the scope of the present disclosure unless specifically stated otherwise.

[0049] At the same time, it should be unders...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to an unbalanced industrial data classification method and device, and a computer readable storage medium. The unbalanced industrial data classification method comprises the following steps: dividing an input unbalanced data set into a minority class sample set and a majority class sample set; in the sampling process, a plurality of balanced subsets are obtained in a mode of inducing missing values for a minority class sample set and estimating and complementing the missing values; training each balanced subset to obtain a sub-classifier; and integrating all the sub-classifiers to obtain a final data classifier. Missing values are induced for minority samples in the sampling process, estimation and complementation are carried out, and the problem of potential useful information missing caused by downsampling can be relieved.

Description

technical field [0001] The present disclosure relates to the field of emerging information technologies, and in particular to a method and device for classifying unbalanced industrial data, and a computer-readable storage medium. Background technique [0002] Aiming at the high complexity and high integration of industrial systems, the data-driven machine learning model can monitor the operating status of equipment, mine and discover the rules of industrial equipment operation, perform predictive maintenance on faults, and prevent accidents. Industrial production saves costs and improves efficiency. [0003] Unbalanced industrial data is ubiquitous in actual scenarios, which greatly affects the prediction effect. Contents of the invention [0004] The inventor found through research that the EasyEnsemble algorithm, which is a related technology to solve the problem of data balance, is a down-sampling technology based on integrated learning. Various types of samples are ta...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06K9/62
CPCG06F18/214G06F18/241
Inventor 刘珮项超贾丹王学敏孟维业王建秀
Owner CHINA TELECOM CORP LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products