Imbalanced data industrial fault classification method based on k-means

What is AI technical title?
AI technical title is built by PatSnap AI team. It summarizes the technical point description of the patent document.
A fault classification and fault class technology, applied in the direction of instruments, character and pattern recognition, computer parts, etc., can solve the problems of overfitting, adding system methods, and the application effect is not very ideal, so as to solve the problem of unbalanced data classification. problems, increase classification accuracy, reduce the effect of overfitting

Inactive Publication Date: 2017-10-10

ZHEJIANG UNIV

View PDF4 Cites 7 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

The improvement methods for sampling are mainly divided into two categories. One is oversampling, that is, resampling the minority class to achieve data balance. A major drawback of this method is that it will increase the system method and cause overfitting. The application effect is not very ideal; the other type is under-sampling, that is, according to certain rules, a part of the majority class is selected as training data, and other data are discarded to achieve data balance. This method ignores part of the Most class data information will lead to insufficient accuracy of the trained classifier

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0013] The present invention is aimed at the problem of fault classification in industrial processes. The method first uses k-means, and clusters the classes with more data according to the degree of imbalance, divides the majority class into N subclasses, and then combines them with M minority classes , as a multi-classification problem of (M+N) class, and finally learn according to the Naive Bayesian classifier.

[0014] The main steps of the technical solution adopted in the present invention are respectively as follows:

[0015] Step 1: Use the system to collect the data of the normal working conditions of the process and various fault data to form a labeled training sample set for modeling: Assume that the fault category is C, add a normal class, and the total modeling data of each sample The category is C+1, ie i=1,2...C+1 where no i is the number of training samples, m is the number of process variables, and R is the set of real numbers. So the complete labeled tr...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses an imbalanced data industrial fault classification method based on k-means. The method comprises the following steps: first, utilizing the k-means; based on the imbalance degrees, clustering the classes with relatively big data; dividing the majority classes into N sub-classes; combining the N sub-classes with M minority classes to serve as a multi-classification problem for an (M+N) classification; and finally, performing learning according to a naive bayes classifier. Compared with other existing methods in prior art, the method of the invention keep the information of the original data to the largest extent and better resolves the problem with imbalanced class data classification under the condition that over-fitting is prevented. Therefore, compared with other methods, the classification precision is increased, and the phenomenon of over-fitting can be reduced.

Description

technical field [0001] The invention belongs to the field of industrial process control, in particular to an industrial process fault classification method for unbalanced data. Background technique [0002] In the work of industrial fault classification, some commonly used classification methods have a prerequisite for use, that is, the amount of data in the training set is equivalent. But the reality is often not the case. When a certain type of data is very large, or a certain type of data is rare, that is, unbalanced data appears, directly using the traditional classification method will produce a large classification error. [0003] In recent years, the research on unbalanced data has been a hot spot. The existing methods are mainly solved from two directions, one is from the algorithm level, and the other is from the sampling level. The present invention mainly improves the traditional classification method at the sampling level. . The improvement methods for sampling...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G06K9/62

CPCG06F18/23213G06F18/24G06F18/214

Inventor葛志强陈革成

OwnerZHEJIANG UNIV

Imbalanced data industrial fault classification method based on k-means

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology