Kernel function based rare category detection method fusing active learning and nonparametric semi-supervised clustering

A semi-supervised clustering, active learning technology, applied in character and pattern recognition, instruments, computer parts, etc., can solve the problem of not finding rare categories

Active Publication Date: 2016-04-06
ZHEJIANG HONGCHENG COMP SYST
View PDF4 Cites 13 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

This process is repeated until a certain proportion of data points have been labeled or no new rare categories are found after a certain number of iterations; it solves how to use manually labeled data points to update The problem of efficiently detecting rare classes in datasets

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Kernel function based rare category detection method fusing active learning and nonparametric semi-supervised clustering
  • Kernel function based rare category detection method fusing active learning and nonparametric semi-supervised clustering
  • Kernel function based rare category detection method fusing active learning and nonparametric semi-supervised clustering

Examples

Experimental program
Comparison scheme
Effect test

Embodiment

[0063] Embodiment: A rare category detection method based on kernel function fusion active learning and non-semi-supervised clustering. The method has three stages: semi-supervised clustering hierarchical construction, active learning based on multiple standards, and iterative control process.

[0064] Among them, the semi-supervised clustering hierarchical construction stage includes the kernel function-based distance metric learning sub-stage and the non-parametric clustering hierarchical construction sub-stage.

[0065] 1) Distance metric learning based on kernel function, the process is as follows figure 1 shown.

[0066] Step 1, calculate the kernel matrix K of the dataset X in the kernel space based on the selected kernel function; if it is a linear mapping, directly output the matrix, otherwise perform steps 2-9.

[0067] Using the mapping function φ(x), the data set X=(x 1 ,x 2 ,...,x n ) from the original Euclidean space to the inner product space (kernelspace). ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a kernel function based rare category detection method fusing active learning and nonparametric semi-supervised clustering. For the problems that marked data points are not fully utilized and category related information needs to be specified in advance in a conventional rare category detection method, the invention proposes the kernel function based rare category detection method fusing the active learning and the nonparametric semi-supervised clustering. A data distribution model is optimized by utilizing small amounts of marked data and large amounts of unmarked data with the nonparametric semi-supervised clustering method, and most representative abnormal points in the unmarked data points are selected out in combination with the active learning and submitted to experts for marking, so that the workload of manual marking in a rare category detection process is reduced, the efficiency of the rare category detection process is improved, and the problem in rare category discovery under a nonlinear condition is solved.

Description

technical field [0001] The invention relates to the field of abnormal data detection, in particular to a rare category detection method based on kernel function fusion active learning and non-semi-supervised clustering. Background technique [0002] Anomaly data detection plays a key role in many applications, such as healthcare, fault detection in critical safety systems, and tracking of specific actors in videos. The significance of outlier data points is that they can often give a lot of useful information in a particular application. However, abnormal data points can be divided into two categories. The first category is ordinary abnormal points, which are usually generated by some predictable reasons. The second category belongs to outliers with additional information, which usually require further exploration and analysis. Compared with ordinary outliers, these more interesting outliers usually only occupy a smaller proportion of the total outliers. The rare category...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06K9/62
CPCG06F18/23G06F18/24147
Inventor 吴勇季海琦陈岭涂鼎
Owner ZHEJIANG HONGCHENG COMP SYST
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products