A rare class detection method and device based on a k-nearest neighbor graph

A k-nearest neighbor graph and detection method technology, which is applied in the directions of instruments, computing, character and pattern recognition, etc., can solve problems such as high time complexity, achieve the effect of improving discovery efficiency and reducing the number of inquiries

Inactive Publication Date: 2019-06-28
WUHAN UNIV
View PDF2 Cites 7 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0008] In view of this, the present invention provides a rare class detection method and device based on the k-neare

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A rare class detection method and device based on a k-nearest neighbor graph
  • A rare class detection method and device based on a k-nearest neighbor graph
  • A rare class detection method and device based on a k-nearest neighbor graph

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0055] This embodiment provides a rare class detection method based on the k-nearest neighbor graph, please refer to figure 1 , the method includes:

[0056] Step S1: For a preset given unlabeled dataset S, construct its k-nearest neighbor graph.

[0057] Specifically, the given unlabeled data set S is preset as an existing calibrated data set, and the k value of the k-nearest neighbor graph can be calculated by a preset algorithm.

[0058] In one implementation, for a preset given unlabeled data set S, construct its k-nearest neighbor graph, specifically including:

[0059] Calculate the k value through the preset clustering algorithm;

[0060] Based on the calculated k value, a k-nearest neighbor graph is constructed for the data set S, where the k-nearest neighbor graph G=(V, E) is a weighted directed graph, each node p∈V represents a data sample, and each edge e ∈E indicates that the two endpoints of the edge are neighbors, and the weight of the edge is the Euclidean di...

Embodiment 2

[0091] This embodiment provides a rare class detection device based on the k-nearest neighbor graph, please refer to Figure 4 , the device consists of:

[0092] The k-nearest neighbor graph construction module is used for constructing its k-nearest neighbor graph for a preset given unlabeled data set S;

[0093] The variation coefficient definition setting module is used to set the definition of the variation coefficient Vc of the node p based on the constructed k-nearest neighbor graph, Vc(p)=maxV(p)×std(EL(p)),

[0094]

[0095]

[0096] Among them, kNN(p) represents the set of k nearest neighbor nodes of p, EL(p) represents the edge length set of node p, deg(p) is the in-degree of point p in the k-nearest neighbor graph, and is based on the set variation coefficient Define, calculate the coefficient of variation value corresponding to each node in the data set S;

[0097] The rare category acquisition module is used to find the node x with the largest variation coe...

Embodiment 3

[0107] See Figure 5 , based on the same inventive concept, the present application also provides a computer-readable storage medium 300, on which a computer program 311 is stored. When the program is executed, the method as described in the first embodiment is implemented.

[0108] Since the computer-readable storage medium introduced in the third embodiment of the present invention is a computer device used to implement the rare class detection method based on the k-nearest neighbor graph in the first embodiment of the present invention, based on the method introduced in the first embodiment of the present invention, this Those skilled in the art can understand the specific structure and deformation of the computer-readable storage medium, so details will not be repeated here. All computer-readable storage media used in the method in Embodiment 1 of the present invention fall within the scope of protection intended by the present invention.

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a k neighbor graph-based rare class detection method, which comprises the following steps of: firstly, constructing a k neighbor graph of a given unlabeled data set S, and automatically selecting a k value by an algorithm; and then, based on the constructed k neighbor graph, giving a definition of a change coefficient Vc, and calculating a change coefficient Vc value of each node in the data set, finding out the node x with the maximum change coefficient from all the nodes, inquiring a labeler to obtain a category label y of the node x, and respectively adding the x andthe y into the selected data sample set I and the selected data sample real category label set L; carrying out rare class detection by utilizing a method for detecting local mutation of data sample distribution in the data set, and compared with other priori-free rare class detection methods, the KRED method is higher in efficiency and lower in algorithm overhead. And meanwhile, through a methodof automatically selecting the k value, the discovery efficiency of each class in the data set is effectively improved, and the inquiry frequency required for discovering all classes in the data set is remarkably reduced.

Description

technical field [0001] The invention relates to the technical field of data mining, in particular to a rare class detection method and device based on a k-nearest neighbor graph. Background technique [0002] Rare class detection is a very important task in data mining. It aims to find those rare classes in unlabeled data sets. practical significance, and thus worthy of further study. For example, in the mass of financial transaction record data, sometimes a small amount of illegal transaction records that exploit the loopholes of the financial system or use fraudulent means are hidden; in the mass of normal network access, there are a small amount of malicious network behavior. In addition to being used for the above practical problems, rare class detection can also obtain a small number of classified data samples from a given unlabeled data set, which can be further used to construct classifiers or for semi-supervised learning methods such as collaborative training and ac...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06K9/62
Inventor 李易黄浩李宗鹏
Owner WUHAN UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products