Abnormity detecting algorithm based on random Hash

An anomaly detection and algorithm technology, applied in computing, computing models, machine learning, etc., can solve problems such as failure, achieve high accuracy, solve the effects of long running time, and overcome failure

Inactive Publication Date: 2018-04-20
NANJING UNIV OF AERONAUTICS & ASTRONAUTICS
View PDF0 Cites 4 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

It can also overcome the failure of traditional methods in high-dimensional situations

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Abnormity detecting algorithm based on random Hash

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0029] A random hash-based anomaly detection algorithm proposed by the present invention will be described in detail below in conjunction with the accompanying drawings.

[0030] Such as figure 1 As shown, the anomaly detection algorithm based on random hash proposed in the present invention comprises the following steps:

[0031] Step 1) Determine the algorithm input variables, including the unmarked sample set D to be detected, the number t of trees in the forest, the limited height h of the tree, the current height I of the tree, and a group of hash functions F;

[0032] Step 2) Normalize and preprocess the original data set D so that the value of each attribute is between [0, 1], and the normalized data set is recorded as D';

[0033] Step 3) randomly sampling D' to obtain a sample set S;

[0034] Step 4) Use the sample set S to train a tree T, and randomly select an attribute A on the sample set Si , select a hash function f from F, and use the hash function f in attrib...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses an abnormity detecting algorithm based on random Hash, wherein the abnormity detecting algorithm belongs to the field of machine learning and data mining. The abnormity detecting algorithm according to the invention is based on an integration concept. A method of combining Hash and random forest is utilized. The abnormity score of a data point is measured by the number of data points in a leaf node to which the point belongs. For one to-be-measured data point, when the number of the data points in the leaf points reduces, the possibility of a fact that the data point isthe abnormity point increases. Compared with a traditional mode based on density and distance, the method according to the invention has higher accuracy, and furthermore the required operation time is lower than that of the traditional mode. Furthermore the abnormity detecting algorithm can overcome a problem of failure of the traditional mode on the condition of high dimension.

Description

technical field [0001] The invention relates to the technical field of machine learning and data mining, in particular to an abnormality detection algorithm based on random hash. Background technique [0002] Outliers or outliers are data points that deviate from other normal points and are not as expected. Although outliers are rare, they contain more important information than normal points. Such as cancer cases in medical cases. Outlier detection is an important data mining task, which is widely used in various application fields such as intrusion detection, fraud detection, financial fraud detection, medical diagnosis and event detection in sensor networks. According to the anomalous variation models in different domains, some various methods have been proposed. Distance-based methods [1, 2], relative density-based methods [3, 4], angle-based methods [5, 6], cluster-based methods [7, 8] and some other excellent anomaly detection algorithms. These algorithms have beco...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06K9/62G06N99/00G06F17/30
CPCG06F16/2465G06N20/00G06F18/214
Inventor 关东海陈凯袁伟伟
Owner NANJING UNIV OF AERONAUTICS & ASTRONAUTICS
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products