Eureka AIR delivers breakthrough ideas for toughest innovation challenges, trusted by R&D personnel around the world.

Heterogeneity data set anomaly detection method and computer readable storage medium

An anomaly detection and heterogeneity technology, applied in the computer field, can solve problems such as inability to detect anomalies in heterogeneous data sets

Active Publication Date: 2020-11-13
北京志翔科技股份有限公司
View PDF11 Cites 1 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] The present invention provides an anomaly detection method for a heterogeneous data set and a computer-readable storage medium to solve the problem in the prior art that the anomaly detection of a heterogeneous data set cannot be performed well

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Heterogeneity data set anomaly detection method and computer readable storage medium
  • Heterogeneity data set anomaly detection method and computer readable storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0026] The embodiment of the present invention addresses the problem of the existing inability to accurately detect anomalies on high-dimensional unlabeled heterogeneous data sets, by selecting several unused classification indicators from a preset classification indicator set, and pairing them based on the selected classification indicators Perform index threshold segmentation processing for heterogeneous data sets, generate data subsets after segmentation and classification under each classification index, and perform anomaly detection on each data subset, so as to achieve accurate high-dimensional unlabeled heterogeneous data sets Perform testing. The present invention will be further described in detail below in conjunction with the drawings and embodiments. It should be understood that the specific embodiments described herein are only used to explain the present invention, and do not limit the present invention.

[0027] The first embodiment of the present invention provid...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a heterogeneity data set anomaly detection method and a computer readable storage medium. According to the invention, several unused classification indexes are selected from apreset classification index set; index threshold segmentation processing is performed on the heterogeneous data set based on the selected classification indexes; segmented and classified data subsetsare generated under the selected classification indexes; anomaly detection is carried out on each data subset; that is to say, index threshold segmentation processing is performed on the data under the classification indexes based on the selected classification indexes to obtain a plurality of data subsets under the selected classification indexes, and anomaly detection is performed on the data subsets so as to accurately perform anomaly detection on the high-dimensional label-free heterogeneous data set.

Description

Technical field [0001] The present invention relates to the field of computer technology, in particular to an abnormality detection method for heterogeneous data sets and a computer-readable storage medium. Background technique [0002] Currently, statistical hypothesis testing and isolated forest methods are used to detect anomalies in data sets. Specifically, statistical hypothesis testing requires the assumption that the data obey a certain distribution, which is only applicable to one-dimensional data, while isolated forests need to be randomized every time Choose dimensions and thresholds to segment the data set, until finally each set has only one data, which constitutes an isolated tree. The less the number of segmentation, the higher the score of the data outlier. However, because the anomaly detection thresholds of heterogeneous data sets are different, the existing statistical hypothesis tests and isolated forest methods cannot detect anomalies in heterogeneous data set...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06K9/62
CPCG06F18/2433
Inventor 巩国栋严朝豪薛野宋洋孙凯
Owner 北京志翔科技股份有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Eureka Blog
Learn More
PatSnap group products