Secondary screening based high-efficiency abnormal time series data extraction method

A secondary screening and time series data technology, applied in the computer field, can solve problems such as high time complexity, difficulty in obtaining satisfactory results, and time-consuming

Active Publication Date: 2016-12-14
北京中科慧云科技有限公司
View PDF4 Cites 10 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, due to the high time complexity of DTW itself, it is too time-consuming to calculate the nearest neighbor distance nearest_neighbor_dist through an inner loop in a large amount of time series data, so it is impossible to directly replace the Euclidean distance with the DTW distance to measure the difference between two subsequences OK, it is difficult to achieve satisfactory results

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Secondary screening based high-efficiency abnormal time series data extraction method
  • Secondary screening based high-efficiency abnormal time series data extraction method
  • Secondary screening based high-efficiency abnormal time series data extraction method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0053] The features and exemplary embodiments of various aspects of the present invention will be described in detail below. The following description covers many specific details in order to provide a comprehensive understanding of the present invention. However, it is obvious to those skilled in the art that the present invention can be implemented without some of these specific details. The following description of the embodiments is only to provide a clearer understanding of the present invention by showing examples of the present invention. The present invention is by no means limited to any specific configuration and algorithm proposed below, but covers any modification, replacement and improvement of related elements, components and algorithms without departing from the spirit of the present invention.

[0054] The embodiment of the present invention provides a method that can accurately find abnormal timing in ultra-large-scale timing data. On the basis of the existing i...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a high-efficiency abnormal time series data extraction method, which can be used for finding exceptions from electrocardiogram (ECG) data to detect a heart disease. The method includes: allowing a distance function in a system to adopt DTW (Dynamic Time Warping) distance to replace the conventional Euclidean distance so as to reduce phase displacement errors; mapping original time series data (ECG) into a series of character string sequences through the SAX technique, and storing the character string sequences into an Array data structure and a Trie ternary tree data structure; finding the most likely abnormal sequences as candidate exceptions through the Array and the Trie ternary tree; finding the nearest neighbor distance of the first candidate exception as a first threshold distance from the ECG data through secondary screening; verifying the candidate exception as the final inquired exception through nest inner and outer circulation, otherwise updating the candidate exception; and finally obtaining abnormal time series in the ECG data after the inner and outer circulation is completely executed. The method can solve the problem that rapid accurate finding of exceptions is difficult to achieve in mass ECG data due to high DTW distance redundancy.

Description

Technical field [0001] The invention belongs to the computer field, and particularly relates to a method for realizing high-precision and rapid extraction of abnormalities in massive time series data, and is applied to ECG (electrocardiogram, electrocardiogram) data for abnormality detection to realize heart disease detection. Background technique [0002] In the past ten years, hundreds of articles have been researching how to find the most similar subsequence to a given time sequence in a large amount of time series data (time series data refers to data recorded in chronological order), and this patent studies how Finding the most different sub-sequence from other timing data in a large amount of timing data is called timing data anomaly. [0003] The abnormal time series data, in simple terms, refers to a very large time series data, there are some time series fragments that are very different from other time series data. Time series data anomalies are very useful in the field ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F19/00
CPCG06F19/3418
Inventor 许泽文李建强莫豪文田猛刘璐孙靖超
Owner 北京中科慧云科技有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products