Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Time sequence variable-length motif mining method based on suffix tree

A time series and motif mining technology, applied in the field of information processing, can solve problems such as inability to ensure similarity of results and low precision of motif mining

Pending Publication Date: 2021-11-30
HOHAI UNIV
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

For example, the motif discovery by random projection is to calculate the mean value of the sequence, and to find frequent patterns after symbolic representation according to the mean value, so it can only ensure that the overall change trend of the motifs is the same, and cannot ensure the similarity between the results, resulting in longer time series motifs Mining accuracy is relatively low

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Time sequence variable-length motif mining method based on suffix tree
  • Time sequence variable-length motif mining method based on suffix tree
  • Time sequence variable-length motif mining method based on suffix tree

Examples

Experimental program
Comparison scheme
Effect test

Embodiment

[0077] Embodiment: In order to verify the effect of the present application, the hydrological data observed by the Dongsha Island survey station of China is used as the experimental data, and the experiment will be carried out from three aspects, (1) analyze the stability of the application algorithm in detail for the data set; ( 2) Compared with no piecewise linear representation, analyze the effectiveness of the algorithm of the application; (3) analyze the time performance of the algorithm of the application based on the existing data set.

[0078] The following is based on the Dongsha Island station data set to analyze the stability, effectiveness and efficiency of the algorithm of this application.

[0079]1) Stability analysis, change the change rate threshold d, and then affect the compression rate, and compare whether the results of phantom discovery are related under different compression strengths.

[0080] 2) Validity analysis, compare whether the motifs found witho...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention relates to a time sequence variable-length motif mining method based on a suffix tree. The method comprises the following steps: performing mode representation based on a slope, setting a change rate threshold, and extracting all edge points to obtain an edge point set; using the edge points of the edge point set to construct a suffix tree, and using the suffix tree to statistically determine frequency of edge point subsequences, wherein the edge point subsequence with the maximum frequency is a frequent pattern; mapping the frequent pattern back to an original time sequence, and recording position of variable-length motifs; and according to the positions of the variable-length motifs, calculating Matrix Profile values between the variable-length motifs, wherein the motif with the minimum Matrix Profile value is an effective motif. According to the invention, extraction of an effective motif is added, so that the problem of low motif discovery precision caused by symbolization hiding extreme point information is solved, and the time sequence variable-length motif mining precision is improved.

Description

technical field [0001] The present application relates to the technical field of information processing, in particular to a time series variable length motif mining method based on a suffix tree. Background technique [0002] Time series data mining belongs to the category of data mining. Its main goal is to discover meaningful information from time series data, and it needs to complete tasks such as clustering, classification, similarity search, anomaly detection, and motif mining. Among them, time series motif mining is to find recurring unknown patterns in time series without any prior information about their location or shape. In addition, time series motif mining is not only applicable to one-dimensional or multi-dimensional data, but also applicable to different types of sequence data, such as spatial sequence data, time series data, and stream data. And time series motif mining technology has also been applied in many fields such as genetics, medicine, mathematics, m...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F16/2458
CPCG06F16/2465G06F16/2474Y02A10/40
Inventor 王继民保宏程崔明星
Owner HOHAI UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products