Supercharge Your Innovation With Domain-Expert AI Agents!

Streamed data frequent item set mining algorithm based on nested time window

A technology of frequent itemset mining and time window, which is applied in the fields of electrical digital data processing, special data processing application, calculation, etc., can solve the problem of uncertain window size and achieve good efficiency

Inactive Publication Date: 2017-10-03
CHONGQING UNIV OF POSTS & TELECOMM
View PDF0 Cites 4 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

In the above algorithm, all window models are based on transaction as the basic unit, and the algorithm cannot determine the appropriate window size to contain the recent major frequent itemsets

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Streamed data frequent item set mining algorithm based on nested time window
  • Streamed data frequent item set mining algorithm based on nested time window
  • Streamed data frequent item set mining algorithm based on nested time window

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0038] The preferred embodiments of the present invention will be described in detail below with reference to the accompanying drawings.

[0039] The basic idea of ​​the present invention is: Given a fixed-size external time window for filtering out recent data, then use the value evaluation model to first evaluate the data items, and then determine the range containing the most recent frequent itemsets to adapt Adjust the window length accordingly. This algorithm can filter out more meaningful frequent itemsets.

[0040] Technical scheme of the present invention comprises the following steps:

[0041] Step 1: Data item-time axis mapping

[0042] In the traditional sliding window model frequent itemset mining algorithm, a fixed size sliding window is given, and then frequent itemset mining is performed. Observing the mining results, we can find that the obtained frequent itemsets present a certain distribution, such as figure 1 shown.

[0043] In this fixed-size window, it...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention relates to a streamed data frequent item set mining algorithm based on a nested time window, and belongs to the field of data flow mining. The algorithm specifically comprises the following steps: screening out near-term data by using an outer embedded time window at first; mapping all transaction data in the window to a time axis; then adaptively adjusting the size of an inner embedded time window according to the retention factor of each data item and an expected window value; and finally, carrying out data mining by using a typical Eclat algorithm. By the algorithm, near-term main frequent item sets in data flow can be rapidly and effectively extracted, and time and space complexity is improved to a certain degree. The extensibility and the adaptability are quite high.

Description

technical field [0001] The invention belongs to the field of data stream mining and relates to a stream data frequent item set mining algorithm based on nested time windows. Background technique [0002] With the rapid development of computer technology, complex data is growing explosively. As a special form of data, data stream widely exists in various industries and functional fields, such as e-commerce data, satellite remote sensing data, web click stream data, Financial services data, sensor data, etc. Mining frequent itemsets on data streams is a significant and challenging task. Streaming data is different from traditional static data. It is continuous, high-speed and infinite, and cannot be stored in memory. Therefore, multi-scanning database technology is no longer suitable for frequent itemset mining of streaming data. algorithm. In addition, data flow has a strong real-time nature, so the analysis and processing of data is required to be instant or online, and t...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
CPCG06F16/2465G06F16/24568
Inventor 熊安萍黄奕蒋溢祝清意水源
Owner CHONGQING UNIV OF POSTS & TELECOMM
Features
  • R&D
  • Intellectual Property
  • Life Sciences
  • Materials
  • Tech Scout
Why Patsnap Eureka
  • Unparalleled Data Quality
  • Higher Quality Content
  • 60% Fewer Hallucinations
Social media
Patsnap Eureka Blog
Learn More