Data flow high-utility item set mining system based on historical utility table pruning

A technology of data items and data streams, applied in visual data mining, special data processing applications, database indexing, etc., can solve the problems of high space complexity of high-utility tree structures, useless processing of low-utility data, and low algorithm efficiency and other issues to achieve the effect of increasing time and space complexity, reducing the number of generation and recursion times, and improving performance

Pending Publication Date: 2021-12-14
上海熙业信息科技有限公司
View PDF12 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0007] The current pattern growth algorithm inevitably has problems such as candidate item sets, too many redundant items, and useless processing of low-utility data, which often leads to high space complexity of

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Data flow high-utility item set mining system based on historical utility table pruning
  • Data flow high-utility item set mining system based on historical utility table pruning
  • Data flow high-utility item set mining system based on historical utility table pruning

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0024] In order to make the purpose, technical solutions and advantages of the embodiments of the present invention more clear, the specific implementation manners of the present invention will be clearly and completely described below in conjunction with the accompanying drawings. It should be understood that the specific implementation methods described here are only used to illustrate and explain the present invention, and are not intended to limit the present invention.

[0025] The implementation process of the system of the present invention is as follows: figure 2 Shown:

[0026] Step 1, creation and update of historical utility value table;

[0027] Step 2. Construction, update and optimization of the global header table and the global tree;

[0028] Step 3. Perform high-utility itemset mining on the optimized global data structure;

[0029] Step 4. Distributed high-utility itemset mining system;

[0030] The individual steps are detailed below.

[0031] Step 1: ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a data flow high-utility item set mining system based on historical utility table pruning. Data flow high-utility item set mining based on a sliding window is one of the most challenging subjects in the field of data mining, due to the fact that a current algorithm can generate a large number of candidate item sets and redundant items, the performance is reduced when large-scale data flows are mined, and meanwhile, few historical mining results are referenced in the data flow mining process. The innovation point of the method is that the historical utility value table is established, the search space of the data stream is effectively built by using historical data, candidate items and redundant items are reduced, and a data mining system is constructed by using a distributed architecture, so that the historical utility value table is created and updated on the premise of not influencing data stream mining, so the efficient item set mining efficiency of the data stream is effectively improved.

Description

technical field [0001] The invention relates to a frequent pattern mining algorithm and a data stream mining system. [0002] High utility itemset mining is an important branch of frequent pattern mining. Background technique [0003] Frequent itemset mining is an important branch in the field of data mining. It can mine itemsets whose occurrence frequency exceeds the threshold set by the user from all transactions in the data set. With the wide application of frequent itemsets, people found that compared with frequent itemsets, some infrequent itemsets can create higher value. In response to this problem, scholars proposed the concept of high-utility itemsets mining, high-utility itemsets The set overcomes the defects of frequency of occurrence, price, profit, regional distribution and other data item weight information that are not considered in the frequent mining item set, and evaluates the importance of the item set through comprehensive utility indicators. [0004] A...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F16/26G06F16/22G06F16/27
CPCG06F16/26G06F16/27G06F16/2282Y02D10/00
Inventor 闫凤麒陈欣如
Owner 上海熙业信息科技有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products