Method for Compressing Intermediate Candidate Frequent Itemsets in the Field of Database Intrusion Detection

A technology of frequent itemsets and intrusion detection, which is applied in electrical digital data processing, special data processing applications, instruments, etc., can solve the problems of Apriori algorithm efficiency decline, large number, and large I/O overhead, so as to reduce the scanning workload , improve speed, and improve search efficiency

Active Publication Date: 2018-11-27
TIANJIN NANKAI UNIV GENERAL DATA TECH
View PDF3 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

For a massive database, the number of frequent itemsets will become very large, and the efficiency of the improved Apriori algorithm will decrease, which still cannot meet the requirements, and the I / O overhead of the system is also very large

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method for Compressing Intermediate Candidate Frequent Itemsets in the Field of Database Intrusion Detection
  • Method for Compressing Intermediate Candidate Frequent Itemsets in the Field of Database Intrusion Detection
  • Method for Compressing Intermediate Candidate Frequent Itemsets in the Field of Database Intrusion Detection

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0032] Specific embodiments of the present invention will be described in detail below in conjunction with the accompanying drawings.

[0033] Based on the improved Apriori algorithm described in the background technology, (Ck is called the set of candidate frequent k-itemsets, and Lk is called k-item frequent itemsets), a compressed intermediate candidate frequent item set for the field of database intrusion detection is proposed. The method of itemsets, including the following steps, to figure 1 The shown database is an example, and the execution process of the algorithm of the present invention is as follows figure 2 Shown:

[0034] Step 1: According to the value of the target number of transactions, select from the transaction database the transactions whose number of items is not less than the value of the target number of transactions as the new transaction database; the value of the target number of transactions in this embodiment is 3, that is, filter out the number ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides an algorithm for compressing middle candidate frequent item sets in the field of database intrusion detection. The algorithm comprises the following steps: (1) according to a target transaction number, selecting transactions, of which the item numbers are not smaller than the target transaction number value, from a transaction database to construct a new transaction database; (2) according to the connecting step and the pruning step of the Apriori algorithm, scanning the new transaction database, and generating frequent 1-item sets L(1) through calculation; (3) finding out a plurality of candidate item sets, of which the numbers are at the top and equal to the target transaction number, among the frequent 1-item sets L(1); (4) scanning the candidate item sets to obtain frequent item sets with the target transaction number. The algorithm has the advantages and positive effects as follows: the operation of generating middle candidate frequent item sets and middle frequent item sets one by one from 1 in the sequence of natural numbers is avoided so as to greatly increase the data mining and searching efficiency; the workload of database scanning is reduced so as to greatly increase the speed of frequent item set calculation.

Description

technical field [0001] The invention belongs to the technical field of Apriori algorithm, in particular to a method for compressing intermediate candidate frequent itemsets in the field of database intrusion detection. Background technique [0002] Association rule mining plays an extremely important role in data mining and is one of the main tasks of data mining. The classic algorithm of association rules is Apriori algorithm. The Apriori algorithm uses a layer-by-layer iterative method, k-itemsets are used to search for (k+1)-itemsets, and the nature of the Apriori algorithm: all non-empty subsets of frequent itemsets must also be frequent itemsets. [0003] Apriori algorithm: According to the definition, if the item set I does not satisfy the minimum support (min_sup), then the item set I is not frequent, that is, P(I)<(min_sup). If item A is added to itemset I, then the resulting itemset I, ie (I ∪ A), cannot occur more frequently than itemset I. Therefore, P(I∪A) ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06F17/30
CPCG06F16/2453
Inventor 李淼吕迅朱宏军崔维力武新
Owner TIANJIN NANKAI UNIV GENERAL DATA TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products