Lightning activity data statistics method based on modified Apriori algorithm

A technology of activity data and statistical methods, applied in the laser field, can solve problems such as low efficiency, achieve the effect of improving execution efficiency and saving storage overhead

Inactive Publication Date: 2014-02-19
陕西省气象局
View PDF3 Cites 28 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

In practical applications, the amount of data is large, so each stage will generate a large number of candidat

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Lightning activity data statistics method based on modified Apriori algorithm
  • Lightning activity data statistics method based on modified Apriori algorithm
  • Lightning activity data statistics method based on modified Apriori algorithm

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0023] The present invention adopts the optimization strategy based on directed graph and weighted association rules to improve Apriori algorithm, promptly based on the optimization method of directed graph and the Apriori algorithm of weighted association rules, first will calculate weighted support degree and weighted confidence degree:

[0024] Let I={i1, i2, ..., im}, the weight vector W={w1, w2, ..., wm} corresponding to i, the i-th transaction ti is a subset of I, and the j-th item in ti ( Denoted as ti [ij]) has a weight w. In this way, each item corresponds to a value in W.

[0025] The item set transaction weight is a summary of the weight of each item in the item set in the database. The item weight of the item set X in the transaction ti is calculated as:

[0026] Weighted support is a summary of the weights of transaction item sets in the transaction database that contain the item:

[0027]

[0028] Where NX is the count of occurrences of X in the database; ...

Embodiment 2

[0032] The invention improves the Apriori algorithm by adopting an optimization strategy based on a directed graph and weighted association rules. The algorithm is stored in a bit vector structure, and there is a bit vector corresponding to each frequent set, so the number of bits in the bit vector is the total number of transactions in the database. The algorithm scans the database only once, counts the frequent items and sets their corresponding bit vectors.

[0033] For example: the transactional database is {, , , , , , , , }

[0034] refer to figure 2 . The specific method is:

[0035]If there is a corresponding item in the transaction, set the corresponding item to 1, otherwise set the corresponding item to 0. After checking all transaction sets, each item corresponds to a binary bit string. Items (nodes) in the database are then mapped to bitmaps sorted from highest to lowest support. If the minimum support count is 2, then the frequent item in this database is i ...

Embodiment 3

[0038] The method of improving the Apriori algorithm optimization based on the directed graph and the weighted association rules to construct the vector bitmap is the same as that in Embodiment 2. Build the frequent binomial graph as described in step 3. In this example combine figure 1 , specifically introduces the construction method of the directed graph. Put the node with the most occurrences of 1 in the obtained bitmap on the top layer. If some two items appear at the same time in one transaction, and the number of occurrences meets the minimum support requirement (greater than or equal to the minimum support), then in the directed Draw an edge between these two nodes in the graph. The edge is represented by a binary string (the binary string is obtained by summing two nodes, and the number of 1s in the string represents the number of times the two nodes appear at the same time). Combined with the transaction database instance in Embodiment 2, the frequent item diagram...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a lightning activity data statistics method based on a modified Apriori algorithm. The method includes: 1, calculating weighted support and weighted confidence; 2, performing vertical bit vector format conversion; 3, generating frequent bipartite graphs; 4, mining candidate sets. Items are imparted with proper weights according to actual needs, and the original support and the original confidence are modified into weighted support and weighted confidence which are more practical. In addition, according to the algorithm, item information is stored in the bit vector vertical data format, storage space is saved, and I/O efficiency is improved; according to the modified algorithm, based on the top-down concept, longest frequent item sets meeting the support and confidence requirements are located through frequent bipartite digraphs, and all frequent items meeting the requirements are generated according to properties of the frequent time sets. Through the application of the algorithm, the efficiency of the Apriori algorithm is improved in terms of both space and time, and the algorithm better meets the actual needs.

Description

technical field [0001] The invention belongs to the technical field of lasers, and in particular relates to a statistical method for lightning activity data based on an improved Apriori algorithm. Background technique [0002] With the rapid development of computer networks and the maturity of database technology, people's ability to collect and use data has been greatly improved. In order to extract information that can be used by people from these large amounts of random practical application data, data mining Technology came into being. Data mining, also known as knowledge discovery, is the main research direction of databases at present. A pattern of underlying value or a process of knowledge. [0003] Association rules proposed by R. Agrawal et al. in 1993 are an important content in the field of data mining. Association rule mining refers to digging out meaningful association relationships among a large number of data set items, so as to provide valuable information...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/30G06F19/00
CPCG06F16/283
Inventor 王卫民李婧雷欣田社教高莹
Owner 陕西省气象局
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products