Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Application based on association rule mining technology in PACS system

A technology and rule technology, applied in the application field of association rule mining technology based on PACS system, can solve problems such as large amount of calculation, many elements, affecting algorithm efficiency, etc., and achieve the effect of high mining efficiency

Inactive Publication Date: 2021-04-30
河北上晟医疗科技发展有限公司
View PDF4 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] The Apriori algorithm uses a method called layer-by-layer search to mine frequent itemsets. Before each scan, candidate frequent itemsets are generated through the method of candidate generation-pruning. Apriori greatly reduces the need for statistics in this way. The amount of candidate item sets provides good mining efficiency to a certain extent, but there are still two bottlenecks in the algorithm to be solved: (1) The algorithm still needs to form a lot of candidate item sets, especially the second-order candidate items set
(2) The algorithm must query the entire data set multiple times, and check a large set of candidate item sets by pattern matching, which is a large overhead and greatly affects the efficiency of the algorithm
[0008] (1) The Eclat algorithm is based on the depth-first method to find all frequent itemsets, and cannot use the pruning theorem of the Apriori algorithm for pruning. Therefore, its search space is much larger than that of Apriori, and it increases virtually. The amount of calculation affects the efficiency of the mining algorithm
[0009] (2) The Eclat algorithm takes advantage of the advantages of vertical data representation, but it also has another disadvantage, that is, when the number of transactions in the transaction database is large, there will be a lot of elements in the TIDset of each item set, which further leads to When using cross counting to obtain the support of itemsets, the amount of calculation is very large, which becomes another bottleneck that limits the efficiency of the Eclat algorithm.
[0010] (3) The Eclat algorithm needs to save the TIDset of all itemsets when mining frequent itemsets, which will consume a large amount of memory in the mining process of large-scale data, especially dense data, and will limit the use of the algorithm to a certain extent. limit

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Application based on association rule mining technology in PACS system
  • Application based on association rule mining technology in PACS system
  • Application based on association rule mining technology in PACS system

Examples

Experimental program
Comparison scheme
Effect test

specific Embodiment 1

[0057] Specific embodiment 1: For any two sums of item sets, set the corresponding TIDset as: the intersection of sums. Connect and to generate a candidate item set, and then need to calculate the support of the candidate item set, that is, the intersection of two TIDsets, assuming minSup=3, when using Theorem 1 to evaluate the support of the item set, the specific steps Such as Figure 13 . exist Figure 13 In , the arrow means that the algorithm executes to the current position and judges whether the element belongs to the intersection. It is easy to know that after judging S 1 The third element "3" of the set does not belong to the set S 2 The comparison can be stopped at this time, because the obtained intersection size |T|=1 at this time, and min(|S 12 |,|S 22 |)=1, according to Theorem 1, the support count of {I2,I4} must not be greater than minSup, so {I2,I4} must not be frequent.

[0058] Generally speaking, the time complexity of calculating the intersection of...

specific Embodiment 2

[0060] Specific embodiment 2: This experiment is carried out based on the data set accidents, and minSup=0.74. The hash function takes h(x)=(4*x+5)%k, where k takes values ​​1, 3, 5, 7, 9, and 11 respectively. Since there is a certain fluctuation in the running time each time, for each k, the algorithm is run 5 times, and the average value of the 5 times is taken as the final experimental result. With the adjustment of k, the execution time of the algorithm is as follows: figure 2 shown.

[0061] right figure 2 The analysis shows that with the increase of k, the execution time of the algorithm as a whole presents a gradually decreasing trend, which is consistent with the previous analysis results, that is, the larger k is, the shorter the running time is. However, careful observation shows that in When the value of k is 9 and 11, the curve shows a slight upward trend. This is because the hash function affects the distribution of elements in the subset, which in turn affect...

specific Embodiment 3

[0064]Specific embodiment 3: the association rule mining algorithm is applied to the mining of heart disease electronic medical records. The data comes from the relevant physical examination data of people's heart disease in a certain area of ​​the United States. It comes from 270 patients altogether, and each piece of data has 13 attribute values ​​( Each attribute represents a certain physical examination index of the patient) and a class label (whether sick or not), and some data are shown as Figure 10 shown.

[0065] Due to the existence of real number types in the original data set, the data cannot be directly applied to the frequent itemset mining algorithm. Therefore, it is necessary to process the physical examination data in advance and discretize each attribute into a limited number of values. The specific method is as follows: For age In terms of attributes, the age of each person is divided into three intervals according to the standards of old age, middle age, an...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses an application based on an association rule mining technology in a PACS system. The Eclat_LSH algorithm starts from the point of view of reducing elements needing to be compared: 1, the process of calculating the intersection of two large sets is converted into the process of solving the intersection of a plurality of small sets and then accumulating by utilizing the idea of local sensitive hash, so that the frequency of comparing each element is reduced; 2, the Eclat_LSH algorithm gives full play to the effect of the minimum support degree in the process of calculating the support degree of the item set, evaluates the upper bound of the support degree of the item set, and immediately stops calculation when evaluating that the support degree of the item set cannot meet the screening condition; according to the Eclat_LSH, the upper bound of the support degree is evaluated in the intersection calculation process, so that the calculation of the support degree of the frequent item set is effective, and the comparison frequency of each element in the intersection calculation process is reduced.

Description

technical field [0001] The invention relates to the technical field of association rule mining in data mining technology, in particular to an application of association rule mining technology based on PACS system. Background technique [0002] The association rule mining algorithm finds the correlation between transactions by counting the items with the most common occurrences. Apriori, FP-growth and Eclat are the three most classic association rule mining methods. Many subsequent algorithms to improve mining efficiency are based on These three methods are proposed for improvement. [0003] The Apriori algorithm uses a method called layer-by-layer search to mine frequent itemsets. Before each scan, candidate frequent itemsets are generated through the method of candidate generation-pruning. Apriori greatly reduces the need for statistics in this way. The amount of candidate item sets provides good mining efficiency to a certain extent, but there are still two bottlenecks in...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/2458G06F16/26G06F16/22
CPCG06F16/2465G06F16/26G06F16/2246
Inventor 徐秀芳张曦予陈宜亮闫国庆
Owner 河北上晟医疗科技发展有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products