Association rule mining method of large-scale data
A large-scale data and rule technology, applied in the field of distributed computing and data mining, can solve the problems of long data mining operation time, etc., and achieve the effects of improving mining efficiency, improving processing efficiency, and good scalability
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
example 1
[0030] The input data table shown in Table 1 has 9 records (T1, T2, ..., T9) and the items contained in each record (I1, I2, I3, I4, I5):
[0031] Table 1 record table
[0032] Record number
collection of items
T1
I1,I2,I5
T2
I2, I4
T3
I2, I3
T4
I1,I2,I4
T5
I1,I3
T6
I2, I3
T7
I1, I3
T8
I1,I2,I3,I5
T9
I1,I2,I3
[0033] In order to facilitate the calculation of the similarity between items in the data, the input data table is converted into a 0,1 state table, as shown in Table 2, 0 means that the current item does not appear in the corresponding record, and 1 means that the current item appears in the corresponding record middle:
[0034] Table 20,1 State table
[0035]
I1
I2
I3
I4
I5
T1
1
1
0
0
1
T2
0
1
0
1
0
T3
0
1
1
0
0
T4
1
1
...
example 2
[0069] Taking frequent itemset mining for a category (T2, T8) as an example, the default minimum support is 0.22.
[0070] The 0,1 state tables of records T2 and T8 are shown in Table 5:
[0071] Table 5 state table
[0072]
I1
I2
I3
I4
I5
T2
1
1
0
0
1
[0073] T8
0
1
0
1
0
[0074] In the first scan, the items contained in this category (I1, I2, I4, I5) are used as candidate item sets alone, and the corresponding support is greater than the minimum support of 0.22 as shown in Table 6:
[0075] Table 6 Support degree of the first scan
[0076]
Support
I1
50%
I2
1
I4
50%
I5
50%
[0077] The frequent 1-itemsets generated by the first scan are: I1, I2, I4, I5
[0078] In the second scan, 2 candidate item sets (I1, I2, I1, I4, I1, I5, I2, I4, I2, I5, I4, I5) including frequent 1-itemsets are generated, and the ...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com