Data set frequent item set mining availability evaluation method

A technology of frequent itemset mining and maximum frequent itemsets, applied in special data processing applications, digital data protection, digital data processing, etc. Operation efficiency, avoid a lot of repeated calculation process, reduce the effect of search space
CN113568942APending Publication Date: 2021-10-29NANJING NORMAL UNIVERSITY

Patent Information

Authority / Receiving Office
CN Β· China
Current Assignee / Owner
NANJING NORMAL UNIVERSITY
Publication Date
2021-10-29

Smart Images

  • Figure 1
    Figure 1
  • Figure 2
    Figure 2
Patent Text Reader

Abstract

The invention discloses a data set frequent item set mining availability evaluation method, which comprises the following steps of: (1) setting C = {I1, I2,..., In} as a set of items, giving transaction data sets D1 and D2, and mining D1 and D2 by utilizing an Apriori algorithm to obtain maximum frequent item set sets, and recording the maximum frequent item set sets as FIS1 and FIS2; (2) any item set MIS1 of the FIS1 and any item set MIS2 of the FIS2 are matched through an item set matching algorithm F, a paired item set table Pairs is obtained, the Pairs is composed of item set pairs < MIS1, MIS2 and score1 >, score1 represents the item similarity of the MIS1 and the MIS2, and the item similarity of the MIS1 and the MIS2 is obtained through calculation in the matching process. (3) for each item < MIS1, MIS2, score 1 > in the Pairs, calculating the support degree similarity score 2 of the MIS1 and the MIS2, further calculating to obtain the composite similarity score of the MIS1 and the MIS2, and updating the pair to be < MIS1, MIS2, score >; and (4) accumulating the composite similarity score of each item in the Pairs, and dividing the accumulated composite similarity score by the number of the items in the Pairs to obtain a similarity score SCORE of the D1 and the D2, and the value range of the score is [0, 1].
Need to check novelty before this filing date? Find Prior Art

Description

technical field

[0001] The invention relates to a method for assessing the usability of frequent item set mining of data sets, which is used for evaluating the usability of data sets regarding the availability of frequent item set mining and analysis. Background technique

[0002] At present, frequent itemset mining analysis has been extensively studied, however, the evaluation of the availability of frequent itemsets in datasets is still in its infancy. The evaluation indicators include precision and relative error RE.

[0003] However, the current commonly used evaluation method precision is mainly based on the item similarity of frequent itemsets, and RE uses the median of support similarity to represent the support similarity between frequent itemsets. These two measurement indicators are relatively independent. And they are relatively one-sided. The similarity of frequent itemsets has an inseparable relationship with item similarity and support similarity. Using two e...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More