Completely-weighted mode mining method for discovering association rules among texts

A fully weighted and rule-based technique, applied in feature word association pattern discovery and text information retrieval query expansion in text mining, data mining field

Inactive Publication Date: 2014-06-04
GUANGXI UNIVERSITY OF FINANCE AND ECONOMICS
View PDF4 Cites 11 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] The purpose of the present invention is to address the deficiencies in the prior art, to provide a fully weighted pattern mining method for discovering association rules between text words, to enrich the technical achievements of association rule mining based on item weight mining, and to solve the problem of fully weighted positive and negative items. Technical Difficulties in Mining Association Rules

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Completely-weighted mode mining method for discovering association rules among texts
  • Completely-weighted mode mining method for discovering association rules among texts
  • Completely-weighted mode mining method for discovering association rules among texts

Examples

Experimental program
Comparison scheme
Effect test

example

[0096] Example: Table 3 has 5 items and 5 transaction records, where the item set is {i 1 ,i 2 ,i 3 ,i 4 ,i 5}={Apple,Orange,Banana,Milk,Coca-cola}, as can be seen from Table 3, i 1 not appearing in T 3 transaction log. Table 4 is an example of fully weighted data for an item. The number of items and transaction records is the same as that of Table 3. Among them, item i 1 In the transaction record T 1 , T 2 , T 3 , T 5 The weights in are 0.85, 0.93, 0.65, 0.75 respectively, which do not appear in the transaction record T 4 , so its weight is 0.

[0097] Table 3 Item Weighted Data Example Table 4 Item Fully Weighted Data Example

[0098]

[0099] 2. Basic concepts of fully weighted data mining

[0100] Let the fully weighted database AWD={T 1 , T 2 ,...,T n}, the number of transactions is n, T i (1≤i≤n) means the i-th transaction in AWD, itemset I={i 1 ,i 2 ,...,i m} represents the collection of all items in AWD, the number of items is m, i j (1≤j≤m) ind...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a completely-weighted mode mining method for discovering association rules among texts. Completely-weighted data to be processed are pre-processed, and a completely-weighted database and an item .library are established; a completely-weighted frequent item set and a negative item set are mined, and an interesting completely-weighted frequent item set and an interesting negative item set are obtained through pruning; the effective completely-weighted positive and negative association rules are mined through a support degree-CPIR model-correlation-interestingness evaluation framework. The completely-weighted mode mining method can overcome the defects of the existing weighing mining technology. Item weights are objectively distributed in the database and integrated with the completely-weighted mode mining method along with the completely-weighted data characteristics of the business record change, and a more actual and reasonable completely-weighted positive and negative association mode can be obtained. An invalid and uninteresting association mode is avoided. The number of mined candidate items, the number of frequent item sets, the number of negative item sets and the number of positive and negative association rule modes are smaller than the number of mined candidate items, the number of frequent item sets, the number of negative item sets and the number of positive and negative association rule modes in the prior art. The mining efficiency is greatly improved, and the completely-weighted mode mining method has good extendibility.

Description

technical field [0001] The invention belongs to the field of data mining, in particular to a fully weighted positive and negative pattern mining method for discovering association rules between text words, which is applicable to the fields of feature word association pattern discovery in text mining, text information retrieval query expansion and the like. Background technique [0002] In the past 20 years, association rule mining has received great interest and research from many scholars, and has become one of the hot spots in data mining research. Its research mainly focuses on two aspects: item frequency mining and item weight mining. [0003] The main feature of positive and negative association pattern mining based on item frequency is to treat the items in the database equally and consistently, and use the probability of item sets appearing in the database as support to mine association patterns. The defect of association rule mining based on item frequency is: only p...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/30G06F17/27
CPCG06F16/30G06F40/20
Inventor 黄名选元昌安
Owner GUANGXI UNIVERSITY OF FINANCE AND ECONOMICS
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products