Method of item-all-weighted positive or negative association model mining between text terms and mining system applied to method

A fully weighted, pattern-mining technology, applied in instrumentation, computing, electrical digital data processing, etc., can solve problems such as invalid correlation patterns, neglect, redundancy, etc.

Inactive Publication Date: 2014-07-30
GUANGXI UNIVERSITY OF FINANCE AND ECONOMICS
View PDF2 Cites 20 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The disadvantage of the traditional item unweighted association mode mining is that it does not consider the existence of item weights, which often leads to a large number of redundant, uninteresting and invalid association modes during mining.
The defect of weighted positive and negative association rule mining is that although it pays attention to the different importance between items, it ignores the fact that the item weigh...

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method of item-all-weighted positive or negative association model mining between text terms and mining system applied to method
  • Method of item-all-weighted positive or negative association model mining between text terms and mining system applied to method
  • Method of item-all-weighted positive or negative association model mining between text terms and mining system applied to method

Examples

Experimental program
Comparison scheme
Effect test

example

[0132] Example: awAPInt(i 1 ,i 2 )=|(5×1.47–3.18×0.61) / (5×1.47+3.18×0.61)|=|5.41 / 9.29|=0.58, awAPInt(i 1 , “i 2 )=|5.41 / (9.29–2×5×3.18)|=0.24, awAPInt(‘i 1 ,i 2 )=|5.41 / (9.29–2×5×0.61)|=1.69, awAPInt(‘i 1 , “i 2 )=|5.41 / (9.29+2×5×(5–3.18–0.61))|=0.25.

[0133] Definition 5

[0134] Mutual information of fully weighted itemset: all-weighted Mutual Information of Itemset, referred to as awMI: mutual information (Mutual Information) is a common method for computational linguistics model analysis, which measures the degree of correlation between two objects x and y, refers to is the logarithm of the ratio of the posterior probability p(x|y) of x to the prior probability p(x) (Fu Zuyun. Basic Theory and Application of Information Theory (Third Edition). Electronic Industry Press, 2011.2, ISBN9787121129001.), if The value of mutual information is greater than 0, indicating that x and y are positively correlated, otherwise, if the value is less than 0, it is negativ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a method of item-all-weighted positive or negative association model mining between text terms and a mining system applied to the method. The method comprises the following steps of preprocessing by using a Chinese text preprocessing module to establish a text database and a feature word item library; mining item-all-weighted feature word candidate item sets from the text database by utilizing a feature word frequent item set and negative item set mining implementation module, calculating a weight dimension ratio, and cutting out uninteresting item sets by adopting a multi-interestingness threshold value pruning strategy to obtain an interesting item-all-weighted feature work frequent item set and negative item set model; mining an effective item-all-weighted positive or negative association rule model from frequent item sets and negative item sets by utilizing an item-all-weighted positive or negative association rule mining implementation module between terms, and outputting the mined positive or negative association rule model to a user by utilizing an item-all-weighted association model result display module between terms. By applying the method and the system, unnecessary frequent item sets, negative item sets and association rule models can be greatly reduced, Chinese feature word association rule mining efficiency is improved and a high-quality association model between Chinese terms is obtained.

Description

technical field [0001] The invention belongs to the field of data mining, specifically a method for mining fully weighted positive and negative association patterns between text words based on the ratio of weight dimensions and a mining system thereof, which is applicable to the discovery of association patterns of characteristic words in text mining and the expansion of text information retrieval queries, etc. field. Background technique [0002] In the past 20 years, the research on association pattern mining technology has made remarkable achievements, and has gone through three research stages: item unweighted mining technology, item weighted mining technology and item fully weighted mining technology. [0003] Phase 1: Research on Mining Unweighted Positive and Negative Association Patterns [0004] The main feature of item unweighted positive and negative association pattern mining is that the probability of the item set appearing in the database is the support degree...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/30
CPCG06F16/313G06F16/3335
Inventor 黄名选夏冰
Owner GUANGXI UNIVERSITY OF FINANCE AND ECONOMICS
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products