Unlock instant, AI-driven research and patent intelligence for your innovation.

Method and system for mining association patterns of Chinese feature words based on dynamic item weights

A technology of feature words and dynamic items, applied in the field of data, can solve the problems of not considering the change of item weight, the large number of association modes, and the increase of association modes

Inactive Publication Date: 2017-10-24
GUANGXI UNIVERSITY OF FINANCE AND ECONOMICS
View PDF3 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Its defect is that it only pays attention to the frequency of items and ignores the existence of item weights, resulting in an increase in redundant, invalid and uninteresting association modes
Its defect is that it does not consider the change of item weight with the change of transaction records, that is, ignores the change of item weight, and cannot solve the data mining problem with the characteristics of item weight change
The disadvantage of the existing mining method based on dynamic item weights is that the number of association patterns it mines is still huge, and there are many uninteresting, false and invalid association patterns, which increases the difficulty for users to choose the desired pattern

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and system for mining association patterns of Chinese feature words based on dynamic item weights
  • Method and system for mining association patterns of Chinese feature words based on dynamic item weights
  • Method and system for mining association patterns of Chinese feature words based on dynamic item weights

Examples

Experimental program
Comparison scheme
Effect test

example

[0088] Example: a matrix-weighted Chinese text database instance, with 5 Chinese document records and 5 feature word items and their weights, that is, the document collection is { d 1 , d 2 , d 3 , d 4 , d 5}, the set of feature words is { i 1 , i 2 , i 3 , i 4 , i 5} = {program, queue, function, environment, member}.

[0089]

[0090] The process of the present invention mining matrix weighted Chinese feature word association pattern to Chinese document data instance is as follows ( ms =0.1, mc =0.55):

[0091]1. Find the sum of the weights of all Chinese feature words in the document database w =8.18.

[0092] 2. Mining matrix weighted Chinese feature words frequent 1_ itemsets L 1 ,As shown in Table 1.

[0093] Table 1:

[0094]

[0095] , as can be seen from Table 1, 1-itemset ( i 2 ) itemset weight mw (C 1 ), so the itemset is an infrequent itemset. Other itemset weights are greater than mw (C 1 ), so they are all frequen...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a Chinese feature word association pattern mining method based on dynamic project weight and a system thereof. The method comprises the following steps: preprocessing by a Chinese text preprocessing module, and constructing a Chinese text database and a feature word project library; utilizing a Chinese feature word candidate item set generating and pruning module to generate a matrix weighting feature word candidate item set, pruning the candidate item set by adopting a new matrix weighting item set pruning method to obtain a final matrix weighting feature word candidate item set; utilizing a Chinese feature word frequent item set generation module to calculate item set weight so as to obtain a feature word frequent item set; and utilizing a Chinese feature word association mode generating and result displaying module to generate all proper subsets of the item set, mining an effective association pattern through the simple calculation and comparison of the item set weight, and displaying to a user to use. The invention exhibits favorable pruning performance, the candidate item set and the mining time of the invention are obviously reduced and shortened, mining efficiency is greatly improved, the pattern of the invention is applied to an information retrieval field, and information query performance can be improved.

Description

technical field [0001] The invention belongs to the field of data mining, specifically a matrix-weighted Chinese feature word association pattern mining method based on dynamic item weights and a mining system thereof, which is applicable to feature word association pattern discovery in Chinese text mining and Chinese text information retrieval query expansion, In the field of cross-language information retrieval of text, the feature word association pattern mined by it can be used as a source of high-quality extended words and applied to web search engines, which will help improve its information retrieval query performance. Background technique [0002] At present, mining methods based on item frequency and mining methods based on fixed item weights have been widely studied and applied, while mining methods based on dynamic item weights are rarely reported. The mining method based on dynamic item weight has important application value and broad application prospect in text...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F17/30
CPCG06F16/3344G06F2216/03
Inventor 黄名选
Owner GUANGXI UNIVERSITY OF FINANCE AND ECONOMICS