Unlock instant, AI-driven research and patent intelligence for your innovation.

Chinese words matrix weighted association rules mining method based on item frequency and weight

A technology of matrix weighting and Chinese words, which is applied in data mining, special data processing applications, knowledge expression, etc., can solve the problems of not considering the importance of the association mode, the weight value of the association mode, and the change of item weight, so as to improve retrieval Performance, good application value, high application value effect

Inactive Publication Date: 2018-08-17
GUANGXI UNIVERSITY OF FINANCE AND ECONOMICS
View PDF3 Cites 1 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

This method only considers the frequency of the association mode, and does not consider the importance of the association mode in the transaction database (ie, the weight of the association mode)
The second type is the calculation method of association mode support with fixed item weights. This method uses the product of the sum of item set weights and unweighted association mode support as the weighted item set support (C.H.Cai, A.da, W.C.Fu, etal.Mining Association Rules with Weighted Items[C] / / Proceedings of IEEEInternational database Engineering and Application Symposiums,1998:68-77.), this method overcomes the defects of the first type of method and considers the item weight, but the item weight The value is fixed during the mining process, which cannot solve the situation that the item weight changes with different transaction records

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Chinese words matrix weighted association rules mining method based on item frequency and weight
  • Chinese words matrix weighted association rules mining method based on item frequency and weight
  • Chinese words matrix weighted association rules mining method based on item frequency and weight

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0039] In order to better illustrate the technical solution of the present invention, the specific implementation manners of the present invention will be described in detail below in conjunction with the accompanying drawings, but this does not constitute a limitation to the protection scope of the claims of the present invention.

[0040] Such as figure 1 As shown, the weighted association rule mining method of Chinese inter-word matrix based on item frequency and weight includes the following steps:

[0041] 1. Preprocess the Chinese documents to be mined, that is, remove Chinese stop words, extract feature words and calculate their weights, and build a Chinese feature lexicon and a Chinese document index library.

[0042] The feature word weight indicates the importance of the Chinese feature word to the Chinese document where it is located. The classic and popular tf-idf feature word weight calculation method is used. The calculation formula is:

[0043]

[0044] In f...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a Chinese words matrix weighted association rules mining method based on item frequency and weight. The Chinese words matrix weighted association rules mining method based on item frequency and weight includes the steps: performing preprocessing on a Chinese document to be excavated, such as removing of Chinese stop words, extraction of characteristic words and calculationof characteristic word weight, and respectively constructing a Chinese characteristic word library and a Chinese document index library; by means of a matrix weighted support degree calculation methodbased on item frequency and weight to excavate Chinese characteristic word matrix weighted frequency item sets to obtain a set of Chinese characteristic word matrix weighted frequency item sets; andby means of a confidence-interest level evaluation framework, excavating Chinese characteristic word matrix weighted association rule modes from the Chinese characteristic word matrix weighted frequency item sets. The Chinese words matrix weighted association rules mining method based on item frequency and weight fully considers the appearance frequency and weight of characteristic words in the document, and can excavate the Chinese words matrix weighted association rule modes which can represent various association relations between the characteristic words more practically and more reasonably, wherein the modes are applied to the information retrieval and query expansion field, and can improve the information retrieval performance.

Description

technical field [0001] The invention belongs to the field of Chinese text mining, in particular to a Chinese inter-word matrix weighted association rule mining method based on item frequency and weight. Background technique [0002] In the study of association pattern mining, the core problem is the calculation of the support degree of association patterns. In the current research, there are mainly three types of association model support calculation methods as follows: the first type is the unweighted association model support calculation method (see literature R. Agrawal, T. Imielinski, A. Swami. Mining association rules between sets of items in large database[C].In Proceeding of 1993ACM SIGMOD International Conference on Management of Data,Washington D.C.,1993,(5):207-216.), this is an early classic support calculation method, which combines the association mode in the transaction The probability of occurrence in is taken as the support of the association mode. This met...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06N5/02G06F17/30
CPCG06F2216/03G06N5/025G06F16/316G06F16/3334G06F16/3335
Inventor 黄名选
Owner GUANGXI UNIVERSITY OF FINANCE AND ECONOMICS