Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Method and device for item-sets mining

A technology of item sets and candidate item sets, which is applied in the directions of instruments, computing, and electrical digital data processing, etc., can solve the problems of low use value of item sets, difference in transaction occurrence probability, low occurrence probability, etc., and achieve the effect of high actual use value

Active Publication Date: 2016-10-19
HARBIN INST OF TECH SHENZHEN GRADUATE SCHOOL +1
View PDF2 Cites 4 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] In the process of implementing the embodiments of the present invention, the inventors found that the above-mentioned technology has at least the following problems: In actual situations, the data stored in the database is often uncertain data, that is, there is a probability of occurrence of transactions in the database, and each There is a large difference in the probability of occurrence corresponding to the transaction
However, the existing HUIM-based algorithm does not consider the occurrence probability of transactions, and it is easy to mine itemsets with high utility value but low occurrence probability, resulting in low actual use value of the mined itemsets.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and device for item-sets mining
  • Method and device for item-sets mining
  • Method and device for item-sets mining

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0026] In order to make the object, technical solution and advantages of the present invention clearer, the implementation manner of the present invention will be further described in detail below in conjunction with the accompanying drawings.

[0027] The following will clearly and completely describe the technical solutions in the embodiments of the present invention with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are some of the embodiments of the present invention, but not all of them. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts belong to the protection scope of the present invention.

[0028] In order to facilitate the description of the embodiments of the present invention, the basic concepts involved in the embodiments of the present invention are introduced in advance as follows:

[0...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a method and a device for item-sets mining, and belongs to the field of data mining. The method comprises: obtaining a self-defined minimum expected support [mu] and a lowest utility proportion [epsilon]; calculating actual expected support expSup and an actual utility value u of item-sets in an uncertainty database D, the item-sets including at least an item; when expSup >= |D| x [mu] and u>= total utility TU x [epsilon], determining the item-sets to be a high-probability and high-utility item-sets, wherein TU represents the sum of utility of all items in the uncertainty database D, and |D| represents transaction total included in the uncertainty database D. The method achieves effects of high utility values of mined item-sets, and high occurrence probability, so that the mined item-sets is ensured to have relatively high practical use values.

Description

technical field [0001] Embodiments of the present invention relate to the technical field of data mining, and in particular to an itemset mining method and device. Background technique [0002] The database usually includes at least one transaction (English: Transaction), and each transaction includes at least one data item (English: item), for example, a transaction about a person record includes name, date of birth, gender, blood type, etc. data item. [0003] In order to discover the association rules between different data items, it is necessary to mine the target data item set. Itemset (English: Itemsets) is a collection of at least one data item, which is used to represent an inherent association rule in the database. HUIM (High-Utility Itemsets Mining, High-Utility Itemsets Mining), as a common data mining method, is used to mine itemsets with high utility values ​​composed of different data items from the database. In the existing HUIM-based algorithm, by calculat...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F17/30
Inventor 林浚玮赖晓平李勇王巨宏甘文生
Owner HARBIN INST OF TECH SHENZHEN GRADUATE SCHOOL
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products