Eureka AIR delivers breakthrough ideas for toughest innovation challenges, trusted by R&D personnel around the world.

Method, system and database terminal for mining approximate dictionary rules in database

A database and rule technology, applied in the direction of electrical digital data processing, special data processing applications, instruments, etc., can solve problems such as low efficiency, poor performance, and inability to adapt to the needs of big data analysis and processing

Active Publication Date: 2016-11-16
SHENZHEN AUDAQUE DATA TECH
View PDF3 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0006] On the one hand, the object of the present invention is to provide a mining method for database approximate dictionary rules, aiming to solve the problems of poor performance and low efficiency of previous mining methods, which cannot meet the needs of big data analysis and processing

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method, system and database terminal for mining approximate dictionary rules in database
  • Method, system and database terminal for mining approximate dictionary rules in database
  • Method, system and database terminal for mining approximate dictionary rules in database

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0097] In order to make the object, technical solution and advantages of the present invention clearer, the present invention will be further described in detail below in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain the present invention, not to limit the present invention.

[0098] Related concepts

[0099] Consider the database r, define the set of all columns in r as R, and the different values ​​in each column are called items (item), and the set of all items is defined as item set I; each row of r is called transaction t (transaction),

[0100] (1) Support: For a given itemset Define its support degree supp(X) as the number of transactions containing item set X in r, which satisfies number of transactions.

[0101] (2) Superset, subset: For two itemsets X, Y, if satisfy Then it is said that Y is a superset of X, X is a subset of Y, and there is supp(Y)<=su...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention is applicable to the field of similar word dictionary rule mining, and provides a mining method, a system and a database terminal of a similar word dictionary rule of a database. The mining method of the similar word dictionary rule comprises the following steps of: scanning and analyzing a database r, eliminating single-value columns and columns in which all values are unique, and the marking the rest candidate column sets as R; counting up the support degree of each item of each column in the candidate column sets R, and numbering the item with the support degree more than the given maximum support degree with an integer; sequentially numbering each row of transactions in the database r, recording the row transaction number included in each item in a list, and then caching; using a method of DCfd to mine the similar word dictionary rule of the database r; and outputting the similar word dictionary rule. According to the invention, the similar word dictionary rule mining method of DCfd is used in the database r, a reverse increment search strategy is used, a search tree is trimmed by a trimming method, and simultaneously, the found rule is cached, so that the calculation amount of the whole mining method can be reduced, and the similar word dictionary rule in the database can be automatically and efficiently found.

Description

technical field [0001] The invention relates to the field of mining approximate dictionary rules, in particular to a method for mining approximate dictionary rules in a database, a system and a database terminal. Background technique [0002] With the rapid development of the Internet and the improvement of informatization in various fields of society, the amount of data is blowing out at an unprecedented rate, and human beings are entering the era of big data. The era of big data is characterized by a larger amount of data, more complex data sources, faster data updates, and uneven data quality. It is almost impossible to manage data quality only by manual means. The field of data management is undergoing major changes and breakthroughs. The technologies that have been commercialized basically stay in the second-generation data quality management stage that is manual and based on experience. didn't show up. An important part of the automated management system is the autom...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F17/30
Inventor 王明兴贾西贝
Owner SHENZHEN AUDAQUE DATA TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Eureka Blog
Learn More
PatSnap group products