Denormalization strategy selection method based on frequent item set mining algorithm

What is AI technical title?
AI technical title is built by Patsnap AI team. It summarizes the technical point description of the patent document.
A technology of frequent itemset mining and frequent itemsets, which is applied in computing, structured data retrieval, special data processing applications, etc., to achieve the effect of solving performance bottlenecks

Active Publication Date: 2014-05-28

UNIV OF ELECTRONIC SCI & TECH OF CHINA

View PDF3 Cites 21 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0018] Foreign researchers have carried out more in-depth work on denormalization, but it is still in the development stage

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

example 1

[0065] Example 1: The denormalization strategy selection method based on the frequent itemset mining algorithm is characterized in that it includes the following steps:

[0066] 1-(a). Obtaining the database log file step: obtaining the database log file to be analyzed;

[0067] 1-(b). Parsing the log step: analyzing the SELECT statement in the log, extracting the table name involved in it, and the field name as the transaction item; then obtaining the transaction record involving cross-table query or only the transaction record of single-table query;

[0068] 1-(c). Data mining step, this step is based on the frequent pattern mining of the simplified prefix tree, which includes three parts in turn:

[0069] (c-1). The step of establishing an FP-tree: read the transaction record set, and establish a frequent pattern tree (FP-tree) based on the preset support experience value, and the persistence threshold is determined by analyzing a large number of denormalized examples, is ...

example 2

[0093] Obtain the database log file to be analyzed: Assume a simplified test data set TestSet, as shown in Table 1.

[0094] The default support count threshold is 3.

[0095]

[0096] Table 1: Test dataset TestSet

[0097] Analyze the SELECT statement in the log, extract the table names involved, and use the field names as transaction items: the frequent 1-itemset (or item sequence conversion table) obtained after the test data set TestSet is read into memory, as shown in Table 2 .

[0098] serial number project 0 course.academy_id 1 academy.academy_id 2 course.course_id 3 teacher.teacher_id 4 give_lesson.givelesson_id

[0099] Table 2: Frequent 1-itemset (item sequence conversion table)

[0100] Such as figure 2 Read the transaction data set into memory, and filter according to the preset support threshold to obtain frequent 1-itemsets;

[0101] Mount all frequent items in the transaction set in the FP-tree;

[0102] Su...

example 3

[0109] Obtain the database log file to be analyzed: Assume a simplified test data set TestSet, as shown in Table 1.

[0110] The default support count threshold is 3.

[0111]

[0112] Table 1: Test dataset TestSet

[0113] Analyze the SELECT statement in the log, extract the table names involved, and use the field names as transaction items: the frequent 1-itemset (or item sequence conversion table) obtained after the test data set TestSet is read into memory, as shown in Table 2 .

[0114] serial number project 0 course.academy_id 1 academy.academy_id 2 course.course_id 3 teacher.teacher_id 4 give_lesson.givelesson_id

[0115] Table 2: Frequent 1-itemset (item sequence conversion table)

[0116] Such as figure 2 Read the transaction data set into memory, and filter according to the preset support threshold to obtain frequent 1-itemsets;

[0117] Mount all frequent items in the transaction set in the FP-tree;

[0118] Su...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a denormalization strategy selection method based on the frequent item set mining algorithm and particularly relates to a denormalization strategy selection method for mass data sets based on the frequent item set mining algorithm. The frequent pattern mining method is applied to guiding database denormalization for the first time; based on the frequent pattern mining algorithm of a concise tree, a brand-new process of establishing the concise tree and a correct counting method, serving for database denormalization selection, are provided. The denormalization strategy selection method based on the frequent item set mining algorithm has the advantages that through the frequent item set mining algorithm of association rules, important association or relation of item sets in mass data is discovered to guide DBA and the like to select and build denormalization strategies of databases, and the problem of performance bottleneck caused by mass table joins in the mass data is solved.

Description

technical field [0001] The invention relates to a denormalization strategy selection method, in particular to a denormalization strategy selection method based on a frequent item set mining algorithm on a massive data set. Background technique [0002] Constructing a relational database must follow certain rules, called paradigms. The higher the paradigm level, the higher the requirements for database design. At the same time, as the paradigm increases, the redundancy of the database decreases step by step, and the data consistency increases step by step. However, relational database theory also has deficiencies. The higher the paradigm, the finer the data model, which means more data tables are required, which requires more table connection operations during the running of the program, although some database systems support stored procedures. And other technologies, but this does not bring revolutionary efficiency improvements, especially when the data of two or more tabl...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G06F17/30

CPCG06F16/284

Inventor牛新征周冬梅侯孟书杨健

OwnerUNIV OF ELECTRONIC SCI & TECH OF CHINA

Denormalization strategy selection method based on frequent item set mining algorithm

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

example 1

example 2

example 3

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology