Method for querying keyword based on topic cluster unit in relational database

A query method and database technology, applied in the field of keyword query based on subject cluster units, can solve the problems of not considering the query efficiency of large-scale relational databases, unfavorable users to find expected query answers, and large-scale tables.

Inactive Publication Date: 2016-09-28
HARBIN ENG UNIV
View PDF3 Cites 20 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, the above offline query methods do not consider the query efficiency in large-scale relational databases
Enterprise databases usually contain hundreds or even thousands of data tables. Using the above method, the

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method for querying keyword based on topic cluster unit in relational database
  • Method for querying keyword based on topic cluster unit in relational database
  • Method for querying keyword based on topic cluster unit in relational database

Examples

Experimental program
Comparison scheme
Effect test

specific Embodiment approach 1

[0066] The specific embodiment one, the keyword query method based on the subject cluster unit in a kind of relational database described in the present embodiment, carries out according to the following steps:

[0067] 1. The construction process of thematic cluster units;

[0068] 1.1. Vertical grouping based on data table characteristics and query logs;

[0069] 1.2. Propose an optimization scheme for the table connection sequence in the theme cluster;

[0070] 1.3. Horizontal grouping based on the association diagram of thematic cluster tuples;

[0071] 2. Establish an index optimization mechanism based on association rules;

[0072] 3. Return the query result to the user.

[0073] This embodiment includes the following beneficial effects:

[0074] 1. Propose an offline query method based on topic cluster unit, which is suitable for keyword query on large-scale relational database;

[0075] 2. Constructed a new type of data structure - theme cluster unit. The data ta...

specific Embodiment approach 2

[0078] Embodiment 2. This embodiment is a further description of the keyword query method based on the subject cluster unit in a relational database described in Embodiment 1. Step 1. The method based on data table characteristics and query logs described in Step 1. The specific process of vertical grouping is as follows: this application adopts the method of constructing the similarity matrix between tables, and constructs the initial input matrix from two aspects of table characteristics, including topological closeness between tables, content similarity between tables and query logs, and the vertical grouping method will The relational database D and user query logs are used as input, and a set of topic clusters are used as output. The specific process is as follows Figure 4 As shown, the vertical grouping method can be roughly divided into the following three modules: input module, similarity matrix building module and output module. The input module takes the relational ...

specific Embodiment approach 3

[0121] Specific Embodiment 3. This embodiment is a further explanation of the keyword query method based on the subject cluster unit in a relational database described in Specific Embodiment 1 or 2. Step 1.2 proposes the table in the subject cluster. The specific process of the connection order optimization scheme is as follows:

[0122] In order to avoid complex table join operations during the query process, the topic cluster C needs to be i =(T 1 , T 2 ,...,T n n tables T in ) 1 , T 2 ,...,T n Connect to get the comprehensive table T' i . Existing methods only perform breadth-first traversal to join tables according to the primary-foreign key relationship. In a large database, which usually contains hundreds or even thousands of data tables, it takes a lot of time to use the above method, and the preprocessing efficiency is greatly affected. Aiming at this problem, the present invention designs table connection sequence optimization scheme based on genetic algorith...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to the field of information retrieval, in particular to a method for querying a keyword based on a topic cluster unit in a relational database, and aims to solve the problem of huge time overhead caused by frequent table connections in a query process for an existing keyword online query method as well as the problem of low query efficiency of an existing keyword offline query method for query on a large-scale database with a complicated internal structure and huge data volume. The method for querying the keyword based on the topic cluster unit in the relational database is implemented by the following steps of 1, constructing a topic cluster unit: (1), performing vertical grouping based on data table characteristics and query logs; (2), proposing a table connection sequence optimization scheme in a topic cluster; and (3), performing horizontal grouping based on a topic cluster tuple association graph; 2, establishing an association rule-based index optimization mechanism; and 3, returning a query result to a user. The method is applied to the field of the information retrieval.

Description

technical field [0001] The invention relates to the field of information retrieval, in particular to a keyword query method based on subject cluster units in a relational database. Background technique [0002] In recent years, keyword query has been successfully applied as an important query technology in the field of information retrieval. Because of its simple and easy-to-use features, it is accepted by more and more users. For relational databases, there is also a need for a simple and effective query method to obtain information of interest to users from complex relational databases. Traditional structured query methods, such as SQL query, not only require users to understand the complex underlying schema of relational databases, but also require users to master the use of related query languages, which brings great difficulties and inconveniences to the query work. Therefore, keyword query technology based on relational database has received extensive attention. Som...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/30
CPCG06F16/2471G06F16/285
Inventor 王念滨周连科王红滨王瑛琦何鸣宋奎勇
Owner HARBIN ENG UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products