Structural data distributed index and retrieval method

A structured data and distributed technology, applied in the field of data retrieval, can solve the problems of low retrieval efficiency and insufficient distributed column database support, and achieve the effect of solving low retrieval efficiency and improving retrieval speed.

Inactive Publication Date: 2015-01-07
SHENZHEN UNIV
View PDF5 Cites 10 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] Although the distributed column database solves the storage and retrieval problem of massive structured data, the distributed column database does not support the index well enough. For example, HBase only supports the primary key index. For non-primary key columns, the retrieval needs to scan the entire table, and the retrieval efficiency is low.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Structural data distributed index and retrieval method
  • Structural data distributed index and retrieval method
  • Structural data distributed index and retrieval method

Examples

Experimental program
Comparison scheme
Effect test

specific Embodiment

[0060] Because the prefix method is used, not all indexes need to be retrieved. Under the condition that the user specifies a data set, it is only necessary to add the prefix of the corresponding data set in front of the search keyword. For example, if the user wants to see the latest results, but the index is established with a date prefix, you only need to add the latest date prefix in front of the search keyword to perform the search. Specific examples are as follows:

[0061] In the first step, the user inputs retrieval conditions, and generally there may be multiple retrieval conditions. as in Figure 4 In the example, we have created an index on column C3, and the user can input "select ID from C3 where value=b or value=s with T1,T2", that is, the user retrieves the values ​​of b and s at the time of T1 and T2 in column C3 The ID of the data.

[0062] In the second step, the system analyzes the retrieval conditions to obtain different retrieval keywords. The system r...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a structural data distributed index and a retrieval method. According to the structural data distributed index, a reverse index table of a distributed partial index structure is established through the MapReduce program, and the reverse index table is stored in a distributed column database. Index establishment comprises the steps of selecting a frequently-used column, establishing reverse indexes and achieving the distributed index. The retrieval method comprises the following steps that a retrieval column name is given; a retrieval keyword is established; search is conducted in the reverse indexes with the retrieval keyword; a union set returned through a search result is a retrieval result set. According to the structural data distributed index and the retrieval method, the reverse index table of the distributed partial index structure is established through the MapReduce program, the problem that the retrieval efficiency of mass structural data is low, and the retrieval speed will be greatly increased.

Description

[technical field] [0001] The invention relates to data retrieval, in particular to a structured data distributed index and retrieval method. [Background technique] [0002] The traditional structured data retrieval technology is mainly the traditional database technology, and the traditional databases commonly used today include Oracle, MYSOL, SOLSERVER, etc. Users only need to import structured data sets into the database, and then they can easily retrieve the data they want by using various operations provided by the database. The technical details are mainly to use the built-in index mechanism of the database to establish a hierarchical index for the imported structured data, and then obtain the search data according to the index through the search interface. [0003] The main technologies for massive structured data retrieval are distributed databases, such as distributed columnar database HBase. Distributed columnar database breaks the traditional way of storing relat...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
CPCG06F16/24G06F16/2228G06F16/284
Inventor 毛睿陆敏华李荣华王毅刘刚岳磅廖凯华
Owner SHENZHEN UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products