Unlock instant, AI-driven research and patent intelligence for your innovation.

Index query method and index query device

A query method and index technology, applied in the field of big data, to achieve the effect of improving query performance and realizing distributed fast indexing

Active Publication Date: 2017-11-24
SHENZHEN UWAY TECH CO LTD
View PDF4 Cites 6 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Therefore, the requirements for real-time query based on big data are becoming higher and higher, and real-time query projects in the current open source big data environment still have certain risks in use due to their performance, stability, and experience accumulation.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Index query method and index query device
  • Index query method and index query device
  • Index query method and index query device

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0065] see figure 1 It is a schematic flowchart of an index query method provided in Embodiment 1 of the present invention, the method includes the following steps:

[0066] S11. Obtain original data in the distributed file system, and generate inverted index information of the original data, wherein the inverted index information includes a file name and an offset;

[0067] Specifically, for the original data, the generated inverted index information mainly stores data index information, that is, the position information of the specified fields of the original record. With this index, you can quickly query the file where the data is located and the offset of the corresponding data. The reason why it is an inverted index is that each item in this index table includes an attribute value and the address of each record with the attribute value, because the attribute value is not determined by the record, but the record is determined by the attribute value position, thus becomin...

Embodiment 2

[0082] Referring to embodiment one of the present invention and figure 1 For the specific process of steps S11 to S15 described in , see figure 2 , the step S12 compresses the original data according to a preset LZO compression mode to generate an LZO compression block, and generates a corresponding random access index, specifically including:

[0083] S121. According to the size of the preset compressed block, compress the original data into an LZO compressed block by using the LZO compression mode;

[0084] It is understandable that through the modification of the source code of LZO, the production of related index files and supporting classes such as MapReduce have been realized. Among them, MapReduce is a programming model mainly used for parallel computing of large-scale data sets. The MapReduce monitoring directory implements LZO compression on the original data to obtain compressed file blocks. The LZO file format is a file header and multiple file blocks. Each block ...

Embodiment 3

[0110] Corresponding to the methods disclosed in Embodiment 1 and Embodiment 2 of the present invention, Embodiment 3 of the present invention also provides an index query device, see Figure 6 , the device specifically includes:

[0111] The acquisition module 1 is configured to acquire original data in the distributed file system, and generate inverted index information of the original data, wherein the inverted index information includes a file name and an offset;

[0112] The compression module 2 is configured to compress the original data according to a preset LZO compression mode to generate an LZO compressed block, and generate a corresponding random access index, wherein the random access index includes an original data offset and a compressed block Offset;

[0113] The writing module 3 is used to write the inverted index information into the inverted index file according to the index file format, wherein the inverted index file includes a file header, an index block,...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses an index query method and an index query device. The index query method includes: generating inverted index information of original data; compressing the original data according to a preset LZO compression mode to generate an LZO compression block and a corresponding random access index; writing the inverted index information into an inverted index file according to an index file format; when an index query request is received, calling the inverted index file to query the inverted index information of the original data; according to the inverted index information and the random access index, querying the corresponding position of the original data in the LZO compression block and acquiring the original data according to query of the position. Therefore, the purpose of improving real-time index query performance is realized.

Description

technical field [0001] The invention relates to the field of big data technology, in particular to a real-time index query method and device based on a distributed file system. Background technique [0002] With the advent of the cloud era, data is undergoing tremendous changes. Data such as customer data, transaction data, social media data, and network behavior data all contain huge high-value business information, which determine the future and development of enterprises. Obviously, the era of big data has arrived, and in business, economics and other fields, business decisions will increasingly rely on information extracted from massive data rather than experience and intuition. Therefore, the requirements for real-time query based on big data are becoming higher and higher, and real-time query projects in the current open source big data environment still have certain risks in use due to their performance, stability, and experience accumulation. Contents of the invent...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
CPCG06F16/13G06F16/134G06F16/182
Inventor 周莅涛李适季王新胜彭仕文王巧瑞张超施全立陈立志
Owner SHENZHEN UWAY TECH CO LTD