Method for quickly retrieving mass data

A massive data, fast technology, applied in structured data retrieval, digital data information retrieval, database indexing, etc., can solve the problem of waste of storage space

Pending Publication Date: 2020-09-18
NANJING LES INFORMATION TECH
View PDF4 Cites 3 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Such a secondary index design solves the problem of wasting storage space after query conditions change in 1, but creates new problems. The principle of RowKey matching in HBase is to match from front to back according to the ASCII code of RowKey. In this way, if there are multiple queries Conditions, in order to be able to adapt to various combined queries, the number of secondary indexes is very large. When the conditions reach 7 to 8, the number of indexes is already too many, so that the storage space occupied by the indexes may exceed the primary data up

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method for quickly retrieving mass data

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0033] In order to facilitate the understanding of those skilled in the art, the present invention will be further described below in conjunction with the embodiments and accompanying drawings, and the contents mentioned in the embodiments are not intended to limit the present invention.

[0034] refer to figure 1 As shown, a kind of fast retrieval method for mass data of the present invention comprises steps as follows:

[0035] 1) Build a massive data storage system;

[0036] The mass data storage system is Apache HBase, which is a distributed and scalable mass data storage system based on Hadoop. The business data to be retrieved is stored in HBase according to the respective business design, and a suitable RowKey is designed. , which is used as the unique identifier of the record; let the data be evenly distributed to multiple RegionServers of HBase, improve the performance of concurrent processing, and avoid local overheating.

[0037] 2) Establish a secondary index for...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a method for quickly retrieving mass data. The method comprises the following steps: constructing a mass data storage system; establishing a secondary index for the data in themass data storage system; starting a data retrieval service, and monitoring an Http request; analyzing the received Http request sent by the Client to generate an index retrieval condition, initiating an index retrieval request to an ElasticSearch index service, and obtaining a response result; and reading the structural data corresponding to the ROWKEY from the Hbase service according to the ROWKEY of the data corresponding to the response result, and analyzing and returning the retrieved structural data. According to the method, massive data can be quickly retrieved according to multiple conditions, query results can be returned within a very short time range, and the defects of existing technical schemes at present are overcome at the minimum cost.

Description

technical field [0001] The invention belongs to the technical field of fast retrieval of big data, and specifically refers to a method for fast retrieval of massive data. Background technique [0002] With the development of society and technology, massive amounts of data are generated in different fields every day, and the storage and use of these data has become a very challenging technical problem. For example, in the transportation industry, a county-level city with a population of 3 million has 10 million pieces of vehicle passing data generated by video detectors. Ordinary transactional information management systems store these data through relational data. Within the first year, the retrieval of these data can still be carried out normally. It is optimized to the extreme, but it is still impossible to query the data to be found in a short time. How to store massive data more effectively and achieve fast retrieval through certain technologies has become an urgent pr...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/22G06F16/242
CPCG06F16/221G06F16/242
Inventor 徐晓贝陈胡陈宽陶伟洋叶兆裕王远友
Owner NANJING LES INFORMATION TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products