Supercharge Your Innovation With Domain-Expert AI Agents!

An automatic identification and optimization method based on massive data-like SQL retrieval scenarios

An optimization method and mass data technology, applied in the direction of structured data retrieval, digital data information retrieval, database indexing, etc., can solve problems such as not being able to meet the retrieval performance requirements of retrieval scenarios, reduce retrieval resource consumption, improve retrieval performance, reduce The effect of resource consumption

Active Publication Date: 2019-04-12
BEIJING SCISTOR TECH +1
View PDF4 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, the retrieval performance requirements of various retrieval scenarios cannot be met only through the massive data storage of a single storage medium. Therefore, on this basis, the present invention provides an automatic identification and optimization technology for SQL retrieval scenarios of massive data to meet the requirements of different retrieval scenarios. High Performance Retrieval Requirements for Scenes

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • An automatic identification and optimization method based on massive data-like SQL retrieval scenarios
  • An automatic identification and optimization method based on massive data-like SQL retrieval scenarios
  • An automatic identification and optimization method based on massive data-like SQL retrieval scenarios

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0030] In order to make the purpose, technical solution and advantages of the present invention clearer, the technical solution of the present invention will be further described in detail below in conjunction with the accompanying drawings and implementation examples.

[0031] Before automatically identifying and optimizing according to the retrieval scenario, the following process is required:

[0032] 1) SQL-like statement lexical syntax analysis;

[0033] 2) Retrieval semantic analysis;

[0034] 3) Logic plan tree generation.

[0035] The specific process is not related to the present invention, so it will not be described in detail here. However, in the stage of logical plan tree generation, operations such as scan / join / group by / order by / aggregation functions have been hierarchically divided. The present invention optimizes the single-table data scanning stage (scan), data indexing and storage medium selection . The present invention illustrates the optimization strat...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides an automatic recognition optimization method based on mass data type SQL retrieval scene, and belongs to the statistical and analysis field of the mass data. The method comprises five aspects to perform the optimization: introducing lucene as an optional storage medium; adding a bloomfilter index (bf index) for a retrieval field of each data file; dividing different retrieval scenes and selecting an optimal storage medium; converting a SQL-like statement into a lucene statement when performing the lucene retrieval scene; and an adding an valid session-level setting for the lucene storage medium and the bf index. The session-level validity of the bf index is judged while retrieving the mass data, a to-be-retrieved data file list is reduced through the bf index, and then the session-level validity of the lucene storage medium is re-judged. Through the adoption of the method provided by the invention, the resource consumption of the cluster while retrieving the mass data is effectively lowered, and the retrieval performance of the mass data is greatly improved.

Description

technical field [0001] The invention belongs to the technical field of statistical analysis of massive data, and relates to a technical scheme for automatically identifying massive data retrieval scenarios based on SQL mode, and selecting corresponding means to speed up retrieval. Background technique [0002] With the rapid development of information science and technology, various forms of massive data such as webpage files, text data, multimedia data, etc. are continuously generated, resulting in a rapid expansion of data scale, and the application fields of various types of data are also expanding. The applications are as follows Features: First, the data scale is large and continues to grow, and these data need to be saved for statistical analysis; second, the requirements for complex query operations and online transaction processing capabilities are high, and the requirements for response time are relatively strict. Moreover, this is carried out under the condition of...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F16/2453G06F16/22G06F16/2452
CPCG06F16/2228G06F16/2452G06F16/2453
Inventor 王宇徐晓燕周渊刘庆良郑彩娟王振宇黄成李斌斌周游刘斌斌
Owner BEIJING SCISTOR TECH
Features
  • R&D
  • Intellectual Property
  • Life Sciences
  • Materials
  • Tech Scout
Why Patsnap Eureka
  • Unparalleled Data Quality
  • Higher Quality Content
  • 60% Fewer Hallucinations
Social media
Patsnap Eureka Blog
Learn More