Cache management method of distributed internal memory column database

A cache management and in-memory columnar technology, which is applied in the field of cache management of distributed in-memory columnar databases, to achieve the effect of improving query efficiency, saving query time and storage space, and reducing the calculation of repeated tasks

Active Publication Date: 2017-01-04
UNIV OF ELECTRONIC SCI & TECH OF CHINA
View PDF2 Cites 22 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] The query requests in the database are highly semantically related

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Cache management method of distributed internal memory column database
  • Cache management method of distributed internal memory column database
  • Cache management method of distributed internal memory column database

Examples

Experimental program
Comparison scheme
Effect test

Embodiment

[0030] This embodiment provides a cache management method for a distributed in-memory columnar database. For the structural diagram of the cache management system for a distributed in-memory columnar database, please refer to figure 1 , including a query execution engine, a cache master node, a standby node, and at least one cache slave node.

[0031] When a query request comes, the query execution engine parses the SQL statement into a physical execution plan represented by DAG. Each node in the physical execution plan represents a physical task, and the physical tasks are divided into GetColumn, Join, Filter, Group, BuildRow, etc. Each edge represents the transmission relationship of calculation results between two physical tasks. The physical execution plan of a typical query statement (SELECT A.id FROM A,B WHERE A.id=B.id AND A.idfigure 1 shown. In a cache management system, the granularity of cached data is the calculation result of a single physical task. When the cach...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a cache management method of a distributed internal memory column database. The cache management method comprises the steps that cache queues are established on cache master control nodes; each physical task is used as a root node to cut the physical execution plan the node is located so as to obtain the cache calculation track corresponding to each physical task; cache feature trees are established on the cache master control nodes according to the cache calculation track corresponding to each physical task; when query requests arrive, an execution engine is queries to parse SOL statements into the physical execution plans; layer-level transversal is conducted on each node in the physical execution plans starting from the root nodes of the physical execution plan to execute, and whether the cache calculation track corresponding to each physical task is matched with the corresponding cache feature tree or not is judged; if yes, actual cache data of the physical tasks is directly read from the cache nodes, if not, the physical tasks are calculated. According to the cache management method of a distributed internal memory column database, weather a cache hits the target or not is rapidly detected through an efficient cache matching algorithm, and the query efficiency is improved.

Description

technical field [0001] The invention relates to the technical field of computer software, in particular to a cache management method of a distributed memory column database. Background technique [0002] With the development of the information age, the scale of data has shown explosive growth. How to extract valuable information from these massive data is a huge challenge facing the current society. The On-Line Analytical Processing (OLAP, On-Line Analytical Processing) system has demonstrated its powerful data analysis capabilities, and it has been widely used in commercial fields such as banking, telecommunications, and stock exchanges. [0003] The distributed in-memory columnar database that supports OLAP systems allows users to extract and analyze valuable information from massive data in multiple dimensions. This information may be a simple report or a complex analysis result. As the complexity of query statements increases, the time required for query operations will...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/30
CPCG06F16/24552
Inventor 段翰聪闵革勇张建郑松詹文翰
Owner UNIV OF ELECTRONIC SCI & TECH OF CHINA
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products