Distributed database system, method for building index therein and query method

A query method and database technology, applied in the direction of electrical digital data processing, special data processing applications, instruments, etc., can solve problems that are not suitable for semi-structured data or data structure changes, high indexing overhead, high repetition rate, etc.

Inactive Publication Date: 2012-03-14
CHINA MOBILE COMM GRP CO LTD
View PDF2 Cites 110 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0007] (2) The data is structured or semi-structured, and the structure may change;
[0009] (4) The repetition rate of values ​​on many attribute domains is high
[0016] It is very difficult for users to continue to use existing databases and their indexing methods if they want to query and obtain the desired data from massive data sets
Databases are often unable to store such a huge amount of data, and are not suitable for semi-structured data or data structure changes
A dense and complete index will not only make the establishment and maintenance of the index expensive and slow for massive data, but also the data volume of the index itself is very large, which makes it difficult for the data writing speed to keep up with the data generation speed

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Distributed database system, method for building index therein and query method
  • Distributed database system, method for building index therein and query method
  • Distributed database system, method for building index therein and query method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0044] Hereinafter, exemplary embodiments of the present application will be described in detail with reference to the accompanying drawings.

[0045] The embodiments in this application are based on a distributed file system. A distributed file system consists of multiple storage and computing nodes; these nodes can be composed of multiple networked PC servers, and the number of nodes can even reach several thousand. Without service interruption, data nodes can be smoothly added or deleted according to capacity needs, and the failure of a few data nodes will not cause system service interruption. As will be described below, file data is divided into blocks and distributed as evenly as possible on each data node, and multiple copies are provided to ensure data reliability. Any file in the file system and the data distributed and stored on each data node can be accessed by calling the client API of the distributed file system, wherein the reading and writing of data in the fil...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a distributed database system and a method for building an index in the distributed database system. The distributed database system comprises a plurality of distributed storage units, an index memory, a resolver, an index query module and a parallel processing engine, wherein the distributed storage units store a plurality of data block files by sections; the index memory stores the indexes of the data block files; the resolver resolves a query sentence initiated by a user and selects a corresponding query index; the index query module searches the indexes of the data block files according to the selected query index to obtain at least one query data block set; the query data block set comprises an index key value and records the position information of the data block files corresponding to the index key value in the data block files; and the parallel processing engine splits the at least one query data block set and initiates a parallel scanning task.

Description

technical field [0001] The present application relates to a distributed database system, a method for establishing an index therein, and a query method. Background technique [0002] Storing large quantities of structured data in databases, especially relational databases, is a common data management method. The simple and intuitive practice is: deploy a mature database management system, use standard interfaces (such as SQL) to define data tables and data structures, and import or insert the collected data into the corresponding tables of the database. According to needs, the database system builds indexes for it to be used for fast query. When querying data, according to the query conditions, an appropriate index can be selected to optimize query performance. [0003] In terms of large-scale data management, the key factors affecting data query performance are the amount of data accessed during query and disk IO. Indexing technology is an important method to improve que...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
Inventor 齐骥钱岭郭磊涛周大罗治国孙少陵张松波张卫平
Owner CHINA MOBILE COMM GRP CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products