Check patentability & draft patents in minutes with Patsnap Eureka AI!

Standard interface accessing HDFS (Hadoop Distribute File System) distribution type storage system

A distributed storage, standard interface technology, applied in database distribution/replication, special data processing applications, instruments, etc., can solve the problems of low efficiency, high cost, insufficient SQL compatibility, etc., to improve execution efficiency and easy extended effect

Inactive Publication Date: 2018-03-09
北京人大金仓信息技术股份有限公司
View PDF5 Cites 5 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0007] Regardless of the various solutions adopted at present, either there is insufficient compatibility with SQL, or the execution efficiency is not high, or the cost is high, and the needs of users cannot be well met.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Standard interface accessing HDFS (Hadoop Distribute File System) distribution type storage system
  • Standard interface accessing HDFS (Hadoop Distribute File System) distribution type storage system
  • Standard interface accessing HDFS (Hadoop Distribute File System) distribution type storage system

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0024] The following will clearly and completely describe the technical solutions in the embodiments of the present invention with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only some, not all, embodiments of the present invention. All other embodiments obtained by persons of ordinary skill in the art based on the embodiments of the present invention belong to the protection scope of the present invention.

[0025] First, call the entire parallel analysis engine Panda.

[0026] figure 1 The main components of a typical Panda cluster are introduced. There are several master nodes: Pandamaster node, Panda master node NameNode, YARN master node ResourceManager. The Panda metadata service is in the Panda master node, and the other nodes are slave nodes. HDFSDataNode, YARN NodeManager and a Panda Segment are deployed on each Slave node. Panda Segment will start multiple QEs (Query Executor, query ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a standard interface accessing an HDFS (Hadoop Distribute File System) distribution type storage system. By storing system table information of a database in a local file at amaster node of a distribution type database and storing data in all other tables in an HDFS, a user receives client connection through the master node and processes a command of a client finally, andthe master node parses and optimizes query, distributes the query to a segment, coordinates execution of the query and stores metadata of all the systems without storing any user data. The standard interface has the beneficial effects that by storing the metadata in the local file system on a master host computer and storing the data of all other tables in the HDFS, the segment does not store anystate and data information and is only charge of calculation, so that separation of read-write, storage and calculation is realized; arbitrary multiple virtual segments can be dynamically started to execute query, so that the execution efficiency is improved; and a cluster is more liable to extend as the segment in a non-storing a state.

Description

technical field [0001] The invention relates to the technical field of big data, in particular to a standard interface for accessing a HDFS distributed storage system. Background technique [0002] Big data is a very hot topic at present, and the core of big data is how to store, analyze and mine massive data to solve practical problems. How to store, query and analyze TB / PB-level data is an unavoidable problem in the era of big data. The HDFS file system is a distributed file system running on ordinary hardware. It provides a highly fault-tolerant and high-throughput mass data storage solution. It is very suitable for applications on large-scale data sets and is widely used in big data. as a storage system. Therefore, the research on the access of the HDFS distributed storage system has become a hot spot in the study of big data. [0003] HDFS is one of the core components of Hadoop. Although the MapReduce component in the Hadoop system can extract data from it and perfo...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F17/30
CPCG06F16/182G06F16/27
Inventor 袁远松
Owner 北京人大金仓信息技术股份有限公司
Features
  • R&D
  • Intellectual Property
  • Life Sciences
  • Materials
  • Tech Scout
Why Patsnap Eureka
  • Unparalleled Data Quality
  • Higher Quality Content
  • 60% Fewer Hallucinations
Social media
Patsnap Eureka Blog
Learn More