Routing method and device for multiple HDFS clusters based on alluxio

A routing and clustering technology, applied in the field of distributed storage in the big data ecosystem, can solve problems such as inconvenient management and maintenance, and achieve the effect of improving performance, improving response time, and improving availability.

Active Publication Date: 2022-03-18
SUNING COM CO LTD
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0010] The purpose of the present invention is to provide an Alluxio-based multi-HDFS cluster routing method and device to solve the problem of inconvenient management and maintenance of the original "Federation+ViewFs" configuration scheme

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Routing method and device for multiple HDFS clusters based on alluxio
  • Routing method and device for multiple HDFS clusters based on alluxio
  • Routing method and device for multiple HDFS clusters based on alluxio

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0039]In order to enable those skilled in the art to better understand the technical solution of the present invention, the present invention will be described in further detail below in conjunction with the accompanying drawings and specific embodiments, but it is not intended to limit the present invention.

[0040] Alluxio is a memory-based distributed file system. It is a middleware between the underlying distributed file system and the upper-layer distributed computing framework. Its main responsibility is to provide data storage in memory or other storage facilities in the form of files. Get service. In the field of big data, the bottom layer is the distributed file system HDFS, such as AmazonS3, ApacheHDFS, etc., while the higher-level applications are some distributed computing frameworks, such as Spark, MapReduce, HBase, Flink, etc. These distributed frameworks, often Both read and write data directly from the distributed file system, which is relatively inefficient a...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a routing method and device for multiple HDFS clusters based on Alluxio. The method includes: mounting the path that needs to be routed in HDFS to the master node of Alluxio to form a routing table; the HDFS client goes to Alluxio The master node of the master node obtains the routing table and pulls it locally; parses the client's RPC request path according to the information in the routing table, and forwards it to the corresponding HDFS cluster according to the parsed path. The invention solves the problem of inconvenient management and maintenance of the Federation plus Viewfs configuration scheme.

Description

technical field [0001] The invention belongs to the field of distributed storage of big data ecosystems, and in particular relates to a routing method and device for multi-HDFS clusters based on Alluxio. Background technique [0002] Hadoop Distributed File System (HDFS) is a distributed file system designed to run on general-purpose hardware. HDFS is a highly fault-tolerant system suitable for deployment on cheap machines. Multiple computers work together on a network (sometimes called a cluster) to solve a certain problem just like a single system. We call such a system a distributed system. Distributed file systems are a subset of distributed systems, and the problem they solve is data storage. In other words, they are storage systems that span multiple computers. Data stored on a distributed file system is automatically distributed across different nodes. Distributed file systems have broad application prospects in the era of big data, and they provide the required s...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): H04L45/74H04L45/745H04L67/63H04L67/50H04L67/141H04L67/133H04L67/10H04L67/1097H04L43/08
CPCH04L45/74H04L45/742H04L45/745H04L67/34H04L67/141H04L67/10H04L67/1097H04L43/08H04L67/133H04L67/63
Inventor 郭业俊林海强王志强许立群
Owner SUNING COM CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products