Unlock instant, AI-driven research and patent intelligence for your innovation.

Method for realizing HDFS high-availability scheme

An implementation method and solution technology, applied in the implementation field of HDFS high availability solution, can solve the problems of system consumption, inaccessibility of file system, loss of file data, etc., and achieve the effect of reducing time

Inactive Publication Date: 2015-10-28
HANGZHOU GEEKOO INFORMATION TECH
View PDF6 Cites 5 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

But at the same time, it also exposes a fatal weakness of the HDFS system. There is a SPOF (Single Point Of Failure) problem, that is, the central server NameNode of HDFS is a single point, and its crash will cost the system hours to restart. As a result, the file system is inaccessible, and all file data is lost

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method for realizing HDFS high-availability scheme
  • Method for realizing HDFS high-availability scheme

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0026] Below in conjunction with accompanying drawing and specific implementation application process, the present invention is further described:

[0027] refer to figure 1 The overall architecture diagram introduces the implementation details of the present invention:

[0028] In the present invention, the NameNode is the central node of the entire cluster and stores metadata of the entire file system. NameNode is also responsible for maintaining the entire namespace and responding to read and write requests from clients. DataNode is a node that stores specific application data. In HDFS, the system divides files exceeding a certain size into data blocks of fixed size and stores them on the DataNode. And usually these data blocks have multiple backups stored on different DataNodes to prevent data blocks from being unavailable due to DataNode failure.

[0029] In the NameNode, all metadata resides in memory for access performance. So NameNode needs more memory than DataNo...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The present invention relates to a method for realizing an HDFS (Hadoop Distributed File System) high-availability scheme by using a hot standby NameNode. According to the method of the present invention, on the basis of an original HDFS configuration, one hot standby NameNode node, i.e., Standby NameNode, is introduced; the system maintains consistency of name spaces by continuously synchronizing metadata in a memory of an Active NameNode and metadata in a memory of the Standby NameNode; when the NameNode is switched, only data in Editlog is imported, thereby shortening the time required for switching of the HDFS; and when the Active NameNode is dropped and unusable, the system can automatically and quickly switch to the Standby NameNode. The method of the present invention can provide high-availability service for the HDFS.

Description

technical field [0001] The invention relates to the field of high availability of Hadoop distributed file systems, and relates to a method for realizing a HDFS high availability scheme introducing a hot standby NameNode. Background technique [0002] Hadoop provides a Hadoop distributed file system HDFS that can run on commercial PCs. It provides applications with high throughput, which is very suitable for large-scale data processing. HDFS made such an assumption at the beginning of its design: it is cheaper to move calculations to the data side than to move data to the computing side. The core idea is to move a large number of application calculations to the place where the data is stored, and then perform these calculations on the data, which greatly improves the throughput of Hadoop. [0003] Among them, HDFS (Hadoop Distributed File System, Hadoop Distributed File System) is an underlying module of the Hadoop framework. Other components of Hadoop, such as the programm...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): H04L12/24H04L29/08
CPCH04L41/0668H04L41/0695H04L67/1095H04L67/1097
Inventor 任祖杰张纪林余卓尔兰云龙
Owner HANGZHOU GEEKOO INFORMATION TECH