Method and device for remote backup of data in HDFSs

An off-site backup and backup technology, applied in the field of big data, can solve problems such as low efficiency of massive data

Inactive Publication Date: 2018-07-03
AEROSPACE INFORMATION
View PDF5 Cites 3 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

At present, the remote disaster recovery technology for large data stored in HDFS often adopts the method of overall database backup. Due to the large amount of data stored in HDFS, the efficiency of HDFS to realize remote disaster recovery of massive data is low.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and device for remote backup of data in HDFSs
  • Method and device for remote backup of data in HDFSs
  • Method and device for remote backup of data in HDFSs

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0033] In order to make the purpose, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below in conjunction with the drawings in the embodiments of the present invention. Obviously, the described embodiments It is a part of embodiments of the present invention, but not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without creative efforts fall within the protection scope of the present invention.

[0034] figure 1 A schematic flow chart of a method for off-site backup of data in HDFS provided by the embodiment of the present application, as shown in the figure, which includes:

[0035] S11. Determine the data block information corresponding to the changed data block in HDFS. At least one file is saved in HDFS, and each file includes at leas...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a method and device for remote backup of data in HDFSs. The method comprises the following steps of: determining changed data blocks in an HDFS, wherein the HDFS at least storesa file, and each file at least comprises a data block; and backing up the changed data blocks into a remote HDFS backup cluster one by one according to data block information corresponding to the changed data block in the HDFS. According to the method and device for backup of data in HDFSs, the changed data blocks are backed up one by one, so that the amount of backed up data and the bandwidth required by the backup are decreased, lightweight-class remote backup is realized and the efficiency of remote disaster tolerance is improved.

Description

technical field [0001] The invention relates to the field of big data, in particular to a method and a device for off-site backup of data in HDFS. Background technique [0002] HDFS (Hadoop Distributed File System) is widely used due to its advantages of high fault tolerance, high reliability, and high scalability. HDFS adopts a master-slave architecture. An HDFS cluster includes a Name Node master node and many DataNode slave nodes. As the master node of the HDFS file system, the Name Node is responsible for maintaining the namespace of the entire HDFS file system and managing the metadata of all files and directories. As the slave node of the HDFS file, the Data Node is responsible for storing multiple fixed-size data blocks (the default block size is 64MB or 128MB). The Name Node node stores data block-related information, including the mapping relationship between files and data blocks, and the mapping relationship between data blocks and Data Node nodes. [0003] Dur...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F11/14H04L29/08
CPCG06F11/1435H04L67/10H04L67/1095
Inventor 林文辉
Owner AEROSPACE INFORMATION
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products