Data backup method and system based on Hadoop distributed file system

A distributed file and data backup technology, which is applied in the direction of data error detection and digital data authentication in the file system and computing redundancy, to achieve the effects of protecting integrity, preventing disasters, and improving security

Pending Publication Date: 2021-05-14
STATE GRID GANSU ELECTRIC POWER CORP +3
View PDF2 Cites 3 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

When a catastrophic situation occurs in the Hadoop cluster, the market lacks a method to protect and restore data remotely and externally

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Data backup method and system based on Hadoop distributed file system

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0022] The technical solution of the present invention will be further described below in conjunction with the accompanying drawings.

[0023] The Hadoop distributed file system (HDFS) of the present invention includes: a backup proxy node and a storage server. The backup server is an independent machine, which can be a physical machine or a virtual machine. Install the proxy service on the backup proxy node, download Hadoop configuration and Kerberos user authentication, create an HDFS client through the proxy service, and communicate with the HDFS client through the proxy service. Hadoop cluster to interact. A storage server contains storage media and a file index database. The storage medium is used to save system file data, and the file index database is used to save system file metadata.

[0024] HDFS cluster is composed of a Namenode and a certain number of Datanodes. Namenode is a central server responsible for managing the namespace of the file system and client acc...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a data backup method and system based on a Hadoop distributed file system. The method comprises the steps that a file folder is backed up in a snapshot mode through an HDFS client side, a time point snapshot of the file folder is generated through the client side, and data in the file folder is stored in an external storage medium. The system comprises an HDFS system and a storage server connected with the system, wherein the storage server comprises a storage medium and a file index database; the storage medium is used for storing system file data, and the file index database is used for storing system file metadata. According to the method, the security of the data in the HDFS can be improved, the Hadoop cluster is prevented from disasters, the system data can be automatically and quickly recovered, and the integrity and consistency of company data are protected.

Description

technical field [0001] The invention relates to a data backup method and system, in particular to a data backup method and system based on a Hadoop distributed file system. Background technique [0002] "Big data" has existed for a long time in the fields of physics, biology, environmental ecology, military, finance, communication and other industries, but it has attracted people's attention because of the development of the Internet and information industry in recent years. With the rapid development and popularization of computer and information technology, big data has increasingly demonstrated its advantages, and the scale of industrial application systems has expanded rapidly, and the data generated by industrial applications has grown explosively. [0003] Hadoop implements a highly fault-tolerant distributed file system (Hadoop Distributed FileSystem, HDFS), which is used to solve problems such as low-cost hardware, scalable super-large clusters, and storage and acces...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/182G06F16/13G06F11/14G06F21/44G06F21/64
CPCG06F16/182G06F16/134G06F11/1448G06F21/44G06F21/64
Inventor 段军红靳丹张旭杨波王琼
Owner STATE GRID GANSU ELECTRIC POWER CORP
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products