Massive small file storage and management method and system

A technology of massive small files and management methods, applied in the field of massive small file storage and management methods and systems, can solve the problems of low retrieval and update efficiency, data access failure, and memory consumption of name node servers, avoiding single point failure, The effect of balancing network load and improving the efficiency of small file processing

Active Publication Date: 2015-08-05
GLOBAL ENERGY INTERCONNECTION RES INST CO LTD +3
View PDF4 Cites 23 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

A large number of small files will consume the memory of the name node server, and its retrieval and update efficiency is low
In addition, the only name node that exists in Hadoop is responsible for managing the file system namespace and controlling the access of external clients. Once the NameNode fails, data access will fail

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Massive small file storage and management method and system
  • Massive small file storage and management method and system
  • Massive small file storage and management method and system

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0042] The present invention will be described in further detail below in conjunction with the accompanying drawings.

[0043] A method for storing and managing a large number of small files, the method comprising:

[0044] Store a large number of small files and write metadata into the name node service network;

[0045] The name node service network manages metadata and responds to client access requests.

[0046] Such as figure 1 As shown, a method for storing a large number of small files includes the following steps:

[0047] Step 101, classifying a large number of small files to generate the metadata file;

[0048] Step 102, using the MapReduce programming framework to decompose and process the data blocks to obtain data values;

[0049] Using the Map and Reduce functions of the MapReduce programming framework, the Map function decomposes the incoming intermediate data file to generate an intermediate key / value data sequence, and the Reduce function analyzes and merg...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a massive small file storage and management method and a system, wherein the massive small file storage and management method comprises that massive small files are stored, and metadata is written into a name node service network, and is managed through the name node service network, thereby achieving access request response of a client. The system comprises the name node service network and a massive small file storage system. The massive small file storage and management method and the system use peer-to-peer computing technology, prevent the problem of single point failure, provide routing lookup method based on keywords, effectively balance network load, and improve query efficiency.

Description

technical field [0001] The invention relates to a storage and management method and system, in particular to a storage and management method and system for massive small files. Background technique [0002] The Hadoop platform adopts the manager / worker model and consists of a name node (NameNode) server and multiple data node (DataNode) servers. Both the NameNode server and the DataNode server are deployed on ordinary PCs, which greatly saves the cost of implementing a distributed system. In Hadoop, the NameNode needs to be used to manage the metadata of the file system, and to respond to client requests to return file locations, etc. Therefore, the limit on the number and size of files is determined by the NameNode. Assuming a small data file, its metadata requires 1KB (1024B) memory space. If there are 10 million such files, and a Block is assigned to each file, then the NameNode node will consume about 10GB of memory to save the information of these Blocks; if the defau...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
CPCG06F16/134G06F16/1827
Inventor 周爱华孟祥君何金陵丁杰戴江鹏杨佩饶玮潘森
Owner GLOBAL ENERGY INTERCONNECTION RES INST CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products