Small file storage method based on Hadoop distributed file system

A distributed file and small file technology, applied in the computer field, can solve the problem of low memory usage and storage access efficiency, and achieve the effect of satisfying low-latency access, reducing storage burden, and high efficiency

Active Publication Date: 2014-06-11
XIDIAN UNIV
View PDF4 Cites 53 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

This method can effectively overcome the shortcomings of high memory usage of the name node NameNode and low storage a

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Small file storage method based on Hadoop distributed file system
  • Small file storage method based on Hadoop distributed file system
  • Small file storage method based on Hadoop distributed file system

Examples

Experimental program
Comparison scheme
Effect test

Example Embodiment

[0042] The present invention will be further described below in conjunction with the drawings.

[0043] Reference figure 1 The specific implementation steps of the present invention are as follows:

[0044] Step 1. Add two services.

[0045] In addition to the Hadoop distributed file system HDFS, a new web server Websever is added to monitor file read and write requests, and a small file processing server is added to process small files: the system architecture of the present invention consists of the web server Websever, The small file processing server and the original HDFS system consist of three parts. The small file processing server mainly performs file merging, file mapping, and file prefetching operations on small files.

[0046] Step 2. Determine whether the file is a small file.

[0047] The web server Websever judges whether the monitored request file is a file smaller than 16M. If it is smaller than 16M, it is regarded as a small file and proceed to step 4; otherwise, it is...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a small file storage method based on a Hadoop distributed file system. The method comprises steps of (1) additionally arranging two servers; (2) judging whether a file is a small file; (3) judging the request state of a large file; (4) judging the request state of the small file; (5) pre-processing a write request; (6) processing the write request; (7) detecting a cache; (8) pre-processing a read request; (9) processing a read request; (10) separating small files; (11) establishing a prefetching record; and (12) updating the prefetching record. Compared with existing methods for storing lots of small files, the small file storage method guarantees universality of the system, and also has advantages of having high reading and writing performance and efficiency, easing NameNode internal storage burden, and solving problems of high NameNode memory usage rate in storing lots of small files, and low storage access efficiency. The small file storage method can be used by the distributed file system for storing and managing lots of small files.

Description

technical field [0001] The invention belongs to the field of computer technology, and further relates to a small file storage method based on a Hadoop distributed file system (Distributed File System DFS) in the field of computer distributed data optimization storage. The invention uses a small file processing server independent of the HDFS system to perform operations such as merging, mapping, and prefetching of small files, and can be applied to efficiently store and access a large number of small files. Background technique [0002] Hadoop Distributed File System, HDFS for short, is a distributed file system. At present, in the field of distributed file storage technology represented by HDFS, HDFS is widely used to efficiently process various large files. However, with the change of user needs, the number of small files is increasing, and the interaction between users and Namenode is becoming more and more frequent. . Due to HDFS's own master-slave structure and metadat...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): H04L29/08
Inventor 樊凯李慧莹李晖
Owner XIDIAN UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products