A method and device for storing and accessing massive small files

A technology of massive small files and small files, applied in special data processing applications, instruments, electronic digital data processing, etc., can solve the problems of reducing the access rate of small files, bloated file directory system, unable to accommodate directory indexes, etc. The effect of reducing, saving storage space and avoiding waste

Active Publication Date: 2014-10-08
BEIJING SOHU NEW MEDIA INFORMATION TECH
View PDF4 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Storage of a small number of small files on a disk or RAID disk array can still meet the application requirements, but when the number of small files stored increases to a certain scale (massive), the directory index structure of the storage system is very large and the number of nodes is large, and the files established for access Directory systems become bloated, increasing storage overhead
Moreover, the high-speed cache set up to improve the access speed will not be able to accommodate all directory indexes, and at least part of the directory index data will be transferred and stored to the disk area, which may result in multiple I / O operations to access a small file, thereby greatly reducing the The access rate of small files
In addition, since small files usually include metadata for describing small file attribute information (such as read and write times, access time, etc.), while storing massive small file data, additional storage space must be added for storing these pairs. User unnecessary metadata, resulting in waste of storage space

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A method and device for storing and accessing massive small files
  • A method and device for storing and accessing massive small files
  • A method and device for storing and accessing massive small files

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 500

[0081] See attached Figure 5 , which shows the structural framework of the device for accessing massive small files in Embodiment 4 of the present application. The device embodiment 500 includes: a receiving unit 501, a retrieval unit 502, an acquisition unit 503 and a readout unit 504, wherein:

[0082] The receiving unit 501 is configured to receive the file name of the small file to be accessed;

[0083] The retrieval unit 502 is configured to retrieve the index table according to the file name of the small file to obtain the file group identification number stored in the small file and the sequence number of the small file in the group, and the index table uses the identification number of the file group and the small file The serial number in the file group is an index, and the corresponding relationship with the file name of the small file is stored, and the file group includes at least two small files;

[0084] The obtaining unit 503 is configured to obtain the start...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

An embodiment of the application discloses a method for storing massive small files. The method comprises the steps of: writing sequence of received small files into a memory element; determining a file group to which the small files belong as well as a sequence of the small files in the file group according to starting address and capacity of the small files in the memory element, wherein the file group includes at least two small files; and taking an identification number of the file group and a sequence number of the small files as an index, and establishing a correspondence between the index and file names of the small files to realize storage. The embodiment of the application further discloses a method for accessing the massive small files, as well as a storing device and an accessing device for the massive small files, corresponding to the storing method and the accessing method. The methods and devices provided by the invention can compress directory indexing structure of the small files, save storage cost and improve storage and access efficiency of the small files.

Description

technical field [0001] The present application relates to the technical field of data storage and access, and in particular to a method for storing a large number of small files and its corresponding device, an access method and its corresponding device. Background technique [0002] With the development of information technology, all kinds of information are increasing rapidly, and individual files as the carriers of these information appear in large numbers, especially small files with not too large capacity. These small files can be as small as a few KB, and the large ones usually do not exceed 20MB. Common small files, such as: Weibo information, photos uploaded by users, emails, UGC data, etc. The development bottleneck brought about by the emergence of a large number of small files is the storage and access problem of small files. In the prior art, usually each small independent file is directly stored in a disk or a RAID disk array (Redundant Array of Independent Di...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06F17/30
Inventor 刘晓云
Owner BEIJING SOHU NEW MEDIA INFORMATION TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products