Block size variable data blocking method for cloud storage system

A cloud storage system and block technology, applied in the field of communication, can solve problems such as too large access to hotspot files, not a perfect solution, partial system overload, etc., to avoid disk fragmentation, improve space utilization, and solve load overload problems Effect

Inactive Publication Date: 2012-06-20
XIDIAN UNIV
View PDF2 Cites 17 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] However, this fixed file block size strategy can only improve system performance to a certain extent, and is not a perfect solution. The main disadvantages are as follows:
[0005] 1. It is easy to cause disk fragmentation
The weakness of the fixed block size strategy is that for 64.0001M files, the 0.0001M files are stored on another node, which will intensify disk fragmentation. At the same time, it is necessary to make a metadata index for the 0.0001M files. Make another network connection during parallel transmission. Therefore, compared with storing the file as a whole on one node, this is a great waste, and it is not conducive to parallel transmission and hot file load balancing.
[0006] 2. It is easy to overload the system due to excessive access to hotspot files, especially for small files, which will contain fewer blocks, or even only one block will cause system overload
When many clients access the same small file multiple times, the servers storing these file blocks will become hotspots, and several block servers storing this file will be accessed by concurrent requests from hundreds of clients, resulting in system localization. overload
[0007] 3. Parallel computing will cause uneven load
Nodes that store too small blocks often need to store larger block nodes before the system can finally integrate the results. Such a parallel processing process will cause insufficient utilization of computing resources due to uneven blocks, which will inevitably lead to performance degradation. loss
The tasks of nodes storing smaller blocks are often completed ahead of time, and the load is relatively small, while the processing process of nodes storing larger blocks is longer. This uneven load will cause the corresponding speed of opening new task nodes, so there will be more Large delays, reducing computational efficiency

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Block size variable data blocking method for cloud storage system
  • Block size variable data blocking method for cloud storage system

Examples

Experimental program
Comparison scheme
Effect test

example 1

[0023] Example 1: Divide a file whose size is less than 4M and whose heat mark is 1

[0024] Step 1, the cloud storage system determines the best block factor vector as D={d according to experience and cloud application requirements 1 , d 2 ,...,d n}, d i , i=1, 2..., n is n block factors arranged from small to large, in the actual system, the best d i , i=1, 2..., the value of n is: when n=6, take d 1 = 4, d 2 = 8, d 3 = 16, d 4 = 32, d 5 = 64, d 6 =128, that is, D={4, 8, 16, 32, 64, 128}.

[0025] Step 2, the cloud storage system obtains the file size as size, size<4M byte, since the size of the file is smaller than the minimum block factor in the block factor vector D, it does not need to be divided into blocks, and the obtained file block size is is the final size of the file.

[0026] Step 3, the cloud storage system obtains the file heat flag flag=1 again, indicating that the file is a hot file, so it needs to be copied and stored, that is, three backups of th...

example 2

[0027] Example 2: Divide a file whose size is less than 4M and whose heat mark is 0

[0028] Step 1, the cloud storage system determines the best block factor vector as D={d 1 , d 2 ,...,d n}, d i , i=1, 2..., n is n block factors arranged from small to large, in the actual system, the best d i , i=1, 2..., the value of n is: when n=6, take d 1 = 4, d 2 = 8, d 3 = 16, d 4 = 32, d 5 = 64, d 6 =128, that is, D={4, 8, 16, 32, 64, 128}.

[0029] Step 2, the cloud storage system obtains the file size as size, size<4M byte, since the size of the file is smaller than the minimum block factor in the block factor vector D, there is no need to block, and the obtained file block size is is the final size of the file.

[0030] Step 3, the cloud storage system obtains the file heat flag flag=0, indicating that the file is not a hot file, so the file is directly stored on a storage node, and the storage ends, that is, the block method ends.

example 3

[0031] Example 3: Divide files larger than 4Mbyte into chunks

[0032] Step A, the cloud storage system determines the best block factor vector as D={d according to experience and cloud application requirements 1 , d 2 ,...,d n}, d i , i=1, 2..., n is n block factors arranged from small to large, in the actual system, the best d i , i=1, 2..., the value of n is: when n=6, take d 1 = 4, d 2 = 8, d 3 = 16, d 4 = 32, d 5 = 64, d 6 =128, that is, D={4, 8, 16, 32, 64, 128}.

[0033] Step B, the cloud storage system obtains the file size as size, size > 4Mbyte, since the file size is greater than the minimum block factor in D, it needs to be divided into blocks, and step C is performed.

[0034] Step C, according to the 6 block factors contained in the block factor vector D={4, 8, 16, 32, 64, 128}, so d 6 =128 is used as the initial block factor to start block judgment, that is, the grouping factor d 6Compare with the size of the file, if size%d 6 6 / 2, the block size i...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a method for blocking and storing a stored file in a cloud storage system by adopting a dynamic adaptation method, and mainly aims to solve the problems of low space utilization rate and disk fragments caused by the use of a fixed block size in the prior art. The implementation scheme comprises the following steps of: setting a grouping factor vector by using the cloud storage system, obtaining the size and heat mark of a file when the file is to be stored, and when the size of the file is smaller than a minimum grouping factor, using the final size of the file; if the heat mark is 1, copying and storing the file, and on the contrary, directly storing the file on a single node; and when the size of the file is larger than the minimum grouping factor, comparing the size of the file with a grouping factor to obtain an optimal block size. Compared with the method using the fixed block size, the invention has higher file uniformity, can better solve a load balancing problem caused by hot files, and is particularly suitable for the cloud storage system.

Description

technical field [0001] The invention belongs to the technical field of communication, and relates to a block data storage method in cloud storage, which is used for file segmentation in a cloud storage system, and stores files with an optimal block size, so that the cloud storage system has more Good file uniformity. Background technique [0002] Existing cloud storage systems all adopt a storage strategy of fixed file block size, and Google File System and Hadoop Distributed File System are representatives of the application of this strategy. [0003] The Google file system adopts a fixed file block size strategy of 64MB, which is much larger than the block size of traditional file systems or even the file size. Each block and its copy are stored on the block server in the form of ordinary Linux files. One of the benefits of using this fixed file block size strategy for the Google file system is that it can reduce the number of fragmented files on the disk to a certain ext...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30H04L29/08
Inventor 樊凯李晖赵黎斌王康陈曦
Owner XIDIAN UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products