Space small file data distribution storage method and system based on access log information

A file data and distributed storage technology, applied in the direction of digital data processing, special data processing applications, data processing input/output process, etc., can solve the problem that the small file data optimization problem has not been fundamentally solved, etc.

Inactive Publication Date: 2015-04-29
WUHAN UNIV
View PDF2 Cites 9 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Therefore, the optimization problem of small

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Space small file data distribution storage method and system based on access log information
  • Space small file data distribution storage method and system based on access log information
  • Space small file data distribution storage method and system based on access log information

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0089] In a distributed environment, access to small spatial file data is difficult to achieve parallel access through distributed storage of data in blocks. Therefore, it is necessary to analyze the relationship between the data of each small spatial file in order to realize the access to small spatial file data. When accessing, store the requested data of small spatial files in different storage servers as much as possible, so as to achieve parallel acquisition of data of small spatial files as much as possible, thereby improving the performance of the spatial information service system.

[0090] Due to the huge amount of small space file data, the storage combination optimization of large-scale small space file data has high computational complexity, and the search time is high. Therefore, it is necessary to classify the popularity of small space file data, and use different method to obtain the optimal storage combination scheme.

[0091] The following provides detailed su...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a space small file data distribution storage method and system based on access log information. The method includes: dividing a space small file data set into a frequently-accessed sub-set and a non-frequently-accessed sub-set, extracting the access sequence of the frequently-accessed space small file sub-set, calculating the association degree of each frequently-accessed space small file datum, and using the values of the association degrees to form an association matrix; performing magnitude conversion on each value in the association matrix, using an RCM sorting algorithm to rearrange the values, then outputting the values, using a local approximation search method to search for the optimal combination of the rearranged association matrix, using the optimal combination to perform distributed storage on the frequently-accessed space small data, and separately storing the non-frequently-accessed space small file data according to space position neighboring relations.

Description

technical field [0001] The invention belongs to the technical field of distributed storage of small spatial file data, and in particular relates to a new method and system for distributed storage of small spatial file data based on access log information. Background technique [0002] The storage and fast access of massive spatial information has always been an important problem that the spatial information service system tries to solve. Commonly used spatial information service systems such as the NASA Earth Observation System collect up to 2TB of data per day. The reasonable distribution and storage of these data is to obtain parallelism. Fast access becomes the key, and one of the important solutions is to improve data access efficiency by distributing data storage to achieve parallel access to data. [0003] At present, typical distributed file storage systems mainly include GFS (Google file system), HDFS (Hadoop distributed file system) and Luster. However, the improve...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/30G06F3/06
CPCG06F3/067G06F16/182
Inventor 潘少明徐正全种衍文李红李明汤戈
Owner WUHAN UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products