Optimization method for data layout of multi-data centres based on calculating relevancy

A multi-data center and data center technology, applied in the field of distributed data storage and management, can solve the problems of high complexity of data layout optimization methods, affecting storage efficiency and space utilization, and huge data volume of big data

Inactive Publication Date: 2014-08-13
WUHAN UNIV
View PDF5 Cites 6 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, the data volume of big data is already very large, and unreasonable copy methods will further bring huge storage space overhead. Some data set copies that are not commonly used will cause too much unnecessary redundancy in the system, seriously affecting storage efficiency and space utilization. Rate
[0006] Comprehensive analysis shows that the current multi-data center data layout optimization methods have the following

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Optimization method for data layout of multi-data centres based on calculating relevancy
  • Optimization method for data layout of multi-data centres based on calculating relevancy
  • Optimization method for data layout of multi-data centres based on calculating relevancy

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0048] Data layout optimization can make the data layout closer to the actual needs, effectively realize the reasonable and full utilization of system resources, reduce the difficulty of data organization and management in the distributed multi-data center storage system and the pressure on the system, and improve the overall access performance and management of the system efficiency. The technical solution of the present invention provides a multi-data center data layout optimization method based on calculation correlation, aiming at the problem of mass data storage layout optimization in a large-scale distributed multi-data center storage system, and the execution efficiency of data-intensive computing, according to The "data common" phenomenon of data-intensive computing defines the computing correlation between data sets, and realizes the non-duplicated layout of massive data sets without considering data copies, and deploys data sets with high computing correlation in the ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses an optimization method for data layout of multi-data centres based on calculating relevancy. The optimization method comprises the steps of: generating an access associated matrix of an aggregation of computing sets and data sets according to a situation of executing calculation and procession of the data sets; obtaining, by means of calculation, the calculating relevancy between any two data sets, generating a corresponding calculating relevancy matrix; calculating a basic capacity of each data centre, defining a layout associated matrix, and laying out the data sets according to the calculating relevancy. According to the optimization method for data layout of multi-data centres based on calculating relevancy disclosed by the invention, by means of establishing the access associated matrix and the layout matrix, specific mathematical expression of the calculating relevancy is provided; according to the established calculating relevancy matrix, data layout is realized by a method of a low complexity; moreover, new data and intermediate data are dynamically laid out to a proper data centre, so that data scheduling cross the data centres can be effectively reduced, and an access performance of a system is improved.

Description

technical field [0001] The invention relates to the field of distributed data storage and management, in particular to a multi-data center data layout optimization method based on calculation correlation. Background technique [0002] The data explosion has brought the information society into the era of big data. Big data has a huge amount of data, contains rich and diverse information, and brings huge economic and social benefits, but also faces severe technical challenges. The "3V" characteristics of big data (that is, large capacity, fast update speed, and various types) make traditional database management unable to meet the requirements of big data storage and management. Cloud computing is currently a very important distributed network computing platform. , is regarded as a very effective storage, management and analysis platform for big data, and is a cost-effective solution for big data management and analysis. [0003] However, in the distributed multi-data cente...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/30
CPCG06F16/217
Inventor 徐正全王涛姚世红熊礼治
Owner WUHAN UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products