Distributed data device, method and system based on spatial correlation

A distributed data and spatial relationship technology, applied in the field of distributed data devices based on spatial correlation, can solve problems that cannot be called a data system, overall performance degradation, and high I/O pressure, to achieve large expansion capabilities and improve efficiency , the effect of dispersing pressure

Active Publication Date: 2014-08-20
罗敬宁
View PDF2 Cites 26 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] However, the inventors of this case have found that the prior art has at least the following deficiencies: first, the HDFS system is a distributed implementation of the file system, and its processing subject is still each independent file, so it cannot be called a real data system; For data associated with attributes, such as data types of attributes such as space, time, and hierarchy, the HDFS system cannot recognize these attribute information, and cannot segment data according to attribute association methods, and establish associations between data blocks, resulting in unreasonable data distribution; again, due to The HDFS system does not divide and store data according to data attributes. When reading data according to certain attribute characteristics, it cannot provide reasonable load balancing. The I / O pressure of some nodes is too high, resulting in a decline in overall performance; finally, HDFS data access It is file-based and cannot perform data access based on attribute characteristics, cannot provide attribute-based data query, merge, crop and other operations, and cannot read and reorganize data from multiple files according to attribute associations
[0007] However, the inventors of this case found that prior art 2 has at least the following deficiencies: First, the Key / Value structure of BigTable only has a single keyword, and it is generally a string, which cannot be used for keyword descriptions with multidimensional attribute structures, such as spatial location , spatial data with time attributes and hierarchical attributes; secondly, the data stored in BigTable is allocated according to the order of arrival or in a random manner, so as to form different tablets, which will cause unreasonable storage of spatially associated data, and data access will frequently point to a tablet , so as to lose the advantages of distributed systems; again, BigTable data records cannot judge attribute associations through Key, such as spatial adjacency, time order, hierarchical relationship, etc. Therefore, when data is queried and read, tasks cannot be performed according to attribute characteristics Allocation and processing, resulting in unbalanced system performance

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Distributed data device, method and system based on spatial correlation
  • Distributed data device, method and system based on spatial correlation
  • Distributed data device, method and system based on spatial correlation

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0062] In order to make the purpose, technical solutions and advantages of the embodiments of the present invention more clear, the embodiments of the present invention will be further described in detail below in conjunction with the accompanying drawings. Here, the exemplary embodiments and descriptions of the present invention are used to explain the present invention, but not to limit the present invention.

[0063] Such as figure 1 Shown is a flow chart of a distributed data storage method based on spatial association according to an embodiment of the present invention.

[0064] Step 101 is included, dividing the data with spatial characteristics into a plurality of grids, and the grids have data of the space where the grids are located.

[0065] Step 102: Store the data in the grid in a plurality of storage nodes according to the association relationship of the spatial positions of the grid.

[0066] As an embodiment of the present invention, the data of the spatial ch...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to the technical field of data systems, in particular to a distributed data device, method and system based on the spatial correlation. The storage method includes the steps of dividing data with spatial characteristics into a plurality of grids, wherein the grids are provided with data of space where the grids are located; storing the data in the grids into a plurality of storage nodes according to the correlation relationship of the positions of the space of the grids. The distributed data device, method and system have the advantages that as for huge spatial data in various types, high-parallelism-degree data writing and reading can be achieved, and it is guaranteed that the data divided according to spatial attributes can be safely stored into the nodes in a balanced and spatial-correlation kept mode; meanwhile, the distributed data system has the high expanding capacity, the system expansion and the performance are linearly correlated, a large number of idler nodes or I/O bottle neck nodes can not appear in the system, and the original design intention of the distributed data system is achieved.

Description

technical field [0001] The present invention relates to the technical field of data systems, in particular to a distributed data device, method and system based on spatial correlation. Background technique [0002] After years of development, the distributed data system has become an important solution for high-efficiency, high-availability, and cost-effective storage and application of massive data. It plays a pivotal role in promoting cloud computing and big data applications. The core idea of ​​a distributed data system is to disperse the storage of data, divide the data into standard subsets, use multiple nodes to store each subset of data, and store the location information of the data subsets to the master node. When reading data, each storage node is only responsible for providing its own data subset, and the client interface component is responsible for reorganizing each data subset, and finally submitting the data file or record. Through data segmentation storage a...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30G06F3/06
CPCG06F16/13G06F16/182
Inventor 罗敬宁
Owner 罗敬宁
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products