Pruning method for spatial big data partition duplicated data

A technique for duplicating data and data partitioning, applied in database distribution/replication, database indexing, electrical digital data processing, etc., can solve the problems of rough response time, low query efficiency, and inaccessibility, etc., to reduce query response time and be easy to implement , easy to achieve the effect of use

Pending Publication Date: 2022-01-07
DALIAN MARITIME UNIVERSITY
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] According to the above-mentioned technical problems that the query efficiency is low and rough response time cannot be obtained, a pruning method for spatial large data partition duplicate data is provided

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Pruning method for spatial big data partition duplicated data
  • Pruning method for spatial big data partition duplicated data
  • Pruning method for spatial big data partition duplicated data

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0054] DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS The following examples are intended to illustrate the invention, but not to limit the scope of the invention.

[0055] This embodiment constructs a Hadoop cluster composed of 3 4-core Intel Xeon E5-2609 and 2.40GHz processor servers, as a test operating environment of the method of the present invention, one of which serve as a Master node, and two other as a SLAVE node. The specific hardware configuration information is shown in Table 1.

[0056] Table 1 Server hardware configuration

[0057] Configure Specification CPU Intel Core E7500 RAM 2GB hard disk 300GB Network broadband 1GB / s

[0058] This embodiment employs Eclipse as a development environment of the method of the present invention, Java serves as a programming language to complete the method design and development. The software environment in which the method running in this embodiment includes: operating system Ubuntu 16.04.2, ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a pruning method for partitioned duplicated data of spatial big data. The pruning method can be built in a distributed spatial big data query system and does not need to carry out refinement operation on candidate results. Comprising the following steps: firstly, reading partition information pi of spatial big data to obtain a spatial minimum boundary rectangle ri covered by data of each partition, and carrying out intersection operation on ri and a spatial query rectangular window q to obtain a query range si of each partition; secondly, performing intersection operation on si and sj (i is not equal to j) to obtain an overlapped rectangular region sinsj = sij between partition query ranges si and sj, introducing a reference point, determining an attribution partition of the region sij through the reference point, and returning query ranges tri and trj after duplicate removal of pi and pj in a form of < pi, tri >; then, executing intersection operation on tr under the same partition p to obtain a final query range of p; finally, the final query range of each partition serves as a new constraint condition to prune partition data, and an obtained query result is a final result.

Description

Technical field [0001] The present invention relates to the field of space large data management, and more particularly to a spatial large data partition repetition data. Background technique [0002] Distributed Data Storage is the main solution to spatial data storage challenges, for spatial data, data partitions have become an essential operation of distributed storage. Summary of domestic and foreign research, mainly using R-tree and R tree variant structures, such as R *, STR and STR +, etc., R-tree is a data structure dedicated to spatial data organization management, can keep space neighboring Sexuality, but will result in a large number of boundary coincidence, that is, partition overlap, which has a huge challenge for spatial data inquiry (such as a range query and K-NN query). [0003] However, existing partition repetitive data prolonged technology is mainly refined to query results set, and draws the query results set in the single-machine environment, and the query r...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F16/2455G06F16/22G06F16/27
CPCG06F16/24556G06F16/2246G06F16/27
Inventor 张维石田瑞杰翟华伟崔立成周立甲
Owner DALIAN MARITIME UNIVERSITY
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products