Data connection method and system across data center

A data and metadata node technology, applied in the computer field, can solve the problem of not being able to realize cross-HDFS data center data connection and other problems

Active Publication Date: 2014-11-05
TSINGHUA UNIV
View PDF3 Cites 5 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The application scenarios in the prior art are all oriented to the same data center, that is, to a set of distributed file systems. However, many application scenarios require the system to perform join operations on two o

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Data connection method and system across data center
  • Data connection method and system across data center

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0035] In order to make the purpose, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below in conjunction with the drawings in the embodiments of the present invention. Obviously, the described embodiments It is a part of the embodiments of the present invention, but not all of them. Based on the embodiments of the present invention, all other embodiments obtained by those of ordinary skill in the art without creative work belong to the protection of the present invention. scope.

[0036] The embodiment of the present invention provides a data connection method across HDFS data centers, see figure 1 , the method includes:

[0037] Step 101: After the coordinating node receives the data connection operation request, it obtains the IP address of the metadata node from the configuration file;

[0038] Step 102: the coordinating nod...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a data connection method across a data center of an HDFS (Hadoop Distributed File System). The method comprises the following steps that: a coordination node acquires the IP (Internet Protocol) addresses of metadata nodes from a configuration file after receiving a data connection operating request, establishes connection with the metadata nodes in the configuration file one by one, and acquires metadata information about a requested metadata node when a requested table is inquired in a current metadata node; the requested node filters data of the requested node according to information about requested data, screens the requested data of the requested node, and transmits the size of a result set saved by the requested node to the coordination node; the coordination node informs small result set nodes to transmit result sets saved by the small result set nodes to all large result set nodes; all the large result set nodes perform Hash join on the result sets of the large result set nodes and the result sets transmitted by all the small result set nodes to obtain result records; and the coordination node aggregates the result records. Through the method and a system, data connection across the data center can be realized.

Description

technical field [0001] The invention relates to the field of computer technology, in particular to a data connection method and system across data centers. Background technique [0002] When faced with big data application requirements, when stand-alone performance faces bottlenecks and relational database cluster systems can no longer meet the needs of big data applications, using distributed storage systems to store data and adopting standard user interfaces based on SQL language has become a solution to large-scale structures. An effective way to simplify data retrieval problems. In the prior art, HDFS (Hadoop Distributed File System, Hadoop Distributed File System) is usually used to store data files, and HDFS files and their contents are mapped into a table structure by maintaining a piece of metadata. The application scenarios in the prior art are all oriented to the same data center, that is, to a set of distributed file systems. However, many application scenarios r...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): H04L29/08G06F17/30
Inventor 汪东升张宝权王占业
Owner TSINGHUA UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products