Unlock instant, AI-driven research and patent intelligence for your innovation.

Data migration method and tool based on hadoop cluster

A hadoop cluster and data technology, applied in the field of data migration based on hadoop cluster, can solve the problems of high development cost, complicated programming, lack of synchronization process progress and data accuracy monitoring, etc., to achieve easy management, simple deployment process, reduce The effect of human intervention

Active Publication Date: 2017-01-04
BEIJING JINGDONG SHANGKE INFORMATION TECH CO LTD +1
View PDF6 Cites 9 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0007] 1) Based on the data synchronization command provided by Hadoop itself, the method of manually copying and migrating each directory is suitable for temporary synchronization operations of small data volume directories, but not suitable for one-time synchronization operations of large data volumes. monitoring of data accuracy;
[0008] 2) Programming in other programming languages ​​is more complicated, and the development cost is higher. At the same time, corresponding monitoring and management functions need to be developed simultaneously

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Data migration method and tool based on hadoop cluster
  • Data migration method and tool based on hadoop cluster
  • Data migration method and tool based on hadoop cluster

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0033] Exemplary embodiments of the present invention are described below in conjunction with the accompanying drawings, which include various details of the embodiments of the present invention to facilitate understanding, and they should be regarded as exemplary only. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the invention. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.

[0034] figure 1 It is a schematic diagram of an architecture diagram of a hadoop cluster-based data migration method provided according to the present invention; figure 2 It is a schematic diagram of the main steps of a hadoop cluster-based data migration method provided according to the present invention; as Figure 1-2 As shown, the method mainly includes the followin...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides a data migration method based on a hadoop cluster. The data migration method comprises: each cluster calculates respective current data directory information according to the data directory list sent from the master server that needs to be synchronized and returns the result to the master server; the master server compares the difference of the results to obtain the difference directory list; the master server divides the difference directory list according to the number of clients executing the synchronization task; each client executing the synchronization task receives the synchronization request initiated by the master server, requests the web service to acquire the divided difference directory list and execute the synchronization task. According to the technical scheme of the invention, based on the synchronization command provided by hadoop itself, the development package is carried out, including data difference comparison, multi-thread concurrent synchronization, synchronous result verification, synchronous progress tracking and process monitoring.

Description

technical field [0001] The invention relates to the technical field of computer networks, in particular to a hadoop cluster-based data migration method and tool. Background technique [0002] In the process of building a data platform, with the growth of business, the expansion of cluster scale, and the upgrading of software and hardware environments, it is inevitable to encounter tasks such as cluster data migration and merging. Therefore, in order to ensure the efficiency of data migration As well as the integrity and accuracy of data, it is of great significance to develop a set of data migration methods and tools based on hadoop environment. [0003] There are mainly two types of existing data migration forms: [0004] 1), based on the data synchronization command provided by hadoop itself, manually copy and migrate each directory; [0005] 2) Use other programming languages, such as java, python, etc. to develop a set of separate data read and write synchronization to...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F17/30
CPCG06F16/214G06F16/27
Inventor 刘传奇李文学
Owner BEIJING JINGDONG SHANGKE INFORMATION TECH CO LTD