Method for importing data into multiple Hadoop components simultaneously

A component and data technology, which is applied in the field of rapid transfer and processing of large amounts of data, can solve problems such as not being provided, achieve wide application prospects, highlight substantive features, and simple structure
CN106919697AActive Publication Date: 2017-07-04SHANDONG LANGCHAO YUNTOU INFORMATION TECH CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
SHANDONG LANGCHAO YUNTOU INFORMATION TECH CO LTD
Publication Date
2017-07-04

Smart Images

  • Figure 1
    Figure 1
Patent Text Reader

Abstract

The invention relates to a method for importing data into multiple Hadoop components simultaneously, and is characterized in that the method comprises the following steps of (1) extending an import tool of the Sqoop, and adding service for importing data to the Kafka; (2) importing configuration parameters of the components according to a database, and writing parameter verification program; and (3) extending the import tool of the Sqoop, and adding service for simultaneously exporting the data to the HDFS, Hive, Hbase and Kafka. On basis of original functions of connecting the database and reading the data of the Sqoop, the function of simultaneously exporting the data to the multiple components is added, the database data are read for one time, multiple user-specified export modules are started simultaneously, and efficient and convenient data import is achieved. On one hand, multiple export tasks are prevented from being written for the same data, on the other hand, the same data are prevented from being repeatedly read, and therefore efficiency is improved.
Need to check novelty before this filing date? Find Prior Art

Description

technical field

[0001] The invention belongs to the technical field of rapid transfer processing of large amounts of data, and in particular relates to a method for simultaneously importing data into multiple Hadoop components. Background technique

[0002] With the rapid development of society today, all walks of life generate a large amount of data every day. The data sources include any type of data that can be captured around us, such as websites, social media, transactional business data, and data created in other business environments. As cloud providers leverage this framework and more users move datasets between Hadoop and traditional databases, tools that can facilitate data transfer become even more important. In this environment, the Apache framework Hadoop came into being. It is an increasingly general distributed computing environment, mainly used to process big data. Apache Sqoop is a data transfer tool, mainly used for data transfer between Hadoop and traditi...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More