A way to import data into multiple hadoop components simultaneously

A component and data technology, which is applied in the field of rapid transfer processing of large amounts of data, can solve the problem of non-supply, etc., and achieve the effect of broad application prospects, prominent substantive features, and simple structure
CN106919697BActive Publication Date: 2020-09-25SHANDONG LANGCHAO YUNTOU INFORMATION TECH CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
SHANDONG LANGCHAO YUNTOU INFORMATION TECH CO LTD
Publication Date
2020-09-25

Smart Images

  • Figure 1
    Figure 1
Patent Text Reader

Abstract

The invention relates to a method for importing data into multiple Hadoop components simultaneously, and is characterized in that the method comprises the following steps of (1) extending an import tool of the Sqoop, and adding service for importing data to the Kafka; (2) importing configuration parameters of the components according to a database, and writing parameter verification program; and (3) extending the import tool of the Sqoop, and adding service for simultaneously exporting the data to the HDFS, Hive, Hbase and Kafka. On basis of original functions of connecting the database and reading the data of the Sqoop, the function of simultaneously exporting the data to the multiple components is added, the database data are read for one time, multiple user-specified export modules are started simultaneously, and efficient and convenient data import is achieved. On one hand, multiple export tasks are prevented from being written for the same data, on the other hand, the same data are prevented from being repeatedly read, and therefore efficiency is improved.
Need to check novelty before this filing date? Find Prior Art

Description

Technical field

[0001] The invention belongs to the technical field of rapid transfer processing of large amounts of data, and specifically relates to a method for simultaneously importing data into multiple Hadoop components. Background technique

[0002] With the rapid development of society today, various industries generate a large amount of data every day. The data sources include any type of data that can be captured around us, websites, social media, transactional business data, and data created in other business environments. As cloud providers use this framework, more users transfer data sets between Hadoop and traditional databases, and tools that can help data transfer become more important. In this environment, the Apache framework Hadoop came into being, it is an increasingly common distributed computing environment, mainly used to process big data. Apache Sqoop is a data transfer tool, which is mainly used to transfer data between Hadoop and traditional databases. ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More