Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Data extraction system and data extraction method

A data extraction and data technology, applied in the field of big data processing, can solve problems such as inability to synchronize data table processing operations

Inactive Publication Date: 2018-01-09
新智云数据服务有限公司
View PDF5 Cites 11 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] In view of this, the embodiment of the present invention provides a kind of data extraction system and data extraction method, to solve the same data form stored in Hadoop big data platform and enterprise management software SAP in the prior art, Hadoop big data platform cannot synchronize enterprise Technical defects in the processing operation of the management software SAP on the above data tables

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Data extraction system and data extraction method
  • Data extraction system and data extraction method
  • Data extraction system and data extraction method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0044] figure 1 A structural diagram of a data extraction system provided in Embodiment 1 of the present invention. The structure of the data extraction system in this embodiment specifically includes:

[0045] A data source 11 and a big data platform 12 , the data source 11 includes a data replication module 111 and a data extraction module 112 , and the big data platform 12 includes a distributed file module 121 and / or a data warehouse tool module 122 .

[0046] The data replication module 111 is used to add a database trigger to the data source 11, and copy the update data from the database trigger, and generate an incremental data extraction queue according to the update data, wherein the database trigger is used to determine the data of the data source 11 When a change occurs, record the data change information.

[0047] In this embodiment, the data replication module 111 may add a database trigger in the data source 11, and the database trigger is used to record data ch...

Embodiment 2

[0056] figure 2 It is a structural diagram of a data extraction system provided by Embodiment 2 of the present invention. This embodiment is optimized based on the foregoing embodiments. In this embodiment, the data extraction module 112 is optimized to also be used to send the data replication rule to the data replication module 111 .

[0057] Correspondingly, the data replication module 111 is optimized to: specifically be used to replicate update data from database triggers according to data replication rules.

[0058] Further, the update data is optimized as: the incremental data of the data table and the time stamp of the incremental data, wherein the data table is a data table stored in the data source 11 .

[0059] Further, it is optimized to include: a diversified interface system 13 for receiving the update data sent by the data extraction module 112 and sending the received update data to the big data platform 12 .

[0060] Further, the data source 11 is optimized...

Embodiment 3

[0070] image 3 It is a flowchart of a data extraction method provided by Embodiment 3 of the present invention. The method of this embodiment can be executed by a data extraction system, which can be implemented in the form of hardware and / or software, and can generally be integrated into a computer or server. The method of this embodiment specifically includes:

[0071] 310. Add a database trigger to the internal data source, copy update data from the database trigger, and generate an incremental data extraction queue according to the update data.

[0072] In this embodiment, the internal data source may specifically be a data source that stores some or all of the data in the form of a data table. The database trigger added to the internal data source is specifically used to record the data change information to form updated data when the data of the determined data source changes. The database trigger corresponds to the data table one by one, and as many tables as there a...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The embodiment of the invention discloses a data extraction system and a data extraction method. The data extraction system comprises a data source and a big data platform, wherein the data source comprises a data reproduction module and a data extraction module, and the big data platform comprises a distributed file module and / or a data warehouse tool module; the data reproduction module is usedfor adding database triggers to the data source and duplicating updated data from the database triggers to generate an incremental data extraction queue; the data extraction module is used for extracting the updated data from the incremental data extraction queue according to a set time interval and sending the updated data to the big data platform; the big data platform is used for correcting stored data according to the updated data; the distributed file module and the data warehouse tool module are used for receiving the updated data. According to the technical scheme, a Hadoop big data platform can obtain processing operations of data tables by enterprise management software SAP, and updates the data tables stored in itself according to the obtained processing operations.

Description

technical field [0001] The embodiments of the present invention relate to the technical field of big data processing, and in particular, to a data extraction system and a data extraction method. Background technique [0002] With the rapid development of the national economy, the amount of data generated and stored in all walks of life is rising rapidly. "Big data" has penetrated into every industry and field and has become an important factor of production. Hadoop is a distributed system infrastructure developed by the Apache Foundation, which implements a distributed file system. Hadoop can process data in a reliable, efficient, and scalable manner. Therefore, Hadoop has rapidly developed into a platform for analyzing big data. leading platform. [0003] Since Hadoop cannot independently modify and delete its own existing data, nor can it independently add new data, if it is necessary to modify or delete existing Hadoop data, the corresponding data modification instructio...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F17/30
Inventor 张含宇许伟孟凡华米文龙
Owner 新智云数据服务有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products