Method for synchronously replicating data to Hadoop platform from PG database based on log analysis technology

A replication method and data synchronization technology, applied in database distribution/replication, electronic digital data processing, structured data retrieval, etc., can solve problems such as business system data exchange problems, reduce backup burden, speed up response time, and reduce transmission volume effect

Inactive Publication Date: 2018-06-29
CHINA REALTIME DATABASE +1
View PDF2 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

This requires that the source and target databases must be PG databases to use the master-slave configuration scheme, which brings difficulties to data exchange between business systems
In particular, there are great difficulties in synchronously copying the data of the PG database to the Hadoop platform

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method for synchronously replicating data to Hadoop platform from PG database based on log analysis technology
  • Method for synchronously replicating data to Hadoop platform from PG database based on log analysis technology
  • Method for synchronously replicating data to Hadoop platform from PG database based on log analysis technology

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0022] An embodiment of the present invention discloses a method for synchronously copying data from PostgresSQL database to Hadoop platform based on log analysis technology, and its main architecture is as follows figure 1 As shown, it mainly includes log parsing, message receiving, and SQL adaptation.

[0023] Before starting the formal data synchronization replication, you must first enable the logical replication function of the PostgresSQL database, and ensure that the maximum number of log sending processes is greater than the set number (set to 2 in this embodiment), and modify the database user settings so that the streaming replication protocol can be used directly .

[0024] See figure 2 , Use the log analysis module to filter the logical logs of the PostgresSQL database that need to be processed, and send complete data according to transaction integrity. Specifically, the log parsing module analyzes the format of the logical log of the PostgresSQL database, and obtains ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention belongs to the technical field of power system databases, and discloses a method for synchronously replicating data to a Hadoop platform from a PostgresSQL database based on the log analysis technology. The method includes the steps that the logical replication function of the PostgresSQL database is started, it is guaranteed that the number of the maximum log sending process is larger than the set number, and database user settings are modified so that the stream replication protocol can be directly used; a log analysis module is used for filtering rules of logical logs of the PostgresSQL database required to be processed and sending complete data according to the transaction completeness; an information receiving module is used for receiving the data from the log analysis module according to configured receiving information and writing the data into a local cache data file for data loading according to the local rule; a SQL adaption module is used for reading the cachedata file, the cache data file is converted into a general standard SQL data statement format according to the type of the Hadoop platform, and the data is loaded and enters the Hadoop platform. According to the method, database synchronous-replication efficiency is improved.

Description

Technical field [0001] The invention belongs to the technical field of power system databases, and specifically relates to a method for synchronously copying data from a PostgresSQL database to a Hadoop platform based on log analysis technology. Background technique [0002] With the construction of the “State Grid Resource Planning Information System” (SG-ERP) project of the International Grid Corporation of China, the State Grid Corporation of China has built relevant application systems in three areas, five major centers, two centers, information platforms, and comprehensive analysis and decision-making. The information system architecture is more complex. In order to ensure the data consistency between different business systems, the problem of data exchange between business systems must be solved, and real-time synchronization between business system databases is one of the feasible ways to solve this problem. [0003] However, there are many types of database synchronization...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
CPCG06F16/27
Inventor 蒋元晨徐增荣李贤慧何阳黄伟
Owner CHINA REALTIME DATABASE
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products