Data reading and writing method and data reading and writing system

A technology of data reading and data writing, applied in the field of data technology data processing, can solve the problems of JDBC server suspended animation, occupying task scheduling time, slow speed, etc., to avoid slow writing speed, high throughput, and strong stability Effect

Active Publication Date: 2019-02-22
杭州玳数科技有限公司
View PDF6 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Using JDBC to write Hive data is slow, because the request to insert records will be converted into a large number of Map Reduce small tasks, which takes up a lot of task scheduling time; when using JDBC to read Hive data, it is easy to cause the JDBC server of Hive to freeze , causing all connections to the JDBC server to be blocked

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Data reading and writing method and data reading and writing system
  • Data reading and writing method and data reading and writing system
  • Data reading and writing method and data reading and writing system

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0043] The implementation of the present invention will be illustrated in detail below with reference to the accompanying drawings. The examples of the present invention are for further explaining the present invention, rather than limiting the protection scope of the present invention.

[0044] Please refer to figure 2 , image 3 As shown, this application proposes a method for the Flink platform to quickly read Hive, including:

[0045] Define the Hive data reading class, which implements the InputFormat interface of the Flink framework. In this embodiment, a Java program is used to define the Hive data reading class class HiveInputFormat implementsInputFormat.

[0046] Implement the configure method of the InputFormat interface of the Flink framework in the Hive data reading class;

[0047] In the implemented configure method, the following sub-steps are included:

[0048] Obtain the database connection instance of Hive through the JDBC connection string of Hive;

[0...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The application provides a data reading and writing method and a data reading and writing system. By parsing the details of a Hive data table and converting the reading and writing of the Hive data table into the file reading and writing on an HDFS file system, the method and the system avoid the problems of false death and slow writing speed that may occur when reading the Hive data table based on JDBC. By directly reading and writing the data of the Hive data table on the HDFS file system at the bottom layer, the method and the system are high in throughput and strong in stability.

Description

technical field [0001] The invention belongs to the field of data technology and data processing, in particular to a data reading and writing method and a data reading and writing system Background technique [0002] Flink is an open source computing platform for distributed data flow processing and batch data processing. It is mainly implemented by Java code and has the characteristics of throughput and low latency. By implementing the InputFormat interface and OutputFormat interface of the Flink framework, the Flink platform can read and write data from different data sources. Hive is a Hadoop-based data warehouse engine that maps structured data files into a database table and provides a simple SQL query function that converts SQL statements into MapReduce tasks for execution. [0003] In order to enable the Flink platform to read and write Hive, a common way is to use the JDBCInputFormat class and JDBCOutputFormat class provided by the Flink framework to read and write ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/22G06F16/242G06F3/06
CPCG06F3/061
Inventor 胡一帆
Owner 杭州玳数科技有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products