Method and device for processing Redshift external table dynamic column

A processing method and processing device technology, applied in the field of data processing, can solve the problems of not supporting direct modification of column structure and inconvenient operation, and achieve the effect of convenient query operation and simplified operation

Active Publication Date: 2019-10-01
CHENDU PINGUO TECH
View PDF8 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

But this storage method does not support direct modification of the column structure
When you need to modify the column structure of the table (for example, when you need to invalidate or add so

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and device for processing Redshift external table dynamic column
  • Method and device for processing Redshift external table dynamic column
  • Method and device for processing Redshift external table dynamic column

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 2

[0030] figure 2 It is a flow chart of the method of Embodiment 2 of the present invention. On the basis of Embodiment 1, Embodiment 2 of the present invention also includes:

[0031] Step 105: Create or update a Redshift View according to the second table header information; the Redshift View is used to query predetermined columns in the Redshift external table.

[0032] In this step, the method of creating or updating the Redshift View can be to preset a SQL statement as a View, and viewing this View later is equivalent to querying through the preset SQL. In this embodiment, the predetermined column is an effective column. Redshift View is used to filter out unoccupied reserved columns and disabled columns, and only display valid columns to users. exist Figure 9 In the program code shown, if the Metadata information is empty, it means that the reserved column is not occupied. The columns whose in_use field in the metadata is false will be filtered out when using Redshift...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a method and a device for processing a Redshift external table dynamic column. The method comprises the following steps: loading first header information of a Redshift externaltable, wherein the first header information comprises a column name of an external table, a mapping relation between the column name and a to-be-stored Spark Data Frame column, and a use condition ofthe column name; according to the first header information, carrying out one-to-one mapping on the column of the Redshift external table and the column of the to-be-stored Spark Data Frame, and generating second header information of the Redshift external table; according to the second header information, updating the header structure of the to-be-stored Spark Data Frame, and obtaining the updated Spark Data Frame; and storing the updated Spark DataFrame in the Redshift external table. According to the technical scheme provided by the invention, the column structure of the Redshift external table can be dynamically changed, and the operation is simplified.

Description

technical field [0001] The invention relates to the technical field of data processing, in particular to a method and device for processing dynamic columns of a Redshift external table. Background technique [0002] Amazon Web Services provides a series of basic services, including AwSRedshift. AWS Redshift is a fast, scalable data warehouse that makes it easy, cost-effective, and efficient to analyze all data in data warehouses and data lakes. AWS Redshift is used as the storage medium and analysis engine after cleaning the landed data, and the data analysis department can directly view, extract, summarize and other operations on the data. [0003] The existing statistical data with day granularity is stored on the AWS Redshift server, with day as the partition key. In consideration of storage cost and security, we will keep a copy of each piece of data in AWS S3, and the AWS Redshift server only stores the data of the past three months. For the data stored on the AWS Re...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F16/22
CPCG06F16/2282
Inventor 朱亮徐滢
Owner CHENDU PINGUO TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products