Data processing method and device applied to data warehouse

A data warehouse and data processing technology, applied in the computer field, can solve problems such as complexity, unobjective deduplication efficiency, and impact on business processing, and achieve the effect of reducing data duplication storage, improving timeliness, and meeting query and analysis needs.

Active Publication Date: 2018-10-09
BEIJING JINGDONG SHANGKE INFORMATION TECH CO LTD +1
View PDF6 Cites 12 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0008] In view of this, the embodiments of the present invention provide a data processing method and device applied to a data warehouse, which can at least solve the problem of relying on users to manually perform deduplication operations on real-time data and historical data in the prior art, resulting in unobjective deduplication efficiency. At the same time affect the follow-up business to deal with complex issues

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Data processing method and device applied to data warehouse
  • Data processing method and device applied to data warehouse
  • Data processing method and device applied to data warehouse

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0035] Exemplary embodiments of the present invention are described below in conjunction with the accompanying drawings, which include various details of the embodiments of the present invention to facilitate understanding, and they should be regarded as exemplary only. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the invention. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.

[0036] It should be noted that the embodiments of the present invention are mainly applicable to the construction of real-time data warehouses of online platforms, such as e-commerce platforms. In addition, the split provided is to redistribute the original data according to predetermined rules.

[0037] see figure 1 , shows a main flowchart of a data processing method a...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a data processing method and device applied to a data warehouse, and relates to the technical field of computers. The data processing method includes the specific steps that real-time data is received, and historical data related to the real-time data is determined; a file of the historical data is determined, and the real-time data is transmitted to a real-time data layerof the file; a version of the real-time data and a version of the historical data are obtained respectively, real-time data or historical data of the latest version is extracted and serves as the latest data record of the file, and the latest data record is written into a historical data layer for record updating. According to the data processing method and device in the embodiment, combination between a traditional data warehouse and a real-time computing technology is achieved, the ETL timeliness of the data warehouse is improved, and the requirement when a user conducts real-time-data queryanalysis is met; in addition, based on existing off-line data, the development cost of real-time computing of the data warehouse can be reduced, and the business requirement of the data warehouse isbetter met.

Description

technical field [0001] The invention relates to the field of computer technology, in particular to a data processing method and device applied to a data warehouse. Background technique [0002] With the rapid growth of business data, data warehouse construction has become a key measure to effectively manage and utilize big data for business intelligence and auxiliary decision-making. Usually, a Hive-based data warehouse focuses on offline data processing and mainly supports data services with a timeliness of (T+1). The operation mode of offline ETL (Extract-Transform-Load, extract-transform-load) data warehouse is increasingly unable to meet the user's processing requirements for implementation data. [0003] In the prior art, the solution for realizing the quasi-real-time ETL of the data warehouse is to store the historical data separately from the streaming data files processed by the real-time computing framework. Real-time data and historical data will not be correlate...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
Inventor 安金龙刘业辉张飞张宁王淼鑫张增
Owner BEIJING JINGDONG SHANGKE INFORMATION TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products