Real-time streaming data processing method and device, computer equipment and storage medium

A streaming data and data processing technology, applied in the field of data processing, can solve problems such as poor timeliness of hive, and achieve the effect of improving real-time processing efficiency

Pending Publication Date: 2022-07-01
SF TECH
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] Based on this, it is necessary to provide a real-time streaming data processing method, device, computer equipment and storage medium for the technical problem of poor timeliness of deletion and modification of hive existing in the current technology

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Real-time streaming data processing method and device, computer equipment and storage medium
  • Real-time streaming data processing method and device, computer equipment and storage medium
  • Real-time streaming data processing method and device, computer equipment and storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0046] In order to make the objectives, technical solutions and advantages of the present application more clear, the present application will be further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are only used to explain the present application, but not to limit the present application.

[0047] In the technology involved in this application, the data set storage management program may be a program for acquiring and processing data sets in real time, and the processed data may be stored in a corresponding database through a program interface. The data set storage management program includes but is not limited to Hudi, Kudu, Hive Transactions, Hbase, Stream Processing, etc., can be configured with different data set management programs according to different data processing requirements. The following takes Hudi as an example to illustrate.

[0048] Hudi is an op...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a real-time streaming data processing method and device, computer equipment and a storage medium. The method comprises the following steps: acquiring to-be-processed streaming data, generating a transaction number for the to-be-processed streaming data, sending the to-be-processed streaming data to at least one computing node, dividing the to-be-processed streaming data into a plurality of data segments, and performing parallel processing on the plurality of data segments, based on the transaction number, processing results of parallel processing are stored in a distributed file system through an interface of a data set storage management program, and when all processing results corresponding to parallel processing are stored in the distributed file system, transaction submission for the to-be-processed streaming data is executed. According to the method, the transaction submission initiated by each computing node to the to-be-processed streaming data is detected, the related data is stored based on the transaction number, and the processing result is stored in the distributed file system through the interface of the data set storage management program, so that the file of the distributed file system can be updated in real time, and the real-time processing efficiency of the data is improved.

Description

technical field [0001] The present application relates to the technical field of data processing, and in particular, to a real-time streaming data processing method, apparatus, computer equipment and storage medium. Background technique [0002] The underlying file storage of Hive is a distributed file system (hdfs). HDfs itself does not support row modification and deletion, which makes it difficult to delete and modify hive rows. [0003] In the current technology, the row deletion and modification of hive is usually done offline with the help of mr or spark, which has poor timeliness. SUMMARY OF THE INVENTION [0004] Based on this, it is necessary to provide a real-time streaming data processing method, apparatus, computer equipment and storage medium for the technical problem of poor timeliness of deletion and modification of hive in the current technology. [0005] A real-time streaming data processing method, the method comprising: [0006] Obtain the stream data ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/182G06F16/11
CPCG06F16/182G06F16/11
Inventor 杨卓荦刘杰
Owner SF TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products