Data platform system based on real-time streaming computing

A real-time data and data platform technology, applied in computing, digital data processing, special data processing applications, etc., can solve problems such as difficulty in meeting the timeliness and accuracy of data processing and analysis, and the inability of traditional data processing models to analyze massive data. , to achieve the effect of low latency, high throughput, and high security

Pending Publication Date: 2019-09-06
HUITONGDA NETWORK CO LTD
View PDF7 Cites 8 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] The traditional standardized data warehouse mainly uses oracle's timed tasks, triggers, partition tables and other mechanisms to perform quasi-real-time statistical analysis of incoming data. In terms of performance and functions, it has been difficult to meet the timeliness and accuracy of data processing and analysis for users needs
[0004] With the rapid development of the Internet, the era of big data has arrived, and the data has shown explosive growth. The traditional data processing mode is no longer suitable for the analysis of massive data. A new data processing method is needed to process complex business logic in real time.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Data platform system based on real-time streaming computing
  • Data platform system based on real-time streaming computing
  • Data platform system based on real-time streaming computing

Examples

Experimental program
Comparison scheme
Effect test

Embodiment

[0104] In the present embodiment, the data of 260M (as shown in table 2, is the sales data of a certain company) is arranged in a table in the setting Oracle database,

[0105] Table 2

[0106]

[0107]

[0108] In Table 2, the first column ORDERNO indicates the order number, the second column MEMBERNO indicates the member number, the third column TOTALPRICE indicates the total price, the fourth column ORDERSTATUS indicates the order status, the fifth column ORGID indicates the organization number, and the sixth column OPERATORID indicates operator number;

[0109] Configure the oracle database connection information through the data source module, including user name, password, oracle driver, etc. Due to the large amount of data, which has reached 120W rows, it is necessary to perform the data initialization process first, and then perform the data synchronization process;

[0110] The data initialization process is to use spark sql micro-batch to extract all the data ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a data platform system based on real-time streaming computing. The data platform system comprises the following steps: acquiring and storing behavior data information of predetermined data, converting the predetermined data into a series of short and small data streams, and processing real-time streaming data with high throughput and a fault-tolerant mechanism. According tothe data platform system, behavior data information of predetermined data is converted into a series of short batch processing operations, and key information is analyzed and extracted, so that large-batch data are converted into micro-batch data, and the data are rapidly calculated in a distributed mode so as to achieve the effects of large throughput and low delay and satisfy the requirement ofa data release system for high timeliness, and being suitable for services sensitive to time delay.

Description

technical field [0001] The invention relates to computer Internet data warehouse design technology, in particular to a data platform system based on real-time stream computing. Background technique [0002] Real-time streaming computing is to segment the data stream, decompose it into a series of short batch processing jobs, and then transmit the segmented data stream to the data batch processing engine, and clean, extract, and transform the data in the data engine , and save the obtained data processing results in memory. [0003] The traditional standardized data warehouse mainly uses oracle's timed tasks, triggers, partition tables and other mechanisms to perform quasi-real-time statistical analysis of incoming data. In terms of performance and functions, it has been difficult to meet the timeliness and accuracy of data processing and analysis for users demand. [0004] With the rapid development of the Internet, the era of big data has arrived, and the data has shown e...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/182G06F16/2458G06F9/48
CPCG06F16/182G06F16/2471G06F9/4881
Inventor 钟证业毕军胡雨成胥小燕
Owner HUITONGDA NETWORK CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products