Self-adaptive rate control method for stream data processing

A streaming data and data processing technology, applied in special data processing applications, electrical digital data processing, program control design, etc., can solve inaccurate calculation results, uncertain running delay of dynamic batch size, and inability to fully utilize computing resources And other issues

Inactive Publication Date: 2017-05-10
DALIAN UNIV OF TECH
View PDF3 Cites 39 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, discarding data usually leads to inaccurate calculation results. Dynamic resource management requires more hardware resources for data processing peaks. Dynamic batch size makes running delays uncertain.
[0005] The state of the cluster is not static, so the artificially set upper limit of the static data processing rate may be inaccurate. If it is too small, it may not be able to fully utilize computing resources when the rate of processing bursts increases. If it is too large, the system may receive too many data, causing high latency in the computing cluster, resulting in system instability

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Self-adaptive rate control method for stream data processing
  • Self-adaptive rate control method for stream data processing
  • Self-adaptive rate control method for stream data processing

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0074] The specific implementation manners of the present invention will be further described below in conjunction with the accompanying drawings and technical solutions.

[0075] System architecture such as figure 1 shown. The system can use a variety of data sources, and the sent byte data stream is parsed through the parser, and the system uses Kafka as a message queue to summarize and cache the data. The data parsed from the data source is sent to the Kafka cluster through Kafka's producer. The system uses Spark Streaming to stream data. The Kafka cluster serves as the upstream data source of Spark Streaming, and Spark Streaming continuously pulls data for calculation. Zookeeper records the offset of the currently processed data, and the database and downstream Kafka cluster are used to manage the output results.

[0076] The processing flow of this method is as follows figure 2shown. The data source continues to generate a large amount of data. First, analyze the ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention belongs to the technical field of computer applications and relates to a self-adaptive rate control method for stream data processing. According to the method, based on a common data receiving message queue and a big data distributor calculating framework, the degree of parallelism of data processing is adjusted through a pre-fragmentation mode according to the condition of a current calculating colony and the quantity of current processed data of the colony is dynamically adjusted according to a self adaptive real time rate control method, so that the stability of the calculating colony is ensured, and the delay of data stream processing is reduced. Along with gradual penetration of big data into the industries, the application range of the real-time processing of mass data is gradually expanded. The real time property and the stability of a mass data processing system are quite important. On the premise of not increasing the quantity of the colony hardware and the task programming complexity, the stability and the processing efficiency of the calculating colony are enhanced to a certain extent.

Description

technical field [0001] The invention belongs to the technical field of computer applications, and relates to an adaptive rate control method for streaming data processing. Background technique [0002] At present, with the development of technology, the amount of data is increasing day by day, and "big data" technology has penetrated into all walks of life. At present, many devices will collect a large amount of data, and it is hoped to process the data in time to discover its value. For example, data generated by smartphones, sensors, Internet of Things devices, social networks, and online transaction systems need to be continuously collected and analyzed in real time to achieve rapid response. Therefore, how to improve the ability of real-time data analysis and processing has become a very important issue. [0003] Some current mainstream big data real-time processing frameworks include Spark, Storm, Flink, etc. Spark Streaming is an extension of the Spark core API, whi...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F9/50G06F17/30
CPCG06F9/5038G06F16/182G06F16/24568
Inventor 申彦明李晓东
Owner DALIAN UNIV OF TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products