Unlock instant, AI-driven research and patent intelligence for your innovation.

Distributed streaming data processing method and system in cloud environment

A technology of distributed data and processing methods, applied in electrical digital data processing, special data processing applications, transmission systems, etc., can solve problems such as increasing difficulty in development and maintenance, and achieve lower development and maintenance costs good maintainability

Inactive Publication Date: 2017-11-24
NANJING UNIV OF POSTS & TELECOMM
View PDF5 Cites 72 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0007] However, the disadvantage of the Lambda architecture is that it is necessary to maintain two sets of codes for stream processing and batch processing. All algorithms are implemented twice, one for the batch processing system and the other for the real-time system, and it is also required to query the results of the two systems. Merger, making development and maintenance more difficult

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Distributed streaming data processing method and system in cloud environment
  • Distributed streaming data processing method and system in cloud environment
  • Distributed streaming data processing method and system in cloud environment

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0042] The design scheme of the present invention emphasizes improving the optimization of the three major processes of multi-node and distributed data stream processing, storage, and query, as well as the optimization of full calculation through stream processing. For the stream processing stage of the traditional Lambda architecture, the present invention creates multiple Input Dstreams in the SparkStreaming application program to receive data from different data nodes in parallel; for the full calculation stage, the present invention replaces the traditional MapReduce batch by creating a new SparkStreaming task Handle calculations.

[0043] 1. Architecture

[0044] The system framework of the present invention can be divided into following four modules:

[0045] 1. Distributed message queue module: The sensor data distributed in different locations and each node enters the distributed message queue, and the distributed message queue divides the data into multiple partition...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides a distributed steaming data processing method and system in a cloud environment. Aiming at the shortcomings that in an internet of things era, the data amount is large in concurrent volume, fast in stream and the like, stream type computing engine Spark Streaming is utilized to replace MapReduce batch computing of a traditional Lambda structure, and by instantiating multiple input streams, stream computing of multi-table data is achieved, the computing result is stored in a distributed file system HDFS, and efficient query is achieved through a distributed query system Impala.

Description

technical field [0001] Aiming at the characteristics of large concurrency and fast flow of data in the era of the Internet of Things, the present invention designs a distributed stream data processing method in a cloud environment, and uses the stream computing engine Spark Streaming to replace the traditional Lambda-based MapReduce batch computing , and realize stream computing for multi-table data by instantiating multiple input streams, save the calculation results in the distributed file system HDFS, and realize efficient query through the distributed query system Impala. The invention belongs to the field of big data processing based on a cloud computing platform. Background technique [0002] The rapid development of Internet of Things technology puts forward stricter requirements for big data processing technology. The big data processing system in the Internet of Things era mainly addresses the following challenges: [0003] (1) The amount of data to be processed...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F17/30H04L29/08
CPCG06F16/182G06F16/2453G06F16/24568G06F16/2471H04L67/10H04L67/12
Inventor 李鹏李亮德徐鹤王汝传陈芳州宋金全
Owner NANJING UNIV OF POSTS & TELECOMM