Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Online capacity expansion method and device for distributed streaming computing application and computer equipment

A streaming computing and distributed technology, applied in special data processing applications, database design/maintenance, structured data retrieval, etc., can solve problems such as troublesome data migration and unavailable streaming applications

Active Publication Date: 2019-10-25
BEIJING QIYI CENTURY SCI & TECH CO LTD
View PDF6 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The machine can only be restarted after the migration is completed, which will make the streaming application unavailable for a period of time and cause troubles in data migration

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Online capacity expansion method and device for distributed streaming computing application and computer equipment
  • Online capacity expansion method and device for distributed streaming computing application and computer equipment
  • Online capacity expansion method and device for distributed streaming computing application and computer equipment

Examples

Experimental program
Comparison scheme
Effect test

example 1

[0058]For a piece of data a that has just entered the distributed stream computing application, its partition key value is key1, and the timestamp corresponding to the partition key value key1 in the first data table of the entry node is time1, which is the first time of the data Stamped as time1. The size relationship obtained by comparing time1 with the timestamp in the second data table is: t0<time1<t1, it can be seen that the timestamp smaller than the first timestamp is only t0, so t0 is the second time of the data stamp, the node list of the second timestamp t0 is {node1, node2, node3}. The partition node inputs the partition key value of the piece of data into a hash function, and the obtained hash value corresponds to node2, so the piece of data is allocated to the node node2, and the node node2 performs related processing on the piece of data. Finally, update the local state on node node2.

example 2

[0060] For a piece of data b that has just entered the distributed streaming computing application, its partition key value is key3, and the timestamp corresponding to the partition key value key3 in the first data table of the entry node is time3, which is the first The timestamp is time3. The size relationship obtained by comparing Time3 with the timestamp in the second data table is: t0, t1t0, so t1 It is closer to time3 than t0, so t1 is the second timestamp of the piece of data, and the node list of the second timestamp t1 is {node4, node5, node6}. The partition node inputs the partition key value of the piece of data into a hash function, and the obtained hash value corresponds to node6, so the piece of data is allocated to node node6, and node node6 performs related processing on the piece of data. Finally, update the local state on node node6.

[0061] From the above example 1, it can be seen that the timestamp of the first appearance of the partition key value of da...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention relates to an online capacity expansion method and device for a distributed streaming computing application, computer equipment and a storage medium, and the method comprises the steps:searching a timestamp corresponding to a partition key value of one piece of data in a first data table when the piece of data enters the distributed streaming computing application, and marking the searched timestamp as a first timestamp; searching a timestamp which is smaller than the first timestamp and closest to the first timestamp in the second data table, and marking the searched timestampas a second timestamp; and allocating the data to one node in the node list corresponding to the second timestamp according to a preset partitioning mode. According to the capacity expansion method provided by the invention, node addition has no influence on distribution of old data, distribution of existing local states on the nodes cannot be changed, shutdown is not needed, use of distributed streaming computing applications cannot be influenced, and a special tool does not need to be adopted to migrate the local states on the nodes.

Description

technical field [0001] The present application relates to the technical field of capacity expansion of distributed stream computing applications, and in particular to an online capacity expansion method, device, computer equipment and storage medium of distributed stream computing applications. Background technique [0002] Streaming computing application is a cutting-edge technology in the current big data field. It focuses on real-time processing of continuously generated data streams, and realizes functions such as local storage of data stream status (referred to as local status), out-of-order data processing, and automatic processing of back pressure. , which has the characteristics of low latency and high throughput. In order to cope with extremely high throughput, streaming computing applications are often distributed applications deployed on multiple machines, that is, the continuously injected data flow is distributed to multiple machines in a user-defined way, and e...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/21G06F16/22
CPCG06F16/214G06F16/2282
Inventor 易帆段效晨康林赵艳杰秦占明
Owner BEIJING QIYI CENTURY SCI & TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products