Job scheduling method and device for streaming data

A technology for job scheduling and streaming data, applied in data exchange networks, digital transmission systems, electrical components, etc., can solve the problem of not comprehensively considering the communication of processing units and the real-time load of physical nodes, affecting the overall performance of streaming data processing, and increasing communication. cost and delay, to avoid overload phenomenon, reduce communication cost and network delay, and improve overall performance.

Active Publication Date: 2014-01-01
INST OF INFORMATION ENG CHINESE ACAD OF SCI
View PDF7 Cites 16 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] The existing streaming data processing system does not comprehensively consider the communication of processing units and the real-time load of physical nodes, but only ensures that the data volume of processing units is evenly distributed. However, because different processing units have different requirements for various resources, the balance in quantity does not mean Balancing resource usage and actual load
Moreover, processing units that communicate frequently with each other may be dispatched to remote environments, increasing their communication costs and delays
Existing scheduling methods cannot well meet the needs of streaming data processing, resulting in uneven loads on physical nodes and high communication delays, which affect the overall performance of streaming data processing

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Job scheduling method and device for streaming data
  • Job scheduling method and device for streaming data
  • Job scheduling method and device for streaming data

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0056] The embodiment of the present invention implements a streaming data processing system, which includes multiple executors and a scheduling manager. The executor is a daemon process running on a physical node, except for the physical node where the scheduling manager is located, each physical node managed by the system runs an executor.

[0057] Executors can start and stop processing units on that physical node. When starting the processing unit, the executor will first create a Linux container with a specified resource capacity on the physical node, and then start the tasks that the processing unit needs to execute inside the Linux container. Processing units correspond to Linux containers one by one, and each processing unit is placed in a Linux container. The Linux container can allocate specified resources for the processes in it. Since the streaming data processing model is usually accompanied by high-traffic communication, the resource types allocated by this syst...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a job scheduling method for streaming data. The job scheduling method for the streaming data comprises the following steps: a scheduling manager acquires jobs to be scheduled from a scheduling queue in real time, and utilizes a directed acyclic graph for generating a processing unit queue according to information of the jobs to be scheduled; according to the ratio of the non-local communication number of physical nodes to dominant resources, the scheduling manager selects one physical node for each processing unit in the processing unit queue and distributes all the processing units to the corresponding physical nodes respectively; when an actuator starts one processing unit, a linux container is established on the physical node of the processing unit, and then the processing unit is started inside the linux container. According to the job scheduling method for the streaming data, the processing units are scheduled to the physical nodes which are small in non-local communication number and low in load, the processing units which need communication frequently can be concentrated to the same physical node, and therefore network communication across the physical nodes is reduced.

Description

technical field [0001] The present invention relates to the field of computer parallel computing, in particular to a stream data-oriented job scheduling method and device. Background technique [0002] In recent years, with the continuous development of applications such as real-time search, advertisement recommendation, social networking, and log online analysis, a new data form - streaming data is emerging. Streaming data refers to a large, fast, uninterrupted sequence of events. In different scenarios, streaming data can be in various data forms such as real-time queries, user clicks, online logs, and streaming media. Streaming applications focus on real-time interaction, and high-latency responses will seriously affect their functions or user experience. Due to the importance and uniqueness of streaming data, a batch of streaming data processing systems have emerged, such as Yahoo! The S4 system. [0003] Events are the basic unit of streaming data in the form of key...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): H04L12/863
Inventor 王旻韩冀中李勇张章孟丹
Owner INST OF INFORMATION ENG CHINESE ACAD OF SCI
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products