Unlock instant, AI-driven research and patent intelligence for your innovation.

Method, device, device, and storage medium for parallel processing of data streams based on tasks

A parallel processing and data flow technology, applied in the field of data science, can solve problems such as increasing system scheduling switching overhead

Active Publication Date: 2020-06-26
THE FOURTH PARADIGM BEIJING TECH CO LTD
View PDF4 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Since the number of physical CPUs on the computer is fixed, too many threads will increase the switching overhead during system scheduling

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method, device, device, and storage medium for parallel processing of data streams based on tasks
  • Method, device, device, and storage medium for parallel processing of data streams based on tasks
  • Method, device, device, and storage medium for parallel processing of data streams based on tasks

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0040] Preferred embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. Although preferred embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.

[0041] Different from the existing parallel processing scheme, the present invention proposes a task-generating parallel operation scheme. The parallel operation scheme of the present invention considers the data and the corresponding operation steps as a whole, according to the data processing flow, packs the operation steps to be processed and the corresponding data to be operated into tasks to be processed, and puts the packaged tasks int...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a method and device to parallelly processing data streams based on tasks, equipment and a storage medium. Through each determined worker thread in multiple worker threads, tasks to be processed are extracted from a task queue, and parallel processing is performed for the extracted tasks to be processed, wherein the tasks to be processed are generated by packaging via batchdata to be operated in data streams and corresponding operating steps in data stream processing. A parallel operating mechanism based on task generation is disclosed herein; for tasks generated by packaging based different operating steps, the degree of parallel may be automatically adjusted according their time consumption during actual operation.

Description

technical field [0001] The present invention relates to the field of data science, in particular to a method, device, device and storage medium for parallel processing of data streams based on tasks. Background technique [0002] In the case of a large amount of data involved in the data processing business, it is usually necessary to use multi-threaded parallel execution to reduce the overall execution time of the business. The thread is the unit of operation scheduling by the operating system. The calculation must be handed over to the operating system in the form of a thread. In actual calculation, the thread will be assigned to a certain physical core for calculation. Using the physical resources of the machine as much as possible can reduce the total time overhead of the task. [0003] One way to use multithreading is to divide the data to be processed into multiple batches, and each thread is responsible for processing a batch of data. The problem with this multi-thr...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F16/2453G06F16/2455G06F9/50
CPCG06F9/5027G06F2209/5018G06F16/24532G06F16/24568
Inventor 杨强陈雨强戴文渊焦英翔石光川
Owner THE FOURTH PARADIGM BEIJING TECH CO LTD