Distributed scheduling method and device, electronic equipment and computer storage medium

A scheduling method and distributed technology, applied in the field of big data platforms, can solve problems such as large number of fragments and junk files, high maintenance effort and investment cost, and long time for fault analysis and location of the scheduler, so as to reduce overhead time and solve resource problems. Effect of waste and reduction of fault location time

Active Publication Date: 2019-09-17
欧冶云商股份有限公司
View PDF2 Cites 14 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0007] It cannot satisfy the rapid positioning and investigation of the operation of abnormal tasks in the overall range;
[0008] The fault analysis and location of the existing scheduler takes a long time, so the execution logs of all batches must be stored in order to have enough time to locate the fault information without being overwritten before the next scheduling start; the same hour level The logs of computing tasks will be stored 20+ times a day; therefore, the number of computing log files generated every day is about 15,000 or more; the daily log data volume is based on distributed storage and is greater than 1G. Therefore, there are many fragments and garbage files generated during operation. Maintenance High effort and input costs

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Distributed scheduling method and device, electronic equipment and computer storage medium
  • Distributed scheduling method and device, electronic equipment and computer storage medium
  • Distributed scheduling method and device, electronic equipment and computer storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0067] In order to make the purpose, technical solutions and advantages of the embodiments of the present invention more clear, the following will gather the drawings of the embodiments of the present invention to clearly and completely describe the technical solutions of the embodiments of the present invention. Apparently, the described embodiments are some, not all, embodiments of the present invention. All other embodiments obtained by those skilled in the art based on the described embodiments of the present invention belong to the protection scope of the present invention.

[0068] Firstly, the distributed scheduling method according to the embodiment of the present invention will be described in detail with reference to the accompanying drawings.

[0069] The distributed scheduling method according to the present invention is used for scheduling tasks in the big data platform.

[0070] For example, layering tasks in a big data platform can include:

[0071] 0_ods laye...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a distributed scheduling method and device, electronic equipment and a computer storage medium. The distributed scheduling method is used for scheduling tasks in a big data platform, and the distributed scheduling method comprises the following steps: S1, determining the range of calculation tasks needing to be scheduled, and generating an initial task set; S2, performing layer-by-layer statistics on tasks of upstream reference objects of the tasks in all the initial task sets, and arranging the tasks according to an upstream and downstream sequence; S3, estimating the memory and processor overhead of each task, calculating respective scores through a resource overhead assessment algorithm, and sorting according to the scores to generate an execution task set; and S4, distributing the tasks in the execution task set to a plurality of computing nodes of the big data platform so as to enable the computing nodes to execute respective tasks. According to the distributed scheduling method provided by the invention, distribution control and overhead measurement are carried out, so that the resource waste condition is solved.

Description

technical field [0001] The present invention relates to big data platform technology, in particular to a distributed scheduling method and device, electronic equipment and non-transitory computer storage medium for tasks of the big data platform. Background technique [0002] As the scope of business development gradually expands, usually, the company's big data platform undertakes more and more data computing tasks, and gradually becomes an important pillar platform for data services. Taking the big data platform of an e-commerce company as an example, starting from the initial 30 data collection tasks and 20-30 calculation tasks of the report tasks of the e-commerce analysis center, gradually covering entrusted reports, financial statements, supply chain business, risk warning, GMV Nearly more than 400 data collection tasks and more than 700 data analysis and calculation tasks for the operation of daily newspapers and other businesses; at the same time, the object level of...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F9/50
CPCG06F9/5016G06F9/5027G06F9/5066
Inventor 冯若寅万仕龙邹晓峰
Owner 欧冶云商股份有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products