Distributed scheduling method, device, equipment and storage medium based on er relationship

A scheduling method and distributed technology, applied in the field of big data platforms, can solve the problems of many fragments and junk files, high maintenance effort and investment cost, and cannot meet the needs of rapid positioning and troubleshooting, so as to achieve the effect of improving concurrency and reducing overhead time

Active Publication Date: 2020-10-30
欧冶云商股份有限公司
View PDF10 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0007] It cannot satisfy the rapid positioning and investigation of the operation of abnormal tasks in the overall range;
[0008] The fault analysis and location of the existing scheduler takes a long time, so the execution logs of all batches must be stored in order to have enough time to locate the fault information without being overwritten before the next scheduling start; the same hour level The logs of computing tasks will be stored 20+ times a day; therefore, the number of computing log files generated every day is about 15,000 or more; the daily log data volume is based on distributed storage and is greater than 1G. Therefore, there are many fragments and garbage files generated during operation. Maintenance High effort and input costs

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Distributed scheduling method, device, equipment and storage medium based on er relationship
  • Distributed scheduling method, device, equipment and storage medium based on er relationship
  • Distributed scheduling method, device, equipment and storage medium based on er relationship

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0089] In order to make the purpose, technical solutions and advantages of the embodiments of the present invention more clear, the following will gather the drawings of the embodiments of the present invention to clearly and completely describe the technical solutions of the embodiments of the present invention. Apparently, the described embodiments are some, not all, embodiments of the present invention. All other embodiments obtained by those skilled in the art based on the described embodiments of the present invention belong to the protection scope of the present invention.

[0090] Firstly, the distributed scheduling method according to the embodiment of the present invention will be described in detail with reference to the accompanying drawings.

[0091] The distributed scheduling method according to the present invention is used for scheduling tasks in the big data platform.

[0092] For example, layering tasks in a big data platform can include:

[0093] 0_ods laye...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a distributed scheduling method and device based on an ER relationship, an electronic device and a computer storage medium. The distributed scheduling method based on the ER relationship is used for scheduling tasks in a big data platform, and comprises the following steps of S1, determining the range of calculation tasks needing to be scheduled, and generating an initial task set; S2, performing hierarchical arrangement on tasks in all the initial task sets based on an ER relationship; S3, estimating the memory and processor overhead of each task, calculating respectivescores through a resource overhead evaluation algorithm, and performing sorting according to the scores to generate an execution task set; and S4, allocating the tasks in the execution task set to aplurality of computing nodes of the big data platform to enable the computing nodes to execute respective tasks. According to the distributed scheduling method, allocation control and overhead measurement and calculation are carried out, and the problem of resource waste is solved.

Description

technical field [0001] The present invention relates to big data platform technology, in particular to a distributed scheduling method and device based on ER relationship for tasks of a big data platform, electronic equipment, and a non-transitory computer storage medium. Background technique [0002] As the scope of business development gradually expands, usually, the company's big data platform undertakes more and more data computing tasks, and gradually becomes an important pillar platform for data services. Taking the big data platform of an e-commerce company as an example, starting from the initial 30 data collection tasks and 20-30 calculation tasks of the report tasks of the e-commerce analysis center, gradually covering entrusted reports, financial statements, supply chain business, risk warning, GMV Nearly more than 400 data collection tasks and more than 700 data analysis and calculation tasks for the operation of daily newspapers and other businesses; at the same...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06F9/50G06F16/18G06F16/2458
CPCG06F9/5016G06F9/5027G06F16/1815G06F16/2462
Inventor 冯若寅万仕龙邹晓峰仲跻炜朱彭生
Owner 欧冶云商股份有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products