A scheduling method based on backup task running time estimation in hadoop big data platform

A big data platform and backup task technology, which is applied in the scheduling field based on backup task running time estimation, can solve problems such as insufficient efficiency of backup task scheduling due to estimation accuracy, meaningless backup task speculative execution mechanism, invalid backup task allocation and operation, etc. , to achieve the effect of shortening the operation turnaround time, increasing the reliability and improving the efficiency

Active Publication Date: 2019-04-16
重庆信科通信工程有限公司
View PDF2 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, the speculative execution mechanism does not perform well in terms of the accuracy of task remaining time estimation and the efficiency of backup task scheduling, which will lead to the end time of a large number of backup tasks not being earlier than the original slow tasks, resulting in invalid allocation and operation of these backup tasks
These deficiencies not only cause a waste of system resources, but also these invalid backup tasks make the original speculative execution mechanism of backup tasks meaningless

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A scheduling method based on backup task running time estimation in hadoop big data platform
  • A scheduling method based on backup task running time estimation in hadoop big data platform
  • A scheduling method based on backup task running time estimation in hadoop big data platform

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0054] The preferred embodiments of the present invention will be described in detail below with reference to the accompanying drawings.

[0055] figure 1 It is a macro flow chart of the scheme of the present invention, figure 2 It is the flow chart of the scheduling method based on backup task running time estimation of the present invention, as shown in the figure, the scheduling method based on backup task running time estimation in the Hadoop big data platform of the present invention mainly includes the following seven steps: Step 1: Determine whether the task process entity TaskTracker in the Job (job) on the JobTracker node, that is, the task requester, is a slow node; Step 2: Check whether the number of tasks already started in the Job (job) on the JobTracker node exceeds Threshold; step 3: filter out all tasks that meet the conditions in the Job (job), and save them in the candidates table, calculate the remaining time leftTime of the task according to LATE (the lon...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention relates to a scheduling method based on backup task running time estimation in a Hadoop big data platform and belongs to the technical field of cloud computation platform optimization. SDN bandwidth perception capacity is adopted in the method, a BWRE backup task running time estimation model is built, and a backup task scheduling method based on a backup task prediction mechanism is optimized; credibility of a backup task is improved through comparison of remaining time for adding a slow task when the standby task is allocated to a node task requester TTi and estimated running time of the pre-started backup task on the TTi, that is, it is believed that the backup task can end earlier than an original slow task, and accordingly the effective rate of the backup task is increased. By means of the method, work turnaround time can be shortened, and system resource waste caused by invalid backup tasks can be reduced.

Description

technical field [0001] The invention belongs to the technical field of cloud computing platform optimization, and relates to a scheduling method based on backup task running time estimation in a Hadoop big data platform. Background technique [0002] With the rapid development of the information technology industry, the amount of data generated by enterprises, organizations and individuals is also increasing day by day. We live in an era of data growth faster than ever before. In 2012, Google had data centers with millions of servers all over the world, processing an average of 3.3 billion search requests per day, and processing more than 400PB of user-generated data per month; in the same year, Facebook announced its The data center receives an average of 300 million pictures uploaded by users every day, and the new data in the database exceeds 500TB. In IDC's 2014 annual data report, it is predicted that 4 billion people will be connected to the Internet in 2020, and the ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & AuthorityPatents(China)
IPC IPC(8): G06F9/48H04L29/08
CPCG06F9/4881H04L67/1097
Inventor尚凤军李路中闫辰云
Owner重庆信科通信工程有限公司