Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Heterogeneous Hadoop cluster-based task scheduling method

A technology for Hadoop clustering and task scheduling, applied in the field of big data, it can solve problems such as the inability to meet performance requirements, and achieve the effect of improving the utilization of cluster resources and speeding up the completion time.

Active Publication Date: 2018-08-31
NORTHWEST UNIV(CN)
View PDF8 Cites 23 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Most strategies fall short of performance needs, including stability, scalability, efficiency, and load balancing

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Heterogeneous Hadoop cluster-based task scheduling method
  • Heterogeneous Hadoop cluster-based task scheduling method
  • Heterogeneous Hadoop cluster-based task scheduling method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment

[0071]Two different types of physical hosts are used to form a heterogeneous Hadoop cluster. One type of physical host has a 4-core CPU (model is I7-4790), the main frequency is 3.6GHz, and the memory is 16GB. Another type of physical host is also a 4-core CPU (model is Intel Xeom E3-1231v3), the main frequency is 3.4GHz and the memory is 16GB. The Hadoop cluster consists of 6 virtual machine nodes, and these 6 virtual machines are distributed on two different types of hosts. In the Hadoop cluster, because the cluster size is relatively small, the data in HDFS is set from 3 backups to 2 backups. The HDFS data block size is set to 64MB. The virtual machine uses VMware workstation12.0, and the Ubuntu14.04 version installed in the operating system. The cluster is installed with Hadoop2.4.1 version. The specific configuration of the cluster is shown in Table 1.

[0072] Table 1 Hadoop cluster configuration

[0073]

[0074] In this embodiment, a comparative experiment is ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a heterogeneous Hadoop cluster-based task scheduling method. According to the method, tasks in task queues in Hadoop are scheduled by considering matching degrees between eachnode and different tasks according to real-time performance of each node in clusters. According to the method, the defects that existing scheduling technology only aims at big data center isomorphismclusters, and existing scheduling algorithm is low in cluster resource utilization rate, imbalanced in cluster node load and relatively long in operation completion time are solved.

Description

technical field [0001] The invention belongs to the field related to big data, and relates to a task scheduling method based on heterogeneous Hadoop clusters. Background technique [0002] With the rapid development of Internet applications, the Internet has ushered in the web2.0 period, and human beings have officially entered the era of information explosion, and the amount of information on the Internet is increasing at an exponential rate. Lots of data are being generated in many fields. For example, in the field of Internet of Things, the various sensors we use, wearable devices, etc. are generating data all the time. In the world of e-commerce, there is also a large amount of data generated when we browse products, add shopping carts and place orders. In the social field, our communication will also generate a large amount of video, audio data and text data. In addition, user behavior logs recorded in social network applications are usually in units of GB or even TB...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F9/48G06F9/50
CPCG06F9/4881G06F9/505G06F9/5088
Inventor 吴奇石王猛侯爱琴张晓阳王永强
Owner NORTHWEST UNIV(CN)
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products