Hadoop optimal parameter evaluation method and device

A hadoop cluster and optimal parameter technology, applied in multi-programming devices, electrical digital data processing, program control design, etc., can solve the problems of not considering resources, operation performance impact, high capacity requirements, etc., to save time and Resource cost, short run time, effect of improving cluster performance

Active Publication Date: 2020-10-30
SHANDONG UNIV +1
View PDF5 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, the inventors found that the disadvantage of this method is that there are many parameters that affect the performance of the job, and the coverage of the formula is extremely limited. This will ignore other important parameters, making the optimization effect not good. There is a sufficient degree of mastery and higher ability requirements; the second type of method is to use the parameters that have an important impact on job performance as the input of the prediction model, and obtain the corresponding model of job execution time and parameter configuration through the training data set, but the inventor It is found that the existing model only considers parameters, but not resources. However, resources also have a very important impact on job performance. When resources are insufficient, the execution speed will slow down.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Hadoop optimal parameter evaluation method and device
  • Hadoop optimal parameter evaluation method and device
  • Hadoop optimal parameter evaluation method and device

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0031] The present invention will be further described below in conjunction with the accompanying drawings and embodiments.

[0032] It should be noted that the following detailed description is exemplary and intended to provide further explanation of the present invention. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs.

[0033] It should be noted that the terminology used here is only for describing specific embodiments, and is not intended to limit exemplary embodiments according to the present invention. As used herein, unless the context clearly dictates otherwise, the singular is intended to include the plural, and it should also be understood that when the terms "comprising" and / or "comprising" are used in this specification, they mean There are features, steps, operations, means, components and / or combinations thereof.

[0034] Su...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention belongs to the field of big data information processing, and provides a Hadoop optimal parameter evaluation method and device. The Hadoop optimal parameter evaluation method comprises the following steps: receiving jobs transmitted by a client, constructing a job sequence, and determining the number of jobs in the job sequence; according to the number of jobs in the job sequence, calling a matched tuning scheme from a scheme database to evaluate Hadoop parameters, obtaining Hadoop optimal configuration parameters with the shortest job completion time, and outputting the Hadoop optimal configuration parameters to a Hadoop cluster server, wherein a first tuning scheme for a single job and a second tuning scheme for a non-single job are pre-stored in the scheme database.

Description

technical field [0001] The invention belongs to the field of big data information processing, and in particular relates to a Hadoop optimal parameter evaluation method and device. Background technique [0002] The statements in this section merely provide background information related to the present invention and do not necessarily constitute prior art. [0003] Google is a pioneer in big data processing, and MapReduce, GFS, and BigTable are three important technologies that lay the foundation for big data distributed processing. MapReduce is a distributed computing technology and a simplified distributed programming model. Based on the three-core Google technology, Apache has developed an open source software called Hadoop, which can implement the framework of the MapReduce computing model. The performance of MapReduce is significantly affected by configuration parameters. [0004] There are hundreds of configuration parameters that affect MapReduce job execution, such ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F9/48
CPCG06F9/4881
Inventor 史玉良张建林王心鹤孔凡玉梁飞马智强
Owner SHANDONG UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products