Unlock instant, AI-driven research and patent intelligence for your innovation.

Hadoop configuration parameter selection method based on kernel clustering feature selection

A technology for configuration parameters and feature selection, which is applied to multi-program devices, instruments, character and pattern recognition, etc., can solve the important configuration parameters that cannot be selected for the operation performance of distributed processing systems, and increase the configuration work of distributed system administrators reduce the maintenance workload and improve the effect of parameter optimization

Inactive Publication Date: 2020-08-11
CHONGQING UNIV OF POSTS & TELECOMM
View PDF4 Cites 1 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, there is a problem that the important configuration parameters that affect the performance of the distributed processing system cannot be selected, which increases the configuration workload of the distributed system administrator.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Hadoop configuration parameter selection method based on kernel clustering feature selection
  • Hadoop configuration parameter selection method based on kernel clustering feature selection
  • Hadoop configuration parameter selection method based on kernel clustering feature selection

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0055] The technical solutions in the embodiments of the present invention will be clearly and completely described below in conjunction with the accompanying drawings in the embodiments of the present invention. Apparently, the described embodiments are only some of the embodiments of the present invention, not all of them. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts all belong to the protection scope of the present invention.

[0056] The present invention aims to solve the above problems of the prior art, proposes a Hadoop configuration parameter selection method based on kernel clustering feature selection, comprising the following steps:

[0057] S1. Collect data sets of different configuration parameters of the Hadoop platform;

[0058] S2, set up the vector model that represents Hadoop platform configuration parameter, represent this vector model with nuclear wi...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention belongs to the technical field of distributed processing systems, and particularly relates to a Hadoop configuration parameter selection method based on kernel clustering feature selection, which comprises the following steps: collecting data sets of different configuration parameters of a Hadoop platform; establishing a vector model representing Hadoop platform configuration parameters, and representing the vector model by using a kernel width vector; establishing a kernel function capable of reflecting the importance of configuration parameters based on the kernel width vector;executing a kernel clustering algorithm to form a clustering set; updating a kernel width vector v representing a sample configuration parameter in the clustering set by using a gradient descent algorithm, and deleting an element if the element in the v is smaller than a preset threshold value; if the configuration parameter sets corresponding to the kernel width vectors at two adjacent moments are consistent, outputting a set of corresponding configuration parameters in the kernel width vectors at the moment. According to the invention, fewer important configuration parameters in the systemcan be selected, so that the maintenance workload of platform management personnel in the distributed processing system is reduced.

Description

technical field [0001] The invention belongs to the technical field of distributed processing systems, in particular to a Hadoop configuration parameter selection method based on kernel clustering feature selection. Background technique [0002] Hadoop is a distributed processing system widely used at present, and it is based on the theoretical basis of the MapReduce model. Parameter optimization is one of the important issues to improve the performance of Hadoop jobs. It mainly comes from more than 190 configuration parameters of the MapReduce model. These configuration parameters mainly include I / O management, slot resource allocation, memory management, concurrency, map And reduce configuration, etc. It is difficult for general Hadoop platform administrators to fully understand and correctly configure these configuration parameters, because it is an NP (NondeterministicPolynomially, nondeterministic polynomially) problem to configure all parameters correctly to make MapR...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F9/46G06K9/62
CPCG06F9/465G06F18/23G06F18/23213
Inventor 刘俊唐苏乐徐光侠马创解绍词杨敬尊赵娟李威
Owner CHONGQING UNIV OF POSTS & TELECOMM