A full-link benchmarking system for distributed scheduling systems

A scheduling system and benchmarking technology, applied in transmission systems, digital transmission systems, data exchange networks, etc., can solve the problems of lack of data set construction, single load set software stack, evaluation distortion, etc., to ensure data generation speed, The effect of retaining real data characteristics and ensuring validity

Active Publication Date: 2021-05-28
BEIHANG UNIV
View PDF9 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0012] (1) The load sets and test index sets contained in the existing benchmarking system are mainly aimed at big data systems, and the distributed scheduling system is an important pluggable module in the big data system, so the final evaluation result is all modules of the system The overall performance reflected after mutual cooperation does not represent the performance of the distributed scheduling system
At present, there are shortcomings in the research of benchmarking technology for distributed scheduling systems, and there is an urgent need for a benchmarking system to conduct a fair and reasonable quantitative evaluation of distributed scheduling systems;
[0013] (2) Lack of data set construction
However, most of the test data used in the current benchmarking system is constructed by random generation or by crawling existing data on the Internet. The randomly generated data does not reflect the above data characteristics, and the data obtained by crawling has relatively Large time cost cannot be quickly evaluated;
[0014] (3) The implementation of the load set software stack is relatively simple
However, most of the workloads in the current benchmark system are Hadoop-type tasks. The loads implemented by different software stacks are quite different in terms of computing logic and data processing. It is unreasonable to evaluate only through Hadoop-type tasks. Overlay other software stacks;
[0015] (4) There is no test index set for distributed scheduling system
These indicators reflect the overall performance of the big data system, and cannot directly and objectively evaluate the distributed scheduling system;
[0016] (5) There is no uniform and quantifiable load submission strategy
In the current benchmark system, there are few instructions on the load submission strategy, and there is a large degree of freedom in the evaluation
The load submission strategy has a very important impact on the system evaluation. Without a unified and quantifiable submission method, it is difficult to make a fair comparison of the system horizontally, and it will also make the evaluation distorted;
[0017] (6) Lack of indicator collection and monitoring modules
However, the current benchmarking system does not include monitoring indicator collection and monitoring modules, which brings some inconvenience to the evaluation. You need to choose the indicator collection and monitoring tools yourself during the evaluation.
[0018] (7) Lack of a full-link test system
The current benchmarking system mainly focuses on the construction of load sets, but there are relatively few aspects of data sets, test indicator sets, load submission strategy design, load submission, indicator collection, and monitoring. It is necessary to find relevant tools for evaluation. To test, the test process is more complicated
[0019] (8) There are some problems in the simulator of the native cluster management system: (1) The scheduler and the task node simulator run on the same computing node, and the thread is used to simulate the task to apply for resources and the node to report the heartbeat information. It directly affects the evaluation of the scheduler; (2) A layer of encapsulation is made for the pluggable scheduler at the scheduling layer, but there are some and unreasonable logics in the implementation of this layer of encapsulation; (3) Designed for generality , can only obtain some index data from the periphery, but cannot obtain the internal index of the scheduler; (4) The simulator focuses on testing the performance of the scheduler, and the actual optimization of the resource manager involves many aspects, and its evaluation is not comprehensive enough

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A full-link benchmarking system for distributed scheduling systems
  • A full-link benchmarking system for distributed scheduling systems
  • A full-link benchmarking system for distributed scheduling systems

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0036] In order to make the object, technical solution and advantages of the present invention clearer, the present invention will be further described in detail below in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain the present invention, not to limit the present invention. In addition, the technical features involved in the various embodiments of the present invention described below can be combined with each other as long as they do not constitute a conflict with each other.

[0037] The invention provides a full-link benchmark test system oriented to a distributed scheduling system, including a data set module, a load set module, a test index set module, a load submission strategy module, a performance index monitoring and collection module, and a client; the client The end obtains various configuration parameters in the configuration file, and is responsible for the...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

A full-link benchmarking system for a distributed scheduling system, characterized in that it includes a data set module, a load set module, a test indicator set module, a load submission strategy module, a performance index monitoring and collection module, and a client; The end obtains various configuration parameters in the configuration file, and is responsible for the connection and control between each module, task submission, and processing the feedback after the test of the distributed scheduling system; the data set module provides the test data required for load operation; all The load set module prepares the load set according to the configured load type; the test index set module selects the test index set according to the configured test index; the load submission strategy module prepares the submission script according to the configured load submission method, and uses the script's Submit the load to the system according to the established strategy; the performance index monitoring and collection module collects the index information of each dimension in real time and sends it to the client for front-end display.

Description

technical field [0001] The invention relates to a test system, in particular to a system for full-link benchmark test oriented to a distributed scheduling system. Background technique [0002] With the rapid development of social productivity and science and technology, especially the rapid development of Internet technology and multimedia technology, information explosion has become an inevitable trend. The data growth rate shows an exponential growth trend. The amount of data has reached the exabyte level. Massive data contains rich value information. Mining these hidden value information has brought great challenges to data storage and computing. The scale of the computing platform effect becomes more pronounced. Today's computing tasks are characterized by large scale and high concurrency. The traditional stand-alone mode can no longer meet the computing needs, and the emergence of distributed scheduling systems provides reliable support for the stable operation of larg...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): H04L12/26
CPCH04L43/045H04L43/0817H04L43/0852H04L43/0888H04L43/50
Inventor 胡春明邵凯阳朱建勇薛世卿
Owner BEIHANG UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products