Selection method of hdfs load source and sink nodes based on multiple metrics
A sink node and target node technology, applied in the field of HDFS load source and sink node selection based on multiple metrics, can solve problems such as cluster performance degradation and inaccurate selection of HDFS source and sink nodes, achieve reasonable and accurate load migration, and improve overall performance , the effect of reducing resource consumption
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0090] Embodiment 1: When using the AHP method to quantify the actual workload of the server, the main steps are as follows:
[0091] (1) Build a server load hierarchy model (such as figure 1 shown);
[0092] (2) Construct a judgment matrix of the importance of each factor or index;
[0093]
[0094] The quantized matrix obtained from the '1-9 scale table' is as follows:
[0095]
[0096] where A 1 、A 2 、A 3 Represent the judgment matrix of performance index, time index and total load index in step-hierarchical model respectively.
[0097] (3) calculate the relative weight of each factor or index, promptly calculate the weight vector of judgment matrix; What the present invention adopts is the approximate value of the maximum eigenvector of sum product method calculation judgment matrix, and the final result that obtains is as follows:
[0098] ω' A =(0.75,0.25) T
[0099] (4) Consistency check of the judgment matrix. After the verification is passed, the fina...
Embodiment 2
[0121] figure 1 It is a model diagram of the quantitative server load hierarchy in the present invention;
[0122] In previous studies, the load of servers in the cluster is estimated by a combination of one or more indicators, the main indicators are as follows:
[0123] ●Storage usage
[0124] ●Disk I / O access rate
[0125] ●Service response time
[0126] ● CPU utilization
[0127] ●Memory usage
[0128] ●Number of tasks
[0129] ●Response delay time of network communication
[0130] ●Virtual memory usage
[0131] ●Cumulative processing time of currently active tasks
[0132] ●CPU temperature
[0133] ●Network bandwidth usage
[0134] ●Failure time
[0135] In the present invention, the emphasis for the balance of the distributed cluster system is the balance of data, that is to say, it is only for the operation of files in HDFS, including uploading and downloading files. It can be seen from this that under this scenario, for the cluster system The main pressure ...
Embodiment 3
[0139] figure 2 It is a flow chart of the load migration strategy based on naive Bayesian in the present invention:
[0140] Its main steps are as follows:
[0141] 1) The master node collects the load information of the node and saves it in a file.
[0142] 2) Use the NB algorithm to train the classifier according to the historical load information of the node. There are three types in the classifier: overload class, balance class, idle class, and each type has 8 characteristic attributes. The classification thresholds of these 8 attribute values are as follows: Figure 4 shown.
[0143] 3) After the classifier is generated, use this classifier to calculate its category for each node, and output it to a file for the load balancer to select the source and sink nodes.
[0144] 4) The equalizer is started, and the classification result file is read.
[0145] 5) The balancer divides the nodes into three queues according to the classification results, and the queues are so...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com