Spark platform dynamic resource allocation method for flow analysis

A dynamic resource and traffic analysis technology, applied in the Internet field, can solve problems such as freezes, process failures, resource waste, etc., and achieve the effect of reducing burden, reducing the use of memory and the number of processing threads, and efficient utilization

Active Publication Date: 2021-04-06
JIANGSU FUTURE NETWORKS INNOVATION
View PDF6 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] In addition, the instantaneous surge of traffic data and the huge difference in traffic content caused by a large number of users have brought huge hidden dangers and challenges to the stable operation of traffic processing tasks; but on the other hand, according to different scenarios, different business traffic After pre-collection analysis and research, we found that the changes in traffic data are not completely untraceable. For example, after dividing by time dimensions such as day and night, working days and rest days, the data volume and type complexity of network traffic fluctuate It can always be maintained within a certain range; as mentioned above in the present invention, the scheduler of Spark itself will not dynamically adjust resources for Spark tasks that have allocated resources, which will lead to a larger setting for ensuring the normal processing of Spark tasks. When the number of memory and cores is large, if the amount of data actually processed is small, it may cause other processes on the server to fail to run normally or freeze, resulting in waste of resources

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Spark platform dynamic resource allocation method for flow analysis
  • Spark platform dynamic resource allocation method for flow analysis
  • Spark platform dynamic resource allocation method for flow analysis

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0035] The specific implementation is as follows:

[0036] Initialize resource configuration process module deployment:

[0037]In the case of successful deployment of the existing Spark platform, the first module of the present invention is deployed. First, you need to configure the server environment: Centos7.5, JDK environment 1.8, Python3, Crontab timing script, Hive database. Crontab timing script is mainly used to trigger monitoring and collect data. The collected historical data is processed by Python tasks and persisted to HDFS. The data analysis module will retrieve the processed data from the HDFS library, and use the collaborative filtering algorithm to find the record most similar to the latest configuration in the historical records to obtain a more appropriate resource configuration.

[0038] Adaptive Resource Scheduler module deployment:

[0039] When the Spark application starts normally, the Crontab scheduled task will monitor the traffic size, the complex...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

A Spark platform dynamic resource allocation method for flow analysis is characterized by comprising the steps that 1, through a Spark resource scheduler, according to basic performance indexes such as a CPU and a memory in a server, in combination with the complexity of a Spark task process, proper combination configuration of the memory and the number of cores is recommended; and 2, a resource scheduler is achieved, and resource self-adaptive allocation is performed by analyzing network flow data characteristics of a CPU (Central Processing Unit), a memory and a load occupied by an application program in combination with an ARSA (Advanced RISC Services Association) algorithm. According to the automatic scheduling method for the Spark resources for flow analysis, appropriate initial memory and kernel number configuration can be adapted by analyzing performance indexes of the cluster server, self-adaptive adjustment can be carried out according to the complexity, the size and the like of actual flow, the cluster resources are fully utilized, and stable operation of a flow processing task is guaranteed.

Description

technical field [0001] The invention relates to the technical field of the Internet, in particular to a dynamic resource allocation method on a Spark platform aimed at traffic analysis. Background technique [0002] With the rapid development of Internet technology and the rapid expansion of information data, how to deal with massive Internet data has become a technical problem. As a prerequisite for various network operation and security management such as bandwidth management, traffic perspective, attack traceability, virus defense, and intrusion detection, traffic analysis has become the focus of the industry because of its large-scale, diverse, and unstable data. . With the emergence and development of distributed computing, the use of multiple server nodes for parallel processing computing has opened a new door for the processing of massive data, such as distributed processing frameworks such as Hadoop, Spark, and Flink have emerged as the times require. [0003] Apac...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F9/48G06F9/50G06F16/182
CPCG06F9/4881G06F9/5016G06F9/5027G06F9/5061G06F16/182
Inventor 张广兴何旭梁帅
Owner JIANGSU FUTURE NETWORKS INNOVATION
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products