Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

A Method for Reducing Energy Consumption in Large-Scale Distributed Machine Learning Systems

A machine learning and distributed technology, applied in instrumentation, resource allocation, energy-saving computing, etc., can solve the problems of server power waste, overall performance degradation, and long iteration time, so as to reduce system energy consumption, improve utilization, and shorten execution time. effect of time

Active Publication Date: 2020-07-24
HANGZHOU DIANZI UNIV
View PDF4 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, in a heterogeneous WAN environment, especially when the links between geographically distant nodes are connected, the limited bandwidth will congest a large number of parameter updates, causing each iteration to take too long and the overall performance to decline significantly.
At the same time, a high delay will cause the working machines that depend on parameter updates to idle, resulting in waste of server power and increased energy consumption

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A Method for Reducing Energy Consumption in Large-Scale Distributed Machine Learning Systems
  • A Method for Reducing Energy Consumption in Large-Scale Distributed Machine Learning Systems
  • A Method for Reducing Energy Consumption in Large-Scale Distributed Machine Learning Systems

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0020] The method for reducing the energy consumption of large-scale distributed machine learning proposed by the present invention has the following steps:

[0021] Step 1: The scheduler collects the real-time information of the CPU, GPU, memory, and disk I / O of the working machine and sends it to the state memory.

[0022] Step 2: The state memory uses the received real-time information of the processor, memory, and disk I / O to calculate the load status of the work machine (CPU usage, GPU usage, memory usage, and disk I / O usage).

[0023] Step 3: The scheduling policy manager reads the load information on the state memory. The load status of different working machines at the same time is used to predict the load type of machine learning tasks (computing-intensive, I / O-intensive, GPU-accelerated, hybrid), and the load curves at different times are used to predict the working machines in the future load.

[0024] Step 4: When the machine learning task arrives, first use the ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a method for reducing energy consumption of a large-scale distributed machine learning system. The invention classifies and identifies the load of the distributed machine learning system through a classifier, predicts the state, and reduces the energy consumption of the whole distributed machine learning system by reducing the communication between the distributed parameterservers and accelerating the task operation. The method of the invention comprises two parts: a machine learning load prediction and type identification method, and a distributed machine learning node parameter ''lazy synchronization'' mechanism. The invention can effectively reduce the waiting time of the system and accelerate the machine learning convergence rate by only transmitting significant updates to the remote data center to reduce the parameter synchronization mechanism of the wide area network communication. The prediction of machine learning load and the identification of machinelearning load types can help to improve the utilization rate of the working machine and avoid a large number of working machines in idle state after being turned on. The above method shortens the execution time of machine learning tasks, improves the utilization rate of the working machine, and greatly reduces the system energy consumption.

Description

technical field [0001] The invention relates to a method for reducing consumption of a large-scale computer system, in particular to a method for reducing energy consumption by rationally optimizing inter-computer communication and load scheduling in a large-scale distributed machine learning system. Background technique [0002] With the advancement of computing technology, communication technology, and sensor technology and the popularization of various smart terminal devices, more and more data are generated in human production and life, and the data growth rate is getting faster and faster. These rapidly generated raw data are generally large in scale, but the value density is low. The current common big data processing method is to introduce machine learning technology into the process of big data analysis and processing, and build a system model through linear regression, deep neural network and other methods And iterative training to mine the potential data rules and ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F1/329G06F1/3287G06F1/3206G06F9/50
CPCG06F1/3206G06F1/3287G06F1/329G06F9/505G06F2209/508Y02D10/00
Inventor 蒋从锋王济伟丁佳明俞俊赵乃良樊甜甜仇烨亮万健张纪林殷昱煜任祖杰
Owner HANGZHOU DIANZI UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products