Adaptive distributed parallel training method for neural network based on reinforcement learning

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A reinforcement learning and neural network technology, applied in the field of model parallel training schemes, can solve the problems of inability to guarantee other performance of the strategy, single parallel dimension, etc., to achieve the effect of expanding offline learning capabilities, speeding up, and improving comprehensive performance

Pending Publication Date: 2021-07-16

HANGZHOU DIANZI UNIV

View PDF0 Cites 22 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

However, the parallel dimension of the above methods is relatively single, the main body tuning model is simple, and these methods focus on the optimization of the running time of the distributed running strategy, and cannot guarantee the performance of the strategy except the running time

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0024] The present invention will be further described below in conjunction with accompanying drawing and specific implementation steps:

[0025] Such as figure 1 As shown, a neural network adaptive distributed parallel training method based on reinforcement learning, including the following steps:

[0026] Step 1: Build a multidimensional performance evaluation model R(π g , π s ), to measure the comprehensive performance of the strategy. First, analyze the factors that affect the execution performance of the neural network, including the structure of the neural network model, computing properties, and cluster topology; secondly, extract the computing cost E i , communication cost C i and memory usage M iand other performance factors, where the calculation cost E i , communication cost C i and memory usage M i It is defined as follows:

[0027] The calculation cost is expressed by dividing the precision of the tensor involved in the operation by the calculation densi...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses an adaptive distributed parallel training method for a neural network based on reinforcement learning, and provides an optimal solution for segmentation and scheduling of a large-scale complex neural network. Firstly, the influence of a neural network model structure and calculation attributes on execution performance is analyzed, on this basis, performance factors including calculation cost, communication cost, memory utilization rate and the like are extracted, a multi-dimensional performance evaluation model capable of comprehensively reflecting distributed training performance is constructed, and comprehensive performance of a parallel strategy is improved; secondly, self-adaptive grouping of operators is realized according to attribute characteristics of the operators by utilizing a feed-forward network, the degree of parallelism is determined, and end-to-end strategy search is realized while the search space is reduced; and finally, based on importance sampling, a near-end strategy gradient iteration optimization reinforcement learning model is adopted, an optimal segmentation and scheduling strategy is searched, the strategy network offline learning capability is expanded, and algorithm stability, convergence rate and strategy search performance are improved.

Description

technical field [0001] The invention relates to a neural network adaptive distributed parallel training method based on reinforcement learning, which provides an optimal model parallel training scheme for large-scale complex neural networks. Background technique [0002] In recent years, benefiting from the development of AI algorithms, hardware computing power, and data sets, deep neural network technology has been widely used in natural language processing, computer vision, and search recommendation. As these fields continue to iteratively develop larger-scale and more complex neural networks, it is difficult for "Moore's Law" to match the computing needs, and a single device can no longer support large-scale deep network training. Therefore, it has become a common method to solve large-scale neural network training by researching and dividing the neural network calculation graph, and scheduling the divided network to clusters containing multiple CPUs and GPUs to achieve m...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Applications(China)

IPC IPC(8): G06N20/00

CPCG06N20/00

Inventor 吴吉央曾艳张纪林袁俊峰任永坚周丽

Owner HANGZHOU DIANZI UNIV

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Adaptive distributed parallel training method for neural network based on reinforcement learning

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology