Distributed training method and device of model, equipment and storage medium

A training method and distributed technology, applied in the fields of deep learning, cloud computing, and artificial intelligence, can solve problems such as increasing model constraints, difficulty in obtaining optimal values, and limited solution range of dynamic programming methods, so as to improve training efficiency , Shorten training time and reduce training cost

Active Publication Date: 2022-03-15
BEIJING BAIDU NETCOM SCI & TECH CO LTD
View PDF10 Cites 11 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, the solution range of the dynamic programming method is limited. Once the modeling method is changed and model constraints are added, it is difficult to obtain the optimal value

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Distributed training method and device of model, equipment and storage medium
  • Distributed training method and device of model, equipment and storage medium
  • Distributed training method and device of model, equipment and storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment approach

[0193] In one embodiment, the method also includes:

[0194] From the multiple computing units included in the computing resources, each segmentation result is matched with the target computing unit, and the target computing unit is used to perform distributed training on the model to be trained.

[0195] Wherein, each segmentation result is matched with the target computing unit, and the matching relationship can be determined according to the hardware topology relationship of the computing resources where the multiple computing units are located. Wherein, the determination method of the hardware topology relationship of the computing resources can be obtained by analyzing the computing resources allocated to the model to be trained.

[0196] Wherein, the hardware topology relationship of the computing resource may include the connection relationship of the computing resource, bandwidth information, task processing capability, and the like. Exemplarily, in the case where the...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a distributed training method and device of a model, equipment and a storage medium, and relates to the technical field of artificial intelligence, in particular to the fields of deep learning, cloud computing and the like. According to the specific implementation scheme, an initial segmentation strategy is generated based on a to-be-trained model; determining first attribute description information of the to-be-trained model under the initial segmentation strategy; the first attribute description information is used for representing at least one of the storage space occupation condition and the operation duration of the to-be-trained model under the initial segmentation strategy; based on the first attribute description information, optimizing the initial segmentation strategy to obtain a target segmentation strategy meeting a preset condition; and segmenting the to-be-trained model based on the target segmentation strategy to obtain a segmentation result, the segmentation result being used for performing distributed training on the to-be-trained model. According to the technology disclosed by the invention, for a distributed training scene of the model, the training time is shortened, the training efficiency is improved, and the training cost is reduced.

Description

technical field [0001] The present disclosure relates to the field of artificial intelligence technology, in particular to the fields of deep learning and cloud computing, and in particular to a distributed training method, device, device, and storage medium for models. Background technique [0002] At present, the existing methods for solving the optimal device segmentation on heterogeneous devices are mainly realized through the method of dynamic programming, and the dynamic programming method generally decomposes the problem into sub-problems for solution. However, the solution range of the dynamic programming method is limited. Once the modeling method is changed and model constraints are added, it is difficult to obtain the optimal value. Contents of the invention [0003] The present disclosure provides a model distributed training method, device, equipment and storage medium. [0004] According to an aspect of the present disclosure, a distributed training method o...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06K9/62G06N3/04G06N3/08
CPCG06N3/04G06N3/082G06F18/214Y02T10/40
Inventor 翁珺曹州敖玉龙吴志华于佃海马艳军
Owner BEIJING BAIDU NETCOM SCI & TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products