Deep learning calculation method and device, chip and medium
A deep learning and computing method technology, applied in the computer field, can solve the problem of not supporting deep learning computing on chip, unable to fully utilize the computing performance of chip, etc., to achieve the effect of improving processing efficiency
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0028] figure 1 It is a flow chart of a deep learning computing method provided by Embodiment 1 of the present invention. This embodiment is applicable to the situation of implementing distributed computing on a chip. The method can be executed by the deep learning computing device provided by the embodiment of the present invention. The device can be implemented in the form of software and / or hardware, and generally can be integrated in a chip.
[0029] Such as figure 1 As shown, the deep learning calculation method provided in this embodiment is applied to the chip, including:
[0030] S110. Obtain an initial calculation graph.
[0031] The core of a machine learning task is the definition of the model and the parameter solution method of the model. After abstracting the two, a unique calculation logic can be determined, and this logic is represented by a graph, which is called a calculation graph. Computational graphs are directed acyclic graphs, which define the flow of...
Embodiment 2
[0049] figure 2 It is a flow chart of a deep learning calculation method provided by Embodiment 2 of the present invention. This embodiment is embodied on the basis of the foregoing embodiments, wherein generating a reconstructed calculation graph based on the initial calculation graph can be specifically:
[0050] determining data input nodes, computational node groups, and trainable variable nodes in the initial computational graph;
[0051] Adjust the input subgraph structure corresponding to the data input node, the variable subgraph structure corresponding to the trainable variable node, copy the calculation subgraph structure corresponding to the calculation node group, and add a calculation result summary subgraph Graph structure to obtain the reconfiguration calculation graph;
[0052] Wherein, execution devices corresponding to the computation subgraph structure in the initial computation graph and the copied computation subgraph structure are different computation ...
Embodiment 3
[0096] Figure 5 It is a flow chart of a deep learning calculation method provided by Embodiment 3 of the present invention. This embodiment provides a specific implementation manner on the basis of the foregoing embodiments.
[0097] Such as Figure 5 As shown, the deep learning calculation method provided in this embodiment is applied to the chip, including:
[0098] S310. Obtain an initial calculation graph.
[0099] S320. Determine whether the initial calculation graph and the chip hardware structure meet the on-chip distributed computing condition, if yes, execute S330, and if not, execute S360.
[0100] After the chip obtains the initial calculation graph, if it determines that the chip driver is in the on-chip distributed multi-computing cluster mode, and the deep learning calculation type of the initial calculation graph is training or inference, and the computing node deployment belongs to heterogeneous computing, then determine the The initial calculation graph an...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com