Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Automatic vector optimization method for width inconsistency of deep learning framework compiler

A deep learning and optimization method technology, applied in the field of deep learning, can solve problems such as loss of calculation graph information, code segment identification, and inability to fully utilize the performance of domestic many-core processors, and achieve the effect of improving reasoning performance and increasing the degree of vectorization

Active Publication Date: 2021-03-19
JIANGNAN INST OF COMPUTING TECH
View PDF2 Cites 5 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, during the deployment of deep learning workloads, the framework compiler loses part of the calculation graph information while generating high-level language codes such as C++ and LLVM IR. Many code segments with optimization potential cannot be recognized by the basic compiler, causing deep learning problems. The load cannot give full play to the full performance of domestic many-core processors

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Automatic vector optimization method for width inconsistency of deep learning framework compiler
  • Automatic vector optimization method for width inconsistency of deep learning framework compiler

Examples

Experimental program
Comparison scheme
Effect test

Embodiment

[0032] Embodiment: a deep learning framework compiler width non-uniform automatic vector optimization method, based on a heterogeneous platform, comprising the following steps:

[0033] S1. The front end of the framework compiler identifies subgraphs in the calculation graph that can be optimized for vectors, as follows:

[0034] S11. Taking the deep learning load generated by the AI ​​framework as input, the framework compiler identifies the model format of the deep learning load according to the type of the AI ​​framework, and converts the deep learning load into a unified calculation graph;

[0035] S12. The framework compiler traverses the entire calculation graph obtained in step S11, and identifies subgraphs in the calculation graph that can be optimized for vectors. The specific method is as follows:

[0036] S13. The framework compiler analyzes the data dependency relationship in the calculation graph obtained in S11, constructs a data dependency graph of the calculati...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses an automatic vector optimization method for width inconsistency of a deep learning framework compiler, which is based on a heterogeneous platform, and comprises the following steps that: S1: a front end of the framework compiler identifies a sub-graph capable of carrying out vector optimization in a calculation graph, S2: a middle end of the framework compiler fuses operators in the sub-graph capable of carrying out vector optimization marked in the step S15, S3, the rear end of the framework compiler carries out vector optimization with inconsistent widths on the bottom-layer IR obtained in the step S2 according to vector widths of a control core and a calculation core of the heterogeneous many-core processor; and S4, a code generation module of the frame compilerconverts the underlying IR after vector optimization obtained in the step S32 into a high-level language code specified by a user, and generates a platform target code after vector optimization through a basic compiler. The instruction set parallel performance of the deep learning load is further mined, and the vectorization degree of the deep learning load is improved, so that the reasoning performance of the deep learning load on the heterogeneous many-core platform is improved.

Description

technical field [0001] The invention relates to an automatic vector optimization method for non-uniform width of a deep learning framework compiler, which belongs to the technical field of deep learning. Background technique [0002] Deep learning workloads need to be deployed on specific hardware to be fully effective. At present, developers in the field of deep learning have designed a variety of frameworks, such as Tensorflow, Caffe, etc., to complete the training and inference tasks of deep learning models. At the same time, hardware manufacturers have also launched a variety of hardware backends, such as GPU, FPGA, etc., to accelerate deep Learning model training and inference speed. A bridge is needed between the large number of different deep learning frameworks and the increasing number of hardware architectures. As a complete optimization toolchain, the Deep Learning Framework Compiler provides an end-to-end solution for deploying deep learning workloads of differ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F8/30G06F8/41G06N20/00
CPCG06F8/443G06F8/37G06F8/447G06N20/00Y02D10/00
Inventor 沈莉周文浩王飞武文浩肖谦
Owner JIANGNAN INST OF COMPUTING TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products