Automatic vector optimization method for width inconsistency of deep learning framework compiler

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A deep learning and optimization method technology, applied in the field of deep learning, can solve problems such as loss of calculation graph information, code segment identification, and inability to fully utilize the performance of domestic many-core processors, and achieve the effect of improving reasoning performance and increasing the degree of vectorization

Active Publication Date: 2021-03-19

JIANGNAN INST OF COMPUTING TECH

View PDF2 Cites 5 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

However, during the deployment of deep learning workloads, the framework compiler loses part of the calculation graph information while generating high-level language codes such as C++ and LLVM IR. Many code segments with optimization potential cannot be recognized by the basic compiler, causing deep learning problems. The load cannot give full play to the full performance of domestic many-core processors

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment

[0032] Embodiment: a deep learning framework compiler width non-uniform automatic vector optimization method, based on a heterogeneous platform, comprising the following steps:

[0033] S1. The front end of the framework compiler identifies subgraphs in the calculation graph that can be optimized for vectors, as follows:

[0034] S11. Taking the deep learning load generated by the AI framework as input, the framework compiler identifies the model format of the deep learning load according to the type of the AI framework, and converts the deep learning load into a unified calculation graph;

[0035] S12. The framework compiler traverses the entire calculation graph obtained in step S11, and identifies subgraphs in the calculation graph that can be optimized for vectors. The specific method is as follows:

[0036] S13. The framework compiler analyzes the data dependency relationship in the calculation graph obtained in S11, constructs a data dependency graph of the calculati...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses an automatic vector optimization method for width inconsistency of a deep learning framework compiler, which is based on a heterogeneous platform, and comprises the following steps that: S1: a front end of the framework compiler identifies a sub-graph capable of carrying out vector optimization in a calculation graph, S2: a middle end of the framework compiler fuses operators in the sub-graph capable of carrying out vector optimization marked in the step S15, S3, the rear end of the framework compiler carries out vector optimization with inconsistent widths on the bottom-layer IR obtained in the step S2 according to vector widths of a control core and a calculation core of the heterogeneous many-core processor; and S4, a code generation module of the frame compilerconverts the underlying IR after vector optimization obtained in the step S32 into a high-level language code specified by a user, and generates a platform target code after vector optimization through a basic compiler. The instruction set parallel performance of the deep learning load is further mined, and the vectorization degree of the deep learning load is improved, so that the reasoning performance of the deep learning load on the heterogeneous many-core platform is improved.

Description

technical field [0001] The invention relates to an automatic vector optimization method for non-uniform width of a deep learning framework compiler, which belongs to the technical field of deep learning. Background technique [0002] Deep learning workloads need to be deployed on specific hardware to be fully effective. At present, developers in the field of deep learning have designed a variety of frameworks, such as Tensorflow, Caffe, etc., to complete the training and inference tasks of deep learning models. At the same time, hardware manufacturers have also launched a variety of hardware backends, such as GPU, FPGA, etc., to accelerate deep Learning model training and inference speed. A bridge is needed between the large number of different deep learning frameworks and the increasing number of hardware architectures. As a complete optimization toolchain, the Deep Learning Framework Compiler provides an end-to-end solution for deploying deep learning workloads of differ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Applications(China)

IPC IPC(8): G06F8/30G06F8/41G06N20/00

CPCG06F8/443G06F8/37G06F8/447G06N20/00Y02D10/00

Inventor 沈莉周文浩王飞武文浩肖谦

Owner JIANGNAN INST OF COMPUTING TECH

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Automatic vector optimization method for width inconsistency of deep learning framework compiler

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology