Deep neural network multi-path reasoning acceleration method for edge intelligent application

A deep neural network and multi-path technology, applied in the field of deep learning model inference optimization acceleration, can solve problems such as low latency, unconsidered computing resource load, and model inference accuracy decline, so as to reduce the impact of network load and adapt to real-time performance requirements and avoid redundant calculations

Active Publication Date: 2020-07-24
SOUTHEAST UNIV
View PDF3 Cites 27 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, such methods directly ignore the selection of the location and number of outlets. The simple preset outlets do not consider the complex and variable computing resource loads of the terminal layer and the edge layer, and the thresholds set manually do not have a good effect on the newly collected data. Decision-making ability leads to a decrease in the accuracy of model reasoning, which cannot meet the high-precision requirements of applications such as autonomous driving
[0007] Therefore, the existing deep learning model reasoning acceleration method still has great limitations when applied to the scenario of edge computing and artificial intelligence applications, and cannot meet the low-latency and high-precision operation requirements of edge intelligence applications.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Deep neural network multi-path reasoning acceleration method for edge intelligent application
  • Deep neural network multi-path reasoning acceleration method for edge intelligent application
  • Deep neural network multi-path reasoning acceleration method for edge intelligent application

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0086] Embodiment 1: as figure 2 As shown, in the outlet selection step, it is necessary to analyze the inference benefits brought by adding outlets after each convolutional layer based on the deep neural network prototype. The convolutional layer is the key layer for feature extraction in deep neural networks, and it takes up a large part of the calculation. In order to calculate the inference benefit of the exit of each layer, on the existing training data set, the model backbone and the classifier (fully connected layer) of each layer exit are trained with the same training intensity. In order to reduce the coupling between the exit and the model backbone, The exit classifier is trained using a separate loss function. Using the cross-validation method, the accuracy rate that can be achieved by the export of each layer can be obtained. In general, the accuracy rate will increase with the increase of the depth of the exit, and the amount of calculation will increase at the...

Embodiment 2

[0096] Embodiment 2: as image 3 As shown, based on the multi-path model constructed by the above algorithm, the present invention sets a threshold unit between the model backbone and the exit classifier to perform multi-path inference decision-making, thereby eliminating a large number of redundant calculations in DNN and shortening the inference delay. In the threshold unit training, it is necessary to make a training set of the threshold unit, including the following three steps:

[0097] Step 1: The original training set is input into the multi-path model, and the classification results are obtained through each inference path.

[0098] Step 2: Extract the intermediate feature map x generated before the sample exits on each path, as the training set Gate_Dateset of the gate unit.

[0099] Step 3: The label is initialized as an all-zero sequence R=. According to the classification result, if the serial number of the shortest path with correct classification is k, that is, ...

Embodiment 3

[0109] Embodiment 3: as Figure 4 As shown, the intermediate feature data is compressed and encoded before being transmitted to the edge layer to reduce the transmission overhead. For data-intensive applications, this step can effectively weaken the constraints on the overall calculation in the transmission stage when the network load is heavy. The use of lossless or lossy compression methods depends on the sparsity ratio of the feature maps of each layer.

[0110] The present invention uses lossless compression for feature maps with relatively large sparseness, and uses lossy compression information entropy for feature maps with relatively small sparseness. By calculating the information entropy of the feature map of each layer, the change of the sparse ratio of the feature map in each layer can be directly judged. Set the entropy threshold threshold, and divide the feature maps of each layer into two categories: the feature maps lower than or equal to the threshold are clas...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a deep neural network multi-path reasoning acceleration method for edge intelligent application, comprising the following steps of: firstly analyzing the classification capability and calculation cost of quitting a branch outlet in advance of each layer of a deep neural network, and selecting an outlet combination with the maximum reasoning benefit to be added to an original model; setting a threshold unit between the exit and a trunk layer, training, and judging whether the task can exit from the current exit or not; compressing intermediate feature data for a task which cannot exit in advance at a terminal layer but must be transmitted to an edge layer; and finally, monitoring and analyzing the network load and the computing capabilities of the terminal and the edge equipment on line in an edge computing environment, cutting a multi-path model by taking the minimized reasoning time delay as a target, and respectively deploying model blocks in a terminal layerand an edge layer to finally form a multi-path reasoning acceleration framework. According to the method, the reasoning flexibility can be improved, the accuracy is ensured, the total reasoning time delay is reduced, and the real-time and high-precision requirements of edge intelligent application are met.

Description

technical field [0001] The present invention belongs to the fields of edge intelligence and deep learning, and specifically relates to a method for implementing inference optimization and acceleration of deep learning models on which applications depend under the scenario of deploying intelligent applications in an edge computing environment. Background technique [0002] During the rapid development of artificial intelligence, Deep Neural Network (DNN), with its powerful learning ability, has achieved excellent results in classic task scenarios such as computer vision or natural language processing. At the same time, with the development of the Internet of Things era, the rapid popularization of smart terminals such as smart cameras, smart sensors, and various Internet of Things devices has made deep learning algorithms successfully applied to some practical deployments, such as face recognition, smart security and other scenarios. Terminal intelligence has become an inevit...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06N5/04G06N3/08G06N3/04
CPCG06N5/04G06N3/08G06N3/045
Inventor 东方王慧田沈典蔡光兴黄兆武
Owner SOUTHEAST UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products