An image processing method and device based on embedded gpu and convolution calculation

An image processing and embedded technology, applied in the field of computer vision, can solve the problems of complex SSD detection algorithm process, large storage consumption, limited computing unit and embedded GPU memory, etc., to simplify segmented convolution control logic, convolution The effect of increasing computing speed and reducing memory overhead

Active Publication Date: 2020-11-03
HANGZHOU INNOVATION RES INST OF BEIJING UNIV OF AERONAUTICS & ASTRONAUTICS
View PDF7 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] Due to the complexity of the SSD detection algorithm process, when it is implemented on an embedded hardware platform, it needs to consume a large amount of storage and computing units, which puts forward higher requirements for the hardware platform
Because programming with hardware such as DSP and FPGA is very difficult compared to software development, on the other hand, many algorithms for software are difficult to implement with hardware, and the cycle of hardware development is longer and the cost is higher. GPU as processor
Embedded GPUs can execute programs concurrently and support deep learning CUDA libraries; however, embedded GPUs have limited memory, so how to optimize the memory utilization and runtime of convolution calculations on embedded platforms is very important for image processing

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • An image processing method and device based on embedded gpu and convolution calculation
  • An image processing method and device based on embedded gpu and convolution calculation
  • An image processing method and device based on embedded gpu and convolution calculation

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0052] The following will clearly and completely describe the technical solutions in the embodiments of the present invention with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only some, not all, embodiments of the present invention. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts belong to the protection scope of the present invention.

[0053] See attached figure 1 , the embodiment of the present invention discloses an image processing method based on embedded GPU and convolution calculation, including:

[0054] S1: Use the memory-optimized convolution expansion method to perform matrix transformation and CUDA parallel processing on the input image to obtain the intermediate matrix;

[0055] S2: expand the rows and columns of the convolution kernel matrix on the input image to obtain a tempo...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses an image processing method and device based on an embedded GPU and the convolution calculation. The method comprises the steps of optimizing the convolutional calculation in an SSD algorithm; performing matrix transformation on an input image by adopting the memory optimized convolution expansion; using a CUDA to parallelly process to form an intermediate matrix, meanwhile, adopting the convolution kernel matrix row and column expansion alignment, partitioning after the convolution kernel matrix expansion so as to reduce the memory overhead during operation, finally, adopting a highly optimized cuBLAS matrix multiplication function in a CUDA library to carry out convolution calculation parallel acceleration, and finally, combining and outputting the matrixes. According to the method provided by the invention, the memory overhead can be reduced, the performance of the algorithm is improved, the advantages of the GPU parallel control are brought into play, the matrix multiplication time is reduced, and the calculation efficiency is improved.

Description

technical field [0001] The present invention relates to the technical field of computer vision, and more specifically relates to an image processing method and device based on embedded GPU and convolution calculation. Background technique [0002] Since the introduction of convolutional neural networks in the ImageNet competition, computer vision technology has made great progress in the past few years, showing great performance in various fields such as image classification, pattern recognition, and multimedia compression. Among them, the SSD algorithm has been widely used. The SSD algorithm uniformly performs dense sampling at different positions of the picture. Different scales and aspect ratios can be used for sampling, and then the convolutional neural network is used to extract features and then directly perform classification and regression. The whole process Only one step is required, and the speed is faster than RCNN series algorithms. The SSD algorithm has been op...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06T1/20G06F17/15G06F17/16
CPCG06F17/15G06F17/16G06T1/20
Inventor 姜宏旭王玺坤李波张永华林珂玉
Owner HANGZHOU INNOVATION RES INST OF BEIJING UNIV OF AERONAUTICS & ASTRONAUTICS
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products