Monocular image depth estimation method based on pyramid pooling module

A technology of pyramid pooling and image depth, applied in the field of image depth estimation, which can solve the problems of difficulty in estimating depth, blurring, and no matching allowed.

Active Publication Date: 2019-03-01
ZHEJIANG UNIVERSITY OF SCIENCE AND TECHNOLOGY
View PDF6 Cites 27 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

In contrast, estimating depth from monocular images is harder and more ambiguous because it does not allow matching in space for stereo images or temporally for motion sequences
In 2016, Laina I and others proposed the Fully Convolutional Residual Networks (FCRN) framework, whi

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Monocular image depth estimation method based on pyramid pooling module

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0032] The present invention will be further described in detail below in conjunction with the accompanying drawings and embodiments.

[0033] A monocular image depth estimation method based on the pyramid pooling module proposed by the present invention, its overall realization block diagram is as follows figure 1 As shown, it includes two processes of training phase and testing phase;

[0034] The specific steps of the described training phase process are:

[0035] Step 1_1: Select Q original monocular images and the real depth images corresponding to each original monocular image, and form a training set, record the qth original monocular image in the training set as {I q (i,j)}, combine the training set with {I q (i,j)} corresponds to the real depth image is denoted as Among them, Q is a positive integer, Q≥200, such as taking Q=4000, q is a positive integer, 1≤q≤Q, 1≤i≤W, 1≤j≤H, W means {I q (i,j)} and The width of H means {I q (i,j)} and the height of I q (i,j...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a monocular image depth estimation method based on a pyramid pooling module. In a training stage, a neural network is firstly constructed, which comprises an input layer, a hidden layer and an output layer. The hidden layer includes a separate first convolution layer, a feature extraction network framework, a scale recovery network framework, a separate second convolution layer, a pyramid pooling module, and a separate connection layer. Each original monocular image in the training set is used as the original input image. The optimal weight vector and the optimal bias term of the trained neural network model are obtained by calculating the loss function value between the predicted depth image and the real depth image corresponding to each original monocular image inthe training set and inputting it into the neural network for training. In the testing phase, the monocular image to be predicted is input into the neural network model, and the predicted depth imageis obtained by using the optimal weight vector and the optimal bias term. The advantages are high prediction accuracy and low computational complexity.

Description

technical field [0001] The present invention relates to an image depth estimation technology, in particular to a monocular image depth estimation method based on a pyramid pooling module. Background technique [0002] Depth estimation is the process of using one or more images to predict a depth map of a scene. Depth information is an important clue to understand the geometric relationship in the scene, and can be applied to various scenarios such as 3D model reconstruction, stereo matching, human pose estimation, etc. Depth information can be obtained from stereo images or motion sequences containing left and right viewpoints, which provide relatively rich information for understanding depth information in space and time, respectively. In contrast, estimating depth from monocular images is harder and more ambiguous because it does not allow matching in space for stereo images or temporally for motion sequences. In 2016, Laina I and others proposed the Fully Convolutional ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06T7/50G06N3/04
CPCG06T7/50G06N3/045
Inventor 周武杰潘婷顾鹏笠钱亚冠楼宋江
Owner ZHEJIANG UNIVERSITY OF SCIENCE AND TECHNOLOGY
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products