Method for generating spatial-temporal consistency depth map sequence based on convolution neural network

A convolutional neural network, consistent technology, applied in the field of the generation of spatiotemporally consistent depth map sequences, can solve the problems of ignoring, affecting the user's perception, flickering of virtual views, etc.

Active Publication Date: 2017-05-03
ZHEJIANG GONGSHANG UNIVERSITY
View PDF6 Cites 20 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The jump of the depth map in adjacent frames will cause flickering of the synthesized virtual view, which seriously affects the user's perception
In addition, the continuity between frames also provides important clues for depth recovery, while in existing methods, this information is simply ignored

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method for generating spatial-temporal consistency depth map sequence based on convolution neural network
  • Method for generating spatial-temporal consistency depth map sequence based on convolution neural network
  • Method for generating spatial-temporal consistency depth map sequence based on convolution neural network

Examples

Experimental program
Comparison scheme
Effect test

specific Embodiment

[0126] Specific embodiments: the present invention is compared with existing methods in other collections on the public dataset NYU depth v2 and a dataset LYB3D-TV proposed by the inventor himself. Among them, the NYU depth v2 dataset consists of 795 training scenes and 654 test scenes, and each scene contains 30 consecutive frames of rgb images and their corresponding depth maps. The LYU3D-TV database is taken from some scenes of the TV series "Nirvana in Fire". We selected 5124 frames of pictures in 60 scenes and their manually-labeled depth maps as the training set, and 1278 frames of pictures in 20 scenes and their manual labels. The depth map is used as the test set. We compared the depth recovery accuracy of the proposed method with the following methods:

[0127] 1. Depth transfer: Karsch, Kevin, Ce Liu, and Sing Bing Kang. "Depth transfer: Depth extraction from video using non-parametric sampling." IEEE transactions on pattern analysis and machine intelligence 36.11(2...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a method for generating a spatial-temporal consistency depth map sequence based on a convolution neural network, which can be used in a film and television work 2D-to-3D technology. The method comprises the following steps: (1) collecting a training set, wherein each training sample in the training set is composed of a continuous RGB image sequence and a corresponding depth map sequence; (2) carrying out spatial-temporal consistency super pixel segmentation on each image sequence in the training set, and constructing a spatial similarity matrix and a temporal similarity matrix; (3) constructing a convolution neural network composed of a single super pixel depth regression network and a spatial-temporal consistency conditional random field loss layer; (4) training the convolution neural network; and (5) for an RGB image sequence of unknown depth, using the trained neural network to recover a corresponding depth map sequence through forward propagation. The problem that a depth recovery method based on clues relies too much on the scene hypothesis and the problem that the frames of a depth map generated by the existing depth recovery method based on a convolution neural network are discontinuous are avoided.

Description

technical field [0001] The invention relates to the field of computer vision stereoscopic video, in particular to a method for generating a spatiotemporal consistency depth map sequence based on a convolutional neural network. Background technique [0002] The basic principle of stereoscopic video is to superimpose and play two images with horizontal parallax, and viewers can see the pictures of left and right eyes respectively through stereoscopic glasses, thereby generating stereoscopic perception. Stereoscopic video can provide people with an immersive three-dimensional perception, and is very popular among consumers. However, as the popularity of 3D film and television hardware continues to rise, the shortage of 3D film and television content will follow. It is expensive to directly shoot with a 3D camera, and the post-production is difficult. Usually, it can only be used in high-cost movies. Therefore, the 2D / 3D conversion technology of film and television works is an...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): H04N13/00G06T7/20G06T7/285
CPCH04N13/122H04N13/128
Inventor 王勋赵绪然
Owner ZHEJIANG GONGSHANG UNIVERSITY
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products