A method of generating a neural network model for video prediction

A neural network model and neural network technology, applied in the field of unsupervised prediction of video frames, can solve problems such as optical flow occlusion, fast movement, sensitivity to changes in illumination or nonlinear structures, fuzzy predictions, and unsatisfactory prediction results. Achieving good long-term predictive effect

Active Publication Date: 2019-01-08
INST OF COMPUTING TECH CHINESE ACAD OF SCI
View PDF4 Cites 6 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

For example, Sudheendra et al. published the article "SfM-Net: Learning of Structure and Motion from Video" on arxiv in 2017, proposing to use single-branch neural network or dual-branch neural network to explicitly model pixel-level motion in combination with optical flow information. However, since optical flow is sensitive to changes in occlusion, fast movement, lighting or nonlinear structures, the prediction effect of this technique is not ideal.
For another example, the article "Video Frame Synthesis using Deep Voxel Flow" published by Liu et al. on ICCV in 2017 proposed the use of fully convolutional code

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A method of generating a neural network model for video prediction
  • A method of generating a neural network model for video prediction
  • A method of generating a neural network model for video prediction

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0040] After studying the existing technology, the inventor found that the current pixel-level video prediction technology adopts a frame-by-frame prediction method. This technology requires a very large amount of calculations whether it is to establish a video prediction model or use the model for prediction. , Especially the use of training neural network to build the above model. In this regard, the inventor proposes that the difference between consecutive multiple frames in a video file can be used to perform video prediction, by extracting the inter-frame difference of video samples, and establishing a generator model G for encoding and decoding. The generator model G includes an encoder and a decoder with a neural network model structure. The encoder takes the inter-frame difference of video samples as input, and uses a jump connection between the encoder and the decoder to generate predicted frames The inter-frame difference ΔX, the sum of the predicted inter-frame diffe...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The present invention provides a method of training a generator model G for video prediction so that a better, long-time video prediction effect can be obtained with less computational effort using the model. The generator model G comprises an encoder and a decoder adopting a neural network model structure, A hopping connection is adopt between that encoder and the decode, DELTA.X for generating apredicted inter-frame difference, The result of summing the predicted inter-frame difference DeltaX with the training sample is the method described in the predicted frame X<^>, includes selecting successive video frames as training samples and extracting the inter-frame difference of the training samples; 2) take that inter-frame difference as an input of an encoder in a generator model G, a formula is shown in the description wherein DELTAXi-1 is a value related to the ith inter-frame difference, Xi is the ith frame in the training sample, Xi<^> is that ith prediction frame, Xi and Xi<^> are associated with the neural network weights of the encoder and the decoder.

Description

Technical field [0001] The present invention relates to video image processing, in particular to unsupervised prediction of video frames by training a neural network model. Background technique [0002] With the development of information technology, the amount of video data generated by various applications has increased dramatically, which makes it difficult for traditional video analysis technologies to meet the image processing requirements of the aforementioned applications. On the one hand, traditional video analysis techniques are usually based on manual selection of image features. However, as the data set increases, this method consumes considerable time and labor costs. On the other hand, the image features used in traditional video analysis are often the technicians based on their assumptions to characterize the data set at a certain level, and the selection of data samples usually depends on the experience of the technicians, which makes it difficult to guarantee Obt...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): H04N19/503H04N19/70H04N19/44
CPCH04N19/44H04N19/503H04N19/70
Inventor 金贝贝胡瑜曾一鸣唐乾坤刘世策叶靖
Owner INST OF COMPUTING TECH CHINESE ACAD OF SCI
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products