Decomposing a video sequence into VOPs is a very difficult task, and comparatively little research has been undertaken in this field.
An intrinsic problem of VOP generation is that objects of interest are not homogeneous with respect to low-level features such as color, intensity, or optical flow.
Thus, conventional segmentation algorithms will fail to obtain meaningful partitions.
(1) it can also be seen that apparent motion is highly sensitive to noise because of the derivatives, which can cause largely incorrect results.
Unfortunately, we can only observe apparent motion.
In addition to the difficulties mentioned above, motion estimation algorithms have to solve the so-called occlusion and aperture problems.
The occlusion problem refers to the fact that no correspondence vectors exist for covered and uncovered background.
The aperture problem states that the number of unknowns is larger than the number of observations.
1. Nonparametric representation, in which a dense field is estimated where each pixel is assigned a correspondence or flow vector. Block matching is then applied, where the current frame is subdivided into blocks of equal size, and for each block the best match in the next (or previous) frame is computed. All pixels of a block are assumed to undergo the same translation, and are assigned the same correspondence vector. The selection of the block size is crucial. Block matching is unable to cope with rotations and deformations. Nevertheless, their simplicity and relative robustness make it a popular technique. Nonparametric representations are not suitable for segmentation, because an object moving in the 3-D space generates a spatially varying 2-D motion field even within the same region, except for the simple case of pure translation. This is the reason why parametric models are commonly used in segmentation algorithms. However, dense field estimation is often the first step in calculating the model parameters.
2. Parametric models require a segmentation of the scene, which is our ultimate goal, and describe the motion of each region by a set of a few parameters. The motion vectors can then be synthesized from these model parameters. A parametric representation is more compact than a dense field description, and less sensitive to noise, because many pixels are treated jointly to estimate a few parameters.
Although parametric representations are less noise sensitive, they still suffer from the intrinsic problems of motion estimation.
The major drawbacks of this proposal are the computational complexity, and the need to specify the number of objects likely to be found.
The techniques of Adiv, Bouthemy and Francois, and Murray and Buxton, include only optical flow data into the segmentation decision, and hence, their performance is limited by the accuracy of the estimated flow field.
These results are not good since we get over-segmentation, and the method is computationally expensive.
These approaches suffer from high computational complexity, and many algorithms need the number of objects or regions in the scene as an input parameter.
On the other hand, these approaches suffer from high computational complexity, and many algorithms need the number of objects or regions in the scene as an input parameter.
The result is an over-segmentation.
A drawback of this technique is the lack of temporal correspondence to enforce continuity in time.
However, due to its nature, the watershed algorithm suffer