Methods and systems use a video sensor grid over an area, and extensive signal processing, to create a model-based view of reality. Grid-based synchronous capture, point cloud generation and refinement, morphology, polygonal tiling and surface representation, texture mapping, data compression, and system-level components for user-directed signal processing, is used to create, at user demand, a virtualized world, viewable from any location in an area, in any direction of gaze, at any time within an interval of capture. This data stream is transmitted for near-term network-based delivery, and 5G. Finally, that virtualized world, because it is inherently model-based, is integrated with augmentations (or deletions), creating a harmonized and photorealistic mix of real, and synthetic, worlds. This provides a fully immersive, mixed reality world, in which full interactivity, using gestures, is supported.