A system 1000, and its alternates 1002 through 1022, for automatically tracking and videoing the movements of multiple participants and objects during an event. System 1000 and its alternates comprise a scalable area tracking matrix of overhead tracking cameras 120c, whose individual field-of-views 120v combine to form a contiguous view 504m of the performance area where the event is being held. As participants 110, and all other necessary objects such as 103 (e.g. a puck in ice-hockey) and 104 (e.g. a stick in ice-hockey) move about during the event, computer 160 analyzes the images from contiguous view 504m to create a real-time tracking database at least including participant and object centroid locations respective to the performance area, and preferably including their identities matched to these ongoing locations. System 1000 and its alternates then employ the real-time database to automatically direct, without operator intervention, one or more side-view cameras such as 140-a, 140-b, 140-c and 140-d, to maintain optimal viewing of the event. The participants and objects may additionally be marked with encoded or non-encoded, visible or non-visible markers, either denoting centroid locations visible from their upper surfaces, and/or non-centroid locations visible from perspective views. The encoded markers are preferably placed on upper surfaces and detectable by computer 160 as it analyzes the images from contiguous overhead field-of-view 504m, thus providing participant and object identities to correspond with ongoing locations, further enhancing algorithms for subsequently controlling side-view cameras 140-a through 140-d. The non-encoded markers are preferably placed on multiple non-centroid locations at least on the participants that are then adjustably viewable by system 1000 and its alternates as the system uses the determined locations of each participant from the overhead view to automatically adjust one or more side-view cameras to tightly follow the participant. The resulting images from side-view cameras 140-a through 140-d may then be subsequently processed to determine the non-encoded marker locations, thus forming a three dimensional model of each participant and objects movements.