A sound source localization camera shooting tracking device comprises a cradle head for carrying a camera, a single chip microcomputer and four sound pick-ups, the first sound pick-up, the second sound pick-up and the third sound pick-up are in sequential arrangement along a horizontal straight line, the fourth sound pick-up is positioned right above the second sound pick-up, output ends of the four sound pick-ups are connected with four input ports of the single chip microcomputer respectively via four filters, and a horizontal angle motor and an elevation angle motor of the cradle head are connected with output ports of the single chip microcomputer. According to sound signals picked up by the four sound pick-ups, a time-delay estimation based positioning method is adopted for determining position of a monitored objective, dynamic tracking of a specific objective is achieved, and the device has the advantages of less operation quantity, easiness in implementation, high positioning accuracy and the like. By the device, multi-angle coverage monitoring can be realized only with a single camera, and monitoring capability is enhanced while monitoring cost is saved.