Video surveillance system with advantageous viewpoint transformation
By applying advantageous viewpoint transformation and monitoring system controller analysis in the video surveillance system, the problem of combining video images from multiple cameras was solved, enabling "bird's-eye view" display and information integration of the monitored area, and improving the coverage and information processing capabilities of the monitoring system.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- HONEYWELL INTERNATIONAL INC
- Filing Date
- 2022-04-28
- Publication Date
- 2026-06-26
Smart Images

Figure CN115297295B_ABST
Abstract
Description
Technical Field
[0001] This disclosure relates generally to video surveillance systems. More specifically, this disclosure relates to video surveillance systems that enable advantageous viewpoint changes. Background Technology
[0002] Multiple video surveillance systems employ cameras installed or otherwise deployed around a monitored area, such as a city, a portion of a city, a facility, or buildings. Video surveillance systems may also include mobile cameras, such as drones carrying cameras. Each camera has a vantage point corresponding to its physical location and a field of view corresponding to what the specific camera can see from its specific physical location. In the case of mobile cameras, it should be understood that their vantage point can change. The desired approach is a method of combining video images from multiple cameras and performing vantage point transformations to provide a "bird's-eye view" of the monitored area that can be controlled by an operator. Summary of the Invention
[0003] This disclosure relates to video surveillance systems. In one example, a method for monitoring a monitored area is provided. The method includes receiving multiple video streams captured by multiple cameras in the monitored area, each of the multiple cameras having a camera vantage point and a camera field of view (FOV). Each of the multiple video streams captures one or more objects within the monitored area, the one or more objects being within the FOV of the corresponding camera and from the camera vantage point of that corresponding camera. A vantage point transformation is applied to each of two or more objects captured in two or more of the multiple video streams. The vantage point transformation transforms the view of each of the two or more objects from the camera vantage point of the corresponding camera to a common vantage point. The view of each of the two or more objects transformed from the common vantage point is then presented on a display. In some cases, the video streams from two or more cameras are stitched together to produce a view of one or more objects.
[0004] In another example, performing a search may include receiving a search query from an operator. In response, multiple videos may be searched for objects that satisfy the search query. The location of the matching object may be displayed at a first frame rate on a view transformed from a common vantage point. Operator input may be received that zooms in on a selected matching object from the common vantage point. The selected matching object from the zoomed-in common vantage point may be displayed at a second frame rate, which may be higher or lower than the first frame rate.
[0005] In another example, a method is provided for operating a surveillance system comprising multiple cameras. Each camera is configured to provide a video stream from a corresponding camera vantage point, and the surveillance system includes a surveillance system controller. The surveillance system controller may be provided centrally, such as in the cloud, or may be distributed in edge controllers at or near the individual cameras. The illustrative method includes the surveillance system controller analyzing the video streams provided by the multiple cameras to find a common object from a first video frame from a first video stream and a second video frame from a second video stream. The common object is marked with an object identifier. View information including view information from the camera vantage points of each of the first and second video streams is stored for the marked common object. A vantage point transformation is applied to the common object, transforming the view information of the common object from the camera vantage points of the first and second video streams to the common vantage point. The view information of the common object after the vantage point transformation is presented on a display. User input may be received to move the common vantage point to the updated common vantage point, and once moved, the vantage point transformation is applied using the updated common vantage point, and the view information of the common object after the updated common vantage point transformation is presented.
[0006] In another example, a drone is configured for use in a surveillance system. The drone includes a camera with a camera field of view (FOV), memory, a transceiver, and a controller operatively coupled to the camera, memory, and transceiver. The controller is configured to capture a first frame of video of an event at a first location using the camera and determine a flight path following that event. The controller is configured to cause the drone to fly along the flight path and capture a second frame of video of an event at a second location using the camera, wherein the second location is determined such that there is overlap in the camera field of view (FOV) between the first and second frames of the video. The controller is configured to transmit the resulting video via the transceiver.
[0007] In another example, a surveillance system is configured to provide monitoring of a monitored area. The surveillance system includes multiple cameras positioned within the monitored area. Each of the multiple cameras is configured to capture and store a video stream corresponding to its specific field of view. A surveillance system monitoring controller is operatively coupled to each of the multiple cameras via a high-speed wireless network. The surveillance system monitoring controller includes: a high-speed input configured to receive video streams from one or more of the multiple cameras via the high-speed wireless network; a memory operatively coupled to the high-speed input and configured to store the received video streams; and a controller operatively coupled to the high-speed input and the memory. The controller is configured to analyze each video stream to find a first common landmark between a first video frame from a first video stream and a second video frame from a second video stream. When the first common landmark is present in both the first video frame from the first video stream and the second video frame from the second video stream, the controller is configured to stitch the first video frame from the first video stream and the second video frame from the second video stream together and place the stitched image into a main image. When there is no first common landmark between a first video frame from a first video stream and a second video frame from a second video stream, the controller is configured to fill the main image with the first video frame from the first video stream and the second video frame from the second video stream, wherein each of the first video frames is positioned within the main image at a relative position corresponding to the physical position of the view included in the first and second video frames. The controller is configured to transform the main image to a top view of the main image using image transformation.
[0008] In another example, a method for operating a surveillance system is provided, the surveillance system including a surveillance system controller and a plurality of cameras configured to provide video streams. The illustrative method includes the surveillance system controller analyzing the video streams provided by the plurality of cameras to find common landmarks from first video frames from a first video stream and second video frames from a second video stream. When a common landmark exists in both the first video frames from the first video stream and the second video frames from the second video stream, the surveillance system controller stitches the first video frames from the first video stream and the second video frames from the second video stream together and places the stitched image into a master image. When no common landmark exists in either the first video frames from the first video stream or the second video frames from the second video stream, the surveillance system controller fills the master image with the first video frames from the first video stream and the second video frames from the second video stream, wherein each of the first video frames is positioned within the master image at a relative position corresponding to a physical position included in the views of the first and second video frames. The surveillance system controller then transforms the master image into a top view of the master image.
[0009] In another example, a drone is configured for use in a surveillance system that includes multiple cameras positioned within a monitored area. The drone includes cameras, memory, a cellular transceiver, and a controller operatively coupled to the cameras, memory, and cellular transceiver. The controller is configured to receive instructions to fly to a specific location where an event is believed to have occurred and to capture a first video frame of the event using the camera. The controller is configured to fly to a second location away from the specific location to follow the event, determining the timing for capturing a second video frame of the event such that there is sufficient overlap between the first and second video frames to allow stitching the first and second video frames together and capturing the second video frame of the event. The controller is configured to stitch the first and second video frames together to create a stitched video image and to transmit the stitched video image via the cellular transceiver.
[0010] The foregoing summary is provided to facilitate understanding of the innovative features unique to this disclosure and is not intended as a complete description. A full understanding of this disclosure can be obtained by considering the entire specification, claims, drawings, and abstract as a whole. Attached Figure Description
[0011] This disclosure can be more fully understood by considering the following description of various examples in conjunction with the accompanying drawings, in which:
[0012] Figure 1A It is a schematic block diagram illustrating a monitoring system;
[0013] Figure 1B It is a schematic block diagram illustrating a monitoring system;
[0014] Figures 2 to 6 It shows that it can be accessed via Figure 1A and Figure 1B A flowchart illustrating the methods performed by the illustrative monitoring system;
[0015] Figure 7 It is a schematic block diagram illustrating the controller of a monitoring system;
[0016] Figure 8 It shows that it can be accessed via Figure 7 A flowchart illustrating the methods executed by the illustrative monitoring system controller;
[0017] Figure 9 This is a flowchart illustrating the illustrative method;
[0018] Figure 10 This is a schematic block diagram illustrating an illustrative drone;
[0019] Figures 11 to 12 It shows that it can be generated by Figure 10A flowchart illustrating the illustrative methods performed by the illustrative drone;
[0020] Figures 13 to 15 This is a flowchart illustrating the illustrative method; and
[0021] Figure 16 This is a schematic block diagram illustrating a series of steps that can be performed when conducting a video search.
[0022] While this disclosure is subject to various modifications and alternatives, its details have been shown by way of example in the accompanying drawings and will be described in detail. However, it should be understood that this disclosure is not intended to limit it to the specific examples described. Rather, it is intended to cover all modifications, equivalents, and alternatives that fall within the substance and scope of this disclosure. Detailed Implementation
[0023] The following description should be read with reference to the accompanying drawings, in which similar elements in different drawings are numbered in the same manner. The drawings are not necessarily drawn to scale and depict examples that are not intended to limit the scope of this disclosure. While examples of various elements are shown, those skilled in the art will recognize that many of the examples provided have suitable alternatives that can be utilized.
[0024] This document assumes that all numbers are modified by the term “about” unless otherwise explicitly stated. Expressions of numerical ranges using endpoints include all numbers contained within that range (e.g., 1 to 5 includes 1, 1.5, 2, 2.75, 3, 3.80, 4, and 5).
[0025] As used in this specification and the appended claims, the singular forms “a,” “an,” and “the” include plural references, unless otherwise expressly stated. As used in this specification and the appended claims, the term “or” is generally used in its meaning to include “and / or,” unless otherwise expressly stated.
[0026] It should be noted that references to "one embodiment," "some embodiments," or "other embodiments" in the specification indicate that the described embodiments may include specific features, structures, or characteristics, but each embodiment may not necessarily include that specific feature, structure, or characteristic. Furthermore, these phrases do not necessarily refer to the same embodiment. Additionally, when a specific feature, structure, or characteristic is described in conjunction with an embodiment, it is conceivable that, whether explicitly described or not, that feature, structure, or characteristic may be applied to other embodiments, unless otherwise expressly stated otherwise.
[0027] Figure 1AThis is a schematic block diagram of an illustrative surveillance system 10 configured to provide monitoring of a monitored area. The illustrative surveillance system 10 includes a surveillance system controller 12 and multiple cameras 14 positioned within the monitored area. These cameras 14 are respectively labeled 14a, 14b, and 14c. Although a total of three cameras are shown, it should be understood that the surveillance system 10 may include, for example, hundreds or even thousands of cameras 14 positioned around a small city. At least some of the cameras 14 may be fixed cameras, meaning they are each mounted in a fixed location. At least some of the cameras 14 may be mobile cameras configured to move around within the monitored area. For example, at least some of the cameras 14 may be mounted within a drone configured to fly around the monitored area, thereby providing camera coverage at various locations and / or vertical positions within the monitored area. In some cases, mobile cameras may be dashcams for emergency vehicles, body cameras for emergency personnel (such as police officers), and / or portable or wearable devices carried by citizens. These are merely examples.
[0028] Each camera 14 includes a favorable viewpoint 16 and a field of view (FOV) 18. The favorable viewpoints 16 are respectively labeled 16a, 16b, and 16c, and the FOVs 18 are respectively labeled 18a, 18b, and 18c. For each camera 14, its favorable viewpoint 16 may be at least partially defined at its installation location (if permanently fixed) or its current location (if it is movable). For example, a particular camera 14 may be installed on the exterior of a building at the intersection of First Street and Main Street at a height of ten feet. This installation location can be considered to define its favorable viewpoint 16. As another example, a particular camera 14 may be installed at the intersection of Third Street and Sixteenth Avenue at a height of twenty-five feet, such as on the exterior of a building or on a streetlight. This particular camera 14 (installed at different locations and at different heights of twenty-five feet) can be considered to have a favorable viewpoint 16 different from that of a camera installed at a height of ten feet. Cameras 14 mounted on drones will also have different favorable viewpoints 16.
[0029] Some cameras in camera 14 may have a fixed field of view (FOV) 18, which is determined by the position and manner in which the camera and lenses are mounted on it. Some cameras in camera 14 may have, for example, a 120-degree FOV or a 360-degree FOV. Some cameras in camera 14 may have an adjustable FOV 18. For example, some cameras in camera 14 may be pan-tilt-zoom (PTZ) cameras, whose FOV can be adjusted by adjusting one or more of the pan, tilt, and zoom of a particular camera 14.
[0030] The monitoring system controller 12 can be configured to control at least some aspects of the operation of the monitoring system 10. For example, the monitoring system controller 12 can be configured to, for instance, provide instructions to at least some of the cameras 14 to transmit video, or to change one or more of the pan, tilt, and zoom of the camera 14, which is a PTZ camera. The monitoring system controller 12 can be configured to control the operation of any moving camera that is part of the monitoring system 10. The monitoring system controller 12 can be configured to perform many different methods. Figures 2 to 6 This is a flowchart illustrating an illustrative method that can be coordinated by the monitoring system controller 12 and thus executed by the monitoring system 10.
[0031] Figure 1B This is a schematic block diagram of an illustrative surveillance system 11 configured to provide monitoring of a monitored area. The surveillance system 11 may include multiple cameras 15 positioned within the monitored area. The cameras 15 are respectively labeled 15a, 15b, 15c, and 15d. Although a total of four cameras 15 are shown, it should be understood that the surveillance system 11 may include hundreds or even thousands of cameras 15, for example, positioned around a small city. At least some of the cameras 15 may be fixed cameras, meaning they are each mounted in a fixed location. At least some of the cameras 15 may be mobile cameras configured to move around within the monitored area. For example, at least some of the cameras 15 may be mounted within a drone configured to fly around the monitored area, thereby providing camera coverage at various locations and / or vertical positions within the monitored area. In some cases, mobile cameras may be dashcams for emergency vehicles, body cameras for emergency personnel (such as police officers), and / or portable or wearable devices carried by citizens. These are merely examples.
[0032] Each camera 15 includes a position 17 and a field of view (FOV) 18, which can be determined according to latitude and longitude. The positions 17 are respectively labeled 17a, 17b, 17c, and 17d, and the FOVs 18 are respectively labeled 18a, 18b, 18c, and 18d. For each camera 15, its position 17 can be at least partially defined at its mounted location (if permanently fixed) or its current location (if it is movable). Some cameras in the camera 15 may have a fixed FOV 18, which is determined by the mounting of the camera and the position and manner in which the lens is mounted on the camera. Some fixed cameras in the camera 15 may have, for example, a 120-degree FOV or a 360-degree FOV. Some cameras in the camera 15 may have an adjustable FOV 18. For example, some cameras in the camera 15 may be pan-tilt-zoom (PTZ) cameras, whose FOV can be adjusted by adjusting one or more of the pan, tilt, and zoom of a particular camera 15.
[0033] The monitoring system 11 includes one or more edge devices 19. Two edge devices 19 are shown, labeled 19a and 19b respectively. It should be understood that, for example, there may be a generally larger number of edge devices 19. As shown, cameras 15a, 15b, and 15c are operatively coupled to edge device 19a, and camera 15d (and possibly other cameras) is operatively coupled to edge device 19b. In some cases, the edge devices 19 can provide information about the monitoring system controller 12 ( Figure 1A Some of the functions described regarding the monitoring system controller 12 may be provided by a cloud-based server 13 and / or a computer or workstation 21 operably coupled to the edge device 19 via the cloud-based server 13. The functions of the monitoring system controller 12 may be provided centrally, such as via the cloud-based server 13, or may be distributed among the edge devices 19 that may be located at or near the individual cameras 15.
[0034] In some cases, each edge device 19 may be an edge controller. In some cases, each edge device 19 may be configured to control the operation of each camera 15 operatively coupled to a particular edge device 19. A particular edge device 19 may be programmed or otherwise learned in terms of details associated with a particular camera 15 operatively coupled to that particular edge device 19. As will be discussed, video stitching may occur, for example, within one or more edge devices of the edge devices 19, or may occur within a cloud-based server 13 and / or computer or workstation 21.
[0035] One or more of the cloud-based server 13, computer or workstation 21, and / or edge device 19 can be configured to control at least some aspects of the operation of the monitoring system 11. For example, one or more of the cloud-based server 13, computer or workstation 21, and / or edge device 19 can be configured to, for example, provide instructions to at least some of the cameras 15 to transmit video, or change one or more of the pan, tilt, and zoom of the camera 15, which is a PTZ camera. One or more of the cloud-based server 13, computer or workstation 21, and / or edge device 19 can be configured to control the operation of any mobile camera that is part of the monitoring system 11. One or more of the cloud-based server 13, computer or workstation 21, and / or edge device 19 can be configured to perform a number of different methods. Figures 2 to 6 This is a flowchart illustrating an illustrative method that can be coordinated by the monitoring system controller 12 and therefore executed by the monitoring system 10. Alternatively, Figures 2 to 6The illustrative methods shown can also be coordinated and therefore carried out via one or more edge devices among cloud-based server 13, computer or workstation 21 and / or edge device 19.
[0036] Figure 2 This is a flowchart illustrating an illustrative method 20 for monitoring a surveillance area. Method 20 includes receiving multiple video streams captured by multiple cameras (such as camera 14) in the surveillance area, each of the multiple cameras having a camera vantage point (such as vantage point 16) and a camera field of view (FOV) (such as FOV 18). Each of the multiple video streams captures one or more objects within the surveillance area, which are within the camera FOV of the corresponding camera and from the camera vantage point of the corresponding camera, as shown in box 22. A vantage point transformation can be applied to each of two or more objects captured in two or more of the multiple video streams, transforming the view of each of the two or more objects from the camera vantage point 16 of the corresponding camera to a common vantage point, as shown in box 24. The common vantage point may correspond to a “bird’s-eye view” of the surveillance area, which can be moved and / or otherwise controlled by an operator of the surveillance system. The transformation may include a geometric transformation from each camera vantage point to the common vantage point.
[0037] In one example, suppose the camera vantage point 16 of the first camera 14 corresponds to a specific location and a height of ten feet, and the camera vantage point 16 of the second camera 14 corresponds to a location fifty yards south of the first camera 14 and a height of twenty-five feet. A common vantage point may correspond to the camera vantage point of one of the cameras 14, or it may correspond to a different location, such as a “bird’s-eye view” of a monitored area that can be moved and / or otherwise controlled by the operator of the monitoring system. It may be desirable for the common vantage point to correspond to a location between the first and second cameras 14, and at a height intermediate between them. In another example, the common vantage point may correspond to a significantly greater vertical height, such as a top view of the area or a desired bird’s-eye view.
[0038] The transformed view of each of two or more objects from a common vantage point can be displayed on the monitor, as shown in box 26. In some cases, method 20 may further include displaying indicators on the monitor that indicate areas of the monitored area not within the camera FOV of any of the multiple cameras, as shown in box 28. For example, method 20 may additionally or alternatively include displaying indicators on the monitor that indicate areas of the object and / or the monitored area not within the camera FOV of any of the multiple cameras, as shown in box 30. This informs the operator of areas of the monitored area not covered by the cameras of monitoring system 10.
[0039] Figure 3 This is a flowchart illustrating an illustrative method 32 for monitoring a surveillance area. Method 32 includes receiving multiple video streams captured by multiple cameras (such as camera 14) in the surveillance area, each of the multiple cameras having a camera vantage point (such as vantage point 16) and a camera field of view (FOV) (such as FOV 18). Each of the multiple video streams captures one or more objects within the surveillance area, which are within the camera FOV of the corresponding camera and from the camera vantage point of the corresponding camera, as shown in box 34. A vantage point transformation can be applied to each of two or more objects captured in two or more of the multiple video streams, transforming the view of each of the two or more objects from the camera vantage point of the corresponding camera to a common vantage point, as shown in box 36. In some cases, the view of each of the two or more objects is transformed and presented from the common vantage point at a first frame rate. The transformed view of each of the two or more objects from the common vantage point can be presented on a display, as shown in box 38.
[0040] In some cases, method 32 includes receiving user input to move the common vantage point to an updated common vantage point, as shown in box 40. In some cases, moving the common vantage point may include one or more of panning, zooming, tilting, and rotating. As an example, the common vantage point may be zoomed in to a zoomed-in and updated common vantage point. A vantage point transformation may be applied using the updated common vantage point, as shown in box 42. A view of each of two or more objects transformed from the updated common vantage point may be rendered, as shown in box 44. In some cases, views of two or more objects from the zoomed-in and updated common vantage point may be rendered at a second frame rate higher than the first frame rate.
[0041] Figure 4This is a flowchart illustrating an illustrative method 46 for monitoring a surveillance area. Method 46 includes receiving multiple video streams captured by multiple cameras (such as camera 14) in the surveillance area, each of the multiple cameras having a camera vantage point (such as vantage point 16) and a camera field of view (FOV) (such as FOV 18). Each of the multiple video streams captures one or more objects within the surveillance area, which are within the camera FOV of the corresponding camera and from the camera vantage point of the corresponding camera, as shown in box 48. A vantage point transformation can be applied to each of two or more objects captured in two or more video streams. The vantage point transformation transforms the view of each of the two or more objects from the camera vantage point of the corresponding camera to a common vantage point, as shown in box 50. In some cases, the view of each of the two or more objects is transformed and rendered from the common vantage point at a first frame rate. The first frame rate can depend on the number of objects and the number of cameras of interest. The fewer objects and / or cameras of interest (e.g., a magnified common vantage point and a smaller field of view can have fewer objects and fewer cameras of interest), the higher the frame rate. The view of each of two or more objects transformed from the common vantage point can be displayed on the monitor, as shown in box 52.
[0042] In some cases, method 46 may further include identifying one or more objects within a monitored area that are within the camera FOV of at least one of a plurality of cameras, as shown in box 54. Determining when the objects identified within the camera FOVs of two or more of the plurality of cameras correspond to the same object, as shown in box 56. The objects are labeled with object identifiers, as shown in box 58. View information of the labeled objects is stored, including view information from camera vantage points of each of the plurality of cameras that captured the labeled objects over time, as shown in box 60. The view information may, for example, include view information captured from different cameras and at different times. In some cases, vantage point transformation transforms the view information to a common vantage point.
[0043] Figure 5This is a flowchart illustrating an illustrative method 62 for monitoring a surveillance area. Method 62 includes receiving multiple video streams captured by multiple cameras (such as camera 14) in the surveillance area, each of the multiple cameras having a camera vantage point (such as vantage point 16) and a camera field of view (FOV) (such as FOV 18). Each of the multiple video streams captures one or more objects within the surveillance area, which are within the camera FOV of the corresponding camera and from the camera vantage point of the corresponding camera, as shown in box 64. A vantage point transformation can be applied to each of two or more objects captured in two or more video streams, transforming the view of each of the two or more objects from the camera vantage point of the corresponding camera to a common vantage point, as shown in box 66. In some cases, the view of each of the two or more objects is transformed and rendered from the common vantage point at a first frame rate. The first frame rate can depend on the number of objects and the number of cameras of interest. The fewer objects and / or cameras of interest (e.g., a magnified common vantage point and a smaller field of view can have fewer objects and fewer cameras of interest), the higher the frame rate. The view of each of two or more objects transformed from the common vantage point can be displayed on the monitor, as shown in box 68.
[0044] In some cases, method 62 further includes determining when the camera FOV of one of the multiple cameras overlaps with the camera FOV of another of the multiple cameras, as shown in box 70. When the camera FOV of one of the multiple cameras overlaps with the camera FOV of another of the multiple cameras, the corresponding video streams are stitched together, as shown in box 72.
[0045] Figure 6 This is a flowchart illustrating an illustrative method 74 of a surveillance system including multiple cameras (such as camera 14), each camera configured to provide a video stream from a corresponding camera vantage point. The surveillance system includes a surveillance system controller. Method 74 includes the surveillance system controller analyzing the video streams provided by the multiple cameras to find common objects from a first video frame from a first video stream and a second video frame from a second video stream, as shown in box 76. The common objects are marked with object identifiers, as shown in box 78. View information including view information from the camera vantage points of each of the first and second video streams is stored for the marked common objects, as shown in box 80. A vantage point transformation is applied to the common objects, transforming the view information of the common objects from the camera vantage points of the first and second video streams to a common vantage point, as shown in box 82. The view information of the common objects transformed from the common vantage point is presented on a display, as shown in box 84.
[0046] In some cases, method 74 may further include receiving user input to move the common advantageous viewpoint to an updated common advantageous viewpoint, and once moved, applying the advantageous viewpoint using the updated common advantageous viewpoint, and presenting the view information of the common object transformed from the updated common advantageous viewpoint, as shown in box 86. The view information of the common object transformed from the common advantageous viewpoint may be presented at a first frame rate. The view information of the common object transformed from the updated common advantageous viewpoint may be presented at a second frame rate different from the first frame rate. Method 74 may further include presenting an indicator on a display indicating an area of the monitored area and / or a common object that is not within the camera FOV of any of the multiple cameras.
[0047] Figure 7 This is a schematic block diagram of an illustrative monitoring system controller 90. The illustrative monitoring system controller 90 can be considered as an example of a monitoring system controller 12. The monitoring system controller 90 can be operatively coupled to a plurality of cameras 14 via a high-speed wireless network (such as, but not limited to, a 5G cellular network). When provided as such, the monitoring system controller 90 includes a high-speed input 92 configured to receive video streams from one or more of the plurality of cameras via the high-speed wireless network. In some cases, the monitoring system controller 90 also includes a high-speed output 94 operatively coupled to the high-speed wireless network to output images such as video, for example. In some cases, the high-speed input 92 and the high-speed output 94 can be combined to form a transceiver 96 that can provide bidirectional communication. A memory 98 is operatively coupled to the high-speed input 92 and configured to store the received video streams. A controller 100 is operatively coupled to the high-speed input 92 and the memory 98. The controller 100 is configured to perform various methods, such as Figure 8 and Figure 9 The methods outlined in the text.
[0048] Figure 8This is a flowchart illustrating an illustrative method 102 that can be coordinated by controller 100 and therefore executed by monitoring system controller 90. Controller 100 can be configured to analyze each video stream to find a first common landmark from a first video frame from a first video stream and a second video frame from a second video stream, as shown in box 104. When the first common landmark is present in both the first video frame from the first video stream and the second video frame from the second video stream, controller 100 is configured to stitch the first video frame from the first video stream and the second video frame from the second video stream together and place the stitched image into a main image, as shown in box 106. When the first common landmark is not present in either the first video frame from the first video stream or the second video frame from the second video stream, controller 100 is configured to fill the main image with the first video frame from the first video stream and the second video frame from the second video stream, wherein each of the first video frames is positioned within the main image at a relative position corresponding to the physical position of the view included in the first and second video frames, as shown in box 108. The main image is converted into a top view (or other bird's-eye view) of the main image, as shown in box 110.
[0049] In some cases, controller 100 may also be configured to analyze each video stream to find a second common landmark from a second video frame from a second video stream and a third video frame from a third video stream, as shown in 112. When the second common landmark exists in both the second video frame from the second video stream and the third video frame from the third video stream, controller 100 may be configured to stitch the second video frame from the second video stream and the third video frame from the third video stream together and place the stitched image into the main image, as shown in box 114. When the second common landmark does not exist in either the second video frame from the second video stream or the third video frame from the third video stream, controller 100 may be further configured to fill the main image with the third video frame from the third video stream, wherein the third video frame is positioned within the main image at a relative position corresponding to the physical position of the view included in the third video frame, as shown in box 116.
[0050] In some cases, as indicated, at least some of the cameras in camera 14 may include cameras with an adjustable field of view (FOV). Controller 100 may be configured to send commands to adjust the FOV of one or more cameras in camera 14 with an adjustable FOV. When some of the cameras in camera 14 include mobile cameras that can be fixed relative to a drone, controller 100 may be configured to provide instructions to a particular mobile camera (e.g., a drone) to move to a particular location. In some cases, controller 100 and / or the mobile camera itself may be configured to estimate the optimal time for capturing a live video stream as one or more mobile cameras in the mobile camera group approach and / or pass through a particular location.
[0051] In some cases, controller 100 can be configured to display a position within a top view of the main image from any of a plurality of different vantage points. For example, the plurality of different vantage points may include different locations on the ground as shown in the XY plane. As an example, the plurality of different vantage points may include different heights in the Z plane, which is orthogonal to the XY plane. Controller 100 can be configured to create and store multiple top views of the main image based on video streams received from multiple cameras over a period of time, and to display a selected top view of the main image corresponding to a time point within that time period. Controller 100 can be configured to display a position within a selected top view of the main image corresponding to a time point from any of a plurality of different vantage points.
[0052] Figure 9This is a flowchart illustrating an illustrative method 118 of operating a surveillance system comprising multiple cameras (such as camera 14) configured to provide video streams, the surveillance system including a surveillance system controller (such as surveillance system controllers 12, 90). Method 118 includes the surveillance system controller analyzing the video streams provided by the multiple cameras to find common landmarks from a first video frame from a first video stream and a second video frame from a second video stream, as shown in box 120. When a common landmark exists in both the first video frame from the first video stream and the second video frame from the second video stream, the surveillance system controller stitches the first video frame from the first video stream and the second video frame from the second video stream together and places the stitched image into a master image, as shown in box 122. When there is no common landmark between the first video frame from the first video stream and the second video frame from the second video stream, the monitoring system controller fills the second video frame with the first video frame from the first video stream and the second video frame from the second video stream. Each first video frame is positioned within the main image at a relative position corresponding to the physical location of the view included in the first and second video frames, as shown in box 124. The monitoring system controller then transforms the main image into a top view of the main image, as shown in box 126.
[0053] Figure 10 This is a schematic block diagram of a drone 128 configured for use in a monitoring system (such as monitoring system 10), which includes multiple cameras (such as camera 14) positioned within a monitored area. The drone 128 includes a camera 130, a memory 132, a transceiver 134, and a controller 136 operatively coupled to the camera 130, memory 132, and transceiver 134. In some cases, the camera 130 may be considered to include a camera field of view (FOV) 131. For example, the transceiver 134 may be a cellular transceiver, such as, but not limited to, a 5G cellular transceiver. The controller 136 may be considered to be configured to control the operation of the drone and the camera 130, including the operation of the drone's flight capabilities. Figure 11 and Figure 12 It is a flowchart illustrating an illustrative method that can be coordinated by the controller 136 and thus executed by the drone 128.
[0054] Figure 11This is a flowchart illustrating an illustrative method 138 that the controller 136 of the drone 128 can be configured to perform. A first frame of video of an event at a first location is captured using a camera, as shown in box 140. A flight path following the event is determined, as shown in box 142. The drone flies along the flight path, as shown in box 144. A second frame of video of the event at a second location is captured using a camera, wherein the second location is determined such that there is overlap in the camera's field of view (FOV) between the first and second frames of the video, as shown in box 146. The video is transmitted via a transceiver, as shown in box 148. In some cases, the frame rate of the video is dynamic and may depend on the speed at which the drone flies along the flight path. In some cases, the frame rate may depend on the altitude at which the drone flies along the flight path.
[0055] Figure 12 This is a flowchart illustrating an illustrative method 150 in which the controller 136 of the drone 128 can be configured to perform. The drone receives instructions to fly to a specific location where an event is believed to be occurring, as shown in box 152. A first video of the event is captured using a camera, as shown in box 154. The controller 136 is configured to fly the drone to a second location away from the specific location to follow the event, as shown in box 156. The controller 136 is configured to determine the timing of capturing a second video of the event such that there is sufficient overlap between the first and second videos to allow them to be stitched together, as shown in box 158. The controller 136 is configured to capture the second video of the event, as shown in box 160, and stitch the first and second videos together to create a stitched video, as shown in box 162. The controller 136 is configured to transmit the stitched video via a cellular transceiver (e.g., a 5G transceiver), as shown in box 164.
[0056] Figure 13This is a flowchart illustrating an illustrative method 166 for determining when and how video images should be stitched together. As can be seen, method 166 can process video provided by any of the fixed camera 168, PTZ camera 170, and drone camera 172, particularly if drone camera 172 is carried by a drone at a similar altitude to the fixed camera 168 and / or PTZ camera 170. In some cases, drone camera 172 with a similar field of view (FOV) may also be used. At box 174, cameras whose images can and should be stitched together are identified. Box 174 receives input details about the area of interest (as shown in box 176) and the physical location of the cameras (as shown in box 178) as input. The area of interest can be defined by the field of view (FOV) of a specific bird's-eye view currently selected by the operator of monitoring system 10. Cameras whose images should be stitched together can include all cameras with an FOV overlapping with the field of view of the specific bird's-eye view currently selected by the operator of monitoring system 10. As the operator zooms in, the FOV of the bird's-eye view currently selected by the operator decreases, and thus fewer cameras whose images should be stitched together can be identified. When fewer cameras are involved, a given set of processing resources can stitch more frames together, and the stitched video can be displayed on the operator's monitor at a higher frame rate.
[0057] In some cases, camera positions can be specified based on latitude and longitude, but other identification criteria, such as GPS coordinates, can also be used. The camera view overlay module 180 determines which camera views overlap and which do not. The overlapping views are stitched together in the image stitching module 182 to produce a stitched image or stitched frame 184. The non-overlapping views are either aggregated as is into the final view or transformed into a top view or other bird's-eye view.
[0058] Figure 14This is a flowchart illustrating an illustrative method 186 for stitching views together, and can be considered to provide additional details about the functionality of the camera view overlay module 180 and the image stitching module 182. Illustrative method 186 includes the horizontal field of view (HFOV) and vertical field of view (VFOV) of each camera according to specifications, and the physical location of each camera as input, as shown in box 188. Google Maps or a similar engine can be used to identify landmarks and their physical locations, as shown in box 190. These are provided as input to box 192, where it is determined whether and how much the landmarks overlap in the various camera views. This can be determined at least in part by identifying the landmark positions within each camera view. If there is no overlap between the landmarks in the various camera views, control proceeds to box 194, where the main image is filled with camera view details. Conversely, if the landmarks overlap in some camera views, control proceeds to decision box 196, where it is determined whether there is significant overlap. If so, control proceeds to box 198, where stitching is performed. At box 200, a transformation of the top view or other bird's-eye view is performed according to a known method and is viewed as shown in box 202.
[0059] Figure 15 This is a flowchart illustrating illustrative method 204. As shown at 206, the drone flies from position P1 at time t1 to position P2 at time t2. At time t1, the drone's field of view is from A to B. At time t2, the drone's field of view is from C to D. Method 204 includes determining the drone's altitude and camera FOV at position P1 (as shown in box 208), as shown in box 210. A first image is captured at position P1, as shown in box 212. The maximum range is calculated at box 214, and position E is determined as shown in box 216 (refer to 206). In some cases, position E may correspond to 80 percent of the maximum range divided by two. This allows for travel of a certain distance between images while still allowing sufficient overlap between successive images. At box 218, the predicted position P2 is determined. This can include the following as inputs: the drone's position at time t1 (such as latitude and longitude), as shown in 218a; the movement speed, as shown in 218b; the drone's altitude, as shown in 218c; and the drone's yaw, pitch, and roll, as shown in 218d. A second image is captured at the predicted position P2, as shown in box 220. The first and second images are provided to image stitching algorithm 222, and then to box 224, which provides the panoramic view.
[0060] Figure 16This is a schematic block diagram illustrating a series of steps 226 that can be performed as part of a search. At block 228, a search query can be received from the operator. The search query can take various forms and, for example, may include a request to search for, for example, a man 60 to 75 inches tall, wearing a red shirt and carrying a black briefcase. The search query may include a request to search for a yellow car seen passing through a specific intersection within a specific time window. For example, the search query may include a request to search for a gathering of more than twenty people. At block 230, video is searched to find objects matching the search query. At block 232, the locations of the matching objects are displayed at a first frame rate on a view transformed from a common vantage point. At block 234, user input is received, which zooms in on the common vantage point to a selected matching object among the matching objects. At block 236, the selected matching object from the zoomed-in vantage point is displayed at a second frame rate. In some cases, the second frame rate may be higher than the first frame rate.
[0061] Although several illustrative embodiments of this disclosure have been described thus, those skilled in the art will readily understand that other embodiments can be made and used within the scope of the appended claims. However, it should be understood that this disclosure is illustrative in many respects only. Changes may be made to details, particularly those relating to shape, size, arrangement of parts, and exclusion and order of steps, without departing from the scope of this disclosure. The scope of this disclosure is, of course, defined by the language expressed in the appended claims.
Claims
1. A method for monitoring a surveillance area, the method comprising: Receive multiple video streams captured by multiple cameras in the monitored area, wherein each of the multiple cameras has a camera advantage viewpoint and a camera field of view (FOV), and each of the multiple video streams captures one or more objects in the monitored area, the one or more objects being within the camera FOV of the corresponding camera and from the camera advantage viewpoint of the corresponding camera. The advantageous viewpoint transformation is applied to each of two or more objects captured in two or more video streams in the plurality of video streams, the advantageous viewpoint transformation transforming the view of each of the two or more objects from the camera advantageous viewpoint of the corresponding camera to a common advantageous viewpoint; Present on the display the view of each of the two or more objects after it has been transformed from the common advantageous viewpoint; Receive user input to move the common advantageous viewpoint to an updated common advantageous viewpoint, and once moved, apply the advantageous viewpoint transformation using the updated common advantageous viewpoint, and present the view of each of the two or more objects transformed from the updated common advantageous viewpoint; The transformed view of each of the two or more objects is transformed and rendered from the common advantageous viewpoint at a first frame rate; as well as The transformed view of each of the two or more objects is transformed and rendered from the updated common advantageous viewpoint at a second frame rate, wherein the second frame rate is different from the first frame rate.
2. The method of claim 1, wherein moving the common advantageous viewpoint comprises one or more of translation, zooming, and rotation.
3. The method according to claim 2, wherein the method further comprises: The common advantageous viewpoint is magnified to the magnified and updated common advantageous viewpoint; as well as The view of the two or more objects is transformed and presented from the magnified and updated common advantageous viewpoint at a second frame rate higher than the first frame rate.
4. The method of claim 1, further comprising presenting an indicator on the display, the indicator indicating an area of the monitored area that is not within the camera FOV of any of the plurality of cameras.
5. The method of claim 1, further comprising presenting an indicator on the display, the indicator indicating an area of the object not within the camera FOV of any of the plurality of cameras.
6. The method according to claim 1, wherein the method further comprises: Identify one or more objects within the monitored area, wherein the one or more objects are within the camera field of view (FOV) of at least one of the plurality of cameras; Determine when objects identified within the field of view (FOV) of two or more of the plurality of cameras correspond to the same object; The object is marked with an object identifier; as well as The view information of the marked object is stored, including view information from the camera's advantageous viewpoint from each of two or more of the plurality of cameras that capture the marked object.
7. The method according to claim 1, wherein the method further comprises: Determine when the FOV of one of the plurality of cameras overlaps with the FOV of another of the plurality of cameras; as well as When the field of view (FOV) of one of the multiple cameras overlaps with the field of view of another of the multiple cameras, the corresponding video streams are stitched together.
8. A monitoring system, the monitoring system comprising: Multiple cameras, each configured to provide a video stream from its corresponding camera's advantageous viewpoint; A monitoring system controller, operatively coupled to the plurality of cameras, is configured to: Analyze the video streams provided by the plurality of cameras to find common objects captured from both a first video frame from a first video stream and a second video frame from a second video stream; The common object is marked with an object identifier; The view information of the marked common objects is stored, including view information of the camera's advantageous viewpoint from each of the first video stream and the second video stream; A favorable viewpoint transformation is applied to a common object, which transforms the view information of the common object from the camera favorable viewpoints of the first video stream and the second video stream to a common favorable viewpoint; Present the view information of the common object as transformed from the common advantageous viewpoint on the display; Receive user input to move the common advantageous viewpoint to an updated common advantageous viewpoint, and once moved, apply the advantageous viewpoint transformation using the updated common advantageous viewpoint, and present the view information of the common object transformed from the updated common advantageous viewpoint; and The transformed view information of the common object is presented from the common advantageous viewpoint at a first frame rate, and the transformed view information of the common object is presented from the updated common advantageous viewpoint at a second frame rate different from the first frame rate.