System and method for background modeling of video streams

By rapidly resetting the background model when the camera stops moving and using image segmentation and object detection algorithms, the problems of classification errors and processing power limitations in background modeling with moving cameras are solved, achieving robust and energy-efficient background modeling.

CN119090913BActive Publication Date: 2026-06-19AXIS

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
AXIS
Filing Date
2024-05-22
Publication Date
2026-06-19

AI Technical Summary

Technical Problem

For cameras with mobility, traditional background modeling methods struggle to distinguish between changes caused by camera movement and object movement in the scene, leading to errors in background and foreground classification. Furthermore, with limited processing power, it is difficult to effectively update the background model.

Method used

By analyzing changes in image frames and repeatedly updating the background model, and performing a fast reset when the camera stops moving, the background modeling process is optimized by using image segmentation and object detection algorithms to identify foreground objects.

🎯Benefits of technology

Robust and energy-efficient background modeling is achieved during camera movement, quickly generating effective background models suitable for mobile cameras, especially in situations with limited power.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN119090913B_ABST
    Figure CN119090913B_ABST
Patent Text Reader

Abstract

This disclosure relates to systems and methods for background modeling of video streams. Specifically, a method for background modeling of video streams acquired by a mobile camera is disclosed. The method includes: acquiring the video stream; repeatedly updating a background model of the video stream by analyzing changes in a sequence of image frames and classifying time-invariant image regions in the image frames as background; detecting camera movement; and resetting the background model by applying an image segmentation and / or object detection algorithm to identify at least foreground objects; and, after resetting the background model, returning to repeatedly updating the background model of the video stream by analyzing changes in a sequence of image frames and classifying time-invariant image regions in the image frames as background. This disclosure further relates to an image processing system.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This disclosure relates to a method for background modeling of video streams acquired by a mobile camera. This disclosure further relates to an image processing system for implementing this method. Background Technology

[0002] Video processing is a special case of signal processing. It uses hardware, software, or a combination thereof to process sequences of images representing video. In video processing, various processing tasks require preprocessing. A common step in this process is defining a background model and applying it to the video sequence. Background modeling can be described as dividing image frames into parts or regions that are classified as background or foreground. The resulting background and foreground segmentation allows further video processing to focus on the image frame portions relevant to certain processing tasks. For example, some processing tasks may only need to be performed in the foreground, while other processing tasks are only relevant to the background.

[0003] One type of background modeling is based on motion analysis within a video stream. If a portion of an image does not show significant change, it can be considered part of the background. This type of background modeling is an affordable and effective way to provide background models useful for a variety of applications. It can be used, for example, in background overlay applications and dynamic privacy masking applications. For fixed cameras, traditional motion-based background modeling in the video stream, used in conjunction with, for example, configurable background merging timing, is generally considered a fairly effective solution.

[0004] However, background modeling is often more challenging for cameras with mobility capabilities, such as pan-tilt zoom (PTZ) cameras. PTZ cameras are cameras capable of controlling both orientation and zoom. Because such cameras can move rapidly with changes in the scene, traditional background modeling can be challenging in correctly identifying the background and foreground in a video stream by analyzing these changes. This is because motion-based background modeling cannot distinguish between changes caused by camera movement and changes caused by the movement of objects in the scene. Therefore, when using this modeling approach, there is a risk that some or all of the stationary parts of the scene may be incorrectly classified as foreground when the camera moves.

[0005] In addition to the aforementioned challenges of background modeling with mobile cameras, such cameras often present other requirements or limitations. For example, they may have limited processing power, especially when the camera is moving, making possible alternatives impossible or unsuitable. Therefore, there is a need to improve background modeling of video streams acquired by mobile cameras. Summary of the Invention

[0006] This invention relates to a method for background modeling of video streams acquired by a mobile camera, the method comprising:

[0007] Acquire a video stream containing a sequence of image frames;

[0008] The background model of the video stream is repeatedly updated by analyzing the changes in the sequence of image frames and classifying the image regions in the image frames that do not change over time as background.

[0009] Detect camera movement and, optionally, stop repeatedly updating the background model;

[0010] When camera movement stops or falls below a movement threshold, the background model is reset by the following steps: acquiring one or more image frames; applying image segmentation and / or object detection algorithms to one or more image frames to identify at least foreground objects; and classifying the region outside the identified foreground objects as background.

[0011] After resetting the background model, the process returns to repeatedly updating the background model of the video stream by analyzing changes in the sequence of image frames and classifying image regions in the image frames that do not change over time as background.

[0012] The inventors have realized that robust and energy-efficient background modeling can be achieved by combining two approaches to background modeling for a camera with said mobility. By rapidly resetting the background model when the camera stops moving, background modeling can quickly obtain a valid background and begin further processing of image frames requiring a background model. Preferably, the step of repeatedly updating the background model by analyzing changes in a sequence of image frames and classifying image regions in the image frames that do not change over time as background is a continuous process, while the resetting of the background model is a single event. Image segmentation and / or object detection algorithms used to perform the resetting of the background model can typically include the use of neural networks or other suitable artificial intelligence (AI)-based methods. While AI-based methods excel at identifying background and foreground objects, these methods may also require more processing power. In the example of a moving PTZ camera, the motor may cause significant power spikes during movement. Furthermore, when the camera is not moving, there are often power limitations depending on how the camera is powered.

[0013] The inventors have discovered a hybrid solution that addresses the aforementioned challenges. In this solution, conventional modeling is used to repeatedly update the background model of the video stream as long as there is no movement or movement is restricted, while image segmentation and / or object detection algorithms are used to perform a rapid reset when the camera stops.

[0014] Several options are available during fast movement. One is to stop maintaining the background model during fast movement. In applications involving, for example, static background overlays, where the overlay based on a static position applies only to the background area and not the foreground area, the overlay may be unconditionally visible, i.e., visible to both the background and foreground. In the case of dynamic masking, a full-screen mask can be temporarily applied during fast movement. As soon as the camera stops, a single quick reset of the background can be performed to quickly restore the background.

[0015] The present invention further relates to an image processing system, comprising:

[0016] At least one camera, such as a pan-tilt-zoom camera; and

[0017] The processing unit is configured as follows:

[0018] Acquire a video stream containing a sequence of image frames;

[0019] The background model of the video stream is repeatedly updated by analyzing changes in the sequence of image frames and classifying image regions in the image frames that do not change over time as background.

[0020] Detect camera movement and, optionally, stop repeatedly updating the background model;

[0021] When camera movement stops or falls below a movement threshold, the background model is reset by the following steps: acquiring one or more image frames; applying image segmentation and / or object detection algorithms to one or more image frames to identify foreground objects; and classifying the region outside the identified foreground objects as background.

[0022] After resetting the background model, the process returns to repeatedly updating the background model of the video stream by analyzing changes in the sequence of image frames and classifying image regions in the image frames that do not change over time as background.

[0023] The system may further include a display for showing a sequence of image frames from the video stream. Typically, further processing, such as dynamic privacy masking, overlay, or any type of image / video processing, will be applied to the displayed video stream. Dynamic privacy masking may include masking one or more foreground regions, i.e., moving objects in the scene. Conversely, static privacy masking is a fixed-position mask, such as masking a user-defined image region in the scene. A fixed-position mask can be used to mask, for example, windows of a building.

[0024] The term "overlay" refers to added content or information, typically superimposed on an image frame. Overlays may be limited to covering a background area. Overlaid content or information can include, but is not limited to, text, images, symbols, graphics, charts, statistics, boxes, etc. Image processing systems can be used, for example, in camera-based surveillance systems.

[0025] Those skilled in the art will recognize that the currently disclosed methods for background modeling of video streams acquired by mobile cameras can be performed using any embodiment of the currently disclosed image processing system, and vice versa. Attached Figure Description

[0026] Various embodiments are described below with reference to the accompanying drawings. The drawings are examples of embodiments intended to illustrate some features of the currently disclosed methods and systems for background modeling of video streams, and are not limited to the currently disclosed methods and systems.

[0027] Figure 1 A flowchart illustrating an embodiment of a currently disclosed method for background modeling of video streams is shown.

[0028] Figures 2A to 2B This is a schematic diagram of an image frame in a video sequence. The image frame defines whether a spatial region in the image frame belongs to the background or the foreground.

[0029] Figure 3 This is a schematic diagram of image frames in a video sequence where the foreground object is transformed into the background object.

[0030] Figure 4 An example of an image frame in a video sequence is shown as the camera moves.

[0031] Figure 5 An image frame showing multiple objects detected with bounding boxes added is displayed.

[0032] Figure 6 An example of a conceptual framework for object detection using neural networks is shown.

[0033] Figure 7 A schematic diagram of an embodiment of the currently disclosed image processing system is shown. Detailed Implementation

[0034] This disclosure relates to a method for background modeling of video streams acquired by a mobile camera. Figure 1 A flowchart of an embodiment of the currently disclosed method 100 is shown, which is used for background modeling of a video stream acquired by a mobile camera. Method 100 includes the following steps:

[0035] Obtain a video stream containing a sequence of image frames (101);

[0036] The background model of the video stream is repeatedly updated by analyzing the changes in the sequence of image frames and classifying the image regions in the image frames that do not change over time as background (102);

[0037] Detect camera movement (103), and optionally, stop repeatedly updating the background model;

[0038] When the camera movement stops or the camera movement is below the movement threshold, the background model is reset by the following steps: acquiring one or more image frames; applying image segmentation and / or object detection algorithms to one or more image frames to identify at least foreground objects (104); classifying the region other than the identified foreground objects as background (105);

[0039] After the background model has been reset, the process returns to repeatedly updating the background model of the video stream by analyzing changes in the sequence of image frames and classifying image regions in the image frames that do not change over time as background (106).

[0040] "Background model" is generally considered a term understood by those skilled in the art within the context of this disclosure. In the context of this disclosure, "background model" should be interpreted as a data model that determines whether a spatial region in a video sequence belongs to the background or foreground of the video sequence. The data model may store this information or process it in any suitable manner. "Background" can include any region of an image frame whose image data, for example, is sufficiently similar to a corresponding region in a previous image frame in terms of pixel data. For example, the brightness and / or color intensity of pixels or groups of pixels between image frames in a video stream can be compared. Typically, the background is intended to correspond to areas in a monitored scene that are not particularly relevant from an image analysis perspective. Similarly, "foreground" can include any region of an image frame whose image data is very dissimilar to a corresponding region in a previous image frame. Foreground objects are typically in motion. In a practical and simplified context, the foreground can correspond to a monitored object, such as a person, car, cargo, etc. In the context of this disclosure, "spatial region" can be interpreted as any number of pixels or subpixels in an image frame, which can be further grouped according to, for example, the shape of the object or the object to which it belongs.

[0041] The process of detecting camera movement can include determining whether the camera movement exceeds a predefined limit. If the movement is only slow, traditional background modeling can still function correctly.

[0042] When camera movement is detected or the camera movement exceeds a predefined limit, the repeated updating of the background model can be paused. When the camera moves rapidly, it may stop maintaining the background model during the movement. One embodiment of the currently disclosed background modeling method further includes the following steps: upon detection of camera movement, but before detection that the camera movement has stopped or the camera movement is below a movement threshold, making a background overlay covering an image region of the entire image frame visible and / or applying full masking and / or pixelation to the image region covering the entire image frame. In other words, in the case of background overlay, one option is to make the entire overlay visible. In the case of dynamic privacy masking, a switch to full-screen masking can be made possible. Once the camera movement ends, a reset of the background model is performed, and the process can return to repeatedly updating the background model of the video stream by analyzing changes in the sequence of image frames and classifying image regions in the image frames that do not change over time as background.

[0043] The following steps can be sequential: repeatedly update the background model; detect camera movement; detect when camera movement stops or when camera movement falls below a movement threshold; and perform a reset of the background model.

[0044] Repeatedly update the background model of the video stream.

[0045] The step of repeatedly updating the background model of a video stream by analyzing changes in a sequence of image frames and classifying image regions in the image frames that do not change over time as background may include analyzing sub-regions of a sequence of image frames, wherein if there are no significant changes between image frames in a sub-region during a predefined time period, the sub-region is classified as a background sub-region.

[0046] Figure 2A , Figure 2B and Figure 3 An example of conventional background modeling of the currently disclosed method is shown.

[0047] Figure 2A Image frame 200 from a video sequence is shown. The video sequence includes a background model that defines whether a spatial region in image frame 200 belongs to background 201 or foreground 202. In this embodiment, any significant change in background 201 will transform the corresponding spatial region to foreground 202. Spatial regions of image data that remain substantially unchanged over a sufficiently long period of time can be transformed into background.

[0048] For example, image frame 200 can be divided into multiple spatial regions 203, and each spatial region 203 can be estimated as background or foreground depending on whether the spatial region changes. Figure 2B A schematic diagram overlapping with an example of a first algorithm including multiple timers 204 is shown. Figure 2AImage frame 200. Each timer 204 is associated with a spatial region 203 among a plurality of spatial regions 203 in image frame 200. If the image data of spatial region 203 of image frame 200 has not changed significantly relative to the image data of said spatial region 220 of the previous image frame 200 before a predefined time limit, then said spatial region 203 is defined as an empty region 201 in the background model.

[0049] If a significant change is detected in the image data of spatial region 203 of image frame 200 relative to the image data of spatial region 203 of the previous image frame 200, then spatial region 203 of background 201 can be converted to foreground 202.

[0050] Timer 204 is a way to implement repetitive updates to the background model, and its advantages lie in its simplicity and computational efficiency. Compared to similar algorithms, it is easy to implement thresholding and / or manipulate the results. This is partly because Timer 204 is a number and numbers are simple. For example, by simply incrementing Timer 204 by seconds, Timer 204 will reach its threshold faster without changing the threshold. As another example, by resetting Timer 204 at predictable intervals, Timer 204 can be prevented from reaching its threshold.

[0051] Timer 204 can be further counted up and / or down without significantly increasing computational complexity. When it is determined that the free region 201 will not be converted, the timer 204 associated with that free region 201 can be reset or paused, or the time limit of the timer 204 associated with that free region 201 can be increased. Manipulating timer 204 is a simple and effective way to maintain the free region 201 as a foreground region 202 in the background model. When the timer 204 of the algorithm associated with the free region 201 reaches an indication threshold below a predefined time limit, the free region 201 can be indicated to transition from foreground 202 to background 201.

[0052] Figure 3 An example of a foreground object transitioning to a background object is shown. In this example, the lower foreground object 202a becomes empty in the middle image frame 200b. Therefore, the spatial region 203 including object 202a becomes an empty region 201, whereas when objects 202a are in motion, they are not in the leftmost image frame 200a. The empty region 201 of the middle image frame 200b is indicated as a transition from foreground to background. In the rightmost image frame 200c, object 202a has thus transitioned to the background. Spatial regions 203 are described in the figure as nine regions per image frame 200a-c. In reality, there may be more or fewer than described. For example, each pixel or each four-pixel square in image frames 200a-b can be a spatial region 203.

[0053] As is known to those skilled in the art, Figure 2A , Figure 2B and Figure 3 The embodiments described herein are not the only way to repeatedly update the background model of a video stream by analyzing changes in a sequence of image frames and classifying image regions in the image frames that do not change over time as background.

[0054] Reset background model

[0055] As mentioned above, by using a rapid reset of the background model when the camera stops moving, background modeling can quickly generate a valid background and begin further processing of the image frame, which may require the background model.

[0056] Preferably, the step of resetting the background model is performed essentially immediately after detecting that the camera movement has stopped or that the camera movement has fallen below a movement threshold. Several methods exist for determining camera movement and the degree of camera movement, which are readily available to those skilled in the art. A straightforward method is to use control signals that control camera movement, but this can also be achieved using, for example, physical data measured from the camera or by analyzing captured images, such as by studying how reference points in the images move.

[0057] The traditional, repetitive steps of updating the background model are typically a continuous process. As previously mentioned, this process involves analyzing the changes in a sequence of image frames over time. In contrast, the step of resetting the background model can preferably be a fast, single event. Preferably, the step of applying image segmentation and / or object detection algorithms does not involve analysis of changes over time, but is performed on a limited number of images, or even possibly a single image. It may be useful to generate a fast first background model when camera movement stops, rather than using traditional background modeling, which typically takes some time to generate the background model. It can be noted that short power spikes are acceptable at this point, since movement that would otherwise require a large amount of power has stopped, or at least is relatively low. Once the rapid reset of the background model is complete, the process can revert to repeatedly updating the background model of the video stream by analyzing changes in the sequence of image frames and classifying image regions in the image frames that do not change over time as background.

[0058] Image segmentation can be, for example, object segmentation, instance segmentation, semantic segmentation, or panorama segmentation. None of these examples of image segmentation include analysis of motion over time, i.e., changes between several image frames, to classify time-invariant image regions within an image frame as background. Therefore, they can be applied immediately to one or a limited number of image frames. The advantages of resetting the background model described in this disclosure can be illustrated by the following example. The camera's frame rate can be, for example, 30 frames per second (fps), 50 fps, 60 fps, or 120 fps. Conventional algorithms that update the background model by analyzing changes in a sequence of image frames and classifying time-invariant image regions within the image frames as background may require, for example, 2 seconds. In other words, if the camera movement stops or falls below a movement threshold, it may take several seconds until conventional background modeling produces the desired background model. The background model resetting described in this disclosure can be performed significantly faster, for example, in less than 300 milliseconds, preferably in less than 200 milliseconds, and even more preferably in less than 100 milliseconds. Therefore, image segmentation and / or object detection algorithms can be applied to a single image frame, which will provide a background model that is faster than that produced by conventional background modeling. Image segmentation and / or object detection algorithms can be applied to additional image frames, such as fewer than 3, 5, or 8 image frames. In principle, image segmentation and / or object detection algorithms can be reused until conventional background modeling is ready to generate a background model. In one embodiment of the currently disclosed method for background modeling of video streams acquired by a moving camera, the step of applying image segmentation to one or more image frames to identify at least foreground objects includes performing image segmentation on a single image frame or on multiple independent image frames without analyzing motion to identify foreground and background.

[0059] In the context of this disclosure, the term "object detection" refers to detecting one or more objects in an image and returning their positions, typically in coordinate form. In some of the most common object detection applications, bounding boxes indicate the detected objects.

[0060] In this disclosure, image segmentation and / or object detection algorithms are applied to one or more image frames to identify at least foreground objects. The method can be trained to know which types of objects are typically foreground objects. This can typically be objects known to be able to move in the video stream, such as people, vehicles, animals, etc. The method can also have knowledge about which objects will be considered background. This can typically be objects known to be stationary in the video stream, such as buildings.

[0061] The steps of identifying at least foreground objects using image segmentation and / or object detection algorithms may include applying a machine learning model, such as a neural network, trained to identify objects, such as those described above. The neural network may include, for example, a deep learning model. Various neural network algorithms / techniques exist for object detection, such as convolutional neural networks (CNNs), recurrent neural networks (RNNs), etc. Typically, those skilled in the art can perform this type of object detection using neural networks.

[0062] Based on a non-limiting example of a setup for object detection using a neural network, the neural network is fed labeled data. The labeled data is, for example, an image of an object, where the image is labeled with the correct type of object; that is, the labeled data includes both image data and the ground truth of the image data itself. The image data is fed into a classifier, and the ground truth is sent to a loss function calculator. The classifier processes the data representing the objects to be classified and generates classification identifiers. The processing in the classifier may include applying weights to the values ​​as the data passes through the classifier. The classification identifier can be a feature vector, a classification vector, or a single value identifying the category. In the loss function, the classification identifier is compared to the ground truth using, for example, a loss function. The result of the loss function is then passed to a weight adjustment function, which is configured to adjust the weights used in the classifier. The process can then be performed by scanning an image using a neural network to detect and classify objects in the image.

[0063] Figure 6 An example of a conceptual framework 300 for object detection using neural networks is shown. Some building blocks of a CNN are convolutional layers, pooling layers, flattening layers, and fully connected layers. The input to a CNN is a tensor. Convolutional layers convolve the input and pass the result to the next layer. This is analogous to the response of neurons in the visual cortex to a specific stimulus. Convolutional networks can further include pooling layers. Pooling layers reduce the dimensionality of the data by combining the outputs of clusters of neurons in one layer into a single neuron in the next layer. CNNs can further include fully connected layers that connect each neuron in one layer to each neuron in another layer. Thus, a flattened matrix can be used to classify images through fully connected layers.

[0064] There are other object detection algorithms that do not rely on CNNs or machine learning.

[0065] In one embodiment of the currently disclosed method for background modeling of video streams acquired by a mobile camera, the algorithm for image segmentation and / or object detection of one or more image frames to identify foreground objects includes thresholding and / or clustering and / or histogram-based segmentation and / or edge detection.

[0066] An example of an object detection algorithm is the Viola-Jones detection framework. In this method, image frames are scanned using a sliding window, where each region is classified as containing or not containing an object. This method uses Haar features and a cascaded classifier to detect objects. Various types of such object detection methods are known in the prior art, where cascaded identifiers are used to detect objects, such as those described in Viola, Paul, and Michael Jones, “Fast Object Detection Using Enhanced Cascades with Simple Features,” Computer Vision and Pattern Recognition, 2001, CVPR 2001, IEEE Computer Society Proceedings, Volume 1, IEEE 2001. Because visual features are important for these algorithms, groups of objects sharing similar visual features can be detected; examples of such groups include faces, vehicles, and people. Any of these methods can be used individually or in combination to detect objects in image data. Several objects can also be detected in the same set of image data. When an object is detected, a set of recognition features can be created to describe the visual appearance of the detected object.

[0067] Figure 5 An example of an image frame 200 in which multiple objects 208 have been detected and bounding boxes 207 have been added is shown. Figure 5 In the example, object 208 is a car, but an object can be any item that is considered to belong to the foreground or background.

[0068] Figure 4 An example of image frames 200a / 200b in a video sequence during camera movement is shown. In this example of camera movement, camera panning can be seen between the left and right image frames. The stationary background objects (trees 205 and buildings 207) do not have the same position in the two image frames 200a and 200b. Meanwhile, there is a moving object 206, which has a different position relative to the background objects 205 and 207 in the two image frames 200a and 200b.

[0069] Figure 7 A schematic diagram of an embodiment of a currently disclosed image processing system 400 is shown. The disclosed image processing system 400 includes a camera 401, a display 403, and a processing unit 402.

[0070] The system may (but does not necessarily have to) include a display for showing a sequence of image frames. Background modeling can be useful in a variety of applications where visualization is performed on the display, including overlay and dynamic masking applications. However, in further applications, the background model may not necessarily be displayed, but rather used in additional applications. These additional applications may include, for example, extracting statistical data from the video or further analysis.

[0071] The system may further include peripheral components, such as one or more memories for storing instructions executable by the processing unit. The system may further include any of the following: internal and external network interfaces, input and / or output ports, a keyboard or mouse, etc.

[0072] As those skilled in the art will understand, the processing unit can also be a single processor in a multi-core / multi-processor system. Both the computing hardware accelerator and the central processing unit can be connected to the data communication infrastructure.

[0073] The system may include memory, such as random access memory (RAM) and / or read-only memory (ROM) or any suitable type of memory. The system may further include a communication interface that allows software and / or data to be transferred between the system and external devices. The software and / or data transferred through the communication interface may be any suitable form of electrical signal, optical signal, or radio frequency signal. The communication interface may include, for example, a cable or wireless interface.

[0074] The present invention further relates to a computer program having instructions that, when executed by a computing device or computing system, cause the computing device or computing system to perform any embodiment of the currently disclosed background modeling method for video streams. The computer program can be stored on any suitable type of storage medium, such as a non-transitory storage medium.

Claims

1. A method for background modeling of a video stream, the video stream being acquired by a moving camera, the method comprising: Acquire a video stream containing a sequence of image frames; The background model of the video stream is repeatedly updated by analyzing the changes in the sequence of image frames and classifying the image regions in the image frames that do not change over time as background; Detect camera movement, and stop repeatedly updating the background model when camera movement is detected or when the camera movement exceeds a predefined limit; When the camera movement stops or the camera movement falls below a movement threshold, the background model is reset by the following steps: acquiring image frames; An image segmentation and / or object detection algorithm is applied to the image frame to identify at least a foreground object, wherein the image segmentation and / or object detection algorithm is trained to identify objects known to be able to move in the video stream without analyzing changes over time between image frames; The area outside the identified foreground object is classified as background; After the background model has been reset, the process returns to repeatedly updating the background model of the video stream by analyzing changes in the sequence of image frames and classifying image regions in the image frames that do not change over time as background.

2. The method of background modeling of a video stream of claim 1, wherein, The step of resetting the background model is performed immediately after detecting that the camera movement has stopped or that the camera movement has fallen below a movement threshold.

3. The method of background modeling of a video stream of claim 1, wherein, The camera movement includes panning and / or tilting and / or zooming.

4. The method for background modeling of a video stream according to claim 1, further comprising the steps of performing background overlay and / or masking based on the background model.

5. The method of background modeling of a video stream of claim 1, further comprising the step of: When camera movement is detected, but before the camera movement stops or the camera movement is below a movement threshold, the background overlay is made visible for the image area covering the entire image frame and / or a complete mask and / or pixelation is applied to the image area covering the entire image frame.

6. The method of background modeling of a video stream of claim 1, wherein, The repeated updating of the background model is a continuous process, and the reset of the background model is a single event.

7. The method for background modeling of a video stream according to claim 1, wherein, The step of repeatedly updating the background model includes analyzing sub-regions of the sequence of image frames, wherein if there is no significant change between the image frames in the sub-region during a predefined time period, the sub-region is classified as a background sub-region.

8. The method for background modeling of a video stream according to claim 1, wherein, The steps of applying image segmentation and / or object detection algorithms to the one or more image frames to identify foreground objects include thresholding and / or clustering and / or histogram-based segmentation and / or edge detection.

9. The method for background modeling of a video stream according to claim 1, wherein, The following steps are sequential: repeatedly update the background model; detect camera movement; detect when the camera movement stops or the camera movement is below a movement threshold; and perform a reset of the background model.

10. The method for background modeling of a video stream according to claim 1, further comprising displaying a sequence of image frames on a display.

11. A non-transitory storage medium storing a computer program having instructions that, when executed by a computing device or computing system, cause the computing device or computing system to perform the method for background modeling of a video stream according to claim 1.

12. An image processing system, comprising: At least one camera; as well as The processing unit is configured as follows: Acquire a video stream containing a sequence of image frames; The background model of the video stream is repeatedly updated by analyzing the changes in the sequence of image frames and classifying the image regions in the image frames that do not change over time as background; Detect camera movement, and stop repeatedly updating the background model when camera movement is detected or when the camera movement exceeds a predefined limit; When the camera movement stops or the camera movement falls below a movement threshold, the background model is reset by the following steps: acquiring image frames; An image segmentation and / or object detection algorithm is applied to the one or more image frames to identify foreground objects, wherein the image segmentation and / or object detection algorithm is trained to identify known objects that can move in the video stream without analyzing changes over time between image frames; The area outside the identified foreground object is classified as background; After the background model has been reset, the process returns to repeatedly updating the background model of the video stream by analyzing changes in the sequence of image frames and classifying image regions in the image frames that do not change over time as background.

Citation Information

Patent Citations

  • System and method for managing moving surveillance cameras

    CN101272483A

  • Image segmentation for a live camera feed

    CN105321171A