A method, device and electronic equipment for detecting image scene change

What is AI technical title?
AI technical title is built by PatSnap AI team. It summarizes the technical point description of the patent document.
The image scene change detection method uses a preset detection mode to determine scene changes in an image set, which solves the problem of low detection efficiency in existing technologies and achieves efficient and unified change detection, applicable to a variety of scenarios.

CN115457372BActive Publication Date: 2026-06-12ZHEJIANG DAHUA TECH CO LTD

View PDF 1 Cites 0 Cited by

Patent Information

Authority / Receiving Office: CN · China
Patent Type: Patents(China)
Current Assignee / Owner: ZHEJIANG DAHUA TECH CO LTD
Filing Date: 2022-09-07
Publication Date: 2026-06-12

Application Information

Patent Timeline

07 Sep 2022

Application

12 Jun 2026

Publication

CN115457372B

IPC: G06V20/00; G06V10/75; G06V10/74; G06V10/82; G06V10/22

CPC: G06V20/35; G06V10/751; G06V10/761; G06V10/22; G06V10/82

AI Tagging

Application Domain

Character and pattern recognition

Explore More Agents

Novelty Search
Search existing technologies and assess novelty
↗
FTO
Analyze whether a product may infringe others' patents
↗
Design FTO
Check prior-design risk for exterior design
↗
Drafting
Draft patent application text based on a technical solution
↗
Find Solutions with TRIZ
Generate feasible solution to solve your technical challenge
↗

Similar Technology Patents

Get free access to AI patent search and analysis

Check patentability, review prior art and ask IP Agent with full patent context.

AI Technical Summary

⚠Technical Problem

Existing technologies for detecting changes in different industries and scenarios are inefficient and costly, requiring customization on a case-by-case basis, which leads to low detection efficiency.

⚗Method used

A method for detecting scene changes in images is provided. By acquiring detection description information and a set of images to be detected, a preset detection mode (including at least a first detection mode and a second detection mode) is used to determine whether the scene in the image set has changed. The first detection mode judges by similarity, and the second detection mode judges by difference regions. It is applicable to various types of task detection instructions and scenes.

🎯Benefits of technology

It achieves efficient and unified change detection in different scenarios, avoiding the inefficiency caused by designing detection methods one by one, thus improving detection efficiency and reducing costs.

✦ Generated by Eureka AI based on patent content.

Smart Images

Figure CN115457372B_ABST

Patent Text Reader

Abstract

The application discloses a method and device for detecting image scene change and electronic equipment, which is widely applicable to different scene change detection methods, thereby solving the problem of low efficiency of detecting image scene change in the prior art. The method comprises: obtaining detection description information from a task detection instruction; obtaining a set of images to be detected; determining whether the scene in the set of images to be detected changes through a preset detection mode matched with the detection description information in K preset detection modes; the preset detection mode comprises a first detection mode and / or a second detection mode; the first detection mode comprises: determining whether the scene in the set of images to be detected changes through the similarity between preset regions in different images to be detected included in the set of images to be detected; and the second detection mode comprises: determining whether the scene in the set of images to be detected changes through different regions between different images to be detected included in the set of images to be detected.

Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This application relates to the field of image detection technology, and in particular to a method, apparatus and electronic device for detecting changes in an image scene. Background Technology

[0002] With the intelligent development of various industries, there is a growing demand for change-based machine detection: identifying changes in videos (or photos) to reduce labor costs. Examples include detecting the opening / closing of fire doors in stairwells, checking for blockages in equipment discharge ports in workshops, ensuring the cleanliness of key areas like tabletops in restaurant kitchens, and verifying the presence of illegally parked vehicles in fire lanes. Currently, this type of demand is typically met by customizing detection methods to detect changes in fixed objects within fixed locations. However, as the demand for change-based detection increases across different industries and scenarios, customizing detection methods for each scenario becomes inefficient and costly. Summary of the Invention

[0003] This application provides a method, apparatus, and electronic device for detecting changes in an image scene, which provides a change detection method that is widely applicable to different scenes, avoiding the problem of low detection efficiency caused by designing change detection methods one by one.

[0004] Firstly, this application provides a method for detecting changes in an image scene, comprising:

[0005] Obtain detection description information from task detection instructions; and

[0006] Obtain a set of images to be detected; the set of images to be detected contains at least two images to be detected.

[0007] Whether the scene in the set of images to be detected has changed is determined by using a preset detection mode that matches the detection description information from among K preset detection modes; wherein, K is an integer not less than 1, and the preset detection modes include at least a first detection mode and / or a second detection mode;

[0008] The first detection mode includes: determining whether the scene in the image set to be detected has changed based on the similarity between preset regions in different images to be detected contained in the image set to be detected;

[0009] The second detection mode includes: determining whether the scene in the set of images to be detected has changed by identifying the difference regions between different images in the set of images to be detected.

[0010] In this embodiment, by obtaining detection description information, a preset detection mode is matched for at least two images in the image set to be detected, thereby detecting whether the scene in the image set to be detected has changed in a targeted manner. This method is applicable to various types of task detection instructions and change detection in various scenarios. Therefore, by directly using the scene change detection method provided in this embodiment, the problem of low detection efficiency caused by designing / customizing different detection methods for each scene or business in the image to be detected can be avoided.

[0011] In one possible implementation, the at least two images to be detected are acquired for a detection region, and the detection description information includes the regional location information of the detection region; then, before determining whether the scene in the set of images to be detected has changed by using a preset detection mode that matches the detection description information from K preset detection modes, the method further includes:

[0012] Based on the region location information in the detection description information, a preset detection mode that matches the detection description information is determined.

[0013] In one possible implementation, the preset detection mode matching the detection description information includes the first detection mode; then, determining whether the scene in the set of images to be detected has changed by using the preset detection mode matching the detection description information from among K preset detection modes includes:

[0014] A first vector of a first image to be detected and a second vector of a second image to be detected are determined; wherein the first vector indicates the features of the preset region in the first image to be detected, and the second vector indicates the features of the preset region in the second image to be detected; the first image to be detected and the second image to be detected are different images to be detected in the set of images to be detected;

[0015] The similarity is determined based on the first vector and the second vector;

[0016] When the similarity is less than the first similarity threshold, it is determined that the scene in the set of images to be detected has changed.

[0017] One possible implementation involves acquiring a reference image and determining a reference vector for the reference image; wherein the similarity between the preset region in the reference image and the preset region in the first image to be detected is less than a second similarity threshold, and the reference vector indicates the features of the preset region in the reference image.

[0018] Based on the second vector and the reference vector, a reference similarity is determined;

[0019] When the reference similarity is less than the second similarity threshold, it is determined that an interfering target appears in the preset region of the second image to be detected; then a third image to be detected is acquired, and the second vector is updated using the third vector of the third image to be detected; wherein, the acquisition time of the third image to be detected is later than the acquisition time of the second image to be detected, and the third vector indicates the features of the preset region in the third image to be detected;

[0020] Based on the similarity between the updated second vector and the first vector, it is determined whether the scene in the set of images to be detected has changed.

[0021] In one possible implementation, the preset detection mode matching the detection description information includes the second detection mode; then, determining whether the scene in the set of images to be detected has changed by using the preset detection mode matching the detection description information from among K preset detection modes includes:

[0022] The first feature matrix of the third image to be detected and the second feature matrix of the fourth image to be detected in the set of images to be detected are determined respectively.

[0023] The difference between the elements indicating the same region between the first feature matrix and the second feature matrix is determined as a feature parameter; wherein, the feature parameter indicates the degree of difference between the same region in the third image to be detected and the same region in the fourth image to be detected;

[0024] The feature parameters that are greater than the difference threshold are marked as difference parameters;

[0025] Based on a preset mapping relationship, in the fourth image to be detected, the difference region corresponding to the difference parameter is determined; the preset mapping relationship indicates the correspondence between the features of any region in any image and the elements in the feature matrix of any image.

[0026] Based on the difference regions, it is determined whether the scene in the set of images to be detected has changed.

[0027] In one possible implementation, the second detection mode includes a first sub-detection mode and a second sub-detection mode. The first sub-detection mode is used to detect environmental changes in the scene within the set of images to be detected, and the second sub-detection mode is used to detect target changes in the scene within the images to be detected.

[0028] One possible implementation includes determining the first feature matrix of the third image to be detected and the second feature matrix of the fourth image to be detected in the set of images to be detected, respectively, comprising:

[0029] Based on the detection items in the detection description information, determine whether to perform environmental change detection for the scene in the set of images to be detected;

[0030] If so, input the third image to be detected and the fourth image to be detected into the first sub-detection mode, and determine the first feature matrix corresponding to the third image to be detected and the second feature matrix corresponding to the fourth image to be detected;

[0031] If not, input the third image to be detected and the fourth image to be detected into the second sub-detection mode to determine the first feature matrix corresponding to the third image to be detected and the second feature matrix corresponding to the fourth image to be detected.

[0032] Secondly, this application provides an image scene change detection device, comprising:

[0033] Collection unit: used to obtain detection description information from task detection instructions; and

[0034] Obtain a set of images to be detected; the set of images to be detected contains at least two images to be detected.

[0035] Matching unit: used to determine whether the scene in the set of images to be detected has changed by using a preset detection mode that matches the detection description information from among K preset detection modes; wherein, K is an integer not less than 1, and the preset detection modes include at least a first detection mode and / or a second detection mode;

[0036] The first detection mode includes: determining whether the scene in the image set to be detected has changed based on the similarity between preset regions in different images to be detected contained in the image set to be detected;

[0037] The second detection mode includes: determining whether the scene in the set of images to be detected has changed by identifying the difference regions between different images in the set of images to be detected.

[0038] In one possible implementation, the device further includes a mode unit, which is specifically used to determine a preset detection mode that matches the detection description information based on the region location information in the detection description information.

[0039] In one possible implementation, the preset detection mode matching the detection description information includes the first detection mode; then the matching unit is specifically used to determine a first vector of the first image to be detected and a second vector of the second image to be detected; wherein, the first vector indicates the features of the preset region in the first image to be detected, and the second vector indicates the features of the preset region in the second image to be detected; the first image to be detected and the second image to be detected are different images to be detected in the set of images to be detected; the similarity is determined based on the first vector and the second vector; when the similarity is less than a first similarity threshold, it is determined that the scene in the set of images to be detected has changed.

[0040] In one possible implementation, the matching unit is specifically used to acquire a reference image and determine a reference vector of the reference image; wherein the similarity between the preset region in the reference image and the preset region in the first image to be detected is less than a second similarity threshold, and the reference vector indicates the features of the preset region in the reference image; a reference similarity is determined based on the second vector and the reference vector; when the reference similarity is less than the second similarity threshold, it is determined that an interfering target appears in the preset region of the second image to be detected; then a third image to be detected is acquired, and the second vector is updated using the third vector of the third image to be detected; wherein the acquisition time of the third image to be detected is later than the acquisition time of the second image to be detected, and the third vector indicates the features of the preset region in the third image to be detected; based on the similarity between the updated second vector and the first vector, it is determined whether the scene in the set of images to be detected has changed.

[0041] In one possible implementation, the preset detection mode matching the detection description information includes the second detection mode; then the matching unit is specifically used to determine the first feature matrix of the third image to be detected and the second feature matrix of the fourth image to be detected in the set of images to be detected; determine the difference between the elements indicating the same region between the first feature matrix and the second feature matrix as a feature parameter; wherein, the feature parameter indicates the degree of difference between the same region in the third image to be detected and the same region in the fourth image to be detected; mark the feature parameter greater than the difference threshold as a difference parameter; based on a preset mapping relationship, determine the difference region corresponding to the difference parameter in the fourth image to be detected; the preset mapping relationship indicates the correspondence between the features of any region in any image and the elements in the feature matrix of any image; based on the difference region, determine whether the scene in the set of images to be detected has changed.

[0042] In one possible implementation, the second detection mode includes a first sub-detection mode and a second sub-detection mode. The first sub-detection mode is used to detect environmental changes in the scene within the set of images to be detected, and the second sub-detection mode is used to detect target changes in the scene within the images to be detected.

[0043] In one possible implementation, the matching unit is further configured to determine, based on the detection items in the detection description information, whether to perform environmental change detection on the scene in the set of images to be detected; if yes, input the third image to be detected and the fourth image to be detected into the first sub-detection mode to determine the first feature matrix corresponding to the third image to be detected and the second feature matrix corresponding to the fourth image to be detected; if no, input the third image to be detected and the fourth image to be detected into the second sub-detection mode to determine the first feature matrix corresponding to the third image to be detected and the second feature matrix corresponding to the fourth image to be detected.

[0044] Thirdly, this application provides a readable storage medium, including,

[0045] memory,

[0046] The memory is used to store instructions that, when executed by a processor, cause an apparatus including the readable storage medium to perform the method as described in the first aspect and any possible implementation.

[0047] Fourthly, this application provides an electronic device, comprising:

[0048] Memory, used to store computer programs;

[0049] When a processor executes a computer program stored in the memory, it implements the method as described in any of the first aspects and any possible embodiments. Attached Figure Description

[0050] Figure 1 This is a schematic flowchart of an image scene change detection method provided in an embodiment of this application;

[0051] Figure 2 This is a schematic diagram of a second detection mode provided in an embodiment of this application;

[0052] Figure 3 This is a schematic diagram illustrating whether the scene in the set of images to be detected has changed, provided in an embodiment of this application.

[0053] Figure 4 A schematic diagram of the structure of an image scene change detection device provided in an embodiment of this application;

[0054] Figure 5 This is a schematic diagram of the structure of an electronic device for detecting changes in an image scene, provided in an embodiment of this application. Detailed Implementation

[0055] To address the low efficiency of existing methods for detecting scene changes in images, this application proposes a method for detecting scene changes in images: First, detection description information and a set of images to be detected are obtained from the task detection instruction. Then, a preset detection mode matched with the detection description information is used to determine whether the scene in the set of images to be detected has changed. This preset detection mode includes at least a first detection mode and / or a second detection mode, and both the first and second detection modes pre-determine regions in the image that may have changed, especially the changed regions in the first detection mode are preset regions. Therefore, this method can obtain a set of images to be detected and match a preset detection mode to the images based on the obtained detection description information to determine whether the scene in the images has changed, thereby avoiding the low detection efficiency caused by regenerating the detection method for each change detection task.

[0056] It should be noted that the images described in this application embodiment are applicable to images captured by any image acquisition device, such as a camera, camcorder, or webcam.

[0057] To better understand the above technical solutions, the technical solutions of this application will be described in detail below with reference to the accompanying drawings and specific embodiments. It should be understood that the embodiments of this application and the specific features in the embodiments are detailed descriptions of the technical solutions of this application, rather than limitations on the technical solutions of this application. In the absence of conflict, the embodiments of this application and the technical features in the embodiments can be combined with each other.

[0058] Please refer to Figure 1 This application proposes a method for detecting changes in an image scene, avoiding the low detection efficiency caused by designing each change detection method individually. The method specifically includes the following implementation steps:

[0059] Step 101: Obtain detection description information from the task detection instructions; and obtain the set of images to be detected.

[0060] The set of images to be detected contains at least two images to be detected.

[0061] Specifically, the images to be detected in the above-mentioned set of images to be detected are obtained by different and / or the same image acquisition devices from capturing (i.e., acquiring) the detection area.

[0062] The detection description information includes the information required to be collected for the detection area. For example, the location information of the detection area. This location information could be the coordinates of the vertex and / or midpoint of the detection area, its shape, etc.

[0063] The aforementioned detection description information may also include image noise indication information to describe image information that is not needed for the detection task, thereby combining it with the detection information in other detection description information to accurately match the preset detection mode described in step 102.

[0064] For example, the detection area is a restaurant kitchen. The area location information in the detection description could be the location of the kitchen door. The corresponding task detection instruction could be to detect changes in the kitchen door (opening / closing). Furthermore, the area location information in the detection description could describe the location of the kitchen countertops or the entire kitchen floor. The corresponding task detection instruction could be to detect whether items (e.g., dishcloths) are scattered haphazardly on the kitchen countertops; or whether the kitchen floor is not cleaned promptly: whether there is paper scraps, food residue, or other trash on the floor.

[0065] Step 102: Determine whether the scene in the set of images to be detected has changed by using the preset detection mode that matches the detection description information from among K preset detection modes.

[0066] Where K is an integer not less than 1, and the preset detection modes include at least the first detection mode and / or the second detection mode.

[0067] The first detection mode includes: determining whether the scene in the image set has changed by using the similarity between preset regions in different images contained in the image set to be detected.

[0068] The second detection mode includes: determining whether the scene in the image set has changed by identifying the difference regions between different images in the image set.

[0069] Specifically, one of the differences between the first detection mode and the second detection mode is whether the detection area is fixed. That is, the detection area in the first detection mode is a preset area, while the detection area in the second detection mode is a variable area.

[0070] Therefore, before detecting the image to be detected to determine whether the scene in the image set has changed, a preset detection mode that matches the detection description information can be determined first by using the regional location information of the detection area in the detection description information.

[0071] Furthermore, another difference between the first and second detection modes lies in the area of the detection region. Specifically, the preset region in the first detection mode is generally larger than the difference region in the second detection mode. Correspondingly, the proportion of the difference region in the image to be detected in the second detection mode is smaller than the proportion of the preset region in the image to be detected in the first detection mode.

[0072] Furthermore, the preset region in the first detection mode is pre-set, while the difference region in the second detection mode is actually obtained by processing the set of images to be detected based on the second detection mode.

[0073] Furthermore, when the preset detection mode that matches the detection description information among the K preset detection modes is the first detection mode, the scene of the image to be detected is determined by the first detection mode.

[0074] Since the set of images to be detected includes at least two images, any two images can be used for detection / processing. It should be noted that if a change in the scene is detected in two images within the set, then the scene within the set can be considered to have changed. However, if no change is detected in the scene of the two currently selected images, it does not mean that the scene within the set has not changed, unless the scene in both images remains unchanged.

[0075] The two images to be detected can be named the first image to be detected and the second image to be detected based on their acquisition times. In this embodiment, the acquisition time of the first image to be detected can be earlier than the acquisition time of the second image to be detected. Alternatively, the acquisition time of the first image to be detected can be later than the acquisition time of the second image to be detected.

[0076] The following is a detailed explanation of the processing of the image set to be detected in the first detection mode.

[0077] First, a first vector of the first image to be detected and a second vector of the second image to be detected are determined. The first vector indicates the features of a predetermined region in the first image to be detected, and the second vector indicates the features of the predetermined region in the second image to be detected.

[0078] Then, based on the first vector and the second vector, the similarity between preset regions in the first and second images to be detected is determined. Specifically, the cosine distance between the first and second vectors can be used as the similarity between preset regions in the first and second images to be detected.

[0079] When the similarity is greater than or equal to the first similarity threshold, it can be determined that the scene in the first image to be detected and the second image to be detected does not change.

[0080] When the similarity is less than the first similarity threshold, it is determined that the scene in the image to be detected has changed.

[0081] Furthermore, for any preset region in the image to be detected, when the preset region changes, it should correspond to the preset region in the reference image, that is, have the same state as the preset region in the reference image. However, when an interference (target) appears in the preset region, even if the preset region does not change, the appearance of the interference target in the second image to be detected causes the second vector in the second image to be detected to have a similarity less than the first similarity threshold because it includes the features of the interference target and because the interference target occludes the preset region. Therefore, in another embodiment of this application, when the similarity is less than the first similarity threshold, it can be further determined whether the similarity is lower than the first similarity threshold by obtaining the reference image. The following describes the method for determining whether an interference target appears:

[0082] First, a reference image is acquired, and a reference vector for a preset region within the reference image is determined. The preset region in the reference image differs in state from the preset region in the first image to be detected. That is, the reference image also includes the same preset region as the first and second images to be detected, but the features of its preset region are different from those of the preset region in the first image to be detected. Specifically, the similarity between the reference vector indicating the features of the preset region in the reference image and the first vector is less than a second similarity threshold. This second similarity threshold can be the same as or different from the first similarity threshold. For example, if the preset region is a door, its state can be either open or closed. If the door is open in the first image to be detected, then the door in the reference image is closed or partially closed. Therefore, the preset regions (doors) in the first and reference images are different, and the similarity between the first vector and the reference vector is less than the second similarity threshold.

[0083] Then, a reference similarity is determined based on the similarity between the second vector and the reference vector. Specifically, the reference similarity between preset regions of the reference image and the second image to be detected can be determined by determining the cosine distance between the second vector and the reference vector.

[0084] Next, the reference similarity is compared with the second similarity threshold. If the reference similarity is also less than the second similarity threshold, it indicates that the preset region in the second image to be detected is also different from the preset region in the reference image. It can be seen that the preset region in the second image to be detected is neither similar to the preset region in the first image to be detected nor similar to the preset region in the reference image. Since the features of the preset region in the reference image actually correspond to the features of the preset region in the first image to be detected after changes, it can be determined that there is an interfering target in the second image to be detected.

[0085] Finally, after confirming the presence of an interfering target, a third image to be detected, acquired later than the second image to be detected, is obtained. A third vector indicating a preset region in the third image to be detected is still determined using the first detection mode. The second vector is updated using this third vector, and the similarity between the first vector and the updated second vector is determined. The interfering target is determined to have disappeared when the similarity between the first vector and the updated second vector is greater than or equal to a first similarity threshold, or when the similarity between the reference vector and the updated second vector is greater than or equal to a second similarity threshold. Correspondingly, it is determined whether the scene in the image to be detected has changed.

[0086] In this embodiment, the similarity of preset regions between images to be detected, or between an image to be detected and a reference image, is actually determined by assessing the similarity of features between any image and a preset region of an image. This determines whether the states of the preset regions between the images are the same, thereby determining whether the scene in the image to be detected has changed. Furthermore, in this embodiment, the number of states of the preset regions is not limited; that is, the number of states of the preset regions can be two, three, etc. For example, when the preset region is a traffic light, the features of the preset region can be that only the red light is on, only the green light is on, or only the yellow light is on.

[0087] The following example uses a preset area as a door for specific illustration:

[0088] The preset region in the first image to be detected is a door, and its state is open. When the similarity between the second image to be detected and the first image to be detected is lower than the first similarity threshold, it means that the state of the door in the second image to be detected by the first detection mode is different from the "open" state in the first image to be detected. Therefore, when there are only two states for a door, the state of the door in the second image to be detected should be closed. However, the reference similarity between the second image to be detected and the reference image with the preset region being a door and the door being closed, obtained by the first detection mode, is also lower than the corresponding threshold (the third similarity threshold). Obviously, the state of the door in the second image to be detected is also not closed.

[0089] Therefore, when the preset area (door) has only two states, if the similarity between the second image to be detected and two images with different states in the preset area (the first image to be detected and the reference image) is less than the corresponding threshold, that is, the door in the second image to be detected is neither open nor closed, it indicates that there is an interfering target in the preset area (door), for example, a potted plant temporarily placed in front of the door. Therefore, a third image to be detected that was acquired later than the second image to be detected can be obtained, and the second vector can be updated with the third vector corresponding to the third image to be detected, until the similarity between the updated second vector and the first vector is greater than or equal to the first similarity threshold, it indicates that the interfering target has disappeared, and the scene in the set of images to be detected where the image to be detected is located has not changed. Or;

[0090] If the updated second vector is less than the first similarity threshold and the similarity between the updated second vector and the reference vector is greater than or equal to the second similarity threshold, then it is determined that the interference target in the preset region has disappeared, and the state of the preset region after the change corresponds to the state of the preset region in the reference image, and is different from the preset region in the first image to be detected, that is, the scene in the set of images to be detected where the image to be detected is located has changed.

[0091] Furthermore, when the preset detection mode that detects matching description information among the K preset detection modes is the second detection mode, any two different images to be detected can be detected / processed based on the second detection mode, as explained in detail below.

[0092] Similarly, we can distinguish between any two images to be detected by the different acquisition times. That is, the third and fourth images to be detected are images of the same scene / region but acquired at different times.

[0093] Specifically, the second detection mode can consist of a RepVGG (Re-param VGG) backbone network and a YOLO detection head. The backbone network is used to extract feature maps from the third and fourth images to be detected. These feature maps can be represented by corresponding feature matrices. Therefore, the second detection mode includes the mapping relationship between the features of any region in the third / fourth image and the elements in the corresponding first / second feature matrices. In other words, the features of any region in the third image correspond to at least one element in the first feature matrix, and the features of any region in the fourth image correspond to at least one element in the second feature matrix.

[0094] Furthermore, after determining the first feature matrix corresponding to the third image to be detected and the second feature matrix corresponding to the fourth image to be detected through the RepVGG backbone network, feature fusion can be performed to further detect the difference regions through the YOLO detection head and determine the category of the difference regions. Please refer to [link / reference needed]. Figure 2 In one embodiment of this application, the fused feature map obtained by subtracting the feature maps of the third and fourth images to be detected is used as the feature fusion result. The target detection box is then regressed using a YOLO detection head within the fused feature map, and the analogy of the target detection box is determined. This analogy can be: target addition, target replacement, and / or target disappearance.

[0095] The aforementioned feature fusion step, which involves subtracting feature images, can actually be accomplished by subtracting corresponding elements from the first and second feature matrices. The difference between elements in the first and second feature matrices that indicate the same region is then used as a feature parameter. This feature parameter indicates the degree of difference between the same region in the third and fourth detection images. Feature parameters exceeding a difference threshold can then be labeled as difference parameters. Finally, based on the preset mapping relationship in the second detection mode, the difference region corresponding to the difference parameter can be determined in the fourth detection image.

[0096] It should be noted that although the second detection mode targets any region in any two images to be detected, the detection network (i.e., the second detection mode) uses the target detection box to regress the difference region and outputs the target detection box category (target addition, target replacement, and / or target disappearance). Therefore, it does not cause a significant decrease in efficiency compared to the first detection mode. Moreover, it can effectively improve the detection efficiency compared to the existing method of detecting any region in an image through a segmentation network.

[0097] Furthermore, for the third and fourth images to be detected, where any region may change, a corresponding sub-detection mode can be matched from the second detection mode among K preset detection modes based on the detection items in the detection description information. This application embodiment provides two sub-detection modes for the second detection mode: a first sub-detection mode and a second sub-detection mode. The first sub-detection mode is used to detect environmental changes in the scene within the set of images to be detected, and the second sub-detection mode is used to detect target changes in the scene within the set of images to be detected.

[0098] It should be noted that the detection items in the detection description information can be set based on information such as business requirements. Furthermore, environmental changes and target changes can be specifically categorized into objects without fixed forms in the natural environment and objects with relatively fixed forms in space. For example, when the detection description item indicates detection of changes in light or water stains, a first sub-detection mode for detecting environmental changes is matched based on this detection item. Then, the third and fourth images to be detected can be input into the first sub-detection mode to determine a first feature matrix corresponding to the third image and a second feature matrix corresponding to the fourth image.

[0099] When the detection description information indicates that paper scraps on the substrate, stains on the wall, etc., should be detected, a second sub-detection mode that detects changes in the target is matched based on this detection item. Then, the third and fourth images to be detected can be input into the first sub-detection mode to determine a first feature matrix corresponding to the third image to be detected and a second feature matrix corresponding to the fourth image to be detected.

[0100] The first and second sub-detection modes have the same structure, but they can be obtained using different training samples depending on their intended use. The training methods for the first and second sub-detection modes are described below.

[0101] First, a first training label and a second training label are set for the first training image. The first training label indicates the features of the environment in any region of the first training image, and the second training label indicates the features of the target in any region of the first training image, while ignoring the features of the environment.

[0102] The first training image and the first label are input into the first training model to obtain the first training matrix. The parameters of the first training model are adjusted based on the first error between the first training matrix and the first preset matrix corresponding to the first label until the first error is less than the first error threshold, thus obtaining the first sub-detection mode.

[0103] The first training image and the second label are input into the second training model to obtain the second training matrix. The parameters of the second training model are adjusted by the second error between the second training matrix and the second preset matrix corresponding to the second label until the second error is less than the second error threshold, thus obtaining the second sub-detection mode.

[0104] During the training process of the first and second sub-detection modes, negative samples can be set, allowing the first or second training model to learn based on these negative samples, thus completing the training more efficiently and accurately. For example, when training the second training model, it learns to identify any region where only the target changes and the target remains unchanged in the detection box. Therefore, training images where the target remains unchanged in a certain region but the environment changes (e.g., changes in lighting or shadows) can be used as negative samples to input into the second sub-detection mode for training.

[0105] It is worth noting that in the same training image mentioned above, changes in the target or environment are relative, not absolute. For example, consider the detection of changes in fire lane occupancy. Since the location of vehicles occupying fire lanes is not fixed, this corresponds to the second sub-detection mode in the second detection mode. In this case, motor vehicles or non-motor vehicles can be detected as targets, while pedestrians are considered as the environment and not detected. Accordingly, pedestrian features are excluded, and motor vehicles / non-motor vehicles are used as targets to train the second training model, so that the second training model focuses on learning to distinguish the features of motor vehicles / non-motor vehicles, thus obtaining the second sub-detection mode. For important and confidential locations, the appearance of any foreign object should trigger an alarm, including motor vehicles, pedestrians, etc. In this case, pedestrians are no longer classified as the environment, but are trained together with motor vehicles, etc., as targets.

[0106] Based on steps 101-103 above, the following examples are provided for your reference. Figure 3 .

[0107] First, the detection description information is obtained from the task detection instruction to match the corresponding detection mode among multiple preset detection modes and process the images to be detected in the set of images to be detected.

[0108] When the detection area is determined based on the region location information in the detection description information, and the detection area is fixed, it is matched to the first detection mode, and the preset area in the first detection mode is set according to the region location information in the detection description information. Then, any two different images to be detected are input into the first detection mode for feature extraction, and the similarity between the preset areas in the images to be detected at different acquisition times is determined. When the similarity is greater than or equal to the similarity threshold, it is determined that the preset area in the current image to be detected has not changed. Otherwise, it can be further determined whether there is an interfering target in the preset area according to the method described in step 102. If there is no interfering target, it can be determined that the scene in the set of images to be detected has changed; otherwise, due to the existence of the interfering target, it is unknown whether the scene in the set of images to be detected has changed.

[0109] When the image set to be processed is matched to the second detection mode in the preset detection modes according to the detection description information, it can be further determined whether the detection item in the image set to be detected is an environment detection type according to the detection description information, and the scene in the image set to be detected is detected in the first sub-detection mode or the second sub-detection mode accordingly. Since the second detection mode in this embodiment regresses the difference region through the target detection box, if the first sub-detection mode or the second sub-detection mode does not regress the target detection box, it is determined that the scene in the image set to be detected has not changed. If the target detection box is regressed, it can be determined that the scene in the image set to be detected has changed, and the type of change in the scene in the image set to be detected can be determined according to the category of the target detection box.

[0110] It is worth noting that when determining whether the scene has changed in any two sets of images to be detected, if the first image to be detected or the third image to be detected has not changed, that is, when the images used for comparison to determine whether the scene has changed are fixed, then the first image to be detected and the third image to be detected are... Figure 3 The background image in the image can be used to record the features of a preset region in the first image to be detected, as well as the first feature matrix of the third image to be detected. Feature extraction / detection is performed only on the second and fourth images to be detected, which reduces the computational load of the first / second detection mode. This effectively improves the efficiency of determining whether the scene in the set of images to be detected has changed without affecting the accuracy of the first / second detection mode.

[0111] Based on the same inventive concept, this application provides an image scene change detection device, which is similar to the aforementioned device. Figure 1 The method for detecting changes in the image scene shown corresponds to this. For a detailed description of the device, please refer to the foregoing method embodiment section. Repeated descriptions will not be repeated here. Figure 4 The device includes:

[0112] Collection unit 401: used to obtain detection description information from task detection instructions; and

[0113] Obtain a set of images to be detected; the set of images to be detected contains at least two images to be detected.

[0114] Matching unit 402: used to determine whether the scene in the set of images to be detected has changed by using a preset detection mode that matches the detection description information from among K preset detection modes; wherein, K is an integer not less than 1, and the preset detection modes include at least a first detection mode and / or a second detection mode;

[0115] The first detection mode includes: determining whether the scene in the image set to be detected has changed based on the similarity between preset regions in different images to be detected contained in the image set to be detected;

[0116] The second detection mode includes: determining whether the scene in the set of images to be detected has changed by identifying the difference regions between different images in the set of images to be detected.

[0117] The image scene change detection device further includes a mode unit, which is specifically used to determine a preset detection mode that matches the detection description information based on the region location information in the detection description information.

[0118] If the preset detection mode includes the first detection mode, then the matching unit 402 is specifically used to determine a first vector of the first image to be detected and a second vector of the second image to be detected; wherein, the first vector indicates the features of the preset region in the first image to be detected, and the second vector indicates the features of the preset region in the second image to be detected; the first image to be detected and the second image to be detected are different images to be detected in the set of images to be detected; the similarity is determined based on the first vector and the second vector; when the similarity is less than a first similarity threshold, it is determined that the scene in the set of images to be detected has changed.

[0119] The matching unit 402 is specifically used to acquire a reference image and determine a reference vector of the reference image; wherein, the similarity between the preset region in the reference image and the preset region in the first image to be detected is less than a second similarity threshold, and the reference vector indicates the features of the preset region in the reference image; a reference similarity is determined based on the second vector and the reference vector; when the reference similarity is less than the second similarity threshold, it is determined that an interfering target appears in the preset region of the second image to be detected; then, a third image to be detected is acquired, and the second vector is updated using the third vector of the third image to be detected; wherein, the acquisition time of the third image to be detected is later than the acquisition time of the second image to be detected, and the third vector indicates the features of the preset region in the third image to be detected; based on the similarity between the updated second vector and the first vector, it is determined whether the scene in the set of images to be detected has changed.

[0120] The preset detection mode that matches the detection description information includes the second detection mode; then the matching unit 402 is further configured to determine the first feature matrix of the third image to be detected and the second feature matrix of the fourth image to be detected in the image set to be detected; determine the difference between the elements indicating the same region between the first feature matrix and the second feature matrix as a feature parameter; wherein, the feature parameter indicates the degree of difference between the same region in the third image to be detected and the same region in the fourth image to be detected; mark the feature parameter that is greater than the difference threshold as a difference parameter; based on a preset mapping relationship, determine the difference region corresponding to the difference parameter in the fourth image to be detected; the preset mapping relationship indicates the correspondence between the features of any region in any image and the elements in the feature matrix of any image; based on the difference region, determine whether the scene in the image set to be detected has changed.

[0121] The second detection mode includes a first sub-detection mode and a second sub-detection mode. The first sub-detection mode is used to detect environmental changes in the scene in the set of images to be detected, and the second sub-detection mode is used to detect target changes in the scene in the images to be detected.

[0122] The matching unit 402 is further configured to determine, based on the detection items in the detection description information, whether to perform environmental change detection on the scene in the set of images to be detected; if yes, input the third image to be detected and the fourth image to be detected into the first sub-detection mode to determine the first feature matrix corresponding to the third image to be detected and the second feature matrix corresponding to the fourth image to be detected; if no, input the third image to be detected and the fourth image to be detected into the second sub-detection mode to determine the first feature matrix corresponding to the third image to be detected and the second feature matrix corresponding to the fourth image to be detected.

[0123] Based on the same inventive concept, embodiments of this application also provide a readable storage medium, including:

[0124] memory,

[0125] The memory is used to store instructions that, when executed by a processor, cause the apparatus including the readable storage medium to perform the image scene change detection method described above.

[0126] Based on the same inventive concept as the aforementioned image scene change detection method, this application also provides an electronic device that can implement the function of the aforementioned image scene change detection method. Please refer to... Figure 5 The electronic device includes:

[0127] At least one processor 501 and a memory 502 connected to at least one processor 501. In this embodiment, the specific connection medium between the processor 501 and the memory 502 is not limited. Figure 5 The example shown is the connection between processor 501 and memory 502 via bus 500. Bus 500 is... Figure 5 The connections between other components are indicated by thick lines and are for illustrative purposes only, not as limiting information. The Bus 500 can be divided into address bus, data bus, control bus, etc., for ease of representation. Figure 5 The term 501 is represented by a single thick line, but this does not imply that there is only one bus or one type of bus. Alternatively, the processor 501 can also be called a controller; there is no restriction on the name.

[0128] In this embodiment, the memory 502 stores instructions executable by at least one processor 501. By executing the instructions stored in the memory 502, the at least one processor 501 can perform the scene change detection method described above. The processor 501 can implement... Figure 4 The functions of each module in the device shown.

[0129] The processor 501 is the control center of the device. It can connect to various parts of the control device through various interfaces and lines. By running or executing instructions stored in memory 502 and calling data stored in memory 502, the processor can perform various functions and process data, thereby monitoring the device as a whole.

[0130] In one possible design, processor 501 may include one or more processing units. Processor 501 may integrate an application processor and a modem processor, wherein the application processor mainly handles the operating system, user interface, and applications, and the modem processor mainly handles wireless communication. It is understood that the modem processor may also not be integrated into processor 501. In some embodiments, processor 501 and memory 502 may be implemented on the same chip; in some embodiments, they may also be implemented on separate chips.

[0131] The processor 501 can be a general-purpose processor, such as a central processing unit (CPU), digital signal processor, application-specific integrated circuit, field-programmable gate array or other programmable logic device, discrete gate or transistor logic device, or discrete hardware component, capable of implementing or executing the methods, steps, and logic block diagrams disclosed in the embodiments of this application. The general-purpose processor can be a microprocessor or any conventional processor. The steps of the image scene change detection method disclosed in the embodiments of this application can be directly implemented by a hardware processor, or implemented by a combination of hardware and software modules within the processor.

[0132] Memory 502, as a non-volatile computer-readable storage medium, can be used to store non-volatile software programs, non-volatile computer-executable programs, and modules. Memory 502 may include at least one type of storage medium, such as flash memory, hard disk, multimedia card, card-type memory, random access memory (RAM), static random access memory (SRAM), programmable read-only memory (PROM), read-only memory (ROM), electrically erasable programmable read-only memory (EEPROM), magnetic storage, magnetic disk, optical disk, etc. Memory 502 can be any other medium capable of carrying or storing desired program code in the form of instructions or data structures that can be accessed by a computer, but is not limited thereto. In the embodiments of this application, memory 502 can also be a circuit or any other device capable of implementing storage functions for storing program instructions and / or data.

[0133] By designing and programming the processor 501, the code corresponding to the image scene change detection method described in the foregoing embodiments can be embedded into the chip, enabling the chip to execute the code during runtime. Figure 1 The steps of the scene change detection method are shown. How to design and program the processor 501 is a technique well-known to those skilled in the art and will not be elaborated here.

[0134] Those skilled in the art will clearly understand that, for the sake of convenience and brevity, the above-described division of functional modules is used as an example. In practical applications, the above functions can be assigned to different functional modules as needed, that is, the internal structure of the device can be divided into different functional modules to complete all or part of the functions described above. The specific working process of the system, device, and unit described above can be referred to the corresponding process in the foregoing method embodiments, and will not be repeated here.

[0135] In the several embodiments provided by this invention, it should be understood that the disclosed apparatus and methods can be implemented in other ways. For example, the apparatus embodiments described above are merely illustrative. For instance, the division of modules or units is only a logical functional division, and in actual implementation, there may be other division methods. For example, multiple units or components may be combined or integrated into another system, or some features may be ignored or not executed. Furthermore, the coupling or direct coupling or communication connection shown or discussed may be through some interfaces; the indirect coupling or communication connection between apparatuses or units may be electrical, mechanical, or other forms.

[0136] The units described as separate components may or may not be physically separate. The components shown as units may or may not be physical units; that is, they may be located in one place or distributed across multiple network units. Some or all of the units can be selected to achieve the purpose of this embodiment according to actual needs.

[0137] Furthermore, the functional units in the various embodiments of this application can be integrated into one processing unit, or each unit can exist physically separately, or two or more units can be integrated into one unit. The integrated unit can be implemented in hardware or as a software functional unit.

[0138] If the integrated unit is implemented as a software functional unit and sold or used as an independent product, it can be stored in a computer-readable storage medium. Based on this understanding, the technical solution of this application, in essence, or the part that contributes to the prior art, or all or part of the technical solution, can be embodied in the form of a software product. This computer software product is stored in a storage medium and includes several instructions to cause a computer device (which may be a personal computer, server, or network device, etc.) or processor to execute all or part of the steps of the methods described in the various embodiments of this application. The aforementioned storage medium includes: Universal Serial Bus flash disks, portable hard drives, read-only memory (ROM), random access memory (RAM), magnetic disks, optical disks, and other media capable of storing program code.

[0139] Obviously, those skilled in the art can make various modifications and variations to this invention without departing from its spirit and scope. Therefore, if these modifications and variations fall within the scope of the claims of this invention and their equivalents, this invention also intends to include these modifications and variations.

Claims

1. A method for detecting changes in an image scene, characterized in that, include: Obtain detection description information from the task detection command; as well as Obtain a set of images to be detected; the set of images to be detected contains at least two images to be detected. The at least two images to be detected are acquired for the detection area, and the detection description information includes the regional location information of the detection area; Based on the region location information in the detection description information, a preset detection mode matching the detection description information is determined from K preset detection modes, and the scene in the set of images to be detected is determined by the preset detection mode; wherein, K is an integer not less than 1, and the preset detection mode includes at least a first detection mode and / or a second detection mode; The first detection mode includes: determining whether the scene in the image set to be detected has changed based on the similarity between preset regions in different images to be detected contained in the image set to be detected; The second detection mode includes: determining whether the scene in the set of images to be detected has changed by identifying the difference regions between different images in the set of images to be detected.

2. The method as described in claim 1, characterized in that, The preset detection mode that matches the detection description information includes the first detection mode; then, determining whether the scene in the set of images to be detected has changed by using the preset detection mode that matches the detection description information from among the K preset detection modes includes: A first vector of a first image to be detected and a second vector of a second image to be detected are determined; wherein the first vector indicates the features of the preset region in the first image to be detected, and the second vector indicates the features of the preset region in the second image to be detected; the first image to be detected and the second image to be detected are different images to be detected in the set of images to be detected; The similarity is determined based on the first vector and the second vector; When the similarity is less than the first similarity threshold, it is determined that the scene in the set of images to be detected has changed.

3. The method as described in claim 2, characterized in that, The step of determining that the scene in the set of images to be detected has changed includes: A reference image is acquired, and a reference vector of the reference image is determined; wherein the similarity between the preset region in the reference image and the preset region in the first image to be detected is less than a second similarity threshold, and the reference vector indicates the features of the preset region in the reference image; Based on the second vector and the reference vector, a reference similarity is determined; When the reference similarity is less than the second similarity threshold, it is determined that an interfering target appears in the preset region of the second image to be detected; then a third image to be detected is acquired, and the second vector is updated using the third vector of the third image to be detected; wherein, the acquisition time of the third image to be detected is later than the acquisition time of the second image to be detected, and the third vector indicates the features of the preset region in the third image to be detected; Based on the similarity between the updated second vector and the first vector, it is determined whether the scene in the set of images to be detected has changed.

4. The method according to any one of claims 1-3, characterized in that, The preset detection mode that matches the detection description information includes the second detection mode; then, determining whether the scene in the set of images to be detected has changed by using the preset detection mode that matches the detection description information from among the K preset detection modes includes: The first feature matrix of the third image to be detected and the second feature matrix of the fourth image to be detected in the set of images to be detected are determined respectively. The difference between the elements indicating the same region between the first feature matrix and the second feature matrix is determined as a feature parameter; wherein, the feature parameter indicates the degree of difference between the same region in the third image to be detected and the same region in the fourth image to be detected; The feature parameters that are greater than the difference threshold are marked as difference parameters; Based on a preset mapping relationship, in the fourth image to be detected, the difference region corresponding to the difference parameter is determined; the preset mapping relationship indicates the correspondence between the features of any region in any image and the elements in the feature matrix of any image. Based on the difference regions, it is determined whether the scene in the set of images to be detected has changed.

5. The method as described in claim 4, characterized in that, The second detection mode includes a first sub-detection mode and a second sub-detection mode. The first sub-detection mode is used to detect environmental changes in the scene in the set of images to be detected, and the second sub-detection mode is used to detect target changes in the scene in the images to be detected.

6. The method as described in claim 5, characterized in that, The step of determining the first feature matrix of the third image to be detected and the second feature matrix of the fourth image to be detected in the set of images to be detected includes: Based on the detection items in the detection description information, determine whether to perform environmental change detection for the scene in the set of images to be detected; If so, input the third image to be detected and the fourth image to be detected into the first sub-detection mode, and determine the first feature matrix corresponding to the third image to be detected and the second feature matrix corresponding to the fourth image to be detected; If not, input the third image to be detected and the fourth image to be detected into the second sub-detection mode to determine the first feature matrix corresponding to the third image to be detected and the second feature matrix corresponding to the fourth image to be detected.

7. A device for detecting changes in an image scene, characterized in that, include: Collection unit: Used to obtain detection description information from task detection instructions; as well as Obtain a set of images to be detected; the set of images to be detected contains at least two images to be detected; the at least two images to be detected are collected for a detection area, and the detection description information includes the regional location information of the detection area; Matching unit: used to determine a preset detection mode that matches the detection description information from K preset detection modes based on the region location information in the detection description information, and to determine whether the scene in the set of images to be detected has changed through the preset detection mode; wherein, K is an integer not less than 1, and the preset detection mode includes at least a first detection mode and / or a second detection mode; The first detection mode includes: determining whether the scene in the image set to be detected has changed based on the similarity between preset regions in different images to be detected contained in the image set to be detected; The second detection mode includes: determining whether the scene in the set of images to be detected has changed by identifying the difference regions between different images in the set of images to be detected.

8. A readable storage medium, characterized in that, include, memory, The memory is used to store instructions that, when executed by a processor, cause a device including the readable storage medium to perform the method as described in any one of claims 1-6.

9. An electronic device, characterized in that, include: Memory, used to store computer programs; A processor, when executing a computer program stored in the memory, implements the method as described in any one of claims 1-6.