A fire passage occupation detection method and device, an electronic device, and a storage medium

By performing target detection and depth data analysis on the video stream of fire lanes, and combining depth difference and duration of continuous occupancy, the problems of false detection and missed detection in traditional methods are solved, and accurate detection and timely early warning of fire lane occupancy are achieved.

CN120431508BActive Publication Date: 2026-06-19CHINA NAT BUILDING MATERIALS TECH CO LTD +3

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
CHINA NAT BUILDING MATERIALS TECH CO LTD
Filing Date
2025-04-27
Publication Date
2026-06-19

Smart Images

  • Figure CN120431508B_ABST
    Figure CN120431508B_ABST
Patent Text Reader

Abstract

This invention discloses a method, apparatus, electronic device, and storage medium for detecting the occupancy of fire lanes. The method includes: acquiring a first image frame from a video stream, the first image frame including a target detection area, which is the area where the fire lane is located; performing target detection on the first image frame to obtain a target detection result; if the target detection result indicates the presence of a detected object, determining first depth data corresponding to the detected object in the first image frame; acquiring second depth data corresponding to the target detection area, and determining the area occupancy detection result of the target detection area based on the first and second depth data; if the area occupancy detection result indicates occupancy, determining the duration of continuous occupancy of the target detection area by the detected object, generating warning information when the duration of continuous occupancy meets warning conditions, and executing warning operations based on the warning information. This method achieves the determination of area occupancy detection results through image frames and multiple depth data, improving the accuracy of the detection results.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of automatic detection technology, and in particular to a method, device, electronic equipment, and storage medium for detecting the occupancy of fire lanes. Background Technology

[0002] Fire lane obstruction detection technology is a crucial technology in areas such as factory safety, residential area monitoring, and public place surveillance. If fire lanes are blocked by debris, vehicles, or other objects, they become obstructed, posing a significant threat to life and property in the event of an emergency.

[0003] Traditional methods for detecting the occupancy of fire lanes are mostly based on two-dimensional plane boundary delineation, which cannot effectively handle complex situations in three-dimensional space, such as passing vehicles and the proximity of accumulated objects. Current fire lane occupancy detection algorithms are based on two-dimensional image detection models, which cannot accurately detect the occupancy of fire lanes, resulting in false positives and false negatives, posing safety hazards. Summary of the Invention

[0004] This invention provides a method, device, electronic equipment, and storage medium for detecting the occupancy of fire lanes, in order to solve the problem of inaccurate detection of the occupancy status of fire lanes.

[0005] According to one aspect of the present invention, a method for detecting the occupancy of fire lanes is provided, comprising:

[0006] The first image frame in the video stream is acquired. The first image frame includes the target detection area, which is the area where the fire escape is located.

[0007] Target detection is performed on the first image frame to obtain the target detection results;

[0008] If the target detection result indicates the presence of a detection object, determine the first depth data corresponding to the detection object in the first image frame;

[0009] Obtain the second depth data corresponding to the target detection area, and determine the area occupancy detection result of the target detection area based on the first depth data and the second depth data, wherein the second depth data is the calibration depth data of the target detection area in the unoccupied state;

[0010] If the area occupancy detection result is that the area is occupied, determine the duration of continuous occupancy of the target detection area by the detected object. If the duration of continuous occupancy meets the warning conditions, generate warning information and execute warning operations based on the warning information.

[0011] Optionally, target detection is performed on the first image frame to obtain target detection results, including: calling a pre-trained target detection model, performing target detection on the first image frame using the pre-trained target detection model, and obtaining target recognition results, wherein the target recognition results include target object type; obtaining a preset detection object type, performing matching based on the target object type within the preset detection object type, and obtaining matching results corresponding to the target object type; if the matching result is successful, then the target detection result is determined to indicate that a detection object exists in the target detection region, and if the matching result is unsuccessful, then the target detection result is determined to indicate that a detection object does not exist in the target detection region.

[0012] Optionally, determining the first depth data of the detected object in the first image frame includes: calling a pre-trained monocular depth estimation model, processing the first image frame based on the pre-trained monocular depth estimation model to obtain the depth data corresponding to the first image frame; and extracting the first depth data of the detected object from the depth data corresponding to the first image frame.

[0013] Optionally, determining the region occupancy detection result of the target detection area based on the first depth data and the second depth data includes: traversing the first depth data corresponding to the detection object and the second depth data corresponding to the target detection area, comparing the first depth data and the second depth data of the same location point, and determining the depth difference corresponding to each location point; if the depth difference of any location point in the target detection area is less than a preset depth difference threshold, determining that the location point is in an occupied state; determining the number of location points in the target detection area that are in an occupied state, and if the number is greater than or equal to a preset number threshold, then determining that the region occupancy detection result corresponding to the target detection area is occupied.

[0014] Optionally, determining the region occupancy detection result of the target detection area based on the first depth data and the second depth data includes: determining the overlapping area between the detection object and the target detection area based on the first depth data and the second depth data; determining a first average depth value based on the first depth data corresponding to the overlapping area; determining a second average depth value based on the second depth value corresponding to the overlapping area; determining an average depth difference based on the first average depth value and the second average depth value; and determining the region occupancy detection result of the target detection area as occupied if the average depth difference is greater than or equal to a preset depth difference threshold.

[0015] Optionally, determining the duration of continuous occupation of the target detection area by the detected object includes: acquiring a second image frame in the video stream, the second image frame being located after the first image frame; acquiring the first detection box corresponding to the detected object in the target detection result corresponding to the first image frame; performing target detection on the second image frame and determining the second detection box of the detected object in the second image frame; determining the intersection-union ratio (IU / R) of the first and second detection boxes; if the IU / R is greater than or equal to a preset IU / R threshold, then updating the duration of continuous occupation based on the time interval between the second image frame and the first image frame; if the IU / R is less than the preset IU / R threshold, then resetting the duration of continuous occupation.

[0016] Optionally, the method further includes: acquiring the acquisition scene of the video stream, and acquiring the second depth data of the target detection region under the acquisition scene; the acquisition conditions of the video stream are different under different acquisition scenes; wherein, the method for determining the second depth data of the target detection region under different acquisition scenes includes: acquiring the third image frame of the target detection region under different acquisition scenes, in which the target detection region is in an unoccupied state; processing the third image frame based on a pre-trained monocular depth estimation model to obtain the second depth data of the target detection region under different acquisition scenes.

[0017] According to another aspect of the present invention, a fire lane occupancy detection device is provided, comprising:

[0018] The image frame acquisition module is used to acquire the first image frame in the video stream. The first image frame includes a target detection area, which is the area where the fire lane is located.

[0019] The target detection result determination module is used to perform target detection on the first image frame and obtain the target detection result;

[0020] The first depth data determination module is used to determine the first depth data corresponding to the detected object in the first image frame when the target detection result indicates that a detected object exists.

[0021] The region occupancy detection result determination module is used to obtain the second depth data corresponding to the target detection region, and determine the region occupancy detection result of the target detection region based on the first depth data and the second depth data, wherein the second depth data is the calibration depth data of the target detection region in the unoccupied state;

[0022] The early warning processing module is used to determine the duration of continuous occupation of the target detection area by the detected object when the area occupancy detection result is that the area is occupied. When the duration of continuous occupation meets the early warning conditions, an early warning message is generated, and an early warning operation is performed based on the early warning message.

[0023] According to another aspect of the present invention, an electronic device is provided, the electronic device comprising:

[0024] At least one processor; and

[0025] A memory that is communicatively connected to at least one processor; wherein,

[0026] The memory stores a computer program that can be executed by at least one processor, such that the at least one processor is able to perform the fire lane occupancy detection method according to any embodiment of the present invention.

[0027] According to another aspect of the present invention, a computer-readable storage medium is provided, which stores computer instructions for causing a processor to execute and implement the fire lane occupancy detection method of any embodiment of the present invention.

[0028] The technical solution of this invention involves acquiring a first image frame from a video stream, the first image frame including a target detection region, which is the area where the fire escape is located; performing target detection on the first image frame to obtain a target detection result, thus extracting image frames from the video stream and performing target detection to obtain a target detection result, providing a data foundation for subsequently determining the area occupancy detection result; when the target detection result indicates the presence of a detection object, determining the first depth data corresponding to the detection object in the first image frame, thus avoiding the extraction of depth information from image frames without detection objects and avoiding waste of computing resources; acquiring second depth data corresponding to the target detection region, and determining the area occupancy detection result of the target detection region based on the first and second depth data, wherein the second depth data is the calibration depth data of the target detection region in an unoccupied state, thus determining the area occupancy detection result of the target detection region based on two depth data, which helps to improve the accuracy of the area occupancy detection result;

[0029] When the area occupancy detection result is "occupied," the duration of continuous occupancy of the target detection area by the detected object is determined. When the duration of continuous occupancy meets the warning conditions, a warning message is generated, and a warning operation is executed based on the warning message. This enables the determination of the occupancy duration when the area occupancy detection result is "occupied," and the generation of warning message based on the occupancy duration. This can effectively avoid the problem of false alarms for detected objects passing through the target detection area, solve the problem of inaccurate and false alarms in fire lane occupancy detection, and improve the accuracy of fire lane occupancy detection.

[0030] It should be understood that the description in this section is not intended to identify key or essential features of the embodiments of the present invention, nor is it intended to limit the scope of the invention. Other features of the invention will become readily apparent from the following description. Attached Figure Description

[0031] To more clearly illustrate the technical solutions in the embodiments of the present invention, the accompanying drawings used in the description of the embodiments will be briefly introduced below. Obviously, the accompanying drawings described below are only some embodiments of the present invention. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.

[0032] Figure 1 This is a flowchart of a method for detecting the occupancy of fire lanes provided in Embodiment 1 of the present invention;

[0033] Figure 2 This is a flowchart of a method for detecting the occupancy of fire lanes provided in Embodiment 2 of the present invention;

[0034] Figure 3 This is a flowchart of a method for detecting the occupancy of fire lanes provided in Embodiment 3 of the present invention;

[0035] Figure 4 This is a schematic diagram of the structure of a fire lane occupancy detection device provided in Embodiment 4 of the present invention;

[0036] Figure 5 This is a schematic diagram of the structure of an electronic device that implements the fire lane occupancy detection method of this invention. Detailed Implementation

[0037] To enable those skilled in the art to better understand the present invention, the technical solutions of the present invention will be clearly and completely described below with reference to the accompanying drawings of the embodiments of the present invention. Obviously, the described embodiments are only some embodiments of the present invention, and not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative effort should fall within the scope of protection of the present invention.

[0038] It should be noted that the terms "first," "second," etc., in the specification, claims, and accompanying drawings of this invention are used to distinguish similar objects and are not necessarily used to describe a specific order or sequence. It should be understood that such data can be interchanged where appropriate so that the embodiments of the invention described herein can be implemented in orders other than those illustrated or described herein. Furthermore, the terms "comprising" and "having," and any variations thereof, are intended to cover a non-exclusive inclusion; for example, a process, method, system, product, or apparatus that comprises a series of steps or units is not necessarily limited to those steps or units explicitly listed, but may include other steps or units not explicitly listed or inherent to such processes, methods, products, or apparatus.

[0039] Example 1

[0040] Figure 1 This is a flowchart of a fire lane occupancy detection method provided in Embodiment 1 of the present invention. This embodiment is applicable to situations requiring fire lane occupancy detection. The method can be executed by a fire lane occupancy detection device, which can be implemented in hardware and / or software. This device can be configured in electronic devices such as computers and servers. Figure 1 As shown, the method includes:

[0041] S110. Obtain the first image frame in the video stream. The first image frame includes the target detection area, which is the area where the fire lane is located.

[0042] Specifically, the video stream refers to video data obtained by capturing images of the fire lane area to be detected using a camera device. In this embodiment, the camera device can be a monocular camera. The first image frame can be understood as a frame of image extracted from the video stream. The target detection area can be understood as the area that needs to be occupancy detected. In this embodiment, the target detection area is the area where the fire lane is located.

[0043] Specifically, monocular cameras can be installed near the fire lane area to monitor the area in real time and acquire the corresponding video stream. Sampling can be performed on the video stream at a preset sampling frequency, and the image frame corresponding to the closest timestamp to the current moment is determined as the first image frame. The first image frame includes the target detection area, i.e., the area where the fire lane is located.

[0044] S120. Perform target detection on the first image frame to obtain the target detection result.

[0045] Specifically, the target detection result can be understood as the output of identifying and locating a specific target object in an image. It typically includes the detected object category and location information, where the location information can be bounding box coordinates. Target detection algorithms can analyze image content to determine the presence of target objects in the image and their precise location within the scene, ultimately presenting the results in structured data form. In this embodiment, the target detection result specifically refers to information about whether a target object exists in the target detection region. If a target object exists, the target detection result includes the object category and location information; if no target object exists, the target detection result is empty.

[0046] Specifically, the object detection algorithm is invoked to analyze and process the first image frame to detect whether the first image frame contains a target object. If a target object is detected, the corresponding target detection box and target category can be output. The detected target object's corresponding target detection box and target category are used as the target detection result and output.

[0047] Optionally, target detection is performed on the first image frame to obtain target detection results, including: calling a pre-trained target detection model, performing target detection on the first image frame using the pre-trained target detection model, and obtaining target recognition results. The target recognition results include the target object type; obtaining a preset detection object type, performing matching based on the target object type within the preset detection object type, and obtaining the matching result corresponding to the target object type; if the matching result is successful, the target detection result is determined to indicate that a detection object exists in the target detection region; if the matching result is unsuccessful, the target detection result is determined to indicate that a detection object does not exist in the target detection region.

[0048] Specifically, a pre-trained object detection model is invoked. The first image frame is input into the pre-trained object detection model, which processes the data. If a target object is detected, the target object type is output. If the target object type is detected, a preset detection object type is retrieved from a preset storage space. It should be noted that the preset detection object type may vary depending on the scenario, and should be set according to the actual detection scenario. The target object type is then matched with the preset detection object type. If the match is successful, the target detection result is determined to be that a detected object exists in the target detection area; if the match fails, the target detection result is determined to be that no detected object exists in the target detection area. This process can remove target objects identified in the image frame that are not in the target detection area, such as trees, signs, and buildings next to fire lanes. Only target objects that appear in the target detection area are detected, which helps improve the accuracy of fire lane occupancy detection results and reduces false alarms.

[0049] S130. If the target detection result indicates the presence of a detection object, determine the first depth data corresponding to the detection object in the first image frame.

[0050] Specifically, the first depth data can be understood as the distance data of each point in the image relative to the camera. The image can be processed by a depth estimation algorithm to obtain the depth data corresponding to the image. The depth estimation algorithm can be a depth estimation model based on deep learning. The depth estimation model is trained with a large amount of image data to learn the mapping relationship between image features and depth information, thereby outputting a depth map corresponding to the monocular image. The value of each pixel in the image represents the depth information of that point in the scene, and different depths are visualized by means of grayscale values ​​or color encoding.

[0051] Specifically, if the target detection result indicates the presence of a detection object, a depth estimation algorithm is invoked to process the first image frame and estimate the depth data of each pixel in the first image frame, thereby obtaining the first depth data corresponding to the detection object in the first image frame.

[0052] Optionally, determining the first depth data of the detected object in the first image frame includes: calling a pre-trained monocular depth estimation model, processing the first image frame based on the pre-trained monocular depth estimation model to obtain the depth data corresponding to the first image frame; and extracting the first depth data of the detected object from the depth data corresponding to the first image frame.

[0053] Specifically, a pre-trained monocular depth estimation model is invoked. The first image frame is input into the monocular depth estimation model, which processes the first image frame, automatically extracting various features from it. Based on the learned feature-depth mapping relationship, the model predicts the depth value of each pixel in the first image frame, ultimately outputting a depth map of the same size as the input first image frame. In this map, the value of each pixel represents the depth information of that point relative to the camera in the actual scene, thus realizing the conversion from monocular image to depth data. Furthermore, based on the location data corresponding to the detected object in the target detection results, matching is performed from the depth data corresponding to the first image frame to obtain the first depth data of the detected object.

[0054] In this embodiment, the depth data corresponding to the image is determined by processing the image through a depth estimation model. This eliminates the need for complex and expensive depth sensing equipment, resulting in lower costs and greater flexibility. The method can be widely applied to various scenarios and is not limited by special environmental conditions. With the help of advanced deep learning algorithms, depth data can be generated quickly, meeting the needs of application scenarios with high real-time requirements. This helps to improve the efficiency and accuracy of determining the occupancy detection results of fire lanes.

[0055] S140. Obtain the second depth data corresponding to the target detection area, and determine the area occupancy detection result of the target detection area based on the first depth data and the second depth data.

[0056] The second depth data refers to the calibrated depth data of the target detection area when it is unoccupied. An image can be acquired when the target detection area is unoccupied by any person or object, and depth estimation processing can be performed on the image to obtain the second depth data corresponding to the target detection area. This second depth data can be stored in a preset storage space. When occupancy detection of the target detection area is required, matching can be performed based on the target detection area to obtain the corresponding second depth data for that area.

[0057] Specifically, matching is performed in a preset storage space based on the location information or name identifier corresponding to the target detection area to obtain the second depth data corresponding to the target detection area. The first depth data and the second depth data can be compared to determine the overlapping area. The area occupancy detection result of the target detection area can be determined based on the depth difference of the overlapping area. For example, it can be set that if the depth difference exceeds a preset threshold, the area occupancy detection result of the target detection area is determined to be occupied, and if the depth difference does not exceed the preset threshold, the area occupancy detection result of the target detection area is determined to be unoccupied.

[0058] In this embodiment, the area occupancy detection result of the target detection area is determined by combining the first depth data and the second depth data. The second depth data is the calibration depth data of the target detection area in the unoccupied state, which can effectively remove detection objects that are not in the fire lane, thereby helping to improve the accuracy of the area occupancy detection result and reduce the problem of false alarms.

[0059] S150. If the area occupancy detection result is that the area is occupied, determine the duration of continuous occupancy of the target detection area by the detection object, generate early warning information when the duration of continuous occupancy meets the early warning conditions, and execute early warning operations based on the early warning information.

[0060] It should be noted that objects that briefly pass through the target detection area need to be identified to reduce false alarms. Specifically, this can be achieved by calculating the duration of continuous occupancy of the detected object within the target detection area, and then further verifying the area occupancy detection results based on the duration of continuous occupancy. The warning conditions specifically characterize whether the duration of the detected object's occupancy within the target detection area meets the conditions set for issuing a warning. These conditions can be set according to the actual occupancy detection situation in the scenario, and are not limited here.

[0061] Specifically, when the area occupancy detection result indicates that the area is occupied, the continuous occupancy duration calculation method is invoked to determine the duration of continuous occupancy of the target detection area by the detected object. This continuous occupancy duration is then compared with a preset duration threshold corresponding to the warning conditions. If the continuous occupancy duration is greater than the preset duration threshold, the continuous occupancy duration is determined to meet the warning conditions; if the continuous occupancy duration is less than or equal to the preset duration threshold, the continuous occupancy duration is determined not to meet the warning conditions. When the continuous occupancy duration meets the warning conditions, corresponding warning information is generated according to the warning information template. Based on the warning information, corresponding warning operations are executed. For example, the warning information can be broadcast or uploaded to a management platform, where relevant personnel can determine whether on-site personnel are needed to remove the obstacle.

[0062] Optionally, determining the duration of continuous occupation of the target detection area by the detected object includes: acquiring a second image frame in the video stream, the second image frame being located after the first image frame; acquiring the first detection box corresponding to the detected object in the target detection result corresponding to the first image frame; performing target detection on the second image frame and determining the second detection box of the detected object in the second image frame; determining the intersection-union ratio (IU / R) of the first and second detection boxes; if the IU / R is greater than or equal to a preset IU / R threshold, then updating the duration of continuous occupation based on the time interval between the second image frame and the first image frame; if the IU / R is less than the preset IU / R threshold, then resetting the duration of continuous occupation.

[0063] Specifically, if the region occupancy detection result of the target detection area determined based on the first image frame is that it is occupied, a second image frame immediately adjacent to the first image frame in the video stream is obtained. The second image frame is located after the first image frame in chronological order. It should be noted that the second image frame may include multiple image frames. A first detection box of the detected object is obtained from the target detection result of the first image frame, and target detection processing is performed on the next image frame immediately adjacent to the first image frame to obtain a second detection box of the detected object in that image frame. The intersection-union ratio (IUGR) of the first and second detection boxes is calculated, and it is determined whether the IUGR determined in this instance is greater than or equal to a preset IUGR threshold. If the IUGR is greater than or equal to the preset IUGR threshold, the continuous occupancy duration is updated according to the time interval between the second and first image frames. That is, the time interval and the continuous occupancy duration are summed, and the sum is assigned to the continuous occupancy duration. If the IUGR is less than the preset IUGR threshold, the continuous occupancy duration is reset. It should be noted that after updating the duration of continuous occupation, a second image frame can be acquired to further determine the intersection-over-union (IoU) ratio and whether the current IoU is greater than or equal to a preset IoU threshold. If so, the duration of continuous occupation continues to be updated. Preferably, after each update of the duration of continuous occupation, it is determined whether the updated duration of continuous occupation meets the warning conditions. If the warning conditions are met, a warning message is generated, and a warning operation is performed based on the warning message.

[0064] Based on the above embodiments, the method further includes: acquiring the acquisition scene of the video stream, and acquiring the second depth data of the target detection region under the acquisition scene; the acquisition conditions of the video stream are different under different acquisition scenes; wherein, the method for determining the second depth data of the target detection region under different acquisition scenes includes: acquiring the third image frame of the target detection region under different acquisition scenes, in which the target detection region is in an unoccupied state; processing the third image frame based on a pre-trained monocular depth estimation model to obtain the second depth data of the target detection region under different acquisition scenes.

[0065] The video stream acquisition scenario refers to the overall environmental background in which the camera acquires video data. To better detect whether fire lanes are obstructed, it is necessary to acquire video stream data in scenarios where fire lanes are not obstructed, thereby determining the corresponding depth data and providing accurate reference data for detecting fire lane occupancy. In this embodiment, the video stream acquisition scenario is a scenario where there are no obstacles or vehicles or other detection objects in the target detection area. Therefore, the corresponding video stream acquisition condition is also set so that the target detection area is not obstructed. The third image frame specifically refers to the image frame extracted from the video stream acquired in a scenario where the target detection area is not obstructed.

[0066] Specifically, based on the requirements for fire lane occupancy detection, a scenario where the fire lane is not occupied is first set up, along with corresponding video stream acquisition conditions. Under this scenario, a video stream meeting the set acquisition conditions is acquired. The second depth data for the current acquisition scenario is determined based on at least one image frame in the video stream, and this depth data is stored in a preset storage space. This data can be directly accessed when fire lane occupancy detection is needed, improving detection efficiency. Specifically, the determination of the second depth data for the target detection area under different acquisition scenarios is as follows: an image frame of the target detection area is acquired from the video stream of the current acquisition scenario, and this image frame is designated as the third image frame. This means that the acquired third image frame does not contain any object obstructing the fire lane, indicating that the target detection area is unoccupied. A pre-trained monocular depth estimation model is then used to process the third image frame to obtain the second depth data for the target detection area under different acquisition scenarios. Alternatively, the pre-trained monocular depth estimation model can be used to process the third image frame to obtain multiple depth data points, and the average depth data is calculated and used as the second depth data.

[0067] In this embodiment, by determining the depth data of the target detection area that is not occupied, it can be used to assist in determining the area occupancy detection result of the target detection area. This can determine the occupancy status of the target detection area more quickly and accurately, while also avoiding the influence of interference factors near the target detection area and improving the accuracy of the detection results.

[0068] The technical solution of this embodiment involves acquiring a first image frame from a video stream, the first image frame including a target detection area, which is the area where the fire escape is located; performing target detection on the first image frame to obtain a target detection result; if the target detection result indicates the presence of a detection object, determining the first depth data corresponding to the detection object in the first image frame; acquiring the second depth data corresponding to the target detection area, and determining the area occupancy detection result of the target detection area based on the first depth data and the second depth data, wherein the second depth data is the calibration depth data of the target detection area in an unoccupied state; if the area occupancy detection result indicates occupancy, determining the duration of continuous occupancy of the target detection area by the detection object, generating a warning message when the duration of continuous occupancy meets the warning conditions, and performing a warning operation based on the warning message. This solution extracts image frames from a video stream and performs target detection to obtain target detection results. If the target detection result indicates the presence of a detected object, it determines first depth data and then acquires second depth data. This second depth data is used in conjunction with the first depth data to determine the area occupancy detection result of the target detection region. If the area occupancy detection result indicates occupancy, it further determines the duration of continuous occupancy. An early warning message is generated when the duration of continuous occupancy meets the early warning conditions. This effectively avoids false alarms about detected objects passing through the target detection region, solves the problems of inaccurate and false alarm detection of fire lane occupancy, and improves the accuracy of fire lane occupancy detection.

[0069] Example 2

[0070] Figure 2 This is a flowchart of a fire lane occupancy detection method provided in Embodiment 2 of the present invention. The method in this embodiment is a further optimization of the method in the above embodiments. Optionally, it iterates through the first depth data corresponding to the detection object and the second depth data corresponding to the target detection area, compares the first depth data and the second depth data for the same location point, and determines the depth difference corresponding to each location point; if the depth difference of any location point within the target detection area is less than a preset depth difference threshold, the location point is determined to be in an occupied state; the number of location points in the target detection area that are in an occupied state is determined, and if the number is greater than or equal to a preset number threshold, the area occupancy detection result corresponding to the target detection area is determined to be occupied. Figure 2 As shown, the method includes:

[0071] S210. Acquire the first image frame in the video stream. The first image frame includes the target detection area, which is the area where the fire lane is located.

[0072] S220. Perform target detection on the first image frame to obtain the target detection result.

[0073] S230. If the target detection result indicates the presence of a detection object, determine the first depth data corresponding to the detection object in the first image frame.

[0074] S240. Obtain the second depth data corresponding to the target detection area.

[0075] S250. Traverse the first depth data corresponding to the detection object and the second depth data corresponding to the target detection area, compare the first depth data and the second depth data of the same location point, and determine the depth difference corresponding to each location point.

[0076] Specifically, the process iterates through the first depth data corresponding to the detected object and the second depth data corresponding to the target detection area. By cross-validating the depth information from these two different sources point by point, the first depth data of the detected object and the second depth data of the target detection area are spatially aligned to ensure that they correspond to the same physical location point in the same coordinate system. By comparing the values ​​of these two depth data point by point, the depth difference at each location point is calculated, which is the absolute value or relative error of the second depth data minus the first depth data. This quantifies the accuracy difference of different depth data at the same location point, thereby assessing the reliability of the depth data or providing a calibration basis for subsequent detection results. Determining the depth difference helps improve the accuracy of the area occupancy detection results for the target detection area.

[0077] S260. If the depth difference between any location point within the target detection area is less than a preset depth difference threshold, determine that the location point is in an occupied state.

[0078] S270. Determine the number of location points in the target detection area that are occupied. If the number is greater than or equal to a preset threshold, then determine that the area occupancy detection result corresponding to the target detection area is occupied.

[0079] Specifically, after obtaining the depth difference values ​​of each location point in the target detection area, the depth difference of each location point is compared with a preset depth difference threshold. If the depth difference value of a location point is greater than the preset depth difference threshold, the location point is determined to be occupied; if the depth difference value is less than or equal to the preset depth difference threshold, the location point is determined to be unoccupied. After determining the occupancy status of each location point, the number of location points in the occupied state is calculated, and this number is compared with a preset number threshold. If the number is greater than or equal to the preset number threshold, the region occupancy detection result for the target detection area is determined to be occupied; if the number is less than the preset number threshold, the region occupancy detection result for the target detection area is determined to be unoccupied.

[0080] S280. If the area occupancy detection result is that the area is occupied, determine the duration of continuous occupancy of the target detection area by the detection object, generate early warning information when the duration of continuous occupancy meets the early warning conditions, and perform early warning operations based on the early warning information.

[0081] The technical solution of this embodiment extracts image frames from a video stream, determines the target detection result based on the image frames, and, assuming the target detection result indicates the presence of a detection object, determines first depth data and acquires second depth data. It compares the depth data of the same location points in the first and second depth data to determine the depth difference between each location point, determines the number of occupied location points based on the depth difference, and then determines the area occupancy detection result of the target detection area based on the number of occupied location points. This achieves the determination of area occupancy detection results through multi-dimensional depth data, improving the accuracy of area occupancy detection results. Furthermore, when the area occupancy detection result indicates occupancy, it further determines the duration of continuous occupancy and generates early warning information, effectively avoiding false alarms about detection objects passing through the target detection area. This solves the problems of inaccurate and false alarm detection of fire lane occupancy, improving the accuracy of fire lane occupancy detection.

[0082] Example 3

[0083] Figure 3 This is a flowchart of a fire lane occupancy detection method provided in Embodiment 3 of the present invention. The method in this embodiment is a further optimization of the method in the above embodiments. Optionally, the overlapping area between the detection object and the target detection area is determined based on first depth data and second depth data; a first average depth value is determined based on the first depth data corresponding to the overlapping area; a second average depth value is determined based on the second depth value corresponding to the overlapping area; an average depth difference is determined based on the first average depth value and the second average depth value; if the average depth difference is greater than or equal to a preset depth difference threshold, the occupancy detection result of the target detection area is determined to be occupied. Figure 3 As shown, the method includes:

[0084] S310. Acquire the first image frame in the video stream. The first image frame includes the target detection area, which is the area where the fire lane is located.

[0085] S320. Perform target detection on the first image frame to obtain the target detection result.

[0086] S330. If the target detection result indicates the presence of a detection object, determine the first depth data corresponding to the detection object in the first image frame.

[0087] S340. Obtain the second depth data corresponding to the target detection area.

[0088] S350. Based on the first depth data and the second depth data, determine the overlapping area between the detection object and the target detection area, determine the first average depth value based on the first depth data corresponding to the overlapping area, and determine the second average depth value based on the second depth value corresponding to the overlapping area.

[0089] Specifically, spatial alignment methods can be used to map the first depth data of the detected object and the second depth data of the target detection area to a unified coordinate system. The geometric intersection region, or overlapping region, is determined by pixel-by-pixel or point-by-point comparison. For all locations within this overlapping region, the corresponding first and second depth data are extracted. The first and second average depth values ​​are calculated using arithmetic mean, enabling the quantification of the numerical distribution characteristics of the two depth datasets within the shared coverage area. This allows for effective evaluation of the consistency of different depth acquisition methods in the overlapping region.

[0090] S360, Determine the average depth difference based on the first average depth value and the second average depth value.

[0091] S370. If the average depth difference is greater than or equal to the preset depth difference threshold, the region occupancy detection result of the target detection area is determined to be occupied.

[0092] Specifically, after obtaining the first and second average depth values ​​corresponding to the overlapping areas, the difference between the first and second average depth values ​​is calculated to obtain the average depth difference. Further, the average depth difference is compared with a preset depth difference threshold. If the average depth difference is less than the preset depth difference threshold, it indicates that the depth data of the detected object in the target detection area in the first depth data is similar to the depth data of the corresponding point in the target detection area in the second depth data, and the region occupancy detection result of the target detection area can be determined as unoccupied. If the average depth difference is greater than or equal to the preset depth difference threshold, it indicates that the depth data of the detected object in the target detection area in the first depth data differs significantly from the depth data of the corresponding point in the target detection area in the second depth data, and the region occupancy detection result of the target detection area can be determined as occupied.

[0093] In this embodiment, by comparing the first depth data and the second depth data, an overlapping area is determined, specifically a fire lane area. Then, the depth data difference of each location point in the overlapping area is determined. Based on the depth difference, the degree of difference between the depth data of the target detection area in the first depth data and the target detection area in the second depth data is determined. Then, the occupancy status of the target detection area is evaluated based on the quantitative data of the average depth difference. This achieves cross-validation and statistical analysis of depth data, providing a more reliable decision-making basis for area occupancy detection and helping to improve the accuracy of area occupancy detection results.

[0094] S380. If the area occupancy detection result is that the area is occupied, determine the duration of continuous occupancy of the target detection area by the detection object, generate early warning information when the duration of continuous occupancy meets the early warning conditions, and execute early warning operations based on the early warning information.

[0095] The technical solution of this embodiment extracts image frames from a video stream, determines the target detection result based on the image frames, and, if the target detection result indicates the presence of a detection object, determines first depth data and acquires second depth data. Based on the first and second depth data, it determines the overlapping area between the detection object and the target detection area. It then determines a first average depth value based on the first depth data corresponding to the overlapping area, and a second average depth value based on the second depth value corresponding to the overlapping area. Furthermore, it determines an average depth difference based on the first and second average depth values. If the average depth difference is greater than or equal to a preset depth difference threshold, it determines that the target detection area is occupied. This achieves the determination of the area occupancy detection result through multi-dimensional depth data, improving the accuracy of the area occupancy detection result. Furthermore, if the area occupancy detection result indicates that the area is occupied, it further determines the duration of continuous occupancy and generates early warning information, effectively avoiding false alarms about detection objects passing through the target detection area. This solves the problems of inaccurate and false alarm detection of fire lane occupancy, improving the accuracy of fire lane occupancy detection.

[0096] Implementation 4

[0097] Figure 4 This is a schematic diagram of a fire lane occupancy detection device provided in Embodiment 4 of the present invention. Figure 4 As shown, the device includes:

[0098] The image frame acquisition module 410 is used to acquire the first image frame in the video stream. The first image frame includes a target detection area, which is the area where the fire lane is located.

[0099] The target detection result determination module 420 is used to perform target detection on the first image frame and obtain the target detection result;

[0100] The first depth data determination module 430 is used to determine the first depth data corresponding to the detected object in the first image frame when the target detection result indicates that a detected object exists.

[0101] The region occupancy detection result determination module 440 is used to acquire the second depth data corresponding to the target detection region, and determine the region occupancy detection result of the target detection region based on the first depth data and the second depth data, wherein the second depth data is the calibration depth data of the target detection region in the unoccupied state;

[0102] The early warning processing module 450 is used to determine the duration of continuous occupation of the target detection area by the detection object when the area occupancy detection result is that the area is occupied, generate early warning information when the duration of continuous occupation meets the early warning conditions, and perform early warning operations based on the early warning information.

[0103] The technical solution of this embodiment involves acquiring a first image frame from a video stream via an image frame acquisition module. The first image frame includes a target detection region, which is the area where the fire escape is located. A target detection result determination module performs target detection on the first image frame to obtain a target detection result. A first depth data determination module determines the first depth data corresponding to the detected object in the first image frame if the target detection result indicates the presence of a detected object. A region occupancy detection result determination module acquires the second depth data corresponding to the target detection region and determines the region occupancy detection result of the target detection region based on the first and second depth data. The second depth data is the calibration depth data of the target detection region in an unoccupied state. An early warning processing module determines the duration of continuous occupancy of the target detection region by the detected object if the region occupancy detection result indicates occupancy. When the duration of continuous occupancy meets the early warning conditions, an early warning message is generated, and an early warning operation is performed based on the early warning message. This solution extracts image frames from a video stream and performs target detection to obtain target detection results. If the target detection result indicates the presence of a detected object, it determines first depth data and then acquires second depth data. This second depth data is used in conjunction with the first depth data to determine the area occupancy detection result of the target detection region. If the area occupancy detection result indicates occupancy, it further determines the duration of continuous occupancy. An early warning message is generated when the duration of continuous occupancy meets the early warning conditions. This effectively avoids false alarms about detected objects passing through the target detection region, solves the problems of inaccurate and false alarm detection of fire lane occupancy, and improves the accuracy of fire lane occupancy detection.

[0104] Based on the above embodiments, optionally, the target detection result determination module 420 is specifically used to call a pre-trained target detection model, perform target detection on the first image frame through the pre-trained target detection model, and obtain a target recognition result, wherein the target recognition result includes the target object type; obtain a preset detection object type, perform matching based on the target object type in the preset detection object type, and obtain a matching result corresponding to the target object type; if the matching result is a successful match, then determine that the target detection result is that a detection object exists in the target detection area; if the matching result is a failed match, then determine that the target detection result is that a detection object does not exist in the target detection area.

[0105] Optionally, the first depth data determination module 430 is specifically used to call a pre-trained monocular depth estimation model, process the first image frame based on the pre-trained monocular depth estimation model to obtain the depth data corresponding to the first image frame, and extract the first depth data of the detected object from the depth data corresponding to the first image frame.

[0106] Optionally, the region occupancy detection result determination module 440 is specifically used to traverse the first depth data corresponding to the detection object and the second depth data corresponding to the target detection area, compare the first depth data and the second depth data of the same location point, and determine the depth difference corresponding to each location point; if the depth difference of any location point in the target detection area is less than a preset depth difference threshold, the location point is determined to be in an occupied state; determine the number of location points in the target detection area that are in an occupied state, and if the number is greater than or equal to a preset number threshold, then the region occupancy detection result corresponding to the target detection area is determined to be occupied.

[0107] Optionally, the region occupancy detection result determination module 440 is further specifically used to determine the overlapping area between the detection object and the target detection area based on the first depth data and the second depth data; determine a first average depth value based on the first depth data corresponding to the overlapping area; determine a second average depth value based on the second depth value corresponding to the overlapping area; determine an average depth difference based on the first average depth value and the second average depth value; and determine the region occupancy detection result of the target detection area as occupied if the average depth difference is greater than or equal to a preset depth difference threshold.

[0108] Optionally, the early warning processing module 450 includes a continuous occupancy duration determination unit. The continuous occupancy duration determination unit is used to acquire a second image frame in the video stream, the second image frame being located after the first image frame; acquire a first detection box corresponding to the detected object in the target detection result corresponding to the first image frame; perform target detection on the second image frame and determine the second detection box of the detected object in the second image frame; determine the intersection-union ratio (IUGR) of the first and second detection boxes; if the IUGR is greater than or equal to a preset IUGR threshold, then update the continuous occupancy duration based on the time interval between the second image frame and the first image frame; if the IUGR is less than the preset IUGR threshold, then reset the continuous occupancy duration.

[0109] Optionally, the device is also used to acquire the acquisition scene of the video stream and acquire the second depth data of the target detection region under the acquisition scene; the acquisition conditions of the video stream are different under different acquisition scenes; wherein, the method for determining the second depth data of the target detection region under different acquisition scenes includes: acquiring the third image frame of the target detection region under different acquisition scenes, in which the target detection region is in an unoccupied state; processing the third image frame based on a pre-trained monocular depth estimation model to obtain the second depth data of the target detection region under different acquisition scenes.

[0110] The fire lane occupancy detection device provided in this embodiment of the invention can execute the fire lane occupancy detection method provided in any embodiment of the invention, and has the corresponding functional modules and beneficial effects of the method.

[0111] Example 5

[0112] Figure 5 This is a schematic diagram of the structure of an electronic device provided in Embodiment 5 of the present invention. The electronic device 10 is intended to represent various forms of digital computers, such as laptop computers, desktop computers, workstations, personal digital assistants, servers, blade servers, mainframe computers, and other suitable computers. The electronic device may also represent various forms of mobile devices, such as personal digital processors, cellular phones, smartphones, wearable devices (such as helmets, glasses, watches, etc.), and other similar computing devices. The components shown herein, their connections and relationships, and their functions are merely illustrative and are not intended to limit the implementation of the invention described and / or claimed herein.

[0113] like Figure 5As shown, the electronic device 10 includes at least one processor 11 and a memory, such as a read-only memory (ROM) 12 or a random access memory (RAM) 13, communicatively connected to the at least one processor 11. The memory stores computer programs executable by the at least one processor. The processor 11 can perform various appropriate actions and processes based on the computer program stored in the ROM 12 or loaded from storage unit 18 into the RAM 13. The RAM 13 may also store various programs and data required for the operation of the electronic device 10. The processor 11, ROM 12, and RAM 13 are interconnected via a bus 14. An input / output (I / O) interface 15 is also connected to the bus 14.

[0114] Multiple components in electronic device 10 are connected to I / O interface 15, including: input unit 16, such as keyboard, mouse, etc.; output unit 17, such as various types of displays, speakers, etc.; storage unit 18, such as disk, optical disk, etc.; and communication unit 19, such as network card, modem, wireless transceiver, etc. Communication unit 19 allows electronic device 10 to exchange information / data with other devices through computer networks such as the Internet and / or various telecommunications networks.

[0115] Processor 11 can be a variety of general-purpose and / or special-purpose processing components with processing and computing capabilities. Some examples of processor 11 include, but are not limited to, a central processing unit (CPU), a graphics processing unit (GPU), various special-purpose artificial intelligence (AI) computing chips, various processors running machine learning model algorithms, a digital signal processor (DSP), and any suitable processor, controller, microcontroller, etc. Processor 11 performs the various methods and processes described above, such as the fire lane occupancy detection method.

[0116] In some embodiments, the fire lane occupancy detection method may be implemented as a computer program tangibly contained in a computer-readable storage medium, such as storage unit 18. In some embodiments, part or all of the computer program may be loaded and / or installed on electronic device 10 via ROM 12 and / or communication unit 19. When the computer program is loaded into RAM 13 and executed by processor 11, one or more steps of the fire lane occupancy detection method described above may be performed. Alternatively, in other embodiments, processor 11 may be configured to perform the fire lane occupancy detection method by any other suitable means (e.g., by means of firmware).

[0117] Various embodiments of the systems and techniques described above herein can be implemented in digital electronic circuit systems, integrated circuit systems, field-programmable gate arrays (FPGAs), application-specific integrated circuits (ASICs), application-specific standard products (ASSPs), systems-on-a-chip (SoCs), payload-programmable logic devices (CPLDs), computer hardware, firmware, software, and / or combinations thereof. These various embodiments may include implementations in one or more computer programs that can be executed and / or interpreted on a programmable system including at least one programmable processor, which may be a dedicated or general-purpose programmable processor, capable of receiving data and instructions from a storage system, at least one input device, and at least one output device, and transmitting data and instructions to the storage system, the at least one input device, and the at least one output device.

[0118] Computer programs used to implement the fire lane occupancy detection method of the present invention can be written in any combination of one or more programming languages. These computer programs can be provided to a processor of a general-purpose computer, a special-purpose computer, or other programmable data processing device, such that when executed by the processor, the computer programs cause the functions / operations specified in the flowcharts and / or block diagrams to be implemented. The computer programs can be executed entirely on a machine, partially on a machine, as a standalone software package partially on a machine and partially on a remote machine, or entirely on a remote machine or server.

[0119] Example 6

[0120] Embodiment 6 of the present invention also provides a computer-readable storage medium storing computer instructions for causing a processor to execute a method for detecting occupancy of fire lanes, the method comprising:

[0121] The first image frame in the video stream is acquired. The first image frame includes the target detection area, which is the area where the fire escape is located.

[0122] Target detection is performed on the first image frame to obtain the target detection results;

[0123] If the target detection result indicates the presence of a detection object, determine the first depth data corresponding to the detection object in the first image frame;

[0124] Obtain the second depth data corresponding to the target detection area, and determine the area occupancy detection result of the target detection area based on the first depth data and the second depth data, wherein the second depth data is the calibration depth data of the target detection area in the unoccupied state;

[0125] If the area occupancy detection result is that the area is occupied, determine the duration of continuous occupancy of the target detection area by the detected object. If the duration of continuous occupancy meets the warning conditions, generate warning information and execute warning operations based on the warning information.

[0126] In the context of this invention, a computer-readable storage medium can be a tangible medium that may contain or store a computer program for use by or in conjunction with an instruction execution system, apparatus, or device. A computer-readable storage medium may include, but is not limited to, electronic, magnetic, optical, electromagnetic, infrared, or semiconductor systems, apparatus, or devices, or any suitable combination thereof. Alternatively, a computer-readable storage medium may be a machine-readable signal medium. More specific examples of machine-readable storage media include electrical connections based on one or more wires, portable computer disks, hard disks, random access memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), optical fibers, portable compact disk read-only memory (CD-ROM), optical storage devices, magnetic storage devices, or any suitable combination thereof.

[0127] To provide interaction with a user, the systems and techniques described herein can be implemented on an electronic device having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to the user; and a keyboard and pointing device (e.g., a mouse or trackball) through which the user provides input to the electronic device. Other types of devices can also be used to provide interaction with the user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user can be received in any form (including sound input, voice input, or tactile input).

[0128] The systems and technologies described herein can be implemented in computing systems that include backend components (e.g., as data servers), or computing systems that include middleware components (e.g., application servers), or computing systems that include frontend components (e.g., user computers with graphical user interfaces or web browsers through which users can interact with implementations of the systems and technologies described herein), or any combination of such backend, middleware, or frontend components. The components of the system can be interconnected via digital data communication of any form or medium (e.g., communication networks). Examples of communication networks include local area networks (LANs), wide area networks (WANs), blockchain networks, and the Internet.

[0129] A computing system can include clients and servers. Clients and servers are generally located far apart and typically interact through communication networks. The client-server relationship is created by computer programs running on the respective computers and having a client-server relationship with each other. The server can be a cloud server, also known as a cloud computing server or cloud host, which is a hosting product within the cloud computing service system to address the shortcomings of traditional physical hosts and VPS services, such as high management difficulty and weak business scalability.

[0130] It should be understood that the various forms of processes shown above can be used, with steps reordered, added, or deleted. For example, the steps described in this invention can be executed in parallel, sequentially, or in different orders, as long as the desired result of the technical solution of this invention can be achieved, and this is not limited herein.

[0131] The specific embodiments described above do not constitute a limitation on the scope of protection of this invention. Those skilled in the art should understand that various modifications, combinations, sub-combinations, and substitutions can be made according to design requirements and other factors. Any modifications, equivalent substitutions, and improvements made within the spirit and principles of this invention should be included within the scope of protection of this invention.

Claims

1. A fire access way occupancy detection method, characterized by, include: Acquire a first image frame from the video stream, the first image frame including a target detection region, the target detection region being the area where the fire escape is located; Target detection is performed on the first image frame to obtain the target detection result; If the target detection result indicates the presence of a detection object, determine the first depth data corresponding to the detection object in the first image frame; Obtain the second depth data corresponding to the target detection area, and determine the area occupancy detection result of the target detection area based on the first depth data and the second depth data, wherein the second depth data is the calibration depth data of the target detection area in the unoccupied state; If the area occupancy detection result is that the area is occupied, determine the duration of continuous occupancy of the target detection area by the detection object, generate warning information when the duration of continuous occupancy meets the warning conditions, and perform warning operation based on the warning information; Wherein, determining the duration of continuous occupation of the target detection area by the detection object includes: Acquire a second image frame from the video stream, wherein the second image frame is located after the first image frame; Obtain the first detection box corresponding to the detected object in the target detection result corresponding to the first image frame; Target detection is performed on the second image frame, and a second detection box of the detected object in the second image frame is determined; Determine the intersection-union ratio of the first detection frame and the second detection frame; If the cross-connection-to-union ratio (CUIR) is greater than or equal to a preset CUIR threshold, the duration of continuous occupation is updated based on the time interval between the second image frame and the first image frame. If the CUIR is less than the preset CUIR threshold, the duration of continuous occupation is reset.

2. The method of claim 1, wherein, The step of performing target detection on the first image frame to obtain target detection results includes: A pre-trained target detection model is invoked, and the first image frame is subjected to target detection through the pre-trained target detection model to obtain target recognition results, wherein the target recognition results include the target object type; Obtain a preset detection object type, and match the target object type within the preset detection object type to obtain the matching result corresponding to the target object type; If the matching result is successful, the target detection result is determined to indicate that a detection object exists in the target detection area. If the matching result is unsuccessful, the target detection result is determined to indicate that no detection object exists in the target detection area.

3. The method of claim 1, wherein, Determining the first depth data of the detected object in the first image frame includes: The pre-trained monocular depth estimation model is invoked, and the first image frame is processed based on the pre-trained monocular depth estimation model to obtain the depth data corresponding to the first image frame. Extract the first depth data of the detected object from the depth data corresponding to the first image frame.

4. The method of claim 1, wherein, The determination of the region occupancy detection result of the target detection area based on the first depth data and the second depth data includes: Traverse the first depth data corresponding to the detection object and the second depth data corresponding to the target detection area, compare the first depth data and the second depth data of the same location point, and determine the depth difference corresponding to each location point; If the depth difference of any of the locations within the target detection area is less than a preset depth difference threshold, it is determined that the location is occupied. Determine the number of location points in the occupied state within the target detection area. If the number is greater than or equal to a preset threshold, then determine that the area occupancy detection result corresponding to the target detection area is occupied.

5. The method of claim 1, wherein, The determination of the region occupancy detection result of the target detection area based on the first depth data and the second depth data includes: Based on the first depth data and the second depth data, the overlapping area between the detection object and the target detection area is determined; based on the first depth data corresponding to the overlapping area, a first average depth value is determined; and based on the second depth value corresponding to the overlapping area, a second average depth value is determined. The average depth difference is determined based on the first average depth value and the second average depth value; If the average depth difference is greater than or equal to a preset depth difference threshold, the region occupancy detection result of the target detection area is determined to be occupied.

6. The method of claim 1, wherein, The method further includes: The acquisition scene of the video stream is obtained, and the second depth data of the target detection area in the acquisition scene is obtained; the acquisition conditions of the video stream are different in different acquisition scenes; The methods for determining the second depth data of the target detection area under different acquisition scenarios include: Acquire a third image frame of the target detection region under different acquisition scenarios, wherein the target detection region is in an unoccupied state in the third image frame; The third image frame is processed based on a pre-trained monocular depth estimation model to obtain the second depth data of the target detection region under different acquisition scenarios.

7. A fire access way occupancy detection apparatus characterized by, include: An image frame acquisition module is used to acquire a first image frame in a video stream, wherein the first image frame includes a target detection area, and the target detection area is the area where the fire lane is located. The target detection result determination module is used to perform target detection on the first image frame and obtain the target detection result; The first depth data determination module is used to determine the first depth data corresponding to the detected object in the first image frame when the target detection result indicates that a detected object exists. The region occupancy detection result determination module is used to obtain the second depth data corresponding to the target detection region, and determine the region occupancy detection result of the target detection region based on the first depth data and the second depth data, wherein the second depth data is the calibration depth data of the target detection region in the unoccupied state; The early warning processing module is used to determine the duration of continuous occupation of the target detection area by the detection object when the area occupancy detection result is that the area is occupied, generate early warning information when the duration of continuous occupation meets the early warning conditions, and perform early warning operations based on the early warning information. The early warning processing module includes a continuous occupation duration determination unit; A continuous occupation duration determination unit is used to acquire a second image frame in the video stream, wherein the second image frame is located after the first image frame; Obtain the first detection box corresponding to the detected object in the target detection result corresponding to the first image frame; Target detection is performed on the second image frame, and a second detection box of the detected object in the second image frame is determined; Determine the intersection-union ratio of the first detection frame and the second detection frame; If the cross-connection-to-union ratio (CUIR) is greater than or equal to a preset CUIR threshold, the duration of continuous occupation is updated based on the time interval between the second image frame and the first image frame. If the CUIR is less than the preset CUIR threshold, the duration of continuous occupation is reset.

8. An electronic device, comprising: The electronic device includes: At least one processor; and A memory communicatively connected to the at least one processor; wherein, The memory stores a computer program that can be executed by the at least one processor, the computer program being executed by the at least one processor to enable the at least one processor to perform the fire lane occupancy detection method according to any one of claims 1-6.

9. A computer-readable storage medium, characterized in that, The computer-readable storage medium stores computer instructions that, when executed by a processor, implement the fire lane occupancy detection method according to any one of claims 1-6.