Fire source authenticity identification method based on image recognition

What is AI technical title?
AI technical title is built by PatSnap AI team. It summarizes the technical point description of the patent document.
By processing images in HSV space and combining cross-union ratio and spectral peak sharpness calculation, the problem of misjudging artificial stroboscopic sources in fire source detection systems is solved. This achieves accurate differentiation between real flames and false fire sources with low computational overhead, reduces false alarm rate, and has the generalization capability of new light sources.

CN122244764APending Publication Date: 2026-06-19辽源高新技术产业开发区消防救援大队 +1

View PDF 0 Cites 0 Cited by

Patent Information

Authority / Receiving Office: CN · China
Patent Type: Applications(China)
Current Assignee / Owner: 辽源高新技术产业开发区消防救援大队
Filing Date: 2026-03-30
Publication Date: 2026-06-19

Application Information

Patent Timeline

30 Mar 2026

Application

19 Jun 2026

Publication

CN122244764A

IPC: G06V20/40; G06V10/44; G06V10/88; G06V10/56; G08B21/18

AI Tagging

Application Domain

Character and pattern recognition Alarms

Explore More Agents

Novelty Search
Search existing technologies and assess novelty
↗
FTO
Analyze whether a product may infringe others' patents
↗
Design FTO
Check prior-design risk for exterior design
↗
Drafting
Draft patent application text based on a technical solution
↗
Find Solutions with TRIZ
Generate feasible solution to solve your technical challenge
↗

Similar Technology Patents

Get free access to AI patent search and analysis

Check patentability, review prior art and ask IP Agent with full patent context.

Smart Images

Figure CN122244764A_ABST

Patent Text Reader

Abstract

This invention discloses a method for identifying the authenticity of fire sources based on image discrimination, belonging to the field of fire detection technology. The method performs color space conversion and threshold filtering frame-by-frame on the video stream to obtain a binary mask of the suspected fire source area. After morphological processing and connected component extraction, a time sliding window is established for each suspected target. In the spatial morphology identification path, the variance of the intersection-union (IUU) sequence between consecutive frames is calculated to quantify the randomness of contour deformation. In the temporal frequency identification path, a fast Fourier transform is performed on the brightness sequence and the spectral peak sharpness is calculated to quantify the periodicity of flicker. Finally, a joint decision is made using the spatial contour deformation variance and the spectral peak sharpness to construct a two-dimensional orthogonal feature, distinguishing between the non-rigid broadband flicker of real flames and the rigid periodic flicker of artificial stroboscopic light sources, effectively reducing the false alarm rate of fire visual detection systems caused by artificial stroboscopic light sources.

Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of fire detection technology, and in particular to a method for identifying the authenticity of fire sources based on image recognition. Background Technology

[0002] With the development and popularization of computer vision technology, image recognition-based fire source detection systems have been widely deployed in various scenarios such as factories, tunnels, forests, and warehouses. Existing visual fire detection solutions mainly rely on the characteristics of flames in color space and simple motion features for identification, while some solutions employ deep learning object detection models. However, in actual industrial and urban environments, there are widespread artificial strobe light sources with color features extremely similar to real flames, such as yellow strobe warning lights at construction sites, red hazard lights on vehicles, LED neon billboards with dynamic refresh rates, and dynamic reflections of these light sources on water surfaces or glass. These artificial strobe light sources are not only difficult to distinguish from real flames in color space, but also exhibit dynamic changes in brightness and area in image sequences due to their flickering characteristics. This makes it extremely easy for existing conventional image algorithms and lightweight models to misidentify them as real fires, leading to frequent false alarms from fire protection systems. High-frequency false alarms not only consume a large amount of manpower for verification but also cause a continuous decline in the trust of on-duty personnel in the system, and even lead to complacency when real fire alarms occur. Current solutions to this problem often rely on increasing the model size to improve recognition accuracy. However, this approach not only increases hardware computing costs but also results in poor generalization performance due to the diverse shapes of stroboscopic light sources. Therefore, accurately distinguishing real flames from artificial stroboscopic light sources at the physical mechanism level with low computational overhead is a pressing technical problem in the field of fire visual detection. Summary of the Invention

[0003] To address the technical problem that existing image recognition-based fire source detection systems often fail to distinguish between real flames and artificial strobe fire sources at the physical mechanism level, leading to frequent false alarms, this invention provides an image-based method for identifying the authenticity of fire sources.

[0004] A method for identifying the authenticity of fire sources based on image recognition includes the following steps:

[0005] The video stream is acquired at a constant frame rate. After color space conversion of each frame, the binarized mask image of the suspected fire source area is extracted according to the preset flame color threshold range.

[0006] The binarized mask image is subjected to morphological opening operation, and the independent connected components in the mask image after opening operation are extracted as suspected fire source targets.

[0007] Establish a length of [length] for each suspected fire source target. A time-sliding window for each frame is used, and the binarized mask and grayscale mean value corresponding to each frame are continuously recorded within the time-sliding window.

[0008] In the spatial morphology identification pathway, the cross-union ratio (CUP) of the effective pixel set of suspected fire source targets between two adjacent frames within the time sliding window is calculated, and the variance of the obtained CUP sequence is calculated as the spatial contour deformation variance.

[0009] In the time-frequency identification path, after DC component elimination and windowing processing are performed on the one-dimensional brightness time series composed of the gray-scale mean of each frame of the suspected fire source target within the time sliding window, a one-dimensional fast Fourier transform is performed to obtain the amplitude spectrum, and the peak sharpness of the spectrum is calculated based on the ratio of the maximum peak amplitude in the amplitude spectrum to the average amplitude of the remaining frequency bands.

[0010] The spatial contour deformation variance and the spectral peak sharpness are used as two-dimensional orthogonal features for joint judgment. When the spatial contour deformation variance is greater than a preset lower threshold and the spectral peak sharpness is less than a preset upper threshold, it is determined to be a real fire source; otherwise, it is determined to be a false fire source.

[0011] As a preferred technical solution of the present invention, the color space conversion is to convert the RGB image to the HSV space, and the flame color threshold range includes a hue channel value range of 0 to 30 and 150 to 180, a saturation channel value range of 100 to 255, and a brightness channel value range of 150 to 255.

[0012] As a preferred embodiment of the present invention, the morphological opening operation employs a combination of erosion followed by dilation, with the size of the structuring element matrix being [missing information]. or .

[0013] As a preferred embodiment of the present invention, the cross-joint ratio is calculated as follows: for two adjacent frames... and The intersection area is obtained by performing a bitwise AND operation on the set of valid pixels of the suspected fire source target, and the union area is obtained by performing a bitwise OR operation. The intersection area is divided by the union area to obtain the first... The cross-joint ratio (CLU) of frames is calculated using the following formula:

[0014]

[0015] As a preferred embodiment of the present invention, the windowing process employs any one of Hamming windows, Hanning windows, or Blackman windows.

[0016] As a preferred embodiment of the present invention, the peak sharpness of the spectrum is calculated as follows: the maximum peak amplitude in the search amplitude spectrum. After removing the peak value and its adjacent frequency bands, the average amplitude of the remaining effective frequency components is calculated. The ratio of the two is used as the sharpness factor, and the corresponding calculation formula is:

[0017]

[0018] in This represents the number of effective frequency components involved in the averaging calculation.

[0019] As a preferred technical solution of the present invention, a time delay integration step is further included after the joint decision: a confidence integration slot is maintained for each suspected fire source target. When a single sliding window decision is a real fire source, the confidence integration slot is increased by a preset positive increment. When the decision is a false fire source, the confidence integration slot is decreased by a preset negative increment or cleared to zero. A fire alarm command is triggered only when the cumulative value of the confidence integration slot exceeds a preset alarm threshold.

[0020] As a preferred embodiment of the present invention, the absolute value of the negative increment is greater than the positive increment, which prevents brief, occasional misjudgments from easily accumulating to the level that triggers an alarm.

[0021] As a preferred embodiment of the present invention, the constant frame rate is 30 frames per second, and the length of the time sliding window is... It is 90 frames per second.

[0022] As a preferred technical solution of the present invention, in the joint decision step, when the spatial profile deformation variance is lower than the lower limit threshold, the calculation of the time frequency identification path is skipped and it is directly determined to be a false fire source, thereby saving computing resources.

[0023] Compared with the prior art, the present invention has the following beneficial effects:

[0024] This invention employs a joint decision-making process based on two-dimensional orthogonal features comprised of spatial contour deformation variance and spectral peak sharpness. It identifies suspected fire sources from two physical dimensions: the randomness of hydrodynamics and the periodicity of circuit control. This accurately distinguishes the irregular, non-rigid deformation and broadband random flicker of real flames from the rigid contour and single-periodic flicker of artificial stroboscopic light sources, effectively filtering out false alarms caused by artificial stroboscopic light sources such as construction warning lights, vehicle hazard lights, and LED neon billboards. Furthermore, all computations involved in this invention fall within the realm of classic digital image processing and one-dimensional signal processing, with computational complexity far lower than the forward inference process of deep learning models. Ordinary embedded processors can meet real-time processing requirements. Moreover, the extracted identification features originate from the underlying physical differences between flames and artificial light sources, rather than relying on statistical learning of specific types of light source appearance samples, demonstrating generalized recognition capabilities for various novel stroboscopic light sources. Attached Figure Description

[0025] Figure 1 This is a schematic diagram of the method flow according to an embodiment of the present invention. Detailed Implementation

[0026] The technical solution of the present invention will be further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the following embodiments are only for illustrative purposes and do not constitute a limitation on the scope of protection of the present invention. All equivalent transformations and improvements based on the technical concept of the present invention in this field should be covered within the scope of protection of the present invention.

[0027] The technical solution of the present invention will be further described in detail below with reference to the accompanying drawings.

[0028] like Figure 1 As shown, this invention provides a method for identifying genuine fire sources based on image recognition. Its core lies in simultaneously extracting the randomness of the spatial contour deformation of suspected fire sources and the periodicity of brightness flickering in the temporal dimension from video image sequences. This is achieved through joint judgment of two-dimensional orthogonal features, enabling accurate differentiation between real flames and artificial stroboscopic fire sources. The physical basis of this method is that real flames, as high-temperature plasma fluids, are subject to continuous disturbances from thermal convection and air turbulence during combustion. Spatially, this manifests as irregular, non-rigid deformation of the flame edge contour, and temporally, as low-frequency, broadband, random brightness flickering. Artificial stroboscopic fire sources, such as construction warning lights, vehicle hazard lights, and LED neon billboards, have their luminous areas fixed within rigid mechanical shells. Spatially, their effective contours overlap significantly when illuminated, and temporally, they are controlled by circuits to produce strictly periodic, single-frequency flickering. This invention utilizes the essential differences in spatial morphology and temporal frequency between these two types of light sources, mathematically quantifying them using cross-sectional area ratio (CLURR) variance and peak spectral sharpness, thereby achieving physical-meaning identification of genuine and fake fire sources at the signal processing level.

[0029] In one specific embodiment, the method of the present invention includes the following steps.

[0030] The system workflow begins with the acquisition and conversion of ambient light energy by a photoelectric sensor. A CMOS or CCD camera acquires an RGB video stream at a constant frame rate, which can be set to 30 frames per second. This frame rate satisfies the Nyquist sampling theorem's requirement for sampling frequencies of 1 to 15 Hz flicker signals. The camera lens collects photons emitted from real flames or artificial light sources in the scene. These photons undergo photoelectric conversion by photodiodes at each pixel in the photosensitive element array and digital processing by an analog-to-digital converter, forming a sequence of RGB three-channel digital images output frame by frame.

[0031] After acquiring the RGB image of the current frame, a color space transformation is performed. In the RGB color model, the values of the red, green, and blue channels not only encode color information but are also coupled with the influence of ambient light intensity. If the fire source recognition threshold is directly set in the RGB space, the judgment will fail when the lighting conditions change. Therefore, the RGB image is converted to the HSV space in real time through color space transformation. This space decouples the essential attribute of color, i.e., hue, from the light intensity, i.e., brightness. In the HSV space, threshold ranges corresponding to the spectral characteristics of flames are set, where the hue channel H is set to a range of 0 to 30 and 150 to 180, the saturation channel S is set to a range of 100 to 255, and the brightness channel V is set to a range of 150 to 255. The above hue ranges correspond to the wavelength regions of the red, orange, and yellow frequency bands, the lower limit of saturation is used to filter out low-saturation white light interference, and the lower limit of brightness is used to filter out background noise in dark areas. Each pixel in the current frame image is compared with the above threshold. Pixels falling within the range are assigned a logical high value (white) in memory, while the remaining pixels are assigned a logical low value (black). This reduces the complexity of the three-dimensional color image to a binary mask image containing only the spatial distribution of suspected light-emitting points.

[0032] Due to the complexity of real-world monitoring scenarios, binarized mask images often contain isolated noise points caused by factors such as sensor background noise, tiny reflective spots, or flying insects. Directly analyzing these noise points would unnecessarily consume computational resources. Therefore, mathematical morphological filtering is applied to the binarized mask image, specifically using an opening operation, i.e., a combination of erosion and dilation. A size of [size missing] is selected. or The structuring element matrix is used, and the erosion operation removes isolated white pixels with excessively small areas. The subsequent dilation operation reconnects larger suspected fire source patches fragmented by internal dark bands into complete regions. In the morphologically processed mask image, the remaining white connected regions are the suspected fire source targets that need to be identified. A connected component labeling algorithm is used to extract the bounding rectangle, centroid coordinates, and precise pixel set of each independent luminous region. Each independent region is then analyzed as a suspected fire source target.

[0033] To achieve continuous feature analysis of suspected targets over time, the system establishes a length of [length missing] for each independently tracked suspected target. A first-in-first-out (FIFO) time sliding window for frames. Taking a frame rate of 30 frames per second and an observation duration of 3 seconds as an example, then... By calculating the Euclidean distance or the overlap of bounding boxes of the centroids of each connected component between adjacent frames, historical frame data of the same physical entity on the time axis are concatenated to form a spatiotemporal data pipeline. Within this sliding window, the binarized mask shape matrix and the grayscale mean of the corresponding region are continuously stored for each tracked target in each frame. When a new frame of data arrives, the oldest frame at the end of the window is popped out, and the new frame data is pushed in at the front, thus achieving real-time streaming processing without waiting for the entire video recording to be completed before offline analysis.

[0034] When the sliding window of a suspected target is filled After the frame data is processed, the internal data stream of the system is divided into two parallel authentication logics that are executed synchronously.

[0035] The first path is a spatial morphology identification pathway, used to quantify the degree of random deformation of the target contour. Traditional security algorithms often only calculate the area change rate when determining whether an object has changed. However, this area change indicator is deceptive—the halo area of a flashing warning light also changes drastically from large to small and from small to large during its on and off process, easily misidentified as a flame. This invention introduces pixel-level intersection-over-union ratio between consecutive frames to quantify shape features. For two adjacent frames within a sliding window... and Extract the first The set of valid pixels of a suspected target in a frame, i.e., the set of white pixels in the binary mask, is denoted as . At the same time, extract the first The set of valid pixels corresponding to the target in a frame is denoted as . The intersection area of two sets is calculated using a bitwise AND operation. This intersection represents the pixel portions in both frames that completely overlap in position and whose shape remains unchanged. The union area of the two sets is calculated using a bitwise OR operation. This union represents the total area occupied by the target contour in both frames. Dividing the intersection area by the union area yields the first... Frame crossover ratio:

[0036]

[0037] For realistic flames, due to random airflow interference, the flame tongue may shift to the left in one frame and suddenly to the right in the next, or exhibit abrupt breaks. Although there may be significant overlap between consecutive frames, the edge pixels often do not contain each other. Therefore, in continuous... Intra-frame calculated A sequence is a set of random variables that jump randomly within a specific range, typically ranging from 0.4 to 0.8. For artificial warning lights, as rigid light sources, when they are in a constant-on state, the intersection of two consecutive frames equals the union, and the intersection-union ratio (IU / R) remains almost constant at a very high level close to 1.0. At the instant of switching from a dark state to a bright state, the IU / R drops to a very low value close to zero. Regardless of the state, the IU / R sequence value is either extremely stable or exhibits extremely regular jumps, without showing any minor random fluctuations.

[0038] After obtaining a length of After performing the crossover and comparison of time series, calculate the statistical variance of the series:

[0039]

[0040] in This is the mean of the sequence. Variance, as a statistic measuring the dispersion of data, can transform the difference between the high random deformation of real flames and the rigid flicker of artificial lamps into a numerical indicator that can be judged by computers. If the target's Below the system's set lower threshold This indicates that the target's contour changes are extremely small or extremely regular, and it does not possess the non-rigid deformation characteristics of a natural fluid flame.

[0041] The second path is the temporal-frequency discrimination path, used to analyze the periodic characteristics of target brightness flickering. Because spatial morphology discrimination can be biased in certain situations—for example, wind-blown leaves obscuring a fixed warning light and forcibly altering its image outline—judging solely by spatial variance still carries a small probability of failure. The introduction of the temporal-frequency discrimination path aims to provide complementary verification independent of spatial dimensions. This path no longer focuses on the target's geometry but extracts the temporal variation pattern of its overall illumination energy. Specifically, in each frame, the arithmetic mean of all pixel grayscale values in the original image region covered by the binarized mask after grayscale processing is calculated, thereby extracting a spectrum of length... One-dimensional discrete brightness time series signal ,in .

[0042] Before performing frequency domain transformation on the sequence, signal preprocessing is required. Subtract all... The mean of each sampling point is used to eliminate the DC component of the signal. Without eliminating the DC component, a large amount of signal energy would be concentrated at zero frequency, masking the true frequency domain distribution of the flickering AC component. Subsequently, to mitigate spectral leakage caused by the signal being truncated at both ends of the time window, the zero-mean signal is multiplied point-by-point by a Hamming window function. The Hamming window's function is to smoothly attenuate the signal to zero at both ends of the window, thereby suppressing the impact of the truncation effect on the accuracy of the spectrum estimation. In an alternative implementation, other commonly used window functions such as the Hanning window or the Blackman window can be used instead of the Hamming window, with similar technical effects.

[0043] After preprocessing, a one-dimensional fast Fourier transform is performed on the windowed signal to convert the time-domain signal to the frequency domain, obtaining the complex form of the spectrum. The amplitude spectrum is then obtained by taking the modulo operation on the complex result. According to the Nyquist sampling theorem, sampling at a frame rate of 30 frames per second, this spectrum can show the energy distribution of each frequency band in the frequency range of 0 to 15 Hz.

[0044] In amplitude spectrum Extract peak sharpness from the spectrum. Search for the maximum amplitude value in the spectrum. and their corresponding frequencies After removing the peak point and its nearest neighboring frequency bands, calculate the average amplitude of all remaining effective frequency components. The ratio of the two is defined as the sharpness factor:

[0045]

[0046] in This refers to the number of effective frequency components involved in the averaging calculation. For a real flame, its brightness flicker is caused by the superposition of countless tiny airflow fluctuations of different periods. The spectrum exhibits a wideband continuous distribution similar to low-frequency pink noise in the 1 to 10 Hz range. Although there may be a frequency point with the highest energy, the surrounding frequencies also have a significant energy distribution, and the peak is not abrupt. (Sharpness factor) Typically, the value is low, such as in the range of 1.5 to 3.0. For artificial warning lights, the flashing is precisely controlled by a crystal oscillator or resonant circuit, with extremely strict periodicity. In the frequency domain, the system energy is highly concentrated at a specific frequency and its higher harmonics, forming extremely sharp pulse lines. The background at other frequencies has almost no energy distribution, resulting in a sharpness factor... It often reaches 10 or even more than 50.

[0047] After calculating the two core features, spatial contour deformation variance and temporal frequency sharpness, in parallel, the system enters the joint true / false judgment stage. This invention uses two-dimensional Boolean logic for the final classification, with the specific judgment rule being: the evaluation result of a single sliding window is judged as a real fire source only when the target simultaneously meets the following two conditions:

[0048] Condition 1, Spatial contour deformation variance Greater than the preset lower threshold Condition 1: The target must possess highly random, non-rigid fluid edge deformation; Condition 2: Spectral peak sharpness. Less than the preset upper limit threshold In other words, the target's brightness time series does not contain sharp periodic frequency components characteristic of artificial circuitry. If either of these two conditions is not met, the system will classify the target as a false fire source and reject it. For example, if a target exhibits a large shape variation but also displays extremely sharp single-frequency flickering characteristics, the target may be a piece of fabric being reflected by a stroboscopic light source, and the system will also intercept it. In a specific parameter setting, The possible value is 0.05. The value can be 5.0. Those skilled in the art can make adaptive adjustments to the above threshold according to the environmental characteristics of the actual deployment scenario.

[0049] To further improve the fault tolerance in fire alarm engineering applications and prevent false triggering caused by calculation anomalies in a few isolated frames, this invention incorporates a time hysteresis integration mechanism after the joint decision. For each tracked suspected target, a confidence integration slot is maintained internally. When the decision result of a single sliding window is a real fire source, the value of the integration slot increases by a preset positive increment; when the decision result is a false fire source, the value of the integration slot decreases by a preset negative increment. The absolute value of this negative increment can be greater than the positive increment, for example, a positive increment of 1 and a negative increment of 2, thus ensuring that brief, isolated false judgments do not easily accumulate to the level required to trigger an alarm. In another implementation, the integration slot for false fire source decisions can be directly cleared to zero. Only when the accumulated value of the confidence integration slot for a suspected target exceeds a preset alarm threshold during its continuous existence period—that is, when the target continuously exhibits double high-confidence real fire characteristics for a considerable period—does the system officially trigger a fire alarm command. The underlying software sends alarm screenshot data frames with timestamps, occurrence coordinates, and red borders to the central fire control platform via a serial communication interface or network protocol. At the same time, it can activate the on-site audible and visual alarms and water sprinkler systems.

[0050] In another implementation, the processing order of each step in the above method can be adaptively adjusted according to the computing resources of the specific hardware platform. For example, when deployed on an edge computing device with sufficient computing power, the spatial morphology identification path and the temporal frequency identification path can be executed simultaneously in parallel threads to shorten the processing latency of a single frame; when deployed on an embedded platform with limited computing power, the two identification paths can be executed serially, and the spatial contour deformation variance calculation, which has a smaller computational load, can be performed first. When this judgment has determined that the target is a rigid body, the frequency domain analysis step is skipped directly to output the pseudo-fire source determination result, thereby saving computing resources. The above adjustment of the processing method does not affect the technical essence of the method of the present invention.

[0051] In addition, regarding the length of the sliding window The value of is not limited to 90 frames as mentioned in the above embodiments. In scenarios with higher frame rates, such as when using a high-speed camera at 60 frames per second, The frame rate can be increased to 180 frames per second to maintain a 3-second observation duration; in scenarios with lower frame rates or higher requirements for response speed, The time frame can be shortened to 45 frames or even 30 frames. In this case, the observation time is shortened accordingly, and the frequency resolution is reduced, but it can still meet the basic requirement of distinguishing between broadband scintillation and narrowband periodic scintillation. The stepping method of the sliding window can also be selected as frame-by-frame stepping or fixed-interval stepping as needed. Frame-by-frame stepping can obtain the finest time resolution, while stepping at larger intervals can further reduce the computational load while sacrificing a small amount of time accuracy.

[0052] Regarding the setting of HSV color space thresholds, the above embodiments provide a typical parameter configuration. In actual deployments, the threshold range may need to be fine-tuned for different monitoring scenarios. For example, in industrial environments where natural gas is the primary fuel, the flame color is bluish, and the threshold range of the hue channel H can be correspondingly extended to cover the blue frequency band; in outdoor scenes at night without ambient light, the lower limit threshold of the luminance channel V can be appropriately lowered to improve the detection sensitivity of weak flames. These parameter adjustments are routine optimizations made by those skilled in the art based on specific application conditions and do not depart from the scope of the technical concept of this invention.

[0053] All the operations involved in the above embodiments, including color space conversion, morphological filtering, connected component labeling, intersection-union ratio (OCR) calculation, and one-dimensional fast Fourier transform, belong to the classic categories of digital image processing and one-dimensional signal processing. Their computational complexity is far lower than that of the forward inference process of deep learning models. Ordinary embedded processors can meet the computing power requirements for real-time processing, without the need for dedicated graphics processing unit hardware. This gives the method of the present invention a significant advantage in terms of hardware implementation cost. At the same time, since the identification features extracted by the present invention originate from the underlying physical differences between flame hydrodynamics and artificial light source circuit control characteristics, rather than relying on statistical learning of specific types of warning light appearance samples, it has a natural generalization ability to identify various novel stroboscopic sources that have not appeared in the training set.

[0054] It should be noted that the above description is merely a preferred embodiment of the present invention and is not intended to limit the present invention. Any modifications, equivalent substitutions, improvements, etc., made within the spirit and principles of the present invention should be included within the protection scope of the present invention.

Claims

1. A method for identifying the authenticity of fire sources based on image recognition, characterized in that, Includes the following steps: The video stream is acquired at a constant frame rate. After color space conversion of each frame, the binarized mask image of the suspected fire source area is extracted according to the preset flame color threshold range. The binarized mask image is subjected to morphological opening operation, and the independent connected components in the mask image after opening operation are extracted as suspected fire source targets. Establish a length of [length] for each suspected fire source target. A time-sliding window for each frame is used, and the binarized mask and grayscale mean value corresponding to each frame are continuously recorded within the time-sliding window. In the spatial morphology identification pathway, the cross-union ratio (CUP) of the effective pixel set of suspected fire source targets between two adjacent frames within the time sliding window is calculated, and the variance of the obtained CUP sequence is calculated as the spatial contour deformation variance. In the time-frequency identification path, after DC component elimination and windowing processing are performed on the one-dimensional brightness time series composed of the gray-scale mean of each frame of the suspected fire source target within the time sliding window, a one-dimensional fast Fourier transform is performed to obtain the amplitude spectrum, and the peak sharpness of the spectrum is calculated based on the ratio of the maximum peak amplitude in the amplitude spectrum to the average amplitude of the remaining frequency bands. The spatial contour deformation variance and the spectral peak sharpness are used as two-dimensional orthogonal features for joint judgment. When the spatial contour deformation variance is greater than a preset lower threshold and the spectral peak sharpness is less than a preset upper threshold, it is determined to be a real fire source; otherwise, it is determined to be a false fire source.

2. The method for identifying the authenticity of fire sources based on image recognition according to claim 1, characterized in that: The color space conversion converts the RGB image to the HSV space. The flame color threshold range includes a hue channel value range of 0 to 30 and 150 to 180, a saturation channel value range of 100 to 255, and a brightness channel value range of 150 to 255.

3. The method for identifying the authenticity of fire sources based on image recognition according to claim 1, characterized in that: The morphological opening operation employs a combination of erosion followed by dilation, with the structuring element matrix having a size of [missing value]. or .

4. The method for identifying the authenticity of fire sources based on image recognition according to claim 1, characterized in that: The cross-joint ratio is calculated as follows: for two adjacent frames and The intersection area is obtained by performing a bitwise AND operation on the set of valid pixels of the suspected fire source target, and the union area is obtained by performing a bitwise OR operation. The intersection area is divided by the union area to obtain the first... The cross-joint ratio (CLU) of frames is calculated using the following formula: 。 5. The method for identifying the authenticity of fire sources based on image recognition according to claim 1, characterized in that: The windowing process employs any one of the following: Hamming window, Hanning window, or Blackman window.

6. The method for identifying the authenticity of fire sources based on image recognition according to claim 1, characterized in that: The peak sharpness of the spectrum is calculated as follows: the maximum peak amplitude in the search amplitude spectrum. After removing the peak value and its adjacent frequency bands, the average amplitude of the remaining effective frequency components is calculated. The ratio of the two is used as the sharpness factor, and the corresponding calculation formula is: ； in This represents the number of effective frequency components involved in the averaging calculation.

7. The method for identifying the authenticity of fire sources based on image recognition according to claim 1, characterized in that: The joint decision is followed by a time delay integration step: a confidence integration slot is maintained for each suspected fire source target. When a single sliding window determines that the target is a real fire source, the confidence integration slot is increased by a preset positive increment. When the target is determined to be a false fire source, the confidence integration slot is decreased by a preset negative increment or cleared to zero. A fire alarm command is triggered only when the cumulative value of the confidence integration slot exceeds a preset alarm threshold.

8. The method for identifying the authenticity of fire sources based on image recognition according to claim 7, characterized in that: The absolute value of the negative increment is greater than the positive increment.

9. The method for identifying the authenticity of fire sources based on image recognition according to claim 1, characterized in that: The constant frame rate is 30 frames per second, and the length of the time sliding window is... It is 90 frames per second.

10. The method for identifying the authenticity of fire sources based on image recognition according to claim 1, characterized in that: In the joint decision step, when the spatial profile deformation variance is lower than the lower limit threshold, the calculation of the time-frequency identification path is skipped, and it is directly determined to be a false fire source.