An aerial smoke sensing method based on image recognition

By constructing a four-dimensional spatiotemporal tensor and a spatiotemporal attention mechanism, the problem of inaccurate smoke source location in strong airflow environments in airborne smoke detection was solved, achieving accurate smoke source location and reliable fire indication under low-concentration smoke conditions.

CN122090424BActive Publication Date: 2026-06-30XIAN AERONAUTICAL POLYTECHNIC INST

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
XIAN AERONAUTICAL POLYTECHNIC INST
Filing Date
2026-04-21
Publication Date
2026-06-30

AI Technical Summary

Technical Problem

Existing image recognition-based aerial smoke detection methods cannot achieve accurate spatial positioning of smoke sources in strong airflow environments, leading to misjudgments and coordinate calculation errors, especially in low-concentration smoke conditions where the location of smoke sources cannot be effectively identified.

Method used

By constructing a four-dimensional spatiotemporal tensor and performing three-dimensional convolution operations to extract the spatiotemporal joint features of smoke particles, spatial attention and temporal attention weights are generated by combining airflow direction and smoke diffusion patterns, spatiotemporal aggregation is performed, three-dimensional coordinates of the smoke source are generated, and a confidence score for the location reliability is provided.

Benefits of technology

It effectively overcomes the problem of spatiotemporal shift of smoke particles in strong airflow environments, improves the positioning accuracy of smoke sources in the three-dimensional space of the cabin, and provides intuitive and reliable fire location indication, shortening the time for fire confirmation and response.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN122090424B_ABST
    Figure CN122090424B_ABST
Patent Text Reader

Abstract

This invention relates to the field of aviation technology in image processing, specifically disclosing an aviation smoke detection method based on image recognition. The method acquires continuous image sequences from multiple perspectives within the cabin, constructs a four-dimensional spatiotemporal tensor, extracts temporal feature sequences along the time axis, and performs directional convolution sampling based on airflow direction parameters to obtain spatiotemporal response values, which are then aggregated into a feature map set. Spatial weights are generated based on the physical laws of smoke diffusion, and temporal weights are generated based on grayscale fluctuation frequency; these weighted values ​​yield a feature representation. The feature representation is decomposed into spatial response maps, and the three-dimensional coordinate parameters of the smoke source are obtained through spatial coordinate voting and weighted fusion. A confidence score is calculated based on the dispersion of candidate coordinates and airflow paths. The coordinate information is matched to cabin spatial layout data to generate a situational output. This invention can achieve precise location of smoke sources and output reliability quantification indicators in strong airflow environments.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of aviation safety technology in image processing, and specifically to an aviation smoke detection method based on image recognition. Background Technology

[0002] With the development of image processing technology and airborne embedded platforms, camera-based visual smoke detection methods have gradually attracted attention. Cabin fire safety is an important component of civil aircraft operational safety. Traditional aviation smoke detection mainly uses point-type smoke detectors, such as photoelectric or ionization sensors, installed in areas such as the cabin ceiling, cargo hold, and electronics bay. These sensors trigger alarms by detecting changes in light scattering or ionization current caused by smoke particles entering the detection chamber.

[0003] In existing image recognition-based aviation smoke detection methods, the strong airflow environment in the cabin causes significant spatiotemporal shifts in the imaging time of the same smoke particle cluster in cameras from different perspectives. This leads to traditional positioning methods that rely on matching at the same time misidentifying the same smoke source as multiple independent sources or causing coordinate calculation errors. At the same time, the airflow dilution effect causes the image texture feature points to become sparse. When the effective feature points are below the three-dimensional reconstruction threshold, the positioning function completely fails, making it impossible to achieve accurate spatial positioning of the smoke source under the coupling effect of strong airflow and low-concentration smoke. Summary of the Invention

[0004] The purpose of this invention is to provide an image recognition-based method for detecting smoke in aviation, in order to solve the problems mentioned above.

[0005] The objective of this invention can be achieved through the following technical solutions:

[0006] An image recognition-based method for detecting smoke in aircraft includes the following steps:

[0007] S1. Acquire a continuous sequence of images from multiple preset locations within the cabin, align the image sequences according to the acquisition time, and construct a four-dimensional spatiotemporal tensor containing time, viewpoint, and spatial dimensions.

[0008] S2 performs a three-dimensional convolution operation on the four-dimensional spatiotemporal tensor, and simultaneously extracts the spatiotemporal joint features of smoke particles along the time axis and spatial axis to obtain a set of feature maps describing the dynamic evolution of smoke in the spatiotemporal domain.

[0009] S3. Generate spatial attention weights along the spatial dimension and temporal attention weights along the temporal dimension based on the feature map set. Weight the spatial attention weights and temporal attention weights to the feature map set to obtain a feature expression focusing on the smoke diffusion pattern.

[0010] S4 performs a spatiotemporal aggregation operation on the feature representation, maps the aggregated features to coordinate parameters of the smoke source in the three-dimensional space of the cabin, and simultaneously generates a confidence score characterizing the reliability of the positioning.

[0011] S5, based on the comparison between the confidence score and the preset threshold, outputs the three-dimensional coordinate information of the smoke source, and associates the three-dimensional coordinate information with the cabin space layout data to generate a situational output for indicating the location of the fire.

[0012] As a further aspect of the present invention: S1 specifically includes:

[0013] Based on the acquisition timestamps of image frames in each image sequence, image frames from different preset locations at the same time are time-synchronized and grouped to obtain multi-view synchronized frame groups;

[0014] Based on the preset air conditioning operating parameters in the cabin, spatial position compensation is performed on each image frame in the multi-view synchronous frame group so that the same physical space point under different views is mapped to the corresponding pixel position in each image frame.

[0015] The compensated image frames are stacked in chronological order and stitched together along the viewpoint dimension to form a four-dimensional spatiotemporal tensor.

[0016] As a further aspect of the present invention: S2 specifically includes:

[0017] Using each pixel in the four-dimensional spacetime tensor as an anchor point, the gray value change sequence of each anchor point in the time dimension is extracted along the time axis to form a temporal feature sequence.

[0018] Based on the preset airflow direction parameters in the cabin, the temporal feature sequence is sampled by directional convolution along the airflow direction in the spatial dimension to obtain the spatiotemporal response value of each anchor point along the airflow propagation path.

[0019] The spatiotemporal response values ​​of adjacent anchor points are locally aggregated to form a set of feature maps describing the dynamic evolution of smoke in the spatiotemporal domain.

[0020] As a further aspect of the present invention: obtaining the spatiotemporal response values ​​of each anchor point along the airflow propagation path specifically includes:

[0021] Centered on the current anchor point in the time-series feature sequence, a neighborhood sampling path along the airflow direction is determined based on the airflow direction parameter. The neighborhood sampling path includes multiple consecutive anchor points located downstream of the current anchor point.

[0022] Perform element-wise weighted accumulation on the gray value change sequence of the current anchor point and the gray value change sequences of each downstream anchor point in the neighborhood sampling path;

[0023] The weighted summation result is output as the spatiotemporal response value of the current anchor point along the airflow propagation path.

[0024] As a further aspect of the present invention: S3 specifically includes:

[0025] The feature map set is divided into several spatial regions. According to the preset physical laws of smoke diffusion in the cabin, each spatial region is assigned an initial spatial weight. The expected smoke source region is given a higher initial spatial weight than the unexpected region.

[0026] The grayscale fluctuation frequency of each spatial region in multiple consecutive frames is extracted along the time dimension. The grayscale fluctuation frequency is compared with the preset smoke fluctuation frequency band, and the time weight of each spatial region is generated based on the comparison result.

[0027] The initial spatial weights and temporal weights are multiplied element by element and then applied to the feature map set to obtain a feature representation focusing on the diffusion pattern of smoke.

[0028] As a further aspect of the present invention: S4 specifically includes:

[0029] The feature representation is decomposed into spatial response maps under multiple perspectives according to the acquisition viewpoint. Each spatial response map contains the smoke response intensity at each spatial location under the corresponding perspective.

[0030] Spatial coordinate voting is performed on the spatial response maps from each perspective. The spatial locations where the smoke response intensity in each spatial response map exceeds the preset threshold are projected onto the cabin three-dimensional spatial coordinate system to form a set of candidate coordinates for each perspective and the voting weight corresponding to each candidate coordinate.

[0031] The candidate coordinate sets from each perspective are weighted and fused according to the voting weights to obtain the coordinate parameters of the smoke source in the three-dimensional space of the cabin. At the same time, the confidence score is calculated based on the dispersion of the candidate coordinate sets from each perspective. The dispersion is inversely proportional to the confidence score.

[0032] As a further aspect of the present invention: the calculation of the confidence score based on the dispersion of the candidate coordinate sets from each viewpoint specifically includes:

[0033] Obtain all candidate coordinates in the candidate coordinate set for each viewpoint, calculate the spatial deviation distance between each candidate coordinate and the airflow path defined by the preset airflow direction parameter in the cabin, sum the spatial deviation distances of all candidate coordinates under each viewpoint and take the average value to obtain the average deviation value of each viewpoint.

[0034] Based on the maximum value of the smoke response intensity in the spatial response map of each viewpoint, the response confidence benchmark for each viewpoint is determined. The response confidence benchmark is multiplied by the reciprocal of the average deviation value to obtain the preliminary confidence level for each viewpoint.

[0035] The confidence scores of all viewpoints are summed and then divided by the total number of viewpoints to obtain a confidence score that characterizes the reliability of the positioning.

[0036] As a further aspect of the present invention: S5 specifically includes:

[0037] Obtain cabin space layout data, which includes the spatial coordinates of each seat position and the corresponding seat number, as well as the spatial coordinates of each aisle area and the corresponding aisle number.

[0038] Based on the comparison between the confidence score and the preset threshold, when the confidence score is greater than or equal to the preset threshold, the three-dimensional coordinate information of the smoke source is spatially matched with the cabin space layout data to determine the seat number or aisle number of the area where the smoke source is located.

[0039] The three-dimensional coordinate information and the corresponding seat number or aisle number are superimposed onto the cabin floor plan image to generate a situational output used to indicate the location of the fire.

[0040] The beneficial effects of this invention are:

[0041] (1) This invention constructs a four-dimensional spatiotemporal tensor and performs directional convolution sampling along the airflow direction, incorporating the cabin air conditioning airflow parameters as known conditions into the calculation process of the spatiotemporal response value. This effectively overcomes the cross-view matching misalignment problem caused by the spatiotemporal shift of smoke particles under strong airflow conditions. At the same time, it enhances the weight of regions that conform to the smoke diffusion law and fluctuation frequency band through a spatiotemporal attention mechanism, thereby improving the positioning accuracy of the smoke source in the three-dimensional space of the cabin.

[0042] (2) By integrating the weighted voting of candidate coordinates from multiple perspectives and the confidence score based on the airflow path dispersion, this invention provides a quantitative indicator of the positioning reliability while outputting the three-dimensional coordinates of the smoke source. The coordinate information is then superimposed on the cabin floor plan image after being associated with the cabin seat number or aisle number, providing the crew with an intuitive and reliable fire location indication, effectively shortening the time for fire confirmation and handling. Attached Figure Description

[0043] The invention will now be further described with reference to the accompanying drawings.

[0044] Figure 1 This is a flowchart of the method of the present invention;

[0045] Figure 2 This is a flowchart of the process for obtaining the spatiotemporal response value in this invention. Detailed Implementation

[0046] The technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of the present invention, and not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of the present invention.

[0047] Please see Figure 1 As shown, this invention is an aerial smoke detection method based on image recognition, comprising the following steps:

[0048] S1. Acquire a continuous sequence of images from multiple preset locations within the cabin, align the image sequences according to the acquisition time, and construct a four-dimensional spatiotemporal tensor containing time, viewpoint, and spatial dimensions.

[0049] S2 performs a three-dimensional convolution operation on the four-dimensional spatiotemporal tensor, and simultaneously extracts the spatiotemporal joint features of smoke particles along the time axis and spatial axis to obtain a set of feature maps describing the dynamic evolution of smoke in the spatiotemporal domain.

[0050] S3. Generate spatial attention weights along the spatial dimension and temporal attention weights along the temporal dimension based on the feature map set. Weight the spatial attention weights and temporal attention weights to the feature map set to obtain a feature expression focusing on the smoke diffusion pattern.

[0051] S4 performs a spatiotemporal aggregation operation on the feature representation, maps the aggregated features to coordinate parameters of the smoke source in the three-dimensional space of the cabin, and simultaneously generates a confidence score characterizing the reliability of the positioning.

[0052] S5, based on the comparison between the confidence score and the preset threshold, outputs the three-dimensional coordinate information of the smoke source, and associates the three-dimensional coordinate information with the cabin space layout data to generate a situational output for indicating the location of the fire.

[0053] In S1, a continuous sequence of images collected from multiple preset locations within the cabin is acquired. These image sequences are then aligned according to their acquisition times to construct a four-dimensional spatiotemporal tensor containing time, viewpoint, and spatial dimensions. Specifically, this includes:

[0054] Multiple near-infrared cameras are installed at predetermined locations on the cabin ceiling. Each camera has a fixed acquisition frame rate of 25 frames per second, and the internal clocks of each camera are uniformly calibrated via an onboard time synchronization bus. Each camera continuously acquires near-infrared images of the cabin area, forming a continuous image sequence corresponding to its respective viewpoint. Each image sequence contains image frames arranged in chronological order of acquisition time, and each image frame carries an acquisition timestamp assigned by a unified clock.

[0055] First, based on the acquisition timestamps of image frames in each image sequence, image frames from different preset locations at the same time are time-synchronized and grouped. Specifically, with a time window of 1 second, image frames with acquisition timestamp deviations of no more than 10 milliseconds within this time window are grouped into the same group. This group contains image frames acquired by all cameras at the same time and is denoted as a multi-view synchronization frame group.

[0056] Secondly, based on the preset air conditioning operating parameters in the cabin, spatial position compensation is performed on each image frame in the multi-view synchronous frame group. These air conditioning operating parameters include the current air supply flow rate and cabin pressure differential of the cabin air conditioning system. A pre-established correspondence between different air conditioning operating parameters and image pixel offsets under each camera's viewpoint is obtained by acquiring calibration images under different air conditioning conditions in a ground simulation cabin. For each image frame in the current multi-view synchronous frame group, based on the currently read air conditioning operating parameters, the offset of each pixel position under that condition is obtained. Each pixel in the image frame is then shifted in the opposite direction according to the offset, so that the same physical spatial point under different camera views is mapped to the corresponding pixel position in each image frame, resulting in a spatially compensated image frame.

[0057] Finally, the spatially compensated image frames are stacked in chronological order and stitched together along the viewpoint dimension to form a four-dimensional spatiotemporal tensor. Specifically, 30 consecutive multi-view synchronized frame groups are selected, and the compensated image frames within each synchronized frame group are arranged side-by-side along the viewpoint dimension to form a three-dimensional tensor at that moment. The three dimensions of this three-dimensional tensor are image height, image width, and number of viewpoints. The 30 three-dimensional tensors at each moment are then stacked sequentially in chronological order to form a four-dimensional spatiotemporal tensor. The four dimensions of this four-dimensional spatiotemporal tensor are time, viewpoint, image height, and image width.

[0058] Please see Figure 2 As shown in S2, a three-dimensional convolution operation is performed on the four-dimensional spatiotemporal tensor to simultaneously extract the spatiotemporal joint features of smoke particles along the time and spatial axes, resulting in a set of feature maps describing the dynamic evolution of smoke in the spatiotemporal domain, specifically including:

[0059] Near-infrared cameras are installed at multiple predetermined locations on the cabin ceiling. Each camera captures near-infrared images of the cabin area at a fixed frame rate of 25 frames per second. The images captured at 30 consecutive moments are constructed into a four-dimensional spatiotemporal tensor according to the aforementioned step S1. The four dimensions of this four-dimensional spatiotemporal tensor are time, viewpoint, image height, and image width. The time dimension includes 30 moments, the viewpoint dimension includes the total number of cameras, the image height dimension corresponds to the number of pixels in the vertical direction of the image, and the image width dimension corresponds to the number of pixels in the horizontal direction of the image.

[0060] Using each pixel in the four-dimensional spacetime tensor as an anchor point, the grayscale value change sequence of each anchor point along the time axis is extracted to form a temporal feature sequence. Specifically, for any pixel in the four-dimensional spacetime tensor with a fixed viewpoint, fixed image height, and fixed image width, the grayscale values ​​of that pixel at 30 consecutive time points are extracted sequentially along the time axis. These 30 grayscale values ​​are arranged in chronological order to form the temporal feature sequence of that anchor point. The length of the temporal feature sequence is 30, and each element in the sequence is the grayscale value at the corresponding time point, with the grayscale value ranging from 0 to 255 integers.

[0061] Based on preset airflow direction parameters within the cabin, directional convolution sampling is performed on the temporal feature sequence along the airflow direction in the spatial dimension to obtain the spatiotemporal response values ​​of each anchor point along the airflow propagation path. The airflow direction parameters are jointly determined by the layout of the air inlets of the cabin air conditioning system and the current airflow rate. The airflow direction vectors at each spatial location within the cabin are pre-obtained through computational fluid dynamics simulation, and the correspondence between these airflow direction vectors and spatial coordinates is stored in the onboard storage unit. During directional convolution sampling, for each anchor point, the corresponding airflow direction vector is queried based on the anchor point's coordinates in the cabin's three-dimensional space to determine the downstream direction along the airflow direction.

[0062] Centered on the current anchor point in the temporal feature sequence, a neighborhood sampling path is determined along the airflow direction based on the airflow direction parameter. This neighborhood sampling path includes multiple consecutive anchor points downstream of the current anchor point. Specifically, starting from the pixel position of the current anchor point in the image, subsequent pixels adjacent to the projection direction of the queried airflow direction vector on the image plane are selected sequentially, for a total of 5 consecutive anchor points selected in the downstream direction. Together with the current anchor point, a total of 6 anchor points constitute the neighborhood sampling path.

[0063] The grayscale value change sequence of the current anchor point is summed element-wise with the grayscale value change sequences of each downstream anchor point in the neighborhood sampling path. Let the temporal feature sequence of the current anchor point be denoted as sequence A, which contains 30 grayscale values, denoted as the grayscale value at time 1, time 2, up to time 30. The temporal feature sequence of each downstream anchor point is denoted as sequence. , each sequence It also contains 30 grayscale values. First, the similarity between the current anchor sequence and each downstream anchor sequence is calculated. The similarity is determined by the reciprocal of the sum of the absolute values ​​of the differences in grayscale values ​​at each time step; that is, for each sequence... with sequence Calculate the absolute value of the grayscale value difference at corresponding positions at 30 time points, sum the absolute values ​​of the 30 differences, and then take the reciprocal to obtain the similarity value. Then based on the similarity value Determine the weighting coefficients for each downstream anchor point. Weighting coefficient Equal to the similarity value of the downstream anchor point Divide by the sum of the similarity values ​​of all downstream anchor points. Finally, divide the current anchor point sequence. With each downstream anchor sequence According to weighting coefficients Perform weighted summation to obtain the spatiotemporal response value of the current anchor point. The calculation formula is as follows:

[0064] ;

[0065] in, This represents the temporal feature sequence of the current anchor point. Indicates the th sampling path in the neighborhood Temporal feature sequences of downstream anchor points Represents a sequence with sequence The similarity value when If the value is equal to 0, it is set to a preset minimum positive number. This similarity value is equal to the sequence. with sequence The reciprocal of the sum of the absolute values ​​of the differences in grayscale values ​​at each time step. In the above formula, the first term represents the contribution of the current anchor sequence after adjustment by the attenuation factor, and the denominator in the attenuation factor includes the average degree of difference between the current anchor sequence and each downstream anchor sequence; the second term represents the contribution of each downstream anchor sequence after weighting by similarity.

[0066] The weighted summation result is output as the spatiotemporal response value of the current anchor point along the airflow propagation path. This spatiotemporal response value is a sequence of 30 values, where the value at each position in the sequence represents the joint response intensity of the anchor point and its downstream anchor points at that moment.

[0067] The spatiotemporal response values ​​of adjacent anchor points are locally aggregated to form a set of feature maps describing the dynamic evolution of smoke in the spatiotemporal domain. Specifically, for each time step, the spatiotemporal response values ​​of all anchor points at that time step are reorganized according to the spatial arrangement of the original image to form a two-dimensional response map for that time step. The height and width of the two-dimensional response map are the same as those of the original image. The two-dimensional response maps at 30 time steps are arranged in chronological order to form a set of feature maps. This set of feature maps contains 30 two-dimensional response maps, and the pixel value in each two-dimensional response map represents the smoke response intensity at the corresponding spatial location at that time step.

[0068] In S3, spatial attention weights are generated along the spatial dimension based on the feature map set, and temporal attention weights are generated along the temporal dimension. These spatial and temporal attention weights are then weighted and applied to the feature map set to obtain a feature representation focusing on the smoke diffusion pattern, specifically including:

[0069] The feature map set is divided into several spatial regions. Based on the pre-defined physical laws of smoke diffusion within the cabin, each spatial region is assigned an initial spatial weight, with the expected smoke source region receiving a higher initial spatial weight than the unexpected region. Specifically, each two-dimensional response map is divided into multiple rectangular regions according to pixel location, with each rectangular region being 16 pixels by 16 pixels. The physical laws of smoke diffusion are pre-established based on historical statistical data of cabin fires and computational fluid dynamics simulation results. These laws define the probability values ​​of different spatial locations within the cabin becoming smoke sources. For the space under the seats, the space inside the overhead luggage racks, and the area around the air conditioning vents, the probability value for becoming a smoke source is set to 0.8; for the cabin aisle area and seat back area, the probability value is set to 0.3; and for other spatial locations, the probability value is set to 0.1. The average of the probability values ​​corresponding to the spatial locations covered by each rectangular region is used as the initial spatial weight for that rectangular region. For the expected smoke source area, i.e. the rectangular area with a high probability value, a higher initial spatial weight is assigned, while for the unexpected area, a lower initial spatial weight is assigned.

[0070] The grayscale fluctuation frequency of each spatial region within multiple consecutive frames is extracted along the time dimension. This grayscale fluctuation frequency is then compared with a preset smoke fluctuation frequency band, and a temporal weight for each spatial region is generated based on the comparison results. Specifically, for each rectangular region, all pixel values ​​at corresponding positions in the two-dimensional response map of that region at 30 time points are extracted, forming the time-series data for that region. This time-series data contains pixel values ​​at 30 time points, with the pixel value at each time point being the average of all pixel values ​​within the rectangular region. The grayscale fluctuation frequency of this time-series data is calculated as follows: the direction of change of pixel values ​​between adjacent time points is statistically analyzed. A change in direction from rising to falling or from falling to rising is counted as one fluctuation. The total number of fluctuations within 30 time points is divided by 29 to obtain the grayscale fluctuation frequency of the rectangular region. The preset smoke fluctuation frequency band is 0.5 Hz to 2.0 Hz, obtained by collecting image data of the smoke diffusion process in a real cabin fire experiment and performing spectral analysis. The calculated grayscale fluctuation frequency is compared with the preset smoke fluctuation frequency band. If the grayscale fluctuation frequency is between 0.5 Hz and 2.0 Hz, the time weight of the rectangular area is assigned to 1.0; if the grayscale fluctuation frequency is below 0.5 Hz, the time weight of the rectangular area is assigned to the quotient of the grayscale fluctuation frequency divided by 0.5; if the grayscale fluctuation frequency is above 2.0 Hz, the time weight of the rectangular area is assigned to the quotient of 2.0 divided by the grayscale fluctuation frequency.

[0071] It should be noted that: the time weight is assigned a value of 1.0. This weighting rule is based on the physical characteristic that the gray-level fluctuation frequency during the actual smoke diffusion process is concentrated in the range of 0.5~2.0 Hz. Regions within this frequency band are assigned the highest weight of 1.0 to enhance the characteristics consistent with smoke patterns, while the weight of regions deviating from this frequency band is proportionally reduced to suppress non-smoke interference. The time weight is assigned as the quotient of the gray-level fluctuation frequency divided by 0.5. This assignment rule is based on the physical characteristic that the upper limit of the gray-level fluctuation frequency of smoke diffusion is 2.0 Hz. High-frequency fluctuations above this band are considered non-smoke interference, and the weight is monotonically reduced as the frequency increases using a reciprocal relationship, thereby effectively suppressing the interference of high-frequency noise on smoke detection.

[0072] The initial spatial weights and temporal weights are multiplied element-wise and then applied to the feature map set to obtain a feature representation focusing on the smoke diffusion pattern. Specifically, for each rectangular region, the initial spatial weight and temporal weight of that region are multiplied to obtain the comprehensive weight of that region. For the two-dimensional response map at each time step in the feature map set, each pixel value within each rectangular region of the two-dimensional response map is multiplied by the comprehensive weight corresponding to that rectangular region to obtain a weighted pixel value. The weighted two-dimensional response maps at all time steps together constitute a feature representation focusing on the smoke diffusion pattern. Spatial regions that conform to the characteristics of the smoke source location and the characteristics of smoke fluctuation frequency are enhanced, while spatial regions that do not conform to the above characteristics are suppressed.

[0073] In S4, a spatiotemporal aggregation operation is performed on the feature representation, mapping the aggregated features to coordinate parameters of the smoke source in the three-dimensional space of the cabin, and simultaneously generating a confidence score characterizing the reliability of the positioning, specifically including:

[0074] The feature representation focusing on the smoke diffusion pattern obtained in step S3 is used as input. This feature representation contains two-dimensional response maps at 30 time points. Each two-dimensional response map corresponds to a collection viewpoint. The value of each pixel in the two-dimensional response map represents the weighted smoke response intensity at the corresponding time point.

[0075] The feature representation is decomposed into spatial response maps at multiple viewpoints according to the acquisition perspective. Each spatial response map contains the smoke response intensity at each spatial location under the corresponding viewpoint. Specifically, each viewpoint in the feature representation corresponds to a set of two-dimensional response maps for 30 consecutive time moments. The maximum response intensity value at the same pixel location at each time moment in this set of two-dimensional response maps is taken to obtain a spatial response map for that viewpoint. The height and width of the spatial response map are consistent with the original acquired image. The pixel value in the spatial response map represents the maximum smoke response intensity at the corresponding spatial location under that viewpoint within the entire time window.

[0076] Spatial coordinate voting is performed on the spatial response maps from each viewpoint. Spatial locations in each spatial response map where the smoke response intensity exceeds a preset threshold are projected onto the cabin's three-dimensional spatial coordinate system, forming a candidate coordinate set for each viewpoint and a corresponding voting weight for each candidate coordinate. The preset threshold is set to twice the average value of all pixel values ​​in the spatial response map, obtained by summing all pixel values ​​and dividing by the total number of pixels. For each viewpoint's spatial response map, all pixel positions are traversed; when a pixel value is greater than the preset threshold, that pixel position is designated as a candidate pixel. Based on pre-calibrated internal and external parameters of the camera for that viewpoint, a mapping relationship is established between the candidate pixel coordinates and the cabin's three-dimensional spatial coordinates. This mapping relationship is obtained by acquiring calibration board images in a ground simulation cabin and solving the homography matrix. The three-dimensional spatial coordinates of each candidate pixel are used as a candidate coordinate for that viewpoint, and the pixel value of that candidate pixel is used as the corresponding voting weight, forming a candidate coordinate set for that viewpoint. Each candidate coordinate in the set is accompanied by a voting weight.

[0077] The candidate coordinate sets from each viewpoint are weighted and fused according to their voting weights to obtain the coordinate parameters of the smoke source in the cabin's three-dimensional space. Specifically, all candidate coordinates and their corresponding voting weights from all viewpoints are collected. The three-dimensional coordinate components of each candidate coordinate are multiplied by their corresponding voting weights to obtain weighted three-dimensional coordinate components. The weighted three-dimensional coordinate components of all candidate coordinates are summed to obtain the weighted total x-coordinate, total y-coordinate, and total ordinate. The weighted total x-coordinate is then divided by the sum of the voting weights of all candidate coordinates to obtain the x-coordinate parameter of the smoke source; the weighted total y-coordinate is divided by the sum of the voting weights of all candidate coordinates to obtain the y-coordinate parameter of the smoke source; and the weighted total ordinate is divided by the sum of the voting weights of all candidate coordinates to obtain the ordinate parameter of the smoke source. These three coordinate parameters together constitute the coordinate parameters of the smoke source in the cabin's three-dimensional space.

[0078] The confidence score is calculated based on the dispersion of the candidate coordinate sets for each viewpoint, with the dispersion inversely proportional to the confidence score. First, all candidate coordinates in the candidate coordinate sets for each viewpoint are obtained. Then, the spatial deviation distance between each candidate coordinate and the airflow path defined by the preset airflow direction parameters within the cabin is calculated. The preset airflow direction parameters include the airflow direction vectors at each spatial location within the cabin, obtained through computational fluid dynamics simulation. The airflow path is defined as a straight line passing through the expected area of ​​the smoke source along the airflow direction. The spatial deviation distance is calculated as follows: a perpendicular line is drawn from the candidate coordinate to the airflow path; the distance between the foot of the perpendicular and the candidate coordinate is the spatial deviation distance. The average deviation value for each viewpoint is obtained by summing the spatial deviation distances of all candidate coordinates for each viewpoint and taking the average value.

[0079] The preset airflow direction parameters are obtained as follows: First, based on the geometric contour of the cabin ceiling, seat arrangement, lower edge height of the overhead luggage rack, and installation coordinates of the air outlets and return air inlets of the air conditioning system, the boundary conditions of the airflow field are determined. Simultaneously, the air supply flow rate of the air conditioning system under typical operating conditions is read from the airborne data bus. These typical operating conditions include a common range of 0.2 to 0.5 cubic meters per second during the cruise phase. Then, computational fluid dynamics is used to numerically simulate the airflow distribution within the cabin. Specifically, the cabin space is discretized into dimensions of 0.05 meters × 0.05 meters × 0.05 meters. A cubic grid of 0.05 meters is used. The Navier-Stokes equations are solved for each grid cell, iterating until convergence occurs when the velocity change between two consecutive iterations is less than 0.01 meters per second. This yields the airflow velocity vector at the center coordinates of each grid cell, and the direction of this vector is taken as the airflow direction at that coordinate point. Finally, the spatial coordinates of all grid cells and their corresponding airflow direction vectors are stored in an onboard storage unit in the form of a lookup table. The lookup table is arranged in coordinate order, with each entry occupying 8 bytes. The first 3 bytes are the spatial coordinate value, and the last 5 bytes are the three direction cosine values ​​of the airflow direction vector. During actual detection, based on the currently acquired real-time air supply flow rate, this value is divided by the air supply flow rate under typical operating conditions to obtain a scaling factor. This scaling factor is then multiplied by the direction cosine values ​​stored in the lookup table to obtain the corrected airflow direction parameters.

[0080] Based on the maximum smoke response intensity in the spatial response map of each viewpoint, a response confidence benchmark is determined for each viewpoint. The response confidence benchmark is equal to the maximum value of all pixel values ​​in the spatial response map for that viewpoint. The initial confidence level for each viewpoint is obtained by multiplying the response confidence benchmark by the reciprocal of the average deviation value. When the average deviation value is zero, the reciprocal of the average deviation value is set as a preset upper limit value, which is the reciprocal of the diagonal length of the spatial response map. The initial confidence levels for all views are summed and divided by the total number of views to obtain a confidence score characterizing the positioning reliability. This confidence score ranges from 0 to 1, with a higher value indicating higher positioning reliability.

[0081] In S5, based on the comparison between the confidence score and a preset threshold, the three-dimensional coordinate information of the smoke source is output, and this three-dimensional coordinate information is correlated with the cabin space layout data to generate a situational output indicating the location of the fire, specifically including:

[0082] The three-dimensional coordinate parameters of the smoke source and the confidence score obtained in step S4 are used as input. The three-dimensional coordinate parameters of the smoke source include horizontal, vertical and vertical coordinate values, all in millimeters. The origin of the coordinates is set as the ground projection point on the cabin floor near the front cabin door. The horizontal coordinate points positively to the right side of the cabin, the vertical coordinate points positively to the rear of the cabin, and the vertical coordinate points positively to the top of the cabin.

[0083] The cabin space layout data is acquired and pre-stored in the onboard data storage unit. This data includes the spatial coordinate range of each seat position and its corresponding seat number, as well as the spatial coordinate range of each aisle area and its corresponding aisle number. The spatial coordinate range of each seat position is defined by the coordinates of its four corner points in three-dimensional space, forming a rectangular area. The spatial coordinate range of each aisle area is defined by the coordinates of its boundary line, forming a polygonal area. Seat numbers use a three-digit code: the first digit indicates the row number, and the last two digits indicate the seat number. Aisle numbers use a two-digit code: the first digit indicates the aisle type, and the second digit indicates the aisle sequence number.

[0084] Based on the comparison between the confidence score and a preset threshold, when the confidence score is greater than or equal to the preset threshold, the three-dimensional coordinate information of the smoke source is spatially matched with the cabin space layout data to determine the seat number or aisle number of the area where the smoke source is located. The preset threshold is set to 0.6, which was obtained by conducting 100 smoke detection experiments at different fire source locations in a ground simulation cabin and statistically analyzing the distribution of confidence scores when the location was successfully determined. The specific method of spatial matching is as follows: the horizontal and vertical coordinate values ​​of the three-dimensional coordinate parameters of the smoke source are used to construct a two-dimensional coordinate point. The spatial coordinate range of all seat positions in the cabin space layout data is traversed to determine whether the two-dimensional coordinate point falls within the rectangular area of ​​any seat. If it falls within, the seat number corresponding to that seat is extracted as the matching result. If it does not fall within any seat area, the spatial coordinate range of all aisle areas is traversed to determine whether the two-dimensional coordinate point falls within the polygonal area of ​​any aisle. If it falls within, the aisle number corresponding to that aisle is extracted as the matching result. If it still does not fall within any aisle area, the matching result is marked as an unlocated area.

[0085] The 3D coordinate information and corresponding seat or aisle numbers are overlaid onto the cabin floor plan image to generate a situational output indicating the location of a fire. The cabin floor plan image is pre-stored in the onboard data storage unit. This image is a two-dimensional image, and its coordinate system is consistent with the cabin space layout data. The image includes the outlines of each seat, the boundary lines of each aisle, and the corresponding seat and aisle numbers. When the matching result is a seat number, a highlighted mark is overlaid on the corresponding seat outline in the cabin floor plan image, and the seat number and the 3D coordinate parameters of the smoke source are displayed in text form next to the seat outline. When the matching result is an aisle number, a highlighted mark is overlaid on the corresponding aisle area in the cabin floor plan image, and the aisle number and the 3D coordinate parameters of the smoke source are displayed in text form next to the aisle area. When the matching result is an unlocated area, only the 3D coordinate parameters of the smoke source are displayed in the cabin floor plan image, and the corresponding position is marked with a circular marker. The overlaid cabin floor plan image is transmitted as the situational output to the cockpit display equipment for the crew to identify the location of the fire.

[0086] The working principle of this invention is as follows: First, multiple near-infrared cameras acquire continuous image sequences from various perspectives within the cabin. Image frames at the same time are synchronously grouped according to the acquisition timestamp, and spatial position compensation is performed on each image frame based on air conditioning operating parameters. Then, the compensated image frames are stacked in chronological order and stitched along the perspective dimension to construct a four-dimensional spatiotemporal tensor containing time, perspective, image height, and image width. Next, using each pixel in this four-dimensional spatiotemporal tensor as an anchor point, a grayscale value change sequence is extracted along the time axis to form a temporal feature sequence. The downstream neighborhood sampling path of each anchor point is determined according to preset airflow direction parameters. The similarity is calculated by taking the reciprocal of the sum of the absolute values ​​of the time-by-time grayscale differences between the current anchor point sequence and each downstream anchor point sequence. The current anchor point sequence and the downstream anchor point sequences are weighted and accumulated according to similarity to obtain the spatiotemporal response value of each anchor point along the airflow propagation path. The spatiotemporal response values ​​of adjacent anchor points are then locally aggregated to form a feature map set. Finally, this feature map set is divided into... In the spatial region, initial spatial weights are assigned to each region based on preset physical laws of smoke diffusion. Simultaneously, the grayscale fluctuation frequency of each region within continuous time intervals is extracted and compared with preset smoke fluctuation frequency bands to generate temporal weights. The two weights are multiplied element-wise and applied to the feature map set to obtain a feature expression focusing on the smoke diffusion law. Subsequently, this feature expression is decomposed into spatial response maps according to the viewpoint. Spatial coordinate voting is conducted on the locations in each map where the smoke response intensity exceeds a threshold to form a candidate coordinate set and voting weights. The three-dimensional coordinate parameters of the smoke source are obtained through weighted fusion. The confidence score is calculated based on the spatial deviation distance between the candidate coordinates of each viewpoint and the preset airflow path, as well as the maximum value of the response intensity of each viewpoint. Finally, when the confidence score reaches the preset threshold, the three-dimensional coordinates of the smoke source are matched with the seat number or aisle number in the cabin spatial layout data. The matching results and coordinate information are superimposed on the cabin floor plan image to generate a situational output for indicating the location of the fire.

[0087] The foregoing has provided a detailed description of one embodiment of the present invention, but this description is merely a preferred embodiment and should not be construed as limiting the scope of the invention. All equivalent variations and modifications made within the scope of the claims of this invention should still fall within the patent coverage of this invention.

Claims

1. An aerial smoke detection method based on image recognition, characterized in that, Includes the following steps: S1. Acquire a continuous sequence of images from multiple preset locations within the cabin, align the image sequences according to the acquisition time, and construct a four-dimensional spatiotemporal tensor containing time, viewpoint, and spatial dimensions. S2 performs a 3D convolution operation on the 4D spatiotemporal tensor, simultaneously extracting the spatiotemporal joint features of smoke particles along the time and spatial axes, resulting in a set of feature maps describing the dynamic evolution of smoke in the spatiotemporal domain, specifically including: Using each pixel in the four-dimensional spacetime tensor as an anchor point, the gray value change sequence of each anchor point in the time dimension is extracted along the time axis to form a temporal feature sequence. Based on the preset airflow direction parameters in the cabin, the temporal feature sequence is sampled by directional convolution along the airflow direction in the spatial dimension to obtain the spatiotemporal response value of each anchor point along the airflow propagation path. The spatiotemporal response values ​​of adjacent anchor points are locally aggregated to form a set of feature maps describing the dynamic evolution of smoke in the spatiotemporal domain; The process of obtaining the spatiotemporal response values ​​of each anchor point along the airflow propagation path specifically includes: Centered on the current anchor point in the time-series feature sequence, a neighborhood sampling path along the airflow direction is determined based on the airflow direction parameter. The neighborhood sampling path includes multiple consecutive anchor points located downstream of the current anchor point. Perform element-wise weighted accumulation on the gray value change sequence of the current anchor point and the gray value change sequences of each downstream anchor point in the neighborhood sampling path; The weighted summation result is output as the spatiotemporal response value of the current anchor point along the airflow propagation path; S3. Generate spatial attention weights along the spatial dimension and temporal attention weights along the temporal dimension based on the feature map set. Weight the spatial attention weights and temporal attention weights to the feature map set to obtain a feature expression focusing on the smoke diffusion pattern. S4 performs a spatiotemporal aggregation operation on the feature representation, maps the aggregated features to coordinate parameters of the smoke source in the three-dimensional space of the cabin, and simultaneously generates a confidence score characterizing the reliability of the positioning. S5, based on the comparison between the confidence score and the preset threshold, outputs the three-dimensional coordinate information of the smoke source, and associates the three-dimensional coordinate information with the cabin space layout data to generate a situational output for indicating the location of the fire.

2. The image recognition-based aerial smoke detection method according to claim 1, characterized in that, S1 specifically includes: Based on the acquisition timestamps of image frames in each image sequence, image frames from different preset locations at the same time are time-synchronized and grouped to obtain multi-view synchronized frame groups; Based on the preset air conditioning operating parameters in the cabin, spatial position compensation is performed on each image frame in the multi-view synchronous frame group so that the same physical space point under different views is mapped to the corresponding pixel position in each image frame. The compensated image frames are stacked in chronological order and stitched together along the viewpoint dimension to form a four-dimensional spatiotemporal tensor.

3. The aerial smoke detection method based on image recognition according to claim 1, characterized in that, S3 specifically includes: The feature map set is divided into several spatial regions. According to the preset physical laws of smoke diffusion in the cabin, each spatial region is assigned an initial spatial weight. The expected smoke source region is given a higher initial spatial weight than the unexpected region. The grayscale fluctuation frequency of each spatial region in multiple consecutive frames is extracted along the time dimension. The grayscale fluctuation frequency is compared with the preset smoke fluctuation frequency band, and the time weight of each spatial region is generated based on the comparison result. The initial spatial weights and temporal weights are multiplied element by element and then applied to the feature map set to obtain a feature representation focusing on the diffusion pattern of smoke.

4. The aerial smoke detection method based on image recognition according to claim 1, characterized in that, S4 specifically includes: The feature representation is decomposed into spatial response maps under multiple perspectives according to the acquisition viewpoint. Each spatial response map contains the smoke response intensity at each spatial location under the corresponding perspective. Spatial coordinate voting is performed on the spatial response maps from each perspective. The spatial locations where the smoke response intensity in each spatial response map exceeds the preset threshold are projected onto the cabin three-dimensional spatial coordinate system to form a set of candidate coordinates for each perspective and the voting weight corresponding to each candidate coordinate. The candidate coordinate sets from each perspective are weighted and fused according to the voting weights to obtain the coordinate parameters of the smoke source in the three-dimensional space of the cabin. At the same time, the confidence score is calculated based on the dispersion of the candidate coordinate sets from each perspective. The dispersion is inversely proportional to the confidence score.

5. The image recognition-based aerial smoke detection method according to claim 4, characterized in that, The calculation of confidence scores based on the dispersion of candidate coordinate sets from each viewpoint specifically includes: Obtain all candidate coordinates in the candidate coordinate set for each viewpoint, calculate the spatial deviation distance between each candidate coordinate and the airflow path defined by the preset airflow direction parameter in the cabin, sum the spatial deviation distances of all candidate coordinates under each viewpoint and take the average value to obtain the average deviation value of each viewpoint. Based on the maximum value of the smoke response intensity in the spatial response map of each viewpoint, the response confidence benchmark for each viewpoint is determined. The response confidence benchmark is multiplied by the reciprocal of the average deviation value to obtain the preliminary confidence level for each viewpoint. The confidence scores of all viewpoints are summed and then divided by the total number of viewpoints to obtain a confidence score that characterizes the reliability of the positioning.

6. The image recognition-based aerial smoke detection method according to claim 1, characterized in that, S5 specifically includes: Obtain cabin space layout data, which includes the spatial coordinates of each seat position and the corresponding seat number, as well as the spatial coordinates of each aisle area and the corresponding aisle number. Based on the comparison between the confidence score and the preset threshold, when the confidence score is greater than or equal to the preset threshold, the three-dimensional coordinate information of the smoke source is spatially matched with the cabin space layout data to determine the seat number or aisle number of the area where the smoke source is located. The three-dimensional coordinate information and the corresponding seat number or aisle number are superimposed onto the cabin floor plan image to generate a situational output used to indicate the location of the fire.