A method for intelligent identification of inspection images of a thermal power plant

What is AI technical title?
AI technical title is built by PatSnap AI team. It summarizes the technical point description of the patent document.
By combining time-series video instance segmentation and sonar imaging technology with real-time equipment operating parameters, the problem of false alarms under varying operating conditions during equipment inspection in thermal power plants has been solved, achieving accurate defect identification and analysis.

CN122244818APending Publication Date: 2026-06-19DATANG YANGLING THERMAL POWER CO LTD

View PDF 0 Cites 0 Cited by

Patent Information

Authority / Receiving Office: CN · China
Patent Type: Applications(China)
Current Assignee / Owner: DATANG YANGLING THERMAL POWER CO LTD
Filing Date: 2026-05-20
Publication Date: 2026-06-19

Application Information

Patent Timeline

20 May 2026

Application

19 Jun 2026

Publication

CN122244818A

IPC: G06V20/56; G06N3/042; G06N5/04; G10L21/10; G06V10/26; G06V10/764; G06V10/80; G06V10/82; G06V20/40

AI Tagging

Application Domain

Speech analysis Character and pattern recognition

Explore More Agents

Novelty Search
Search existing technologies and assess novelty
↗
FTO
Analyze whether a product may infringe others' patents
↗
Design FTO
Check prior-design risk for exterior design
↗
Drafting
Draft patent application text based on a technical solution
↗
Find Solutions with TRIZ
Generate feasible solution to solve your technical challenge
↗

Similar Technology Patents

Raindrop acoustic signal assisted method for removing rain interference from a patrol image
CN121903890BImage enhancement Image analysis
A real-time audio Ethernet transmission and processing system based on double FPGA
CN122204843ASpeech analysis Transmission
Electronic device for detecting speech rate and method for detecting speech rate
CN122224204ASpeech analysis
A method, device and medium for intelligent control of light
CN117636911BElectrical apparatus Speech analysis
Method and apparatus for auditory training
US20260162561A1Data processing applicationsEar treatment

Get free access to AI patent search and analysis

Check patentability, review prior art and ask IP Agent with full patent context.

AI Technical Summary

Technical Problem

In existing equipment inspections at thermal power plants, static image comparison and sound detection solutions cannot distinguish between real defects and normal operating condition fluctuations when the equipment is operating under varying conditions, resulting in a high false alarm rate.

Method used

By employing temporal video instance segmentation and sonar imaging technologies, combined with real-time equipment operating parameters, and through causal inference, feature attribution, quantification, and fusion of features and abnormal states, interpretable anomaly identification results are output.

Benefits of technology

It effectively isolates the characteristic interference of normal load fluctuations of equipment, outputs anomaly identification results and defect cause analysis with causal logic, and reduces false alarm rate.

✦ Generated by Eureka AI based on patent content.

Smart Images

Figure CN122244818A_ABST

Patent Text Reader

Abstract

This invention discloses an intelligent image recognition method for thermal power plant inspections, belonging to the field of intelligent image recognition technology. This method synchronously collects continuous inspection video and full-band acoustic signature data using an inspection robot, aligning the timestamps and converting the acoustic signature data into a sonar imaging atlas that matches the video frames at the pixel level. It uses time-series video instance segmentation to track moving targets, fusing appearance, dynamic operation, and acoustic emission features to generate a joint feature matrix. Real-time operating parameters from the DCS system are introduced to normalize and correct the joint feature matrix, removing interference from operating condition fluctuations. A causal inference feature attribution algorithm is used to quantify the causal relationship between features and anomalies, locate core feature factors, and output interpretable anomaly identification and defect cause analysis results. This invention achieves spatiotemporal fusion of audiovisual data and decoupling from operating conditions, suppresses motion blur, reduces false alarm rates, and is suitable for dynamic inspections of thermal power plant equipment, improving recognition accuracy and result traceability.

Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of intelligent image recognition technology, specifically to an intelligent image recognition method for inspection of thermal power plants. Background Technology

[0002] Currently, equipment inspections in thermal power plants generally employ static image-based defect identification schemes. Specifically, an inspection robot equipped with a visible light camera stops at various points along a fixed path and takes photos of the equipment. The backend system uses a conventional convolutional neural network to process each frame of the image, extracting the texture and geometric features of the equipment surface. This data is then compared with a pre-set defect sample library, outputting a binary classification result indicating whether an anomaly has occurred. Regarding sound detection, existing solutions typically collect audio segments independently, extract acoustic features, and perform thresholding. This results in a disconnect between image and sound data in both time and space. Furthermore, the recognition process relies on the original distribution of the input data, with the recognition result presented as a direct output, and the internal feature mapping process is closed.

[0003] When the above-mentioned existing technical solutions are used to identify anomalies in the dynamic operation of thermal power generation equipment, they use single-frame static image comparison and do not introduce equipment operating status parameters as a reference. When the equipment is in a variable operating condition, the changes in the appearance of the equipment and the sound field fluctuations caused by normal load fluctuations will be directly judged as defect features by the background system. This leads to false alarms caused by operating condition interference in the identification results, and it is impossible to distinguish between real physical defects and normal operating condition fluctuations. Summary of the Invention

[0004] The purpose of this invention is to provide a solution that can effectively address the problems described in the background section.

[0005] To achieve the above objectives, the technical solution adopted by the present invention is as follows: A method for intelligent recognition of inspection images in thermal power plants includes: Acquire the continuous inspection video sequence and full-band voiceprint data synchronously collected by the inspection robot, and complete the timestamp synchronization and alignment; The acoustic signature data is converted into a sonar imaging map that matches the video frame at the pixel level. The pixel values of the map represent the sound pressure level and spectral characteristics of the corresponding spatial location. Temporal video instance segmentation is used to track moving targets in video sequences, extract appearance changes and dynamic operation features, and simultaneously extract anomalous acoustic emission features from sonar imaging maps, which are then fused to generate a joint feature matrix. Acquire real-time operating parameters of equipment collected by the DCS system and perform operating condition normalization correction on the joint feature matrix; The causal inference feature attribution quantification method is used to determine the causal correlation between fusion features and abnormal states, locate core feature factors, and output interpretable anomaly identification results and defect cause analysis results.

[0006] Preferably, the method of tracking moving targets in a video sequence using temporal video instance segmentation includes: The video sequence is input into a two-stream network containing a spatial mask branch and a temporal memory branch. The vortex dynamics constraint from computational fluid dynamics is introduced to model the pixel motion field between adjacent frames as an optical flow field. The two-dimensional vortex field distribution of the optical flow field is calculated. The vortex field is defined as the curl scalar field of the optical flow field on the image plane. In the temporal memory branch, an adaptive deformation convolution kernel is constructed based on the vortex field distribution. The adaptive deformation convolution kernel is applied to the feature map of the temporal memory branch to resample and aggregate the pixel features that undergo topological deformation between frames. This compensates for the pixel motion offset caused by the high-speed operation and vibration of the equipment, completes vortex compensation aggregation, suppresses motion blur of thermal power generation equipment under high-speed operation or vibration, and outputs an accurate temporal instance mask.

[0007] Preferably, converting the acoustic signature data into a sonar imaging atlas that matches the video frame at the pixel level includes: The self-focusing processing in synthetic aperture sonar imaging is used to perform phase compensation and beamforming on the full-band acoustic pattern data to generate an initial sonar imaging map. By introducing aperture synthesis technology from radio astronomy and a joint constraint based on image edge priors, the geometric contours of device edges extracted from visible light video sequences in the same scene are used as spatial prior constraints. A joint inverse problem model containing acoustic data fidelity terms and optical edge prior regularization terms is constructed. The objective function of the model is defined as the L2 norm of the difference between the initial sonar map and the map to be solved plus the L1 norm of the device edge prior constraints. The objective function is solved by the alternating direction multiplier method.

[0008] Preferably, the joint characteristic matrix is subjected to operating condition normalization correction, including: The real-time operating parameters of the equipment are constructed as a topological manifold structure in non-Euclidean space. Specifically, the real-time operating parameter vector of the equipment at each moment is mapped to a point on the topological manifold, and the dimension of the manifold is consistent with the dimension of the operating parameter vector. The geodesic distance metric in Riemannian geometry is introduced, and the Riemannian metric tensor on the manifold is defined as a Gaussian kernel metric based on the state transition probability of the queuing network and the Euclidean distance of the parameter vector. The shortest path deviation between the real-time operating parameters and the historical standard operating parameters on the manifold surface is calculated. The operating condition migration matrix is calculated based on the shortest path deviation. The joint feature matrix is then projected along the tangent space of the manifold in parallel to eliminate the feature distribution drift caused by the operation under different operating conditions, and the pure joint feature matrix after operating condition decoupling is output.

[0009] Preferably, the causal correlation between causal inference feature attribution quantification fusion features and abnormal states is adopted, including: A structural causal model integrating features and anomalous states is constructed, introducing a counterfactual network framework from epidemiological etiology inference. The counterfactual network framework includes an association layer, an intervention layer, and a counterfactual layer. The association layer corresponds to the observation distribution of the structural causal model and is used to model the observational correlation between features and anomalous states. The intervention layer corresponds to the intervention distribution of the structural causal model and is used to model the distribution of anomalous states after feature intervention through intervention variables. The counterfactual layer corresponds to the counterfactual distribution of the structural causal model and is used to generate features and anomalous states under counterfactual scenarios that have not occurred. The intervention variable is an endogenous variable acting on the structural causal model. Operator, specifically , indicating that the d-th dimension feature variable Forced to be set to the value x, or This indicates that the abnormal state variable Y will be forcibly set to the normal state; Using the observed fusion features as factual conditions, counterfactual fusion features without anomalies are generated through intervention variables. The mean causal effect of factual and counterfactual features in the potential outcome space is calculated. The dimensions in the fusion features are sorted in descending order based on the mean causal effect, and the top-ranked core feature factors are extracted.

[0010] Preferably, in the time memory branch, an adaptive deformation convolution kernel is constructed based on the vorticity field distribution, including: Combining the wave function interference principle in quantum mechanics, the curl vector in the vorticity field distribution is converted into the complex phase feature of the target pixel; Based on the coherence intensity of the complex phase characteristics, the update threshold of the hidden state in the time memory branch is dynamically allocated. Redundant motion features caused by background thermal ripple interference are eliminated by phase destructive interference, and only constructive interference features caused by mechanical defects of the equipment are retained to generate an adaptive deformation convolution kernel.

[0011] Preferably, the joint inverse problem model is solved using the alternating direction multiplier method, including: Introducing the local resonance mechanism from the field of acoustic metamaterials, a resonance scattering regularization term is added to the augmented Lagrangian function of the alternating direction multiplier method; the expression for the resonance scattering regularization term is: ,in, This represents the acoustic signal spectrum of the corresponding pixel in the sonar imaging map. For the Dirac function, The resonant frequency of the equivalent acoustic Helmholtz resonator in the internal cavity of a thermal power generation device; The internal cavity structure of thermal power generation equipment is equivalent to an acoustic Helmholtz resonator, and the resonant frequency is... The results are obtained by calculating the geometric parameters of the equipment cavity or by calibrating the acoustic data under historical normal operating conditions. During the iterative solution process, the pixel values of the sonar imaging spectrum that deviate from the resonance frequency are penalized by the resonance scattering regularization term, thereby enhancing the energy concentration of the weak abnormal noise caused by the tiny cracks inside the equipment in the sonar imaging spectrum.

[0012] Preferably, the real-time operating parameters of the device are constructed as a topological manifold structure in non-Euclidean space, including: By introducing the queuing network dynamics model from operations research, the load change process of multiple related devices in real-time operating parameters is mapped as a sequence of arrival rate and service rate of Markov-modulated Poisson process. The node connection weights of the topological manifold structure are constructed based on the state transition probability of the queuing network. The information entropy of the node characteristics is calculated by combining the entropy increase principle in thermodynamics. Nodes with information entropy below a preset threshold are pruned in the topological manifold structure to reduce the computational dimension of the working condition normalization correction.

[0013] Preferably, calculating the mean causal effect of factual and counterfactual features in the potential outcome space includes: By combining the disease gene module tracing method in network medicine, the distribution differences of factual and counterfactual features in the potential outcome space are mapped into a multi-layered heterogeneous causal graph; In a multi-layered heterogeneous causal graph, a message passing mechanism from graph neural networks is introduced to propagate activation signals of abnormal states along causal edges. The cumulative activation energy received by each feature node is calculated, and this cumulative activation energy is used as a quantitative indicator to locate core feature factors, replacing the mean of causal effects. The cumulative activation energy is the sum of the hidden states of the feature node in all iterations of message passing, and the calculation formula is as follows: ,in, Let L be the cumulative activation energy of the d-th feature node, and L be the total number of iterations in message passing. Let be the hidden state of the d-th dimension feature node in the l-th iteration.

[0014] Preferably, in a multi-layered heterogeneous causal graph, a message passing mechanism from a graph neural network is introduced, including: Combining model prediction control strategies from cybernetics, a joint cost function of forward prediction loss and backward causal consistency constraint is constructed in each iteration step of message passing; The joint cost function is solved by interior point method, and the decay coefficient of edge weights in graph neural network is adaptively adjusted to block redundant message passing paths caused by spurious correlations in multi-layer heterogeneous causal graph. The output is a sparse causal graph that retains only direct causal links to complete traceability verification.

[0015] Compared with the prior art, the beneficial effects of the present invention are as follows: 1. This invention incorporates real-time operating parameters of the equipment into a joint feature matrix for normalization correction of operating conditions. It utilizes a causal inference feature attribution algorithm to quantify the correlation between features and abnormal states. This allows the removal of feature interference caused by normal load fluctuations of the equipment, and outputs anomaly identification results and defect cause analysis with causal logic. This solves the problem that static identification schemes under variable operating conditions cannot distinguish between real defects and normal operating condition fluctuations, leading to false alarms.

[0016] 2. This invention introduces vortex dynamics constraints to construct an adaptive deformation convolution kernel in temporal video processing, performing vortex compensation aggregation on pixel features during inter-frame topological deformation to suppress motion blur under high-speed operation. In acoustic imaging, it combines aperture synthesis technology with prior solving of the device edge geometric contour to achieve sub-pixel-level spatial registration between sonar maps and video frames. In operating parameter processing, it constructs a non-Euclidean space topological manifold structure and eliminates feature distribution drift through parallel transmission projection. In causal inference, it uses a counterfactual network framework to generate counterfactual fusion features to calculate the mean of causal effects, and in message passing, it combines model prediction control strategy to adaptively adjust the edge weight attenuation coefficient, blocking redundant transmission paths of false correlations and completing the traceability verification of algorithm recognition results. Attached Figure Description

[0017] Figure 1 This is a general flowchart of the overall method of the present invention; Figure 2 This is a flowchart of the temporal video instance segmentation process of the present invention; Figure 3 This is a flowchart of the acoustic signature to sonar imaging atlas of the present invention; Figure 4 This is a flowchart of the normalization correction process for the operating conditions of the present invention; Figure 5 This is a flowchart illustrating the causal inference feature attribution process of the present invention; Figure 6 This is a flowchart of the sparsity causal graph optimization process of the present invention. Detailed Implementation

[0018] As a preferred embodiment, please refer to the appendix. Figures 1 to 3 The system acquires continuous inspection video sequences and full-band acoustic data synchronously collected by the inspection robot, and completes timestamp synchronization alignment. The inspection robot's synchronization clock unit adopts the IEEE 1588 precision time protocol, assigning a unique acquisition timestamp to each video frame of the visible light camera unit. ,in The sequence number of the video frame. , The total number of frames in the video sequence; a unique acquisition timestamp is assigned to each audioprint sampling point of the full-band audioprint acquisition unit. ,in This is the serial number of the voiceprint sampling point. , This represents the total number of sampling points for the voiceprint data. The timestamp synchronization and alignment process is as follows: using the acquisition timestamp of the video frame... Based on this, extract from the timeline of the voiceprint data. Voiceprint data segments within the interval, among which The sampling period of a video frame. , The frame rate for acquiring the video sequence is set so that each video frame corresponds to a unique audioprint data segment of equal duration, thus completing the synchronization and alignment of the timestamps of the video sequence and the audioprint data.

[0019] The acoustic signature data is converted into a sonar imaging atlas that matches the pixel level of the video frames. The pixel values of the atlas represent the sound pressure level and spectral characteristics of the corresponding spatial location. For the acoustic signature data segment corresponding to the synchronized and aligned i-th video frame, a beamforming algorithm is used for spatial focusing to generate a sonar imaging atlas with the same resolution as the i-th video frame. The sonar imaging map pixel coordinates With video frames pixel coordinates All correspond, achieving pixel-level matching. Among them, the sonar imaging atlas... medium pixel pixel values The calculation formula is: ; in, pixel coordinates The corresponding guide vector, superscript This indicates the conjugate transpose. For the array receiving data matrix of voiceprint data segments, For reference sound pressure level, the value is [value to be filled in]. Meanwhile, pixel values Embedding spectral features corresponding to spatial locations within pixels, specifically: for pixels The corresponding focused acoustic signal is subjected to a Fast Fourier Transform to obtain the spectrum vector. , spectral vector The peak frequency and frequency band energy distribution characteristics are mapped as additional dimensions of the pixel value, so that each pixel in the sonar imaging spectrum simultaneously contains sound pressure level and spectral characteristics.

[0020] Temporal video instance segmentation is employed to track moving targets in video sequences, extracting features of appearance changes and dynamic operation. Simultaneously, anomalous acoustic emission features from sonar imaging maps are extracted and fused to generate a joint feature matrix. Temporal video instance segmentation uses a temporal convolutional network to segment the device to be detected in a continuous video sequence, generating an instance mask for the device in each frame. Based on the instance mask, moving target tracking is performed on the same device in adjacent frames, extracting appearance change features, including geometric and textural features of surface cracks, deformation, leakage, and component detachment; and extracting dynamic operation features, including temporal variations in the rotational speed, vibration amplitude, and trajectory of rotating components. Simultaneously, for the sonar imaging map synchronized with the video frames, anomalous acoustic emission features of the corresponding region are extracted based on the same instance mask, including features of sudden changes in sound pressure level, spectral shift, and anomalies in resonance peaks. The extracted appearance change features, dynamic operation features, and anomalous acoustic emission features are concatenated dimensionally to generate a joint feature matrix. ,in The total number of feature dimensions. The time step length of the video sequence.

[0021] Please refer to the attached document. Figure 4 The system acquires real-time operating parameters of the equipment collected by the DCS system and performs operating condition normalization correction on the joint feature matrix. It retrieves real-time operating parameters of the equipment synchronized with the video sequence timestamps from the DCS system database. These operating parameters include, but are not limited to, equipment load, inlet and outlet pressure, inlet and outlet temperature, speed, current, voltage, and flow rate. The real-time operating parameters are then constructed into an operating condition parameter vector. , For time steps. Based on the operating condition parameter vector. With the preset standard operating condition parameter vector Calculate the operating condition deviation coefficient The calculation formula is: ; in, It is the L2 norm. Based on the operating condition deviation coefficient. For the joint characteristic matrix The eigenvectors at the corresponding time steps are normalized and corrected using the following formula: ; in, The first in the joint characteristic matrix Feature vectors at each time step These are the corrected eigenvectors, and all corrected eigenvectors constitute the joint eigenma matrix after normalization and correction for the operating condition. .

[0022] Please refer to the attached document. Figure 5This study employs causal inference and feature attribution quantification to determine the causal correlation between fusion features and abnormal states, identifies core feature factors, and outputs interpretable anomaly identification results and defect cause analysis results. A structural causal model (SCM) is constructed, which includes exogenous variables. Endogenous variables and Among them, endogenous variables The corrected joint characteristic matrix Feature dimensions, endogenous variables This is a label for the abnormal status of the device. , Indicates a normal state. This represents an abnormal state. Based on a structural causal model, each feature dimension is calculated. Average treatment effect on abnormal state Y As a measure of causal correlation, the formula for calculation is: ; in, For the budget, Features Abnormal values, Features The normal value, For mathematical expectation. All feature dimensions. Sort by absolute value in descending order and extract the top-ranked items. The feature dimension is used as the core feature factor. This is a preset positive integer. Based on the physical meaning of the core feature factors, the system outputs anomaly identification results, including the abnormal device, abnormal location, and abnormal type. It also outputs defect cause analysis results, including the causal relationship between the core feature factors and the abnormal state, and an analysis of the defect generation mechanism.

[0023] As a preferred embodiment, the autofocusing process in synthetic aperture sonar imaging is employed to perform phase compensation and beamforming on the full-band acoustic signature data, generating an initial sonar imaging spectrum. For the synchronized and aligned acoustic signature data segments, the phase gradient autofocusing (PGA) algorithm is used for phase error compensation. Specifically, pulse compression is performed on the range dimension of the acoustic signature data to extract the phase error of strong scattering points, and the phase error function is solved using the minimum entropy criterion. ,in For fast time. Based on phase error function. Phase compensation is performed on each distance gate of the voiceprint data, resulting in the compensated voiceprint data. for: ; in, The imaginary unit is used for the phase-compensated voiceprint data. A delayed summation beamforming algorithm is used for synthetic aperture processing to generate an initial sonar imaging map. .

[0024] This paper introduces aperture synthesis techniques from radio astronomy and incorporates joint constraints based on image edge priors. The geometric contours of device edges extracted from visible light video sequences within the same scene are used as spatial prior constraints to construct a joint inverse problem model that includes acoustic data fidelity terms and optical edge prior regularization terms. This model is applied to synchronized video frames. The edge geometry of the device is extracted using the Canny edge detection operator, and a binarized edge prior mask is generated. , mask In the device's edge region, the pixel value is 1, and in the non-edge region, the pixel value is 0. Based on aperture synthesis technology, the acoustic observation process is modeled as a linear observation model: ; in, For acoustic observation matrix, The image is the sonar imaging pattern of the target to be solved. To account for observation noise. Simultaneously, the edge prior constraints of optical observations are modeled as regularization terms: ; Where ∇ is the gradient operator, and ⊙ is the Hadamard product. The L1 norm regularization term constrains the gradient sparsity in non-device edge regions while preserving gradient information in device edge regions. A joint inverse problem model of acoustic and optical observations is constructed, with the objective function being: ; in, The weight coefficients are the marginal prior regularization terms.

[0025] By solving the joint inverse problem model using the alternating direction multiplier method, sub-pixel-level spatial registration of sonar imaging maps and video frames is achieved. Introducing the local resonance mechanism from the field of acoustic metamaterials, the internal cavity structure of thermal power generation equipment is equivalent to an acoustic Helmholtz resonator, and the natural resonant frequency of the Helmholtz resonator is... The calculation formula is: ; in, The speed of sound in air. Let N be the cross-sectional area of the resonator neck. The volume of the resonator cavity. This is the equivalent length of the neck opening. The resonant frequency... It can also be obtained through calibration using voiceprint data under historical normal operating conditions. A resonant scattering regularization term is added to the augmented Lagrangian function of the alternating direction multiplier method. The resonant scattering regularization term is: ; in, This represents the acoustic signal spectrum of the corresponding pixel in the sonar imaging map. For the Dirac function, the regularization term penalizes deviations from the resonant frequency. The pixel values of the sonar imaging spectrum enhance the energy concentration of faint abnormal noises caused by tiny cracks inside the device in the sonar imaging spectrum.

[0026] Construct the augmented Lagrangian function that includes a resonant scattering canonical term: ; in, As an auxiliary variable, For Lagrange multipliers, For penalty parameters, represents the weighting coefficient of the resonant scattering regularization term.

[0027] The iterative solution is obtained using the alternating direction multiplier method. The iterative steps include: Minimize step: fixed and Seeking answers regarding The minimization problem yields the updated... ; Minimize step: fixed and Seeking answers regarding The minimization problem is solved by using the soft threshold operator to obtain the updated value. ; Lagrange multiplier update step: update .

[0028] After iterating until the preset convergence condition is met, the solved sonar imaging atlas is output. The sonar imaging map and video frame achieve sub-pixel level spatial registration, with a pixel-level matching accuracy better than 0.5 pixels.

[0029] In a preferred embodiment, the video sequence is input into a two-stream network containing a spatial mask branch and a temporal memory branch. Vortex dynamics constraints from computational fluid dynamics are introduced to model the pixel motion field between adjacent frames as an optical flow field, and the two-dimensional vortex field distribution of the optical flow field is calculated. The vortex field is defined as the curl scalar field of the optical flow field on the image plane. The two-stream network is an encoder-decoder structure. The spatial mask branch and the temporal memory branch share the encoder's bottom-level feature extraction layer. The spatial mask branch is configured to extract the spatial semantic features of a single frame of video to generate an initial instance mask. The temporal memory branch is configured to extract the temporal motion features between frames and perform temporal optimization on the initial instance mask.

[0030] For adjacent i-th and (i+1)-th video frames, calculate the pixel motion field between the frames, i.e., the optical flow field. ,in for Optical flow components in the direction, for The optical flow component in the direction of light flow. Introducing vortex dynamics constraints from computational fluid dynamics, the vorticity field distribution of the optical flow field is... Defined as: ; vorticity field distribution The rotational characteristics characterize pixel motion, corresponding to the rotational motion of rotating parts in thermal power generation equipment and the pixel deformation motion caused by vibration.

[0031] In the temporal memory branch, an adaptive deformation convolution kernel is constructed based on the vortex field distribution. The adaptive deformation convolution kernel is applied to the feature map of the temporal memory branch to resample and aggregate the pixel features that undergo topological deformation between frames. This compensates for the pixel motion offset caused by the high-speed operation and vibration of the equipment, completes vortex compensation aggregation, suppresses motion blur of thermal power generation equipment under high-speed operation or vibration, and outputs an accurate temporal instance mask.

[0032] By applying the wave function interference principle from quantum mechanics, the curl vector in the vorticity field distribution is converted into the complex phase feature of the target pixel. Specifically, the vorticity field distribution... As a curl vector, construct the target pixel Complex wave function : ; in, For pixels The feature magnitude is the feature response value output by the pixel in the spatial mask branch; For the complex phase feature of a pixel, , is the preset phase conversion coefficient, and j is the imaginary unit.

[0033] Based on the coherence strength of the complex phase characteristics, the update threshold for the hidden states in the time memory branch is dynamically allocated. The time memory branch is a gated recurrent unit structure, and its hidden states... The update formula is: ; in, To update the door, To reset the door, It is the sigmoid activation function. , , This is the weight matrix. , , For bias terms, The input features for the current frame.

[0034] Based on the coherence intensity of complex phase characteristics, the update gate is dynamically adjusted. Threshold Coherence strength The calculation formula is: ; in, For inner product operations, The complex wavefunction matrix of the current frame. This is the complex wavefunction matrix of the previous frame. Coherence intensity. The value range is [0,1]. The closer it is to 1, the stronger the phase coherence.

[0035] Threshold The dynamic adjustment formula is: ; in, This is the preset initial threshold value. When updating the gate... The value is greater than If the hidden state is updated, then update the hidden state; otherwise, retain the hidden state of the previous frame.

[0036] Redundant motion features caused by background thermal ripple interference are eliminated through phase destructive interference, retaining only the constructive interference features caused by equipment mechanical defects. Specifically, for the complex wave function of adjacent frames... and Calculate the composite wave function after interference.

[0037] ; When phase difference At that time, destructive interference occurs. This feature is a redundant motion characteristic generated by background thermal ripple interference and is therefore discarded; when the phase difference At that time, constructive interference occurs. This feature, being a valid motion characteristic caused by mechanical defects in the equipment, is retained.

[0038] Based on the preserved constructive interference features, an adaptive deformation convolution kernel is constructed. The offset of the adaptive deformation convolution kernel is determined jointly by the vorticity field distribution and the constructive interference features. The offset of each sampling point of the convolution kernel ( , The calculation formula is: ; in, The index of the sampling point of the convolution kernel. The offset scaling factor is ( , ) is the convolution kernel The coordinates of each sampling point.

[0039] An adaptive deformation convolution kernel is used to perform vortex compensation aggregation on pixel-level features of moving targets during inter-frame topological deformation. Specifically, the adaptive deformation convolution kernel is applied to the feature map of the temporal memory branch to resample and aggregate the pixel features that undergo inter-frame topological deformation, thereby compensating for pixel motion shift caused by high-speed operation and vibration of the device and suppressing motion blur.

[0040] Finally, the decoder of the dual-stream network fuses the spatial semantic features of the spatial mask branch with the temporal compensation features of the temporal memory branch to output an accurate temporal instance mask, which precisely matches the topological deformation of moving targets in the video sequence between frames.

[0041] As a preferred embodiment, the real-time operating parameters of the equipment are constructed as a topological manifold structure in a non-Euclidean space. A queuing network dynamics model from operations research is introduced, mapping the load change process of multiple associated devices in the real-time operating parameters to an arrival rate and service rate sequence of a Markov-modulated Poisson process. Specifically, each associated device in the thermal power plant is modeled as a node in a queuing network, and the input load process of the device is modeled as an arrival process of a Markov-modulated Poisson process, with an arrival rate of... The output processing capacity of the equipment is modeled as a service process of a Markov-modulated Poisson process, with a service rate of [missing information]. The state transition process of a Markov-modulated Poisson process is described by a Markov chain. The state space of the Markov chain corresponds to the operating conditions of the equipment, and the state transition probability matrix is... ,in To be from the working condition Switch to operating condition The probability of.

[0042] Node connection weights are constructed based on the state transition probabilities of queuing networks to create a topological manifold structure. The topological manifold structure is a Riemannian manifold M, where each node corresponds to the operating parameter characteristics of a device, and the edge weights between nodes are also considered. for: ; in, For nodes The corresponding operating condition parameter vector, For nodes The corresponding operating condition parameter vector, The Gaussian kernel width coefficient is used, and the Riemann metric tensor is defined as a Gaussian kernel metric based on the aforementioned edge weights.

[0043] By applying the principle of entropy increase in thermodynamics to calculate the information entropy of node characteristics, nodes with information entropy below a preset threshold are pruned within the topological manifold structure. Feature information entropy The calculation formula is: ; in, The total number of nodes in the topological manifold. Preset information entropy threshold. When node Information entropy < If the node's feature information is insufficient, pruning is performed on the node by deleting the node and its associated edges, reducing the dimension of the topological manifold structure, and thus reducing the computational dimension of the working condition normalization correction.

[0044] Incorporating the geodesic distance metric from Riemannian geometry, this method calculates the shortest path deviation between real-time operating parameters and historical standard operating parameters on the manifold surface. Specifically, on the pruned Riemannian manifold M, the geodesic distance is defined as the distance between two points on the manifold surface, which is the length of the shortest curve connecting the two points. The points on the manifold corresponding to the real-time operating parameters are... The points on the manifold corresponding to the historical standard operating condition parameters are Geodesic distance The calculation formula is: ; in, Connect on manifold and The parameterized curve, , , , Let be the Riemannian metric tensor of the manifold at γ(s). For curves The tangent vector components. Geodesic distance. This refers to the shortest path deviation between real-time operating parameters and historical standard operating parameters on the manifold surface.

[0045] The operating condition migration matrix is calculated based on the shortest path deviation. This migration matrix is a parallel transport operator on the Riemannian manifold. The joint characteristic matrix is projected along the tangent space of the manifold using parallel transport, eliminating characteristic distribution drift caused by varying operating conditions and outputting a pure joint characteristic matrix after operating condition decoupling. Specifically, this is based on geodesic distance. Calculate the operating condition transition matrix Working condition transition matrix For parallel transport operators on Riemannian manifolds, used to transfer... The eigenvectors in the tangent space at the point are transmitted in parallel to... In the tangent space at that point.

[0046] The parallel transmission process satisfies the parallel transmission equation of the Levi-Civita connection: ; middle, It is a covariant derivative operator. The parallel transport operator is obtained by solving the parallel transport equation. Joint feature matrix The feature vector corresponding to the time step (lie in The tangent space at that point is projected to the parallel transmission point. The tangent space at the point is used to obtain the corrected feature vector. : ; After parallel transport projection, the eigenvectors of all time steps form a pure joint feature matrix after decoupling from the operating conditions. It eliminates the characteristic distribution drift caused by variable operating conditions and achieves normalization correction of operating conditions.

[0047] As a preferred embodiment, a structural causal model that integrates features to anomalous states is constructed, incorporating a counterfactual network framework from epidemiological etiology inference. Specifically, the constructed structural causal model includes quadruplets. ,in These are exogenous variables, including unobserved equipment operating environment interference variables and measurement noise variables; For the set of endogenous variables, , This is the set of feature variables corresponding to the corrected joint feature matrix. , Total number of feature dimensions For abnormal state variables of the equipment, ; This is a set of functions, where each endogenous variable corresponds to a function, and its parent node variable is mapped to the value of that endogenous variable. , ,in for The parent node variable, for The parent node variable, , For the corresponding exogenous variables; exogenous variables The joint probability distribution.

[0048] A counterfactual network framework for epidemiological etiology inference is introduced, comprising three layers: an association layer, an intervention layer, and a counterfactual layer. The association layer is configured to model the observed correlation between features and abnormal states; the intervention layer is configured to model the distribution of abnormal states after feature intervention using the do operator; and the counterfactual layer is configured to generate features and abnormal states under counterfactual scenarios that did not occur, and calculate the causal effect.

[0049] Using the observed fusion features as factual conditions, counterfactual fusion features without anomalies are generated through intervention variables. The mean causal effect of factual and counterfactual features in the potential outcome space is calculated. The dimensions in the fusion features are sorted in descending order based on the mean causal effect, and the top-ranked core feature factors are extracted.

[0050] Based on actual observed fusion features As factual conditions, among which The actual observed feature matrix corresponds to the observed abnormal state values. By intervening variables Generate counterfactual fusion features that do not show any anomalies. This refers to the fusion feature when the counterfactual scenario is "the device did not malfunction". Counterfactual fusion feature The solution process follows a three-step counterfactual inference method: Abductive step: based on observational data , Inferring the posterior distribution of the exogenous variable U ; Intervention step: Implementing the intervention within the structural causal model ,Revise The structural equation is fixed. The value is ; Prediction Step: Based on the posterior distribution of exogenous variables and the modified structural equation, predict counterfactual fusion features. The value of .

[0051] By combining disease gene module tracing methods from network medicine, the distribution differences between factual and counterfactual features in the potential outcome space are mapped into a multi-layered heterogeneous causal graph. , where the set of nodes It contains three layers of nodes: feature layer nodes Intermediate hidden layer nodes Abnormal state layer nodes edge set A directed edge represents a causal relationship between nodes, with the direction of the edge pointing from the cause node to the effect node.

[0052] In a multi-layered heterogeneous causal graph, a message-passing mechanism from graph neural networks is introduced to propagate activation signals of anomalous states along causal edges. The cumulative activation energy received by each feature node is calculated, and this cumulative activation energy is used to replace the mean of causal effects as a quantitative indicator to locate core feature factors. Please refer to the appendix. Figure 6 .

[0053] Each iteration of the message passing mechanism includes a message aggregation step and a node update step. For the... Round iteration, node The message aggregation formula is: ; in, For nodes In the The aggregated message received in the round of iteration, For nodes The set of parent nodes, For nodes To the node edge weights, For nodes In the The hidden state of the round iteration.

[0054] node The hidden state update formula is: ; in, Weight matrix, For bias terms, This is the activation function.

[0055] During the initial iteration, the abnormal state layer nodes Hidden state The initial hidden state of the remaining nodes is... That is, the activation signal starts from the abnormal state layer node and propagates backward along the causal edge to the feature layer node.

[0056] Combining model prediction control strategies from cybernetics, a joint cost function of forward prediction loss and backward causal consistency constraint is constructed in each iteration of message passing. Specifically: for the... Round iteration, forward prediction loss The mean square error between the predicted outlier values and the actual observed values: ; in, The abnormal state value is obtained by forward prediction based on the hidden state of the node in the l-th iteration.

[0057] Backward causal consistency constraint To constrain the transit entropy of causal edges, ensure that message passing occurs only along true causal links: ; in, For nodes To the node The propagation entropy is configured for quantization. arrive The intensity of causal information flow.

[0058] Constructing the joint cost function : ; in, These are the weighting coefficients for the causal consistency constraint. for The weighting coefficient of the regularization term, For edge weights Norms are configured to constrain the sparsity of edge weights.

[0059] Solving the joint cost function using the interior point method The problem involves minimizing the joint cost function and adaptively adjusting the decay coefficients of the edge weights in the graph neural network. Specifically, the problem is transformed into a constrained convex optimization problem. By introducing a logarithmic barrier function using the interior-point method, the constrained problem is transformed into an unconstrained problem for iterative solution, yielding the optimal edge weights. Based on optimal edge weights Calculate the decay coefficient of edge weights : ; in, This is the maximum absolute value of all edge weights. For the decay coefficient... Edges with weights greater than a preset threshold are reset to 0, blocking redundant message passing paths caused by spurious correlations in the multi-layered heterogeneous causal graph, and outputting a sparse causal graph that retains only direct causal links.

[0060] Based on sparse causal graphs, a full iterative process for message passing is completed, calculating each feature layer node. Received cumulative activation energy : ; in, This represents the total number of iterations in message passing. For feature nodes In the The hidden state of the round iteration.

[0061] To accumulate and activate energy Using the mean of alternative causal effects as a quantitative indicator of causal correlation, all feature dimensions are categorized... Sort by size in descending order and extract the top-ranked items. The feature dimension is used as the core feature factor. The value is a preset positive integer. Based on core feature factors and a sparse causal graph, interpretable anomaly identification results and defect cause analysis results are output. At the same time, the traceability of the identification results is verified based on the causal links of the sparse causal graph.

Claims

1. A method for intelligent recognition of inspection images in thermal power plants, characterized in that, include: Acquire the continuous inspection video sequence and full-band voiceprint data synchronously collected by the inspection robot, and complete the timestamp synchronization and alignment; The acoustic signature data is converted into a sonar imaging map that matches the video frame at the pixel level. The pixel values of the map represent the sound pressure level and spectral characteristics of the corresponding spatial location. Temporal video instance segmentation is used to track moving targets in video sequences, extract appearance changes and dynamic operation features, and simultaneously extract anomalous acoustic emission features from sonar imaging maps, which are then fused to generate a joint feature matrix. Acquire real-time operating parameters of equipment collected by the DCS system and perform operating condition normalization correction on the joint feature matrix; The causal inference feature attribution quantification method is used to determine the causal correlation between fusion features and abnormal states, locate core feature factors, and output interpretable anomaly identification results and defect cause analysis results.

2. The method according to claim 1, characterized in that, Moving target tracking in video sequences is performed using temporal video instance segmentation, including: The video sequence is input into a two-stream network containing a spatial mask branch and a temporal memory branch. The vortex dynamics constraint from computational fluid dynamics is introduced to model the pixel motion field between adjacent frames as an optical flow field. The two-dimensional vortex field distribution of the optical flow field is calculated. The vortex field is defined as the curl scalar field of the optical flow field on the image plane. In the temporal memory branch, an adaptive deformation convolution kernel is constructed based on the vortex field distribution. The adaptive deformation convolution kernel is applied to the feature map of the temporal memory branch to resample and aggregate the pixel features that undergo topological deformation between frames. This compensates for the pixel motion offset caused by the high-speed operation and vibration of the equipment, completes vortex compensation aggregation, suppresses motion blur of thermal power generation equipment under high-speed operation or vibration, and outputs an accurate temporal instance mask.

3. The method according to claim 1, characterized in that, Converting acoustic signature data into a sonar imaging atlas that matches video frames at the pixel level includes: The self-focusing processing in synthetic aperture sonar imaging is used to perform phase compensation and beamforming on the full-band acoustic pattern data to generate an initial sonar imaging map. By introducing aperture synthesis technology from radio astronomy and a joint constraint based on image edge priors, the geometric contours of device edges extracted from visible light video sequences in the same scene are used as spatial prior constraints. A joint inverse problem model containing acoustic data fidelity terms and optical edge prior regularization terms is constructed. The objective function of the model is defined as the L2 norm of the difference between the initial sonar map and the map to be solved plus the L1 norm of the device edge prior constraints. The objective function is solved by the alternating direction multiplier method.

4. The method according to claim 1, characterized in that, The joint characteristic matrix is subjected to operating condition normalization correction, including: The real-time operating parameters of the equipment are constructed as a topological manifold structure in non-Euclidean space. Specifically, the real-time operating parameter vector of the equipment at each moment is mapped to a point on the topological manifold, and the dimension of the manifold is consistent with the dimension of the operating parameter vector. The geodesic distance metric in Riemannian geometry is introduced, and the Riemannian metric tensor on the manifold is defined as a Gaussian kernel metric based on the state transition probability of the queuing network and the Euclidean distance of the parameter vector. The shortest path deviation between the real-time operating parameters and the historical standard operating parameters on the manifold surface is calculated. The operating condition migration matrix is calculated based on the shortest path deviation. The joint feature matrix is then projected in parallel along the tangent space of the manifold to output the pure joint feature matrix after operating condition decoupling.

5. The method according to claim 1, characterized in that, The causal correlation between fusion features and abnormal states is quantified using causal inference feature attribution quantification, including: A structural causal model integrating features and anomalous states is constructed, introducing a counterfactual network framework from epidemiological etiology inference. The counterfactual network framework includes an association layer, an intervention layer, and a counterfactual layer. The association layer corresponds to the observation distribution of the structural causal model and is used to model the observational correlation between features and anomalous states. The intervention layer corresponds to the intervention distribution of the structural causal model and is used to model the distribution of anomalous states after feature intervention through intervention variables. The counterfactual layer corresponds to the counterfactual distribution of the structural causal model and is used to generate features and anomalous states under counterfactual scenarios that have not occurred. The intervention variable is an endogenous variable acting on the structural causal model. Operator, specifically , indicating that the d-th dimension feature variable Forced to be set to the value x, or This indicates that the abnormal state variable Y will be forcibly set to the normal state; Using the observed fusion features as factual conditions, counterfactual fusion features without anomalies are generated through intervention variables. The mean causal effect of factual and counterfactual features in the potential outcome space is calculated. The dimensions in the fusion features are sorted in descending order based on the mean causal effect, and the top-ranked core feature factors are extracted.

6. The method according to claim 2, characterized in that, In the time-memory branch, an adaptive deformation convolution kernel is constructed based on the vorticity field distribution, including: Combining the wave function interference principle in quantum mechanics, the curl vector in the vorticity field distribution is converted into the complex phase feature of the target pixel; Based on the coherence intensity of the complex phase characteristics, the update threshold of the hidden state in the time memory branch is dynamically allocated. Redundant motion features caused by background thermal ripple interference are eliminated by phase destructive interference, and only constructive interference features caused by mechanical defects of the equipment are retained to generate an adaptive deformation convolution kernel.

7. The method according to claim 3, characterized in that, Solving the joint inverse problem model using the alternating direction multiplier method includes: Introducing the local resonance mechanism from the field of acoustic metamaterials, a resonance scattering regularization term is added to the augmented Lagrangian function of the alternating direction multiplier method; the expression for the resonance scattering regularization term is: ,in, This represents the acoustic signal spectrum of the corresponding pixel in the sonar imaging map. For the Dirac function, The resonant frequency of the equivalent acoustic Helmholtz resonator in the internal cavity of a thermal power generation device; The internal cavity structure of thermal power generation equipment is equivalent to an acoustic Helmholtz resonator, and the resonant frequency is... The results are obtained by calculating the geometric parameters of the equipment cavity or by calibrating the acoustic data under historical normal operating conditions. During the iterative solution process, the pixel values of the sonar imaging spectrum that deviate from the resonance frequency are penalized by the resonance scattering regularization term, thereby enhancing the energy concentration of the weak abnormal noise caused by the tiny cracks inside the equipment in the sonar imaging spectrum.

8. The method according to claim 4, characterized in that, Constructing the real-time operating parameters of the device into a topological manifold structure in non-Euclidean space includes: By introducing the queuing network dynamics model from operations research, the load change process of multiple related devices in real-time operating parameters is mapped as a sequence of arrival rate and service rate of Markov-modulated Poisson process. The node connection weights of the topological manifold structure are constructed based on the state transition probability of the queuing network. The information entropy of the node characteristics is calculated by combining the entropy increase principle in thermodynamics. Nodes with information entropy lower than a preset threshold are pruned in the topological manifold structure.

9. The method according to claim 5, characterized in that, Calculate the mean causal effect of factual and counterfactual features in the potential outcome space, including: By combining the disease gene module tracing method in network medicine, the distribution differences of factual and counterfactual features in the potential outcome space are mapped into a multi-layered heterogeneous causal graph; In a multi-layered heterogeneous causal graph, a message passing mechanism from graph neural networks is introduced to propagate activation signals of abnormal states along causal edges. The cumulative activation energy received by each feature node is calculated, and this cumulative activation energy is used as a quantitative indicator to locate core feature factors, replacing the mean of causal effects. The cumulative activation energy is the sum of the hidden states of the feature node in all iterations of message passing, and the calculation formula is as follows: ,in, Let L be the cumulative activation energy of the d-th feature node, and L be the total number of iterations in message passing. Let be the hidden state of the d-th dimension feature node in the l-th iteration.

10. The method according to claim 9, characterized in that, In multi-layer heterogeneous causal graphs, a message-passing mechanism from graph neural networks is introduced, including: Combining model prediction control strategies from cybernetics, a joint cost function of forward prediction loss and backward causal consistency constraint is constructed in each iteration step of message passing; The joint cost function is solved by interior point method, and the decay coefficient of edge weights in graph neural network is adaptively adjusted to block redundant message passing paths caused by spurious correlations in multi-layer heterogeneous causal graph. The output is a sparse causal graph that retains only direct causal links to complete traceability verification.