A reciprocating compressor AI automatic diagnosis method and system based on multi-modal data fusion
By preprocessing and feature quantizing the vibration, pressure, and temperature signals of the reciprocating compressor, and adaptively generating modal feature weights, the problem of missed fault diagnosis caused by fixed weights in multimodal fusion is solved, and higher accuracy fault diagnosis is achieved.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- BEIJING YINGHUADA POWER ELECTRONICS ENG TECH CO LTD
- Filing Date
- 2026-03-18
- Publication Date
- 2026-06-19
AI Technical Summary
Existing multimodal fusion technology has failed to fully explore the deep physical correlation between different modal data in the fault diagnosis of reciprocating compressors. As a result, the simple splicing method with fixed weights cannot be adaptively adjusted, making it difficult to effectively distinguish between mechanical faults and process faults, and leading to missed fault diagnosis.
By acquiring vibration, pressure, and temperature signals from a reciprocating compressor, features are extracted after preprocessing, the strength of evidence is quantified, and modal feature weights are adaptively generated. These weights are then adjusted and fused to generate a multimodal fusion feature vector, which is then input into a deep learning diagnostic model for fault identification.
It enables adaptive adjustment of modal feature weights based on the current state of the equipment, improving the accuracy of fault diagnosis, reducing the rate of missed fault diagnosis, and more accurately identifying the differences between mechanical faults and process faults.
Smart Images

Figure CN122241430A_ABST
Abstract
Description
Technical Field
[0001] This application relates to the field of data processing technology, specifically to an AI-based automatic diagnostic method and system for reciprocating compressors based on multimodal data fusion. Background Technology
[0002] Reciprocating compressors, as core power equipment in industrial production, are widely used in key fields such as petrochemicals, metallurgy, and natural gas transportation. Their internal structure is complex, containing numerous moving parts such as cylinders, crankshafts, and connecting rods. Operating under harsh conditions such as variable loads, high pressures, and high temperatures for extended periods, they are highly susceptible to mechanical wear or process abnormalities.
[0003] With the development of Industrial Internet of Things (IIoT) technology, current reciprocating compressor fault diagnosis technology is gradually evolving towards multimodal data collaborative diagnosis. Existing automated diagnostic solutions typically install multiple sensors, such as vibration, pressure, and temperature sensors, simultaneously on key parts of the compressor to collect multi-dimensional operational status data. In the data processing stage, feature extraction is generally performed on the raw data of each modality in the time or frequency domain. Subsequently, the extracted multimodal features are directly concatenated at the data layer to construct a comprehensive feature vector. Finally, this comprehensive feature vector is input into a pre-built basic neural network for classification and inference, thereby outputting the fault diagnosis results of the equipment.
[0004] However, existing multimodal fusion methods often remain at a superficial level, directly splicing features without fully exploring the deep physical relationships between different modal data. In the actual operation of reciprocating compressors, different types of faults exhibit vastly different sensitivities to each modal feature. Existing technologies employ a simple splicing method with fixed weights, failing to adaptively evaluate and adjust the weight ratios of each modal feature based on the real-time abnormal state of the equipment. This rigid fusion approach makes it difficult for the model to effectively distinguish between mechanical faults and process faults, ultimately leading to missed fault diagnoses. Summary of the Invention
[0005] This application provides an AI-based automatic diagnosis method and system for reciprocating compressors based on multimodal data fusion. This method achieves a leap from fixed weights to dynamic adaptive weights, thereby improving the accuracy of fault diagnosis results.
[0006] Firstly, this application provides an AI-based automatic diagnostic method for reciprocating compressors based on multimodal data fusion. The method includes: acquiring vibration signals, pressure signals, and temperature signals from key components of the reciprocating compressor in real-time synchronous acquisition; preprocessing the vibration signals, pressure signals, and temperature signals to obtain preprocessed vibration signals, preprocessed pressure signals, and preprocessed temperature signals; extracting vibration features, pressure features, and temperature features from the preprocessed vibration signals, preprocessed pressure signals, and preprocessed temperature signals, respectively; and quantifying the differences between the vibration features, pressure features, and temperature features and the corresponding baseline features under normal operating conditions to obtain the degree of deviation of each mode from the normal state. The system analyzes the strength of evidence, including vibration, pressure, and temperature. It combines these strengths to obtain a real-time anomaly evidence vector, which is then mapped and compared to a preset fault mode template to obtain a set of modal feature weights. Based on these weights, the vibration, pressure, and temperature features are adaptively weighted and adjusted, and the adjusted multimodal features are then concatenated and fused to generate a multimodal fusion feature vector. This multimodal fusion feature vector is then input into a preset deep learning diagnostic model for processing, yielding the fault diagnosis results for the reciprocating compressor. A fault diagnosis report is generated based on the results and pushed to the user terminal in real time.
[0007] By adopting the above technical solution, vibration signals, pressure signals and temperature signals of key parts of reciprocating compressor are obtained. After targeted preprocessing of these multimodal signals, their respective characteristic parameters are extracted. By comparing the real-time extracted vibration features, pressure features and temperature features with the benchmark features under normal operating conditions, vibration evidence strength, pressure evidence strength and temperature evidence strength are generated. This method of quantifying evidence strength effectively solves the problem of lack of physical correlation in multimodal data fusion in existing technologies. By mapping and comparing the combined real-time anomaly evidence vector with the preset fault mode template, a set of modal feature weight values can be adaptively generated according to the current abnormal state characteristics of the equipment. These weight values fully consider the sensitivity differences of different fault types to each modal feature, so that in the subsequent weighting adjustment process, vibration features, pressure features, and temperature features are assigned importance weights that match the current fault state, realizing the leap from fixed weights to dynamic adaptive weights. The multimodal fusion feature vector generated by splicing and fusing after adaptive weighting adjustment not only contains the original information of each mode, but also contains the fault-oriented feature importance distribution, enabling the preset deep learning diagnostic model to more accurately identify the differences between mechanical faults and process faults, significantly improving the accuracy of fault diagnosis results and effectively reducing the fault missed diagnosis rate.
[0008] Optionally, the vibration signal, pressure signal, and temperature signal are preprocessed to obtain preprocessed vibration signal, preprocessed pressure signal, and preprocessed temperature signal. Specifically, this includes: denoising the vibration signal based on wavelet packet decomposition algorithm to obtain a denoised vibration signal; filtering the pressure signal using a moving average filtering algorithm to obtain a filtered pressure signal; removing outliers from the temperature signal based on preset criteria and performing data completion processing on the temperature signal after outlier removal using a linear interpolation algorithm to obtain a processed temperature signal; and standardizing and normalizing the denoised vibration signal, filtered pressure signal, and processed temperature signal respectively to obtain preprocessed vibration signal, preprocessed pressure signal, and preprocessed temperature signal.
[0009] By adopting the above technical solutions, vibration signals, as high-frequency dynamic signals, are easily affected by electromagnetic interference and environmental noise during the acquisition process. Noise reduction based on wavelet packet decomposition algorithm can effectively suppress high-frequency noise while preserving fault characteristic frequency components, ensuring a higher signal-to-noise ratio for the noise-reduced vibration signal. Pressure signals, as key parameters reflecting the thermodynamic cycle of the compressor, often contain a mixture of operating condition changes and measurement errors. Using a moving average filtering algorithm to filter the pressure signal can smooth instantaneous fluctuations while preserving the overall trend of pressure changes, making the filtered pressure signal more accurately reflect the equipment's process operating status. Temperature signals may contain outliers due to sensor failures or abnormal data transmission. Outlier removal based on preset criteria and data completion using linear interpolation algorithm effectively avoid interference from outlier data points in subsequent analysis, ensuring the continuity and reliability of the processed temperature signal. The vibration signal after noise reduction, the pressure signal after filtering, and the temperature signal after processing are standardized and normalized respectively, eliminating the differences in dimensions and numerical ranges of different modal signals. This makes the pre-processed vibration signal, pre-processed pressure signal, and pre-processed temperature signal comparable, ensuring the accuracy of the entire diagnostic system from the source.
[0010] Optionally, vibration features, pressure features, and temperature features are extracted from the preprocessed vibration signal, preprocessed pressure signal, and preprocessed temperature signal, respectively. Specifically, this includes: performing time-domain and frequency-domain analysis on the preprocessed vibration signal to extract time-domain statistical features and frequency-domain energy features; combining the time-domain statistical features and frequency-domain energy features to form vibration features, where the time-domain statistical features include root mean square value, peak-to-peak value, skewness, and kurtosis; and the frequency-domain energy features include the main frequency amplitude extracted based on fast Fourier transform and the energy proportion of each frequency band. Thermodynamic and waveform feature analysis is performed on the preprocessed pressure signal to extract pressure features, including maximum exhaust pressure, minimum intake pressure, pressure ratio, pressure pulsation amplitude, and indicated work characteristics. Statistical and trend analysis is performed on the preprocessed temperature signal to extract temperature features, including average temperature value, temperature variance, and temperature change rate.
[0011] By adopting the above technical solutions, for the pre-processed vibration signal, by combining the dual perspectives of time-domain analysis and frequency-domain analysis, it is possible to capture the overall energy distribution and waveform characteristics of the vibration signal from the time-domain statistical features. Based on the main frequency amplitude and energy proportion of each frequency band extracted by fast Fourier transform, it is possible to effectively distinguish the spectral characteristics of different mechanical faults such as bearing wear, so that the combined vibration features have the ability to comprehensively characterize the mechanical state. For the pre-processed pressure signal, the pressure features extracted through thermodynamic and waveform feature analysis cover the key parameters of the compressor working cycle. The temperature features extracted by statistical and trend analysis of the pre-processed temperature signal are also analyzed. This multi-modal and multi-dimensional feature extraction strategy ensures that the vibration features, pressure features, and temperature features can comprehensively characterize the equipment operating state from multiple perspectives such as mechanics, thermodynamics, and thermal management.
[0012] Optionally, the vibration, pressure, and temperature features are differentiated and quantified with the corresponding baseline features under normal operating conditions to obtain the vibration evidence strength, pressure evidence strength, and temperature evidence strength of each mode's deviation from the normal state. Specifically, this includes: acquiring historical multimodal feature data of the reciprocating compressor under normal operating conditions, and calculating the mean vector and covariance matrix corresponding to the feature vectors of the vibration mode, pressure mode, and temperature mode respectively; using the mean vector and covariance matrix corresponding to each mode as the baseline features; using the vibration, pressure, and temperature features as real-time vibration feature vectors, real-time pressure feature vectors, and real-time temperature feature vectors respectively; and using the vibration mode, pressure mode, and temperature mode as the target modes respectively. The process involves subtracting the real-time feature vector of the target mode from the mean vector corresponding to the target mode to obtain the feature deviation vector of the target mode. The real-time feature vector is the vector corresponding to the target mode. The covariance matrix corresponding to the target mode is inverted to obtain the inverse covariance matrix of the target mode. The feature deviation vector is transposed to obtain the transposed feature deviation vector. The transposed feature deviation vector, the inverse covariance matrix, and the feature deviation vector are then multiplied sequentially to obtain the squared Mahalanobis distance of the target mode. The squared Mahalanobis distances of the vibration mode, the pressure mode, and the temperature mode are used as the strength of evidence for vibration, pressure, and temperature, respectively.
[0013] By adopting the above technical solution, historical data of multimodal characteristics of reciprocating compressors under normal operating conditions are obtained. The mean vectors and covariance matrices corresponding to the feature vectors of vibration mode, pressure mode, and temperature mode are calculated as benchmark features. After using the real-time acquired vibration features, pressure features, and temperature features as real-time feature vectors, the feature deviation vector is obtained by subtracting the real-time feature vector of the target mode from the corresponding mean vector. This can intuitively reflect the direction and magnitude of the deviation of the current state from the normal state. The inverse covariance matrix is obtained by inverting the covariance matrix. The transposed feature deviation vector, the inverse covariance matrix, and the feature deviation vector are then multiplied sequentially. The resulting squared Mahalanobis distance value can comprehensively consider the correlation and variance difference of each feature dimension. The squared Mahalanobis distance values of vibration mode, pressure mode, and temperature mode are used as the vibration evidence strength, pressure evidence strength, and temperature evidence strength, respectively. This realizes the transformation of complex multidimensional feature deviation degree into scalar indicators with clear physical meaning, ensuring that the diagnostic system can accurately capture subtle abnormal changes in the equipment and make correct judgments.
[0014] Optionally, the real-time anomaly evidence vector is mapped and compared with a preset fault mode template to obtain a set of modal feature weight values. Specifically, this includes: obtaining a fault prototype vector as a preset fault mode template; the fault prototype vector includes at least a mechanical fault prototype vector, a fluid fault prototype vector, and a health state prototype vector; each fault prototype vector consists of vibration reference evidence values, pressure reference evidence values, and temperature reference evidence values; performing dot product operations between the real-time anomaly evidence vector and the mechanical fault prototype vector, fluid fault prototype vector, and health state prototype vector respectively to obtain mechanical fault alignment scores, fluid fault alignment scores, and health state alignment scores; based on temperature super... The parameters are exponentially calculated and normalized for the mechanical fault alignment score, fluid fault alignment score, and health status alignment score to obtain the mechanical fault confidence score, fluid fault confidence score, and health status confidence score. Pre-defined mechanical fault attention vectors, fluid fault attention vectors, and health status attention vectors are obtained. The mechanical fault confidence score is multiplied by the mechanical fault attention vector, the fluid fault confidence score is multiplied by the fluid fault attention vector, and the health status confidence score is multiplied by the health status attention vector. The results of these multiplications are then summed to obtain a dynamic weight vector. Each component in the dynamic weight vector is normalized to obtain a set of modal feature weight values.
[0015] By adopting the above technical solution, fault prototype vectors, including mechanical fault prototype vectors, fluid fault prototype vectors, and health status prototype vectors, are obtained as preset fault mode templates. After performing dot product operations with each fault prototype vector to obtain mechanical fault alignment scores, fluid fault alignment scores, and health status alignment scores, the similarity between the current equipment state and various preset fault modes can be quantified. Based on the temperature hyperparameter, each alignment score is exponentially calculated and normalized to obtain mechanical fault confidence, fluid fault confidence, and health status confidence. A soft probability distribution characterizes the most likely fault category corresponding to the current state. Each fault confidence is multiplied and added with a pre-set mechanical fault attention vector, fluid fault attention vector, and health status attention vector to obtain a dynamic weight vector. The components of the dynamic weight vector are normalized to obtain a set of modal feature weight values, ensuring the rationality and interpretability of the weight allocation. This fundamentally solves the technical deficiency of existing technologies that use fixed weights and cannot adapt to different fault types, significantly improving the accuracy of the diagnostic system in identifying complex fault modes.
[0016] Optionally, the vibration feature, pressure feature, and temperature feature are adaptively weighted and adjusted according to a set of modal feature weight values, and the adjusted multimodal features are then concatenated and fused to generate a multimodal fusion feature vector. Specifically, this includes: parsing the vibration feature weight value corresponding to the vibration mode, the pressure feature weight value corresponding to the pressure mode, and the temperature feature weight value corresponding to the temperature mode from a set of modal feature weight values; performing scalar multiplication of the vibration feature weight value with each feature element in the vibration feature to obtain the weighted vibration feature; performing scalar multiplication of the pressure feature weight value with each feature element in the pressure feature to obtain the weighted pressure feature; performing scalar multiplication of the temperature feature weight value with each feature element in the temperature feature to obtain the weighted temperature feature; and concatenating the weighted vibration feature, weighted pressure feature, and weighted temperature feature according to a preset modal dimension order to generate a multimodal fusion feature vector.
[0017] By employing the above technical solution, vibration feature weight values, pressure feature weight values, and temperature feature weight values are extracted from a set of modal feature weight values. The process of performing scalar multiplication between each modal feature weight value and each feature element in the corresponding feature is essentially a selective amplification or suppression operation on the original features, resulting in weighted vibration features, weighted pressure features, and weighted temperature features. The weighted multimodal features are then concatenated according to a preset modal dimension order to generate a multimodal fusion feature vector. This not only preserves the original physical information and complementarity of each mode but also redistributes the representation weights of each mode in the feature space, ensuring that the information composition of the fusion feature vector is perfectly aligned with the feature sensitivity of the current fault type. Compared to the simple direct feature concatenation of existing technologies, the multimodal fusion feature vector has stronger fault discrimination ability and higher information utilization efficiency, providing higher quality feature input for subsequent deep learning diagnostic models.
[0018] Optionally, the multimodal fusion feature vector is input into a preset deep learning diagnostic model for processing to obtain the fault diagnosis result of the reciprocating compressor. Specifically, this includes: inputting the multimodal fusion feature vector into the multimodal feature input layer of the preset deep learning diagnostic model, performing standardization and dimensional alignment processing to obtain an aligned feature sequence; inputting the aligned feature sequence into a feature embedding and position encoding layer, mapping the aligned feature sequence to a high-dimensional latent space through a linear embedding layer, and superimposing sinusoidal position encoding information to obtain a sequence feature with position information; inputting the sequence feature with position information into a Transformer encoder stack layer, and calculating the position information through a multi-head self-attention module. The global dependencies between modal features in the sequence features of the information are analyzed and processed nonlinearly through residual connections, layer normalization, and a feedforward neural network module to output deep sequence features. The deep sequence features are then input into a global feature aggregation layer, where dimensionality reduction and aggregation are performed through a global average pooling unit and a feature compression unit to generate a global fault feature vector. The global fault feature vector is then input into a fault diagnosis output layer, where it is classified and mapped through a fully connected layer and a classification activation function to output the fault type, fault location, and severity of the reciprocating compressor. The fault type, fault location, and severity are combined as the fault diagnosis result.
[0019] By adopting the above technical solution, when the multimodal fusion feature vector is input into the multimodal feature input layer of the preset deep learning diagnostic model, the aligned feature sequence is obtained after standardization and dimensional alignment processing. The aligned feature sequence is then mapped to a high-dimensional latent space through feature embedding and positional encoding layers, and sinusoidal positional encoding information is superimposed to obtain sequence features with positional information. This not only projects the original features to a more expressive high-dimensional space but also preserves the order relationship and relative positional information of each modality feature through positional encoding. After inputting the sequence features with positional information into the stacked layers of the Transformer encoder, the multi-head self-attention module can automatically learn and capture the global dependencies between each modality feature, breaking through the limitations of the local receptive field of traditional neural networks. It can simultaneously focus on the deep coupling relationship between the impact signal in vibration features and the fluctuation anomaly in pressure features. The deep sequence features output after residual connection, layer normalization, and nonlinear transformation processing by the feedforward neural network module contain rich cross-modal interaction information. The global average pooling unit and feature compression unit of the global feature aggregation layer perform dimensionality reduction and aggregation processing on the deep sequence features to generate a global fault feature vector, effectively extracting the most discriminative fault representation. The final fault diagnosis output layer maps the global fault feature vector into a multi-dimensional diagnostic result of fault type, fault location, and severity through a fully connected layer and a classification activation function. Compared with the existing technology that only outputs a single fault category, it provides more comprehensive and practical diagnostic information. It can not only accurately distinguish between mechanical faults and process faults, but also locate specific fault locations and assess the severity of faults.
[0020] The second aspect of this application provides an AI-based automatic diagnostic system for reciprocating compressors based on multimodal data fusion. The system includes an acquisition unit, a processing unit, a fusion unit, and a diagnostic unit. The acquisition unit acquires vibration, pressure, and temperature signals from key components of the reciprocating compressor in real-time and synchronously. The processing unit preprocesses the vibration, pressure, and temperature signals to obtain preprocessed vibration, pressure, and temperature signals. The fusion unit extracts vibration, pressure, and temperature features from the preprocessed vibration, pressure, and temperature signals, respectively. The vibration, pressure, and temperature features are then differentially quantized with corresponding baseline features under normal operating conditions to obtain... The system assesses the intensity of vibration, pressure, and temperature evidence to determine the degree of deviation from normal conditions in each mode. It combines these indices to obtain a real-time anomaly evidence vector, which is then mapped and compared with a preset fault mode template to obtain a set of modal feature weights. Based on these weights, the vibration, pressure, and temperature features are adaptively weighted and adjusted, and the adjusted multimodal features are then concatenated and fused to generate a multimodal fusion feature vector. The diagnostic unit inputs this multimodal fusion feature vector into a preset deep learning diagnostic model for processing, yielding the fault diagnosis results for the reciprocating compressor. A fault diagnosis report is generated based on the results and pushed to the user terminal in real time.
[0021] In a third aspect, this application provides an electronic device including a processor, a memory, a user interface, and a network interface. The memory is used to store instructions, the user interface and the network interface are used to communicate with other devices, and the processor is used to execute the instructions stored in the memory, causing the electronic device to perform any of the methods described above in this application.
[0022] In a fourth aspect, this application provides a computer-readable storage medium storing instructions that, when executed, perform any of the methods described above in this application.
[0023] In summary, one or more technical solutions provided in the embodiments of this application have at least the following technical effects or advantages: 1. Obtain vibration, pressure, and temperature signals from key components of the reciprocating compressor. After targeted preprocessing of these multimodal signals, extract their respective characteristic parameters. By comparing the real-time extracted vibration, pressure, and temperature characteristics with the baseline characteristics under normal operating conditions, calculate the vibration evidence strength, pressure evidence strength, and temperature evidence strength. This method of quantifying evidence strength effectively solves the problem of lack of physical correlation in multimodal data fusion in existing technologies. By mapping and comparing the combined real-time anomaly evidence vector with the preset fault mode template, a set of modal feature weight values can be adaptively generated according to the current abnormal state characteristics of the equipment. These weight values fully consider the sensitivity differences of different fault types to each modal feature, so that in the subsequent weighting adjustment process, vibration features, pressure features, and temperature features are assigned importance weights that match the current fault state, realizing the leap from fixed weights to dynamic adaptive weights. The multimodal fusion feature vector generated by splicing and fusing after adaptive weighting adjustment not only contains the original information of each mode, but also contains the fault-oriented feature importance distribution, enabling the preset deep learning diagnostic model to more accurately identify the differences between mechanical faults and process faults, significantly improving the accuracy of fault diagnosis results and effectively reducing the fault missed diagnosis rate. Attached Figure Description
[0024] Figure 1 This is a schematic diagram of the first process of an AI-based automatic diagnosis method for reciprocating compressors based on multimodal data fusion, provided in an embodiment of this application. Figure 2 This is a second flowchart illustrating an AI-based automatic diagnostic method for reciprocating compressors based on multimodal data fusion, provided in an embodiment of this application. Figure 3 This is a schematic diagram of the structure of an electronic device disclosed in an embodiment of this application.
[0025] Explanation of reference numerals in the attached figures: 300, electronic device; 301, processor; 302, memory; 303, user interface; 304, network interface; 305, communication bus. Detailed Implementation
[0026] To enable those skilled in the art to better understand the technical solutions in this specification, the technical solutions in the embodiments of this specification will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of this application, and not all embodiments.
[0027] In the description of the embodiments of this application, the words "for example" or "for instance" are used to indicate examples, illustrations, or explanations. Any embodiment or design that is described as "for example" or "for instance" in the embodiments of this application should not be construed as being more preferred or advantageous than other embodiments or design options. Rather, the use of the words "for example" or "for instance" is intended to present the relevant concepts in a specific manner.
[0028] In the description of the embodiments of this application, the term "multiple" means two or more. For example, multiple systems means two or more systems, and multiple screen terminals means two or more screen terminals. Furthermore, the terms "first" and "second" are used for descriptive purposes only and should not be construed as indicating or implying relative importance or implicitly specifying the indicated technical features. Thus, a feature defined as "first" or "second" may explicitly or implicitly include one or more of that feature. The terms "comprising," "including," "having," and variations thereof all mean "including but not limited to," unless otherwise specifically emphasized.
[0029] Therefore, how to change the simple splicing of fixed weights in existing technologies, which leads to a mismatch with the current state of the equipment, is a problem that urgently needs to be solved. This application provides a multimodal data fusion-based AI automatic diagnosis method for reciprocating compressors, applied in a server. The server in this application can be a platform providing fault diagnosis services for all components of industrial reciprocating compressors. Figure 1 This is a schematic diagram of the first process of an AI-based automatic diagnostic method for reciprocating compressors based on multimodal data fusion, provided in an embodiment of this application. (Refer to...) Figure 1 The method includes the following steps S101-S108.
[0030] S101: Acquire vibration, pressure, and temperature signals from key components of the reciprocating compressor in real time.
[0031] In the above S101, since the reciprocating compressor is a complex electromechanical integrated device, the fault often manifests as abnormal changes in multiple physical quantities at the same time. For example, bearing wear will cause the vibration signal amplitude to increase while the bearing temperature rises abnormally. The airflow pulsation fault is mainly reflected in the fluctuation characteristics of the pressure signal. A single modal signal is difficult to accurately distinguish the fault type and locate the faulty component. Therefore, it is necessary to collect the vibration signal, pressure signal and temperature signal in real time to ensure that the data of each modality are strictly aligned on the time axis.
[0032] Based on the structural characteristics and fault mechanisms of reciprocating compressors, vibration sensors are installed in mechanically vibration-sensitive parts such as the cylinder head and crankcase. A high-frequency vibration sensor, model EN060, is selected, with a sampling frequency set to 20kHz to capture high-frequency impact characteristics such as bearing failures and piston impacts. The measurement range is set to 0-50g to adapt to the strong vibration environment of industrial sites, and the sensitivity reaches 50mV / g to ensure effective identification of weak fault signals. Pressure sensors are installed at the cylinder inlet and outlet pipes. A high-precision pressure sensor, model EN080, is selected, with a measurement range covering 0-15MPa to adapt to compressors with different exhaust pressure levels. The accuracy reaches 0.2 grade to ensure accurate capture of pressure fluctuation characteristics, and the sampling frequency is set to 1kHz to meet the dynamic monitoring requirements of the thermodynamic cycle process. Temperature sensors are installed on critical heat-generating components such as connecting rod bearings and cylinder walls. A PT100 platinum resistance temperature sensor is selected, which has a measurement range of -50℃ to 200℃, covering the entire operating temperature range of the compressor. Its accuracy reaches 0.1℃, which can identify the slight temperature rise caused by early bearing wear. The sampling frequency is set to 1Hz to match the slow change characteristics of the temperature signal.
[0033] After confirming that all sensors have been installed in the key parts of the reciprocating compressor, all sensors are connected to the 5G industrial gateway via wired transmission. A unified clock source is used to trigger synchronous sampling of each sensor, and the sampling period is strictly controlled within 8ms to ensure accurate alignment of vibration, pressure and temperature signals in the time dimension. The collected data is transmitted to the server wirelessly in real time through the 5G industrial gateway.
[0034] S102: Preprocess the vibration signal, pressure signal and temperature signal to obtain the preprocessed vibration signal, preprocessed pressure signal and preprocessed temperature signal.
[0035] In step S102 above, the purpose of preprocessing vibration, pressure, and temperature signals is to eliminate various noise interferences and data anomalies introduced by the complex industrial environment, extract pure signal features that truly reflect the equipment's operating status, and provide a high-quality data foundation for subsequent multimodal feature fusion and AI diagnosis. Because the reciprocating compressor's operating environment is subject to various interference sources such as strong electromagnetic interference, mechanical shock, and ambient temperature fluctuations, the raw vibration, pressure, and temperature signals inevitably contain noise components, baseline drift, and abrupt changes that affect diagnostic accuracy. Directly using unprocessed raw signals for feature extraction will lead to distorted feature parameters, resulting in misdiagnosis or missed diagnosis. Therefore, it is necessary to employ targeted preprocessing algorithms to purify and standardize the data based on the physical characteristics and noise features of different modal signals.
[0036] In addition, the vibration signal, pressure signal, and temperature signal are preprocessed to obtain preprocessed vibration signal, preprocessed pressure signal, and preprocessed temperature signal. Specifically, this includes: denoising the vibration signal based on wavelet packet decomposition algorithm to obtain a denoised vibration signal; filtering the pressure signal using a moving average filtering algorithm to obtain a filtered pressure signal; removing outliers from the temperature signal based on preset criteria and performing data completion processing on the temperature signal after outlier removal using a linear interpolation algorithm to obtain a processed temperature signal; and standardizing and normalizing the denoised vibration signal, filtered pressure signal, and processed temperature signal respectively to obtain the preprocessed vibration signal, preprocessed pressure signal, and preprocessed temperature signal.
[0037] Specifically, noise reduction of vibration signals based on wavelet packet decomposition is a key step in achieving vibration signal purification. The core principle lies in wavelet packet decomposition's ability to decompose non-stationary vibration signals into different frequency bands, effectively separating useful fault characteristic signals from high-frequency noise components in the frequency domain. Specifically, the db5 wavelet basis is chosen as the decomposition basis function. This wavelet basis possesses good time-frequency localization and orthogonality, effectively matching the impact characteristics of the reciprocating compressor vibration signal. The vibration signal undergoes six levels of wavelet packet decomposition, generating 32 wavelet packet nodes, each corresponding to a specific frequency band. By calculating the energy distribution of each node, the effective frequency band containing fault characteristics and the interference frequency band mainly containing noise can be identified. A soft thresholding method is used to shrink the coefficients of high-frequency noise nodes. The threshold setting adopts an adaptive threshold criterion, dynamically determining the threshold size based on the energy statistical characteristics of each frequency band to ensure that fault impact characteristics are preserved to the greatest extent while suppressing noise. After thresholding, the wavelet packet coefficients of each frequency band are reconstructed to obtain the noise-reduced vibration signal. Taking the bearing wear failure of the 2D12 compressor as an example, the original vibration signal is affected by environmental vibration and motor operation interference, and the signal-to-noise ratio is only 30dB. After wavelet packet noise reduction processing, the high-frequency random noise is effectively suppressed, and the bearing failure characteristic frequency of 100Hz and its harmonic components are clearly displayed.
[0038] Using a moving average filtering algorithm to filter pressure signals is an effective method to eliminate high-frequency pulsation interference and measurement noise. During the acquisition process, pressure signals are affected by factors such as airflow pulsation, pipeline vibration, and sensor measurement errors, which superimpose small-amplitude but high-frequency pulsation components onto the true pressure waveform. These pulsation components interfere with the accurate extraction of characteristic parameters such as pressure peak value and pressure fluctuation amplitude. The moving average filtering algorithm sets a fixed-length sliding window and calculates the arithmetic mean of the pressure sample values within the window point by point on the time series as the filtered output at that moment, thereby achieving smooth suppression of high-frequency pulsation components. Specifically, based on the compressor's cycle characteristics and the pressure signal sampling frequency, the sliding window size is set to 10 sampling points, corresponding to a time length of 10ms. This window length can effectively smooth high-frequency measurement noise at a sampling frequency of 1kHz, while preserving key waveform characteristics of the pressure signal such as rising edge, peak value, and falling edge during the working cycle. The filtering frequency is set to 0.1Hz to ensure correction for slow drift of the pressure baseline. By using a point-by-point sliding window and calculating the mean, the filtered pressure signal is output, while the pressure baseline is corrected, with a maximum correction of 0.2 MPa. Taking the pressure fluctuation fault of a 6M50 compressor as an example, the original pressure signal has a high-frequency pulsation of ±0.8 MPa at the pressure peak of the exhaust stroke, resulting in a repeatability measurement error of 6% for the pressure peak characteristic. After the moving average filtering process, the high-frequency pulsation is effectively smoothed, and the pressure peak stabilizes from a fluctuation range of 8-12 MPa to 10.5 ± 0.3 MPa. The measurement accuracy of the pressure fluctuation amplitude characteristic parameter is improved to within 2%, ensuring the reliability of pressure feature extraction.
[0039] Outlier removal based on preset criteria and data completion using linear interpolation algorithms are essential steps to ensure the integrity and consistency of temperature data. During long-term continuous acquisition, temperature signals may exhibit abrupt changes or missing values due to factors such as poor sensor contact, transient electromagnetic interference, and data transmission packet loss. If these anomalous data are used directly in feature calculations without processing, they will severely distort the true values of characteristic parameters such as the temperature mean, temperature fluctuation range, and temperature gradient. Therefore, the 3σ criterion is used as a preset criterion for outlier detection. First, the mean μ and standard deviation σ of the temperature signal sequence are calculated. Then, each temperature sample value is checked to see if it falls within the interval [μ-3σ, μ+3σ]. According to statistical principles, normally distributed data has a 99.7% probability of falling within this interval; data points outside this interval can be identified as outliers. Identified outliers are marked and removed from the original sequence. Statistical analysis shows that the proportion of outliers is controlled within 0.3%. To fill the data gaps created after outlier removal, a linear interpolation algorithm is used. This involves estimating the temperature at the missing location based on adjacent normal temperature samples before and after the gap using a linear relationship. The formula is T(t) = T(t1) + (T(t2) - T(t1)) × (t - t1) / (t2 - t1), where t1 and t2 are the adjacent sampling times before and after the gap, T(t1) and T(t2) are the corresponding temperature values, and t is the time at which the data needs to be filled. Because temperature signals exhibit slow-changing characteristics with small temperature variations between adjacent times, linear interpolation can effectively restore the true trend of the missing data. For continuous data loss due to data transmission packet loss, the proportion of missing data is controlled to within 0.5%, and the processed temperature signal is obtained after interpolation completion. Taking the temperature monitoring of the connecting rod bearing of the 2D12 compressor as an example, during an 8-second acquisition cycle, there were 3 abnormal abrupt changes (the instantaneous temperature jumped from 55℃ to 120℃) and 2 missing data points in the original temperature data. After the outliers were removed by the 3σ criterion and the data was completed by linear interpolation, the temperature curve regained its smooth transition characteristics. The calculated average temperature was corrected from the original 62.3℃ to 54.8℃, and the temperature fluctuation range was corrected from 35℃ to 8℃, which is consistent with the actual physical state and avoids the misleading of fault diagnosis by abnormal data.
[0040] Standardizing and normalizing the denoised vibration signal, filtered pressure signal, and processed temperature signal is a crucial step in eliminating dimensional differences between different modal data and ensuring the effectiveness of multimodal feature fusion. Since the amplitude of the vibration signal is measured in acceleration g, the pressure signal in MPa, and the temperature signal in °C, the numerical ranges and statistical distribution characteristics of these three physical quantities differ significantly. Directly fusing feature parameters with different dimensions can lead to features with larger values dominating the fusion result, while smaller but equally important features are masked, affecting the balanced representation of multimodal features. The Z-score standardization and normalization method maps data from different modalities to a unified standard normal distribution space, eliminating the influence of dimensions. Specifically, the mean μ and standard deviation σ of the denoised vibration signal, filtered pressure signal, and processed temperature signal on the training sample set are calculated separately. Then, the data value x at each sampling point is standardized using the formula z=(x-μ) / σ. The transformed data has a mean of 0 and a variance of 1, ensuring that the data from different modalities are on the same numerical scale. In practical applications, normalized parameter libraries were established for both the 2D12 and 6M50 models to store statistical parameters of each modal signal under normal operating conditions. Real-time acquired signals were processed using the corresponding normalized parameters for each model to ensure the comparability of data from different models. After standardization and normalization, preprocessed vibration, pressure, and temperature signals were obtained. The calculation time for the entire preprocessing process was controlled to within 0.3 seconds per group.
[0041] Taking the bearing wear failure scenario as an example, the vibration peak value after noise reduction is 1.1g, the pressure peak value after filtering is 10.2MPa, and the average temperature after processing is 78℃. After Z-score standardization, they are converted to z-vibration=2.15, z-pressure=1.87, and z-temperature=2.43 respectively. The eigenvalues of the three modes are in a similar numerical range.
[0042] S103: Extract vibration features, pressure features, and temperature features from the preprocessed vibration signal, preprocessed pressure signal, and preprocessed temperature signal, respectively.
[0043] In S103 above, extracting vibration features, pressure features, and temperature features from the preprocessed vibration signal, preprocessed pressure signal, and preprocessed temperature signal respectively is a key step in converting multimodal time-domain waveform data into feature vectors that can be used for AI diagnosis. The core purpose is to extract key parameters that can characterize the operating status and fault characteristics of the equipment from a massive amount of original sampling points, providing high-dimensional and highly discriminative feature inputs for subsequent multimodal feature fusion.
[0044] Furthermore, vibration features, pressure features, and temperature features are extracted from the preprocessed vibration signal, preprocessed pressure signal, and preprocessed temperature signal, respectively. Specifically, this includes: performing time-domain and frequency-domain analysis on the preprocessed vibration signal to extract time-domain statistical features and frequency-domain energy features; combining the time-domain statistical features and frequency-domain energy features to form vibration features, where the time-domain statistical features include root mean square value, peak-to-peak value, skewness, and kurtosis; and the frequency-domain energy features include the main frequency amplitude extracted based on fast Fourier transform and the energy proportion of each frequency band. Thermodynamic and waveform feature analysis is performed on the preprocessed pressure signal to extract pressure features, including maximum exhaust pressure, minimum intake pressure, pressure ratio, pressure pulsation amplitude, and indicated work characteristics. Statistical and trend analysis is performed on the preprocessed temperature signal to extract temperature features, including average temperature value, temperature variance, and temperature change rate.
[0045] Specifically, fault information of reciprocating compressors is dispersed in vibration signals, pressure signals, and temperature signals in different physical forms. Mechanical faults, such as bearing wear, are significantly reflected in the time-domain impact characteristics and frequency-domain harmonic energy of vibration signals. Process faults, such as airflow pulsation, are mainly reflected in the waveform characteristics and thermodynamic parameter changes of pressure signals. The statistical characteristics and trends of temperature signals can reflect the evolution of the thermal state of the equipment. Therefore, based on the physical meaning and fault sensitivity characteristics of each modal signal, differentiated feature extraction methods are designed to ensure that the extracted features can not only fully express the fault information of the current modality, but also provide complementary information for cross-modal correlation analysis.
[0046] The necessity of performing time-domain and frequency-domain analyses on the preprocessed vibration signal lies in the fact that the vibration signal is essentially a dynamic response of the mechanical motion state of the equipment. Time-domain characteristics can directly reflect transient events such as impacts and periodic fluctuations, while frequency-domain characteristics can reveal the spectral structure changes and energy distribution patterns caused by faults. Only by combining both can the fault characteristics of the vibration signal be comprehensively characterized. The time-domain analysis first calculates the mean square error of the preprocessed vibration signal sequence x(n), calculated as follows: This parameter reflects the overall energy level of the vibration signal. Under normal operating conditions, the root mean square (RMS) value remains stable at a low level. When mechanical faults such as bearing wear or piston ring damage occur, friction and impact increase the vibration energy, causing the RMS value to rise significantly. The peak-to-peak value is calculated using the formula PV = max(x(n)) - min(x(n)). This parameter characterizes the maximum fluctuation amplitude of the vibration signal and is extremely sensitive to impact-related faults such as instantaneous impacts caused by crankshaft cracks. Skewness is calculated using the formula Skewness = E[(x-μ)³] / σ³, where μ is the mean and σ is the standard deviation. Skewness reflects the symmetry of the vibration signal's probability distribution. Normal vibration signals are close to a symmetrical distribution with skewness close to zero. Under fault conditions, the impact component causes a skewed distribution, increasing the absolute value of the skewness. Kurtosis is calculated using the formula Kurtosis = E[(x-μ)]. 4 ] / σ 4 Kurtosis reflects the peak level of the probability distribution of vibration signals. Under normal conditions, kurtosis is close to 3. Periodic impacts caused by early bearing failures can significantly increase kurtosis to above 6, making it a sensitive indicator for early fault diagnosis.
[0047] Frequency domain analysis converts the time-domain vibration signal to the frequency domain using the Fast Fourier Transform (FFT), and the calculation formula is X(k) = ∑x(n)e (-j2πkn / N) The spectral distribution of the vibration signal is obtained, and the amplitude of the dominant frequency component (i.e., the frequency component with the largest amplitude) is extracted from the spectrum. This parameter reflects the energy intensity of the equipment's main excitation frequency and fault characteristic frequency. Based on the rotational speed and structural parameters of the reciprocating compressor, the spectrum is divided into four frequency bands: low frequency (0-50Hz), power frequency (50-150Hz), octave frequency (150-500Hz), and high frequency (above 500Hz). The energy proportion of each frequency band is calculated using the following formula: This leads to an increase in the proportion of energy in the high-frequency band, while crankshaft cracks lead to an increase in the proportion of energy in the octave band. The extracted root mean square value, peak-to-peak value, skewness, and kurtosis are used to construct a time-domain statistical feature vector, and the main frequency amplitude and the energy proportion of each frequency band are used to construct a frequency-domain energy feature vector. The time-domain statistical features and frequency-domain energy features are combined to construct a vibration feature vector containing 9 feature parameters. Taking the bearing wear failure of the 2D12 compressor as an example, under normal operating conditions, the root mean square value of the vibration signal is 0.52g, the peak-to-peak value is 1.8g, the skewness is 0.15, and the kurtosis is 3.7. The main frequency amplitude appears at the speed frequency of 12Hz at 0.12g. The energy proportions are as follows: low frequency band 25%, power frequency band 45%, octave band 20%, and high frequency band 10%. When the bearing wear reaches 0.2mm, the root mean square value increases to 1.05g, an increase of 102%, the peak-to-peak value increases to 4.2g, an increase of 133%, the skewness changes to 0.38, the kurtosis increases significantly to 6.8, and the main frequency amplitude increases to 0.28g. More importantly, the energy proportion of the high frequency band jumps to 28% while the low frequency band drops to 18%. This coordinated change in time-domain statistical characteristics and frequency-domain energy characteristics clearly indicates the development of the bearing wear failure and provides a highly discriminative feature input for subsequent fault identification.
[0048] The core of thermodynamic and waveform characteristic analysis of preprocessed pressure signals lies in the fact that pressure signals directly reflect the thermodynamic cycle process and gas flow state of the compressor. Waveform characteristics can reveal the essential characteristics of process faults such as intake and exhaust valve malfunctions, airflow pulsation, and pressure fluctuations. Specifically, a complete working cycle is identified from the preprocessed pressure signal. Based on the periodic changes in the pressure signal, the starting point of each cycle is determined by detecting the pressure rise edge, and pressure characteristic parameters are extracted within a single cycle. The maximum discharge pressure is calculated using the formula Pdischarge-max=max(p(t)|t∈Tdischarge), where Tdischarge is the discharge stroke time period. This parameter reflects the compressor's discharge capacity and the working state of the exhaust valve; exhaust valve leakage will cause a decrease in the maximum discharge pressure. The minimum suction pressure is calculated using the formula Psuction-min=min(p(t)|t∈Tsuction), where Tsuction is the suction stroke time period. This parameter reflects the working state of the suction valve and intake system; suction valve malfunction will cause the minimum suction pressure to deviate from the normal value. The pressure ratio is calculated using the formula ε=Pdischarge-max / Psuction-min. The pressure ratio is a comprehensive indicator reflecting the compressor load and operating efficiency. Under normal operating conditions, the pressure ratio is stable near the design value. Airflow pulsation or pressure fluctuation faults will cause increased pressure ratio fluctuations. The pressure pulsation amplitude is calculated by first performing bandpass filtering on the pressure signal to extract the pulsation component, and then calculating the root mean square value of the pulsation component as the pressure pulsation amplitude. This parameter directly characterizes the severity of the airflow pulsation fault. The indicated work feature is extracted by calculating the area under the closed curve of the pressure-volume graph, using the formula Wi=∮p(V)dV. The indicated work reflects the actual compression work performed by the compressor, and anomalies in its value and graph shape can comprehensively reflect the impact of various faults. The maximum discharge pressure, minimum intake pressure, pressure ratio, pressure pulsation amplitude, and indicated work feature are combined to form a pressure feature vector containing five characteristic parameters.
[0049] Taking the airflow pulsation fault of the 6M50 compressor as an example, under normal operating conditions, the maximum discharge pressure is 11.8MPa, the minimum intake pressure is 0.95MPa, the pressure ratio is 12.4, the pressure pulsation amplitude is 0.15MPa, and the indicated work is 458kJ. When the intake pipeline resonates and causes the airflow pulsation fault, the maximum discharge pressure fluctuates to the range of 10.5-13.2MPa, the minimum intake pressure fluctuates to the range of 0.75-1.15MPa, the pressure ratio fluctuates to the range of 9.1-17.6, the pressure pulsation amplitude surges to 0.85MPa with an increase of 467%, and the indicated work fluctuates to the range of 420-495kJ. The sharp increase in the pressure pulsation amplitude and the large fluctuations in various parameters clearly identify the characteristics of the airflow pulsation fault. However, this type of process fault is not obvious in the vibration characteristics, which fully demonstrates the unique sensitivity of pressure characteristics to process faults.
[0050] The purpose of statistical and trend analysis on the preprocessed temperature signal is that, although the temperature signal changes slowly, it can reflect the evolution of the equipment's thermal state and energy loss characteristics, serving as an important basis for judging friction-related mechanical faults and abnormal thermal efficiency. Specifically, the average temperature value is calculated using the formula... This parameter reflects the overall thermal state of the monitored area. Increased frictional power consumption due to bearing wear will raise the average temperature, and cooling system malfunctions will also cause an abnormally high average temperature. The temperature variance is calculated using the formula... Temperature variance reflects the degree of fluctuation in the temperature signal. Under normal operating conditions, the temperature is relatively stable with a small variance, while intermittent faults or frequent load changes will lead to an increase in temperature variance. The rate of temperature change is calculated using the formula dT / dt=(T(n)-T(n-Δn)) / Δt, where Δn is the length of the time window and Δt is the corresponding time interval. The rate of temperature change reflects the speed at which the temperature rises or falls. Sudden changes in the rate of temperature change in the early stages of a fault often precede significant changes in the average temperature value, making it a sensitive indicator for early fault warning. In practical applications, temperature features are extracted for different monitoring parts such as connecting rod bearings, crankcases, and cylinder walls. For multi-point temperature signals, the temperature difference characteristics between each measuring point are also calculated. Uneven temperature distribution often indicates local faults. The average temperature value, temperature variance, and rate of temperature change are combined to form a temperature feature vector containing three characteristic parameters.
[0051] Taking the connecting rod bearing wear failure of the 2D12 compressor as an example, under normal operating conditions, the average bearing temperature is 52℃, the temperature variance is 6.5℃², and the temperature change rate is 0.8℃ / min. As the bearing wear increases from 0.05mm to 0.15mm, the average temperature gradually rises to 68℃, 75℃, and 82℃, and the temperature variance increases to 18℃², reflecting intensified temperature fluctuations. The temperature change rate shows an abnormal increase of 2.2℃ / min in the early stage of the failure and then stabilizes at 1.5℃ / min. The evolution of temperature characteristics shows a clear correspondence with the degree of bearing wear. In particular, the early abrupt change in the temperature change rate provides a time window for fault warning. At this time, the change in vibration characteristics is not obvious, and the kurtosis only increases from 3.7 to 4.2. This fully demonstrates the unique sensitivity and early warning capability of temperature characteristics for friction-related failures.
[0052] Finally, a vibration feature vector containing 9 parameters is extracted from the preprocessed vibration signal, a pressure feature vector containing 5 parameters is extracted from the preprocessed pressure signal, and a temperature feature vector containing 3 parameters is extracted from the preprocessed temperature signal. The three modal features have a total of 17 feature parameters, covering multiple physical dimensions such as time domain statistics, frequency domain energy, thermodynamic cycle, waveform morphology, and temperature trend, ensuring a comprehensive characterization capability for mechanical and process faults.
[0053] S104: The vibration characteristics, pressure characteristics, and temperature characteristics are differentiated and quantified with the corresponding benchmark characteristics under normal operating conditions to obtain the vibration evidence strength, pressure evidence strength, and temperature evidence strength of the degree of deviation of each mode from the normal state.
[0054] In S104 above, the vibration, pressure, and temperature characteristics are quantified and differentiated from the corresponding baseline characteristics under normal operating conditions to obtain the vibration evidence strength, pressure evidence strength, and temperature evidence strength of the degree of deviation of each mode from the normal state. This is a key step in transforming multimodal characteristics from the feature space to the decision space. The core purpose is to quantitatively assess the degree of deviation of the current operating state from the healthy baseline state, providing reliable evidence support for subsequent fault diagnosis decisions. Fault diagnosis of reciprocating compressors is essentially a pattern recognition problem, requiring the determination of whether the current equipment state belongs to a normal operating state or a certain fault state. The basis for this determination is the degree of difference between the current state characteristics and the normal state characteristics. Directly using simple measurement methods such as Euclidean distance has obvious shortcomings because there are often correlations between the various dimensions of multimodal characteristics, and the scale and fluctuation range of different feature dimensions vary greatly. Simple Euclidean distance cannot accurately reflect the true degree of deviation in the feature space. Furthermore, the characteristic distribution under normal operating conditions is not isotropic, but exhibits a specific statistical distribution pattern. Larger changes in characteristics in some directions are normal fluctuations, while even small changes in other directions may indicate anomalies. Therefore, a distance metric method that can take into account the statistical characteristics of the characteristic distribution must be used to accurately quantify the degree of deviation, avoid misjudging normal fluctuations as fault symptoms, or drown out real anomalies in characteristic noise.
[0055] Furthermore, the vibration, pressure, and temperature features are differentiated and quantified against the baseline features corresponding to the normal operating conditions to obtain the vibration evidence strength, pressure evidence strength, and temperature evidence strength for each mode's deviation from the normal state. Specifically, this includes: acquiring historical multimodal feature data of the reciprocating compressor under normal operating conditions, and calculating the mean vector and covariance matrix corresponding to the feature vectors of the vibration mode, pressure mode, and temperature mode respectively, using the mean vector and covariance matrix corresponding to each mode as the baseline features; using the vibration, pressure, and temperature features as real-time vibration feature vectors, real-time pressure feature vectors, and real-time temperature feature vectors respectively; and using the vibration mode, pressure mode, and temperature mode as the target modes respectively. The process involves subtracting the real-time feature vector of the target mode from the mean vector corresponding to the target mode to obtain the feature deviation vector of the target mode. The real-time feature vector is the vector corresponding to the target mode. The covariance matrix corresponding to the target mode is inverted to obtain the inverse covariance matrix of the target mode. The feature deviation vector is transposed to obtain the transposed feature deviation vector. The transposed feature deviation vector, the inverse covariance matrix, and the feature deviation vector are then multiplied sequentially to obtain the squared Mahalanobis distance of the target mode. The squared Mahalanobis distances of the vibration mode, the pressure mode, and the temperature mode are used as the strength of evidence for vibration, pressure, and temperature, respectively.
[0056] Specifically, acquiring historical multimodal characteristic data of reciprocating compressors under normal operating conditions is fundamental to building a health benchmark, ensuring the representativeness and reliability of the historical data. Initially, after the compressor is installed, commissioned, and running stably, at least 1000 hours of operating data are continuously collected using a multimodal data acquisition module. This ensures coverage of the equipment's break-in and stabilization periods, as well as the complete range of various typical load conditions from 60% to 100% of rated load. During the data collection process, data is rigorously screened, retaining only data confirmed by professional engineers as being in normal operating condition, and excluding data from transitional processes such as start-up, shutdown, and operating condition switching, as well as data from any known anomalies. For the 6M50 compressor, under normal operating conditions of 420 r / min rated speed and 12.0 MPa, 1200 hours of data were also collected, and 5000 sets of normal operating condition samples were selected. For each sample, using the aforementioned feature extraction method, a complete feature vector containing 9-dimensional vibration features, 5-dimensional pressure features, and 3-dimensional temperature features was obtained, constituting a historical multimodal characteristic dataset under normal operating conditions.
[0057] The mean vectors and covariance matrices corresponding to the eigenvectors of vibration modes, pressure modes, and temperature modes are calculated separately to comprehensively describe the distribution characteristics of each mode under normal conditions using statistical parameters. For vibration modes, vibration features are extracted from 5000 sets of normal operating condition samples to form a 5000×9 vibration feature matrix Xv, where each row represents a 9-dimensional vibration feature vector of a set of samples, and each column represents the numerical sequence of a certain vibration feature parameter in the 5000 sets of samples. The mean vector μv of the vibration modes is calculated using the formula μv=1 / N∑Xv(i), where N=5000 is the number of samples. This operation calculates the average value of each column of the vibration feature matrix to obtain a 9-dimensional mean vector, where the first dimension represents the average level of the root mean square value, the second dimension represents the average level of the peak-to-peak value, and successively includes the average level of skewness, kurtosis, dominant frequency amplitude, and energy proportion of each frequency band.
[0058] Taking the 2D12 compressor as an example, the calculated mean vector of vibration modes is μv=[0.53g, 1.85g, 0.16, 3.72, 0.115g, 0.245, 0.442, 0.198, 0.115]. These nine values constitute the central position of the vibration characteristics under normal conditions. The covariance matrix Σv of the vibration modes is calculated using the formula Σv=1 / (N-1)∑(Xv(i)-μv)(Xv(i)-μv) T The operation first calculates the deviation between the vibration feature vector and the mean vector for each sample group. Then, it multiplies the deviation vector by its transpose to obtain a 9×9 outer product matrix. The average of the outer product matrices for all samples yields the covariance matrix. The diagonal elements of the covariance matrix represent the variance of each feature dimension, while the off-diagonal elements represent the covariance between different feature dimensions, thus comprehensively describing the fluctuation degree and correlation of each dimension of the vibration features. The calculated vibration mode covariance matrix Σv is a 9×9 symmetric positive definite matrix. Its diagonal elements show a root mean square variance of 0.0082g², a peak-to-peak variance of 0.156g², and a kurtosis variance of 0.18, indicating that kurtosis fluctuates relatively greatly under normal conditions. The off-diagonal elements show a positive correlation between the root mean square value and the peak-to-peak value (covariance 0.035g²), and a positive correlation between kurtosis and the proportion of high-frequency energy (covariance 0.012). This correlation information is crucial for subsequent distance calculations.
[0059] Using the same calculation method, pressure features were extracted from 5000 sets of normal operating condition samples to form a 5000×5 pressure feature matrix Xp. A 5-dimensional pressure mode mean vector μp and a 5×5 pressure mode covariance matrix Σp were then calculated. Taking the 2D12 compressor as an example, the pressure mode mean vector is μp=[3.8MPa, 0.35MPa, 0.18MPa / ms, 12.2, 0.14MPa], corresponding to the normal levels of maximum discharge pressure, pressure fluctuation amplitude, pressure rise and fall rate, pressure ratio, and pressure pulsation amplitude, respectively. The diagonal elements of the covariance matrix Σp show that the variance of the maximum discharge pressure is 0.65MPa², the variance of the pressure ratio is 1.25, and the variance of the pressure pulsation amplitude is 0.0045MPa². The off-diagonal elements show a strong positive correlation between the maximum discharge pressure and the pressure ratio, with a covariance of 0.82MPa. Temperature features were extracted from 5000 normal operating condition samples to form a 5000×3 temperature feature matrix Xt. A 3-dimensional temperature mode mean vector μt and a 3×3 temperature mode covariance matrix Σt were then calculated. Taking the 2D12 compressor as an example, the temperature mode mean vector is μt = [51℃, 7.2℃, 1.1℃ / min], corresponding to the normal levels of average temperature, temperature variance, and temperature change rate, respectively. The diagonal elements of the covariance matrix Σt show that the variance of the average temperature is 12.5℃², the variance of the temperature variance is 2.8℃², and the variance of the temperature change rate is 0.25 (℃ / min)². The off-diagonal elements show a weak positive correlation between the average temperature and the temperature change rate, with a covariance of 0.45℃² / min.
[0060] Using vibration, pressure, and temperature characteristics as real-time vibration, pressure, and temperature feature vectors, respectively, is the initial step in incorporating the equipment status data to be diagnosed into the evaluation process. In actual diagnostics, the multimodal data acquisition module collects the vibration, pressure, and temperature signals of the compressor in real time. After noise reduction, filtering, and standardization by the data preprocessing module, and feature extraction by the multimodal feature fusion module, the current 9-dimensional vibration feature vector, 5-dimensional pressure feature vector, and 3-dimensional temperature feature vector are obtained. Taking the 2D12 compressor at a certain monitoring moment as an example, the real-time vibration feature vector obtained after feature extraction is xv=[1.08g, 4.15g, 0.36, 6.75, 0.27g, 0.185, 0.385, 0.205, 0.265]. Compared with the baseline mean vector μv, it can be seen that the root mean square value increased from 0.53g to 1.08g, the kurtosis increased from 3.72 to 6.75, and the high-frequency energy ratio increased from 0.115 to 0.265. These significant changes in features suggest that there may be a bearing wear failure. At the same time, the extracted real-time pressure feature vector is xp=[3.6MPa, 0.38MPa, 0.19MPa / ms, 12.0, 0.16MPa], and the real-time temperature feature vector is xt=[78℃, 22℃, 2.7℃ / min]. The significant increase in temperature features further confirms the inference that bearing wear leads to increased frictional heat.
[0061] Using vibration mode, pressure mode, and temperature mode as target modes separately is to calculate the deviation degree of each mode individually, avoiding confusion in dimensions and physical meaning caused by cross-modal comparisons. For each target mode, the real-time feature vector of the target mode is first subtracted from the mean vector of the corresponding target mode to obtain the feature deviation vector of the target mode. Taking vibration mode as the target mode as an example, the real-time vibration feature vector xv=[1.08g, 4.15g, 0.36, 6.75, 0.27g, 0.185, 0.385, 0.205, 0.265] is subtracted element-wise from the vibration mode mean vector μv=[0.53g, 1.85g, 0.16, 3.72, 0.115g, 0.245, 0.442, 0.198, 0.115], and the calculation formula is Δxv=xv-μv, which gives the feature deviation vector Δx of the vibration mode. v = [0.55g, 2.30g, 0.20, 3.03, 0.155g, -0.060, -0.057, 0.007, 0.150]. Each dimension of this vector represents the degree to which the corresponding feature parameter deviates from the normal mean. Positive values indicate above-normal levels, and negative values indicate below-normal levels. Specifically, the root mean square value is 0.55g higher, the kurtosis is 3.03 higher, the high-frequency energy proportion is 0.150 higher, and the low-frequency energy proportion is 0.060 lower. This deviation pattern is consistent with the typical characteristics of bearing wear failure. The calculation of the feature deviation vector eliminates the influence of the absolute value of the real-time feature vector, focusing instead on the deviation relative to the normal baseline, laying the foundation for subsequent distance measurement.
[0062] Inverting the covariance matrix corresponding to the target mode to obtain the inverse covariance matrix of the target mode is a key mathematical transformation in Mahalanobis distance calculation. Its physical significance lies in whitening the feature space, eliminating correlations and scale differences between feature dimensions. Taking vibration modes as an example, the vibration mode covariance matrix Σv is a 9×9 symmetric positive definite matrix. Performing matrix inversion yields the inverse covariance matrix Σv. (-1) Numerical methods such as Gaussian elimination or singular value decomposition are used in the calculation to ensure numerical stability. The inverse covariance matrix Σv (-1)Also a 9×9 symmetric positive definite matrix, its function is to perform a weighted transformation on the eigenvector, assigning larger weights to feature dimensions with smaller variances and smaller weights to feature dimensions with larger variances, while also considering the correlation between dimensions for decorrelation. The physical meaning of this weighting mechanism is that if a feature dimension naturally has small fluctuations and small variance under normal conditions, then even a small deviation in that dimension should be considered a significant anomaly. Conversely, if a feature dimension naturally has large fluctuations and large variance under normal conditions, then a large deviation in that dimension may simply be normal fluctuation. Taking kurtosis as an example, under normal conditions, the variance of kurtosis is 0.18, which fluctuates significantly, while the variance of the root mean square (RMS) value is 0.0082g², which fluctuates less. If a simple Euclidean distance is used, the kurtosis deviation of 3.03 and the RMS deviation of 0.55g would be treated as equivalent. However, in reality, the kurtosis deviation of 3.03 is only 7.1 times the standard deviation relative to the variance of 0.424, while the RMS deviation of 0.55g reaches 6.0 times the standard deviation relative to the standard deviation of 0.091g. The anomalies of the two are quite significant. The introduction of the inverse covariance matrix is precisely to achieve this reasonable weighting based on statistical distribution.
[0063] Performing a matrix transpose on the eigenvector bias vector to obtain the transposed eigenvector bias vector is to satisfy the dimension matching requirement of matrix multiplication. Taking vibration modes as an example, the eigenvector bias vector Δxv is a 9×1 column vector, and transposing it yields a 1×9 row vector Δxv. T =[0.55g, 2.30g, 0.20, 3.03, 0.155g, -0.060, -0.057, 0.007, 0.150], this row vector will be used as the first operand in subsequent matrix multiplication operations. Matrix transpose is mathematically a simple swap of rows and columns, but it is an indispensable step in the computational process, ensuring that the dimensions of the subsequent three matrices can be correctly matched for consecutive multiplication operations.
[0064] Performing matrix multiplication on the transposed eigenvalue deviation vector, the inverse covariance matrix, and the eigenvalue deviation vector sequentially to obtain the squared Mahalanobis distance of the target mode is the core calculation step for quantifying the degree of deviation. Taking vibration mode as an example, the calculation formula is D²v = Δxv T Σv (-1) Δxv is an operation involving the continuous multiplication of three matrices. First, the transpose eigenvalue vector Δxv is calculated. T (1×9 dimensional) and the inverse covariance matrix Σv (-1)A 9×9 matrix multiplication yields a 1×9 intermediate vector. Each element of this intermediate vector is a weighted sum of the elements of the transposed eigenvalue bias vector and the corresponding rows of the inverse covariance matrix, reflecting the weighted bias after considering eigencorrelation and variance weights. Then, this 1×9 intermediate vector is multiplied by the eigenvalue bias vector Δxv (9×1), resulting in a 1×1 scalar, the squared value of the Mahalanobis distance, D²v. The entire calculation can be expanded as D²v = ∑∑(Δxv)i(Σv) (-1) )ij(Δxv)j, where i and j traverse 9 feature dimensions respectively, is a quadratic mathematical form. Its numerical value comprehensively reflects the weighted distance of the feature deviation vector after considering the covariance matrix. For the real-time vibration feature vector at the moment of bearing wear in the aforementioned 2D12 compressor, the calculated squared Mahalanobis distance of the vibration mode is D²v = 187.3. This is a dimensionless value, and its magnitude reflects the statistical distance of the current vibration state relative to the normal reference state; the larger the value, the more severe the deviation. The Mahalanobis distance, through the weighted transformation of the inverse covariance matrix, provides a more scientific and reasonable measure of the degree of deviation.
[0065] Using the exact same calculation procedure, the squared Mahalanobis distance of the pressure modes is calculated. The real-time pressure eigenvector xp = [3.6 MPa, 0.38 MPa, 0.19 MPa / ms, 12.0, 0.16 MPa] is subtracted from the pressure mode mean vector μp = [3.8 MPa, 0.35 MPa, 0.18 MPa / ms, 12.2, 0.14 MPa] to obtain the pressure mode eigenvalue deviation vector Δxp = [-0.2 MPa, 0.03 MPa, 0.01 MPa / ms, -0.2, 0.02 MPa]. The inverse covariance matrix Σp of the pressure modes is then inverted to obtain the inverse covariance matrix Σp. (-1) The transpose of the eigenvalue deviation vector yields Δxp T Perform matrix multiplication D²p = Δxp T Σp (-1) The squared Mahalanobis distance of the pressure mode, Δxp, is calculated to be D²p = 2.8. This value is significantly smaller than the squared Mahalanobis distance of the vibration mode, indicating that under bearing wear failure scenarios, the deviation of the pressure characteristics is much smaller than that of the vibration characteristics. This aligns with the physical law that bearing wear primarily affects mechanical vibration while having a relatively small impact on thermodynamic cycle pressure. The squared Mahalanobis distance of the temperature mode is then calculated. Subtracting the real-time temperature feature vector xt = [78℃, 22℃, 2.7℃ / min] from the temperature mode mean vector μt = [51℃, 7.2℃, 1.1℃ / min] yields the temperature mode feature deviation vector Δxt = [27℃, 14.8℃, 1.6℃ / min]. Inverting the temperature mode covariance matrix Σt yields the inverse covariance matrix Σt.(-1) The transpose of the eigenvalue deviation vector yields Δxt. T Perform matrix multiplication D²t=Δxt T Σt (-1) Δxt yields the squared Mahalanobis distance value for the temperature mode, D²t = 95.6. The large squared Mahalanobis distance value indicates that the increased frictional heat caused by bearing wear has led to a significant deviation in the temperature characteristics. Temperature evidence plays a crucial supporting role in the diagnosis of this fault.
[0066] The squared Mahalanobis distance values of the calculated vibration mode, pressure mode, and temperature mode were used as the strength of evidence for vibration, pressure, and temperature, respectively, thus completing the transformation from feature space to evidence space.
[0067] S105: Combine the intensity of vibration evidence, the intensity of pressure evidence, and the intensity of temperature evidence to obtain a real-time anomaly evidence vector. Map and compare the real-time anomaly evidence vector with a preset fault mode template to obtain a set of modal feature weight values.
[0068] In S105 above, mapping and comparing real-time anomaly evidence vectors with preset fault mode templates to obtain a set of modal feature weight values is the core mechanism for achieving adaptive dynamic adjustment of feature weights. Its fundamental purpose is to automatically identify the most likely fault type based on the current equipment status and dynamically adjust the contribution weight of each modal feature in the final diagnostic decision accordingly. Different fault types in reciprocating compressors exhibit significantly differentiated multimodal feature manifestations. Mechanical faults, such as bearing wear, mainly manifest as vibration and temperature anomalies with relatively small pressure changes; process faults, such as airflow pulsation, mainly manifest as pressure fluctuations with insignificant vibration and temperature changes. If fixed weights are used to fuse modal features, some fault types' features will inevitably be submerged or misjudged. Therefore, establishing a dynamic weight allocation mechanism based on fault mode recognition allows the mode most sensitive to the current fault to automatically receive higher weights, improving the targeting and accuracy of the diagnosis.
[0069] The real-time anomaly evidence vector is mapped and compared with a preset fault mode template to obtain a set of modal feature weight values. Specifically, this includes: obtaining fault prototype vectors as preset fault mode templates; the fault prototype vectors include at least mechanical fault prototype vectors, fluid fault prototype vectors, and health state prototype vectors; each fault prototype vector consists of vibration reference evidence values, pressure reference evidence values, and temperature reference evidence values; performing dot product operations between the real-time anomaly evidence vectors and the mechanical fault prototype vectors, fluid fault prototype vectors, and health state prototype vectors respectively to obtain mechanical fault alignment scores, fluid fault alignment scores, and health state alignment scores; and applying temperature hyperparameters... The alignment scores for mechanical faults, fluid faults, and health status are calculated and normalized using exponential methods to obtain the confidence scores for mechanical faults, fluid faults, and health status. Pre-defined attention vectors for mechanical faults, fluid faults, and health status are then obtained. The mechanical fault confidence score is multiplied by the mechanical fault attention vector, the fluid fault confidence score is multiplied by the fluid fault attention vector, and the health status confidence score is multiplied by the health status attention vector. The results of these multiplications are then summed to obtain a dynamic weight vector. Each component in the dynamic weight vector is normalized to obtain a set of modal feature weight values.
[0070] Specifically, obtaining fault prototype vectors as preset fault mode templates is the foundation for establishing a fault mode knowledge base. From historical fault samples on a cloud-based big data platform, 1000 sets of typical mechanical fault samples, 1000 sets of typical process fault samples, and 1000 sets of typical health state samples were selected. For each type of sample, the statistical mean values of vibration evidence strength, pressure evidence strength, and temperature evidence strength were calculated to form the prototype vector for that type of fault. Taking the 2D12 compressor as an example, the mechanical fault prototype vector is [185.2, 3.1, 92.8], representing the mean vibration evidence strength of 185.2, the mean pressure evidence strength of 3.1, and the mean temperature evidence strength of 92.8 during mechanical faults; the fluid fault prototype vector is [16.5, 325.7, 9.2], representing the typical distribution of the three modal evidence strengths during fluid faults; and the health state prototype vector is [2.8, 2.5, 3.1], indicating that the evidence strengths of each modality are close to the baseline level during normal operation. These three prototype vectors fully characterize the multimodal evidence distribution patterns under different states, providing a standard template for subsequent pattern matching. Fault prototype vectors can be obtained through statistical analysis of historical fault data, set based on the prior knowledge of domain experts, or calibrated using simulation data.
[0071] Performing a dot product operation between the real-time anomaly evidence vector and each fault prototype vector to obtain the alignment score for each type of fault is a mathematical method to quantify the similarity between the current state and each fault mode. Taking the real-time anomaly evidence vector [187.3, 2.8, 95.6] at the aforementioned bearing wear moment as an example, performing a dot product operation between it and the mechanical fault prototype vector [185.2, 3.1, 92.8] yields the formula smech=187.3×185.2+2.8×3.1+95.6×92.8=43574.1. This value reflects the projection length of the real-time evidence vector in the direction of the mechanical fault mode; the larger the value, the closer the current state is to that fault mode. Similarly, the fluid fault alignment score sfluid = 187.3 × 16.5 + 2.8 × 325.7 + 95.6 × 9.2 = 4882.0, and the health status alignment score shealth = 187.3 × 2.8 + 2.8 × 2.5 + 95.6 × 3.1 = 827.8. Comparing the three alignment scores, it can be found that the mechanical fault alignment score of 43574.1 is significantly higher than the fluid fault alignment score of 4882.0 and the health status alignment score of 827.8, clearly indicating that the current state belongs to the mechanical fault mode.
[0072] The key step in converting alignment scores into probability distributions is to calculate and normalize the alignment scores based on the temperature hyperparameter to obtain the confidence levels of various fault types. The temperature hyperparameter τ is used to control the concentration of the probability distribution; in this embodiment, τ = 0.1 is set. Each alignment score is divided by the temperature hyperparameter and the exponential function is calculated, resulting in exp(smech / τ) = exp(43574.1 / 0.1) = exp(435741), exp(sfluid / τ) = exp(48820), and exp(shealth / τ) = exp(8278). Then, the three exponential values are normalized: mechanical fault confidence = exp(435741) / (exp(435741) + exp(48820) + exp(8278)) ≈ 0.9998, fluid fault confidence ≈ 0.0001, and health status confidence ≈ 0.0001. The physical meaning of confidence level is the probability that the current state belongs to a certain category. A confidence level of 1.0 for mechanical failures means that it is almost certain that the current state is a mechanical failure.
[0073] Obtaining pre-defined fault attention vectors for various types of faults provides prior knowledge for defining which modal characteristics should be focused on for different fault types. Based on the fault mechanism of reciprocating compressors, the mechanical fault attention vector is set to [0.45, 0.20, 0.35], indicating that mechanical fault diagnosis should focus on vibration (0.45) and temperature (0.35), while pressure (0.20) has a lower weight. The fluid fault attention vector is set to [0.15, 0.70, 0.15], indicating that fluid fault diagnosis should focus on pressure (0.70). The health status attention vector is set to [0.40, 0.35, 0.25], indicating that the modal weights are relatively balanced under healthy conditions. Multiply the mechanical fault confidence score of 0.9998 element-wise with the mechanical fault concern vector [0.45, 0.20, 0.35] to obtain [0.4499, 0.1999, 0.3499]. Multiply the fluid fault confidence score of 0.0001 with the fluid fault concern vector [0.15, 0.70, 0.15] to obtain [0.000015, 0.000070, 0.000015]. Multiply the health status confidence score of 0.0001 with the health status concern vector [0.40, 0.35, 0.25] to obtain [0.00004, 0.000035, 0.000025]. Add the three sets of results to obtain the dynamic weight vector [0.4499, 0.1999, 0.3499]. The vector was normalized so that the sum of its components was 1.0, resulting in a vibration feature weight of 0.45, a pressure feature weight of 0.20, and a temperature feature weight of 0.35. This weighting scheme accurately matched the characteristic distribution pattern of bearing wear, a mechanical fault.
[0074] In one possible implementation, besides determining the corresponding weight value based on the deviation of the current operating state from the health baseline state, an attention mechanism can also be used to obtain modal feature weight values. The process of obtaining modal feature weight values based on the attention mechanism automatically captures the interaction relationships and importance differences between different modal features through a learnable neural network structure, thus achieving the core technical path of deep feature layer fusion. The preprocessed vibration, pressure, and temperature modal raw features are respectively fed into the corresponding feature extraction submodules for encoding. The vibration feature extraction submodule calculates the time-domain feature peak, kurtosis, impulse factor, and frequency-domain feature 100Hz / 200Hz / 400Hz spectral peak and harmonic energy to form a 6-dimensional vibration feature vector V. The pressure feature extraction submodule extracts the pressure peak, fluctuation amplitude, and rise / fall rate to form a 3-dimensional pressure feature vector P. The temperature feature extraction submodule extracts the temperature mean, fluctuation range, and temperature gradient to form a 3-dimensional temperature feature vector T. The three feature vectors are concatenated to form a 12-dimensional original multimodal feature matrix X=[V; P; T].
[0075] Next, a linear projection layer maps the original feature matrix to the three representation spaces required by the attention mechanism: the query vector, the key vector, and the value vector. The linear projection layer is implemented using three learnable weight matrices WQ, WK, and WV, each with a dimension of 12×64, mapping the 12-dimensional input features to a 64-dimensional latent space. The query vector Q = X·WQ reflects what information the current feature needs to focus on, the key vector K = X·WK represents what information each modality feature can provide, and the value vector V = X·WV carries the actual content of each modality feature. A scaled dot product attention mechanism is used to calculate the attention score. First, the dot product QK of the query vector and the key vector is calculated. T The original attention score matrix S is obtained, where each element S(i,j) represents the attention intensity of the i-th feature to the j-th feature. A larger dot product value indicates stronger correlation. To prevent excessively large dot product values from causing gradient vanishing, the attention scores are normalized by dividing by a scaling factor, resulting in S' = S / 8. Then, the Softmax function is applied row-wise to the normalized attention score matrix, converting the scores into a probability distribution form attention weight matrix A. The sum of each row's elements is 1.0, and each element A(i,j) represents the weight proportion that the i-th feature should be assigned to the j-th feature. The Softmax function has the characteristic of highlighting maximum values and suppressing small values, allowing highly correlated features to receive significantly higher weights. Multiplying the attention weight matrix A by the value vector V yields a weighted fusion vector F = A·V. Each element in this vector incorporates information from all modal features, and the fusion weights are adaptively determined by the feature content. To enhance the model's ability to model complex fault modes, a multi-head attention mechanism is employed to compute eight sets of attention weights in parallel. Each head independently executes linear projection, attention calculation, and weighted fusion processes using different parameter matrices, allowing each head to focus on different subspaces of the feature space. The 64-dimensional fusion vectors output by the eight heads are concatenated along the dimensions to obtain a 512-dimensional multi-head fusion feature. This feature is then mapped back to 12-dimensional space using the output projection matrix WO to obtain the final attention fusion feature Fmulti. When extracting the comprehensive weight values for each mode from the attention weight matrix, the attention weight matrix A is summed column-wise to obtain the total weight vector W = [w1, w2, ..., w12] for each feature dimension. The weights of dimensions belonging to the same mode are then averaged to obtain the weight values for vibration, pressure, and temperature modes. After normalization, these values can be used for feature weighting in the diagnostic decision layer.
[0076] Taking the moment of bearing wear failure in the 2D12 compressor as an example, the vibration feature vector at that moment is V=[1.1g, 6.8, 2.0, 0.27g, 0.28g, 0.09g²], the pressure feature vector is P=[4.2MPa, 0.35MPa, 0.2MPa / ms], and the temperature feature vector is T=[80℃, 22℃, 2.8℃ / min]. These are spliced together to form a 12-dimensional original feature matrix X. After linear projection, a 64-dimensional Q, K, V vector is obtained. Taking the vibration kurtosis feature v2 (value 6.8) and the temperature mean feature t1 (value 80℃) as an example, the dot product calculation yields the original score S(2,10)=6.26. This relatively large score indicates that the vibration kurtosis and temperature mean are highly correlated during bearing wear. After scaling and Softmax processing, it is converted into attention weights. The weights of the second-dimensional vibration kurtosis feature on the temperature-related dimensions (dimensions 10-12) are 0.32, 0.28, and 0.25 (total 0.85), respectively, while the weights on the pressure-related dimensions (dimensions 7-9) are only 0.04, 0.03, and 0.02 (total 0.09). This clearly reflects the strong coupling relationship between vibration and temperature features during mechanical failure. After parallel computation by 8 attention heads, the first head mainly captures the coupling relationship between vibration and temperature (mean attention score 0.68), and the second head mainly captures the correlation between pressure and vibration (mean attention score 0.54). The multi-head mechanism outputs 512-dimensional fused features, which are then mapped back to 12-dimensional space to obtain the final fused feature Fmulti=[1.05, 0.92, -0.28, 0.98, 0.71, 0.34, 0.18, 0.13, 0.09, 1.12, 1.08, 0.95]. When extracting the comprehensive weights for each mode, the average total weight of vibration across the six dimensions is 0.42, the average total weight of pressure across the three dimensions is 0.17, and the average total weight of temperature across the three dimensions is 0.35. After normalization, the vibration feature weight is 0.45, the pressure feature weight is 0.18, and the temperature feature weight is 0.37. This weight allocation scheme accurately matches the feature distribution pattern of bearing wear faults, proving that the attention mechanism can adaptively adjust the weight allocation strategy according to real-time feature content, effectively strengthening fault-sensitive features while suppressing redundant features, and achieving deep feature fusion.
[0077] S106: The vibration feature, pressure feature and temperature feature are adaptively weighted and adjusted according to a set of modal feature weight values, and the adjusted multimodal features are spliced and fused to generate a multimodal fusion feature vector.
[0078] In S106 above, the vibration feature, pressure feature, and temperature feature are adaptively weighted and adjusted according to a set of modal feature weight values, and then spliced and fused to generate a multimodal fusion feature vector. This is a key execution step that transforms the dynamic weights calculated above into a weighted representation of actual features. The fundamental purpose is to inject weight information into the numerical representation of each modal feature through mathematical operations, so that the feature values of fault-sensitive modes dominate the fusion vector, while the feature values of non-sensitive modes are appropriately suppressed. This results in the construction of a high-quality fusion feature vector that can accurately reflect the current fault mode for subsequent inference by the Transformer diagnostic model. The essence of reciprocating compressor fault diagnosis is to identify the feature distribution patterns of fault modes in a multi-dimensional feature space. If the original features of each mode are directly spliced together without weighting adjustment, the three modal features will inevitably have the same representation intensity in the fusion vector, which cannot reflect the different dependence differences of different fault types on different modes. For example, bearing wear faults should highlight vibration and temperature features while weakening pressure features. However, simple splicing will make the numerical influence of pressure features comparable to that of vibration and temperature features, causing confusion in the feature space and misjudgment of the diagnostic model. Therefore, it is necessary to establish an adaptive weighting mechanism based on weight values to ensure that the numerical distribution of the fusion feature vector accurately matches the physical nature of the current fault mode.
[0079] Furthermore, the vibration feature, pressure feature, and temperature feature are adaptively weighted and adjusted according to a set of modal feature weight values. The adjusted multimodal features are then concatenated and fused to generate a multimodal fusion feature vector. Specifically, this includes: parsing the vibration feature weight values corresponding to the vibration mode, the pressure feature weight values corresponding to the pressure mode, and the temperature feature weight values corresponding to the temperature mode from a set of modal feature weight values; performing scalar multiplication of the vibration feature weight values with each feature element in the vibration feature to obtain the weighted vibration feature; performing scalar multiplication of the pressure feature weight values with each feature element in the pressure feature to obtain the weighted pressure feature; performing scalar multiplication of the temperature feature weight values with each feature element in the temperature feature to obtain the weighted temperature feature; and concatenating the weighted vibration feature, weighted pressure feature, and weighted temperature feature according to a preset modal dimension order to generate a multimodal fusion feature vector.
[0080] Specifically, independent weight values corresponding to each mode are extracted from a set of modal feature weight values calculated above. The weight values obtained after attention mechanism processing are usually presented in a normalized form, with the sum of the three modal weights strictly equal to 1.0, ensuring the probabilistic meaning of the weight allocation. Taking the bearing wear failure moment of a 2D12 compressor as an example, the set of modal feature weight values output by the attention mechanism is [0.45, 0.18, 0.37]. This triplet is arranged according to a preset modal order. The first element, 0.45, corresponds to the vibration mode, which is the vibration feature weight value wV; the second element, 0.18, corresponds to the pressure mode, which is the pressure feature weight value wP; and the third element, 0.37, corresponds to the temperature mode, which is the temperature feature weight value wT. The magnitude of the weight value directly reflects the diagnostic contribution of the modal feature in the current fault state. The vibration weight 0.45 and the temperature weight 0.37 are significantly higher than the pressure weight 0.18, clearly indicating that the mechanical fault of bearing wear mainly relies on vibration and temperature information for identification, while the contribution of pressure information is relatively small.
[0081] The core mathematical operation for achieving weighted feature representation is to calculate the weighted vibration feature by performing scalar multiplication between the vibration feature weight value and each feature element in the vibration feature. The vibration feature extraction submodule outputs a 6-dimensional vibration feature vector V=[v1, v2, v3, v4, v5, v6], where v1 is the peak feature, v2 is the kurtosis feature, v3 is the impulse factor feature, and v4 / v5 / v6 are the spectral peak features at 100Hz / 200Hz / 400Hz, respectively. The vibration characteristic vector at the moment of bearing wear is V = [1.1g, 6.8, 2.0, 0.27g, 0.28g, 0.09g²]. The vibration characteristic weight value wV = 0.45 is multiplied by each element of this vector one by one. The calculation formula is weighted vibration characteristic Vweighted = [wV×v1, wV×v2, wV×v3, wV×v4, wV×v5, wV×v6]. Substituting the values, we get Vweighted = [0.45×1.1, 0.45×6.8, 0.45×2.0, 0.45×0.27, 0.45×0.28, 0.45×0.09] = [0.495g, 3.06, 0.90, 0.122g, 0.126g, 0.041g²]. The physical meaning of the weighting operation is to scale the values of vibration features according to their importance in the current fault state. A weight of 0.45, which is less than 1.0, means that the vibration features are moderately suppressed compared to their original values, leaving more room for the representation of temperature features. This proportional scaling ensures that the numerical intensity of each modal feature in the fused feature vector strictly corresponds to its diagnostic contribution.
[0082] The weighted pressure feature is calculated by multiplying the pressure feature weight value with each feature element in the pressure feature using a scalar multiplication, following the same mathematical principle as the vibration feature weighting. The 3D pressure feature vector P=[p1, p2, p3] output by the pressure feature extraction submodule, where p1 is the pressure peak feature, p2 is the pressure fluctuation amplitude feature, and p3 is the pressure rise and fall rate feature. The pressure feature vector at the bearing wear moment is P=[4.2MPa, 0.35MPa, 0.2MPa / ms]. The weighted pressure feature wP=0.18 is multiplied with each element of this vector to obtain the weighted pressure feature Pweighted=[0.18×4.2, 0.18×0.35, 0.18×0.2]=[0.756MPa, 0.063MPa, 0.036MPa / ms]. Comparing the numerical changes before and after weighting reveals that the pressure peak was scaled down from the original 4.2 MPa to 0.756 MPa, a reduction of 82%. This significant reduction accurately reflects the physical fact that bearing wear failure is not strongly correlated with pressure changes, ensuring that pressure characteristics do not have an excessive impact on the fusion vector and interfere with the diagnostic model's identification of mechanical fault characteristics.
[0083] The weighted temperature feature is obtained by multiplying the temperature feature weight value with each feature element in the temperature feature by a scalar multiplication, thus completing the weighted processing of the three-modal features. The 3D temperature feature vector T=[t1, t2, t3] output by the temperature feature extraction submodule is given by t1, t2, and t3, where t1 is the temperature mean feature, t2 is the temperature fluctuation range feature, and t3 is the temperature gradient feature. The temperature feature vector at the bearing wear moment is given by T=[80℃, 22℃, 2.8℃ / min]. The weighted temperature feature Tweighted=[0.37×80, 0.37×22, 0.37×2.8]=[29.6℃, 8.14℃, 1.04℃ / min]. The average temperature was scaled down from the original 80℃ to 29.6℃, a reduction of 63%, which is much smaller than the 82% reduction in the pressure feature. This indicates that the temperature feature retains a stronger characterization ability in the fused vector, which is consistent with the mechanism that bearing wear leads to frictional heat generation, making temperature a key diagnostic indicator.
[0084] The final step in constructing a unified feature representation is to concatenate the weighted vibration features, pressure features, and temperature features according to the preset modal dimension order to generate a multimodal fusion feature vector. Using a sequential concatenation operation, the 6-dimensional weighted vibration feature Vweighted, the 3-dimensional weighted pressure feature Pweighted, and the 3-dimensional weighted temperature feature Tweighted are concatenated end-to-end according to the dimension order of [vibration; pressure; temperature] to form a 12-dimensional multimodal fusion feature vector F=[Vweighted; Pweighted; Tweighted]. The fused feature vector F = [0.495g, 3.06, 0.90, 0.122g, 0.126g, 0.041g², 0.756MPa, 0.063MPa, 0.036MPa / ms, 29.6℃, 8.14℃, 1.04℃ / min] for bearing wear moments completely preserves the physical meaning and dimensional units of the three-modal features. Simultaneously, an adaptive adjustment of feature intensity is achieved through a weighting mechanism, ensuring that the temperature-related dimensions (values of 29.6, 8...) are consistent across the entire 12-dimensional vector. The values of the vibration-related dimensions (0.14, 1.04) and the values of the 1st to 6th dimensions (0.495, 3.06, 0.90, 0.122, 0.126, 0.041) are significantly higher than those of the pressure-related dimensions (0.756, 0.063, 0.036) in magnitude. This numerical distribution accurately matches the characteristic pattern of bearing wear failure. After inputting the fused feature vector into the Transformer diagnostic model, the model can quickly identify the mechanical failure type corresponding to the high numerical combination of temperature and vibration features.
[0085] S107: Input the multimodal fusion feature vector into the preset deep learning diagnostic model for processing to obtain the fault diagnosis result of the reciprocating compressor.
[0086] In S107 above, before processing the multimodal fusion feature vector into a preset deep learning diagnostic model to obtain the fault diagnosis result of the reciprocating compressor, a preset deep information diagnostic model needs to be constructed. Specifically, this includes: acquiring a historical multimodal fusion feature training dataset, which includes historical multimodal fusion feature samples and corresponding real diagnostic labels; constructing an initial network model, which sequentially includes a multimodal feature input layer, a feature embedding and position encoding layer, a Transformer encoder stack layer, a global feature aggregation layer, and a fault diagnosis output layer. The Transformer encoder stack layer is composed of multiple cascaded Transformer encoder units with identical structures. Each Transformer encoder unit sequentially includes a multi-head self-attention module, a first residual connection and layer normalization module, and a feedforward neuron. The training dataset is input into the initial network model via a network module and a second residual connection and layer normalization module for forward propagation. It then passes through a multimodal feature input layer for standardization and dimensionality alignment to obtain aligned feature sequences. These sequences are then mapped to a high-dimensional latent space via feature embedding and positional encoding layers, injecting positional encoding information to obtain sequence features with positional information. A Transformer encoder stacked layer performs global dependency learning and deep feature extraction to output deep sequence features. A global feature aggregation layer reduces dimensionality and aggregates these features into a global fault feature vector. Finally, a fault diagnosis output layer outputs the diagnostic prediction result. Based on the diagnostic prediction result and the true diagnostic label, the loss value is calculated, and the network parameters of the initial network model are updated using a backpropagation algorithm until the loss value meets the preset convergence condition. The trained initial network model is then used as the preset deep learning diagnostic model.
[0087] Specifically, reciprocating compressor fault diagnosis is essentially a complex nonlinear mapping problem. It requires establishing a mapping function from a 12-dimensional fused feature space to a fault category space. This mapping relationship cannot be explicitly expressed by traditional mathematical formulas and must rely on deep neural networks to perform implicit learning through a large number of labeled samples. Therefore, the first step in building a pre-defined deep learning diagnostic model is to obtain a historical multimodal fused feature training dataset. This dataset includes historical multimodal fused feature samples and corresponding real diagnostic labels to ensure that each feature sample has clear labeling information on fault type, faulty component, and severity for the model to learn.
[0088] Then, historical sample data that has undergone feature fusion processing is extracted. These samples cover the full data of two compressor types, 2D12 and 6M50, under six operating conditions: normal operation, bearing wear, crankshaft crack, piston ring damage, airflow pulsation, and pressure fluctuation. Each historical multimodal fusion feature sample is a 12-dimensional vector containing 6-dimensional vibration features, 3-dimensional pressure features, and 3-dimensional temperature features after weighted adjustment and splicing fusion. The corresponding true diagnostic label is encoded in the form of a triple [fault type, faulty component, severity]. Fault types include three categories: normal, mechanical fault, and process fault. Faulty components include six parts: cylinder, bearing, crankshaft, piston ring, valve, and pipeline. Severity includes four levels: normal, mild, moderate, and severe. Taking the moment of wear of a bearing in a 2D12 compressor as an example, the historical multimodal fusion feature sample at that moment is F=[0.495g, 3.06, 0.90, 0.122g, 0.126g, 0.041g², 0.756MPa, 0.063MPa, 0.036MPa / ms, 29.6℃, 8.14℃, 1.04℃ / min]. The corresponding real diagnostic label is [mechanical failure, bearing, moderate]. After unique thermal encoding, it is converted into numerical labels [0, 1, 0] to indicate that the fault type is mechanical failure, [0, 1, 0, 0, 0, 0] to indicate that the faulty component is bearing, and [0, 0, 1, 0] to indicate that the severity is moderate. The training dataset was composed of 14,400 groups, representing 80% of the total 18,000 samples. Among them, 2,400 groups were for normal operating conditions, 2,400 groups for bearing wear, 2,400 groups for crankshaft cracks, 2,400 groups for piston ring damage, 2,400 groups for airflow pulsation, and 2,400 groups for pressure fluctuations. This ensured that the number of samples in each category was balanced to avoid bias in model training towards a particular fault type. At the same time, the training dataset was divided into a training subset of 11,520 groups and a validation subset of 2,880 groups in an 8:2 ratio to monitor overfitting during model training.
[0089] When constructing the initial network model, five functional modules are sequentially built according to the data flow direction from input to output to form an end-to-end diagnostic inference architecture. The first layer, the multimodal feature input layer, includes a feature vector interface and a normalization unit. The feature vector interface receives 12-dimensional historical multimodal fusion feature samples, and the normalization unit performs Z-score standardization on the input features to ensure that the mean of each dimension is 0 and the variance is 1, eliminating the influence of different physical dimensions on the training of the neural network. The second layer, the feature embedding and position encoding layer, includes a linear embedding layer and a position encoding unit. The linear embedding layer is implemented by a 12×128 weight matrix, mapping the 12-dimensional input features to a 128-dimensional high-dimensional latent space. The position encoding unit uses a sine and cosine position encoding scheme to generate a unique position vector for each position in the feature sequence and adds it to the embedded features, enabling the model to perceive the temporal or structural relationships of the feature sequence. The third Transformer encoder stack consists of six cascaded Transformer encoder units with identical structures. Each encoder unit includes, in sequence, a multi-head self-attention module, a first residual connection and layer normalization module, a feedforward neural network module, and a second residual connection and layer normalization module. The multi-head self-attention module has eight attention heads, each with a hidden dimension of 128 ÷ 8 = 16. Attention weights are calculated using the Q / K / V projection matrix and then weighted and fused. The first residual connection adds the input and output of the attention module and feeds them into the layer normalization module to stabilize the training process. The feedforward neural network module consists of two fully connected layers and a GELU activation function. The first fully connected layer expands the 128-dimensional features to 512 dimensions for nonlinear transformation. The second fully connected layer compresses the 512-dimensional features back to 128 dimensions. The dropout coefficient is set to 0.25 to prevent overfitting. The second residual connection adds the input and output of the feedforward network and then performs layer normalization again. The fourth layer, the global feature aggregation layer, contains a global average pooling unit and a feature compression unit. The global average pooling unit averages the 128-dimensional sequence features output by the sixth encoder along the sequence dimension to obtain a single 128-dimensional global fault feature vector. The feature compression unit further compresses the feature dimension to 64 dimensions through a 128×64 fully connected layer. The fifth layer, the fault diagnosis output layer, contains three parallel fully connected classification layers that output diagnostic results for fault type, faulty component, and severity, respectively. The fault type classification layer contains a 64×3 weight matrix and a Softmax activation function to output a 3-dimensional probability distribution. The faulty component classification layer contains a 64×6 weight matrix to output a 6-dimensional probability distribution. The severity classification layer contains a 64×4 weight matrix to output a 4-dimensional probability distribution. The initial network model has approximately 1.2 million parameters, and all weight matrices are initialized with random initial values using the Xavier initialization method.
[0090] When the training dataset is input into the initial network model for forward propagation, a 12-dimensional feature matrix of 64 samples in a batch is input into the multimodal feature input layer. The normalization unit independently calculates the mean and standard deviation of each feature dimension to standardize and obtain an aligned feature sequence with a shape of [64, 12]. Then, through the linear embedding matrix of the feature embedding and position encoding layers, it is transformed into an embedded feature of shape [64, 128]. The position encoding unit generates 128-dimensional position vectors for each of the 12 feature positions and adds them element-wise to the embedded features to obtain sequence features with positional information. The sequence features are sequentially passed through 6 Transformer encoder units. In the first encoder layer, the multi-head self-attention module calculates the global dependency matrix within the sequence features and performs weighted aggregation. Residual connections and layer normalization stabilize the feature distribution. The feedforward neural network independently performs nonlinear transformations on each position feature to extract higher-order features. The output deep sequence features are passed to the second encoder layer to repeat the above processing. After stacking 6 encoder layers, the shape of the output deep sequence features is still [64, 128], but the feature representation ability is significantly enhanced. The global feature aggregation layer performs average pooling along the second dimension on the deep sequence features to obtain a global fault feature vector of shape [64, 128]. Then, a 64-dimensional compression layer is used to obtain compressed features of shape [64, 64]. The three classification layers of the fault diagnosis output layer perform linear transformation and Softmax normalization on the compressed features, respectively, and output the fault type probability distribution of shape [64, 3], the fault component probability distribution of shape [64, 6], and the severity probability distribution of shape [64, 4] as the diagnostic prediction results.
[0091] When calculating the loss value based on the diagnostic prediction results and the actual diagnostic labels and updating the network parameters through the backpropagation algorithm, the cross-entropy loss function is used to calculate the fault type loss, faulty component loss, and severity loss respectively. The three losses are summed with a weight ratio of 1:1:1 to obtain the total loss value. Taking the bearing wear sample as an example, the predicted probability of fault type is [0.05, 0.90, 0.05], the true label is [0, 1, 0], and the cross-entropy loss is -log(0.90)=0.105. The predicted probability of faulty component is [0.02, 0.92, 0.02, 0.02, 0.01, 0.01], the true label is [0, 1, 0, 0, 0, 0], and the cross-entropy loss is -log(0.92)=0.083. The predicted probability of severity is [0.05, 0.10, 0.80, 0.05], the true label is [0, 0, 1, 0], and the cross-entropy loss is -log(0.80)=0.223. The total loss is (0.105+0.083+0.223) / 3=0.137. The SGD optimizer was configured with a momentum of 0.9 and a weight decay of 1e-4, and the learning rate was set to 0.002. The gradient of the total loss with respect to all network parameters was calculated using the backpropagation algorithm, and the parameters were updated according to the gradient descent direction. After training for 120 epochs, the loss value converged from the initial 2.5 to 0.023±0.002, and the accuracy on the validation set reached 99.3%. The trained initial network model was saved as a preset deep learning diagnostic model for actual diagnostic inference.
[0092] Furthermore, after constructing the preset deep learning diagnostic model, the multimodal fusion feature vector is input into the preset deep learning diagnostic model for processing to obtain the fault diagnosis result of the reciprocating compressor. Specifically, this includes: inputting the multimodal fusion feature vector into the multimodal feature input layer of the preset deep learning diagnostic model, performing standardization and dimensional alignment processing to obtain an aligned feature sequence; inputting the aligned feature sequence into the feature embedding and position encoding layer, mapping the aligned feature sequence to a high-dimensional latent space through a linear embedding layer, and superimposing sinusoidal position encoding information to obtain a sequence feature with position information; inputting the sequence feature with position information into the Transformer encoder stacking layer, and processing it through a multi-head self-attention model. The block computes the global dependencies between modal features in the sequence features with location information, and performs nonlinear transformation processing through residual connections, layer normalization, and feedforward neural network modules to output deep sequence features. The deep sequence features are then input into the global feature aggregation layer, where global average pooling and feature compression units are used to reduce the dimensionality and aggregate the deep sequence features to generate a global fault feature vector. The global fault feature vector is then input into the fault diagnosis output layer, where a fully connected layer and a classification activation function are used to classify and map the global fault feature vector, outputting the fault type, fault location, and severity of the reciprocating compressor. The fault type, fault location, and severity are combined as the fault diagnosis result.
[0093] Specifically, the 12-dimensional multimodal fusion feature vector is input into the multimodal feature input layer of a pre-defined deep learning diagnostic model for standardization and dimensional alignment to obtain an aligned feature sequence. Taking a 2D12 compressor at a certain operating moment as an example, the multimodal fusion feature vector F=[0.52g, 3.2, 0.95, 0.13g, 0.14g, 0.045g², 0.8MPa, 0.07MPa, 0.04MPa / ms, 31℃, 8.5℃, 1.1℃ / min] generated after fusion via the attention mechanism. This 12-dimensional vector contains feature values with different physical dimensions, resulting in significant differences in numerical range. The vibration peak of 0.52g differs from the average temperature of 31℃ by nearly 60 times. If directly input into the neural network, the features with larger values will dominate the gradient calculation while the features with smaller values will be ignored. Therefore, the normalization unit of the multimodal feature input layer independently calculates the mean and standard deviation of each dimension of the input vector and performs Z-score standardization. The first dimension... The training set mean of the vibration peak feature is 0.75g and the standard deviation is 0.25g. The standardized value is calculated as (0.52-0.75) / 0.25=-0.92. The training set mean of the 10th dimension temperature mean feature is 60℃ and the standard deviation is 15℃. The standardized value is calculated as (31-60) / 15=-1.93. After standardizing each of the 12 dimensions, the aligned feature sequence S=[-0.92, 0.48, 0.15, -0.36, -0.28, -0.15, -1.20, -0.85, -0.72, -1.93, -0.58, -0.45] is obtained. After standardization, the numerical range of all feature values converges to the interval [-2, 2] and the mean is close to 0 and the variance is close to 1. This eliminates the dimensional differences of different modal features and ensures that the neural network can fairly process all input dimensions.
[0094] The aligned feature sequence is input into the feature embedding and positional encoding layer. A linear embedding layer maps the 12-dimensional aligned feature sequence to a 128-dimensional high-dimensional latent space and superimposes sinusoidal positional encoding information to obtain sequence features with positional information. The linear embedding layer consists of a 12×128 weight matrix Wembed and a 128-dimensional bias vector bembed. The aligned feature sequence S is multiplied by the weight matrix and the bias is added to obtain the embedded feature E = S×Wembed + bembed. The shape of the embedded feature E is [1, 128], representing the mapping of 12 input features to a 128-dimensional vector. The purpose of the high-dimensional mapping is to expand the feature representation space so that the model can learn more complex feature combinations. The positional encoding unit uses a sinusoidal-cosine positional encoding scheme to generate a unique position vector for each feature position. For the i-th feature position and the j-th encoding dimension, the positional encoding calculation formula is PE(i, j) = sin(i / 10000). (2j / 128) When j is even, PE(i, j) = cos(i / 10000) (2j / 128)When j is odd, the 128-dimensional positional encoding vector generated for the first feature position is PE1=[sin(1 / 10000)]. 0 ), cos(1 / 10000) (2 / 128) ), sin(1 / 10000 (4 / 128) ), ..., cos(1 / 10000) (126 / 128) The position encoding vector is added element by element to the embedded features to obtain the sequence feature H=E+PE with position information. The shape of the sequence feature is [1, 128]. By injecting position encoding information, the model can perceive the relative positional relationship of the 12 input features in the fusion vector. For example, the vibration features from position 1 to 6 and the pressure features from position 7 to 9 have clear modal affiliations in a physical sense. Position encoding helps the model recognize this structured information.
[0095] The sequence features with location information are input into the stacked layers of the Transformer encoder. A multi-head self-attention module calculates the global dependencies between modal features in the sequence features and performs nonlinear transformation processing via a residual connection layer normalized feedforward neural network module to output deep sequence features. The first layer of the Transformer encoder receives the input sequence features H. The multi-head self-attention module sets up 8 attention heads, mapping the 128-dimensional input features to a query matrix Q, a key matrix K, and a value matrix V through Q / K / V projection matrices, respectively. Each head has a dimension of 128 ÷ 8 = 16 dimensions. The first attention head calculates the attention weight matrix A = softmax(Q × K). T / √16), for a single sample, the attention weight matrix shape is [1, 1] which degenerates into a scalar. The weighted matrix O=A×V is calculated to obtain the output of this head. The outputs of the 8 heads are concatenated and the multi-head attention output shape is obtained by the output projection matrix as [1, 128]. This output is added to the input sequence feature H through residual connection and layer normalization is performed to obtain the normalized feature. The normalized feature is input to the feedforward neural network module. The first fully connected layer expands the 128-dimensional feature to 512-dimensional and applies the GELU activation function for nonlinear transformation. The second fully connected layer compresses the 512-dimensional feature back to 128-dimensional and applies dropout to randomly deactivate 25% of neurons to prevent overfitting. The feedforward network output is again connected to the normalized feature through residual connection and layer normalization to obtain the output of the first encoder. This output is passed to the second encoder to repeat the above processing. After stacking 6 encoder layers, the deep sequence feature shape of the output is still [1, 128], but the feature representation ability is significantly enhanced, and it can capture the high-order interaction relationship between multimodal features.
[0096] Deep sequence features are input into a global feature aggregation layer. A global average pooling unit and a feature compression unit perform dimensionality reduction and aggregation on the deep sequence features to generate a global fault feature vector. The global average pooling unit directly uses the 128-dimensional vector of deep sequence features as the global feature without pooling operations because the input is a single sample and a single sequence. The feature compression unit is implemented by a 128×64 fully connected layer, compressing the 128-dimensional deep features into a 64-dimensional global fault feature vector G. The purpose of the compression operation is to extract the most critical fault discrimination features and reduce the computational complexity of subsequent classification layers. The 64-dimensional global fault feature vector contains highly abstract fault mode information extracted from the original 12-dimensional fused features through multiple nonlinear transformations, providing a compact and information-rich feature representation for final diagnosis and classification.
[0097] The global fault feature vector is input into the fault diagnosis output layer. A fully connected layer and a classification activation function are used to classify and map the global fault feature vector, outputting the fault type, fault location, and severity of the reciprocating compressor. The combination of these three factors is used as the fault diagnosis result. The fault diagnosis output layer contains three parallel fully connected classification branches. The fault type classification branch consists of a 64×3 weight matrix and a Softmax activation function, mapping the 64-dimensional global feature vector to a 3-dimensional output vector [pnormal, pmechanical, pprocess]. After Softmax normalization, the sum of the three probability values is 1.0. Taking a 2D12 compressor at this moment as an example, the output [0.05, 0.15, 0.80] represents a 5% probability of normal operation, a 15% probability of mechanical fault, and an 80% probability of process fault. The category corresponding to the highest probability, i.e., process fault, is selected as the fault type diagnosis result.
[0098] The fault location classification branch is mapped by a 64×6 weight matrix to a 6-dimensional output vector [pcylinder, pbearing, pcrankshaft, ppiston, pvalve, pipe]. The output [0.02, 0.03, 0.02, 0.02, 0.05, 0.86] indicates that the probability of a pipe fault is the highest at 86%, and the fault location is determined to be a pipe. The severity classification branch is mapped by a 64×4 weight matrix to a 4-dimensional output vector [pnormal, pmild, pmoderate, psevere]. The output [0.10, 0.75, 0.12, 0.03] indicates that the probability of a minor fault is the highest at 75%, and the severity is determined to be minor. The three classification results were combined to obtain a complete fault diagnosis result of [process fault, pipeline, minor]. The diagnosis conclusion was verified by comparison with the actual working conditions to be a minor fluctuation in the pressure of the inlet and outlet pipelines caused by airflow pulsation. The diagnosis was accurate. The entire reasoning process, from inputting fused features to outputting the diagnosis result, took about 15ms, which meets the requirements of real-time diagnosis. This proves that the preset deep learning diagnosis model can efficiently and accurately complete the automatic fault diagnosis task of reciprocating compressors.
[0099] like Figure 2 As shown, the architecture of the entire process from raw multimodal data acquisition to feature fusion output is fully demonstrated. The topmost box in the figure represents the raw sensor data of three different modes—vibration signal, pressure signal, and temperature signal—that are simultaneously acquired from the key parts of the reciprocating compressor. These three modes of data are transmitted in real time to the cloud big data platform for subsequent processing through a 5G industrial gateway. The data preprocessing stage uses three parallel sub-modules to process different modal data. The left box shows the use of wavelet packet decomposition algorithm to denoise and detrend the vibration signal. By using 6-level decomposition of the db5 wavelet basis and setting 32 wavelet packet nodes, the signal-to-noise ratio of the vibration signal is improved from 30dB to over 48dB, effectively removing environmental noise and baseline drift interference. The middle box shows the use of moving average filtering algorithm to filter the pressure signal. A filter with a window size of 10 and a filtering frequency of 0.1Hz is used to remove periodic interference caused by airflow pulsation and correct the pressure baseline, with a maximum correction of 0.2MPa. The right box shows the use of preset threshold judgment and linear interpolation algorithm to handle outliers in the temperature signal. Based on the 3σ criterion, outlier temperature data points with a proportion not exceeding 0.3% are removed, and missing data with a proportion not exceeding 0.5% are filled in by linear interpolation, ensuring the integrity and continuity of the temperature sequence.
[0100] The clean data output from the three preprocessing submodules are collected in the middle layer box for standardization and normalization. The Z-score normalization algorithm is used to convert multimodal data with different dimensions into standardized values with a mean of 0 and a variance of 1, eliminating the dimensional differences between vibration signal units (g), pressure signal units (MPa), and temperature signal units (℃), laying the data foundation for subsequent feature fusion. The feature extraction stage is divided into three parallel feature extraction branches. The left box represents the time-domain feature analysis and frequency-domain feature analysis of the processed vibration signal. The time-domain features include three-dimensional features: peak value, kurtosis, and impulse factor. The frequency-domain features include three-dimensional features: spectral peak value and harmonic energy at characteristic frequencies of 100Hz, 200Hz, and 400Hz. After vibration feature extraction, a 6-dimensional vibration feature vector is output. The middle box represents the pressure feature analysis of the processed pressure signal, extracting a 3-dimensional pressure feature vector including pressure peak value, pressure fluctuation amplitude, and pressure rise and fall rate. The right box represents the temperature feature analysis of the processed temperature signal, extracting a 3-dimensional temperature feature vector including temperature mean, temperature fluctuation range, and temperature gradient.
[0101] The lower rectangle in the diagram represents the deep fusion of the extracted 12-dimensional feature vectors (vibration, pressure, and temperature features) through an attention mechanism fusion submodule. The attention mechanism dynamically allocates weights to each modal feature based on the current operating conditions: under normal conditions, vibration features are assigned a weight of 0.45, pressure features 0.35, and temperature features 0.2. Under fault conditions, the weights of fault-sensitive features are automatically increased to achieve feature weighting optimization, ultimately generating a fused feature vector containing multimodal correlation information. The two boxes at the bottom of the flowchart respectively represent the weighting adjustment of a set of extracted multimodal fusion feature vectors on vibration, pressure, and temperature features, the concatenation and fusion of the adjusted multimodal fusion features to obtain a total multimodal feature vector, and the input of the multimodal fusion feature vector into a preset deep learning diagnostic model for processing to obtain a complete diagnostic result output including fault type, fault location, and severity. This completes the end-to-end automated processing flow from raw sensor data to fault diagnosis conclusions.
[0102] S108: Generate a fault diagnosis report based on the fault diagnosis results and push the fault diagnosis report to the user terminal in real time.
[0103] In S108 above, generating a fault diagnosis report based on the fault diagnosis results and pushing the report to the user terminal in real time is a crucial step in transforming the output of the AI diagnostic model into a standardized document that can be understood and used for decision-making by engineering technicians, and delivering it to the field in a timely manner. The fundamental purpose is to ensure the completeness, standardization, and timeliness of fault information, avoiding equipment maintenance delays caused by information lag or unclear descriptions. The ultimate goal of fault diagnosis for industrial reciprocating compressors is to guide maintenance personnel to quickly locate faults and implement repair measures. Simple classification labels cannot provide sufficient decision-making basis. It is necessary to integrate the diagnostic results, such as fault type, fault location, and severity, into a structured report that includes equipment information, diagnosis time, characteristic data, diagnostic conclusions, and maintenance suggestions, and deliver it to field personnel in real time via mobile terminals.
[0104] The diagnostic report generation submodule reads the basic equipment information of the current diagnostic object from the platform, including compressor model 2D12, equipment number CF-2024-001, installation location in metallurgical workshop A area, rated speed 730 r / min, rated exhaust pressure 2.5 MPa, and records the diagnostic timestamp March 13, 202x, 14:35:42. It extracts the 12-dimensional original values of the multimodal fusion feature vector of this diagnosis and the weight allocation of each mode. It converts the fault type (process fault), fault location (pipeline), and severity (mild) output by the fault diagnosis output layer into a natural language description: "A slight pressure fluctuation caused by airflow pulsation in the inlet and outlet pipelines was detected. The pressure fluctuation amplitude is 0.7 MPa, which exceeds the normal range of 0.2-0.5 MPa." Based on the fault type and severity, it matches the corresponding maintenance suggestions from the preset knowledge base: "It is recommended to check the pipeline valve opening and airflow distribution system. If necessary, adjust the valve opening and closing sequence to reduce the pulsation amplitude. The maintenance time is expected to be 1-2 hours, which will not affect continuous production." The above information is formatted according to the standard template to generate a PDF format fault diagnosis report file with a size of approximately 180KB. Once the report is generated, it is pushed to preset user terminals via a 5G industrial gateway. The user terminals include the mobile device terminal number M-2024-015 for the field engineer and the tablet terminal number T-2024-008 for the workshop supervisor. The push method uses the WeChat message push interface. A real-time notification pops up on the user terminal: "A minor process fault has been detected in compressor CF-2024-001 of type 2D12. Please check the detailed report." The user can click on the notification to download and view the complete fault diagnosis report. The push delay is no more than 3 seconds, ensuring that field personnel can obtain fault information and arrange maintenance work as soon as possible, avoiding the fault from escalating into equipment downtime accidents, and realizing closed-loop management of the entire process from fault detection to maintenance decision-making.
[0105] This application embodiment also provides an AI automatic diagnostic system for reciprocating compressors based on multimodal data fusion. The system includes an acquisition unit, a processing unit, a fusion unit, and a diagnostic unit. The acquisition unit acquires vibration signals, pressure signals, and temperature signals from key components of the reciprocating compressor in real-time and synchronously. The processing unit preprocesses the vibration, pressure, and temperature signals to obtain preprocessed vibration, pressure, and temperature signals. The fusion unit extracts vibration features, pressure features, and temperature features from the preprocessed vibration, pressure, and temperature signals, respectively. The vibration features, pressure features, and temperature features are then differentially quantized with corresponding baseline features under normal operating conditions to obtain... The system assesses the intensity of vibration, pressure, and temperature evidence for each modality's deviation from the normal state. It combines these indices to obtain a real-time anomaly evidence vector, which is then mapped and compared to a preset fault mode template to generate a set of modal feature weights. The vibration, pressure, and temperature features are adaptively weighted according to these weights, and the adjusted multimodal features are then concatenated and fused to generate a multimodal fusion feature vector. A diagnostic unit inputs this multimodal fusion feature vector into a preset deep learning diagnostic model for processing, yielding the fault diagnosis results for the reciprocating compressor. A fault diagnosis report is generated based on the results and pushed to the user terminal in real time.
[0106] In one possible implementation, the processing unit is used to perform noise reduction processing on the vibration signal based on the wavelet packet decomposition algorithm to obtain a noise-reduced vibration signal; to perform filtering processing on the pressure signal using a moving average filtering algorithm to obtain a filtered pressure signal; to remove outliers from the temperature signal based on a preset criterion, and to perform data completion processing on the temperature signal after outlier removal using a linear interpolation algorithm to obtain a processed temperature signal; and to perform standardization and normalization processing on the noise-reduced vibration signal, the filtered pressure signal, and the processed temperature signal respectively to obtain a preprocessed vibration signal, a preprocessed pressure signal, and a preprocessed temperature signal.
[0107] In one possible implementation, the fusion unit performs time-domain and frequency-domain analysis on the preprocessed vibration signal, extracting time-domain statistical features and frequency-domain energy features. The time-domain statistical features include root mean square value, peak-to-peak value, skewness, and kurtosis; the frequency-domain energy features include the dominant frequency amplitude extracted based on fast Fourier transform and the energy proportion of each frequency band. The preprocessed pressure signal undergoes thermodynamic and waveform feature analysis to extract pressure features, including maximum exhaust pressure, minimum intake pressure, pressure ratio, pressure pulsation amplitude, and indicated work. The preprocessed temperature signal undergoes statistical and trend analysis to extract temperature features, including average temperature value, temperature variance, and temperature change rate.
[0108] In one possible implementation, the acquisition unit is used to acquire historical multimodal feature data of the reciprocating compressor under normal operating conditions, and calculate the mean vector and covariance matrix corresponding to the feature vectors of vibration mode, pressure mode, and temperature mode respectively, using the mean vector and covariance matrix corresponding to each mode as the baseline feature; the fusion unit is used to use the vibration feature, pressure feature, and temperature feature as real-time vibration feature vector, real-time pressure feature vector, and real-time temperature feature vector respectively; and using the vibration mode, pressure mode, and temperature mode as the target mode respectively, subtracting the real-time feature vector of the target mode from the mean vector corresponding to the target mode. The characteristic deviation vector of the target mode is obtained, and the real-time characteristic vector is the vector corresponding to the target mode. The inverse covariance matrix corresponding to the target mode is inverted to obtain the inverse covariance matrix of the target mode. The characteristic deviation vector is transposed to obtain the transposed characteristic deviation vector. The transposed characteristic deviation vector, the inverse covariance matrix, and the characteristic deviation vector are multiplied sequentially to obtain the squared Mahalanobis distance of the target mode. The squared Mahalanobis distances of the vibration mode, the pressure mode, and the temperature mode are used as the strength of evidence for vibration, pressure, and temperature, respectively.
[0109] In one possible implementation, the acquisition unit is used to acquire fault prototype vectors as preset fault mode templates. The fault prototype vectors include at least mechanical fault prototype vectors, fluid fault prototype vectors, and health state prototype vectors. Each fault prototype vector consists of vibration reference evidence values, pressure reference evidence values, and temperature reference evidence values. The fusion unit is used to perform dot product operations on the real-time anomaly evidence vectors with the mechanical fault prototype vectors, fluid fault prototype vectors, and health state prototype vectors, respectively, to obtain mechanical fault alignment scores, fluid fault alignment scores, and health state alignment scores. The mechanical fault alignment score is then adjusted based on the temperature hyperparameter. The alignment scores for fluid faults and health status are calculated and normalized to obtain the confidence scores for mechanical faults, fluid faults, and health status. Pre-defined attention vectors for mechanical faults, fluid faults, and health status are then obtained. The mechanical fault confidence score is multiplied by the mechanical fault attention vector, the fluid fault confidence score is multiplied by the fluid fault attention vector, and the health status confidence score is multiplied by the health status attention vector. The results are then summed to obtain a dynamic weight vector. Each component in the dynamic weight vector is normalized to obtain a set of modal feature weight values.
[0110] In one possible implementation, the fusion unit is used to parse the vibration feature weight value corresponding to the vibration mode, the pressure feature weight value corresponding to the pressure mode, and the temperature feature weight value corresponding to the temperature mode from a set of modal feature weight values; perform scalar multiplication of the vibration feature weight value with each feature element in the vibration feature to obtain the weighted vibration feature; perform scalar multiplication of the pressure feature weight value with each feature element in the pressure feature to obtain the weighted pressure feature; perform scalar multiplication of the temperature feature weight value with each feature element in the temperature feature to obtain the weighted temperature feature; and perform vector concatenation of the weighted vibration feature, weighted pressure feature, and weighted temperature feature according to a preset modal dimension order to generate a multimodal fusion feature vector.
[0111] In one possible implementation, the diagnostic unit is used to input the multimodal fusion feature vector into the multimodal feature input layer of a preset deep learning diagnostic model, perform standardization and dimensionality alignment processing to obtain an aligned feature sequence; input the aligned feature sequence into the feature embedding and position encoding layer, map the aligned feature sequence to a high-dimensional latent space through a linear embedding layer, and superimpose sinusoidal position encoding information to obtain a sequence feature with position information; input the sequence feature with position information into the Transformer encoder stack layer, calculate the global dependency relationship between each modality feature in the sequence feature with position information through a multi-head self-attention module, and perform nonlinear transformation processing through residual connection, layer normalization and feedforward neural network module to output deep sequence features; input the deep sequence features into the global feature aggregation layer, perform dimensionality reduction and aggregation processing on the deep sequence features through a global average pooling unit and a feature compression unit to generate a global fault feature vector; input the global fault feature vector into the fault diagnosis output layer, perform classification mapping on the global fault feature vector through a fully connected layer and a classification activation function, output the fault type, fault location and severity of the reciprocating compressor respectively, and combine the fault type, fault location and severity as the fault diagnosis result.
[0112] It should be noted that the system provided in the above embodiments is only illustrated by the division of the above functional modules. In actual applications, the above functions can be assigned to different functional modules as needed, that is, the internal structure of the device can be divided into different functional modules to complete all or part of the functions described above. In addition, the system and method embodiments provided in the above embodiments belong to the same concept, and the specific implementation process can be found in the method embodiments, which will not be repeated here.
[0113] This application also discloses an electronic device. (See reference...) Figure 3 , Figure 3 This application provides a schematic diagram of the structure of an electronic device. The electronic device 300 may include: at least one processor 301, at least one network interface 304, a user interface 303, a memory 302, and at least one communication bus 305.
[0114] The communication bus 305 is used to enable communication between these components.
[0115] The user interface 303 may include a display screen and a camera. Optionally, the user interface 303 may also include a standard wired interface and a wireless interface.
[0116] The network interface 304 may optionally include a standard wired interface or a wireless interface (such as a Wi-Fi interface).
[0117] The processor 301 may include one or more processing cores. The processor 301 connects to various parts of the server using various interfaces and lines, and performs various server functions and processes data by running or executing instructions, programs, code sets, or instruction sets stored in memory 302, and by calling data stored in memory 302. Optionally, the processor 301 may be implemented using at least one hardware form of Digital Signal Processing (DSP), Field-Programmable Gate Array (FPGA), or Programmable Logic Array (PLA). The processor 301 may integrate one or a combination of several of the following: Central Processing Unit (CPU), Graphics Processing Unit (GPU), and modem. The CPU primarily handles the operating system, user interface, and application requests; the GPU is responsible for rendering and drawing the content required for display; and the modem handles wireless communication. It is understood that the modem may also not be integrated into the processor 301 and may be implemented as a separate chip.
[0118] The memory 302 may include random access memory (RAM) or read-only memory. Optionally, the memory 302 may include a non-transitory computer-readable storage medium. The memory 302 may be used to store instructions, programs, code, code sets, or instruction sets. The memory 302 may include a program storage area and a data storage area. The program storage area may store instructions for implementing an operating system, instructions for at least one function (such as touch functionality, sound playback functionality, image playback functionality, etc.), instructions for implementing the various method embodiments described above, etc. The data storage area may store data involved in the various method embodiments described above. Optionally, the memory 302 may also be at least one storage device located remotely from the aforementioned processor 301.
[0119] like Figure 3 As shown, the memory 302, which serves as a computer storage medium, may include an operating system, a network communication module, a user interface module, and an application for AI-based automatic diagnosis of reciprocating compressors using multimodal data fusion.
[0120] exist Figure 3In the electronic device 300 shown, the user interface 303 is mainly used to provide an input interface for the user and to obtain the user input data; while the processor 301 can be used to call the application for automatic diagnosis of reciprocating compressor AI stored in the memory 302, which stores multimodal data fusion. When executed by one or more processors, the electronic device performs one or more of the methods described in the above embodiments.
[0121] It should be noted that, for the sake of simplicity, the foregoing method embodiments are all described as a series of actions. However, those skilled in the art should understand that this application is not limited to the described order of actions, as some steps may be performed in other orders or simultaneously according to this application. Furthermore, those skilled in the art should also understand that the embodiments described in the specification are preferred embodiments, and the actions and modules involved are not necessarily essential to this application.
[0122] In the above embodiments, the descriptions of each embodiment have different focuses. For parts not described in detail in a certain embodiment, please refer to the relevant descriptions in other embodiments.
[0123] In the several embodiments provided in this application, it should be understood that the disclosed apparatus can be implemented in other ways. For example, the apparatus embodiments described above are merely illustrative; for instance, the division of units is only a logical functional division, and in actual implementation, there may be other division methods. For example, multiple units or components may be combined or integrated into another system, or some features may be ignored or not executed. Furthermore, the displayed or discussed mutual couplings, direct couplings, or communication connections may be through some service interfaces; indirect couplings or communication connections between devices or units may be electrical or other forms.
[0124] The units described as separate components may or may not be physically separate. The components shown as units may or may not be physical units; that is, they may be located in one place or distributed across multiple network units. Some or all of the units can be selected to achieve the purpose of this embodiment according to actual needs.
[0125] Furthermore, the functional units in the various embodiments of this application can be integrated into one processing unit, or each unit can exist physically separately, or two or more units can be integrated into one unit. The integrated unit can be implemented in hardware or as a software functional unit.
[0126] If the integrated unit is implemented as a software functional unit and sold or used as an independent product, it can be stored in a computer-readable storage device (CMD). Based on this understanding, the technical solution of this application, in essence, or the part that contributes to the prior art, or all or part of the technical solution, can be embodied in the form of a software product. This computer software product is stored in a memory and includes several instructions to cause a computer device (which may be a personal computer, server, or network device, etc.) to execute all or part of the steps of the methods described in the various embodiments of this application. The aforementioned memory includes various media capable of storing program code, such as USB flash drives, portable hard drives, magnetic disks, or optical disks.
[0127] The above description is merely an exemplary embodiment of this disclosure and should not be construed as limiting the scope of this disclosure. Any equivalent changes and modifications made in accordance with the teachings of this disclosure shall still fall within the scope of this disclosure. Those skilled in the art will readily conceive of other embodiments of this disclosure upon considering the specification and the disclosure of practical truths. This application is intended to cover any variations, uses, or adaptations of this disclosure that follow the general principles of this disclosure and include common knowledge or customary techniques in the art not described in this disclosure.
Claims
1. A multimodal data fusion-based AI-based automatic diagnostic method for reciprocating compressors, characterized in that, The method includes: The vibration, pressure, and temperature signals of key components of the reciprocating compressor are acquired in real time and synchronously. The vibration signal, the pressure signal, and the temperature signal are preprocessed to obtain preprocessed vibration signal, preprocessed pressure signal, and preprocessed temperature signal. Vibration features, pressure features, and temperature features are extracted from the preprocessed vibration signal, the preprocessed pressure signal, and the preprocessed temperature signal, respectively. The vibration characteristics, pressure characteristics, and temperature characteristics are differentiated and quantified with the corresponding benchmark characteristics under normal operating conditions to obtain the vibration evidence strength, pressure evidence strength, and temperature evidence strength of the degree of deviation of each mode from the normal state. The vibration evidence intensity, the pressure evidence intensity, and the temperature evidence intensity are combined to obtain a real-time anomaly evidence vector. The real-time anomaly evidence vector is then mapped and compared with a preset fault mode template to obtain a set of modal feature weight values. The vibration feature, pressure feature, and temperature feature are adaptively weighted and adjusted according to the set of modal feature weight values, and the adjusted multimodal features are spliced and fused to generate a multimodal fusion feature vector. The multimodal fusion feature vector is input into a preset deep learning diagnostic model for processing to obtain the fault diagnosis result of the reciprocating compressor. A fault diagnosis report is generated based on the fault diagnosis results, and the fault diagnosis report is pushed to the user terminal in real time.
2. The method according to claim 1, characterized in that, The preprocessing of the vibration signal, the pressure signal, and the temperature signal to obtain preprocessed vibration signal, preprocessed pressure signal, and preprocessed temperature signal specifically includes: The vibration signal is denoised using the wavelet packet decomposition algorithm to obtain the denoised vibration signal. The pressure signal is filtered using a moving average filtering algorithm to obtain a filtered pressure signal. The temperature signal is subjected to outlier removal based on preset criteria, and the temperature signal after outlier removal is processed by a linear interpolation algorithm to obtain the processed temperature signal. The vibration signal after noise reduction, the pressure signal after filtering, and the temperature signal after processing are standardized and normalized to obtain the preprocessed vibration signal, the preprocessed pressure signal, and the preprocessed temperature signal.
3. The method according to claim 1, characterized in that, The extraction of vibration features, pressure features, and temperature features from the preprocessed vibration signal, the preprocessed pressure signal, and the preprocessed temperature signal, respectively, specifically includes: The preprocessed vibration signal is subjected to time-domain analysis and frequency-domain analysis respectively to extract time-domain statistical features and frequency-domain energy features. The time-domain statistical features and the frequency-domain energy features are combined to form the vibration features. The time-domain statistical features include root mean square value, peak-to-peak value, skewness and kurtosis. The frequency-domain energy features include the main frequency amplitude extracted based on fast Fourier transform and the energy proportion of each frequency band. The preprocessed pressure signal is subjected to thermodynamic and waveform feature analysis to extract the pressure features, which include maximum exhaust pressure, minimum intake pressure, pressure ratio, pressure pulsation amplitude, and indicated work characteristics. The preprocessed temperature signal is subjected to statistical and trend analysis to extract the temperature features, which include the average temperature value, temperature variance, and temperature change rate.
4. The method according to claim 3, characterized in that, The step of quantifying the differences between the vibration characteristics, pressure characteristics, and temperature characteristics and the corresponding baseline characteristics under normal operating conditions to obtain the vibration evidence strength, pressure evidence strength, and temperature evidence strength of each mode's deviation from the normal state specifically includes: The historical multimodal feature data of the reciprocating compressor under normal operating conditions are obtained, and the mean vector and covariance matrix corresponding to the feature vectors of vibration mode, pressure mode and temperature mode are calculated respectively. The mean vector and covariance matrix corresponding to each mode are used as the reference feature. The vibration feature, the pressure feature, and the temperature feature are respectively used as the real-time vibration feature vector, the real-time pressure feature vector, and the real-time temperature feature vector; The vibration mode, the pressure mode, and the temperature mode are respectively used as target modes. The real-time feature vector of the target mode is subtracted from the mean vector corresponding to the target mode to obtain the feature deviation vector of the target mode, wherein the real-time feature vector is the vector corresponding to the target mode; The inverse covariance matrix corresponding to the target mode is inverted to obtain the inverse covariance matrix of the target mode; the matrix transpose of the feature deviation vector is performed to obtain the transposed feature deviation vector. The transposed eigenvalue deviation vector, the inverse covariance matrix, and the eigenvalue deviation vector are sequentially multiplied to obtain the squared Mahalanobis distance of the target mode. The squared Mahalanobis distance values of the vibration mode, the pressure mode, and the temperature mode are calculated and used as the strength of the vibration evidence, the strength of the pressure evidence, and the strength of the temperature evidence, respectively.
5. The method according to claim 4, characterized in that, The step of mapping and comparing the real-time anomaly evidence vector with a preset fault mode template to obtain a set of modal feature weight values specifically includes: Obtain a fault prototype vector as the preset fault mode template. The fault prototype vector includes at least a mechanical fault prototype vector, a fluid fault prototype vector, and a health state prototype vector. Each fault prototype vector consists of a vibration reference evidence value, a pressure reference evidence value, and a temperature reference evidence value. The real-time anomaly evidence vector is multiplied by the mechanical fault prototype vector, the fluid fault prototype vector and the health state prototype vector to obtain the mechanical fault alignment score, the fluid fault alignment score and the health state alignment score. Based on the temperature hyperparameter, the mechanical fault alignment score, the fluid fault alignment score, and the health status alignment score are calculated exponentially and normalized to obtain the mechanical fault confidence, fluid fault confidence, and health status confidence. Obtain pre-defined mechanical fault concern vectors, fluid fault concern vectors, and health status concern vectors; The mechanical fault confidence is multiplied by the mechanical fault concern vector, the fluid fault confidence is multiplied by the fluid fault concern vector, the health status confidence is multiplied by the health status concern vector, and the results of the multiplications are added together to obtain the dynamic weight vector. The components in the dynamic weight vector are normalized to obtain the set of modal feature weight values.
6. The method according to claim 5, characterized in that, The step of adaptively weighting and adjusting the vibration feature, pressure feature, and temperature feature according to the set of modal feature weight values, and then concatenating and fusing the adjusted multimodal features to generate a multimodal fusion feature vector, specifically includes: From the set of modal feature weight values, the vibration feature weight value corresponding to the vibration mode, the pressure feature weight value corresponding to the pressure mode, and the temperature feature weight value corresponding to the temperature mode are respectively extracted; The weighted vibration feature is obtained by performing a scalar multiplication between the vibration feature weight value and each feature element in the vibration feature; The weighted pressure feature is obtained by performing a scalar multiplication between the pressure feature weight value and each feature element in the pressure feature; The weighted temperature feature is obtained by performing a scalar multiplication between the weighted value of the temperature feature and each feature element in the temperature feature. The weighted vibration features, weighted pressure features, and weighted temperature features are concatenated into vectors according to a preset modal dimension order to generate the multimodal fusion feature vector.
7. The method according to claim 1, characterized in that, The step of inputting the multimodal fused feature vector into a preset deep learning diagnostic model for processing to obtain the fault diagnosis result of the reciprocating compressor specifically includes: The multimodal fusion feature vector is input into the multimodal feature input layer of the preset deep learning diagnostic model, and standardized and dimension-aligned to obtain an aligned feature sequence. The alignment feature sequence is input into the feature embedding and position encoding layer. The alignment feature sequence is mapped to a high-dimensional latent space through a linear embedding layer, and sinusoidal position encoding information is superimposed to obtain sequence features with position information. The sequence features with location information are input into the stacked layer of the Transformer encoder. The global dependencies between modal features in the sequence features with location information are calculated by the multi-head self-attention module. The deep sequence features are then processed by the residual connection, layer normalization and feedforward neural network modules to output the nonlinear transformation. The deep sequence features are input into the global feature aggregation layer, and the deep sequence features are reduced in dimension and aggregated through the global average pooling unit and the feature compression unit to generate a global fault feature vector. The global fault feature vector is input into the fault diagnosis output layer. The global fault feature vector is classified and mapped through a fully connected layer and a classification activation function. The fault type, fault location and severity of the reciprocating compressor are output respectively. The fault type, fault location and severity are combined as the fault diagnosis result.
8. A multimodal data fusion-based AI automatic diagnostic system for reciprocating compressors, characterized in that, The system includes an acquisition unit, a processing unit, a fusion unit, and a diagnostic unit. The acquisition unit acquires vibration signals, pressure signals, and temperature signals from key parts of the reciprocating compressor in real time and synchronously. The processing unit preprocesses the vibration signal, the pressure signal, and the temperature signal to obtain preprocessed vibration signal, preprocessed pressure signal, and preprocessed temperature signal. The fusion unit extracts vibration features, pressure features, and temperature features from the preprocessed vibration signal, the preprocessed pressure signal, and the preprocessed temperature signal, respectively. The vibration characteristics, pressure characteristics, and temperature characteristics are differentiated and quantified with the corresponding benchmark characteristics under normal operating conditions to obtain the vibration evidence strength, pressure evidence strength, and temperature evidence strength of the degree of deviation of each mode from the normal state. The vibration evidence strength, pressure evidence strength, and temperature evidence strength are combined to obtain a real-time anomaly evidence vector. The real-time anomaly evidence vector is mapped and compared with a preset fault mode template to obtain a set of modal feature weight values. The vibration feature, pressure feature, and temperature feature are adaptively weighted and adjusted according to the set of modal feature weight values. The adjusted multimodal features are then spliced and fused to generate a multimodal fusion feature vector. The diagnostic unit inputs the multimodal fusion feature vector into a preset deep learning diagnostic model for processing to obtain the fault diagnosis result of the reciprocating compressor; generates a fault diagnosis report based on the fault diagnosis result, and pushes the fault diagnosis report to the user terminal in real time.
9. An electronic device, characterized in that, The device includes a processor, a memory, a user interface, and a network interface. The memory is used to store instructions, the user interface and the network interface are used to communicate with other devices, and the processor is used to execute the instructions stored in the memory to cause the electronic device to perform the method as described in any one of claims 1-7.
10. A computer-readable storage medium, characterized in that, The computer-readable storage medium stores instructions that, when executed, perform the method as described in any one of claims 1-7.