A big data-based thermoelectric power generation emission pollutant detection system and method
By analyzing the emission parameters of thermal power generation through a distributed sensor network and big data platform, and combining a two-stage pollutant detection model and spectral feature analysis technology, the shortcomings of existing detection methods have been overcome, enabling real-time, comprehensive, and accurate monitoring and source tracing analysis of pollutants emitted from thermal power generation.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- HUANENG POWER INT INC JINGGANGSHAN POWER PLANT
- Filing Date
- 2025-08-19
- Publication Date
- 2026-06-30
AI Technical Summary
Existing methods for detecting pollutants emitted from thermal power generation are insufficient for real-time, comprehensive, and accurate monitoring. They cannot identify combined patterns of multiple pollutants, lack effective data processing and source tracing analysis, and traditional detection results are not targeted or accurate enough.
A distributed sensor network is used to collect multi-dimensional emission parameters in real time. A big data processing platform and a pollution characteristic knowledge base are used for correlation analysis. A two-level pollutant detection model is used to process macroscopic and microscopic emission characteristic areas to generate pollutant exceedance early warning and component source tracing analysis reports. Combined with dynamic region segmentation and spectral feature analysis technology, the emission parameters can be accurately divided and detected.
It enables comprehensive and dynamic monitoring of pollutants emitted from thermal power generation, identifies potential target pollutant combination patterns, generates accurate warnings of exceeding standards and component source tracing analysis reports, improves the pertinence and comprehensiveness of detection, and provides specific directions for pollution control.
Smart Images

Figure CN120974238B_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of thermoelectric pollutant detection technology, specifically to a system and method for detecting pollutants emitted from thermoelectric power generation based on big data. Background Technology
[0002] During cogeneration, the pollutants emitted from chimneys are complex in composition and fluctuate significantly, posing a potential impact on the ecological environment and human health. Currently, pollutant detection largely relies on single-point sampling and analysis, making it difficult to comprehensively capture the real-time dynamic changes at emission points. Traditional detection methods typically monitor individual pollutants, ignoring the correlations between different pollutants, resulting in limited ability to identify complex pollution.
[0003] In existing technologies, sensor networks are relatively concentrated in their layout, and the data collected is limited to a single dimension, failing to simultaneously cover multi-dimensional parameters such as flue gas composition, particulate matter characteristics, flow velocity, temperature, and pressure. Furthermore, the lack of an effective knowledge base in the data processing stage makes it difficult to perform in-depth analysis of massive amounts of monitoring data, often only enabling simple concentration exceedance judgments without identifying potential pollutant combinations.
[0004] Traditional detection models fail to properly regionalize emission parameters, conflating macroscopic emission characteristics with microscopic component characteristics, resulting in insufficient specificity and accuracy of detection results. Regarding pollutant source tracing, existing technologies lack systematic analytical methods, making it difficult to track the sources and formation pathways of pollutants, thus posing significant challenges to pollution control. These issues render existing methods for detecting pollutants emitted from thermal power generation inadequate for meeting the demands for real-time, comprehensive, and precise monitoring. Summary of the Invention
[0005] The purpose of this invention is to provide a big data-based system and method for detecting pollutants emitted from thermal power generation, in order to solve the problems mentioned in the background art.
[0006] To achieve the above objectives, the present invention provides a method for detecting pollutants emitted from thermal power generation based on big data, the method comprising:
[0007] Multiple emission parameters from the chimney exhaust outlet of a thermal power plant are collected in real time through a distributed sensor network. These emission parameters include flue gas composition concentration, particulate matter size distribution, emission velocity, and temperature and pressure data.
[0008] The collected multidimensional emission parameters are transmitted to a big data processing platform, and the pollution feature knowledge base is used to analyze the correlation of the multidimensional emission parameters to identify potential target pollutant combination patterns.
[0009] Based on the identified target pollutant combination pattern, the preset pollutant detection rule library is invoked, and the emission parameters are divided into macro emission characteristic regions and micro component characteristic regions based on the dynamic region segmentation algorithm.
[0010] By using a two-level pollutant detection model to process parameter data from macroscopic emission characteristic areas and microscopic component characteristic areas respectively, pollutant exceedance early warning results and component source tracing analysis reports are generated.
[0011] Preferably, the process of performing correlation analysis using a pollution feature knowledge base includes:
[0012] A pollution feature knowledge base based on a graph database is constructed, which stores the mapping relationship between different fuel types, combustion conditions and pollutant emission characteristics;
[0013] A multi-source data association engine is used to analyze real-time data collected by a distributed sensor network and extract the symbiotic relationship features between flue gas components.
[0014] The extracted symbiotic relationship features are compared with the standard pollution patterns in the pollution feature knowledge base using a knowledge graph matching algorithm to determine the target pollutant combination pattern corresponding to the current emission parameters.
[0015] Preferably, the execution process of the dynamic region segmentation algorithm includes:
[0016] Based on the target pollutant combination pattern, the corresponding pollutant detection rule base is invoked to obtain the emission monitoring standards under the current pollution pattern;
[0017] The emission parameters are decomposed into a data matrix with spatial distribution characteristics using a spectral feature analysis algorithm.
[0018] Based on the threshold parameters specified in the emission monitoring standards, the data matrix is divided into multi-scale regions to generate macro-emission feature regions containing overall emission intensity characteristics and micro-component feature regions containing the concentration distribution of specific substances.
[0019] Preferably, the operation process of the two-stage pollutant detection model includes:
[0020] In macro-emission characteristic regions, an emission total assessment model is applied, and spatiotemporal feature extraction algorithms are used to calculate the total emission and diffusion trend of pollutants.
[0021] A component fingerprinting model is activated in the microscopic component characteristic region, and high-resolution spectral matching technology is used to analyze the fingerprint spectrum of characteristic pollutants;
[0022] It simultaneously receives trend prediction data from the total emission assessment model and substance source traceability data from the component fingerprinting model, and generates a comprehensive detection report that integrates the total emission exceedance warning and the source traceability results of specific substances.
[0023] Preferably, the operation process of the component fingerprint recognition model includes:
[0024] Establish a pollutant fingerprint spectral database to store the characteristic absorption spectra of different industrial pollution sources;
[0025] Wavelet denoising was performed on the parameter data of the microscopic component characteristic region to extract effective material characteristic spectral curves;
[0026] The spectral similarity calculation engine uses a step-by-step matching process between the extracted spectral curves of the material characteristics and the pollutant fingerprint spectral database to identify the source categories of characteristic pollutants.
[0027] Preferably, the method further includes a dynamic optimization process for pollutant detection:
[0028] Real-time monitoring of the analysis bias indicators of the total emission assessment model and the component fingerprinting model;
[0029] When there is a significant difference between the total emission assessment results of macro-emission characteristic regions and the substance identification results of micro-component characteristic regions, the model parameter coordination mechanism is activated.
[0030] By dynamically adjusting the weight coefficients in the total emissions assessment model and the matching thresholds in the component fingerprinting model through a feedback adjustment algorithm, the outputs of the two-level models converge.
[0031] Preferably, the execution process of the model parameter coordination mechanism includes:
[0032] Construct a correlation analysis matrix of the output results of the two-level model, and calculate the statistical correlation index between total emissions data and material composition data;
[0033] Based on the degree to which the statistical correlation indicators deviate from the normal range, a set of model parameter correction coefficients is generated;
[0034] The model parameter correction coefficients are input into the dynamic weight controller of the total emissions assessment model and the threshold regulator of the component fingerprinting model to achieve collaborative optimization of the two-level detection models.
[0035] Preferably, the method further includes a pollutant migration prediction process:
[0036] Real-time wind direction and speed, atmospheric humidity and temperature stratification data are obtained based on meteorological data interfaces;
[0037] The pollutant exceedance early warning results are spatiotemporally overlaid with meteorological data, and fluid dynamics simulation algorithms are used to deduce the pollutant diffusion path.
[0038] Based on the simulation results, a pollutant migration prediction map is generated, which includes the affected area range, concentration distribution gradient, and duration.
[0039] Preferably, the method further includes an emission control feedback process:
[0040] The pollutant exceedance warning results are input into the thermal power generator set control system;
[0041] The fuel supply ratio, air volume, and flue gas recirculation parameters are adjusted through combustion optimization control algorithms.
[0042] Real-time acquisition of the adjusted emission parameter change curves verifies whether the pollutant concentration decline trend meets the expected control target.
[0043] Preferably, the present invention also includes a big data-based system for detecting pollutants emitted from thermal power generation, used to implement the above-described big data-based method for detecting pollutants emitted from thermal power generation, the system comprising the following modules:
[0044] A distributed sensing and acquisition module is deployed in the emission pipeline of a thermal power plant, equipped with multiple types of gas sensors and particulate matter monitoring probes.
[0045] The big data processing engine module connects the pollution characteristic knowledge base and the pollutant detection rule base to perform data association parsing and dynamic region segmentation.
[0046] The two-level analysis and decision-making module includes a macro-emission assessment unit and a micro-component identification unit, which process data from different characteristic regions respectively.
[0047] The dynamic coordination and optimization module receives the output results from the two-level analysis and decision-making module and performs model parameter coordination operations.
[0048] The pollution migration prediction module connects to the meteorological data platform and performs pollutant diffusion simulations.
[0049] The closed-loop control execution module converts the detection results into control commands and transmits them to the generator set control system.
[0050] Compared with the prior art, the beneficial effects of the present invention are:
[0051] By collecting multiple emission parameters through a distributed sensor network, the limitations of traditional single-point sampling and single-parameter monitoring are overcome, enabling comprehensive capture of dynamic information from chimney emission outlets. The simultaneous acquisition of multi-dimensional parameters means that pollutant monitoring is no longer limited to one or a few substances, but encompasses multiple aspects such as flue gas composition, particulate matter characteristics, flow velocity, temperature, and pressure, providing rich foundational data for subsequent comprehensive analysis.
[0052] By utilizing a pollution characteristic knowledge base to analyze the correlations of multidimensional emission parameters, it is possible to uncover the intrinsic relationships between different pollutants and identify potential target pollutant combination patterns. This pattern recognition based on correlation analysis changes the traditional approach of monitoring single pollutants, making the detection process more aligned with the characteristic that pollutants often exist in complex forms in actual emissions. This helps to discover pollution risks that are difficult to detect using traditional methods.
[0053] The dynamic region segmentation algorithm divides emission parameters into macroscopic emission characteristic regions and microscopic component characteristic regions, enabling effective differentiation of emission information at different levels. This segmentation method makes subsequent detection and processing more targeted, avoiding analytical biases caused by interference between macroscopic and microscopic information, and making the detection process more accurate and efficient.
[0054] The dual-level pollutant detection model processes parameter data from two regions separately, generating corresponding results based on both macroscopic emission characteristics and microscopic component characteristics. For the macroscopic region, it can generate exceedance warnings, promptly reflecting the overall emission status; for the microscopic region, it can generate component source tracing analysis reports, providing in-depth analysis of the composition and sources of pollutants. This tiered approach not only focuses on whether overall emissions exceed standards but also delves into the detailed information of pollutants, making the detection results more comprehensive and providing more specific directions for pollution control. Attached Figure Description
[0055] Figure 1 This is a schematic diagram illustrating the working principle of the big data-based method for detecting pollutants emitted from thermal power generation as described in this invention.
[0056] Figure 2 A flowchart for analyzing the correlation of the pollution feature knowledge base;
[0057] Figure 3 A flowchart for the operation of the two-stage pollutant detection model;
[0058] Figure 4 A flowchart for the execution of the model parameter coordination mechanism;
[0059] Figure 5 This is a flowchart for predicting pollutant migration. Detailed Implementation
[0060] The technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of the present invention, and not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of the present invention.
[0061] Please see Figure 1 This invention provides a method for detecting pollutants emitted from thermal power generation based on big data, the method comprising:
[0062] The system includes a laser gas analyzer, a beta-ray particulate matter monitor, a Pitot tube flow meter, and temperature and pressure transmitters. The sensors collect real-time concentrations of SO2, NOx, CO, and CO2 in the flue gas at a sampling frequency of 10Hz, record PM2.5 and PM10 particle size distribution data, and simultaneously measure flue gas velocity, temperature, and pressure parameters. The collected raw data is converted to a protocol via an industrial IoT gateway and then transmitted in JSON format to a big data processing platform.
[0063] The big data processing platform uses the Apache Spark framework to build a streaming computing pipeline, performing standardized preprocessing on the input data, including outlier removal, dimensional unification, and timestamp alignment. The preprocessed multidimensional emission parameters are imported into a pollution characteristic knowledge base for correlation analysis. This knowledge base stores typical emission patterns of 12 fuels, including coal and natural gas, under different load rates. A graph neural network algorithm is used to calculate the similarity matrix between the current emission parameters and the knowledge base templates, identifying pollutant combinations that synergistically exceed SO2, NOx, and PM2.5 standards.
[0064] After identifying the target pollutant combination pattern, the system calls the preset pollutant detection rule base and loads the emission limit standards for the corresponding pollution pattern. An improved K-means clustering algorithm is used to dynamically segment the emission parameters into regions, with an SO2 concentration threshold of 50 mg / m³. 3 Using this as the boundary, the data space is divided into a macroscopic emission characteristic region reflecting the overall pollution level and a microscopic component characteristic region showing the distribution of heavy metals. The macroscopic assessment module in the two-level pollutant detection model calculates the total hourly SO2 emissions of the region, while the microscopic analysis module focuses on the characteristic spectra of trace elements such as arsenic and mercury, ultimately generating a comprehensive report that includes warnings of total emissions exceeding limits and heavy metal source markers.
[0065] Example 1: See Figure 2The pollution characteristic knowledge base is constructed using a graph database structure, with nodes and relationship types rigorously defined to meet the needs of thermal power emission analysis. The graph database contains three core node types. The first type is fuel characteristic nodes, storing industrial analysis data for 12 fuels including coal and biomass, covering 28 indicators such as fixed carbon content, volatile matter percentage, and absolute sulfur content. Each indicator is associated with a corresponding detection standard and unit of measurement. The second type is combustion condition nodes, recording boiler operating parameters, including but not limited to dynamic data such as load rate fluctuation curves, furnace oxygen concentration gradients, and steam temperature-pressure relationships. These parameters are stored in time-series format with a sampling interval of 1 minute. The third type is pollutant characteristic nodes, storing the spectral characteristics and chemical composition of typical emissions, including absorption peaks of conventional gaseous pollutants such as sulfur dioxide and nitrogen oxides, as well as characteristic emission lines of heavy metals such as arsenic and mercury. Each spectral line is labeled with the measuring instrument model and calibration date.
[0066] The multi-source data association engine employs complex event processing technology to achieve real-time data stream analysis. The engine's input connects to a Kafka message queue of a distributed sensor network, configured with a sliding time window mechanism. The window width is set to 5 minutes, and the sliding step size is 1 minute. Three types of event association rules are established within the window: the first type detects the temporal relationship between events of decreasing oxygen content and increasing sulfur dioxide concentration, with the trigger condition being a 0.5% decrease in oxygen content within 10 seconds accompanied by a 20 ppm increase in SO2; the second type captures the spatial correlation between changes in particulate matter size distribution and flow velocity fluctuations, generating an abnormal event marker when the PM2.5 / PM10 ratio exceeds 0.7 and the flue gas velocity is below 12 m / s; the third type analyzes the nonlinear relationship between temperature and pressure parameters and the concentrations of multiple pollutants, using kernel density estimation to calculate the joint distribution probability of the parameters. All triggered events are appended with a timestamp and confidence score, stored in the relational edge attributes of a graph database.
[0067] The spectral feature analysis process comprises two stages: signal preprocessing and feature extraction. The preprocessing stage employs a five-stage pipeline: the raw spectral data is first filtered using a moving average to eliminate high-frequency noise, with a window width set to 7 data points; then baseline correction is performed, using asymmetric least squares to fit the spectral baseline; the third step involves wavelength calibration, adjusting the instrument offset based on the standard position of the mercury lamp emission spectral lines; the fourth step is intensity normalization, converting detector counts into relative absorbance; finally, data resampling is performed, interpolating the irregularly spaced raw data into uniform spectra with 0.1 nm intervals. The feature extraction stage utilizes matrix factorization, organizing the preprocessed spectral data into a wavelength-time two-dimensional matrix. A non-negative constraint-based decomposition algorithm extracts five basis vectors, each representing the spectral characteristics of a class of substances. During the decomposition process, sparsity constraints are set to ensure that the proportion of non-zero elements in each basis vector does not exceed 30%.
[0068] The dynamic region segmentation algorithm establishes a dual-criteria mechanism during implementation. The first criterion is based on the absolute value of pollutant concentration, referencing national emission standards to set threshold limits for each substance, such as a 1-hour average concentration limit of 50 mg / m³ for SO2. 3 Exceeding this value qualifies the region as a macro-emission characteristic area. The second criterion considers the relative proportions of pollutants; when the SO2 / NOx concentration ratio exceeds 4:1, region labeling is triggered even if the concentration of any single pollutant does not exceed the standard. The segmentation process is performed in a three-dimensional parameter space, with the coordinate axes representing the concentrations of gaseous pollutants, particulate matter, and trace heavy metals, respectively. A density peak clustering algorithm is used to automatically determine the region boundaries. The clustering parameters are dynamically adjusted according to the fuel type; a larger neighborhood radius parameter is set for coal-fired units, while a finer clustering scale is used for gas-fired units.
[0069] The knowledge graph matching algorithm employs a multi-stage similarity calculation method. The first stage calculates topological similarity, comparing the subgraph constructed from real-time data with the knowledge base template in terms of node connectivity. Evaluation metrics include the number of common neighbors and path similarity. The second stage calculates attribute similarity, performing cosine similarity calculations on node feature vectors, focusing on comparing the relative intensity distribution of spectral feature peaks. The third stage performs temporal pattern matching, using a dynamic time warping algorithm to align historical operating condition curves with the real-time data stream. The final similarity score is a weighted composite of the three-stage results: topological similarity (40%), attribute similarity (35%), and temporal similarity (25%). The matching results generate a list of the top five candidate patterns, each accompanied by a difference analysis report, annotating the main inconsistencies and their degree of deviation.
[0070] The pollution characteristic knowledge base's update and maintenance mechanism includes an automated verification process. Newly collected emission data undergoes three verifications before being entered into the database: instrument status verification checks sensor calibration status codes to exclude data from equipment that has not been calibrated on schedule; operating condition consistency verification compares current operating parameters with typical values under historical loads, marking data that deviate by more than three standard deviations; and chemical balance verification calculates element conservation relationships, such as the mass balance of sulfur in pulverized coal, slag, and flue gas. Verified data enters the knowledge base's incremental update queue. The update process maintains version control, generating differential backups with each modification, supporting rollback to any historical version.
[0071] The real-time data analytics pipeline employs a microservices architecture for elastic scaling. Data processing units are encapsulated as independent containers, each dedicated to a specific type of analysis task. For example, the spectral preprocessing container is configured with 4 CPU cores and 8GB of memory, while the feature extraction container is allocated 6 CPU cores and 16GB of memory. Containers communicate with each other via a service mesh, using the gRPC protocol to transmit large messages such as spectral data. A load balancer monitors the resource utilization of each container, automatically triggering horizontal scaling when the CPU load consistently exceeds 70%, scalable up to a maximum of 20 parallel processing instances. An error handling mechanism implements circuit breaker protection; a single container is isolated after more than three failures, and its tasks are taken over by a backup container.
[0072] Spatial mapping of emission parameters employs adaptive gridding technology. The base grid is established in cylindrical coordinates centered on the chimney outlet, with radial spacing distributed on a logarithmic scale. The grid is denser in the near-field region with a resolution of 0.5 meters, gradually thinning out to a 10-meter interval in the far field. Vertically, it is layered, with each computational layer measuring 5 meters in height, extending up to five times the chimney height. Grid attributes include scalar fields such as pollutant concentration, temperature, and velocity, as well as derived parameters such as turbulence intensity and diffusion coefficient. Grid data is refreshed every 15 seconds, using an incremental calculation strategy, recalculating only grid cells with concentration changes exceeding 10%.
[0073] The spectral database undergoes rigorous version management and quality control. Each spectral entry must contain complete metadata: measurement date and time accurate to the second, instrument serial number and calibration certificate number, operator identification, and environmental temperature and humidity records. The database uses blockchain technology to store fingerprint information of important spectra, generating an immutable record with each modification. The query interface supports multi-condition searches, allowing data to be filtered by wavelength range, substance category, measurement time, and other dimensions. Search results are exported in standardized JSON-LD format, containing complete semantic annotation information.
[0074] Example 2: See Figure 3 The emission assessment model is constructed based on a three-dimensional spatial interpolation algorithm, using the Kriging method to establish the pollutant concentration field. The model input data comes from real-time monitoring values from a distributed sensor network, including data from eight monitoring points located at the chimney outlet section and auxiliary monitoring points positioned 50 meters, 100 meters, and 200 meters downstream. The spatial interpolation process considers anisotropy, setting a larger correlation length under the prevailing wind direction and employing an exponential variogram model in the vertical direction. The interpolation results generate a 500m × 500m × 200m computational domain grid. The grid resolution gradually decreases with distance from the chimney center, maintaining a fine 5-meter grid in the near field and transitioning to a coarse 20-meter grid in the far field. A sliding time window is established in the time dimension, with a window width of 1 hour and a sliding step of 10 minutes, containing at least 3600 valid sampling points within each window period.
[0075] The analysis of macroscopic emission characteristic regions employs a spatiotemporal convolutional neural network. The network input layer receives spatially interpolated 3D concentration field data, with the data structure being a floating-point tensor of width 50, height 40, and depth 30. The first hidden layer performs 3D convolution operations with a kernel size of 5×5×5 and a stride of 2×2×2 to extract local spatial features. The second hidden layer performs max pooling with a pooling region of 3×3×3 to reduce data dimensionality. The third hidden layer uses a gated recurrent unit structure, processing data from six consecutive time windows along the time axis, with 128 memory units. The output layer generates two types of prediction results: the trend of total pollutant variation over the next 3 hours is output in minute-level time series format, and the spatial diffusion range is represented by an isosurface grid, with isosurface intervals set to three levels: 50%, 80%, and 100% of the national standard limit.
[0076] The spectral acquisition system for the component fingerprinting model is equipped with an echelle grating spectrometer and a high-sensitivity CCD detector. The spectrometer operates in the 200-800 nm range with an optical resolution of 0.1 nm. The detector is cooled to an operating temperature of -30°C to reduce dark current noise. A synchronous triggering mechanism is implemented during spectral acquisition; the flue gas sampling probe and the spectrometer shutter are precisely synchronized via an optocoupler, with the sampling pulse width controlled within 100 milliseconds. Each spectral sample includes three repeated measurements, and the system automatically discards abnormal measurements with a discrete value exceeding 5%. Spectrometer wavelength calibration uses the standard emission lines of a mercury-argon lamp. The calibration process is performed automatically before daily operation, triggering an alarm when the calibration deviation exceeds 0.05 nm.
[0077] Data processing for the microscopic component feature regions employed wavelet multi-resolution analysis. The original spectral signal first underwent a 9th-order Daubechies wavelet transform, with a decomposition hierarchy of 5 levels. The first-level detail coefficients preserved high-frequency noise information for subsequent quality control, while the coefficients of the second to fourth levels participated in feature extraction. The fifth-level approximation coefficients characterized the spectral baseline. Adaptive thresholding was implemented during the noise reduction process, with the threshold for each wavelet coefficient dynamically adjusted based on the noise level. The threshold calculation method employed Stein's unbiased risk estimation. The reconstruction process used only the coefficients of the third to fifth levels, effectively preserving characteristic peak shapes while suppressing random noise. During the feature extraction stage, the second-order derivative spectra of each band were calculated to enhance the separation of overlapping peaks. The derivative calculation employed the Savitzky-Golay filtering method, with a polynomial order of 5 and a window width of 11 data points.
[0078] The construction of the pollutant fingerprint spectral database incorporates two data sources: standard reference material measurements and actual sample collection. Standard reference material measurements were performed under controlled conditions, using NIST standard reference materials to prepare samples with different concentration gradients. Each concentration level was measured 10 times, and the average value was taken. Actual samples were collected from dust collector hoppers and desulfurization byproducts in different thermal power plants. Samples underwent pretreatment such as microwave digestion and solvent extraction before spectral measurements. Each record in the database contains complete spectral metadata: measurement time accurate to the second, instrument operating parameters such as slit width and gain settings, environmental conditions including temperature and relative humidity, operator identification, and sample source code. Spectral data is stored as a floating-point array with a wavelength interval of 0.1 nm, and absorption intensity values are normalized to the 0-1 range.
[0079] The spectral matching algorithm employs an improved dynamic time warping technique. Before matching, both the query spectrum and the database spectrum are preprocessed, including wavelength alignment, intensity normalization, and baseline correction. The dynamic time warping constraint is set at a 45-degree slope, and the path weight matrix uses a symmetric triangular window function. A local feature enhancement strategy is introduced in the similarity calculation, assigning higher weights to characteristic peak regions and lower weights to background regions. The matching results generate a top-5 candidate list, with each candidate entry displaying its similarity score, matching status of major characteristic peaks, and source information. For multi-component mixed spectra, non-negative least squares linear decomposition is used to solve for the contribution ratio of each component, and the decomposition residual is used as a matching quality evaluation index.
[0080] A synergistic analysis of total emissions and component concentrations establishes a mass conservation verification mechanism. The hourly SO2 emissions output from the macroscopic model are converted to molar amounts, and the concentrations of various sulfides (sulfates, sulfites, etc.) identified by the microscopic model are also converted to molar equivalents; the difference between the two should be controlled within 15%. The verification process considers the distribution ratio of sulfur between the gas and particulate phases, employing different phase equilibrium constants depending on the temperature conditions. When significant deviations are detected, the system automatically checks the synchronization of data acquisition time, instrument calibration status, and analytical parameter settings, generating a difference analysis report to identify potential problems.
[0081] The model output is presented using interactive 3D visualization technology. The total pollutant distribution is projected onto the digital elevation model using a color map, with the color gradient transitioning from blue (low concentration) to red (high concentration). The transparency of the isosurfaces dynamically adjusts according to the concentration value. Component identification results are displayed in the form of molecular structure diagrams, annotating the positions of characteristic absorption peaks and the names of matching substances. The visualization system supports multi-view viewing, including top-down plan views, vertical cross-section views, and 3D perspective views, all of which are linked in real time. Users can click on any grid point to view detailed data, including concentration values, component composition, and uncertainty assessments.
[0082] The real-time data processing pipeline implements a quality control-in-the-loop mechanism. Data quality checkpoints are set at each processing stage: signal-to-noise ratio and baseline flatness are checked during spectral preprocessing; peak symmetry and full width at half maximum (FWHM) stability are monitored during feature extraction; and goodness of fit and residual distribution are evaluated during matching analysis. Quality indicators are displayed in real-time on a monitoring dashboard. Abnormal situations trigger tiered alarms: minor anomalies are logged, moderate anomalies prompt operators for inspection, and severe anomalies suspend the automatic analysis process. All alarm events record complete contextual data, supporting post-event traceability analysis.
[0083] Historical data for component analysis is stored in a time-series database. Complete analytical results for each sample include the raw spectrum, preprocessed spectrum, feature extraction parameters, matching results, and validation indices, organized into queryable data blocks by timestamp. The database supports complex queries, such as retrieving the frequency of a specific substance within a given time period or comparing component differences between different units. Data export formats are compatible with common analytical software, supporting multiple formats including CSV, JSON, and HDF5 to meet the needs of various application scenarios.
[0084] Example 3: See Figure 4 The core of the model coordination mechanism lies in establishing a dynamic balance between the outputs of the two-level models. The system constructs a multi-dimensional evaluation matrix to quantify the consistency level between the macro-emission assessment unit and the micro-component identification unit. This matrix includes six dimensions: the time-series correlation coefficient reflects the synchronization between the trend of total pollutant changes and component concentration fluctuations; the spatial distribution consistency index compares the matching degree between the concentration gradient direction and the mass diffusion pattern; the mass conservation balance value calculates the mass closure degree between the macro-total and micro-component levels of key elements such as sulfur and nitrogen; the error propagation factor assesses the amplification effect of sensor noise between the two-level models; the confidence interval overlap rate counts the proportion of intersection between two sets of results within the uncertainty range; and the decision conflict frequency records the number of times contradictory conclusions are generated in historical analysis. Each dimension indicator is standardized to eliminate dimensional differences, ultimately forming a six-dimensional hypercube-shaped evaluation structure.
[0085] The deviation detection algorithm implements a continuous monitoring strategy. Every five minutes, the system calculates the Euclidean distance modulus of the current evaluation matrix and compares it with the reference modulus under baseline conditions. The reference modulus is determined through historical data analysis, taking the upper limit of the 95% confidence interval during the system's stable operation phase. When the real-time modulus exceeds the reference value, a coordination mechanism is triggered; the coordination strength is proportional to the degree of modulus deviation. The formula for calculating the modulus is as follows:
[0086]
[0087] Where Φ represents the compatibility modulus, w i x represents the weight coefficient of the i-th dimension. iThis is the current indicator value, μ i σ is the baseline average. i This is the historical standard deviation. The weighting coefficients are dynamically adjusted according to the type of pollutant. In the correlation analysis of sulfur oxides, the weight of the mass conservation dimension is set to 0.3, while the nitrogen oxide analysis is given a higher weight of 0.35 for time series correlation.
[0088] The parameter adjustment process employs a hierarchical optimization strategy. The first layer targets the total emissions assessment model, dynamically adjusting the smoothing coefficient and trend term weights of its spatial interpolation algorithm. The smoothing coefficient controls the smoothness of the concentration field, with an initial value of 0.7 and an adjustment range limited to 0.5-0.9. The trend term weights influence the contribution ratio of the background field, with a baseline value of 0.4, allowing fluctuations between 0.3-0.6. The second layer targets the component fingerprinting model, primarily adjusting the similarity threshold and peak shape matching strictness of spectral matching. The similarity threshold is initially set at 85%, fine-tuned within the 80%-90% range based on coordination requirements. The peak shape matching strictness controls the tolerance for characteristic peak position shifts, employing a five-level adjustment mechanism, with each level corresponding to a wavelength tolerance of 0.2 nm.
[0089] The feedback control system employs a dual-loop adjustment architecture. The inner loop responds quickly to short-term fluctuations, checking the rate of change of the coordinated modulus every 30 seconds. When the rate of change exceeds 5% per minute, rapid adjustment is initiated, primarily adjusting the model's calculation frequency and data sampling interval. The outer loop handles systematic deviations, performing a comprehensive calibration daily and recalculating the baseline parameters and weighting coefficients of the evaluation matrix. A buffer mechanism is established between the two loops to prevent oscillations caused by frequent adjustments. Adjustment commands are transmitted via a message queue, with each adjustment action accompanied by complete contextual information, including trigger conditions, adjustment parameters, and expected results.
[0090] The historical data analysis module maintains a circular buffer to store operational data for the most recent 30 days. The buffer employs a time-slicing storage strategy, dividing the data into segments with five-minute intervals. Each segment contains the original input data, intermediate model results, and final output values. Data analysis uses a sliding window method, with a window width set to 24 hours and sliding for 12 hours at a time, calculating the moving average and trend slope of various indicators. The system automatically identifies abnormal patterns, such as persistent unidirectional deviations or periodic oscillations, and generates corresponding adjustment suggestions. Buffer data is periodically archived to a long-term storage system, supporting multi-dimensional retrieval by time range, pollutant type, and unit number.
[0091] The coordination process is visualized using a parallel coordinate graph to display the six-dimensional evaluation matrix. Each line represents the model state at a given time point, with color coding indicating the magnitude of the coordination modulus, transitioning from blue (normal) to red (severe deviation). The monitoring interface features interactive filters, allowing operators to select specific dimensions for detailed analysis. Key parameter changes are displayed as band charts, highlighting the time points and magnitudes of important adjustment events. All visualization elements support dynamic refresh, with update frequency synchronized with data acquisition.
[0092] The anomaly handling mechanism implements a tiered response strategy. Level 1 anomalies refer to a brief exceedance of a single dimension indicator; the system automatically logs this and marks data points requiring attention. Level 2 anomalies involve simultaneous deviations across multiple dimensions, triggering an early warning notification and suggesting adjustment measures. Level 3 anomalies indicate a fundamental contradiction between models, suspending the automated analysis process and requiring manual intervention for verification. Each anomaly type is associated with corresponding remedial measures, including data re-verification, model restart, and sensor calibration. Anomaly events are categorized and stored according to severity, forming a knowledge base for improving coordination strategies.
[0093] The system maintenance interface provides manual adjustment functionality. Operators can temporarily override automatic adjustment parameters and set...
[0094] The system offers various model configuration combinations. Manual adjustments are divided into trial mode and effective mode. Trial mode allows simulation of adjustments without affecting actual operation, while effective mode immediately applies changes to the production environment. All manual operations are logged in detail, including operator identity, modification time, original parameter values, and new parameter values. The system periodically analyzes the effectiveness of manual adjustments and incorporates successful cases into the automatic adjustment rule base.
[0095] The version control system manages the iterative updates of model parameters. Each major adjustment results in a new configuration version, with version numbers following a semantic numbering rule. The major version number indicates an algorithm architecture change, the minor version number identifies parameter combination adjustments, and the revision number records minor optimizations. The version repository stores complete configuration snapshots and performance benchmark data, supporting quick rollback to any historical version. Version difference comparison tools visualize the parameter change path, aiding in the analysis of the effectiveness of adjustment strategies.
[0096] The coordination mechanism is evaluated using cross-validation. The system retains 10% of the real-time data as a validation set, which is not used in the tuning process. Validation set data is periodically input into the currently configured model to calculate its performance metrics on unknown data. The validation results are compared with the training set performance to detect overfitting or underfitting. When the validation set performance degrades beyond a predetermined threshold, a configuration review process is triggered to reassess the generalization ability of the tuning strategy.
[0097] Inter-model communication employs a standardized data format. Macroscopic model outputs of total pollutant assessments are converted into a unified pollutant equivalent representation, including structured fields such as concentration values, uncertainty, and spatiotemporal range. Microscopic model-generated component identification results utilize a standard substance coding system, with each identification result accompanied by a matching score and a list of alternative substances. Data exchange is achieved through a shared memory region, ensuring the timeliness of large-volume data transfer, while a read-write lock mechanism prevents concurrent access conflicts.
[0098] The coordination strategy is optimized using reinforcement learning. The system treats each adjustment as a state-action pair, recording the performance improvement after adjustment as a reward signal. The policy network employs a deep Q-learning framework, with an input layer receiving a six-dimensional evaluation matrix, a fully connected layer with 128 hidden nodes, and an output layer predicting the expected reward for each adjustment action. Training utilizes experience replay technology, randomly sampling from historical adjustment records to construct training batches. Policy updates are performed offline and deployed to the production system after thorough validation.
[0099] Example 4: See Figure 5 The meteorological data access layer of the pollutant migration prediction system is configured with a multi-source data fusion mechanism. The system accesses real-time data from the China Meteorological Administration's GRAPES numerical weather prediction system, the US NCEP global forecast system, and local meteorological station observations. After spatiotemporal alignment processing, various data are used to generate comprehensive wind field analysis results with a 1-kilometer resolution. The data fusion process employs a confidence-weighted algorithm: for forecasts within 6 hours, local observations are assigned a 70% weight; forecasts between 6 and 12 hours use a balanced weighting of numerical models and observations; forecasts exceeding 12 hours primarily rely on numerical model results. The system establishes a dedicated quality control module to remove obviously anomalous wind speed abrupt changes (such as instantaneous wind speeds jumping from 2 m / s to 20 m / s and then immediately falling back).
[0100] The fluid dynamics simulation module employs computational fluid dynamics (CFD) methods to establish a three-dimensional diffusion model. The computational domain is set centered on the chimney, covering a circular area with a radius of 5 km, extending vertically to the top of the boundary layer. An adaptive octree structure is used for mesh generation, with the mesh size near the chimney refined to 10 meters, gradually increasing to 200 meters with increasing distance. Boundary conditions include: chimney outlet parameters (diameter 3 meters, flow velocity 15 m / s, flue gas temperature 140℃), terrain elevation data (1:50000 accuracy digital elevation model), and surface roughness classification (based on 30-meter resolution land use data). The turbulence model selected can realize the k-ε equation, and the standard wall function is used to handle near-surface flow.
[0101] The pollutant diffusion calculation employs a phased approach. The first phase involves solving the steady-state flow field iteratively until the residual converges to the order of 10^-4, obtaining the background wind field distribution. The second phase incorporates the pollutant transport equation, using a second-order upwind scheme to discretely define the convection term with a time step of 1 second. The third phase considers both dry and wet deposition effects, setting the dry deposition velocity of sulfur dioxide to 0.8 cm / s and dynamically adjusting the wet deposition coefficient based on precipitation intensity. The calculations are executed in parallel on a GPU cluster, with each node processing a sector-shaped region, and data exchange between nodes is achieved via the MPI protocol.
[0102] The spatiotemporal overlay analysis module correlates exceedance warning results with meteorological fields in multiple dimensions. The system establishes a spatiotemporal index structure to classify areas with pollutant concentrations exceeding standards (e.g., SO2 exceeding 150 μg / m³). 3 The analysis is performed by overlaying the pollutant flux diagram with a concurrent wind rose diagram. The process identifies the dominant transport pathway and calculates the pollutant flux contribution rate at various azimuth angles. For persistent exceedance events, the system retrospectively analyzes meteorological conditions over a 72-hour period to establish a table showing the correlation between pollution accumulation and changes in meteorological parameters. This table records the trajectory of key parameter changes, providing a reference model for subsequent forecasts.
[0103] The visualization output of the prediction results employs hierarchical rendering technology. The GIS engine divides the concentration prediction values into five levels, each corresponding to different display colors and transparency. The underlying map integrates administrative divisions, the distribution of sensitive points (such as schools and hospitals), and the locations of real-time monitoring stations. The pollutant diffusion animation uses particle tracking technology, with each particle representing a specific mass of pollutant, and its trajectory updated in real time based on the calculated flow field. The animation playback speed is adjustable, allowing for detailed observation of the transport and diffusion process of the pollution plume. All graphic elements support click-to-query functionality, displaying the predicted concentration value, arrival time, and duration at that location.
[0104] The emergency response system automatically generates response recommendations based on forecast results. When a forecast indicates that pollutants may affect sensitive areas, the system triggers a tiered response process. Level 1 response (blue alert) recommends increasing monitoring frequency; Level 2 response (yellow alert) initiates pre-notification of surrounding sensitive areas; Level 3 response (orange alert) implements emission reduction measures and prepares evacuation plans. Each response level is associated with a specific action list, including the responsible departments to be notified, recommended measures, and implementation deadlines. The system automatically tracks the implementation of measures and records the degree of match between the actual response time and the predicted event.
[0105] The predictive model is updated and maintained using a dynamic calibration mechanism. The system continuously compares the deviations between predicted concentrations and actual monitored values, establishing an error statistics database. Model parameter calibration is performed daily, primarily adjusting the turbulent diffusion coefficient and sedimentation velocity parameters. A comprehensive validation is conducted monthly, using independent datasets to test the model's predictive capabilities under different meteorological conditions. Validation results generate a model performance index report, including key parameters such as hit rate, false alarm rate, and lead time.
[0106] The historical case library stores complete records of typical pollution events. Each case includes initial emission conditions, meteorological background field, forecast process data, and actual impact range. Case retrieval supports multi-condition queries, such as filtering by season, querying by pollutant type, or sorting by impact severity. Selecting a case allows replaying the entire forecast process and comparing the differences in forecast results under different parameter settings. The case library is updated regularly with new representative real-world events.
[0107] Mobile terminal access services allow on-site personnel to view prediction results in real time. Predicted data is compressed and pushed to mobile devices, supporting offline viewing of key information. The mobile interface simplifies complex graphics, highlighting the impact level and protection recommendations for the current location. The location service automatically matches the user's location with the predicted impact area, triggering vibration alerts when entering high-risk zones. All movement operations record trajectory information for post-event analysis and assessment.
[0108] The data archiving system implements full-lifecycle management. Raw meteorological data is retained for three months, feature-extracted data is kept for one year, and forecast results are permanently stored. Archived data uses a columnar storage format, coupled with efficient compression algorithms to reduce storage space usage. The data retrieval interface supports batch export to meet the analytical needs of different departments. The archiving process implements integrity verification to ensure data traceability and tamper-proofness.
[0109] The system interfaces with the ambient air quality forecasting platform to achieve two-way data exchange. On one hand, it receives background pollution field data over a large area as boundary conditions for local forecasts; on the other hand, it uploads local forecast results, contributing them to the regional ensemble forecasting system. Data exchange uses the standard NetCDF format and is automatically synchronized hourly. Data consistency checks are implemented during the interface process to ensure a unified spatiotemporal reference.
[0110] Example 5: The closed-loop control execution system adopts a layered distributed architecture, deeply integrated with the existing control system of the thermal power generator set. The system deploys a high-speed data acquisition module at the physical layer, directly connecting to the OPCUA interface of the generator set's DCS system to read key parameters of the boiler combustion control system, flue gas purification system, and auxiliary equipment in real time. The data acquisition frequency is set differently according to the parameter characteristics; core parameters such as burner oxygen content and main steam pressure are acquired at 1-second intervals, while auxiliary parameters such as dust collector differential pressure and desulfurization tower pH value are acquired at 5-second intervals. The acquisition module has a built-in signal conditioning circuit to digitize 4-20mA analog signals and also supports Modbus TCP protocol for reading data from smart instruments. All acquired data is labeled with a quality code to distinguish normal values, values exceeding the range, and communication anomalies.
[0111] The transmission of pollutant exceedance warning signals employs a dual-channel approach, combining hard-wired and soft-communication methods. The hard-wired channel connects to the unit's safety interlock system via a relay output module. A passive dry contact signal is triggered when a Level 1 exceedance (exceeding the national standard limit) occurs, and an audible and visual alarm is activated when a Level 2 exceedance (exceeding the limit by 80%) occurs. The soft-communication channel transmits structured alarm information via the factory's fiber optic ring network, including fields such as the name of the exceeding substance, current concentration, exceedance magnitude, and duration. Alarm information is encapsulated according to the OPCAE standard format to ensure compatibility with control systems from different manufacturers. Alarm priorities are divided into three levels, each corresponding to different response time requirements. The highest-level alarm requires delivery to the main control panel in the control room within 10 seconds.
[0112] The combustion optimization control algorithm employs a multivariate predictive control framework, establishing a dynamic matrix model with 28 input variables and 15 output variables. Input variables cover fuel characteristic parameters (such as volatile matter and sulfur content in the coal fed into the furnace), combustion condition parameters (such as primary air ratio and pulverizer outlet temperature), and environmental protection facility status (such as electrostatic precipitator secondary current and ammonia slip rate in the denitrification system). Output variables focus on three pollutant control indicators (NOx, SO2, and dust concentration) and three operational economic indicators (boiler efficiency, coal consumption for power generation, and plant power consumption rate). Model identification is based on historical unit operating data, using a subspace identification method to extract the state-space equations, and the model parameters are retrained quarterly. Constraints are added during the control algorithm solution process to ensure that adjustments do not cause steam parameters to exceed limits or auxiliary equipment to overload.
[0113] The fuel supply regulation system implements a feedforward-feedback composite control strategy. The feedforward stage dynamically adjusts the coal feeding ratio of different coal bunkers based on online coal quality monitoring data. When an increase in sulfur content is detected, it automatically increases the blending ratio of low-sulfur coal. The feedback stage receives real-time pollutant concentration monitoring values and adjusts the feeder speed via a PID controller. The control loop is equipped with an anti-integral saturation function. Fuel regulation commands must undergo safety verification before being issued. Verification includes checking the mill outlet temperature limit, coal bunker level alarm status, and conveyor belt operation status. All regulation actions are recorded in a complete operation log, including timestamps, command values, actual execution values, and operator confirmation status.
[0114] The air supply control system establishes an air-coal coordination mechanism. Primary air volume adjustment is based on a baseline setting according to the pulverizer load characteristic curve, and then fine-tuned based on NOx concentration monitoring values. Secondary air volume distribution adopts a hierarchical control strategy, dividing the combustion zone into four control areas, with the damper opening of each area adjusted independently. The air volume measurement device is calibrated periodically, and the calibration data is automatically updated to the compensation parameters of the control system. Air volume adjustment and fuel changes are decoupled; when a rapid change in fuel quantity is detected, it automatically switches to a special control mode to prevent combustion instability.
[0115] The flue gas recirculation system implements adaptive PID control. The control system dynamically adjusts the PID parameters based on load commands and the NOx concentration change rate, employing weaker integral action at high loads and enhanced derivative control at low loads. Acceleration limits are set for the circulating fan speed regulation to prevent excessively rapid action that could cause fluctuations in furnace negative pressure. The system monitors the oxygen content in the circulating flue gas in real time, automatically reducing the recirculation rate when an abnormal decrease in oxygen is detected. Key control parameters, such as proportional band and integral time, are stored in non-volatile memory, retaining the latest settings even after power failure.
[0116] The ammonia injection control in the denitrification system employs a cascade control structure. The main loop calculates the basic ammonia demand based on the deviation between the NOx concentration measured by CEMS and the set value, while the secondary loop dynamically compensates based on changes in flue gas flow and temperature. Ammonia slip monitoring values are used as constraints in the control calculations; when the slip amount approaches the alarm value, the ammonia injection rate is automatically reduced. The control system incorporates a spray gun blockage detection function, identifying abnormal spray guns by analyzing differences in ammonia flow rates across different branches. Pressure regulation at the ammonia station utilizes pressure-flow coordinated control to maintain the main pipe pressure stable within the range of 0.3 MPa ± 5%.
[0117] The emission parameter verification system is configured with a multi-level data verification process. The raw monitoring data first undergoes instrument status verification, checking health indicators such as analyzer calibration marks, sample gas flow rate, and optical mirror cleanliness. The second level performs physicochemical rationality verification, such as the negative correlation between SO2 and CO2 concentrations and the synchronicity of NOx and oxygen changes. The third level performs process consistency verification, comparing the expected changes in pollutant concentrations before and after control actions with the actual changes. The verification results generate a data quality assessment report, marking suspicious data points and their potential influencing factors.
[0118] The control effectiveness evaluation employs a dual comparison method. Horizontally, it compares the control performance indicators of similar units to eliminate the impact of fuel characteristic differences. Vertically, it compares historical data from the same load range to eliminate the impact of operating condition fluctuations. The evaluation period is set to 24 hours, calculating two key indicators: the standard deviation of pollutant concentration and the cumulative time of exceedance. The evaluation results are displayed in radar chart format, visually presenting the achievement of each control objective. Control strategies that consistently demonstrate excellent performance are marked as preferred solutions and given priority in similar operating conditions.
[0119] The system employs multiple security measures. Before issuing control commands, a logical interlock check is performed to prevent conflicting commands. Important adjustments require manual confirmation; operators must confirm the changes at the control panel before execution. The system periodically and automatically tests safety loops to verify the reliability of hard-wired alarm channels. All control algorithms run on independent real-time processors, physically isolated from the monitoring network. Data communication uses encryption protocols to prevent unauthorized access and modification.
[0120] The knowledge base management system automatically archives typical control cases. Each case includes an initial operating condition description, control challenge analysis, measures taken, and a final effect evaluation. Case retrieval supports fuzzy searches; for example, entering "high-sulfur coal" + "low load" will find relevant handling experience. The knowledge base is regularly reviewed by an expert team to update outdated control strategies and supplement new best practices. The system provides a case similarity calculation function to help operators find historical handling solutions that are closest to the current situation.
[0121] The user interface is designed according to ergonomic principles. The main display screen in the control room adopts a panoramic layout, with the central area showing the concentration trend of key pollutants and status indicator lights for each subsystem distributed around it. The operating terminal supports both touch and keyboard input, and important parameter settings require a double-confirmation password. The interface color coding follows industry standards: red indicates an alarm status, yellow represents a warning, and green indicates normal operation. All operating procedures have context-sensitive help guides, and pressing the F1 key brings up a detailed documentation for the current screen.
[0122] It should be noted that, in this document, relational terms such as "first" and "second" are used only to distinguish one entity or operation from another, and do not necessarily require or imply any such actual relationship or order between these entities or operations. Furthermore, the terms "comprising," "including," or any other variations thereof are intended to cover non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements includes not only those elements but also other elements not expressly listed, or elements inherent to such process, method, article, or apparatus.
[0123] Although embodiments of the invention have been shown and described, it will be understood by those skilled in the art that various changes, modifications, substitutions and alterations can be made to these embodiments without departing from the principles and spirit of the invention, the scope of which is defined by the appended claims and their equivalents.
Claims
1. A method for detecting pollutants emitted from thermal power generation based on big data, characterized in that, Includes the following steps: Multiple emission parameters from the chimney exhaust outlet of a thermal power plant are collected in real time through a distributed sensor network. These emission parameters include flue gas composition concentration, particulate matter size distribution, emission velocity, and temperature and pressure data. The collected multidimensional emission parameters are transmitted to a big data processing platform, and the pollution feature knowledge base is used to analyze the correlation of the multidimensional emission parameters to identify potential target pollutant combination patterns. Based on the identified target pollutant combination pattern, the preset pollutant detection rule library is invoked, and the emission parameters are divided into macro emission characteristic regions and micro component characteristic regions based on the dynamic region segmentation algorithm. By using a two-level pollutant detection model to process parameter data from macroscopic emission characteristic regions and microscopic component characteristic regions respectively, pollutant exceedance early warning results and component source tracing analysis reports are generated. The process of using a pollution feature knowledge base for correlation analysis includes: A pollution feature knowledge base based on a graph database is constructed, which stores the mapping relationship between different fuel types, combustion conditions and pollutant emission characteristics; A multi-source data association engine is used to analyze real-time data collected by a distributed sensor network and extract the symbiotic relationship features between flue gas components. The extracted symbiotic relationship features are compared with the standard pollution patterns in the pollution feature knowledge base using a knowledge graph matching algorithm to determine the target pollutant combination pattern corresponding to the current emission parameters. The execution process of the dynamic region segmentation algorithm includes: Based on the target pollutant combination pattern, the corresponding pollutant detection rule base is invoked to obtain the emission monitoring standards under the current pollution pattern; The emission parameters are decomposed into a data matrix with spatial distribution characteristics using a spectral feature analysis algorithm. Based on the threshold parameters specified in the emission monitoring standards, the data matrix is divided into multi-scale regions to generate macro-emission feature regions containing overall emission intensity characteristics and micro-component feature regions containing the concentration distribution of specific substances. The operation process of the two-stage pollutant detection model includes: In macro-emission characteristic regions, an emission total assessment model is applied, and spatiotemporal feature extraction algorithms are used to calculate the total emission and diffusion trend of pollutants. A component fingerprinting model is activated in the microscopic component characteristic region, and high-resolution spectral matching technology is used to analyze the fingerprint spectrum of characteristic pollutants; It simultaneously receives trend prediction data from the total emission assessment model and substance source traceability data from the component fingerprinting model, and generates a comprehensive detection report that integrates the total emission exceedance warning and the source traceability results of specific substances.
2. The method for detecting pollutants emitted from thermal power generation based on big data according to claim 1, characterized in that, The operation process of the component fingerprint recognition model includes: Establish a pollutant fingerprint spectral database to store the characteristic absorption spectra of different industrial pollution sources; Wavelet denoising was performed on the parameter data of the microscopic component characteristic region to extract effective material characteristic spectral curves; The spectral similarity calculation engine uses a step-by-step matching process between the extracted spectral curves of the material characteristics and the pollutant fingerprint spectral database to identify the source categories of characteristic pollutants.
3. The method for detecting pollutants emitted from thermal power generation based on big data according to claim 2, characterized in that, It also includes a dynamic optimization process for pollutant detection: Real-time monitoring of the analysis bias indicators of the total emission assessment model and the component fingerprinting model; When there is a significant difference between the total emission assessment results of macro-emission characteristic regions and the substance identification results of micro-component characteristic regions, the model parameter coordination mechanism is activated. By dynamically adjusting the weight coefficients in the total emission assessment model and the matching thresholds in the component fingerprinting model through a feedback adjustment algorithm, the output results of the two-stage pollutant detection model converge.
4. The method for detecting pollutants emitted from thermal power generation based on big data according to claim 3, characterized in that, The execution process of the model parameter coordination mechanism includes: Construct a correlation analysis matrix for the output results of the two-stage pollutant detection model, and calculate the statistical correlation index between total emission data and material composition data; Based on the degree to which the statistical correlation indicators deviate from the normal range, a set of model parameter correction coefficients is generated; The model parameter correction coefficients are input into the dynamic weight controller of the total emission assessment model and the threshold regulator of the component fingerprinting model to achieve collaborative optimization of the two-level pollutant detection model.
5. The method for detecting pollutants emitted from thermal power generation based on big data according to claim 4, characterized in that, It also includes the pollutant migration prediction process: Real-time wind direction and speed, atmospheric humidity and temperature stratification data are obtained based on meteorological data interfaces; The pollutant exceedance early warning results are spatiotemporally overlaid with meteorological data, and fluid dynamics simulation algorithms are used to deduce the pollutant diffusion path. Based on the simulation results, a pollutant migration prediction map is generated, which includes the affected area range, concentration distribution gradient, and duration.
6. The method for detecting pollutants emitted from thermal power generation based on big data according to claim 5, characterized in that, It also includes the emission control feedback process: The pollutant exceedance warning results are input into the thermal power generator set control system; The fuel supply ratio, air volume, and flue gas recirculation parameters are adjusted through combustion optimization control algorithms. Real-time acquisition of the adjusted emission parameter change curves verifies whether the pollutant concentration decline trend meets the expected control target.
7. A big data-based system for detecting pollutants emitted from thermal power generation, used to implement the big data-based method for detecting pollutants emitted from thermal power generation as described in any one of claims 1-6, characterized in that, The system includes the following modules: A distributed sensing and acquisition module is deployed in the emission pipeline of a thermal power plant, equipped with multiple types of gas sensors and particulate matter monitoring probes. The big data processing engine module connects the pollution characteristic knowledge base and the pollutant detection rule base to perform data association parsing and dynamic region segmentation. The two-level analysis and decision-making module includes a macro-emission assessment unit and a micro-component identification unit, which process data from different characteristic regions respectively. The dynamic coordination and optimization module receives the output results from the two-level analysis and decision-making module and performs model parameter coordination operations. The pollution migration prediction module connects to the meteorological data platform and performs pollutant diffusion simulations. The closed-loop control execution module converts the detection results into control commands and transmits them to the generator set control system.