An intelligent monitoring and early warning method for surface subsidence of a coal mine area

By constructing a neural network model that combines multi-source data processing and physical simulation, the problems of model stability and adaptability in coal mine surface subsidence monitoring and trend prediction were solved, achieving high-precision subsidence trend prediction and intelligent early warning.

CN122241340APending Publication Date: 2026-06-19CHINA COAL INFORMATION TECH (BEIJING) CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
CHINA COAL INFORMATION TECH (BEIJING) CO LTD
Filing Date
2026-02-05
Publication Date
2026-06-19

AI Technical Summary

Technical Problem

Existing technologies for monitoring and predicting surface subsidence in coal mine areas are insufficient in terms of model stability and generalization ability, making it difficult to adapt to complex changes and abnormal working conditions in mining areas and affecting the actual effectiveness of automatic early warning systems.

Method used

A smart monitoring and early warning method for surface subsidence in coal mining areas is constructed. By acquiring multi-source observation data and performing spatiotemporal alignment and normalization processing, combined with an elastoplastic physical simulation submodule and an intermediate mapping network, a master prediction neural network with a dual-stream coding path is used for training. The spatiotemporal attention mechanism is used to fuse data-driven features and physical prior information to generate prediction results of surface subsidence trends for future periods. The adaptive decoupling coefficient is adjusted by gradient coordination state flags to dynamically adjust the model learning strategy.

Benefits of technology

It significantly improves the model's convergence and generalization capabilities, enhances its adaptability to complex mining environments and the accuracy of long-term trend prediction, reduces computational errors and parameter tuning complexity in engineering deployment, and achieves high spatiotemporal resolution and good interpretability.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN122241340A_ABST
    Figure CN122241340A_ABST
Patent Text Reader

Abstract

This invention relates to an intelligent monitoring and early warning method for surface subsidence in coal mining areas. The method includes: collecting and preprocessing GNSS and InSAR deformation, meteorological, and mining log data to achieve spatiotemporal alignment and normalization; constructing an elastoplastic physical simulation submodule based on geomechanical parameters to output a surface mechanical response vector; implicitly representing the physical state through an intermediate mapping network and inputting it along with observational data features into a main prediction network; and using a spatiotemporal attention mechanism to fuse physical priors and data-driven information to predict future subsidence trends. An adaptive decoupling mechanism dynamically adjusts the physical constraint weights to achieve a progressive balance learning between physics and data. This scheme effectively improves the model's interpretability and generalization ability, accurately supporting risk classification, early warning, and intelligent response for surface deformation in mining areas.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of coal mine surface subsidence monitoring and intelligent early warning technology, and in particular to an intelligent monitoring and early warning method for coal mine surface subsidence. Background Technology

[0002] Surface subsidence monitoring and trend prediction technology in coal mining areas is of great significance for safe production, environmental protection, and sustainable development. It enables real-time monitoring of surface deformation, timely early warning of geological disasters such as collapses and landslides, and ensures the safety of miners and the stability of mine facilities. Simultaneously, this technology helps assess the extent of damage mining causes to surface buildings, farmland, and water systems, providing a scientific basis for ecological restoration and damage compensation.

[0003] Existing technologies for monitoring and predicting surface subsidence in coal mine areas suffer from model instability in practical engineering applications. Under hard constraint strategies, the models have limited generalization ability and are difficult to adapt to complex changes and abnormal working conditions in mining areas. They are also prone to overfitting to theoretical models and deviating from real monitoring data, which affects the actual effectiveness of automatic early warning systems. Summary of the Invention

[0004] This application provides an intelligent monitoring and early warning method for surface subsidence in coal mining areas, aiming to solve one of the problems or issues of the existing technology mentioned in the background section.

[0005] This application provides a method for intelligent monitoring and early warning of surface subsidence in coal mine areas, specifically including:

[0006] S1: Acquire multi-source observation data of the coal mine surface, including GNSS deformation monitoring sequences, InSAR differential interferometric atlases, and meteorological and mining activity logs, and perform spatiotemporal alignment and normalization preprocessing on the data to generate a structured historical observation dataset under a unified spatiotemporal benchmark.

[0007] S2: Based on prior knowledge of geomechanics, an elastoplastic physical simulation submodule is constructed. This elastoplastic physical simulation submodule uses the thickness of the rock strata, elastic modulus and mining stress distribution parameters in the mining area to calculate the theoretical predicted value of the surface displacement field at each time step and outputs a physical state vector sequence containing stress gradient and strain increment.

[0008] S3: Based on the physical state vector sequence and the structured historical observation dataset, an intermediate mapping network is constructed. This network maps the physical state vector into an implicit feature representation through nonlinear transformation, realizing a differentiable buffered connection between the physical model output and the main prediction network input.

[0009] S4: The structured historical observation dataset and implicit feature representation are fed into the main prediction neural network as dual inputs. The main prediction neural network includes a dual-stream coding path with a physical guidance branch and a data guidance branch. The network uses a spatiotemporal attention mechanism to fuse data-driven features and physical prior information to generate prediction results of land subsidence trends in future periods.

[0010] S5: Train the intermediate mapping network and the main prediction neural network. During the training process, fix the parameters of the physical simulation sub-module, jointly optimize the weights of the intermediate mapping network and the main prediction neural network, and determine the learning consistency between the physical guidance branch and the data guidance branch based on the gradient direction angle in the backpropagation path, and generate a gradient coordination state flag.

[0011] S6: Dynamically adjust the adaptive decoupling coefficient based on the gradient coordination state flag. This coefficient is used to adjust the contribution weight of implicit feature representation in the loss function, realizing a progressive learning strategy with high physical guidance in the early stage of training and high data adaptability in the later stage.

[0012] S7: Determine whether the validation set error of the main prediction neural network in the current training round meets the convergence condition. If it does, output the final trained settlement trend prediction model; otherwise, return to continue executing parameter updates and decoupling control.

[0013] S8: Input the newly collected surface observation data in real time into the trained subsidence trend prediction model to obtain the future subsidence risk level, and combine it with the preset threshold to determine whether to trigger the graded early warning signal, so as to realize the intelligent response to the abnormal surface deformation in the mining area.

[0014] The intelligent monitoring and early warning method for surface subsidence in coal mine areas provided in this application has the following beneficial effects:

[0015] (1) This scheme significantly improves the convergence and generalization ability of the model by constructing a dynamically decoupled collaborative training architecture. This scheme innovatively introduces an independently running physical simulation submodule and an intermediate mapping network (IMN) to realize the formal extraction and flexible injection of physical prior information. Since the physical model parameters remain fixed during training, its output features are transmitted only through the differentiable IMN, effectively isolating the direct interference of solving complex differential equations on the optimization path of the neural network; at the same time, by scaling control and direction discriminative truncation of the IMN output gradient, the excessive traction effect that the physical guidance branch may bring in a specific training stage is further suppressed. The main prediction neural network includes a dual-stream encoding path of physical guidance branch and data guidance branch. During training, the parameters of the physical simulation submodule are fixed, the weights of the intermediate mapping network and the main prediction neural network are jointly optimized, and the learning consistency between the physical guidance branch and the data guidance branch is judged based on the gradient direction angle in the backpropagation path, effectively overcoming the structural defects of traditional fusion strategies in the coordination of multi-source information.

[0016] (2) This scheme achieves a dynamic balance between physical knowledge guidance and data-driven learning by designing a dual-path attention fusion mechanism, which greatly enhances the model's adaptability to complex mining environments and the accuracy of long-term trend prediction. Unlike coarse-grained control methods such as static weighting or multi-stage frozen training, this application combines a feedback adjustment mechanism based on gradient angle. When there is a significant divergence between the physical path and the data path, the system can actively reduce the influence of the former to prevent erroneous priors from misleading the learning direction. The main network uses an attention mechanism to deeply fuse historical spatiotemporal observation sequences with implicit physical features generated by the IMN. This not only preserves the dynamic evolution pattern in the time dimension but also explicitly models the correlation and dependency between spatial locations, making the prediction results have both high spatiotemporal resolution and good interpretability. The entire framework does not require differentiable reconstruction or approximate discretization of the original physical model, avoiding computational errors and stability risks introduced by numerical differentiation, and greatly reducing the engineering deployment threshold and parameter tuning complexity. Attached Figure Description

[0017] Figure 1 This is the main flowchart of a smart monitoring and early warning method for surface subsidence in coal mining areas.

[0018] Figure 2 This is a sub-flowchart of a method for intelligent monitoring and early warning of surface subsidence in coal mining areas.

[0019] Figure 3 This is another sub-flowchart of a method for intelligent monitoring and early warning of surface subsidence in coal mining areas. Detailed Implementation

[0020] Embodiments of the present invention are described in detail below, examples of which are shown in the accompanying drawings, wherein the same or similar reference numerals denote the same or similar elements or elements having the same or similar functions throughout. The embodiments described below with reference to the accompanying drawings are exemplary and are only used to explain the present invention, and should not be construed as limiting the present invention.

[0021] The following disclosure provides many different embodiments or examples for implementing different structures of the invention. To simplify the disclosure, specific examples of components and arrangements are described below. Of course, these are merely examples and are not intended to limit the invention. Furthermore, reference numerals and / or letters may be repeated in different examples; such repetition is for simplification and clarity and does not in itself indicate a relationship between the various embodiments and / or arrangements discussed.

[0022] like Figure 1 As shown, this application provides an intelligent monitoring and early warning method for surface subsidence in coal mining areas, specifically including:

[0023] S1: Acquire multi-source observation data of the coal mine surface, including GNSS deformation monitoring sequences, InSAR differential interferometric atlases, and meteorological and mining activity logs, and perform spatiotemporal alignment and normalization preprocessing on the data to generate a structured historical observation dataset under a unified spatiotemporal benchmark.

[0024] S2: Based on prior knowledge of geomechanics, an elastoplastic physical simulation submodule is constructed. This elastoplastic physical simulation submodule uses the thickness of the rock strata, elastic modulus and mining stress distribution parameters in the mining area to calculate the theoretical predicted value of the surface displacement field at each time step and outputs a physical state vector sequence containing stress gradient and strain increment.

[0025] S3: Based on the physical state vector sequence and the structured historical observation dataset, an intermediate mapping network is constructed. This network maps the physical state vector into an implicit feature representation through nonlinear transformation, realizing a differentiable buffered connection between the physical model output and the main prediction network input.

[0026] S4: The structured historical observation dataset and implicit feature representation are fed into the main prediction neural network as dual inputs. The main prediction neural network includes a dual-stream coding path with a physical guidance branch and a data guidance branch. The network uses a spatiotemporal attention mechanism to fuse data-driven features and physical prior information to generate prediction results of land subsidence trends in future periods.

[0027] S5: Train the intermediate mapping network and the main prediction neural network. During the training process, fix the parameters of the physical simulation sub-module, jointly optimize the weights of the intermediate mapping network and the main prediction neural network, and determine the learning consistency between the physical guidance branch and the data guidance branch based on the gradient direction angle in the backpropagation path, and generate a gradient coordination state flag.

[0028] S6: Dynamically adjust the adaptive decoupling coefficient based on the gradient coordination state flag. This coefficient is used to adjust the contribution weight of implicit feature representation in the loss function, realizing a progressive learning strategy with high physical guidance in the early stage of training and high data adaptability in the later stage.

[0029] S7: Determine whether the validation set error of the main prediction neural network in the current training round meets the convergence condition. If it does, output the final trained settlement trend prediction model; otherwise, return to continue executing parameter updates and decoupling control.

[0030] S8: Input the newly collected surface observation data in real time into the trained subsidence trend prediction model to obtain the future subsidence risk level, and combine it with the preset threshold to determine whether to trigger the graded early warning signal, so as to realize the intelligent response to the abnormal surface deformation in the mining area.

[0031] Step S1: Acquire multi-source surface observation data of the coal mine area, including GNSS deformation monitoring sequences, InSAR differential interferometric atlases, and meteorological and mining activity logs. Perform spatiotemporal alignment and normalization preprocessing on the data to generate a structured historical observation dataset under a unified spatiotemporal benchmark. Specifically, this includes:

[0032] S1.1: Acquire raw surface observation data from multiple sources in the coal mine area, including three-dimensional coordinate time series data output by GNSS continuous observation stations, surface deformation phase atlas generated by InSAR differential interferometry, rainfall and temperature time series recorded by meteorological monitoring systems, and working face advance logs and blasting operation schedules from the mining production management system, as the initial input conditions for multimodal data fusion.

[0033] The input conditions are raw observation data generated by various sensors and monitoring systems distributed in the coal mine area, including three-dimensional coordinate time series data output by GNSS continuous observation stations, surface deformation phase atlas generated by InSAR differential interferometry, rainfall and temperature time series recorded by meteorological monitoring systems, and working face advance logs and blasting operation schedules provided by the mining production management system.

[0034] Using the GNSS data acquisition protocol (parameters: sampling frequency 1Hz, output format RINEX), the three-dimensional displacement component time series of each fixed observation station in a unified reference coordinate system (such as WGS-84) is acquired in real time to achieve high-precision positioning and recording of surface deformation.

[0035] Furthermore, by using the InSAR differential interferometry processing method (parameters: baseline threshold for interferometric pair selection ≤ 300m, time interval ≤ 24 days), satellite SAR image data are fused to generate a surface deformation phase difference atlas, and deformation quantification indicators are extracted to obtain spatially comprehensive isometric deformation observation information.

[0036] Furthermore, by using an automatic weather station acquisition protocol (parameters: sampling frequency 1 hour, accuracy of rainfall 0.1 mm, temperature 0.1 °C), continuous time series data of rainfall and temperature are obtained from the meteorological monitoring network in the mining area, thereby realizing the quantitative recording of meteorological conditions driving surface subsidence.

[0037] Furthermore, by reading the working face advancement log and blasting operation schedule through the mining production management system interface protocol (parameters: data refresh cycle of 1 day, position accuracy ≤5m), the spatiotemporal distribution information of mining activities in the mining area is transformed into a structured event sequence, which serves as an important correlation factor for deformation data.

[0038] By acquiring and synchronously encapsulating multi-source data, the aforementioned GNSS displacement time series, InSAR phase atlas, meteorological time series, and mining event logs are combined to form a raw multimodal dataset, thereby providing a unified input condition for subsequent spatial resampling and temporal alignment processing.

[0039] For example, five continuous GNSS observation stations were deployed in a typical mining area, with a sampling frequency of 1 Hz, outputting three-dimensional coordinates in RINEX format. Combined with Sentinel-1 satellite imagery, an interferometric baseline threshold of 200 m and a time interval of 12 days were set to generate a deformation phase difference map covering the mining area. Rainfall data was collected at 1-hour intervals with an accuracy of 0.1 mm, and temperature data with an accuracy of 0.1 °C. The mining management system recorded an average face advance speed of 3 m per day and blasting operations twice a week. The GNSS displacement sequence, with timestamp accuracy at the second level, was spatially indexed and matched with the InSAR deformation map to generate point-area fused data with time-matched labels. Meteorological factors and mining events were mapped along the time axis to form an external driving variable matrix, forming a multimodal raw input set that can be directly called in subsequent spatial interpolation and temporal resampling steps, significantly improving data fusion accuracy and consistency.

[0040] S1.2: Based on the spatial coordinates of GNSS observation points and InSAR pixel geocoding information, the InSAR differential interferometric atlas is resampled using a spatial interpolation algorithm to match the observation grid density of GNSS stations. The non-deformation phase noise caused by atmospheric delay effect is corrected by the least squares fitting method to generate surface deformation rate field data with consistent spatial resolution, which serves as the spatial reference for subsequent spatiotemporal alignment.

[0041] Based on the spatial coordinates of GNSS observation points and InSAR pixel geocoding information, the Kriging spatial interpolation method (parameters: the semi-variogram model type is a spherical model, the range parameter is set according to the length of the main structural zone in the mining area, and the effective sample radius is 5km) is used to resample the InSAR differential interferometric atlas to generate pixel grid data with the same density as the GNSS station network observation points.

[0042] Furthermore, a bilinear interpolation algorithm (parameter: elevation correction value corresponding to latitude and longitude of input grid) is used to perform position matching adjustment on the resampled InSAR deformable phase data to ensure that the center point of each pixel in the resampled map is accurately aligned with the grid nodes of the spatial distribution of GNSS stations, and the matched deformable phase matrix is ​​output.

[0043] Furthermore, the non-deformation phase noise caused by atmospheric delay effect is corrected by using the least squares fitting method (parameters: the fitting equation is a linear model, and the fitting variables are elevation and phase residual values). The fitting calculation formula is as follows:

[0044]

[0045] in, These are the original InSAR phase values. For pixel elevation values, The atmospheric delay phase gradient coefficient is estimated by least squares. This is the corrected phase value.

[0046] Furthermore, the corrected phase matrix is ​​mapped to the surface deformation rate field using the phase-to-deformation rate conversion formula, which is:

[0047]

[0048] in, This is the deformation rate value. For radar carrier wavelength, Obtain the time difference between adjacent images.

[0049] The above algorithm transforms the InSAR differential interferometric atlas from the previous step into surface deformation rate field data with spatial resolution consistent with the GNSS observation grid and noise correction, thereby achieving consistency and accuracy improvement of the spatial reference in the subsequent spatiotemporal alignment process.

[0050] For example, in a practical application in a mining area, the GNSS network has 64 fixed observation points, with spatial coordinates distributed at one node every 5 km. The original InSAR atlas (original pixel resolution 30m) was resampled using Kriging interpolation to generate a deformation phase matrix for each 5km grid. The range parameter of the spherical semi-variogram model was set to 20km, and the mean square error of the fitting result was less than 0.005 radians. After aligning the center coordinates of the resampled pixels with the GNSS nodes using bilinear interpolation, the atmospheric delay gradient coefficient k obtained by fitting the phase-elevation relationship using the least squares method was... The value was calculated in radians per kilometer, and the phase value was corrected accordingly. After correction, the phase transition rate was calculated using a radar wavelength λ = 0.056 m and an image interval Δt = 12 days. The resulting deformation rate field showed that the subsidence rate in the central mining area was significantly higher than that in the peripheral areas, with the maximum rate reaching [missing value]. m / year, significantly improving spatial matching accuracy and meeting the spatial consistency requirements for subsequent fusion with GNSS data.

[0051] S1.3: Based on the timestamp sequence of GNSS data and the acquisition time of InSAR images, time series resampling technology is used to unify all sensor data to an equal time step (e.g., one time point per hour). For missing time periods, an interpolation algorithm based on the autoregressive moving average model (ARMA) is used to fill in the missing time periods, generating a time-synchronized and aligned multi-source observation sequence to ensure that each data stream has consistency in the time dimension.

[0052] S1.4: Using the known geological zoning boundaries of the mining area and the working face location information in the mining activity log, a spatial mask matrix is ​​constructed and regional clipping operations are performed on GNSS and InSAR data to extract the observation subset of the active subsidence area that is significantly affected by mining. At the same time, meteorological factors and mining events are mapped into binary flag sequences or continuous intensity variables according to the time axis to form a set of external driving factors that match the deformation data in time and space.

[0053] For GNSS observation sequences and InSAR deformation rate field data that have been time-synchronized and aligned, a spatial mask generation method based on the geological zoning boundary of the mining area (parameters: boundary coordinate vector, grid resolution) is adopted to mark the pixel positions in the monitoring grid that do not belong to the effective geological zoning as invalid.

[0054] Furthermore, by using the working face location information parsing algorithm (parameters: spatial coordinates recorded in the mining log, advance direction vector), the dynamically changing boundary of the mining area is mapped to a unified spatial reference system, and a set of active area contour vectors that evolves over time is generated.

[0055] Furthermore, spatial logical intersection operation (parameters: geological zoning mask matrix, mining activity area outline set) is used to achieve regional cropping of GNSS and InSAR observation data, retaining the observation values ​​within the intersection area to form spatial subset data of the subsidence active area.

[0056] Furthermore, for meteorological monitoring sequences and mining operation events, a time axis discretization method (parameter: unified time step Δt) is used to standardize timestamps, and a binary flag mapping algorithm is used to convert the event occurrence state into a 0 / 1 flag sequence, mapping continuous quantitative information (such as rainfall intensity) into a continuous intensity variable sequence, so that it keeps the time step consistent with the deformation variable data.

[0057] Furthermore, spatiotemporal index matching (parameters: time step index, spatial grid index) is performed between external driving factors and the observed subset of settlement active areas to form a multimodal observation dataset with external driving factor annotations, and the direct callability of the dataset in both sample and variable dimensions is ensured.

[0058] By combining spatial mask construction with temporal state mapping, the time-synchronized observation data from the previous step is transformed into characteristic data of active subsidence areas labeled with external driving factors, enabling targeted analysis input for areas of the mining area significantly affected by mining activities.

[0059] For example, in a coal mine area A, the known geological boundary consists of 200 vertex coordinates, the spatial reference system is UTM 50N, and the observation grid resolution is 10 m. The position coordinate sequence (accuracy ±0.5 m) in the working face advance log is parsed into a dynamic active area outline, updated hourly. A spatial mask is generated using Boolean matrix operations, resulting in a reduction of the number of GNSS stations from the original 45 to 12 within the active area, and the number of InSAR pixels from the original 40,000 to 9,200 after clipping. The rainfall sequence in the meteorological factors is discretized with a 1-hour step size, and the rainfall intensity (mm / h) is directly used as a continuous variable. The blasting operation schedule is mapped to a length-synchronized binary sequence. Through spatiotemporal index matching, the clipped subsidence active area data is bound to meteorological and blasting markers to generate a tensor dataset with dimensions of [number of samples × number of variables], where the variables include three-dimensional displacement, deformation rate, rainfall intensity, and blasting markers. In subsequent model calls, this dataset significantly improved the accuracy of subsidence trend analysis in active areas and the stability of the model's response to mining-driven factors.

[0060] S1.5: The aligned multi-source observation sequence is normalized to zero mean and unit variance. Specifically, the mean and standard deviation parameters are calculated based on the historical statistical distribution of each variable during the training period. Linear transformations are performed on data of different dimensions such as GNSS displacement components, InSAR deformation rate, and rainfall intensity to generate standardized feature vector sequences. These sequences are then encapsulated into a structured tensor dataset in a unified format for joint use by the subsequent physical simulation submodule and neural network model.

[0061] Step S2: Constructing an elastoplastic physical simulation submodule based on prior knowledge of geomechanics. This submodule uses parameters such as the thickness of the rock strata, elastic modulus, and mining-induced stress distribution in the mining area to calculate the theoretical predicted values ​​of the surface displacement field at each time step, outputting a sequence of physical state vectors containing stress gradients and strain increments. Specifically, it includes:

[0062] S2.1: Based on the geological exploration report and borehole data of the mining area, obtain the rock stratum structure parameters, including the thickness, density, Poisson's ratio and elastic modulus of each rock layer, and perform spatial interpolation processing on the parameters to generate a three-dimensional lithological distribution field with a uniform grid resolution, which serves as the geometric and material property input for elastoplastic physical simulation.

[0063] Based on the geological exploration report and borehole data of the mining area, a data parsing and parameter extraction algorithm (parameters: borehole record file, rock stratum profile atlas, and mining area mapping coordinate system) is used to systematically extract the structural parameters of each rock mass layer, including thickness, density, Poisson's ratio, and elastic modulus.

[0064] Furthermore, by using a coordinate standardization method (parameter: unified projection coordinate system of the mining area), the spatial positions of different acquisition coordinate systems in the original borehole and profile data are converted into a unified reference, thereby achieving consistent alignment of spatial positions.

[0065] Furthermore, the Kriging interpolation algorithm (parameters: variogram model type, neighborhood search radius, interpolation grid resolution) is used to perform spatial interpolation calculations on the rock strata parameters of discrete measurement points, and a continuous three-dimensional parameter field covering the entire mining area is generated, thereby establishing a complete spatial distribution table of material parameters.

[0066] Furthermore, a multi-attribute fusion interpolation technique (parameter: multi-attribute weighting factor) is adopted to simultaneously consider the correlation of multiple attributes such as thickness, density, Poisson's ratio and elastic modulus during the interpolation process, thereby realizing multi-attribute coupling optimization of the three-dimensional parameter field of rock strata.

[0067] Furthermore, the generated three-dimensional parameter field is discretized into a three-dimensional lithological distribution field dataset with a unified grid resolution through a mesh generation and resolution unification processing method (parameter: target mesh size) to ensure the consistency and computability of the data format when the subsequent elastoplastic physical simulation submodule is called.

[0068] Through the above spatial interpolation and fusion processing methods, the original rock strata exploration data from the previous step is transformed into a gridded three-dimensional lithology distribution field, thereby achieving the standardization and precision of the geometric and material property inputs required for elastoplastic physical simulation.

[0069] For example, when this step was implemented in the western wing of a mining area, the obtained exploration data included 120 borehole measuring points at a depth of 300 meters. Each measuring point recorded the thickness (range: 3~35 meters), density (range: 2.35~2.75 g / cm³), Poisson's ratio (range: 0.21~0.28), and elastic modulus (range: 28~40 GPa) of six rock layers. Spatial consistency of all measuring point locations was achieved using a coordinate standardization method and a unified projected coordinate system (EPSG:32650) for the mining area. Kriging interpolation parameters were set: a spherical model was selected for the variogram model, the neighborhood search radius was 500 meters, and the interpolation grid resolution was 50-meter grid points. The multi-attribute fusion interpolation weighting factors were set to thickness 0.3, density 0.25, Poisson's ratio 0.2, and elastic modulus 0.25, generating a three-dimensional parameter field covering a 10km × 8km area of ​​the mining area. After mesh generation, the three-dimensional parameter field was discretized into a lithological distribution field with 200×160×30 grid points. This dataset, used as input in subsequent elastoplastic physical simulations, significantly improved the accuracy of initial conditions and simulation stability in displacement field calculations. Verification showed that the convergence step size in the initial stress field calculation was reduced to 70% of the original method, significantly improving computational efficiency.

[0070] S2.2: Extract the time-space distribution characteristics of mining-induced stress sources from the coal mine mining engineering log, calculate the stress redistribution process of the surrounding rock under dynamic load by combining the overburden fracture mechanics model, and use the finite difference method to solve the equilibrium equation to obtain the initial stress tensor field within each time step.

[0071] S2.3: Based on the elastoplastic constitutive relation, the yield criterion and flow rule are introduced to perform incremental iterative calculations on the initial stress tensor field, determine whether the element has entered the plastic deformation stage, and update the strain increment tensor and residual stress state of the current step size accordingly, so as to realize the nonlinear transition modeling from elastic to plastic deformation.

[0072] Based on the initial stress tensor field obtained in step S2.2, the yield criterion in the elastoplastic constitutive relation (parameters: Mohr-Coulomb yield criterion selected, internal friction angle φ and cohesion c derived from the lithological distribution field) is used to determine the yield state of each element, thus identifying the elastoplastic stage of each mesh element. Furthermore, the direction of plastic strain increment is calculated using the flow rule (parameters: associated or unassociated flow, plastic potential function determined by the dilatation coefficient ψ), thus determining the deformation path after the stress state enters the plastic region. Further, for elements determined to have entered the plastic stage, the strain increment tensor of each iteration step is updated using an incremental iteration algorithm (parameters: iteration step size Δt and convergence tolerance ε set within the material mechanical stability range), ensuring that the stress-strain relationship satisfies the incremental form of the selected constitutive model.

[0073] The shear yield condition is calculated using the following formula:

[0074]

[0075] in, The normal stress component at the current step size. For cohesion, The Mohr-Coulomb yield angle. When the value is ≥0, the plastic deformation stage is determined.

[0076] Furthermore, after determining the plastic stage, the plastic strain increment is calculated using the flow rule. :

[0077]

[0078] in, As plastic multipliers, the solution is obtained through the consistency condition. Let be the plastic potential function. Let N be the stress tensor. The Newton-Raphson iterative method is used (parameter: maximum number of iterations N). max =50, residual threshold ε res =1e-6) Solve for the plastic multiplier to achieve accurate updates of plastic strain.

[0079] Furthermore, the plastic strain increment and the elastic strain increment (calculated by the generalized Hooke's law) are superimposed in the tensor space to form the total strain increment tensor, and the residual stress state is updated. An explicit time integration method is used to ensure the stability of the calculation.

[0080] Through the above incremental iteration and constitutive update processing, the initial stress state of the previous step is transformed into strain increments and residual stress data containing elastic and plastic responses, thus achieving the effect of nonlinear transition modeling technology in the elastic-plastic stage.

[0081] For example, in the lower strata of the east wing working face of a coal mine, Poisson's ratio is set to 0.25, elastic modulus to 35 GPa, cohesion c = 2.5 MPa, internal friction angle φ = 28°, and dilatation coefficient ψ = 5°. At the first time step Δt = 1 h, the maximum principal stress of the initial stress tensor field under mining stress reaches 8 MPa, which, according to the Mohr-Coulomb yield condition calculation, satisfies... , =7.23 MPa, therefore the yield function f = 0.77 MPa ≥ 0, and the element is determined to have entered plastic deformation. Unassociated flow is selected in the flow rule, with a plastic potential function dilatation coefficient ψ = 5°. The initial plastic strain direction is calculated using the flow rule formula, and iterative convergence is achieved. Calculate the elastic strain increment using the generalized Hooke's law. The total strain increment is formed by superimposing the tensors of the two. This process achieves a residual stress release of 0.5 MPa. In this embodiment, the strain increment and residual stress calculated through this step provide significantly improved physical accuracy and numerical stability for subsequent displacement field calculations. In the verification example, the distribution of the plastic zone highly matches the measured underground fracture propagation area.

[0082] S2.4: The updated strain increment tensor is converted into a displacement increment field through geometric equations. Combined with boundary constraints (such as fixed bottom surface and free sides), displacement cumulative integration is performed to calculate the theoretical displacement field prediction value of the surface node at each time step, and the data is organized into a spatially gridded time series dataset.

[0083] S2.5: Extract key physical state variables from the theoretical displacement field predictions, including local stress gradient, vertical strain increment and maximum principal strain direction, encapsulate them into a physical state vector sequence in a time-aligned manner, and output them to the next processing stage as input to the intermediate mapping network to realize the formal expression of physical prior information.

[0084] like Figure 2 As shown, step S3 involves constructing an intermediate mapping network based on the physical state vector sequence and the structured historical observation dataset. This network maps the physical state vectors into implicit feature representations through nonlinear transformations, achieving a differentiable buffered connection between the physical model output and the main prediction network input. Specifically, this includes:

[0085] S3.1: Based on the physical state vector sequence generated in the preceding step S2, extract the stress gradient and strain increment at each time step as input variables, and use the physical state vector to construct the input tensor of the intermediate mapping network to form a quantitative characterization of the mechanical response of the surface rock strata.

[0086] S3.2: Design an intermediate mapping network (IMN) consisting of multiple fully connected neurons. Its network structure includes an input layer, two hidden layers and an output layer. The ReLU activation function is used to perform nonlinear transformation on the input tensor to generate a high-order abstract implicit feature space representation.

[0087] Based on the physical state vector input tensor generated in the preceding step S3.1, an intermediate mapping network structure (IMN) is designed using a multilayer fully connected neural network to realize the nonlinear high-dimensional feature mapping of the mechanical response of the surface rock strata.

[0088] The input layer receives the physical state vector tensor, and the number of neurons in the input layer is consistent with the dimension of the physical state vector. Linear transformation is achieved through weight matrix multiplication and bias vector addition, providing a basic representation capability for subsequent nonlinear mapping.

[0089] Furthermore, feature dimension enhancement is performed using the first hidden layer, with the number of neurons in the hidden layer set to twice the input dimension, and the Rectified Linear Unit (ReLU) activation function is used. Nonlinear transformations are implemented to enhance the network's ability to fit complex patterns of physical states and generate preliminary high-order features.

[0090] Furthermore, a second hidden layer is used for feature compression. The number of neurons is set to the target compression dimension of the physical prior features according to the output target dimension. The ReLU activation function is used to perform non-linear processing. At the same time, the Dropout mechanism (dropout rate p=0.2) is introduced to suppress overfitting and ensure the generalization performance of the implicit features in different training batches.

[0091] Furthermore, the compressed features output by the second hidden layer are mapped to the implicit feature space representation using an output layer. The number of output neurons is equal to the set implicit feature dimension. The interpretability of the feature values ​​is maintained by a linear activation function, and the original feature distribution is provided for subsequent L2 normalization processing.

[0092] Through the above-mentioned multi-layer fully connected and nonlinear transformation processing method, the physical state vector of the previous step is transformed into high-order abstract implicit feature space data, realizing a differentiable buffered connection between the physical model output and the main prediction network input.

[0093] For example, in a subsidence trend prediction task for a certain mining area, the physical state vector has a dimension of 6 (containing 3 components of stress gradient and 3 components of strain increment), the number of neurons in the input layer is set to 6, the number of neurons in the first hidden layer is set to 12, the weight matrix is ​​initialized using a Xavier uniform distribution, and the ReLU activation function is set to... The output features are non-linearly upscaled to generate a 12-dimensional vector. The second hidden layer has 8 neurons, uses the same ReLU activation function, and applies Dropout with p=0.2. The output layer has 5 neurons, and the output of the second hidden layer is mapped to a 5-dimensional implicit feature vector through linear activation. During training, an input batch size of 64 is used, and the AdamW optimizer learning rate is set to 0.001. After 50 iterations, the correlation between the implicit features and the observed deformation is significantly improved. The final generated implicit feature space representation effectively enhances the interpretability and stability of trend prediction in the fusion computation of the main prediction network.

[0094] S3.3: Perform L2 normalization on the implicit feature space representation to eliminate the influence of different physical dimensions on the subsequent fusion process of the master prediction network, and output the normalized implicit feature vector sequence as a compressed expression of physical prior information.

[0095] S3.4: Based on the implicit feature vector sequence and the input interface requirements of the main prediction network, a differentiable buffer connection path is constructed so that the information output by the physical model is transmitted unidirectionally to the main prediction network through this path, avoiding direct coupling between physical parameters and network weights during backpropagation.

[0096] The input conditions are the standardized implicit feature vector sequence output by the preceding step S3.3, and the input interface specifications defined by the main predictive neural network in the structural design phase, including the length of the time dimension, the size of the feature dimension, and data type constraints.

[0097] An interface adaptation mapping algorithm is used (parameter: input vector dimension d). p The expected dimension d of the master predictor network m This algorithm achieves dimensionality reshaping of implicit features by constructing a linear transformation matrix W. map ,Will Multiplying the matrix of dimensions with the input vector generates a feature dimension acceptable to the main prediction network while preserving gradient transitivity.

[0098] Furthermore, through a one-way gradient isolation mechanism (parameter: isolation factor γ) iso This ensures the complete transmission of the physical model output during forward propagation, while scaling or blocking the gradients along the path during backward propagation. Specifically, a gradient modulation operator is inserted into the computation nodes of the buffered path, causing the backward gradient to be multiplied by γ. iso Alternatively, the values ​​can be set to zero to eliminate the direct coupling effect of physical parameters on network weights.

[0099] Furthermore, through a time synchronization interpolation method (parameter: time step alignment length T) alignThis enables the implicit feature vector to be interpolated and resampled on the time axis, making it completely consistent with the input of the data-driven branch in the time dimension, ensuring that the main prediction network can align and process the physical and data features of each time step during the dual-input fusion stage.

[0100] Furthermore, by encoding the feature channel identifier (parameters: physical guide branch identifier bit 1, data guide branch identifier bit 0), metadata tags for distinguishing the source are added to the output of the buffer path, providing accurate feature source information for the downstream attention mechanism and supporting selective enhancement computation in the cross-modal fusion process.

[0101] Through the aforementioned joint algorithm of interface adaptation and gradient isolation, the implicit feature vector sequence from the previous step is transformed into differentiable input data that matches the main prediction network in both time and space and dimensionality, thereby achieving unidirectional transmission of physical model output information and gradient path.

[0102] S3.5: The intermediate mapping network is pre-trained using a structured historical observation dataset to learn the nonlinear mapping relationship between the physical state vector and the actual observed deformation, thereby improving the implicit feature representation's ability to interpret the real settlement trend and providing reliable initialization parameters for subsequent joint optimization.

[0103] Based on the differentiable buffer connection path constructed in the preceding step S3.4, the physical state vector sequence is selected as the source input, and the real shape variables in the structured historical observation dataset are combined as the target output to construct pre-trained sample pairs and realize the input conditions for preliminary parameter optimization.

[0104] The mean squared error loss function (parameters: batch size B, learning rate η) is used to map the implicit feature vectors output by the intermediate mapping network back to the estimated values ​​of the observed variables, and to measure the difference between the estimated values ​​and the true values ​​on a sample-by-sample basis.

[0105] Furthermore, the Adam optimization algorithm (parameters: β1=0.9, β2=0.999) was used. This involves performing gradient calculations and weight updates to ensure that the intermediate mapping network quickly approximates the mapping relationship between the physical state and the observed deformation during the initial training phase.

[0106] Furthermore, by using the Dropout regularization method (parameter: inactivation rate p=0.2), a random inactivation mechanism is introduced into the hidden layer to reduce the risk of overfitting in the pre-training stage and improve the generalization ability of the implicit features.

[0107] Furthermore, by employing an early stopping strategy (parameter: tolerance period=10), the validation set loss is monitored in real time. When there is no improvement in consecutive periods, the pre-training process is terminated to prevent parameter oscillations caused by redundant iterations.

[0108] Through the above algorithm, the weight parameters obtained in the pre-training stage are initialized into the intermediate mapping network. While maintaining the compressed expression of physical information, the implicit features are improved to interpret the real settlement trend, thereby improving the stability and convergence speed of the subsequent joint optimization with the main prediction network.

[0109] For example, when implementing this step in a certain mining area, B=64 is selected as the batch size, the learning rate η is set to 0.001, and the Adam optimization parameters are kept at their default values ​​β1=0.9 and β2=0.999. Historical observation data, including GNSS 3D displacement sequences and InSAR deformation rate fields, are standardized to form a unified tensor. Using the 128-dimensional implicit eigenvectors output by the intermediate mapping network, deformation variables are predicted through linear transformation regression. The L value is calculated based on the mean squared error loss formula; in the first iteration, L is approximately... The training dropped to [a certain level] by the 50th round. During pre-training, a Dropout strategy with p=0.2 was used to suppress overfitting. The validation set loss remained stable around 0.50. An early stopping mechanism with patience=10 was triggered, terminating training at round 65. The final generated initial weights significantly improved the stability of the subsequent collaborative training of the main prediction network and the intermediate mapping network, and demonstrated strong adaptability to complex working conditions in the actual mining area subsidence trend prediction task.

[0110] like Figure 3 As shown, step S4 involves feeding the structured historical observation dataset and implicit feature representation as dual inputs into the main prediction neural network. The main prediction neural network includes a two-stream encoding path with a physical-guided branch and a data-guided branch. This main prediction neural network employs a spatiotemporal attention mechanism to fuse data-driven features and physical prior information to generate a prediction result for future land subsidence trends. Specifically, this includes:

[0111] S4.1: Based on the structured historical observation dataset, multivariate time series embedding processing is performed. The learnable linear projection matrix is ​​used to vectorize the GNSS deformation sequence, InSAR differential interferometric atlas, and spatiotemporally aligned meteorological and mining activity logs to generate a time step feature vector sequence with a unified dimension, which serves as the initial input representation for the data-driven branch.

[0112] Based on the multimodal sequences in the structured historical observation dataset, GNSS deformation monitoring sequences, InSAR differential interferometric atlases, and meteorological and mining activity logs that have undergone spatiotemporal alignment are selected as input objects.

[0113] A learnable linear projection matrix is ​​used to construct a multivariate time series embedding channel (the parameter dimension is determined according to the original feature length of each modality data) to achieve vectorized encoding of observation variables with different dimensions and spatial resolutions.

[0114] Furthermore, by using matrix multiplication, the data from each sensor are projected from the original feature space to a unified low-dimensional embedding space, generating a continuous vector sequence of fixed dimensions. This eliminates the differences in dimensionality and scale of the data and improves the feasibility of fusion.

[0115] Furthermore, a batch normalization method based on time steps is adopted to perform zero-mean-unit variance standardization on the embedding vector in both the batch and feature dimensions, thereby suppressing the offset effect caused by the differences in the distribution of data from different sensors and improving the training stability of the main prediction network.

[0116] Furthermore, periodic and incremental temporal position information is added to the embedded feature sequence through a temporal position encoding method to explicitly characterize the relative position of each observation in the long-term time series prediction task, ensuring that the temporal dependence can be effectively captured by the subsequent LSTM encoder.

[0117] Through the above embedding and standardization processing methods, the original multimodal observation sequence is transformed into a set of time-step feature vectors with unified dimensions and time synchronization, realizing a highly compatible initial input representation for data-driven branches.

[0118] S4.2: Spatial dimension expansion and temporal alignment interpolation are performed on the implicit feature representation output by the intermediate mapping network to keep it synchronized with the structured historical observation dataset in terms of temporal resolution. An enhanced physical prior feature tensor is generated through a nonlinear activation function as the input to the physical guidance branch to ensure that the information output by the physical model can effectively participate in subsequent attention calculations.

[0119] For the implicit feature vector sequence output by the intermediate mapping network, a spatial dimension expansion algorithm (parameter: the target spatial grid resolution is consistent with GNSS / InSAR data) is adopted to improve the physical features from a lower spatial sampling density to the same spatial resolution as the structured historical observation dataset, so as to ensure the matching of spatial features in subsequent fusion.

[0120] Furthermore, by using a time-axis interpolation method (parameters: linear interpolation or cubic spline interpolation, with the target time step being the unified synchronization step of the observation data), the implicit features are fully aligned with the structured historical observation dataset in the time dimension, and a time-synchronized physical prior feature time series is generated.

[0121] Furthermore, a nonlinear activation function is used (parameters: ReLU or LeakyReLU, threshold setting to prevent gradient vanishing) to perform a nonlinear transformation on the spatially expanded and temporally aligned physical feature sequence, thereby enhancing the sensitivity of the features to changes in local physical states and suppressing the influence of noise in the low-response region.

[0122] Furthermore, by using the feature tensor construction method (parameters: time × space × channel dimension arrangement rules), the physical features after nonlinear transformation are organized into an enhanced physical prior feature tensor, so that it has a unified dimension and is consistent with the input format of the master prediction network of structured historical observation data.

[0123] By using feature synchronization expansion and nonlinear enhancement processing, the implicit features from the previous step are transformed into physical prior feature tensors that match the data-driven input in terms of spatial resolution, temporal synchronization, and response intensity. This enables the effective participation of the physical model output in the main prediction network and the efficient computation of the attention mechanism.

[0124] For example, during implementation in a mining area, the implicit feature vector sequence output by the intermediate mapping network has a spatial resolution of 5km grid, while the GNSS / InSAR fused grid resolution of the structured historical observation dataset is 1km. A bilinear interpolation expansion algorithm is used to expand the implicit feature matrix from 5×5 nodes to 25×25 nodes, with the parameters set so that the weight matrix varies inversely with the square of the distance. On the time axis, the original time step of the implicit features is 3 hours, and the observation data has a uniform time step of 1 hour. A cubic spline interpolation method is applied to generate the aligned time series, with the boundary first derivative set to 0 during interpolation to smooth the endpoint trends. The nonlinear enhancement stage uses the LeakyReLU function with a leakage coefficient of 0.01 to ensure that a certain gradient signal is retained in the low response region of the physical features. After processing, the constructed enhanced physical prior feature tensor has dimensions of T=720 time steps, X=25 spatial grids, and C=16 channels. Structure. In performance verification, this tensor is input to the physical guidance branch of the main prediction network, which can significantly improve the weight allocation of the spatiotemporal attention mechanism to the region of abrupt change in settlement rate, thereby improving the response performance of the prediction model in the identification of abnormal settlement events in mining areas.

[0125] S4.3: The time-step feature vector sequence and the enhanced physical prior feature tensor are fed into the dual-stream coding path respectively. The data-driven branch uses stacked LSTM layers to extract long-term temporal dependencies and generate high-order temporal hidden states. The physical guidance branch captures local physical state evolution patterns through a one-dimensional convolutional network and outputs physical perception context vectors to form independent deep representations of the two types of features.

[0126] The input conditions are the time-step feature vector sequence generated by step S4.1 and the enhanced physical prior feature tensor generated by step S4.2. Both are consistent in terms of temporal resolution and spatial encoding dimension, and are both standardized tensor formats adapted to the main prediction network interface.

[0127] A stacked Long Short-Term Memory (LSTM) network (parameters: number of hidden units h=128, number of layers L=3, dropout rate=0.2) is used to perform stepwise recursive state updates on the time step feature vector sequence, thereby realizing the long-term temporal dependency modeling of the surface subsidence deformation process, and outputting the temporal hidden state matrix of the highest layer as a high-order representation of the data-driven branch.

[0128] Furthermore, the input gate, forget gate, and output gate of the LSTM unit are batch normalized through a gated recurrent unit (GRU) preactivation strategy to enhance the stability of the model to fluctuations in different operating conditions and generate a set of time-series hidden state vectors that have been gated, thereby improving the consistency of long-term predictions.

[0129] Furthermore, a one-dimensional convolutional network (1D-CNN) (parameters: kernel size k=5, stride=1, number of kernels filters=64) is used to perform convolution operations on the enhanced physical prior feature tensor along the time dimension to capture the local evolution pattern of the physical state vector between adjacent time steps, and generate convolution output feature maps as preliminary representations of the physical guidance branch.

[0130] Furthermore, the convolution output feature map is temporally compressed by the max pooling operation (parameter: pooling window size=2), which preserves significant physical pattern change points and reduces the sequence length, so that the context features of the physical guide branch have higher weights on key changes during the fusion stage.

[0131] Through the aforementioned dual-stream coding path, the higher-order temporal hidden states of the data-driven branch and the physical perception context vectors of the physical guidance branch are used as two independent deep representations to provide input for the cross-modal spatiotemporal attention mechanism in the next sub-step, thereby achieving complementary enhancement of physical information and observation data features.

[0132] For example, for the input time-step feature vector sequence, the stacked LSTM layers are configured with 128 hidden units per layer and a regularized dropout rate of 0.2. Let the input sequence length be T=48 (representing data from the past 48 hours), and each time step contain M=64-dimensional features. After processing through 3 layers of LSTM, a high-order temporal hidden state matrix with shape [48, 128] is obtained. For the enhanced physical prior feature tensor, with a shape of [48, 32], the local patterns of the physical features within the time window are calculated using a 1D-CNN convolution kernel size of 5 and a stride of 1. The convolution output shape is [44, 64]. A max-pooling window of 2 is used to compress it, resulting in an output shape of [22, 64], which serves as the physical awareness context vector. During the fusion phase, the high-order temporal hidden state matrix and the physical perception context vector are fed into the attention weight calculation module. Feature interaction is achieved through soft normalization of the correlation score matrix. Finally, in the validation set test, the RMSE of trend prediction is significantly reduced, and the smoothness and physical consistency of the prediction curve under complex conditions are improved.

[0133] S4.4: Based on the generated high-order temporal hidden states and physical perception context vectors, a cross-modal spatiotemporal attention mechanism is constructed: the correlation score matrix of the two in the time step and spatial dimension is calculated, the attention weight distribution is obtained by using softmax normalization, and the high-order temporal hidden states are weighted and aggregated to generate a data-driven context representation that integrates physical constraints, thereby achieving selective enhancement of key spatiotemporal nodes.

[0134] Based on the high-order temporal hidden state matrix H t With the physical perception context matrix C p A cross-modal correlation calculation method (parameters: time step range T, number of spatial nodes N) is adopted to measure the matching of two types of features in the spatiotemporal dimension.

[0135] Furthermore, a two-dimensional correlation mapping algorithm is used (parameters: correlation is measured by dot product attention, and the spatial topological adjacency matrix A). s This achieves joint matching of time series features and spatial distribution features, resulting in a preliminary correlation score matrix S. tc .

[0136] Furthermore, the softmax normalization method is employed (parameter: normalization dimension is a dual axis of time step and spatial node) to achieve normalization of the correlation score matrix S. tc The probabilistic processing is performed, and a normalized attention weight matrix W is generated. tc This is used to measure the contribution intensity of each spatiotemporal location to the fusion result.

[0137] Furthermore, through a weighted aggregation algorithm (parameter: weight matrix W) tcWith the higher-order temporal hidden state matrix H t ), to achieve H t We perform a weighted summation of the spatiotemporal nodes to obtain a data-driven context representation U that incorporates the effects of physical constraints. td .

[0138] By using a cross-modal spatiotemporal attention fusion algorithm, the weighted context representation U from the previous step is... td This transforms the input features into physically interpretable predictive features, thereby enhancing the response to key spatiotemporal nodes.

[0139] S4.5: Input the fused context representation into the decoder module, combine it with the autoregressive strategy to gradually predict the surface subsidence displacement sequence for multiple time steps in the future, output the subsidence trend curve and confidence interval for the future period, complete the end-to-end mapping from multi-source input to trend prediction, and provide interpretable deformation evolution prediction results for the target mining area.

[0140] The fused context representation is input into the decoder module, and a sequence-to-sequence decoding algorithm based on gated recurrent units (GRU) is adopted (parameters: 128 hidden units, 2 layers, dropout rate 0.1) to gradually unfold and predict the temporal information of the fused features, forming a preliminary displacement prediction sequence for multiple future time steps.

[0141] Furthermore, by employing an autoregressive strategy (parameters: prediction step size Δt = 1 hour, window length H = 24 hours), the prediction result of the previous step is used as an auxiliary input for the current step in the prediction of each time step, thereby realizing the recursive iteration of time series prediction and obtaining the full sequence prediction value as time progresses.

[0142] Furthermore, a multi-step feedforward correction method (parameter: the correction coefficient matrix is ​​derived from the least squares fitting of historical residuals) is adopted to compensate for errors in the displacement sequence generated by autoregression, thereby improving the consistency of the predicted sequence in terms of contour shape and numerical amplitude, and generating a corrected predicted displacement time series.

[0143] Furthermore, based on the predicted displacement sequence, the trend curve and confidence interval are calculated. Using Bayesian residual analysis, a conditional probability distribution is constructed, and the mean prediction and standard deviation estimate for each future time step are obtained. The mean prediction is based on the center value of the entire sequence output, and the standard deviation is calculated using the following formula:

[0144]

[0145] in For the first The residual of the time step, To predict the total number of steps.

[0146] By smoothing curves and extracting interval boundaries, the statistically analyzed prediction sequence is transformed into the future subsidence trend curve of the target mining area and its corresponding confidence interval, thus achieving end-to-end trend prediction and uncertainty assessment.

[0147] For example, in a mining area, the decoding and prediction steps are performed. The fused context vector dimension is set to 256. After two layers of processing by the GRU decoder, 128 hidden units are configured, the dropout rate is 0.1, the autoregressive window length is set to 24 hours, and the step size is 1 hour, resulting in a preliminary predicted displacement sequence for the next 72 hours. A correction coefficient matrix is ​​established using historical one-week residual data, with the predicted value adjusted by no more than 2.5 millimeters to reduce error accumulation. Bayesian residual analysis is used to calculate the predicted standard deviation for each time step in the next 72 hours. The center value of the curve is taken as the mean trend, and the upper and lower bounds are generated based on the standard deviation calculated by the formula, resulting in a complete trend curve and confidence interval. The application results show that the curve shape is stable, the interval width is reasonable, the interpretability of trend changes is significantly improved, and it still possesses high generalization prediction ability under complex working conditions.

[0148] Step S5: Train the intermediate mapping network and the main prediction neural network. During training, fix the parameters of the physics simulation submodule, jointly optimize the weights of the intermediate mapping network and the main prediction neural network, and determine the learning consistency between the physics-guided branch and the data-guided branch based on the gradient direction angle in the backpropagation path, generating a gradient coordination state flag. Specifically, this includes:

[0149] S5.1: Based on the physical state vector sequence and structured historical observation dataset output by the physical simulation submodule completed in the previous steps, obtain the implicit feature representation output by the intermediate mapping network (IMN) and the structured historical observation dataset processed by the main prediction neural network, and use them as dual-path differentiable input signals to construct a joint loss function containing physical prior and data-driven information, providing a unified optimization objective basis for subsequent gradient analysis.

[0150] Based on the physical state vector sequence output by the fixed parameters of the physical simulation submodule and the structured historical observation dataset, a parallel loading method using a data pipeline is adopted. The physical state vector sequence is input into the intermediate mapping network (IMN) to generate implicit feature representation, while the structured historical observation dataset is input into the main prediction neural network to form a historical observation feature sequence, thereby realizing the synchronous acquisition of dual-path differentiable signals.

[0151] Furthermore, an input interface adaptation algorithm (parameter: feature dimension matching threshold δ) is used. f The implicit feature vectors output by the IMN and the embedding layer output of the main prediction network are dimensionally calibrated to generate a feature set that meets the joint optimization requirements, ensuring that the two types of features maintain a consistent tensor shape during gradient backpropagation.

[0152] Furthermore, a feature fusion preprocessing method (parameters: normalized mode Z-score, fusion ratio η) is adopted to map the dimension-calibrated implicit feature representation and the historical observation feature sequence to the physical prior branch and the data-driven branch respectively in the loss function calculation path, generating a joint feature input matrix, which is used to construct a composite loss function containing physical constraint terms and data fitting terms.

[0153] Furthermore, by constructing an algorithm using the loss function, the physical guidance branch loss term is defined as the physical consistency error, and the data guidance branch loss term is defined as the prediction error, resulting in the following composite loss L expression:

[0154]

[0155] in, For physical consistency weights, Fit weights to the data, The constraint error is calculated based on the physical state vector and implicit feature representation. This represents the prediction error based on a structured historical observation dataset.

[0156] By using the aforementioned composite loss function, the dual input signals are unified into the same optimization target space, generating comparable basic data for subsequent gradient analysis and direction angle calculation, thus achieving the optimized fusion of physical priors and data-driven information.

[0157] For example, in a coal mine monitoring scenario, the physical state vector sequence includes local stress gradient values ​​ranging from [0.15, 0.35] MPa / m, vertical strain increments ranging from [1.5e-5, 3.2e-5], and principal strain direction angles ranging from [45°, 60°]. The structured historical observation dataset includes GNSS deformation rates ranging from [5.2, 8.3] mm / day and InSAR deformation rates ranging from [4.9, 8.0] mm / day. The IMN output implicit feature representation dimension is 128, and the main prediction network embedding layer output dimension is 128. A perfect match of feature dimensions is achieved through an input interface adaptation algorithm. The fusion ratio η is set to 0.4, and Z-score normalization is calculated based on the mean and standard deviation of the past 30 days. In the composite loss function, α = 0.6 and β = 0.4.

[0158] S5.2: During backpropagation, extract the first gradient vector G from the implicit feature representation path respectively. p (Physically guided branch gradient) and the second gradient vector G from the historical observation sequence path d (Data-driven branch gradient), where G p G reflects the direction of the influence of physical priors on network parameter updates. dThis reflects the adjustment direction of data-driven learning, and the two constitute a dual-path learning representation in the gradient space.

[0159] S5.3: Based on the formula for calculating the vector angle, the first gradient vector G is... p With the second gradient vector G d Perform cosine similarity calculation to obtain the gradient direction angle θ = arccos( (G p ·G d ) / (||G p || ||G d ||) ) is used to quantify the degree of learning consistency between the physics-guided branch and the data-guided branch within the current training step, forming a continuous learning coordination index.

[0160] The gradient vectors G extracted during backpropagation based on the physical bootstrap branch and the data bootstrap branch, respectively. p With G d We use the vector angle calculation formula for quantitative analysis to construct a continuous learning coordination index.

[0161] The method of calculating similarity using vector dot product and normalized cosine similarity (parameter: G) is adopted. p G d This allows for a quantitative measurement of the similarity between the directions of two types of gradient vectors.

[0162] Furthermore, the inner product value of the gradient direction is obtained through vector dot product operation, and the norm of the two gradient vectors is solved using the modulus calculation formula.

[0163]

[0164] Among them, G p G represents the gradient vector from the physical prior feature path. d This represents the gradient vector from the data-driven feature path. The angle between the two directions. This is the vector dot product operator. and These are the vector magnitudes.

[0165] Furthermore, the original similarity value is mapped to a continuous variable within the [0,π] radian range using the angle calculation results, thereby achieving a quantifiable representation of the degree of consistency in gradient direction.

[0166] Furthermore, the included angle value is converted into a dimensionless coordination coefficient through normalization, forming a smooth and continuously trackable sequence of learning coordination indicators, which serves as the data input for the dynamic decoupling coefficient adjustment mechanism.

[0167] By using the cosine similarity operation of the vector angle, the gradient vector result of the previous step is transformed into a quantifiable directional consistency index, thereby realizing the knowledge fusion stability assessment of the fusion model during the training process.

[0168] For example, in the training scenario of a surface subsidence prediction model in a mining area, the physical guide branch gradient vector... Length is Data-guided branch gradient vector Length is The inner product of the two is calculated as follows: Substituting into the above formula, we get: the denominator is ,Right now molecule is The result is that the cosine value is Corresponding angle radians. This included angle value is normalized and mapped to a compatibility coefficient. This state is marked as high consistency in the training log. In this state, the dynamic decoupling coefficient remains high, the physical guidance branch has a significant guiding effect on the main prediction network, and the model effectively captures the nonlinear characteristics of the observed data while incorporating physical laws.

[0169] S5.4: Based on the preset included angle threshold range [θ] low , θ high The gradient direction angle θ is mapped to a discrete state label: when θ ≤ θ low When θ is determined to be a strongly consistent state, θ ≥ θ high When a state is identified as a strong conflict state, and the state in between is a transitional state, a gradient coordination state flag with clear semantics is generated to indicate the stability of knowledge fusion in the current training phase.

[0170] Based on the gradient direction angle obtained in the preceding step S5.3 As an input variable, an interval threshold discrimination method is used to realize the discretization mapping function of gradient consistency degree.

[0171] Furthermore, by comparing the operation module (condition: This process generates a strongly consistent state label and binds the label to the semantic identifier "Strong-Consistent," resulting in a judgment result data showing a high degree of coordination between the physical bootstrap branch and the data bootstrap branch.

[0172] Furthermore, by comparing the operation module (condition: This process generates a label for a strong conflict state and binds the label to the semantic identifier "Strong-Conflict," resulting in a judgment data showing that the gradient directions of the physical guidance branch and the data guidance branch are significantly diverging.

[0173] Furthermore, through the range discrimination method (condition: This enables the generation of transition state labels and assigns the semantic identifier "Moderate-Consistency", forming a judgment result data that shows a certain difference between the physical guidance branch and the data guidance branch but does not reach a conflict.

[0174] Through the aforementioned interval mapping and label binding algorithm, the continuous gradient coordination index of the previous step is transformed into a state flag with three distinct semantic categories: strong consistency, transition, and strong conflict, thereby achieving a precise indication of the stability of knowledge fusion in the current training phase.

[0175] For example, during the training process of a settlement prediction model in a certain mining area, the following settings are made: = , = In the 200th round of training, calculations showed that... = ,satisfy The system generates a strongly consistent state label, "Strong-Consistent," indicating a high degree of alignment between the physical prior and the data-driven path. In the 350th training round, calculations... = ,satisfy The system generates a transitional state label "Moderate-Consistency," indicating that the two paths exhibit moderate differences but still maintain basic fusion. In the 500th training round, calculations... = ,satisfy The system generates a "Strong-Conflict" label, indicating a significant deviation between the physical guidance and data fitting directions. This necessitates adjusting the adaptive decoupling coefficients in subsequent training to mitigate the conflict. The aforementioned label output has been validated to significantly improve the system's ability to adjust its response to fusion stability at different training stages, ensuring the stability and generalization performance of the model optimization process.

[0176] S5.5: The generated gradient coordination state flag is output as a feedback signal to the next training stage for the adaptive decoupling coefficient adjustment mechanism to achieve dynamic control over the contribution weight of implicit feature representation in the loss function, ensuring that the model maintains a balanced learning process between physical guidance and data fitting.

[0177] Step S6: Dynamically adjust the adaptive decoupling coefficient based on the gradient coordination state flag. This coefficient is used to adjust the contribution weight of implicit feature representations in the loss function, realizing a progressive learning strategy with high physical guidance in the early stages of training and high data adaptability in the later stages. Specifically, it includes:

[0178] S6.1: Based on the gradient coordination state flag generated in the preceding steps, this flag is obtained by calculating the angle between the gradient directions of the physical guidance branch and the data guidance branch during backpropagation, and is used to characterize the consistency between the two types of learning paths. Using this flag as an input condition, the adaptive decoupling coefficient α(t) is initialized, and its initial value is set to a high level to enhance the learning intensity of physical prior knowledge in the early stage of training.

[0179] S6.2: Normalize the current training round t and map it to the interval [0,1] as the time evolution factor τ(t). This factor reflects the relative stage of the model training process. Perform an S-shaped decay function transformation based on τ(t) to generate a basic decoupling weight curve. The output of this curve serves as the benchmark reference value for the adaptive decoupling coefficient α(t), ensuring that the physical guidance effect gradually weakens as training progresses.

[0180] S6.3: The gradient coordination state flag and the basic decoupling weight curve are nonlinearly fused, and a dynamic modulation function is constructed using a differentiable gating mechanism. This function takes the gradient consistency degree as a condition. When the gradient directions of the physical guidance branch and the data guidance branch are highly consistent, a higher α(t) value is retained to maintain the strength of the physical constraint. When a significant deviation occurs, α(t) is reduced more rapidly to weaken the influence of the physical path on the main prediction network and avoid optimization oscillations caused by the propagation of erroneous gradients.

[0181] S6.4: The dynamically adjusted adaptive decoupling coefficient α(t) is introduced into the composite loss function of the main prediction network, and the physical consistency term in the loss function is multiplied by α(t) to achieve weighted control of implicit feature representation in the model optimization objective. Through this weighting process, the model prioritizes fitting geomechanical laws in the early stage of training and focuses on capturing nonlinear dynamic characteristics in the observation data in the later stage, forming a learning mechanism with phased emphasis.

[0182] S6.5: Output the time-varying adaptive decoupling coefficient α(t) and its corresponding weighted loss function as the basis for the next round of parameter updates; at the same time, record the α(t) sequence to the training log for subsequent analysis of the physical-data learning dynamic balance process, and support the optimization of transfer training strategies in cross-mining scenarios.

[0183] Step S7: Determine whether the validation set error of the main prediction neural network in the current training round meets the convergence condition. If it does, output the final trained subsidence trend prediction model; otherwise, return to continue executing parameter updates and decoupling control. Specifically, this includes:

[0184] S7.1: Based on the difference between the prediction output of the main prediction neural network on the independent validation set and the actual surface subsidence observations in the previous training round, the root mean square error (RMSE) is calculated as the current model performance evaluation index to quantify the model's learning accuracy of historical deformation trends.

[0185] S7.2: Compare the RMSE of the validation set calculated in the current round with the preset convergence threshold, and combine this with whether the error decrease within three consecutive training rounds is lower than the set minimum rate of change Δmin, to determine whether the model has entered a performance saturation state, and generate a preliminary convergence judgment flag f. converge .

[0186] S7.3: Initial convergence determination flag f converge Gradient stability verification: By analyzing the gradient norm change trend of the weights of the last layer of the master prediction network, it is determined whether it tends to be stable; if the gradient oscillation amplitude is less than the preset stable interval ε, it is confirmed that the model training process has converged, and the final convergence confirmation signal is output.

[0187] For the training state corresponding to the initial convergence judgment flag, the gradient norm calculation method (parameter: the weight vector takes all trainable parameters of the last layer of the main prediction network) is used to realize the quantitative evaluation of the gradient change magnitude.

[0188] Furthermore, through time series analysis methods (parameter: sliding window length set to...), (In 1 training round), the stationarity of the gradient norm sequence is tested, and the gradient mean change curve is obtained.

[0189] Furthermore, by using a gradient oscillation amplitude calculation method, the gradient norm range between adjacent windows is compared with a preset stable interval. The results are compared and a stability determination is generated.

[0190] S7.4: If the final convergence confirmation signal is true, freeze all trainable parameters of the intermediate mapping network and the main prediction neural network, encapsulate them to form a fusion subsidence trend prediction model with physical interpretability and high generalization ability, and persistently store it in the model library for subsequent online inference.

[0191] S7.5: If the convergence condition is not met, the adaptive moment estimation optimizer (AdamW) is used to perform incremental update operations on the weight parameters of the intermediate mapping network and the main prediction neural network based on the gradient information obtained by backpropagation of the current loss function, and the adaptive decoupling coefficient α(t) is adjusted simultaneously to enter the next round of iterative training process.

[0192] Based on the determination that the convergence condition is not met in the current training round, the input data includes the loss function value of the previous round, the gradient tensor set obtained by backpropagation, and the adaptive decoupling coefficient α(t) dynamically adjusted by step S6.

[0193] The gradient backpropagation algorithm (parameters: learning rate η=0.001, weight decay λ=0.01) is used to perform gradient calculation on the trainable weights of the intermediate mapping network and the main prediction neural network while preserving the frozen state of the physical simulation submodule parameters, thereby generating a complete set of parameter update direction matrices.

[0194] Furthermore, by combining the α(t) value of the current round, the update step size of AdamW output is weighted and modulated so that the weight update magnitude corresponding to the implicit feature path loss term is multiplied by α(t) to ensure a dynamic balance between physical guidance and data-driven learning.

[0195] By synchronously adjusting α(t) and the weight update step size, the learning intensity in the gradient conflict region of the previous training is appropriately weakened, while the update amplitude is enhanced in the region where the gradient direction is highly consistent, thereby improving the stability of iterative optimization.

[0196] Through the AdamW optimization and decoupling coefficient modulation processing described above, the gradient information from the previous step is transformed into weight parameter update data for the next iteration, thereby achieving the technical effect of continuing to optimize the fusion model in a non-convergent state and improving prediction accuracy and stability.

[0197] Step S8: Input the newly collected surface observation data for the new time period into the trained subsidence trend prediction model to obtain the future subsidence risk level, and determine whether to trigger a graded early warning signal based on a preset threshold, thereby realizing an intelligent response to abnormal surface deformation in the mining area. Specifically, this includes:

[0198] S8.1: Acquire new real-time multi-source surface observation data for the next time period, including GNSS deformation monitoring sequences, InSAR differential interferometric atlases, and meteorological and mining activity logs. Perform spatiotemporal alignment and normalization on the data to generate a structured input tensor consistent with the training phase, ensuring that the input data matches the model's expected input in terms of timestamps, spatial resolution, and dimensions.

[0199] S8.2: The structured input tensor is fed into the trained main prediction neural network. The data-driven features and temporal dynamic patterns are extracted through its built-in spatiotemporal attention mechanism. The data is then fused with the implicit feature representation output from the intermediate mapping network to calculate the predicted values ​​of the surface subsidence displacement sequence for multiple future time steps.

[0200] S8.3: Based on the predicted surface subsidence displacement sequence, calculate key risk indicator parameters, including maximum cumulative subsidence, subsidence rate change rate and acceleration trend slope, and use the sliding window statistical method to generate a risk feature vector with physical meaning as the basis for subsidence risk level judgment.

[0201] S8.4: Input the risk feature vector into the preset multi-level classification and discrimination logic module. This module performs graded judgment based on the threshold range calibrated by historical typical working conditions: if the maximum cumulative settlement exceeds the first-level threshold and the settlement rate is lower than the critical value, a yellow warning is output; if the settlement rate change rate exceeds the second-level threshold or the acceleration trend slope suddenly increases, it is upgraded to an orange warning; if multiple indicators simultaneously exceed the limit and continue to deteriorate, a red warning signal is triggered.

[0202] The risk feature vector, consisting of the maximum cumulative settlement, the rate of change of settlement rate, and the slope of acceleration trend, is input into the multi-level classification and discrimination logic module. A discrimination method based on threshold intervals (parameter: historical typical working condition calibration data) is used to classify the current mining area status.

[0203] Furthermore, through an interval matching algorithm (parameter: maximum cumulative settlement threshold) Settlement rate change threshold acceleration trend slope threshold This allows for the numerical comparison of feature components with corresponding risk levels, and yields a preliminary set of risk level labels.

[0204] By using a multi-level classification logic processing method, the risk feature vector from the previous step is transformed into a clear early warning level label, thereby achieving a graded response effect to abnormal surface conditions in the mining area.

[0205] For example, in the real-time prediction results of a certain mining area, the maximum cumulative settlement is 12.3 mm, the settlement rate is 0.15 mm / day, the rate of change is 0.045 mm / day², and the acceleration trend slope is 0.007 mm / day³. A first-level threshold is set. The value is 10 mm, and the critical rate is L. crit 0.2 mm / day, secondary threshold The slope threshold is 0.04 mm / day². The value is 0.006 mm / day³. Substituting the input parameters into the warning determination formula, the yellow warning condition is calculated as follows: The condition is deemed true; the calculation for an orange alert is as follows: True, and The condition for a red alert is true if all conditions are met. The above input is considered true if multiple indicators exceed limits, therefore the final output is a red alert label. This embodiment demonstrates that in complex limit-exceeding scenarios, the module can significantly improve its ability to identify high-risk states and provide the most stringent emergency response level.

[0206] S8.5: The generated settlement risk level information is packaged into a standardized early warning message package, which is pushed to the monitoring center platform through the communication module and linked with the mine safety management system to activate the corresponding emergency response plan. At the same time, the timestamp, location tag, triggering conditions and handling suggestions of this early warning event are recorded to form a closed-loop management log for subsequent traceability and analysis.

[0207] For those skilled in the art, various other corresponding changes and modifications can be made based on the technical solutions and concepts described above, and all such changes and modifications should fall within the protection scope of the claims of this invention.

[0208] Unless otherwise defined, the technical or scientific terms used herein shall have the ordinary meaning as understood by one of ordinary skill in the art to which this application pertains. The terms “first,” “second,” “third,” and similar terms used in this patent application specification and claims do not indicate any order, quantity, or importance, but are merely used to distinguish different components. Similarly, the terms “an” or “a” and similar terms do not indicate a quantity limitation, but rather indicate the presence of at least one. The terms “comprising” or “including” and similar terms mean that the elements or objects preceding “comprising” or “including” encompass the elements or objects listed following “comprising” or “including” and their equivalents, and do not exclude other elements or objects. The “multiple” mentioned in the embodiments of this application refers to two or more. A and / or B indicate three possibilities: A; B; and A and B.

[0209] The above description is merely an exemplary embodiment of this application, but the scope of protection of this application is not limited thereto. Any person skilled in the art can easily conceive of various equivalent modifications or substitutions within the technical scope disclosed in this application, and such modifications or substitutions should all be covered within the scope of protection of this application. Therefore, the scope of protection of this application should be determined by the scope of the claims.

Claims

1. A method for intelligent monitoring and early warning of surface subsidence in coal mining areas, characterized in that, include: S1: Acquire multi-source observation data of the coal mine surface and preprocess it to generate a structured historical observation dataset; S2: Based on prior knowledge of geomechanics, an elastoplastic physical simulation submodule is constructed to calculate the theoretical prediction of the surface displacement field at each time step and output a physical state vector sequence containing stress gradient and strain increment. S3: Based on the physical state vector sequence and the structured historical observation dataset, construct an intermediate mapping network and map the physical state vector into implicit feature representations; S4: The structured historical observation dataset and implicit feature representation are used as dual inputs and fed into the main prediction neural network to generate the prediction results of land subsidence trend in future periods. The main prediction neural network includes a dual-stream coding path with a physical guidance branch and a data guidance branch. S5: Train the intermediate mapping network and the main prediction neural network, determine the learning consistency between the physical guidance branch and the data guidance branch, and generate gradient coordination state flags. S6: Dynamically adjust the adaptive decoupling coefficient based on the gradient coordination state flag. This coefficient is used to adjust the contribution weight of the implicit feature representation in the loss function. S7: Determine whether the validation set error of the main prediction neural network in the current training round meets the convergence condition. If it does, output the final subsidence trend prediction model after training; otherwise, return to continue executing parameter updates and decoupling control. S8: Input the newly collected surface observation data for the next time period into the trained prediction model to obtain the future subsidence risk level.

2. The intelligent monitoring and early warning method for surface subsidence in coal mine areas according to claim 1, characterized in that, The multi-source observation data includes GNSS deformation monitoring sequences, InSAR differential interferometry atlases, and meteorological and mining activity logs.

3. The intelligent monitoring and early warning method for surface subsidence in coal mine areas according to claim 1, characterized in that, Step S1 involves preprocessing the multi-source observation data, including spatiotemporal alignment and normalization.

4. The intelligent monitoring and early warning method for surface subsidence in coal mine areas according to claim 1, characterized in that, The elastoplastic physical simulation submodule described in step S2 uses the thickness of the rock strata in the mining area, the elastic modulus and the distribution parameters of mining-induced stress to calculate the theoretical predicted value of the surface displacement field at each time step.

5. The intelligent monitoring and early warning method for surface subsidence in coal mine areas according to claim 1, characterized in that, The intermediate mapping network in step S3 consists of multiple fully connected neurons, including an input layer, two hidden layers, and an output layer.

6. The intelligent monitoring and early warning method for surface subsidence in coal mine areas according to claim 1, characterized in that, Step S4 includes: Based on a structured historical observation dataset, multivariate time series embedding processing is performed to generate a sequence of time step feature vectors with a unified dimension; an enhanced physical prior feature tensor is generated based on implicit feature representation. The time-step feature vector sequence and the enhanced physical prior feature tensor are fed into the dual-stream coding path respectively. The data-driven branch uses stacked LSTM layers to extract long-term temporal dependencies and generate high-order temporal hidden states. The physical guidance branch captures local physical state evolution patterns through a one-dimensional convolutional network and outputs physical perception context vectors to form independent deep representations of the two types of features. Based on higher-order temporal hidden states and physical perception context vectors, a cross-modal spatiotemporal attention mechanism is constructed: the correlation score matrix between the two in the time step and spatial dimension is calculated, the attention weight distribution is obtained by using softmax normalization, and the higher-order temporal hidden states are weighted and aggregated to generate a data-driven context representation that incorporates physical constraints. The fused context representation is input into the decoder module, and combined with the autoregressive strategy, the surface subsidence displacement sequence at multiple time steps in the future is predicted step by step, and the subsidence trend curve and confidence interval for the future period are output.

7. The intelligent monitoring and early warning method for surface subsidence in coal mine areas according to claim 6, characterized in that, For structured historical observation datasets, multivariate time series embedding processing is performed, including: using learnable linear projection matrices to vectorize GNSS deformation sequences, InSAR differential interferometric atlases, and spatiotemporally aligned meteorological and mining activity logs to generate time step feature vector sequences with uniform dimensions.

8. The intelligent monitoring and early warning method for surface subsidence in coal mine areas according to claim 6, characterized in that, The generation of enhanced physical prior feature tensors based on implicit feature representations includes: performing spatial dimension expansion and time-aligned interpolation on the implicit feature representations output by the intermediate mapping network to keep them synchronized with the structured historical observation dataset in terms of time resolution, and generating enhanced physical prior feature tensors through nonlinear activation functions.

9. The intelligent monitoring and early warning method for surface subsidence in coal mine areas according to claim 1, characterized in that, In step S5, during the training of the main prediction neural network, the parameters of the physical simulation submodule are fixed, the weights of the intermediate mapping network and the main prediction neural network are jointly optimized, and the learning consistency between the physical guidance branch and the data guidance branch is determined based on the gradient direction angle in the backpropagation path, and a gradient coordination state flag is generated.

10. The intelligent monitoring and early warning method for surface subsidence in coal mine areas according to claim 9, characterized in that, During backpropagation, the first gradient vector G from the implicit feature representation path is extracted respectively. p The second gradient vector G from the path of the structured historical observation dataset d For the first gradient vector G p With the second gradient vector G d Perform cosine similarity calculation to obtain the gradient direction angle θ, so as to quantify the degree of learning consistency between the physical guidance branch and the data guidance branch within the current training step; According to the preset included angle threshold range [θ low ,θ high The gradient direction angle θ is mapped to a discrete state label: When θ ≤ θ low When θ is determined to be a strongly consistent state, θ ≥ θ high When a state is in a strong conflict state, it is determined to be in a transitional state, and thus a gradient coordination state flag with clear semantics is generated.