A runoff ensemble prediction method based on spatial heterogeneity and variable attention
By employing watershed spatial partitioning, adaptive feature selection for forecast period, and variable attention modeling, the spatiotemporal heterogeneity and multi-scale features of runoff prediction in complex watersheds are addressed, achieving high-precision runoff ensemble prediction and improving the model's stability and applicability.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- CHINA THREE GORGES CORPORATION
- Filing Date
- 2026-03-25
- Publication Date
- 2026-06-19
AI Technical Summary
Existing deep learning runoff prediction methods suffer from problems such as insufficient spatiotemporal heterogeneity matching, lack of multi-source feature screening mechanisms, limited reconstruction methods for prediction results, and limited multi-scale feature capture capabilities in complex watersheds. These problems result in insufficient generalization ability of the models in long-sequence prediction and low prediction accuracy for extreme runoff events.
A runoff ensemble prediction method based on spatial heterogeneity and variable attention is adopted. Through watershed spatial partitioning, adaptive feature selection for the forecast period, two-stage runoff sequence decomposition and variable attention modeling, combined with a reconstruction strategy of physical interaction coupling, the method can achieve accurate modeling and prediction of runoff multi-components.
It improves the accuracy and stability of runoff prediction, can accurately capture extreme runoff events under complex watershed conditions, provides high-quality multi-scale prediction results, and enhances the interpretability and applicability of the model.
Smart Images

Figure CN122241152A_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of hydrological forecasting and water resources management technology, and in particular to a runoff ensemble forecasting method based on spatial heterogeneity and variable attention. Background Technology
[0002] Runoff forecasting is a key technological support for water resource regulation, flood control and disaster reduction, and sustainable watershed development. In recent years, driven by both climate change and human activities, the frequency of extreme hydrological events worldwide has increased significantly, and watershed hydrological processes have exhibited stronger spatial heterogeneity and non-stationary dynamic characteristics. This complexity poses greater challenges to runoff modeling and forecasting in large watersheds, placing higher demands on the accuracy, stability, and interpretability of models. To address the runoff forecasting problem in complex watersheds, various modeling approaches have been proposed. Among them, physical mechanism models, by parametrically describing watershed topography, underlying surface, and hydrological processes, can systematically characterize the mechanism of rainfall-runoff conversion. However, they are highly dependent on high-quality input data, have numerous model parameters, and a rigid structure, making them insufficiently adaptable to rapidly changing climate and land use conditions. In contrast, data-driven methods, especially time-series forecasting models based on deep learning, have demonstrated significant advantages in tasks such as runoff forecasting and flood warning due to their powerful nonlinear fitting and spatiotemporal feature learning capabilities, providing a new approach for the dynamic simulation of complex watersheds.
[0003] However, existing deep learning runoff prediction methods still have many shortcomings: (1) Insufficient matching of spatiotemporal heterogeneity. Most studies ignore the differences in spatial structure within the watershed and the spatiotemporal lag of the confluence process. The model is difficult to adaptively capture the dynamic time lag between rainfall and runoff response, resulting in difficulty in aligning key features; (2) Lack of multi-source feature screening mechanism. There are generally coupling and redundant relationships among multi-source meteorological variables. Existing methods are difficult to dynamically screen key driving factors for different forecast periods, resulting in insufficient generalization ability of the model in long-sequence prediction; (3) Limited reconstruction method of prediction results. Existing multi-component prediction methods mostly adopt simple linear superposition reconstruction strategy, ignoring the complex nonlinear interaction coupling between different runoff components, which limits the prediction accuracy of extreme runoff events; (4) Existing time series modeling methods still have limited multi-scale feature capture ability: Traditional RNN-type models are difficult to characterize long-range dependence and multi-scale changes. Although Transformer has the advantage of global modeling, it is not adaptable to local mutations and time series pattern transformations, and there is room for improvement in spatiotemporal feature alignment and fusion. Therefore, how to effectively integrate spatial heterogeneity information, optimize feature representation, and improve the model's ability to model multi-scale temporal features in the context of complex watersheds has become a key issue that needs to be addressed in current runoff prediction research. Summary of the Invention
[0004] The purpose of this invention is to overcome the above-mentioned shortcomings and provide a runoff ensemble prediction method based on spatial heterogeneity and variable attention, which aims to improve the accuracy and stability of runoff prediction under complex watershed conditions.
[0005] To solve the above-mentioned technical problems, the technical solution adopted by the present invention is as follows: A runoff ensemble prediction method based on spatial heterogeneity and variable attention includes the following steps: Step 1: Data Collection and Processing: Acquire historical runoff sequences, meteorological reanalysis data, and topographic and spatial baseline data of the target hydrological station; preprocess the data to form a spatiotemporal driven dataset. Step 2, Spatial partitioning of the watershed: Based on the digital elevation model (DEM) of the watershed in the spatiotemporal driven dataset, the control watershed of the target hydrological station is divided into several sub-regions by calculating the cumulative runoff and extracting and dividing the river network. Step 3, Adaptive Feature Selection for Forecast Period: Based on the spatial zoning results of the watershed, meteorological driving factors for each sub-region are extracted, and for each sub-region, the regional average value of the raster data is calculated to form a numerical sequence representing the overall meteorological conditions of the sub-region. Highly correlated candidate features are initially selected through Pearson correlation analysis. Furthermore, an adaptive selection mechanism for forecast period is introduced, setting differentiated selection thresholds based on the differences in forecast duration. A low threshold is set for the short-term forecast period to retain high-frequency detailed features, and a high threshold is set for the long-term forecast period to retain dominant trend features. Finally, a dynamic input feature set suitable for different forecast periods is constructed to achieve physical matching between input data and forecast targets. Step 4: Two-stage runoff sequence decomposition: Local weighted regression trend decomposition (STL) is used to separate the trend component, seasonal component and residual component of the historical runoff sequence. Then, the variational mode decomposition (VMD) parameters are optimized by the sparrow search algorithm (SSA). Finally, the residual component is adaptively decomposed by the optimized VMD to extract multi-scale high-frequency disturbance components. Step 5, Variable Attention Modeling: Construct a time series prediction model with variable deformable attention and time deformable attention mechanisms. The runoff multi-component sequence obtained from Step 4 and the meteorological features selected in Step 3 are used for joint modeling. The corresponding prediction models are trained and optimized for different forecast periods. Step 6: Two-stage reconstruction based on physical interaction coupling: A two-stage training strategy is adopted to realize the physical interaction coupling reconstruction of multi-component prediction results; the first stage freezes the prediction model parameters of each component and independently outputs the predicted values of trend, seasonal and high-frequency disturbance components; the second stage constructs a component interaction coupling network to realize nonlinear modulation of runoff; finally, multi-forecast period set prediction results are generated by nonlinear weighted superposition.
[0006] Preferably, in step 1, the historical runoff sequence includes the collected runoff time series of the target hydrological station, with the time scale selected as daily, monthly, or hourly scale according to the research objective; the data sources include historical monitoring records of the watershed management agency, measured data from automatic hydrological monitoring stations, and scientific research databases; the meteorological reanalysis data includes rainfall, average temperature, maximum temperature, minimum temperature, and evapotranspiration or potential evapotranspiration, and the data are derived from reanalysis data; the topographic and spatial basic data include digital elevation model (DEM), watershed boundaries, and river network distribution.
[0007] Preferably, in step 1, the data preprocessing includes: Time and space unification: In the time dimension, the target prediction time scale is used as the benchmark to ensure that multi-source data are aligned in time; in the spatial dimension, meteorological and spatial data are uniformly resampled to a consistent spatial resolution. Missing data processing: For missing data in the hydrological runoff sequence, linear interpolation is used to fill in the missing data points, and interpolation correction is performed by combining runoff trend information in the adjacent time period.
[0008] Preferably, in step 2, the watershed spatial zoning further includes a DEM preprocessing step: filling depressions in the DEM data to eliminate false depressions, and using the D8 algorithm to calculate the surface runoff direction of each grid cell.
[0009] Preferably, in step 2, the calculation of the cumulative flow is based on DEM flow direction data and is performed according to the logic of grid water flow converging from upstream to downstream. By traversing each grid, the number or area value of all grid points upstream of each grid that can converge to that grid unit is counted, thereby obtaining the cumulative flow.
[0010] Preferably, in step 2, the river network extraction and watershed division are carried out by setting a confluence threshold to extract the river network skeleton, and using the watershed algorithm to simulate water flow convergence from local high points based on the cumulative flow and river network information, and delineating the range of adjacent sub-basins. Each sub-basin contains an independent outlet point; each sub-region corresponds to an independent catchment unit with consistent topographic structure and runoff response characteristics.
[0011] Preferably, in step 2, the watershed spatial partitioning further includes a sub-watershed optimization and merging step: eliminating sub-watersheds with relatively small areas that cannot independently reflect runoff generation and confluence characteristics, merging adjacent sub-watersheds with similar hydrological attributes, so that the runoff process and hydrological response within each sub-region remain consistent.
[0012] Preferably, in step 3, the meteorological driving factors include rainfall, average temperature, maximum temperature, minimum temperature, and evapotranspiration or potential evapotranspiration.
[0013] Preferably, in step 3, the correlation coefficient calculation formula for Pearson correlation analysis is as follows:
[0014] In the formula, and Let these represent the meteorological characteristic value and the runoff observation value at the i-th time step, respectively. and Let be the mean of their respective sequences. This represents the total number of samples.
[0015] Preferably, in step 3, the adaptive feature selection mechanism for the forecast period specifically involves: constructing a dynamic feature selection strategy based on the differences in the forecast requirements of hydrological processes for different forecast periods; setting a lower correlation selection threshold for short-term forecast tasks to retain high-frequency detailed features containing instantaneous disturbance information; setting a higher correlation selection threshold for long-term forecast tasks to eliminate high-frequency noise and retain dominant features reflecting long-term evolution patterns; and achieving physical mechanism matching between the input feature set and the forecast target through the above differentiated threshold settings.
[0016] Preferably, in step 4, the expression for the Locally Weighted Regression Trend Decomposition (STL) is: ; In the formula: for The original natural runoff value at that moment; for The trend value at any given time; for The seasonal value at any given time; for The residual value at time 10:00. .
[0017] Preferably, in step 4, when optimizing the parameters of variational mode decomposition (VMD) using the Sparrow Search Algorithm (SSA), the number of modes K and the penalty factor α are selected as optimization variables. The optimal parameter combination is found through the co-evolution of three types of individuals—discoverers, participants, and vigilants—in the SSA algorithm.
[0018] Preferably, in step 4, when optimizing the parameters of variational mode decomposition (VMD) using the Sparrow Search Algorithm (SSA), the constructed multi-objective fitness function is: ; in, For reconstruction error, The total envelope entropy;
[0019] in, The original signal sequence, The signal reconstructed from all IMF components. This is the total length of the signal;
[0020] in, Indicates the first The normalized envelope amplitude of each modal component, A small constant used to prevent numerical overflow.
[0021] Preferably, in step 4, VMD adaptively decomposes the residual term into K eigenmode functions with compact spectral characteristics through iterative optimization. The frequency domain update formula for the k-th mode in the (n+1)-th iteration is:
[0022] In the formula, For the first The iteration of the ... Each modal spectrum corresponds to the oscillation component within a specific frequency band in the residual term; The spectrum of the STL residual term; These are the Lagrange multipliers for the nth iteration; For the nth iteration, the sum of the spectra of all modes except the kth mode; As a penalty factor; For frequency variables; For the current number The center frequency of each mode; This represents the number of iteration steps. The center frequency is updated based on the spectral energy centroid of the mode:
[0023] In the formula, The new center frequency of the k-th mode during the (n+1)-th iteration reflects the frequency domain position of the dominant oscillation of this mode in the residual signal. For the first The iteration of the ... Each modal spectrum.
[0024] Preferably, in step 5, the runoff multi-component sequence and meteorological features are used as input data. The input data is constructed by dividing the time series composed of runoff multi-component sequence and meteorological features into training set, validation set and test set in a ratio of 7:2:1, constructing training labels using the sliding time window method, and standardizing all features to eliminate dimensional differences.
[0025] Preferably, in step 5, the time-series prediction model is DeformTime, which includes a variable deformable attention block (V-DAB) and a time deformable attention block (T-DAB). In the variable dimension, the deformable attention learns feature offsets to dynamically focus on key driving factors; in the time dimension, the deformable attention learns location offsets to adaptively capture the confluence time lag effect between rainfall and runoff.
[0026] Preferably, the two-dimensional deformable offset of the V-DAB is calculated using the following formula:
[0027] In the formula, This is the query vector generated within the p-th time segment after joint encoding of multiple meteorological variables through an embedding layer. For a two-layer convolutional network, As input, output the position offset in both time and variable dimensions. , It is a learnable scalar that controls the maximum offset magnitude.
[0028] Preferably, the one-dimensional time offset of the T-DAB is calculated using the following formula:
[0029] In the formula, It is the g-th query subset obtained by the neighborhood sensing input embedding module NAE after grouping according to the correlation between meteorological variables and runoff components, η off For offset prediction, a one-dimensional convolutional network is used, where β is a learnable scaling factor that constrains the temporal offset range to maintain causality and stability.
[0030] Preferably, the NAE is used to preprocess the multi-source input variables for runoff prediction. First, the variables are rearranged according to their linear correlation with the target runoff component, and then divided into G adjacent groups. Each group is embedded in a specified dimension through an independent fully connected layer and then concatenated. Sine position codes are injected to preserve the temporal order and output structured features.
[0031] Preferably, in step 5, when training and optimizing the corresponding prediction models for different forecast periods, the model training uses mean squared error as the loss function, introduces an early stopping strategy to monitor the prediction error of the validation set, and dynamically adjusts the learning rate using the validation set; for different forecast periods, sub-models with independent weights are trained.
[0032] Preferably, step 6 specifically includes the following: a two-stage training strategy is adopted. In the first stage, the parameters of each runoff component prediction model are frozen, and the prediction results of the trend component, seasonal component, and high-frequency disturbance component are output independently. In the second stage, a component interaction coupling network is constructed. The predicted values of the trend component and seasonal component are used as input, and a nonlinear correction factor for the high-frequency disturbance component is output to achieve nonlinear modulation of the runoff. Finally, the predicted values of each component are nonlinearly corrected and weighted to generate the final runoff prediction result. Through this mechanism, a multi-forecast period ensemble prediction model is generated, wherein the model with a forecast period of 1 outputs the runoff value for the first day, the first month, or the first hour of the future, the model with a forecast period of 3 outputs the runoff value for the third day, the third month, or the third hour of the future, and so on, to achieve high-precision joint runoff prediction at multiple time scales.
[0033] Preferably, the component interaction coupling network is: constructing a multilayer perceptron structure including fully connected layers and activation functions; and predicting the trend components. Seasonal component forecast values Feature concatenation is performed and used as the input vector for the coupled network; the coupled network outputs a correction factor for the high-frequency perturbation component through nonlinear mapping; the final reconstruction formula is as follows: ; In the formula, This is the final predicted runoff value. These are the predicted values for the high-frequency disturbance components. This is the nonlinear correction coefficient for runoff, and its range is constrained to the interval (0,1) by the Sigmoid activation function.
[0034] Preferably, step 6 further includes: training and optimizing the corresponding component interaction coupling network for different forecast periods, using mean squared error as the loss function for model training, calculating the error between the reconstructed runoff and the actual runoff, introducing an early cessation strategy to monitor the prediction error of the validation set, and dynamically adjusting the learning rate using the validation set; training a component interaction coupling model with independent weights for different forecast periods.
[0035] Beneficial effects of this invention: (1) A comprehensive framework for runoff ensemble prediction integrating spatial heterogeneity and multi-stage modeling: This invention proposes a systematic runoff ensemble prediction framework that achieves deep synergistic integration of watershed spatial information, runoff temporal characteristics, and multi-source meteorological driving factors. This framework fully encompasses six core components: data collection and processing, watershed spatial zoning, adaptive feature selection for the forecast period, two-stage runoff sequence decomposition, variable attention modeling, and two-stage reconstruction based on physical interaction coupling. This forms a closed-loop process from refined construction of input features and deep characterization of spatiotemporal features to multi-scale forecast ensemble prediction. Through this systematic framework, multi-scale temporal characteristics of runoff can be accurately extracted based on fully exploiting the spatial heterogeneity information within the watershed, ultimately achieving collaborative modeling of meteorological driving processes and runoff response mechanisms.
[0036] (2) Construction of Regional Meteorological Features and Adaptive Forecasting Mechanism Based on Sub-basin Spatial Partitioning: This invention divides the control basin into several sub-basins based on DEM data, combined with runoff accumulation analysis and watershed algorithm. Sub-basins are then optimized and merged to form a spatial organization framework adapted to runoff generation and confluence characteristics. Based on this framework, meteorological elements such as rainfall, temperature, and evapotranspiration are extracted from each sub-basin, fully preserving the spatial heterogeneity within the basin. Furthermore, correlation scores are calculated using Pearson correlation analysis, and an innovative adaptive forecasting mechanism is proposed. This mechanism retains high-frequency detailed features for short-term forecasts and dominant trend features for long-term forecasts, achieving physical mechanism matching between the input feature set and the prediction target. This mechanism reduces redundant input variables, lowers the risk of model overfitting, and strengthens the physical consistency between meteorological driving factors and runoff response, achieving the construction of a high-quality input feature set for multi-scale prediction.
[0037] (3) A runoff modeling paradigm combining two-stage temporal decomposition and dual-dimensional deformable attention: To address the technical challenges of the strong nonstationarity and complex dynamic characteristics of natural runoff sequences, this invention proposes a two-stage temporal decomposition method based on STL and SSA-optimized VMD. First, the trend, seasonal, and residual terms of the runoff sequence are separated by STL decomposition. Then, the core parameters of VMD are optimized by the Sparrow Search Algorithm (SSA), and the residual terms are decomposed a second time to extract multi-scale high-frequency perturbation components, thereby reducing the volatility and nonstationarity of the original runoff sequence from the root. Subsequently, a dual-dimensional deformable attention network is constructed. The temporal dimension deformable attention adaptively captures the confluence time lag effect between meteorology and runoff, and the variable dimension deformable attention dynamically focuses on key driving factors, realizing adaptive interactive modeling of meteorological characteristics and multi-scale runoff components in each sub-basin. This significantly improves the model's ability to deeply learn and accurately represent complex hydrological nonlinear processes.
[0038] (4) Integration strategy of differentiated modeling by forecast period and multi-scale component collaborative prediction: In view of the inherent defects of traditional models, such as "high accuracy in short-term prediction and sharp drop in accuracy in long-term forecast period", as well as the industry pain points of different runoff response patterns and uneven fitting difficulty of multi-scale runoff features, this invention adopts an integrated strategy of independent training by forecast period and multi-component decoupled prediction: At the model training level, dedicated sub-models with independent weights are trained for different forecast periods such as January, March and June, and the optimal feature subsets of the corresponding forecast periods are matched to solve the problem of insufficient adaptability of single model to multiple forecast periods; At the prediction reconstruction level, a component interactive coupling network is designed to realize nonlinear modulation of runoff, and the prediction results of each component are physically interactively coupled and reconstructed to eliminate mutual interference between features of different scales, greatly alleviate the problem of error accumulation in long-term forecast, and finally achieve synchronous improvement and balanced optimization of short, medium and long-term runoff prediction performance.
[0039] (5) A solution to enhance the interpretability of models through deep integration of data-driven and hydrophysical mechanisms: This invention breaks through the industry bottlenecks of traditional deep learning hydrological models, which suffer from strong "black box" attributes, weak physical meaning, and poor interpretability. It embeds hydrophysical mechanism constraints into the entire technical process: the sub-basin division and optimization based on DEM fully conforms to the spatial physical processes of natural runoff generation and confluence in the basin; the sensitivity screening and dynamic adaptation of forecast period features ensure the physical consistency of the meteorological-runoff response relationship; the two-dimensional deformable attention mechanism can quantify the contribution of different meteorological variables and different time periods to runoff prediction, intuitively revealing the inherent nonlinear correlation between meteorological driving and runoff response. While improving prediction accuracy, it endows the pure data-driven deep learning model with clear hydrophysical connotations, greatly improves the credibility of the model results, and provides reliable technical support for water resources management engineering decision-making.
[0040] (6) Accurate Capture of Extreme Runoff Events and Robustness Optimization Across All Scenarios: Addressing the engineering challenges of low sample proportions, strong nonlinear characteristics, and large prediction biases of traditional models in natural runoff sequences, this invention achieves accurate prediction of extreme runoff events through a collaborative approach across the entire process: The two-stage time-series decomposition method can accurately separate and model high-frequency extreme disturbance components in the runoff sequence, avoiding the submergence of extreme features by conventional stationary sequences; the two-dimensional deformable attention mechanism can adaptively focus on key meteorological driving signals and key time nodes before and after extreme events, enhancing the model's ability to learn features of extreme hydrological processes; the independent training strategy for different forecast periods reduces the interference of extreme samples on the full-cycle model training, improving the model's fitting stability for extreme events. This invention significantly improves the prediction accuracy of peak flow, peak occurrence time, and dry season flow, while maintaining stable prediction performance under different watershed scales, different data abundances, and different forecast period scenarios. It possesses excellent engineering applicability and robustness, providing key technical support for watershed flood control early warning and drought relief water resource allocation. Attached Figure Description
[0041] Figure 1 The flowchart of a runoff ensemble prediction method based on spatial heterogeneity and variable attention is shown in Example 1. Figure 2 Flowchart of the mechanism for constructing and screening regional meteorological features based on spatial partitioning; Figure 3 A flowchart for runoff prediction combining two-stage temporal decomposition and deformable attention mechanism; Figure 4 Optimize VMD parameter results for SSA; Figure 5 This is a diagram showing the results of the two-stage decomposition. Figure 6 The diagram shows the prediction performance of the model under multiple forecast periods. Figure 7 This is a graph showing the trend of runoff at the target station. Detailed Implementation
[0042] The present invention will now be described in further detail with reference to the accompanying drawings and specific embodiments.
[0043] Example 1: As Figures 1 to 3 As shown, a runoff ensemble prediction method based on spatial heterogeneity and variable attention includes six core steps: data collection and processing, watershed spatial partitioning, adaptive feature selection for the forecast period, two-stage runoff sequence decomposition, variable attention modeling, and two-stage reconstruction based on physical interaction coupling.
[0044] Step (1): In the data collection and processing stage, historical runoff sequences, meteorological reanalysis data (i.e., multi-source meteorological data for the corresponding period in the control basin), and topographic and spatial basic data of the target hydrological station are obtained. Meteorological elements include basic driving factors such as rainfall, temperature, and evapotranspiration. According to the needs of model calculation and analysis, the multi-source data are unified in time scale, standardized in format, spatially aligned, and missing value processed to form a complete spatiotemporal driving dataset, laying the data foundation for subsequent spatial feature construction and time series prediction.
[0045] Step (2): In the watershed spatial zoning stage, based on the watershed digital elevation model (DEM), methods such as runoff accumulation analysis and watershed delineation are used to divide the control watershed of the target hydrological station into several representative sub-regions. Each sub-region corresponds to an independent catchment unit, possessing a relatively consistent topographic structure and runoff generation and runoff response characteristics, effectively preserving the spatial heterogeneity within the watershed, and providing a clear spatial organization framework for regional meteorological feature extraction and modeling.
[0046] Step (3): In the adaptive feature selection stage of the forecast period, based on the spatial zoning results of the watershed, the meteorological driving elements of each sub-region are extracted, and for each sub-region, the regional average value of the raster data is calculated to form a numerical sequence representing the overall meteorological conditions of the sub-region; high-correlation candidate features are initially selected through Pearson correlation analysis; further, an adaptive screening mechanism for the forecast period is introduced, and a differentiated screening threshold is set according to the difference in forecast duration. A low threshold is set for the short-term forecast period to retain high-frequency detailed features, and a high threshold is set for the long-term forecast period to retain the dominant trend features; finally, a dynamic input feature set suitable for different forecast periods is constructed to achieve physical matching between the input data and the forecast target.
[0047] Step (4): In the two-stage runoff sequence decomposition stage, to address the non-stationarity of historical runoff sequences, the trend component, seasonal component, and residual component of the runoff sequence are first separated using Local Weighted Regression Trend Decomposition (STL). Then, the variational mode decomposition (VMD) parameters are optimized using the Sparrow Search Algorithm (SSA) to adaptively decompose the residual component and extract multi-scale high-frequency disturbance components. This two-stage decomposition method can effectively reduce the non-stationarity of the original sequence and improve the subsequent model's learning ability and prediction accuracy for complex runoff dynamic characteristics.
[0048] Step (5): In the variable attention modeling stage, a time series prediction model with variable deformable attention and time deformable attention mechanism is constructed to jointly model the multi-component sequence of runoff and highly correlated meteorological characteristics. For different forecast periods, the corresponding prediction models are trained and optimized to accurately characterize the spatiotemporal coupling characteristics and runoff evolution law of complex watersheds.
[0049] Step (6): In the two-stage reconstruction stage based on physical interaction coupling, a two-stage training strategy is adopted to realize the physical interaction coupling reconstruction of multi-component prediction results; in the first stage, the prediction model parameters of each component are frozen, and the predicted values of trend, seasonal and high-frequency disturbance components are output independently; in the second stage, the component interaction coupling network is constructed to realize the nonlinear modulation of runoff; finally, the multi-forecast period set prediction results are generated by nonlinear weighted superposition.
[0050] Through the above steps, this invention achieves deep integration of spatial information, temporal characteristics and multi-source meteorological driving factors, effectively improving the accuracy, stability and generalization ability of runoff prediction, and is applicable to water resource scheduling and flood forecasting scenarios under complex watershed conditions.
[0051] The specific details of each main step are as follows: I. Data Collection and Processing: In watershed-scale runoff prediction and regulation analysis, the completeness, accuracy, and consistency of data directly determine the reliability of subsequent modeling and analysis. Therefore, this technique first constructs a unified data acquisition and preprocessing system based on multi-source data from the target watershed, including basic hydrological, meteorological, and remote sensing data, laying a solid data foundation for subsequent feature construction and model training.
[0052] 1. Data types: (1) Historical runoff sequence: The runoff time series data from the target hydrological stations are collected, with the time scale selected as daily, monthly, or hourly, depending on the research objectives. Data sources include historical monitoring records from watershed management agencies, measured data from automatic hydrological monitoring stations, and relevant research databases. This type of data directly reflects the watershed outflow process and serves as a core reference for training and validating predictive models.
[0053] (2) Meteorological reanalysis data: Meteorological elements closely related to runoff formation were collected, including rainfall, average temperature, maximum temperature, minimum temperature, and evapotranspiration. The data primarily came from reanalysis data, such as the ERA5 reanalysis dataset published by the European Centre for Medium-Range Weather Forecasts. Reanalysis data is characterized by good temporal continuity, wide spatial coverage, and complete elements, effectively compensating for the deficiencies caused by the sparse distribution of surface meteorological observation stations.
[0054] (3) Topographic and spatial basic data: Basic spatial information, including digital elevation models (DEMs) of the target site's control basin, basin boundaries, and river network distribution, is collected. This data can be used to clarify the spatial extent, topographic features, and drainage system structure of the basin, providing necessary support for subsequent basin spatial delineation, feature extraction, and hydrological process mechanism analysis.
[0055] 2. Data preprocessing: (1) Unity of time and space: Because different data sources differ in temporal and spatial resolution, they need to be standardized. In the temporal dimension, the target prediction timescale should be used as a benchmark (e.g., monthly runoff prediction should be aggregated with corresponding monthly meteorological data) to ensure temporal alignment of multi-source data. In the spatial dimension, meteorological and spatial data should be uniformly resampled to a consistent spatial resolution (e.g., 1 km or 10 km) to ensure spatial comparability and fusion of different data types.
[0056] (2) Handling of missing measurements: To address missing data in hydrological runoff sequences, linear interpolation is used to fill in the missing data points, thereby restoring the integrity and continuity of the time series. When there are a large number of missing data points or their distribution is irregular, interpolation correction can be performed by combining runoff trend information from adjacent time periods.
[0057] II. Watershed Spatial Zoning: In watershed-scale runoff forecasting, spatial heterogeneity within the watershed significantly impacts the hydrological response. Different sub-regions exhibit substantial differences in topography, catchment area, underlying surface features, and runoff generation characteristics. Consequently, the imagery of the target station varies across upstream sub-regions. Directly modeling the entire watershed without differentiation may result in inaccurate depictions of runoff generation processes. Therefore, this technique utilizes runoff accumulation analysis and watershed delineation methods based on a digital elevation model (DEM) and river network framework to divide the target watershed into several representative sub-regions. This provides a clear spatial structure for subsequent regional meteorological feature extraction, temporal modeling, and forecasting.
[0058] 1. DEM preprocessing: In Digital Elevation Model (DEM) data, local depressions often appear due to measurement noise, interpolation errors, or insufficient data accuracy. These depressions are usually not real landform features but rather spurious topography. If directly used for surface runoff simulation, they will lead to interrupted flow paths and distorted flow direction results. To eliminate this problem, DEM data needs to be filled with depressions. The basic principle of depression filling is to identify closed depressions surrounded by high ground and raise their elevation values to the lowest outlet height that can form a continuous outflow path. Through this process, spurious depressions can be effectively eliminated, restoring the hydrological connectivity of the topographic surface, thereby ensuring the continuity and rationality of subsequent flow direction analysis and runoff calculations.
[0059] After filling depressions, it is necessary to calculate the surface runoff direction for each grid cell. A commonly used algorithm is the D8 algorithm (Deterministic Eight-direction method). This algorithm assumes that water flow in each grid cell can only flow into one of its eight neighborhoods (up / down, left / right, and four diagonal directions). In practice, the direction with the largest slope is determined by comparing the elevation differences between the central grid cell and its neighboring grid cells. The D8 algorithm has advantages such as computational simplicity and clear physical meaning, and can accurately describe the downslope path of surface water flow.
[0060] 2. Calculation of cumulative flow: By utilizing DEM flow direction data, the cumulative catchment area of each grid cell can be calculated based on the Digital Elevation Model (DEM) to quantitatively characterize the spatial convergence of surface runoff. The cumulative catchment area refers to the total area flowing through a given grid cell and all its upstream cells, reflecting the scale and capacity of water accumulation at that location. Calculating the cumulative area of the entire region reveals the convergence process of water flow from higher to lower elevations and the spatial distribution characteristics of runoff paths.
[0061] The calculation of cumulative catchment area is based on DEM flow direction data, following the logic of grid-based water flow converging from upstream to downstream. The algorithm traverses each grid, counting the number or area of all upstream grid points that can converge to that cell, thus obtaining the cumulative catchment area at that location. Grids with larger cumulative areas typically correspond to lower-lying, terrain-converging areas, such as main channels, valleys, or major catchment areas; while grids with smaller cumulative areas are mostly distributed on watersheds or higher-lying terrain cells, representing the source areas of water flow or runoff boundaries.
[0062] 3. River network extraction and watershed delineation: After obtaining the cumulative runoff results, the river network can be extracted by setting a runoff threshold. When the cumulative runoff area of a grid exceeds the set threshold, it is considered to have formed a stable runoff channel. This yields a river network framework consistent with the topography, providing hydrological boundary constraints for subsequent watershed delineation.
[0063] Based on accumulated runoff and river network information, a watershed algorithm is used for preliminary division. This algorithm, using topographic elevation and flow direction data, starts from local high points (ridgelines) and simulates the convergence of water flows along a descending slope. When the boundaries of different flow convergence zones meet, a watershed line is formed, thus delineating the boundaries of adjacent sub-basins. Each sub-basin contains an independent outlet point, typically corresponding to a river network intersection node, and serves as the basic unit for subsequent hydrological response analysis and feature extraction.
[0064] 4. Sub-basin optimization and merging: Based on the initial division, the sub-basin results need to be optimized to ensure the hydrological rationality and spatial operability of the division. Sub-basins that are relatively small in area and cannot independently demonstrate runoff generation and confluence characteristics, or those that are not representative, should be removed to avoid spatial redundancy caused by over-subdivision. Taking into account the actual river network distribution and topographic and hydrological characteristics, adjacent sub-basins with similar hydrological attributes should be reasonably merged to ensure that the runoff processes and hydrological responses within each sub-region remain relatively consistent, thereby improving the stability and application value of the overall division results.
[0065] III. Adaptive Feature Filtering in the Forecast Period: In watershed runoff prediction, feature construction is a crucial step in model performance. Accurate and stable features not only reflect the core driving factors of watershed hydrological response but also reduce model input redundancy and improve training efficiency and generalization ability. This technique, based on watershed spatial division, extracts and filters features from meteorological data in each sub-region to form a regional-scale feature set that can be used in deep learning models.
[0066] 1. Extraction of regional meteorological features: Based on the zoning results, each sub-basin is used as the basic unit. Meteorological driving factors are extracted from the reanalysis data, including rainfall, average temperature, maximum temperature, minimum temperature, and potential evapotranspiration. For each sub-region, the regional average of the raster data is calculated to form a numerical sequence representing the overall meteorological conditions of the sub-region. This regional scale feature can reflect the spatial consistency within the sub-region while preserving the meteorological differences between different sub-regions, providing a basis for the model to capture spatial heterogeneity.
[0067] 2. Feature Correlation Analysis: Under high-dimensional input conditions, redundant or inefficient features may not only introduce noise and increase model training costs, but may also lead to overfitting and affect the model's generalization ability. In order to reduce the noise interference caused by redundant or irrelevant features to the model prediction and improve the model performance, this study uses Pearson correlation analysis to calculate the correlation of key features that affect the runoff prediction effect.
[0068] The Pearson correlation coefficient measures the degree of linear correlation between two variables, with a range of [-1, 1]. A larger absolute value indicates a stronger correlation. This study identifies variables highly linearly correlated with runoff changes as preferred features for model input by calculating the Pearson correlation coefficients between various meteorological features and the target runoff sequence. In this study, let x represent the time series of a certain meteorological feature, and y represent the runoff observation sequence for the corresponding time period. The correlation coefficient is calculated using the following formula:
[0069] In the formula, and Let these represent the meteorological characteristic value and the runoff observation value at the i-th time step, respectively. and Let be the mean of their respective sequences. This represents the total number of samples.
[0070] 3. Adaptive feature selection during the forecast period: Runoff forecasting typically involves different lead times (e.g., 1 day, 3 days, 7 days), and the responses of each lead time to driving features differ significantly. To address the problem that traditional single feature sets are insufficient to meet the forecasting needs of multiple lead times, this study proposes an adaptive feature selection mechanism based on lead times. This mechanism constructs a dynamic feature selection strategy based on the differences in the forecasting needs of hydrological processes at different lead times: The Pearson correlation coefficient screening threshold is defined as θ. For the short-term forecast period, the watershed is sensitive to instantaneous meteorological events such as rainfall, so θ is set to a lower threshold (preferred range 0.3~0.5) to retain high-frequency detailed features containing instantaneous disturbance information. For the long-term forecast period, runoff is mainly controlled by slowly changing factors, so θ is set to a higher threshold (preferred range 0.6~0.8) to eliminate high-frequency noise interference and retain the dominant features reflecting long-term evolution patterns.
[0071] IV. Two-stage runoff sequence decomposition: In watershed runoff forecasting, historical runoff sequences typically exhibit non-stationarity, seasonal fluctuations, and multi-scale disturbances. Directly inputting these sequences into deep learning models may lead to learning difficulties or the accumulation of prediction errors. To improve the model's ability to characterize complex runoff dynamics, this technique employs a two-stage time series decomposition method, STL-SSA-VMD, to decompose the original runoff sequence into multiple components, extracting trend components, seasonal components, and high-frequency disturbance components, thus providing the model with a more structured input.
[0072] 1. Seasonal-Trend Decomposition (STL) of Locally Weighted Regression: Seasonal and Trend decomposition using Loess (STL) is a decomposition technique based on locally weighted regression. It effectively decomposes a time series into three parts: a trend term, a seasonal term, and a residual term. Compared to traditional decomposition methods, STL offers greater flexibility and robustness, handling non-stationary seasonal time series. Its additive model structure is suitable for most hydrological and meteorological data. STL decomposition effectively reduces the non-stationarity of runoff series, lowers model training difficulty, and allows the model to learn separately for long-term trends, seasonal fluctuations, and local disturbances. Its decomposition formula can generally be expressed as: ; In the formula: for The original natural runoff value at that moment; for The trend value at any given time; for The seasonal value at any given time; for The residual value at time 10:00. .
[0073] The core of STL decomposition is a double-loop structure based on LOESS. The inner loop mainly fits the trend and calculates the cycle time, while the outer loop is mainly used to adjust the robust weights. Let... , For trend items and seasonal items The results of this iteration, and the inner loop steps, are summarized as follows: Parameter initialization: Set the number of iterations Initial value of trend component .
[0074] Trend removal: Subtracting the current trend component from the original sequence. .
[0075] Seasonal subsequence smoothing: Subsequences are constructed based on the seasonal cycle length, and LOESS smoothing is applied to each subsequence to form a seasonal sequence. .
[0076] Low-pass filtering of seasonal subsequences: for preliminary seasonal sequences Low-pass filtering was performed using a combination of moving average and LOESS smoothing to obtain the smoothed baseline sequence. .
[0077] Seasonal component update: Remove smoothed baselines from the initial seasonal series and update the seasonal components. .
[0078] Seasonal term removal: Subtracting the latest seasonal estimate from the original sequence. .
[0079] Trend smoothing: for Perform LOESS smoothing to obtain .
[0080] Convergence check and iteration control: If the decomposition result reaches the set convergence criterion or the preset maximum number of iterations has been completed, output the trend, seasonality and residual components; otherwise, return to step (2) and continue iterating.
[0081] 2. Variational Mode Decomposition (VMD): The residual components obtained after STL decomposition often contain high-frequency noise, and directly using them for prediction may degrade the overall model performance. This technique employs Variational Mode Decomposition (VMD) to further process the residual components. VMD is an adaptive signal processing method that decomposes non-stationary time series into several Intrinsic Mode Functions (IMFs) with finite bandwidth. Compared with traditional empirical mode decomposition, VMD constructs a variational optimization model and iteratively solves for the center frequency and function shape of each mode in the frequency domain, exhibiting stronger theoretical interpretability and noise resistance, and effectively extracting potential multi-scale dynamic features from the time series. Assuming the original sequence consists of a superposition of several IMFs, the goal of VMD is to optimally extract the finite bandwidth characteristics of each mode, making each mode as independent as possible in the frequency domain and reducing mutual interference. The specific operation process is as follows: The core idea of VMD is to minimize the sum of bandwidths across all modes. Its variational form and constraints are expressed as follows: ; ; In the formula: This represents the operation of the first derivative; For modality ; For modality The center frequency; The modal number; The original input signal; This is a complex exponential signal, where j is the imaginary unit, and its function is to convert the mode u... k The spectrum of (t) is shifted to the center frequency ω. k This location facilitates the calculation of modal bandwidth.
[0082] Introducing Lagrange multipliers and penalty factor Construct an unconstrained Lagrangian function: ; in, The penalty coefficient, For Lagrange multipliers, Calculate the inner product of two vectors.
[0083] Based on the alternating direction multiplier method framework, VMD iteratively updates each mode and center frequency in the frequency domain. VMD adaptively decomposes the residual term into K eigenmode functions with compact spectral characteristics through iterative optimization; the update formula is: ; In the formula, For the first The iteration of the ... One modal spectrum; The spectrum of the original signal; For the current number The center frequency of each mode; This represents the number of iteration steps.
[0084] The center frequency is updated based on the spectral energy centroid of the mode: ; In the formula, For the first The iteration of the ... One modal spectrum; 3. Sparrow Search Algorithm (SSA) optimizes VMD parameters: To improve the adaptability and accuracy of signal decomposition, the Sparrow Search Algorithm (SSA) is used to jointly optimize key parameters in Virtual Mode Decomposition (VMD). VMD, a commonly used time-frequency analysis method, is directly affected by the number of modes K and the penalty factor α. Inappropriate parameter selection can easily lead to mode aliasing or decomposition distortion, affecting subsequent feature extraction and analysis. Therefore, this paper introduces the SSA algorithm, combined with a multi-objective fitness function, to automatically optimize VMD parameters and improve decomposition quality.
[0085] Swarm Optimization (SSA) is an emerging swarm intelligence optimization method inspired by the foraging and vigilance behaviors of sparrows in nature. This algorithm constructs three types of individuals—discoverers, joiners, and vigilants—that co-evolve to balance global search and local exploitation capabilities, effectively avoiding getting trapped in local optima.
[0086] The SSA group consists of a total of The individual components, each of which can be represented by its position vector, are as follows: ; in, For the decision variable dimension of the problem, Indicates the first The current position of a sparrow.
[0087] Discoverer Location Update: Discoverers constitute a certain proportion of the total population and possess strong global search capabilities. Their location update formula is as follows: ; ; in, This represents the current iteration number. For control parameters, The maximum number of iterations, It is a random number. As the warning threshold, Follows a normal distribution. It is a unit vector.
[0088] Joiner position update: Joiners improve their local search capabilities by following the discoverer and combining this with their own strategies. The update formula is as follows: ; in, This indicates the optimal position of the current population.
[0089] When a population faces potential risks (such as the appearance of predators), some individuals enhance the randomness of their search through the following strategies: ; in, and Let represent the current global best and worst positions, respectively, and β be a random variable that follows a normal distribution.
[0090] For the VMD parameter optimization task, this paper selects the number of modes K (integer) and the penalty factor α (continuous value) as optimization variables to construct a multi-objective fitness function, which comprehensively considers the signal reconstruction error and the envelope entropy of the decomposed modes.
[0091] Reconstruction error reflects the degree to which the VMD decomposition result fits the original signal, and is a direct and important indicator for measuring the decomposition effect. Ideally, the reconstructed signal from each mode function of VMD should be as close as possible to the original signal. A large reconstruction error indicates information loss, insufficient mode separation, or unreasonable parameter settings. This paper defines reconstruction error using the root mean square error (RMSE) form: ; in, The original signal sequence, The signal reconstructed from all IMF components. The total length of the signal is denoted as . The smaller the reconstruction error, the higher the decomposition accuracy, and the more comprehensively the IMF components can express the characteristics of the original signal.
[0092] Envelope entropy reflects the structural complexity and information distribution characteristics of each modal component. After obtaining the signal envelope based on the Hilbert transform, the envelope entropy is larger when the feature information of the modal component is relatively small, and smaller when there is less noise and more feature information in the IMF. The entropy value is calculated using the following formula: ; in, Indicates the first The normalized envelope amplitude of each modal component, A small constant used to prevent numerical overflow.
[0093] To comprehensively evaluate the reconstruction accuracy and modal structure characteristics of VMD decomposition, this paper jointly introduces two indices: reconstruction error and envelope entropy, and designs a multi-objective fitness function. Reconstruction error reflects the signal reconstruction capability, while envelope entropy measures the structural complexity of the decomposition result. Considering the dimensional differences and numerical fluctuation range of these two indices, this paper introduces a logarithmic transformation (log transform) to normalize the scale of each index. The specific expression is as follows: ; in, For reconstruction error, The total envelope entropy is represented by a 1, which is appropriately added to avoid zero or negative values affecting the calculation. This fusion strategy balances the reconstruction accuracy and information representation quality of the decomposition, effectively improving the stability and optimization effect of VMD parameter optimization.
[0094] V. Variable Attention Modeling: The core objective of this phase is to construct a time-series forecasting model capable of accurately characterizing the relationship between meteorological drivers and runoff response, and to improve the model's generalization ability and stability through scientific training strategies. The model design fully integrates previous feature engineering results, forming a highly adaptable deep learning framework. During training, the model's prediction accuracy and robustness are gradually improved by setting appropriate loss functions and optimization strategies. Simultaneously, model training is conducted separately for different forecast periods to enhance its applicability and relevance in specific time periods, providing stable and reliable technical support for dynamic runoff forecasting.
[0095] 1. Input data construction: The original time series data was divided into training, validation, and test sets in chronological order, with a ratio of 7:2:1. The training set was used to fit the model parameters, the validation set was used to adjust the model structure and hyperparameters, and the test set was used to test the model's performance in actual predictions. Using chronological division effectively prevents future information from entering the training process, thereby improving the reliability of the prediction results.
[0096] In constructing the training labels, a sliding time window method is adopted, using a continuous historical meteorological information sequence as input and the subsequent runoff sequence as the prediction target. Specifically, a fixed-length historical sequence is selected as the model input each time, and the corresponding actual runoff value for a future period is used as the output label. This method can fully reflect the temporal relationship between meteorological changes and runoff response, and supports flexible applications of single-step or multi-step prediction.
[0097] To eliminate the dimensional differences between various meteorological elements and runoff data, all features were standardized to ensure more stable convergence of the model during training. These steps resulted in a structured dataset with clear input-output relationships.
[0098] 2. DeformTime model: DeformTime is a deep learning architecture for multivariate time series forecasting tasks, designed to address the shortcomings of traditional models in simultaneously capturing the dynamic dependencies between multiple variables and the internal correlation features of time series. In this invention, the model uses multi-source meteorological driving factors for watershed runoff forecasting. This model introduces a deformable attention mechanism to fully exploit the cross-variable asynchronous responses and non-stationary evolution features in the meteorological-runoff component system. Its processing flow includes reordering and grouping multiple meteorological variables according to their correlation with the target runoff component using a neighborhood-aware input embedding module (NAE) to enhance the representation of hydrologically sensitive factors; the embedding results are input in parallel into a variable deformable attention block (V-DAB) and a time deformable attention block (T-DAB) to model the synergistic influence mechanism of different meteorological factors on the runoff component and the dynamic temporal dependencies within the runoff component sequence, respectively; after fusing the two types of time series representations, the data is fed into a decoder, where a GRU generates the hidden states for each prediction time step, and a multilayer perceptron outputs the final runoff component prediction results. The core modules of this model include V-DAB and T-DAB, whose structure and functions are as follows.
[0099] V-DAB is used to capture the impact of dynamic interactions among multiple meteorological variables on runoff components. Traditional attention mechanisms are often based on fixed temporal windows or global dependency modeling, making it difficult to flexibly focus on the feature locations most relevant to the target variable. V-DAB, building upon the multi-head attention mechanism of the standard Transformer, introduces a deformable sampling offset mechanism. It adaptively selects the most relevant meteorological variables and their effective time points through learned offset parameters, thereby improving the effectiveness of cross-variable information interaction. Specifically, the input embedding is first divided into several spatiotemporal segments along the time dimension, and a lightweight 2D CNN is used to predict a two-dimensional deformable offset for each segment. ; In the formula, This is a query vector generated by jointly encoding multiple meteorological variables through an embedding layer within the p-th time segment, reflecting the comprehensive state of multi-source meteorological factors during that period. For a two-layer convolutional network, As input, output the position offset in both time and variable dimensions. , As a learnable scalar, it controls the maximum offset magnitude. Then, at position... The key and value vectors are obtained through bilinear interpolation, and attention calculations are performed across variables and across time to adaptively focus on the most contributing spatiotemporal features.
[0100] T-DAB aims to model the non-stationary dynamic characteristics of individual meteorological or runoff components over time. Since hydrological processes often involve multiple time scales, such as diurnal cycles, flood season trends, and abrupt changes in extreme events, fixed-time attention is difficult to achieve simultaneously. T-DAB introduces a deformable time-shifting mechanism, enabling the model to flexibly focus on key historical moments at different time scales. Specifically, this module will have a length of... The time series data is rearranged into a matrix, with each row corresponding to a coarser-grained time window. Then, a one-dimensional offset is learned for each row using a 1DCNN. ; In the formula, It is the g-th query subset obtained by the neighborhood sensing input embedding module NAE after grouping according to the correlation between meteorological variables and runoff components. This is a one-dimensional convolutional network for offset prediction. This offset allows the attention head to flexibly "slide back and forth" within a time window, thus adaptively capturing hydrological response patterns of different frequencies. A learnable scaling factor is used to constrain the time offset range to maintain causality and stability. The NAE module is used to preprocess the multi-source input variables for runoff prediction. First, the variables are rearranged according to their linear correlation with the target runoff, and then divided into G adjacent groups. Each group is embedded in a specified dimension through an independent fully connected layer and then concatenated. Sine position codes are injected to preserve the temporal order, and structured features are output.
[0101] 3. Model training settings: This technology employs refined training configurations and strategies to ensure a balance between accuracy, stability, and generalization ability in the prediction model. For the loss function, mean squared error (MSE) is chosen as the primary objective function to measure the deviation between predicted and actual runoff values. MSE is more sensitive to larger errors, which helps the model accurately fit key periods such as peak floods during training, thus improving the overall reliability of the prediction results.
[0102] To prevent overfitting and improve the model's generalization performance, an early stopping strategy is introduced to monitor the training process. Training automatically terminates when the prediction error on the validation set is not effectively reduced within a set number of consecutive iterations, preventing performance degradation during ineffective iterations. Simultaneously, the learning rate is dynamically adjusted using the validation set, decreasing it later in the training process to improve the model's convergence stability.
[0103] To address the differences in runoff forecasting across different forecast periods, this technique employs a "forecast period-specific training" approach. This involves training separate sub-models with independent weights for each forecast period. This not only fully uncovers the runoff response patterns across various time scales but also significantly improves the prediction accuracy and stability for specific forecast periods, avoiding the trade-offs in performance across multiple forecast periods for a single model.
[0104] VI. Two-stage reconstruction based on physical interaction coupling: To address the problem that traditional runoff prediction models often employ simple linear superposition reconstruction strategies, neglecting the complex interactions between different runoff components, this invention proposes a two-stage reconstruction and multi-forecast period integration strategy based on physical interaction coupling.
[0105] 1. Two-stage physical interaction coupling reconstruction: For the decomposed time series, the prediction results corresponding to each component are not simply linearly superimposed, but reconstructed through a physical interaction coupling mechanism. This invention designs a component interaction coupling network containing fully connected layers and activation functions. The specific reconstruction process is divided into two stages: Phase 1: Freeze the parameters of each runoff component prediction model so that it can independently output the prediction results of trend component, seasonal component and high-frequency disturbance component, ensuring that each component model focuses on the extraction of features at a specific scale. The second stage involves concatenating the predicted values of the trend component and the seasonal component as input to a coupled network to simulate the nonlinear modulation of runoff and output the correction factor γ for the high-frequency disturbance component.
[0106] The final reconstruction formula is: ; In the formula, This is the final predicted runoff value. These are the predicted values for the high-frequency disturbance components. The nonlinearity correction coefficient for runoff is constrained to the (0,1) interval by the Sigmoid activation function, thus achieving adaptive correction for high-frequency disturbances.
[0107] 2. Multi-foresight period differentiated integration strategy: For multiple sub-models trained with different forecast periods, an ensemble strategy is used to fuse the prediction results. Given the significant differences in the response of different forecast periods to driving features, an independent modeling approach is adopted for each forecast period: the model with a forecast period of 1 day outputs the runoff value for the first day in the future, the model with a forecast period of 3 days outputs the runoff value for the third day in the future, and so on.
[0108] When applying the model, sub-models for the corresponding forecast period are invoked according to actual needs. Short-term forecast models focus more on timeliness, while medium- and long-term forecast models can supplement trend and lagged response information. Through the optimal model selection mechanism, the predictive advantages of multiple forecast periods are complementary and integrated to form a comprehensive prediction result with higher reliability. The final output runoff prediction sequence has both temporal continuity and multi-scale feature expression capabilities, which not only meets the needs of short-term flood peak forecasting and emergency response, but also provides solid support for medium- and long-term water resource allocation and planning.
[0109] Example 2: This embodiment provides a technical solution for runoff ensemble prediction based on spatial heterogeneity and variable attention, aiming to improve the accuracy and stability of runoff prediction under complex watershed conditions. The method mainly includes the following: Step 1: In the data collection and processing stage, acquire the historical runoff sequence of the target hydrological station and the corresponding time period of its controlled watershed through reanalysis multi-source meteorological data. Meteorological elements include basic driving factors such as rainfall, temperature, and evapotranspiration. The target station's runoff variation trend diagram is shown below. Figure 7 As shown in Table 1, based on the needs of model calculation and analysis, the multi-source data were unified in time scale, standardized in format, spatially aligned, and missing values were processed to form a complete spatiotemporal driven dataset, laying the data foundation for subsequent spatial feature construction and time series prediction. Specific information about the meteorological dataset is shown in Table 1.
[0110] Table 1 Meteorological Data Details
[0111] Step Two: In the watershed spatial zoning stage, based on the watershed digital elevation model (DEM), methods such as runoff accumulation analysis and watershed delineation are used to divide the control watershed of the target hydrological station into seven representative sub-regions. Each sub-region corresponds to an independent catchment unit, possessing a relatively consistent topographic structure and runoff generation and runoff response characteristics, effectively preserving the spatial heterogeneity within the watershed, and providing a clear spatial organization framework for regional meteorological feature extraction and modeling.
[0112] Step 3: In the adaptive feature selection stage for the forecast period, based on the spatial partitioning results, the regional average values of meteorological elements such as rainfall, temperature, and evapotranspiration for each sub-region are extracted from the reanalysis data to form a meteorological feature set at the sub-region scale. The multi-year average values of meteorological data for each sub-region are shown in Table 2. The importance of each meteorological variable is evaluated through Pearson correlation analysis and adaptive feature selection for the forecast period. The optimal combination of input features is selected according to different forecast periods to reduce redundant features and improve model stability and generalization performance.
[0113] Table 2 Multi-year average meteorological data for each sub-region
[0114] Step 4: In the two-stage runoff series decomposition stage, considering the non-stationarity of historical runoff series, firstly, Local Weighted Regression Trend Decomposition (STL) is used to separate the trend component, seasonal component, and residual component of the runoff series. Then, the Sparrow Search Algorithm (SSA) is used to optimize the Variational Mode Decomposition (VMD) parameters, thereby adaptively decomposing the residual components and extracting multi-scale high-frequency disturbance components. The VMD parameter optimization results and runoff series decomposition results are as follows: Figure 4 and Figure 5 As shown, this two-stage decomposition method can effectively reduce the non-stationarity of the original sequence and improve the learning ability and prediction accuracy of subsequent models for complex runoff dynamics.
[0115] Step 5: In the variable attention modeling stage, a time-series prediction model with variable deformable attention and time deformable attention mechanisms is constructed to jointly model the multi-component sequence of runoff and highly correlated meteorological features. The model is set with a forecast period of 1 month, 3 months, 5 months, and 7 months, a sliding window length of 15 time steps, a batch size of 64, a maximum training epoch of 200, an initial learning rate of 0.001, and an early stopping strategy accounting for 15% of the total training epochs. Independent models are constructed for the multi-component sequence of runoff at different forecast periods to more finely characterize the spatiotemporal coupling characteristics and runoff evolution patterns of complex watershed systems, and to accurately characterize the spatiotemporal coupling characteristics and runoff evolution patterns of complex watersheds.
[0116] Step Six: In the two-stage reconstruction phase based on physical interaction coupling, a physical interaction coupling reconstruction mechanism is designed for the multi-component prediction models under different forecast periods. First, the parameters of each component prediction model are frozen, and the prediction results of the trend, seasonal, and high-frequency disturbance components are output independently. Then, using the component interaction coupling network, based on the trend and seasonal components, a nonlinear correction factor is output to adaptively correct the high-frequency disturbance component. Finally, the result is obtained through a formula... Generate final runoff predictions. Compare the prediction results of the ensemble model with those of other methods. Figure 6 As shown.
[0117] Figure 4 The iterative process for optimizing VMD parameters for SSA was performed with 20 iterations, a dimension of 2, a population size of 30, and a parameter optimization range of k∈[2,10], α∈[300,3000]. The optimization results showed that the optimal parameter combination was K=7 and α=391.37.
[0118] Figure 5 The figure shows the result of decomposing the original runoff series using the STL method, yielding three parts: the trend term, the seasonal term, and the residual term. Figure 5 As shown in (a), the trend term can represent runoff well. To further explore the potential multi-scale features in the residual sequence, the variational mode decomposition (VMD) method is introduced to perform a secondary decomposition on the residual part. The decomposition results of VMD are shown in Figure 1. Figure 5 As shown in (b), each IMF component corresponds to a variation pattern at different time scales and exhibits good frequency separation characteristics. Among them, the low-order IMF mainly captures high-frequency noise and local disturbances, reflecting the characteristics; the high-order IMF reflects potential mid-to-low frequency structural information.
[0119] Figure 6 As shown in the figure, the prediction curves of the STL-VMD-DeformTime model under different prediction periods are generally consistent with the actual sequence, and can accurately reflect the trend of runoff change, further verifying the reliability and stability of the proposed model.
[0120] The above embodiments are merely preferred technical solutions of the present invention and should not be considered as limitations on the present invention. The scope of protection of the present invention should be limited to the technical solutions described in the claims, including equivalent substitutions of the technical features described in the claims. That is, equivalent substitutions and improvements within this scope are also within the scope of protection of the present invention.
Claims
1. A runoff ensemble prediction method based on spatial heterogeneity and variable attention, characterized in that, Includes the following steps: Step 1: Data Collection and Processing: Acquire historical runoff sequences, meteorological reanalysis data, and topographic and spatial baseline data of the target hydrological station; preprocess the data to form a spatiotemporal driven dataset. Step 2, Spatial partitioning of the watershed: Based on the digital elevation model (DEM) of the watershed in the spatiotemporal driven dataset, the control watershed of the target hydrological station is divided into several sub-regions by calculating the cumulative runoff and extracting and dividing the river network. Step 3, Adaptive Feature Screening for Forecast Period: Based on the spatial zoning results of the watershed, meteorological driving factors for each sub-region are extracted, and for each sub-region, the regional average value of the raster data is calculated to form a numerical sequence representing the overall meteorological conditions of the sub-region; high-correlation candidate features are initially selected through Pearson correlation analysis; further, an adaptive screening mechanism for forecast period is introduced, setting differentiated screening thresholds based on the difference in forecast duration, setting a low threshold for short-term forecast period to retain high-frequency detailed features, and setting a high threshold for long-term forecast period to retain dominant trend features; Finally, a dynamic input feature set suitable for different forecast periods is constructed to achieve physical matching between input data and prediction targets; Step 4: Two-stage runoff sequence decomposition: Local weighted regression trend decomposition (STL) is used to separate the trend component, seasonal component and residual component of the historical runoff sequence. Then, the variational mode decomposition (VMD) parameters are optimized by the sparrow search algorithm (SSA). Finally, the residual component is adaptively decomposed by the optimized VMD to extract multi-scale high-frequency disturbance components. Step 5, Variable Attention Modeling: Construct a time series prediction model with variable deformable attention and time deformable attention mechanisms. The runoff multi-component sequence obtained from Step 4 and the meteorological features selected in Step 3 are used for joint modeling. The corresponding prediction models are trained and optimized for different forecast periods. Step 6: Two-stage reconstruction based on physical interaction coupling: A two-stage training strategy is adopted to realize the physical interaction coupling reconstruction of multi-component prediction results; in the first stage, the prediction model parameters of each component are frozen, and the predicted values of trend, seasonal and high-frequency disturbance components are output independently. The second stage involves constructing a component-interactive coupling network to achieve nonlinear modulation of runoff; finally, a multi-forecast-period ensemble prediction result is generated through nonlinear weighted superposition.
2. The runoff ensemble prediction method based on spatial heterogeneity and variable attention as described in claim 1, characterized in that, In step 1, the historical runoff sequence includes the time series of runoff collected from the target hydrological station, with the time scale selected as daily, monthly, or hourly based on the research objective; the data sources include historical monitoring records from the watershed management agency, measured data from automatic hydrological monitoring stations, and scientific research databases; the meteorological reanalysis data includes rainfall, average temperature, maximum temperature, minimum temperature, and evapotranspiration or potential evapotranspiration, which are derived from reanalysis data; the topographic and spatial basic data include digital elevation model (DEM), watershed boundaries, and river network distribution.
3. The runoff ensemble prediction method based on spatial heterogeneity and variable attention as described in claim 1, characterized in that, Step 1, data preprocessing includes: Time and space unification: In the time dimension, the target prediction time scale is used as the benchmark to ensure that multi-source data are aligned in time; in the spatial dimension, meteorological and spatial data are uniformly resampled to a consistent spatial resolution. Missing data processing: For missing data in the hydrological runoff sequence, linear interpolation is used to fill in the missing data points, and interpolation correction is performed by combining runoff trend information in the adjacent time period.
4. The runoff ensemble prediction method based on spatial heterogeneity and variable attention as described in claim 1, characterized in that, In step 2, the watershed spatial zoning also includes a DEM preprocessing step: filling depressions in the DEM data to eliminate false depressions, and using the D8 algorithm to calculate the surface runoff direction of each grid cell.
5. The runoff ensemble prediction method based on spatial heterogeneity and variable attention according to claim 4, characterized in that, In step 2, the calculation of the cumulative flow is based on DEM flow direction data and follows the logic of grid water flow converging from upstream to downstream. By traversing each grid, the number or area of all grid points upstream of each grid that can converge to that grid unit is counted, thereby obtaining the cumulative flow.
6. The runoff ensemble prediction method based on spatial heterogeneity and variable attention as described in claim 5, characterized in that, In step 2, the river network extraction and watershed delineation are carried out by setting a confluence threshold to extract the river network skeleton. Based on the cumulative flow and river network information, the watershed algorithm is used to simulate the convergence of water flow from local high points and delineate the range of adjacent sub-basins. Each sub-basin contains an independent outlet point. Each sub-region corresponds to an independent catchment unit with consistent topographic structure and runoff response characteristics.
7. The runoff ensemble prediction method based on spatial heterogeneity and variable attention as described in claim 6, characterized in that, In step 2, the watershed spatial partitioning also includes a sub-watershed optimization and merging step: eliminating sub-watersheds with relatively small areas that cannot independently reflect runoff generation and confluence characteristics, merging adjacent sub-watersheds with similar hydrological attributes, so that the runoff process and hydrological response within each sub-region are consistent.
8. The runoff ensemble prediction method based on spatial heterogeneity and variable attention according to claim 1, characterized in that, In step 3, the meteorological driving factors include rainfall, average temperature, maximum temperature, minimum temperature, and evapotranspiration or potential evapotranspiration.
9. The runoff ensemble prediction method based on spatial heterogeneity and variable attention according to claim 1, characterized in that, In step 3, the correlation coefficient calculation formula for Pearson correlation analysis is as follows: In the formula, and Let these represent the meteorological characteristic value and the runoff observation value at the i-th time step, respectively. and Let be the mean of their respective sequences. This represents the total number of samples.
10. The runoff ensemble prediction method based on spatial heterogeneity and variable attention according to claim 1, characterized in that, In step 3, the adaptive feature selection mechanism for the forecast period is specifically as follows: based on the differences in the forecast requirements of hydrological processes for different forecast periods, a dynamic feature selection strategy is constructed; for short-term forecast tasks, a lower correlation selection threshold is set to retain high-frequency detailed features containing instantaneous disturbance information; for long-term forecast tasks, a higher correlation selection threshold is set to remove high-frequency noise and retain the dominant features reflecting long-term evolution patterns; through the above differentiated threshold settings, the physical mechanism matching between the input feature set and the forecast target is achieved.
11. The runoff ensemble prediction method based on spatial heterogeneity and variable attention according to claim 1, characterized in that, In step 4, the expression for the Locally Weighted Regression Trend Decomposition (STL) is: ; In the formula: for The original natural runoff value at that moment; for The trend value at any given time; for The seasonal value at any given time; for The residual value at time 10:
00. .
12. The runoff ensemble prediction method based on spatial heterogeneity and variable attention as described in claim 11, characterized in that, In step 4, when optimizing the parameters of variational mode decomposition (VMD) using the Sparrow Search Algorithm (SSA), the number of modes K and the penalty factor α are selected as optimization variables. The optimal parameter combination is found through the co-evolution of three types of individuals: discoverers, participants, and vigilants in the SSA algorithm.
13. The runoff ensemble prediction method based on spatial heterogeneity and variable attention according to claim 12, characterized in that, In step 4, when optimizing the parameters of variational mode decomposition (VMD) using the Sparrow Search Algorithm (SSA), the constructed multi-objective fitness function is as follows: ; in, For reconstruction error, The total envelope entropy; in, The original signal sequence, The signal reconstructed from all IMF components. This is the total length of the signal; in, Indicates the first The normalized envelope amplitude of each modal component, A small constant used to prevent numerical overflow.
14. The runoff ensemble prediction method based on spatial heterogeneity and variable attention according to claim 13, characterized in that, In step 4, VMD adaptively decomposes the residual term into K eigenmode functions with compact spectral characteristics through iterative optimization. The frequency domain update formula for the k-th mode in the (n+1)-th iteration is as follows: In the formula, For the first The iteration of the ... Each modal spectrum corresponds to the oscillation component within a specific frequency band in the residual term; The spectrum of the STL residual term; These are the Lagrange multipliers for the nth iteration; For the nth iteration, it is the sum of the spectra of all modes except the kth mode; As a penalty factor; For frequency variables; For the current number The center frequency of each mode; This represents the number of iteration steps. The center frequency is updated based on the spectral energy centroid of the mode: In the formula, The new center frequency of the k-th mode during the (n+1)-th iteration reflects the frequency domain position of the dominant oscillation of this mode in the residual signal. For the first The iteration of the ... Each modal spectrum.
15. The runoff ensemble prediction method based on spatial heterogeneity and variable attention according to claim 1, characterized in that, In step 5, the runoff multi-component sequence and meteorological features are used as input data. The input data is constructed by dividing the time series composed of runoff multi-component sequence and meteorological features into training set, validation set and test set in a ratio of 7:2:
1. The sliding time window method is used to construct training labels, and all features are standardized to eliminate dimensional differences.
16. The runoff ensemble prediction method based on spatial heterogeneity and variable attention according to claim 1, characterized in that, In step 5, the time-series prediction model is DeformTime, which includes a variable deformable attention block V-DAB and a time deformable attention block T-DAB. In the variable dimension, the deformable attention learns feature offsets to dynamically focus on key driving factors; in the time dimension, the deformable attention learns location offsets to adaptively capture the confluence time lag effect between rainfall and runoff.
17. The runoff ensemble prediction method based on spatial heterogeneity and variable attention as described in claim 16, characterized in that, The two-dimensional deformable offset of the V-DAB is calculated using the following formula: In the formula, This is the query vector generated within the p-th time segment after joint encoding of multiple meteorological variables through an embedding layer. For a two-layer convolutional network, As input, output the position offset in both time and variable dimensions. , It is a learnable scalar that controls the maximum offset magnitude.
18. The runoff ensemble prediction method based on spatial heterogeneity and variable attention as described in claim 16, characterized in that, The one-dimensional time offset of the T-DAB is calculated using the following formula: In the formula, It is the g-th query subset obtained by the neighborhood sensing input embedding module NAE after grouping according to the correlation between meteorological variables and runoff components, η off For offset prediction, a one-dimensional convolutional network is used, where β is a learnable scaling factor that constrains the temporal offset range to maintain causality and stability.
19. The runoff ensemble prediction method based on spatial heterogeneity and variable attention according to claim 18, characterized in that, The NAE is used to preprocess the multi-source input variables for runoff prediction. First, the variables are rearranged according to their linear correlation with the target runoff component, and then divided into G adjacent groups. Each group is embedded in a specified dimension through an independent fully connected layer and then concatenated. Sine position codes are injected to preserve the temporal order and output structured features.
20. The runoff ensemble prediction method based on spatial heterogeneity and variable attention according to claim 1, characterized in that, In step 5, when training and optimizing the corresponding prediction models for different forecast periods, the model training uses mean squared error as the loss function, introduces an early stopping strategy to monitor the prediction error of the validation set, and dynamically adjusts the learning rate using the validation set; for different forecast periods, sub-models with independent weights are trained.
21. The runoff ensemble prediction method based on spatial heterogeneity and variable attention according to claim 1, characterized in that, Step 6 specifically includes the following: A two-stage training strategy is adopted. In the first stage, the parameters of the prediction model for each runoff component are frozen, and the prediction results of the trend component, seasonal component and high-frequency disturbance component are output independently. The second stage constructs a component interaction coupling network, taking the predicted values of trend components and seasonal components as inputs and outputting a nonlinear correction factor for high-frequency disturbance components to achieve nonlinear modulation of runoff. Finally, the predicted values of each component are weighted and summed after nonlinear correction to generate the final runoff prediction result; This mechanism generates a multi-forecast-period ensemble prediction model. The model with a forecast period of 1 outputs the runoff value for the first day, the first month, or the first hour of the future. The model with a forecast period of 3 outputs the runoff value for the third day, the third month, or the third hour of the future, and so on, thus achieving high-precision joint runoff prediction across multiple time scales.
22. The runoff ensemble prediction method based on spatial heterogeneity and variable attention according to claim 21, characterized in that, The component interaction coupling network is constructed as follows: a multilayer perceptron structure containing fully connected layers and activation functions is built; the trend component prediction value is... Seasonal component forecast values Feature concatenation is performed and used as the input vector for the coupled network; the coupled network outputs a correction factor for the high-frequency perturbation component through nonlinear mapping; the final reconstruction formula is as follows: ; In the formula, This is the final predicted runoff value. These are the predicted values for the high-frequency disturbance components. This is the nonlinear correction coefficient for runoff, and its range is constrained to the interval (0,1) by the Sigmoid activation function.
23. The runoff ensemble prediction method based on spatial heterogeneity and variable attention according to claim 21, characterized in that, Step 6 also includes: training and optimizing the corresponding component interaction coupling network for different forecast periods, using mean squared error as the loss function for model training, calculating the error between the reconstructed runoff and the actual runoff, introducing an early cessation strategy to monitor the prediction error of the validation set, and dynamically adjusting the learning rate using the validation set; training component interaction coupling models with independent weights for different forecast periods.