A gradient-constrained neural network-based short-term prediction method for arctic sea ice

By constructing a gradient-constrained neural network model and combining multi-source meteorological and oceanographic data with deep learning methods, the uncertainties and computational resource problems in sea ice forecasting were solved, achieving efficient short-term sea ice forecasting with low resources, thus meeting the needs of Arctic navigation and resource development.

CN116401939BActive Publication Date: 2026-06-30NAT UNIV OF DEFENSE TECH

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
NAT UNIV OF DEFENSE TECH
Filing Date
2023-03-09
Publication Date
2026-06-30

AI Technical Summary

Technical Problem

Existing sea ice forecasting methods suffer from high uncertainty, large computational resource consumption, and difficulty in accurately predicting sea ice change patterns when faced with rapid changes and complex multi-scale physical processes in polar sea ice. In particular, traditional methods are unable to meet the needs of Arctic navigation decision-making when the melting pool phenomenon is severe in summer and autumn.

Method used

A gradient-constrained neural network model was constructed by combining multi-source meteorological and oceanographic data. Through spatiotemporal matching, factor selection, and gradient loss function optimization, short-term daily forecasts of sea ice concentration and thickness were achieved. The Grad-PredRNN network model was constructed by combining the nonlinear fitting capability of deep learning with meteorological and oceanographic expertise.

Benefits of technology

It achieves high-accuracy sea ice forecasting with low computational resource requirements, and can provide short-term daily forecast data support in the Arctic Ocean, providing a guarantee for Arctic navigation and resource development, and improving the accuracy and applicability of forecasts.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN116401939B_ABST
    Figure CN116401939B_ABST
Patent Text Reader

Abstract

This invention discloses a gradient-constrained neural network method for short-term forecasting of Arctic sea ice. The method includes: 1) preparing data and extracting and filtering the data; comprising the following two parts: 1-1) acquiring multi-source data on Arctic sea ice concentration (SIC) and Arctic sea ice thickness (SIT); 2) preprocessing the acquired data, and performing spatiotemporal matching of Arctic sea ice thickness data with different spatial grid points with other element data; 3) constructing a gradient-constrained neural network model, namely Grad-PredRNN, and using the local gradient constraint model of sea ice to converge; setting multiple sets of control models and selecting the optimal model; using the preprocessed spatiotemporal sequence data for parameter training, designing multiple training schemes, and selecting the model with the best performance to save the parameters; 4) using the trained best model for short-term forecasting of sea ice in the Arctic Ocean, and comparing the forecast results with the actual results to verify the forecast effect.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to a short-term forecasting method for Arctic sea ice using a gradient-constrained neural network, belonging to the field of deep learning applications in Arctic sea ice forecasting technology. Background Technology

[0002] The Arctic region is perpetually covered by sea ice, playing a crucial role in global climate regulation. According to the latest IPCC-AR6 report, Arctic sea ice cover is declining year by year, and by 2050, it is highly likely that the entire Arctic Ocean will be ice-free during summer, with the melting trend of Arctic sea ice almost irreversible. On the one hand, rising temperatures cause the ice sheet to melt; on the other hand, the positive feedback effect of ice sheet melting exacerbates global warming.

[0003] The widespread melting of Arctic sea ice will have a significant impact on human production and daily life, such as the continued rise in sea level and the intensification of cold waves. However, it also presents opportunities for the development and utilization of Arctic resources. For example, as sea ice cover continues to decrease, the navigation time of Arctic shipping routes will be extended, and their commercial value will increase significantly. In particular, the Northeast Passage in the Arctic has partially thawed in the summer, allowing ships to navigate during the summer months. However, the development and utilization of the Arctic still faces many difficulties. On the one hand, the existing infrastructure is still incomplete, and emergency response and rescue capabilities are poor. On the other hand, the extreme weather conditions unique to the Arctic and the large amount of sea ice and floating ice increase the safety risks to ships. Therefore, sea ice forecasting in the Arctic Ocean is extremely important, especially weather-scale forecasts (1-10 days), which are crucial to navigation decisions and are currently a hot topic in "weather" navigation research in the Arctic Ocean.

[0004] Currently, sea ice forecasting mainly relies on two methods: numerical model forecasting and statistical forecasting. Deep learning is a relatively new statistical forecasting method. Numerical models, starting from known physical laws, simulate and forecast the aforementioned sea ice change processes based on the dynamics, thermodynamics, and thickness distribution of sea ice. Based on well-defined physical laws and considering various factors affecting sea ice changes, model forecasts can provide predictions with stable errors and strong interpretability.

[0005] However, the rapid changes in polar sea ice and their evolving physical properties in recent years have significantly increased the uncertainty of physical parameterization schemes and simulation results in sea ice models, placing higher demands on model resolution. Furthermore, many phenomena in nature lack clearly defined physical laws, such as small-scale sea ice dynamics and deformation, as well as variations in sea ice extent. In addition, parameterization schemes for coupling different models require further refinement, and the computational power of computers and the feasibility of deploying polar marine observation equipment both constrain the development of numerical sea ice models.

[0006] Statistical forecasting, limited by both algorithmic constraints and insufficient data, had a relatively late start, and its application in sea ice forecasting still has significant room for research. Only in the last decade, with the development of remote sensing satellite observation technology, has the acquisition of sea ice observation data become increasingly easier, and this data is publicly available, providing data support for the further application of statistical methods in sea ice forecasting. However, due to the highly nonlinear nature of sea ice changes, traditional statistical methods, limited by the models themselves, struggle to accurately fit the patterns of sea ice variation.

[0007] In recent years, deep learning algorithms have gradually attracted researchers' attention due to their excellent nonlinear fitting capabilities. Deep learning excels at extracting patterns of change from data, simulating complex dynamic systems or nonlinear changes, aligning with people's understanding of natural laws, and can achieve or even surpass the simulation effects of numerical models. While deep learning is widely and maturely applied in other fields, such as computer image recognition and natural language processing, its application in sea ice forecasting was previously limited by the availability of observational data, preventing it from achieving ideal forecasting results. However, the public release of a large amount of satellite observation data and high-quality reanalysis data has provided fertile ground for the development of deep learning in sea ice forecasting.

[0008] Compared to model forecasting, deep learning, as a statistical forecasting method, not only bypasses the development challenges of numerical models but also yields fast and accurate forecasts with less computational resources and is easier to deploy. Compared to traditional statistical forecasting, deep learning algorithms have higher complexity, stronger fitting capabilities, and are better suited for extracting sea ice variation patterns from long-term, high-resolution data.

[0009] Existing research indicates that sea ice forecasting models built using deep learning methods have significant advantages over traditional statistical forecasting models. By deeply analyzing the spatiotemporal information of sea ice concentration data, they can effectively characterize the changing patterns of sea ice concentration and improve the effectiveness of sea ice concentration forecasts. However, due to the complex multi-scale physical processes of sea ice changes, and the close correlation between sea ice changes at different scales and changes in the atmospheric and oceanic environment, relying solely on the analysis of sea ice concentration data patterns is insufficient, especially during the summer and autumn seasons when melt pooling is more severe.

[0010] Therefore, in order to overcome the limitations of deep learning in sea ice forecasting, this invention, supported by professional knowledge in the fields of meteorology and oceanography, and combined with the time-recurrent network model, which has a relatively high performance in the field of deep learning, proposes a gradient-constrained neural network method for short-term forecasting of Arctic sea ice, which can realize short-term daily forecasts of sea ice concentration and thickness. Summary of the Invention

[0011] The purpose of this invention is to provide a short-term forecasting method for Arctic sea ice using gradient-constrained neural networks. This method combines deep learning with meteorological and oceanographic expertise from the perspective of multi-factor and physical mechanism constraints to achieve short-term daily forecasts of sea ice concentration and thickness. This provides forecast data support for Arctic meteorological navigation and ensures future development of Arctic shipping routes and Arctic resources.

[0012] The present invention provides a short-term forecasting method for Arctic sea ice based on gradient-constrained neural networks, comprising the following steps:

[0013] 1) Prepare data and extract and filter the data; this includes the following two parts: 1-1) Obtain multi-source data for Arctic sea ice concentration (SIC) and Arctic sea ice thickness (SIT) products. The SIC data comes from the ERA5 (ECMWF Reanalysis v5) reanalysis dataset, which is the fifth generation reanalysis product of the European Centre for Medium-Range Weather Forecasts (ECMWF) and assimilates various satellite radiometric and scatterometer observation data; the SIT data comes from the PIOMAS (Pan-Arctic IceOcean Modeling and Assimilation System) model product; this model assimilates ice-sea model products for Arctic sea surface temperature and sea ice concentration; both data are daily gridded data, but their spatial grids are different.

[0014] 1-2) Acquire various Arctic meteorological and oceanographic data related to Arctic sea ice changes, including Arctic sea surface temperature (SST), vertical heat flux (VHF), vertical water vapor flux (VMF), mean sea surface pressure (MSL), 10-meter wind speed (WIND), 2-meter air temperature (T2M), radiation albedo (AL), low cloud cover (LCC), and skin temperature (SKT), using daily gridded data of REA5;

[0015] The above data are all publicly available datasets that can be accessed online. The extracted data includes information such as sea ice thickness, latitude and longitude, and sampling time.

[0016] 2) Preprocess the acquired data from various sources, and perform spatiotemporal matching of Arctic sea ice thickness data with different spatial grid points with other element data; perform dimensionality reduction on the data through factor screening, and retain only the elements that are highly correlated with Arctic sea ice changes; including the following three parts: 2-1) Grid matching: perform grid matching on data of different resolutions, and perform nearest neighbor interpolation on Arctic sea ice thickness data to unify it with elements such as Arctic sea ice concentration to the same grid point;

[0017] 2-2) Dimensional unification: Unify the dimensions of different elements to the same dimension. Taking sea ice concentration forecast as an example, first dedimensify the other meteorological and oceanographic elements and then scale them down to the same dimension as sea ice concentration.

[0018] 2-3) Data dimensionality reduction: In order to reduce redundant information and reduce data dimensionality, the input factors of the model are screened; the time-lag correlation coefficients of each element at time t+1 and the sea ice element at time t are calculated, and the factors are sorted from largest to smallest according to the absolute value of the time-lag correlation coefficients, and the top 50% of factors are selected as the input factors of the model.

[0019] 3) Construct a gradient-constrained neural network model, namely Grad-PredRNN, and utilize the convergence of the sea ice local gradient constraint model; set up multiple control models and select the optimal model; including the following three parts:

[0020] 3-1) Network Construction: Based on the TensorFlow platform, a basic time-cyclic predictive gradient-constrained neural network model was built. The specific model hyperparameters are shown in Table 1. This model extracts spatial features through a three-layer convolutional network and uses information from 10 historical time points to predict subsequent sea ice information.

[0021] 3-2) Gradient constraint: Considering the local variation of sea ice, a gradient loss function is proposed, and the convergence direction of the model parameters is constrained by the local gradient. A Grad-PredRNN network model suitable for short-term sea ice forecasting is constructed.

[0022] 3-3) Then model training: The preprocessed spatiotemporal sequence data related to Arctic sea ice are used for parameter training. Multiple training schemes are designed. Based on the forecast lead time, MAE and spatial distribution index of error, the effect of introducing local gradient constraints is tested. The forecast performance of the time-cyclic prediction neural network and the convolutional neural network are compared. The model with the best performance is selected and the parameters are saved.

[0023] Set up multiple sets of control models and select the optimal model; use the preprocessed spatiotemporal sequence data for parameter training, design multiple training schemes, and select the model with satisfactory and optimal performance to save the parameters;

[0024] 4) The best model obtained from the training was used for short-term sea ice forecasting in the Arctic Ocean, and the forecast results were compared with the actual results to verify the forecast effect.

[0025] 1) The data characteristics of the data products used.

[0026] We collected daily data products from PIOMAS and ERA5 to identify the optimal multifactor forecasting model, reserving data from a subset of years for model performance validation. ERA5 data was used to extract information on SIC, SST, VHF, VMF, MSL, WIND, T2M, AL, LCC, and SKT, while PIOMAS data was used to extract information on SIT. Both ERA5 and PIOMAS data products are daily horizontal gridded data.

[0027] 2) Data preprocessing before model training;

[0028] Because the spatial grid points of different datasets do not match, it is necessary to unify them to the same grid points. Bilinear interpolation is used to interpolate the PIOMAS SIT data onto the ERA5 spatial grid, ensuring that all sea ice and meteorological / oceanographic elements are uniformly formatted as ERA5. Bilinear interpolation involves calculating a total of three single-linear interpolations in two directions. First, two single-linear interpolations are performed in the x-direction to obtain two temporary points, R1(x, y1) and R2(x, y2). Then, one single-linear interpolation is performed in the y-direction to obtain f(x, y). The calculation formula is as follows:

[0029]

[0030] Q ij i = 1, 2, j = 1, 2 represent the four points around the interpolation point, x1, x2 represent the x-coordinates of the surrounding grid points, y1, y2 represent the y-coordinates of the surrounding grid points, and f(x, y) is the result of the interpolation.

[0031] After spatial grid matching, input factors are screened. This removes redundant information and reduces data dimensionality, improving model convergence speed. Taking Arctic sea ice data (such as short-term forecasts of Arctic sea ice concentration SIC) as an example, the time-lag correlation coefficient between SIC at time t and the input factors (VHF, VMF, MSL, etc., a total of 10 types) at time t-1 is calculated. The absolute values ​​of the time-lag correlation coefficients are sorted from largest to smallest, and the top 50% of factors are selected as input factors for the model.

[0032] 3) The design of the network model;

[0033] In deep learning algorithms, based on the loss function MAE-loss, gradient constraints are used to reflect local changes in sea ice, thus improving the loss function and proposing Grad-loss, whose calculation formula is as follows.

[0034]

[0035] In the above formula, Grad lat and Grad lony represents the gradients in the longitude and latitudinal directions, respectively, and std represents the standard deviation. (i,j) y represents the information of a specific grid point in the spatial field, while y represents the information of the entire spatial field for that element. The forecast information is for the elements. To make the MAE-loss and Grad-loss have the same dimensions, the std of SIC and the std of the local gradient are removed from the above formula, respectively.

[0036] In deep learning algorithms, the loss functions used are mostly MSE-loss or MAE-loss. Their formulas are shown below.

[0037]

[0038]

[0039] However, both MSE and MAE compare the true and predicted values ​​grid-by-grid. During network training, these loss functions can only compare the difference between the true SIC and the predicted SIC at the same grid point. In reality, different sea areas respond differently to different environmental factors; that is, local variations in sea ice can also significantly impact prediction performance.

[0040] Therefore, in deep learning algorithms, based on the loss function MAE-loss, gradient constraints are used to reflect local changes in sea ice, and the loss function is improved, thus proposing Grad-loss;

[0041] The idea behind the gradient-constrained neural network method for short-term Arctic sea ice forecasting in this invention is:

[0042] Based on the PredRNN++ deep learning model, a short-term daily sea ice forecast model is constructed by incorporating various meteorological and oceanographic factors related to sea ice changes. Meteorological and oceanographic elements, as well as sea ice concentration and thickness, are extracted from multi-source meteorological and oceanographic datasets (ERA5 and PIOMAS). Factors with strong correlation to sea ice changes are selected by calculating time-lag correlation coefficients, thus reducing data dimensionality. Grad-loss is proposed to improve the existing PredRNN++ network model, considering the local gradient variation pattern of sea ice. Comparative experiments are designed to evaluate the forecasting performance of each model using three evaluation metrics, and the best-performing short-term sea ice forecast model is selected.

[0043] 4) In this section, the model's forecast performance is verified and the optimal model is selected.

[0044] Comparative experiments were conducted on various models to compare the performance of the PredRNN++ model and the ConvLSTM model, as well as the performance of the MAE loss function and the Grad-loss loss function. Each comparative model can be used for short-term daily forecasting of sea ice, with a maximum forecast period of 10 days.

[0045] The optimal forecasting model was selected through three comparisons: 1. Comparison with the monthly average MAE-loss of sea ice elements to verify the effectiveness of daily forecasts; 2. Comparison with daily real data to calculate MAE-loss and spatial structure similarity (SSIM) to verify the model's forecasting performance in the time dimension; 3. Comparison with the spatial field of daily data to calculate the spatial difference (DIFF) between the forecast results and the real data to verify the model's forecasting performance in the spatial dimension. Based on the above three comparison results, the optimal forecasting model was selected.

[0046] Therefore, this invention examines the forecasting performance of each model through comparison in three aspects and selects the optimal forecasting model.

[0047] Beneficial Effects: This invention provides a gradient-constrained neural network-based short-term forecasting method for Arctic sea ice. Compared to model forecasting, this method requires less computational equipment, can be carried on board ships, and is convenient for practical deployment. Compared to autoregressive forecasting of sea ice elements, this method, based on multiple meteorological and oceanographic elements, rationally selects relevant factors and constructs a multi-factor forecasting model to improve the accuracy of sea ice element forecasts. Compared to existing sea ice forecasting research, this method achieves short-term daily forecasts of sea ice elements, providing data support for Arctic meteorological navigation research and supporting the implementation planning of Arctic meteorological navigation.

[0048] Figure and Table Description

[0049] Figure 1 This is a technical roadmap for the gradient-constrained neural network method for short-term forecasting of Arctic sea ice according to the present invention.

[0050] Figure 2 The coefficients represent the time-lag correlations between various meteorological and oceanographic elements and the SIC. A stronger warm color indicates a greater positive time-lag correlation between the factor and the SIC, while a stronger cool color indicates a greater negative time-lag correlation.

[0051] Figure 3 The magenta dashed line represents the monthly average MAE for each model's continuous 10-day forecast in October 2020, while the other eight solid lines represent the average MAE for each comparative experimental model's continuous 10-day forecast in October.

[0052] Figure 4 In the figure, 'a' and 'b' represent the SSIM of the single-factor and multi-factor models, respectively. The DIFF in the figure is obtained by subtracting the single-factor model from the multi-factor model; light red represents positive values, and dark blue represents negative values.

[0053] Figure 5 In the diagram, 'a' and 'b' represent the MAE of the single-factor and multi-factor models, respectively. The calculation method of DIFF is similar to... Figure 4 same.

[0054] Figure 6 In the figure, a and b represent the SSIMs of the MAE-loss and Grad-loss models, respectively. The DIFF in the figure is obtained by subtracting the MAE-loss-based model from the Grad-loss model.

[0055] Figure 7 In the diagram, 'a' and 'b' represent the MAE values ​​of the MAE-loss and Grad-loss models, respectively. The calculation method of DIFF is similar to... Figure 6 same.

[0056] Figure 8 In the figure, a and b represent the SSIMs of the ConvLSTM and PredRNN++ models, respectively. The DIFF in the figure is obtained by subtracting the ConvLSTM from the PredRNN++ model.

[0057] Figure 9 In the diagram, 'a' and 'b' represent the MAE of the ConvLSTM and PredRNN++ models, respectively. The calculation method of DIFF is similar to... Figure 8 same.

[0058] Figure 10 This is a comparison chart of the forecast results from various models and the DIFF curve as of July 15, 2020. Figure 10 In the figures: a represents historical SIC data; b represents the prediction results of ConvLSTM; c represents the prediction results of PredRNN++; d represents the DIFF obtained by subtracting the monthly average SIC from a in July; e represents the DIFF obtained by subtracting a from a in b; and f represents the DIFF obtained by subtracting a from a in c.

[0059] Table 1 shows the comparative experimental design of various SIC prediction models. The prediction performance of SIC is compared between autoregressive and multifactor regression; the prediction performance of MAE-loss and Grad-loss is compared; and the prediction performance of ConvLSTM and PredRNN++ models is compared, with labels provided for each. Specific Implementation

[0060] To make the objectives, technical solutions, and advantages of this invention clearer, the method of this invention will be further described in detail below with reference to practical examples. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.

[0061] A gradient-constrained neural network method for short-term forecasting of Arctic sea ice.

[0062] The technical solution adopted in this invention is as follows ( Figure 1 ):

[0063] Step 1: Data Extraction. Acquire multi-source data for Arctic sea ice concentration (SIC) and Arctic sea ice thickness (SIT) products, including PIOMAS data (an ice-sea model product assimilated with Arctic sea surface temperature and sea ice concentration) and REA5 reanalysis data (assimilated with various satellite radiometric and scatterometer observations). Both products are daily gridded data, but their spatial grids differ. Extract sea ice elements such as SST and SIC, as well as meteorological and oceanographic elements related to sea ice changes, from the ERA5 dataset, and sea ice thickness from the PIOMAS dataset. All of these products are publicly available online datasets. Based on actual needs, select daily data within a specified time and spatial range, and extract element information and latitude / longitude information from the datasets.

[0064] Step 2: Data Preprocessing. Due to differences in the calculation and storage methods of elements in multi-source datasets, spatial grid points and temporal records differ, necessitating spatiotemporal matching of the extracted data. This invention employs bilinear interpolation to interpolate PIOMAS data onto the standard ERA5 spatial grid. Furthermore, due to the influence of land, outliers may exist in sea ice elements and some meteorological and oceanographic elements. Therefore, it is necessary to remove the land influence. This can be achieved by setting the land portion of sea ice elements to 0 and setting large negative values ​​for land outliers in other elements, enabling the model to identify or remove the land influence.

[0065] Step 3: Factor Screening. Calculate the time-lag correlation coefficients between each meteorological and oceanographic element and the sea ice element, as well as the time-lag correlation coefficients of the sea ice element itself. Based on the absolute values ​​of the time-lag correlation coefficients, select factors with high correlation for use in training subsequent sea ice forecasting models.

[0066] Step 4: Design the implementation of Grad-loss, calculate the std of sea ice elements and the std of local gradient in the meridional and zonal directions, and dedimensionalize the forecast error and local gradient error in Grad-loss.

[0067] Step 5: Design multiple sets of comparative experiments to compare the impact of different network models and different loss functions on the prediction results. Based on the experimental design, construct the network model and set hyperparameters such as the number of network layers, nonlinear activation functions, number and size of convolutional kernels.

[0068] Step 6: Calculate the results of various evaluation indicators such as MAE, SSIM, and DIFF, analyze the forecasting performance of each model based on the results, and select the optimal sea ice forecasting model for daily sea ice forecasting in the selected area.

[0069] More specifically: Taking short-term daily forecasts of sea ice concentration as an example:

[0070] Step 1: Using data from January 2010 to December 2019 as the training set and data from January to December 2020 as the test set, the time range of the data is defined. The spatial range of the data is defined as longitude range 120°~180°E and latitude range 66.75°N~83°N. Data for the corresponding time and spatial ranges are extracted from the ERA5 dataset, including meteorological and oceanographic elements such as sea surface temperature (SST), vertical heat flux (VHF), vertical water vapor flux (VMF), mean sea surface pressure (MSL), 10-meter wind speed (WIND), 2-meter air temperature (T2M), albedo (AL), low cloud cover (LCC), and skin temperature (SKT), as well as sea ice concentration (SIC). The spatial resolution of each element is 0.25°×0.25°, and the temporal resolution is 1 day.

[0071] Step 2: Since the sea ice concentration data and other meteorological and oceanographic elements in this example all come from the same dataset, there is no spatiotemporal matching issue. However, the SIC data contains the influence of land. To remove the influence of land, the NaN value of land is set to 0, assuming that there is no sea ice on land, which is consistent with the actual situation.

[0072] Step 3: Calculate the time-delay correlation coefficient between SIC at time t and the input factors (SST, VHF, VMF, MSL, WIND, T2M, AL, LCC, SKT, and SIC) at time t-1. The calculation formula is as follows:

[0073]

[0074] In the above formula, ρ represents the calculated time-delay correlation coefficient value, SIC t X represents the SIC value at time t. t-1 δ represents the value of the input factor at time t-1, cov represents the calculation of the covariance, and δ 2 Variance calculation. Based on the above formula, the time-lag correlation coefficient values ​​of each element are calculated ( Figure 2 Based on the absolute value, the top 50% of factors were selected. Therefore, SST, MSL, T2M, SKT, and SIC were chosen as input factors for subsequent model training.

[0075] Step 4: Construct the Grad-loss function based on the following formula:

[0076]

[0077] In the above formula, Grad lat and Grad lon y represents the gradient of SIC in the longitudinal and latitudinal directions, respectively, while std represents the standard deviation. (i,j)y represents the corresponding grid point information of SIC, and y represents the information of the entire SIC spatial field. This is the prediction information for SIC. To make MAE and Grad have the same dimensions, the std of SIC, the std of the meridional and zonal gradients are calculated in the above formula, and the MAE loss and gradient loss are dimensionless.

[0078] Step 5: This example uses the Tensorflow open-source deep learning platform to build two network models, ConvLSTM and PredRNN++, for subsequent comparative experiments. The main computation process of ConvLSTM is as follows:

[0079]

[0080]

[0081]

[0082]

[0083]

[0084] In the formula, i is the input gate, f is the forget gate, o is the output gate, C is the memory unit, H is the hidden state, W is the weight, X is the input data, and b is the bias. For Hadamard product, * represents convolution. To ensure that the output C and H maintain the same dimensions as the input X, zero-padding is required during the convolution calculation.

[0085] Compared to ConvLSTM, PredRNN++ primarily improves the network structure. By adding a GHU (Gradient Highway Unit) layer, neurons at different time points are connected, allowing feature information from historical time points to be directly passed to subsequent time points. The formula for calculating the GHU is as follows:

[0086] P t =tanh(W px *X t +W pz *Z t-1 )

[0087] S t =σ(W sx *X t +W sz *Z t-1 )

[0088]

[0089] In the above formula, S is the switching gate, P is the input of the GHU layer, and Z is the hidden state of the GHU layer. By passing information through the GHU layer, the gradient vanishing problem is solved to some extent.

[0090] Based on the above calculation process, this example constructs three-layer networks for ConvLSTM and PredRNN++ respectively. The first layer is the feature extraction layer, which extracts feature maps from the input data; the second layer is the hidden layer, which performs deep feature extraction on the feature maps; the third layer is the output layer, which compares the extracted features with the label data to obtain the final output result. The hyperparameter settings of the network model are shown in Table 1. The filters of each layer are 128, 128, and 64, the kernel sizes are 5, 3, and 3, the activation function is ReLU, the optimizer is Adam, and the batch size for network training is 32.

[0091] To select the optimal forecasting model, this example designed eight SIC comparison models, mainly comparing three aspects: the forecasting performance of SIC autoregressive and multifactor regression; the forecasting performance of MAE-loss and Grad-loss; and the forecasting performance of ConvLSTM and PredRNN++ models. The design of each forecasting model is shown in Table 2 below.

[0092] Step 6: To verify the forecast performance of each comparative model, this example compares the models' forecast timeliness, average forecast error, and spatial distribution of error by calculating indices such as MAE, SSIM, and DIFF. Based on the calculation results of the indices, the forecast performance of the models in single-factor and multi-factor scenarios, MAE-loss and Grad-loss scenarios, and different network models are also examined.

[0093] Forecast timeliness is measured by MAE and compared with the monthly average MAE of SIC. Figure 3 The dashed line in the figure represents the monthly average MAE of SIC. Looking at the MAE growth trend of each model, the worst model has a forecast timeliness of only 6 days, while the Grad-PredRNN++ model has the best forecast timeliness, which can reach up to 9 days. That is, the forecast error on the 9th day is still less than the monthly average change of that month.

[0094] The average forecast error was measured by MAE and SSIM, and divided into three groups to compare the forecast performance of single-factor versus multi-factor, MAE-loss versus Grad-loss, and ConvLSTM versus PredRNN++.

[0095] The comparison effect between single-factor and multi-factor methods is as follows: Figure 4 and Figure 5As shown, multi-factor forecasting models exhibit higher spatial structure similarity and lower errors, especially during the summer when sea ice variation patterns are complex, where the advantages of multi-factor forecasting models are more significant. In other words, multi-factor forecasting models are superior to single-factor models.

[0096] The comparison between MAE-loss and Grad-loss is as follows: Figure 6 and Figure 7 As shown, Grad-loss can further improve the accuracy of SIC forecasts based on multi-factor forecasts, with an overall accuracy improvement of approximately 2%, and a maximum improvement of nearly 5%. The improvement effect of Grad-loss is not limited to summer; it significantly improves forecast accuracy throughout the year. In other words, Grad-loss outperforms MAE-loss in forecast performance.

[0097] The comparison results of ConvLSTM and PredRNN++ are as follows: Figure 8 and Figure 9 As shown, the Grad-loss-based PredRNN++ multi-factor forecasting model exhibits higher SSIM and lower MAE. Except for winter, it shows a general improvement in forecasts during spring, summer, and autumn, especially during seasons with significant sea ice variations, where PredRNN++ can improve forecast performance by nearly 4%. This means the PredRNN++ model outperforms ConvLSTM in forecasting performance.

[0098] The spatial distribution of error is measured by the spatial distribution of DIFF, such as... Figure 10 As shown, the red area represents a lower predicted SIC, and the blue area represents a higher predicted SIC. The monthly average result has the largest DIFF, indicating significant variability in SIC within the month, which is precisely the significance of daily forecasts. Overall, ConvLSTM forecasts have a higher DIFF and a larger error area in the central sea area. In contrast, the Grad-PredRNN++ multi-factor forecast model has the smallest DIFF, with a more uniform spatial distribution and no large-scale deviations in any sea area.

[0099] Table 1

[0100]

[0101] Table 2

[0102]

[0103] In conclusion, considering the SSIM, MAE, and DIFF indices, the Grad-PredRNN++ multi-factor forecasting model exhibits the best forecasting performance. Furthermore, given that summer and autumn are the peak navigation seasons for the Northeast Passage, and the East Siberian Sea is a crucial waterway traversed by the Northeast Passage, the requirements for sea ice forecasting are relatively high. Therefore, from the perspective of meteorological navigation needs, the Grad-PredRNN++ multi-factor forecasting model is the optimal model.

Claims

1. A method for short-term prediction of Arctic sea ice based on a gradient-constrained neural network, characterized in that, Includes the following steps: 1) Prepare data and extract and filter the data; this includes the following two parts: 1-1) Obtain multi-source data for Arctic sea ice concentration (SIC) and Arctic sea ice thickness (SIT) products. The SIC data comes from the ERA5 reanalysis dataset, which is the fifth-generation reanalysis product of the European Centre for Medium-Range Weather Forecasts (ECMWF) and assimilates various satellite radiometric and scatterometer observation data; the SIT data comes from the PIOMAS model product, which assimilates the ice-sea model products of Arctic sea surface temperature and sea ice concentration; both types of data are daily gridded data, but their spatial grids are different. 1-2) Obtain various Arctic meteorological and oceanographic data related to Arctic sea ice changes, including Arctic sea surface temperature (SST), vertical heat flux (VHF), vertical water vapor flux (VMF), mean sea surface pressure (MSL), 10-meter wind speed (WIND), 2-meter air temperature (T2M), radiation albedo (AL), low cloud cover (LCC), and skin temperature (SKT), using daily gridded data of REA5. The above data are all publicly available datasets that can be accessed online. The extracted data includes sea ice thickness, latitude and longitude, and sampling time information. 2) Preprocess the acquired data from various sources, performing spatiotemporal matching of Arctic sea ice thickness data at different spatial grid points with other feature data; reduce the dimensionality of the data through factor screening, retaining only features highly correlated with Arctic sea ice changes; this includes the following three parts: 2-1) Grid matching: Perform grid matching on data of different resolutions and perform nearest neighbor interpolation on Arctic sea ice thickness data to unify it with the Arctic sea ice concentration element to the same grid point; 2-2) Dimensional unification: Unify the dimensions of different elements to the same dimension. Taking sea ice concentration forecast as an example, first dedimensify the other meteorological and oceanographic elements and then scale them down to the same dimension as sea ice concentration. 2-3) Data dimensionality reduction: In order to reduce redundant information and reduce data dimensionality, the input factors of the model are screened; the time-lag correlation coefficients of each element at time t+1 and the sea ice element at time t are calculated, and the factors are sorted from largest to smallest according to the absolute value of the time-lag correlation coefficients, and the top 50% of factors are selected as the input factors of the model. 3) Construct a gradient-constrained neural network model, namely Grad-PredRNN, and utilize the convergence of the sea ice local gradient constraint model; set up multiple control models and select the optimal model; including the following three parts: 3-1) Network Construction: Based on the TensorFlow platform, a basic time-cycle predictive gradient-constrained neural network model was built. This model extracts spatial features through a three-layer convolutional network and uses information from 10 historical time points to predict subsequent sea ice information. 3-2) Gradient constraint: Considering the local variation of sea ice, a gradient loss function is proposed, and the convergence direction of the model parameters is constrained by the local gradient. A Grad-PredRNN network model suitable for short-term sea ice forecasting is constructed. 3-3) Then model training: The preprocessed spatiotemporal sequence data related to Arctic sea ice are used for parameter training. Multiple training schemes are designed. Based on the forecast lead time, MAE and spatial distribution index of error, the effect of introducing local gradient constraints is tested. The forecast effects of time-cyclic prediction neural network and convolutional neural network are compared. The model with the qualified effect is selected and the parameters are saved. 3) Design of the network model; In deep learning algorithms, based on the loss function MAE-loss, gradient constraints are used to reflect local changes in sea ice, and the loss function is improved to propose Grad-loss, whose calculation formula is as follows; In the above equation, Grad lat and Grad lon represent the zonal and meridional gradients, respectively, and std represents the standard deviation; y (i,j) is the information of a specific grid point of the spatial field, y is the information of the entire spatial field of the element, is the forecast information of the element; in order to make the dimensions of MAE-loss and Grad-loss the same, the std of SIC and the std of the local gradient are removed in the above equation, respectively; 4) The best model obtained from the training was used for short-term sea ice forecasting in the Arctic Ocean, and the forecast results were compared with the actual results to verify the forecast effect.

2. The method for short-term forecasting of Arctic sea ice based on gradient-constrained neural networks according to claim 1, characterized in that, 1) The data characteristics of the data products used; We collected daily data products from PIOMAS and ERA5 to find the best multifactor forecasting model, reserving data for some years for model effectiveness testing. ERA5 data was used to extract information on SIC, SST, VHF, VMF, MSL, WIND, T2M, AL, LCC, and SKT, while PIOMAS data was used to extract information on SIT. Both ERA5 and PIOMAS data products are daily horizontal gridded data.

3. The method for short-term forecasting of Arctic sea ice based on gradient-constrained neural networks according to claim 1, characterized in that: 2) Data preprocessing before model training; Because the spatial grid points of different datasets do not match, it is necessary to unify them to the same grid points. Bilinear interpolation is used to interpolate the PIOMAS SIT data onto the ERA5 spatial grid, ensuring that all sea ice and meteorological / oceanographic elements are uniformly formatted as ERA5. Bilinear interpolation involves calculating a total of three single-linear interpolations in two directions. First, two single-linear interpolations are performed in the x-direction to obtain two temporary points R1(x,y1) and R2(x,y2). Then, one single-linear interpolation is performed in the y-direction to obtain f(x,y). The calculation formula is as follows: where Q ij , i = 1, 2, j = 1, 2 represent four points around the interpolation point, xi, yi represent the horizontal and vertical coordinates of the surrounding grid points, and f(x, y) is the result of interpolation. After spatial grid matching, input factors are screened to remove redundant information and reduce data dimensionality, thereby improving model convergence speed. The time-delay correlation coefficient between the input factors at time t and time t-1 of the Arctic sea ice data is calculated, and the absolute values ​​of the time-delay correlation coefficients are sorted from largest to smallest. The top 50% of the factors are selected as input factors for the model.

4. The method for short-term forecasting of Arctic sea ice based on gradient-constrained neural networks according to claim 1, characterized in that: 4) In this section, the model's forecast performance is verified and the optimal model is selected; Comparative experiments were conducted using multiple models to compare the performance of the PredRNN++ model and the ConvLSTM model; the performance of the MAE-loss loss function and the Grad-loss loss function; each comparative model can be used for short-term daily forecasting of sea ice, with a maximum forecast period of 10 days; The optimal forecasting model was selected through three comparisons: 1) comparing with the monthly average MAE-loss of sea ice elements to verify the effectiveness of daily forecasts; 2) comparing with daily real data to calculate MAE-loss and spatial structure similarity (SSIM) to verify the model's forecasting performance in the time dimension; 3) comparing with the spatial field of daily data to calculate the spatial difference DIFF between the forecast results and the real data to verify the model's forecasting performance in the spatial dimension. Based on the above three comparison results, the optimal forecasting model was selected.