A precipitation forecasting method and a forecasting system with a physically constrained prediction model

By constructing a prediction model with physical constraints and training the model using VIL radar image samples and an improved loss function, the problem of lack of physical constraints in deep learning models in QPN tasks is solved, and efficient and accurate prediction of extreme precipitation events is achieved.

CN117496324BActive Publication Date: 2026-06-30NAT UNIV OF DEFENSE TECH

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
NAT UNIV OF DEFENSE TECH
Filing Date
2023-09-22
Publication Date
2026-06-30

Smart Images

  • Figure CN117496324B_ABST
    Figure CN117496324B_ABST
Patent Text Reader

Abstract

The embodiment of the application relates to a precipitation prediction method and a prediction system with a physical constraint prediction model, compared with a model trained by using MSE, the method innovatively incorporates a physical constraint of a convection diffusion equation (a partial differential equation in the field of fluid mechanics) in the precipitation prediction model, because there is significant imbalance and difference in precipitation data in an observation area, in particular, high-frequency signals representing extreme precipitation events are less, so that a conventional loss function is not an optimal solution, the method utilizes a physical constraint content loss to improve the attention and sensitivity of the model to extreme precipitation events, and in turn increases the accuracy of the model in predicting extreme precipitation events, and the method utilizes the physical constraint to provide a new direction for future exploration of combination of physical prior knowledge and deep learning technology in the field of precipitation prediction.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This application relates to the field of precipitation prediction technology, and in particular to a precipitation forecasting method and system with a physically constrained prediction model. Background Technology

[0002] Quantitative precipitation nowcasting (QPN) is an extremely challenging task in weather forecasting. Accurate, timely, and efficient precipitation forecasts are essential for ensuring the success of events in various everyday scenarios, including storm warnings, air transport, and large gatherings.

[0003] Recent advances in computer vision (CV) have expanded the possibilities of applying deep learning (DL) techniques to QPN tasks, but the following challenges must still be overcome when using various deep learning models for QPN tasks:

[0004] Black-box deep learning (DL) models lack physical constraints and prior knowledge. Meteorologists are particularly concerned with the physical constraints of DL models. However, in small-scale systems such as precipitation, convection, and turbulence, large-scale conservation laws are not applicable, meaning it is difficult to apply mass or momentum conservation to QPN models. On the other hand, Physical Information Neural Networks (PINNs) are specifically designed to solve nonlinear partial differential equations (PDEs) describing physical laws. While recent research has focused on using PINNs as alternatives to traditional numerical solvers to simulate cylindrical flows, aircraft wings, and wind turbines, they have not yet been applied to real-world QPN tasks. Summary of the Invention

[0005] The following is an overview of the subject matter described in detail herein. This overview is not intended to limit the scope of the claims.

[0006] The main objective of this disclosure is to propose a precipitation forecasting method and system with a physical constraint prediction model. By utilizing physical constraint content loss, the model's focus on and sensitivity to extreme precipitation events is improved, thereby increasing the accuracy of the model's prediction of extreme precipitation events.

[0007] In a first aspect, this disclosure proposes a precipitation forecasting method with a physically constrained prediction model, characterized in that the precipitation forecasting method with a physically constrained prediction model includes the following steps:

[0008] A prediction model was constructed, and VIL image samples were selected; where VIL images are radar images of vertically cumulative liquid water content.

[0009] A loss function for the prediction model is constructed, and the prediction model is trained using the VIL image samples and the loss function to obtain the trained prediction model; the loss function includes an MSE loss term and a convection-diffusion term;

[0010] The measured VIL image is input into the trained prediction model to obtain prediction images for multiple future frames, which are then used to predict precipitation events.

[0011] In some embodiments of this application, the convection-diffusion term includes:

[0012]

[0013] in, It is a constant. The coefficient of the diffusion term, For VIL images, superscripts are used to represent changes over time, such as time intervals. It is a moment The next moment, subscript Used for characterization Grid points in the direction, subscript Used for characterization Grid points in direction, grid points For grid points exist The previous grid point in the direction, grid point For grid points exist The next grid point in the direction, grid point For grid points exist The previous grid point in the direction, grid point For grid points exist The next grid point in the direction.

[0014] In some embodiments of this application, the prediction model includes:

[0015] The input module consists of a 3×3 convolutional layer;

[0016] The encoding module includes three cascaded encoders, each comprising multiple stacked visual transformers and a downsampling layer. The visual transformer includes cascaded multi-head squared attention and a feedforward neural network. The multi-head squared attention process for processing the feature map includes:

[0017]

[0018]

[0019]

[0020]

[0021]

[0022] in, The input features are multi-head squared attention. The output features of multi-head squared attention. For 1×1 point-by-element convolution, It is a 3×3 depthwise convolution. This is the query, key, and value matrix obtained from the input features transformed by multi-head squared attention. For trainable parameters, For 1×1 convolution, For activation function, For attention mechanisms, To be Features after unbiased layer normalization;

[0023] Before the multi-head squared attention output features are input into the feedforward neural network, the process includes: unbiased layer normalization of the multi-head squared attention output features to obtain the layer-normalized features; the feedforward neural network's feature processing includes:

[0024] The layer-normalized features are input into the point-wise progressive convolution and depthwise convolution in the first path, and the output features of the depthwise convolution are activated by the GELU function. The layer-normalized features are input into the point-wise progressive convolution and depthwise convolution in the second path, and the output features of the depthwise convolution are activated by the sigmoid function. The Hadamard product of the features activated by the GELU function and the features activated by the sigmoid function is calculated. The Hadamard product is input into a 1×1 convolution, and the output features of the 1×1 convolution are skip-connected with the output features of the multi-head squared attention to obtain the output features of the feedforward neural network.

[0025] The central module consists of multiple stacked visual transformers;

[0026] The decoding module includes three cascaded decoders, each of which sequentially includes an upsampling layer, a convolutional layer, and multiple stacked visual transformers; the output features of each encoder module are skipped to the output features of the upsampling layer of a corresponding decoder module.

[0027] The output module includes multiple stacked visual transformers, two convolutional layers, and a Swish activation layer; the feature processing in the output module includes:

[0028] The output features of the decoding module are input into the first visual transformer in a stacked array of visual transformers; the output features of the last visual transformer in the stacked array of visual transformers are input into the first convolutional layer to obtain the output features of the first convolutional layer; the output features of the first convolutional layer are connected to the input features of the input module using a skip connection; the skip-connected features are input into the second convolutional layer to obtain the output features of the second convolutional layer; the output features of the second convolutional layer are activated using a Swish activation layer to obtain the output features of the output module.

[0029] In some embodiments of this application, when training the prediction model, the precipitation forecasting method with a physically constrained prediction model further includes:

[0030] The VIL image samples are set into a first mask sample set and a second mask sample set, where the number of VIL images is the same as the number of VIL image samples. The VIL images in the first mask sample set retain only feature tensors with pixel values ​​greater than 74, and the VIL images in the second mask sample set retain only feature tensors with pixel values ​​greater than 133.

[0031] The prediction model is trained based on the first mask sample set and the second mask sample set.

[0032] In some embodiments of this application, when training the prediction model, the precipitation forecasting method with physical constraints further includes:

[0033] Precipitation events tagged with thunderstorms, hail, and floods were selected from the VIL image samples;

[0034] The prediction model is trained based on precipitation events tagged with thunderstorms, hail, and floods.

[0035] In some embodiments of this application, when training the prediction model, the precipitation forecasting method with physical constraints further includes:

[0036] The prediction model is trained based on the first N / 2 minutes of VIL image sequences in the VIL image samples, and the prediction model is evaluated based on the last N / 2 minutes of VIL image sequences in the VIL image samples; N is the length of the VIL image sequence in the VIL image samples.

[0037] In some embodiments of this application, the loss function comprises a weighted sum of an MSE loss term and a convection-diffusion term, wherein the weight of the MSE loss term is 1 and the weight of the convection-diffusion term is 0.5.

[0038] Secondly, embodiments of this disclosure propose a precipitation forecasting system with a physically constrained prediction model, the precipitation forecasting system with a physically constrained prediction model comprising:

[0039] The model building unit is used to build a prediction model and select VIL image samples; where VIL images are radar images of vertically cumulative liquid water content.

[0040] The model training unit is used to construct the loss function of the prediction model, and to train the prediction model using the VIL image samples and the loss function to obtain the trained prediction model; the loss function includes an MSE loss term and a convection-diffusion term;

[0041] The model prediction unit is used to input the measured VIL image into the trained prediction model to obtain prediction images for multiple future frames, which are then used for precipitation event prediction.

[0042] Thirdly, embodiments of this disclosure provide an electronic device including at least one memory;

[0043] At least one processor;

[0044] At least one computer program;

[0045] The computer program is stored in the memory, and the processor executes the at least one computer program to achieve:

[0046] Precipitation forecasting method with a physically constrained prediction model as described in any of the embodiments of the first aspect.

[0047] Fourthly, embodiments of this disclosure provide a computer-readable storage medium storing computer-executable instructions for causing a computer to perform:

[0048] Precipitation forecasting method with a physically constrained prediction model as described in any of the embodiments of the first aspect.

[0049] Some embodiments of this application provide a precipitation forecasting method with a physically constrained prediction model. Compared to models trained using MSE, this method innovatively incorporates physical constraints of the convection-diffusion equation (a partial differential equation in the field of fluid mechanics) into the precipitation forecasting model. This is because there are significant imbalances and differences (height deviations and imbalances) in precipitation data within the observation area. In particular, there are few high-frequency signals representing extreme precipitation events, so conventional loss functions are not the optimal solution. This method uses physical constraint content loss to improve the model's focus on and sensitivity to extreme precipitation events, thereby increasing the accuracy of the model's prediction of extreme precipitation events. Moreover, this method uses physical constraints to provide a new direction for future exploration in the field of precipitation forecasting by combining physical prior knowledge and deep learning technology.

[0050] It is understood that the beneficial effects of the second to fourth aspects compared with the related technologies are the same as the beneficial effects of the first aspect compared with the related technologies. Please refer to the relevant description in the first aspect above, which will not be repeated here. Attached Figure Description

[0051] To more clearly illustrate the technical solutions in the embodiments of this application, the drawings used in the description of the embodiments or related technologies will be briefly introduced below. Obviously, the drawings described below are only some embodiments of the embodiments of this application. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.

[0052] Figure 1 This is a flowchart illustrating a precipitation forecasting method with a physically constrained prediction model provided in one embodiment of this application;

[0053] Figure 2 This is a schematic diagram of the structure of a prediction model provided in one embodiment of this application;

[0054] Figure 3 This is a schematic diagram of the structure of a multi-head squared attention and feedforward neural network provided in one embodiment of this application;

[0055] Figure 4 This is a schematic diagram of the original dataset, Masked74 dataset, and Masked133 dataset provided in one embodiment of this application;

[0056] Figure 5 This is a schematic diagram illustrating the separation of three types of precipitation events from the original dataset provided in one embodiment of this application;

[0057] Figure 6 This is a schematic diagram of the structure of an electronic device provided in one embodiment of this application. Detailed Implementation

[0058] To make the objectives, technical solutions, and advantages of this application clearer, the following detailed description is provided in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative and not intended to limit the scope of this application.

[0059] It should be noted that although functional modules are divided in the device schematic diagram and a logical order is shown in the flowchart, in some cases, the steps shown or described may be performed in a different order than the module division in the device or the order in the flowchart. The terms "first," "second," etc., in the specification, claims, and the aforementioned drawings are used to distinguish similar objects and are not necessarily used to describe a specific order or sequence.

[0060] Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs. The terminology used herein is for the purpose of describing embodiments of this application only and is not intended to limit this application.

[0061] The following is an introduction to the background technology:

[0062] Precipitation is a key variable in weather forecasting because it can lead to severe natural disasters such as flash floods, torrential rains, and mudslides. Short-term quantitative precipitation forecasting (QPN) is considered one of the most challenging aspects of this field. For a considerable period, numerical weather prediction (NWP) systems have been the primary method for forecasting precipitation. However, limited model resolution and a lack of understanding of subgrid processes prevent NWPs from modeling small-scale processes such as convection and turbulence. Furthermore, model spin-up issues and the slow computation speed of partial differential equations further limit the performance of NWPs in short-term forecasting tasks. Recent advances in computer vision (CV) have expanded the possibilities of applying deep learning (DL) techniques to QPN tasks.

[0063] In the traditional field of machine learning, transfer learning involves transferring knowledge from one domain (i.e., the source domain) to another, enabling better learning outcomes in the target domain. It has been applied in various fields, such as natural language processing and cyber-physical systems. Furthermore, transfer learning is considered an important direction for future research in atmospheric science.

[0064] In the development of neural networks, physical constraints are considered a major challenge by researchers. Existing research identifies two main types of physical constraints: soft constraints and hard constraints. Soft physical constraints are enforced by embedding partial differential equations (PDEs) and their initial and boundary conditions into the loss function of Physical Information Neural Networks (PINNs). Hard physical constraints are enforced by designing a custom neural network that satisfies the initial and boundary conditions while embedding PDEs into the loss function. As a general approximator, PINNs can be applied to solve various types of equations, including fractional equations, integral-differential equations, and stochastic PDEs. These deep neural networks produce an estimated solution at a specific point in the differential equation after training. PINNs employ a network that encodes the physical governing equations and use a residual term derived from the governing equations to augment the loss function as a penalty term to constrain it.

[0065] Example Implementation Section;

[0066] Reference Figure 1 One embodiment of this application provides a precipitation forecasting method with a physically constrained prediction model, the method comprising the following steps:

[0067] Step S100: Construct a prediction model and select VIL image samples; wherein, the VIL image is a radar image of the vertically cumulative liquid water content.

[0068] Step S200: Construct the loss function of the prediction model. Train the prediction model using VIL image samples and the loss function to obtain the trained prediction model. The loss function includes the MSE loss term and the convection-diffusion term.

[0069] Step S300: Input the measured VIL image into the trained prediction model to obtain prediction images for multiple future frames, which are then used for precipitation event prediction.

[0070] Reference Figure 2 First, the prediction model used in the embodiments of this application is introduced. The prediction model includes:

[0071] The input module consists of a 3×3 convolutional layer.

[0072] The encoding module includes three cascaded encoders, each consisting of multiple stacked visual transformers and a downsampling layer.

[0073] Reference Figure 3 The visual transformer consists of cascaded multi-head squared attention and a feedforward neural network. The multi-head squared attention process for processing feature maps includes:

[0074]

[0075]

[0076]

[0077]

[0078]

[0079]

[0080] in, The input features are multi-head squared attention. The output features of multi-head squared attention. For 1×1 point-by-element convolution, It is a 3×3 depthwise convolution. This is the query, key, and value matrix obtained from the input features transformed by multi-head squared attention. For trainable parameters, For 1×1 convolution, For activation function, For attention mechanisms, To be Features after unbiased layer normalization.

[0081] Compared to existing precipitation prediction units, the prediction model used here is a novel one. The encoder and decoder have been redesigned, with each encoder and decoder containing multiple stacked visual transformers. These visual transformers consist of cascaded multi-head squared attention and feedforward neural networks. Compared to traditional multi-head squared attention, this embodiment first utilizes unbiased layer normalization to improve information flow during backpropagation, accelerating training convergence. Then, it decomposes the standard convolution into two separate operations: depthwise convolution and pointwise progressive convolution. By performing pointwise progressive convolution and depthwise convolution on the feature tensor, query, key, and value are generated. Pointwise progressive convolution aggregates cross-channel contextual information, performing convolution on all output channels, effectively combining information from all input channels and reducing the overall number of model parameters. Depthwise convolution encodes the spatial features of precipitation in different frames, applying a single filter independently to each input channel, further reducing the overall number of model parameters. Finally, multi-head squared attention uses the sigmoid activation function instead of the softmax function, improving its attention performance. The prediction model here utilizes an improved prediction model that can improve the accuracy of long-term predictions while significantly reducing parameters and computational complexity.

[0082] Before the multi-head squared attention output features are input into the feedforward neural network, the process includes: unbiased layer normalization of the multi-head squared attention output features to obtain the layer-normalized features; the feedforward neural network's feature processing includes:

[0083] The output features of multi-head squared attention are subjected to unbiased layer normalization. The normalized features are then input into the point-wise progressive convolution and depthwise convolution in the first path, and the output features of the depthwise convolution are activated by the GELU function. The normalized features are then input into the point-wise progressive convolution and depthwise convolution in the second path, and the output features of the depthwise convolution are activated by the sigmoid function. The Hadamard product of the features activated by the GELU function and the features activated by the sigmoid function is calculated. The Hadamard product is then input into a 1×1 convolution, and the output features of the 1×1 convolution are skipped to the output features of multi-head squared attention to obtain the output features of the feedforward neural network.

[0084] Reference Figure 3 Similar to multi-head squared attention, feedforward neural networks also add an unbiased layer normalization operation and point-wise convolution and depthwise convolution operations. The feedforward neural network consists of two paths. First, the layer-normalized tensor extracts spatial information through two separable convolutional layers. In one path, the output is activated by the GELU function to add nonlinearity. The other path activates it by the sigmoid function. The GELU activation function has a strong nonlinear response when handling large inputs, while avoiding the gradient vanishing problem when handling small inputs. The sigmoid function compresses the data to ensure appropriate data amplitude while controlling the size of weight updates during backpropagation. By combining these two paths, the aim is to enhance the model's ability to capture nonlinear relationships in precipitation data. Subsequently, the output tensors of the two parallel paths are applied through a Hadamard product and transformed by point-wise convolution. Residual connections are also applied to address the problems of gradient vanishing and weight matrix degradation.

[0085] The central module comprises multiple stacked visual transformers to enhance the model's ability to simulate nonlinear variations in precipitation events.

[0086] The decoding module includes three cascaded decoders, which in turn include an upsampling layer, a convolutional layer, and multiple stacked visual transformers. The output features of each encoder module are connected to the output features of the upsampling layer of the corresponding decoder module in a skip connection.

[0087] The encoder and decoder architecture leverages the UNet architecture, incorporating skip connections between the encoder and decoder blocks. This offers several advantages: First, the visual transformer overcomes the vanishing gradient problem in recurrent architectures by capturing relationships throughout the sequence using self-attention, regardless of distance. This allows the model to better handle complex dependencies and improves performance in tasks requiring long-term memory. Second, the architecture, where the output features of each encoder module are skip-connected to the output features of the upsampled layer of a corresponding decoder module, facilitates direct information flow between different layers, preserving low-level input features while accessing both high- and low-level information. This enhances the model's ability to capture fine-grained details and contextual information. Finally, the proposed architecture achieves single-step image prediction, avoiding the high computational complexity and low training efficiency of recurrent architectures where each time step's computation depends on the hidden state of the previous time step.

[0088] The output module consists of four stacked visual transformers, two convolutional layers, and a Swish activation layer; the feature processing in the output module includes:

[0089] The output features of the decoding module are input into the first visual transformer in a stacked visual transformer array; the output features of the last visual transformer in the stacked visual transformer array are input into the first convolutional layer to obtain the output features of the first convolutional layer; the output features of the first convolutional layer are connected to the input features of the input module using a skip connection; the skip-connected features are input into the second convolutional layer to obtain the output features of the second convolutional layer; the output features of the second convolutional layer are activated using a Swish activation layer to obtain the output features of the output module.

[0090] The output module has four stacked visual transformers to refine the details of the VIL image. Skip connections are used to add the last frame of the input features from the input module to the output features of the convolutional layer. An additional convolutional layer and a Swish activation layer are also added. An output block consisting of convolutional and Swish activation layer functions is used to further enhance the nonlinear fitting capability.

[0091] In step S100, a prediction model is first constructed, as described in the above embodiment. Then, VIL image samples are selected, for example, 5 frames (25 minutes) of VIL images are selected as the raw data. The VIL data from January 1, 2017 to December 31, 2019 includes a total of 20,393 sequences, which are divided into three parts for training, validation and testing, respectively.

[0092] Description of the QPN task:

[0093]

[0094] in, Is VIL at any time? The actual frame, The VIL image at time... The predicted frames. In this implementation, the QPN task is formalized as an image-to-image regression task. The aim is to use 5 frames (25 minutes) of input data to predict the VIL for the next 20 frames (100 minutes) to adapt to real-world scenes.

[0095] In step 200, the loss function of the prediction model is constructed from two parts: one part is the MES loss term, and the other part is the convection-diffusion term. For example, the loss function of the prediction model is composed of the weighted sum of the MES loss term and the convection-diffusion term, including the following formula:

[0096] (1)

[0097] in, For MES loss items, For the convection-diffusion term, a and b are weight values.

[0098] The formation of the convection-diffusion term is explained in detail below:

[0099] Based on prior physical knowledge, the global atmosphere satisfies several key physical conservation laws, including conservation of mass, momentum, angular momentum, energy, and vorticity. However, small-scale weather systems, such as precipitation systems, often fail to satisfy large-scale conservation laws. Furthermore, constructing these conservation equations requires multiple variables, such as wind speed, heat, and pressure, which are not included in the current QPN dataset. Obtaining these variables from reanalysis data is infeasible because it cannot achieve the 1-kilometer resolution matched by radar echo data and cannot be immediately retrieved during precipitation events. Therefore, directly applying these conservation equations in the QPN mission is challenging. This embodiment proposes using the convection-diffusion equation from fluid dynamics as the physical constraints for QPN. Several assumptions are made based on existing data and computational conditions. The convection-diffusion equation is shown below:

[0100] (2)

[0101] in, This refers to the amount of material transported; in this embodiment, VIL is used instead. It is the convective velocity vector, representing the velocity of the macroscopic fluid. It is the partial derivative with respect to time, which represents Changes over time. It refers to the dispersion of flow. This term describes the change caused by convection. It is the diffusion coefficient, which is a constant. The Laplace term. This term changes due to the diffusion of matter. Specifically, the convection term describes... Transported by fluid motion, while the diffusion term describes The spread and diffusion of [the virus], therefore, it is believed that VIL should be used instead. It is reasonable to use this to simulate precipitation movement. In some cases, the velocity of a precipitation system is relatively stable, and can be approximated as a fixed velocity over a short timescale. This assumption is based on the understanding that, in many cases, the motion of a precipitation system exhibits a certain degree of consistency over a finite time period. Therefore, some assumptions and simplifications are made to the convection-diffusion equation. First, it is assumed that the precipitation system moves at a fixed velocity over 120 minutes. Therefore... It is a constant. Furthermore, the partial differential equation was discretized using numerical methods. Finally, this equation was added as a loss term to the total network loss function used to calculate the residuals of the discrete partial differential equation. This promotes the network's learning of results that satisfy physical constraints. Backpropagation not only updates the weights to minimize the prediction error (MSE) but also approximates the physical constraints. It also balances the different terms of the loss function to avoid one term being abnormally large while neglecting the influence of other terms. The discrete partial differential equation can be expressed as:

[0102] (3)

[0103] in, and , These are the time step and the space step, which are constants. , yes , The velocity in the direction is also a constant assumed earlier. In the above equation... superscript Representing changes over time, This means The next moment. Subscript represent Grid points in the direction, represent Grid points in the direction. represent The direction is the previous grid point. This represents the next grid point, derived from the discretization of the differential equation. Therefore, the partial derivative of the first term with respect to time results in... In terms of time The change, the partial derivative of the second term with respect to space, brings about... , The changes.

[0104] Therefore, after a series of assumptions and simplifications, the above equation (3) can be simplified as follows:

[0105] (4)

[0106] The left side of the above equation represents the convection loss term. The purpose of this loss function is to optimize the network parameters so that the loss function approaches zero. The closer the left side of equation (4) is to zero, the more it satisfies the physical constraints of convection diffusion. The output of the prediction model can be directly substituted into equation (4). For example, it is required that... Simply subtract frames 1 to 19 from frames 2 to 20 in the model's output of 20 frames. Similarly, it only needs to be done at the grid points. Simply subtract the values ​​if the directions are misaligned by one unit. Therefore, the specific mathematical form is the differential equation on the left side of equation (2). To facilitate numerical calculation, it is transformed into the difference discretization form on the left side of equation (4), so that the results output by the model calculation can be used directly for calculation.

[0107] In formula (4), These are all constants. A convolutional layer and a ReLU (corrected linear unit) activation function are applied outside the neural network to generate these four parameters, similar to the idea of ​​approximating unknown partial differential equations. Because it's regular grid data, here... , They are equal, therefore It is defined as the coefficient of the diffusion term. In general, It is determined by the time step The decision, and , It is determined by the space step size , and the assumed speed , It was decided. The diffusion coefficient D and the spatial step size , The decision is made here. Based on the above assumptions, the velocity of a single precipitation event remains constant over two hours, with both the time step and spatial step being fixed. Therefore, all four coefficients are considered constants.

[0108] In formula (1), a and b are the weights of each term. Due to the non-differentiability of the convection-diffusion term, it cannot be directly used as the loss function. If the weight is set too large, it may have an adverse effect on gradient descent and backpropagation. Therefore, the coefficient of the MSE term (denoted as "a") is set to 1 to ensure a normal training process. In addition, the coefficient 'b' is set to 0.5 to use the convection-diffusion term as a regularization term to constrain the model's learning process. Setting the coefficient too large may cause the model to be unsuitable, where the model cannot fully understand the characteristics of the training data. Since the purpose of this embodiment is not to optimize the physically constrained model, 0.5 is chosen as the training value. The complete training process of the prediction model is represented by the following pseudocode:

[0109]

[0110] Table 1

[0111] Compared to models trained using MSE, this method innovatively incorporates physical constraints of the convection-diffusion equation (a partial differential equation in fluid mechanics) into the precipitation prediction model. This is because there are significant imbalances and discrepancies (height bias and imbalance) in precipitation data within the observation area. In particular, there are few high-frequency signals representing extreme precipitation events, so conventional loss functions are not the optimal solution. This method uses physical constraint content loss to improve the model's focus on and sensitivity to extreme precipitation events, thereby increasing the accuracy of the model's prediction of extreme precipitation events. Moreover, this method, by utilizing physical constraints, provides a new direction for future exploration in the field of precipitation forecasting by combining physical prior knowledge and deep learning technology.

[0112] In one embodiment of this application, the unique features of precipitation events in current prediction models are over-smoothed due to long-term training. Typically, the prediction area is dominated by light rain and dry regions, leading deep learning models to fit these data to reduce errors. Therefore, the precipitation predictions of many deep learning models tend to be lower than the actual ground conditions. However, convective precipitation is crucial in QPN tasks due to its transient variability and destructive impact. Furthermore, different types of precipitation and individual precipitation events have their own causes, including topography, atmospheric circulation, and humidity. Therefore, fine-tuning can be used to learn the characteristics of different precipitation events. To meet these requirements and improve the accuracy of model predictions, three different fine-tuning schemes are designed using fine-tuning techniques:

[0113] The first fine-tuning scheme attempts to address the imbalance in the original dataset and the under-prediction of high-intensity regions. To mitigate this, tensors with pixel values ​​greater than 74 and 133 are retained to represent high-intensity regions, while the tensors for the remaining regions are masked to 0. Only the portions of the original VIL image data with pixel values ​​greater than 74 and 133 are retained, thus creating two masked datasets. Figure 4As shown, the original data (first row) has green low-value areas. The Masked74 dataset (one type of masked dataset) removes the parts with smaller pixel values, keeping only the parts with values ​​greater than 74 pixels. The Masked133 dataset (one type of masked dataset) removes even more, keeping only the parts with values ​​greater than 133 pixels.

[0114] like Figure 5 As shown, in the second fine-tuning scheme, a fine-tuning scheme was studied on a subset of the original data. Three types of precipitation events were separated from the original data: thunderstorms, hail, and floods, containing 1833, 1043, and 385 samples respectively. The SEVIR dataset (comprising a dataset of approximately 15,000 spatially and temporally consistent image sequences generated by the GOES-16 satellite and next-generation weather radar) provides an index file containing the ID, event type, latitude, and longitude information for each precipitation event. Specifically, these event types include three distinct categories: thunderstorms, hail events, and floods. Therefore, it is not necessary to select precipitation events individually. The fine-tuning scheme was applied to each category separately, and its effectiveness was analyzed.

[0115] In the third fine-tuning scheme, fine-tuning is studied for individual events, rather than for a single category or the entire dataset. This is because each precipitation event is independent and has unique characteristics. Training is performed using the first 120 minutes of data, and inference and evaluation are performed using the next 120 minutes. The SEVIR dataset, with 49 frames per sequence covering a 240-minute time range, is well-suited for this design. During the training phase of fine-tuning, five frames with a time range between -120 and -100 (inclusive) are used as input features, and twenty frames with a time range between -105 and 0 (inclusive) are used as labels. During the inference phase, five frames with times between 0 and 20 are used as input, and twenty frames with times between 25 and 120 are used as ground truth values ​​for evaluation.

[0116] In the first two fine-tuning schemes, all parameters of the model were fine-tuned. Considering computational speed in operational prediction, all trainable parameters except those in the last three blocks were frozen, such as... Figure 2 The last three blocks, shown by dashed lines in the output module, contain only 55.2k trainable parameters and significantly improve training speed. For the first two fine-tuning schemes, all samples are divided into 80% for training, 10% for validation, and 10% for testing as training for the prediction model. For the third fine-tuning scheme, 660 individual precipitation events were selected, and the first 120 minutes of each sequence were used for training, with the last 120 minutes used for inference.

[0117] To demonstrate the rationality of this scheme, a set of experiments is provided:

[0118] All experiments were conducted on a machine with a 24-core NVIDIA Quadro RTX 8000 graphics processor (GPU) and an Intel Xeon 6248R central processing unit (CPU). Random access memory (RAM) was 376GB. All models were trained using the AdamW optimizer. A cosine annealing learning rate scheduler was used, with an initial learning rate of 1e-3 decaying to 1e-9. Training was performed for 100 epochs, using an early stopping policy with a tolerance of 20, monitoring the loss on the validation set.

[0119] Table 2 below lists the results of three fine-tuning schemes, evaluated using five CSI thresholds (0.1, 16, 74, 133, and 181). The first row of each fine-tuning scheme shows the inference results of the prediction model on a specific dataset. Fine-tuning the prediction model on the Masked74 dataset significantly improved performance in terms of overall prediction error (RMSE), false alarm rate (FAR), and Heide skill score (HSS) compared to using the prediction model directly on the test set. Similarly, after fine-tuning on a dataset masking values ​​less than 133 pixels, the model also showed a significant improvement in prediction accuracy for convective regions (CSI-181). These results demonstrate the effectiveness of fine-tuning on datasets with specific pixel values. In the second fine-tuning scheme, the fine-tuning technique was found to consistently improve the prediction accuracy and hit rate for thunderstorms, strong winds, and flash floods in three precipitation events. In the third fine-tuning scheme, the model significantly improved the evaluation metrics, especially in predicting large numerical regions in the fine-tuning of individual events. These results directly demonstrate that fine-tuning schemes can significantly improve the accuracy of prediction models.

[0120]

[0121] Table 2

[0122] Each precipitation event has a unique generating environment, such as topography, atmospheric circulation, and humidity. Long-term training can only capture the common features of different precipitation events, while the distinctive features of different types of precipitation are ignored. Therefore, transfer learning techniques have broad application prospects. The results in Table 2 show the great potential of fine-tuning techniques, which provide an alternative to using complex stacked modules of deep learning to improve prediction accuracy. In the first experiment, fine-tuning the masked dataset resulted in better performance. The second fine-tuning scheme shows that building datasets for specific types of precipitation is valuable. However, for specific types of precipitation such as hail, customized fine-tuning schemes and more feature engineering are needed. In the third fine-tuning scheme, applying fine-tuning techniques to a single precipitation event yielded greater confidence.

[0123] This application provides a precipitation forecasting system with a physically constrained prediction model, comprising, in some embodiments:

[0124] The model building unit is used to build a prediction model and select VIL image samples; where VIL images are radar images of vertically cumulative liquid water content.

[0125] The model training unit is used to construct the loss function of the prediction model. The prediction model is trained using VIL image samples and the loss function to obtain the trained prediction model. The loss function includes the MSE loss term and the convection-diffusion term.

[0126] The model prediction unit is used to input the measured VIL image into the trained prediction model to obtain prediction images for multiple future frames, which can be used to predict precipitation events.

[0127] It should be noted that the precipitation forecasting system with a physical constraint prediction model in this application embodiment is based on the same inventive concept as the aforementioned precipitation forecasting method with a physical constraint prediction model. Therefore, the precipitation forecasting system with a physical constraint prediction model in this application embodiment corresponds to the aforementioned precipitation forecasting method with a physical constraint prediction model.

[0128] This application also provides an electronic device, which includes:

[0129] At least one memory;

[0130] At least one processor;

[0131] At least one program;

[0132] The program is stored in memory, and the processor executes at least one program to implement the precipitation forecasting method with a physically constrained prediction model described above in this disclosure.

[0133] This electronic device can be any smart terminal, including mobile phones, tablets, personal digital assistants (PDAs), and in-vehicle computers.

[0134] The following is combined Figure 6 The electronic device according to embodiments of this application will be described in detail. The electronic device includes:

[0135] The processor 1600 can be implemented using a general-purpose central processing unit (CPU), microprocessor, application-specific integrated circuit (ASIC), or one or more integrated circuits, and is used to execute relevant programs to implement the technical solutions provided in the embodiments of this disclosure.

[0136] The memory 1700 can be implemented as a read-only memory (ROM), static storage device, dynamic storage device, or random access memory (RAM). The memory 1700 can store the operating system and other application programs. When the technical solutions provided in the embodiments of this specification are implemented through software or firmware, the relevant program code is stored in the memory 1700 and is called and executed by the processor 1600 to execute the precipitation forecasting method with a physically constrained prediction model according to the embodiments of this disclosure.

[0137] The input / output interface 1800 is used to implement information input and output.

[0138] The communication interface 1900 is used to enable communication and interaction between this device and other devices. Communication can be achieved through wired means (such as USB, Ethernet cable, etc.) or wireless means (such as mobile network, WIFI, Bluetooth, etc.).

[0139] Bus 2000 transmits information between various components of the device (e.g., processor 1600, memory 1700, input / output interface 1800, and communication interface 1900);

[0140] The processor 1600, memory 1700, input / output interface 1800 and communication interface 1900 communicate with each other within the device via bus 2000.

[0141] This disclosure also provides a storage medium, which is a computer-readable storage medium storing computer-executable instructions for causing a computer to execute the above-described precipitation forecasting method with a physically constrained prediction model.

[0142] Memory, as a non-transitory computer-readable storage medium, can be used to store non-transitory software programs and non-transitory computer-executable programs. Furthermore, memory may include high-speed random access memory, and may also include non-transitory memory, such as at least one disk storage device, flash memory device, or other non-transitory solid-state storage device. In some embodiments, memory may optionally include memory remotely located relative to the processor, which can be linked to the processor via a network. Examples of such networks include, but are not limited to, the Internet, intranets, local area networks, mobile communication networks, and combinations thereof.

[0143] The embodiments described in this disclosure are for the purpose of more clearly illustrating the technical solutions of this disclosure and do not constitute a limitation on the technical solutions provided by this disclosure. As those skilled in the art will know, with the evolution of technology and the emergence of new application scenarios, the technical solutions provided by this disclosure are also applicable to similar technical problems.

[0144] Those skilled in the art will understand that the technical solutions shown in the figures do not constitute a limitation on the embodiments of this disclosure, and may include more or fewer steps than shown, or combine certain steps, or different steps.

[0145] The device embodiments described above are merely illustrative. The units described as separate components may or may not be physically separate; that is, they may be located in one place or distributed across multiple network units. Some or all of the modules can be selected to achieve the purpose of this embodiment according to actual needs.

[0146] Those skilled in the art will understand that all or some of the steps in the methods disclosed above, as well as the functional modules / units in the systems and devices, can be implemented as software, firmware, hardware, or suitable combinations thereof.

[0147] The terms “first,” “second,” “third,” “fourth,” etc. (if present) in the specification and accompanying drawings of this application are used to distinguish similar objects and are not necessarily used to describe a specific order or sequence. It should be understood that such data can be interchanged where appropriate so that the embodiments of this application described herein can be implemented in orders other than those illustrated or described herein. Furthermore, the terms “comprising” and “having,” and any variations thereof, are intended to cover non-exclusive inclusion; for example, a process, method, system, product, or apparatus that comprises a series of steps or units is not necessarily limited to those steps or units explicitly listed, but may include other steps or units not explicitly listed or inherent to such processes, methods, products, or apparatus.

[0148] It should be understood that in this application, "at least one (item)" means one or more, and "more than" means two or more. "And / or" is used to describe the relationship between related objects, indicating that three relationships can exist. For example, "A and / or B" can represent three cases: only A exists, only B exists, and both A and B exist simultaneously, where A and B can be singular or plural. The character " / " generally indicates that the preceding and following related objects are in an "or" relationship. "At least one (item) of the following" or similar expressions refer to any combination of these items, including any combination of single or plural items. For example, at least one (item) of a, b, or c can represent: a, b, c, "a and b", "a and c", "b and c", or "a and b and c", where a, b, and c can be single or multiple.

[0149] In the several embodiments provided in this application, it should be understood that the disclosed apparatus and methods can be implemented in other ways. For example, the apparatus embodiments described above are merely illustrative; for instance, the division of units is only a logical functional division, and in actual implementation, there may be other division methods. For example, multiple units or components may be combined or integrated into another system, or some features may be ignored or not executed. Furthermore, the coupling or direct coupling or communication links shown or discussed may be through some interfaces; the indirect coupling or communication links between apparatuses or units may be electrical, mechanical, or other forms.

[0150] The units described as separate components may or may not be physically separate. The components shown as units may or may not be physical units; that is, they may be located in one place or distributed across multiple network units. Some or all of the units can be selected to achieve the purpose of this embodiment according to actual needs.

[0151] Furthermore, the functional units in the various embodiments of this application can be integrated into one processing unit, or each unit can exist physically separately, or two or more units can be integrated into one unit. The integrated unit can be implemented in hardware or as a software functional unit.

[0152] If the integrated unit is implemented as a software functional unit and sold or used as an independent product, it can be stored in a computer-readable storage medium. Based on this understanding, the technical solution of this application, in essence, or the part that contributes to the prior art, or all or part of the technical solution, can be embodied in the form of a software product. This computer software product is stored in a storage medium and includes multiple instructions to cause an electronic device (which may be a personal computer, server, or network device, etc.) to execute all or part of the steps of the methods of the various embodiments of this application. The aforementioned storage medium includes various media capable of storing programs, such as USB flash drives, portable hard drives, read-only memory (ROM), random access memory (RAM), magnetic disks, or optical disks.

[0153] The above is a detailed description of the preferred embodiments of this application. However, the embodiments of this application are not limited to the above-described implementation methods. Those skilled in the art can make various equivalent modifications or substitutions without departing from the spirit of the embodiments of this application. All such equivalent modifications or substitutions are included within the scope defined by the claims of the embodiments of this application.

Claims

1. A precipitation forecast method with a physically constrained prediction model, characterized by, The precipitation forecasting method with a physically constrained prediction model includes the following steps: A prediction model was constructed, and VIL image samples were selected; where VIL images are radar images of vertically cumulative liquid water content. A loss function for the prediction model is constructed, and the prediction model is trained using the VIL image samples and the loss function to obtain a trained prediction model; the loss function includes an MSE loss term and a convection-diffusion term; the convection-diffusion term includes: in, It is a constant. The coefficient of the diffusion term, For VIL images, superscripts are used to represent changes over time, such as time intervals. It is a moment The next moment, subscript Used for characterization Grid points in the direction, subscript Used for characterization Grid points in direction, grid points For grid points exist The previous grid point in the direction, grid point For grid points exist The next grid point in the direction, grid point For grid points exist The previous grid point in the direction, grid point For grid points exist The next grid point in the direction; the prediction model includes: The input module consists of a 3×3 convolutional layer; The encoding module includes three cascaded encoders, each comprising multiple stacked visual transformers and a downsampling layer. The visual transformer includes cascaded multi-head squared attention and a feedforward neural network. The multi-head squared attention process for processing the feature map includes: in, The input features are multi-head squared attention. The output features of multi-head squared attention. For 1×1 point-by-element convolution, It is a 3×3 depthwise convolution. This is the query, key, and value matrix obtained from the input features transformed by multi-head squared attention. For trainable parameters, For a 1×1 convolution, For activation function, For attention mechanisms, To be Features after unbiased layer normalization; Before the multi-head squared attention output features are input into the feedforward neural network, the process includes: unbiased layer normalization of the multi-head squared attention output features to obtain the layer-normalized features; the feedforward neural network's feature processing includes: The layer-normalized features are input into the point-wise progressive convolution and depthwise convolution in the first path, and the output features of the depthwise convolution are activated by the GELU function. The layer-normalized features are input into the point-wise progressive convolution and depthwise convolution in the second path, and the output features of the depthwise convolution are activated by the sigmoid function. The Hadamard product of the features activated by the GELU function and the features activated by the sigmoid function is calculated. The Hadamard product is input into a 1×1 convolution, and the output features of the 1×1 convolution are skip-connected with the output features of the multi-head squared attention to obtain the output features of the feedforward neural network. The central module consists of multiple stacked visual transformers; The decoding module includes three cascaded decoders, each of which sequentially includes an upsampling layer, a convolutional layer, and multiple stacked visual transformers; the output features of each encoder module are skipped to the output features of the upsampling layer of a corresponding decoder module. The output module includes multiple stacked visual transformers, two convolutional layers, and a Swish activation layer; the feature processing in the output module includes: The output features of the decoding module are input into the first visual transformer in a stacked array of visual transformers; the output features of the last visual transformer in the stacked array of visual transformers are input into the first convolutional layer to obtain the output features of the first convolutional layer; the output features of the first convolutional layer are connected to the input features of the input module using a skip connection; the skip-connected features are input into the second convolutional layer to obtain the output features of the second convolutional layer; the output features of the second convolutional layer are activated using a Swish activation layer to obtain the output features of the output module. The measured VIL image is input into the trained prediction model to obtain prediction images for multiple future frames, which are then used to predict precipitation events.

2. The precipitation forecast method with a physically constrained prediction model according to claim 1, characterized in that, When training the prediction model, the precipitation forecasting method with physically constrained prediction models further includes: The VIL image samples are set into a first mask sample set and a second mask sample set, where the number of VIL images is the same as the number of VIL image samples. The VIL images in the first mask sample set retain only feature tensors with pixel values ​​greater than 74, and the VIL images in the second mask sample set retain only feature tensors with pixel values ​​greater than 133. The prediction model is trained based on the first mask sample set and the second mask sample set. 3.The precipitation forecasting method with a physically constrained prediction model according to claim 1, wherein, When training the prediction model, the precipitation forecasting method with physical constraints further includes: Precipitation events tagged with thunderstorms, hail, and floods were selected from the VIL image samples; The prediction model is trained based on precipitation events tagged with thunderstorms, hail, and floods.

4. The precipitation forecast method with a physically constrained prediction model according to claim 1, characterized in that, When training the prediction model, the precipitation forecasting method with physical constraints further includes: The prediction model is trained based on the first N / 2 minutes of VIL image sequences in the VIL image samples, and the prediction model is evaluated based on the last N / 2 minutes of VIL image sequences in the VIL image samples; N is the length of the VIL image sequence in the VIL image samples.

5. The precipitation forecast method with a physically constrained prediction model according to claim 1, characterized in that, The loss function consists of a weighted sum of an MSE loss term and a convection-diffusion term, wherein the weight of the MSE loss term is 1 and the weight of the convection-diffusion term is 0.

5.

6. A precipitation forecast system with a physically constrained prediction model, characterized in that, The precipitation forecasting system with a physically constrained prediction model includes: The model building unit is used to build a prediction model and select VIL image samples; where VIL images are radar images of vertically cumulative liquid water content. A model training unit is used to construct the loss function of the prediction model, and to train the prediction model using the VIL image samples and the loss function to obtain the trained prediction model; the loss function includes an MSE loss term and a convection-diffusion term; the convection-diffusion term includes: in, It is a constant. The coefficient of the diffusion term, For VIL images, superscripts are used to represent changes over time, such as time intervals. It is a moment The next moment, subscript Used for characterization Grid points in the direction, subscript Used for characterization Grid points in direction, grid points For grid points exist The previous grid point in the direction, grid point For grid points exist The next grid point in the direction, grid point For grid points exist The previous grid point in the direction, grid point For grid points exist The next grid point in the direction; the prediction model includes: The input module consists of a 3×3 convolutional layer; The encoding module includes three cascaded encoders, each comprising multiple stacked visual transformers and a downsampling layer. The visual transformer includes cascaded multi-head squared attention and a feedforward neural network. The multi-head squared attention process for processing the feature map includes: in, The input features are multi-head squared attention. The output features of multi-head squared attention. For 1×1 point-by-element convolution, It is a 3×3 depthwise convolution. This is the query, key, and value matrix obtained from the input features transformed by multi-head squared attention. For trainable parameters, For a 1×1 convolution, For activation function, For attention mechanisms, To be Features after unbiased layer normalization; Before the multi-head squared attention output features are input into the feedforward neural network, the process includes: unbiased layer normalization of the multi-head squared attention output features to obtain the layer-normalized features; the feedforward neural network's feature processing includes: The layer-normalized features are input into the point-wise progressive convolution and depthwise convolution in the first path, and the output features of the depthwise convolution are activated by the GELU function. The layer-normalized features are input into the point-wise progressive convolution and depthwise convolution in the second path, and the output features of the depthwise convolution are activated by the sigmoid function. The Hadamard product of the features activated by the GELU function and the features activated by the sigmoid function is calculated. The Hadamard product is input into a 1×1 convolution, and the output features of the 1×1 convolution are skip-connected with the output features of the multi-head squared attention to obtain the output features of the feedforward neural network. The central module consists of multiple stacked visual transformers; The decoding module includes three cascaded decoders, each of which sequentially includes an upsampling layer, a convolutional layer, and multiple stacked visual transformers; the output features of each encoder module are skipped to the output features of the upsampling layer of a corresponding decoder module. The output module includes multiple stacked visual transformers, two convolutional layers, and a Swish activation layer; the feature processing in the output module includes: The output features of the decoding module are input into the first visual transformer in a stacked array of visual transformers; the output features of the last visual transformer in the stacked array of visual transformers are input into the first convolutional layer to obtain the output features of the first convolutional layer; the output features of the first convolutional layer are connected to the input features of the input module using a skip connection; the skip-connected features are input into the second convolutional layer to obtain the output features of the second convolutional layer; the output features of the second convolutional layer are activated using a Swish activation layer to obtain the output features of the output module. The model prediction unit is used to input the measured VIL image into the trained prediction model to obtain prediction images for multiple future frames, which are then used for precipitation event prediction.

7. An electronic device, comprising: include: At least one memory; At least one processor; At least one computer program; The computer program is stored in the memory, and the processor executes the at least one computer program to achieve: The precipitation forecasting method with a physical constraint prediction model as described in any one of claims 1 to 5.

8. A computer-readable storage medium, characterized in that, The computer-readable storage medium stores computer-executable instructions for causing a computer to perform: The precipitation forecasting method with a physical constraint prediction model as described in any one of claims 1 to 5.