A method for modeling delay differential equations based on Bayesian optimization and neural networks

What is AI technical title?
AI technical title is built by Patsnap AI team. It summarizes the technical point description of the patent document.
By combining Bayesian optimization with neural networks, the problems of high computational cost and low recognition accuracy in delay differential equation modeling are solved, achieving efficient and accurate delay parameter recognition and error quantification, which can adapt to the nonlinear dynamic characteristics of complex systems.

CN121920244BActive Publication Date: 2026-06-30CHONGQING DIDA IND TECH RES INST CO LTD

View PDF 2 Cites 0 Cited by

Patent Information

Authority / Receiving Office: CN · China
Patent Type: Patents(China)
Current Assignee / Owner: CHONGQING DIDA IND TECH RES INST CO LTD
Filing Date: 2026-03-25
Publication Date: 2026-06-30

Application Information

Patent Timeline

25 Mar 2026

Application

30 Jun 2026

Publication

CN121920244B

IPC: G06F30/27; G06N7/01; G06F119/02

AI Tagging

Technology Topics

Algorithm Theoretical computer science

Explore More Agents

Novelty Search
Search existing technologies and assess novelty
↗
FTO
Analyze whether a product may infringe others' patents
↗
Design FTO
Check prior-design risk for exterior design
↗
Drafting
Draft patent application text based on a technical solution
↗
Find Solutions with TRIZ
Generate feasible solution to solve your technical challenge
↗

Similar Technology Patents

Get free access to AI patent search and analysis

Check patentability, review prior art and ask IP Agent with full patent context.

Smart Images

Figure CN121920244B_ABST

Patent Text Reader

Abstract

This application belongs to the field of delay differential equation modeling, specifically disclosing a delay differential equation modeling method based on Bayesian optimization and neural networks. The method includes: generating multiple trajectory data of the system; searching for the optimal delay term using a Bayesian optimization algorithm, where the delay term to be solved is the optimization variable, and the error based on neural network simulation is the objective function, obtaining the optimal delay term through iterative optimization using a surrogate model; constructing a state matrix and a delay matrix from the trajectory data based on the optimal delay term, using the concatenated matrix as input to a neural network to construct a neural network for approximating a nonlinear function; integrating a linear multi-step method into the loss function of the neural network, training the neural network using the trajectory data to obtain a nonlinear function approximation model; and separating the numerical discretization error and the neural network approximation error based on this model to construct a total error bound. This application can achieve high-precision, high-efficiency delay differential equation modeling with quantization error guarantee.

Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This application belongs to the field of delayed differential equation modeling, and more specifically, relates to a delayed differential equation modeling method based on Bayesian optimization and neural networks. Background Technology

[0002] Delayed differential equations (DDEs) are widely used in biology, mechanical engineering, and electrical systems to describe the dynamic dependencies between the current and historical states of a system. The core of DDE model identification lies in accurately acquiring delay parameters, system parameters, and unknown nonlinear functions, which directly affects the accuracy of characterizing the evolution of real systems and the control effect. In recent years, data-driven methods, with their advantage of not relying on prior system knowledge and being able to uncover underlying dynamics solely through observational data, have become a core research direction for DDE identification. However, existing technologies still have shortcomings and cannot meet the high-precision, high-efficiency identification requirements in complex scenarios.

[0003] Some data-driven methods (such as early extensions of SINDy) employ brute-force search to traverse candidate delay values, requiring repeated calls to the core algorithm for reconstruction, resulting in extremely high computational costs. Another type of method, based on Time Delayed Neural Networks (TDNNs), while supporting joint training of delay terms and network parameters, is prone to spurious convergence when the data is noisy or the sampling step size is large, significantly reducing the accuracy of delay parameter identification. Methods based on sparse identification (SINDy) rely on a pre-defined function library to construct sparse combinations; if the function library does not contain the system's true nonlinear terms, the identification accuracy will be greatly reduced, and an excessively large function library will lead to overfitting and computational redundancy. While methods based on Neural Delayed Differential Equations (NDDEs) possess strong nonlinear expressive power, their high model complexity and stringent requirements for data quantity and quality make them difficult to adapt to the complex and diverse nonlinear dynamic characteristics of real-world systems. Existing data-driven methods are mostly designed for single equations with specific parameter configurations, requiring repeated model training for different parameter combinations, resulting in low efficiency. Furthermore, the coupling between numerical discretization error and model approximation error in data-driven identification can affect the final accuracy, but existing technologies do not systematically quantify these two types of errors, which reduces the reliability of practical applications.

[0004] Therefore, how to achieve a high-precision, high-efficiency modeling method for delay differential equations with quantization error guarantee is an urgent problem to be solved. Summary of the Invention

[0005] To address the shortcomings of existing technologies, the purpose of this application is to provide a delay differential equation modeling method based on Bayesian optimization and neural networks, which can achieve high-precision delay parameter identification, adaptive approximation of nonlinear functions, and systematic error quantification analysis in the delay differential equation modeling process.

[0006] To achieve the above objectives, in a first aspect, this application provides a method for modeling delayed differential equations based on Bayesian optimization and neural networks, comprising the following steps:

[0007] S10 generates multiple trajectory data for the delayed differential equation system;

[0008] S20, using the Bayesian optimization algorithm, the optimal delay term is searched based on the trajectory data; wherein, the Bayesian optimization algorithm uses the delay term to be solved as the optimization variable, the error based on neural network simulation as the objective function, and obtains the optimal delay term through iterative optimization using a surrogate model;

[0009] S30, Based on the optimal delay term, construct a state matrix and a delay matrix from the trajectory data, and use the concatenated matrix of the state matrix and the delay matrix as the input of the neural network to construct the neural network for approximating the nonlinear function in the delay differential equation;

[0010] S40, The linear multi-step method is incorporated into the loss function of the neural network, and the neural network is trained using the trajectory data to obtain a nonlinear function approximation model;

[0011] S50, Based on the nonlinear function approximation model, the numerical discretization error and neural network approximation error in the identification process of delayed differential equations are separated, and the total error bound is constructed.

[0012] As a further preferred embodiment, in step S20, the Bayesian optimization algorithm specifically includes:

[0013] Construct a black-box function optimization framework using Gaussian processes as surrogate models;

[0014] By aiming to improve the acquisition function, candidate delay values are selected from a preset search space;

[0015] For each candidate delay value, a subset is randomly selected from the trajectory data, and a neural network with two hidden layers is trained on the subset. The trained neural network is used to perform forward trajectory simulation on a given initial state. The root mean square error between the simulated trajectory and the real trajectory is calculated as the simulation error, and the simulation error is used as the output value of the black box function.

[0016] Update the Gaussian process surrogate model based on the current candidate delay values and their corresponding simulation errors;

[0017] The optimization process continues until the preset number of evaluations is reached, and the candidate delay value corresponding to the minimum simulation error obtained during the iteration is determined as the optimal delay term.

[0018] As a further preferred embodiment, the kernel function of the Gaussian process is a squared exponential kernel. Its expression is ,in and These are the hyperparameters of the kernel function, which control the magnitude of the function value change and the scale of the input feature length, respectively. and This represents any two distinct points in the input space of a neural network.

[0019] As a further preferred embodiment, the expression for the desired improved acquisition function is: ,in and Indicates the Gaussian process at point The posterior mean and standard deviation; and Let represent the cumulative distribution function and probability density function of the standard normal distribution, respectively; It is the best value observed so far.

[0020] As a further preferred embodiment, in step S30, the neural network is a fully connected neural network containing three hidden layers and employing the Tanh activation function.

[0021] As a further preferred embodiment, in step S10, the generated trajectory data is generated on an ultrafine mesh using a fourth-order RK format.

[0022] As a further preferred embodiment, in step S40, the linear multistep method is the Adams-Bashforth linear multistep method, whose general form is: ,in Let be the order of the linear multistep method. ; Indicates the time step; Represents the coefficients of the linear multistep method;

[0023] The loss function is expressed as: ,in M The number of trajectories generated; N For time steps; It is a real number representing the length of time lag in the system, reflecting the dependence of the current state on the historical state; Indicates the first m The trajectory in the first n The system state at +1 time step; and The coefficients of the linear multistep method; Δ t Indicates the time step; The value of the nonlinear function output by the neural network.

[0024] As a further preferred option, the following also includes:

[0025] By adding a system parameter vector to the input layer of the neural network, and training the network based on a composite dataset consisting of trajectory data under multiple different system parameters, a class of delay differential equations can be learned in a single training session.

[0026] As a further preferred embodiment, in step S50, the total error boundary satisfies ,in It is related to the coefficients of the linear multistep method. The constants related to the higher-order derivatives; C 2 and L It is related to the length of time Constants related to the Lipschitz constant; It is a constant. ; h It is the time step; It is the order of the linear multistep method; It is an increasing function.

[0027] Secondly, this application provides a delayed differential equation modeling system based on Bayesian optimization and neural networks, used to implement the delayed differential equation modeling method based on Bayesian optimization and neural networks as described above, including:

[0028] The data generation module is used to generate multiple trajectory data for a delay differential equation system.

[0029] The delay optimization module is used to search for the optimal delay term based on the trajectory data using a Bayesian optimization algorithm. The Bayesian optimization algorithm uses the delay term to be solved as the optimization variable and the error based on neural network simulation as the objective function, and obtains the optimal delay term through iterative optimization using a surrogate model.

[0030] The network construction module is used to construct a state matrix and a delay matrix from the trajectory data based on the optimal delay term, and to use the concatenated matrix of the state matrix and the delay matrix as the input of the neural network to construct the neural network to approximate the nonlinear function in the delay differential equation.

[0031] The model training module is used to incorporate the linear multi-step method into the loss function of the neural network, and to train the neural network using the trajectory data to obtain a nonlinear function approximation model.

[0032] The error analysis module is used to separate the numerical discretization error and the neural network approximation error in the identification process of the delayed differential equation based on the nonlinear function approximation model, and to construct the total error bound.

[0033] To address the significant shortcomings of existing methods for identifying delayed differential equations (DDEs) in areas such as delay parameter optimization, nonlinear approximation, multi-parameter generalization, and error control, this application proposes a systematic solution with the following beneficial effects:

[0034] (1) Efficient and accurate identification of delay parameters: A Bayesian optimization-driven delay parameter search strategy is adopted, which eliminates the need for brute-force traversal of candidate values, greatly reducing computational costs and effectively avoiding local optima. By training a small neural network and designing an embedded simulation error as the objective function, pseudo-convergence is avoided, further improving the identification accuracy.

[0035] (2) Optimize the approximation performance of nonlinear functions: Utilizing the strong expressive power of neural networks, complex and diverse nonlinear terms in biological, mechanical, and other systems can be adaptively captured without the need for a pre-defined function library, thus solving the "function library dependency" problem of sparse identification methods. The Adams-Bashforth (AB) series of linear multi-step methods are innovatively integrated into the loss function of neural networks. By constraining the dynamic correlation between multi-step historical states and the current state, the neural network is guided to focus on learning the evolution law of real systems.

[0036] (3) Enhance multi-parameter generalization capability: Design a multi-parameter joint learning framework. By embedding parameter vectors, it is possible to achieve unified identification of a class of DDE equations without repeatedly training the model for different parameter combinations, and efficiently adapt to the identification needs of parameter fluctuations in actual systems.

[0037] (4) Establish a systematic error quantification system: realize the systematic quantification of numerical discretization error and neural network approximation error, and establish the quantitative relationship of error boundary through rigorous mathematical derivation. Attached Figure Description

[0038] Figure 1 This is an overall framework diagram of the delayed differential equation approximation model provided in the embodiments of this application;

[0039] Figure 2 This is a diagram illustrating the optimal delay effect of Bayesian optimization identification provided in the embodiments of this application;

[0040] Figure 3 These are comparison diagrams of trajectory prediction using different AB methods provided in the embodiments of this application; where (a) corresponds to the AB1 method, (b) corresponds to the AB2 method, (c) corresponds to the AB3 method, and (d) corresponds to the AB4 method.

[0041] Figure 4 This is a comparison diagram of the neural network approximation function and the real function provided in the embodiments of this application; where (a) corresponds to the AB1 method, (b) corresponds to the AB2 method, (c) corresponds to the AB3 method, and (d) corresponds to the AB4 method.

[0042] Figure 5 This is a diagram illustrating the optimal delay performance of Bayesian optimization for identification in a multi-parameter scenario, as provided in the embodiments of this application.

[0043] Figure 6 This is a comparison diagram of trajectory prediction of training parameters by a neural network in a multi-parameter scenario provided in the embodiments of this application; wherein, (a) corresponds to parameter 1.0, (b) corresponds to parameter 1.2, (c) corresponds to parameter 1.5, and (d) corresponds to parameter 1.8;

[0044] Figure 7 This is a comparison diagram of trajectory prediction by a neural network for untrained parameters in a multi-parameter scenario provided in the embodiments of this application; wherein, (a) corresponds to a parameter of 0.9, (b) corresponds to a parameter of 1.6, and (c) corresponds to a parameter of 2.0;

[0045] Figure 8 This is a comparison diagram of the function approximation of the training parameters by the neural network in a multi-parameter scenario provided in the embodiments of this application; wherein, (a) corresponds to a parameter of 1.0, (b) corresponds to a parameter of 1.2, (c) corresponds to a parameter of 1.5, and (d) corresponds to a parameter of 1.8;

[0046] Figure 9 This is a comparison diagram of the function approximation of the neural network to the untrained parameters in a multi-parameter scenario provided in the embodiments of this application; where (a) corresponds to a parameter of 0.9, (b) corresponds to a parameter of 1.6, and (c) corresponds to a parameter of 2.0. Detailed Implementation

[0047] To make the objectives, technical solutions, and advantages of this application clearer, the following detailed description is provided in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative and not intended to limit the scope of this application.

[0048] It should be noted that delay differential equations (DDEs) are a class of mathematical models used to describe the relationship between the current state and past states of a system, and are widely used in fields such as biology and mechanical engineering. These equations are characterized by the introduction of a time delay term, and their general form can be expressed as:

[0049] (1)

[0050] in, Is the system in the current state? The state at any given moment, The system in the past The state at any given moment. It is a real number, usually a positive number, representing the length of time lag in the system, reflecting the dependence of the current state on the historical state. The mapping relationship of the control system behavior is defined.

[0051] To further illustrate the initial value problem (IVP) of equation (1), let... For real numbers, Let be a positive integer, denoted as Let be the Banach space, which contains all intervals. Mapped to n Dimension Space A continuous function, and assign it a uniform norm:

[0052]

[0053] in, yes Any norm in . The classic DDE can be represented as:

[0054] (2)

[0055] in, It is a continuous functional. Defined as:

[0056]

[0057] The initial value problem of formula (2) is solved by giving a function. Determined, meaning satisfied:

[0058]

[0059] Under the standard Lipschitz continuity condition, the existence and uniqueness of solutions to DDEs are fundamental results in the theory of functional differential equations.

[0060] Theorem 1: (Existence and uniqueness of solutions) Consider The nonlinear delay differential equation:

[0061] (3)

[0062] in, For a finite time interval, Represents the continuous initial historical states of the system. It is a nonlinear function of the dynamic behavior of the control system.

[0063] Assume the following two conditions are met:

[0064] (1) Continuity: function Continuous within the domain;

[0065] (2) Lipschitz: There exists a constant. , so that for any as well as ,satisfy:

[0066]

[0067] in, Representing equation (3) in the current context System status at all times Different values of , They represent the past. System status at all times Different values of .

[0068] Then equation (3) is in There exists a unique solution. .

[0069] Linear multistep methods (LMMs) are classic methods for solving numerical solutions to differential equations. Their core principle is to utilize the preceding... The historical state and vector field of each step are used to construct the numerical solution for the current step, thereby reducing discretization error. To address the discretization requirements of the delay differential equation, this embodiment employs the explicit Adams-Bashforth (AB) series of linear multistep methods, whose general form is:

[0070] (4)

[0071] in, The order of the linear multistep method is selected in this embodiment. (Referred to as AB1 to AB4); Indicates the time step. Represents the coefficients of the linear multistep method.

[0072] Bayesian optimization (BO) is a method for solving computationally expensive black-box functions. This is a serialized, model-based optimization framework designed to find the global optimum. It approximates the global optimum with the fewest function evaluations by making an intelligent trade-off between "exploring" the unknown region and "exploiting" the known optimal region. Mathematically, this embodiment is investigating how to find the unknown objective function. The global minimum (or maximum) point:

[0073]

[0074] in, Representing a search space of interest, without any understanding of the properties of the derivative, for a computationally expensive black-box function. Optimize.

[0075] The BO framework mainly consists of two core probabilistic components: the objective function. The probability model and the acquisition function used to determine the sampling points are described. The probability model reflects the objective function in this embodiment. Current understanding typically employs a Gaussian Process (GP). GP uses an unknown objective function... It can be considered as a random process, the properties of which are determined by a mean function. And a covariance function (or kernel function). Full definition:

[0076]

[0077] Among them, kernel function Measured and The correlation between function values at two points. A commonly used choice is the squared exponential kernel function (or RBF kernel):

[0078]

[0079] in and These are the hyperparameters of the kernel function, which control the magnitude of the function value change and the scale of the input feature length, respectively.

[0080] Gaussian processes (GPs) are derived from Bayes' theorem based on observed datasets. To update its unknown function This belief. This process enables this embodiment to provide any new test input set. Infer its corresponding output posterior prediction distribution Hypothesis evaluation When there is no observation noise, the posterior prediction distribution is a multivariate Gaussian distribution:

[0081]

[0082] in, It is the observation input set. This is the observed output vector used. The mean vector of this distribution. Covariance Matrix for:

[0083]

[0084] in, It is a matrix whose elements Calculated by kernel function.

[0085] After the GP model provides a probabilistic description of the objective function, this embodiment requires a strategy to determine the next most valuable evaluation point. This strategy is implemented using an easily computed acquisition function. This embodiment uses the classic Expected Improvement (EI) as the acquisition function.

[0086] make The EI function represents the currently observed optimal value. A new assessment could bring more than The expected value is expressed as:

[0087]

[0088] in, and Indicates GP at point The posterior mean and standard deviation, and Let these represent the cumulative distribution function and probability density function of the standard normal distribution, respectively. (Collection function) The next evaluation point is selected by maximizing the function.

[0089]

[0090] In this embodiment, the Bayesian optimization framework is applied to find the optimal delay. ,like Figure 2 As shown. The goal of Bayesian optimization is to find a... This makes based on the delay term The learned dynamic model can most accurately reproduce the system's real evolutionary behavior. This embodiment quantifies the model's accuracy through an embedded simulation error evaluation. Specifically, for each candidate value... This embodiment performs the following steps:

[0091] 1. Training a small network: Randomly sample a subset of trajectory data and quickly train a small neural network model on that subset. To enable its learning in the present The dynamics under.

[0092] 2. Forward simulation: using pre-trained... and current Perform a forward trajectory simulation for a given initial state.

[0093] 3. Calculation Error: The simulated trajectory is compared with the actual trajectory data, and the root mean square error (RMSE) between the two is calculated, denoted as... .

[0094] Due to this simulation error It is about This is a computationally expensive black-box function because it involves training and iterative simulation of the inner neural network. The ultimate goal of this embodiment is to find a function that minimizes simulation error. Therefore, this embodiment sets the Bayesian optimization framework to solve the following minimization problem:

[0095]

[0096] in, It is the identified optimal delay. This is for The defined search space. Bayesian optimization is achieved by constructing... The Gaussian process surrogate model, utilizing the expected gain (EI) acquisition function, can efficiently approximate the global optimum with a very small number of evaluations. .

[0097] To determine the nonlinear function in the delay differential equation (Equation (1)) This application proposes a framework based on "Bayesian optimization + neural network", the architecture of which is as follows: Figure 1 As shown. The core of this framework is to use Bayesian optimization to find the optimal delay term based on system-generated data. Training a neural network To approximate the unknown dynamic function .

[0098] The model building process begins with data preparation: First, this embodiment generates... Trajectory data, using Bayesian optimization to identify unknown delay terms. Based on the delay term Constructing the state matrix and the corresponding delay matrix The spliced matrix As input to the neural network, since the discretization error in the delay differential equation is cumulative, this embodiment uses a linear multistep method of different orders to construct the loss function during the model training phase. This setting allows this embodiment to systematically study the impact of discretization error on model recognition.

[0099]

[0100] Finally, this embodiment demonstrates that when the neural network approximates the function... Replace the original function When the solution to equation (1) still satisfies the condition of existence and uniqueness.

[0101] Theorem 2: (Existence and Uniqueness of Solutions to Delayed Differential Equations Approximated by Neural Networks) Consider the following... Approximation of the nonlinear delay differential equation:

[0102] (5)

[0103] in, It is a neural network using the Tanh activation function to approximate the equation (1). Then equation (4) has a unique continuous solution. .

[0104] Proof: By Theorem 1, if If the continuity and Lipschitz continuity conditions are satisfied, then the solution to equation (4) exists and is unique. Let's verify these two forms below:

[0105] set up For containing A feedforward neural network with hidden layers, the input is Its structure can be represented as:

[0106] (6)

[0107] in, Represents the weight matrix. This represents the bias vector. spectral norm The Lipschitz condition for linear transformation is:

[0108]

[0109] (1) Continuity:

[0110] The Tanh function is a continuous linear mapping. It is also continuous, as can be deduced from the composition property of continuous functions. exist The upper part is also continuous.

[0111] (2) Lipschitz conditions:

[0112] Existence needs to be proven , so that for any ,satisfy:

[0113] (7)

[0114] The activation function Tanh satisfies the 1-Lipschitz condition, that is, for any ,exist:

[0115]

[0116] For the Layers, output differences satisfy:

[0117] (8)

[0118] That is, the output difference of a single-layer network is constrained by the difference of the previous layer and the weight norm.

[0119] Therefore, for those containing A neural network with multiple hidden layers can be obtained by induction:

[0120] (9)

[0121] because Therefore, formula (6) holds true, where .

[0122] In conclusion, The two conditions of Definition 1 are satisfied, therefore equation (4) is in There exists a unique solution.

[0123] This embodiment aims to establish a strict error bound for the proposed "high-order linear multistep method (LMM) discretization + neural network approximation" framework.

[0124] Considering the nonlinear delay differential equation of formula (3), to ensure the rigor of the error analysis, this embodiment introduces the following standard assumptions:

[0125] 1. Smoothness and stability: It is continuous within its domain and satisfies the Lipschitz condition, with the Lipschitz constant being... ,at the same time have Continuous mixed partial derivatives of order 1.

[0126] 2. Neural Networks Approximation properties: It is continuous and satisfies the Lipschitz condition, with the Lipschitz constant being... . right The approximation error is uniformly bounded, that is, there exists a constant. , making

[0127] (10)

[0128] 3. Compatibility of linear multi-step methods: The method used in this embodiment... The first-order linear multistep method satisfies the compatibility condition, that is, when the time step size is... At that time, its local truncation error (LTE) satisfies And the coefficients satisfy .

[0129] 4. Initial condition consistency: numerical solution and neural network approximate solution In the initial interval Above and the true solution Completely identical, that is The initial error is 0.

[0130] According to the source of error, this embodiment will calculate the total error. Decomposed into two parts:

[0131]

[0132] From the triangle inequality, we can obtain that the total error satisfies:

[0133]

[0134] Theorem 3 (Total Error Bound): Under the above assumptions, for any ,Depend on The total error of the solution to the delay differential equation obtained by the first-order linear multistep method and neural network approximation satisfy:

[0135]

[0136] in, It is related to the coefficients of the linear multistep method. The constants related to the higher-order derivatives; C 2 and L It is related to the length of time The constants related to the Lipschitz constant.

[0137] Proof: This embodiment will prove the theorem step by step, first deriving the discretization error. Approximation error of neural networks Then merge them into a single boundary.

[0138] Step 1: Discretize the error bound

[0139] True solution At the point of time The satisfied relationship can be expressed using a linear multistep operator and local truncation error:

[0140] (11)

[0141] Numerical solution The definition of is:

[0142] (12)

[0143] Subtracting equations (11) and (12), we get:

[0144]

[0145] Taking the norm of the above expression and using the triangle inequality:

[0146] (13)

[0147] Based on assumptions 1 and 3:

[0148]

[0149] Substitute into formula (13) and set We can obtain:

[0150]

[0151] The zero stability of the Adams-Bashforth (AB) series of linear multistep methods is satisfied by the coefficients. ;make The above formula can be simplified to:

[0152]

[0153] set up We can obtain:

[0154]

[0155] In the initial step At that time, it can be seen from the consistency of initial conditions and the order of local truncation error that... Clearly, there exists a constant. Make .

[0156] From the initial step arrive Iteratively expand the recurrence relation and use the discrete Gronwall inequality and geometric series summation:

[0157]

[0158] because Therefore Substituting, we get:

[0159]

[0160] Definition and Irrelevant constants ,but:

[0161]

[0162] Therefore, for all The discretization error bound holds, that is:

[0163]

[0164] Step 2: Approximating the error bound using a neural network

[0165] Neural network approximation error Consider its derivative .in, The continuous form satisfies , satisfy .

[0166] Therefore, we can obtain:

[0167]

[0168] Furthermore, we can obtain:

[0169] (14)

[0170] Applying hypothesis 1 to Term 1:

[0171]

[0172] Applying hypothesis 2 to Term 2:

[0173]

[0174] Taking the norm of formula (14) yields:

[0175]

[0176] From 0 to Integrate and utilize the initial conditions in Assumption 4. We obtain the integral inequality:

[0177] (15)

[0178] when hour, ,therefore:

[0179]

[0180] remember ,but Formula (15) can be simplified to:

[0181] (16)

[0182] when At that time, it can be known from the initial conditions that Inequality (16) can be simplified to:

[0183] (17)

[0184] Multiply both sides simultaneously and from 0 to integral:

[0185]

[0186] in Substituting into the above formula, we get:

[0187] (18)

[0188] because Substituting formula (18) into formula (17) yields:

[0189] (19)

[0190] when At that time, the method of induction was extended to ,at this time From formula (18), we can obtain:

[0191]

[0192] Substituting into formula (16), we get:

[0193]

[0194] Similarly, integration yields:

[0195]

[0196] Furthermore, we can also obtain formula (19). Similarly, for any By induction, we can obtain:

[0197]

[0198] Pick We can obtain:

[0199]

[0200] Step 3: Combine Total Error Boundaries

[0201] Finally, substitute the results from steps 1 and 2 into... We can obtain:

[0202]

[0203] Because in the interval superior, It is an increasing function, in The maximum value is obtained at this point, therefore, this embodiment yields the final total error bound:

[0204]

[0205] This embodiment aims to verify the effectiveness of using Bayesian optimization and neural networks to approximate the right-hand side of the delayed differential equation. To this end, a one-dimensional numerical simulation example is set up. To explore the impact of different parameters on the system, this embodiment introduces a multi-parameter identification mechanism for this example, aiming to learn a class of equations through a single training iteration. This embodiment uses a fourth-order RK scheme to generate data on an ultrafine mesh, with the delay parameter... Time interval The system was discovered and its findings were verified through numerical simulation.

[0206] The Hutchinson model is one of the simplest equations in the delay differential equations, and its expression is:

[0207] (20)

[0208] in, Used to control the growth trend of the system, in this example, it is set This embodiment selects... Use a random number within the range as the initial value. In this example, the parameters for Bayesian optimization are set as follows: the search space is... The total number of Bayesian optimization evaluations is 30, with the first 10 being randomized candidate points to ensure a preliminary global understanding of the search space. The training data subset for the small network consists of 10 randomly selected trajectories, and this network contains two hidden layers. The outer neural network contains three hidden layers, all using the Tanh activation function, employing the Adam optimizer, and setting the learning rate to [value missing]. .

[0209] like Figure 3 As shown, the AB4 method yields... Its simulated long-term trajectory closely matches the actual value. In contrast, AB1 obtained... The predicted results will deviate from the true value over time. The results show that training a neural network using a higher-order linear multistep method can improve the system's recognition accuracy. Specifically, according to the error bound in Theorem 3, a higher order can reduce the discretization error components and also allow the neural network to achieve a smaller approximation error. This intuitive conclusion is quantitatively verified in Table 1: compared to AB1, the neural network trained using AB4... of The error was reduced by nearly 86.92%. Crucially, this improvement in function approximation accuracy also improved the long-term predicted trajectory... The error was reduced by nearly 77.87%, further providing strong empirical support for the error bound theorem of Theorem 3. The superiority of AB4 lies in... Figure 4 The figure further illustrates this by comparing the neural network learning results obtained under four different methods. With the real function The differences.

[0210] Table 1. Comparison of Errors of Four Methods

[0211]

[0212] Normally, when equation (20) The system exhibits different properties depending on the values of the equations. A key advantage of this framework is its ability to learn not only individual equations but also entire systems of equations parameterized by system constants. This embodiment validates this capability by transforming the Hutchinson equations from a fixed-parameter model to a multi-parameter model.

[0213] To explore the generalization ability of neural networks for multiple parameters, this embodiment constructs a framework identical to the single-parameter model architecture. The difference is that the multi-parameter framework adds a parameter to the input layer of the neural network. This embodiment generates multiple sets. Training was performed on a composite dataset with data of varying values. In this embodiment, four different datasets were used in the experiment. The values are {1.0, 1.2, 1.5, 1.8}, each... Generate 200 trajectories.

[0214] Figure 5 This demonstrates how Bayesian optimization can be used to find the optimal latency in a composite dataset under multi-parameter scenarios. .Depend on Figure 6 It can be observed that the actual trajectory and the predicted trajectory are highly consistent, indicating that the model achieves high-precision recognition of all four different parameters. To verify the model's generalization ability to other untrained parameters, this embodiment selects three different untrained parameters (such as...). Figure 7 (as shown) (Extrapolation beyond the lower limit of the training range); (Used for interpolation between two training points); (Extrapolation beyond the upper limit of the training range). By Figure 8 It can be observed that the data points are densely clustered in The online reference indicates that the model achieved high-precision recognition of all four different parameters. Similarly, for the untrained parameters (…),… Figure 9 (As shown) the learned functions It can approximate the true function with high accuracy in both interpolation and extrapolation tasks. The results confirm that the model in this embodiment can achieve the effect of learning a class of delayed differential equations in a single training iteration with relatively high accuracy, without the need for separate training for each parameter.

[0215] Those skilled in the art will readily understand that the above description is merely a preferred embodiment of this application and is not intended to limit this application. Any modifications, equivalent substitutions, and improvements made within the spirit and principles of this application should be included within the protection scope of this application.

Claims

1. A method for modeling delayed differential equations based on Bayesian optimization and neural networks, characterized in that, Includes the following steps: S10 generates multiple trajectory data for the delayed differential equation system; S20, using the Bayesian optimization algorithm, the optimal delay term is searched based on the trajectory data; wherein, the Bayesian optimization algorithm uses the delay term to be solved as the optimization variable, the error based on neural network simulation as the objective function, and obtains the optimal delay term through iterative optimization using a surrogate model; S30, Based on the optimal delay term, construct a state matrix and a delay matrix from the trajectory data, and use the concatenated matrix of the state matrix and the delay matrix as the input of the neural network to construct the neural network for approximating the nonlinear function in the delay differential equation; S40, The linear multi-step method is incorporated into the loss function of the neural network, and the neural network is trained using the trajectory data to obtain a nonlinear function approximation model; S50, Based on the nonlinear function approximation model, the numerical discretization error and neural network approximation error in the process of identifying delayed differential equations are separated, and the total error bound is constructed. In step S20, the Bayesian optimization algorithm specifically includes: Construct a black-box function optimization framework using Gaussian processes as surrogate models; By aiming to improve the acquisition function, candidate delay values are selected from a preset search space; For each candidate delay value, a subset is randomly selected from the trajectory data, and a neural network with two hidden layers is trained on the subset. The trained neural network is used to perform forward trajectory simulation on a given initial state. The root mean square error between the simulated trajectory and the real trajectory is calculated as the simulation error, and the simulation error is used as the output value of the black box function. Update the Gaussian process surrogate model based on the current candidate delay values and their corresponding simulation errors; The optimization process continues until the preset number of evaluations is reached, and the candidate delay value corresponding to the minimum simulation error obtained during the iteration is determined as the optimal delay term.

2. The method for modeling delayed differential equations based on Bayesian optimization and neural networks as described in claim 1, characterized in that, The kernel function of the Gaussian process is a squared exponential kernel. Its expression is ,in and These are the hyperparameters of the kernel function, which control the magnitude of the function value change and the scale of the input feature length, respectively. and This represents any two distinct points in the input space of a neural network.

3. The method for modeling delayed differential equations based on Bayesian optimization and neural networks as described in claim 1, characterized in that, The expression for the desired improvement in the acquisition function is: ,in and Indicates the Gaussian process at point The posterior mean and standard deviation; and Let represent the cumulative distribution function and probability density function of the standard normal distribution, respectively; It is the best value observed so far.

4. The method for modeling delayed differential equations based on Bayesian optimization and neural networks as described in claim 1, characterized in that, In step S30, the neural network is a fully connected neural network, containing three hidden layers and employing the Tanh activation function.

5. The method for modeling delayed differential equations based on Bayesian optimization and neural networks as described in claim 1, characterized in that, In step S10, the generated trajectory data is generated on an ultrafine mesh using a fourth-order RK format.

6. The method for modeling delayed differential equations based on Bayesian optimization and neural networks as described in claim 1, characterized in that, In step S40, the linear multistep method is the Adams-Bashforth linear multistep method, whose general form is: ,in Let be the order of the linear multistep method. ; Indicates the time step; Represents the coefficients of the linear multistep method; The loss function is expressed as: ,in M The number of trajectories generated; N For time steps; It is a real number representing the length of time lag in the system, reflecting the dependence of the current state on the historical state; Indicates the first m The trajectory in the first n The system state at +1 time step; and The coefficients of the linear multistep method; Δ t Indicates the time step; The value of the nonlinear function output by the neural network.

7. The method for modeling delayed differential equations based on Bayesian optimization and neural networks as described in claim 1, characterized in that, Also includes: By adding a system parameter vector to the input layer of the neural network, and training the network based on a composite dataset consisting of trajectory data under multiple different system parameters, a class of delay differential equations can be learned in a single training session.

8. The method for modeling delayed differential equations based on Bayesian optimization and neural networks as described in claim 1, characterized in that, In step S50, the total error boundary satisfies ,in It is related to the coefficients of the linear multistep method. The constants related to the higher-order derivatives; C 2 and L It is related to the length of time Constants related to the Lipschitz constant; It is a constant. ; h It is the time step; It is the order of the linear multistep method; It is an increasing function.

9. A delay differential equation modeling system based on Bayesian optimization and neural networks, characterized in that, The method for modeling delayed differential equations based on Bayesian optimization and neural networks as described in any one of claims 1 to 8 includes: The data generation module is used to generate multiple trajectory data for a delay differential equation system. The delay optimization module is used to search for the optimal delay term based on the trajectory data using a Bayesian optimization algorithm. The Bayesian optimization algorithm uses the delay term to be solved as the optimization variable and the error based on neural network simulation as the objective function, and obtains the optimal delay term through iterative optimization using a surrogate model. The network construction module is used to construct a state matrix and a delay matrix from the trajectory data based on the optimal delay term, and to use the concatenated matrix of the state matrix and the delay matrix as the input of the neural network to construct the neural network to approximate the nonlinear function in the delay differential equation. The model training module is used to incorporate the linear multi-step method into the loss function of the neural network, and to train the neural network using the trajectory data to obtain a nonlinear function approximation model. The error analysis module is used to separate the numerical discretization error and the neural network approximation error in the identification process of the delayed differential equation based on the nonlinear function approximation model, and to construct the total error bound.

Citation Information

Patent Citations

Neural network method for solving delay differential algebraic model with weak discontinuity
CN111310886A
Control method and system of flexible joint robot under input delay
CN119858164A

Patent Information

Abstract

Description

Patent Citations

Neural network method for solving delay differential algebraic model with weak discontinuity

Control method and system of flexible joint robot under input delay