A ship dynamic positioning anti-disturbance control method considering vertical dynamic optimization

By using an improved Actor-Critic reinforcement learning framework and interference observer, the problem of high-precision anti-interference control for large ships in complex marine environments was solved, achieving high-precision and stable ship positioning and adapting to different sea states and loading conditions.

CN122284342APending Publication Date: 2026-06-26DALIAN MARITIME UNIVERSITY +1

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
DALIAN MARITIME UNIVERSITY
Filing Date
2026-05-15
Publication Date
2026-06-26

AI Technical Summary

Technical Problem

Existing ship dynamic positioning control technology is difficult to cope with the high control difficulty, unknown nonlinear parameters and complex marine environmental interference of large ships. Traditional Actor-Critic network training is unstable, interference estimation accuracy is low, and four-degree-of-freedom motion adaptability is poor, making it difficult to achieve high-precision anti-interference control.

Method used

A nonlinear dynamic model of a ship with four degrees of freedom is established. By combining an improved Actor-Critic reinforcement learning framework and a disturbance observer, external disturbances are estimated and compensated in real time through a first-order filter and a virtual control law. The control strategy is then optimized to achieve high-precision positioning.

Benefits of technology

It enables high-precision anti-interference control of ships in complex marine environments, improves positioning accuracy and stability, reduces the impact of external disturbances on ship attitude, and adapts to different sea conditions and loading conditions.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN122284342A_ABST
    Figure CN122284342A_ABST
Patent Text Reader

Abstract

This invention discloses a ship dynamic positioning anti-interference control method considering vertical dynamic optimization, comprising: establishing a four-degree-of-freedom nonlinear dynamic model of the ship and obtaining the error vector by combining the desired pose; defining a virtual control law by backstepping and processing the velocity error using a first-order filter; approximating the uncertainties in the model based on an improved Actor-Critic reinforcement learning framework and designing an interference observer to estimate external disturbances; calculating the compensated desired acceleration by combining the approximation output of the uncertainties, the estimated value of the external disturbances, and the velocity error; obtaining the thruster pitch ratio based on the desired acceleration and generating the final control command through a thrust distribution model. This invention can effectively handle model uncertainties and external environmental disturbances in ship dynamic positioning, especially optimizing vertical dynamic performance and improving control accuracy, response speed, and system robustness.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of ship motion control, and more particularly to a ship dynamic positioning anti-interference control method that considers vertical dynamic optimization. Background Technology

[0002] A dynamic positioning system is a closed-loop control system. Generally, ships performing dynamic positioning tasks are all-driven vessels. All-driven systems are a general term for a class of mechanical systems. In actual marine engineering, ships performing dynamic positioning tasks may inevitably encounter actuator failures. System signals are frequently transmitted between sensors, controllers, and actuators. Sensors observe ship errors and send signals to the controller. The controller sends control command signals to the actuators, which are then input into the ship system through the actuator servo system to drive the ship to perform control tasks. In existing ship dynamic positioning control technologies, large ships are characterized by large mass, large inertia, and long time delays. These characteristics mean that large ships are more difficult to maneuver and control, requiring more precise control inputs and faster response capabilities. Furthermore, ships possess significant unknown nonlinear parameters, such as hydrodynamic coefficients, added mass, and damping coefficients. These parameters are affected by factors such as ship draft, loading status, and sea state, making them difficult to obtain through precise modeling and prone to causing control system instability. Secondly, external disturbances in the marine environment (such as irregular waves, changing winds, and strong currents) are random and time-varying, making it difficult for traditional control methods to accurately compensate for them, affecting dynamic positioning accuracy and making anti-interference control difficult. Conventional reinforcement learning algorithms have the following shortcomings in ship dynamic positioning and control tasks: Traditional Actor-Critic networks only use the immediate action value function as the optimization objective and do not construct long-term policy performance indicators, which makes it difficult for the control policy to adapt to the dynamic changes of complex marine environments. The parameter update mechanism is imperfect, the network training is unstable, and the convergence speed is slow. The interference estimation accuracy is low. Traditional interference observers rely on linearized models and cannot use the strong nonlinear approximation capability of neural networks to achieve accurate interference estimation. There are also problems such as interference compensation lag and network update incoordination, as well as insufficient consideration of the time delay characteristics of the marine environment and long-term control performance. The four-degree-of-freedom motion adaptability is poor, the modeling of the nonlinear characteristics of vertical motion in the heave direction is insufficient, and no cooperative control strategy is designed for the coupling characteristics of the four degrees of freedom, making it difficult to achieve balanced and high-precision control. Summary of the Invention

[0003] This invention provides a ship dynamic positioning anti-interference control method that considers vertical dynamic optimization to overcome the above-mentioned problems.

[0004] To achieve the above objectives, the technical solution of the present invention is as follows: A ship dynamic positioning anti-interference control method considering vertical dynamic optimization includes: S1. Establish a nonlinear dynamic model of the ship with four degrees of freedom; S2. Based on the nonlinear dynamic model of the ship with four degrees of freedom, and combined with the set desired ship pose, the error vector is obtained; S3. Based on the error vector, define a virtual control law using the backstepping method; process the virtual control law using a first-order filter to obtain the output of the first-order filter, and then obtain the speed error. S4. Based on the ship's velocity vector, obtain the output of the Actor network using the improved Actor-Critic reinforcement learning framework; S5. Based on the improved Actor-Critic reinforcement learning framework and the ship's velocity vector, the estimated value of the external disturbance is obtained through the designed disturbance observer. S6. Based on the estimated value of external disturbance, the output of the Actor network, and the velocity error, calculate the expected acceleration after compensation. S7. Calculate the thruster pitch ratio based on the compensated expected acceleration; and finally obtain the control command based on the pitch ratio command and the thrust distribution model.

[0005] Furthermore, the expression for the nonlinear dynamic model of the ship with four degrees of freedom is as follows: (1) (2) In the formula, Let be the ship's position and attitude vector in the geodetic coordinate system, where This represents the longitudinal position in the geodetic coordinate system. This refers to the lateral position in the geodetic coordinate system. This refers to the vertical position in the geodetic coordinate system. The heading angle of the ship; for The derivative; Let be the velocity vector of the ship's hull, where Longitudinal velocity; For lateral velocity; Vertical velocity; The bow roll angular velocity; for The derivative; Let be the rotation matrix between the ship's coordinate system and the geodetic coordinate system, satisfying and ; The inertial mass matrix; It is a nonlinear fluid dynamics function; Let be the restoring force vector, where, The component of the restoring force vector in the vertical degree of freedom; It is a vector representing the control force and torque; This is the external interference vector.

[0006] Furthermore, the expression for the error vector is: (3) In the formula, This is the error vector; The desired attitude of the ship is set.

[0007] Furthermore, the virtual control law is: (4) In the formula, For virtual control laws; It is a positive definite diagonal matrix; for The derivative; The expression for the first-order filter is: (5) In the formula, This is the output of a first-order filter; for The derivative; A matrix representing the time constant; The expression for the speed error is: (6) In the formula, This represents the speed error.

[0008] Furthermore, the improved Actor-Critic reinforcement learning framework uses an artificial fish swarm algorithm to globally optimize the initial weights and learning rate, and introduces a loss network to adjust the learning rate of the composite observer.

[0009] Furthermore, the expression for the output of the Actor network is: (7) In the formula, This represents the uncertainty term in the nonlinear dynamics model of a ship with four degrees of freedom. The ideal weight matrix for the Actor network; These are the basis function vectors in the Actor network; This represents the approximation error of the Actor network.

[0010] Furthermore, the expression for the designed interference observer is: (8) In the formula, This is an estimate of the external disturbance. These are auxiliary state variables for the observer; The observer gain diagonal matrix; These are the basis function vectors of the loss network; This is the learning rate matrix; Estimating the weight matrix for the loss network; For the perturbation observer One degree of freedom auxiliary state variable; for The derivative; For the first The observer gain diagonal matrix with degrees of freedom; The output of the Actor network One component; For the first Estimates of external disturbances for each degree of freedom; For the first Control force / torque for each degree of freedom; The first inertial mass matrix Each line; For the first A velocity vector with one degree of freedom.

[0011] Furthermore, the expression for the compensated desired acceleration is: (9) In the formula, The expected acceleration after compensation; This is the velocity error feedback matrix; This is the output of the Actor network.

[0012] Furthermore, the expression for the pitch ratio of the thruster is: (10) (11) In the formula, The pitch ratio of the propeller; For the first Adaptive update rate of the estimated parameters; For A diagonal matrix with diagonal elements; The pseudo-inverse of the thrust allocation matrix; A positive adaptive gain; and Index for the ship's velocity degrees of freedom; Assigning the pseudo-inverse matrix to the thrust Line number Column elements; The first of the thrust allocation matrix Line number Column elements; The first of the expected accelerations after compensation One component; For speed index error; for The initial value; The expression for the thrust distribution model is as follows: (12) In the formula, It is a vector of control force and torque, i.e., control command; Assign a matrix to the thrust; This is a diagonal matrix of unknown coefficients for the propeller rotational speed.

[0013] Beneficial effects: The present invention provides a ship dynamic positioning anti-interference control method that considers vertical dynamic optimization. Through an improved Actor-Critic reinforcement learning framework, it approximates and compensates for model uncertainties (such as unmodeled dynamics, parameter perturbations, etc.) in the ship's four-degree-of-freedom nonlinear dynamics model online without the need for an accurate mathematical model, enabling the control system to adapt to different sea states and ship loading conditions. By designing a specialized disturbance observer, the equivalent force of external environmental disturbances (such as wind, waves, and currents) is estimated in real time, and the estimated value is fed forward to compensate for the desired acceleration calculation. This significantly suppresses the impact of external disturbances on the ship's attitude and achieves high-precision anti-interference dynamic positioning. Attached Figure Description

[0014] To more clearly illustrate the technical solutions in the embodiments of the present invention or the prior art, the drawings used in the description of the embodiments or the prior art will be briefly introduced below. Obviously, the drawings described below are some embodiments of the present invention. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.

[0015] Figure 1 This is a flowchart illustrating the anti-interference control method of the present invention; Figure 2 This is a comparison diagram of positioning effects in an embodiment of the present invention; Figure 3 This is a comparison diagram of positional variables in an embodiment of the present invention; Figure 4 This is a comparison diagram of velocity variables in an embodiment of the present invention; Figure 5 This is a comparison diagram of control inputs in an embodiment of the present invention; Figure 6 This is a comparison chart of observation results in an embodiment of the present invention; Figure 7 This is a comparison chart of the position optimization effects in embodiments of the present invention; Figure 8 This is a comparison chart of speed optimization effects in embodiments of the present invention; Figure 9 This is a comparison chart of the control input optimization effects in embodiments of the present invention. Detailed Implementation

[0016] To make the objectives, technical solutions, and advantages of the embodiments of the present invention clearer, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of the present invention, not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of the present invention.

[0017] This embodiment provides a ship dynamic positioning anti-interference control method that considers vertical dynamic optimization, such as... Figure 1 As shown, it includes: S1. Establish a nonlinear dynamic model of the ship with four degrees of freedom; Preferably, the expression for the nonlinear dynamic model of the ship with four degrees of freedom is: (1) (2) In the formula, Let be the ship's position and attitude vector in the geodetic coordinate system, where This represents the longitudinal position in the geodetic coordinate system. This refers to the lateral position in the geodetic coordinate system. This refers to the vertical position in the geodetic coordinate system. The heading angle of the ship; for The derivative; Let be the velocity vector of the ship's hull, where Longitudinal velocity; For lateral velocity; Vertical velocity; The bow roll angular velocity; for The derivative; Let be the rotation matrix between the ship's coordinate system and the geodetic coordinate system, satisfying and ; The inertial mass matrix; It is a nonlinear fluid dynamics function; Let be the restoring force vector, where, The component of the restoring force vector in the vertical degree of freedom; It is a vector representing the control force and torque; External interference vector; in, It can be represented as: (3) It can be represented as: (4) In the formula, The total mass of the ship; The ship's center of gravity in the hull coordinate system Coordinates on the axis; For bow roll moment of inertia; , , , , For hydrodynamic derivative; It can be represented as: (5) In the formula, , , , , , , , , , , , This is the hydrodynamic derivative.

[0018] S2. Based on the nonlinear dynamic model of the ship with four degrees of freedom, and combined with the set desired ship pose, the error vector is obtained; Preferably, the expression for the error vector is: (6) In the formula, This is the error vector; The desired attitude of the ship is set.

[0019] S3. Based on the error vector, define a virtual control law using the backstepping method; process the virtual control law using a first-order filter to obtain the output of the first-order filter, and then obtain the speed error. Preferably, the virtual control law is: (7) In the formula, For virtual control laws; It is a positive definite diagonal matrix; for The derivative; The expression for the first-order filter is: (8) In the formula, This is the output of a first-order filter; for The derivative; A matrix representing the time constant; The expression for the speed error is: (9) In the formula, This represents the speed error.

[0020] S4. Based on the ship's velocity vector, obtain the output of the Actor network using the improved Actor-Critic reinforcement learning framework; Preferably, the improved Actor-Critic reinforcement learning framework uses an artificial fish swarm algorithm to globally optimize the initial weights and learning rate, and introduces a loss network to adjust the learning rate of the composite observer; The artificial fish swarm algorithm introduces a fish swarm dispersion measure to measure the spatial distribution diversity of individual fish in the swarm (aiming to prevent premature convergence to local regions) and the effectiveness of updating the optimal fish's position. With the optimal fish improvement rate The expression is: (10) (11) In the formula, The fish dispersion degree, and The higher the value, the stronger the fish's exploration ability; Indicates the size of the school of fish; For the first The location of the artificial fish; For the fish school's center of gravity; To avoid singularities, small constants; For the first Artificial fish in The optimal position at any given moment is used to update the actor weights or correct the critic evaluation for the corresponding dimension; For the integral strengthening interval; The expression for the gradient descent adaptive update rate is: (12) (13) (14) In the formula, For the Critic network The derivative of each weight; is the learning rate of the Critic network; This refers to the increment or gradient of the basis functions of the Critic network; The change in the basis functions; It is a performance exponential function; As a regulating factor; This serves as a guiding term for the fish swarm algorithm. It is a positive gain coefficient; For the Actor network The estimated values ​​of each weight; For the Actor network The derivative of each weight; is the learning rate of the Actor network; These are the basis function vectors of the Actor network; For the first Velocity error in one degree of freedom; For the loss of the network The derivative of each weight; The learning rate of the loss network; These are the basis function vectors of the loss network; For the first Position error of one degree of freedom; , , , , , All are gain coefficients; Specifically, in the improved Actor-Critic reinforcement learning framework, the long-term utility function components on four degrees of freedom are defined as the learning objectives of the Critic network, and the loss function components are defined as the learning objectives of the loss network. The expression for the long-term utility function component is: (15) In the formula, For the first A long-term utility function with one degree of freedom; For the integral strengthening interval; Discount rate; This is a performance tracking metric function; The expression for the components of the loss function is: (16) In the formula, No. Loss function for each degree of freedom; For the first Position error of one degree of freedom; For the first Velocity error in one degree of freedom; For the first The disturbance estimation error for each degree of freedom; The components of the long-run utility function are combined into a long-run utility function vector. The components of the loss function are combined into a loss function vector. ; The long-term utility function integrates future information that is not available at the current moment, and is difficult to calculate directly even for linear systems; Specifically, to solve this problem, a Critic network is used for approximation, and the expression is: (17) In the formula, The ideal weight matrix for the Critic network; This represents the approximation error of the Critic network.

[0021] Preferably, the uncertain parameters of the model structure and the hydrodynamic derivative are approximated by the Actor network to obtain the approximate output of the uncertain terms of the nonlinear dynamic model of the ship with four degrees of freedom, and the gain uncertainty of the actuator is compensated online and dynamically optimized by the adaptive technique. The expression for the output of the Actor network is: (18) In the formula, This represents the uncertainty term in the nonlinear dynamics model of a ship with four degrees of freedom. The ideal weight matrix for the Actor network; These are the basis function vectors in the Actor network; This represents the approximation error of the Actor network.

[0022] Specifically, the loss function, which contains unpredictable external disturbances, is also difficult to solve. It is approximated using this network, and the expression is: (19) In the formula, The ideal weight matrix for the loss network; This is to compensate for the approximation error of the loss network.

[0023] Specifically, due to weight Since the true value is unknown, we approximate the above variables using observed values, expressed as: (20) (twenty one) (twenty two) In the formula, These are approximate observations related to the Critic network; These are approximate observations related to the Actor network; To approximate observations related to the loss network; A piecewise function vector related to the error threshold; Weight estimation for the Critic network; Weight estimation for the Actor network; Weight estimation for loss network when hour, Otherwise, take 0. (Contains...) Each term represents the first term of the corresponding vector. One element, .

[0024] S5. Based on the improved Actor-Critic reinforcement learning framework and the ship's velocity vector, the estimated value of the external disturbance is obtained through the designed disturbance observer. In this embodiment, nonlinear DP ships typically face a dual challenge: model uncertainty and complex environmental disturbances, such as unmodeled hydrodynamics, thruster gain variations, and coupling disturbances. This embodiment designs a composite observer that integrates reinforcement learning and artificial fish swarm optimization to perform anti-interference control. Preferably, the expression for the designed interference observer is: (twenty three) In the formula, This is an estimate of the external disturbance. These are auxiliary state variables for the observer; The observer gain diagonal matrix; These are the basis function vectors of the loss network; This is the learning rate matrix; Weight estimation for lossy networks; For the perturbation observer One degree of freedom auxiliary state variable; for The derivative; For the first The observer gain diagonal matrix with degrees of freedom; The output of the Actor network One component; For the first Estimates of external disturbances for each degree of freedom; For the first Control force / torque for each degree of freedom; The first inertial mass matrix Each line; For the first A velocity vector with one degree of freedom.

[0025] S6. Based on the estimated value of external disturbances, the approximate output of the uncertainty term of the nonlinear dynamic model of the ship's four degrees of freedom, and the velocity error, the expected acceleration after compensation is calculated. Preferably, the expression for the compensated desired acceleration is: (twenty four) In the formula, The expected acceleration after compensation; This is the velocity error feedback matrix; This is the output of the Actor network.

[0026] S7. Calculate the thruster pitch ratio based on the compensated expected acceleration; and finally obtain the control command based on the pitch ratio command and the thrust distribution model. Preferably, the expression for the pitch ratio of the thruster is: (25) (26) In the formula, The pitch ratio of the propeller; For the first Adaptive update rate of the estimated parameters; For A diagonal matrix with diagonal elements; The pseudo-inverse of the thrust allocation matrix; A positive adaptive gain; and Index for the ship's velocity degrees of freedom; Assigning the pseudo-inverse matrix to the thrust Line number Column elements; The first of the thrust allocation matrix Line number Column elements; The first of the expected accelerations after compensation One component; For speed index error; for The initial value; The expression for the thrust distribution model is as follows: (27) In the formula, It is a vector of control force and torque, i.e., control command; Assign a matrix to the thrust; This is a diagonal matrix of unknown coefficients for the propeller rotational speed.

[0027] In this embodiment, the same positioning task is completed using the anti-interference control method of this embodiment and the robust adaptive dynamic positioning control algorithm based on feedback propagation delay used for comparison, thereby verifying the advantages and effectiveness of the present invention. The comparison results also verify the dynamic optimization effect of the artificial fish swarm algorithm and the superiority of the improved Actor-Critic reinforcement learning framework based on the artificial fish swarm algorithm. Meanwhile, environmental disturbances are considered unavoidable factors. In actual marine engineering, environmental disturbances have randomness and uncertainty. The designed observer is used to observe them, thereby achieving anti-interference control. Figure 2 The positioning trajectory between the control algorithm and the comparison algorithm proposed in this invention is shown; for example... Figure 2 As shown, the anti-interference control method in this embodiment exhibits rapid convergence of the position trajectory, operates stably near the target position with minimal fluctuations; in contrast, the comparison algorithm shows significant fluctuations in the initial stage, but gradually stabilizes over time. Figure 3 and Figure 4 This demonstrates that the anti-interference control method of this embodiment exhibits excellent performance in tracking accuracy, stability, dynamic response, and steady-state accuracy, indicating that the ship is more resistant to drift caused by wind and waves. Its low fluctuation and high accuracy characteristics are particularly suitable for demanding marine engineering scenarios. The position vectors converge to the target value faster, and the fluctuation amplitude in the steady-state stage is significantly reduced, demonstrating more accurate positioning. In terms of speed control, the transient overshoot of longitudinal speed, lateral speed, vertical speed, and yaw rate is lower, the dynamic adjustment is smoother, and the speed fluctuation in the steady-state stage is also more gradual. This means that the anti-interference control method of this embodiment has better response characteristics and stronger anti-interference ability, and can maintain the ship's motion state more stably. The overall control effect is more in line with the requirements of dynamic positioning systems for high precision and high stability. like Figure 5 As shown, the anti-interference control method in this embodiment is smoother in terms of the time-varying characteristics of force and torque, minimizing sudden adjustments or fluctuations in the control input; Figure 6This indicates that the interference observer can effectively track the actual state of the system for most of the time periods. Although there is an initial deviation from the actual value, the observation error is gradually offset over time as the speed error decreases and the correction term is dynamically optimized through iteration. This shows that the interference observer has a strong state estimation capability and can accurately capture the dynamic changes of the system in multiple dimensions. The thrust and torque adjustments are smoother and there are fewer high-frequency fluctuations, which reduces the frequent action losses of the actuator and improves the stability of the system operation. In terms of observation effect, the observation values ​​of the anti-interference control method in this embodiment are extremely close to the true values, and the errors of the observation values ​​of disturbances in each dimension are always kept within a very small range, indicating that its observation accuracy and anti-interference capability are stronger. This also supports the stability of the control input and the high precision of the positioning speed control. Overall, the target algorithm performs better in terms of control efficiency, observation accuracy and system stability. Figures 7 to 9 The main results of the robustness performance comparison between the two algorithms are presented; [selected] and The degrees of freedom of direction and their corresponding velocities and control forces are compared; the focus is on positioning accuracy, velocity response stability and control force output efficiency; as shown in the figure, the anti-interference control method in this embodiment significantly improves position fluctuation and convergence speed, and the velocity and control force changes show a smoother transition; by improving the robust control method of dynamic optimization reinforcement learning framework and disturbance observer, the convergence speed is further improved and the transient response characteristics of the system are enhanced. Figures 7 to 9 The verification system can smoothly enter the stable process, effectively suppressing overshoot, reducing oscillations, reducing amplitude, and improving system stability. In terms of velocity vector changes, the anti-interference control method in this embodiment has low oscillation frequency and small fluctuation amplitude, ensuring a smooth convergence process. On the other hand, the comparison algorithm has serious initial oscillation phenomena, and the fluctuation duration is relatively long. Through comparison, it can be analyzed that the convergence rate after dynamic optimization is faster, the process is smoother, the steady-state fluctuations are almost eliminated, and the control energy consumption is significantly reduced. After dynamic optimization, the positioning accuracy of the ship is improved, meeting the high-precision position requirements of marine engineering operations. The stability of speed directly reduces sudden changes in ship attitude and reduces the energy consumption and mechanical losses of frequent thruster adjustments.

[0028] The present invention has the following beneficial effects: This invention provides a ship dynamic positioning anti-interference control method that considers vertical dynamic optimization. Through an improved Actor-Critic reinforcement learning framework, it performs online approximation and compensation for model uncertainties (such as unmodeled dynamics and parameter perturbations) in the ship's four-degree-of-freedom nonlinear dynamics model. This eliminates the need for an accurate mathematical model and enables the control system to adapt to different sea states and ship loading conditions. By designing a specialized disturbance observer, the equivalent force of external environmental disturbances (such as wind, waves, and currents) is estimated in real time, and the estimated value is fed forward to compensate for the desired acceleration calculation. This significantly suppresses the impact of external disturbances on the ship's attitude and achieves high-precision anti-interference dynamic positioning. Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of the present invention and are not intended to limit it. Although the present invention has been described in detail with reference to the foregoing embodiments, those skilled in the art should understand that modifications can still be made to the technical solutions described in the foregoing embodiments, or equivalent substitutions can be made to some or all of the technical features; and these modifications or substitutions do not cause the essence of the corresponding technical solutions to deviate from the scope of the technical solutions of the embodiments of the present invention.

Claims

1. A ship dynamic positioning anti-interference control method considering vertical dynamic optimization, characterized in that, include: S1. Establish a nonlinear dynamic model of the ship with four degrees of freedom; S2. Based on the nonlinear dynamic model of the ship with four degrees of freedom, and combined with the set desired ship pose, the error vector is obtained; S3. Based on the error vector, define a virtual control law using the backstepping method; process the virtual control law using a first-order filter to obtain the output of the first-order filter, and then obtain the speed error. S4. Based on the ship's velocity vector, obtain the output of the Actor network using the improved Actor-Critic reinforcement learning framework; S5. Based on the improved Actor-Critic reinforcement learning framework and the ship's velocity vector, the estimated value of the external disturbance is obtained through the designed disturbance observer. S6. Based on the estimated value of external disturbance, the output of the Actor network, and the velocity error, calculate the expected acceleration after compensation. S7. Calculate the thruster pitch ratio based on the compensated expected acceleration; and finally obtain the control command based on the pitch ratio command and the thrust distribution model.

2. The ship dynamic positioning anti-interference control method considering vertical dynamic optimization according to claim 1, characterized in that, The expression for the nonlinear dynamic model of the ship with four degrees of freedom is: (1) (2) In the formula, Let be the ship's position and attitude vector in the geodetic coordinate system, where This represents the longitudinal position in the geodetic coordinate system. This refers to the lateral position in the geodetic coordinate system. This refers to the vertical position in the geodetic coordinate system. The heading angle of the ship; for The derivative; Let be the velocity vector of the ship's hull, where Longitudinal velocity; For lateral velocity; Vertical velocity; The bow roll angular velocity; for The derivative; Let be the rotation matrix between the ship's coordinate system and the geodetic coordinate system, satisfying and ; The inertial mass matrix; It is a nonlinear fluid dynamics function; Let be the restoring force vector, where, The component of the restoring force vector in the vertical degree of freedom; It is a vector representing the control force and torque; This is the external interference vector.

3. The ship dynamic positioning anti-interference control method considering vertical dynamic optimization according to claim 2, characterized in that, The expression for the error vector is: (3) In the formula, This is the error vector; The desired attitude of the ship is set.

4. The ship dynamic positioning anti-interference control method considering vertical dynamic optimization according to claim 3, characterized in that, The virtual control law is: (4) In the formula, For virtual control laws; It is a positive definite diagonal matrix; for The derivative; The expression for the first-order filter is: (5) In the formula, This is the output of a first-order filter; for The derivative; A matrix representing the time constant; The expression for the speed error is: (6) In the formula, This represents the speed error.

5. The ship dynamic positioning anti-interference control method considering vertical dynamic optimization according to claim 4, characterized in that, The improved Actor-Critic reinforcement learning framework uses an artificial fish swarm algorithm to globally optimize the initial weights and learning rate, and introduces a loss network to adjust the learning rate of the composite observer.

6. The ship dynamic positioning anti-interference control method considering vertical dynamic optimization according to claim 4, characterized in that, The expression for the output of the Actor network is: (7) In the formula, This represents the uncertainty term in the nonlinear dynamics model of a ship with four degrees of freedom. The ideal weight matrix for the Actor network; These are the basis function vectors in the Actor network; This represents the approximation error of the Actor network.

7. The ship dynamic positioning anti-interference control method considering vertical dynamic optimization according to claim 5, characterized in that, The expression for the designed interference observer is: (8) In the formula, This is an estimate of the external disturbance. These are auxiliary state variables for the observer; The observer gain diagonal matrix; These are the basis function vectors of the loss network; This is the learning rate matrix; Estimating the weight matrix for the loss network; For the perturbation observer One degree of freedom auxiliary state variable; for The derivative; For the first The observer gain diagonal matrix with degrees of freedom; The output of the Actor network One component; For the first Estimates of external disturbances for each degree of freedom; For the first Control force / torque for each degree of freedom; The first inertial mass matrix Each line; For the first A velocity vector with one degree of freedom.

8. A ship dynamic positioning anti-interference control method considering vertical dynamic optimization according to claim 7, characterized in that, The expression for the compensated desired acceleration is: (9) In the formula, The expected acceleration after compensation; This is the velocity error feedback matrix; This is the output of the Actor network.

9. A ship dynamic positioning anti-interference control method considering vertical dynamic optimization according to claim 8, characterized in that, The expression for the pitch ratio of the thruster is: (10) (11) In the formula, The pitch ratio of the propeller; For the first Adaptive update rate of the estimated parameters; For A diagonal matrix with diagonal elements; The pseudo-inverse of the thrust allocation matrix; A positive adaptive gain; and Index for the ship's velocity degrees of freedom; Assigning the pseudo-inverse matrix to the thrust Line number Column elements; The first of the thrust allocation matrix Line number Column elements; The first of the expected accelerations after compensation One component; For speed index error; for The initial value; The expression for the thrust distribution model is as follows: (12) In the formula, It is a vector of control force and torque, i.e., control command; Assign a matrix to the thrust; This is a diagonal matrix of unknown coefficients for the propeller rotational speed.