A wind turbine active power collaborative optimization method based on piecewise linear load prediction and convex state-value function, equipment and medium

By combining piecewise linear load prediction with a convex state-value function-based active power co-optimization method for wind turbines, the problem of coordinating long-term fatigue suppression and short-term power dispatch in wind farm control is solved, achieving precise load smoothing control and equipment protection under wide-area operating conditions.

CN122246864APending Publication Date: 2026-06-19SHANDONG UNIV

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
SHANDONG UNIV
Filing Date
2026-02-14
Publication Date
2026-06-19

Smart Images

  • Figure CN122246864A_ABST
    Figure CN122246864A_ABST
Patent Text Reader

Abstract

This invention provides a method, equipment, and medium for coordinated active power optimization of wind turbines based on piecewise linear load prediction and convex state-value functions, belonging to the field of new energy power generation control and optimization technology. First, data is collected and preprocessed. A high-dimensional state feature vector is constructed using state additive hat-shaped basis functions, transforming the model into an equivalent linear prediction model. Then, a "state-action value" input convex neural network is constructed using reinforcement learning as the parameterized value function. Tangents are calculated at anchor points to form a linear lower bound constraint for long-term control value. Finally, an objective is set, and a linear programming problem is constructed and solved online, yielding a reference command sequence for the unit's active power to be transmitted and executed. This invention, by constructing multiple models and constraints, can predict loads, evaluate long-term control value, and quickly obtain a power reference value sequence satisfying multiple constraints through online solving of the linear programming problem, improving the stability and safety of unit operation.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention belongs to the field of new energy power generation control and optimization technology, specifically involving a method, equipment and medium for active power coordinated optimization of wind turbine units based on piecewise linear load prediction and convex state-value function. Background Technology

[0002] The existing field-level power allocation methods are mainly as follows: one is linear MPC based on mechanistic models; the other is empirical regression or black-box prediction superimposed with QP / MILP.

[0003] For related technologies, the above methods perform load prediction under wide-area operating conditions. While deep learning networks can fit load responses under complex operating conditions, their complex structure makes them unsuitable for solving the linear engineering constraints of wind farms simultaneously. Although linear models are easy to optimize, their fitting ability is limited. When wind turbines switch between start-up, rated operation, and shutdown conditions, the inability to capture nonlinear characteristics leads to a sharp increase in prediction errors, resulting in a lack of reliable basis for load control strategies and making it difficult to achieve accurate and smooth load control.

[0004] The challenge lies in synergistically addressing long-term fatigue load suppression and short-term power dispatch. Wind farm control must simultaneously meet real-time power quota requirements from the grid and ensure long-term equipment fatigue protection. However, many related technologies prioritize short-term power targets, relegating long-term fatigue suppression to a secondary role. This leads to accelerated fatigue accumulation in equipment during frequent power adjustments. Some attempts to introduce long-term value assessments employ nonlinear value functions, which cannot be transformed into linear constraints to integrate into the real-time optimization framework. This results in a lack of coordination between long-term and short-term objectives, leading to unbalanced optimization decisions. When existing reinforcement learning is used for wind farm control, the value function is prone to overestimation. Even with a dual-Q network structure, it is difficult to generate quantifiable constraints that can be integrated into linear optimization. Significant fluctuations can cause biases in optimization decisions, making it impossible to stabilize the long-term fatigue suppression objective and impacting the sustainability of wind farm operation. Summary of the Invention

[0005] This invention provides a method for coordinated optimization of active power of wind turbines based on piecewise linear load prediction and convex state-value function. The method achieves the solution through linear programming, satisfies the grid quota and unit boundary, and synergistically reduces the fatigue load of tower and shaft system, thereby improving the modeling accuracy and optimization feasibility.

[0006] The methods include: S101: Collect operating status data and structural load data of wind turbine generator sets, and perform preprocessing; S102: Based on the preprocessed data, construct a high-dimensional state feature vector using state additive hat basis functions; S103: Using high-dimensional state feature vectors, construct a piecewise linear load prediction model based on the convexity of the unit's active power reference command to predict tower bending moment and shaft torque respectively. S104: The piecewise linear load prediction model with respect to the active power reference command convexity is equivalently transformed into a linearized prediction model that includes auxiliary variables and linear inequality relationships. S105: Construct a “state-action value” input convex neural network for the dimensionless active power reference command convex, as a parameterized value function; S106: At multiple preset active power reference command anchor points, calculate the tangents of the parameterized value function and form a linear lower bound constraint by a linear combination of the tangents; S107: With the goal of minimizing the load variation amplitude and maximizing the linear lower bound of long-term control value, a linear programming problem is constructed by combining the field-level total active power constraint and the upper and lower limits of single-unit active power constraint. S108: Solve the linear programming problem online to obtain the active power reference command sequence of the unit that satisfies all constraints, and transmit it to the wind turbine for execution.

[0007] It should be further explained that S101 specifically includes the following steps: S1011: Collects real-time operating status data of each wind turbine generator set at a preset cycle through SCADA system or edge computing device. Status data include ultra-short-term predicted inflow wind speed, generator speed, generator output active power, pitch angle and generator torque. S1012: Perform data quality verification on the collected real-time operating status quantities, remove missing data points, and perform amplitude limiting and cleaning on outliers that exceed the physical range. S1013: Collect and real-time operational status data of structural loads in time, including tower root bending moment and transmission chain shaft torque; S1014: According to the preset scaling factor, the cleaned state quantity, single-machine active power reference command and structural load data are normalized and scaled respectively. The scaling factor is determined according to the typical range of each physical quantity. S1015: Organize the scaled state variables, structural load data, and corresponding active power reference commands into a time-aligned normalized sample dataset.

[0008] It should be further explained that S102 specifically includes the following steps: S1021: Determine the dimension of the state vector representing the operating conditions of the wind turbine unit; S1022: For each dimension of the state vector, set multiple nodes at equal intervals along the dimension within the interval determined by the minimum and maximum values ​​of the historical observation data. S1023: Based on nodes, define a set of hat-shaped basis functions with local support properties for each state dimension, with the center of each basis function located at a node; S1024: For a given current state vector, calculate the output value of each dimension component on all corresponding hat-shaped basis functions; S1025: Concatenate a constant term with the output values ​​of all hat-shaped basis functions in all dimensions to form a high-dimensional sparse feature vector.

[0009] It should be further explained that S103 specifically includes the following steps: S1031: For the two load channels of tower bending moment and shaft torque, respectively, a prediction model is constructed. The output of the prediction model is the prediction of the load value at a future moment. The input of the prediction model is the state feature vector and active power reference command at the current moment. S1032: Based on high-dimensional state feature vectors, linear mapping is performed through trainable parameter matrices to generate state sensitivity coefficients based on the current operating conditions, including intercept coefficients, linear coefficients, and two sets of non-negative piecewise linear rate of change coefficients. S1033: Within the allowable variation range of the single-machine active power reference command, a set of fixed left and right turning points are preset at equal intervals along the power axis to define the turning points of the model piecewise linearity. S1034: Apply nonnegativity constraints to the two sets of piecewise linear rate of change coefficients; S1035: Combine the intercept term, the linear term, and two sets of non-negative piecewise linear rate of change coefficients according to the preset left and right inflection point positions to calculate the predicted load value.

[0010] It should be further explained that S104 specifically includes the following steps: S1041: For each piecewise linear term in the load prediction model, auxiliary variables are configured to mathematically replace the nonlinear positive part operation in the original model. S1042: Establish linear inequality constraints between auxiliary variables, control quantities, and corresponding inflection points, such that the auxiliary variables are numerically not less than the difference between the control quantities and the inflection points and are not less than 0; S1043: Rewrite the calculation expression of the original load prediction model into a linear weighted sum of control variables, auxiliary variables and state sensitivity coefficients; S1044: Combine the rewritten linear load calculation equation with the linear inequality constraints for all auxiliary variables to form a linear equivalent system describing the input-output relationship of a single load channel; S1045: Summarize the linear equivalent system of all load channels, field-level power balance constraints, single-machine power upper and lower limit constraints, and load variation range constraints into a linear constraint set.

[0011] It should be further explained that S105 specifically includes the following steps: S1051: Map the physical active power reference command to the zero-one interval to generate a dimensionless action representation decoupled from the upper and lower limits of single-machine power; S1052: Determine the input composition of the neural network, including dimensionless actions and feature vectors representing the current operating state of the wind turbine. S1053: Construct a multi-layer feedforward network. The update calculation of the feature vector of each layer is a linear combination of the features of the previous layer, the dimensionless action and the state feature vector, and is transformed by a non-linear activation function. S1054: During the forward propagation of the network, a non-negativity constraint is applied to the weight matrix acting on the feature vector of the previous layer to ensure that the entire network mapping is a convex function with respect to dimensionless actions. S1055: The feature vectors of the last layer of the network are combined again with the dimensionless action and state feature vectors in a constrained linear combination to output the final action-state value function estimate, which is used to characterize the long-term control value.

[0012] It should be further explained that S106 specifically includes the following steps: S1061: Within the standard range of dimensionless motion, select a fixed number of discrete points as anchor points for evaluating the curvature of the function. S1062: For each selected anchor point, call the parameterized value function network to calculate the motion state value function value at the anchor point and its partial derivative with respect to the dimensionless motion. S1063: Based on the function value and derivative value at each anchor point, calculate the parameters of the equation of the straight line passing through the point and with the derivative as the slope; S1064: Introduce auxiliary variables for each unit and establish constraints to ensure that they are not greater than the function values ​​of the tangents corresponding to all anchor points at the current action of the unit; S1065: Based on the auxiliary variables of all units, construct an aggregate expression for a long-term value linear lower bound across the entire field as part of the optimization objective.

[0013] It should be further explained that S107 specifically includes the following steps: S1071: To describe the variation range of tower and shaft loads and the lower bound of the linear value characterizing long-term value, optimization weight coefficients are assigned and combined with the corresponding decision variables to form a multi-objective optimization function; S1072: Use the field-level power balance equation and the upper and lower limit inequalities of individual unit power as a set of constraints; S1073: Combine the linear equivalent equations and inequalities of the load prediction model, as well as the slack variable constraints introduced to smooth the load, with the power constraints to form a linear constraint system. S1074: Define the solution time domain of the optimization problem as the current single time step, and set the linear programming problem to be repeatedly constructed and solved online at a frequency matching the control cycle; S1075: Define the set of decision variables to be optimized, including the active power reference command of all wind turbines, auxiliary variables in the load prediction model, load smoothing relaxation variables, and auxiliary variables for the lower bound of the value function.

[0014] According to another embodiment of this application, an electronic device is provided, including a memory, a processor, and a computer program stored in the memory and executable on the processor. When the processor executes the program, it implements the steps of the wind turbine active power co-optimization method based on piecewise linear load prediction and convex state-value function.

[0015] According to another embodiment of this application, a storage medium is also provided, on which a computer program is stored, which, when executed by a processor, implements the steps of the wind turbine active power co-optimization method based on piecewise linear load prediction and convex state-value function.

[0016] As can be seen from the above technical solutions, the present invention has the following advantages: The active power co-optimization method for wind turbines based on piecewise linear load prediction and convex state-value functions provided by this invention achieves data time synchronization through collaborative acquisition by the SCADA system and edge modules. Layered execution of missing data removal, anomaly pruning, and standardization ensures the physical rationality of operating status and structural load data, eliminating interference from invalid data in modeling. High-dimensional state feature vectors, relying on the local activation characteristics of hat-shaped basis functions, mine local changes and global correlations in each state dimension, characterizing the operating characteristics under different conditions such as turbine startup, rated operation, and shutdown. The convex piecewise linear load prediction model retains the ability to fit complex load responses through the superposition of linear basic terms and nonlinear piecewise terms, and possesses optimization adaptability through convex design. Model linearization transformation achieves structural transformation without approximation errors through a combination of hinged auxiliary variables and linear constraints, making the load prediction model compatible with engineering constraints such as field-level power quotas and single-unit power boundaries. A convex input neural network combined with a dual-Q network framework captures the balance between fatigue load accumulation and power output in long-term control, effectively avoiding optimization bias caused by overestimation. The anchor-point tangent envelope technique transforms nonlinear long-term value into quantifiable linear constraints, allowing long-term fatigue suppression objectives to be directly integrated into the short-term optimization framework, achieving a balance between short-term control effectiveness and long-term equipment benefits. By balancing load smoothing and long-term value through a weighted objective function, it integrates multiple constraints such as field-level power quotas, single-machine safety boundaries, and load smoothing to form an optimization framework, ensuring that the optimization results simultaneously meet compliance requirements. The primordial dual interior-point method, combined with feasible projection, enables solution and result correction, ensuring the generation and execution of optimization instructions that adapt to control needs. Attached Figure Description

[0017] To more clearly illustrate the technical solution of the present invention, the accompanying drawings used in the description will be briefly introduced below. Obviously, the accompanying drawings described below are only some embodiments of the present invention. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.

[0018] Figure 1 Flowchart of the active power coordination optimization method for wind turbine units; Figure 2 This is a schematic diagram of an electronic device. Detailed Implementation

[0019] The active power co-optimization method for wind turbines based on piecewise linear load prediction and convex state-value functions of this invention provides a structured approach at both the model and optimization ends. On the prediction side, VC-CPWL, which maintains convexity for control variables, is employed, and a state-additive "hat-shaped" basis covers the entire observation interval, ensuring high accuracy of the load relative to the power reference under wide-area operating conditions. It also possesses a linear equivalent expression, facilitating integration with engineering constraints into the LP solution. On the decision side, a PICNN value function with convex action is used, and the long-term value is stably incorporated into the MPC objective in the form of a linear lower bound through the tangent envelope of a small number of anchor points. Thus, under online solution, while satisfying grid quotas and individual unit boundaries, the equivalent fatigue load on the tower and shaft system is synergistically reduced, and the modeling accuracy and optimization feasibility under wide-area operating conditions are improved.

[0020] The active power co-optimization method for wind turbines based on piecewise linear load prediction and convex state-value function, as described in this application, will be described in detail below. Specific details, such as particular system structures and technologies, are presented for illustrative purposes and not for limitation, to provide a thorough understanding of the embodiments of this application. However, those skilled in the art will understand that this application can also be implemented in other embodiments without these specific details.

[0021] It should be understood that, when used in this specification, the term "comprising" indicates the presence of the described feature, integral, step, operation, element, and / or component, but does not exclude the presence or addition of one or more other features, integrals, steps, operations, elements, components, and / or collections thereof. The terms "comprising," "including," "having," and variations thereof all mean "including but not limited to," unless otherwise specifically emphasized.

[0022] The terms "one embodiment" or "some embodiments" used in this application mean that one or more embodiments of this application include the specific features, structures, or characteristics described in that embodiment. Therefore, the phrases "in one embodiment," "in some embodiments," "in other embodiments," "in still other embodiments," etc., appearing in different parts of this application do not necessarily refer to the same embodiment, but rather mean "one or more, but not all, embodiments," unless otherwise specifically emphasized.

[0023] The technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of the present invention, and not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of the present invention.

[0024] Please see Figure 1The diagram shows a flowchart of a wind turbine active power co-optimization method based on piecewise linear load prediction and convex state-value function in a specific embodiment. The method includes: S101: Collect operating status data and structural load data of wind turbine generator sets, and perform preprocessing.

[0025] S101 specifically includes the following steps: S1011: Collects real-time operating status data of each wind turbine generator set at preset intervals through SCADA system or edge computing device. Status data include ultra-short-term predicted inflow wind speed, generator speed, generator output active power, pitch angle and generator torque.

[0026] In some embodiments, state variables The specific composition is as defined in formula (1):

[0027]

[0028] Formula (1) in This is the predicted structural load for the next cycle. ; In order, the parameters are: inflow wind speed, generator speed, unit output active power, blade pitch angle, and generator torque. This is a reference for the active power of a single unit.

[0029] These are the core variables reflecting the transient operating conditions of wind turbines. The data comes from the wind farm-level SCADA system or edge computing devices deployed in the turbine control cabinet, and is forced to be sampled synchronously at a period of Δt = 1 s. This ensures that the timestamps of all state variables are strictly aligned, avoiding phase errors caused by asynchronous data. Synchronous acquisition avoids data mismatch caused by asynchronous sampling times of various sensors, improving the spatiotemporal consistency quality of the data samples used for subsequent modeling.

[0030] S1012: Perform data quality verification on the collected real-time operating status quantities, remove missing data points, and perform amplitude limiting and cleaning on outliers that exceed the physical range.

[0031] In some embodiments, signal loss points caused by communication interruptions or momentary sensor malfunctions are removed from the sequence. Based on the physical limits and safe operating range of the wind turbine design, reasonable upper and lower thresholds are set for each state variable. Abnormal data points exceeding these ranges are subjected to amplitude limiting processing; that is, values ​​exceeding the upper limit are set to the upper limit value, and values ​​below the lower limit are set to the lower limit value.

[0032] S1013: Collect and record the structural load data corresponding to the real-time operating status quantities in time. The structural load data includes the bending moment at the tower root and the torque of the transmission chain shaft system.

[0033] In some embodiments, the tower root bending moment is collected at the same time t as the state quantity in S1011. and transmission chain shaft torque As in formula (1) The corresponding historical truth value.

[0034] The structural load data in this embodiment are the target values ​​for constructing the load prediction model.

[0035] This embodiment is based on the time-correspondence acquisition of load data and state data, ensuring that a complete sample pair can be obtained at each time point t: the input is the state s(t) and the control quantity u(t), and the output is the true value of the load at the next time step. This precise alignment is fundamental to training predictive models that accurately reflect state-control to load, avoiding model distortion caused by temporal misalignment.

[0036] S1014: According to the preset scaling factor, the cleaned state quantity, single-machine active power reference command, and structural load data are normalized and scaled respectively. The scaling factor is determined according to the typical range of each physical quantity.

[0037] In some embodiments, the scaling operation is performed as follows: Formula (2) Wherein, scaling matrix and its coefficients, and coefficients These are all pre-set fixed values. These coefficients map wind speeds and pitch angles of different dimensions and orders of magnitude to a similar numerical range.

[0038] S1015: Organize the scaled state variables, structural load data, and corresponding active power reference commands into a time-aligned normalized sample dataset.

[0039] In some embodiments, the normalized state vector after processing S1011 to S1014 Standardized control instructions and the normalized load truth value at the next moment. Arranged into structured sample units in chronological order The sample units from all time steps together constitute a normalized sample dataset, which will be directly used for feature mapping in step S102 and model training in step S103.

[0040] S102: Based on the preprocessed data, construct a high-dimensional state feature vector using state additive hat-shaped basis functions.

[0041] S102 specifically includes the following steps: S1021: Determine the dimensions of the state vector that characterizes the operating conditions of the wind turbine. The state vector includes five dimensions: inflow wind speed, generator speed, turbine output active power, pitch angle, and generator torque.

[0042] In some embodiments, the state vector It is explicitly defined as containing five key physical quantities, as shown in formula (1). These five quantities together characterize the aerodynamic, mechanical, and electrical operating conditions of the wind turbine at a specific moment. In feature construction, the dimension D of the state vector is set to 5, corresponding to these five physical quantities.

[0043] S1022: For each dimension of the state vector, within the interval determined by the minimum and maximum values ​​of the historical observation data, set multiple nodes at equal intervals along that dimension.

[0044] In some embodiments, for the d-th dimension state quantity Within the historical observation range Inside, place K nodes at equal intervals. As described in the relevant description of formula (5).

[0045] The state-additive hat shape in this embodiment is based on feature mapping.

[0046]

[0047] Formula (5) In the formula For the first Dimensional equidistant nodes, covering the observation range , It is a local scale parameter related to the distance between adjacent nodes. This is the concatenated state feature vector; Let be the state dimension. The number of nodes per dimension, the feature dimension. .

[0048] These nodes evenly cover the entire possible range of values ​​for the state variable, ensuring that under any operating condition, the value of the state variable falls within the neighborhood of a few adjacent nodes.

[0049] S1023: Based on nodes, define a set of hat-shaped basis functions with local support properties for each state dimension. The center of each basis function is located at a node, and its value decreases linearly to zero as the distance between the state variable and the node increases.

[0050] In some embodiments, for the m-th node of the d-th dimension state variable Its corresponding hat-shaped basis functions The specific mathematical form is given by formula (5): .

[0051] The function is at the center point. It reaches its maximum value of 1 at that point, and then... The distance from the center point decreases linearly with increasing distance; when the distance exceeds... When the output is 0, a triangular "hat-shaped" support region is formed. This structure provides the nonlinear state characteristics for constructing the piecewise linear model formula (6) for the convexity of the control quantity.

[0052] S1024: For a given current state vector, calculate the output value of each dimension component on all corresponding hat-shaped basis functions.

[0053] In some embodiments, for a given preprocessed and scaled current state vector For each of its dimensional components Substitute the corresponding K hat-shaped basis functions for that dimension. The calculation is performed in the formula (5). Based on the definition of formula (5), the result is obtained. scalar output value Due to the local support of basis functions, for each dimension, typically only 2-3 adjacent basis functions will output non-zero values, while the rest will output zero, thus forming sparse feature vectors.

[0054] S1025: Concatenate a constant term with the output values ​​of all hat-shaped basis functions in all dimensions to form a unified high-dimensional sparse feature vector.

[0055] In some embodiments, as shown in formula (5), the constant term (bias term) with a value of 1 is concatenated with the output values ​​of the basis functions of all dimensions in sequence to form the final global state feature vector. : The total dimension of the vector In a preferred embodiment, .

[0056] The final Φ(s) is a fixed-dimensional, highly structured and sparse feature vector, which is used as a state variable and converted into a mathematical representation suitable for efficient linear operations and parameter learning, for use by the load prediction model in step S103.

[0057] S103: Utilizing high-dimensional state feature vectors, a piecewise linear load prediction model based on the convexity of the unit's active power reference command is constructed to predict tower bending moment and shaft torque, respectively. S103 specifically includes the following steps: S1031: For the two load channels of tower bending moment and shaft torque, a prediction model with the same mathematical structure is constructed. The output of the prediction model is the prediction of the load value at a future moment. The input of the prediction model is the state feature vector and active power reference command at the current moment.

[0058] In some embodiments, for the two structural loads that need to be predicted, namely tower bending moment (c=tower) and shaft torque (c=shaft), an independent prediction model is established for each. The function forms are exactly the same, but the model parameters are different. The model inputs are the high-dimensional state feature vector Φ(s) output from step S102 and the current active power reference command u. The model outputs... It is a prediction of the load value after Δt, i.e. The predicted relationship is described. Using a homogeneous model means that the two load channels share the same model architecture and feature processing flow, simplifying the system design and training process. However, independent parameter sets are used to characterize the different influence mechanisms of state variables such as wind speed and rotational speed on tower bending moment and shaft torque. This separate modeling approach can more accurately capture the dynamic response characteristics of different physical components to control commands.

[0059] S1032: Based on the high-dimensional state feature vector, a set of state sensitivity coefficients that are completely dependent on the current operating conditions are generated by linear mapping through a trainable parameter matrix, including an intercept coefficient, a linear coefficient, and two sets of non-negative piecewise linear rate of change coefficients.

[0060] In some embodiments, the four state sensitivity coefficients of the prediction model All are composed of state feature vectors The result is obtained through a linear transformation, as shown below:

[0061] In the formula To control the amount The right and left turning points are equidistantly covered within the allowed range. This represents the number of fold points on each side. , These are trainable parameters.

[0062] Used to guarantee Thus about convex.

[0063] and The calculation takes a similar form, with a parameter matrix. The dimension is larger, M is the number of inflection points, for example 12, and its output needs to be processed by a nonlinear function to ensure non-negativity.

[0064] S1033: Within the allowable variation range of the single-machine active power reference command, a set of fixed left and right turning points are preset at equal intervals along the power axis to define the turning points of the model piecewise linearity.

[0065] In some embodiments, within the physically feasible region of u Inside, two sets of fixed points with an equal number are predefined: right turn points and left vertex As described in formula (6). These inflection points cover the entire interval at equal intervals; for example, M=12 right inflection points and M=12 left inflection points are set. The right inflection point is used to define the positive piecewise change term triggered when u exceeds that point. The left turning point is used to define the negative piecewise change term triggered when u is below that point. .

[0066] S1034: Apply non-negativity constraints to the two sets of piecewise linear rate of change coefficients to ensure that these coefficients are greater than or equal to zero under any operating conditions.

[0067] In some embodiments, the coefficients related to the segmented terms in formula (6) are defined. and It is non-negative. This is achieved through an activation function:

[0068] Similarly, as in formula (6) and As defined by formula (7). `w` is a parameter that controls the smoothness of the function. The function maps any real input `z` to a positive output, thus ensuring the non-negativity of `w` and `v`. The Softplus function, as a soft non-negativity threshold, allows gradients to backpropagate smoothly during model training, while its output is strictly greater than zero. In the model structure of formula (6), when `w` and `v` are non-negative, as `u` increases, the activated `w` and `v` are... The increasing number of terms and their non-negative coefficients ensure that the second difference of the predicted value y(c) with respect to u is non-negative, thus guaranteeing the convexity of the overall function. This guarantee of convexity is crucial for optimization solutions, ensuring that, under a given state, the curve of load versus power is either "concave" or "linear," making the part of the optimization problem with respect to u easily tractable and possessing favorable mathematical properties under any linear constraints.

[0069] S1035: Combine the intercept term, linear term, and two sets of non-negative piecewise linear rate of change coefficients according to the preset left and right inflection point positions to calculate the piecewise linear and overall convex load prediction value with respect to the active power reference command.

[0070] In some embodiments, according to formula (6), all coefficients generated in S1032 are combined with the inflection point structure defined in S1033 for calculation:

[0071] in, This is the positive part operation. The expression shows a piecewise linear function of u: It is the fundamental intercept. The term is a global linear term. The two summation terms add additional linear increments when u exceeds the right inflection point or falls below the left inflection point. Since w and v are non-negative, these increments either increase or remain unchanged on the slope, thus forming a convex piecewise linear function.

[0072] S104: The piecewise linear load prediction model with respect to the active power reference command convexity is equivalently transformed into a linearized prediction model that includes auxiliary variables and linear inequality relationships.

[0073] S104 specifically includes the following steps: S1041: For each piecewise linear term in the load prediction model, configure a set of independent auxiliary variables to mathematically replace the nonlinear positive part operation in the original model.

[0074] In some embodiments, the prediction model for each load channel c is configured to include a right-turn point-based method. item and based on the left turning point item For each such item, assign a corresponding auxiliary variable, denoted as . and For example, when the number of inflection points on each side of the model is M=12, a total of 2M=24 auxiliary variables are configured for each load channel. These variables, under subsequent constraints, are mathematically equivalent to the result of taking the positive part of the original expression.

[0075] S1042: Establish linear inequality constraints between the auxiliary variable and the control quantity (active power reference command) and the corresponding inflection point, forcing the auxiliary variable to be numerically no less than the difference between the control quantity and the inflection point and no less than 0, while also forcing it to be non-negative.

[0076] In some embodiments, for each right-turn point auxiliary variable Apply two linear inequality constraints: and For each left-turning point, auxiliary variables... Similarly, two constraints are applied: These constraints are the parts listed below the model equation in the following formula.

[0077] Formula (8) These inequalities work together to ensure that, in any feasible solution, the auxiliary variables... The value is at least equal to The larger of 0 and 0 Similarly, these two sets of inequalities constitute a relaxed definition of the auxiliary variables. This embodiment does not stipulate that the auxiliary variables must equal the result of taking the positive part, but rather specifies a lower bound. When these auxiliary variables appear in the subsequently constructed linear programming objective function, and their coefficients are positive, the optimization process, in order to minimize the objective function, will automatically compress the auxiliary variables to their possible minimum value, that is, exactly equal to the maximum value in their lower bound. This ensures that in the optimal solution of the linear programming, the auxiliary variables will equal the value of the original nonlinear term, thus achieving mathematical equivalence.

[0078] S1043: The calculation expression of the original load prediction model is rewritten as a linear weighted sum of control variables, auxiliary variables and state sensitivity coefficients, eliminating the positive part operation in the original expression.

[0079] In some embodiments, the nonlinear terms in the original model formula (6) are directly replaced with the corresponding auxiliary variables to obtain a completely linear expression.

[0080] Specifically, load prediction values The calculation is rewritten as the equation in formula (8). In this equation, α, β, , The known constants are calculated from the current state s through step S103, and the decision variables are u. and By replacing the variables with auxiliary variables, a piecewise linear function of u is represented as a single, global linear function of u and a series of auxiliary variables. The linear equation can serve as an equality constraint in a linear programming problem, linking the decision variable (u,t) to the predicted load output y that needs to be optimized or constrained.

[0081] S1044: Combine the rewritten linear load calculation equation with the linear inequality constraints for all auxiliary variables to form a complete linear equivalent system describing the input-output relationship of a single load channel.

[0082] In some embodiments, for any channel c of tower bending moment or shaft torque, its complete linear equivalent system consists of two parts: 1) Linear calculation equation: i.e., the equation obtained from S1043

[0083] 2) Linear inequality constraint set: namely, the 4M inequalities established by S1042 for all M right-turn points and M left-turn points of the channel. This system is completely equivalent to the input (u) output described by the original nonlinear formula (6). Mapping.

[0084] S1045: Summarize the linear equivalent systems of all load channels, field-level power balance constraints, single-machine power upper and lower limit constraints, and load variation range constraints into a unified set of linear constraints.

[0085] In some embodiments, all linear relationships that must be satisfied simultaneously are grouped together, including: the linear equivalent systems of all towers and shafts. Field-level power balance equation constraints:

[0086] Formula (10) Single-unit power upper and lower limit constraints: .

[0087] In the formula For the first Reference for the active power of typhoon generators; The field-level quotas issued by the power grid; This represents the lower / upper limit of single-unit power. This refers to the number of grid-connected wind turbines.

[0088] To suppress abrupt changes in structural loads, two types of absolute value relaxation variables are set for each wind turbine.

[0089] Formula (24) In the formula , The one-step variation of the upper limit of the linear tower bending moment and the shaft torque.

[0090] The load smoothing constraint in this embodiment is shown in formula (24), which is achieved by configuring the slack variable s. abs The absolute value constraint of the load variation amplitude before and after each step is transformed into a linear inequality. All these constraints together define a multivariable, multi-constraint linear system.

[0091] In this way, grid dispatching requirements, unit safety operation boundaries, structural load smoothness requirements, and high-precision load prediction models from different sources and with different physical meanings are unified under a linear constraint system. This integration enables the implementation of the Model Predictive Control (MPC) framework. It ensures that the solution to the final online linear programming problem is not only mathematically optimal, but also achieves an optimized approach that satisfies grid quotas and individual unit boundaries under safety constraints and grid commands.

[0092] S105: Construct a "state-action value" input convex neural network for the dimensionless active power reference command convex, as a parameterized value function.

[0093] In some embodiments, the single-machine active power reference command is transformed into a dimensionless action within the [0,1] interval according to a linear mapping rule, eliminating the training impact caused by differences in physical dimensions. The neural network adopts a hierarchical design. The hidden layer achieves feature fusion through a linear combination of non-negative weight matrices, dimensionless actions, and state front-end features, processed by the Softplus activation function. The output layer directly outputs the value estimate through a linear combination.

[0094] During training, non-negative projection operations are performed on the core weights of the hidden and output layers to ensure the convexity of the network for dimensionless actions. Based on the dual-delay deep deterministic policy gradient (TD3) dual-Q network framework, two networks with identical structures and independent parameters are constructed. Instantaneous rewards are calculated based on the equivalent fatigue load of the sliding window, and training objectives are constructed according to the temporal difference rule. Training is conducted using a learning rate of 10⁻⁴ and a batch size of 256. A high-fidelity simulation online pre-training method is employed, that is, a Q-function is added to the objective function of the model predictive control algorithm built from the completed load prediction model. Training is completed through online interaction with the simulation environment, with a total of 1000 training epochs, each 1000 seconds long. A verification epoch is triggered every 10 training epochs, and an early stopping mechanism is triggered based on the error changes in the verification epochs.

[0095] The convex network design provides the necessary conditions for constructing a linear lower bound on long-term value, while the double-Q network structure improves the reliability of value estimation. Standardized training procedures and early stopping mechanisms ensure the network has good generalization ability, enabling it to stably output accurate long-term value assessment results under wide-area conditions.

[0096] S106: At multiple preset active power reference command anchor points, calculate the tangents of the parameterized value function and form a linear lower bound constraint by a linear combination of the tangents.

[0097] In some embodiments, five equidistant anchor points are set within the dimensionless action [0,1] interval, covering the endpoints and key intermediate positions of the interval. The output value and action gradient of the dual-path network at each anchor point are calculated using an automatic differentiation tool, and the tangent slope and intercept are derived.

[0098] A linear lower bound constraint is constructed for each network path, ensuring that the auxiliary variable representing the lower bound of the single-path value simultaneously satisfies the inequality constraints of the tangents at all anchor points, guaranteeing that the auxiliary variable always lies below the output of its corresponding network. A joint auxiliary variable is configured, and through dual inequality constraints, it is made simultaneously less than or equal to the lower bound auxiliary variables of both networks, forming a linear lower bound for the minimum value of both paths. Finally, the joint auxiliary variables of all wind turbines in the entire field are summed to obtain the total linear lower bound of the global long-term control value. The linear lower bound constraint can be directly integrated into subsequent linear programming problems, providing structured support for the coordinated optimization of long-term value and short-term load smoothing. The multi-anchor point design makes the lower bound more compact, improving the accuracy of value estimation. Dual-path fusion continues the overestimation resistance advantage of the dual-Q network, and global aggregation meets the needs of field-level optimization.

[0099] S107: With the goal of minimizing the load variation amplitude and maximizing the linear lower bound of long-term control value, a linear programming problem is constructed by combining the field-level total active power constraint and the upper and lower limits of single-unit active power constraint.

[0100] In some embodiments, the objective function adopts a weighted summation form. The weights for tower load smoothing, shaft load smoothing, and long-term value terms are all configured as positive real numbers. The weight ratios are determined through offline simulation debugging, balancing the priorities of different optimization directions according to engineering requirements. Constraints cover four core categories: field-level total active power equality constraints, where the sum of all wind turbine power reference values ​​equals the grid-issued quota; single-unit active power two-sided inequality constraints, defining the safe operating boundary of a single wind turbine; load smoothing constraints, which transform the absolute value constraint of the load's one-step change into a two-way linear inequality by configuring non-negative relaxation variables; and linear equivalence model constraints, ensuring the logical correlation between load prediction and power reference values. The objective function and all constraints are integrated into a standard linear programming mathematical form, clearly defining elements such as decision variables, objective coefficient vectors, equality constraint matrices, and inequality constraint matrices.

[0101] S108: Solve the linear programming problem online to obtain a reference command sequence of active power of the unit that satisfies all constraints, and transmit it to the wind turbine for execution.

[0102] In some embodiments, the parameters of the primal dual interior-point method solver are configured, with the initial value of the obstacle parameter set to 0.1, the attenuation coefficient set to 0.5, and the iteration termination condition set to a KKT conditional residual less than 10. -6 The maximum number of iterations is limited to 100 steps to ensure that the solution is completed within a 1-second control cycle.

[0103] This embodiment transforms the constructed linear programming problem into a standard matrix form, specifying the decision variable vector, objective coefficient vector, equality and inequality constraint matrices, and right-hand side terms.

[0104] The solver is invoked to solve the problem using an interior-point iterative process. Newton's method is used to update the decision and dual variables until the optimality condition is met. A feasible projection is performed on the solution results, first correcting values ​​exceeding the single-unit power boundary, then adjusting the total power deviation across the entire field to ensure the results strictly meet the core constraints. The optimized power reference value sequence is transmitted using the industrial Ethernet protocol.

[0105] The interior-point method in this embodiment offers fast solution speed and is suitable for online real-time control requirements. Feasible projection ensures the executability of the optimization results.

[0106] In one embodiment of the present invention, based on step S105, the following is a possible embodiment and its specific implementation will be described in a non-limiting manner. S105 specifically includes the following steps: S1051: Maps the physical active power reference command to the zero-one interval to generate a dimensionless action representation decoupled from the upper and lower limits of the single-machine power.

[0107] In some embodiments, the physical active power reference command for the j-th wind turbine is... Using the unit's permissible lower power limit With power limit The calculation is performed using a linear transformation defined by the following formula:

[0108] Formula (11) In the formula For dimensionless motion, referenced by physical power Linear mapping to Interval.

[0109] Based on this calculation, regardless of the actual power range of a single unit, its corresponding control command can be determined. All values ​​are normalized to a fixed interval [0,1]. This variable is what the subsequent neural network refers to as the dimensionless action. This ensures that the input action domain of the subsequently constructed evaluation network (value function) is fixed and consistent, simplifying the network design and training process.

[0110] S1052: Determine the input composition of the neural network, including the dimensionless action and the feature vector representing the current operating state of the wind turbine.

[0111] In some embodiments, the input to the neural network is fed in from two parts concatenated or separately: one part is the dimensionless action generated in step S1051. The other part is the feature vector φ(s) characterizing the operating state of the wind turbine. The construction method of the state feature vector φ(s) can be the same as that of the high-dimensional state feature vector Φ(s) in step S102, with the aim of encoding operating condition information such as wind speed and rotational speed. The network input pairs are ( ,φ(s)), which evaluates the performance of a specific action in a specific state s. The expected long-term returns that can be obtained.

[0112] S1053: Construct a multi-layer feedforward network. The update calculation of the feature vector of each layer is a linear combination of the features of the previous layer, dimensionless actions and state feature vectors, and is transformed by a non-linear activation function.

[0113] In some embodiments, an Input Convex Neural Network (PICNN) structure is employed, where the feature vector of its k-th hidden layer is... The following formula is used for calculation:

[0114]

[0115] Formula (13) in, It is the feature vector of the previous layer, specifically the first... Layer features; , , It is a weight matrix; It is the bias vector; It is an element-wise non-linear activation function. and Perform nonnegative projection after each update to maintain accuracy. convexity; These are trainable parameters; This is a state front-end feature.

[0116] In a preferred embodiment, the Softplus function is used. This calculation shows that the input of each layer integrates features from the previous layer, the current action, and the original state information.

[0117] By allowing action The state φ(s) directly participates in the computation at each layer, enabling the network to fuse state and action information in a highly nonlinear manner. Using smooth, monotonically increasing activation functions like Softplus helps maintain the good properties of the function and facilitates gradient calculation. This structure allows the network to fit the true value function surface while preserving the convexity with respect to action through the weight constraint S1054.

[0118] S1054: During the forward propagation of the network, a non-negativity constraint is applied to the weight matrix acting on the feature vectors of the previous layer to ensure that the entire network mapping is a convex function with respect to dimensionless actions.

[0119] In some embodiments, in formula (13), the action is applied to the upper-level feature. The weight matrix is ​​labeled as The superscript "+" indicates that the matrix is ​​projected onto the non-negative domain after each parameter update. In the final output layer of the network, the weight vector acting on the features of the last layer... The same nonnegative projection is also required, as shown in the second row of equation (13). This ensures that the entire network mapping is about dimensionless actions. It is a convex function. Convexity means that for a fixed state s, the value function Q changes with the action. The changing curve is "convex" or linear in shape. This property is crucial because it ensures that in subsequent steps, a global lower bound for the convex function can be approximated using a set of linear inequalities.

[0120] S1055: The feature vectors of the last layer of the network are combined again with the dimensionless action and state feature vectors in a constrained linear combination to output the final action-state value function estimate, which is used to characterize the long-term control value.

[0121] In some embodiments, after several layers of calculation as described in S1053, the final layer feature vector z is obtained. L The final value function output The second equation of formula (13) gives: .

[0122] in, It is a weight vector subject to non-negativity constraints. and They are ordinary weight vectors and scalars. It is the output bias. This calculation is a linear combination, and its inputs are convex features and the original action. The original state φ(s) is then used. The abstract convex features about action-state interactions learned from the previous hidden layers are then linearly synthesized with the original input information. The entire network, trained offline, learns a parameterized function capable of accurately predicting long-term cumulative rewards. This reduces a multi-step, long-term optimization problem that is difficult to solve online to a function evaluation problem that can be quickly queried, thus providing guidance for online decision-making.

[0123] In one embodiment of the present invention, based on step S106, the following is a possible embodiment and its specific implementation will be described in a non-limiting manner. S106 specifically includes the following steps: S1061: Within the standard range of dimensionless motion, select a fixed number of discrete points as anchor points for evaluating the curvature of the function.

[0124] In some embodiments, in dimensionless actions On the domain [0,1], select K fixed points as anchor points, denoted as { }(k=1,...,K). In a preferred embodiment, as described below in formula (16),

[0125] Formula (16) The anchor point set is set to {0, 0.25, 0.5, 0.75, 1}, i.e., K=5. These points uniformly cover the entire interval from the minimum to the maximum action. The positions of these anchor points are known in advance and remain unchanged during the online optimization process.

[0126] Choosing a fixed set of anchor points covering the entire action range is equivalent to pre-setting a series of key observation stations for the complex convex value function curve. The selection of these anchor point positions balances approximation accuracy and computational complexity. Since the value function is convex with respect to action, the global behavior can be characterized by the local slopes at these discrete points.

[0127] S1062: For each selected anchor point, call the parameterized value function network to calculate the motion state value function value at the anchor point and its partial derivative with respect to the dimensionless motion.

[0128] In some embodiments, for the j-th unit, for each anchor point Perform two calculations: 1) Forward calculation: anchor point and current state features φ(s) j The input is fed into a trained parameterized value function network (such as PICNN) to obtain the function value. As shown in formula (17) ; ,

[0129] Formula (17) The slope and intercept of the tangent line at the anchor point are the two-way value functions, obtained from the forward gradient of PICNN with respect to the action. Constrained by the bounds of all tangents, thus being The linear lower bound; pressed down Under these circumstances, thus becoming The linear lower bound; It is the sum of the lower bounds of all fields.

[0130] 2) Gradient calculation: Calculate the output Q of the value function. j Regarding input actions At point The partial derivative at point is denoted as Because a framework with automatic differentiation is used, the gradient value can be obtained efficiently and accurately.

[0131] In this embodiment, the forward computation yields the height of a specific point on the curve, and the gradient calculation yields the slope of that point. For a convex function, the value of the function at any point within its domain is no less than the tangent value passing through that point. Therefore, by obtaining the function value and derivative value at the anchor point, the tangent required to construct the global lower bound of the convex function is obtained.

[0132] S1063: Based on the function value and derivative value at each anchor point, calculate the parameters of the equation of the straight line passing through the point and with the derivative as the slope.

[0133] In some embodiments, for the j-th unit and the k-th anchor point The function value calculated at [location] and slope This allows for the unique determination of a straight line. Thus, for each generator unit and each anchor point, a set of straight line parameters is obtained. , This fully defines the tangent line. The local information at the anchor point (point, slope) is encapsulated into a standard linear function form. The resulting tangent line equation is y = *a+ It is a linear expression for the action variable 'a'. Due to the convexity of the value function, for any action 'a' within the domain, its true function value will be greater than or equal to the maximum value among all tangent values ​​at that point 'a'.

[0134] S1064: Introduce auxiliary variables for each unit and establish constraints to ensure that they are not greater than the function values ​​of the tangents corresponding to all anchor points at the current action of the unit.

[0135] In some embodiments, an auxiliary variable, denoted as t, is introduced for each unit j. j For the K tangents calculated at the K anchor points of the unit, establish a set of linear inequality constraints: For all k=1,...,K, this constraint set mandates that the auxiliary variable t... j The value of t does not exceed the value calculated by any tangent at the current action. Since the value of the convex function is not lower than any tangent value, therefore t... j This constitutes a linear lower bound for the value of the convex function. This is achieved by introducing an auxiliary variable t. j A set of linear inequalities creates an upper bound, t. jIt is constrained below the upper bound. During optimization, if the objective function attempts to maximize the upper bound, it will push the upper bound as close as possible to this upper bound, i.e., the minimum of all tangent values. This ensures that the long-run value can be stably incorporated into the MPC objective in the form of a linear lower bound.

[0136] S1065: Based on the auxiliary variables of all units, construct an aggregate expression for a long-term value linear lower bound across the entire field as part of the optimization objective.

[0137] In some embodiments, where a dual-Q network is used to improve stability, each unit j will have two value functions. and This will correspondingly generate two lower bound variables. .

[0138] To obtain robust value estimates, further variables are introduced. and impose constraints: ≤ and ≤ .so, It is the smaller of the two lower bounds of the value function, i.e., min( , The lower linear bound of ). For all units Summing yields the sum of the lower bounds of the long-term value of the entire field. .this It will be added as a linear term to the objective function of subsequent linear programming.

[0139] Thus, using the minimum value of a dual-Q network is a common technique in reinforcement learning to mitigate value overestimation. By introducing... By ingeniously transforming the nonlinear min operation into a form manageable by linear programming through the addition of variables and additional linear constraints, the final summation yields... It is a linear function of all unit actions. Maximizing this term in linear programming is equivalent to raising the lower bound of the long-term cumulative reward as much as possible while ensuring that all tangent inequality constraints are satisfied. This allows long-term and short-term objectives to be co-optimized within a unified linear framework.

[0140] In one embodiment of the present invention, based on step S107, the following is a possible embodiment and its specific implementation will be described in a non-limiting manner. S107 specifically includes the following steps: S1071: To describe the one-step variation of the tower and shaft loads, and to define the linear lower bound representing the long-term value, corresponding optimization weight coefficients are assigned, and these are combined with the corresponding decision variables to form a multi-objective optimization function in the form of a linear summation.

[0141] In some embodiments, the objective function consists of two linearly weighted parts. The first part is the short-term objective, used to smooth structural load fluctuations, corresponding to minimizing the slack variable s introduced in formula (24). abs The first part is a weighted sum. The second part is the long-term objective, which aims to maximize the linear lower bound of the overall long-term value obtained in step S106. The first part, minimizing the load variation, directly corresponds to reducing structural fatigue and is a short-term, instantaneous optimization objective. The second part, maximizing the lower bound of the long-term value, guides the optimization towards a direction with higher long-term benefits.

[0142] S1072: The field-level power balance equation and the upper and lower limit inequalities of individual unit power are incorporated as hard constraints into the set of constraints of the optimization problem.

[0143] In some embodiments, the field-level power balance equation constraint requires that the sum of the active power reference commands of all N wind turbines equals the total power command issued by the power grid dispatch center. ,Right now Formula (10).

[0144] Single-unit operating boundary constraints require power reference values ​​for each unit. The range is: .

[0145] Field-level equality constraints ensure that the optimization results strictly meet the power quota requirements of the power grid, thus achieving grid quota compliance. Individual unit box-type inequality constraints ensure that the power command allocated to each wind turbine is within the range that its converter and pitch system can safely execute, avoiding overload or underload operation and meeting the individual unit boundary requirements.

[0146] S1073: Combine the linear equivalent equations and inequalities of the load prediction model, as well as the slack variable constraints introduced to smooth the load, with the power constraints mentioned above to form a complete linear constraint system.

[0147] In some embodiments, the set of equality and inequality equations represented by the linear equivalent prediction model for each load channel c obtained in step S104, as shown in equation (8), are added as constraints.

[0148]

[0149] Formula (8) In the formula As a hinged auxiliary variable, the piecewise linear term is expressed using linear inequalities, which facilitates simultaneous solution with the power equation / box constraint in the same power inequality.

[0150] In this embodiment, the model will use decision variable u j Compared with the predicted load value This is achieved by incorporating load smoothing constraints, which utilize predicted and current measured loads and limit the step change of load through slack variables. By adding predictive model constraints, the optimizer can anticipate the instantaneous impact of different allocation schemes on the structural load of each wind turbine when searching for the optimal power allocation. Furthermore, by adding load smoothing constraints, the optimizer proactively avoids power commands that cause drastic load fluctuations during decision-making.

[0151] S1074: Define the solution time domain of the optimization problem as the current single time step, and set the linear programming problem to be repeatedly constructed and solved online at a frequency that matches the control cycle.

[0152] In some embodiments, a model predictive control principle based on the rolling time domain is employed, but the prediction time domain length is set to 1. That is, at each discrete control time t, based on the currently measured state s(t) and load y(t), and the current field-level total command Utot(t), a linear programming problem as described in S1071-S1073 is constructed and solved. After solving, only the decision variables corresponding to the current time t in the optimal solution are used as the actual commands to be issued. Upon reaching the next control time t+Δt, this process is repeated, using the new state and command to construct and solve a new linear programming problem.

[0153] S1075: Define the set of decision variables to be optimized, including the active power reference command of all wind turbines, auxiliary variables in the load prediction model, load smoothing relaxation variables, and auxiliary variables for the lower bound of the value function.

[0154] In some embodiments, the set of decision variables x includes: the active power reference command u of all N wind turbines. j The auxiliary variables corresponding to each wind turbine, each load channel, and each inflection point are used to equivalently represent the load prediction model. The load change amplitude relaxation variables corresponding to each wind turbine and each load channel. The linear lower bound variables and aggregate lower bound variables of the value function corresponding to each wind turbine. These variables are used as unknowns to be solved when constructing the problem. Their dimensions are determined by parameters such as the number of wind turbines N, the number of model inflection points M, and the number of value anchor points K, as described in the note to formula (18). Their scale is linearly controllable.

[0155] In this way, all the quantities that need to be determined are unified into a decision variable vector x, so that the objective function and constraints can be written as linear expressions in terms of x. The linearly controllable scale ensures that even for large wind farms containing dozens of wind turbines, the constructed linear programming problem can still be processed by the MPC solver. For LP, dual, KKT, and interior-point methods, it can be expressed as the following formula:

[0156] Formula (19) In the formula For the decision variable vector, For the target coefficient, The equality constraint matrix and the right-hand side terms are given. Let be the inequality constraint matrix and the right-hand side terms.

[0157]

[0158] Formula (20) In the formula , These are the dual variables of the equation and the inequality, respectively.

[0159]

[0160] Formula (21) The above equation represents the KKT conditions, and the last equation represents complementary relaxation.

[0161]

[0162] Formula (22) In the formula As slack variables, For the obstacle parameters, the primal dual interior point method decreases... And perform Newton steps on the central path to approximate the KKT solution.

[0163] In one embodiment of the present invention, based on step S108, the following is a possible embodiment and its specific implementation will be described in a non-limiting manner. S108 specifically includes the following steps: S1081: The optimization problem constructed in step S107, which includes the objective function and the set of linear constraints, is rearranged into the standard mathematical form of linear programming.

[0164] In some embodiments, the objective function and all constraints defined in step S107 are organized according to the standard linear programming form shown in formula (19). Specifically, all decision variables are arranged as column vectors x. The coefficients of the corresponding variables in the objective function are arranged as row vectors. The field-level power balance equations (10) and (8) are rearranged into the form Ax=b, where A is the equality constraint matrix and b is the constant vector on the right side.

[0165] Field-level constraints and load smoothing

[0166] Formula (10) In the formula For the first Reference for the active power of typhoon generators; The field-level quotas issued by the power grid; This represents the lower / upper limit of single-unit power. This refers to the number of grid-connected wind turbines.

[0167] Equations (10), (8), (24), and (16) are rearranged into the form Gx≤h, where G is the inequality constraint matrix and h is the corresponding constant vector. By transforming constraints with explicit engineering semantics into pure coefficient matrices (A,G) and vectors (b,h,c), the engineering optimization problem is transformed into a mathematical problem.

[0168] S1082: Call the linear programming algorithm to iteratively solve the standard form optimization problem numerically, and obtain the optimal solution for all decision variables, including the active power reference command.

[0169] In some embodiments, an integrated or linked linear programming solver is invoked, with the coefficients c, A, b, G, h in standard form as input. The solver iterates based on the barrier function method principle shown in formula (22): by introducing a slack variable s, the inequality Gx≤h is transformed into Gx+s=h, s>0, and a logarithmic barrier term is added to the objective function. The parameter s is forced to be positive. Using iterative methods such as Newton's method, a series of perturbed optimization problems are solved while gradually decreasing the obstacle parameter μ, ultimately approximating the optimal solution. The solver outputs the optimal decision variable vector x, which contains the required active power reference optimal solution u. j The solver here is highly optimized to handle potentially ill-conditioned matrices in a numerically stable manner and provide high-precision solutions within a finite number of iterations.

[0170] S1083: During the reinforcement learning training phase, random perturbations are injected into the obtained active power reference optimal solution to encourage policy exploration.

[0171] In some embodiments, after obtaining the optimal action (power command) u obtained by MPC solution under the current state s, it is not directly used for environmental interaction or as the final output command. Instead, a random noise vector ε is added to it to generate a temporary reference command û=u+ε. The noise ε can be taken from a Gaussian distribution with zero mean, and its amplitude decays with the training process. As described in formula (23), the exploration and feasibility projection method is as follows:

[0172] Formula (23) In the formula This serves as a temporary reference after incorporating exploration noise. The projected feasible solution is obtained by using Lagrange multiplier bisection and piecewise saturation update to ensure that the summation equation and the box constraint are satisfied simultaneously.

[0173] The purpose of formula (23) is to encourage the system to attempt actions that are slightly different from the optimal solution in order to collect more diverse state, action, and reward data. This approach of exploring around the MPC baseline combines the short-term rationality of model-based MPC with the long-term exploration capability of model-free RL. It is a key mechanism for efficiently learning complex value functions Q(s,a), ultimately enabling the PICNN network in step S105 to learn better value estimates.

[0174] S1084: Project the power reference value after the injected disturbance into the feasible region that satisfies the field-level power balance and the single-unit power limit to obtain the final executable instruction.

[0175] In some embodiments, for temporary instructions û with added noise during the training phase, or instructions that may be fine-tuned during the testing phase to ensure robustness, their projections satisfy the feasible region of all hard constraints in equation (10). This is achieved by solving the quadratic programming problem shown in equation (23): The constraints are and This problem can be solved using efficient numerical methods, such as the bisection method based on Lagrange multipliers and the piecewise saturation update method for box constraints, to quickly find the feasible point within the feasible region that is closest to Euclidean distance. This ensures that all exploratory actions during training, and the final instructions issued, meet the requirements of grid quotas and individual machine boundaries, thus confining the exploratory behavior of the learning process within a safe boundary.

[0176] S1085: The final active power reference command is mapped from dimensionless form back to physical dimensions and sent to the basic controller of the corresponding wind turbine generator for execution via the communication network.

[0177] In some embodiments, the power reference value obtained after solving and possibly projecting is the final command in physical dimensions. Through the wind farm's internal communication network, the command for each turbine is sent in real time to the turbine's main controller or converter control system. The controller receives this active power reference command and uses it as an outer loop command. Through inner-layer torque control and pitch control, it drives the generator and pitch system to execute, ultimately enabling the turbine's actual output power to track the reference value.

[0178] It should be understood that the sequence number of each step in the above embodiments does not imply the order of execution. The execution order of each process should be determined by its function and internal logic, and should not constitute any limitation on the implementation process of the embodiments of the present invention.

[0179] like Figure 2As shown, this application also provides an electronic device, including a display module 103, a memory 102, a processor 101, a communication module 104, and a computer program stored in the memory and executable on the processor 101. When the processor 101 executes the program, it implements the steps of a wind turbine active power co-optimization method based on piecewise linear load prediction and convex state-value function.

[0180] In embodiments of the present invention, electronic devices include, but are not limited to, laptop computers, desktop computers, workstations, personal digital assistants, servers, blade servers, mainframe computers, and other suitable computers. Electronic devices may also represent various forms of mobile devices, such as personal digital processors, cellular phones, smartphones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions are merely examples and are not intended to limit the implementation of the embodiments described and / or claimed herein.

[0181] In this embodiment, processor 101 may be implemented using at least one of an application-specific integrated circuit, a programmable logic device, a field-programmable gate array, a processor, a controller, a microcontroller, a microprocessor, or an electronic unit designed to perform the functions described herein. In some cases, such an implementation may be implemented within a controller. For software implementation, implementations such as processes or functions may be implemented with separate software modules that allow the performance of at least one function or operation. Software code may be implemented by a software application (or program) written in any suitable programming language, and the software code may be stored in memory and executed by the controller.

[0182] The display module 103 is used to display information input by the user or information provided to the user. The display module 103 may include a display panel, which may be configured in the form of a liquid crystal display, an organic light-emitting diode, or the like.

[0183] The memory 102 can be used to store software programs and various data. The memory 102 may include high-speed random access memory, and may also include non-volatile memory, such as at least one disk storage device, flash memory device, or other volatile solid-state storage device.

[0184] The communication module 104 transmits radio signals to and / or receives radio signals from at least one of a base station, an external terminal, and a server. Such radio signals may include voice call signals, video call signals, or various types of data sent and / or received according to text and / or multimedia messages.

[0185] The present invention also provides a storage medium storing a computer program thereon, wherein the computer program, when executed by a processor, implements the steps of the wind turbine active power co-optimization method based on piecewise linear load prediction and convex state-value function.

[0186] The storage medium may be any combination of one or more readable media. A readable medium may be a readable signal medium or a readable storage medium. A readable storage medium may be, for example,, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination thereof. More specific examples (a non-exhaustive list) of readable storage media include: electrical connections having one or more wires, portable disks, hard disks, random access memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), optical storage devices, magnetic storage devices, or any suitable combination thereof.

[0187] The storage medium stores a program product capable of implementing the methods described above in this specification. In some possible implementations, various aspects of this disclosure may also be implemented as a program product comprising program code that, when run on a terminal device, causes the terminal device to perform the steps described in the "Exemplary Methods" section of this specification according to various exemplary embodiments of this disclosure.

[0188] The above description of the disclosed embodiments enables those skilled in the art to make or use the invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the general principles defined herein may be implemented in other embodiments without departing from the spirit or scope of the invention. Therefore, the invention is not to be limited to the embodiments shown herein, but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims

1. A method for coordinated active power optimization of wind turbines based on piecewise linear load prediction and convex state-value function, characterized in that, The methods include: S101: Collect operating status data and structural load data of wind turbine generator sets, and perform preprocessing; S102: Based on the preprocessed data, construct a high-dimensional state feature vector using state additive hat basis functions; S103: Using high-dimensional state feature vectors, construct a piecewise linear load prediction model based on the convexity of the unit's active power reference command to predict tower bending moment and shaft torque respectively. S104: The piecewise linear load prediction model with respect to the active power reference command convexity is equivalently transformed into a linearized prediction model that includes auxiliary variables and linear inequality relationships. S105: Construct a "state-action value" input convex neural network for the dimensionless active power reference command convex, as a parameterized value function; S106: At multiple preset active power reference command anchor points, calculate the tangents of the parameterized value function and form a linear lower bound constraint by a linear combination of the tangents; S107: With the goal of minimizing the load variation amplitude and maximizing the linear lower bound of long-term control value, a linear programming problem is constructed by combining the total active power constraint at the field level and the upper and lower limits of active power of a single unit. S108: Solve the linear programming problem online to obtain the active power reference command sequence of the unit that satisfies all constraints, and transmit it to the wind turbine for execution.

2. The wind turbine active power co-optimization method based on piecewise linear load prediction and convex state-value function as described in claim 1, characterized in that, S101 specifically includes the following steps: S1011: Collects real-time operating status data of each wind turbine generator set at a preset cycle through SCADA system or edge computing device. Status data include ultra-short-term predicted inflow wind speed, generator speed, generator output active power, pitch angle and generator torque. S1012: Perform data quality verification on the collected real-time operating status quantities, remove missing data points, and perform amplitude limiting and cleaning on outliers that exceed the physical range. S1013: Collect and real-time operational status data of structural loads in time, including tower root bending moment and transmission chain shaft torque; S1014: According to the preset scaling factor, the cleaned state quantity, single-machine active power reference command and structural load data are normalized and scaled respectively. The scaling factor is determined according to the typical range of each physical quantity. S1015: Organize the scaled state variables, structural load data, and corresponding active power reference commands into a time-aligned normalized sample dataset.

3. The wind turbine active power co-optimization method based on piecewise linear load prediction and convex state-value function as described in claim 1, characterized in that, S102 specifically includes the following steps: S1021: Determine the dimension of the state vector representing the operating conditions of the wind turbine unit; S1022: For each dimension of the state vector, set multiple nodes at equal intervals along the dimension within the interval determined by the minimum and maximum values ​​of the historical observation data. S1023: Based on nodes, define a set of hat-shaped basis functions with local support properties for each state dimension, with the center of each basis function located at a node; S1024: For a given current state vector, calculate the output value of each dimension component on all corresponding hat-shaped basis functions; S1025: Concatenate a constant term with the output values ​​of all hat-shaped basis functions in all dimensions to form a high-dimensional sparse feature vector.

4. The wind turbine active power co-optimization method based on piecewise linear load prediction and convex state-value function as described in claim 1, characterized in that, S103 specifically includes the following steps: S1031: For the two load channels of tower bending moment and shaft torque, respectively, a prediction model is constructed. The output of the prediction model is the prediction of the load value at a future moment. The input of the prediction model is the state feature vector and active power reference command at the current moment. S1032: Based on high-dimensional state feature vectors, linear mapping is performed through trainable parameter matrices to generate state sensitivity coefficients based on the current operating conditions, including intercept coefficients, linear coefficients, and two sets of non-negative piecewise linear rate of change coefficients. S1033: Within the allowable variation range of the single-machine active power reference command, a set of fixed left and right turning points are preset at equal intervals along the power axis to define the turning points of the model piecewise linearity. S1034: Apply nonnegativity constraints to the two sets of piecewise linear rate of change coefficients; S1035: Combine the intercept term, the linear term, and two sets of non-negative piecewise linear rate of change coefficients according to the preset left and right inflection point positions to calculate the predicted load value.

5. The wind turbine active power co-optimization method based on piecewise linear load prediction and convex state-value function as described in claim 1, characterized in that, S104 specifically includes the following steps: S1041: For each piecewise linear term in the load prediction model, auxiliary variables are configured to mathematically replace the nonlinear positive part operation in the original model. S1042: Establish linear inequality constraints between auxiliary variables, control quantities, and corresponding inflection points, such that the auxiliary variables are numerically not less than the difference between the control quantities and the inflection points and are not less than 0; S1043: Rewrite the calculation expression of the original load prediction model into a linear weighted sum of control variables, auxiliary variables and state sensitivity coefficients; S1044: Combine the rewritten linear load calculation equation with the linear inequality constraints for all auxiliary variables to form a linear equivalent system describing the input-output relationship of a single load channel; S1045: Summarize the linear equivalent system of all load channels, field-level power balance constraints, single-machine power upper and lower limit constraints, and load variation range constraints into a linear constraint set.

6. The wind turbine active power co-optimization method based on piecewise linear load prediction and convex state-value function as described in claim 1, characterized in that, S105 specifically includes the following steps: S1051: Map the physical active power reference command to the zero-one interval to generate a dimensionless action representation decoupled from the upper and lower limits of single-machine power; S1052: Determine the input composition of the neural network, including dimensionless actions and feature vectors representing the current operating state of the wind turbine. S1053: Construct a multi-layer feedforward network. The update calculation of the feature vector of each layer is a linear combination of the features of the previous layer, the dimensionless action and the state feature vector, and is transformed by a non-linear activation function. S1054: During the forward propagation of the network, a non-negativity constraint is applied to the weight matrix acting on the feature vector of the previous layer to ensure that the entire network mapping is a convex function with respect to dimensionless actions. S1055: The feature vectors of the last layer of the network are combined again with the dimensionless action and state feature vectors in a constrained linear combination to output the final action-state value function estimate, which is used to characterize the long-term control value.

7. The wind turbine active power co-optimization method based on piecewise linear load prediction and convex state-value function as described in claim 1, characterized in that, S106 specifically includes the following steps: S1061: Within the standard range of dimensionless motion, select a fixed number of discrete points as anchor points for evaluating the curvature of the function. S1062: For each selected anchor point, call the parameterized value function network to calculate the motion state value function value at the anchor point and its partial derivative with respect to the dimensionless motion. S1063: Based on the function value and derivative value at each anchor point, calculate the parameters of the equation of the straight line passing through the point and with the derivative as the slope; S1064: Introduce auxiliary variables for each unit and establish constraints to ensure that they are not greater than the function values ​​of the tangents corresponding to all anchor points at the current action of the unit; S1065: Based on the auxiliary variables of all units, construct an aggregate expression for a long-term value linear lower bound across the entire field as part of the optimization objective.

8. The wind turbine active power co-optimization method based on piecewise linear load prediction and convex state-value function as described in claim 1, characterized in that, S107 specifically includes the following steps: S1071: To describe the variation range of tower and shaft loads and the lower bound of the linear value characterizing long-term value, optimization weight coefficients are assigned and combined with the corresponding decision variables to form a multi-objective optimization function; S1072: Use the field-level power balance equation and the upper and lower limit inequalities of individual unit power as a set of constraints; S1073: Combine the linear equivalent equations and inequalities of the load prediction model, as well as the slack variable constraints introduced to smooth the load, with the power constraints to form a linear constraint system. S1074: Define the solution time domain of the optimization problem as the current single time step, and set the linear programming problem to be repeatedly constructed and solved online at a frequency matching the control cycle; S1075: Define the set of decision variables to be optimized, including the active power reference command of all wind turbines, auxiliary variables in the load prediction model, load smoothing relaxation variables, and auxiliary variables for the lower bound of the value function.

9. An electronic device comprising a memory, a processor, and a computer program stored in the memory and executable on the processor, characterized in that, When the processor executes the program, it implements the steps of the wind turbine active power co-optimization method based on piecewise linear load prediction and convex state-value function as described in any one of claims 1 to 8.

10. A storage medium having a computer program stored thereon, characterized in that, When the computer program is executed by the processor, it implements the steps of the wind turbine active power co-optimization method based on piecewise linear load prediction and convex state-value function as described in any one of claims 1 to 8.