Power plant primary frequency modulation method and device based on reflection agent limited discrimination and electronic equipment

By using a constrained discrimination method based on reflective agents, adaptive piecewise drooping mapping and continuous change rules are generated, which solves the problem of asymmetric sensitivity of heating constraints to frequency regulation response in primary frequency regulation of heating units, and realizes continuous adjustment of frequency regulation commands and stable response within the heating boundary.

CN122267916APending Publication Date: 2026-06-23XIAN THERMAL POWER RES INST CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
XIAN THERMAL POWER RES INST CO LTD
Filing Date
2026-05-25
Publication Date
2026-06-23

AI Technical Summary

Technical Problem

In the existing primary frequency regulation control of heating units, the effect of heating constraints on the frequency regulation response mainly falls on the end cutoff or hard limiting stage. The power command is prone to sudden changes, making it difficult to express the asymmetric sensitivity of the heating side to different regulation directions. Furthermore, the limitation judgment depends on a single preset direction threshold or a single rule, making it difficult to output verifiable limitation levels and causes.

Method used

A method based on reflexive agents for constrained discrimination is adopted. By acquiring the power grid frequency difference and the set of heating constraint states, a candidate constrained conclusion set is generated. The reflexive agent is used to extract the heating change direction features for consistency judgment, output the upward and downward adjustment allowable coefficients, generate an adaptive piecewise droop mapping, calculate the change in frequency regulation target power, and update the frequency regulation command according to the continuous change rule when the allowable coefficient shrinks.

Benefits of technology

It enables continuous changes in frequency modulation power commands when heating constraints contract, reduces the risk of command mutations, enhances the verifiable correspondence between frequency modulation actions and heating constraints, and ensures that the frequency modulation response is continuously adjusted within the heating boundary.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN122267916A_ABST
    Figure CN122267916A_ABST
Patent Text Reader

Abstract

The application discloses a method and device for primary frequency modulation of a thermal power unit based on reflection of an intelligent agent restricted discrimination and electronic equipment, wherein the method comprises the following steps: acquiring a power grid frequency difference amount, a heating constraint state set and an instruction direction identifier; generating a candidate restricted conclusion set by a preset rule chain; performing consistency discrimination by a reflection intelligent agent based on a boundary proximity feature and a heating change direction feature, selecting a restricted level and outputting an up-regulation permission coefficient, a down-regulation permission coefficient and a restricted reason label; generating an adaptive segmented droop mapping on-line by the permission coefficient, calculating a frequency modulation target power change amount and an achievable response upper limit, and forming a frequency modulation power instruction according to a continuous rule when the permission coefficient is contracted to form an upper limit. The application can enhance the verifiable corresponding relationship between the frequency modulation action and the heating constraint and reduce the instruction mutation risk.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This application relates to the field of power equipment technology, and in particular to a primary frequency regulation method, apparatus and electronic equipment for thermal power units based on confined discrimination of reflective intelligent agents. Background Technology

[0002] Heating units typically supply heat to the heating network by extracting steam from steam turbines. The unit's electrical power regulation is strongly coupled with key quantities such as heating pressure, heating temperature, and heating flow rate. When the speed control system changes valve positions or power settings, it causes changes in the distribution of main steam and extracted steam, thus affecting the dynamic response of pressure, temperature, and flow rate on the heating side. When the power grid experiences frequency deviations, primary frequency regulation requires the unit to rapidly increase or decrease active power in the direction of the frequency difference within a short period to maintain frequency stability. However, the heating side must also meet process boundaries and user heating demands to avoid problems such as pressure exceeding limits, temperature deviations, or insufficient flow. Therefore, implementing primary frequency regulation under heating conditions requires establishing a coordination mechanism between the power response triggered by the power grid frequency difference and heating constraints. This ensures that frequency regulation actions are reliably executed by the speed control system without disrupting heating boundary conditions and can adapt to changes in constraint strength over time.

[0003] In related technologies, existing heating units' primary frequency regulation control largely follows the conventional approach: mapping the grid frequency difference directly to power changes via a fixed droop coefficient, fixed gain, or piecewise function, and configuring execution constraints such as dead zone, amplitude limiting, slope limitation, and valve saturation on the speed regulation system side. Regarding the impact on the heating side, engineering practices often set static upper and lower limits for key quantities of extracted steam heating, load reduction / amplification logic near the boundary, or reduce the primary frequency regulation gain under specific operating conditions based on simple rules; some solutions also perform end-point corrections based on real-time heating pressure or temperature values ​​after the frequency regulation power command is generated to ensure that key quantities do not exceed limits. In the scenario of heating units, the above scheme mainly affects the frequency regulation response through end-point cutoff or hard limiting. Power commands are prone to sudden changes when they reach the limit, and there is no continuous correspondence with changes in key heating quantities. Limitation determination often relies on a single preset directional threshold or a single rule, making it difficult to output verifiable limitation levels and causes. At the same time, upward and downward adjustments often adopt symmetrical limiting or reduction methods, making it difficult to express the asymmetric sensitivity of the heating side to different adjustment directions. Summary of the Invention

[0004] This disclosure provides a primary frequency regulation method, apparatus, and electronic device for thermal power units based on confined discrimination by a reflective intelligent agent, in order to at least solve the above-mentioned technical problems existing in the prior art.

[0005] According to a first aspect of this application, a primary frequency regulation method for thermal power units based on constrained discrimination by a reflective agent is provided, the method comprising: Acquire the power grid frequency difference, the set of heating constraint states, and the command direction identifier; Based on the set of heating constraint states and the preset rule chain, a candidate restricted conclusion set is generated. A reflective agent is used to extract heating change direction features from the set of heating constraint states. Based on the heating change direction features and the instruction direction identifier, the consistency of each candidate path in the candidate restricted conclusion set is determined. The candidate restricted conclusion set includes multiple candidate paths, and each candidate path includes a restriction level and a restriction reason label. When the consistency determination result corresponding to the candidate path meets the preset determination threshold, the restriction level corresponding to the candidate path is determined; otherwise, the candidate path is switched or the candidate restriction conclusion set is regenerated and consistency determination is performed until the consistency determination result meets the determination threshold or the preset maximum number of loops is reached, and the final restriction level is determined; based on the final restriction level, the permission coefficient is increased, the permission coefficient is decreased and the restriction reason label is output. Based on the power grid frequency difference, the upward adjustment allowance coefficient, and the downward adjustment allowance coefficient, an adaptive piecewise droop mapping is generated; The initial frequency modulation target power change is calculated based on the adaptive segmented droop mapping, and the initial achievable response upper limit is determined according to the final confinement level and the adaptive segmented droop mapping; the initial achievable response upper limit includes an initial upward achievable response upper limit and an initial downward achievable response upper limit; When the upward or downward adjustment of the permissible coefficient shrinks relative to the previous control cycle, the initial achievable response upper limit is shrunk according to a preset continuous change rule, and the initial frequency modulation target power change is updated to obtain the final frequency modulation target power change and the final achievable response upper limit; otherwise, the initial achievable response upper limit is directly determined as the final achievable response upper limit, and the initial frequency modulation target power change is determined as the final frequency modulation target power change; wherein, the continuous change rule is: multiply the initial achievable response upper limit by the shrinkage ratio of the upward or downward adjustment of the permissible coefficient, and shrink proportionally; Based on the final target frequency modulation power change and the final achievable response limit, the frequency modulation power command is obtained; By combining the restricted cause label, the increased permission coefficient, and the decreased permission coefficient, the restricted explanatory value is obtained; The frequency regulation power command is input into the turbine speed regulation system for execution, and the key quantities of extraction steam heating after execution are collected to obtain the execution feedback heating status. The heating constraint state set is updated based on the execution feedback heating status; Output the updated set of heating constraint states, the constrained interpretation quantities, and the frequency modulation power command.

[0006] In one possible implementation, acquiring the power grid frequency difference, the heating constraint state set, and the command direction identifier includes: Based on the collected real-time frequency and rated frequency of the power grid, the power grid frequency difference is calculated. After normalizing the power grid frequency difference, the amplitude is segmented to obtain the power grid frequency difference quantity. The system collects current key quantities for steam extraction heating, heating pressure boundaries, heating temperature boundaries, and heating flow rate boundaries. The current key quantities for steam extraction heating include: current values ​​of heating pressure, heating temperature, and heating flow rate. The heating pressure boundaries include: an upper limit and a lower limit of heating pressure; the heating temperature boundaries include: an upper limit and a lower limit of heating temperature; and the heating flow rate boundaries include: an upper limit and a lower limit of heating flow rate. Based on a fixed sequence of encapsulation of the current key quantities of steam extraction heating, heating pressure boundary, heating temperature boundary, and heating flow rate boundary, a set of heating constraint states is obtained. Obtain the direction of the frequency modulation power command from the previous control cycle, and determine the direction of the frequency modulation power command based on a preset direction threshold; Based on the direction determination result, the command direction identifier is determined.

[0007] In one possible implementation, the step of generating a candidate restricted conclusion set based on the set of heating constraint states and a preset rule chain, extracting heating change direction features from the set of heating constraint states using a reflective agent, and performing consistency determination on each candidate path in the candidate restricted conclusion set based on the heating change direction features and the instruction direction identifier, includes: The reflective agent extracts the current value of heating pressure, the upper limit of heating pressure, the lower limit of heating pressure, the current value of heating temperature, the upper limit of heating temperature, the lower limit of heating temperature, the current value of heating flow rate, the upper limit of heating flow rate, and the lower limit of heating flow rate from the set of heating constraint states, and forms a nine-dimensional vector in a fixed order. Based on the heating constraint state set, the heating change direction features are extracted in the change direction of adjacent control cycles, and a three-dimensional vector is formed in a fixed order; the three-dimensional vector is in the order of heating pressure change direction, heating temperature change direction, and heating flow rate change direction. The nine-dimensional vector, the three-dimensional vector, and the one-dimensional instruction direction identifier are concatenated in a fixed order to obtain a thirteen-dimensional input vector, which is then input into the consistency determination network. The consensus determination network is used to perform forward computation on the thirteen-dimensional input vector, and outputs a consensus score vector after bounded processing. The maximum value in the consistency score vector is compared with a preset judgment threshold. If the maximum value is greater than or equal to the preset judgment threshold, the candidate path corresponding to the maximum value is selected to obtain the consistency judgment result. Each dimension of the consistency score vector corresponds to a candidate path in the candidate restricted conclusion set.

[0008] In one possible implementation, the consistency determination network includes: The first fully connected layer has thirty-two neurons, the second fully connected layer has sixteen neurons, the third fully connected layer has eight neurons, and the output layer; The number of neurons in the output layer is the same as the number of candidate paths in the candidate restricted conclusion set; Each neuron consists of a weighted summation unit and a nonlinear transformation unit. The weighted summation unit combines the weights of the previous layer's output and adds the bias. The nonlinear transformation units of the first, second, and third fully connected layers use the ReLU activation function to perform nonlinear mapping on the weighted summation result. The nonlinear transformation unit of the output layer uses the Sigmoid activation function to map the weighted summation result, thus completing the bounded output of the consistency score vector in the 0~1 interval.

[0009] In one possible implementation, generating an adaptive piecewise droop mapping based on the grid frequency difference, the upward adjustment allowance coefficient, and the downward adjustment allowance coefficient includes: The segmentation structure is determined based on the amplitude of the power grid frequency difference and a preset segmentation threshold set, and the frequency difference interval corresponding to each segment in the segmentation structure is determined; the frequency difference interval includes the primary frequency regulation dead zone interval, the linear adjustment interval, and the amplitude limiting interval; the segmentation structure includes multiple upward adjustment segments and multiple downward adjustment segments. For the upward adjustment direction of the segmented structure, the upward adjustment permission coefficient is mapped to the first segment slope parameter and the segment upper limit parameter of each upward adjustment segment to generate an upward adjustment segment mapping; For the downward adjustment direction of the segmented structure, the downward adjustment allowance coefficient is mapped to the second segment slope parameter and the segment lower limit parameter of each downward adjustment segment to generate a downward adjustment segment mapping; The upward segmentation mapping and the downward segmentation mapping are concatenated to form an adaptive segmented drooping mapping.

[0010] In one possible implementation, obtaining the frequency modulation power command based on the final target frequency modulation power change and the final achievable response upper limit includes: Obtain the adjustment direction of the final frequency modulation target power change; Select the corresponding final upward adjustment achievable response limit or the final downward adjustment achievable response limit as the limiting benchmark according to the adjustment direction; The absolute value of the final frequency modulation target power change is compared with the amplitude limiting reference for amplitude verification. When the absolute value of the final frequency modulation target power change exceeds the limiting reference, the final frequency modulation target power change is limited while keeping the adjustment direction unchanged, to obtain the limited frequency modulation target power change. A frequency modulation power command is generated based on the change in the frequency modulation target power after the amplitude limiting.

[0011] In one possible implementation, the step of inputting the frequency regulation power command into the turbine speed control system for execution, and collecting key quantities of extracted steam for heating after execution to obtain the execution feedback heating status, includes: The frequency regulation power command is input into the turbine speed regulation system, which then converts the frequency regulation power command into a speed regulation execution quantity to drive the turbine to complete the active power response. After the frequency regulation power command is input into the turbine speed control system, the timing logic is started; Once the timing reaches the preset execution confirmation time threshold, key quantities of the extracted steam for heating are collected to obtain the current values ​​of heating pressure, heating temperature, and heating flow rate. The preset execution confirmation time threshold is used to match the response lag characteristics of the turbine speed control system. The current values ​​of the heating pressure, heating temperature, and heating flow rate are packaged in a fixed order to form an execution feedback heating status.

[0012] In one possible implementation, updating the heating constraint state set based on the execution feedback heating state includes: Obtain the preset heating pressure boundary, heating temperature boundary, and heating flow rate boundary; The current value of heating pressure, its upper and lower limits, the current value of heating temperature, its upper and lower limits, and the current value of heating flow rate, along with their upper and lower limits, are combined in a fixed order to form an updated set of heating constraint states.

[0013] According to a second aspect of this application, a primary frequency regulation device for a thermal power unit based on confined discrimination by a reflective agent is provided, the device comprising: The data acquisition module is used to acquire the power grid frequency difference, the set of heating constraint states, and the command direction identifier; The constraint determination module is used to generate a candidate restricted conclusion set based on the heating constraint state set and the preset rule chain, and to extract heating change direction features from the heating constraint state set using a reflective agent. Based on the heating change direction features and the instruction direction identifier, the module performs consistency determination on each candidate path in the candidate restricted conclusion set. The candidate restricted conclusion set includes multiple candidate paths, and each candidate path includes a restriction level and a restriction reason label. The constraint output module is used to determine the restriction level of a candidate path when the consistency judgment result corresponding to the candidate path meets a preset judgment threshold; otherwise, it switches the candidate path or regenerates the candidate restriction conclusion set and performs consistency judgment until the consistency judgment result meets the judgment threshold or reaches a preset maximum number of loops, and determines the final restriction level; based on the final restriction level, it outputs the increase permission coefficient, decrease permission coefficient and restriction reason label. The mapping generation module is used to generate an adaptive segmented droop mapping based on the power grid frequency difference, the upward adjustment allowance coefficient, and the downward adjustment allowance coefficient. The parameter calculation module is used to calculate the initial frequency modulation target power change based on the adaptive segmented droop mapping, and to determine the initial achievable response upper limit according to the final limiting level and the adaptive segmented droop mapping; the initial achievable response upper limit includes an initial upward achievable response upper limit and an initial downward achievable response upper limit; A power smoothing update module is used to respond to a contraction in the allowance coefficient or the allowance coefficient relative to the previous control cycle by contracting the initial achievable response upper limit and updating the initial frequency modulation target power change according to a preset continuous change rule, thereby obtaining the final frequency modulation target power change and the final achievable response upper limit; otherwise, the initial achievable response upper limit is directly determined as the final achievable response upper limit, and the initial frequency modulation target power change is determined as the final frequency modulation target power change; wherein, the continuous change rule is: multiplying the initial achievable response upper limit by the contraction ratio of the allowance coefficient or the allowance coefficient, and then contracting proportionally; The instruction generation module is used to obtain the frequency modulation power instruction based on the final frequency modulation target power change and the final achievable response upper limit; The explanatory quantity generation module is used to combine the restricted cause label, the increased permission coefficient, and the decreased permission coefficient to obtain the restricted explanatory quantity; The feedback acquisition module is used to input the frequency regulation power command into the turbine speed regulation system for execution, and to acquire the key quantities of extraction steam heating after execution to obtain the execution feedback heating status. The status update module is used to update the heating constraint status set based on the execution feedback heating status; The data output module is used to output the updated set of heating constraint states, the constrained interpretation quantities, and the frequency modulation power command.

[0014] According to a third aspect of this application, an electronic device is provided, comprising: At least one processor; and a memory communicatively connected to the at least one processor; wherein, The memory stores instructions that can be executed by the at least one processor to enable the at least one processor to perform the method described in this application.

[0015] According to a fourth aspect of this application, a non-transitory computer-readable storage medium is provided storing computer instructions for causing the computer to perform the methods described in this application.

[0016] According to a fifth aspect of this application, a computer program product is provided, comprising a computer program or instructions that, when executed by a processor, implement the method described in this application.

[0017] By utilizing the technical solution of this application, an interpretable constraint judgment result can be formed based on the set of heating constraint states during the primary frequency regulation process of the heating unit. Based on this result, an online achievable response boundary and piecewise droop mapping for asymmetric upward and downward regulation can be generated, ensuring that the frequency regulation power command remains continuously changing during constraint contraction without exceeding the heating boundary. This application enhances the verifiable correspondence between frequency regulation actions and heating constraints and reduces the risk of command mutations.

[0018] It should be understood that the description in this section is not intended to identify key or essential features of the embodiments of this application, nor is it intended to limit the scope of this application. Other features of this application will become readily apparent from the following description. Attached Figure Description

[0019] The above and other objects, features, and advantages of exemplary embodiments of this application will become readily apparent from the following detailed description taken in conjunction with the accompanying drawings. Several embodiments of this application are illustrated in the drawings by way of example and not limitation, in which: In the accompanying drawings, the same or corresponding reference numerals indicate the same or corresponding parts.

[0020] Figure 1 This illustration shows the implementation flow of the primary frequency regulation method for thermal power units based on the constrained judgment of a reflective intelligent agent, according to an embodiment of this application. Figure 1 ; Figure 2 This illustration shows the implementation flow of the primary frequency regulation method for thermal power units based on the constrained judgment of a reflective intelligent agent, according to an embodiment of this application.Figure 2 ; Figure 3 This illustration shows the implementation flow of the primary frequency regulation method for thermal power units based on the constrained judgment of a reflective intelligent agent, according to an embodiment of this application. Figure 3 ; Figure 4 This illustration shows the implementation flow of the primary frequency regulation method for thermal power units based on the constrained judgment of a reflective intelligent agent, according to an embodiment of this application. Figure 4 ; Figure 5 This illustration shows the implementation flow of the primary frequency regulation method for thermal power units based on the constrained judgment of a reflective intelligent agent, according to an embodiment of this application. Figure 5 ; Figure 6 This illustration shows a schematic diagram of the consistency determination and re-determination of candidate restricted conclusion sets in an embodiment of this application; Figure 7 This illustration shows a schematic diagram of adjusting the asymmetric reachability margin and the non-jumping during continuous contraction in an embodiment of this application. Figure 8 This illustration shows the implementation flow of the primary frequency regulation method for thermal power units based on the constrained judgment of a reflective intelligent agent, according to an embodiment of this application. Figure 6 ; Figure 9 This paper illustrates a schematic diagram comparing continuous contraction with hard limiting jump in an embodiment of the present application, where the response limit can be reached. Figure 10 This illustration shows the implementation flow of the primary frequency regulation method for thermal power units based on the constrained judgment of a reflective intelligent agent, according to an embodiment of this application. Figure 7 ; Figure 11 This paper shows a block diagram illustrating the implementation of a primary frequency regulation device for a thermal power unit based on the constrained judgment of a reflective intelligent agent, according to an embodiment of this application. Figure 12 A schematic diagram of the composition structure of an electronic device according to an embodiment of this application is shown. Detailed Implementation

[0021] To make the objectives, features, and advantages of this application more apparent and understandable, the technical solutions in the embodiments of this application will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of this application, and not all embodiments. Based on the embodiments of this application, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of this application.

[0022] In the following description, the terms "first" and "second" are used merely to distinguish similar objects and do not represent a specific ordering of objects. It is understood that "first" and "second" may be interchanged in a specific order or sequence where permitted, so that the embodiments of this application described herein can be implemented in an order other than that illustrated or described herein.

[0023] Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs. The terminology used herein is for the purpose of describing embodiments of this application only and is not intended to limit this application.

[0024] The following description, in conjunction with the accompanying drawings, introduces a primary frequency regulation method, apparatus, and electronic device for thermal power units based on confined discrimination by a reflective intelligent agent, provided in this application.

[0025] like Figure 1 As shown, this application provides a primary frequency regulation method for thermal power units based on confined discrimination by a reflective agent, the method comprising: S101, obtain the power grid frequency difference, the set of heating constraint states, and the command direction identifier; S102, a candidate restricted conclusion set is generated based on the heating constraint state set and the preset rule chain. The reflective agent extracts the heating change direction feature from the heating constraint state set. Based on the heating change direction feature and the instruction direction identifier, the consistency of each candidate path in the candidate restricted conclusion set is determined. The candidate restricted conclusion set includes multiple candidate paths, and each candidate path includes a restriction level and a restriction reason label. S103, in response to the consistency judgment result corresponding to the candidate path meeting the preset judgment threshold, determine the restriction level corresponding to the candidate path; otherwise, switch the candidate path or regenerate the candidate restriction conclusion set and perform consistency judgment until the consistency judgment result meets the judgment threshold or reaches the preset maximum number of loops, and determine the final restriction level; based on the final restriction level, output the increase permission coefficient, decrease permission coefficient and restriction reason label. S104, Based on the power grid frequency difference, the upward adjustment allowance coefficient, and the downward adjustment allowance coefficient, generate an adaptive segmented droop mapping; S105, calculate the initial frequency modulation target power change based on the adaptive segmented droop mapping, and determine the initial achievable response upper limit according to the final limiting level and the adaptive segmented droop mapping; the initial achievable response upper limit includes the initial upward achievable response upper limit and the initial downward achievable response upper limit; S106, in response to the contraction of the upward or downward allowance coefficient relative to the previous control cycle, the initial achievable response upper limit is contracted and the initial frequency modulation target power change is updated according to a preset continuous change rule to obtain the final frequency modulation target power change and the final achievable response upper limit; otherwise, the initial achievable response upper limit is directly determined as the final achievable response upper limit, and the initial frequency modulation target power change is determined as the final frequency modulation target power change; wherein, the continuous change rule is: multiply the initial achievable response upper limit by the contraction ratio of the upward or downward allowance coefficient, and contract proportionally; S107, Based on the final frequency modulation target power change and the final achievable response upper limit, a frequency modulation power command is obtained; S108, combine the restricted cause label, the increased permission coefficient, and the decreased permission coefficient to obtain the restricted explanation quantity; S109, the frequency regulation power command is input into the turbine speed regulation system for execution, and the key quantities of extraction steam heating after execution are collected to obtain the execution feedback heating status; S1010, Update the heating constraint state set based on the execution feedback heating state; S1011, output the updated set of heating constraint states, the constrained interpretation quantity, and the frequency modulation power command.

[0026] A reflective agent is an artificial intelligence agent possessing the capabilities of self-evaluation, self-criticism, and self-correction. Through a built-in reflection loop, the reflective agent reviews its reasoning process, output results, or decisions, thereby improving judgment accuracy, logical consistency, and task reliability. Reflective agents can be implemented using existing intelligent agents.

[0027] This application provides a primary frequency regulation method for thermal power units based on reflective agent constraint judgment. It acquires the grid frequency difference, the set of heating constraint states, and the command direction identifier. A candidate constraint conclusion set is generated from a preset rule chain. The reflective agent performs consistency judgment based on boundary proximity features and heating change direction features, selects the constraint level, and outputs an upward adjustment allowable coefficient, a downward adjustment allowable coefficient, and a constraint reason label. Then, an adaptive piecewise droop mapping is generated online from the allowable coefficients to calculate the frequency regulation target power change and the achievable response upper limit. When the allowable coefficients shrink, the upper limit is continuously shrunk to form a frequency regulation power command. This application can enhance the verifiable correspondence between frequency regulation actions and heating constraints and reduce the risk of command mutation. The boundary proximity features can refer to the degree of proximity between the current actual operating parameters of the unit and the allowable safety boundaries (heating pressure boundary, heating temperature boundary, and heating flow boundary) and the upper and lower power limit boundaries of the heating constraints.

[0028] In some embodiments, such as Figure 2As shown, the acquisition of the power grid frequency difference, the set of heating constraint states, and the command direction identifier includes: S201, Based on the collected real-time frequency and rated frequency of the power grid, calculate the power grid frequency difference, and after normalizing the power grid frequency difference, perform amplitude segmentation to obtain the power grid frequency difference quantity; S202, collect the current key quantities of steam extraction heating, heating pressure boundary, heating temperature boundary, and heating flow rate boundary; the current key quantities of steam extraction heating include: current value of heating pressure, current value of heating temperature, and current value of heating flow rate; wherein, the heating pressure boundary includes: upper limit of heating pressure and lower limit of heating pressure; the heating temperature boundary includes: upper limit of heating temperature and lower limit of heating temperature; the heating flow rate boundary includes: upper limit of heating flow rate and lower limit of heating flow rate; S203, based on a fixed sequence of encapsulation of the current key quantities of steam extraction heating, heating pressure boundary, heating temperature boundary and heating flow boundary, a set of heating constraint states is obtained; S204, Obtain the direction of the frequency modulation power command from the previous control cycle, and determine the direction of the frequency modulation power command based on a preset direction threshold; S205, Based on the direction determination result, determine the command direction identifier.

[0029] In this application, the grid frequency difference is collected during the current control cycle and converted into a grid frequency difference quantity for subsequent droop mapping generation. During the current control cycle, the current values ​​of heating pressure, heating temperature, and heating flow rate are collected, along with the upper and lower limits of heating pressure, heating temperature, heating temperature, heating flow rate, and heating flow rate, and these are encapsulated in a fixed order to form a heating constraint state set. The frequency regulation power command from the previous cycle is obtained, and its direction is determined using a preset direction threshold. The determination result is mapped to a command direction identifier, and the grid frequency difference quantity, the heating constraint state set, and the command direction identifier are output.

[0030] For example, the control system has a fixed control cycle. The frequency modulation calculation is performed cyclically once, and the control cycle is... Triggered by the frequency regulation calculation task of the turbine speed control system. In each control cycle... The sampling time collects the real-time frequency of the power grid. and read the rated frequency. Define the power grid frequency difference as and The difference. Convert the grid frequency difference into a grid frequency difference quantity. At this time, the preset frequency difference limit is first used. The grid frequency difference is normalized to obtain a normalized frequency difference. Then, the normalized frequency difference is truncated in amplitude. The truncation rule is as follows: when the normalized frequency difference is greater than or equal to a positive truncation preset direction threshold, it is set to the truncation preset direction threshold; when the normalized frequency difference is less than a negative truncation preset direction threshold, it is set to a negative truncation preset direction threshold. This yields a grid frequency difference that is sign-preserving and amplitude-limited. ,in Both the truncation preset direction threshold and the threshold value are preset parameters of the control system.

[0031] In the control cycle Internally synchronized acquisition of key quantities for steam extraction heating, including the current value of heating pressure. Current heating temperature Current value of heating flow Simultaneously obtain the upper limit of the heating pressure boundary. With boundary lower limit Upper limit of heating temperature boundary With boundary lower limit Upper limit of the heating flow rate boundary With boundary lower limit The above nine boundary-related quantities are grouped into three sets of data according to process parameters. , , And further form a boundary approximation feature matrix. ,in The first row corresponds to the current value of heating pressure and the upper and lower limits of heating pressure boundaries; the second row corresponds to the current value of heating temperature and the upper and lower limits of heating temperature boundaries; and the third row corresponds to the current value of heating flow rate and the upper and lower limits of heating flow rate boundaries.

[0032] Approximate the boundary to the feature matrix Expanded in row-major order, it becomes a nine-dimensional vector. The nine-dimensional vector The values ​​are, in order: current heating pressure, upper limit of heating pressure boundary, lower limit of heating pressure boundary, current heating temperature, upper limit of heating temperature boundary, lower limit of heating temperature boundary, current heating flow rate, upper limit of heating flow rate boundary, and lower limit of heating flow rate boundary. This is represented by the nine-dimensional vector. As a numerical representation of the set of heating constraint states, the consistency decision network can directly use the sequentially consistent boundary to approach the feature input.

[0033] Obtain the frequency modulation power command issued in the previous control cycle. Using a preset direction threshold Regarding the Perform direction determination: when Greater than When it is determined to be an upward direction, when Less than negative When it is determined to be a downward adjustment direction, when In to When the interval is defined as zero direction, the determination result is mapped to the command direction identifier. ,in It is a one-dimensional quantity, which is used in conjunction with the nine-dimensional vector. The inputs of the consensus decision network are constructed together. This yields the basic input data set used for restricted discrimination and permission coefficient generation. .

[0034] In some embodiments, such as Figure 3 As shown, the process of generating a candidate restricted conclusion set based on the set of heating constraint states and a preset rule chain, extracting heating change direction features from the set of heating constraint states using a reflective agent, and performing consistency determination on each candidate path in the candidate restricted conclusion set based on the heating change direction features and the instruction direction identifier includes: S301, using a reflective agent to extract the current value of heating pressure, upper limit of heating pressure, lower limit of heating pressure, current value of heating temperature, upper limit of heating temperature, lower limit of heating temperature, current value of heating flow rate, upper limit of heating flow rate, and lower limit of heating flow rate from the set of heating constraint states, and form a nine-dimensional vector in a fixed order; S302, extract the heating change direction features based on the change direction of the heating constraint state set in adjacent control cycles, and form a three-dimensional vector in a fixed order; the three-dimensional vector is in turn the heating pressure change direction, the heating temperature change direction, and the heating flow rate change direction; S303, the nine-dimensional vector, the three-dimensional vector, and the one-dimensional instruction direction identifier are concatenated in a fixed order to obtain a thirteen-dimensional input vector, and the thirteen-dimensional input vector is input into the consistency determination network; S304, The consistency determination network is used to perform forward calculation on the thirteen-dimensional input vector, and the consistency score vector after bounded processing is output. S305, compare the maximum value in the consistency score vector with a preset judgment threshold. If the maximum value is greater than or equal to the preset judgment threshold, select the candidate path corresponding to the maximum value to obtain the consistency judgment result; wherein, each dimension of the consistency score vector corresponds to a candidate path in the candidate restricted conclusion set.

[0035] In this application, the constrained discrimination and permission coefficient generation of the reflective agent takes the set of heating constraint states as the sole basis for heating and uses the instruction direction identifier as the discrimination reference quantity to form a constrained judgment link that is different from fixed preset direction threshold limit or single rule matching.

[0036] In this application, when the preset rule chain generates a candidate restricted conclusion set on the set of heating constraint states, it does not output a single conclusion, but rather outputs multiple candidate paths containing restriction level and restriction cause labels, giving the restriction determination a structural basis for switchable candidate paths. The reflective agent does not replace the preset rule chain, but performs consistency determination on the candidate restricted conclusion set, forming a two-stage structure of candidate generation and consistency screening. The heating change direction feature is extracted by the reflective agent from the set of heating constraint states. The heating change direction feature is limited to an interpretable directional quantity description, used to characterize the change direction of key quantities of steam extraction heating in the current time period. The reflective agent combines the heating change direction feature with the command direction identifier to obtain the correspondence between the frequency regulation action direction and the heating response direction, and then compares the correspondence with the typical directional relationship of each candidate path in the candidate restricted conclusion set to form a consistency determination result.

[0037] The consistency determination result uses a preset determination threshold as the determination boundary: when the consistency determination result meets the preset determination threshold, the reflexive agent selects the restricted level and fixes the restricted cause label; when the consistency determination result does not meet the preset determination threshold, the reflexive agent switches the rule path of the candidate restricted conclusion set and re-determines the restricted level, so that the restricted level comes from a path with verifiable directional consistency. The reflexive agent outputs the increase and decrease of the permissible coefficient, and the restricted cause label. The increase and decrease of the permissible coefficient parameterize the restricted level as the primary frequency regulation executable boundary, providing a direct driving quantity for the primary frequency regulation control with segmented drooping of the permissible coefficient. At the same time, the restricted cause label forms a verifiable restricted interpretation basis on the grid side.

[0038] In some embodiments, the consensus determination network includes: The first fully connected layer has thirty-two neurons, the second fully connected layer has sixteen neurons, the third fully connected layer has eight neurons, and the output layer; The number of neurons in the output layer is the same as the number of candidate paths in the candidate restricted conclusion set; Each neuron consists of a weighted summation unit and a nonlinear transformation unit. The weighted summation unit combines the weights of the previous layer's output and adds the bias. The nonlinear transformation units of the first, second, and third fully connected layers use the ReLU activation function to perform nonlinear mapping on the weighted summation result. The nonlinear transformation unit of the output layer uses the Sigmoid activation function to map the weighted summation result, thus completing the bounded output of the consistency score vector in the 0~1 interval.

[0039] In this application, the reflexive agent constraint discrimination and permission coefficient generation includes a heating change direction feature extraction unit and a consistency determination network. The input to the consistency determination network is formed by a set of heating constraint states and a command direction identifier. The set of heating constraint states is expanded on the network side into boundary proximity features and heating change direction features. The boundary proximity features are expressed using a nine-dimensional vector, which sequentially includes the current value of heating pressure, the upper limit of heating pressure boundary, the lower limit of heating pressure boundary, the current value of heating temperature, the upper limit of heating temperature boundary, the lower limit of heating temperature boundary, the current value of heating flow rate, the upper limit of heating flow rate boundary, and the lower limit of heating flow rate boundary. The heating change direction features are expressed using a three-dimensional vector, which sequentially represents the direction of heating pressure change, the direction of heating temperature change, and the direction of heating flow rate change. The command direction identifier is expressed using a one-dimensional quantity. The boundary proximity features, heating change direction features, and command direction identifier are concatenated to form a thirteen-dimensional input vector.

[0040] The consistency determination network employs a multi-layer fully connected structure. The input layer receives a thirteen-dimensional input vector. The first fully connected layer contains thirty-two neurons, the second fully connected layer contains sixteen neurons, and the third fully connected layer contains eight neurons. The output layer contains the same number of neurons as the number of candidate paths in the candidate restricted conclusion set, outputting a consistency score vector. Each dimension of the consistency score vector corresponds to a candidate path in the candidate restricted conclusion set. Each neuron in the consistency determination network consists of a weighted summation unit and a nonlinear transformation unit. The weighted summation unit combines the weights of the previous layer's output and adds a bias, while the nonlinear transformation unit maps the weighted summation result. The output layer neurons output a bounded consistency score vector, which is used to determine consistency with a preset threshold.

[0041] The reflexive agent matches the consistency score vector with the candidate restricted conclusion set, selects candidate paths whose consistency score vectors meet a preset judgment threshold, and obtains the restriction level and restriction reason label. If the consistency score vector does not meet the preset judgment threshold, the reflexive agent switches the rule path of the candidate restricted conclusion set and re-outputs the consistency score vector to reselect the restriction level and restriction reason label. The reflexive agent looks up the allowance coefficient and the allowance coefficient for lowering the allowance level in a table based on the restriction level, and outputs the allowance coefficient, allowance coefficient, and restriction reason label. This structure limits the network output of the reflexive agent to the consistency score vector and fixes the restriction reason label to the candidate restricted conclusion set, achieving verifiable candidate paths and corresponding restriction reason labels, unlike black-box classification structures that directly output the restriction level.

[0042] For example, the confined discrimination and permissive coefficient generation of the reflexive agent uses a set of heating constraint states and command direction identifiers as input data. The set of heating constraint states is represented by a boundary proximity feature matrix. Its expanded vector It means that the It is a 3x3 matrix, with the three rows corresponding to the current value of the heating pressure. Upper limit of heating pressure boundary and lower limit of heating pressure boundary Current heating temperature Upper limit of heating temperature boundary and lower limit of heating temperature boundary Current value of heating flow Upper limit of heating flow boundary and lower limit of heating flow boundary .Will Expanding by row order to form a nine-dimensional vector The The nine components are fixed in the following order: current heating pressure, upper limit of heating pressure boundary, lower limit of heating pressure boundary, current heating temperature, upper limit of heating temperature boundary, lower limit of heating temperature boundary, current heating flow rate, upper limit of heating flow rate boundary, and lower limit of heating flow rate boundary. The command direction indicator uses a one-dimensional quantity. It means that the It is determined by the frequency modulation power command of the previous control cycle according to the preset direction threshold.

[0043] When generating candidate constrained conclusion sets using a preset rule chain and a working condition state machine, by The boundary state vector is obtained through analysis. The It is a three-dimensional vector, and its three-dimensional components are, in turn, the boundary states of the heating pressure. Heating temperature boundary state Heating flow boundary state .in, Depend on With , The comparison yields, Greater than When the time is out of bounds, Less than Remove the out-of-bounds state when it is in the middle. and When there is a boundary condition, the state inside the boundary is taken. and Obtained in the same way. Preset rule chain. and To trigger input, restricted rules are triggered one by one according to preset rule priority, and each restricted rule generates a candidate path record. The candidate path record Includes restricted levels With the tag of the reason for restriction To ensure that candidate paths possess verifiable and typical directional relationships, each candidate path is recorded. Configure typical directional relationship vectors The It is a four-dimensional vector, with its four components corresponding sequentially to the directions of heating pressure change, heating temperature change, heating flow rate change, and command direction. The operating condition state machine uses operating condition state variables. This indicates that the operating condition state quantity Depend on The candidate restricted conclusion set is determined together with the currently triggered restricted rules, and state transitions are performed according to preset state transition conditions. The candidate restricted conclusion set uses a candidate path matrix. The candidate path matrix represents... Depend on Each row consists of a candidate path record. Correspondingly, the The current operating condition state quantity is determined by the preset rule chain and the operating condition state machine. The trigger result is determined.

[0044] The characteristics of the direction of heating change are represented by a three-dimensional vector. It means that the The three-dimensional components are, in order, the directions of heating pressure change. Direction of heating temperature change Direction of change in heating flow .extract At that time, read the current value of the heating pressure collected in the previous control cycle. Current heating temperature Current value of heating flow Calculate the difference respectively , , and with a preset direction threshold , , The direction is determined by comparison: a difference greater than a corresponding preset direction threshold is determined as an upward direction; a difference less than a negative corresponding preset direction threshold is determined as a downward direction; and all others are determined as stable directions. The upward direction is encoded as the value 1, the downward direction as the value -1, and the stable direction as the value zero, thus obtaining... , , The computable representation; Mapped to the value one, negative one, or zero according to the same encoding rules. Forming a four-dimensional direction vector and will , , The thirteen-dimensional input vector is formed by concatenating the elements in a fixed order. The The order of arrangement is .

[0045] The consistency determination network adopts a multi-layer fully connected structure and uses... As input, the parameters of the consensus determination network are determined by the first weight matrix. With the first bias vector Second weight matrix With the second bias vector Third weight matrix With the third bias vector Output layer weight matrix With output layer bias vector The first fully connected layer consists of thirty-two neurons and outputs thirty-two-dimensional hidden vectors. The second fully connected layer contains sixteen neurons and outputs a sixteen-dimensional hidden vector. The third fully connected layer contains eight neurons and outputs an eight-dimensional hidden vector. Each fully connected layer employs a nonlinear transformation function. The nonlinear transformation function maps the weighted summation result. A modified linear unit mapping rule is adopted, meaning that the output is zero when the input is negative, and the output is the input value when the input is non-negative. The number of neurons in the output layer and the candidate path matrix are also considered. number of rows Consistency, output consistency score vector The for dimensional vector, any component With candidate path records One-to-one correspondence. The output layer uses bounded functions. The consistency score is bounded, and the bounded function is... The segmented saturation rule is adopted, that is, when the input is less than zero, the output is zero; when the input is greater than one, the output is one; and when the input is between zero and one, the output is the input value.

[0046] To explicitly incorporate directional consistency into the consistency determination, after obtaining the consistency score vector... Then, record each candidate path. Calculate the corrected score for consistency of fusion direction And construct the corrected consistency score vector with the corrected scores. The corrected score is defined by the following formula:

[0047] in, For the first The corrected consistency score of each candidate path record; It is a bounded function; The fusion coefficient takes values ​​in a closed interval between zero and one; Consistency score vector The One component; Four-dimensional direction vector The One component; For the first The typical directional relationship vectors corresponding to each candidate path record The One component; Index the candidate paths; For directional component index; This is for absolute value operations.

[0048] Consistency determination uses a preset threshold. When the consistency score vector is corrected There exist components that satisfy greater than or equal to When selecting the component index with the highest corrected consistency score, choose the component index that has the highest corrected consistency score. And from the candidate path matrix Extracting from index Corresponding candidate path records Record the candidate paths Restricted level Determined as a restricted level Record the candidate paths The reason for the restriction is the tag The tag was identified as a reason for restriction. When the consistency score vector is corrected All components are less than When switching the rule path of the candidate restricted conclusion set, the rule path is identified by the rule path identifier. Indicates a regular path sequence It consists of multiple rule path identifiers arranged in a preset order; the next rule path identifier is selected according to the preset order to drive the preset rule chain and the working state machine to regenerate the candidate path matrix. and in the updated candidate path matrix Keep below Recalculate the consistency score vector without changing it. With the corrected consistency score vector Until a restricted level is obtained. With the tag of the reason for restriction .

[0049] According to the level of restriction In the preset mapping table The upper limit of the license coefficient can be found in the middle. With lowering the permit coefficient The preset mapping table Indexed by restricted level and using tuples For the table entries, allow the upward and downward adjustment coefficients to take values ​​symmetrically or asymmetrically, while maintaining the label for the reason for restriction. With candidate path records Consistent.

[0050] In some embodiments, such as Figure 4 As shown, the step of generating an adaptive piecewise droop mapping based on the grid frequency difference, the upward adjustment allowance coefficient, and the downward adjustment allowance coefficient includes: S401, determine the segmentation structure based on the amplitude of the power grid frequency difference and the preset segmentation threshold set, and determine the frequency difference interval corresponding to each segment in the segmentation structure; the frequency difference interval includes the primary frequency regulation dead zone interval, the linear adjustment interval, and the amplitude limiting interval; the segmentation structure includes multiple upward adjustment segments and multiple downward adjustment segments. S402, for the upward adjustment direction of the segmented structure, the upward adjustment permission coefficient is mapped to the first segment slope parameter and the segment upper limit parameter of each upward adjustment segment to generate an upward adjustment segment mapping; S403, for the downward adjustment direction of the segmented structure, the downward adjustment permission coefficient is mapped to the second segment slope parameter and the segment lower limit parameter of each downward adjustment segment to generate the downward adjustment segment mapping; S404, the upward segmentation mapping and the downward segmentation mapping are concatenated to form an adaptive segmented drooping mapping.

[0051] In this application, the primary frequency regulation control with piecewise droop of the permissive coefficient uses the grid frequency difference as the frequency regulation trigger input, and uses the upward and downward permissive coefficients generated by the reflective agent's constraint judgment and permissive coefficient generation as the unique adaptive factors to generate an adaptive piecewise droop mapping online. Under the constraints of the adaptive piecewise droop mapping, the target power change and the achievable response upper limit are calculated. Existing technologies commonly use fixed droop or fixed gain in heating units: the grid frequency difference is directly mapped to the power change, and then a limiting or slope restriction is superimposed at the end. The difference of this type of structure is that the mapping is fixed first and the restriction is superimposed later. The influence of the heating pressure boundary, heating temperature boundary, and heating flow boundary on the primary frequency regulation falls on the end cutoff, the probability of the power command jumping at the cutoff point is increased, and the degree of upward and downward restriction is difficult to express through a unified mechanism.

[0052] Adaptive segmented droop mapping employs permission coefficient-driven mapping generation instead of fixed mapping with end-truncation. Primary frequency control with permission coefficient segmented droop first generates a segmented structure based on the amplitude range of the grid frequency difference, binding the segmented structure to the upward and downward permission coefficients: the upward segment's slope and upper limit are determined by the upward permission coefficient, while the downward segment's slope and lower limit are determined by the downward permission coefficient, creating an asymmetric mapping between the upward and downward paths, directly corresponding to the different sensitivities of the heating unit's steam extraction heating to upward and downward adjustments. Subsequently, primary frequency control with permission coefficient segmented droop calculates the target power change within the segmented structure, and the achievable response upper limit is simultaneously determined by the upward and downward permission coefficients, ensuring that the target power change and the achievable response upper limit originate from the same set of permission coefficient constraints, forming a control link where the mapping and boundary are of the same origin.

[0053] When changes in the constraint level lead to an upward or downward contraction of the permissible coefficient, the primary frequency regulation control, which involves segmented drooping of the permissible coefficient, does not use a hard-limiting method to handle the achievable response upper limit. Instead, it contracts the achievable response upper limit according to a continuous change rule and updates the frequency regulation target power change. The continuous change rule is defined as follows: the achievable response upper limit is adjusted continuously with changes in the permissible coefficient, and the frequency regulation target power change is adjusted synchronously with the achievable response upper limit, establishing a continuous correspondence between the trajectory of the frequency regulation power command and the trajectory of the heating constraint state set. This structure ensures that the constraints of the heating constraint state set on primary frequency regulation are transmitted during the drooping mapping generation stage and continuously referenced through the achievable response upper limit during the target power calculation stage and the command formation stage, forming an irreplaceable path that runs through the three processing links.

[0054] In some embodiments, such as Figure 5 As shown, the step of obtaining the frequency modulation power command based on the final target frequency modulation power change and the final achievable response upper limit includes: S501, Obtain the adjustment direction of the final frequency modulation target power change; S502, select the corresponding final upward adjustment reachable response limit or the final downward adjustment reachable response limit as the limiting benchmark according to the adjustment direction; S503, perform amplitude verification between the absolute value of the final frequency modulation target power change and the amplitude limiting reference; S504, in response to the absolute value of the final frequency modulation target power change exceeding the limiting reference, the final frequency modulation target power change is limited while keeping the adjustment direction unchanged, to obtain the limited frequency modulation target power change; S505, generate a frequency modulation power command based on the change in the frequency modulation target power after the amplitude limiting.

[0055] In this application, the grid frequency difference, the upward adjustment allowable coefficient, and the downward adjustment allowable coefficient are used as inputs. The segmentation structure is determined based on the amplitude of the grid frequency difference and the preset segmentation threshold set, and the frequency difference interval corresponding to each segment is determined. For the upward adjustment direction of the segmented structure, the upward adjustment allowable coefficient is mapped to the segment slope parameter and segment upper limit parameter of each upward adjustment segment, generating an upward adjustment segment mapping. For the downward adjustment direction of the segmented structure, the downward adjustment allowable coefficient is mapped to the segment slope parameter and segment lower limit parameter of each downward adjustment segment, generating a downward adjustment segment mapping. The upward adjustment segment mapping and the downward adjustment segment mapping are concatenated at the zero frequency difference position to form an adaptive segmented droop mapping. Under the constraint of the adaptive segmented droop mapping, the grid frequency difference is converted into the frequency regulation target power change, and the achievable response upper limit corresponding to the constraint level is determined according to the constraint level, so that the achievable response upper limit is jointly constrained by the upward adjustment allowable coefficient and the downward adjustment allowable coefficient. When the constraint level changes and causes the upward adjustment allowable coefficient or the downward adjustment allowable coefficient to shrink relative to the previous control cycle, the target achievable response upper limit is recalculated based on the shrunken upward adjustment allowable coefficient or the downward adjustment allowable coefficient, and the achievable response upper limit of the previous control cycle is used as the initial value. The achievable response upper limit is shrunken cycle by cycle according to a preset continuously changing preset direction threshold. The frequency modulation target power change is synchronously limited based on the achievable response upper limit after contraction, and the adaptive segmented droop mapping, the frequency modulation target power change, and the achievable response upper limit are output.

[0056] For example, the permissive coefficient segmented droop generation and continuous contraction are performed once in each control cycle, and the grid frequency difference is used. It is indicated that the upward adjustment of the permit coefficient adopts... It is stated that the reduction in the permit coefficient adopts This indicates that the restricted level adopts... Indicated. Based on. , , and Generate adaptive piecewise droop mapping and calculate frequency modulation target power variation. With achievable response limit .

[0057] The preset segmented threshold set uses a segmented threshold vector. This indicates that the segmented threshold vector Arranged in ascending order of amplitude and including a zero preset direction threshold, used to... The amplitude is divided into multiple frequency difference intervals; for Take the absolute value to obtain the amplitude and then compare it with the piecewise threshold vector. Compare each item to determine the index of the current interval. Based on this, the effective number of segments in the segmented structure is determined, wherein the effective number of segments is determined by using... This indicates that the interval boundaries of each segment are determined simultaneously. The segmented structure uses a segment boundary matrix. This indicates that the segmented boundary matrix The lower and upper boundaries of each segment are stored row by row, so that the segment slope parameters and segment limits can be referenced in alignment according to the segment index.

[0058] When generating the upward segmented mapping for the upward direction, a preset upward reference slope vector is used. With the upper limit vector of the benchmark The upward adjustment of the reference slope vector Each component corresponds to a reference segment slope for a frequency difference interval, and the upper limit vector of the upward adjustment reference... Each component corresponds to the upper limit of a reference segment within a frequency difference interval. The permissible coefficient will be increased. With the upward adjustment of the reference slope vector Component-wise multiplication yields the up-adjustment piecewise slope parameter vector. The permit coefficient will be increased. With the upper limit vector of the benchmark Component-wise multiplication yields the upper bound parameter vector of the piecewise adjustment. and will The up-adjustment mapping matrix is ​​formed according to a fixed column order. The up-adjustment mapping matrix Each line describes the boundary, slope parameter, and upper limit parameter of a frequency range.

[0059] When generating the downward segmentation mapping for the downward direction, a preset downward reference slope vector is used. With the lower limit vector of the lower benchmark The lower limit vector of the lowered benchmark The lower limit will be expressed by taking the negative value according to the direction of power change. The lowering of the permissible coefficient will be... With the lowered baseline slope vector Component-wise multiplication yields the down-adjustment piecewise slope parameter vector. The permit coefficient will be lowered. With the lower limit vector of the lower benchmark Component-wise multiplication yields the lower bound parameter vector of the lower adjustment segment. and will The down-adjustment mapping matrix is ​​formed according to a fixed column order. The mapping matrix will be adjusted upwards. With down-adjustment mapping matrix An adaptive piecewise droop mapping table is constructed using zero frequency difference as the splicing boundary. The adaptive segmented drooping mapping table Includes both up-mapping and down-mapping regions while maintaining consistent segment indexes. For example... Figure 6As shown, this application first generates candidate paths and then performs consistency screening. Specifically, a preset rule chain and a working condition state machine generate multiple candidate restricted paths / conclusion sets in parallel. A reflective agent performs consistency judgment on the candidates (such as the degree of matching with working condition evidence, constraint logic, and preset direction threshold conditions). If the consistency judgment result is satisfied, it outputs "restriction level, restriction reason label, and adjustment of the permissible coefficient (upward / downward)". If the preset direction threshold is not satisfied, it triggers a backflow, switches the rule path, and re-determines the restriction level. Compared to providing a conclusion all at once, this application achieves verifiable path selection and backflow re-determination, reduces false positives / false negatives, and enhances the interpretability, robustness, and adaptability to complex working conditions of the results.

[0060] Specifically, the power grid frequency difference Converted to frequency modulation target power change At that time, first according to Symbol selection adaptive piecewise drooping map The corresponding mapping area in the middle, and then with Amplitude and piecewise boundary matrix Comparison yields the range index And according to the segment index from the first segment to the second segment. The power change is accumulated segment by segment; for segments with complete coverage, the change is accumulated by multiplying the slope parameter of the segment by the width of the frequency difference interval of the segment; for segments containing... The final segment is divided according to the slope parameter of the segment and The remaining amplitude within this segment is multiplied and accumulated; the accumulated result is assigned... The symbol is obtained and indexed by range Reference segment upper bound parameter vector or piecewise lower limit parameter vector right Perform segmented limit verification.

[0061] The reachable response upper limit adopts a directional upper limit quantity Indicated, and in restricted level Generates constraints in conjunction with the permission coefficient. Preset upper limit table for restricted level baseline. The table of upper limits for the restricted level Indexed by the restricted level and storing the maximum response limit that can be adjusted based on the baseline. Lowering the baseline can reach the upper limit of response. Simultaneously read the rated active power of the unit. Used for unifying dimensions. (Based on the level of constraint) From the table of restricted level benchmark upper limit Found and ,Will With the increase in the permit coefficient Multiply and press Dimension scaling allows for the upward adjustment of the target to reach the upper limit of the response. ,Will With lowering the permit coefficient Multiply and press Dimension scaling reduces the target to reach the upper limit of the response. .according to The symbol in and The target achievable response upper limit is obtained by selecting from the options. And read the response limit from the previous control cycle. With preset continuously changing preset direction threshold Within the same control cycle, the upper limit of the achievable response is updated cycle by cycle using the following formula:

[0062] in, The maximum response limit that can be reached during the current control cycle; This refers to the frequency difference of the power grid. For symbolic functions, When the value is greater than zero, the value is taken as one. When the value is less than zero, the value is negative one. When the value is zero, it takes the value of zero; The target is the upper limit of achievable response; The upper limit of the response that can be reached in the previous control cycle; A preset direction threshold is set for continuous change. This is for absolute value operations; This is for calculating the maximum value.

[0063] In obtaining Then, the change in the frequency modulation target power. and Perform amplitude verification in the same direction, when The amplitude exceeds When the amplitude is, for according to Implement synchronization limits and maintain direction and Consistent.

[0064] like Figure 7As shown, with the "baseline power" as the center line, the area above the center line represents the upward adjustment achievable margin, and the area below the center line represents the downward adjustment achievable margin. The dashed arrows represent the target (unconstrained) frequency modulation power increase or decrease, and the solid arrows represent the actual output (constrained). In the "traditional hard limiting" on the left, when the heating-side constraint suddenly contracts, the target command is still too large, and the end is directly truncated, resulting in a step jump in the output at the limit value. The right side represents this application: at time t0, the reflective agent determines the constraint level and generates the upward / downward adjustment allowance coefficients, thus forming an asymmetric upward and downward adjustment achievable margin envelope; when the constraint tightens (t1, t2) causing the allowance coefficients to contract, the achievable upper limit converges periodically according to a continuous rule, and the output arrow shortens smoothly with the envelope, avoiding "hitting the wall". Therefore, this application can explicitly express the difference in upward and downward adjustment capabilities while satisfying the heating boundary, reducing the risk of command mutation and enhancing the verifiable correspondence between the constraint and the heating constraint.

[0065] The generation of frequency modulation power commands and the construction of constrained interpretation quantities are performed once per control cycle. The change in the target frequency modulation power is read. The The target power change is generated by piecewise drooping of the permissive coefficient; the upper limit of the achievable response is read. The In accordance with the restricted level and subject to the increased permit coefficient With lowering the permit coefficient Commonly constrained directional power limit; reading the constraint reason label The Select the cause label corresponding to the path for the reflexive agent; read the increase permission coefficient. With lowering the permit coefficient The and The permission coefficients obtained by looking up the restricted level in the preset mapping table.

[0066] right Direction extraction is performed using a preset direction threshold. Determine the direction of the change in the frequency modulation target power. The The preset directional threshold is consistent with the dimensions of power change; when season ,when season ,when season The direction determination results and key amplitude values ​​are combined to form a direction verification data set. It is used to uniformly select the amplitude limit benchmark and verify the amplitude.

[0067] according to Select a limiting reference; the limiting reference adopts... It means that the The amplitude quantity and by Confirmed: When season ,when season ,when season .Will and Amplitude verification is performed, and the verification rule is comparison. and :when The change in the frequency modulation target power after the amplitude limiting is generated. and will Set as ,make The direction is from Constrained and amplitude equal to ;when When Set as The key quantities in the limiting calculation process are combined into a limiting result vector. This is used to ensure that the direction and amplitude constraints of the amplitude limiting process are traceable.

[0068] according to Generate frequency regulation power command and read the reference power setpoint of the turbine speed regulation system in the current control cycle. The The power reference setpoint provided for the speed control system; and The frequency modulation power command is obtained by superposition. The This is the power setting quantity issued to the turbine speed control system. To construct the constraint explanation quantity, the constraint reason label is... According to the preset coding table Mapping to restricted cause encoding The The mapping table is a mapping table from restricted cause labels to restricted cause codes. Encodes integers or fixed-point numbers. Constituting a restricted explanatory vector in a fixed order and will and The restricted interpretation matrix is ​​constructed in a fixed row order. The restricted interpretation matrix The first line corresponds The second line corresponds to .

[0069] In some embodiments, such as Figure 8As shown, the step of inputting the frequency regulation power command into the turbine speed regulation system for execution, and collecting key quantities of extracted steam for heating after execution to obtain the execution feedback heating status, includes: S601, The frequency regulation power command is input into the turbine speed regulation system, and the turbine speed regulation system converts the frequency regulation power command into a speed regulation execution quantity to drive the turbine to complete the active power response; S602, after the frequency regulation power command is input to the turbine speed regulation system, the timing logic is started; S603, when the timing duration reaches the preset execution confirmation time threshold, the key quantities of steam extraction heating after execution are collected to obtain the current values ​​of heating pressure, heating temperature and heating flow rate; wherein, the preset execution confirmation time threshold is used to match the response lag characteristics of the turbine speed regulation system; S604, the current value of the heating pressure, the current value of the heating temperature, and the current value of the heating flow rate are packaged in a fixed order to form an execution feedback heating state.

[0070] Specifically, the frequency regulation power command is input into the turbine speed control system, which then converts the command into a speed control execution quantity to drive the turbine power response. After the frequency regulation power command is input and a preset execution confirmation time threshold is reached, key quantities for extraction steam heating are collected. These key quantities include the current values ​​of heating pressure, heating temperature, and heating flow rate. The current values ​​of heating pressure, heating temperature, and heating flow rate are then packaged in a fixed order to form an execution feedback heating status and output.

[0071] For example, the feedback heating status generation is performed once per control cycle. The frequency modulation power command is read. The frequency modulation power command For power setting; change the frequency modulation power command Input to the turbine speed control system. The turbine speed control system reads the unit's active power feedback. The active power feedback of the unit The generator power measurement value is used to calculate the power deviation. The power deviation This is the difference between the frequency regulation power command and the unit's active power feedback. The turbine speed control system measures the power deviation. Perform proportional calculations to obtain the proportional term, and then adjust the power deviation. By sampling period The integral term is obtained by summing the terms. The proportional term is then added to the integral term to obtain the unlimited execution amount. Finally, the valve position constraint range is applied. The speed regulation execution quantity is obtained by performing saturation limiting. The speed regulation execution amount The valve position constraint range is given by the actuator stroke, which is the valve opening setpoint or guide vane opening setpoint. The turbine speed control system will then execute the speed control actuation. The data is sent to the actuators to drive the turbine power response. The relevant data for execution are then arranged in a fixed order to form an execution drive vector. .

[0072] In frequency modulation power command Record the time of instruction issuance during input. The time when the instruction was issued Provided by the control system clock; read the preset execution confirmation time threshold. The preset execution confirmation time threshold For fixed duration parameters; confirmation of data collection time will be executed. Set as command issuance time Compared with the preset execution confirmation time threshold The summation. Time-related data are arranged in a fixed order to form a time vector. and during the confirmation collection time Trigger a heating-side data collection.

[0073] At the time of confirmation and data collection Collect key quantities for steam extraction heating, including the current value of heating pressure. Current heating temperature Current value of heating flow During data acquisition, first read the raw output of each sensor. , , The For the raw output of the heating pressure sensor, the The original output of the heating temperature sensor, the The original output of the heat flow meter is used; then the linear calibration coefficient is read separately. , , With zero offset , , And according to the corresponding calibration coefficients and zero-point offset , , Performing a linear transformation yields , , The original output and the converted results are arranged in a fixed order to form a data acquisition matrix. The collected data matrix First action The second line .

[0074] Will The system is packaged in a fixed sequence to form a feedback heating status. The fixed sequence is the current value of heating pressure, the current value of heating temperature, and the current value of heating flow rate. The step of performing feedback on the heating status... Using execution feedback heating state vector This indicates that the execution feedback heating state vector The execution confirmation acquisition time and the execution feedback heating status vector are arranged in a fixed order to form the execution feedback record vector. And will execute the driving vector With execution feedback record vector The execution feedback matrix is ​​arranged in a fixed row order. Execution feedback matrix First action The second line .

[0075] like Figure 9 The diagram compares two limiting methods: On the left, the "hard limiting jump" method directly truncates the change in target power when the permissible coefficient tightens or the boundary is triggered, causing abrupt output changes that can easily lead to command jitter, tracking overshoot, or control loop oscillations. On the right, the mechanism described in this application is as follows: when the permissible coefficient is adjusted upwards / downwards, the achievable response upper limit smoothly shrinks periodically according to a continuous change rule, and the change in target power is synchronously limited within the gradually shrinking achievable domain, without abrupt truncation. The advantages are a significant reduction in the risk of command abrupt changes, improved stability and predictability of the frequency regulation process, and reduced impact on the actuators and the power grid.

[0076] In some embodiments, such as Figure 10 As shown, updating the heating constraint state set based on the execution feedback heating state includes: S701, obtain the preset heating pressure boundary, heating temperature boundary, and heating flow boundary; S702 combines the current value of heating pressure with the upper and lower limits of heating pressure, the current value of heating temperature with the upper and lower limits of heating temperature, and the current value of heating flow with the upper and lower limits of heating flow in a fixed order to form an updated set of heating constraint states.

[0077] In this application, the heating pressure boundary, heating temperature boundary, and heating flow rate boundary corresponding to the execution feedback heating state are obtained. The current value of the heating pressure in the execution feedback heating state is combined with the upper and lower limits of the heating pressure boundary, the current value of the heating temperature boundary, and the current value of the heating flow rate boundary to form an updated heating constraint state set in a fixed order. A frequency modulation power command, a constrained interpretation value, and the updated heating constraint state set are output.

[0078] For example, updating the heating constraint state set is performed once per control cycle. The execution feedback heating state vector is read. Execute feedback heating state vector Based on the current value of heating pressure Current heating temperature Current value of heating flow Composed in a fixed order, the fixed order being: Read frequency modulation power command Frequency modulation power command This refers to the power setting for the turbine speed control system. (Read the restricted interpretation matrix.) Constrained interpretation matrix It is a 2x3 matrix, with the first row containing elements. The second row of elements is ,in Code for the restricted reason, To increase the permit coefficient, To lower the permit coefficient, The direction of the change in the target power of the frequency modulation. As a limit benchmark, This represents the change in the target power of the frequency modulation after amplitude limiting.

[0079] Obtain and execute feedback heating state vector When the corresponding heating pressure boundary, heating temperature boundary, and heating flow rate boundary are defined, the heating boundary parameter table should be read. The heating boundary parameter table Storage heating pressure boundary upper limit Lower limit of heating pressure boundary Upper limit of heating temperature boundary Lower limit of heating temperature boundary Upper limit of heating flow boundary Lower limit of heating flow boundary Read the heating status indicator. The heating condition indicator The heating boundary parameter table is provided by the unit control system and used for indexing. Boundary entries in the middle; identified according to heating conditions. From the heating boundary parameter table Extract Boundary quantities are arranged into boundary vectors in a fixed order. The boundary vector The upper bounds are arranged into an upper bound vector in a fixed order. The lower bounds are arranged into a lower bound vector in a fixed order. and will Construct a boundary alignment matrix by row The boundary alignment matrix It is a two-row, three-column matrix, with the first row being the upper bound vector and the second row being the lower bound vector.

[0080] Execute feedback heating state vector With boundary vector The updated heating constraint state set is formed by combining states in a fixed order, and the updated heating constraint state set is formed by using the boundary proximity feature matrix. Its expanded vector Representation. Construct the boundary proximity feature matrix Mh: This incorporates the current value of the heating pressure. Upper limit of heating pressure boundary and lower limit of heating pressure boundary Write First line: Current heating temperature value Upper limit of heating temperature boundary and lower limit of heating temperature boundary Write The second line. The current value of the heating flow rate. Upper limit of heating flow boundary and lower limit of heating flow boundary Write The third line, makes It is a 3x3 matrix, with each row defined by its current value, upper and lower bounds. Arrange. Expanding by row order to form a nine-dimensional vector The The nine components are in a fixed order. The feedback heating state vector will be executed. Upper limit vector Lower bound vector The heating alignment matrix is ​​formed according to a fixed row order. The heating alignment matrix First action The second line The third line .

[0081] Restricted interpretation matrix Expanding by row order to form a six-dimensional restricted interpretation vector The restricted interpretation vector The order of the six components is fixed as follows FM power command Restricted interpretation vector Expanded vector A sixteen-dimensional output combination vector is formed by splicing the components in a fixed order. The output combined vector: .

[0082] like Figure 11 As shown, this application provides a primary frequency regulation device for thermal power units based on confined discrimination by a reflective agent. The device includes: Data acquisition module 1101 is used to acquire power grid frequency difference, heating constraint status set and command direction identifier; The constraint determination module 1102 is used to generate a candidate restricted conclusion set based on the heating constraint state set and the preset rule chain, extract heating change direction features from the heating constraint state set using a reflective agent, and perform consistency determination on each candidate path in the candidate restricted conclusion set based on the heating change direction features and the instruction direction identifier; the candidate restricted conclusion set includes multiple candidate paths, and each candidate path includes a restriction level and a restriction reason label. The constraint output module 1103 is used to determine the restriction level of the candidate path when the consistency judgment result corresponding to the candidate path meets the preset judgment threshold; otherwise, it switches the candidate path or regenerates the candidate restriction conclusion set and performs consistency judgment until the consistency judgment result meets the judgment threshold or reaches the preset maximum number of loops, and determines the final restriction level; based on the final restriction level, it outputs the increase permission coefficient, decrease permission coefficient and restriction reason label. The mapping generation module 1104 is used to generate an adaptive segmented droop mapping based on the power grid frequency difference, the upward adjustment allowance coefficient, and the downward adjustment allowance coefficient. The parameter calculation module 1105 is used to calculate the initial frequency modulation target power change based on the adaptive segmented droop mapping, and to determine the initial achievable response upper limit according to the final limiting level and the adaptive segmented droop mapping; the initial achievable response upper limit includes an initial upward achievable response upper limit and an initial downward achievable response upper limit; The power smoothing update module 1106 is used to respond to a contraction in the upward or downward allowance coefficient relative to the previous control cycle by contracting the initial achievable response upper limit and updating the initial frequency modulation target power change according to a preset continuous change rule, thereby obtaining the final frequency modulation target power change and the final achievable response upper limit; otherwise, the initial achievable response upper limit is directly determined as the final achievable response upper limit, and the initial frequency modulation target power change is determined as the final frequency modulation target power change; wherein, the continuous change rule is: multiplying the initial achievable response upper limit by the contraction ratio of the upward or downward allowance coefficient, and then contracting proportionally; The instruction generation module 1107 is used to obtain the frequency modulation power instruction based on the final frequency modulation target power change and the final achievable response upper limit; The explanatory quantity generation module 1108 is used to combine the restricted cause label, the increased permission coefficient, and the decreased permission coefficient to obtain the restricted explanatory quantity; Feedback acquisition module 1109 is used to input the frequency regulation power command into the steam turbine speed regulation system for execution, and to acquire the key quantities of extraction steam heating after execution to obtain the execution feedback heating status; The state update module 1110 is used to update the heating constraint state set based on the execution feedback heating state; The data output module 1111 is used to output the updated set of heating constraint states, the constrained interpretation quantities, and the frequency modulation power command.

[0083] According to embodiments of this application, this application also provides an electronic device and a readable storage medium.

[0084] The electronic device includes at least one processor and a memory communicatively connected to the at least one processor. The memory stores instructions executable by the at least one processor, which, when executed, enable the at least one processor to perform the primary frequency regulation method for thermal power units based on reflexive agent-constrained discrimination as described in this application. The computer instructions are used to cause the computer to perform the primary frequency regulation method for thermal power units based on reflexive agent-constrained discrimination as described in this application.

[0085] This application also provides a computer program product, including a computer program / instruction, which, when executed by a processor, implements the primary frequency regulation method for thermal power units based on the constrained discrimination of a reflective intelligent agent of this application.

[0086] Figure 12A schematic block diagram of an example electronic device 800 that can be used to implement embodiments of this application is shown. The electronic device is intended to represent various forms of digital computers, such as laptop computers, desktop computers, workstations, personal digital assistants, servers, blade servers, mainframe computers, and other suitable computers. The electronic device may also represent various forms of mobile devices, such as personal digital processors, cellular phones, smartphones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions are merely illustrative and are not intended to limit the implementation of the application described and / or claimed herein.

[0087] like Figure 12 As shown, the electronic device 800 includes a computing unit 801, which can perform various appropriate actions and processes based on a computer program stored in a read-only memory (ROM) 802 or a computer program loaded from a storage unit 808 into a random access memory (RAM) 803. The RAM 803 may also store various programs and data required for the operation of the electronic device 800. The computing unit 801, ROM 802, and RAM 803 are interconnected via a bus 804. An input / output (I / O) interface 805 is also connected to the bus 804.

[0088] Multiple components in electronic device 800 are connected to I / O interface 805, including: input unit 806, such as keyboard, mouse, etc.; output unit 807, such as various types of displays, speakers, etc.; storage unit 808, such as disk, optical disk, etc.; and communication unit 809, such as network card, modem, wireless transceiver, etc. Communication unit 809 allows electronic device 800 to exchange information / data with other devices through computer networks such as the Internet and / or various telecommunications networks.

[0089] The computing unit 801 can be various general-purpose and / or special-purpose processing components with processing and computing capabilities. Some examples of the computing unit 801 include, but are not limited to, a central processing unit (CPU), a graphics processing unit (GPU), various special-purpose artificial intelligence (AI) computing chips, various computing units running machine learning model algorithms, a digital signal processor (DSP), and any suitable processor, controller, microcontroller, etc. The computing unit 801 performs the various methods and processes described above, such as the primary frequency regulation method for thermal power units based on confined judgment of a reflective agent. For example, in some embodiments, the primary frequency regulation method for thermal power units based on confined judgment of a reflective agent can be implemented as a computer software program, which is tangibly contained in a machine-readable medium, such as storage unit 808. In some embodiments, part or all of the computer program can be loaded and / or installed on electronic device 800 via ROM 802 and / or communication unit 809. When the computer program is loaded into RAM 803 and executed by the computing unit 801, one or more steps of the primary frequency regulation method for thermal power units based on confined judgment of a reflective agent described above can be performed. Alternatively, in other embodiments, the computing unit 801 may be configured by any other suitable means (e.g., by means of firmware) to perform a primary frequency regulation method for thermal power units based on the limited judgment of a reflective agent.

[0090] Various embodiments of the systems and techniques described above herein can be implemented in digital electronic circuit systems, integrated circuit systems, field-programmable gate arrays (FPGAs), application-specific integrated circuits (ASICs), application-specific standard products (ASSPs), systems-on-a-chip (SoCs), payload-programmable logic devices (CPLDs), computer hardware, firmware, software, and / or combinations thereof. These various embodiments may include implementations in one or more computer programs that can be executed and / or interpreted on a programmable system including at least one programmable processor, which may be a dedicated or general-purpose programmable processor, capable of receiving data and instructions from a storage system, at least one input device, and at least one output device, and transmitting data and instructions to the storage system, the at least one input device, and the at least one output device.

[0091] The program code used to implement the methods of this application may be written in any combination of one or more programming languages. This program code may be provided to a processor or controller of a general-purpose computer, special-purpose computer, or other programmable data processing device, such that when executed by the processor or controller, the functions / operations specified in the flowcharts and / or block diagrams are implemented. The program code may be executed entirely on a machine, partially on a machine, as a standalone software package partially on a machine and partially on a remote machine, or entirely on a remote machine or server.

[0092] In the context of this application, a machine-readable medium can be a tangible medium that may contain or store a program for use by or in conjunction with an instruction execution system, apparatus, or device. A machine-readable medium can be a machine-readable signal medium or a machine-readable storage medium. Machine-readable media can be, but is not limited to, electronic, magnetic, optical, electromagnetic, infrared, or semiconductor systems, apparatus, or devices, or any suitable combination of the foregoing. More specific examples of machine-readable storage media include electrical connections based on one or more wires, portable computer disks, hard disks, random access memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), optical fibers, portable compact disk read-only memory (CD-ROM), optical storage devices, magnetic storage devices, or any suitable combination of the foregoing.

[0093] To provide interaction with a user, the systems and techniques described herein can be implemented on a computer having: a display device for displaying information to the user (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor); and a keyboard and pointing device (e.g., a mouse or trackball) through which the user provides input to the computer. Other types of devices can also be used to provide interaction with the user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user can be received in any form (including sound input, voice input, or tactile input).

[0094] The systems and technologies described herein can be implemented in computing systems that include backend components (e.g., as a data server), or computing systems that include middleware components (e.g., an application server), or computing systems that include frontend components (e.g., a user computer with a graphical user interface or web browser through which a user can interact with implementations of the systems and technologies described herein), or any combination of such backend, middleware, or frontend components. The components of the system can be interconnected via digital data communication of any form or medium (e.g., a communication network). Examples of communication networks include local area networks (LANs), wide area networks (WANs), and the Internet.

[0095] Computer systems can include clients and servers. Clients and servers are generally located far apart and typically interact via communication networks. Client-server relationships are created by computer programs running on the respective computers and having a client-server relationship with each other. Servers can be cloud servers, servers in distributed systems, or servers incorporating blockchain technology.

[0096] The above description is merely a specific embodiment of this application, but the scope of protection of this application is not limited thereto. Any variations or substitutions that can be easily conceived by those skilled in the art within the scope of the technology disclosed in this application should be included within the scope of protection of this application. Therefore, the scope of protection of this application should be determined by the scope of the claims.

Claims

1. A primary frequency regulation method for thermal power units based on constrained discrimination by a reflective agent, characterized in that, The method includes: Acquire the power grid frequency difference, the set of heating constraint states, and the command direction identifier; Based on the set of heating constraint states and the preset rule chain, a candidate restricted conclusion set is generated. A reflective agent is used to extract heating change direction features from the set of heating constraint states. Based on the heating change direction features and the instruction direction identifier, the consistency of each candidate path in the candidate restricted conclusion set is determined. The candidate restricted conclusion set includes multiple candidate paths, and each candidate path includes a restriction level and a restriction reason label. When the consistency determination result corresponding to the candidate path meets the preset determination threshold, the restriction level corresponding to the candidate path is determined; otherwise, the candidate path is switched or the candidate restriction conclusion set is regenerated and consistency determination is performed until the consistency determination result meets the determination threshold or the preset maximum number of loops is reached, and the final restriction level is determined; based on the final restriction level, the permission coefficient is increased, the permission coefficient is decreased and the restriction reason label is output. Based on the power grid frequency difference, the upward adjustment allowance coefficient, and the downward adjustment allowance coefficient, an adaptive piecewise droop mapping is generated; The initial frequency modulation target power change is calculated based on the adaptive segmented droop mapping, and the initial achievable response upper limit is determined according to the final confinement level and the adaptive segmented droop mapping; the initial achievable response upper limit includes an initial upward achievable response upper limit and an initial downward achievable response upper limit; When the upward or downward adjustment of the permissible coefficient shrinks relative to the previous control cycle, the initial achievable response upper limit is shrunk according to a preset continuous change rule, and the initial frequency modulation target power change is updated to obtain the final frequency modulation target power change and the final achievable response upper limit; otherwise, the initial achievable response upper limit is directly determined as the final achievable response upper limit, and the initial frequency modulation target power change is determined as the final frequency modulation target power change; wherein, the continuous change rule is: multiply the initial achievable response upper limit by the shrinkage ratio of the upward or downward adjustment of the permissible coefficient, and shrink proportionally; Based on the final target frequency modulation power change and the final achievable response limit, the frequency modulation power command is obtained; By combining the restricted cause label, the increased permission coefficient, and the decreased permission coefficient, the restricted explanatory value is obtained; The frequency regulation power command is input into the turbine speed regulation system for execution, and the key quantities of extraction steam heating after execution are collected to obtain the execution feedback heating status. The heating constraint state set is updated based on the execution feedback heating status; Output the updated set of heating constraint states, the constrained interpretation quantities, and the frequency modulation power command.

2. The primary frequency regulation method for thermal power units based on constrained discrimination of reflective agents according to claim 1, characterized in that, The acquisition of the power grid frequency difference, the set of heating constraint states, and the command direction identifier includes: Based on the collected real-time frequency and rated frequency of the power grid, the power grid frequency difference is calculated. After normalizing the power grid frequency difference, the amplitude is segmented to obtain the power grid frequency difference quantity. The system collects current key quantities for steam extraction heating, heating pressure boundaries, heating temperature boundaries, and heating flow rate boundaries. The current key quantities for steam extraction heating include: current values ​​of heating pressure, heating temperature, and heating flow rate. The heating pressure boundaries include: an upper limit and a lower limit of heating pressure; the heating temperature boundaries include: an upper limit and a lower limit of heating temperature; and the heating flow rate boundaries include: an upper limit and a lower limit of heating flow rate. Based on a fixed sequence of encapsulation of the current key quantities of steam extraction heating, heating pressure boundary, heating temperature boundary, and heating flow rate boundary, a set of heating constraint states is obtained. The direction of the frequency modulation power command from the previous control cycle is obtained, and the direction of the frequency modulation power command is determined based on a preset direction threshold to obtain the direction determination result. Based on the direction determination result, the command direction identifier is determined.

3. The primary frequency regulation method for thermal power units based on constrained discrimination of reflective agents according to claim 2, characterized in that, The process involves generating a candidate restricted conclusion set based on the set of heating constraint states and a preset rule chain, extracting heating change direction features from the set of heating constraint states using a reflective agent, and performing consistency determination on each candidate path in the candidate restricted conclusion set based on the heating change direction features and the instruction direction identifier. This includes: The reflective agent extracts the current value of heating pressure, the upper limit of heating pressure, the lower limit of heating pressure, the current value of heating temperature, the upper limit of heating temperature, the lower limit of heating temperature, the current value of heating flow rate, the upper limit of heating flow rate, and the lower limit of heating flow rate from the set of heating constraint states, and forms a nine-dimensional vector in a fixed order. Based on the heating constraint state set, the heating change direction features are extracted in the change direction of adjacent control cycles, and a three-dimensional vector is formed in a fixed order; the three-dimensional vector is in the order of heating pressure change direction, heating temperature change direction, and heating flow rate change direction. The nine-dimensional vector, the three-dimensional vector, and the one-dimensional instruction direction identifier are concatenated in a fixed order to obtain a thirteen-dimensional input vector, which is then input into the consistency determination network. The consensus determination network is used to perform forward computation on the thirteen-dimensional input vector, and outputs a consensus score vector after bounded processing. The maximum value in the consistency score vector is compared with a preset judgment threshold. If the maximum value is greater than or equal to the preset judgment threshold, the candidate path corresponding to the maximum value is selected to obtain the consistency judgment result. Each dimension of the consistency score vector corresponds to a candidate path in the candidate restricted conclusion set.

4. The primary frequency regulation method for thermal power units based on constrained discrimination of reflective agents according to claim 3, characterized in that, The consensus determination network includes: The first fully connected layer has thirty-two neurons, the second fully connected layer has sixteen neurons, the third fully connected layer has eight neurons, and the output layer; The number of neurons in the output layer is the same as the number of candidate paths in the candidate restricted conclusion set; Each neuron consists of a weighted summation unit and a nonlinear transformation unit. The weighted summation unit combines the weights of the previous layer's output and adds the bias. The nonlinear transformation units of the first, second, and third fully connected layers use the ReLU activation function to perform nonlinear mapping on the weighted summation result. The nonlinear transformation unit of the output layer uses the Sigmoid activation function to map the weighted summation result, thus completing the bounded output of the consistency score vector in the 0~1 interval.

5. The primary frequency regulation method for thermal power units based on constrained discrimination of reflective agents according to claim 1, characterized in that, The step of generating an adaptive piecewise droop mapping based on the grid frequency difference, the upward adjustment allowance coefficient, and the downward adjustment allowance coefficient includes: The segmentation structure is determined based on the amplitude of the power grid frequency difference and a preset segmentation threshold set, and the frequency difference interval corresponding to each segment in the segmentation structure is determined; the frequency difference interval includes the primary frequency regulation dead zone interval, the linear adjustment interval, and the amplitude limiting interval; the segmentation structure includes multiple upward adjustment segments and multiple downward adjustment segments. For the upward adjustment direction of the segmented structure, the upward adjustment permission coefficient is mapped to the first segment slope parameter and the segment upper limit parameter of each upward adjustment segment to generate an upward adjustment segment mapping; For the downward adjustment direction of the segmented structure, the downward adjustment allowance coefficient is mapped to the second segment slope parameter and the segment lower limit parameter of each downward adjustment segment to generate a downward adjustment segment mapping; The upward segmentation mapping and the downward segmentation mapping are concatenated to form an adaptive segmented drooping mapping.

6. The primary frequency regulation method for thermal power units based on constrained discrimination of reflective agents according to claim 1, characterized in that, The process of obtaining the frequency modulation power command based on the final target power change and the final achievable response upper limit includes: Obtain the adjustment direction of the final frequency modulation target power change; Select the corresponding final upward adjustment achievable response limit or the final downward adjustment achievable response limit as the limiting benchmark according to the adjustment direction; The absolute value of the final frequency modulation target power change is compared with the amplitude limiting reference for amplitude verification. When the absolute value of the final frequency modulation target power change exceeds the limiting reference, the final frequency modulation target power change is limited while keeping the adjustment direction unchanged, to obtain the limited frequency modulation target power change. A frequency modulation power command is generated based on the change in the frequency modulation target power after the amplitude limiting.

7. The primary frequency regulation method for thermal power units based on constrained discrimination of reflective agents according to claim 2, characterized in that, The step of inputting the frequency regulation power command into the turbine speed control system for execution, and collecting key quantities of extracted steam for heating after execution to obtain the execution feedback heating status includes: The frequency regulation power command is input into the turbine speed regulation system, which then converts the frequency regulation power command into a speed regulation execution quantity to drive the turbine to complete the active power response. After the frequency regulation power command is input into the turbine speed control system, the timing logic is started; Once the timing reaches the preset execution confirmation time threshold, key quantities of the extracted steam for heating are collected to obtain the current values ​​of heating pressure, heating temperature, and heating flow rate. The preset execution confirmation time threshold is used to match the response lag characteristics of the turbine speed control system. The current values ​​of the heating pressure, heating temperature, and heating flow rate are packaged in a fixed order to form an execution feedback heating status.

8. The primary frequency regulation method for thermal power units based on constrained discrimination of reflective agents according to claim 7, characterized in that, The step of updating the heating constraint state set based on the execution feedback heating state includes: Obtain the preset heating pressure boundary, heating temperature boundary, and heating flow rate boundary; The current value of heating pressure, its upper and lower limits, the current value of heating temperature, its upper and lower limits, and the current value of heating flow rate, along with their upper and lower limits, are combined in a fixed order to form an updated set of heating constraint states.

9. A primary frequency regulation device for thermal power units based on constrained discrimination by a reflective intelligent agent, characterized in that, The device includes: The data acquisition module is used to acquire the power grid frequency difference, the set of heating constraint states, and the command direction identifier; The constraint determination module is used to generate a candidate restricted conclusion set based on the heating constraint state set and the preset rule chain, and to extract heating change direction features from the heating constraint state set using a reflective agent. Based on the heating change direction features and the instruction direction identifier, the module performs consistency determination on each candidate path in the candidate restricted conclusion set. The candidate restricted conclusion set includes multiple candidate paths, and each candidate path includes a restriction level and a restriction reason label. The constraint output module is used to determine the restriction level of a candidate path when the consistency judgment result corresponding to the candidate path meets a preset judgment threshold; otherwise, it switches the candidate path or regenerates the candidate restriction conclusion set and performs consistency judgment until the consistency judgment result meets the judgment threshold or reaches a preset maximum number of loops, and determines the final restriction level; based on the final restriction level, it outputs the increase permission coefficient, decrease permission coefficient and restriction reason label. The mapping generation module is used to generate an adaptive segmented droop mapping based on the power grid frequency difference, the upward adjustment allowance coefficient, and the downward adjustment allowance coefficient. The parameter calculation module is used to calculate the initial frequency modulation target power change based on the adaptive segmented droop mapping, and to determine the initial achievable response upper limit according to the final limiting level and the adaptive segmented droop mapping; the initial achievable response upper limit includes an initial upward achievable response upper limit and an initial downward achievable response upper limit; A power smoothing update module is used to respond to a contraction in the allowance coefficient or the allowance coefficient relative to the previous control cycle by contracting the initial achievable response upper limit and updating the initial frequency modulation target power change according to a preset continuous change rule, thereby obtaining the final frequency modulation target power change and the final achievable response upper limit; otherwise, the initial achievable response upper limit is directly determined as the final achievable response upper limit, and the initial frequency modulation target power change is determined as the final frequency modulation target power change; wherein, the continuous change rule is: multiplying the initial achievable response upper limit by the contraction ratio of the allowance coefficient or the allowance coefficient, and then contracting proportionally; The instruction generation module is used to obtain the frequency modulation power instruction based on the final frequency modulation target power change and the final achievable response upper limit; The explanatory quantity generation module is used to combine the restricted cause label, the increased permission coefficient, and the decreased permission coefficient to obtain the restricted explanatory quantity; The feedback acquisition module is used to input the frequency regulation power command into the turbine speed regulation system for execution, and to acquire the key quantities of extraction steam heating after execution to obtain the execution feedback heating status. The status update module is used to update the heating constraint status set based on the execution feedback heating status; The data output module is used to output the updated set of heating constraint states, the constrained interpretation quantities, and the frequency modulation power command.

10. An electronic device, characterized in that, At least one processor; and a memory communicatively connected to the at least one processor; wherein, The memory stores instructions that can be executed by the at least one processor, which, when executed by the at least one processor, enables the at least one processor to perform the primary frequency regulation method for thermal power units based on reflexive agent-constrained discrimination as described in claims 1 to 8.