An automatic data extraction method and system for water turbine characteristic curves
By establishing a mapping between pixels and the physical coordinate system and a regular sampling grid, combined with a neural network algorithm, the problem of low efficiency in manually selecting points for turbine characteristic curves was solved, achieving uniform data distribution and the construction of nonlinear models, thus supporting simulation modeling.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- HUNAN WULING POWER TECH CO LTD
- Filing Date
- 2026-03-17
- Publication Date
- 2026-06-12
AI Technical Summary
The current method of extracting characteristic curve data of water turbines relies on manual sampling, which results in low efficiency, uneven sampling, poor standardization, and the inability to be directly used for simulation modeling.
By establishing a mapping relationship between the image pixel coordinate system and the physical coordinate system, a regular sampling grid is constructed to collect and constrain user trajectory points, and a nonlinear model is built by combining neural network algorithms.
It achieves efficient and standardized extraction of characteristic curve data, ensures uniform distribution of sampling points, improves the standardization of data, and constructs a nonlinear model that can be used for simulation modeling.
Smart Images

Figure CN122197718A_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of hydropower engineering and computer-aided modeling technology, specifically to an automated data extraction method and system for turbine characteristic curves. Background Technology
[0002] The comprehensive characteristics of a hydroelectric turbine model are the core basis for the design, operation analysis, and regulation system modeling of hydropower station units. Due to the complex internal flow of a hydroelectric turbine, its external characteristics are often difficult to describe precisely using analytical mathematical formulas; therefore, engineering practice mainly relies on a large amount of discrete measurement data obtained from model tests. To facilitate comparative analysis under different operating conditions, the industry typically uses unit rotational speed... Unit flow Using dimensionless parameters as independent variables, key hydraulic performance indicators such as turbine efficiency and guide vane opening are expressed in the form of a model comprehensive characteristic curve.
[0003] In this type of comprehensive characteristic diagram, the unit flow rate is usually used. With unit speed Using multiple sets of guide vane opening curves, isoefficiency lines, and cavitation coefficient contour lines as coordinate axes, the overall operating characteristics distribution of the model runner is constructed. Engineers can obtain key parameters such as the turbine's hydraulic efficiency, optimal operating range, and cavitation performance by consulting these characteristic curves, based on specific head conditions and operating conditions. This provides data support for unit design, operation optimization, and dynamic simulation of the regulating system.
[0004] However, existing comprehensive characteristic data for hydro-turbine models are typically stored and disseminated in unstructured formats such as paper charts, scanned images, or PDF files, lacking a unified digital representation method and thus unsuitable as direct input data for hydro-turbine simulation modeling. In practical engineering applications, the acquisition of relevant characteristic data still primarily relies on manual reading, recording, and processing of characteristic curve images point by point. This process is not only inefficient and labor-intensive but also susceptible to human factors such as operator experience level, display resolution, and mouse precision, resulting in poor data accuracy and repeatability. Furthermore, existing manual reading methods typically rely on freely drawn trajectories or discrete point selection, lacking effective constraints on the spatial distribution of sampling points. The resulting curve data is prone to significant differences between different operators or during different reading processes, failing to meet the requirements of hydro-turbine simulation modeling and engineering calculations for the continuity, stability, and standardization of characteristic data.
[0005] Especially when dealing with dense curves, non-equidistant coordinates, and slight image rotation or distortion, traditional point-taking methods struggle to guarantee the overall continuity and accurate restoration of curve features. Furthermore, the lack of a unified digital data output interface prevents the extracted data from directly interfaceing with subsequent simulation platforms, digital twin systems, or parameter optimization programs, thus limiting the automated development of turbine modeling technology.
[0006] Therefore, how to achieve efficient and standardized data extraction of turbine characteristic curves while preserving engineers' understanding of the overall shape of the characteristic curves, and further construct a nonlinear turbine model based on engineering characteristic data, has become a technical problem that urgently needs to be solved in the field of hydropower unit modeling and simulation. Summary of the Invention
[0007] This invention provides an automated data extraction method and system for turbine characteristic curves. Its purpose is to solve the technical problems of low data extraction efficiency, uneven sampling, poor standardization, and inability to be directly used for simulation modeling caused by the reliance on manual point sampling of existing turbine characteristic curves.
[0008] To achieve the above objectives, the first aspect of the present invention provides an automated data extraction method for turbine characteristic curves, comprising the following steps: The characteristic curve of the water turbine model is acquired, and the image is calibrated to establish a mapping relationship between the image pixel coordinate system and the physical coordinate system. In the physical coordinate system, a regular sampling grid is constructed according to the set resolution parameters, dividing the physical coordinate space into multiple grid cells; Collect trajectory points drawn by the user along the characteristic curve, and transform the trajectory points to the physical coordinate system through the mapping relationship; Based on the rule sampling grid, the transformed trajectory points are spatially partitioned. For multiple trajectory points falling into the same grid cell, only one trajectory point is retained as the representative sampling point of that grid cell to obtain discrete data of the characteristic curve. The discrete data is standardized, and a nonlinear model of the turbine is constructed based on the standardized data using a neural network algorithm.
[0009] Furthermore, the coordinate calibration method includes: Select at least two reference points on the characteristic curve image and input the actual physical coordinates corresponding to each reference point; Record the pixel coordinates of the reference point in the image pixel coordinate system; Based on the pixel coordinates and physical coordinates of the at least two sets of reference points, a linear scaling model is used to establish the mapping relationship between the pixel coordinate system and the physical coordinate system. For any pixel in the characteristic curve image, its transformed physical coordinates are calculated using the following formula:
[0010]
[0011] in, , and , These are the coordinates of the bottom left and top right reference points in the image pixel coordinate system, respectively. , and , These are the actual physical coordinates corresponding to the reference points at the bottom left and top right, respectively; These are the physical coordinates of the point after transformation; represents the pixel coordinates of any point in the image.
[0012] Furthermore, the construction of the rule sampling grid includes: Determine the range of values for the horizontal and vertical coordinates in the physical coordinate space; Generate horizontal coordinate grid nodes based on the set horizontal resolution:
[0013] in, This represents the minimum value of the x-coordinate in the physical coordinate space. The maximum index for dividing the horizontal axis grid; These are the nodes of the horizontal coordinate grid; Horizontal resolution; Generate vertical coordinate grid nodes based on the set vertical resolution:
[0014] in, This represents the minimum value of the ordinate in the physical coordinate space. The maximum index for dividing the vertical axis grid; For the vertical coordinate grid nodes; Vertical resolution; The horizontal and vertical grid nodes form a regular sampling grid, which divides the physical coordinate space into multiple grid cells. Each grid cell is surrounded by adjacent horizontal and vertical grid nodes.
[0015] Furthermore, the method for collecting trajectory points drawn by the user along the characteristic curve and transforming the trajectory points to the physical coordinate system through the mapping relationship includes: Perform continuous drawing operations along the target characteristic curve using a mouse or touch device; The system collects trajectory points during the user's drawing process in real time at fixed time intervals and records the pixel coordinates of each trajectory point in the image pixel coordinate system. For each acquired trajectory point, the pixel coordinates are converted into corresponding physical coordinates using the established mapping relationship between the pixel coordinate system and the physical coordinate system. Arrange all the converted trajectory points in chronological order to form the representation of the user trajectory in physical coordinate space:
[0016] in, For the sampling time point, Total drawing time, , for The physical coordinates obtained after converting the trajectory points collected at all times. This represents the physical coordinates of the user's trajectory.
[0017] Furthermore, for multiple trajectory points falling within the same grid cell, the representative sampling point is determined according to any of the following rules: The point where the trajectory point first enters the grid cell; The point where the trajectory point last left this grid cell; The point in the trajectory that has the smallest distance to the center of the grid cell; The trajectory point is the median point along the trajectory direction within the grid cell.
[0018] Furthermore, the standardization process includes: normalizing the guide vane opening corresponding to the characteristic curve based on the maximum guide vane opening input by the user, thereby generating standardized data for neural network modeling.
[0019] Furthermore, the method for constructing a nonlinear model of a water turbine using a neural network algorithm includes: constructing a feedforward neural network with unit rotational speed and normalized guide vane opening as inputs and unit flow rate or unit torque as outputs; the feedforward neural network contains one or more hidden layers, the number of neurons in the hidden layers is set according to the complexity of the characteristic curve, the LM algorithm is used for training, and the network performance is evaluated by mean square error.
[0020] Furthermore, during the construction of the nonlinear model of the turbine, the model is extended by incorporating the runaway characteristic curve, zero opening boundary condition, and zero speed boundary condition. The expressions for unit torque and unit flow rate under the zero speed condition are as follows:
[0021] in, Unit torque; Unit rotational speed; To contribute to the unit; The derivative of unit force with respect to unit rotational speed; Unit flow rate; For efficiency; This is the derivative of efficiency with respect to unit rotational speed.
[0022] Furthermore, the collection of user trajectory points includes "read line" mode and "read point" mode: In "Read Lines" mode, the user-drawn trajectory is continuously collected at fixed time intervals; In "Read Point" mode, the user's click locations are collected point by point as sampling points.
[0023] To achieve the above objectives, a second aspect of the present invention provides an automated data extraction system for turbine characteristic curves based on coordinate calibration, comprising: The image input module is used to load images of the characteristic curves of the water turbine model; The coordinate calibration module is used to calibrate the coordinates of the image and establish a mapping relationship between the image pixel coordinate system and the physical coordinate system; The sampling grid construction module is used to construct a rule sampling grid according to the set resolution parameters in the physical coordinate system, and divide the physical coordinate space into multiple grid cells. The trajectory acquisition module is used to acquire trajectory points drawn by the user along the characteristic curve, and to convert the trajectory points to the physical coordinate system through the mapping relationship; The constraint sampling module is used to spatially partition the transformed trajectory points based on the rule sampling grid. For multiple trajectory points falling into the same grid cell, only one trajectory point is retained as the representative sampling point of the grid cell to obtain discrete data of the characteristic curve. The model building module is used to standardize the discrete data and, based on the standardized data, to construct a nonlinear model of the water turbine using a neural network algorithm.
[0024] The beneficial effects of this invention are: Compared with existing technologies, the present invention provides an automated data extraction method and system for turbine characteristic curves. By constructing a precise mapping from image coordinates to physical space and combining it with a configurable resolution regular sampling grid to spatially partition the user-drawn trajectory, only one representative sampling point is retained in each grid cell. This achieves a uniform distribution of sampling points in physical space while ensuring the overall geometric shape of the characteristic curve. It completely eliminates the problems of uneven sampling density and poor repeatability caused by human operation differences such as hand tremors and uneven drawing speeds, significantly improving the efficiency and standardization of data extraction. On this basis, by normalizing the discrete data and using a neural network for nonlinear fitting, a nonlinear model of a turbine that can be directly used for speed control system simulation and digital twin platform docking is successfully constructed. This effectively solves the technical problems of low efficiency, poor accuracy, and inability to be directly used for engineering modeling in traditional manual sampling methods. Attached Figure Description
[0025] To more clearly illustrate the technical solutions in the embodiments of the present invention, the accompanying drawings used in the description of the embodiments will be briefly introduced below.
[0026] Figure 1 This is a flowchart of an automated data extraction method for turbine characteristic curves disclosed in an embodiment of the present invention.
[0027] Figure 2 This is an example diagram of loading a comprehensive characteristic curve disclosed in an embodiment of the present invention.
[0028] Figure 3 This is an example diagram of a loading flight characteristic curve disclosed in an embodiment of the present invention.
[0029] Figure 4 This is a schematic diagram of the image after coordinate calibration and mesh generation.
[0030] Figure 5 This is a surface diagram showing the flow characteristics of a water turbine.
[0031] Figure 6 This is a surface diagram showing the torque characteristics of a water turbine. Detailed Implementation
[0032] To enable those skilled in the art to better understand the present invention, the technical solutions of the present invention will be clearly and completely described below with reference to the accompanying drawings of the embodiments of the present invention. Obviously, the described embodiments are only some embodiments of the present invention, and not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative effort should fall within the scope of protection of the present invention.
[0033] According to embodiments of the present invention, it should be noted that the steps shown in the flowcharts of the accompanying drawings can be executed in a computer system such as a set of computer-executable instructions, and although a logical order is shown in the following methods, in some cases the steps shown or described may be executed in a different order than that shown here.
[0034] like Figure 1 As shown, this invention provides an automated data extraction method for turbine characteristic curves, which includes the following steps: Step S100: Obtain an image of the characteristic curve of the water turbine model, and perform coordinate calibration on the image to establish a mapping relationship between the image pixel coordinate system and the physical coordinate system; In this step, the characteristic curve images include the model's overall characteristic curve and the flyaway characteristic curve. These images can be obtained from scanned copies of paper test reports, screenshots of PDF files, electronic drawings, or other unstructured image formats. The images are loaded and displayed in the main interface's display area, providing a visual interface for subsequent coordinate calibration and curve acquisition.
[0035] In a preferred embodiment of the present invention, an example of a combined characteristic curve image and a flyaway characteristic curve image is as follows: Figure 2 and Figure 3 As shown in the figure, the unit flow rate is... x-axis, unit speed Using the vertical axis as the ordinate, the comprehensive operating characteristics of the turbine are described by multiple sets of equal opening lines and equal efficiency lines.
[0036] After the image is loaded, the coordinate calibration step begins. The purpose of coordinate calibration is to establish a mapping relationship between the image pixel coordinate system and the physical coordinate system, thereby achieving an accurate conversion of geometric information in the characteristic curve image to the physical quantity space. In practice, the user needs to select at least two reference points in the characteristic curve image, preferably the lower left and upper right reference points, such as... Figure 4 As shown. The user first selects the pixel coordinates of the lower left reference point on the image by clicking with the mouse. Enter the actual physical coordinates of the point in the system prompt box. Then, in the same manner, select the upper right reference point and record its pixel coordinates. And input the corresponding actual physical coordinates The specific values of the physical coordinates can be directly read from the coordinate axis labels of the characteristic curve image. For example, the lower left corner corresponds to the origin (0,0), and the upper right corner corresponds to the maximum unit flow rate and the maximum unit speed.
[0037] When the image coordinate axes are substantially orthogonal and without significant distortion, a linear scaling model is used to independently map the horizontal and vertical coordinates. Based on the correspondence between the two sets of reference points mentioned above, for any pixel in the image... Its transformed physical coordinates The following linear interpolation formula is used for calculation:
[0038]
[0039] in, , and , These are the coordinates of the bottom left and top right reference points in the image pixel coordinate system, respectively. , and , These are the actual physical coordinates corresponding to the reference points at the bottom left and top right, respectively; These are the physical coordinates of the point after transformation; represents the pixel coordinates of any point in the image.
[0040] Once this mapping relationship is established, the system can convert any pixel location within the image area into its corresponding physical coordinate value in real time, laying the foundation for subsequent regular sampling grid construction and user trajectory acquisition. After coordinate calibration, the system displays the calibrated reference point positions on the image interface using crosshairs or color highlights, allowing users to confirm the calibration accuracy. If the image has slight rotation or distortion, users can choose to increase the number of reference points or use a more complex coordinate transformation model.
[0041] Step S200: In the physical coordinate system, a rule sampling grid is constructed according to the set resolution parameters to divide the physical coordinate space into multiple grid cells; The purpose of a regular sampling grid is to standardize and partition the physical coordinate space, providing a spatial basis for subsequent user trajectory point constraint sampling. By constructing a uniformly distributed grid within the physical coordinate space, continuous user-drawn trajectories can be transformed into a discrete set of spatially controllable sampling points, thereby effectively eliminating the problem of uneven sampling point distribution caused by factors such as differences in user operation speed and hand tremors.
[0042] In practice, the range of values in the physical coordinate space is first determined based on the coordinate calibration results. Let the horizontal coordinate (i.e., unit flow rate) in the physical coordinate system be... The minimum and maximum values of ) are respectively and The vertical axis (i.e., unit rotational speed) The minimum and maximum values of ) are respectively and This range of values can be determined directly from the coordinate axis markings in the characteristic curve image, or it can be calculated from the physical coordinates of a reference point.
[0043] After determining the boundaries of the physical coordinate space, a rule-based sampling grid is constructed according to the resolution parameters set by the user. These resolution parameters include the lateral resolution. and vertical resolution , representing the spacing between adjacent grid nodes in physical coordinate space. Users can flexibly set these two parameters according to the complexity of the characteristic curve and the required sampling accuracy: for curve regions with drastic changes, a smaller resolution can be set to obtain denser sampling points; for regions with gentle changes, the resolution can be appropriately increased to improve data extraction efficiency.
[0044] Based on the above parameters, grid nodes are generated in the horizontal and vertical coordinate directions, and the calculation formula is as follows:
[0045] in, This represents the minimum value of the x-coordinate in the physical coordinate space. The maximum index for dividing the horizontal axis grid; These are the nodes of the horizontal coordinate grid; Horizontal resolution; Generate vertical coordinate grid nodes based on the set vertical resolution:
[0046] in, This represents the minimum value of the ordinate in the physical coordinate space. The maximum index for dividing the vertical axis grid; For the vertical coordinate grid nodes; Vertical resolution; All grid nodes Together they form a set of regular sampling grids .
[0047] The physical coordinate space defined by the aforementioned grid nodes is naturally divided into several rectangular grid cells, each cell consisting of two adjacent horizontal grid nodes and two vertical grid nodes. These grid cells are not directly used as the final sampling points, but rather as the basis for spatial partitioning: in subsequent steps, the grid cell to which each user trajectory point belongs will be determined, and only one representative sampling point will be retained within each cell. It is worth noting that although grid nodes have definite physical coordinates, the representative sampling point is not required to fall on a grid node, but is allowed to fall at any location within the grid cell. This ensures that the sampling points accurately reflect the actual trend of the characteristic curve and avoids additional errors introduced by forced merging into grid nodes.
[0048] Once the regular sampling grid is constructed, it can be displayed on the image interface as a light-colored grid line or a dot matrix overlay, allowing users to intuitively understand the currently set sampling density and spatial distribution. Users can adjust the grid resolution at any time if they are not satisfied with it. and The parameters are then adjusted and the grid is regenerated until the desired sampling density is achieved. This configurable grid resolution mechanism allows the method of this invention to flexibly adapt to characteristic curves of different types and complexities, ensuring data quality while also considering operational efficiency and user-specific needs.
[0049] Step S300: Collect the trajectory points drawn by the user along the characteristic curve, and convert the trajectory points to the physical coordinate system through the mapping relationship; In practice, two selectable acquisition modes are provided: "Read Line" mode and "Read Point" mode, to accommodate different curve shapes and operating habits. In "Read Line" mode, the user continuously draws along the target characteristic curve by holding down the left mouse button on a mouse or touch device at fixed time intervals. Real-time acquisition of trajectory points during the user's drawing process, recording the pixel coordinates of each trajectory point in the image pixel coordinate system. The time interval The settings can be configured according to system performance and operational accuracy requirements, with typical values ranging from 10 milliseconds to 50 milliseconds. This ensures the continuity of the trajectory while avoiding data redundancy caused by overly dense sampling. Users can move the cursor along the curve naturally during drawing, and the system will automatically record the entire drawing process.
[0050] For each acquired trajectory point, the pixel coordinates are immediately converted to their corresponding physical coordinates using the mapping relationship between the pixel coordinate system and the physical coordinate system established in step S100. Through this real-time conversion mechanism, the user can see the physical coordinate value of the current point in the status bar or cursor tooltip of the image interface during the drawing process, facilitating real-time confirmation of the accuracy of the drawing position. Simultaneously, the converted physical coordinate trajectory points are stored sequentially in chronological order, forming a complete representation of the user's trajectory in the physical coordinate space.
[0051] in, For the sampling time point, Total drawing time, , for The physical coordinates obtained after converting the trajectory points collected at all times. This represents the physical coordinates of the user's trajectory.
[0052] In "Read Point" mode, users can select key feature points on a characteristic curve by clicking point by point. This mode is suitable for handling isolated data points, the start and end points of curves, inflection points, or intersections with other curves. Each click records the pixel coordinates of the current cursor position and converts them into physical coordinates through a mapping relationship, storing them as an independent sampling point. Compared to continuous drawing, "Read Point" mode gives users more precise control, allowing them to supplement key feature points on top of the dense data collected in "Read Line" mode to improve the accuracy of curve representation.
[0053] Whether in "read line" mode or "read point" mode, all collected trajectory points are temporarily stored in a list of trajectory points in memory and displayed in real time on the image interface using different colors or symbols, allowing users to intuitively view the coverage of the currently collected trajectory. If users find errors or unsatisfactory parts during the drawing process, they can delete the most recent trajectory segment using the undo function, or clear all trajectories and re-collect. This interactive acquisition method ensures operational flexibility and provides users with ample opportunities for error correction, ensuring that the final obtained raw trajectory data accurately reflects the true shape of the characteristic curve.
[0054] Step S400: Based on the rule sampling grid, the converted trajectory points are spatially partitioned. For multiple trajectory points falling into the same grid cell, only one trajectory point is retained as the representative sampling point of the grid cell to obtain discrete data of the characteristic curve. The constraint sampling mechanism based on rule sampling grid proposed in this invention decouples the user drawing operation from the final sampling result through two key steps: spatial partitioning and representative point selection. This fundamentally solves the problems of uneven sampling point density and poor repeatability in traditional manual point selection methods.
[0055] In practice, based on the rule-based sampling grid constructed in step S200, the physical coordinate space is divided into several grid cells. Each grid cell corresponds to a defined spatial region, and its boundary is defined by adjacent horizontal grid nodes. and vertical grid nodes Enclosed.
[0056] Iterate through all user trajectory points collected in step S300. For each trajectory point, based on its physical coordinates Determine the corresponding mesh cell. The determination method is: find cells that satisfy the condition... and Grid index Then the trajectory point is marked as belonging to the index. The grid cells are used to discretize the originally continuous user trajectory into a set of trajectory points belonging to different grid cells.
[0057] After determining the attribution of all trajectory points, the trajectory point set within each grid cell is processed independently. For multiple user trajectory points falling into the same grid cell, the principle of "retaining only one representative sampling point" is adopted for filtering, and the remaining trajectory points are discarded.
[0058] The filtering mechanism lies in: the resolution parameter of the regular sampling grid. and Under proper settings, the spatial range within the same grid cell is small enough that multiple trajectory points within it will inevitably be highly adjacent. Retaining just one point is sufficient to fully represent the curve characteristics of the local area. At the same time, by retaining only one point in each grid cell, the distribution density of the final sampling points in the physical coordinate space is ensured to strictly correspond to the grid resolution, thereby achieving uniform control of the sampling points.
[0059] The selection rules for representative sampling points can be flexibly configured according to actual application needs. The system provides a variety of optional determination rules for users to choose from. In the preferred embodiment, representative sampling points can be determined according to any of the following rules: Rule 1: Select the point where the trajectory point first enters the grid cell. This rule can better preserve the entry characteristics of the curve along the drawing direction and is suitable for application scenarios that need to maintain curve direction information. Rule 2: Select the point where the trajectory point last leaves the grid cell. This rule is suitable for situations where the exit characteristics of the curve are of interest. Rule 3: Select the point among the trajectory points with the smallest distance to the center of the grid cell. This rule makes the representative sampling point as close as possible to the center of the grid cell, which is beneficial for subsequent gridded data processing. Rule 4: Select the median point of the trajectory points along the trajectory direction within the grid cell, that is, the point located in the middle position after sorting according to the trajectory order. This rule can eliminate the influence of accidental jitter to a certain extent, making the representative point better reflect the overall trend of the curve. Users can choose the most suitable one from the above rules according to the specific shape of the characteristic curve and the purpose of the data.
[0060] Through the aforementioned constrained sampling mechanism, spatially uniform discrete data of characteristic curves can be obtained. For regions where trajectory points are initially dense or sparse due to varying user drawing speeds, after grid cell filtering, at most one point is retained per cell, thus decoupling the sampling point density from the drawing speed. For local trajectory jitter caused by user hand tremors, since jittery trajectory points are often distributed across multiple adjacent grid cells, only one point is retained per cell, effectively suppressing the jitter amplitude while preserving the overall curve's direction. This mechanism ensures that even if different users draw the same curve, or the same user draws the same curve multiple times, the resulting discrete data exhibits high spatial consistency and repeatability.
[0061] In addition, the resolution parameter of the regular sampling grid and This invention provides a direct means of controlling the sampling point density. Users can flexibly adjust the density according to the complexity of the characteristic curve: for curve segments with drastic changes, a smaller grid spacing is set to obtain dense sampling points, ensuring that curve details are not lost; for curve segments with gentle changes, a larger grid spacing is set to reduce data redundancy. This adaptive sampling density control allows the method of this invention to balance data accuracy and storage efficiency, making it suitable for engineering application scenarios with different accuracy requirements.
[0062] After constrained sampling is completed, the representative sampling points retained within each grid cell are organized according to their physical coordinates to form a discrete dataset of the turbine model's comprehensive characteristic curve and runaway characteristic curve. This dataset has three significant characteristics: first, it is spatially uniform, with the distance between adjacent sampling points largely determined by the grid resolution; second, its density is controllable, allowing users to preset the sampling density through grid parameters; and third, it is minimally affected by human factors, with data extracted by different operators exhibiting high consistency. Simultaneously, the trajectory points before and after filtering are overlaid in different colors on the image interface, enabling users to intuitively compare the differences between the original trajectory and the final sampling result. Users can also adjust the grid parameters and resample at any time until satisfactory discrete data results are obtained.
[0063] Step S500: Standardize the discrete data and construct a nonlinear model of the turbine based on the standardized data using a neural network algorithm.
[0064] Data standardization is a necessary preparatory step before neural network modeling. Because the curves corresponding to different guide vane openings in the characteristic curves have different dimensions and numerical ranges, directly using them as neural network input may lead to slow convergence or getting stuck in local optima during the training process.
[0065] To solve this problem, the maximum guide vane opening is input by the user during the model information setting phase. Parameters, corresponding to the guide vane opening of each of the collected equal opening lines. Normalization is performed. The normalization formula is: This ensures that the normalized guide vane opening range falls within the range of Within the interval. After normalization, the input parameters of all sample data (including unit rotational speed) Both the normalized guide vane opening and the normalized guide vane opening have a uniform scale range, which is beneficial to improving the training efficiency and fitting accuracy of neural networks.
[0066] After data standardization, training samples are organized according to the turbine characteristic type to be modeled. For flow characteristic modeling, the data points on the constant opening line collected in step S400 are directly used as training samples, with the input for each sample being... The output is a unit flow rate. For torque characteristic modeling, it is necessary to comprehensively utilize isoaperture line data and isoefficiency line data: firstly, the efficiency characteristic surface is fitted using isoefficiency line data to obtain arbitrary operating points. Corresponding efficiency value Then, based on the theoretical relationship between unit torque, unit flow rate, unit speed, and efficiency of the turbine, the flow characteristic sample is converted into a torque characteristic sample:
[0067] in, Unit torque; Unit rotational speed; Unit flow rate; For efficiency.
[0068] In a preferred embodiment of the invention, a feedforward neural network structure including one hidden layer is employed. The input layer has two neurons, each corresponding to a unit rotational speed. and normalized guide vane opening The output layer is configured with one neuron, corresponding to a unit flow rate. or unit torque The number of hidden layer neurons is set according to the complexity of the characteristic curve, typically ranging from 5 to 20, and can be determined through cross-validation or empirical formulas. Hidden layer neurons use nonlinear activation functions, such as the hyperbolic tangent function (tansig) or the sigmoid function, to enhance the network's nonlinear mapping ability; output layer neurons use the linear activation function (purelin), allowing the network output to cover the entire real number range.
[0069] The neural network training employs the LM algorithm, which boasts fast convergence. This algorithm combines the global search characteristics of gradient descent with the local fast convergence of the Gauss-Newton method, effectively improving training efficiency and accuracy. During training, the mean squared error (MSE) is used as the network performance evaluation metric, and its calculation formula is as follows:
[0070] in, For the sample true value, This is the network prediction value. This represents the number of samples.
[0071] To prevent overfitting, all samples are randomly divided into three parts: a training set, a validation set, and a test set, with proportions of 80%, 10%, and 10%, respectively. The training set is used to adjust network weights and thresholds; the validation set is used to monitor overfitting during training and stop training at an appropriate time; and the test set is used only after training is completed to evaluate the network's generalization performance. The maximum number of training iterations is set to 1000. If the validation set error fails to decrease for several consecutive training iterations, training is terminated early to conserve computational resources.
[0072] Before training the neural network, the boundary conditions of the turbine model need to be reasonably extended to ensure the model's applicability across all operating conditions. The model extension mainly considers three aspects: First, the runaway characteristic boundary condition, directly obtaining the flow and torque characteristic data of the turbine in runaway state from the runaway characteristic curve collected in step S400, as supplementary training samples; Second, the zero-opening boundary condition, when the guide vane opening is 0, the unit flow rate is always 0, and the unit torque is less than 0 and proportional to the square of the unit speed, generating corresponding artificial samples based on this physical law; Third, the zero-speed boundary condition, when the unit speed is 0, the unit efficiency and unit output are both 0, and the unit torque and unit flow rate need to be determined through limit analysis. For the zero-speed boundary condition, the limit expressions for unit torque and unit flow rate are derived using L'Hôpital's rule:
[0073] in, Unit torque; Unit rotational speed; To contribute to the unit; The derivative of unit force with respect to unit rotational speed; Unit flow rate; For efficiency; This is the derivative of efficiency with respect to unit rotational speed.
[0074] By extending the boundary conditions described above, a complete training sample set covering the entire operating range can be obtained, ensuring that the neural network model still has a reasonable physical response under extreme operating conditions such as zero speed, zero opening, and runaway.
[0075] After the neural network is trained, the trained network structure and parameters (including weight matrices, threshold vectors, activation function types, etc.) are saved as a nonlinear model file for the hydro turbine. This model can be directly embedded into a speed control system simulation platform, a digital twin system, or a parameter optimization program. Users can use the model validation function to compare the predicted values of the trained neural network with the original sampling points to check the fitting accuracy.
[0076] To further visualize the model's characteristics, the turbine's flow rate characteristic surface and torque characteristic surface can be plotted based on the trained neural network. The schematic results are shown below. Figure 5 and Figure 6 As shown. These surfaces rotate at a unit speed. and normalized guide vane opening Using the base coordinates, in units of flow rate or unit torque This provides a comprehensive and detailed overview of the nonlinear characteristics of the turbine throughout its entire operating range, offering an intuitive reference for unit operation analysis and control strategy design.
[0077] Through the above steps, this invention realizes an automated modeling process from turbine characteristic curve images to turbine nonlinear models, which greatly improves modeling efficiency while ensuring model accuracy, and provides strong technical support for the simulation analysis and optimized operation of hydropower units.
[0078] According to another aspect of the embodiments of this application, an automated data extraction system for turbine characteristic curves based on coordinate calibration is also provided, comprising: The image input module is used to load images of the characteristic curves of the water turbine model; The coordinate calibration module is used to calibrate the coordinates of the image and establish a mapping relationship between the image pixel coordinate system and the physical coordinate system; The sampling grid construction module is used to construct a rule sampling grid according to the set resolution parameters in the physical coordinate system, and divide the physical coordinate space into multiple grid cells. The trajectory acquisition module is used to acquire trajectory points drawn by the user along the characteristic curve, and to convert the trajectory points to the physical coordinate system through the mapping relationship; The constraint sampling module is used to spatially partition the transformed trajectory points based on the rule sampling grid. For multiple trajectory points falling into the same grid cell, only one trajectory point is retained as the representative sampling point of the grid cell to obtain discrete data of the characteristic curve. The model building module is used to standardize the discrete data and, based on the standardized data, to construct a nonlinear model of the water turbine using a neural network algorithm.
[0079] In the above embodiments of the present invention, the descriptions of each embodiment have different focuses. For parts not described in detail in a certain embodiment, please refer to the relevant descriptions of other embodiments.
[0080] In the several embodiments provided in this application, it should be understood that the disclosed technical content can be implemented in other ways. The device embodiments described above are merely illustrative; for example, the division of units can be a logical functional division, and in actual implementation, there may be other division methods. For instance, multiple units or components may be combined or integrated into another system, or some features may be ignored or not executed. Furthermore, the displayed or discussed mutual coupling, direct coupling, or communication connection may be through some interfaces; the indirect coupling or communication connection between units or modules may be electrical or other forms.
[0081] Furthermore, the functional units in the various embodiments of the present invention can be integrated into one processing unit, or each unit can exist physically separately, or two or more units can be integrated into one unit. The integrated unit can be implemented in hardware or as a software functional unit.
[0082] If the integrated unit is implemented as a software functional unit and sold or used as an independent product, it can be stored in a computer-readable storage medium. Based on this understanding, the technical solution of the present invention, in essence, or the part that contributes to the prior art, or all or part of the technical solution, can be embodied in the form of a software product. This computer software product is stored in a storage medium and includes several instructions to cause a computer device (which may be a personal computer, server, or network device, etc.) to execute all or part of the steps of the methods described in the various embodiments of the present invention. The aforementioned storage medium includes various media capable of storing program code, such as USB flash drives, read-only memory (ROM), random access memory (RAM), portable hard drives, magnetic disks, or optical disks.
[0083] The above description is only a preferred embodiment of the present invention. It should be noted that for those skilled in the art, several improvements and modifications can be made without departing from the principle of the present invention, and these improvements and modifications should also be considered within the scope of protection of the present invention.
Claims
1. A method for automated data extraction of turbine characteristic curves, characterized in that, Includes the following steps: The characteristic curve of the water turbine model is obtained, and the image is calibrated to establish a mapping relationship between the image pixel coordinate system and the physical coordinate system. In the physical coordinate system, a regular sampling grid is constructed according to the set resolution parameters, dividing the physical coordinate space into multiple grid cells; Collect trajectory points drawn by the user along the characteristic curve, and transform the trajectory points to the physical coordinate system through the mapping relationship; Based on the rule sampling grid, the transformed trajectory points are spatially partitioned. For multiple trajectory points falling into the same grid cell, only one trajectory point is retained as the representative sampling point of that grid cell to obtain discrete data of the characteristic curve. The discrete data is standardized, and a nonlinear model of the turbine is constructed based on the standardized data using a neural network algorithm.
2. The automated data extraction method for turbine characteristic curves as described in claim 1, characterized in that, The coordinate calibration method includes: Select at least two reference points on the characteristic curve image and input the actual physical coordinates corresponding to each reference point; Record the pixel coordinates of the reference point in the image pixel coordinate system; Based on the pixel coordinates and physical coordinates of the at least two sets of reference points, a linear scaling model is used to establish the mapping relationship between the pixel coordinate system and the physical coordinate system. For any pixel in the characteristic curve image, its transformed physical coordinates are calculated using the following formula: in, , and , These are the coordinates of the bottom left and top right reference points in the image pixel coordinate system, respectively. , and , These are the actual physical coordinates corresponding to the reference points at the bottom left and top right, respectively; These are the physical coordinates of the point after transformation; represents the pixel coordinates of any point in the image.
3. The automated data extraction method for turbine characteristic curves as described in claim 1, characterized in that, The constructed rule sampling grid includes: Determine the range of values for the horizontal and vertical coordinates in the physical coordinate space; Generate horizontal coordinate grid nodes based on the set horizontal resolution: in, This represents the minimum value of the abscissa in the physical coordinate space. The maximum index for dividing the horizontal axis grid; These are the nodes of the horizontal coordinate grid; Horizontal resolution; Generate vertical coordinate grid nodes based on the set vertical resolution: in, This represents the minimum value of the ordinate in the physical coordinate space. The maximum index for dividing the vertical axis grid; For the vertical coordinate grid nodes; Vertical resolution; The horizontal and vertical grid nodes form a regular sampling grid, which divides the physical coordinate space into multiple grid cells. Each grid cell is surrounded by adjacent horizontal and vertical grid nodes.
4. The automated data extraction method for turbine characteristic curves as described in claim 1, characterized in that, The method for collecting trajectory points drawn by the user along the characteristic curve and transforming the trajectory points to the physical coordinate system through the mapping relationship includes: Perform continuous drawing operations along the target characteristic curve using a mouse or touch device; The system collects trajectory points during the user's drawing process in real time at fixed time intervals and records the pixel coordinates of each trajectory point in the image pixel coordinate system. For each acquired trajectory point, the pixel coordinates are converted into corresponding physical coordinates using the established mapping relationship between the pixel coordinate system and the physical coordinate system. Arrange all the converted trajectory points in chronological order to form the representation of the user trajectory in physical coordinate space: in, For the sampling time point, Total drawing time, , for The physical coordinates obtained after converting the trajectory points collected at all times. This represents the physical coordinates of the user's trajectory.
5. The automated data extraction method for turbine characteristic curves as described in claim 1, characterized in that, For multiple trajectory points falling within the same grid cell, the representative sampling point is determined according to any of the following rules: The point where the trajectory point first enters the grid cell; The point where the trajectory point last left this grid cell; The point in the trajectory that has the smallest distance to the center of the grid cell; The trajectory point is the median point along the trajectory direction within the grid cell.
6. The automated data extraction method for turbine characteristic curves as described in claim 1, characterized in that, The standardization process includes: normalizing the guide vane opening corresponding to the characteristic curve based on the maximum guide vane opening input by the user, and generating standardized data for neural network modeling.
7. The automated data extraction method for turbine characteristic curves as described in claim 1, characterized in that, The method for constructing a nonlinear model of a hydro turbine using a neural network algorithm includes: using unit rotational speed and normalized guide vane opening as inputs, and unit flow rate or unit torque as outputs to construct a feedforward neural network; the feedforward neural network contains one or more hidden layers, the number of neurons in the hidden layers is set according to the complexity of the characteristic curve, the LM algorithm is used for training, and the network performance is evaluated by mean square error.
8. The automated data extraction method for turbine characteristic curves as described in claim 7, characterized in that, In the process of constructing the nonlinear model of the water turbine, the model is also extended by combining the runaway characteristic curve, zero opening boundary condition, and zero speed boundary condition. The expressions for unit torque and unit flow rate under the zero speed condition are as follows: in, Unit torque; Unit rotational speed; To contribute to the unit; The derivative of unit force with respect to unit rotational speed; Unit flow rate; For efficiency; This is the derivative of efficiency with respect to unit rotational speed.
9. The automated data extraction method for turbine characteristic curves as described in claim 1, characterized in that, User trajectory point collection includes "read line" mode and "read point" mode: In "Read Lines" mode, the user-drawn trajectory is continuously collected at fixed time intervals; In "Read Point" mode, the user's click location is collected point by point as a sampling point.
10. An automated data extraction system for turbine characteristic curves based on coordinate calibration, characterized in that, include: The image input module is used to load images of the characteristic curves of the water turbine model; The coordinate calibration module is used to calibrate the coordinates of the image and establish a mapping relationship between the image pixel coordinate system and the physical coordinate system; The sampling grid construction module is used to construct a rule sampling grid according to the set resolution parameters in the physical coordinate system, and divide the physical coordinate space into multiple grid cells. The trajectory acquisition module is used to acquire trajectory points drawn by the user along the characteristic curve, and to convert the trajectory points to the physical coordinate system through the mapping relationship; The constraint sampling module is used to spatially partition the transformed trajectory points based on the rule sampling grid. For multiple trajectory points falling into the same grid cell, only one trajectory point is retained as the representative sampling point of the grid cell to obtain discrete data of the characteristic curve. The model building module is used to standardize the discrete data and, based on the standardized data, to construct a nonlinear model of the water turbine using a neural network algorithm.