River water level early warning method and device and nonvolatile storage medium

By combining a multimodal language model with multi-step reasoning prompt word templates, the robustness problem of traditional river water level monitoring in complex environments is solved, enabling accurate perception and risk assessment of water level changes, and improving the automation and early warning accuracy of river water level monitoring.

CN122245059APending Publication Date: 2026-06-19STATE GRID BEIJING ELECTRIC POWER CO

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
STATE GRID BEIJING ELECTRIC POWER CO
Filing Date
2026-03-13
Publication Date
2026-06-19

AI Technical Summary

Technical Problem

Traditional river water level monitoring methods lack robustness in complex field environments, leading to frequent false alarms and missed alarms, and delayed risk assessment, thus failing to meet the requirements of high-precision and high-timeliness intelligent flood control.

Method used

By employing a multimodal language model combined with multi-step inference prompt word templates, the system identifies fixed reference points and judges water level changes by comparing current monitoring images with baseline images, and then uses time-series trends and confidence levels to determine water level warning results.

Benefits of technology

It enables precise perception and autonomous risk assessment of water level changes in rivers surrounding power transmission towers, improving the automation level of water level monitoring and the accuracy and real-time nature of risk warnings, and avoiding false alarms and missed alarms.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN122245059A_ABST
    Figure CN122245059A_ABST
Patent Text Reader

Abstract

This invention discloses a method, device, and non-volatile storage medium for river water level early warning. The method includes: acquiring a current monitoring image of the target river at the current moment and a baseline monitoring image pre-stored in a preset database; encoding the current monitoring image and the baseline monitoring image, and combining them with a preset multi-step inference prompt word template to construct a multimodal input message; inputting the multimodal input message into a preset multimodal language model to obtain the current water level of the target river; and determining the water level early warning result of the target river based on the current water level, wherein the water level early warning result includes whether there is a risk of rising water or not. This invention solves the technical problem that traditional image analysis methods are difficult to achieve robust recognition in complex field scenes, leading to frequent false alarms and missed alarms, and delayed risk assessment.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of water conservancy monitoring and early warning technology, and more specifically, to a method, device and non-volatile storage medium for river water level early warning. Background Technology

[0002] In the fields of river hydrological monitoring and power transmission line safety protection, traditional methods mainly rely on regular manual inspections and fixed threshold sensor networks. Manual inspections are limited by high labor costs, long response cycles, and are greatly affected by severe weather and complex terrain, making it difficult to achieve real-time perception of sudden water level rises during the flood season. While fixed sensors have automation capabilities, they have high deployment density, high maintenance costs, poor adaptability, and are prone to false alarms and missed alarms due to siltation, equipment aging, or environmental interference. Moreover, they can only monitor a single physical quantity and lack the ability to understand the overall situation.

[0003] While current computer vision-based image analysis solutions have seen some application, they generally employ traditional target detection or image difference algorithms. These algorithms lack robustness to complex field environments such as changes in lighting, vegetation occlusion, and water surface reflections. They struggle to accurately identify the relative positional changes of water levels and reference objects, and lack the ability to reason about temporal evolution trends and fuse multi-source information. Consequently, they cannot achieve closed-loop intelligent analysis from "image recognition" to "risk assessment" and then to "decision linkage." Furthermore, current systems generally suffer from the separation of water level monitoring and tower safety assessment, resulting in severe information silos. They have failed to establish a correlation model between river hydrological changes and the status of transmission equipment, leading to delayed early warnings and a lack of coordinated response. This makes it difficult to meet the demands of modern power grids for high-precision, high-timeliness, and highly interpretable intelligent flood control under extreme weather conditions.

[0004] There is currently no effective solution to the above problems. Summary of the Invention

[0005] This invention provides a method, device, and non-volatile storage medium for river water level early warning, which at least solves the technical problem that traditional image analysis methods are difficult to achieve robust recognition in complex field scenarios, resulting in frequent false alarms and missed alarms, and delayed risk assessment.

[0006] According to one aspect of the present invention, a river water level early warning method is provided, comprising: acquiring a current monitoring image of a target river at the current moment and a baseline monitoring image pre-stored in a preset database; encoding the current monitoring image and the baseline monitoring image, and combining them with a preset multi-step inference prompt word template to construct a multimodal input message, wherein the multi-step inference prompt word template is used to guide a preset multimodal language large model to determine the water level; inputting the multimodal input message into the preset multimodal language large model to obtain the current water level status of the target river, wherein the current water level status includes whether the current water level is rising, the confidence level of the current result, and the severity of the current water level; and determining a water level early warning result for the target river based on the current water level status, wherein the water level early warning result includes whether there is a risk of rising water or not.

[0007] Optionally, acquiring the current monitoring image of the target river at the current moment includes: acquiring the original monitoring image of the target river at the current moment; correcting the original monitoring image based on camera parameters in a preset database to obtain a first monitoring image; increasing the contrast of the waterfront boundary of the target river and the fixed reference object of the target river in the first monitoring image to obtain a second monitoring image; and standardizing the second monitoring image based on a preset resolution to obtain the current monitoring image.

[0008] Optionally, the preset multi-step inference prompt template includes the following sub-steps: identifying fixed reference objects that match the current monitoring image and the reference monitoring image, wherein the fixed reference objects are poles, markers, vegetation, or fixed structures on the bank in the target river channel; comparing the relative vertical positions of the water surface edges of the target river channel in the current monitoring image and the reference monitoring image based on the fixed reference objects; determining whether the current water level has risen and the magnitude of the water level change based on the relative vertical positions; determining the severity of the current water level based on the ratio between the magnitude of the water level change and the height of the fixed reference objects, wherein the severity of the current water level includes slight, significant, and severe; calculating the confidence level of this inference process; and outputting the results including whether the current water level has risen, the confidence level of this result, and the severity of the current water level.

[0009] Optionally, based on the current water level, the water level warning result for the target river is determined, including: obtaining multiple historical water level conditions prior to the current water level; and determining the water level warning result as indicating a risk of rising water levels when the water levels in each of the multiple historical water level conditions have risen and the severity of the current water level condition reaches a preset threshold.

[0010] Optionally, it also includes: generating rich media warning content based on the water level warning results, wherein the rich media warning content includes multiple comparison image frames of the current monitoring image and the baseline monitoring image; determining the sending method based on the severity of the current water level, wherein the sending method includes at least one of the following: SMS, telephone, system platform; and sending the rich media warning content to the target user terminal according to the sending method.

[0011] Optionally, it also includes: storing the current monitoring image and the current water level in a preset database; obtaining multiple historical water levels and the corresponding historical monitoring images from the preset database; updating the parameters of the multimodal language large model based on the current monitoring image, the current water level, multiple historical water levels, and the corresponding historical monitoring images to obtain the updated multimodal language large model.

[0012] According to another aspect of the present invention, a river water level early warning device is also provided, comprising: an acquisition module, configured to acquire a current monitoring image of a target river at the current moment and a baseline monitoring image pre-stored in a preset database; a construction module, configured to encode the current monitoring image and the baseline monitoring image, and combine them with a preset multi-step inference prompt word template to construct a multimodal input message, wherein the multi-step inference prompt word template is used to guide a preset multimodal language large model to perform water level judgment; a judgment module, configured to input the multimodal input message into the preset multimodal language large model to obtain the current water level status of the target river, wherein the current water level status includes whether the current water level is rising, the confidence level of the current result, and the severity of the current water level; and a determination module, configured to determine the water level early warning result of the target river based on the current water level status, wherein the water level early warning result includes whether there is a risk of rising water or not.

[0013] According to another aspect of the present invention, a non-volatile storage medium is also provided, the non-volatile storage medium including a stored program, wherein, when the program is running, the device where the non-volatile storage medium is located is controlled to execute any of the above-described river water level early warning methods.

[0014] According to another aspect of the present invention, a computer device is also provided, the computer device including a processor, the processor being configured to run a program, wherein the program executes any of the above-described river water level early warning methods during runtime.

[0015] According to another aspect of the present invention, a computer program product is also provided, including a computer program that, when executed by a processor, implements any of the above-described river water level early warning methods.

[0016] In this embodiment of the invention, a river water level early warning method is adopted. This method acquires the current monitoring image of the target river at the current moment and a baseline monitoring image pre-stored in a preset database. The current monitoring image and the baseline monitoring image are encoded and combined with a preset multi-step inference prompt word template to construct a multimodal input message. The multi-step inference prompt word template guides a preset multimodal language model to determine the water level. The multimodal input message is input into the preset multimodal language model to obtain the current water level of the target river, including whether the current water level is rising, the confidence level of the current result, and the severity of the current water level. Based on the current water level, a water level early warning result for the target river is determined, including whether there is a risk of rising water or not. This achieves the goal of accurate perception and autonomous risk assessment of water level changes in rivers surrounding power transmission towers, thereby improving the automation level of water level monitoring, enhancing the accuracy and real-time performance of risk early warning, and solving the technical problem that traditional image analysis methods struggle to achieve robust recognition in complex field scenarios, leading to frequent false alarms and missed alarms, and delayed risk assessment. Attached Figure Description

[0017] The accompanying drawings, which are included to provide a further understanding of the invention and form part of this application, illustrate exemplary embodiments of the invention and, together with their description, serve to explain the invention and do not constitute an undue limitation thereof. In the drawings:

[0018] Figure 1 A hardware structure block diagram of a computer terminal for implementing a river water level early warning method is shown.

[0019] Figure 2 This is a flowchart illustrating the river water level early warning method provided according to an embodiment of the present invention;

[0020] Figure 3 This is a schematic flowchart of a closed-loop river flood monitoring and early warning method according to an optional embodiment of the present invention;

[0021] Figure 4 This is a structural block diagram of a river water level early warning device provided according to an embodiment of the present invention. Detailed Implementation

[0022] To enable those skilled in the art to better understand the present invention, the technical solutions of the present invention will be clearly and completely described below with reference to the accompanying drawings of the embodiments of the present invention. Obviously, the described embodiments are only some embodiments of the present invention, and not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative effort should fall within the scope of protection of the present invention.

[0023] It should be noted that the terms "first," "second," etc., in the specification, claims, and accompanying drawings of this invention are used to distinguish similar objects and are not necessarily used to describe a specific order or sequence. It should be understood that such data can be interchanged where appropriate so that the embodiments of the invention described herein can be implemented in orders other than those illustrated or described herein. Furthermore, the terms "comprising" and "having," and any variations thereof, are intended to cover a non-exclusive inclusion; for example, a process, method, system, product, or apparatus that comprises a series of steps or units is not necessarily limited to those steps or units explicitly listed, but may include other steps or units not explicitly listed or inherent to such processes, methods, products, or apparatus.

[0024] According to an embodiment of the present invention, a method for early warning of river water levels is provided. It should be noted that the steps shown in the flowchart in the accompanying drawings can be executed in a computer system such as a set of computer-executable instructions. Furthermore, although a logical order is shown in the flowchart, in some cases, the steps shown or described may be executed in a different order than that shown here.

[0025] The method embodiment provided in Embodiment 1 of this application can be executed on a mobile terminal, computer terminal, or similar computing device. Figure 1 A hardware block diagram of a computer terminal for implementing a river water level early warning method is shown. Figure 1 As shown, the computer terminal 10 may include one or more processors (shown as 102a, 102b, ..., 102n in the figure) (the processor may include, but is not limited to, a microprocessor MCU or a programmable logic device FPGA, etc.) and a memory 104 for storing data. In addition, it may also include: a display, an input / output interface (I / O interface), a universal serial bus (USB) port (which may be included as one of the ports of a BUS bus), a network interface, a power supply, and / or a camera. Those skilled in the art will understand that... Figure 1 The structure shown is for illustrative purposes only and does not limit the structure of the aforementioned electronic device. For example, computer terminal 10 may also include... Figure 1 The more or fewer components shown, or having the same Figure 1 The different configurations shown.

[0026] It should be noted that the aforementioned one or more processors and / or other data processing circuits are generally referred to herein as "data processing circuits". These data processing circuits may be embodied, in whole or in part, in software, hardware, firmware, or any other combination thereof. Furthermore, the data processing circuits may be a single, independent processing module, or may be integrated, in whole or in part, into any other element within the computer terminal 10. As involved in the embodiments of this application, the data processing circuits serve as a processor control mechanism (e.g., selection of a variable resistor termination path connected to an interface).

[0027] The memory 104 can be used to store software programs and modules of application software, such as the program instructions / data storage device corresponding to the river water level early warning method in this embodiment of the invention. The processor executes various functional applications and data processing by running the software programs and modules stored in the memory 104, thereby realizing the river water level early warning method of the aforementioned application. The memory 104 may include high-speed random access memory, and may also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid-state memory. In some instances, the memory 104 may further include memory remotely located relative to the processor, and these remote memories can be connected to the computer terminal 10 via a network. Examples of such networks include, but are not limited to, the Internet, corporate intranets, local area networks, mobile communication networks, and combinations thereof.

[0028] The display can be, for example, a touchscreen liquid crystal display (LCD) that allows the user to interact with the user interface of the computer terminal 10.

[0029] Figure 2 This is a flowchart illustrating the river water level early warning method provided according to an embodiment of the present invention, as shown below. Figure 2 As shown, the method includes the following steps:

[0030] Step S201: Obtain the current monitoring image of the target river channel at the current moment and the baseline monitoring image pre-stored in the preset database.

[0031] In this step, the system first retrieves the camera configuration information corresponding to the current monitoring point from the pre-set tower ledger database. This includes the camera's RTSP (Real Time Streaming Protocol) video stream address, installation azimuth, and the path to the baseline monitoring image. The baseline monitoring image is a high-quality image captured by the camera under conditions of no flooding or stable water levels (such as during the dry season or historically safe periods) and manually verified. It is pre-stored in the database and bound to the tower's geographical coordinates, timestamp, and camera intrinsic parameters to ensure spatial and temporal traceability. The monitoring image at the current moment is automatically triggered by the system according to the scheduling strategy. It first captures real-time video frames through the main stream. If this fails, it automatically switches to the backup sub-stream to ensure the reliability of image acquisition. The acquired image is immediately appended with camera encoding, a precise timestamp, and a geographic tag to form a complete data packet. Subsequently, both the current and baseline images undergo uniform distortion correction and enhancement processing, outputting in a standard 640×640 RGB format. This ensures consistency between the two images under different viewing angles, resolutions, and lighting conditions, providing a high-quality, comparable dual input source for subsequent time-series comparative analysis of multimodal large models. This mechanism avoids misjudgments caused by differences in image acquisition conditions, enabling the model to focus on the relative displacement between the water level and fixed reference objects (such as stone piers, railings, and tree trunks), rather than environmental noise, thereby achieving sensitive identification and reliable judgment of subtle water level rise trends.

[0032] Step S202: Encode the current monitoring image and the baseline monitoring image, and combine them with the preset multi-step reasoning prompt word template to construct a multimodal input message. The multi-step reasoning prompt word template is used to guide the preset multimodal language large model to determine the water level.

[0033] In this step, to guide the multimodal large model to accurately understand river water level changes, the current monitoring image and the pre-stored baseline monitoring image are converted into character sequences using Base64 encoding, ensuring that the image data can be embedded into the input structure of the large model in text form. Subsequently, these two images are dynamically combined with a pre-set multi-step reasoning prompt template to construct a structured multimodal input message. This prompt template is carefully designed to guide the model step-by-step in logical reasoning in the form of natural language instructions: first, the model is required to identify a common fixed reference object (such as a stone pier, railing, or tree root) in the two images; second, based on the reference object, it determines the relative displacement direction of the water body edge; then, it assesses the magnitude of the water level change as "minor," "significant," or "drastic"; finally, it outputs a structured JSON result, including "whether it has risen," "confidence level," "severity," and supporting descriptions. Domain terminology and discriminative logic are embedded in the prompts (such as "based on the bottom of the reference object" and "whether the waterline exceeds the upper edge of the stone steps"), enabling the model to overcome the limitations of simple image recognition and possess scene semantic understanding and spatiotemporal reasoning capabilities. By fusing images with prompts as input, the model no longer relies solely on pixel differences but simulates the analytical process of human experts, achieving an intelligent closed loop from "seeing" to "understanding" to "decision-making," significantly improving the accuracy and interpretability of water level judgment in complex field environments.

[0034] Step S203: Input the multimodal input message into the preset multimodal language large model to obtain the current water level of the target river. The current water level includes whether the current water level has risen, the confidence level of the current result, and the severity of the current water level.

[0035] In this step, the constructed multimodal input message—containing Base64 encodings of the current image and the reference image, along with structured multi-step reasoning prompts—is input into the locally deployed multimodal large language model. A multimodal large language model (MLLM) is an AI model capable of simultaneously receiving and jointly understanding multimodal inputs such as images and text, and performing semantic reasoning and decision-making based on large-scale parameters and cross-modal alignment capabilities. It extracts semantic features from images through a visual encoder and fuses them with text prompts in a unified semantic space, thereby achieving end-to-end judgments on "whether the water level has risen," "the magnitude of the change," and "confidence level." Unlike traditional computer vision models that can only detect edge or pixel changes, MLLM can understand the scene semantics in images—such as recognizing human-understandable contexts like "the stone pier is a fixed reference" and "the water level has overflowed the steps"—combined with prompts to guide multi-step logical reasoning, ultimately outputting structured results.

[0036] The multimodal language model, based on its powerful vision-language joint understanding capability, follows a logical chain guided by prompt words. It first locates stable ground references in the image, compares their relative positions in images from two different time periods to determine whether the water boundary has risen. It then combines semantic features such as shoreline morphology and water infiltration marks on references to comprehensively assess the water level change trend. The model outputs analysis results in structured JSON format, explicitly including three core pieces of information: first, "whether the current water level is rising," represented by a Boolean value (true / false) to indicate the dynamic trend of the water level; second, "the confidence level of this result," generated by the model's internal attention and probability scoring mechanisms, reflecting the certainty of the judgment result, with a value between 0 and 1, used to quantify the reliability of the analysis; and third, "the severity of the current water level," using graded labels to describe the intensity of the water level change, aligning with the risk level classification in power flood control practices. This output not only provides a judgment conclusion but also includes traceable semantic evidence, enabling the system to distinguish between real water level rise and light and shadow interference, avoiding false alarms.

[0037] Step S204: Based on the current water level, determine the water level warning result for the target river channel, wherein the water level warning result includes whether there is a risk of rising water or not.

[0038] In this step, the final warning is not generated directly based on a single judgment result from the MLLM output. Instead, it is based on three core indicators: "whether the current water level is rising," "confidence level," and "severity level," combined with a multi-rule fusion decision-making mechanism for comprehensive analysis to ultimately determine whether there is a risk of rising water in the target river. If the model output confidence level is lower than a preset threshold (e.g., 0.75), the system will automatically trigger a reanalysis process to avoid misjudgments due to noise interference or low-quality images. When the confidence level meets the threshold, the system further compares the results of multiple recent analyses to identify whether the trend is consistent. If it is a single abrupt change without historical trend support, it is judged as an occasional disturbance and no warning is triggered. If the water level shows a steady rise or drastic changes multiple times in a row, it is confirmed as a real rising water trend, and the system judges it as "there is a risk of rising water" and simultaneously marks the severity level. Conversely, if the model consistently outputs "no rise" or the change is slight and the confidence level is consistently low, it is judged as "there is no risk of rising water."

[0039] Through the above steps, the goal of accurately sensing and autonomously assessing the water level changes in the river channels surrounding the power transmission towers was achieved. This improved the automation level of water level monitoring, enhanced the accuracy and real-time performance of risk warnings, and solved the technical problem that traditional image analysis methods are difficult to use in complex field scenarios, leading to frequent false alarms and missed alarms, and delayed risk assessment.

[0040] As an optional embodiment, acquiring the current monitoring image of the target river at the current moment includes: acquiring the original monitoring image of the target river at the current moment; correcting the original monitoring image based on camera parameters in a preset database to obtain a first monitoring image; increasing the contrast of the waterfront boundary of the target river and the fixed reference object of the target river in the first monitoring image to obtain a second monitoring image; and standardizing the second monitoring image based on a preset resolution to obtain the current monitoring image.

[0041] Optionally, to ensure high consistency and analyzability of the images input to the multimodal large model, the original image at the current moment can first be obtained from the RTSP video stream of the target river monitoring camera. If the main stream capture fails, it automatically switches to the backup sub-stream to ensure data continuity, and the image is appended with camera encoding, timestamp, and geographic tags to ensure traceability. Subsequently, the intrinsic parameter calibration data of the camera in the preset ledger database is called to perform lens distortion correction on the original image, eliminating geometric distortion caused by fisheye or wide-angle distortion, restoring the true spatial proportion, and generating the first monitoring image. On this basis, to enhance the recognition of the waterfront boundary and fixed reference objects (such as stone piers, guardrails, and tree roots) under low light, haze, or backlight conditions, the CLAHE (Contrast Limited Adaptive Histogram Equalization) algorithm can be used to nonlinearly enhance local areas of the image, significantly improving the grayscale contrast at the water-land interface and key reference objects, forming the second monitoring image, making it easier for the model to capture subtle water level changes. Finally, the enhanced images are uniformly scaled and padded to a standardized resolution of 640×640 pixels, maintaining the input format consistent with that used during MLLM training to ensure stable and reliable model inference. The entire preprocessing workflow constitutes a closed-loop, adaptive, and highly robust image preparation chain. It not only eliminates input noise caused by device differences and environmental interference but also provides a high-quality, standardized visual input foundation for subsequent multimodal intelligent analysis through geometric correction and semantic enhancement. This is a crucial preliminary step for achieving accurate water level judgment.

[0042] As an optional embodiment, the preset multi-step inference prompt template includes the following sub-steps: identifying fixed reference objects that match the current monitoring image and the reference monitoring image, wherein the fixed reference objects are poles, markers, vegetation, or fixed structures on the bank in the target river channel; comparing the relative vertical positions of the water surface edges of the target river channel in the current monitoring image and the reference monitoring image based on the fixed reference objects; determining whether the current water level has risen and the magnitude of the water level change based on the relative vertical positions; determining the severity of the current water level based on the ratio between the magnitude of the water level change and the height of the fixed reference objects, wherein the severity of the current water level includes slight, significant, and severe; calculating the confidence level of this inference process; and outputting the results including whether the current water level has risen, the confidence level of this result, and the severity of the current water level.

[0043] Optionally, the prompts first guide the MLLM to accurately identify reference structures in the current and historical baseline images that match the power-specific fixed structures such as pole foundations, grounding grids, slope protection piers, and flood control piles. These structures have long-term stability and are not easily destroyed in flood season images, far superior to ordinary riverside trees or rocks. Subsequently, using the fixed elevations of these power facilities as geometric benchmarks, the relative vertical offset of the water surface edge on them is accurately compared, rather than relying on the fuzzy "waterline" for identification, thus avoiding misjudgments caused by water flow fluctuations and reflection confusion. By calculating the rise ratio of the water level on known elevation marks such as pole grounding bolts and inspection platform edges, three severity levels can be automatically quantified: "minor" (not exceeding 20cm), "significant" (20–50cm), and "severe" (>50cm or overflowing the foundation), achieving direct alignment with the flood control design standards for power poles. Meanwhile, the model dynamically generates the confidence level of the inference based on the clarity of the reference object, the continuity of the waterfront edge, and the consistency of the background lighting, and refuses to output low-confidence results when there is heavy rain, glare, or severe occlusion.

[0044] As an optional embodiment, the water level warning result of the target river is determined based on the current water level, including: obtaining multiple historical water level conditions before the current water level; and determining that there is a risk of rising water levels when the water levels of the multiple historical water levels have risen and the severity of the current water level reaches a preset threshold.

[0045] Optionally, the water level warning does not rely on a single model judgment, but constructs an intelligent decision-making mechanism based on the time series trend to avoid false alarms caused by instantaneous interference. After obtaining the severity (slight, obvious, severe) and confidence level of the current water level, it automatically retraces the results of multiple recent historical analyses to form a water level change sequence within a dynamic sliding window. For example, only when three or more consecutive historical analyses show that the water level is on a stable upward trend and the severity of the current result reaches the "obvious" or "severe" level, is it comprehensively determined that there is a risk of rising water; if there is only a single severe change but the previous trend is stable, or although the trend is upward but the current severity does not reach the threshold (such as only "slight"), it is determined as environmental disturbance or normal fluctuation and no warning is triggered. This mechanism simulates the empirical logic of "looking at trends and emphasizing confirmation" of power inspection personnel and has strong robustness against common pseudo-rising water disturbances such as water surface reflection, tree branch swaying, and fog occlusion around transmission towers. At the same time, by combining confidence filtering and trend consistency verification, it effectively avoids false alarms caused by poor single-frame image quality or occasional model misjudgments, ensures that each warning has sufficient time series evidence support, and truly realizes the precise and engineering intelligent warning of "only alarm when it rises steadily and responds only to drastic mutations", greatly enhancing the reliability and authority of operation and maintenance decisions during the flood season.

[0046] As an optional embodiment, it further includes: generating rich media warning content based on the water level warning result, where the rich media warning content includes multiple comparison image frames of the current monitoring image and the reference monitoring image; determining the sending method based on the severity of the current water level, where the sending method includes at least one of the following: text message, phone call, system platform; and sending the rich media warning content to the target user terminal according to the sending method.

[0047] Optionally, it can automatically extract the aligned water bank areas in the current monitoring image and the historical reference image to generate a dynamic comparison GIF or a sliding comparison graph, intuitively showing the rising process of the water level on power-specific reference objects such as the tower foundation and slope protection gabions, supplemented by a natural language summary stating that "the water level has risen 42 cm compared to the reference and has reached the 'obvious' level, only 8 cm from the top surface of the tower foundation", enabling non-technical users to quickly understand the risk situation. Subsequently, it can intelligently match a hierarchical push strategy based on the severity of the water level: when it is determined as "slight", it is only pushed to the work order system of the operation and maintenance team platform for daily inspection tracking; once it reaches the "obvious" or "severe" level, it immediately triggers multi-channel collaborative alarms - simultaneously sending double reminders of text messages and voice calls to the person in charge of the local power supply station to ensure that key information is not missed, while popping up a highlighted alarm window on the power grid dispatching large screen and the equipment health monitoring platform, and automatically associating the tower number, geographical coordinates, and emergency contact information to achieve precise reach of the "image + data + notification" trinity.

[0048] As an optional embodiment, the method further includes: storing the current monitoring image and the current water level in a preset database; obtaining multiple historical water levels and the corresponding historical monitoring images from the preset database; and updating the parameters of the multimodal language large model based on the current monitoring image, the current water level, the multiple historical water levels, and the corresponding historical monitoring images to obtain the updated multimodal language large model.

[0049] Optionally, a closed-loop, self-optimizing intelligent monitoring system can be constructed. This system stores images and water level results from each effective early warning—including current monitoring images, structured water level results output by the large model (e.g., "rising, confidence level 0.89, significant severity"), along with corresponding timestamps and geographic information—in a highly reliable spatiotemporal database, forming a traceable and reusable power flood control knowledge base. Historical water level change sequences and their corresponding images can be periodically extracted from this database, especially those samples manually verified as "real rises" or "false alarm corrections," for incremental visual cue fine-tuning of the locally deployed multimodal language model. Unlike traditional full-parameter retraining, incremental visual cue fine-tuning only adjusts lightweight cue vectors related to visual semantic alignment, enhancing the model's ability to identify changes in transmission tower-specific reference objects (such as grounding bolts and gabions) and water level boundaries under complex outdoor lighting conditions. By continuously injecting positive and negative samples from real-world scenarios, the model gradually learns to distinguish between "real water level rise" and noise patterns such as "water flow fluctuations," "reflection interference," and "leaf occlusion," automatically optimizing its inference logic and confidence judgment boundaries. This makes subsequent analyses more aligned with the experience and understanding of power inspection experts. This mechanism achieves a sustainable evolutionary closed loop of "monitoring-feedback-learning-upgrade," enabling the multimodal model to continuously adapt to flood season environments of different regions and tower structures without the need for manual annotation of massive amounts of data. This significantly improves the system's generalization ability and early warning accuracy during long-term operation.

[0050] In conjunction with the above optional embodiments, a closed-loop method for monitoring and early warning of rising water levels in rivers is also provided. Figure 3 This is a schematic flowchart of a closed-loop river flood monitoring and early warning method according to an optional embodiment of the present invention, as shown below. Figure 3As shown, based on a pre-built tower register database, each monitoring point's camera is configured with a timed image capture and analysis task, and a hash round-robin algorithm is used to balance the scheduling of computing resources. The camera is triggered to capture raw images according to the scheduling strategy, and an adaptive retry and failover mechanism ensures successful acquisition. Standardized preprocessing pipeline operations are performed on successfully acquired images to output standard images of uniform quality. Using a locally deployed Qwen-VL multimodal large model, combined with time-series comparison and structured prompt word engineering, intelligent identification and risk assessment of water level changes are achieved. When a risk of rising water is determined, the system automatically associates information such as the responsible entity and geographical location in the register, and integrates external rainfall data to generate rich media warning content (such as comparative GIFs and natural language summaries). Based on the warning level and the recipient's role, the system adaptively selects the push channel (such as SMS, telephone, platform notification), and ensures that key information is delivered first according to a hierarchical strategy. Warning information is pushed to existing operation and maintenance monitoring platforms (such as Zabbix and Prometheus Alertmanager) through standard interfaces to achieve automatic creation of alarm work orders and large-screen visualization. The system supports iterative optimization of models and decision rules based on subsequent feedback, continuously improving monitoring accuracy and early warning timeliness.

[0051] It should be noted that, for the sake of simplicity, the foregoing method embodiments are all described as a series of actions. However, those skilled in the art should understand that the present invention is not limited to the described order of actions, because according to the present invention, some steps can be performed in other orders or simultaneously. Furthermore, those skilled in the art should also understand that the embodiments described in the specification are preferred embodiments, and the actions and modules involved are not necessarily essential to the present invention.

[0052] Through the above description of the embodiments, those skilled in the art can clearly understand that the river water level early warning method according to the above embodiments can be implemented by means of software plus necessary general-purpose hardware platform. Of course, it can also be implemented by hardware, but in many cases the former is a better implementation method. Based on this understanding, the technical solution of the present invention, or the part that contributes to the prior art, can be embodied in the form of a software product. This computer software product is stored in a storage medium (such as ROM / RAM, magnetic disk, optical disk) and includes several instructions to cause a terminal device (which may be a mobile phone, computer, server, or network device, etc.) to execute the methods described in the various embodiments of the present invention.

[0053] According to embodiments of the present invention, a river water level early warning device for implementing the above-described river water level early warning method is also provided. Figure 4 This is a structural block diagram of a river water level early warning device provided according to an embodiment of the present invention, such as... Figure 4As shown, the device includes: an acquisition module 41, a construction module 42, a judgment module 43, and a determination module 44. The device will be described below.

[0054] The acquisition module 41 is used to acquire the current monitoring image of the target river at the current moment and the baseline monitoring image pre-stored in the preset database.

[0055] The construction module 42, connected to the acquisition module 41, is used to encode the current monitoring image and the reference monitoring image, and combine them with the preset multi-step reasoning prompt word template to construct a multimodal input message. The multi-step reasoning prompt word template is used to guide the preset multimodal language large model to judge the water level.

[0056] The judgment module 43, connected to the construction module 42, is used to input multimodal input messages into a preset multimodal language large model to obtain the current water level of the target river. The current water level includes whether the current water level has risen, the confidence level of the current result, and the severity of the current water level.

[0057] The determination module 44, connected to the judgment module 43, is used to determine the water level warning result of the target river based on the current water level. The water level warning result includes whether there is a risk of rising water or not.

[0058] It should be noted that the acquisition module 41, construction module 42, judgment module 43, and determination module 44 mentioned above correspond to steps S201 to S204 in the embodiments. Multiple modules and their corresponding steps implement the same instances and application scenarios, but are not limited to the content disclosed in the above embodiments. It should also be noted that the above modules, as part of the device, can run on the computer terminal 10 provided in the embodiments.

[0059] Embodiments of the present invention may provide a computer device. Optionally, in this embodiment, the computer device may be located in at least one of a plurality of network devices in a computer network. The computer device includes a memory and a processor.

[0060] The memory can be used to store software programs and modules, such as the program instructions / modules corresponding to the river water level early warning method and device in this embodiment of the invention. The processor executes various functional applications and data processing by running the software programs and modules stored in the memory, thereby realizing the aforementioned river water level early warning method. The memory may include high-speed random access memory, and may also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid-state memory. In some instances, the memory may further include memory remotely located relative to the processor, and these remote memories can be connected to a computer terminal via a network. Examples of such networks include, but are not limited to, the Internet, corporate intranets, local area networks, mobile communication networks, and combinations thereof.

[0061] The processor can access information and applications stored in memory via a transmission device to perform the following steps: acquiring the current monitoring image of the target river channel at the current moment and the baseline monitoring image pre-stored in a preset database; encoding the current monitoring image and the baseline monitoring image, and combining them with a preset multi-step inference prompt word template to construct a multimodal input message, wherein the multi-step inference prompt word template is used to guide a preset multimodal language model to determine the water level; inputting the multimodal input message into the preset multimodal language model to obtain the current water level of the target river channel, wherein the current water level includes whether the current water level is rising, the confidence level of the current result, and the severity of the current water level; based on the current water level, determining the water level warning result of the target river channel, wherein the water level warning result includes whether there is a risk of rising water or not.

[0062] Optionally, the processor may also execute program code for the following steps: acquiring the current monitoring image of the target river at the current moment, including: acquiring the original monitoring image of the target river at the current moment; correcting the original monitoring image based on camera parameters in a preset database to obtain a first monitoring image; increasing the contrast of the waterfront boundary of the target river and the fixed reference object of the target river in the first monitoring image to obtain a second monitoring image; and standardizing the second monitoring image based on a preset resolution to obtain the current monitoring image.

[0063] Optionally, the processor may also execute program code with the following steps: The preset multi-step inference prompt template includes the following sub-steps: identifying a fixed reference object that matches the current monitoring image and the reference monitoring image, wherein the fixed reference object is a pole, marker, vegetation, or fixed structure on the bank in the target river channel; comparing the relative vertical position of the water surface edge of the target river channel in the current monitoring image and the reference monitoring image with the fixed reference object as a reference; determining whether the current water level has risen and the magnitude of the water level change based on the relative vertical position; determining the severity of the current water level based on the ratio between the magnitude of the water level change and the height of the fixed reference object, wherein the severity of the current water level includes slight, significant, and severe; calculating the confidence level of this inference process; and outputting a result including whether the current water level has risen, the confidence level of this result, and the severity of the current water level.

[0064] Optionally, the processor may also execute program code that performs the following steps: based on the current water level, determine the water level warning result for the target river channel, including: obtaining multiple historical water level conditions prior to the current water level; and determining the water level warning result as indicating a risk of rising water levels when the water levels in the multiple historical water level conditions have risen and the severity of the current water level condition reaches a preset threshold.

[0065] Optionally, the processor may also execute program code that includes the following steps: generating rich media warning content based on the water level warning result, wherein the rich media warning content includes multiple comparison image frames of the current monitoring image and the baseline monitoring image; determining the sending method based on the severity of the current water level, wherein the sending method includes at least one of the following: SMS, telephone, system platform; and sending the rich media warning content to the target user terminal according to the sending method.

[0066] Optionally, the processor may also execute program code that includes the following steps: storing the current monitoring image and the current water level in a preset database; obtaining multiple historical water levels and their corresponding historical monitoring images from the preset database; updating the parameters of the multimodal language model based on the current monitoring image, the current water level, the multiple historical water levels, and their corresponding historical monitoring images to obtain the updated multimodal language model.

[0067] This invention provides a method for river water level early warning. It acquires a current monitoring image of the target river at the current moment and a baseline monitoring image pre-stored in a preset database. The current and baseline monitoring images are encoded and combined with a preset multi-step inference prompt template to construct a multimodal input message. The multi-step inference prompt template guides a preset multimodal language model to determine the water level. The multimodal input message is input into the preset multimodal language model to obtain the current water level of the target river, including whether the water level is rising, the confidence level of the result, and the severity of the current water level. Based on the current water level, a water level early warning result is determined for the target river, including whether there is a risk of rising water or not. This method achieves accurate perception and autonomous risk assessment of water level changes in rivers surrounding power transmission towers, thereby improving the automation level of water level monitoring and enhancing the accuracy and real-time performance of risk warnings. It also solves the technical problem that traditional image analysis methods struggle to achieve robust recognition in complex field scenarios, leading to frequent false alarms and missed alarms, and delayed risk assessment.

[0068] Those skilled in the art will understand that all or part of the steps in the various methods of the above embodiments can be implemented by a program instructing the hardware related to the terminal device. The program can be stored in a non-volatile storage medium, which may include: flash drive, read-only memory (ROM), random access memory (RAM), disk or optical disk, etc.

[0069] Embodiments of the present invention also provide a non-volatile storage medium. Optionally, in this embodiment, the aforementioned non-volatile storage medium can be used to store the program code executed by the river water level early warning method provided in the above embodiments.

[0070] Optionally, in this embodiment, the non-volatile storage medium may be located in any computer terminal in a group of computer terminals in a computer network, or in any mobile terminal in a group of mobile terminals.

[0071] Optionally, in this embodiment, the non-volatile storage medium is configured to store program code for performing the following steps: acquiring the current monitoring image of the target river channel at the current moment and the baseline monitoring image pre-stored in a preset database; encoding the current monitoring image and the baseline monitoring image, and combining them with a preset multi-step inference prompt word template to construct a multimodal input message, wherein the multi-step inference prompt word template is used to guide the preset multimodal language large model to determine the water level; inputting the multimodal input message into the preset multimodal language large model to obtain the current water level of the target river channel, wherein the current water level includes whether the current water level is rising, the confidence level of the current result, and the severity of the current water level; based on the current water level, determining the water level warning result of the target river channel, wherein the water level warning result includes whether there is a risk of rising water and whether there is no risk of rising water.

[0072] Optionally, in this embodiment, the non-volatile storage medium is configured to store program code for performing the following steps: acquiring the current monitoring image of the target river at the current moment, including: acquiring the original monitoring image of the target river at the current moment; performing correction processing on the original monitoring image based on camera parameters in a preset database to obtain a first monitoring image; improving the contrast of the waterfront boundary of the target river and the fixed reference object of the target river in the first monitoring image to obtain a second monitoring image; and performing standardization processing on the second monitoring image based on a preset resolution to obtain the current monitoring image.

[0073] Optionally, in this embodiment, the non-volatile storage medium is configured to store program code for performing the following steps: The preset multi-step inference prompt template includes the following sub-steps: identifying a fixed reference object that matches the current monitoring image and the reference monitoring image, wherein the fixed reference object is a pole, marker, vegetation, or fixed structure on the bank in the target river channel; comparing the relative vertical position of the water surface edge of the target river channel in the current monitoring image and the reference monitoring image with the fixed reference object as a reference; determining whether the current water level has risen and the magnitude of the water level change based on the relative vertical position; determining the severity of the current water level based on the ratio between the magnitude of the water level change and the height of the fixed reference object, wherein the severity of the current water level includes slight, significant, and severe; calculating the confidence level of this inference process; and outputting a result including whether the current water level has risen, the confidence level of this result, and the severity of the current water level.

[0074] Optionally, in this embodiment, the non-volatile storage medium is configured to store program code for performing the following steps: determining the water level warning result of the target river based on the current water level, including: obtaining multiple historical water level conditions before the current water level; and determining that there is a risk of rising water levels when the water levels of the multiple historical water level conditions rise and the severity of the current water level condition reaches a preset threshold.

[0075] Optionally, in this embodiment, the non-volatile storage medium is configured to store program code for performing the following steps: further including: generating rich media warning content based on the water level warning result, wherein the rich media warning content includes multiple comparison image frames of the current monitoring image and the baseline monitoring image; determining the sending method based on the severity of the current water level, wherein the sending method includes at least one of the following: SMS, telephone, system platform; and sending the rich media warning content to the target user terminal according to the sending method.

[0076] Optionally, in this embodiment, the non-volatile storage medium is configured to store program code for performing the following steps: further including: storing the current monitoring image and the current water level in a preset database; obtaining multiple historical water levels and the historical monitoring images corresponding to each of the multiple historical water levels from the preset database; updating the parameters of the multimodal language large model based on the current monitoring image, the current water level, the multiple historical water levels, and the historical monitoring images corresponding to each of the multiple historical water levels, to obtain the updated multimodal language large model.

[0077] Embodiments of the present invention also provide a computer program product, including a computer program. Optionally, in this embodiment, when the computer program is executed by a processor, it can: acquire a current monitoring image of the target river at the current moment and a baseline monitoring image pre-stored in a preset database; encode the current monitoring image and the baseline monitoring image, and combine them with a preset multi-step inference prompt word template to construct a multimodal input message, wherein the multi-step inference prompt word template is used to guide a preset multimodal language large model to determine the water level; input the multimodal input message into the preset multimodal language large model to obtain the current water level of the target river, wherein the current water level includes whether the current water level is rising, the confidence level of the current result, and the severity of the current water level; based on the current water level, determine the water level warning result of the target river, wherein the water level warning result includes whether there is a risk of rising water and whether there is no risk of rising water.

[0078] The sequence numbers of the above embodiments of the present invention are for descriptive purposes only and do not represent the superiority or inferiority of the embodiments.

[0079] In the above embodiments of the present invention, the descriptions of each embodiment have different focuses. For parts not described in detail in a certain embodiment, please refer to the relevant descriptions of other embodiments.

[0080] In the several embodiments provided in this application, it should be understood that the disclosed technical content can be implemented in other ways. The device embodiments described above are merely illustrative; for example, the division of units can be a logical functional division, and in actual implementation, there may be other division methods. For instance, multiple units or components may be combined or integrated into another system, or some features may be ignored or not executed. Furthermore, the displayed or discussed mutual coupling, direct coupling, or communication connection may be through some interfaces; the indirect coupling or communication connection between units or modules may be electrical or other forms.

[0081] The units described as separate components may or may not be physically separate. The components shown as units may or may not be physical units; that is, they may be located in one place or distributed across multiple units. Some or all of the units can be selected to achieve the purpose of this embodiment according to actual needs.

[0082] Furthermore, the functional units in the various embodiments of the present invention can be integrated into one processing unit, or each unit can exist physically separately, or two or more units can be integrated into one unit. The integrated unit can be implemented in hardware or as a software functional unit.

[0083] If the integrated unit is implemented as a software functional unit and sold or used as an independent product, it can be stored in a non-volatile storage medium. Based on this understanding, the technical solution of the present invention, in essence, or the part that contributes to the prior art, or all or part of the technical solution, can be embodied in the form of a software product. This computer software product is stored in a storage medium and includes several instructions to cause a computer device (which may be a personal computer, server, or network device, etc.) to execute all or part of the steps of the methods described in the various embodiments of the present invention. The aforementioned storage medium includes various media capable of storing program code, such as USB flash drives, read-only memory (ROM), random access memory (RAM), portable hard drives, magnetic disks, or optical disks.

[0084] The above description is only a preferred embodiment of the present invention. It should be noted that for those skilled in the art, several improvements and modifications can be made without departing from the principle of the present invention, and these improvements and modifications should also be considered within the scope of protection of the present invention.

Claims

1. A method for early warning of river water levels, characterized in that, include: Acquire the current monitoring image of the target river channel at the current moment and the baseline monitoring image pre-stored in the preset database; The current monitoring image and the reference monitoring image are encoded and combined with a preset multi-step reasoning prompt word template to construct a multimodal input message. The multi-step reasoning prompt word template is used to guide the preset multimodal language large model to determine the water level. The multimodal input message is input into the preset multimodal language large model to obtain the current water level of the target river channel, wherein the current water level includes whether the current water level has risen, the confidence level of the current result, and the severity of the current water level. Based on the current water level, a water level warning result for the target river is determined, wherein the water level warning result includes whether there is a risk of rising water or not.

2. The method according to claim 1, characterized in that, The acquisition of the current monitoring image of the target river channel at the current moment includes: Obtain the original monitoring image of the target river channel at the current time; Based on the camera parameters in the preset database, the original surveillance image is corrected to obtain the first surveillance image; Increase the contrast of the waterfront boundary of the target river and the fixed reference point of the target river in the first monitoring image to obtain a second monitoring image; Based on a preset resolution, the second monitoring image is standardized to obtain the current monitoring image.

3. The method according to claim 1, characterized in that, The preset multi-step reasoning prompt template includes the following sub-steps: Identify a fixed reference object that matches the current monitoring image with the reference monitoring image, wherein the fixed reference object is a pole, marker stone, vegetation or fixed structure on the bank in the target river channel; Using the fixed reference point as a benchmark, compare the relative vertical positions of the water surface edge of the target river channel in the current monitoring image and the benchmark monitoring image; Based on the relative vertical position, determine whether the current water level has risen and the magnitude of the water level change; The severity of the current water level is determined based on the ratio between the magnitude of the water level change and the height of the fixed reference point, wherein the severity of the current water level includes slight, significant, and severe. Calculate the confidence level of this reasoning process; The output includes fields indicating whether the current water level has risen, the confidence level of the current result, and the severity of the current water level.

4. The method according to claim 1, characterized in that, The determination of the water level warning result for the target river channel based on the current water level includes: Obtain multiple historical water level data prior to the current water level; If the water level in each of the multiple historical water level conditions has risen, and the severity of the current water level condition reaches a preset threshold, the water level warning result is determined to indicate that there is a risk of rising water levels.

5. The method according to claim 1, characterized in that, Also includes: Based on the water level warning results, rich media warning content is generated, wherein the rich media warning content includes multiple comparison image frames between the current monitoring image and the reference monitoring image; Based on the severity of the current water level, the sending method is determined, wherein the sending method includes at least one of the following: SMS, telephone, system platform; According to the sending method, the rich media warning content is sent to the target user terminal.

6. The method according to any one of claims 1 to 5, characterized in that, Also includes: The current monitoring image and the current water level are stored in the preset database; Obtain multiple historical water level data and corresponding historical monitoring images from the preset database; Based on the current monitoring image, the current water level, the multiple historical water levels, and the historical monitoring images corresponding to each of the multiple historical water levels, the parameters of the multimodal language large model are updated to obtain the updated multimodal language large model.

7. A river water level early warning device, characterized in that, include: The acquisition module is used to acquire the current monitoring image of the target river channel at the current moment and the baseline monitoring image pre-stored in the preset database; The construction module is used to encode the current monitoring image and the reference monitoring image, and combine them with a preset multi-step reasoning prompt word template to construct a multimodal input message. The multi-step reasoning prompt word template is used to guide the preset multimodal language large model to determine the water level. The judgment module is used to input the multimodal input message into the preset multimodal language large model to obtain the current water level of the target river channel, wherein the current water level includes whether the current water level has risen, the confidence level of the current result, and the severity of the current water level. The determination module is used to determine the water level warning result of the target river channel based on the current water level situation, wherein the water level warning result includes whether there is a risk of rising water or not.

8. A non-volatile storage medium, characterized in that, The non-volatile storage medium includes a stored program, wherein, when the program is executed, it controls the device containing the non-volatile storage medium to execute the river water level early warning method according to any one of claims 1 to 6.

9. A computer device, characterized in that, include: Memory and processor The memory stores computer programs; The processor is configured to execute a computer program stored in the memory, wherein when the computer program is executed, the processor performs the river water level early warning method according to any one of claims 1 to 6.

10. A computer program product, comprising a computer program, characterized in that, When the computer program is executed by the processor, it implements the river water level early warning method according to any one of claims 1 to 6.