A waterlogging detection method based on horizontal plane features, medium and equipment
By performing three-dimensional spatial mapping reconstruction of visual image data and detecting the geometric constraints of the horizontal plane, and combining the reference plane information to calculate the quantitative index of water accumulation, the problems of high equipment cost and high false judgment rate in water accumulation detection in traditional methods are solved, and high-precision measurement of water accumulation depth and area is achieved.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- BEIJING QIDAISONG TECH CO LTD
- Filing Date
- 2026-05-15
- Publication Date
- 2026-06-12
AI Technical Summary
Existing technologies struggle to achieve high-density grid-based water accumulation detection within urban areas, and traditional methods cannot accurately determine the physical depth and area of water accumulation, resulting in a high false positive rate.
By performing three-dimensional spatial mapping and reconstruction on visual image data, spatial geometric features are generated to detect suspected water accumulation areas that meet the geometric constraints of the horizontal plane. Water accumulation quantification indicators are calculated using reference plane information and reference reference information of environmental reference objects, thus achieving three-dimensional geometric detection.
It reduced equipment costs, improved the robustness and accuracy of detection, and provided high-precision measurements of water depth and area, offering reliable decision-making support for flood control scheduling and traffic management.
Smart Images

Figure CN122199547A_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of water accumulation detection technology, and in particular to a method, medium, and equipment for water accumulation detection based on horizontal surface characteristics. Background Technology
[0002] Real-time and accurate detection of road flooding is a crucial requirement for smart city emergency management. Existing technologies for urban flooding monitoring mainly fall into the following categories: Monitoring methods based on physical sensors, which involve deploying water level gauges, ultrasonic sensors, and other equipment in flood-prone road sections for single-point measurements, offer high accuracy. However, these methods are costly, difficult to deploy and maintain, and struggle to achieve high-density grid coverage across urban areas. They also leave numerous blind spots and fail to detect the spread of large-scale flooding.
[0003] Traditional two-dimensional visual image detection methods utilize RGB images captured by surveillance cameras to identify the pixel area ratio of water accumulation regions through image segmentation or object detection algorithms. However, these methods can only perform semantic recognition of water accumulation at the two-dimensional pixel level, losing three-dimensional spatial geometric information. They cannot directly obtain the physical water depth and actual water area, which are of most concern to flood control decisions. At the same time, the surface of water accumulation has strong specular reflection characteristics, and traditional two-dimensional visual models are prone to misidentifying reflections as real physical entities, resulting in a high false positive rate.
[0004] Detection methods based on lidar or depth sensors acquire dense depth information of a scene by actively emitting lasers or structured light, enabling high-precision 3D measurement. However, their equipment costs are too high, exceeding those of ordinary surveillance cameras by tens or even hundreds of times, making large-scale deployment in urban surveillance systems impractical.
[0005] Therefore, how to identify waterlogged areas and measure their physical depth and area with high precision using only existing ordinary visual monitoring equipment in the city without adding dedicated depth sensors or physical rulers has become an urgent problem to be solved. Summary of the Invention
[0006] To address the aforementioned technical problems, the present invention provides a water accumulation detection method based on horizontal surface features, which includes the following steps: S1, perform three-dimensional spatial mapping and reconstruction on the visual image data of the target monitoring area to obtain spatial geometric features, wherein the spatial geometric features are used to characterize the three-dimensional topological structure and / or relative depth relationship of the scene within the target monitoring area.
[0007] S2 identifies several suspected water accumulation areas that meet the geometric constraints of the horizontal plane from spatial geometric features.
[0008] S3. Obtain reference information from the visual image data, wherein the reference information includes reference plane information and / or reference position information of environmental reference objects with known geometric features.
[0009] S4. For any suspected waterlogged area, calculate the waterlogging quantification index of the current suspected waterlogged area relative to the reference plane based on the relative spatial position relationship of the current suspected waterlogged area according to the reference benchmark information. The waterlogging quantification index includes waterlogging depth and / or waterlogging area.
[0010] S5. Based on the water accumulation quantification indicators corresponding to all suspected water accumulation areas, the water accumulation detection results of the target monitoring area are obtained.
[0011] The present invention also provides a non-transitory computer-readable storage medium storing at least one instruction or at least one program, wherein the at least one instruction or at least one program is loaded and executed by a processor to implement the above-described method for detecting water accumulation based on horizontal surface features.
[0012] The present invention also provides an electronic device, including a processor and the aforementioned non-transitory computer-readable storage medium.
[0013] This invention has at least the following beneficial effects: It obtains spatial geometric features by reconstructing ordinary visual image data in three dimensions, eliminating the need for expensive hardware such as LiDAR or depth sensors. It can acquire three-dimensional geometric information of a scene using only a monocular camera, significantly reducing deployment and maintenance costs. By detecting suspected water accumulation areas that meet horizontal geometric constraints from spatial geometric features, it transforms the physical property of water bodies tending towards a horizontal surface into quantifiable three-dimensional geometric constraints, achieving a technological leap from two-dimensional pixel semantic segmentation to three-dimensional spatial geometric detection. This effectively filters out false detections caused by water reflections and lighting changes, improving detection robustness. By acquiring reference benchmark information, including baseline plane information and / or environmental reference object reference position information, it can obtain quantified reference benchmarks without pre-deploying physical scales or relying on historical depth maps from the same camera position. Furthermore, it calculates water accumulation indicators based on the relative spatial relationship between the reference benchmark information and the suspected water accumulation area, solving the technical pain point that traditional two-dimensional vision methods cannot obtain true physical depth and area, providing high-precision decision-making basis for flood control scheduling and traffic control. Attached Figure Description
[0014] To more clearly illustrate the technical solutions in the embodiments of the present invention, the accompanying drawings used in the description of the embodiments will be briefly introduced below. Obviously, the accompanying drawings described below are only some embodiments of the present invention. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.
[0015] Figure 1 This is a flowchart of a water accumulation detection method based on horizontal surface features provided in Embodiment 1 of the present invention. Detailed Implementation
[0016] The technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of the present invention, and not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of the present invention.
[0017] It should be noted that the terms "first," "second," etc., in the specification, claims, and accompanying drawings of this invention are used to distinguish similar objects and are not necessarily used to describe a specific order or sequence. It is understood that, where appropriate, the terms used to distinguish similar objects can be interchanged so that the invention can also be implemented in other embodiments besides the illustrated or described embodiments. Furthermore, the terms "including," "having," and any variations are intended to cover non-exclusive inclusion; for example, a process, method, system, product, or server that includes a series of steps or units is not necessarily limited to those steps or units explicitly listed, but may include other steps or units not explicitly listed or inherent to these processes, methods, products, or devices.
[0018] Example 1 This embodiment provides a water accumulation detection method based on horizontal surface features, such as... Figure 1 As shown, this water accumulation detection method includes the following steps: S1, perform three-dimensional spatial mapping and reconstruction on the visual image data of the target monitoring area to obtain spatial geometric features, wherein the spatial geometric features are used to characterize the three-dimensional topological structure and / or relative depth relationship of the scene within the target monitoring area.
[0019] The target monitoring area is a specific geographical area equipped with image acquisition devices (such as surveillance cameras) and requiring water accumulation detection. It is typically a flood-prone section of urban roads, such as low-lying areas, bridge culverts, tunnel entrances and exits, and underpasses—key locations prone to water accumulation. In practice, the target monitoring area is the monitoring scene covered by the fixed camera's field of view, including the road surface and surrounding environmental references (such as curbs and vehicles). By collecting and analyzing visual image data within this area, water accumulation can be identified and quantified.
[0020] Visual image data consists of two-dimensional color image data collected by ordinary RGB cameras deployed in the target monitoring area. It contains rich visual information such as the color, texture, and outline of objects in the scene.
[0021] Spatial geometric features are data structures that abstractly represent the 3D shape, surface undulations, spatial positions, and mutual occlusion relationships of objects in a scene after reconstruction through 3D spatial mapping. They are used to characterize the 3D topological structure and / or relative depth relationships of the scene within the target monitoring area. Specifically, the 3D topological structure refers to the connection relationships, boundary contours, and overall shape construction of various objects such as ground, water accumulation, vehicles, curbs, and traffic barriers within the target monitoring area in 3D space, used to distinguish the distribution of different objects in 3D space. The relative depth relationship refers to the distance from the real physical point corresponding to different pixels in the image to the camera's optical center, reflecting the relative positions of points in the scene (front / back, high / low), providing a comparative benchmark for detecting abnormal water surface features.
[0022] While a single 2D image itself lacks direct absolute scale information, the "image texture-depth" mapping pattern learned from massive amounts of data by deep learning models can predict a physically consistent relative or absolute depth distribution. Combined with pre-set camera calibration parameters, and using a rigorous geometric inverse projection formula, the 2D pixel array is reconstructed into a 3D spatial point set, thus achieving 3D spatial mapping reconstruction. This reconstruction process enables ordinary cameras to perceive true physical scale, replacing the 3D data that requires specialized hardware such as LiDAR or stereo cameras.
[0023] In one specific embodiment, the spatial geometric features are a three-dimensional point cloud model, and S1 includes the following steps: S11, input the visual image data into the preset monocular depth estimation network to obtain the continuous depth map corresponding to the visual image data.
[0024] S12, based on the camera imaging geometry principle and pre-calibrated camera intrinsic and extrinsic parameters, performs three-dimensional spatial mapping on the two-dimensional pixel coordinates of the visual image data and the corresponding depth values in the continuous depth map, generating a three-dimensional point cloud model representing the three-dimensional topological structure of the scene.
[0025] The monocular depth estimation network can be implemented using a convolutional neural network or a visual Transformer network based on an encoder-decoder architecture. The encoder performs multi-level downsampling on the input image to extract high-level semantic features, while the decoder progressively upsamples these features to restore the original input resolution and outputs pixel-by-pixel depth estimates. A skip connection is used between the encoder and decoder to pass intermediate feature maps from each level of the encoder to the corresponding level of the decoder, fusing shallow detail information with deep semantic information to improve the accuracy of depth estimation. During training, the monocular depth estimation network is supervised through a dataset containing tens of thousands of pairs of RGB images and their corresponding real depth maps, which are acquired by LiDAR or depth sensors. The network weight parameters are updated using a backpropagation algorithm with a scale-invariant loss function and / or an edge-aware loss function as the optimization objective. After training, the monocular depth estimation network is capable of predicting pixel-by-pixel depth values for any input RGB image. For example, a monocular depth estimation network can specifically use pre-trained models such as MiDaS or DPT, where the encoder uses ResNet-50 or Vision Transformer-Base as the backbone network, the decoder uses a progressive upsampling module, and the output depth map has the same size as the input image.
[0026] Specifically, visual image data (single-frame RGB image) of the target monitoring area is input into a pre-trained monocular depth estimation network. The network performs multi-layer convolution (or self-attention computation) and downsampling encoding on the input image to extract high-level semantic features, then upsampling decoding to restore the original image resolution. Finally, it outputs depth estimates pixel by pixel, and by summing the depth values of all pixels, a continuous depth map aligned with the original image space is formed. The numerical value of each pixel in the continuous depth map represents the estimated distance from the corresponding real physical point to the camera's optical center, serving as a key intermediate medium for transforming two-dimensional pixel coordinates into a three-dimensional point cloud.
[0027] The camera imaging geometry principle is the classic pinhole imaging model, which describes how any point in three-dimensional space is projected onto the two-dimensional imaging plane through the camera's optical center to form corresponding pixel coordinates. The inverse process is the theoretical basis for recovering a three-dimensional point from a two-dimensional image. Camera intrinsic parameters include the camera's focal length f. x f y and principal point coordinates c x c y The extrinsic parameters describe the internal optical and sensor characteristics of the camera; the extrinsic parameters include the camera's rotation matrix and translation vector, which describe the camera's pose in the real-world coordinate system and are used to establish a precise mathematical mapping relationship between two-dimensional pixel coordinates (u, v) and three-dimensional spatial coordinates (X, Y, Z).
[0028] During camera imaging, a point P(X, Y, Z) in 3D space is projected onto the image plane at pixel coordinates (u, v). By performing the mathematical inverse operation of this process, given the pixel coordinates (u, v) and the depth Z of the point, the X and Y coordinates of the original 3D point P can be deduced. By performing the inverse operation on the entire image, each pixel in the 2D image can be restored to a geometric point in 3D space, thus completing the 3D reconstruction of the entire scene.
[0029] Specifically, it loads the camera's intrinsic and extrinsic parameters obtained in advance through camera calibration, including the focal length f. x f y Principal point coordinates c x c y Traverse each pixel (u, v) in the continuous depth map and obtain its corresponding depth value Z = D(u, v). Based on the camera imaging geometry principle (inverse transformation of the pinhole imaging model), calculate the three-dimensional spatial coordinates (X, Y, Z) of the pixel, where X = (uc... x )×Z / f x Y=(vc y )×Z / f y Z = D(u, v). All calculated 3D coordinate points are aggregated, and the original RGB color value corresponding to each point is optionally retained to form the final 3D point cloud model.
[0030] As described above, by using a monocular depth estimation network to perform three-dimensional spatial mapping and reconstruction on ordinary visual image data and combining it with camera parameters, spatial geometric features representing the three-dimensional topological structure and relative depth relationship of the scene are generated. This enables the system to accurately measure the physical area and depth of water accumulation using only existing surveillance cameras, solving the problem that traditional two-dimensional vision methods cannot obtain real physical quantitative indicators.
[0031] S2 identifies several suspected water accumulation areas that meet the geometric constraints of the horizontal plane from spatial geometric features.
[0032] Among them, the horizontal plane geometric constraint condition is a set of geometric criteria for identifying water accumulation areas in spatial geometric features based on the physical law that "the free surface of a water body tends to form a horizontal plane perpendicular to the direction of gravity under the action of gravity". This is used to transform the physical laws of nature into detection rules that can be quantified and executed by a computer.
[0033] Water surfaces have a smooth, specular reflective property, which makes it difficult for monocular depth estimation networks to match the correct texture points, resulting in output depth values that are significantly lower than the actual road surface or are completely missing. In 3D point cloud models, this phenomenon manifests as downward-facing "pits" or "cavities" in areas that should be flat. Therefore, by detecting this specific 3D geometric anomaly, water accumulation can be identified with high confidence.
[0034] In one specific embodiment, S2 includes the following steps: S21, obtain the depth gradient map corresponding to the continuous depth map, and determine the pixels with depth gradient magnitude greater than the preset gradient threshold as depth jump anomaly points according to the depth gradient map, and / or determine the continuous pixel region with depth value lower than the preset effective depth lower limit as depth missing anomaly region.
[0035] S22, map depth jump anomalies and / or depth missing anomalies to the 3D point cloud model to obtain the corresponding local point cloud regions.
[0036] S23, fit the reference road surface plane equation in the three-dimensional point cloud model, and calculate the vertical distance from each point in the local point cloud region to the reference road surface plane equation.
[0037] S24, when the vertical distance is less than the preset negative threshold, the local point cloud region is identified as a suspected water accumulation region that meets the geometric constraints of the horizontal plane.
[0038] In this study, normal road surfaces exhibit rich texture, allowing the monocular depth estimation network to output continuous and smooth depth values with relatively small depth gradient amplitudes. However, at the water's edge, specular reflection causes drastic texture changes, resulting in abrupt jumps in the network's output depth values from normal to invalid values, creating extremely large depth gradient amplitudes. Within the water's interior, specular reflection eliminates texture, preventing the network from matching corresponding points and resulting in near-zero depth values, forming large areas of voids. By employing two complementary anomaly detection mechanisms—gradient jumps and depth loss—multi-dimensional anomaly signals from the water surface can be comprehensively captured in continuous depth maps.
[0039] Specifically, spatial gradient operations are performed on the continuous depth map (such as convolving the depth map using the Sobel operator) to obtain a depth gradient map. The value of each pixel in the depth gradient map reflects the rate of change of the depth at that location along the horizontal or vertical direction, and is used to highlight edge regions where the depth value in the continuous depth map changes drastically.
[0040] The depth gradient magnitude is the magnitude of the gradient vector at each pixel location in the depth gradient map, quantifying the drastic change in depth value at that location. By iterating through each pixel in the depth gradient map, if the depth gradient magnitude of that pixel is greater than a preset gradient threshold, it is marked as a depth jump anomaly.
[0041] Simultaneously, each pixel in the continuous depth map is traversed. If the depth value of a pixel is lower than a preset effective lower depth limit, it is included in the connected component search of the depth missing anomaly region. In the continuous depth map, pixels with depth values continuously lower than the preset effective lower depth limit constitute a continuous hole region, which is also the depth missing anomaly region.
[0042] By using the mapping table from two-dimensional pixel coordinates to three-dimensional point cloud point indices, the pixel positions corresponding to the detected depth jump anomalies and depth missing anomalies are directly looked up in the three-dimensional spatial points of the three-dimensional point cloud model to form local point cloud regions.
[0043] In the 3D point cloud model, a subset of point clouds deemed healthy road surfaces (not covered by standing water) is selected. A robust plane fitting algorithm is used to fit the healthy road surface point cloud, yielding a baseline road surface plane equation representing "the height the road surface should be at if there were no standing water." For local point cloud regions, the vertical distance from each point to this baseline plane is calculated to quantitatively describe the deviation of each point in the local point cloud region from its expected height. Points on normal road surfaces have vertical distances close to zero, while points in water-covered areas show negative vertical distances due to distorted depth estimation.
[0044] Because water surfaces should remain horizontal under gravity and not be higher than the surrounding road surface, in a 3D point cloud model, the water surface region appears as a continuous depression located below the reference road surface plane. By checking whether the vertical distance between points within the region is less than a preset negative threshold, the true water surface depression can be effectively distinguished from local point cloud noise, ensuring high confidence in the detection.
[0045] The specific value of the preset gradient threshold can be set by the implementer based on the range and accuracy of the depth values output by the monocular depth estimation network. For example, when the depth value is normalized to the interval [0, 1], the preset gradient threshold ranges from 0.1 to 0.3, with a preferred value of 0.2. When the depth value is expressed in actual physical scales (such as meters), the preset gradient threshold ranges from 0.5m to 2.0m.
[0046] The specific value of the preset effective depth lower limit can be set by the implementer based on the output characteristics of the monocular depth estimation network for invalid regions. For example, when the monocular depth estimation network outputs a depth value close to 0 for pixels that cannot be reliably estimated, the preset effective depth lower limit can be set to 0.01m to 0.1m; when the monocular depth estimation network outputs a fixed identifier value (such as -1) for invalid regions, pixels with this identifier value are directly identified as depth loss anomalies.
[0047] The specific value of the preset negative threshold can be set by the implementer based on the point cloud density and measurement accuracy of the 3D point cloud model. The range of the preset negative threshold is usually from -0.05m to -0.2m. When the point cloud model has high accuracy and high point cloud density, a smaller value (such as -0.05m) can be used; when the point cloud model has a certain amount of noise, a larger value (such as -0.15m) can be used to tolerate small point cloud fluctuations and avoid missing real water surface areas.
[0048] The above-mentioned method transforms the failure of monocular depth estimation caused by water surface reflection from a passive defect into an active detection feature by detecting depth jump anomalies and depth missing anomalies, thus achieving comprehensive capture of abnormal signals in water accumulation areas. By mapping depth anomaly areas to a 3D point cloud model and using a reference plane for vertical distance threshold determination, the method effectively distinguishes between real water surface depressions and point cloud noise, significantly improving the robustness and accuracy of detection under complex lighting and reflection environments.
[0049] S3, obtain reference benchmark information from visual image data.
[0050] S4. For any suspected waterlogged area, calculate the waterlogging quantification index of the current suspected waterlogged area relative to the reference plane based on the relative spatial position relationship of the current suspected waterlogged area according to the reference benchmark information.
[0051] The reference reference information includes reference plane information and / or reference position information of environmental reference objects with known geometric features, which is used to provide a stable and traceable reference system for the physical quantification of water depth and area, so as to convert relative measurements into absolute physical quantities.
[0052] The reference plane information is a mathematical expression of the spatial plane equation obtained by fitting the point cloud of a healthy road surface that is not covered by water. It is used to establish a virtual waterless road surface in three-dimensional spatial geometry, serving as a geometric reference for measuring the downward depression depth and planar extension range of the water-covered area.
[0053] The reference position information is the precise spatial coordinate range occupied by the environmental reference object in the three-dimensional spatial geometry. It is used to determine the position of the environmental reference object in three-dimensional space and to provide spatial positioning for subsequent truncation plane detection and height measurement.
[0054] The quantitative indicators of water accumulation include water depth and / or water area.
[0055] In one specific embodiment, S3 includes the following steps: S311 extracts point clouds of healthy road surfaces not covered by water bodies from spatial geometric features.
[0056] S312, Generate a reference road surface plane equation by fitting the point cloud of the healthy road surface, and use the plane represented by the reference road surface plane equation as the reference plane information.
[0057] Among them, the healthy road surface point cloud is a subset of points in the 3D point cloud model that belong to the normal dry road surface. It excludes non-road or abnormal objects such as water accumulation areas, vehicles, pedestrians, and curb stones. It serves as the original data for the fitting reference plane to ensure that the fitting result reflects the geometry of the real road surface.
[0058] The reference road surface plane equation is the final spatial plane mathematical expression obtained by integrating and fitting the point clouds of all healthy road surfaces. As the overall reference plane information, it is used to provide a unique and deterministic geometric benchmark for calculating the depth and area of water accumulation.
[0059] In one specific embodiment, S311 includes the following steps: S3111 divides the 3D point cloud model into several point cloud sub-regions and calculates the mean direction and variance of the normal vector of each point cloud sub-region.
[0060] S3112, the point cloud sub-regions whose mean normal vector direction and preset gravity direction have an angle less than a preset angle threshold and whose normal vector variance is less than a preset variance threshold are identified as candidate road surface regions.
[0061] S3112, perform plane fitting on the candidate road surface area to obtain several local plane equations.
[0062] S3113, the point cloud corresponding to the plane in the local plane equation where the number of points in the plane is greater than a preset threshold and the elevation distribution of the points in the plane is continuous is determined as the healthy road surface point cloud that is not covered by water.
[0063] In this context, a point cloud sub-region is a cluster of local points obtained by dividing a 3D point cloud model according to certain rules. This sub-region reduces the computational cost of subsequent normal vector calculations, maintains geometric consistency within a local area, and improves the accuracy of region classification. Specifically, dividing the 3D point cloud model into several point cloud sub-regions can include: performing voxel downsampling on the 3D point cloud model, treating the point cloud within each voxel mesh as a point cloud sub-region; or segmenting the 3D point cloud model based on a region growing algorithm, treating each connected point cluster obtained as a point cloud sub-region.
[0064] Voxel downsampling divides the 3D space into cubic units by setting the side length of the voxel mesh (e.g., 0.1m to 0.5m). The point cloud within each unit constitutes a point cloud sub-region, offering the advantage of high computational efficiency. Region growing algorithms select seed points and expand to the neighborhood based on normal vector continuity or curvature thresholds, aggregating points with consistent geometric features into a point cloud sub-region, offering the advantage of high segmentation accuracy. Implementers can choose one or a combination of these methods depending on scene complexity and computational resources.
[0065] By dividing the point cloud into local sub-regions and statistically analyzing their normal vector distributions, different objects (road surface, curb, vehicles, puddles, etc.) that were originally mixed together can be initially separated in the feature space. The mean direction of the normal vector of the sub-region of the horizontal road surface should be close to vertically upward and have a small variance; the direction of the normal vector of the sub-region of the curb side should be close to horizontal and have a small variance; the direction of the normal vector of the vehicle surface sub-region is chaotic or has a large variance.
[0066] Specifically, the average direction of the surface normal vectors of all points within the point cloud sub-region is calculated to obtain the mean direction of the normal vectors. The deviation between the normal vectors of each point within the point cloud sub-region and the mean direction is statistically analyzed to obtain the variance of the normal vectors. The smaller the variance of the normal vectors, the smoother the surface of the sub-region.
[0067] The preset gravity direction is a pre-defined vertical direction vector in three-dimensional space, typically (0, 0, 1) or (0, -1, 0), serving as a reference direction for determining the road surface's levelness. The preset angle threshold is a critical angle value used to determine whether the mean direction of the normal vector of a sub-region is sufficiently close to the gravity direction. When the angle is less than this preset angle threshold, the sub-region is considered to generally meet the requirements for a level road surface orientation. The preset variance threshold is a critical value for determining the smoothness within a sub-region. When the variance of the normal vector is less than this threshold, the surface of the sub-region is considered sufficiently flat, belonging to the road surface rather than a rough or undulating object.
[0068] For each point cloud sub-region, perform a dual judgment: whether the angle between the mean direction of the normal vector and the preset gravity direction is less than a preset angle threshold; and whether the variance of the normal vector is less than a preset variance threshold. Point cloud sub-regions that pass the dual judgment are marked as candidate road surface regions.
[0069] For each candidate road surface region, a random sampling consensus algorithm is used for plane fitting to generate the corresponding local plane equation. Through iterative random sampling and verification, the random sampling consensus algorithm can robustly fit the best plane equation from a point cloud containing a certain proportion of outliers, and can further eliminate a small number of non-road points that may exist in the candidate road surface region.
[0070] Specifically, the random sampling consensus algorithm performs plane fitting through the following iterative steps: Three non-collinear points are randomly selected from the point cloud of the candidate road surface region to determine an initial plane equation; the distance from each of the remaining points in the candidate road surface region to this initial plane equation is calculated, and points whose distance is less than a preset interior point distance threshold are included in the interior point set; the above random sampling and interior point counting process is repeated, and the plane equation with the largest number of interior points is selected as the optimal plane for this iteration; using the interior point set corresponding to this optimal plane, the final local plane equation is obtained by refitting using the least squares method. The preset interior point distance threshold can be set according to the measurement accuracy of the point cloud, for example, set to 0.02m to 0.1m.
[0071] A truly large-scale road surface should have a large point scale and continuous topographic extension. A preset quantity threshold is the critical number of inliers required to determine if a local surface is large enough, used to exclude small surfaces such as vehicle roofs and median barriers. Continuous elevation distribution means that the point cloud height values in the area covered by the surface change smoothly and continuously along the road's extension direction, without abrupt jumps. This is used to exclude suspended or fragmented surfaces caused by occlusion or missegmentation, ensuring that the final retained point cloud has a very high confidence level as a healthy road surface.
[0072] In one specific embodiment, S4 includes the following steps: S411, construct a three-dimensional polygon based on the boundary points of the current suspected waterlogged area, and use the area of the three-dimensional polygon as the waterlogged area of the current suspected waterlogged area.
[0073] S412, calculate the vertical distance of each point in the current suspected waterlogged area relative to the reference road surface plane equation.
[0074] S413, determine the water depth of the suspected waterlogged area based on the vertical distance.
[0075] The boundary of the waterlogged area defines its spread across the ground. Boundary points of the point cloud of the suspected waterlogged area are extracted, sorted by spatial proximity, and connected to construct a closed 3D polygon. The area of this 3D polygon is calculated (either by projecting it onto a reference plane or by directly calculating the 3D surface area), and this area is taken as the waterlogged area of the suspected waterlogged area.
[0076] Specifically, boundary points can be extracted from the point cloud of the suspected water accumulation area using either the Alpha shape algorithm or the concave hull detection algorithm. The Alpha shape algorithm involves rolling a circle with a preset radius α outside the point cloud; the points touched by the circle are the boundary points. A larger value of α results in a coarser extracted boundary, while a smaller value yields a finer one. The concave hull detection algorithm extracts boundary points by finding the smallest concave polygon that encloses the point set, and is suitable for non-convex water accumulation areas. The specific value of the preset radius α can be set by the implementer based on the average spacing of the point cloud, for example, set to 2 to 5 times the average spacing of the point cloud.
[0077] The reference road surface plane represents the road surface height in a waterless state. For each three-dimensional point in the current suspected waterlogged area, the vertical distance from the point to the plane is calculated using the point-to-plane distance formula.
[0078] The absolute values of the vertical distances between points in the suspected flooded area are taken and statistically analyzed. The maximum, average, or median value is then selected as the water depth of the suspected flooded area, depending on the application requirements. Specifically, the maximum vertical distance reflects the water depth at the lowest point in the flooded area, suitable for determining if a vehicle has bottomed out; the average distance reflects the overall water depth of the flooded area, suitable for assessing the risk of wading through water; and the median is robust to outliers and suitable for scenarios with individual point cloud noise.
[0079] The above method filters healthy road surface point clouds by using the mean direction and variance of the normal vectors of point cloud sub-regions and fits them to a reference plane, providing an accurate three-dimensional spatial geometric reference for the quantitative calculation of water accumulation depth and area. It constructs three-dimensional polygons by using the boundary points of suspected water accumulation areas and calculates their areas, as well as calculates the vertical distance of each point in the suspected water accumulation area relative to the reference road surface plane equation and determines the water accumulation depth accordingly. This allows the water accumulation spread range and depth to be output with real physical values, solving the technical deficiency of traditional two-dimensional methods that can only provide pixel percentages.
[0080] In another specific embodiment, the environmental reference objects with known geometric features include at least vehicle wheel hubs and curb stones, and S3 includes the following steps: S321, input the visual image data into the preset target detection network, identify and locate environmental reference objects with known geometric features in the target monitoring area, and obtain the two-dimensional bounding box of the environmental reference objects in the visual image data.
[0081] S322, Based on the two-dimensional bounding box, extract the local image region corresponding to the environmental reference object from the visual image data.
[0082] S323 performs edge detection and / or contour fitting on local image regions to extract geometric feature information of environmental reference objects.
[0083] S324. Based on geometric feature information and the two-dimensional pixel coordinates of the visual image data and the three-dimensional spatial mapping relationship between the spatial geometric features, determine the reference position information of the environmental reference object in the spatial geometric features.
[0084] Among them, environmental reference objects with known geometric features refer to objects that exist naturally or are commonly deployed in the scene and have fixed shapes and sizes that can be known in advance, such as vehicle wheel hubs (round, about 40cm-60cm in diameter) and curb stones (straight, about 15cm-30cm in height). These can replace artificially placed physical rulers and serve as natural geometric benchmarks for the quantitative calculation of water accumulation.
[0085] Object detection networks are pre-trained deep learning models used to simultaneously identify objects of a specified category from input images and output their rectangular bounding boxes. They can be implemented using single-stage detection architectures (such as the YOLO series) or two-stage detection architectures (such as the Faster R-CNN network). Taking the YOLOv5 network as an example, its network structure consists of three parts: a backbone network, a neck network, and a detection head. The backbone network uses the CSPDarknet53 structure, containing multiple convolutional layers and cross-stage local connection modules to extract multi-level visual feature maps from the input image. The neck network uses a feature pyramid structure and a path aggregation structure to fuse feature maps of different scales output by the backbone network, enhancing its ability to represent features of both small and large objects. Based on the fused feature maps, the detection head predicts the object category probability, bounding box coordinates, and confidence score for each anchor box. During the training phase, the object detection network is trained using a labeled image dataset containing environmental references (such as vehicle wheel hubs and curb stones). Each image in the labeled image dataset is labeled with the category label and bounding box coordinates of the environmental reference. During training, the weighted sum of classification loss and localization loss is used as the total loss function, and the network parameters are updated through backpropagation and gradient descent algorithms. After training, the object detection network is capable of recognizing and locating environmental references of a specified category in the input image.
[0086] Vehicle wheel hubs have known circular features, while curb stones have known straight line segment features. Edge detection and contour fitting are performed on local image regions to extract precise geometric feature information of environmental references. For example, Hough circle transform is used to detect the circular contour of the vehicle wheel hub, obtaining the center coordinates and radius; straight line segment detection is used for the curb stone, obtaining the endpoint coordinates and extension direction.
[0087] By combining the established mapping relationship between two-dimensional pixel coordinates and three-dimensional spatial coordinates, geometric feature information is mapped to three-dimensional space, thereby determining the reference position information of environmental reference objects in spatial geometric features.
[0088] In another specific embodiment, S4 includes the following steps: S421, Determine the spatial position of the environmental reference object in the spatial geometric features based on the reference position information of the environmental reference object in the spatial geometric features.
[0089] S422, based on spatial location, along the vertical direction of environmental reference objects, determines the location of point cloud density abrupt changes or depth value abrupt changes formed by the water surface in the spatial geometric features as the horizontal truncation plane, and obtains the absolute height of the horizontal truncation plane in the spatial geometric features.
[0090] S423: Obtain the reference height of the reference road surface plane, and use the elevation difference between the absolute height and the reference height as the water depth of the current suspected water accumulation area.
[0091] In this model, the water surface forms a clear boundary on the vertical plane of the reference object. The upper surface of the reference object retains a complete 3D point cloud, while the lower water surface area shows missing point clouds or abrupt depth changes due to specular reflection. Based on this physical phenomenon, the water surface height can be directly determined by scanning the changes in point cloud density / depth values along the vertical direction of the reference object.
[0092] A sudden change in point cloud density refers to a dramatic shift in the number of point clouds at the boundary between an environmental reference facade and the water surface, where the facade retains point clouds while the water surface produces almost none. A sudden change in depth value refers to a situation where the point cloud on the facade above the boundary has a normal depth value, while the depth value of the point cloud in the water surface below the boundary shows an abnormal jump due to estimation distortion.
[0093] A horizontal truncated plane is a spatial horizontal boundary line formed by the intersection of the water surface and the elevation of a reference object. The Z value corresponding to the abrupt change in depth or the missing point cloud position at this boundary line is the absolute height of the plane.
[0094] The reference height is the road surface height calculated from the reference road surface plane equation at the bottom position of the reference object, representing the road surface height at this location when there is no water. The elevation difference between the absolute height and the reference height is used as the water depth of the currently suspected waterlogged area.
[0095] In one specific embodiment, the environmental reference is the vehicle wheel hub, and S422 includes: Based on the center coordinates and radius of the circular profile of the vehicle wheel hub, along the vertical axis of the vehicle wheel hub, the horizontal dividing line of the point cloud density change or depth value change formed by the water surface in the spatial geometric features is detected, and the height of the horizontal dividing line is determined as the absolute height.
[0096] In this system, the vehicle wheel hub is a rigid, circular object perpendicular to the ground. When water submerges the bottom of the hub, the water surface and the hub's vertical surface form a clear spatial segment. The upper half of the hub is above the water, and the point cloud is complete. The lower half is submerged, and due to specular reflection from the water surface, monocular depth estimation cannot generate a valid point cloud. By scanning the density or depth changes within a local spatial range along the hub's vertical axis, the height of the segmented plane can be precisely located, thus directly obtaining the absolute height value of the water surface.
[0097] Specifically, the system acquires the reference position information of the vehicle wheel hub, including the center coordinates and radius in three-dimensional space. Using the center coordinates as a reference point, a vertical axis along the direction of gravity is determined. Along the vertical axis, point clouds within a local spatial range are sampled point by point from the center height downwards with a first preset step size. At each sampling height, the number of point clouds within a preset window thickness centered at that height is counted. The curve of point cloud density changing with height is recorded, and the height position where the point cloud density drops sharply is detected. This abrupt change position is the horizontal dividing line where the water surface intersects the wheel hub, and the height value of the horizontal dividing line is determined as the absolute height of the water surface.
[0098] The specific value of the first preset step size can be set by the implementer according to the actual situation. For example, when the spatial resolution of the 3D point cloud model is high (e.g., the average spacing of the point cloud is about 0.3cm) and the application scenario requires the water depth measurement accuracy to be at the millimeter level, it can be set to 0.5cm; when the spatial resolution of the 3D point cloud model is medium (e.g., the average spacing of the point cloud is about 1.0cm) and the application scenario requires the water depth measurement accuracy to be at the centimeter level, the first preset step size can be set to 1.0cm.
[0099] The specific value of the preset window thickness can be set by the implementer based on the first preset step size and the point cloud density. The preset window thickness is typically set to 1.0 to 2.0 times the first preset step size. When the preset window thickness equals the first preset step size, adjacent sampling windows connect perfectly without overlap; when the preset window thickness is greater than the first preset step size, adjacent sampling windows partially overlap, which can increase the statistical sample size and improve the stability of density estimation, but will increase the computational load. For example, when the first preset step size is 1.0 cm, the preset window thickness can be set to 1.0 cm to 2.0 cm.
[0100] In one specific embodiment, the environmental reference is a curbstone, and S422 includes: Based on the extension direction of the straight edge of the curbstone, the vertical face of the curbstone is determined in the spatial geometric features. The boundary position of the point cloud density change or depth value change formed by the water surface is detected along the vertical facestone, and the height corresponding to the boundary position is determined as the absolute height.
[0101] Curbstones are typical rigid vertical boundaries, extending along the road edge with highly regular linear geometry and a vertical orientation. When the road surface is flooded, the water surface and the curbstone form a very distinct intersection line. Below this intersection line, due to the specular reflection of the water surface, depth estimation of the lower region of the curbstone produces voids or anomalies; above the intersection line, the curbstone facade can still stably generate a continuous point cloud. The boundary position can be accurately located by scanning the vertical facade.
[0102] Specifically, the reference position information of the curbstone is obtained, including the endpoint coordinates and extension direction vector of the straight edge of the curbstone in three-dimensional space. Based on the straight edge, a spatial plane perpendicular to the road surface is constructed by extending downwards in a vertical direction; this plane is the vertical elevation of the curbstone. On this vertical elevation, point cloud samples are collected vertically from bottom to top at a second preset step size. At each sampling height, the depth distribution or density of the point cloud near that height is statistically analyzed. The boundary locations where the point cloud density or depth value changes drastically are detected, and the height value of these boundary locations is determined as the absolute height of the water surface.
[0103] The specific value of the second preset step size can be set by the implementer according to the actual situation. The setting principle is the same as that of the first preset step size, and it can be the same or different from the first preset step size.
[0104] As described above, by using a target detection network to identify naturally existing environmental reference objects in the scene and extract their geometric features, the system can obtain a spatial measurement benchmark without setting up a physical scale, thus realizing scale-free water depth measurement. By detecting abrupt changes in point cloud density or depth values along the vertical direction of the reference object to locate the horizontal truncated plane, the physical features at the interface between the water surface and the reference object are converted into quantitative detection signals, thus achieving accurate acquisition of the true height of the water surface.
[0105] S5. Based on the water accumulation quantification indicators corresponding to all suspected water accumulation areas, the water accumulation detection results of the target monitoring area are obtained.
[0106] Specifically, the water accumulation quantification index of each suspected water accumulation area is compared with the preset warning threshold. When the water accumulation quantification index of a suspected water accumulation area exceeds the preset warning threshold, water accumulation warning information containing the location information and water accumulation quantification index of the suspected water accumulation area is generated as the water accumulation detection result of the target monitoring area.
[0107] In practical applications, flood warning information can also be sent to terminal devices, allowing these devices to display the warnings to users. Furthermore, a flood distribution map containing the location information of each suspected flooded area, along with its corresponding water depth and / or area, can be generated as the flood detection result for the target monitoring area.
[0108] The above-mentioned method obtains spatial geometric features by reconstructing three-dimensional spatial mapping from ordinary visual image data. This eliminates the need for expensive hardware such as LiDAR or depth sensors, and can acquire three-dimensional geometric information of the scene using only a monocular camera, significantly reducing deployment and maintenance costs. By detecting suspected water accumulation areas that meet the geometric constraints of a horizontal plane from the spatial geometric features, the physical property of water bodies tending towards a horizontal surface is transformed into quantifiable three-dimensional geometric constraints. This achieves a technological leap from two-dimensional pixel semantic segmentation to three-dimensional spatial geometric detection, effectively filtering out false detections caused by water reflections and changes in lighting, and improving detection robustness. By acquiring reference benchmark information, including reference plane information and / or reference position information of environmental reference objects, a quantifiable reference benchmark can be obtained without pre-deploying physical scales or relying on historical depth maps of the same camera position. Based on the relative spatial position relationship between the reference benchmark information and the suspected water accumulation area, the water accumulation index is calculated, solving the technical pain point that traditional two-dimensional vision methods cannot obtain the true physical depth and area. This provides a high-precision decision-making basis for flood control scheduling and traffic control.
[0109] Example 2 Embodiment 2 of the present invention provides a non-transitory computer-readable storage medium, which can be disposed in an electronic device to store at least one instruction or at least one program related to implementing a method in the method embodiment. The at least one instruction or at least one program is loaded and executed by the processor to implement the water accumulation detection method based on horizontal surface features provided in the above embodiment.
[0110] Example 3 Embodiment 3 of the present invention provides an electronic device, which includes a processor and the non-transitory computer-readable storage medium of Embodiment 2 of the present invention.
[0111] The above are merely preferred embodiments of the present invention and are not intended to limit the present invention in any way. Although the present invention has been disclosed above with reference to preferred embodiments, it is not intended to limit the present invention. Any person skilled in the art can make some modifications or alterations to the above-disclosed technical content to create equivalent embodiments without departing from the scope of the present invention. Any simple modifications, equivalent changes and alterations made to the above embodiments based on the technical essence of the present invention without departing from the scope of the present invention shall still fall within the scope of the present invention.
Claims
1. A method for detecting water accumulation based on horizontal surface features, characterized in that, The water accumulation detection method includes the following steps: S1, perform three-dimensional spatial mapping and reconstruction on the visual image data of the target monitoring area to obtain spatial geometric features, wherein the spatial geometric features are used to characterize the three-dimensional topological structure and / or relative depth relationship of the scene within the target monitoring area; S2, detect several suspected water accumulation areas that satisfy the horizontal plane geometric constraints from the spatial geometric features; S3, obtain reference reference information from the visual image data, wherein the reference reference information includes reference plane information and / or reference position information of environmental reference objects with known geometric features; S4. For any suspected waterlogged area, based on the relative spatial position of the current suspected waterlogged area according to the reference benchmark information, calculate the waterlogging quantification index of the current suspected waterlogged area relative to the reference plane, wherein the waterlogging quantification index includes waterlogging depth and / or waterlogging area. S5. Based on the water accumulation quantification indicators corresponding to all suspected water accumulation areas, obtain the water accumulation detection results of the target monitoring area.
2. The water accumulation detection method based on horizontal surface features according to claim 1, characterized in that, The spatial geometric features are a three-dimensional point cloud model. S1 includes the following steps: S11, The visual image data is input into a preset monocular depth estimation network to obtain a continuous depth map corresponding to the visual image data; S12, based on the camera imaging geometry principle and pre-calibrated camera intrinsic and extrinsic parameters, perform three-dimensional spatial mapping on the two-dimensional pixel coordinates of the visual image data and the corresponding depth values in the continuous depth map to generate a three-dimensional point cloud model representing the three-dimensional topological structure of the scene.
3. The water accumulation detection method based on horizontal surface features according to claim 2, characterized in that, S2 includes the following steps: S21, obtain the depth gradient map corresponding to the continuous depth map, and determine the pixels with depth gradient magnitude greater than the preset gradient threshold as depth jump anomaly points according to the depth gradient map, and / or determine the continuous pixel region with depth value lower than the preset effective depth lower limit as depth missing anomaly region. S22, map the depth jump anomaly points and / or the depth missing anomaly regions to the three-dimensional point cloud model to obtain the corresponding local point cloud regions; S23, fit and generate a reference road surface plane equation in the three-dimensional point cloud model, and calculate the vertical distance from each point in the local point cloud region to the reference road surface plane equation; S24, when the vertical distances are all less than a preset negative threshold, the local point cloud region is determined as a suspected water accumulation region that meets the geometric constraints of the horizontal plane.
4. The water accumulation detection method based on horizontal surface features according to claim 2, characterized in that, S3 includes the following steps: S311, Extract healthy road surface point cloud that is not covered by water from the spatial geometric features; S312, a reference road surface plane equation is generated by fitting the point cloud of the healthy road surface, and the plane represented by the reference road surface plane equation is used as the reference plane information.
5. The water accumulation detection method based on horizontal surface features according to claim 4, characterized in that, S311 includes the following steps: S3111, the three-dimensional point cloud model is divided into several point cloud sub-regions, and the mean direction and variance of the normal vector of each point cloud sub-region are calculated. S3112, the point cloud sub-regions with an angle between the mean direction of the normal vector and the preset gravity direction that is less than a preset angle threshold and the variance of the normal vector that is less than a preset variance threshold are identified as candidate road surface regions. S3112, perform plane fitting on the candidate road surface area to obtain several local plane equations; S3113, the point cloud corresponding to the plane in the local plane equation where the number of points in the plane is greater than a preset threshold and the elevation distribution of the points in the plane is continuous is determined as the healthy road surface point cloud that is not covered by water.
6. The water accumulation detection method based on horizontal surface features according to claim 4, characterized in that, S4 includes the following steps: S411, Construct a three-dimensional polygon based on the boundary points of the current suspected water accumulation area, and use the area of the three-dimensional polygon as the water accumulation area of the current suspected water accumulation area; S412, Calculate the vertical distance of each point in the current suspected waterlogged area relative to the reference road surface plane equation; S413, determine the water depth of the suspected water accumulation area based on the vertical distance.
7. The water accumulation detection method based on horizontal surface features according to claim 4, characterized in that, The environmental reference objects with known geometric features include at least vehicle wheel hubs and curb stones. S3 includes the following steps: S321, The visual image data is input into a preset target detection network to identify and locate environmental reference objects with known geometric features in the target monitoring area, and to obtain the two-dimensional bounding box of the environmental reference objects in the visual image data; S322, Based on the two-dimensional bounding box, extract the local image region corresponding to the environmental reference object from the visual image data; S323, perform edge detection and / or contour fitting on the local image region to extract the geometric feature information of the environmental reference object; S324, Based on the geometric feature information and the two-dimensional pixel coordinates of the visual image data and the three-dimensional spatial mapping relationship between the spatial geometric features, determine the reference position information of the environmental reference object in the spatial geometric features.
8. The water accumulation detection method based on horizontal surface features according to claim 7, characterized in that, S4 includes the following steps: S421, Determine the spatial position of the environmental reference object in the spatial geometric feature based on the reference position information of the environmental reference object in the spatial geometric feature; S422, Based on the spatial location, along the vertical direction of the environmental reference, determine the location of the point cloud density abrupt change or depth value abrupt change formed by the water surface in the spatial geometric feature as a horizontal truncating plane, and obtain the absolute height of the horizontal truncating plane in the spatial geometric feature; S423, obtain the reference height of the reference road surface plane, and use the elevation difference between the absolute height and the reference height as the water depth of the current suspected water accumulation area.
9. A non-transitory computer-readable storage medium, wherein the non-transitory computer-readable storage medium stores at least one instruction or at least one program segment, characterized in that, The at least one instruction or the at least one program segment is loaded and executed by the processor to implement the water accumulation detection method based on horizontal surface features as described in any one of claims 1-8.
10. An electronic device, characterized in that, Includes a processor and the non-transitory computer-readable storage medium as described in claim 9.