A Two-Way Positioning and Matching Method and System for Drainage Manhole Covers Based on Target Detection and Spatial Feature Indexing
By integrating visible light image recognition, GPS positioning, and inertial sensor data on mobile terminals, a dynamic spatial feature index and bidirectional matching mechanism are constructed. This solves the problems of insufficient positioning accuracy and data fragmentation in the inspection of manhole covers in drainage pipe networks, realizes accurate positioning and file management of manhole covers, and improves inspection efficiency.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- YANGTZE ECOLOGY & ENVIRONMENT CO LTD
- Filing Date
- 2025-07-31
- Publication Date
- 2026-06-30
AI Technical Summary
Existing technologies for manhole cover inspection in drainage pipe networks suffer from insufficient positioning accuracy, poor real-time status recognition, and fragmented multi-source data, resulting in low inspection efficiency. In particular, the location of the manhole cover cannot be accurately matched when GPS signals are interfered with.
A method based on target detection and spatial feature indexing is adopted. Visible light images, GPS positioning and inertial sensor data are collected by mobile terminal. The lightweight target detection model MobileMC-Net is used to identify the manhole cover area. Dynamic spatial feature index is constructed by combining discrete wavelet transform and extended Kalman filter to realize accurate positioning and status management of manhole covers.
It improves the accuracy of manhole cover positioning, breaks through the traditional manual verification mode, and significantly improves the efficiency of manhole cover operation and maintenance. It is especially suitable for the rapid identification and file management of concealed manhole covers in urban drainage pipe networks, with a positioning accuracy rate of over 85%.
Smart Images

Figure CN120953568B_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of intelligent management technology for urban infrastructure, and in particular to a bidirectional positioning and matching method and system for drainage well covers based on target detection and spatial feature indexing. Background Technology
[0002] Currently, the inspection and management of drainage pipe network manhole covers mainly relies on manual visual inspection, which has significant technical shortcomings. Inspectors must carry paper files to check the location and condition of each manhole cover one by one, which is inefficient and has a high rate of missed inspections. In densely built-up areas, GPS positioning signals are affected by multipath interference, and the positioning error often reaches 5-10 meters, making it impossible to accurately match the actual location of the manhole covers.
[0003] Existing technologies attempt to assist in identification through video surveillance; however, such solutions have two major limitations: first, fixed cameras have limited field of view, making it difficult to cover the entire road network of manhole covers; second, they can only achieve single-point status identification and cannot form a dynamic correlation with spatial location and historical records. Inspection personnel still need to confirm on-site, resulting in isolated data such as location information, visual features, and maintenance records. Although recent research has used mobile terminals to collect data, bottlenecks remain in dynamic positioning, failing to solve the problem of real-time positioning drift in mobile scenarios. When inspection vehicles or personnel are moving, factors such as inertial sensor noise and GPS signal loss can cause the positioning trajectory to deviate from the actual path, making it impossible to establish an accurate spatial-temporal mapping relationship.
[0004] Therefore, there is an urgent need for a technical solution that can simultaneously address the three major pain points of insufficient positioning accuracy, poor real-time status recognition, and fragmented multi-source data (dynamic association of location / image / archive) to meet the needs of efficient operation and maintenance of modern urban drainage pipe networks. Summary of the Invention
[0005] This invention provides a method and system for bidirectional positioning of drainage network manhole covers based on mobile terminals. By innovatively integrating visible light image recognition, GPS positioning, and inertial sensor data, it solves three major technical problems in traditional manhole cover inspection: insufficient positioning accuracy, low efficiency of status recognition, and poor data coordination. The core of this solution lies in constructing a dual mechanism of dynamic spatial feature indexing and bidirectional matching engine, enabling precise positioning and status management of manhole covers on mobile terminals.
[0006] To solve the above-mentioned technical problems, the technical solution adopted by the present invention is: a bidirectional positioning and matching method for drainage well covers based on target detection and spatial feature indexing, the method comprising:
[0007] S1. Collect visible light images, GPS positioning data, and three-axis inertial sensor data of the manhole cover via a mobile terminal;
[0008] S2. Input the visible light image into the pre-trained lightweight target detection model to identify the manhole cover area and output the visual feature vector;
[0009] S3. Preprocess the triaxial inertial sensor data to calculate the composite acceleration and composite angular velocity; and use discrete wavelet transform to denoise the composite data.
[0010] S4. Integrate GPS coordinates with preprocessed inertial data to construct a dynamic spatial feature index;
[0011] S5. Achieve manhole cover positioning and file association through a two-way matching algorithm:
[0012] Forward matching: Based on the visual feature vector in step S2 and the real-time spatial feature index in step S4, retrieve the associated electronic archive information;
[0013] Reverse positioning: Based on historical coordinates in the electronic archives and combined with real-time inertial data, a navigation path is generated, and the physical location of the target manhole cover is output.
[0014] In the preferred embodiment, the lightweight target detection model in step S2 includes the lightweight target detection model MobileMC-Net for water pipe network manhole covers;
[0015] The backbone network is reconstructed using depthwise separable convolution, and the computational cost is reduced by channel-wise convolution and point-wise convolution.
[0016] A global attention mechanism module is embedded in a specific layer of the backbone network, specifically in layers 3, 6, and 9. This module enhances feature representation capabilities and suppresses background interference through channel attention and spatial attention.
[0017] Generate a bounding box for the target region based on the visible light image. and the corresponding visual feature vectors ;
[0018] Local features of the target region are extracted based on the bounding box, and combined with global features to generate the final visual feature vector.
[0019] In the preferred scheme, depthwise separable convolutions are used to reconstruct the backbone network, and channel-wise and pointwise convolutions are used to reduce the computational cost; the computational cost of standard convolution is... The computational cost of depthwise separable convolution is K is the kernel size. Input the number of channels. H represents the number of output channels, and H and W represent the feature map dimensions.
[0020] In the preferred solution, the global attention mechanism module includes:
[0021] For the feature map output by the backbone network, calculate the channel attention weights to enhance the feature representation of key channels;
[0022] Based on channel attention weights, a spatial attention distribution is generated to highlight the spatial characteristics of the target region.
[0023] The channel attention weights and spatial attention distributions are fused to generate an enhanced feature map.
[0024] The enhanced feature map is then fed into subsequent convolutional layers to further extract visual features of the target region.
[0025] In the preferred scheme, in step S3,
[0026] Acquire triaxial acceleration and triaxial angular velocity data from inertial sensor data;
[0027] Calculate the resultant acceleration of the triaxial acceleration and the resultant angular velocity of the triaxial angular velocity;
[0028] Discrete wavelet transform is used to analyze the synthesized acceleration. and the resultant angular velocity The components are decomposed to obtain high-frequency coefficients and low-frequency coefficients.
[0029] High-frequency coefficients are processed using a soft thresholding method to generate denoised motion data;
[0030] The soft threshold function is defined as follows: ,in These are wavelet coefficients. For the threshold, It is a symbolic function;
[0031] A continuous sequence of motion features is generated based on the denoised motion data.
[0032] In the preferred scheme, the specific process of constructing the dynamic spatial feature index in step S4 is as follows: using GPS coordinates as the reference position point, and based on the denoised inertial data, using extended Kalman filtering to calculate the trajectory and generate a spatial position sequence;
[0033] Furthermore, visual feature vectors are associated with spatial location sequences to form a spatiotemporal joint index.
[0034] In the preferred scheme, GPS coordinate positioning data is used as the reference position to obtain real-time global coordinates;
[0035] Based on the denoised inertial data, a continuous spatial position sequence is generated by calculating the flight path using the extended Kalman filter algorithm; wherein the state vector Including position (Px, Py, Pz), velocity (Vx, Vy, Vz), attitude, and sensor bias, the state transition equation is: , It is a nonlinear function. It is an inertial measurement value. It is process noise; the observation model is , It is a nonlinear function. To locate the measurement vector, To measure the noise vector;
[0036] Determine whether the deviation between the positioning data and the spatial location sequence exceeds a preset threshold;
[0037] If the deviation exceeds a preset threshold, the fusion weights are adjusted, and the inertial data is used preferentially for position compensation; the velocity is obtained by integrating the acceleration, and the position is obtained by integrating the velocity. The position update formula is as follows: The speed update formula is: Where P is position, V is velocity, A is acceleration, and Δt is time interval;
[0038] The visual feature vector is time-aligned with the spatial location sequence to form the spatiotemporal joint index.
[0039] In the preferred scheme, the trajectory is calculated using the extended Kalman filter algorithm, including:
[0040] Construct a state vector, including position, velocity, and attitude information;
[0041] The state vector is updated using a nonlinear state transition equation based on the denoised inertial data.
[0042] By combining positioning data and using an observation model to correct the state vector, a high-precision spatial position sequence is generated.
[0043] When the reliability of the positioning data is lower than a preset threshold, inertial data is used first for status updates.
[0044] In the preferred embodiment, in step S5,
[0045] Based on visual feature vectors, similarity matching is performed in the electronic archives to retrieve corresponding archive information;
[0046] Based on the real-time spatial feature index, verify the consistency between the archive information and the spatial location of the target object;
[0047] If the deviation between the historical location data in the electronic file and the real-time location data exceeds a preset threshold, the position correction amount is calculated through the trajectory estimation algorithm.
[0048] Generate a navigation path based on the position correction and output the physical position coordinates of the target object;
[0049] In reverse positioning, when the deviation between the electronic file coordinates and the real-time GPS exceeds a threshold δ, the position correction is calculated using a trajectory extrapolation algorithm. Output the corrected physical position coordinates.
[0050] A two-way positioning and matching system for drainage well covers based on target detection and spatial feature indexing, the system includes:
[0051] Mobile data acquisition terminal: integrates a camera, GPS module, and three-axis inertial sensor;
[0052] Data processing module: Performs lightweight target detection and inertial data denoising;
[0053] Spatial Indexing Engine: Constructs dynamic spatial feature indexes;
[0054] Two-way matching module: enables forward file retrieval and reverse physical location.
[0055] This invention provides a bidirectional positioning and matching method and system for drainage manhole covers based on target detection and spatial feature indexing. The method and system collect multi-source data via a mobile terminal, accurately identify manhole covers and extract visual features using the lightweight target detection model MobileMC-Net, denoise inertial data using discrete wavelet transform, and then construct a dynamic spatial feature index by fusing GPS and inertial data through extended Kalman filtering. Finally, a bidirectional matching algorithm is used to achieve forward matching and reverse positioning. This effectively solves the problems of difficult on-site positioning of unnumbered manhole covers and low efficiency in file association, breaking through the traditional manual verification mode, and achieving a positioning accuracy rate of over 85%. MobileMC-Net employs depthwise separable convolution and global attention mechanisms to reduce computational load while enhancing feature representation, meeting the real-time processing requirements of mobile devices. Inertial data preprocessing preserves real motion information through synthetic motion features and denoising, providing reliable data for trajectory estimation. Dynamic spatial feature indexing fuses multi-source positioning data, switching to inertial navigation when GPS signals are weak, improving positioning robustness and accuracy in complex environments. A bidirectional matching mechanism enables rapid association and accurate navigation between manhole covers and records, particularly suitable for the rapid identification and record management of concealed manhole covers in urban drainage networks, significantly improving manhole cover maintenance efficiency and providing strong support for the intelligent management of urban drainage networks. Attached Figure Description
[0056] The present invention will be further described below with reference to the accompanying drawings and embodiments:
[0057] Figure 1 This is a block diagram of a two-way positioning and matching system for drainage well covers based on target detection and spatial feature indexing, as described in an embodiment of the invention.
[0058] Figure 2This is a schematic diagram of a two-way positioning and matching system for drainage well covers based on target detection and spatial feature indexing, according to an embodiment of the invention.
[0059] Figure 3 This is a diagram of the lightweight object detection model architecture of MobileMC-Net, as described in an embodiment of the invention.
[0060] Figure 4 This is an overall schematic diagram of the bidirectional positioning and matching method for drainage well covers according to an embodiment of the invention;
[0061] Figure 5 This is a sample diagram of actual data from an embodiment of the invention;
[0062] Figure 6 This is a diagram illustrating the effect of the detection method in an embodiment of the invention. Detailed Implementation
[0063] Example 1
[0064] like Figure 1-6 As shown, a bidirectional positioning and matching method for drainage well covers based on target detection and spatial feature indexing is proposed. This method includes:
[0065] S1. Collect visible light images, GPS positioning data, and three-axis inertial sensor data of the manhole cover via a mobile terminal;
[0066] S2. Input the visible light image into the pre-trained lightweight target detection model to identify the manhole cover area and output the visual feature vector;
[0067] S3. Preprocess the triaxial inertial sensor data to calculate the composite acceleration and composite angular velocity; and use discrete wavelet transform to denoise the composite data.
[0068] S4. Integrate GPS coordinates with preprocessed inertial data to construct a dynamic spatial feature index;
[0069] S5. Achieve manhole cover positioning and file association through a two-way matching algorithm:
[0070] Forward matching: Based on the visual feature vector in step S2 and the real-time spatial feature index in step S4, retrieve the associated electronic archive information;
[0071] Reverse positioning: Based on historical coordinates in the electronic archives and combined with real-time inertial data, a navigation path is generated, and the physical location of the target manhole cover is output.
[0072] When using this method, a mobile terminal integrating a camera, GPS module and three-axis inertial sensor is used to simultaneously collect visible light images of the manhole cover, GPS positioning data and three-axis inertial sensor data. The visible light images are high-resolution images, the GPS positioning data provides real-time coordinates, and the three-axis inertial sensor data includes three-axis acceleration and angular velocity with a sampling frequency of not less than 100Hz.
[0073] Next, the acquired visible light images are input into the pre-trained lightweight object detection model MobileMC-Net. This model uses depthwise separable convolutions to reconstruct the backbone network and introduces global attention mechanisms in layers 3, 6, and 9, enabling it to accurately identify manhole cover areas and output bounding boxes and visual feature vectors as the visual "fingerprint" of the manhole cover. Then, the triaxial inertial sensor data is preprocessed. First, synthetic acceleration and synthetic angular velocity are calculated, then discrete wavelet transform is used to denoise the synthetic data. High-frequency coefficients are processed using a soft thresholding method to eliminate noise and preserve true motion features. Afterward, a dynamic spatial feature index is constructed by fusing GPS coordinates with the preprocessed inertial data. Using GPS coordinates as a reference, an extended Kalman filter algorithm is used to calculate the trajectory and generate a spatial position sequence. When the GPS signal drift exceeds a 2-meter threshold, the system automatically switches to inertial navigation mode, generating position compensation through integrated angular velocity data. Finally, the visual feature vectors are associated with the spatial position sequence to form a spatiotemporal joint index.
[0074] Finally, a bidirectional matching algorithm is used to link the manhole cover location with the archive. During forward matching, the associated manhole cover number, historical maintenance records and other information are retrieved from the electronic archive based on the visual feature vector and real-time spatial feature index. During reverse positioning, a navigation path is generated based on the historical coordinates in the electronic archive and real-time inertial data. When the deviation between the electronic archive coordinates and the real-time GPS exceeds a threshold, the position correction amount is calculated by the trajectory estimation algorithm and the corrected physical position is output.
[0075] This method breaks through the traditional manual verification mode. Through multi-source data fusion and intelligent algorithms, it solves the problems of difficult on-site positioning of unnumbered manhole covers and low efficiency of file association. The positioning accuracy rate reaches more than 85%, which can significantly improve the operation and maintenance efficiency of manhole covers. It is especially suitable for the rapid identification and file management of hidden manhole covers in urban drainage pipe networks. It realizes efficient collaboration between manhole cover positioning and file management, reduces the errors and inefficiencies caused by manual operation, and provides strong support for the intelligent management of urban drainage pipe networks.
[0076] In the preferred embodiment, the lightweight target detection model in step S2 includes the lightweight target detection model MobileMC-Net for water pipe network manhole covers;
[0077] The backbone network is reconstructed using depthwise separable convolution, and the computational cost is reduced by channel-wise convolution and point-wise convolution.
[0078] A global attention mechanism module is embedded in a specific layer of the backbone network, specifically in layers 3, 6, and 9. This module enhances feature representation capabilities and suppresses background interference through channel attention and spatial attention.
[0079] Generate a bounding box for the target region based on the visible light image. and the corresponding visual feature vectors ;
[0080] Local features of the target region are extracted based on the bounding box, and combined with global features to generate the final visual feature vector.
[0081] In the preferred scheme, depthwise separable convolutions are used to reconstruct the backbone network, and channel-wise and pointwise convolutions are used to reduce the computational cost; the computational cost of standard convolution is... The computational cost of depthwise separable convolution is K is the kernel size. Input the number of channels. H represents the number of output channels, and H and W represent the feature map dimensions.
[0082] In the preferred solution, the global attention mechanism module includes:
[0083] For the feature map output by the backbone network, calculate the channel attention weights to enhance the feature representation of key channels;
[0084] Based on channel attention weights, a spatial attention distribution is generated to highlight the spatial characteristics of the target region.
[0085] The channel attention weights and spatial attention distributions are fused to generate an enhanced feature map.
[0086] The enhanced feature map is then fed into subsequent convolutional layers to further extract visual features of the target region.
[0087] When using this lightweight object detection model, we first reconstruct the backbone network using depthwise separable convolution for the specific object of drainage pipe network manhole covers. We then reduce the computational cost through channel-wise and point-wise convolutions, where the computational cost of standard convolution is [missing information]. The computational cost of depthwise separable convolution is K is the kernel size. Input the number of channels. H represents the number of output channels, and H and W represent the feature map dimensions.
[0088] This ensures detection effectiveness while meeting the real-time processing requirements of mobile terminals. Next, a global attention mechanism module is embedded in layers 3, 6, and 9 of the backbone network. This module first calculates channel attention weights for the feature map output by the backbone network, enhancing the feature representation of key channels. Then, it generates a spatial attention distribution based on the channel attention weights to highlight the spatial features of the target region. Finally, it fuses the channel attention weights and the spatial attention distribution to generate an enhanced feature map, which is then input into subsequent convolutional layers to further extract the visual features of the target region. Background interference is suppressed through dual channel-spatial attention. Afterward, the acquired visible light image of the manhole cover is input into the model. The model generates a bounding box of the target region and its corresponding visual feature vector. Local features of the target region are then extracted based on the bounding box, and combined with global features to generate the final visual feature vector, providing accurate visual feature basis for subsequent bidirectional matching.
[0089] Specifically designed for drainage pipe network manhole covers, this system employs depthwise separable convolution to effectively reduce computational load, enabling real-time processing on mobile terminals and meeting the timeliness requirements of on-site data processing. By embedding a global attention mechanism module in a specific layer of the backbone network, channel-space dual attention enhances feature representation capabilities, effectively suppresses background interference, and improves the accuracy of manhole cover area recognition. The generated visual feature vector combines local and global features, providing accurate and reliable visual feature basis for forward matching and reverse positioning of manhole covers, thereby improving the efficiency and accuracy of the entire drainage manhole cover bidirectional positioning and matching system. This helps solve the problems of difficult on-site positioning of unnumbered manhole covers and low efficiency of file association.
[0090] In the preferred embodiment, step S3 involves acquiring the triaxial acceleration and triaxial angular velocity data from the inertial sensor.
[0091] Calculate the resultant acceleration of the triaxial acceleration and the resultant angular velocity of the triaxial angular velocity;
[0092] Discrete wavelet transform is used to analyze the synthesized acceleration. and the resultant angular velocity The components are decomposed to obtain high-frequency coefficients and low-frequency coefficients.
[0093] High-frequency coefficients are processed using a soft thresholding method to generate denoised motion data;
[0094] The soft threshold function is defined as follows: ,in These are wavelet coefficients. For the threshold, It is a symbolic function;
[0095] A continuous sequence of motion features is generated based on the denoised motion data.
[0096] When processing inertial sensor data using this scheme, the first step is to acquire two types of raw data: triaxial acceleration and triaxial angular velocity. Next, the composite acceleration and composite angular velocity of the three axes are calculated to integrate the motion information from each axis, forming a more comprehensive motion profile. Then, discrete wavelet transform is used to decompose the composite acceleration and composite angular velocity, yielding high-frequency and low-frequency coefficients. The high-frequency coefficients largely correspond to noise, while the low-frequency coefficients contain more real motion information. Subsequently, a soft thresholding method is used to process the high-frequency coefficients; the soft thresholding function is defined as follows: ,in These are wavelet coefficients. For the threshold, The sign function is used; this method effectively removes noise and generates denoised motion data. Finally, a continuous motion feature sequence is generated based on the denoised motion data, providing high-quality data support for subsequent steps such as trajectory extrapolation.
[0097] By calculating the synthetic acceleration and synthetic angular velocity, multi-axis inertial data can be integrated into more representative motion features, facilitating subsequent processing. Using discrete wavelet transform combined with soft thresholding to denoise the synthetic data effectively removes noise while preserving the true motion features to the greatest extent, thus improving data quality. The generated continuous motion feature sequence lays a reliable foundation for accurate trajectory estimation and spatial feature index construction, thereby improving the positioning accuracy and stability of the entire drainage manhole cover positioning and matching system in complex environments. Especially when GPS signals are poor, the system can rely on the denoised inertial data to achieve more accurate position estimation, helping to solve the problem of difficult manhole cover positioning.
[0098] In the preferred scheme, the specific process of constructing the dynamic spatial feature index in step S4 is as follows: using GPS coordinates as the reference position point, and based on the denoised inertial data, using extended Kalman filtering to calculate the trajectory and generate a spatial position sequence;
[0099] Furthermore, visual feature vectors are associated with spatial location sequences to form a spatiotemporal joint index.
[0100] In the preferred scheme, GPS coordinate positioning data is used as the reference position to obtain real-time global coordinates;
[0101] Based on the denoised inertial data, a continuous spatial position sequence is generated by calculating the flight path using the extended Kalman filter algorithm; wherein the state vector Including position (Px, Py, Pz), velocity (Vx, Vy, Vz), attitude, and sensor bias, the state transition equation is: , It is a nonlinear function. It is an inertial measurement value. It is process noise; the observation model is , It is a nonlinear function. To locate the measurement vector, To measure the noise vector;
[0102] Determine whether the deviation between the positioning data and the spatial location sequence exceeds a preset threshold;
[0103] If the deviation exceeds a preset threshold, the fusion weights are adjusted, and the inertial data is used preferentially for position compensation; the velocity is obtained by integrating the acceleration, and the position is obtained by integrating the velocity. The position update formula is as follows: The speed update formula is: Where P is position, V is velocity, A is acceleration, and Δt is time interval;
[0104] The visual feature vector is time-aligned with the spatial location sequence to form the spatiotemporal joint index.
[0105] In the preferred scheme, the trajectory is calculated using the extended Kalman filter algorithm, including:
[0106] Construct a state vector, including position, velocity, and attitude information;
[0107] The state vector is updated using a nonlinear state transition equation based on the denoised inertial data.
[0108] By combining positioning data and using an observation model to correct the state vector, a high-precision spatial position sequence is generated.
[0109] When the reliability of the positioning data is lower than a preset threshold, inertial data is used first for status updates.
[0110] When constructing the dynamic spatial feature index, real-time global coordinates are first obtained using GPS coordinates as the reference position point. Simultaneously, combined with denoised inertial data, a trajectory is calculated using an extended Kalman filter algorithm. The constructed state vector includes position (Px, Py, Pz), velocity (Vx, Vy, Vz), attitude (e.g., quaternions qw, qx, qy, qz), and sensor bias, based on a nonlinear state transition equation. Update the state vector, where It is a nonlinear function. It is an inertial measurement value. It is process noise; combined with positioning data, through observation models Correcting the state vector, It is a nonlinear function. To locate the measurement vector, To measure the noise vector, a continuous spatial position sequence is generated. Then, it is determined whether the deviation between the positioning data and the spatial position sequence exceeds a preset threshold. If it does, the fusion weights are adjusted, prioritizing the use of inertial data for position compensation, and the position update formula is applied. and speed update formula (Where P is position, V is velocity, A is acceleration, and Δt is time interval) Update position and velocity. Finally, align the visual feature vector with the spatial position sequence in time to form a spatiotemporal joint index. When the reliability of the positioning data is lower than a preset threshold, inertial data is used first for state updates to ensure the continuity and accuracy of track extrapolation.
[0111] By using GPS coordinates as a reference and fusing denoised inertial data, and employing an extended Kalman filter algorithm for trajectory estimation, the advantages of both data types are effectively combined. When GPS signals are stable, its global positioning capability is utilized; when signals are weak, inertial data is used to maintain positioning, improving the accuracy and stability of the spatial location sequence. By judging deviations and dynamically adjusting the fusion weights, effective correction of GPS drift is achieved, ensuring robust positioning, especially in complex environments such as densely built-up areas. The spatiotemporal joint index formed by associating visual feature vectors with spatial location sequences tightly binds the visual attributes of manhole covers to their precise spatiotemporal location, providing accurate foundational data for subsequent two-way matching. This helps improve the efficiency and accuracy of manhole cover positioning and file association, thereby promoting the intelligent management of drainage network manhole covers.
[0112] In the preferred scheme, in step S5,
[0113] Based on visual feature vectors, similarity matching is performed in the electronic archives to retrieve corresponding archive information;
[0114] Based on the real-time spatial feature index, verify the consistency between the archive information and the spatial location of the target object;
[0115] If the deviation between the historical location data in the electronic file and the real-time location data exceeds a preset threshold, the position correction amount is calculated through the trajectory estimation algorithm.
[0116] Generate a navigation path based on the position correction and output the physical position coordinates of the target object;
[0117] In reverse positioning, when the deviation between the electronic file coordinates and the real-time GPS exceeds a threshold δ, the position correction is calculated using a trajectory extrapolation algorithm. Output the corrected physical position coordinates.
[0118] A two-way positioning and matching system for drainage well covers based on target detection and spatial feature indexing, the system includes:
[0119] Mobile data acquisition terminal: integrates a camera, GPS module, and three-axis inertial sensor;
[0120] Data processing module: Performs lightweight target detection and inertial data denoising;
[0121] Spatial Indexing Engine: Constructs dynamic spatial feature indexes;
[0122] Two-way matching module: enables forward file retrieval and reverse physical location.
[0123] First, based on the visual feature vectors extracted from the visible light images of the manhole covers, similarity matching is performed in the electronic archive to retrieve the corresponding manhole cover archive information, such as the manhole cover number and historical maintenance records. Next, the consistency between the retrieved archive information and the target manhole cover in spatial location is verified using a real-time spatial feature index, ensuring that the archive information corresponds to the actual manhole cover. When the deviation between the historical location data in the electronic archive and the real-time positioning data exceeds a preset threshold, a position correction amount is calculated using a trajectory extrapolation algorithm. If, during reverse positioning, the deviation between the electronic archive coordinates and the real-time GPS exceeds a threshold δ, the position correction amount is also calculated using a trajectory extrapolation algorithm. Then, a navigation path is generated based on the position correction amount, and the corrected physical location coordinates of the target manhole cover are output.
[0124] When in use, the mobile acquisition terminal integrates a camera, GPS module, and three-axis inertial sensor to simultaneously acquire visible light images of the manhole cover, GPS positioning data, and three-axis inertial sensor data. The data processing module processes the acquired data. On the one hand, it uses a lightweight target detection model to process the visible light images to identify the manhole cover area and extract visual feature vectors. On the other hand, it performs preprocessing such as noise reduction on the inertial sensor data. The spatial indexing engine merges GPS coordinates with the preprocessed inertial data to construct a dynamic spatial feature index. The bidirectional matching module uses the processed data to achieve forward file retrieval and reverse physical positioning.
[0125] Example 2
[0126] Further explanation in conjunction with Example 1, such as Figure 1-6 As shown, MobileMC-Net: This invention proposes a lightweight object detection model based on depthwise separable convolution and introducing a global attention mechanism (GAM), specifically designed for high-precision identification of drainage manhole cover areas and extraction of visual features.
[0127] Global Attention Mechanism (GAM): A deep learning attention mechanism that optimizes feature extraction and suppresses background interference through channel-space dual attention.
[0128] Discrete Wavelet Transform (DWT): A signal processing technique used to denoise inertial sensor data, eliminating noise while preserving true motion characteristics.
[0129] Extended Kalman Filter (EKF): A nonlinear system state estimation algorithm used to fuse multi-sensor data (such as GPS and inertial data) to achieve continuous, high-precision position estimation.
[0130] Spatiotemporal Joint Index: A multi-dimensional data structure that links the visual feature vectors of manhole covers with precise spatial location sequences, integrating the appearance and location information of manhole covers.
[0131] Two-way matching algorithm: The core algorithm of this invention includes two modes: "forward matching" (associating manhole covers with archives on site) and "reverse positioning" (navigating to the manhole cover based on the archives).
[0132] Mobile data acquisition terminal: An intelligent device integrating a camera, GPS module and three-axis inertial sensor for on-site multimodal data acquisition.
[0133] According to an embodiment of the present invention, a two-way positioning and matching system for drainage well covers based on target detection and spatial feature indexing is provided, which aims to solve the problems of difficulty in on-site positioning of unnumbered well covers and low efficiency in file association. Figure 1 This is a system schematic diagram of an embodiment of the invention. According to... Figure 1 As shown, the system of this embodiment mainly includes: a mobile acquisition terminal, a data processing module, a spatial indexing engine, and a bidirectional matching module.
[0134] In this embodiment of the invention, the main function of the mobile data acquisition terminal is as the data input front-end of the system, typically a smart mobile device integrating multiple sensors. It simultaneously collects various types of data, including real-time visible light images of the manhole cover and its surrounding environment captured by a camera, serving as direct input for visual recognition; real-time Global Positioning System (GPS) coordinates obtained from a GPS module, providing a global reference for the manhole cover's location; and high-frequency recording of the device's linear acceleration and angular velocity along the X, Y, and Z axes from a three-axis inertial sensor, including a three-axis accelerometer and a three-axis gyroscope (angular velocity meter). This data is crucial for high-precision trajectory estimation in environments with unstable or obstructed GPS signals.
[0135] This invention utilizes a data processing module for data processing and utilization. This module is responsible for preprocessing and feature extraction of raw multi-source data. It primarily uses a lightweight target detection and visual feature extraction unit to receive visible light images, input them into a pre-trained MobileMC-Net model, and accurately identify manhole cover areas and extract visual feature vectors. MobileMC-Net, through its lightweight design, meets the real-time processing requirements of mobile terminals. An inertial data denoising unit preprocesses the raw triaxial inertial sensor data, including calculating synthetic acceleration and synthetic angular velocity, and employs Discrete Wavelet Transform (DWT) for denoising to eliminate noise, preserve true motion characteristics, and provide clean data for subsequent high-precision trajectory estimation.
[0136] The core of this invention for achieving high-precision manhole cover positioning is a spatial indexing engine, which integrates multiple positioning data to construct a dynamic spatial feature index for the manhole cover. Using data fusion and trajectory estimation, with GPS coordinates as the initial reference, it fuses denoised inertial data output from the data processing module. Continuous trajectory estimation is performed using the Extended Kalman Filter (EKF) algorithm, fusing global GPS positioning with local relative motion from inertial sensors to generate a high-precision spatial position sequence. When the deviation between the real-time GPS coordinates and the calculated trajectory position exceeds a threshold (e.g., 2 meters), the engine dynamically adjusts the fusion weights, relying more on inertial navigation data to calculate position compensation, correcting GPS drift, and improving positioning robustness and accuracy in complex urban environments. Finally, the visual feature vector of the manhole cover is time-aligned and correlated with the precise spatial position sequence generated through EKF and dynamic compensation, forming a "spatiotemporal joint index" for the manhole cover, tightly binding the manhole cover's "visual attributes" with its "precise spatiotemporal position."
[0137] The core decision-making component of the system in this invention is the bidirectional matching module, which enables a "two-way" function of manhole cover positioning and file association. This includes two methods: forward matching and reverse positioning. Forward matching involves quickly retrieving and associating detailed file information (such as manhole cover number, historical maintenance records, etc.) from the electronic archive based on the visual characteristics and real-time location of the manhole cover, solving the problem of identifying unnumbered manhole covers. Reverse positioning uses the historical coordinates of the target manhole cover in the electronic archive, combined with real-time inertial data from the mobile terminal, to generate a navigation path and guide the user to the physical location of the target manhole cover. This function supports position correction, ensuring accurate navigation even when GPS signals are weak. Finally, the electronic archive stores and manages the electronic file information of all drainage manhole covers, including manhole cover number, geographic coordinates, maintenance records, status, etc., and interacts bidirectionally with the bidirectional matching module, supporting information retrieval, updating, and maintenance.
[0138] This invention utilizes several key technologies to design a lightweight convolutional neural network—MobileMC-Net—specifically designed for mobile devices. Unlike standard convolution, it decomposes a convolution operation into two independent steps: Depthwise Convolution: Each channel of the input feature map is convolved independently, with the number of filters equal to the number of input channels. Pointwise Convolution: A 1x1 convolution kernel is linearly combined across all channels to generate a new feature map. The computational cost of standard convolution is... The computational cost of depthwise separable convolution is Where K is the kernel size, Input the number of channels. Where H is the number of output channels, and W is the feature map size. Much larger and Much larger Therefore, the parameters and computational cost can be significantly reduced by an order of magnitude. To enable the model to adaptively focus on the most important features for manhole cover identification in the image while suppressing irrelevant background information, and to capture global contextual information and enhance feature expressiveness, we used a Global Attention (GAM) mechanism, primarily through channel attention and spatial attention for feature enhancement. In addressing data fusion and trajectory extrapolation, we used an Extended Kalman Filter (EKF) for state prediction and update. EKF can effectively fuse GPS and inertial data. This includes a state vector. This includes position (Px, Py, Pz), velocity (Vx, Vy, Vz), attitude (e.g., quaternions qw, qx, qy, qz), and sensor bias. The state transition equation for calculating the motion model is: Where f is a nonlinear function, These are inertial measurements (acceleration, angular velocity). It is process noise. The formula (measurement equation) for the observation model is: Where h is a nonlinear function, For GPS measurement vectors, To measure the noise vector, and when GPS signals are unreliable (e.g., deviation > threshold), the system primarily relies on denoised inertial data for trajectory estimation. Velocity is obtained by integrating acceleration, and position is obtained by integrating velocity; attitude change is obtained by integrating angular velocity. The position update formula is as follows: The speed update formula is: Where P is position, V is velocity, A is acceleration, and Δt is time interval. In practical applications, attitude transformation and the Earth coordinate system are considered. Finally, through spatiotemporal joint indexing, the visual feature vector of a specific manhole cover (obtained from MobileMC-Net) is bound to the precise spatial location sequence calculated by EKF. This allows the system to find the "location" through the "image" or the "image" through the "location", forming a dynamic "fingerprint" of the manhole cover.
[0139] According to an embodiment of the present invention, a bidirectional positioning and matching method for drainage well covers based on target detection and spatial feature indexing is provided, the overall process of which is as follows: Figure 2 As shown. This method achieves accurate identification, positioning, and record management of manhole covers through multi-source data fusion and intelligent algorithms. Specifically, it includes the following steps:
[0140] Step 1: Collect visible light images, GPS positioning data, and three-axis inertial sensor data of the manhole cover using a mobile terminal. The visible light images are high-resolution images, represented as follows: Where H and W are the length and width in the figure, and this invention uses H=1920, W=1080; GPS positioning data are real-time Global Positioning System coordinates, represented as... The units are (°) and meters (m), respectively; triaxial inertial sensor data includes triaxial acceleration data. and triaxial angular velocity data ,in Units are , Units are The sampling frequency can be set to... We set the input image data, GPS data, and IMU data as follows: , , .
[0141] The mobile terminal simultaneously collects images of the manhole cover, GPS coordinates, and triaxial acceleration and angular velocity data, laying the foundation for subsequent processing.
[0142] Step 2: Input the visible light image into the pre-trained lightweight object detection model MobileMC-Net to identify the manhole cover area and output a visual feature vector.
[0143] The MobileMC-Net model performs high-precision image recognition, extracting the bounding box and visual feature vector of manhole covers as their visual "fingerprint." The model design balances accuracy and real-time performance on mobile devices. The input data includes: images. The output after model inference is: manhole cover bounding box. and visual feature vectors To improve detection accuracy, this model utilizes several key technologies.
[0144] Step 3: Preprocess the triaxial inertial sensor data: calculate the composite acceleration and composite angular velocity; and use discrete wavelet transform to denoise the composite data.
[0145] The composite characteristics of triaxial acceleration and angular velocity are calculated. Subsequently, discrete wavelet transform (DWT) is applied to the composite data for denoising, and high-frequency coefficients are processed using a soft thresholding method to eliminate noise and preserve true motion characteristics, providing clean data for high-precision trajectory estimation. The composite acceleration is calculated as follows: The resultant angular velocity is DWT decomposes a signal into wavelet coefficients of different frequency components. It uses multi-scale decomposition to separate the signal into approximate (low-frequency) components and detail (high-frequency) components. Noise typically manifests as high-frequency components. This invention employs the soft thresholding method. Let the wavelet frequency coefficients be... The threshold is The soft threshold function is defined as: ,in, It is a sign function, when When it is 1, when When it is -1, when The threshold is 0. This method sets coefficients less than the threshold to zero and shrinks coefficients greater than the threshold towards zero, effectively removing noise while retaining most of the useful signal.
[0146] Step 4: Integrate GPS coordinates with preprocessed inertial data to construct a dynamic spatial feature index. Using GPS coordinates as a reference, and employing denoised inertial data, a continuous and high-precision spatial position sequence is generated through Extended Kalman Filter (EKF) to calculate the trajectory. When the GPS deviation exceeds a threshold, the system dynamically compensates for the position error. Finally, the visual feature vector of the manhole cover is associated with this precise spatiotemporal sequence to form a "spatiotemporal joint index" for the manhole cover.
[0147] Step 5: Achieve manhole cover location and file association through bidirectional matching algorithm: Forward matching: Based on the visual feature vector of step (2) and the real-time spatial feature index of step (4), retrieve the associated electronic file information; Reverse positioning: Based on the historical coordinates in the electronic file, combine with real-time inertial data to generate a navigation path and output the physical location of the target manhole cover.
[0148] This process comprises two modes: forward matching and reverse positioning. In forward matching, if a manhole cover without a number or with an unclear number is found on-site, its archival information needs to be quickly located. The system searches the electronic archive based on real-time collected visual feature vectors of the manhole cover (extracted via MobileMC-Net) and a real-time spatial feature index (precise real-time location) constructed using EKF. The final matching result is obtained by weighting visual feature matching and spatial location matching. In reverse positioning, a specific manhole cover recorded in the electronic archive (e.g., a manhole cover requiring maintenance) needs to be located, and navigation to its precise location is required. The system generates a navigation path based on the historical coordinates of the target manhole cover stored in the electronic archive, combined with real-time inertial data (and GPS data) collected by the mobile terminal, and outputs the physical location of the target manhole cover. Crucially, when the deviation between the historical coordinates in the electronic archive and the real-time GPS measurement exceeds a preset threshold (this typically occurs when the GPS signal is poor or the historical coordinates are inaccurate), the system will activate a trajectory estimation algorithm to calculate a position correction amount, providing more accurate physical location coordinates.
[0149] The above embodiments are merely preferred technical solutions of the present invention and should not be considered as limitations on the present invention. The scope of protection of the present invention should be limited to the technical solutions described in the claims, including equivalent substitutions of the technical features described in the claims. That is, equivalent substitutions and improvements within this scope are also within the scope of protection of the present invention.
Claims
1. A target detection and spatial feature index-based two-way positioning and matching method for a drainage cover, characterized in that: The method includes: S1. Collect visible light images, GPS positioning data, and three-axis inertial sensor data of the manhole cover via a mobile terminal; S2. Input the visible light image into the pre-trained lightweight target detection model to identify the manhole cover area and output the visual feature vector; S3. Preprocess the triaxial inertial sensor data to calculate the composite acceleration and composite angular velocity; and use discrete wavelet transform to denoise the composite data. S4. Integrate GPS coordinates with preprocessed inertial data to construct a dynamic spatial feature index; S5. Achieve manhole cover positioning and file association through a two-way matching algorithm: Forward matching: Based on the visual feature vector in step S2 and the real-time spatial feature index in step S4, retrieve the associated electronic archive information; Reverse positioning: Based on historical coordinates in the electronic archives and combined with real-time inertial data, a navigation path is generated, and the physical location of the target manhole cover is output; The lightweight target detection model in step S2 includes MobileMC-Net, a lightweight target detection model for water pipe network manhole covers; The backbone network is reconstructed using depthwise separable convolution, and the computational cost is reduced by channel-wise convolution and point-wise convolution. A global attention mechanism module is embedded in a specific layer of the backbone network, specifically in layers 3, 6, and 9. This module enhances feature representation capabilities and suppresses background interference through channel attention and spatial attention. generating a bounding box of a target region for the visible light image and a corresponding visual feature vector ; Local features of the target region are extracted based on the bounding box, and combined with global features to generate the final visual feature vector; The global attention mechanism module includes: For the feature map output by the backbone network, calculate the channel attention weights to enhance the feature representation of key channels; Based on channel attention weights, a spatial attention distribution is generated to highlight the spatial characteristics of the target region. The channel attention weights and spatial attention distributions are fused to generate an enhanced feature map. The enhanced feature map is then input into subsequent convolutional layers to further extract visual features of the target region. In step S4, the specific process of constructing the dynamic spatial feature index is as follows: using GPS coordinates as the reference position point, and based on the denoised inertial data, the trajectory is calculated through extended Kalman filtering to generate a spatial position sequence. Furthermore, visual feature vectors are associated with spatial location sequences to form a spatiotemporal joint index; Using GPS coordinate positioning data as a reference position, obtain real-time global coordinates; Based on the denoised inertial data, a continuous spatial position sequence is generated by calculating the flight path using the extended Kalman filter algorithm; wherein the state vector Including position (Px, Py, Pz), velocity (Vx, Vy, Vz), attitude, and sensor bias, the state transition equation is: , It is a nonlinear function. It is an inertial measurement value. It is process noise; the observation model is , It is a nonlinear function. To locate the measurement vector, To measure the noise vector; Determine whether the deviation between the positioning data and the spatial location sequence exceeds a preset threshold; If the deviation exceeds a preset threshold, the fusion weights are adjusted, and the inertial data is used preferentially for position compensation; the velocity is obtained by integrating the acceleration, and the position is obtained by integrating the velocity. The position update formula is as follows: The speed update formula is: Where P is position, V is velocity, A is acceleration, and Δt is time interval; The visual feature vector is time-aligned with the spatial location sequence to form the spatiotemporal joint index.
2. The bidirectional positioning and matching method for drainage well covers based on target detection and spatial feature indexing as described in claim 1, characterized in that: The backbone network is reconstructed using depthwise separable convolutions, and computational complexity is reduced through channel-wise and pointwise convolutions; the computational complexity of standard convolution is [missing information]. The computational cost of depthwise separable convolution is K is the kernel size. Input the number of channels. H represents the number of output channels, and H and W represent the feature map dimensions.
3. The bidirectional positioning and matching method for drainage well covers based on target detection and spatial feature indexing as described in claim 1, characterized in that: In step S3, Acquire triaxial acceleration and triaxial angular velocity data from inertial sensor data; Calculate the resultant acceleration of the triaxial acceleration and the resultant angular velocity of the triaxial angular velocity; Discrete wavelet transform is used to analyze the synthesized acceleration. and the resultant angular velocity The components are decomposed to obtain high-frequency coefficients and low-frequency coefficients. High-frequency coefficients are processed using a soft thresholding method to generate denoised motion data; The soft threshold function is defined as follows: ,in These are wavelet coefficients. For the threshold, It is a symbolic function; A continuous sequence of motion features is generated based on the denoised motion data.
4. The bidirectional positioning and matching method for drainage well covers based on target detection and spatial feature indexing as described in claim 1, characterized in that: Track estimation is performed using the extended Kalman filter algorithm, including: Construct a state vector, including position, velocity, and attitude information; The state vector is updated using a nonlinear state transition equation based on the denoised inertial data. By combining positioning data and using an observation model to correct the state vector, a high-precision spatial position sequence is generated. When the reliability of the positioning data is lower than a preset threshold, inertial data is used first for status updates.
5. The bidirectional positioning and matching method for drainage well covers based on target detection and spatial feature indexing as described in claim 1, characterized in that: In step S5, Based on visual feature vectors, similarity matching is performed in the electronic archives to retrieve corresponding archive information; Based on the real-time spatial feature index, verify the consistency between the archive information and the spatial location of the target object; If the deviation between the historical location data in the electronic file and the real-time location data exceeds a preset threshold, the position correction amount is calculated through the trajectory estimation algorithm. Generate a navigation path based on the position correction and output the physical position coordinates of the target object; In reverse positioning, when the deviation between the electronic file coordinates and the real-time GPS exceeds a threshold δ, the position correction is calculated using a trajectory extrapolation algorithm. Output the corrected physical position coordinates.
6. A two-way positioning and matching system for drainage well covers based on target detection and spatial feature indexing, wherein the system operates according to claim 1.
5. The bidirectional positioning and matching method for drainage well covers based on target detection and spatial feature indexing as described in any one of the claims, characterized in that the system include: Mobile data acquisition terminal: integrates a camera, GPS module, and three-axis inertial sensor; Data processing module: Performs lightweight target detection and inertial data denoising; Spatial Indexing Engine: Constructs dynamic spatial feature indexes; Two-way matching module: enables forward file retrieval and reverse physical location.