Milepost positioning method and device based on multi-modal fusion, equipment and medium

By using multimodal fusion technology, the real-time pitch angle is calculated using the left and right views, inverse distortion correction and dead reckoning are performed, and virtual anchor points are generated. This solves the problem of mileage marker positioning error of vehicle-mounted cameras in complex environments and achieves high-precision and continuous positioning results.

CN121963153BActive Publication Date: 2026-06-23CHANGSHA UNIVERSITY OF SCIENCE AND TECHNOLOGY

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
CHANGSHA UNIVERSITY OF SCIENCE AND TECHNOLOGY
Filing Date
2026-04-02
Publication Date
2026-06-23

AI Technical Summary

Technical Problem

In complex road environments, the dynamic changes in the pitch angle of the vehicle-mounted camera cause a significant increase in the positioning error of the mileage markers. Furthermore, the visual recognition algorithm is prone to errors under conditions such as changes in lighting and rain or fog, and cannot provide three-dimensional coordinates that meet highway engineering specifications.

Method used

By acquiring the left and right views of the inspection vehicle, calculating the real-time pitch angle, performing inverse distortion processing, and combining dead reckoning and virtual anchor point generation, high-precision positioning of mileage markers is achieved.

Benefits of technology

In dynamic driving conditions and complex environments, it significantly improves the positioning accuracy of mileage markers and the spatial continuity of data, ensuring the output of high-precision asset mapping data.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN121963153B_ABST
    Figure CN121963153B_ABST
Patent Text Reader

Abstract

The application discloses a milestone positioning method and device based on multi-modal fusion, equipment and medium, relates to the field of image recognition technology, and the method comprises the following steps: firstly, acquiring the left and right views of the front road, the instantaneous speed, the absolute geographical coordinates and the vehicle heading angle collected by the inspection vehicle; secondly, calculating the real-time pitch angle of the camera relative to the road surface according to the left and right lane line vanishing points; simultaneously, performing time integration on the speed and combining historical data to obtain the dead reckoning mileage value; when the target milestone is detected, performing inverse distortion processing on the initial three-dimensional coordinates by using the pitch angle and verifying the identification value; if the target milestone is not detected and the theoretical mileage position is reached, generating a virtual anchor point coordinate based on the geographical coordinates, the heading angle and the preset lateral offset; mapping the calculated value to the milestone and combining the anchor point to obtain the final positioning result. The application can realize high-precision positioning of the road mileage milestone under the actual inspection working conditions of vehicle dynamic driving, target identification misjudgment or missed detection.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This application relates to the field of image recognition technology, and in particular to a method, apparatus, equipment and medium for mileage station location based on multimodal fusion. Background Technology

[0002] In the intelligent maintenance of modern highways, the three-dimensional spatial coordinates of road mileage markers are the absolute physical benchmark for locating various road defects. With the deepening of digitalization of transportation infrastructure, high-precision and high-reliability marker positioning data is not only the core physical benchmark variable for building accurate digital twin maps, but also a fundamental requirement for supporting precise and efficient road maintenance decisions.

[0003] Currently, the mainstream automated inspection solutions in the industry usually rely on vehicle-mounted mobile measurement systems. Their standard technical approach mainly utilizes vehicle-mounted binocular stereo cameras to acquire depth information, and combines it with target detection algorithms such as YOLO to automatically select roadside markers. Finally, optical character recognition technology is used to read the specific mileage value on the marker.

[0004] However, the aforementioned vision-based solutions exhibit significant limitations in complex real-world road environments. Since existing ranging algorithms typically assume the camera's optical axis is perfectly parallel to the road surface, dynamic changes in the camera's pitch angle due to vehicle bumps caused by slopes or potholes lead to severe nonlinear geometric distortions in coordinate calculations, resulting in a substantial increase in positioning errors. Furthermore, vision-based algorithms are prone to recognition illusions under conditions of drastic changes in outdoor lighting, rain, fog, or foliage obstruction, causing jumps in the output mileage data, and the system lacks effective internal logical verification mechanisms. In addition, in blind spots where the target is completely obscured, current methods mostly rely on simple mileage interpolation, failing to provide true 3D coordinates that conform to highway engineering specifications and possess spatial confidence, thus hindering the guarantee of spatial data continuity. Therefore, achieving high-precision positioning of road mileage markers under real-world inspection conditions involving dynamic vehicle movement and potential misdetection or missed detection of targets has become an urgent problem to be solved. Summary of the Invention

[0005] The purpose of this application is to provide a method, device, equipment and medium for mileage marker positioning based on multimodal fusion, which aims to solve the technical problem of how to achieve high-precision positioning of road mileage markers under actual inspection conditions such as dynamic vehicle driving and false or missed detection of targets.

[0006] To achieve the above objectives, this application proposes a mileage marker location method based on multimodal fusion, the method comprising:

[0007] Obtain the left and right views corresponding to the images of the road ahead captured by the onboard camera on the inspection vehicle, and obtain the instantaneous speed, absolute geographic coordinates, and heading angle of the inspection vehicle.

[0008] Based on the vanishing point of the lane lines where the left and right lane lines converge in the left view, calculate the real-time pitch angle of the vehicle camera relative to the horizontal road surface.

[0009] The instantaneous velocity is integrated over time based on the initial moment or the previous verification moment, and combined with the determined historical mileage marker values ​​to obtain the dead reckoning mileage value.

[0010] When the target mileage marker to be located is detected in the left view, the initial three-dimensional coordinates of the target mileage marker are subjected to inverse distortion processing based on the real-time pitch angle, and the identification result of the target mileage marker is verified to obtain the mileage marker positioning result.

[0011] When the target mileage marker is not detected in the left view and the dead reckoning mileage value reaches the theoretical mileage position corresponding to the next mileage marker, virtual anchor point coordinates are generated based on the absolute geographic coordinates, the vehicle heading angle, and the preset roadside lateral offset. The theoretical mileage position is determined based on the historical mileage marker values ​​and the preset mileage marker interval value.

[0012] The dead reckoning mileage value is mapped to the reckoning station number value, and the mileage station number positioning result is obtained based on the virtual anchor point coordinates and the reckoning station number value.

[0013] In one embodiment, the step of detecting the target mileage marker to be located in the left view, performing inverse distortion processing on the initial three-dimensional coordinates of the target mileage marker based on the real-time pitch angle, and verifying the identification result of the target mileage marker to obtain the mileage marker location result includes:

[0014] When detecting the target mileage marker to be located in the forward road image in the left view, the initial three-dimensional coordinates of the target mileage marker are determined based on the left view and the right view;

[0015] Based on the real-time pitch angle, the initial three-dimensional coordinates are subjected to inverse distortion processing to obtain the corrected road network coordinates;

[0016] The detected target mileage markers are subjected to character recognition to obtain the recognition results;

[0017] Based on the dead reckoning mileage value, the identification result is verified through a spatiotemporal dual gating mechanism to obtain the identification station number value;

[0018] The mileage and station number positioning result is obtained based on the corrected road network coordinates and the identified station number values.

[0019] In one embodiment, the step of performing inverse distortion processing on the initial three-dimensional coordinates based on the real-time pitch angle to obtain the corrected road network coordinates includes:

[0020] Construct an inverse distortion rotation matrix based on the real-time pitch angle;

[0021] Multiply the initial three-dimensional coordinates by the inverse distortion rotation matrix to obtain the corrected road network coordinates;

[0022] The inverse distortion rotation matrix is ​​represented as follows:

[0023]

[0024] in, This indicates the real-time pitch angle.

[0025] In one embodiment, when detecting the target mileage marker to be located in the forward road image in the left view, the step of determining the initial three-dimensional coordinates of the target mileage marker based on the left view and the right view includes:

[0026] The target mileage marker to be located is detected in the left view by a preset target detection network, and the detection result is obtained.

[0027] When the detection result indicates that a station number exists, select the detection box corresponding to the target mileage station number;

[0028] The pixel center coordinates of the detection box are determined, and the pixel displacement difference between the pixel center coordinates in the left view and the right view is calculated using a semi-global block matching algorithm to obtain the disparity value.

[0029] Obtain the horizontal focal length, vertical focal length, optical center coordinates, and binocular baseline length of the vehicle-mounted camera;

[0030] The depth coordinates of the target mileage marker relative to the vehicle-mounted camera are calculated based on the horizontal focal length, the binocular baseline length, and the parallax value.

[0031] Based on the vertical focal length and the optical center coordinates, spatial mapping calculations are performed on the pixel center coordinates and the depth coordinates to obtain the corresponding horizontal and vertical coordinates;

[0032] The initial three-dimensional coordinates of the target mileage station are obtained based on the depth coordinates, the lateral coordinates, and the vertical coordinates.

[0033] In one embodiment, the step of calculating the mileage value based on the dead reckoning value and verifying the identification result through a spatiotemporal dual-gating mechanism to obtain the identification station number value includes:

[0034] The scale evolution features are obtained based on the proportion of the detection box area corresponding to the target mileage station to the full image resolution.

[0035] Based on the scale evolution features, determine whether the detection box satisfies the preset replacement condition between adjacent frames to obtain the first gating decision result;

[0036] Calculate the numerical difference between the identification result and the dead reckoning mileage value, and determine whether the numerical difference exceeds a preset tolerance threshold to obtain the second gating decision result;

[0037] When both the first gating decision result and the second gating decision result are yes, the dead reckoning mileage value is mapped to the identification station number value;

[0038] When the first gating decision result or the second gating decision result is negative, the identification result is used as the identification station number value.

[0039] In one embodiment, the step of calculating the real-time pitch angle of the vehicle-mounted camera relative to the horizontal road surface based on the vanishing point of the lane lines where the left and right lane lines converge in the left view includes:

[0040] The left view is subjected to grayscale conversion and Gaussian filtering to obtain a denoised grayscale image;

[0041] The pixel gradients in the denoised grayscale image are extracted using an edge detection algorithm to obtain edge contour features;

[0042] The straight line segments in the edge contour features are extracted by Hough line transform, and the straight line segments are fitted by the least squares method to obtain the equations of the left lane line and the right lane line.

[0043] By simultaneously solving the equations of the left lane line and the right lane line, the coordinates of the intersection point of the straight lines are obtained, thus yielding the vanishing point of the lane lines.

[0044] Determine the vertical coordinate of the vanishing point of the lane line in the left view based on the principle of perspective projection.

[0045] Based on the ordinate of the vanishing point, the coordinates of the optical center of the vehicle-mounted camera, and the vertical focal length, inverse trigonometric functions are used to calculate the real-time pitch angle of the vehicle-mounted camera relative to the horizontal road surface.

[0046] In one embodiment, the step of integrating the instantaneous velocity over time based on the initial time or the previous verification time, and combining it with the determined historical mileage marker values ​​to obtain the dead reckoning mileage value includes:

[0047] If the current positioning status is the first positioning after the system starts, the instantaneous velocity between the initial time and the current time is integrated to obtain the dead reckoning distance value;

[0048] If the current positioning status is not the first positioning after the system starts, the instantaneous speed between the previous verification time and the current time is integrated to obtain the theoretical distance traveled by the vehicle.

[0049] The directional coefficient is determined based on the travel direction of the inspection vehicle;

[0050] The dead reckoning distance is calculated based on the theoretical distance, the direction coefficient, and the determined historical mileage markers.

[0051] Furthermore, to achieve the above objectives, this application also proposes a multimodal fusion-based mileage marker positioning device, the device comprising:

[0052] The data acquisition module is used to acquire the left and right views corresponding to the images of the road ahead captured by the on-board camera on the inspection vehicle, and to acquire the instantaneous speed, absolute geographic coordinates, and heading angle of the inspection vehicle.

[0053] The attitude calculation module is used to calculate the real-time pitch angle of the vehicle camera relative to the horizontal road surface based on the vanishing point of the lane lines where the left and right lane lines converge in the left view.

[0054] The physical calculation module is used to perform time integration on the instantaneous velocity based on the initial time or the previous verification time, and combine it with the determined historical mileage station values ​​to obtain the dead reckoning mileage value.

[0055] The detection and positioning module is used to perform inverse distortion processing on the initial three-dimensional coordinates of the target mileage station number based on the real-time pitch angle when the target mileage station number is detected in the left view, and to verify the identification result of the target mileage station number to obtain the mileage station number positioning result.

[0056] The virtual anchor point generation module is used to generate virtual anchor point coordinates based on the absolute geographic coordinates, the vehicle heading angle, and a preset roadside lateral offset when the target mileage marker is not detected in the left view and the dead reckoning mileage value reaches the theoretical mileage position corresponding to the next mileage marker. The theoretical mileage position is determined based on the historical mileage marker values ​​and the preset mileage marker interval value.

[0057] The blind zone output module is used to map the dead reckoning mileage value to the reckoning station number value, and obtain the mileage station number positioning result based on the virtual anchor point coordinates and the reckoning station number value.

[0058] Furthermore, to achieve the above objectives, this application also proposes a multimodal fusion-based mileage marker positioning device, the device comprising: a memory, a processor, and a computer program stored in the memory and executable on the processor, the computer program being configured to implement the steps of the multimodal fusion-based mileage marker positioning method described above.

[0059] In addition, to achieve the above objectives, this application also proposes a storage medium, which is a computer-readable storage medium, on which a computer program is stored, and when the computer program is executed by a processor, it implements the steps of the multimodal fusion-based mileage marker positioning method described above.

[0060] In addition, to achieve the above objectives, this application also provides a computer program product, which includes a computer program that, when executed by a processor, implements the steps of the multimodal fusion-based mileage marker positioning method described above.

[0061] One or more technical solutions proposed in this application have at least the following technical effects:

[0062] First, the left and right views corresponding to the road images captured by the vehicle-mounted camera on the inspection vehicle are obtained, and instantaneous speed, absolute geographic coordinates, and vehicle heading angle are acquired in parallel, providing a multi-source heterogeneous data foundation for subsequent multi-dimensional feature fusion and localization. Second, the vanishing points of the lane lines fitted from the left and right lane lines in the left view are used to calculate the real-time pitch angle of the vehicle-mounted camera relative to the horizontal road surface, which can dynamically correct the target pose distortion caused by road undulations, thereby significantly improving the accuracy of spatial positioning. Subsequently, the instantaneous speed is integrated over time based on the initial time or the previous verification time, and combined with the determined historical mileage marker values ​​to obtain the dead reckoning mileage value, establishing a theoretical position constraint benchmark that conforms to the laws of physical motion for the visual processing results. Then, in the left view... When the target mileage marker to be located is detected, the initial three-dimensional coordinates of the target mileage marker are subjected to inverse distortion processing based on the real-time pitch angle, and the identification result of the target mileage marker is verified in combination with the dead reckoning mileage value, thereby effectively filtering out jump false detections caused by changes in lighting or numerical similarity; finally, when the target mileage marker is not detected and the dead reckoning mileage value reaches the theoretical mileage position corresponding to the next mileage marker, virtual anchor point coordinates are generated based on absolute geographic coordinates, vehicle heading angle and preset roadside lateral offset, and the dead reckoning mileage value is mapped to the reckoning marker value to ensure the absolute continuity of asset mapping data in spatiotemporal distribution. This application can achieve high-precision positioning of road mileage markers under actual inspection conditions of vehicle dynamic driving and target recognition false detection or missed detection. Attached Figure Description

[0063] The accompanying drawings, which are incorporated in and form part of this specification, illustrate embodiments consistent with this application and, together with the description, serve to explain the principles of this application.

[0064] To more clearly illustrate the technical solutions in the embodiments of this application or the prior art, the drawings used in the description of the embodiments or the prior art will be briefly introduced below. Obviously, for those skilled in the art, other drawings can be obtained based on these drawings without creative effort.

[0065] Figure 1 This is a flowchart illustrating an embodiment of the multimodal fusion-based mileage marker positioning method of this application.

[0066] Figure 2 This is a schematic diagram illustrating the definition of the direction coefficient provided in Embodiment 1 of the mileage stationing method based on multimodal fusion in this application;

[0067] Figure 3 This is a schematic diagram of the virtual anchor point coordinate generation process provided in Embodiment 1 of the multimodal fusion-based mileage station positioning method of this application;

[0068] Figure 4 This is a flowchart illustrating Embodiment 2 of the multimodal fusion-based mileage marker positioning method of this application;

[0069] Figure 5 This is a logical diagram of the spatiotemporal dual-gating mechanism provided in Embodiment 2 of the multimodal fusion-based mileage marker positioning method of this application;

[0070] Figure 6 A simplified flowchart illustrating the multimodal fusion-based mileage marker positioning method provided in Embodiment 2 of this application;

[0071] Figure 7 This is a schematic diagram of the module structure of the mileage marker positioning device based on multimodal fusion according to an embodiment of this application;

[0072] Figure 8 This is a schematic diagram of the hardware operating environment involved in the multimodal fusion-based mileage marker positioning method in the embodiments of this application.

[0073] The purpose, features, and advantages of this application will be further explained in conjunction with the embodiments and with reference to the accompanying drawings. Detailed Implementation

[0074] It should be understood that the specific embodiments described herein are merely illustrative of the technical solutions of this application and are not intended to limit this application.

[0075] To better understand the technical solution of this application, a detailed description will be provided below in conjunction with the accompanying drawings and specific implementation methods.

[0076] It should be noted that the executing entity of this application embodiment can be a computing service device with data processing, network communication, and program execution functions, such as a tablet computer, personal computer, or mobile phone, or an electronic device or a stationing system capable of performing the above functions. The following description uses a stationing system as an example to illustrate this embodiment and the subsequent embodiments.

[0077] The user information (including but not limited to user device information, user personal information, etc.) and data (including but not limited to data used for analysis, data stored, data displayed, etc.) involved in this application are all information and data authorized by the user or fully authorized by all parties, and the collection, use and processing of the relevant data comply with the relevant laws, regulations and standards of the relevant countries and regions.

[0078] Based on this, the embodiments of this application provide a method for mileage station location based on multimodal fusion, referring to... Figure 1 , Figure 1 This is a flowchart illustrating the first embodiment of the mileage marker positioning method based on multimodal fusion of this application.

[0079] In this embodiment, the mileage marker positioning method based on multimodal fusion includes steps S10~S60:

[0080] Step S10: Obtain the left and right views corresponding to the images of the road ahead captured by the vehicle-mounted camera on the inspection vehicle, and obtain the instantaneous speed, absolute geographic coordinates, and heading angle of the inspection vehicle.

[0081] Step S20: Calculate the real-time pitch angle of the vehicle camera relative to the horizontal road surface based on the vanishing point of the lane lines where the left and right lane lines converge in the left view.

[0082] Step S30: Integrate the instantaneous velocity over time based on the initial time or the previous verification time, and combine it with the determined historical mileage marker values ​​to obtain the dead reckoning mileage value.

[0083] Step S40: When the target mileage marker to be located is detected in the left view, the initial three-dimensional coordinates of the target mileage marker are subjected to inverse distortion processing according to the real-time pitch angle, and the identification result of the target mileage marker is verified to obtain the mileage marker positioning result.

[0084] Step S50: When the target mileage marker is not detected in the left view and the dead reckoning mileage value reaches the theoretical mileage position corresponding to the next mileage marker, virtual anchor point coordinates are generated based on the absolute geographic coordinates, the vehicle heading angle, and the preset roadside lateral offset. The theoretical mileage position is determined based on the historical mileage marker values ​​and the preset mileage marker interval value.

[0085] Step S60: Map the dead reckoning mileage value to the reckoning station number value, and obtain the mileage station number positioning result based on the virtual anchor point coordinates and the reckoning station number value.

[0086] It should be noted that the objects of multimodal analysis include the image of the road ahead (visual modality), the instantaneous speed, absolute geographic coordinates, and heading angle of the inspection vehicle (kinematic modality), dynamic pitch angle, and inverse distortion rotation matrix (spatial attitude modality), etc. Multimodal analysis aims to compensate for the shortcomings of visual algorithms in complex environments by fusing data with different attributes from the physical world.

[0087] The image of the road ahead refers to the digital image data, including the road environment, lane lines, and mileage markers, collected in real time by the vehicle-mounted ZED2i binocular stereo camera during the inspection vehicle's movement. The left view is the two-dimensional raw image captured by the left lens of the binocular camera system at a specific moment; it is the primary data source for target detection and vanishing point extraction. The right view is the image captured synchronously with the left view by the right lens; together, they form a parallax pair used to calculate the depth information of the target. Instantaneous speed refers to the real-time movement rate of the inspection vehicle output via the CAN bus or integrated navigation system at the sampling time. Absolute geographic coordinates refer to the vehicle's longitude and latitude position information in Earth's space obtained using a high-precision integrated navigation system (RTK-GPS). The heading angle is the horizontal deflection angle of the vehicle's longitudinal axis centerline relative to geographic true north, used to determine the vehicle's direction of travel. The left lane line refers to the lane edge boundary line located to the left of the vehicle's direction of travel, identified and fitted by machine vision algorithms within the region of interest in the image.

[0088] The right lane line refers to the lane edge boundary located on the right side of the vehicle's direction of travel, identified within the region of interest in the image. The vanishing point (VP) is the geometric intersection point where parallel left and right lane lines converge on the two-dimensional image plane as they extend into the distance in the three-dimensional physical world. The real-time pitch angle is the dynamic angle between the camera's optical axis and the horizontal road surface, its value changing in real time with road gradient variations or vehicle movement. The initial moment refers to the starting point when the system starts operating, begins mileage integration calculations, or records the first set of valid reference data. The previous verification moment refers to the historical timestamp of the most recent high-confidence target identification and effective mileage data calibration performed by the system before the current frame. The historical mileage marker value refers to the value information of the most recent physical mileage marker that the system has successfully identified, confirmed, and recorded before the current positioning action. The dead reckoning mileage value refers to the theoretically correct road mileage position of the vehicle at the current moment, calculated by physically integrating time using vehicle kinematic data. The target mileage marker refers to the physical mileage marker that appears within the field of view of the inspection camera and requires the system to perform automated location calculation, character recognition, and attribute marking.

[0089] Initial 3D coordinates refer to the original spatial position data of the station relative to the camera coordinate system, calculated directly from the parallax principle of the binocular camera but before attitude correction. The station location result refers to the asset mapping data, including precise mileage values ​​and corresponding 3D physical coordinates, finally output by the system after attitude decoupling, semantic arbitration correction, or blind spot compensation mechanisms. The theoretical mileage position corresponding to the next station refers to the ideal mileage value predicted based on highway construction specifications and the previous station data. The preset roadside lateral offset (ΔL) refers to the fixed horizontal physical distance (e.g., 3 meters) between the station installation center point and the driving lane boundary, determined according to highway construction standards. Virtual anchor point coordinates refer to the simulated 3D spatial coordinates of the station calculated by superimposing the lateral offset and heading angle projection on the GPS position under blind spot conditions where the visual algorithm fails. The preset station interval value refers to the fixed mileage span (e.g., 1 kilometer) between two adjacent physical station signs as specified in highway standards and specifications. The estimated station number refers to the alternative station number semantic label generated by the dead reckoning logic when the visual recognition result is judged to be a logical conflict or a missed detection.

[0090] Understandably, the stationing system first acquires the left and right views corresponding to the road images captured by the vehicle-mounted camera, and then obtains the instantaneous speed V of the inspection vehicle at the current time t in real time through the vehicle's CAN bus or a high-precision integrated navigation system (RTK-GPS+IMU). t(Unit: m / s), Absolute geographic coordinates (Lon) t ,Lat t The system uses image processing algorithms to identify the left and right lane lines in the left view. It then uses the vanishing point of the lane lines formed by their intersection in the distance to infer the real-time pitch angle of the vehicle camera relative to the horizontal road surface. This mathematically compensates for camera attitude deviations caused by uneven road surfaces or inclines, eliminating systematic ranging errors. Simultaneously, the system continuously integrates the instantaneous speed over time from the initial moment or the previous verification moment, and calculates the current dead reckoning mileage value by combining it with stored historical mileage marker values, providing a physically kinematically constrained background for the visual recognition results.

[0091] Next, when the target mileage marker to be located is detected in the left view, the system uses the real-time pitch angle to construct a rotation matrix, performs inverse distortion correction on the initial three-dimensional coordinates of the target mileage marker to restore its true physical spatial location, and performs spatiotemporal logical verification on the character recognition results of the target mileage marker, thereby producing a high-precision mileage marker positioning result. In addition, if the left view shows a continuous occurrence of undetected target mileage markers (no detection box is output for N consecutive frames), and the current dead reckoning mileage value has reached the theoretical mileage position corresponding to the next mileage marker calculated from the historical mileage marker values ​​and the preset mileage marker interval value (the kinematic integrator in the background shows that the vehicle has passed the geographical location where the next mileage marker should theoretically appear (such as K102)), the system will extract the absolute geographic coordinates and the vehicle heading angle, and superimpose a preset roadside lateral offset to perform triangular geometric projection to generate virtual anchor point coordinates, thereby compensating for data interruptions caused by extreme blind spot scenarios such as occlusion. Finally, the system maps the current dead reckoning mileage value to the reckoning station number value, and combines it with the generated virtual anchor point coordinates to complete the data binding, thereby obtaining a mileage station positioning result that maintains absolute continuity in spatial distribution and semantic temporal sequence.

[0092] As an example, the step of calculating the real-time pitch angle of the vehicle-mounted camera relative to the horizontal road surface based on the vanishing point of the lane lines where the left and right lane lines converge in the left view includes: performing grayscale conversion and Gaussian filtering on the left view to obtain a denoised grayscale image; extracting pixel gradients from the denoised grayscale image using an edge detection algorithm to obtain edge contour features; extracting line segments from the edge contour features using Hough line transform, and fitting the line segments using the least squares method to obtain the equations for the left and right lane lines; solving for the coordinates of the intersection of the lines by simultaneously solving the equations for the left and right lane lines to obtain the vanishing point of the lane lines; determining the ordinate of the vanishing point in the left view based on the principle of perspective projection; and performing inverse trigonometric function calculation based on the ordinate of the vanishing point, the optical center coordinates of the vehicle-mounted camera, and the vertical focal length to obtain the real-time pitch angle of the vehicle-mounted camera relative to the horizontal road surface.

[0093] It should be noted that a denoised grayscale image refers to a single-channel digitized image obtained by converting the original color image into a grayscale image and applying Gaussian filtering to eliminate random noise interference in the original image. Edge detection algorithms are mathematical computational models that identify and extract physical boundary cues of objects by calculating the gradient magnitude and direction of image pixels in the horizontal and vertical directions and using a double thresholding method (e.g., a low threshold of 50 and a high threshold of 150). Edge contour features refer to the set of pixels retained after processing by machine vision algorithms, which can characterize the geometric trend of lane lines, mileage markers, or other physical boundaries in an image. The Hough line transform is a mathematical transformation method that maps edge pixels in image space to polar coordinate parameter space for voting, thereby extracting long line segment features with global geometric significance from a discrete pixel sequence. The left lane line equation is a mathematical expression used to describe the position and direction of the left driving boundary by linearly fitting line segments belonging to the left lane line family within the region of interest of the image using the least squares method. The right lane line equation refers to the linear relationship obtained by mathematically modeling the fitted line segments belonging to the right lane line family in the image, used to quantify the geometric trajectory of the right driving boundary in the image coordinate system. The intersection point coordinates refer to the x and y coordinates of the intersection point of two straight lines in a two-dimensional pixel coordinate system, calculated by simultaneously solving the left and right lane line equations. The perspective projection principle refers to the geometric mapping law that parallel straight lines in three-dimensional physical space, when projected onto a two-dimensional image plane, extend further away with increasing distance and eventually converge at a point. The optical center coordinates refer to the positional parameter of the intersection point of the principal optical axis of the vehicle camera lens and the image sensor plane in the digital image pixel coordinate system. The vertical focal length is the equivalent distance from the perspective center of the camera lens to the imaging plane in the vertical pixel dimension, and is a key internal parameter determining the degree of image perspective distortion and its spatial projection relationship.

[0094] Understandably, firstly, the stationing system converts the left view corresponding to the acquired image of the road ahead into a single-channel grayscale format, and then applies 5... A Gaussian filter matrix of 5 is used to perform convolutional smoothing on the image, and then the lower half of the image is cropped as the region of interest (ROI) to obtain a denoised grayscale image, thereby eliminating random noise in the original image and providing a stable pixel basis for subsequent geometric extraction. Next, the stationing system uses an edge detection algorithm (Canny edge detection) to calculate the gradient magnitude and direction of each pixel in the denoised grayscale image, and employs a dual thresholding method (e.g., a low threshold of 50 and a high threshold of 150) to extract clear image edge contours (edge ​​contour features), aiming to accurately capture the physical boundary clues of lane lines in the two-dimensional imaging plane. Then, the stationing system uses Hough line transform to map the pixels in the edge contour features to the parameter space for cumulative voting, thereby identifying significant long straight line segments in discrete edges, and using the least squares method to perform linear fitting on the line segment sets belonging to different sides, obtaining the left lane line equation and right lane line equation that can characterize the extension trend of the road space.

[0095] Next, the stationing system solves the common solution of the left and right lane line equations using algebraic simultaneous equations, calculating the vanishing point of the lane lines, representing the perspective convergence of parallel lines in three-dimensional space. This vanishing point serves as a crucial geometric reference for inferring the camera's dynamic spatial attitude. Subsequently, based on perspective projection principles, the system determines the ordinate of the vanishing point in the left view from the intersection coordinates. This value directly quantifies the pixel offset of the camera's principal optical axis relative to the horizontal plane. Finally, combining the optical center coordinates and vertical focal length of the vehicle-mounted camera, the system uses inverse trigonometric formulas to geometrically calculate the ordinate of the vanishing point, thereby obtaining the real-time pitch angle of the vehicle-mounted camera relative to the horizontal road surface. This provides precise attitude compensation parameters for subsequent steps to eliminate nonlinear distortion of the target object's coordinates caused by road surface undulations.

[0096] The formula for calculating the real-time pitch angle is as follows:

[0097]

[0098] in, It is the ordinate of the vanishing point of the lane lines, used to reflect the position of the convergence point of the lane lines in the image at a distance; It is the ordinate of the optical center coordinates, serving as a reference point for the image coordinates; It refers to the vertical focal length.

[0099] This embodiment also includes a fallback mechanism: a confidence level assessment is set, and when rain or nighttime results in very few effective line segments extracted by the Hough transform, making it impossible to determine the intersection point, the system triggers an interrupt and directly reads the gyroscope integral data (pitch angle) from the ZED2i's built-in IMU as... The alternative value ensures that the algorithm remains up-to-date 24 / 7.

[0100] As an example, the step of integrating the instantaneous speed based on the initial time or the previous verification time and combining it with the determined historical mileage marker values ​​to obtain the dead reckoning mileage value includes: if the current positioning state is the first positioning after the system starts, then integrating the instantaneous speed from the initial time to the current time to obtain the dead reckoning mileage value; if the current positioning state is not the first positioning after the system starts, then integrating the instantaneous speed from the previous verification time to the current time to obtain the theoretical distance traveled by the vehicle; determining the direction coefficient based on the travel direction of the inspection vehicle; and calculating the dead reckoning mileage value based on the theoretical distance, the direction coefficient, and the determined historical mileage marker values.

[0101] It should be noted that the current positioning status refers to the logical judgment stage in which the stationing system processes the data stream. This stage distinguishes whether the current task is the first baseline establishment phase after system startup or an incremental calculation phase based on existing calibration points. The dead reckoning mileage value refers to the theoretical mileage value calculated using the kinematic physical constraints of the inspection vehicle when visual perception fails or when logical consistency verification of the recognition results is required. The theoretical distance traveled by the vehicle refers to the physical displacement length obtained by the stationing system using calculus principles to perform time integration on the instantaneous speed of the inspection vehicle over the sampling period. The direction coefficient is a constant characterizing the logical mapping relationship between the direction of travel of the inspection vehicle and the arrangement pattern of station numbers, used to determine the sign attribute of the displacement during the mileage accumulation process.

[0102] Understandably, the formula for calculating dead reckoning distance is as follows:

[0103]

[0104] in, The directional coefficient is +1 for forward travel and -1 for reverse travel. The denominator 1000 is used to convert meters to kilometers (station units). This refers to the historical mileage marker value from the last high-confidence identification; v refers to the instantaneous speed. This refers to the previous verification time, that is, the time point when the high-confidence station identification was last completed; It refers to the current moment, that is, the point in time when the dead reckoning distance needs to be calculated.

[0105] This embodiment assumes that the dead reckoning mileage is... .

[0106] Please refer to Figure 2 , Figure 2 This diagram illustrates the definition of the direction coefficient in Embodiment 1 of the multimodal fusion-based mileage marker positioning method of this application. The diagram shows the definition of the direction coefficient and its corresponding relationships. In the diagram, two parallel road lanes are marked as driving paths, and the mileage markers along the roads are sequentially arranged as K99, K100, K101, K102, K103, and K104. Each mileage marker is represented by a rectangle and stands perpendicular to the road centerline at the road edge. In the upper road, the vehicle travels to the right, and the marker increases with the travel direction; therefore, the direction coefficient is defined as +1. In the lower road, the vehicle travels to the left, and the marker decreases with the travel direction; therefore, the direction coefficient is defined as -1. The diagram clearly indicates the correspondence between direction and marker changes through the vehicle's travel direction arrows. This ensures that when calculating dead reckoning mileage values, the correct sign for mileage accumulation or decrement is assigned based on the vehicle's travel direction, achieving consistency between the marker value and the vehicle's motion state.

[0107] As an example, the step of generating virtual anchor point coordinates based on the absolute geographic coordinates, the vehicle heading angle, and the preset roadside lateral offset includes: converting the absolute geographic coordinates into vehicle plane coordinates in the northeast-central local plane coordinate system; performing trigonometric function spatial projection transformation based on the vehicle heading angle and the preset roadside lateral offset to obtain an offset vector; and superimposing the offset vector onto the vehicle plane coordinates to obtain the virtual anchor point coordinates.

[0108] It should be noted that the Northeastern Sky (ENU) local plane coordinate system refers to a local rectangular coordinate system with a specific point on the Earth's surface as the origin, and its three mutually perpendicular coordinate axes pointing to geographic due east, due north, and perpendicular to the reference ellipsoid upwards. The vehicle's plane coordinates refer to the two-dimensional position value within the horizontal projection plane of the Northeastern Sky local plane coordinate system obtained after transforming the absolute geographic coordinates of the inspection vehicle using a coordinate transformation algorithm. The offset vector refers to the spatial displacement vector in the horizontal plane representing the installation position of the roadside mileage marker relative to the center of the inspection vehicle, calculated using trigonometric function spatial projection transformation.

[0109] Understandably, this involves extracting the current vehicle's high-precision GPS latitude and longitude and converting it into coordinates in the Northeastern Sky (ENU) local plane coordinate system. This refers to absolute geographic coordinates. Extract the vehicle's current heading angle. (i.e., the heading angle of the vehicle). Based on highway construction specifications, a "preliminary lateral installation offset value ΔL (i.e., the preset lateral offset to the roadside)" is introduced (e.g., an offset of 3 meters to the right of the carriageway). Using trigonometric projection, the actual coordinates of the virtual anchor point are calculated:

[0110]

[0111]

[0112] The final calculated value "K102" and the coordinates of the virtual anchor point Bind and output. This step ensures the absolute continuity of asset mapping data in terms of 3D spatial distribution and semantic temporal sequence without requiring any visual data.

[0113] Please refer to Figure 3 , Figure 3 This is a schematic diagram illustrating the virtual anchor point coordinate generation process provided in Embodiment 1 of the multimodal fusion-based mileage marker positioning method of this application. First, the system checks whether consecutive target misses have occurred and whether the dead reckoning mileage has reached the next marker. If the condition is not met, the system performs conventional binocular visual positioning along the left-hand flow. If the condition is met, it indicates that the vehicle has encountered an extreme blind spot, and the system initiates a virtual anchor point spatial mapping mechanism along the right-hand flow. Subsequently, the system extracts the vehicle's instantaneous high-precision GPS coordinates and heading angle to determine the vehicle's position and orientation in the local plane coordinate system. Next, the system loads prior engineering parameters, including the road lateral installation offset ΔL, to adjust the virtual anchor point position to conform to highway construction specifications. Then, the system performs a trigonometric function spatial projection transformation based on the heading angle and offset, mapping the vehicle's position and offset information into three-dimensional space, thereby calculating the virtual three-dimensional anchor point coordinates. Finally, the system outputs the generated virtual three-dimensional anchor point coordinates and binds them to the calculated marker value to ensure that the marker positioning results maintain continuity and accuracy in three-dimensional space and semantic temporal sequence in blind spot scenarios.

[0114] This embodiment provides a mileage marker localization method based on multimodal fusion. First, it acquires the left and right views of the road ahead image captured by the vehicle-mounted camera on the inspection vehicle, and simultaneously acquires instantaneous speed, absolute geographic coordinates, and vehicle heading angle, providing a multi-source heterogeneous data foundation for subsequent multi-dimensional feature fusion localization. Second, it calculates the real-time pitch angle of the vehicle-mounted camera relative to the horizontal road surface using the vanishing points of the lane lines fitted from the left and right lane lines in the left view, which can dynamically correct target pose distortion caused by road undulations, thereby significantly improving the accuracy of spatial positioning. Subsequently, it performs time integration on the instantaneous speed based on the initial time or the previous verification time, and combines it with the determined historical mileage marker values ​​to obtain the dead reckoning mileage value, establishing a theoretical position that conforms to the laws of physical motion for the visual processing results. The constraint benchmark is then established. Next, when the target mileage marker to be located is detected in the left view, the initial three-dimensional coordinates of the target mileage marker are subjected to inverse distortion processing based on the real-time pitch angle. The identification result of the target mileage marker is then verified in conjunction with the dead reckoning mileage value, thereby effectively filtering out jump false detections caused by changes in lighting or numerical similarity. Finally, when the target mileage marker is not detected and the dead reckoning mileage value reaches the theoretical mileage position corresponding to the next mileage marker, virtual anchor point coordinates are generated based on the absolute geographic coordinates, the vehicle heading angle, and the preset roadside lateral offset. The dead reckoning mileage value is then mapped to the calculated mileage marker value to ensure the absolute continuity of asset mapping data in the spatiotemporal distribution. This embodiment can achieve high-precision positioning of road mileage markers under actual inspection conditions such as dynamic vehicle driving and false or missed target recognition.

[0115] Based on the first embodiment of this application, in the second embodiment of this application, the content that is the same as or similar to that in Embodiment 1 above can be referred to the above description, and will not be repeated hereafter. Based on this, please refer to... Figure 4 , Figure 4 This is a flowchart illustrating the second embodiment of the mileage marker positioning method based on multimodal fusion of this application. Step S40 of the mileage marker positioning method based on multimodal fusion includes steps S41 to S45:

[0116] Step S41: When the target mileage marker to be located is detected in the forward road image in the left view, the initial three-dimensional coordinates of the target mileage marker are determined based on the left view and the right view.

[0117] Step S42: Perform inverse distortion processing on the initial three-dimensional coordinates based on the real-time pitch angle to obtain the corrected road network coordinates;

[0118] Step S43: Perform character recognition on the detected target mileage marker to obtain the recognition result;

[0119] Step S44: Calculate the mileage value based on the dead reckoning value, and verify the identification result through a spatiotemporal dual gating mechanism to obtain the identification station number value;

[0120] Step S45: Based on the corrected road network coordinates and the identified station number values, the mileage station number positioning result is obtained.

[0121] It should be noted that correcting the road network coordinates refers to performing an inverse distortion transformation on the distorted original 3D coordinates calculated by binocular stereo vision using a constructed rotation matrix around the X-axis. This eliminates the projection calculation error caused by the camera's optical axis not being horizontal due to road undulations, restoring the true physical space horizontal projection coordinates. The recognition result refers to the original mileage information parsed from the mileage marker area detected in the left view by using a text recognition network (e.g., PaddleOCR) for feature extraction and semantic analysis, without multimodal logical verification. The spatiotemporal dual-gating mechanism combines the scale evolution features defined by the detection box area ratio with the kinematic calculation features defined by the integral of the inspection vehicle's displacement. This performs cross-validation on the visually recognized values ​​and determines whether there are logical conflicts through a joint arbitration algorithm. Recognizing the mileage marker value refers to the mileage marker number label with physical and logical consistency output by the system after the spatiotemporal dual-gating mechanism performs authenticity judgment and semantic conflict processing. When the system determines that the recognition result has a jump-like false detection, this value will be forcibly overwritten by the theoretical mileage value calculated by dead reckoning.

[0122] Understandably, firstly, when the stationing system detects a target mileage station within the road image ahead in the left view, it generates a detection box in the left view by calling the target detection network and extracts the pixel center coordinates of the detection box. Then, it calculates the disparity between the pixel center in the left and right views using a binocular SGBM (semi-global block matching) algorithm, and combines this with the camera's horizontal focal length. Vertical focal length Optical center coordinates Using the binocular baseline length (B), the initial three-dimensional coordinates of the target mileage station are calculated. This process allows for the rapid acquisition of the station's spatial position within the camera coordinate system. Secondly, the station positioning system utilizes real-time pitch angles to construct a rotation matrix around the X-axis. Multiplying the initial 3D coordinates by this matrix performs inverse distortion processing, resulting in corrected road network coordinates. This eliminates coordinate distortion caused by pitch angle changes during vehicle movement, improving spatial positioning accuracy. Next, the system performs character recognition on the target mileage station. The image region within the detection box is input into the character recognition algorithm to obtain the corresponding recognition result, which is used to extract the semantic information of the station. Then, the station positioning system combines the dead reckoning mileage value with the recognition result, using a spatiotemporal dual-gating mechanism to verify the recognition result and determine the reliability of the recognized station value. If a logical conflict exists in the recognition result, it is replaced with the reckoning value to ensure consistency between the station value and the vehicle's motion state, avoiding jump errors. Finally, the system combines the corrected road network coordinates with the verified recognized station value to generate the final mileage station positioning result, achieving precise positioning of the target in 3D space and ensuring the reliability of the corresponding station value.

[0123] As an example, the step of performing inverse distortion processing on the initial three-dimensional coordinates based on the real-time pitch angle to obtain the corrected road network coordinates includes: constructing an inverse distortion rotation matrix based on the real-time pitch angle; multiplying the initial three-dimensional coordinates by the inverse distortion rotation matrix to obtain the corrected road network coordinates; the inverse distortion rotation matrix is ​​represented as follows:

[0124]

[0125] in, This indicates the real-time pitch angle.

[0126] It is understandable that multiplying the distorted original coordinates Praw by this rotation matrix completes the inverse distortion projection, thus obtaining the absolute horizontal road network coordinates (i.e., the corrected road network coordinates). .

[0127]

[0128]

[0129]

[0130] As an example, when detecting the target mileage marker to be located in the forward road image in the left view, the step of determining the initial three-dimensional coordinates of the target mileage marker based on the left view and the right view includes: detecting the target mileage marker to be located in the left view using a preset target detection network to obtain a detection result; when the detection result indicates the presence of a marker, selecting the detection box corresponding to the target mileage marker; determining the pixel center coordinates of the detection box, and calculating the pixel displacement difference between the pixel center coordinates in the left view and the right view using a semi-global block matching algorithm to obtain a disparity value; acquiring the horizontal focal length, vertical focal length, optical center coordinates, and binocular baseline length of the vehicle-mounted camera; calculating the depth coordinates of the target mileage marker relative to the vehicle-mounted camera based on the horizontal focal length, the binocular baseline length, and the disparity value; performing spatial mapping calculation on the pixel center coordinates and the depth coordinates based on the vertical focal length and the optical center coordinates to obtain the corresponding lateral and vertical coordinates; and obtaining the initial three-dimensional coordinates of the target mileage marker based on the depth coordinates, the lateral coordinates, and the vertical coordinates.

[0131] It should be noted that the pre-trained target detection network refers to a neural network model (e.g., YOLOv8) pre-trained in the system to identify mileage markers in road images. It can quickly locate and select the position of the marker in the input image. The detection result refers to the recognition information output by the target detection network to the input image, including the presence of the target and the position and size of the corresponding detection box, used for subsequent 3D coordinate calculations. The disparity value refers to the difference in the horizontal position of the same pixel in the left and right views, used to calculate the depth information of the object relative to the camera using the principle of binocular stereo imaging. The horizontal focal length refers to the distance from the optical center of the vehicle camera lens to the imaging plane in the horizontal direction, determining the perspective projection ratio of the image in the horizontal direction.

[0132] Vertical focal length refers to the distance from the optical center of the camera lens to the imaging plane in the vertical direction, determining the perspective projection ratio of the image in the vertical direction. Optical center coordinates refer to the pixel position in the image coordinate system where the principal optical axis intersects the sensor on the camera's imaging plane; this is used to map pixel coordinates to the camera coordinate system. Binocular baseline length refers to the horizontal distance between the optical centers of the left and right camera lenses, a fundamental parameter used for depth calculation. Depth coordinates refer to the spatial distance of the target mileage marker relative to the camera along the optical axis, indicating the distance between the marker and the camera; the calculation formula is as follows:

[0133]

[0134] in, This refers to the camera's horizontal focal length. This refers to the vertical focal length. B refers to the optical center coordinates, B refers to the binocular baseline length, and d refers to the parallax value of the pixel center between the left and right views.

[0135] The horizontal coordinate refers to the horizontal position of the target mileage marker in the camera coordinate system, perpendicular to the direction of travel. It is used to determine the spatial position in the left-right direction, and the calculation formula is as follows:

[0136]

[0137] in, It refers to the x-coordinate of the center of the detection box pixel in the horizontal coordinate system of the image. It refers to the vertical coordinate of the pixel center of the detection box in the vertical coordinate system of the image.

[0138] Vertical coordinates refer to the height of the target mileage station in the camera coordinate system, perpendicular to the horizontal plane. They are used to determine the spatial position of the station in the vertical direction. The calculation formula is as follows:

[0139]

[0140] Understandably, firstly, the station location system calls a preset target detection network to detect the target station number in the left view. The network output determines whether the target exists and generates a detection box (e.g., the detection box covers the entire image area of ​​the station number). This quickly determines the station number's position in the image and provides a reference for subsequent spatial calculations. Secondly, when the detection result shows the target station number exists, the system determines the pixel center coordinates from the detection box and uses a semi-global block matching algorithm to find the corresponding positions of identical pixels between the left and right views, thereby calculating the pixel displacement difference and obtaining the disparity value. This is done to accurately reflect the distance difference between the target and the camera, facilitating the calculation of depth coordinates. Subsequently, the system acquires water level data from the vehicle-mounted camera. The system determines the spatial position of the station number along the optical axis by using the horizontal focal length, vertical focal length, optical center coordinates, and binocular baseline length, along with the disparity value, to calculate depth coordinates using imaging formulas. Then, combining the vertical focal length and optical center coordinates, the system spatially maps the pixel center coordinates and depth coordinates to obtain lateral and vertical coordinates. This maps the two-dimensional position in the image to the three-dimensional plane of the camera coordinate system, clarifying the station number's spatial position in the left-right and up-down directions. Finally, the system combines the depth, lateral, and vertical coordinates to generate the initial three-dimensional coordinates of the target station number, providing foundational data for subsequent inverse distortion processing and precise positioning, and ensuring accurate and reliable spatial location.

[0141] As an example, the step of verifying the identification result based on the dead reckoning mileage value using a spatiotemporal dual-gating mechanism to obtain the identification station number value includes: obtaining scale evolution features based on the proportion of the detection box area corresponding to the target mileage station number to the full map resolution; determining whether the detection box meets a preset replacement condition between adjacent frames based on the scale evolution features to obtain a first gating decision result; calculating the numerical difference between the identification result and the dead reckoning mileage value, and determining whether the numerical difference exceeds a preset tolerance threshold to obtain a second gating decision result; when both the first gating decision result and the second gating decision result are yes, mapping the dead reckoning mileage value to the identification station number value; when either the first gating decision result or the second gating decision result is no, using the identification result as the identification station number value.

[0142] It should be noted that scale evolution features refer to the changes in the size of the target in the image as the vehicle moves or travels a distance, obtained by calculating the proportion of the detection box area corresponding to the target mileage marker to the total image resolution. This reflects the changes in the viewpoint and distance of the mileage marker between different frames. The formula for calculating scale evolution features is as follows:

[0143]

[0144] Where w and h represent the width and height of the YOLOv8 detection box, respectively; W and H represent the width and height of the image, respectively. H represents the full image resolution.

[0145] The preset replacement conditions refer to the rules set in advance by the system to determine whether the change in the area of ​​the detection box in adjacent image frames meets the standard for target replacement. For example, replacement is triggered when a new station number in the distance appears in the image and the area of ​​the detection box is smaller than the threshold of the nearby station number. The first gating decision result refers to the logical output that determines whether the detection box corresponds to the next station number in the sequence based on the scale evolution characteristics and the preset replacement conditions, and is used to determine whether the target has undergone position replacement. The preset tolerance threshold refers to the range of numerical differences allowed between the recognition result and the dead reckoning mileage value set in advance by the system (e.g., 1 kilometer), and is used to determine whether the recognized station number matches the actual mileage traveled by the vehicle. The second gating decision result refers to the judgment result obtained by calculating the numerical difference between the recognition result and the dead reckoning mileage value and comparing it with the preset tolerance threshold, and is used to confirm the reliability of the recognized station number value in time and space.

[0146] Understandably, firstly, the mileage marker localization system calculates scale evolution features based on the proportion of the detection box area corresponding to the target mileage marker to the total image resolution. Specifically, it obtains the pixel area of ​​the detection box in the current frame and divides it by the total number of pixels in the image, thereby quantifying the relative size change of the mileage marker in the image. This reflects the changes in viewpoint and size of the target as the vehicle moves or the distance changes, providing a basis for subsequent judgments on whether the mileage marker has changed. Secondly, the system uses the scale evolution features to determine whether the detection box meets the preset replacement conditions between adjacent frames. By comparing whether the change in the proportion of the detection box area between the current frame and the previous frame exceeds a set threshold, it determines whether the current detection box corresponds to the next mileage marker and generates the first gating decision result. This ensures that the identified target is consistent with the actual mileage marker sequence, avoiding erroneous switching or duplication. The system first identifies the vehicle's mileage. Then, it calculates the difference between the identified mileage and the dead reckoning mileage and compares it with a preset tolerance threshold to obtain a second gating decision. This is done to determine whether the mileage value obtained from visual identification matches the vehicle's theoretical mileage, thereby filtering out jump errors caused by lighting, occlusion, or misidentification. Finally, the system determines the final identified mileage value based on the judgment results of the first and second gating decisions: when both are yes, the dead reckoning mileage value is mapped to the identified mileage value to correct visual misdetection (e.g., to avoid K100 being misidentified as K180); when either decision result is no, the identification result is directly used as the identified mileage value to maintain continuity and data reliability, thereby achieving consistency between the identified value and the actual vehicle movement state.

[0147] Spacetime dual gating mechanism:

[0148] Gated 1 (Scale State Machine): The system traces historical frames. If the scale of the detection box at time t-1 is observed... (The old vehicle marker is right next to the window and is about to disappear), while at time t, a new detection frame suddenly appears in the distance of the screen and... (This indicates that a new station number has just appeared at the far end). The system triggers an "adjacent physical target replacement event," determining that the visual focus is now locked on the "next sign" in the sequence.

[0149] in, This refers to the area threshold preset by the system to determine whether the detection frame corresponds to the old station number. This refers to the area threshold that the system presets to determine whether a detection box corresponds to a new station number. .

[0150] Gating 2 (Kinematic Conflict Detection): Check At this point, the OCR recognition result... It is 180, while the dead reckoning mileage value is... The value is 101. A difference of 79 exceeds the tolerance threshold (e.g., ±1), triggering a "serious semantic conflict".

[0151] Multimodal arbitration and forced overwrite:

[0152] The system initiates a mandatory arbitration procedure only when both "Gate 1 triggers target replacement" and "Gate 2 triggers serious conflict." The decision logic is as follows: Based on the laws of physical space, a vehicle absolutely cannot travel 79 kilometers within an extremely short time integration period. Therefore, the "180" extracted by OCR is judged as an "unreliable jump-type false detection (such as tree shadow interference or hallucinations caused by number similarity)." Action taken: The system forcibly discards the result. The value is the theoretical value calculated from dead reckoning. (i.e., 101) is used as the final current station number output, and the record is labeled with Type:Estimated in the database to completely prevent dirty data from entering the final road network ledger.

[0153] Please refer to Figure 5 , Figure 5 This is a logical diagram of the spatiotemporal dual-gating mechanism provided in Embodiment 2 of the multimodal fusion-based mileage marker positioning method of this application. The system first extracts the semantic information, scale features, and vehicle kinematics estimation features of the target mileage marker to establish a multi-dimensional verification basis. Then, it enters gate 1: scale feature judgment, to determine whether the currently detected target is a replacement for an adjacent target. If the judgment result is negative, the system maintains the current output state along the left-hand flow and continues to track the same target, ensuring output continuity. If the judgment result is positive, it enters gate 2: kinematics judgment, to determine whether there is a conflict between the recognition result and the vehicle's motion state. If the judgment result is negative, it indicates no abnormal conflict has occurred. The system verifies the box, uses the OCR recognition value, and updates the dead reckoning baseline, aligning the visual recognition with the kinematics estimation results. If the judgment result is positive, it indicates a sudden skipping of markers. The system initiates multimodal arbitration, performs logical judgment on the OCR recognition result, determines that the recognition result has a skipping false detection, then forcibly discards the OCR recognition value and overwrites the output with the dead reckoning value, thereby ensuring the consistency between the recognized marker value and the vehicle's motion trajectory and preventing abnormal skipping data from entering the final positioning result.

[0154] This embodiment first detects a target mileage marker in the road image ahead in the left view. Then, it calculates the initial three-dimensional coordinates of the marker by combining the left and right views, converting the pixel information in the image into a spatial position. This provides basic spatial data for subsequent attitude correction and precise positioning, ensuring the target's three-dimensional position is completely captured. Second, it performs inverse distortion processing on the initial three-dimensional coordinates using the real-time pitch angle to obtain corrected road network coordinates. Mathematical mapping eliminates spatial distortion caused by pitch changes during vehicle movement, thereby improving spatial position accuracy. Finally, it performs character recognition on the detected target mileage marker to obtain... The identification results are used to extract semantic information of the station numbers, thus obtaining the corresponding mileage values ​​while acquiring the spatial location. Subsequently, the mileage values ​​are combined with dead reckoning values, and the identification results are verified through a spatiotemporal dual-gating mechanism to determine the reliability of the identified station numbers and correct visual false detections when necessary, thereby improving the reliability of the identification results. Finally, the corrected road network coordinates are combined with the verified identified station numbers to generate the final mileage station positioning results, achieving precise positioning of the target in three-dimensional space while ensuring the accuracy and usability of the corresponding mileage values, thus providing high-precision data for road inspection and asset mapping.

[0155] For example, to help understand the implementation process of the multimodal fusion-based mileage marker positioning method obtained by combining this embodiment with the above embodiment one, please refer to... Figure 6 , Figure 6 A simplified flowchart of a multimodal fusion-based mileage station location method is provided, specifically:

[0156] On the left side of the diagram, the data input layer includes the acquisition of ZED binocular images and vehicle speed and trajectory information. The system first identifies lane lines in the left view using a lane vanishing point detection module, then calculates the vehicle's dynamic pitch angle and constructs a rotation matrix to correct the original ZED coordinates. After coordinate correction, accurate corrected road network coordinates are obtained, ensuring the accuracy of the spatial positioning baseline data. In the dual-core processing layer, the system uses YOLO target detection boxes to extract the aspect ratio and scale features of the detection boxes, and inputs the detection box region into the OCR recognition module to obtain recognition values. Simultaneously, it combines dead reckoning to predict the vehicle's theoretical mileage. The system sequentially performs scale and kinematic judgments to determine if the current target has been replaced or if there are any jump anomalies. When both scale and kinematic judgments pass, the system updates the output using the OCR recognition value; if an anomaly is found, a dual-gated arbitration mechanism determines whether to force the use of dead reckoning values ​​to overwrite the recognition results, thereby ensuring the continuity and reliability of the station numbers. Finally, in the blind spot compensation and result output layer, when vision fails completely or the binocular camera fails to identify the station number and continuously misses detection, the system starts the virtual anchor point generation mechanism, uses GPS extrapolation coordinates and offset to calculate virtual three-dimensional anchor points, and outputs corrected coordinates and final station number, realizing the fusion of visual recognition, kinematic calculation and engineering prior information, so as to ensure high-precision mileage station positioning under various complex working conditions.

[0157] It should be noted that the above examples are only for understanding this application and do not constitute a limitation on the multimodal fusion-based mileage marker positioning method of this application. Any simple modifications based on this technical concept are within the protection scope of this application.

[0158] This application also provides a mileage marker positioning device based on multimodal fusion, please refer to... Figure 7 The multimodal fusion-based mileage marker positioning device includes:

[0159] The data acquisition module 10 is used to acquire the left and right views corresponding to the images of the road ahead captured by the on-board camera on the inspection vehicle, and to acquire the instantaneous speed, absolute geographic coordinates and heading angle of the inspection vehicle.

[0160] The attitude calculation module 20 is used to calculate the real-time pitch angle of the vehicle camera relative to the horizontal road surface based on the vanishing point of the lane lines where the left and right lane lines converge in the left view.

[0161] The physical calculation module 30 is used to perform time integration on the instantaneous velocity based on the initial time or the previous verification time, and combine it with the determined historical mileage station value to obtain the dead reckoning mileage value.

[0162] The detection and positioning module 40 is used to perform inverse distortion processing on the initial three-dimensional coordinates of the target mileage station number according to the real-time pitch angle when the target mileage station number is detected in the left view, and to verify the identification result of the target mileage station number to obtain the mileage station number positioning result.

[0163] The virtual anchor point generation module 50 is used to generate virtual anchor point coordinates based on the absolute geographic coordinates, the vehicle heading angle, and a preset roadside lateral offset when the target mileage marker is not detected in the left view and the dead reckoning mileage value reaches the theoretical mileage position corresponding to the next mileage marker. The theoretical mileage position is determined based on the historical mileage marker values ​​and the preset mileage marker interval value.

[0164] The blind zone output module 60 is used to map the dead reckoning mileage value to the reckoning station value, and obtain the mileage station positioning result based on the virtual anchor point coordinates and the reckoning station value.

[0165] The multimodal fusion-based mileage marker positioning device provided in this application, employing the multimodal fusion-based mileage marker positioning method described in the above embodiments, can solve the technical problem of achieving high-precision positioning of road mileage markers under actual inspection conditions involving dynamic vehicle movement and false or missed target recognition. Compared with the prior art, the beneficial effects of the multimodal fusion-based mileage marker positioning device provided in this application are the same as those of the multimodal fusion-based mileage marker positioning method provided in the above embodiments, and other technical features in the multimodal fusion-based mileage marker positioning device are the same as those disclosed in the methods of the above embodiments, and will not be repeated here.

[0166] This application provides a multimodal fusion-based mileage marker positioning device, which includes: at least one processor; and a memory communicatively connected to the at least one processor; wherein the memory stores instructions executable by the at least one processor, which are executed by the at least one processor to enable the at least one processor to execute the multimodal fusion-based mileage marker positioning method in Embodiment 1 above.

[0167] The following is for reference. Figure 8The diagram illustrates a structural schematic of a multimodal fusion-based mileage marker positioning device suitable for implementing embodiments of this application. The multimodal fusion-based mileage marker positioning device in the embodiments of this application may include, but is not limited to, mobile terminals such as mobile phones, laptops, digital radio receivers, PDAs (Personal Digital Assistants), PADs (Portable Android Devices), PMPs (Portable Media Players), and vehicle terminals (e.g., vehicle navigation terminals), as well as fixed terminals such as digital TVs and desktop computers. Figure 8 The mileage marker positioning device based on multimodal fusion shown is merely an example and should not impose any limitations on the functionality and scope of use of the embodiments of this application.

[0168] like Figure 8 As shown, the multimodal fusion-based mileage marker positioning device may include a processing unit 1001 (e.g., a central processing unit, a graphics processing unit, etc.), which can perform various appropriate actions and processes according to a program stored in ROM (Read Only Memory) 1002 or a program loaded from storage device 1003 into RAM (Random Access Memory) 1004. RAM 1004 also stores various programs and data required for the operation of the multimodal fusion-based mileage marker positioning device. The processing unit 1001, ROM 1002, and RAM 1004 are interconnected via bus 1005. Input / output (I / O) interface 1006 is also connected to the bus. Typically, the following systems can be connected to I / O interface 1006: input devices 1007 including, for example, touchscreens, touchpads, keyboards, mice, image sensors, microphones, accelerometers, gyroscopes, etc.; output devices 1008 including, for example, LCDs (Liquid Crystal Displays), speakers, vibrators, etc.; storage devices 1003 including, for example, magnetic tapes, hard disks, etc.; and communication devices 1009. Communication device 1009 allows the multimodal fusion-based mileage marker positioning device to communicate wirelessly or wiredly with other devices to exchange data. Although the figure shows a multimodal fusion-based mileage marker positioning device with various systems, it should be understood that it is not required to implement or possess all the systems shown. More or fewer systems can be implemented alternatively.

[0169] Specifically, according to the embodiments disclosed in this application, the processes described above with reference to the flowcharts can be implemented as computer software programs. For example, embodiments disclosed in this application include a computer program product comprising a computer program carried on a computer-readable medium, the computer program containing program code for performing the methods shown in the flowcharts. In such embodiments, the computer program can be downloaded and installed from a network via a communication device, or installed from storage device 1003, or installed from ROM 1002. When the computer program is executed by processing device 1001, it performs the functions defined in the methods of the embodiments disclosed in this application.

[0170] The multimodal fusion-based mileage marker positioning device provided in this application, employing the multimodal fusion-based mileage marker positioning method described in the above embodiments, can solve the technical problem of achieving high-precision positioning of road mileage markers under actual inspection conditions involving dynamic vehicle movement and false or missed target recognition. Compared with the prior art, the beneficial effects of the multimodal fusion-based mileage marker positioning device provided in this application are the same as those of the multimodal fusion-based mileage marker positioning method provided in the above embodiments, and other technical features of this multimodal fusion-based mileage marker positioning device are the same as those disclosed in the previous embodiment method, and will not be repeated here.

[0171] It should be understood that the various parts disclosed in this application can be implemented using hardware, software, firmware, or a combination thereof. In the description of the above embodiments, specific features, structures, materials, or characteristics can be combined in any suitable manner in one or more embodiments or examples.

[0172] The above description is merely a specific embodiment of this application, but the scope of protection of this application is not limited thereto. Any variations or substitutions that can be easily conceived by those skilled in the art within the scope of the technology disclosed in this application should be included within the scope of protection of this application. Therefore, the scope of protection of this application should be determined by the scope of the claims.

[0173] This application provides a computer-readable storage medium having computer-readable program instructions (i.e., a computer program) stored thereon, which are used to execute the multimodal fusion-based mileage marker positioning method in the above embodiments.

[0174] The computer-readable storage medium provided in this application may be, for example, a USB flash drive, but is not limited to, electrical, magnetic, optical, electromagnetic, infrared, or semiconductor systems, devices, or any combination thereof. More specific examples of computer-readable storage media may include, but are not limited to: electrical connections having one or more wires, portable computer disks, hard disks, RAM (Random Access Memory), ROM (Read Only Memory), EPROM (Erasable Programmable Read Only Memory or Flash Memory), optical fibers, CD-ROM (CD-Read Only Memory), optical storage devices, magnetic storage devices, or any suitable combination thereof. In this embodiment, the computer-readable storage medium may be any tangible medium containing or storing a program that can be used by or in conjunction with an instruction execution system, system, or device. The program code contained on the computer-readable storage medium may be transmitted using any suitable medium, including but not limited to: wires, optical cables, RF (Radio Frequency), etc., or any suitable combination thereof.

[0175] The aforementioned computer-readable storage medium may be included in a multimodal fusion-based mileage marker positioning device; or it may exist independently and not assembled into a multimodal fusion-based mileage marker positioning device.

[0176] The aforementioned computer-readable storage medium carries one or more programs that, when executed by a multimodal fusion-based mileage marker positioning device, cause the multimodal fusion-based mileage marker positioning device to: acquire a left and right view corresponding to the road image captured by the vehicle-mounted camera on the inspection vehicle, and acquire the instantaneous speed, absolute geographic coordinates, and heading angle of the inspection vehicle; calculate the real-time pitch angle of the vehicle-mounted camera relative to the horizontal road surface based on the vanishing point of the lane lines where the left and right lane lines converge in the left view; perform time integration on the instantaneous speed based on the initial time or the previous verification time, and combine it with the determined historical mileage marker values ​​to obtain the dead reckoning mileage value; in the left view... When the target mileage marker to be located is detected, the initial three-dimensional coordinates of the target mileage marker are subjected to inverse distortion processing based on the real-time pitch angle, and the identification result of the target mileage marker is verified to obtain the mileage marker positioning result. When the target mileage marker is not detected in the left view and the dead reckoning mileage value reaches the theoretical mileage position corresponding to the next mileage marker, virtual anchor point coordinates are generated based on the absolute geographic coordinates, the vehicle heading angle, and the preset roadside lateral offset. The theoretical mileage position is determined based on the historical mileage marker values ​​and the preset mileage marker interval value. The dead reckoning mileage value is mapped to the reckoning mileage value, and the mileage marker positioning result is obtained based on the virtual anchor point coordinates and the reckoning mileage value.

[0177] Computer program code for performing the operations of this application can be written in one or more programming languages ​​or a combination thereof, including object-oriented programming languages ​​such as Java, Smalltalk, and C++, as well as conventional procedural programming languages ​​such as the "C" language or similar programming languages. The program code can be executed entirely on the user's computer, partially on the user's computer, as a standalone software package, partially on the user's computer and partially on a remote computer, or entirely on a remote computer or server. In cases involving remote computers, the remote computer can be connected to the user's computer via any type of network—including LAN (Local Area Network) or WAN (Wide Area Network)—or can be connected to an external computer (e.g., via the Internet using an Internet service provider).

[0178] The flowcharts and block diagrams in the accompanying drawings illustrate the architecture, functionality, and operation of possible implementations of systems, methods, and computer program products according to various embodiments of this application. In this regard, each block in a flowchart or block diagram may represent a module, segment, or portion of code containing one or more executable instructions for implementing a specified logical function. It should also be noted that in some alternative implementations, the functions indicated in the blocks may occur in a different order than those indicated in the drawings. For example, two consecutively indicated blocks may actually be executed substantially in parallel, and they may sometimes be executed in reverse order, depending on the functions involved. It should also be noted that each block in the block diagrams and / or flowcharts, and combinations of blocks in the block diagrams and / or flowcharts, can be implemented using a dedicated hardware-based system that performs the specified function or operation, or using a combination of dedicated hardware and computer instructions.

[0179] The modules described in the embodiments of this application can be implemented in software or hardware. The names of the modules do not necessarily limit the functionality of the unit itself.

[0180] The readable storage medium provided in this application is a computer-readable storage medium that stores computer-readable program instructions (i.e., a computer program) for executing the above-described multimodal fusion-based mileage marker positioning method. This solves the technical problem of achieving high-precision positioning of road mileage markers under actual inspection conditions involving dynamic vehicle movement and false or missed target detection. Compared with the prior art, the beneficial effects of the computer-readable storage medium provided in this application are the same as those of the multimodal fusion-based mileage marker positioning method provided in the above embodiments, and will not be repeated here.

[0181] This application also provides a computer program product, including a computer program that, when executed by a processor, implements the steps of the multimodal fusion-based mileage marker positioning method described above.

[0182] The computer program product provided in this application can solve the technical problem of achieving high-precision positioning of road mileage markers under actual inspection conditions such as dynamic vehicle driving and false or missed target recognition. Compared with the prior art, the beneficial effects of the computer program product provided in this application are the same as those of the mileage marker positioning method based on multimodal fusion provided in the above embodiments, and will not be repeated here.

[0183] The above description is only a part of the embodiments of this application and does not limit the patent scope of this application. All equivalent structural transformations made under the technical concept of this application and using the contents of the specification and drawings of this application, or direct / indirect applications in other related technical fields, are included in the patent protection scope of this application.

Claims

1. A method for mileage station location based on multimodal fusion, characterized in that, The method includes: Obtain the left and right views corresponding to the images of the road ahead captured by the onboard camera on the inspection vehicle, and obtain the instantaneous speed, absolute geographic coordinates, and heading angle of the inspection vehicle. Based on the vanishing point of the lane lines where the left and right lane lines converge in the left view, calculate the real-time pitch angle of the vehicle camera relative to the horizontal road surface. The instantaneous velocity is integrated over time based on the initial moment or the previous verification moment, and combined with the determined historical mileage marker values ​​to obtain the dead reckoning mileage value. When the target mileage marker to be located is detected in the left view, the initial three-dimensional coordinates of the target mileage marker are subjected to inverse distortion processing based on the real-time pitch angle, and the identification result of the target mileage marker is verified to obtain the mileage marker positioning result. When the target mileage marker is not detected in the left view and the dead reckoning mileage value reaches the theoretical mileage position corresponding to the next mileage marker, virtual anchor point coordinates are generated based on the absolute geographic coordinates, the vehicle heading angle, and the preset roadside lateral offset. The theoretical mileage position is determined based on the historical mileage marker values ​​and the preset mileage marker interval value. The dead reckoning mileage value is mapped to the reckoning station number value, and the mileage station number positioning result is obtained based on the virtual anchor point coordinates and the reckoning station number value. The steps for detecting the target mileage marker to be located in the left view, performing inverse distortion processing on the initial three-dimensional coordinates of the target mileage marker based on the real-time pitch angle, and verifying the identification result of the target mileage marker to obtain the mileage marker location result include: When detecting the target mileage marker to be located in the forward road image in the left view, the initial three-dimensional coordinates of the target mileage marker are determined based on the left view and the right view; Based on the real-time pitch angle, the initial three-dimensional coordinates are subjected to inverse distortion processing to obtain the corrected road network coordinates; The detected target mileage markers are subjected to character recognition to obtain the recognition results; Based on the dead reckoning mileage value, the identification result is verified through a spatiotemporal dual gating mechanism to obtain the identification station number value; Based on the corrected road network coordinates and the identified station number values, the mileage station location result is obtained; The step of performing inverse distortion processing on the initial three-dimensional coordinates based on the real-time pitch angle to obtain the corrected road network coordinates includes: Construct an inverse distortion rotation matrix based on the real-time pitch angle; Multiply the initial three-dimensional coordinates by the inverse distortion rotation matrix to obtain the corrected road network coordinates; The inverse distortion rotation matrix is ​​represented as follows: in, This indicates the real-time pitch angle.

2. The method as described in claim 1, characterized in that, When detecting the target mileage marker to be located in the forward road image in the left view, the step of determining the initial three-dimensional coordinates of the target mileage marker based on the left view and the right view includes: The target mileage marker to be located is detected in the left view by a preset target detection network, and the detection result is obtained. When the detection result indicates that a station number exists, select the detection box corresponding to the target mileage station number; The pixel center coordinates of the detection box are determined, and the pixel displacement difference between the pixel center coordinates in the left view and the right view is calculated using a semi-global block matching algorithm to obtain the disparity value. Obtain the horizontal focal length, vertical focal length, optical center coordinates, and binocular baseline length of the vehicle-mounted camera; The depth coordinates of the target mileage marker relative to the vehicle-mounted camera are calculated based on the horizontal focal length, the binocular baseline length, and the parallax value. Based on the vertical focal length and the optical center coordinates, spatial mapping calculations are performed on the pixel center coordinates and the depth coordinates to obtain the corresponding horizontal and vertical coordinates; The initial three-dimensional coordinates of the target mileage station are obtained based on the depth coordinates, the lateral coordinates, and the vertical coordinates.

3. The method as described in claim 1, characterized in that, The step of calculating the mileage value based on the dead reckoning value and verifying the identification result through a spatiotemporal dual gating mechanism to obtain the identification station number value includes: The scale evolution features are obtained based on the proportion of the detection box area corresponding to the target mileage station to the full image resolution. Based on the scale evolution features, determine whether the detection box satisfies the preset replacement condition between adjacent frames to obtain the first gating decision result; Calculate the numerical difference between the identification result and the dead reckoning mileage value, and determine whether the numerical difference exceeds a preset tolerance threshold to obtain the second gating decision result; When both the first gating decision result and the second gating decision result are yes, the dead reckoning mileage value is mapped to the identification station number value; When the first gating decision result or the second gating decision result is negative, the identification result is used as the identification station number value.

4. The method as described in claim 1, characterized in that, The step of calculating the real-time pitch angle of the vehicle-mounted camera relative to the horizontal road surface based on the vanishing point of the lane lines where the left and right lane lines converge in the left view includes: The left view is subjected to grayscale conversion and Gaussian filtering to obtain a denoised grayscale image; The pixel gradients in the denoised grayscale image are extracted using an edge detection algorithm to obtain edge contour features; The straight line segments in the edge contour features are extracted by Hough line transform, and the straight line segments are fitted by the least squares method to obtain the equations of the left lane line and the right lane line. By simultaneously solving the equations of the left lane line and the right lane line, the coordinates of the intersection point of the straight lines are obtained, thus yielding the vanishing point of the lane lines. Determine the vertical coordinate of the vanishing point of the lane line in the left view based on the principle of perspective projection. Based on the ordinate of the vanishing point, the coordinates of the optical center of the vehicle-mounted camera, and the vertical focal length, inverse trigonometric functions are used to calculate the real-time pitch angle of the vehicle-mounted camera relative to the horizontal road surface.

5. The method as described in claim 1, characterized in that, The step of integrating the instantaneous velocity over time based on the initial time or the previous verification time, and combining it with the determined historical mileage marker values ​​to obtain the dead reckoning mileage value includes: If the current positioning status is the first positioning after the system starts, the instantaneous velocity between the initial time and the current time is integrated to obtain the dead reckoning distance value; If the current positioning status is not the first positioning after the system starts, the instantaneous speed between the previous verification time and the current time is integrated to obtain the theoretical distance traveled by the vehicle. The directional coefficient is determined based on the travel direction of the inspection vehicle; The dead reckoning distance is calculated based on the theoretical distance, the direction coefficient, and the determined historical mileage markers.

6. A mileage marker positioning device based on multimodal fusion, characterized in that, The device employs the multimodal fusion-based mileage marker positioning method as described in any one of claims 1 to 5, and the device comprises: The data acquisition module is used to acquire the left and right views corresponding to the images of the road ahead captured by the on-board camera on the inspection vehicle, and to acquire the instantaneous speed, absolute geographic coordinates, and heading angle of the inspection vehicle. The attitude calculation module is used to calculate the real-time pitch angle of the vehicle camera relative to the horizontal road surface based on the vanishing point of the lane lines where the left and right lane lines converge in the left view. The physical calculation module is used to perform time integration on the instantaneous velocity based on the initial time or the previous verification time, and combine it with the determined historical mileage station values ​​to obtain the dead reckoning mileage value. The detection and positioning module is used to perform inverse distortion processing on the initial three-dimensional coordinates of the target mileage station number based on the real-time pitch angle when the target mileage station number is detected in the left view, and to verify the identification result of the target mileage station number to obtain the mileage station number positioning result. The virtual anchor point generation module is used to generate virtual anchor point coordinates based on the absolute geographic coordinates, the vehicle heading angle, and a preset roadside lateral offset when the target mileage marker is not detected in the left view and the dead reckoning mileage value reaches the theoretical mileage position corresponding to the next mileage marker. The theoretical mileage position is determined based on the historical mileage marker values ​​and the preset mileage marker interval value. The blind zone output module is used to map the dead reckoning mileage value to the reckoning station number value, and obtain the mileage station number positioning result based on the virtual anchor point coordinates and the reckoning station number value.

7. A mileage marker positioning device based on multimodal fusion, characterized in that, The device includes: a memory, a processor, and a computer program stored in the memory and executable on the processor, the computer program being configured to implement the steps of the multimodal fusion-based mileage marker positioning method as described in any one of claims 1 to 5.

8. A storage medium, characterized in that, The storage medium is a computer-readable storage medium, and a computer program is stored on the storage medium. When the computer program is executed by a processor, it implements the steps of the mileage marker positioning method based on multimodal fusion as described in any one of claims 1 to 5.