An image-based vehicle pitch angle compensation method and device under critical working conditions

By acquiring visual images and motion state signals from the front of the vehicle, dynamically defining the region of interest, performing local texture direction estimation and voting calculation, and combining Kalman filtering to estimate the optimal camera pitch angle, the problem of vehicle pitch angle changes under critical conditions is solved, improving the perception accuracy and robustness of the intelligent driving system.

CN122265348APending Publication Date: 2026-06-23TSINGHUA UNIVERSITY

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
TSINGHUA UNIVERSITY
Filing Date
2026-03-09
Publication Date
2026-06-23

AI Technical Summary

Technical Problem

In critical operating conditions, the violent movement of the vehicle suspension system causes the vehicle body to pitch, and the camera pitch angle changes rapidly, which disrupts the geometric constraints of the vision system. Existing technologies are difficult to compensate for effectively, affecting the accuracy and robustness of the intelligent driving perception system.

Method used

By acquiring visual images and motion signals from the front of the vehicle, the region of interest is dynamically defined, local texture direction is estimated, the vanishing point of the road is determined by voting calculation, and Kalman filtering is used for optimal estimation. Compensation for camera and vehicle pitch angles is calculated.

Benefits of technology

It enables real-time and accurate compensation of vehicle pitch angle under critical conditions, improves the accuracy and robustness of intelligent driving perception system, and ensures the reliable operation of active safety functions in complex dynamic scenarios.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN122265348A_ABST
    Figure CN122265348A_ABST
Patent Text Reader

Abstract

The application provides a vehicle pitch angle compensation method and device based on images under critical working conditions, and belongs to the technical field of intelligent automobile environment perception and chassis control. By acquiring a visual image in front of the vehicle and a motion state signal, a dynamic region of interest containing a road vanishing point is demarcated in the image based on the motion state signal; local dominant direction estimation is performed on pixel points in the region of interest to obtain a texture dominant direction corresponding to each pixel point, and a vote calculation is performed accordingly to determine the position of the road vanishing point; the position of the vanishing point is optimally estimated by using Kalman filtering, and the pitch angle of the camera is calculated according to the coordinates of the vanishing point obtained by optimal estimation, so that accurate compensation of the pitch angle of the vehicle is realized. Through adaptive extraction of the dynamic region of interest and texture direction analysis, combined with Kalman filtering estimation, the accuracy and robustness of the pitch angle calculation are improved, and the attitude stability of the vehicle under complex working conditions is improved.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of intelligent vehicle environmental perception and chassis control technology, and in particular to an image-based vehicle pitch angle compensation method, device, equipment and storage medium for critical conditions. Background Technology

[0002] With the development of advanced driver assistance systems and autonomous driving technology, vision-based environmental perception has become one of the core means for vehicles to understand their surroundings. Cameras are usually fixed inside the vehicle. Although their relative position parameters, such as pitch angle, are initially calibrated relative to the vehicle body, under critical conditions such as emergency braking, fast cornering, and severe road impacts, the vehicle's suspension system will experience significant compression and rebound, causing the vehicle body to pitch violently. This, in turn, causes the camera mounted on it to experience dynamic pitch angle changes. In such critical conditions, the camera's pitch angle changes instantaneously and significantly, which can seriously undermine the geometric constraints upon which the vision system relies for operation.

[0003] While existing technologies exist for directly acquiring vehicle attitude angles using inertial measurement units, they suffer from sensor noise and drift issues. Furthermore, these methods primarily address angle changes caused by road gradients and slow steering, without specifically designing online compensation schemes for the critical problem of rapid and significant pitch changes under emergency conditions. Therefore, a method is needed that can estimate and compensate for camera pitch angle changes under emergency conditions using existing sensors. Summary of the Invention

[0004] The present invention aims to at least partially solve one of the technical problems in the related art.

[0005] To address this, this invention proposes an image-based vehicle pitch angle compensation method for critical conditions. The method involves acquiring a visual image and motion state signal of the vehicle's front, and dynamically defining a region of interest (ROI) containing the road vanishing point within the image based on the motion state signal. Local dominant orientation estimation is performed on pixels within the ROI to extract the dominant texture orientation of each pixel. A voting calculation is then performed based on the dominant texture orientation to accurately locate the road vanishing point in the image. Finally, Kalman filtering is used to optimally estimate the vanishing point location, and the camera pitch angle is calculated based on the estimation result, achieving dynamic compensation for the vehicle pitch angle. This effectively improves the accuracy and robustness of the intelligent driving perception system under critical conditions.

[0006] Another objective of this invention is to provide an image-based vehicle pitch angle compensation device for critical operating conditions.

[0007] The third objective of this invention is to provide a computer device.

[0008] A fourth objective of this invention is to provide a non-transitory computer-readable storage medium.

[0009] To achieve the above objectives, the present invention proposes an image-based vehicle pitch angle compensation method for critical operating conditions, comprising:

[0010] S1, acquire visual images of the front of the vehicle and vehicle motion status signals; S2, Based on the vehicle motion state signal, determine the dynamic region of interest containing the road vanishing point in the visual image; S3, perform local dominant direction estimation on the pixels in the dynamic region of interest to obtain the texture dominant direction corresponding to each pixel; S4, based on the dominant texture direction of each pixel, vote to determine the location of the road vanishing point in the visual image; S5. The vanishing point of the road is optimally estimated based on Kalman filtering, and the camera's pitch angle is calculated based on the coordinates of the vanishing point obtained from the optimal estimation, thereby compensating for the vehicle's pitch angle.

[0011] An image-based vehicle pitch angle compensation method under critical conditions according to an embodiment of the present invention may also have the following additional technical features: In one embodiment of the present invention, acquiring the visual image of the front of the vehicle and the vehicle motion state signal includes: S11, real-time acquisition of vehicle motion status signals, including vehicle speed, longitudinal acceleration and lateral acceleration, which are read through the vehicle CAN bus; S12, real-time acquisition of visual images in front of the vehicle. The visual images are acquired by a vehicle-mounted camera fixedly installed on the inside of the windshield at a preset frequency. The image resolution is fixed. The installation position and viewing angle of the vehicle-mounted camera are pre-adjusted so that the proportion of the road area in the acquired image is greater than a set threshold, and the longitudinal extension distance of the road area in the image exceeds a preset number of meters. S13, the visual image and the vehicle motion status signal are synchronously transmitted to the on-board industrial control computer, and a timestamp based on the system time of the industrial control computer is added to each frame of the image so that the images can be processed in chronological order later.

[0012] In one embodiment of the present invention, determining the dynamic region of interest containing the road vanishing point in the visual image based on the vehicle motion state signal includes: S21, Obtain the reference position of the horizon on the image plane in the initial calibration state of the vehicle, and use it as the longitudinal reference of the dynamic region of interest; S22, based on the real-time acquired vehicle speed, longitudinal acceleration, and lateral acceleration signals, dynamically adjust the longitudinal position and size range of the dynamically interested region in the image; wherein, the longitudinal position is adaptively translated according to the change of the vehicle's pitch attitude, and the size range is adaptively scaled according to the intensity of the vehicle's movement. S23. Using the dynamically adjusted vertical position as the center, a strip-shaped region of preset width is defined in the image as the dynamic region of interest. Subsequently, local dominant orientation estimation of pixels is only performed within this region to reduce the computational load.

[0013] In one embodiment of the present invention, the step of estimating the local dominant orientation of pixels within the dynamically interested region to obtain the texture dominant orientation corresponding to each pixel includes: S31, for each pixel in the dynamic region of interest, convolution operation is performed using Gabor filters with at least four preset directional angles to obtain the Gabor energy response value corresponding to each direction; S32, divide at least four preset direction angles into several groups of orthogonal direction pairs, perform competitive suppression operation on the two energy response values ​​in each group of orthogonal direction pairs, and obtain the significant intensity of the pixel in the current direction pair; S33, select the angle corresponding to the maximum value of the significant intensity in all direction pairs as the initial dominant direction of the pixel, and use the vector synthesis method to optimize the neighborhood consistency of the initial dominant direction to obtain the final estimated texture dominant direction.

[0014] In one embodiment of the present invention, the step of determining the vanishing point location of the road in the visual image by performing voting calculation based on the dominant texture direction of each pixel includes: S41, taking each pixel in the dynamic region of interest as the starting point of voting, a ray pointing to the image boundary is generated along its corresponding dominant texture direction to establish the voting support relationship between the pixel and the vanishing point position. S42, calculate the Euclidean distance between each candidate pixel on the ray and the starting point of the vote, and perform exponential decay weighting based on the Euclidean distance to obtain the voting contribution value of the starting point of the vote to each candidate pixel on the ray, wherein the closer the distance, the greater the contribution, and the farther the distance, the smaller the contribution. S43, accumulate the voting contribution values ​​of all pixels on the image plane to construct a two-dimensional voting accumulation matrix corresponding to the dynamic region of interest, and select the pixel coordinates corresponding to the voting peak in the two-dimensional voting accumulation matrix as the observation position of the road vanishing point in the current frame visual image.

[0015] In one embodiment of the present invention, the step of optimally estimating the vanishing point location based on Kalman filtering, and calculating the camera's pitch angle based on the optimally estimated vanishing point coordinates, thereby compensating for the vehicle's pitch angle, includes: S51, construct a state vector containing the image coordinates of the vanishing point and the motion velocity on the image plane, and perform Kalman filtering to predict the state of the vanishing point of the current frame based on the uniform motion model to obtain the predicted state and the corresponding covariance. S52, the observed location of the road vanishing point calculated by voting is used as the measurement input. The observation noise covariance is dynamically adjusted according to the confidence of the voting results. The Kalman gain is calculated in combination with the predicted state, and the state vector is updated. The updated vanishing point image coordinates are extracted as the optimal estimated location. S53 calculates the current pitch angle of the camera based on the longitudinal coordinate of the vanishing point in the optimal estimated position, combined with the camera's focal length, sensor height, and image size. Then, using the pre-calibrated fixed pitch angle deviation between the camera and the vehicle, it determines the vehicle's actual pitch angle and outputs this actual pitch angle to the vehicle control system and perception module for compensation.

[0016] To achieve the above objectives, another aspect of the present invention provides an image-based vehicle pitch angle compensation device for critical operating conditions, comprising: The multi-source signal synchronous acquisition module is used to acquire visual images of the front of the vehicle and vehicle motion status signals. A dynamic region of interest adaptive localization module is used to determine a dynamic region of interest containing the vanishing point of the road in the visual image based on the vehicle motion state signal; The texture dominance direction estimation module is used to estimate the local dominance direction of pixels in the dynamic region of interest to obtain the texture dominance direction corresponding to each pixel. The vanishing point voting calculation and localization module is used to perform voting calculations based on the dominant texture direction of each pixel to determine the location of the road vanishing point in the visual image. The pitch angle estimation and dynamic compensation module is used to make an optimal estimate of the vanishing point position of the road based on Kalman filtering, and calculate the camera pitch angle based on the coordinates of the vanishing point obtained by the optimal estimate, and then compensate for the vehicle pitch angle.

[0017] This invention discloses an image-based vehicle pitch angle compensation method and apparatus for critical conditions. By dynamically defining the region of interest, estimating local texture direction, using vanishing point voting for localization, and employing Kalman filtering for optimal estimation, it effectively solves the perception failure problem caused by sudden changes in camera viewpoint and disruption of visual geometric constraints due to severe vehicle pitch in critical conditions. It achieves integrated modeling from multi-source signal fusion and dynamic region adaptive localization to real-time pitch angle estimation and compensation, accurately restoring the perception baseline under vehicle attitude changes. This significantly improves the environmental perception accuracy and control robustness of the intelligent driving system under extreme conditions, and enhances the engineering applicability of active safety functions in complex dynamic scenarios.

[0018] To achieve the above objectives, a third aspect of this application provides a computer device comprising a processor and a memory; wherein the processor runs a program corresponding to the executable program code by reading executable program code stored in the memory, for implementing an image-based vehicle pitch angle compensation method under critical conditions as described in the first aspect embodiment.

[0019] To achieve the above objectives, a fourth aspect of this application provides a non-transitory computer-readable storage medium having a computer program stored thereon, which, when executed by a processor, implements an image-based vehicle pitch angle compensation method for critical conditions as described in the first aspect embodiment.

[0020] Additional aspects and advantages of the invention will be set forth in part in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention. Attached Figure Description

[0021] The above and / or additional aspects and advantages of the present invention will become apparent and readily understood from the following description of the embodiments taken in conjunction with the accompanying drawings, wherein: Figure 1 This is a flowchart of an image-based vehicle pitch angle compensation method under critical conditions according to an embodiment of the present invention; Figure 2 This is an overall flowchart of another image-based vehicle pitch angle compensation method under critical conditions according to an embodiment of the present invention; Figure 3 This is a schematic diagram illustrating the conversion from camera pitch angle to vehicle center of gravity pitch angle in another image-based vehicle pitch angle compensation method under critical conditions according to an embodiment of the present invention. Figure 4 This is a schematic diagram of the structure of an image-based vehicle pitch angle compensation device under critical conditions according to an embodiment of the present invention. Figure 5 It is a computer device according to an embodiment of the present invention. Detailed Implementation

[0022] It should be noted that, unless otherwise specified, the embodiments and features described in the present invention can be combined with each other. The present invention will now be described in detail with reference to the accompanying drawings and embodiments.

[0023] To enable those skilled in the art to better understand the present invention, the technical solutions of the present invention will be clearly and completely described below with reference to the accompanying drawings of the embodiments of the present invention. Obviously, the described embodiments are only some embodiments of the present invention, and not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative effort should fall within the scope of protection of the present invention.

[0024] The following description, with reference to the accompanying drawings, describes an image-based vehicle pitch angle compensation method, apparatus, device, and storage medium for critical operating conditions according to an embodiment of the present invention.

[0025] The core idea of ​​this invention is to provide a multi-source data foundation for pitch angle compensation by acquiring real-time visual images and motion state signals in front of the vehicle. Based on this, a region of interest (ROI) containing the road vanishing point is dynamically delineated in the image according to the vehicle's motion state signals, achieving adaptive allocation of computing resources and precise localization of the sensing area. Further, local texture dominance direction estimation is performed on pixels within the ROI, extracting the texture dominance direction corresponding to each pixel and constructing geometric constraints pointing to the road vanishing point. A voting calculation is performed based on the texture dominance directions of all pixels, accumulating the voting contribution on the image plane and locating the peak coordinates, thereby determining the observation position of the road vanishing point. To address potential jitter and noise in the vanishing point between consecutive frames, a Kalman filter is introduced to optimally estimate the vanishing point position, eliminating high-frequency jumps and observation errors. Finally, based on the optimally estimated vanishing point ordinate, combined with camera intrinsic parameters and pre-calibrated installation deviations, the current pitch angle of the camera is accurately calculated and converted into the actual vehicle pitch angle, which is then output to the vehicle control system or sensing module for dynamic compensation. This transforms the traditional single-image ranging problem into an intelligent compensation system that can fuse multi-source signals, dynamically allocate computing resources, suppress texture noise, and achieve optimal prediction through state estimation. This significantly improves the real-time performance, accuracy, and robustness of vehicle pitch angle estimation under critical conditions, effectively ensuring the reliable operation of intelligent driving active safety functions in extreme scenarios.

[0026] Example 1 To achieve the above invention, embodiments of the present invention provide an image-based vehicle pitch angle compensation method for critical operating conditions, such as... Figure 1 As shown, it includes: S1, acquire the visual image of the front of the vehicle and the vehicle's motion status signal.

[0027] Specifically, this step aims to establish a multi-source data foundation for subsequent pitch angle estimation and compensation. Its core principle lies in the fact that visual images contain texture information of the road scene, serving as a direct data source for vanishing point detection; while vehicle motion state signals reflect the vehicle's dynamic response in real time, providing prior information for adaptive localization of the region of interest. By precisely aligning visual perception data and vehicle dynamics data in the time dimension, a mapping relationship between image space and vehicle motion state is constructed, enabling subsequent image-based processing to couple with the vehicle's actual motion state, thereby achieving accurate capture and compensation of pitch angle changes under critical conditions.

[0028] Specifically, an onboard camera, fixedly mounted inside the windshield, acquires real-time images of the road ahead in the vehicle's direction of travel at a preset frame rate and image resolution. The camera's mounting position and viewing angle are pre-calibrated and adjusted to ensure that the road surface occupies a major proportion of the image, and that the longitudinal extension of the road surface in the image meets the field of view requirements for subsequent vanishing point detection. Simultaneously, vehicle motion status signals, including vehicle speed, longitudinal acceleration, and lateral acceleration, are read in real-time via the vehicle's CAN bus. The acquired image data and vehicle signals are synchronously transmitted to the onboard industrial control computer, and a unique timestamp is added to each frame of the image based on the industrial control computer's system time, establishing a one-to-one correspondence between image frames and signal data in the time dimension, forming a time-aligned and formatted input data stream.

[0029] Furthermore, in terms of visual image acquisition, continuous acquisition is performed using a fixed frame rate (e.g., 30 frames per second) and a fixed resolution (e.g., a preset pixel matrix). When installing the camera, the viewing angle is adjusted so that the road surface area occupies more than three-fifths of the frame, and the longitudinal coverage distance of the road surface in the image exceeds 50 meters, ensuring sufficient texture information and geometric constraints for vanishing point detection. For vehicle signal acquisition, vehicle speed, longitudinal acceleration, and lateral acceleration are read via the CAN bus at a sampling frequency no less than the image acquisition frame rate, ensuring that the signals can respond to real-time changes in vehicle dynamics. Timestamp accuracy reaches the millisecond level, ensuring that the synchronization error between the image and the signal is controlled within an acceptable range.

[0030] Furthermore, the multi-source data acquired in this step forms the foundation for the entire pitch angle compensation method and is widely applied in various critical operating conditions. Specifically, under extreme conditions such as automatic emergency braking, high-speed emergency obstacle avoidance, and severe road impacts that cause sudden changes in vehicle attitude, this step continuously provides the system with image sequences containing road texture information and motion state signals reflecting the vehicle's real-time response. The acquired images provide raw material for vanishing point detection, while the vehicle motion state signals provide the triggering basis and parameter input for subsequent real-time adjustments to the dynamic region of interest, enabling the entire compensation method to maintain stable tracking of the road vanishing point throughout the dynamic process of severe vehicle pitch.

[0031] Specifically, on the one hand, high-quality, wide-view road images ensure sufficient texture features and geometric constraints for vanishing point detection; on the other hand, real-time vehicle motion state signals provide accurate prior information for adaptive adjustment of the region of interest, effectively reducing the computational load of subsequent image processing and improving the localization accuracy of the target region. The temporal alignment of images and signals eliminates estimation errors that may be introduced by data asynchrony, ensuring that the final pitch angle compensation accurately reflects the vehicle's current motion state, significantly improving the response speed and estimation accuracy of the entire compensation method under critical conditions.

[0032] Furthermore, S1 includes: S11 acquires vehicle motion status signals in real time, including vehicle speed, longitudinal acceleration, and lateral acceleration. The signals are read through the vehicle's CAN bus.

[0033] Specifically, the core purpose of acquiring vehicle motion state signals in real time is to provide prior information related to changes in vehicle attitude for subsequent dynamic region of interest localization. Vehicle speed, longitudinal acceleration, and lateral acceleration are key parameters characterizing vehicle motion. Vehicle speed reflects the vehicle's stable operating state, longitudinal acceleration directly reflects the pitch tendency caused by braking or acceleration, and lateral acceleration is related to the roll caused by steering or obstacle avoidance. By continuously reading these signals, the system can perceive the dynamic evolution of the vehicle's attitude, thereby providing the image processing module with a basis for adjusting the search area, ensuring that the tracking of the vanishing point of the road always focuses on the most likely image location in critical situations.

[0034] Furthermore, this step involves accessing the vehicle controller's local area network (LAN) bus to read motion status signals broadcast by the vehicle's electronic control unit in real time using a standardized communication protocol. Specifically, vehicle speed signals typically originate from wheel speed sensors and are processed by the vehicle controller before being transmitted as digital messages. Longitudinal and lateral acceleration signals originate from the onboard inertial measurement unit or the vehicle stability system. The industrial control computer is physically connected to the bus via a CAN interface card, filtering and parsing relevant signal frames according to preset message identifiers to obtain continuous numerical sequences. All signal reading operations run in parallel with the image acquisition thread, and a timestamp is added to each signal record using a unified system time to establish a precise correspondence with the image frame.

[0035] Furthermore, to achieve accurate and rapid dynamic capture under critical conditions, signal reading must meet stringent performance specifications. The sampling frequency must be no less than the image acquisition frame rate (e.g., 30Hz) to ensure that each image frame contains corresponding motion state data. The vehicle speed measurement range should cover 0–200 km / h with a resolution of 0.1 km / h; the longitudinal and lateral acceleration measurement range should be at least ±1.5g with a resolution better than 0.01g, and the dynamic response time should be less than 10ms. The signal transmission delay via the CAN bus should be controlled within milliseconds to ensure time synchronization accuracy. All signals must be filtered and verified by the vehicle itself before being transmitted to the industrial control computer to ensure data integrity and reliability.

[0036] Specifically, for example, during automatic emergency braking, a sharp change in longitudinal acceleration indicates a violent pitch of the vehicle body. In this case, vehicle speed and acceleration values ​​are used to calculate the longitudinal offset of the region of interest in real time. In high-speed emergency obstacle avoidance, a sudden change in lateral acceleration indicates a risk of roll, and the signal is used to adjust the size of the region of interest to accommodate greater uncertainty in the vanishing point position. By directly integrating the vehicle dynamics response into the vision processing flow, the algorithm can proactively adapt to sudden changes in vehicle attitude, rather than passively responding.

[0037] Specifically, on the one hand, vehicle speed and acceleration information enable the region of interest to respond in advance to changes in vehicle attitude, avoiding the risk of losing the vanishing point during sharp pitch in a fixed region, thus significantly improving the robustness of vanishing point detection. On the other hand, high-frequency reading based on the CAN bus ensures low latency and high fidelity of the signal, allowing the compensation system to synchronize with the actual vehicle movement, thereby achieving rapid tracking and accurate compensation of the pitch angle. Ultimately, this method ensures real-time performance and accuracy under critical conditions, laying a solid foundation for the reliable operation of intelligent driving active safety functions.

[0038] S12, acquire a real-time visual image of the front of the vehicle. The visual image is acquired by a vehicle-mounted camera fixedly installed on the inside of the windshield at a preset frequency. The image resolution is fixed. The installation position and viewing angle of the vehicle-mounted camera are pre-adjusted so that the proportion of the road surface area in the acquired image is greater than a set threshold, and the longitudinal extension distance of the road surface area in the image exceeds a preset number of meters.

[0039] Specifically, by rigidly fixing the vehicle-mounted camera to the vehicle body and establishing a fixed transformation relationship between the camera coordinate system and the vehicle coordinate system, changes in road geometric features in the image can be directly mapped to changes in vehicle posture. The road surface area occupies a dominant proportion in the image and has sufficient longitudinal extension distance, ensuring that the key geometric feature of the road vanishing point is always within the image's field of view, and providing sufficient pixel support and stable perspective constraints for the texture-direction-based vanishing point voting algorithm.

[0040] Specifically, the vehicle-mounted camera uses an industrial-grade digital camera, which is fixedly installed in a concealed position inside the windshield and behind the rearview mirror using a dedicated bracket to avoid obstructing the driver's view. The camera's optical axis points directly ahead of the vehicle. During installation, the tilt angle is adjusted so that the lower edge of the image aligns with or slightly overlaps with the edge of the hood, ensuring that the nearby road surface is fully captured. The camera continuously acquires images at a constant frame rate (e.g., 30 frames per second), with a preset fixed resolution (e.g., 1280×720 or 1920×1080 pixels), and transmits the data in real time to the vehicle's industrial control computer via USB 3.0 or GigE interface. After installation, on-site calibration is performed, recording the camera's intrinsic parameters and the initial installation angle relative to the vehicle body, which serves as the benchmark for subsequent tilt angle calculations.

[0041] Furthermore, to achieve stable and reliable vanishing point detection, this step sets specific quantitative indicators for image acquisition. The image acquisition frame rate should be no less than 30 frames per second to ensure continuous capture of the vehicle's rapid pitch movements. The resolution should be sufficient to distinguish the texture details of lane lines at a distance of 100 meters, for example, the number of pixels in the horizontal direction should be no less than 1280. The camera perspective adjustment requires that the road surface area occupies more than three-fifths of the entire image in the vertical direction, ensuring rich road surface features in the foreground and that the distant horizon is located in the upper part of the image. The vertical coverage distance of the road surface area in the image should exceed 50 meters, that is, the road texture within a range of at least 50 meters from the bottom edge of the image should be clearly captured, providing a sufficiently long ray support area for vanishing point voting.

[0042] Specifically, under normal driving conditions, the large road surface area ensures sufficient features such as lane lines and road texture, resulting in stable and reliable vanishing point detection. In critical situations such as emergency braking, rapid acceleration, or bumpy roads, although vehicle pitch causes dynamic changes in the road surface distribution in the image, the initial installation has reserved a sufficient road surface area. Therefore, even with severe vehicle pitch, the vanishing point typically does not move beyond the image boundary, ensuring the algorithm's continued usability. Simultaneously, the road surface coverage distance exceeding 50 meters allows the algorithm to obtain stable orientation estimates based on far-field texture, avoiding the impact of local road surface interference on vanishing point localization.

[0043] Specifically, a large proportion of road surface area ensures that the region of interest for vanishing point detection is always located within the image, avoiding target loss due to insufficient camera field of view. A longitudinal coverage distance exceeding 50 meters significantly enhances the statistical stability of texture orientation estimation, enabling the voting algorithm based on local dominant orientations to obtain more consistent ray convergence points. Continuous acquisition at a fixed frame rate and resolution guarantees data consistency over time, providing a temporally stable observation sequence for subsequent Kalman filtering. Finally, high-quality image input achieves sub-pixel accuracy in vanishing point localization, significantly improving the accuracy of pitch angle estimation and providing fundamental assurance for reliable compensation under critical conditions.

[0044] S13, the visual image and the vehicle motion status signal are synchronously transmitted to the on-board industrial control computer, and a timestamp based on the system time of the industrial control computer is added to each frame of the image so that the images can be processed in chronological order later.

[0045] Specifically, image acquisition and signal reading belong to different hardware channels, each with its own independent transmission path and clock domain. Without a unified time base alignment, image frames and their corresponding vehicle states will be out of sync in time, rendering subsequent operations to adjust the region of interest based on motion state meaningless. By converging both types of data into the same processing unit and adding a unique timestamp to each image frame using a unified system time, the vehicle motion state corresponding to the instant of image acquisition can be precisely locked in the time dimension, establishing a reliable synchronization benchmark for subsequent timing processing and state estimation.

[0046] Specifically, the vehicle-mounted camera establishes a physical connection with the industrial control computer via a USB interface or a GigE vision interface to transmit image data streams in real time. The CAN interface card connects to the industrial control computer via a PCIe or USB interface, continuously monitoring and parsing motion status messages on the vehicle bus. The industrial control computer runs a real-time data acquisition thread. When a frame of image is completely transmitted to memory, it immediately reads the current system time (with microsecond or millisecond accuracy) as the timestamp of that frame and embeds the timestamp into the image data structure or records it in a separate timestamp file. Simultaneously, the signal reading thread continuously caches signal data with CAN bus timestamps and performs nearest-neighbor interpolation or linear interpolation based on the image timestamp to extract the vehicle speed and acceleration values ​​that best match the time of that frame of image, forming a synchronized data pair between the image and the signal. Finally, the synchronized data is written to shared memory or a circular buffer for subsequent processing modules to read in chronological order.

[0047] Furthermore, to achieve high-precision synchronization, this step sets strict quantitative indicators for timestamp marking and data alignment. The system time synchronization accuracy must reach the millisecond level, and the image timestamp recording delay should be less than one-tenth of an image acquisition interval (e.g., 33 milliseconds), i.e., no more than 3 milliseconds, to ensure that the timestamp accurately reflects the image exposure time. Signal interpolation error should be controlled within half of the signal sampling interval; for example, when the signal sampling frequency is 100Hz, the interpolation time error should not exceed 5 milliseconds. The end-to-end delay of data synchronization transmission should be stable within 50 milliseconds to ensure that the compensation system can respond to vehicle dynamics in real time. The industrial control computer operating system needs to be configured in real-time or high-precision timing mode, and the system time should be synchronized with an external time source via NTP or PTP protocol to avoid clock drift accumulation errors.

[0048] Specifically, under normal driving conditions, the synchronized data stream ensures that the dynamic adjustment of the region of interest remains synchronized with the actual changes in vehicle attitude, avoiding regional lag caused by data delays. In critical situations such as emergency braking, the vehicle attitude can reach its peak pitch angle within hundreds of milliseconds. At this point, millisecond-level synchronization accuracy ensures that each frame of image accurately matches the vehicle speed and acceleration values ​​at the moment of acquisition, enabling Kalman filtering to perform state prediction and updates based on time-consistent data, achieving real-time tracking of the pitch angle. In data playback and offline analysis scenarios, a unified timestamp system supports accurate accident reconstruction and algorithm debugging.

[0049] Specifically, on the one hand, precise alignment of the image and signal eliminates phase errors introduced by data asynchrony, enabling the dynamic region of interest to follow vehicle attitude changes in real time, significantly improving the tracking success rate of vanishing point detection during rapid pitch. On the other hand, a unified time reference provides a time-consistent observation sequence for Kalman filtering, ensuring the convergence and stability of state estimation and effectively suppressing estimation jitter that may be introduced by timing discrepancies. Ultimately, the synchronized data stream enables the compensation system to achieve precise compensation with millisecond-level response under critical conditions, providing crucial support for the reliable triggering and stable operation of intelligent driving active safety functions.

[0050] S2, Based on the vehicle motion state signal, determine the dynamic region of interest containing the road vanishing point in the visual image.

[0051] Specifically, there is a definite geometric mapping relationship between the vertical coordinate position of the road vanishing point on the image plane and the camera pitch angle, and the change in the camera pitch angle is directly governed by the suspension deformation caused by the longitudinal and lateral acceleration of the vehicle body. Therefore, by using the real-time acquired vehicle speed, longitudinal acceleration, and lateral acceleration signals, the current vehicle pitch state can be inferred, thereby predicting the possible location range of the vanishing point in the image. Using this prediction result to dynamically delineate the region of interest allows subsequent texture orientation estimation and voting calculations to focus on the local image region most likely to contain the vanishing point, thus achieving adaptive allocation of computing resources and directional enhancement of perception accuracy.

[0052] Specifically, firstly, the baseline ordinate of the horizon on the image plane in the initial calibration state of the vehicle is obtained as the initial position of the dynamic region of interest (ROI). Then, based on the real-time read vehicle speed, longitudinal acceleration, and lateral acceleration, the offset of the current vehicle pitch angle relative to the baseline state is calculated using a preset mapping model or lookup table. This offset is then converted into pixel displacement on the image plane to adjust the longitudinal position of the ROI. Simultaneously, the intensity of vehicle motion is assessed based on the vehicle speed and acceleration amplitude, and the longitudinal size range of the ROI is adaptively adjusted—the more intense the motion, the greater the longitudinal height of the region, to accommodate greater uncertainty in the vanishing point position. Finally, centered on the adjusted longitudinal position, a horizontal strip of preset width is defined in the image as the dynamic ROI of the current frame. This region covers the entire lateral range of the image, while the longitudinal range is dynamically variable.

[0053] Furthermore, the vertical position adjustment range covers the main portion of the image's vertical direction, typically set as a continuous interval from the lower-middle to the upper-middle of the image. The maximum adjustment amplitude corresponds to the pixel offset corresponding to the maximum possible pitch angle range of the vehicle (e.g., ±5°). The dynamic variation range of the region's vertical size is set to 0.8 to 2.0 times the reference height to accommodate different operating conditions from smooth driving to severe pitch. The region's horizontal width is fixed at the full width of the image to ensure that no possible vanishing point lateral position is lost. The region of interest update frequency is consistent with the image frame rate, ensuring that each frame generates a matching search region based on the latest motion state.

[0054] Specifically, in a smooth highway driving scenario, both longitudinal and lateral accelerations are close to zero, and the region of interest (ROI) remains stable near the horizon reference position with a small size. Only a narrow area near the horizon is precisely calculated for efficient processing. In an emergency braking scenario, longitudinal acceleration increases sharply. Based on this signal, the system rapidly shifts the ROI upwards and appropriately expands the longitudinal range to ensure effective coverage even after the vanishing point shifts upwards due to severe pitch. In continuous curves or emergency obstacle avoidance scenarios, the lateral acceleration signal triggers a scaling of the ROI size to compensate for the increased longitudinal fluctuation range of the vanishing point caused by roll. Through this dynamic adaptive mechanism, the ROI always follows the vehicle's attitude changes, ensuring that the vanishing point remains within the search range.

[0055] Specifically, firstly, computational efficiency is significantly improved—intensive processing is performed only on dynamically defined strip regions rather than the entire image, reducing the number of pixels involved in the calculation by more than 50%, enabling the algorithm to run in real time on an automotive embedded platform. Secondly, the robustness of vanishing point detection is greatly enhanced—adaptive translation and scaling of the dynamic region ensures that the vanishing point does not move out of the search range even under critical conditions of severe vehicle pitch, avoiding target loss due to the failure of fixed regions. Thirdly, anti-interference capability is improved—limiting the search range to a strip region near the horizon effectively eliminates the influence of nearby road texture interference and sky noise, improving the confidence of the voting results. Finally, the dynamic region of interest provides a precise input range for subsequent texture orientation estimation and voting calculation, laying a solid foundation for the real-time performance and accuracy of the entire pitch angle compensation method.

[0056] Furthermore, S2 includes: S21, Obtain the reference position of the horizon on the image plane in the initial calibration state of the vehicle, and use it as the longitudinal reference of the dynamic region of interest.

[0057] Specifically, under ideal conditions where the vehicle is stationary and its load state is calibrated, the camera's pitch angle relative to the horizontal road surface is a known fixed value. At this point, the projection position of the vanishing point (i.e., the horizon) at infinity on the image plane is uniquely determined, forming the geometric reference for the vehicle's horizontal attitude. When the vehicle subsequently undergoes pitch motion, the longitudinal offset of the vanishing point relative to this reference directly reflects the magnitude of the pitch angle change. Therefore, pre-calibrating and storing this reference position is equivalent to establishing a zero-point reference for subsequent dynamic adjustments, providing a clear starting point and measurement basis for the translation and scaling of the region of interest based on motion state signals.

[0058] Specifically, initial calibration is performed on a level road surface before the vehicle leaves the factory or during the initial system installation. The vehicle is parked on a flat surface, ensuring the body is unloaded or under standard load, and the suspension system is in a static equilibrium position. The onboard camera is activated to acquire an image containing a complete view of the road. The position of the distant horizon in the image is identified manually or by an automatic algorithm, for example, by detecting the intersection of lane lines or a clear boundary between the road surface and the sky. The vertical coordinate value of this horizon in the image coordinate system is recorded and stored as a reference position in non-volatile memory. If the camera has autofocus or zoom capabilities, intrinsic parameters such as the current lens focal length must also be recorded in this state to ensure that the reference position is consistent with the optical parameters of subsequently acquired images. After calibration, this reference position will be used as a constant parameter for subsequent real-time processing modules.

[0059] Furthermore, to ensure the accuracy and stability of the benchmark position, this step sets clear quantitative indicators for the calibration process. The calibration site requires a longitudinal slope of less than 1% and a lateral slope of less than 0.5% to ensure the authenticity of the horizontal benchmark. During calibration, the vehicle speed is zero, the vehicle is parked, and the engine is idling to maintain normal power supply. The calibration accuracy of the horizon coordinate should reach the pixel level, typically requiring an error of no more than 2 pixels to support subsequent sub-pixel level pitch angle estimation. If an automatic calibration algorithm is used, the horizon position needs to be extracted from multiple consecutive frames of images and averaged to eliminate the influence of noise in single frames. The calibration results, along with camera intrinsic parameters, installation angle, and other data, need to be written into the configuration file for loading when the system starts.

[0060] Specifically, the reference position determined in this step is the original reference point for the entire dynamic region of interest (ROI) adjustment logic, and it is used throughout the entire vehicle lifecycle. Upon vehicle startup, the system first reads this reference value from memory as the initial value for the longitudinal position of the ROI in the current frame. During normal driving, ROI adjustments caused by motion signals are calculated based on this reference value, rather than relying on the position of the previous frame for recursion, thus avoiding the accumulation and propagation of errors. After vehicle maintenance, modification, or camera reinstallation, this calibration step can be repeated to update the reference position to adapt to changes in hardware status. In special scenarios such as heavy load or no load, the reference position still represents the standard reference for the vehicle's horizontal attitude; all dynamic adjustments use this as the origin, ensuring consistency in attitude estimation under different load conditions.

[0061] Specifically, by pre-calibrating and storing the horizon reference position, this step establishes a precise and stable reference benchmark for the entire pitch angle compensation method. First, this benchmark eliminates individual differences in camera mounting position and angle, allowing the same algorithm model to adapt to different vehicle hardware configurations, thus improving the method's versatility. Second, using a fixed benchmark as the origin for dynamic adjustment avoids the cumulative errors that might be introduced by recursive methods, ensuring the accuracy of region of interest (ROI) positioning during long-term operation. Third, the pixel-level accuracy of the benchmark position provides a reliable initial zero point for subsequent pitch angle inversion based on the vanishing point's ordinate, guaranteeing the absolute accuracy of pitch angle estimation. Finally, the combination of this static benchmark and dynamic motion state signals constitutes a dynamic-static combined ROI adaptive adjustment mechanism, significantly improving the adaptability and robustness of the entire compensation method under different operating conditions.

[0062] S22, based on the real-time acquired vehicle speed, longitudinal acceleration, and lateral acceleration signals, dynamically adjust the longitudinal position and size range of the dynamically interested region in the image; wherein, the longitudinal position is adaptively translated according to the change of the vehicle's pitch attitude, and the size range is adaptively scaled according to the intensity of the vehicle's movement.

[0063] Specifically, during vehicle operation, longitudinal acceleration directly reflects the pitch moment caused by braking or acceleration, resulting in a change in the vehicle's pitch angle around the lateral axis. This causes the vanishing point to move longitudinally back and forth in the image. Lateral acceleration reflects the roll moment caused by steering, indirectly affecting the pitch angle through suspension coupling. Roll itself also alters the longitudinal projection position of the vanishing point. Vehicle speed, as a motion reference, influences the rate of change of acceleration and the dynamic characteristics of the suspension response. Therefore, by using these signals as input, the offset of the current frame's vanishing point relative to the reference position and its possible fluctuation range can be inferred in real time. Based on this, longitudinal translation compensation can be performed on the region of interest, and the longitudinal size of the region can be scaled to accommodate uncertainties, ensuring that the vanishing point is always effectively covered.

[0064] Specifically, a mapping model is first established from motion state signals to adjustments in the region of interest (ROI). For longitudinal position adjustment, the instantaneous values ​​and rates of change of longitudinal and lateral acceleration are input into a preset pitch angle estimation model to calculate the pitch angle increment of the vehicle body relative to the calibration state. This angle increment is then converted into pixel offsets on the image plane using camera intrinsic parameters, thereby determining the longitudinal translation of the ROI. For size range adjustment, the intensity of the current motion is assessed based on a comprehensive index of vehicle speed absolute value, acceleration amplitude, and acceleration derivative. A size scaling factor is dynamically calculated using a preset scaling function; for example, the more intense the motion, the larger the scaling factor, resulting in a corresponding increase in the longitudinal height of the region. Finally, the translated position is used as the center of the region, and the reference height multiplied by the scaling factor is used as the longitudinal half-height of the region, thus defining the dynamically adjusted ROI in the image.

[0065] Furthermore, the response delay for longitudinal position adjustment should be less than one image frame interval (e.g., 33 milliseconds) to ensure that translational movements are synchronized with the actual pitch of the vehicle. The maximum translation range covers the pixel range corresponding to the vehicle's maximum possible pitch angle (e.g., ±6°), typically 20% to 30% of the image's longitudinal dimension. The dynamic range of the scaling factor is set to 0.8 to 2.0, meaning the longitudinal height of the region of interest can continuously vary between 80% and 200% of the baseline value. The mapping relationship between the scaling factor and the intensity of motion can use a linear or non-linear function, and a dead zone threshold is set to avoid frequent scaling caused by minor perturbations. The calculation and update frequency of the adjustment amount is consistent with the image frame rate, ensuring that each frame generates an adapted region of interest based on the latest signal.

[0066] Specifically, in a smooth cruise scenario on a highway, both longitudinal and lateral accelerations are close to zero, the scaling factor approaches 1.0, and the region of interest (ROI) remains stable near the reference position, requiring only minor adjustments to accommodate road surface undulations. In a scenario with frequent acceleration and deceleration on urban roads, longitudinal acceleration alternates, and the ROI rapidly shifts up and down with each braking and acceleration, ensuring the vanishing point remains within the region. Under extreme conditions such as emergency braking, longitudinal acceleration increases sharply to its peak value, and the system quickly shifts the ROI upwards by a significant amount, while the scaling factor increases to over 1.5 times to accommodate the large-scale movement and uncertainty of the vanishing point during rapid pitch. In a continuous emergency obstacle avoidance scenario, lateral acceleration fluctuates violently, and the scaling factor increases accordingly, providing sufficient margin for longitudinal fluctuations in the vanishing point caused by roll.

[0067] Specifically, regarding vertical position adjustment, the translation mechanism eliminates the vanishing point position shift caused by pitch angle changes, ensuring that the region of interest always moves synchronously with the vanishing point. This avoids target loss due to a fixed region and significantly improves the continuity of vanishing point tracking. Regarding size range adjustment, the scaling mechanism dynamically adjusts the vertical height of the region based on the intensity of motion. Under stable conditions, it maintains a small region to reduce computational load, while under critical conditions, it expands the region to enhance robustness, achieving a dynamic balance between computational efficiency and detection reliability. Ultimately, this dynamic and static adaptive adjustment strategy ensures that subsequent texture orientation estimation and voting calculations always focus on the optimal image subspace, providing a core guarantee for the stable operation of the entire pitch angle compensation method under various conditions.

[0068] S23. Using the dynamically adjusted vertical position as the center, a strip-shaped region of preset width is defined in the image as the dynamic region of interest. Subsequently, local dominant orientation estimation of pixels is only performed within this region to reduce the computational load.

[0069] Specifically, the vanishing point, as the theoretical intersection of all parallel texture direction extensions in the image, has its vertical coordinate dominated by the vehicle's pitch angle, while its horizontal coordinate is mainly determined by the relative relationship between the vehicle's direction of travel and the road direction, showing limited change when the vehicle is traveling straight or with a small curvature. Therefore, limiting the search range to a strip-shaped area covering the full width of the image, but with a dynamically variable and relatively narrow vertical range, can both fully preserve all possible lateral positions of the vanishing point and compress the vertical search space to a limited range matching the current vehicle posture, thus achieving intensive allocation of computational resources without sacrificing detection accuracy.

[0070] Specifically, the system first obtains the vertical center coordinates after dynamic adjustment in the previous stage. These coordinates, in pixels, represent the position of the vertical center line of the region of interest in the current frame on the vertical axis of the image. Then, using this vertical position as a reference, a preset vertical half-height is extended upwards and downwards to form a rectangular strip region. The horizontal extent of this region covers the entire width of the image, extending from the leftmost column 0 to the rightmost column; the vertical extent is determined by adding or subtracting the half-height from the center position, forming a horizontal strip with a fixed or dynamically variable height (depending on the scaling factor). After delineation, the system generates a binary mask corresponding to this strip region or directly specifies a range of pixel indices. Subsequent steps only perform intensive calculations such as Gabor convolution and orientation estimation on the pixels within this region; pixels outside the region are skipped and do not participate in any processing.

[0071] Furthermore, the baseline width (vertical height) of the strip region is determined during initial calibration, typically set to 15% to 25% of the total image height. For example, for a 1080p image, the baseline height is approximately 160 to 270 pixels. Under the dynamic scaling mechanism, the actual height can vary between 80% and 200% of the baseline value, meaning the minimum height is approximately 12% of the total image height and the maximum height is approximately 50%, adapting to different driving conditions from smooth travel to severe pitch. The horizontal range of the strip region is fixed at the full width of the image, ensuring that no possible vanishing point horizontal position is lost. Through these parameter settings, the number of pixels participating in subsequent calculations can typically be controlled between 15% and 30% of the entire image, significantly reducing the computational load.

[0072] Specifically, under straight-line driving conditions on highways, due to the stable vehicle posture, the longitudinal height of the strip region remains at the baseline value, occupying only a small proportion of the image. The system only needs to process a limited number of pixels to complete the vanishing point detection, achieving efficient operation. Under frequent acceleration and deceleration conditions on urban roads, the strip region dynamically adjusts its longitudinal position and height according to the motion state, always confining the search range within the image strip where the vanishing point is most likely to appear, ensuring that even when the longitudinal position changes rapidly, the vanishing point will not escape the processing area. Under critical conditions such as emergency braking or high-speed obstacle avoidance, the longitudinal height of the strip region may expand to twice the baseline value. Although the number of pixels processed increases, adaptive scaling ensures that the vanishing point is always included, while still avoiding full-image processing and maintaining the real-time feasibility of the algorithm.

[0073] Specifically, in terms of computational efficiency, since dense processing is performed only on a strip-shaped region rather than the entire image, the number of pixels involved in the computation is significantly reduced. This allows complex Gabor convolution and voting calculations to run in real time on the automotive embedded platform at a speed of over 30 frames per second, providing sufficient frame rate support for pitch angle compensation. Regarding detection reliability, limiting the search range to a strip-shaped region near the horizon naturally eliminates interference from irrelevant areas such as nearby road surface textures, sky clouds, and roadside buildings. This allows texture orientation estimation and voting calculations to focus on image features most likely pointing to the vanishing point, significantly improving the signal-to-noise ratio and accuracy of vanishing point localization. Ultimately, this spatial cropping strategy, combined with the dynamic position adjustment mechanism, constitutes an efficient, accurate, and robust region of interest processing framework, laying a crucial foundation for the engineering implementation of the entire pitch angle compensation method.

[0074] S3, perform local dominant direction estimation on the pixels in the dynamic region of interest to obtain the texture dominant direction corresponding to each pixel.

[0075] Specifically, structured textures in road images, such as lane lines, curbs, and road surface cracks, exhibit distinct directional features within their local neighborhoods. Furthermore, the extension directions of these textures, under perspective projection constraints, all point towards the road vanishing point. By analyzing the spatial distribution of grayscale or gradients within the neighborhood of each pixel, the direction with the highest energy concentration is identified as the dominant texture direction at that point, thus constructing a vector field pointing from the pixel to the vanishing point. This process essentially involves extracting the implicit perspective geometry information from the image point by point, providing directional data support for subsequent vanishing point localization based on the voting principle.

[0076] Specifically, for each pixel within the dynamically defined region of interest, convolution operations are first performed using Gabor filters with multiple preset directional angles. Gabor filters can simultaneously capture spatial location and frequency direction information; their kernel function is modulated by a Gaussian envelope and a sinusoidal plane wave, exhibiting selective response to textures in specific directions. For each pixel, the energy response value of the filters in each direction is calculated to obtain the response intensity of that pixel in each direction. To enhance the accuracy and robustness of direction selection, the preset directions are divided into several pairs of orthogonal directions. A competition suppression operation is performed on the two response values ​​within each pair to highlight the dominant direction and suppress interference from orthogonal directions. The angle corresponding to the maximum significant intensity among all direction pairs is selected as the initial dominant direction for that pixel. Furthermore, a vector synthesis method is used to optimize the neighborhood consistency of the initial direction, eliminating estimation noise from isolated pixels, ultimately obtaining a stable and reliable estimate of the texture dominant direction.

[0077] Furthermore, to achieve high-precision orientation estimation, this step sets explicit quantitative indicators for the algorithm parameters. The Gabor filter has at least four preset orientations, typically using four orthogonal base orientations of 0°, 45°, 90°, and 135°, but can be increased to eight to improve angular resolution. The filter wavelength is set according to the image resolution and road texture scale; for example, in a 1080p image, the wavelength is set to 8 to 16 pixels. The standard deviation of the Gaussian envelope determines the spatial support region of the filter and is typically set to half the wavelength. In the competition suppression operation, a threshold needs to be set for the calculation of significant intensity to eliminate low-confidence responses, for example, retaining only orientations exceeding 30% of the global maximum response value. The neighborhood window size during vector synthesis is typically set to 5×5 or 7×7 pixels to maintain orientation consistency while avoiding excessive smoothing. The final output texture-dominant orientation angle accuracy should be within 5° to meet the convergence accuracy requirements of subsequent voting calculations.

[0078] Specifically, in well-structured highway scenarios, clear lane lines provide strong directional features for direction estimation, and the Gabor filter can stably output the dominant direction consistent with the lane lines. In urban road scenarios, the road surface may contain various textures such as zebra crossings, arrow markings, and road text. The direction estimation algorithm uses a competitive suppression mechanism to filter out textures consistent with the road's extension direction, suppressing interference from lateral or random textures. In scenarios with drastic lighting changes or shadow coverage, the bandpass characteristics of the Gabor filter are robust to lighting changes and can extract directional information based on local contrast. In extreme scenarios with blurred road textures or covered by ice and snow, even if lane lines are not visible, the granular texture of the road surface itself can still provide weak directionality. Through vector synthesis and neighborhood consistency optimization, the algorithm can still maintain a certain direction estimation capability, providing a basic support for vanishing point detection.

[0079] Specifically, firstly, multi-directional convolution based on Gabor filters can accurately capture the local directional features of the texture, with the angular resolution of the response peak reaching within a few degrees, providing high-precision directional input for subsequent voting calculations. Secondly, the orthogonal competition suppression mechanism effectively suppresses interference from non-dominant directions, such as lateral shadows on roads or vehicle shadows, significantly improving the signal-to-noise ratio of direction estimation. Thirdly, vector synthesis and neighborhood consistency optimization eliminate estimation noise from isolated pixels, ensuring the spatial continuity and smoothness of the direction field, which better conforms to the actual distribution patterns of road textures. Finally, the high-quality direction estimation results lay a solid data foundation for voting-based vanishing point localization, enabling the vanishing point to be stably and accurately identified in complex texture environments, providing a reliable source of geometric constraints for the entire pitch angle compensation method.

[0080] Furthermore, S3 includes: S31, for each pixel in the dynamic region of interest, convolution operation is performed using Gabor filters with at least four preset directional angles to obtain the Gabor energy response value corresponding to each direction.

[0081] Specifically, a multi-directional Gabor filter is used for convolution operations on each pixel within a dynamically dynamic region of interest. This principle simulates the selective response characteristics of simple cells in the mammalian visual cortex to textures in specific directions. The Gabor filter, modulated by a Gaussian envelope function and a complex sinusoidal plane wave, achieves optimal localization in both the spatial and frequency domains, thus possessing excellent texture direction and frequency selectivity. When the filter direction aligns with the local texture direction of the image, the convolution output response reaches its maximum; conversely, the response is significantly suppressed. By setting a filter bank covering the dominant directions, the local extension directions of various structured textures (such as lane lines, curbs, and road surface cracks) in road images can be comprehensively detected, providing raw response data for subsequent extraction of the dominant texture direction.

[0082] Specifically, a set of two-dimensional Gabor filter kernel functions with preset orientation angles is first constructed. These orientation angles are typically uniformly distributed within the range of 0 to π, for example, selecting four basic directions: 0°, 45°, 90°, and 135°. For each pixel within the dynamically relevant region of interest, a neighborhood window of a preset size is taken centered on that pixel, and two-dimensional convolution operations are performed with each orientation filter kernel. The convolution result is a complex number, with its real and imaginary parts corresponding to the even-symmetric and odd-symmetric responses of the filter, respectively. To obtain a scalar value proportional to the directional intensity, the modulus of this complex number is calculated as the Gabor energy response value, reflecting the saliency of the texture in that direction within the current pixel's neighborhood. The convolution operation can be accelerated using a Fast Fourier Transform or implemented using separable filtering to meet real-time processing requirements. Finally, each pixel generates an energy response vector equal to the number of orientation angles.

[0083] Furthermore, to ensure the accuracy and robustness of direction estimation, this step sets clear quantitative indicators for various parameters of the Gabor filter. The preset number of direction angles is at least four, and in practical applications, it can be increased to eight (e.g., 0°, 22.5°, 45°, 67.5°, 90°, 112.5°, 135°, 157.5°) to improve angular resolution. The filter wavelength λ is selected based on the image resolution and road texture scale; for 1080p images, a typical range is 8–16 pixels. The standard deviation σ of the Gaussian envelope is usually set to 0.5λ–1.0λ to balance direction selectivity and spatial positioning accuracy. The filter kernel size is set to a rectangular window containing 2–3 wavelengths; for example, for λ=8 pixels, the kernel size can be 15×15 or 21×21 pixels. During convolution calculation, the gray values ​​of the pixel neighborhood need to be normalized to suppress the influence of illumination differences.

[0084] Specifically, in well-structured highway scenes, clear lane lines generate strong energy responses in their corresponding directions (typically close to 0° or small angles), while the response perpendicular to the lane lines is weak, creating a stark directional contrast. In urban road scenes, the road surface may contain multi-directional textures such as zebra crossings and arrow markings. Gabor filter banks can simultaneously capture energy from different directions, providing complete response information for subsequent competitive suppression. Under conditions of shadows, rain, snow, or abrupt changes in illumination, the bandpass characteristics of Gabor filters allow them to tolerate local contrast changes to a certain extent, still extracting directional information from weak textures. Even in cases with sparse road surface textures, a particular direction may still dominate among the multi-directional responses, ensuring the continuity of direction estimation.

[0085] Specifically, the multi-directional energy response comprehensively covers all possible directions of road texture, avoiding misjudgment of dominant directions due to insufficient direction sampling. The inherent noise resistance and direction selectivity of the Gabor filter enable the energy response to effectively suppress random noise and interference from irrelevant textures, improving the signal-to-noise ratio of direction information. Pixel-by-pixel calculation ensures that the spatial resolution of direction estimation reaches the pixel level, providing a fine local description for subsequent competition suppression and vector synthesis. Finally, the high-fidelity direction energy response map output by this step ensures the accuracy of subsequent texture-dominant direction estimation, thereby supporting the stable performance of vanishing point voting calculation in complex road environments.

[0086] S32, divide at least four preset direction angles into several groups of orthogonal direction pairs, perform competition suppression operation on the two energy response values ​​in each group of orthogonal direction pairs, and obtain the significant intensity of the pixel in the current direction pair.

[0087] Specifically, at least four preset direction angles are divided into several pairs of orthogonal directions and competitive suppression operations are performed. This principle originates from the lateral inhibition mechanism in visual perception, where adjacent neurons enhance contrast and highlight edges through mutual inhibition. In image texture analysis, orthogonal directions (i.e., directions with an angle of 90°) have a natural mutually exclusive property because real road textures cannot simultaneously have two mutually perpendicular dominant directions. By dividing the preset directions into orthogonal pairs and performing competitive suppression operations on the two energy response values ​​within each pair, the response of the true dominant direction can be strengthened while suppressing interfering responses in the orthogonal directions. This process is essentially a nonlinear feature enhancement operation that can significantly improve the signal-to-noise ratio of direction estimation, making subsequent saliency intensity calculations more focused on the true texture direction.

[0088] Specifically, the at least four preset orientation angles used in step S31 are first grouped according to the principle of pairwise orthogonality. For example, if the preset orientations are 0°, 45°, 90°, and 135°, then two orthogonal pairs are formed: (0°, 90°) and (45°, 135°). For each pixel in the dynamic region of interest, its Gabor energy response values ​​in both directions of each orthogonal pair are obtained. For each orthogonal pair, a competitive suppression function is used to calculate the salient intensity of the pixel in that orientation pair. A typical competitive suppression operation can use a nonlinear function of the form "Winner-Take-All," for example, defining the salient intensity as the result of the difference between two response values ​​after passing through a nonlinear activation function, or defining it as the larger response value minus the product of the smaller response value and the suppression coefficient. The suppression coefficient can be set empirically to balance the enhancement degree of the dominant orientation with the suppression degree of the orthogonal orientation. Finally, each pixel generates several salient intensity values ​​equal to the number of orthogonal pair groups.

[0089] Furthermore, the grouping of orthogonal direction pairs must strictly satisfy the condition of an included angle of 90°±1° to ensure the geometric rationality of the suppression mechanism. The suppression coefficient is usually set to a range of 0.3 to 0.7. The larger the suppression coefficient, the stronger the suppression of the orthogonal direction, but an excessively large suppression coefficient may cause the dominant direction itself to be attenuated. The output of the competitive suppression function needs to be normalized, for example, by compressing the salient intensity value to the [0,1] interval to facilitate subsequent multi-group comparisons and maximum value selection. For low-contrast or textured regions, an energy response threshold can be set. When the response values ​​of both directions are below this threshold, the salient intensity is directly set to zero to avoid noise being incorrectly amplified.

[0090] Specifically, in well-structured highway scenes, lane lines typically have a direction close to 0°, with extremely low response values ​​in their orthogonal 90° direction. After competition suppression, the significant intensity of the 0° direction is further highlighted, while the 90° direction is effectively suppressed, making the dominant direction clearer. In urban road scenes, the road surface may have textures in multiple directions, such as zebra crossings (90° direction) and lane lines (0° direction) coexisting. The competition suppression mechanism forces pixels to choose between two orthogonal directions, thus avoiding the blurring caused by a pixel responding to multiple directions simultaneously. In shadow or unevenly lit areas, the Gabor energy response may be generally low. The competition suppression operation, through relative comparison rather than absolute thresholding, can still extract the relatively dominant direction from the weak response, maintaining the continuity of direction estimation.

[0091] Specifically, this step achieves sharpening and enhancement of directional information through the partitioning of orthogonal direction pairs and competition suppression operations. First, the competition suppression mechanism effectively eliminates interference responses in orthogonal directions, significantly reducing potential directional ambiguity and making the directional features of each pixel purer. Second, this non-linear processing enhances the contrast between the truly dominant and secondary directions, making the dominant direction more advantageous when selecting the maximum saliency intensity, reducing the probability of misselection. Third, by calculating saliency intensity in groups, information from multiple candidate directions is preserved, providing intermediate data for subsequent multi-directional comprehensive judgment. Finally, the saliency intensity map after competition suppression has a higher signal-to-noise ratio and clearer directional boundaries, laying a higher-quality data foundation for subsequent selection of the dominant texture direction and significantly improving the anti-interference capability and angle accuracy of the entire direction estimation process.

[0092] S33, select the angle corresponding to the maximum value of the significant intensity in all direction pairs as the initial dominant direction of the pixel, and use the vector synthesis method to optimize the neighborhood consistency of the initial dominant direction to obtain the final estimated texture dominant direction.

[0093] Specifically, the angle corresponding to the maximum significant intensity among all directional pairs is selected as the initial dominant direction, essentially making direction decisions at the pixel level based on a winner-takes-all mechanism. Since each local neighborhood in a road image typically has only one dominant texture extension direction, this mechanism can filter out the most representative direction candidates from multi-directional responses. However, due to factors such as image noise, texture interruptions, or uneven illumination, the initial dominant direction of individual pixels may deviate from the true texture direction. To address this, a vector synthesis method is introduced for neighborhood consistency optimization. This method utilizes the assumption of spatial continuity of texture directions, jointly estimating the directions of multiple pixels within the neighborhood to suppress isolated outliers. This allows the final direction field to achieve a smooth transition while preserving local details, more accurately reflecting the overall geometric structure of the road texture.

[0094] Specifically, firstly, each pixel within the dynamically active region of interest is traversed. From the saliency intensities of each orthogonal pair calculated in step S32, the maximum value is identified and its corresponding orientation angle is recorded as the initial dominant orientation of that pixel. Then, the initial dominant orientation is vectorized, converting each angle into a unit orientation vector (e.g., in two-dimensional Cartesian coordinates). A neighborhood window of a preset size (e.g., 5×5 or 7×7 pixels) is defined centered on the current pixel. The orientation vectors of all valid pixels within the window are weighted and averaged. The weights can be Gaussian functions or the inverse of distance to enhance the contribution of the central pixel and smooth the influence of distant pixels. The magnitude and orientation angle of the weighted average vector are calculated. If the magnitude exceeds a preset threshold (indicating high orientation consistency), the angle of the average vector is used as the final estimated dominant texture orientation for that pixel. If the magnitude is below the threshold, the point is determined to be a low-confidence region and can be marked as not participating in subsequent voting or the initial orientation can be retained.

[0095] Furthermore, the neighborhood window size is determined based on the image resolution and road texture scale, typically ranging from 5×5 to 9×9 pixels. A window that is too small makes noise removal difficult, while a window that is too large may over-smooth details. The standard deviation of the Gaussian weighted average is usually set to 0.3 to 0.5 times the window radius, making the center pixel significantly more weighted than the edge pixels. The threshold for the magnitude after vector averaging is set to 0.3 to 0.5. When the magnitude is below this value, it indicates poor directional consistency within the neighborhood, possibly located in a blurred texture or boundary region. In this case, the initial direction can be retained or invalidated. The angle output accuracy is required to be within 1° to meet the sub-pixel convergence requirements of subsequent vanishing point voting.

[0096] Specifically, in well-structured highway scenes, neighborhood consistency further enhances the smoothness of the orientation field, ensuring continuous and stable lane line orientations and providing a highly consistent vector field for vanishing point voting. In complex textured areas of urban roads, such as zebra crossings and road markings, initial orientations may exhibit local abrupt changes. Neighborhood optimization effectively integrates the dominant orientations of surrounding lane lines, avoiding interference from lateral textures and making the orientation field closer to the actual road direction. Under low-contrast conditions such as varying lighting, shadows, or slippery surfaces in rainy weather, the orientation estimation of individual pixels is susceptible to noise. Neighborhood vector synthesis can utilize information from surrounding reliable pixels for correction, maintaining the overall reliability of the orientation field. When critical conditions cause slight motion blur in the image, neighborhood optimization helps stabilize the orientation field and reduce inter-frame jitter.

[0097] Specifically, firstly, the winner-takes-all mechanism ensures that each pixel obtains a clear initial orientation, avoiding multi-directional blurring. Secondly, the vector synthesis method significantly suppresses orientation jumps caused by isolated noise points and local texture anomalies, making the orientation field spatially continuous and smooth, which is more in line with the physical laws of road perspective geometry. The final output texture-dominated orientation has a higher signal-to-noise ratio and stronger anti-interference ability, providing accurate and stable orientation input for subsequent voting-based vanishing point localization, directly improving the repeatability and accuracy of vanishing point coordinates, and thus ensuring the robustness and reliability of pitch angle estimation under critical conditions.

[0098] S4, based on the dominant texture direction of each pixel, performs a voting calculation to determine the location of the road vanishing point in the visual image.

[0099] Specifically, in a real 3D scene, the projected extensions of all straight lines parallel to the road plane (such as lane lines and curbs) on the image plane should intersect at a single point, i.e., the vanishing point. By assigning each pixel its dominant texture direction, it is equivalent to specifying the direction of the line family to which each potential texture fragment in the image belongs. The voting mechanism simulates the process of converging these line extensions onto the image plane—each pixel votes for the candidate vanishing point locations along its dominant direction. When the extensions of the dominant directions of a large number of pixels highly overlap at a certain location, that location is the statistically most likely road vanishing point. This process is essentially a spatial convergence estimation based on orientation field consistency, enabling robust localization of vanishing points even in the absence of complete line detection.

[0100] Furthermore, a two-dimensional voting accumulation matrix corresponding to the size of the dynamically generated region of interest (ROI) is first constructed. Each element of the matrix is ​​initialized to zero, representing the accumulated votes for that pixel location as a candidate vanishing point. For each valid pixel within the ROI, a ray is generated pointing towards the image boundary, starting from its location and along its dominant texture direction. Every candidate pixel covered by this ray can serve as a potential vanishing point location, but the contribution of candidate points at different distances to the voting differs—the closer to the starting point, the higher the probability of the ray direction converging at that candidate point. Therefore, an exponential decay function based on Euclidean distance is used to calculate the voting contribution value, ensuring that closer candidate points receive higher votes and farther candidate points receive lower votes. The voting contribution value of each pixel is accumulated to the corresponding position in the voting matrix. After traversing all valid pixels within the region, the peak position in the voting matrix corresponds to the observed position of the road vanishing point in the current frame.

[0101] Furthermore, the exponential constant λ of the voting decay function is typically positive, controlling the rate at which voting contributions decay with distance. A typical value ranges from 0.01 to 0.1; a smaller value results in slower decay, retaining more contributions from distant candidate points; a larger value results in faster decay, focusing on the near-end region. The generation of the voting ray for each pixel requires precise calculation of its intersection with the boundary of the region of interest to ensure complete voting coverage. The resolution of the voting matrix is ​​consistent with the image resolution to ensure pixel-level accuracy in vanishing point localization. To reduce computational load, sparse sampling of pixels can be performed, for example, voting can be conducted every other pixel or every two pixels. The sampling step size is typically set to 1-2 pixels, striking a balance between accuracy and efficiency. After voting, the cumulative matrix needs to be Gaussian smoothed to eliminate local noise; the smoothing kernel size is typically 3×3 or 5×5.

[0102] Specifically, in well-structured highway scenes, lane lines are clear and continuous, the dominant directions of a large number of pixels are highly consistent, the voting accumulation matrix exhibits sharp peaks, and the vanishing point localization accuracy can reach the sub-pixel level. In complex texture scenes of urban roads, there may be multi-directional texture interference, and the voting matrix may show multiple local peaks. Subsequent Kalman filtering can combine historical information to select and track the correct vanishing point. In areas with changing lighting or shadows, the confidence of orientation estimation for some pixels decreases, but through the cumulative voting of a large number of pixels, a reliable vanishing point can still be statistically converged. When critical conditions cause the vehicle to pitch violently, the adaptive adjustment of the dynamic region of interest ensures that the voting calculation always focuses on the effective area, enabling the vanishing point to maintain continuous tracking during rapid movement.

[0103] Specifically, firstly, the voting mechanism inherently possesses anti-interference capabilities—even if the orientation estimation of some pixels has errors, as long as the overall orientation field remains consistent with the true vanishing point, the voting peak can still accurately reflect the convergence position, demonstrating the robustness advantage of statistical methods. Secondly, the distance-based exponential decay weighting makes the voting results more focused on the contribution of near-end rays, avoiding excessive influence of far-end pixels on vanishing point localization and improving localization accuracy. Thirdly, the construction of the voting accumulation matrix provides quantified confidence information for subsequent Kalman filtering—the relative height of the peak can directly reflect the reliability of vanishing point detection in the current frame, used for dynamically adjusting filter parameters. Finally, the high-precision vanishing point observation provides accurate position input for pitch angle calculation, enabling the entire compensation method to maintain stable and reliable performance in complex dynamic scenes.

[0104] Furthermore, S4 includes: S41, taking each pixel in the dynamic region of interest as the starting point for voting, generates a ray pointing to the image boundary along its corresponding dominant texture direction, and establishes the voting support relationship between the pixel and the vanishing point position.

[0105] Specifically, according to perspective projection geometry, parallel straight lines on a road plane are represented in the image as a cluster of rays converging at a single point, which is the vanishing point. Each pixel with a definite texture direction essentially represents a sample point of the family of lines to which the texture fragment belongs. By drawing rays from this pixel along its dominant texture direction, the infinite extension of the straight line containing the texture fragment on the image plane is simulated. When a large number of rays from different locations converge densely in a certain area of ​​the image, this convergence point is the statistically most likely vanishing point location. Therefore, the ray generation process establishes a support relationship between each pixel and all possible candidate vanishing point locations, providing a spatial mapping basis for subsequent voting accumulation.

[0106] Specifically, the process first iterates through each valid pixel within the dynamic region of interest (ROI) to obtain the pixel's image coordinates and its corresponding dominant texture direction angle. The dominant texture direction angle is typically expressed in radians or angles and is defined as the angle between the texture extension direction at that point and the horizontal axis of the image. Starting from this pixel, the direction vector of the ray is calculated based on its dominant direction angle. A linear rasterization algorithm (such as the Bresenham algorithm or the DDA algorithm) is used to trace the ray path pixel by pixel until it reaches the boundary of the ROI or the image boundary. During the tracing process, the coordinates of all pixels traversed by the ray are recorded; these coordinates constitute the set of candidate vanishing points that the voting starting point may support. Each pixel corresponds to a ray, establishing a one-to-many voting support relationship between the starting point and all candidate points on the ray path. To improve computational efficiency, a ray increment table corresponding to each direction angle can be pre-calculated, and the ray path can be quickly generated by looking up the table.

[0107] Furthermore, the angular accuracy of the texture-dominant direction must be within 1° to ensure the geometric accuracy of the ray direction. The ray tracing step size is fixed at 1 pixel to ensure coverage of all possible candidate locations. The ray termination condition is set to reach the boundary of the region of interest or the image boundary. When the region of interest is a rectangular strip, the ray may terminate at the upper, lower, left, or right boundary. For each voting starting point, its ray length depends on the distance between the starting point and the boundary, with the maximum possible length corresponding to the diagonal size of the region of interest. During implementation, degradation cases where the orientation angle is close to horizontal or vertical need to be handled to avoid division-by-zero errors or infinite loops in ray tracing.

[0108] Specifically, in well-structured highway scenes, pixels on lane lines have a clear texture-dominant direction, and the generated rays accurately point to the true vanishing point region, providing high-quality orientation support for subsequent voting accumulation. In complex textured areas of urban roads, such as intersections or areas with dense pavement markings, some pixels may have orientation estimation biases, but through the superposition of rays from a large number of pixels, the true vanishing point still gains statistical advantage. In critical situations with drastic changes in vehicle pitch, the adaptive adjustment of the longitudinal position of the dynamically generated region of interest ensures that ray generation always focuses on the image strip containing the vanishing point, avoiding ray pointing errors caused by search area offset. In areas with changing lighting or partial texture loss, even if the ray direction of individual pixels is incorrect, the convergence characteristics of the overall ray cluster remain stable.

[0109] Specifically, firstly, ray generation transforms the texture direction information of each pixel into support votes for vanishing point candidate positions, establishing a bridge between pixel-level contribution and global peak detection. Secondly, all candidate points within the ray's coverage area receive votes, ensuring the comprehensiveness of vanishing point detection—even if the true vanishing point is located between two pixels, the convergence of rays from surrounding pixels can still accumulate through voting to form a local peak. Thirdly, the ray-based voting mechanism inherently possesses anti-sparseness and anti-noise capabilities; even if some pixels cannot participate in voting due to missing textures, the rays from the remaining pixels can still converge effectively. Finally, high-quality ray generation lays a solid foundation for subsequent voting accumulation and peak detection, enabling vanishing point localization to maintain pixel-level accuracy and stability in complex dynamic scenes.

[0110] S42, calculate the Euclidean distance between each candidate pixel on the ray and the voting starting point, and perform exponential decay weighting based on the Euclidean distance to obtain the voting contribution value of the voting starting point to each candidate pixel on the ray, wherein the closer the distance, the greater the contribution, and the farther the distance, the smaller the contribution.

[0111] Specifically, on the image plane, candidate points closer to the voting starting point have a stronger geometric correlation between their texture direction and the starting point, because the probability of straight lines bending or shifting within a short distance is lower. Conversely, for candidate points farther away, the confidence of the starting point's texture direction pointing to that point gradually decreases due to image distortion, perspective shortening, or texture interruption. The exponential decay function maps distance to decay weights, achieving spatial adaptive adjustment of voting contributions—nearer candidate points receive high weights, reflecting their high confidence as potential vanishing points; distant candidate points receive low weights, reflecting the weakening of their geometric correlation. This mechanism makes vanishing point localization more focused on regions with highly consistent local orientation fields, avoiding interference from distant noise or texture anomalies on the voting results.

[0112] Furthermore, for each ray generated from the voting starting point, the image coordinates of all candidate pixels along the ray are first obtained. The Euclidean distance, expressed in pixels, is calculated between the coordinates of the voting starting point and the coordinates of each candidate point. Subsequently, an exponential decay function is applied to calculate the voting contribution value corresponding to each candidate point; a typical function form is... ,in For Euclidean distance, The function is a preset decay constant. Its range is (0, 1], with a contribution of 1 when the distance is 0 and approaching 0 as the distance approaches infinity. The calculated weight is used as the voting contribution of the starting point to the current candidate point and accumulated in the corresponding position of the voting accumulation matrix. In practice, a decay weight table corresponding to different distances can be pre-calculated, and the contribution value can be quickly obtained by looking up the table, avoiding the performance overhead of real-time calculation of the exponential function. For candidate points exceeding the boundary of the region of interest, voting is automatically terminated to ensure that the calculation range is controllable.

[0113] Furthermore, the attenuation constant The value of determines the rate at which the weight decays with distance, typically ranging from 0.01 to 0.1. Smaller values... A value of 0.01 (e.g., 0.01) results in a smoother decay curve, while retaining some weight for distant candidate points, making it suitable for scenarios with strong texture continuity and long vanishing point distances; a larger value... A value of 0.1 (e.g.,) results in rapid decay, focusing on the near-end region, suitable for scenarios with frequent texture interruptions and where local consistency needs to be emphasized. Distance The unit of measurement is pixels, ranging from 0 to the maximum length of the ray, typically tens to hundreds of pixels. To ensure computational stability, a distance truncation threshold can be set; candidate points exceeding this threshold will no longer participate in the voting. For example, the threshold can be set to 80% of the diagonal length of the region of interest. When accumulating voting contributions, floating-point accumulation must be used to ensure accuracy, and the final voting matrix can be normalized to the interval [0, 1].

[0114] Specifically, in straight-line driving scenarios on highways, lane lines are continuous and oriented in the same direction. Along the ray path from the voting starting point to the true vanishing point, near-end candidate points have high weights, while far-end candidate points have low weights. This naturally concentrates the voting peak in the area near the starting point, closely matching the location of the true vanishing point. At urban road intersections or in areas with complex textures, multiple rays may intersect in different directions. The exponential decay mechanism ensures that the voting contribution of each starting point is mainly concentrated in its near-end area, avoiding interference from far-end intersections and helping to preserve local consistency information in multi-peak scenarios. In critical situations causing severe vehicle pitch, the vanishing point moves rapidly. Exponential decay, by emphasizing near-end contributions, allows the voting peak to quickly follow changes in the starting point's direction, improving dynamic response capabilities. In low-contrast areas with insufficient lighting or blurred textures, the starting point's direction estimation may contain errors. In this case, exponential decay limits the voting range, confining the error's impact to the near end and preventing the accumulation of errors at the far end.

[0115] Specifically, firstly, the distance decay mechanism assigns higher weights to near-end candidate points, making vanishing point localization more reliant on regions with strong local orientation consistency, effectively improving localization accuracy. Secondly, the low weighting of far-end candidate points by the decay function naturally suppresses erroneous voting at long distances caused by texture interruptions, noise interference, or orientation estimation errors, enhancing the noise resistance of the voting process. Thirdly, the smoothing characteristic of exponential decay ensures the continuity and differentiability of the voting matrix, facilitating subsequent peak detection and sub-pixel precision interpolation. Finally, this weighting mechanism, combined with ray generation, ensures that the contribution of each pixel matches its geometric confidence, providing high-quality input to the voting accumulation matrix. This significantly improves the robustness and accuracy of vanishing point localization in complex dynamic scenes, laying a reliable data foundation for pitch angle estimation.

[0116] S43, accumulate the voting contribution values ​​of all pixels on the image plane to construct a two-dimensional voting accumulation matrix corresponding to the dynamic region of interest, and select the pixel coordinates corresponding to the voting peak in the two-dimensional voting accumulation matrix as the observation position of the road vanishing point in the current frame visual image.

[0117] Specifically, the voting contributions of all pixels are accumulated on the image plane to construct a two-dimensional voting accumulation matrix and extract the peak coordinates. The core principle lies in maximizing geometric consistency detection through statistical convergence. Each pixel's contribution to candidate locations along a line, based on its dominant texture direction, is essentially local evidence of that location's probability of being a vanishing point. When a large number of pixels from different spatial locations independently cast their votes, the location corresponding to the true vanishing point will accumulate significantly more votes than other locations due to the convergence of rays from multiple directions. This process utilizes the statistical law revealed by the Central Limit Theorem—even if the direction estimation of a single pixel has errors, as long as the error distribution is unbiased and the number of votes is sufficient, the expected location of the voting peak will still converge to the true vanishing point. Therefore, peak detection using the voting accumulation matrix is ​​essentially a maximum likelihood estimation based on evidence accumulation, enabling robust geometric localization in complex texture environments.

[0118] Specifically, first, a two-dimensional voting accumulation matrix with the exact same size as the dynamic region of interest (ROI) is initialized. The matrix elements are floating-point numbers, and all elements are initialized to zero. Each valid pixel within the ROI is traversed, and the voting contribution value calculated for that pixel to each candidate position on the ray is obtained. These contributions are then accumulated into the corresponding coordinates in the accumulation matrix. The accumulation process must ensure multi-threaded safety or employ atomic operations to avoid concurrent write conflicts. After accumulating the voting contributions of all pixels, the accumulation matrix undergoes optional post-processing, such as applying a small Gaussian kernel for smoothing filtering to eliminate the influence of isolated noise points and enhance the local peak convexity. Subsequently, a peak detection algorithm is used to traverse the accumulation matrix and identify the position of the element with the maximum value. For sub-pixel accuracy, quadratic surface fitting or barycentric interpolation can be performed within the peak neighborhood to obtain a more accurate coordinate estimate. Finally, this coordinate is output as the observed position of the road vanishing point in the current frame's visual image.

[0119] Furthermore, the spatial resolution of the voting accumulation matrix is ​​consistent with the pixel resolution of the dynamic region of interest to ensure pixel-level positioning accuracy. The Gaussian kernel size for smoothing filtering is typically 3×3 or 5×5, with a standard deviation set to 0.5~1.0, suppressing noise while avoiding excessive smoothing that could lead to peak shift. The neighborhood comparison range for peak detection is set to 3×3 or 5×5 to ensure that the detected maximum value is the true peak within the local region. If sub-pixel interpolation is used, the fitting window size is typically 3×3 or 5×5, and the interpolation accuracy can reach 0.1~0.2 pixels. For frames with excessively low voting values ​​(such as peak values ​​below a preset threshold), they can be identified as low-confidence observations, and the vanishing point position of that frame is marked as invalid, with subsequent Kalman filtering used for prediction compensation.

[0120] Specifically, in well-structured highway scenes, the highly consistent voting contributions from a large number of lane line pixels result in a sharp single peak in the cumulative matrix, with clear and well-defined peak coordinates, achieving sub-pixel accuracy in vanishing point localization. In complex urban road scenes, texture interference may exist from multiple directions, potentially causing the voting cumulative matrix to exhibit multiple local peaks. In such cases, it is necessary to combine the height or sharpness of the peaks to select the dominant peak, or to use subsequent Kalman filtering combined with historical trajectories to select the correct vanishing point. Under conditions of varying illumination or partial texture loss, the peaks of the voting matrix may be relatively flat, but the most likely vanishing point region can still be identified through statistical advantages. In critical situations causing severe vehicle pitch, real-time adjustment of the dynamically adjusted region of interest ensures that the voting cumulative matrix always focuses on the effective image range, enabling continuous and stable output of vanishing point observation even during rapid movement.

[0121] Specifically, firstly, the statistical accumulation mechanism inherently possesses noise resistance—even if there are random errors in the orientation estimation of a large number of pixels, as long as the errors are unbiased, the position of the voting peak can still accurately reflect the true vanishing point, demonstrating the robustness of statistical inference. Secondly, peak detection provides a quantitative confidence index—the height of the peak and its relative difference with the secondary peak can directly reflect the reliability of the vanishing point observation in the current frame, providing a basis for the dynamic parameter adjustment of the subsequent Kalman filter. Thirdly, through sub-pixel interpolation technology, the positioning accuracy can break through the pixel-level limitation, reaching 0.1~0.2 pixels, enabling pitch angle estimation to sensitively capture minute changes in vehicle attitude. Finally, the high-precision, high-confidence vanishing point observation provides accurate measurement input for the entire pitch angle compensation method, enabling subsequent attitude estimation and dynamic compensation to maintain excellent performance under critical conditions, significantly improving the reliability and robustness of the intelligent driving active safety system.

[0122] S5. The vanishing point of the road is optimally estimated based on Kalman filtering, and the camera's pitch angle is calculated based on the coordinates of the vanishing point obtained from the optimal estimation, thereby compensating for the vehicle's pitch angle.

[0123] Specifically, due to factors such as image noise, illumination variations, and texture loss, the vanishing point observation position obtained from single-frame voting inevitably contains random errors and may exhibit unreasonable high-frequency jumps between consecutive frames. Kalman filtering, as a recursive state estimator, can estimate the true state of the system in a statistically optimal sense by fusing the system dynamics model with noisy observation data. The vanishing point position and its velocity are constructed as a state vector, and a uniform motion model is used to describe its evolution across consecutive frames. Simultaneously, the observation noise covariance is adaptively adjusted based on the confidence level of the voting results, ensuring that the filtered vanishing point trajectory is both smooth and can quickly respond to real pitch changes. Finally, the optimally estimated vanishing point ordinate is substituted into the camera geometric model, combined with pre-calibrated camera mounting angles, to accurately calculate the current camera pitch angle. Based on this, dynamic compensation is performed on the vehicle pitch angle to restore the geometric reference of the visual perception system.

[0124] Specifically, firstly, a Kalman filter state vector is constructed, containing the ordinate and abscissa of the vanishing point on the image plane and its corresponding motion velocity components, typically represented as a four-dimensional vector. A state transition equation is established based on a uniform motion model to predict the mean and covariance of the current frame's state. The vanishing point coordinates obtained from voting in step S4 are used as observation inputs, and the observation noise covariance is dynamically calculated based on the peak signal-to-noise ratio (PSNR) or peak sharpness of the voting matrix—when the voting peak is sharp and the confidence level is high, the observation noise covariance takes a smaller value, giving the observation higher weight; conversely, a larger value is taken, indicating greater confidence in the prediction result. After calculating the Kalman gain, the state estimate is updated in conjunction with the observation values ​​to obtain the optimal estimated position of the vanishing point in the current frame and its covariance. The vanishing point ordinate is extracted from the updated state vector, and combined with the camera focal length, sensor height, and total image height, the current camera's pitch angle relative to the optical axis is calculated using the arctangent function. Utilizing the fixed pitch angle deviation calibrated during camera installation, the camera pitch angle is converted into the actual pitch angle at the vehicle's center of gravity, and finally, this angle value is output to the vehicle control system or perception module for compensation.

[0125] Furthermore, the state transition matrix is ​​constructed based on the image acquisition interval, with a typical frame interval of 33 milliseconds (corresponding to 30fps). The process noise covariance matrix needs to be adjusted according to the dynamic characteristics of the vehicle's pitch motion, typically setting the standard deviation of position process noise to 0.1~0.5 pixels and the standard deviation of velocity process noise to 0.01~0.1 pixels / frame. The dynamic adjustment range of the observation noise covariance is set to 0.1 to 10 pixels², and the mapping relationship with the peak signal-to-noise ratio can be achieved using a linear or piecewise function. The camera focal length in the pitch angle calculation formula is in pixels and needs to be obtained through offline calibration, with a typical value range of 1000~2000 pixels. The ratio of sensor height to image height is determined by the camera model, and the pitch angle calculation accuracy needs to reach above 0.01° to meet sub-pixel level attitude tracking requirements. The final output pitch angle compensation value is updated at the same frequency as the image frame rate to ensure that the compensation action is synchronized with image acquisition.

[0126] Specifically, in smooth highway driving scenarios, the vanishing point changes slowly, and the prediction component of the Kalman filter dominates, resulting in a smooth and natural output trajectory that effectively suppresses minor jitter caused by road texture noise. During emergency braking, the rapid increase in longitudinal acceleration causes the vanishing point to move upwards quickly. At this point, the observation noise covariance is automatically adjusted downwards based on the voting confidence level, enabling the filter to quickly track real pitch changes with a response delay within a few frames. In scenarios with continuously bumpy roads, the vanishing point may exhibit periodic fluctuations. The Kalman filter, by fusing historical motion information, maintains real-time tracking while avoiding excessive tracking of high-frequency noise. In frames where brief texture loss or direct sunlight causes voting failure, the observation noise covariance automatically increases. The filter primarily relies on prediction to maintain the state, quickly correcting it after observation recovery, ensuring the continuity and stability of the compensation signal.

[0127] Specifically, firstly, the recursive estimation mechanism of the Kalman filter effectively eliminates random errors and high-frequency jumps in single-frame voting results, ensuring the vanishing point trajectory remains smooth and continuous in the time dimension, providing a stable input for pitch angle calculation. Secondly, adaptive adjustment of observation noise based on voting confidence enables the filter to dynamically balance the weights of prediction and observation under different operating conditions, taking into account both fast response and anti-interference capabilities. Thirdly, the accurate camera geometry model converts pixel coordinates into physical angles, achieving sub-pixel pitch angle resolution, allowing the compensation system to perceive minute attitude changes on the order of 0.01°. Finally, the real-time output of the vehicle pitch angle compensation value can be directly injected into the visual perception algorithm or chassis control system to restore the geometric constraints under critical conditions, significantly improving the reliability and accuracy of active safety functions such as automatic emergency braking and lane keeping in extreme scenarios, providing key support for the stable operation of intelligent driving systems in all weather and all operating conditions.

[0128] Furthermore, S5 includes: S51, construct a state vector containing the image coordinates of the vanishing point and the motion velocity on the image plane, and perform Kalman filtering to predict the state of the vanishing point of the current frame based on the uniform motion model to obtain the predicted state and the corresponding covariance.

[0129] Specifically, because the vehicle's pitch motion is constrained by the suspension system's dynamics, within extremely short time intervals (such as 33 milliseconds between adjacent frames), the vanishing point's motion can be approximated as uniform motion, meaning its position changes linearly with time while its velocity remains constant. Incorporating both position and velocity into the state vector allows the filter not only to estimate the optimal position of the vanishing point at the current moment but also to maintain continuous tracking of its motion trend. The Kalman filter's prediction step utilizes this motion model, deriving the prior estimate for the current frame based on the optimal estimate of the previous frame, providing a reasonable initial value for subsequent fusion of the current frame's observation data. This mechanism essentially introduces a smoothing constraint in the time dimension, ensuring that the vanishing point trajectory conforms to the laws of physical motion and effectively suppressing high-frequency jitter caused by single-frame observation noise.

[0130] Specifically, the state vector of the Kalman filter is first defined, typically in four-dimensional form, containing the horizontal and vertical coordinates of the vanishing point on the image plane, along with their corresponding horizontal and vertical velocities, expressed as follows: A state transition matrix is ​​established based on the assumption of uniform motion. This matrix is ​​determined according to the image acquisition interval. The structure is a block matrix, with the top left corner being a 2×2 identity matrix representing positional inheritance, and the top right corner being... Multiplying by the identity matrix represents the contribution of velocity to position, and the 2×2 identity matrix in the lower right corner represents velocity retention. For each new frame, the velocity is estimated from the optimal state of the previous frame. and its covariance matrix Starting from the beginning, multiply the state transition matrix on the left to obtain the predicted state of the current frame. and predicted covariance The calculation of the predicted covariance also requires the addition of the process noise covariance matrix. This matrix reflects the uncertainty between the uniform velocity model and the actual motion. After the prediction is completed, the predicted state and covariance are temporarily stored, awaiting subsequent updates based on the observation data of the current frame.

[0131] Furthermore, the time interval in the state transition matrix It needs to accurately correspond to the actual frame interval of image acquisition, typically 33 milliseconds (30 fps) or 40 milliseconds (25 fps), and adaptively adjust for potential frame rate fluctuations. Process noise covariance matrix. The design reflects the prior setting of the confidence level of the motion model. Typically, the standard deviation of position noise in the diagonal elements is set to 0.1~0.5 pixels, and the standard deviation of velocity noise is set to 0.01~0.1 pixels / frame. Smaller values ​​indicate higher confidence in the uniform velocity model and smoother filter output; larger values ​​allow for greater model bias and faster filter response. Predicted covariance matrix. The initial values ​​are set when the filter is started. Typically, the diagonal elements are set to have a larger initial uncertainty. For example, the position covariance is initialized to 100 pixels² and the velocity covariance is initialized to 10 pixels² / frame².

[0132] Specifically, in a smooth highway driving scenario, the vanishing point changes slowly, the predicted state differs little from the previous frame, and the prediction covariance increases slowly, providing a stable prior for subsequent observation updates. Under emergency braking conditions, the vanishing point moves rapidly upwards, and the prediction step deduces the approximate position of the current frame based on the velocity estimate from the previous frame, ensuring the filter maintains continuous tracking of the vanishing point even with brief delays or loss of observations. In a continuously bumpy road scenario, the vanishing point exhibits periodic fluctuations, and the prediction step captures the motion trend through the velocity term, enabling the estimated trajectory to follow the phase of the actual motion. In frames where observation data is temporarily lost due to strong light or obstruction, the prediction state becomes the sole output basis, ensuring the continuity of the compensation signal, and allowing the filter to quickly correct accumulated errors once observations are restored.

[0133] Specifically, firstly, introducing the velocity dimension enables the filter to memorize motion trends, predicting future positions based on historical motion patterns, effectively improving the continuity and predictability of tracking. Secondly, although the assumptions of the uniform velocity model are simple, they have high approximate effectiveness within extremely short time intervals between adjacent frames, keeping prediction errors within an acceptable range while maintaining computational efficiency. Thirdly, the recursive calculation of the prediction covariance provides a theoretical measure of uncertainty for subsequent observation updates, allowing the calculation of the Kalman gain to scientifically balance the weights of prediction and observation. Finally, high-quality prediction states lay a solid prior foundation for the entire filtering process, ensuring that the vanishing point trajectory remains smooth and physically reasonable in the time dimension, significantly improving the stability and anti-interference capability of pitch angle estimation.

[0134] S52 uses the observed location of the road vanishing point obtained by voting as the measurement input, dynamically adjusts the observation noise covariance according to the confidence of the voting results, calculates the Kalman gain in combination with the predicted state, updates the state vector, and extracts the updated vanishing point image coordinates as the optimal estimated location.

[0135] Specifically, Kalman filtering uses Kalman gain to balance the confidence of the predicted state and the observed values. The gain depends on the relative relationship between the prediction covariance and the observation noise covariance. The confidence level of the voting result directly reflects the reliability of the observed values—when the voting peak is sharp and the signal-to-noise ratio is high, the observation noise should be set lower, and the filter trusts the current observation more; conversely, the observation noise should be set higher, and the filter relies more on the prediction. By establishing a mapping relationship between the confidence index and the observation noise covariance, the filter can dynamically adjust the fusion weights according to the actual quality of each frame, thereby ensuring the smoothness of the estimation while quickly responding to the real vanishing point motion. The vanishing point coordinates extracted from the updated state vector are the optimal estimate that fuses the motion model and the current observation, providing accurate and stable input for subsequent pitch angle calculations.

[0136] Specifically, firstly, the observed coordinates of the road vanishing point in the current frame are obtained from the voting calculation in step S4. Simultaneously, quantitative indicators reflecting the voting confidence are extracted, such as the peak signal-to-noise ratio (PSNR) of the voting matrix, the ratio of the peak to the second-highest peak, or the local sharpness of the peak (e.g., the second moment). Based on a preset mapping function, the confidence indicators are converted into the diagonal elements of the observation noise covariance matrix. Typically, the observation noise covariance matrix is ​​a two-dimensional diagonal matrix, whose value decreases as confidence increases and increases as confidence decreases. Then, the predicted state and prediction covariance of the current frame are obtained from step S51, and the Kalman gain is calculated in conjunction with the observation matrix (mapping the state space to the observation space). The Kalman gain is used to correct the predicted state, resulting in an updated state vector, and the covariance matrix is ​​updated simultaneously. After the update, the first two components (i.e., image coordinates) are extracted from the state vector as the optimal estimated position of the vanishing point in the current frame. The entire process is executed recursively, requiring only the estimation results from the previous frame to be retained, thus meeting real-time computational efficiency requirements.

[0137] Furthermore, the dynamic adjustment range of the observation noise covariance is typically set from 0.1 to 10 pixels², and the mapping between the specific value and the voting confidence index can be achieved using linear, exponential, or piecewise functions. For example, when the peak signal-to-noise ratio (PSNR) is higher than 30 dB, the observation noise covariance is set to 0.1 pixels²; when it is lower than 15 dB, it is set to 10 pixels²; linear interpolation is used in the intermediate region. The observation matrix is ​​designed to only observe the position components, ensuring that the dimensionality of the state space matches that of the observation space. The element values ​​of the Kalman gain matrix reflect the degree of confidence in the observations, and the normal range is between 0 and 1. The updated state covariance matrix must maintain positive definiteness, and the square root of its diagonal elements can be used as a measure of estimation uncertainty. The updated vanishing point coordinates should achieve pixel-level or sub-pixel-level accuracy to meet the accuracy requirements of subsequent pitch angle calculations.

[0138] Furthermore, in well-structured highway scenarios, the voting peaks are sharp with high confidence. The observation noise covariance is automatically lowered, and the Kalman gain is increased, allowing the filter to quickly track the true vanishing point location and ensure the estimated trajectory closely follows the observations. In complex urban road textures or multi-light interference scenarios, voting may exhibit multiple peaks or flat peaks, reducing confidence. The observation noise covariance is automatically increased, and the filter relies more on prediction, avoiding being misled by erroneous observations. In extreme frames where direct sunlight or brief occlusion causes voting to fail, the observations may be completely unusable. In such cases, the observation noise covariance is set to a maximum value, and the Kalman gain approaches zero. The filter relies entirely on prediction to maintain its state, ensuring the continuity of the estimation. In critical situations causing rapid vanishing point movement, the observation confidence is usually high, allowing the filter to respond quickly and achieve low-latency tracking.

[0139] Specifically, firstly, the confidence-weighted observation fusion mechanism enables the filter to automatically adjust its confidence level based on the image quality of each frame. This allows for rapid response to real motion during high-quality observations and maintains smoothness and stability during low-quality observations, achieving an optimal balance between response speed and anti-interference capability. Secondly, real-time calculation of the Kalman gain provides scientific fusion weights, ensuring that the updated state vector statistically minimizes the mean square error and guarantees optimal estimation. Thirdly, the updated vanishing point coordinates effectively suppress random errors and high-frequency jumps in single-frame observations while preserving the true dynamic trend, providing accurate and stable input for pitch angle calculation. Finally, this adaptive fusion mechanism significantly improves the robustness and accuracy of the entire pitch angle compensation method under complex and variable conditions, enabling the vehicle control system to make accurate decisions and compensations based on reliable state estimations.

[0140] S53 calculates the current pitch angle of the camera based on the longitudinal coordinate of the vanishing point in the optimal estimated position, combined with the camera's focal length, sensor height, and image size. Then, using the pre-calibrated fixed pitch angle deviation between the camera and the vehicle, it determines the vehicle's actual pitch angle and outputs this actual pitch angle to the vehicle control system and perception module for compensation.

[0141] Specifically, in the pinhole camera model, the vertical coordinate of the road vanishing point on the image plane is directly determined by the angle between the camera's optical axis and the horizontal road surface—when the camera's pitch angle changes, the vertical coordinate of the vanishing point moves linearly along the vertical direction of the image. By solving this geometric relationship, the pixel coordinates can be inverted into the camera's absolute pitch angle. Since the camera and vehicle are rigidly connected and have a fixed installation tilt angle, the actual pitch angle at the vehicle's center of gravity can be obtained by subtracting the pre-calibrated installation deviation from the camera's pitch angle. Outputting this angle value to the vehicle control system or perception module in real time can dynamically correct the geometric reference of visual perception when the vehicle's attitude changes drastically, ensuring that algorithms such as ranging and target detection maintain their preset accuracy and reliability even in critical situations.

[0142] Specifically, firstly, the optimal estimated ordinate of the vanishing point is extracted from the state vector updated by Kalman filtering. This coordinate, in pixels, represents the position of the vanishing point in the vertical direction of the image. Pre-calibrated camera intrinsic parameters are obtained, including focal length (in pixels), physical height of the sensor, and total image height. Based on the perspective projection formula, the tangent of the angle between the optical axis and the horizontal direction is calculated using the offset of the vanishing point's ordinate relative to the image center, combined with the focal length. The current pitch angle of the camera is then obtained using the arctangent function. Next, the fixed pitch angle deviation between the camera and the vehicle, recorded during vehicle static calibration, is read. This deviation represents the tilt angle of the camera relative to the vehicle's horizontal reference during installation. Subtracting this installation deviation from the camera pitch angle yields the absolute pitch angle of the vehicle on the actual road surface. Finally, this angle value is encapsulated into a standard message conforming to the Controller Area Network (CLAN) bus protocol and sent at a frequency synchronized with the image frame rate to the vehicle's electronic stability control system, active suspension system, or visual perception module to compensate for measurement errors caused by changes in vehicle attitude in real time.

[0143] Furthermore, the camera focal length calibration accuracy must reach the pixel level, typically obtained through a checkerboard calibration method, with a relative focal length error of less than 0.5%. The ratio of sensor height to image height is determined by the camera model and must be accurate to the micrometer level to ensure correct conversion between physical dimensions and pixel units. Pitch angle calculation uses double-precision floating-point arithmetic, with an angle resolution of at least 0.01° to meet the angle sensitivity corresponding to sub-pixel level vanishing point changes. The installation deviation angle calibration is completed statically on a level road surface, with a calibration accuracy better than 0.05°, and must be verified under different vehicle load conditions to ensure the stability of the deviation value. The update frequency of the output message is strictly synchronized with the image acquisition frame rate, typically 30Hz or 50Hz, to ensure real-time correspondence between compensation actions and image acquisition. The angle output range covers the extreme pitch angles that the vehicle may reach, typically -6° to +6° (positive for braking pitch and negative for acceleration pitch), and amplitude limiting protection is set to prevent abnormal value output.

[0144] Specifically, during the activation of the automatic emergency braking system, this angle compensates in real time for target ranging errors caused by the vehicle's nose dropping, ensuring that braking decisions are based on the true relative distance rather than the apparent distance distorted by attitude. In adaptive cruise control, stable pitch angle compensation prevents changes in the height of the target vehicle ahead from being misinterpreted as lane changes or slope undulations, improving following smoothness. In lane keeping assist systems, the compensated image allows the lane line detection algorithm to operate based on the assumption of a level road surface, avoiding misjudgments of lane line curvature caused by vehicle pitch. In active suspension control, this angle can serve as a feedforward signal to adjust shock absorber damping in advance to suppress vehicle pitch and improve ride comfort. In hill start assist or hill descent control functions, the compensated pitch angle can be used to accurately determine road slope, preventing false triggering or function failure. During vehicle development and testing, this angle value can also be used as a data recording parameter to reproduce and analyze vehicle dynamics under extreme conditions.

[0145] Specifically, firstly, the pitch angle calculation based on the vanishing point's ordinate establishes a precise mapping between image features and physical attitude, enabling the vision system to self-perceive its real-time attitude relative to the road surface without relying on an inertial measurement unit, reducing hardware costs and avoiding sensor drift issues. Secondly, the actual vehicle pitch angle obtained after deducting pre-calibrated installation deviations has physical interpretability and can be directly used for vehicle dynamics control, enhancing compatibility and interoperability with underlying actuators. Thirdly, the real-time output angle value provides the perception algorithm with dynamic calibration capabilities, enabling geometrically constrained vision tasks (such as target ranging and lane line fitting) to maintain preset accuracy when the vehicle's attitude changes drastically, significantly improving the reliability of the active safety system under critical conditions. Finally, this compensation mechanism enables the intelligent driving system to maintain robustness in perception and control in extreme scenarios, providing crucial technical support for the implementation of all-weather, all-condition autonomous driving functions.

[0146] This invention discloses an image-based vehicle pitch angle compensation method for critical conditions. By acquiring real-time visual images and motion state signals of the vehicle's front, it dynamically delineates a region of interest (ROI) containing the road vanishing point. Local texture direction estimation and voting calculations are performed on pixels within this ROI to accurately locate the vanishing point. Kalman filtering is then used to optimally estimate the vanishing point's location. Based on the estimation result, the camera pitch angle is calculated, and a pre-calibrated installation deviation is subtracted to obtain the vehicle's actual pitch angle, which is then output to the control system for dynamic compensation. This method effectively solves the technical problem of visual perception geometric reference destruction and environmental perception failure caused by severe vehicle pitch in critical conditions. It achieves integrated modeling from multi-source data fusion and adaptive region localization to accurate pitch angle estimation and real-time compensation, significantly improving the accuracy of environmental perception and control robustness of intelligent driving systems under extreme conditions, and enhancing the engineering applicability of active safety functions in complex dynamic scenarios.

[0147] Example 2 To achieve the above invention, embodiments of the present invention also provide another image-based vehicle pitch angle compensation method for critical situations, such as... Figure 2 As shown, it includes: In one embodiment of the present invention, a visual image of the front of the vehicle and vehicle signals are acquired.

[0148] Specifically, the visual images of the front of the vehicle are captured using an onboard camera, with an image resolution of [resolution missing]. The camera captures images at a frequency of 30 frames per second, positioned below the rearview mirror inside the windshield. The camera angle is adjusted so that the lowest point of the field of view is at the edge of the hood, and the road surface occupies more than three-fifths of the entire image, ensuring the road area in the captured image exceeds 50 meters. The vehicle's CAN bus is used to acquire signals reflecting its motion status in real time, such as vehicle speed, longitudinal acceleration, and lateral acceleration. The captured images are transmitted to the onboard industrial control computer via USB, and vehicle signals are transmitted to the industrial control computer via the CAN interface. The industrial control computer timestamps each frame of the image using its system time, and subsequently processes the images in chronological order within the industrial control computer.

[0149] In one embodiment of the present invention, the region of interest at the vanishing point of the road is determined based on vehicle signals.

[0150] Specifically, to improve computational efficiency and avoid performing calculations on the entire image, a region near the horizon close to the vanishing point of the road is predefined as a region of interest. The position and size of this region are not fixed, but are dynamically adjusted according to real-time vehicle speed and acceleration signals. Dense texture direction calculations are performed only within this region, significantly reducing the number of computational pixels. This region of interest is dynamically adjusted according to the initial calibrated horizon position of the vehicle.

[0151] In one embodiment of the present invention, the local dominant orientation of pixels in the region of interest at the road vanishing point is estimated.

[0152] For any pixel of the region of interest image Use 0, , , Four specific angles Perform Gabor convolution: (1) remember and They are respectively Calculate the Gabor energy response using the real and imaginary parts: (2) To enhance direction selectivity and suppress noise, an orthogonal direction competition strategy is adopted, dividing the four directions into two orthogonal pairs, and calculating the significance intensity of each pair. : (3) Among them, the angle corresponding to significant intensity As the dominant direction, it is defined as: (4) Local dominance direction is estimated using vector composition: (5) In one embodiment of the present invention, road vanishing point voting is calculated.

[0153] With point as The origin defines a dominant direction. directional rays , Each pixel on Contributions to the vanishing point vote are based on Euclidean distance. Exponential decay is calculated as follows: (6) in, From The maximum distance to the intersection point of its ray and the image boundary of the region of interest. It is a fixed constant greater than 0, used to control the degree of decay.

[0154] Accumulate the contributions of all pixels and calculate the vanishing point vote. : (7) Choose a voting matrix The pixel with the most votes is used as the vanishing point position. .

[0155] In one embodiment of the present invention, camera pitch angle estimation and vehicle pitch angle compensation are based on Kalman filtering.

[0156] Specifically, due to road surface texture noise or changes in lighting, the vanishing point location Unreasonably high-frequency jumps may occur between consecutive frames, requiring optimal estimation of the true state of the vanishing point; a state vector is defined. ,in , These represent the speed at which the vanishing point moves on the image plane.

[0157] Based on the uniform motion model, predict the current frame: (8) Wherein, the state transition matrix F Process noise covariance Q Defined as: (9) (10) in, Image acquisition interval Process noise standard deviation.

[0158] The vanishing point coordinates obtained from the road vanishing point voting calculation are used as the observation values. Calculate the Kalman gain: (11) Among them, the observation matrix H and observation noise covariance R Defined as: (12) (14) in, It is a scaling constant. It is the minimum value. Peak signal-to-noise ratio of the voting matrix; Updated state estimate: (15) in, I It is an identity matrix.

[0159] After filtering, the state vector is extracted. The optimal position estimate in As the final vanishing point of the camera; defined F For camera focal length, HFor the height of the photosensitive element, the aperture angle of the camera in the vertical direction is... for: (16) definition Given the image size, the camera pitch angle is... : (17) like Figure 3 As shown, assuming the camera is rigidly connected to the vehicle, the relative pitch angle difference is... The vehicle pitch angle, obtained during camera installation and calibration, is then calculated based on graph compensation. It can be represented as (18) Compared with existing technologies, this invention does not rely on inertial measurement units and can achieve pitch angle estimation using only existing forward-looking cameras. It is also specifically optimized for critical conditions. By detecting the vanishing point of the road in the image, it estimates and compensates for the vehicle's pitch angle, which can effectively suppress the degradation of perception performance caused by violent vehicle movement and significantly improve the active safety capabilities of intelligent vehicles under extreme conditions.

[0160] Another embodiment of the present invention provides an image-based vehicle pitch angle compensation method for critical conditions. This method acquires real-time visual images and motion state signals from the vehicle's front, dynamically delineates a region of interest (ROI) containing the road vanishing point, and extracts the dominant texture direction of pixels within the ROI for voting calculation to accurately locate the vanishing point. Kalman filtering is then used to optimally estimate the vanishing point location. Based on the estimation result, the camera pitch angle is calculated and a pre-calibrated installation deviation is subtracted to obtain the actual vehicle pitch angle, which is then output to the control system for dynamic compensation. This method effectively solves the technical problem of visual perception geometric reference destruction and environmental perception failure caused by severe vehicle pitch in critical conditions. It achieves integrated modeling from multi-source data fusion and adaptive region localization to accurate pitch angle estimation and real-time compensation, significantly improving the accuracy of environmental perception and control robustness of the intelligent driving system under extreme conditions, and enhancing the engineering applicability of active safety functions in complex dynamic scenarios.

[0161] Example 3 To achieve the above invention, such as Figure 4 As shown, this embodiment also provides an image-based vehicle pitch angle compensation device 10 for critical situations, the device 10 comprising: The multi-source signal synchronous acquisition module 100 is used to acquire visual images of the front of the vehicle and vehicle motion status signals. The dynamic region of interest adaptive localization module 200 is used to determine the dynamic region of interest containing the road vanishing point in the visual image based on the vehicle motion state signal. The texture dominance direction estimation module 300 is used to estimate the local dominance direction of the pixels in the dynamic region of interest to obtain the texture dominance direction corresponding to each pixel. The vanishing point voting calculation and positioning module 400 is used to perform voting calculations based on the dominant texture direction of each pixel to determine the location of the road vanishing point in the visual image. The pitch angle estimation and dynamic compensation module 500 is used to make an optimal estimate of the vanishing point position of the road based on Kalman filtering, and calculate the camera pitch angle based on the coordinates of the vanishing point obtained by the optimal estimate, and then compensate the vehicle pitch angle.

[0162] This invention discloses an image-based vehicle pitch angle compensation device for critical operating conditions. By incorporating a multi-source signal synchronous acquisition module, a dynamic region of interest adaptive localization module, a texture-dominant direction estimation module, a vanishing point voting calculation and localization module, and a pitch angle estimation and dynamic compensation module, an integrated processing flow is constructed from data acquisition, region localization, direction estimation, vanishing point detection, and pitch angle compensation. This device effectively solves the technical problem of visual perception geometric reference destruction and environmental perception failure caused by severe vehicle pitch in critical operating conditions. It significantly improves the accuracy of environmental perception and control robustness of intelligent driving systems under extreme conditions, and enhances the engineering applicability and reliability of active safety functions in complex dynamic scenarios.

[0163] To implement the methods of the above embodiments, the present invention also provides a computer device, such as... Figure 5 As shown, the computer device 600 includes a memory 601 and a processor 602; wherein, the processor 602 reads the executable program code stored in the memory 601 to run a program corresponding to the executable program code, so as to implement the various steps of the image-based vehicle pitch angle compensation method under critical conditions described above.

[0164] To implement the above embodiments, this application also proposes a non-transitory computer-readable storage medium storing a computer program thereon, which, when executed by a processor, implements an image-based vehicle pitch angle compensation method under critical conditions as described in the foregoing embodiments.

[0165] In the description of this specification, the references to terms such as "one embodiment," "some embodiments," "example," "specific example," or "some examples," etc., refer to specific features, structures, materials, or characteristics described in connection with that embodiment or example, which are included in at least one embodiment or example of the present invention. In this specification, the illustrative expressions of the above terms do not necessarily refer to the same embodiment or example. Furthermore, the specific features, structures, materials, or characteristics described may be combined in any suitable manner in one or more embodiments or examples. Moreover, without contradiction, those skilled in the art can combine and integrate the different embodiments or examples described in this specification, as well as the features of different embodiments or examples.

[0166] Furthermore, the terms "first" and "second" are used for descriptive purposes only and should not be construed as indicating or implying relative importance or implicitly specifying the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include at least one of that feature. In the description of this invention, "a plurality of" means at least two, such as two, three, etc., unless otherwise explicitly specified.

Claims

1. A vehicle pitch angle compensation method based on images under critical operating conditions, characterized in that, include: S1, acquire visual images of the front of the vehicle and vehicle motion status signals; S2, Based on the vehicle motion state signal, determine the dynamic region of interest containing the road vanishing point in the visual image; S3, perform local dominant direction estimation on the pixels in the dynamic region of interest to obtain the texture dominant direction corresponding to each pixel; S4, based on the dominant texture direction of each pixel, vote to determine the location of the road vanishing point in the visual image; S5. The vanishing point of the road is optimally estimated based on Kalman filtering, and the camera's pitch angle is calculated based on the coordinates of the vanishing point obtained from the optimal estimation, thereby compensating for the vehicle's pitch angle.

2. The method as described in claim 1, characterized in that, The acquisition of the visual image in front of the vehicle and the vehicle's motion state signal includes: S11, real-time acquisition of vehicle motion status signals, including vehicle speed, longitudinal acceleration and lateral acceleration, which are read through the vehicle CAN bus; S12, real-time acquisition of visual images in front of the vehicle. The visual images are acquired by a vehicle-mounted camera fixedly installed on the inside of the windshield at a preset frequency. The image resolution is fixed. The installation position and viewing angle of the vehicle-mounted camera are pre-adjusted so that the proportion of the road area in the acquired image is greater than a set threshold, and the longitudinal extension distance of the road area in the image exceeds a preset number of meters. S13, the visual image and the vehicle motion status signal are synchronously transmitted to the on-board industrial control computer, and a timestamp based on the system time of the industrial control computer is added to each frame of the image so that the images can be processed in chronological order later.

3. The method as described in claim 1, characterized in that, The step of determining the dynamic region of interest containing the road vanishing point in the visual image based on the vehicle motion state signal includes: S21, Obtain the reference position of the horizon on the image plane in the initial calibration state of the vehicle, and use it as the longitudinal reference of the dynamic region of interest; S22, based on the real-time acquired vehicle speed, longitudinal acceleration, and lateral acceleration signals, dynamically adjust the longitudinal position and size range of the dynamically interested region in the image; wherein, the longitudinal position is adaptively translated according to the change of the vehicle's pitch attitude, and the size range is adaptively scaled according to the intensity of the vehicle's movement. S23. Using the dynamically adjusted vertical position as the center, a strip-shaped region of preset width is defined in the image as the dynamic region of interest. Subsequently, local dominant orientation estimation of pixels is only performed within this region to reduce the computational load.

4. The method as described in claim 1, characterized in that, The step of estimating the local dominant orientation of pixels within the dynamically interested region to obtain the texture dominant orientation corresponding to each pixel includes: S31, for each pixel in the dynamic region of interest, convolution operation is performed using Gabor filters with at least four preset directional angles to obtain the Gabor energy response value corresponding to each direction; S32, divide at least four preset direction angles into several groups of orthogonal direction pairs, perform competitive suppression operation on the two energy response values ​​in each group of orthogonal direction pairs, and obtain the significant intensity of the pixel in the current direction pair; S33, select the angle corresponding to the maximum value of the significant intensity in all direction pairs as the initial dominant direction of the pixel, and use the vector synthesis method to optimize the neighborhood consistency of the initial dominant direction to obtain the final estimated texture dominant direction.

5. The method as described in claim 1, characterized in that, The method of determining the vanishing point location of the road in the visual image by voting based on the dominant texture direction of each pixel includes: S41, taking each pixel in the dynamic region of interest as the starting point of voting, a ray pointing to the image boundary is generated along its corresponding dominant texture direction to establish the voting support relationship between the pixel and the vanishing point position. S42, calculate the Euclidean distance between each candidate pixel on the ray and the starting point of the vote, and perform exponential decay weighting based on the Euclidean distance to obtain the voting contribution value of the starting point of the vote to each candidate pixel on the ray, wherein the closer the distance, the greater the contribution, and the farther the distance, the smaller the contribution. S43, accumulate the voting contribution values ​​of all pixels on the image plane to construct a two-dimensional voting accumulation matrix corresponding to the dynamic region of interest, and select the pixel coordinates corresponding to the voting peak in the two-dimensional voting accumulation matrix as the observation position of the road vanishing point in the current frame visual image.

6. The method as described in claim 1, characterized in that, The process of optimally estimating the vanishing point location based on Kalman filtering, calculating the camera's pitch angle based on the optimally estimated vanishing point coordinates, and then compensating for the vehicle's pitch angle includes: S51, construct a state vector containing the image coordinates of the vanishing point and the motion velocity on the image plane, and perform Kalman filtering to predict the state of the vanishing point of the current frame based on the uniform motion model to obtain the predicted state and the corresponding covariance. S52, the observed location of the road vanishing point calculated by voting is used as the measurement input. The observation noise covariance is dynamically adjusted according to the confidence of the voting results. The Kalman gain is calculated in combination with the predicted state, and the state vector is updated. The updated vanishing point image coordinates are extracted as the optimal estimated location. S53 calculates the current pitch angle of the camera based on the longitudinal coordinate of the vanishing point in the optimal estimated position, combined with the camera's focal length, sensor height, and image size. Then, using the pre-calibrated fixed pitch angle deviation between the camera and the vehicle, it determines the vehicle's actual pitch angle and outputs this actual pitch angle to the vehicle control system and perception module for compensation.

7. A vehicle pitch angle compensation device based on images under critical conditions, characterized in that, include: The multi-source signal synchronous acquisition module is used to acquire visual images of the front of the vehicle and vehicle motion status signals. A dynamic region of interest adaptive localization module is used to determine a dynamic region of interest containing the vanishing point of the road in the visual image based on the vehicle motion state signal; The texture dominance direction estimation module is used to estimate the local dominance direction of pixels in the dynamic region of interest to obtain the texture dominance direction corresponding to each pixel. The vanishing point voting calculation and localization module is used to perform voting calculations based on the dominant texture direction of each pixel to determine the location of the road vanishing point in the visual image. The pitch angle estimation and dynamic compensation module is used to make an optimal estimate of the vanishing point position of the road based on Kalman filtering, and calculate the camera pitch angle based on the coordinates of the vanishing point obtained by the optimal estimate, and then compensate for the vehicle pitch angle.

8. An electronic device, comprising: processor; The memory stores executable instructions; when the processor executes the instructions, it implements the image-based vehicle pitch angle compensation method for critical conditions as described in any one of claims 1-6.

9. A computer-readable storage medium storing a computer program, which, when executed by a processor, implements an image-based vehicle pitch angle compensation method for critical conditions as described in any one of claims 1-6.