A trajectory prediction method and system based on fusion of multi-modal results and a medium

By employing multimodal data fusion and trajectory prediction methods, the problems of high computational cost and uninterpretable detection in deep learning for autonomous driving are solved, achieving efficient and interpretable multi-sensor data fusion and vehicle trajectory prediction.

CN117994289BActive Publication Date: 2026-06-19东风悦享科技有限公司

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
东风悦享科技有限公司
Filing Date
2024-01-25
Publication Date
2026-06-19

AI Technical Summary

Technical Problem

Existing deep learning technologies in autonomous driving suffer from high computational cost, uninterpretable detection results, and black-box issues, making it difficult to effectively integrate multi-sensor data, resulting in insufficient detection accuracy and prediction reliability.

Method used

By employing a multimodal result fusion method, data is acquired using vehicle-mounted cameras, LiDAR, and millimeter-wave radar. The data is then timestamped and aligned. Combined with the multimodal result fusion algorithm and obstacle-map lane matching, a multi-trajectory prediction model is used to predict vehicle trajectories.

Benefits of technology

The test results were optimized to meet cost control requirements, improve testing efficiency and prediction effectiveness, and enhance the interpretability of the prediction results.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN117994289B_ABST
    Figure CN117994289B_ABST
Patent Text Reader

Abstract

This invention relates to a trajectory prediction method, system, and medium based on fused multimodal results. The method includes: Q1. A vehicle travels on a road, acquiring real-time image data of the road using an onboard camera, real-time point cloud data of a target obstacle using an onboard LiDAR, and real-time position and velocity data of the target obstacle using an onboard millimeter-wave radar; Q2. Based on the image data of the road, the point cloud data of the target obstacle, and the position and velocity data of the target obstacle, performing timestamp calibration and data alignment processing to obtain processed image data of the road. This invention not only significantly optimizes the detection results by assigning the advantageous attributes of various sensors to the reference position of the target object through multiple matching, but also ensures detection efficiency, meets cost control requirements, and guarantees detection effect, thus integrating the advantages of various sensors.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of autonomous driving technology, and in particular to a trajectory prediction method, system, and medium based on the fusion of multimodal results. Background Technology

[0002] Multimodal target detection technology refers to using sensors with multiple modalities and specific fusion strategies or models to detect targets. This results in a fusion of information from multiple modalities, improving the efficiency and accuracy of target detection and outputting the best and most stable target results. It mainly consists of three steps: First, the raw data from multiple sensors scanning the surrounding environment is preprocessed. The target detection technology corresponding to each sensor is used to detect the preprocessed data, obtaining the detection results from multiple target detection modules. Second, the results obtained in the first step are further preprocessed in the fusion module to obtain the target detection results at the same time within a defined range. Third, a pre-set graph neural network model is used to fuse and track the target detection data, identifying the same target object, thus achieving global information fusion, increasing perception capabilities, and enabling more accurate target identification.

[0003] Trajectory prediction technology refers to the technology used by autonomous vehicles to predict the future states of nearby traffic participants, similar to the predictive driving ability of human drivers. Specifically, it can be described as using the historical states of traffic participants to predict their future states within a given scenario. Historical states can be obtained from both the vehicle and roadside, and state information generally includes the position, speed, acceleration, and orientation of traffic participants. Scenario information generally includes vehicle kinematics (or dynamics), maps, and information about interactions between traffic participants. Rapidly developing deep learning technology is also gaining significant traction in the prediction field and has become a mainstream research direction, reaching state-of-the-art (SOTA) levels. Compared to physics-based and machine learning-based methods, deep learning-based methods can predict states over longer time periods.

[0004] Existing deep learning technologies present two main problems. First, cost is a critical consideration for the practical deployment of autonomous vehicles. This necessitates careful management of the vehicle's controller, the "brain" of the system. The deployment and use of neural networks in deep learning consume significant computing power, competing with upstream detection and other modules for resources. Second, due to the inherent algorithmic nature of neural networks, they are poorly interpretable detection algorithms, especially trajectory prediction, which is inherently subjective. The more uninterpretable neural network models used in autonomous vehicles, the more likely unpredictable black-box problems will be introduced. Summary of the Invention

[0005] In view of the above problems, the present invention provides a trajectory prediction method, system and medium based on the fusion of multimodal results. It not only assigns the advantageous attributes of various sensors to the reference position of the target object through multiple matching, which greatly optimizes the detection results, but also ensures the detection efficiency, meets the requirements of cost control, and ensures the detection effect, thus combining the advantages of various sensors.

[0006] To achieve the above and other related objectives, the present invention provides the following technical solution:

[0007] A trajectory prediction method based on the fusion of multimodal results, the method comprising:

[0008] Q1. When a vehicle is driving on the road, it acquires real-time image data of the road based on the vehicle-mounted camera, real-time point cloud data of the target obstacle based on the vehicle-mounted lidar, and real-time position and speed data of the target obstacle based on the vehicle-mounted millimeter-wave radar.

[0009] Q2. Based on the image data information of the road, the point cloud data information of the target obstacle, and the position and velocity data information of the target obstacle, perform timestamp calibration and data alignment processing to obtain the processed image data information of the road, the point cloud data information of the target obstacle, and the position and velocity data information of the target obstacle;

[0010] Q3. Based on the processed image data of the road, the point cloud data of the target obstacle, and the position and velocity data of the target obstacle, a multimodal result fusion algorithm is used to fuse the data to obtain the fused data information of the target obstacle;

[0011] Q4. Input the fused target obstacle data into the high-precision map, and use the obstacle and map lane matching algorithm to classify the state of the target obstacle to obtain the target obstacle state classification data information;

[0012] Q5. Based on the state classification data of the target obstacle, the vehicle's trajectory is predicted using a multi-trajectory prediction model algorithm to obtain the vehicle's trajectory prediction data.

[0013] Furthermore, in step Q2, the timestamp calibration and data alignment process includes:

[0014] Q21. Based on the image data information of the road, the point cloud data information of the target obstacle, and the position and velocity data information of the target obstacle, noise reduction processing is performed to obtain the noise-reduced image data information of the road, the point cloud data information of the target obstacle, and the position and velocity data information of the target obstacle;

[0015] Q22. Based on the image data information of the road after noise reduction, the point cloud data information of the target obstacle, and the position and velocity data information of the target obstacle, a timestamp calibration algorithm is used to process the data to obtain the image data information of the road, the point cloud data information of the target obstacle, and the position and velocity data information of the target obstacle after timestamp calibration;

[0016] Q23. Based on the image data information of the road after the timestamp calibration, the point cloud data information of the target obstacle, and the position and velocity data information of the target obstacle, the alignment processing is performed according to the different time points to obtain the processed image data information of the road, the point cloud data information of the target obstacle, and the position and velocity data information of the target obstacle.

[0017] Furthermore, the alignment process based on different time points involves aligning the image data of the road and the position and velocity data of the target obstacle according to the same time point, based on the point cloud data information of the target obstacle after the timestamp is calibrated.

[0018] Furthermore, in step Q3, the data fusion using a multimodal result fusion algorithm includes:

[0019] Q31. Based on the processed road image data and target obstacle point cloud data, establish a first fusion function G for the multimodal results.

[0020]

[0021] Where A is the transpose matrix, M1 is the processed image data of the road, M2 is the processed point cloud data of the target obstacle, α and β are weight coefficients, and the first fusion data information of the multimodal result is obtained.

[0022] Q32. Based on the processed point cloud data and position / velocity data of the target obstacle, establish a second fusion function H for the multimodal results.

[0023] H=∫∫(ρ1M2+ρ2M3)dM2dM3+exp(M2·M3),

[0024] Where M2 is the processed point cloud data information of the target obstacle, M3 is the processed position and velocity data information of the target obstacle, ρ1 is the fusion factor of the processed point cloud data information of the target obstacle, and ρ2 is the fusion factor of the processed position and velocity data information of the target and obstacle, thus obtaining the second fused data information of the multimodal result.

[0025] Q33. Based on the second fused data information of the multimodal results and the first fused data information of the multimodal results, establish a fusion function W for the multimodal results.

[0026]

[0027] Where g is the second fused data information of the multimodal results, h is the first fused data information of the multimodal results, and λ1 and λ2 are the fusion parameters of the multimodal results. The data information of the target obstacle is fused to obtain the fused data information of the target obstacle.

[0028] Furthermore, the constraints on the weighting coefficients α and β are as follows:

[0029]

[0030] The constraints for the fusion factor ρ1 of the processed point cloud data information of the target obstacle and the fusion factor ρ2 of the processed position and velocity data information of the target and obstacle are as follows:

[0031]

[0032] The fusion parameter λ1 of the multimodal results takes values ​​in the range of (0,1), and the fusion parameter λ2 of the multimodal results takes values ​​in the range of (0,1).

[0033] Furthermore, the processed position and velocity data of the target obstacle is a state matrix data information of the position and velocity of the target obstacle, the processed image data information of the road is a matrix data information of the road image, and the processed point cloud data information of the target obstacle is a point cloud matrix data information of the target obstacle.

[0034] Furthermore, in step Q4, the classification of the state of the target obstacle using the obstacle-map lane matching algorithm includes:

[0035] Q41. Input the fused target obstacle data into a high-precision map, and establish an obstacle-map lane matching function P.

[0036]

[0037] Where n is the sample size, x i For the fused target obstacle data, y i From the data information of the map lane points in the high-precision map, we obtain the matching degree data information between obstacles and map lanes;

[0038] Q42. Based on the matching degree data information between the obstacle and the map lane, preset thresholds k1, k2 and k3 are set. If the matching degree data information between the obstacle and the map lane is less than k1, it is a first type of target obstacle. If the matching degree data information between the obstacle and the map lane is between k1 and k2, it is a second type of target obstacle. If the matching degree data information between the obstacle and the map lane is between k2 and k3, it is a third type of target obstacle. If the matching degree data information between the obstacle and the map lane is greater than k3, it is a fourth type of target obstacle.

[0039] Q43. Based on the states of the first type of target obstacle, the second type of target obstacle, the third type of target obstacle, and the fourth type of target obstacle, obtain the state classification data information of the target obstacle.

[0040] Furthermore, the multi-trajectory prediction model algorithm for predicting the vehicle's trajectory involves the following steps: If the target obstacle's state is the first type of target obstacle, then the target obstacle is traveling straight along the current lane, and a vehicle-centric single-vehicle kinematics model is used to predict the vehicle's trajectory. If the target obstacle's state is the second type of target obstacle, then the target obstacle is changing lanes to an adjacent lane, and a front-wheel-drive single-vehicle kinematics model is used to predict the vehicle's trajectory. If the target obstacle's state is the third type of target obstacle, then the target obstacle is turning, and an Ackerman steering geometry kinematics model is used to predict the vehicle's trajectory. If the target obstacle's state is the fourth type of target obstacle, then the target obstacle fails to match the map, and an LSTM prediction model is used to predict the vehicle's trajectory.

[0041] To achieve the above and other related objectives, the present invention also provides a trajectory prediction system based on fused multimodal results, including a computer device programmed or configured to perform the steps of any of the trajectory prediction methods based on fused multimodal results.

[0042] To achieve the above and other related objectives, the present invention also provides a computer-readable storage medium storing a computer program programmed or configured to perform any of the trajectory prediction methods based on fused multimodal results as described above.

[0043] The present invention has the following positive effects:

[0044] 1. This invention obtains processed road image data, target obstacle point cloud data, and target obstacle position and velocity data by performing timestamp calibration and data alignment. It then combines this with a multimodal result fusion algorithm to obtain fused target obstacle data. Through multiple matching operations, the advantageous attributes of various sensors are assigned to the target's reference position, significantly optimizing the detection results. This ensures detection efficiency, meets cost control requirements, and guarantees detection effectiveness, thus integrating the advantages of various sensors.

[0045] 2. This invention classifies the state of target obstacles by using an obstacle and map lane matching algorithm to obtain the state classification data information of the target obstacles. It then combines this with a vehicle multi-trajectory prediction model algorithm to predict the trajectory of the vehicle. By making full use of information such as high-precision maps and combining a time-series prediction LSTM model, the trajectory prediction function of the target obstacles is completed, which greatly ensures the prediction effect. This not only ensures stable prediction, but also enhances the interpretability of the prediction results. Attached Figure Description

[0046] Figure 1 This is a schematic diagram of the method flow of the present invention;

[0047] Figure 2 This is a flowchart illustrating the multimodal result fusion algorithm of the present invention;

[0048] Figure 3 This is a schematic diagram illustrating the process of classifying the state of target obstacles using the obstacle and map lane matching algorithm of the present invention;

[0049] Figure 4 This is a schematic diagram of the target obstacle detection results of the present invention. Detailed Implementation

[0050] The exemplary embodiments of this disclosure are described below with reference to the accompanying drawings, including various details of the embodiments to aid understanding, and should be considered merely exemplary. Therefore, those skilled in the art will recognize that various changes and modifications can be made to the embodiments described herein without departing from the scope and spirit of this disclosure. Similarly, for clarity and brevity, descriptions of well-known functions and structures are omitted in the following description.

[0051] Example 1: As Figure 1 As shown, a trajectory prediction method based on the fusion of multimodal results is described, the method comprising:

[0052] Q1. When a vehicle is driving on the road, it acquires real-time image data of the road based on the vehicle-mounted camera, real-time point cloud data of the target obstacle based on the vehicle-mounted lidar, and real-time position and speed data of the target obstacle based on the vehicle-mounted millimeter-wave radar.

[0053] Q2. Based on the image data information of the road, the point cloud data information of the target obstacle, and the position and velocity data information of the target obstacle, perform timestamp calibration and data alignment processing to obtain the processed image data information of the road, the point cloud data information of the target obstacle, and the position and velocity data information of the target obstacle;

[0054] Q3. Based on the processed image data of the road, the point cloud data of the target obstacle, and the position and velocity data of the target obstacle, a multimodal result fusion algorithm is used to fuse the data to obtain the fused data information of the target obstacle;

[0055] Q4. Input the fused target obstacle data into the high-precision map, and use the obstacle and map lane matching algorithm to classify the state of the target obstacle to obtain the target obstacle state classification data information;

[0056] Q5. Based on the state classification data of the target obstacle, the vehicle's trajectory is predicted using a multi-trajectory prediction model algorithm to obtain the vehicle's trajectory prediction data.

[0057] In this embodiment, step Q2, the timestamp calibration and data alignment process includes:

[0058] Q21. Based on the image data information of the road, the point cloud data information of the target obstacle, and the position and velocity data information of the target obstacle, noise reduction processing is performed to obtain the noise-reduced image data information of the road, the point cloud data information of the target obstacle, and the position and velocity data information of the target obstacle;

[0059] Q22. Based on the image data information of the road after noise reduction, the point cloud data information of the target obstacle, and the position and velocity data information of the target obstacle, a timestamp calibration algorithm is used to process the data to obtain the image data information of the road, the point cloud data information of the target obstacle, and the position and velocity data information of the target obstacle after timestamp calibration;

[0060] Q23. Based on the image data information of the road after the timestamp calibration, the point cloud data information of the target obstacle, and the position and velocity data information of the target obstacle, the alignment processing is performed according to the different time points to obtain the processed image data information of the road, the point cloud data information of the target obstacle, and the position and velocity data information of the target obstacle.

[0061] In this embodiment, the alignment process based on different time points involves aligning the image data of the road and the position and velocity data of the target obstacle with the same time point, based on the point cloud data information of the target obstacle after the timestamp is calibrated.

[0062] Currently, the immaturity of environmental perception technology remains a major obstacle to improving the overall performance of autonomous vehicles and the biggest hurdle to their large-scale commercialization. Typically, to ensure the safety and reliability of autonomous vehicles during operation, multi-sensor perception solutions are employed, primarily including cameras, millimeter-wave radar, and lidar. Cameras offer advantages such as high resolution, high speed, rich information transmission, and low cost; their powerful learning capabilities on complex data, relying on deep learning, can significantly improve environmental perception classification. Millimeter-wave radar boasts advantages such as fast response speed, simple operation, and immunity to occlusion, and can provide effective target position and velocity under various conditions. LiDAR offers advantages such as accurate 3D perception, insensitivity to light changes, and rich information; however, image data cannot provide accurate spatial information, millimeter-wave radar has extremely low resolution, and lidar is very expensive. Furthermore, with the improvement of sensor performance, each sensor brings more information, making feature extraction extremely difficult without losing valuable information. Therefore, efficiently processing and fusing multi-sensor data is a highly challenging task.

[0063] In recent years, deep learning has achieved remarkable success with camera data, significantly improving the speed and accuracy of 2D object detection, proving it to be an effective feature extraction method. The development of convolutional neural network models has greatly enhanced the speed and capability of extracting features from autonomous driving camera data. By effectively utilizing these robust, high-quality, and high-accuracy image features, vision-based autonomous vehicles can also achieve good detection results in 3D perception tasks. Deep learning has also shown good performance in processing LiDAR data; with the emergence of networks based on sparse point cloud data, deep learning's ability to learn point cloud characteristics has gradually surpassed some traditional methods. However, when using deep learning for multi-sensor fusion, problems such as inefficient fusion, data mismatch, and overfitting still exist. Applying multi-sensor fusion technology to obstacle detection in autonomous driving also suffers from insufficient detection accuracy, missed detections, false detections, and inadequate real-time processing capabilities. Due to the increasing level of autonomous vehicles, traditional multi-sensor target fusion can no longer meet the perception requirements of decision-making, and the large amount of redundant perception information also brings great difficulties to decision-making. Furthermore, due to the significant differences in information dimension, information scope, and information quantity among the raw data from multiple sensors, effectively fusing information from multiple sensors becomes extremely difficult.

[0064] Similarly, deep learning techniques also perform well in handling target trajectory prediction problems involving time series data. Common approaches include recurrent neural networks (RNNs), which are widely used in trajectory prediction due to their ability to model sequential data. Models such as Long Short-Term Memory (LSTM) and Gated Recurrent Units (GRUs) can capture temporal correlations and predict future trajectories based on past observations. Compared to traditional methods, deep learning-based methods offer improved performance in capturing complex patterns, handling diverse scenarios, and generating more accurate trajectory predictions. However, they require large amounts of labeled training data and computational resources for training and inference. Furthermore, the interpretability of the learned model can be challenging; therefore, validating predictions and understanding the model's limitations in real-world scenarios is crucial.

[0065] Therefore, this solution addresses the aforementioned pain points by redesigning a processing flow. Starting with data reception and management, including intermediate data processing and usage, and finally sending the result data, the entire process primarily uses LiDAR with precise 3D spatial positioning, supplemented by other sensors. This ensures both the accuracy of the detected target in the final output and the reliability of using the results for prediction.

[0066] Example 2: Based on the trajectory prediction method based on the fusion of multimodal results in Example 1, the present invention will be further explained and described below.

[0067] like Figure 1 As shown, a trajectory prediction method based on the fusion of multimodal results is described, the method comprising:

[0068] Q1. When a vehicle is driving on the road, it acquires real-time image data of the road based on the vehicle-mounted camera, real-time point cloud data of the target obstacle based on the vehicle-mounted lidar, and real-time position and speed data of the target obstacle based on the vehicle-mounted millimeter-wave radar.

[0069] Q2. Based on the image data information of the road, the point cloud data information of the target obstacle, and the position and velocity data information of the target obstacle, perform timestamp calibration and data alignment processing to obtain the processed image data information of the road, the point cloud data information of the target obstacle, and the position and velocity data information of the target obstacle;

[0070] Q3. Based on the processed image data of the road, the point cloud data of the target obstacle, and the position and velocity data of the target obstacle, a multimodal result fusion algorithm is used to fuse the data to obtain the fused data information of the target obstacle;

[0071] Q4. Input the fused target obstacle data into the high-precision map, and use the obstacle and map lane matching algorithm to classify the state of the target obstacle to obtain the target obstacle state classification data information;

[0072] Q5. Based on the state classification data of the target obstacle, the vehicle's trajectory is predicted using a multi-trajectory prediction model algorithm to obtain the vehicle's trajectory prediction data.

[0073] In this embodiment, as Figure 2 As shown, in step Q3, the data fusion using the multimodal result fusion algorithm includes:

[0074] Q31. Based on the processed road image data and target obstacle point cloud data, establish a first fusion function G for the multimodal results.

[0075]

[0076] Where A is the transpose matrix, M1 is the processed image data of the road, M2 is the processed point cloud data of the target obstacle, α and β are weight coefficients, and the first fusion data information of the multimodal result is obtained.

[0077] Q32. Based on the processed point cloud data and position / velocity data of the target obstacle, establish a second fusion function H for the multimodal results.

[0078] H=∫∫(ρ1M2+ρ2M3)dM2dM3+exp(M2·M3),

[0079] Where M2 is the processed point cloud data information of the target obstacle, M3 is the processed position and velocity data information of the target obstacle, ρ1 is the fusion factor of the processed point cloud data information of the target obstacle, and ρ2 is the fusion factor of the processed position and velocity data information of the target and obstacle, thus obtaining the second fused data information of the multimodal result.

[0080] Q33. Based on the second fused data information of the multimodal results and the first fused data information of the multimodal results, establish a fusion function W for the multimodal results.

[0081]

[0082] Where g is the second fused data information of the multimodal results, h is the first fused data information of the multimodal results, and λ1 and λ2 are the fusion parameters of the multimodal results. The data information of the target obstacle is fused to obtain the fused data information of the target obstacle.

[0083] In this embodiment, the constraints on the weighting coefficients α and β are as follows:

[0084]

[0085] The constraints for the fusion factor ρ1 of the processed point cloud data information of the target obstacle and the fusion factor ρ2 of the processed position and velocity data information of the target and obstacle are as follows:

[0086]

[0087] The fusion parameter λ1 of the multimodal results takes values ​​in the range of (0,1), and the fusion parameter λ2 of the multimodal results takes values ​​in the range of (0,1).

[0088] In this embodiment, the processed position and velocity data of the target obstacle is the state matrix data of the position and velocity of the target obstacle, the processed image data of the road is the matrix data of the road image, and the processed point cloud data of the target obstacle is the point cloud matrix data of the target obstacle.

[0089] In this embodiment, in step Q4, as Figure 3 As shown, the classification of the state of the target obstacle using the obstacle and map lane matching algorithm includes:

[0090] Q41. Input the fused target obstacle data into a high-precision map, and establish an obstacle-map lane matching function P.

[0091]

[0092] Where n is the sample size, x i For the fused target obstacle data, y i From the data information of the map lane points in the high-precision map, we obtain the matching degree data information between obstacles and map lanes;

[0093] Q42. Based on the matching degree data information between the obstacle and the map lane, preset thresholds k1, k2 and k3 are set. If the matching degree data information between the obstacle and the map lane is less than k1, it is a first type of target obstacle. If the matching degree data information between the obstacle and the map lane is between k1 and k2, it is a second type of target obstacle. If the matching degree data information between the obstacle and the map lane is between k2 and k3, it is a third type of target obstacle. If the matching degree data information between the obstacle and the map lane is greater than k3, it is a fourth type of target obstacle.

[0094] Q43. Based on the states of the first type of target obstacle, the second type of target obstacle, the third type of target obstacle, and the fourth type of target obstacle, obtain the state classification data information of the target obstacle.

[0095] In this embodiment, as Figure 4 As shown, the multi-trajectory prediction model algorithm for predicting the vehicle's trajectory is as follows: If the state of the target obstacle is the state of the first type of target obstacle, then the target obstacle is traveling straight along the current lane, and a vehicle-centric single-vehicle kinematics model is used to predict the vehicle's trajectory. If the state of the target obstacle is the state of the second type of target obstacle, then the target obstacle is changing lanes to an adjacent lane, and a front-wheel-drive single-vehicle kinematics model is used to predict the vehicle's trajectory. If the state of the target obstacle is the state of the third type of target obstacle, then the target obstacle is turning, and an Ackerman steering geometry kinematics model is used to predict the vehicle's trajectory. If the state of the target obstacle is the state of the fourth type of target obstacle, then the target obstacle fails to match the map, and an LSTM prediction model is used to predict the vehicle's trajectory.

[0096] In this embodiment, the present invention provides a trajectory prediction system based on fused multimodal results, including a computer device programmed or configured to perform the steps of any of the trajectory prediction methods based on fused multimodal results described above.

[0097] In this embodiment, the present invention provides a computer-readable storage medium storing a computer program programmed or configured to perform any of the trajectory prediction methods based on fused multimodal results as described above.

[0098] Any references to memory, storage, database, or other media used in the embodiments provided in this application may include non-volatile and / or volatile memory. Non-volatile memory may include read-only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), or flash memory. Volatile memory may include random access memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms, such as static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), dual data rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous link DRAM (SLDRAM), RAMbus direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and RAMbus dynamic RAM (RDRAM), etc.

[0099] In summary, this invention not only optimizes the detection results by assigning the advantageous attributes of various sensors to the reference position of the target object through multiple matching, but also ensures detection efficiency, meets cost control requirements, and guarantees detection effect, thus combining the advantages of various sensors.

[0100] The specific embodiments described above do not constitute a limitation on the scope of protection of this disclosure. Those skilled in the art should understand that various modifications, combinations, sub-combinations, and substitutions can be made according to design requirements and other factors. Any modifications, equivalent substitutions, and improvements made within the spirit and principles of this disclosure should be included within the scope of protection of this disclosure.

Claims

1. A trajectory prediction method based on fusion of multimodal results, characterized in that, The method includes: Q1. When a vehicle is driving on the road, it acquires real-time image data of the road based on the vehicle-mounted camera, real-time point cloud data of the target obstacle based on the vehicle-mounted lidar, and real-time position and speed data of the target obstacle based on the vehicle-mounted millimeter-wave radar. Q2. Based on the image data information of the road, the point cloud data information of the target obstacle, and the position and velocity data information of the target obstacle, perform timestamp calibration and data alignment processing to obtain the processed image data information of the road, the point cloud data information of the target obstacle, and the position and velocity data information of the target obstacle; Q3. Based on the processed image data of the road, the point cloud data of the target obstacle, and the position and velocity data of the target obstacle, a multimodal result fusion algorithm is used to fuse the data to obtain the fused data information of the target obstacle; Q4. Input the fused target obstacle data into the high-precision map, and use the obstacle and map lane matching algorithm to classify the state of the target obstacle to obtain the target obstacle state classification data information; Q5. Based on the state classification data of the target obstacle, the vehicle's trajectory is predicted using a multi-trajectory prediction model algorithm to obtain the vehicle's trajectory prediction data. In step Q4, classifying the state of the target obstacle using an obstacle-map lane matching algorithm includes: Q41. Input the fused target obstacle data into a high-precision map, and establish an obstacle-map lane matching function P. , Where n is the sample size, x i For the fused target obstacle data, y i From the data information of the map lane points in the high-precision map, we obtain the matching degree data information between obstacles and map lanes; Q42. Based on the matching degree data information between the obstacle and the map lane, preset thresholds k1, k2 and k3 are set. If the matching degree data information between the obstacle and the map lane is less than k1, it is a first type of target obstacle. If the matching degree data information between the obstacle and the map lane is between k1 and k2, it is a second type of target obstacle. If the matching degree data information between the obstacle and the map lane is between k2 and k3, it is a third type of target obstacle. If the matching degree data information between the obstacle and the map lane is greater than k3, it is a fourth type of target obstacle. Q43. Based on the states of the first type of target obstacle, the second type of target obstacle, the third type of target obstacle, and the fourth type of target obstacle, obtain the state classification data information of the target obstacle; The multi-trajectory prediction model algorithm for predicting vehicle trajectory is as follows: If the state of the target obstacle is the first type of target obstacle, then the target obstacle is traveling straight along the current lane, and a vehicle-centric single-vehicle kinematics model is used to predict the vehicle's trajectory. If the state of the target obstacle is the second type of target obstacle, then the target obstacle is changing lanes to an adjacent lane, and a front-wheel-drive single-vehicle kinematics model is used to predict the vehicle's trajectory. If the state of the target obstacle is the third type of target obstacle, then the target obstacle is turning, and an Ackerman steering geometry kinematics model is used to predict the vehicle's trajectory. If the state of the target obstacle is the fourth type of target obstacle, then the target obstacle fails to match the map, and an LSTM prediction model is used to predict the vehicle's trajectory.

2. The trajectory prediction method based on fused multimodal results according to claim 1, characterized in that, In step Q2, the timestamp calibration and data alignment process includes: Q21. Based on the image data information of the road, the point cloud data information of the target obstacle, and the position and velocity data information of the target obstacle, noise reduction processing is performed to obtain the noise-reduced image data information of the road, the point cloud data information of the target obstacle, and the position and velocity data information of the target obstacle; Q22. Based on the image data information of the road after noise reduction, the point cloud data information of the target obstacle, and the position and velocity data information of the target obstacle, a timestamp calibration algorithm is used to process the data to obtain the image data information of the road, the point cloud data information of the target obstacle, and the position and velocity data information of the target obstacle after timestamp calibration; Q23. Based on the image data information of the road after the timestamp calibration, the point cloud data information of the target obstacle, and the position and velocity data information of the target obstacle, the alignment processing is performed according to the different time points to obtain the processed image data information of the road, the point cloud data information of the target obstacle, and the position and velocity data information of the target obstacle.

3. The trajectory prediction method based on fused multimodal results according to claim 2, characterized in that: The alignment process based on different time points involves aligning the image data of the road and the position and velocity data of the target obstacle with the same time point, based on the point cloud data information of the target obstacle after the timestamp is calibrated.

4. The trajectory prediction method based on fused multimodal results according to claim 1, characterized in that, In step Q3, the data fusion using a multimodal result fusion algorithm includes: Q31. Based on the processed road image data and target obstacle point cloud data, establish a first fusion function G for the multimodal results. , Where A is the transpose matrix, M1 is the processed image data of the road, M2 is the processed point cloud data of the target obstacle, α and β are weight coefficients, and the first fusion data information of the multimodal result is obtained. Q32. Based on the processed point cloud data and position / velocity data of the target obstacle, establish a second fusion function H for the multimodal results. , Where M2 is the processed point cloud data information of the target obstacle, M3 is the processed position and velocity data information of the target obstacle, ρ1 is the fusion factor of the processed point cloud data information of the target obstacle, and ρ2 is the fusion factor of the processed position and velocity data information of the target and obstacle, thus obtaining the second fused data information of the multimodal result. Q33. Based on the second fused data information of the multimodal results and the first fused data information of the multimodal results, establish a fusion function W for the multimodal results. , Where g is the second fused data information of the multimodal results, h is the first fused data information of the multimodal results, and λ1 and λ2 are the fusion parameters of the multimodal results. The data information of the target obstacle is fused to obtain the fused data information of the target obstacle.

5. The trajectory prediction method based on fused multimodal results according to claim 4, characterized in that: The constraints on the weighting coefficients α and β are as follows: , The constraints for the fusion factor ρ1 of the processed point cloud data information of the target obstacle and the fusion factor ρ2 of the processed position and velocity data information of the target and obstacle are as follows: , The fusion parameter λ1 of the multimodal results takes values ​​in the range of (0,1), and the fusion parameter λ2 of the multimodal results takes values ​​in the range of (0,1).

6. The trajectory prediction method based on fused multimodal results according to claim 4, characterized in that: The processed position and velocity data of the target obstacle are state matrix data of the target obstacle's position and velocity; the processed image data of the road is matrix data of the road image; and the processed point cloud data of the target obstacle is point cloud matrix data of the target obstacle.

7. A trajectory prediction system based on fusion of multimodal results, comprising a computer device, characterized in that, The computer device is programmed or configured to perform the steps of the trajectory prediction method based on fused multimodal results as described in any one of claims 1 to 6.

8. A computer-readable storage medium, characterized in that, The computer-readable storage medium stores a computer program that is programmed or configured to perform the trajectory prediction method based on fused multimodal results as described in any one of claims 1 to 6.