Trajectory prediction model training method, system, device and storage medium
By constructing temporal training samples and calculating the loss function to optimize the trajectory prediction model, the problem of ignoring the correlation between adjacent time moments in the trajectory prediction algorithm is solved, thereby improving the stability of the prediction results and the reliability of the autonomous driving system.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- ZHEJIANG LINGAI FUTURE TECHNOLOGY CO LTD
- Filing Date
- 2026-01-22
- Publication Date
- 2026-06-16
Smart Images

Figure CN121564467B_ABST
Abstract
Description
Technical Field
[0001] This application relates to the field of autonomous driving technology, specifically to trajectory prediction model training methods, systems, devices, and storage media. Background Technology
[0002] Trajectory prediction is a continuous prediction problem based on time series, and the prediction results at two adjacent time points should be consistent. However, most current trajectory prediction algorithms perform independent prediction processes, meaning that the prediction results at two adjacent time points are independent. This leads to unstable prediction results, which in turn affects downstream planning. For example, if the prediction at one time point is a left turn, and the prediction at the next time point is a right turn, it is not conducive to the autonomous driving system making safe and reliable decisions. Therefore, it is necessary to propose a stable and reliable trajectory prediction scheme. Summary of the Invention
[0003] This invention provides a method, system, device, and storage medium for training a trajectory prediction model to improve the stability of trajectory prediction.
[0004] Firstly, a method for training a trajectory prediction model is provided, including:
[0005] Obtain the first training sample of the target obstacle. The first training sample includes state data from M historical moments, where M is a positive integer.
[0006] A target sample set is constructed based on the first training sample; the target sample set includes M second training samples;
[0007] The first loss is determined based on the first training sample and the target trajectory prediction model;
[0008] The second loss is determined based on the target sample set and the target trajectory prediction model;
[0009] The target trajectory prediction model is optimized based on a first loss and a second loss to obtain a trained target trajectory prediction model. In some embodiments, a target sample set is constructed based on the first training samples, including:
[0010] In chronological order, each of the M historical moments is taken as the current historical moment.
[0011] Based on the state data of each current historical moment and the state data of all historical moments preceding the current historical moment, a second training sample corresponding to each current historical moment is constructed.
[0012] The target sample set is determined based on the second training sample corresponding to each historical current moment.
[0013] In some embodiments, determining a first loss based on a first training sample and a target trajectory prediction model includes:
[0014] A first candidate trajectory set is determined based on the first training samples and the target trajectory prediction model; the first candidate trajectory set includes K first candidate trajectories.
[0015] The first loss is determined based on the first candidate trajectory set and the preset true trajectory.
[0016] In some embodiments, determining a first loss based on a first candidate trajectory set and a preset true trajectory includes:
[0017] Select the first candidate trajectory with the highest similarity to the preset true value trajectory from the K first candidate trajectories, and use it as the target candidate trajectory;
[0018] The loss between the target candidate trajectory and the preset true trajectory is calculated to obtain the first loss.
[0019] In some embodiments, determining a second loss based on a target sample set and a target trajectory prediction model includes:
[0020] The second candidate trajectory set is determined based on the target sample set and the target trajectory prediction model; each second training sample includes K second candidate trajectories, and the second candidate trajectory set includes K*M second candidate trajectories.
[0021] The second loss is determined based on the second candidate trajectory set.
[0022] In some embodiments, the M second training samples are arranged in ascending order of the number of historical time points; the second loss is determined based on the second candidate trajectory set, including:
[0023] Arrange the K second candidate trajectories of each second training sample in descending order of confidence.
[0024] Calculate the third loss for all two second candidate trajectories with the same ranking level among the K second candidate trajectories of two adjacent second training samples in turn;
[0025] The third loss of two candidate trajectories with the same ranking level for two adjacent second training samples is summed to obtain the fourth loss for two adjacent second training samples.
[0026] The fourth loss of all two adjacent second training samples is summed to obtain the second loss.
[0027] In some embodiments, the method for calculating the third loss includes:
[0028] Calculate the fifth loss for the two state data at all the same time in two second candidate trajectories with the same ranking rank;
[0029] Adding all the fifth losses together, we get the third loss.
[0030] Secondly, a trajectory prediction model training system is also provided, including:
[0031] The acquisition module is used to acquire the first training sample of the target obstacle. The first training sample includes state data at M historical moments, where M is a positive integer.
[0032] A construction module is used to construct a target sample set based on the first training samples; the target sample set includes M second training samples.
[0033] The first determining module is used to determine the first loss based on the first training samples and the target trajectory prediction model;
[0034] The second determination module is used to determine the second loss based on the target sample set and the target trajectory prediction model;
[0035] The optimization module is used to optimize the parameters of the target trajectory prediction model based on the first loss and the second loss to obtain the trained target trajectory prediction model.
[0036] Thirdly, an electronic device is also provided, including a memory and a processor, wherein a computer program is stored in the memory, and when executed by the processor, the computer program implements the method described in the first aspect.
[0037] Fourthly, a computer-readable storage medium is also provided, on which a computer program is stored, said computer program being loaded by a processor to perform the steps of the method described in the first aspect.
[0038] Beneficial Effects: This application provides a trajectory prediction model training method, system, device, and storage medium. The trajectory prediction model training method includes: acquiring a first training sample of a target obstacle, the first training sample including state data at M historical moments, where M is a positive integer; constructing a target sample set based on the first training sample; the target sample set including M second training samples; determining a first loss based on the first training sample and the target trajectory prediction model; determining a second loss based on the target sample set and the target trajectory prediction model; and optimizing the parameters of the target trajectory prediction model based on the first and second losses to obtain a trained target trajectory prediction model. The trajectory prediction model training method provided in this application constructs multiple second training samples from the historical state data of the target obstacle according to temporal sequence, calculates the first loss based on the first training sample and the target trajectory prediction model respectively, calculates the second loss based on the second training sample and the target trajectory prediction model, and finally optimizes the parameters of the target trajectory prediction model using the first and second losses. Since the second training samples are constructed according to temporal sequence, the second loss takes into account the influence of temporal sequence on trajectory prediction, thereby stably outputting the future predicted trajectory of the target obstacle and improving the stability of trajectory prediction. Attached Figure Description
[0039] To more clearly illustrate the technical solutions in the embodiments of this application, the accompanying drawings used in the description of the embodiments will be briefly introduced below. Obviously, the accompanying drawings described below are only some embodiments of this application. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.
[0040] Figure 1 This is a flowchart of a trajectory prediction model training method provided in an embodiment of this application;
[0041] Figure 2 This is a framework diagram of a target trajectory prediction model provided in the embodiments of this application;
[0042] Figure 3 This is a schematic diagram of model training provided in an embodiment of this application;
[0043] Figure 4 This is a schematic diagram of the principle structure of a trajectory prediction model training system provided in the embodiments of this application. Detailed Implementation
[0044] The technical solutions of the embodiments of this application will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of this application, and not all embodiments. Based on the embodiments of this application, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of this application.
[0045] In the description of this application, it should be understood that the terms "center," "longitudinal," "lateral," "length," "width," "thickness," "upper," "lower," "front," "rear," "left," "right," "vertical," "horizontal," "top," "bottom," "inner," and "outer," etc., indicating orientation or positional relationships based on the orientation or positional relationships shown in the accompanying drawings, are used only for the convenience of describing this application and simplifying the description, and do not indicate or imply that the device or element referred to must have a specific orientation, or be constructed and operated in a specific orientation, and therefore should not be construed as a limitation of this application. Furthermore, the terms "first" and "second" are used for descriptive purposes only and should not be construed as indicating or implying relative importance or implicitly specifying the number of indicated technical features. Thus, features defined with "first" and "second" may explicitly or implicitly include one or more of the stated features. In the description of this application, "a plurality of" means two or more, unless otherwise explicitly specified.
[0046] "A and / or B" includes the following three combinations: A only, B only, and a combination of A and B.
[0047] The use of "applies to" or "configured to" in this application implies open and inclusive language, which does not exclude the applicability to or configuration to devices performing additional tasks or steps. Additionally, the use of "based on" implies openness and inclusivity, because processes, steps, calculations, or other actions "based on" one or more of the stated conditions or values may in practice be based on additional conditions or values beyond those stated.
[0048] In this application, the term "exemplary" is used to mean "used as an example, illustration, or description." Any embodiment described as "exemplary" in this application is not necessarily to be construed as being more preferred or advantageous than other embodiments. The following description is provided to enable any person skilled in the art to make and use this application. Details are set forth in the following description for purposes of explanation. It should be understood that those skilled in the art will recognize that this application can be made without using these specific details. In other instances, well-known structures and processes are not described in detail to avoid obscuring the description of this application with unnecessary detail. Therefore, this application is not intended to be limited to the embodiments shown, but is consistent with the broadest scope of the principles and features disclosed in this application.
[0049] The applicant's research revealed that trajectory prediction has always been a key task in the field of autonomous driving. It helps autonomous vehicles plan future trajectories and prevent potential accidents. Because of the high degree of uncertainty surrounding the future, trajectory prediction is inherently a multimodal problem. This means that an ideal trajectory prediction algorithm should generate multiple possible future trajectories.
[0050] Trajectory prediction algorithms typically take the historical states of a target obstacle (called an agent) at multiple points in time and its surrounding scene information (such as the historical states of other obstacles in the vicinity at multiple points in time, a map, etc.) as input, and output multiple possible future trajectories of the agent and a confidence score for each trajectory.
[0051] Currently, most prediction algorithms treat predictions from two consecutive moments as two independent processes, assuming that the prediction at the current moment is unrelated to the prediction at the previous moment. In reality, this assumption is invalid because the input information of the algorithm largely overlaps between two consecutive moments. This assumption easily leads to instability in continuous predictions; for example, the current prediction might be to go straight, but the previous prediction might be to turn right, which is detrimental to the autonomous driving system making safe and reliable decisions.
[0052] In view of this, embodiments of this application provide a trajectory prediction model training method, system, device, and storage medium. The trajectory prediction model training method provided in this application constructs multiple second training samples from the historical state data of the target obstacle according to the temporal sequence, calculates a first loss based on the first training sample and the target trajectory prediction model, calculates a second loss based on the second training sample and the target trajectory prediction model, and finally optimizes the parameters of the target trajectory prediction model through the first loss and the second loss. Since the second training sample is constructed according to the temporal sequence, the second loss takes into account the influence of the temporal sequence on trajectory prediction, thereby stably outputting the future predicted trajectory of the target obstacle and improving the stability of trajectory prediction.
[0053] Figure 1 This is a flowchart illustrating a trajectory prediction model training method provided in an embodiment of this application. On one hand, this embodiment provides a trajectory prediction model training method applicable to autonomous driving control systems, where, during the target trajectory prediction model training phase, parameters of the target trajectory prediction model are optimized to improve trajectory prediction stability. This method can be executed by a trajectory prediction model training system, which can be implemented in software and / or hardware, and can be configured in the processor or controller of the autonomous driving control system. Please refer to... Figure 1 The method includes the following steps:
[0054] Step 110: Obtain the first training sample of the target obstacle.
[0055] The first training sample includes state data from M historical time points, where M is a positive integer. The specific value of M can be set according to the actual situation and is not specifically limited here.
[0056] The target obstacle is any obstacle in the surrounding area other than the vehicle (e.g., an autonomous vehicle), such as roadblocks or pedestrians.
[0057] Among them, the M historical state data of the target obstacle refers to the historical state data of the target obstacle for the M consecutive historical moments before the current moment.
[0058] The historical state data includes the target obstacle's coordinates, heading angle, velocity, acceleration, length, width, and type.
[0059] Step 120: Construct the target sample set based on the first training sample.
[0060] Specifically, a target sample set is constructed based on the state data from M historical time points. This target sample set includes M second training samples. These second training samples are training samples composed of state data from multiple consecutive time points (e.g., state data from multiple consecutive historical time points).
[0061] In some embodiments, constructing a target sample set based on the first training sample includes: taking each of the M historical moments as the current historical moment in chronological order; constructing a second training sample corresponding to each current historical moment based on the state data of each current historical moment and the state data of all historical moments preceding the current historical moment; and determining the target sample set based on the second training sample corresponding to each current historical moment.
[0062] In this context, chronological order refers to the sequence of consecutive moments.
[0063] Specifically, the construction process of the target sample set is as follows: For each target obstacle in the input sample, assume there are M historical moments and N future moments. Here, M and N are integers, with N generally greater than M. When constructing the second training sample, the M historical moments are used in turn as the current historical moment, thus constructing M second training samples. For example, in chronological order, the moment before the current moment of the target obstacle (i.e., the first historical moment) is used as the current historical moment of the first second training sample. Based on the state data of the first historical moment and the state data of the M-1 subsequent historical moments, the first second training sample can be constructed. Similarly, in chronological order, the moment before the current moment of the target obstacle (i.e., the second historical moment) is used as the current historical moment of the second second training sample. Based on the state data of the second historical moment and the state data of the M-2 subsequent historical moments, the second second training sample can be constructed. This process continues until the Mth second training sample is constructed.
[0064] Step 130: Determine the first loss based on the first training sample and the target trajectory prediction model.
[0065] Among them, the target trajectory prediction model is all algorithm models that can be used for trajectory prediction, including different prediction methods such as agent-centric prediction, ego-centric prediction, and joint prediction. The specific model can be set according to the actual application scenario, and no specific limitation is made here.
[0066] Figure 2 This is a framework diagram of a target trajectory prediction model provided in the embodiments of this application. (See also...) Figure 2 The training process of the target trajectory prediction model is as follows: First, construct the input of the target trajectory prediction model, feed the input into the designed target trajectory prediction model, obtain multiple possible future trajectories and their confidence scores, and finally use the corresponding loss function to supervise the prediction results in order to train the target trajectory prediction model.
[0067] The inputs include the historical state of the target obstacle, the historical state of other obstacles around the target obstacle, and map information. The output includes multiple possible future trajectories and their confidence levels.
[0068] In some embodiments, determining a first loss based on a first training sample and a target trajectory prediction model includes: determining a first candidate trajectory set based on the first training sample and the target trajectory prediction model; the first candidate trajectory set includes K first candidate trajectories; and determining a first loss based on the first candidate trajectory set and a preset ground truth trajectory. The first training sample includes state data from M historical time points. Assume the target trajectory prediction model uses M historical time points to predict N future time points. After inputting the first training sample into the target trajectory prediction model, it outputs K first candidate trajectories. Each first candidate trajectory includes state data from N future time points.
[0069] Specifically, the method for determining the first loss is as follows: (See...) Figure 2 The diagram shows the framework of the target trajectory prediction model. The first training sample (including state data from M historical time points) is input into the model, which outputs K first candidate trajectories. Each first candidate trajectory includes state data from N future time points. Then, a first loss is calculated based on the K first candidate trajectories and the preset ground truth trajectory.
[0070] The preset true trajectory is as follows: During data collection, vehicle driving information is typically collected over a period of time (e.g., 5 minutes), including its coordinates, speed, acceleration, etc. When constructing training samples, the collected data can be cropped into 9-second segments. The first 3 seconds are used as model input (i.e., historical information), and the last 6 seconds are used as the future true trajectory, i.e., the preset true trajectory.
[0071] In some embodiments, determining a first loss based on a first candidate trajectory set and a preset ground truth trajectory includes: selecting the first candidate trajectory with the highest similarity to the preset ground truth trajectory from K first candidate trajectories and using it as the target candidate trajectory; calculating the loss between the target candidate trajectory and the preset ground truth trajectory to obtain the first loss.
[0072] The first candidate trajectory with the highest similarity to the preset true trajectory refers to the first candidate trajectory that is closest to the preset true trajectory. The specific method for determining the similarity to the preset true trajectory includes calculating the Euclidean distance between each first candidate trajectory and the preset true trajectory, and calculating the cosine similarity between each first candidate trajectory and the preset true trajectory, etc. The specific settings can be set according to the actual situation, and no specific limitation is made here.
[0073] It should be noted that the method for determining the target candidate trajectory, besides selecting the first candidate trajectory with the highest similarity to the preset true trajectory from the K first candidate trajectories, can also be: selecting the first candidate trajectory with the smallest deviation from the preset true trajectory from the K first candidate trajectories as the target candidate trajectory. Alternatively, it can be: calculating the confidence scores of the K first candidate trajectories and selecting the first candidate trajectory with the highest confidence score as the target candidate trajectory, etc. The specific determination method can be set according to the actual situation, and no specific limitations are made here.
[0074] The specific implementation method for calculating the loss between the target candidate trajectory and the preset true trajectory can be: using L2 loss to calculate the loss between the target candidate trajectory and the preset true trajectory to obtain the first loss.
[0075] Step 140: Determine the second loss based on the target sample set and the target trajectory prediction model.
[0076] The target sample set includes M second training samples. The first second training sample includes M historical time-stamped state data, the second includes M-1 historical time-stamped state data, ..., the (M-1)th second training sample includes 2 historical time-stamped state data, and the Mth second training sample includes 1 historical time-stamped state data. Correspondingly, the future trajectories of the first and second training samples overlap at N-1 time points, the future trajectories of the third and second training samples overlap at N-2 time points, and so on. The resulting M second training samples exhibit temporal sequence. Therefore, inputting these M second training samples into the target trajectory prediction model allows the model to explicitly observe this temporal sequence. This data-driven approach improves the stability of the target trajectory prediction model's prediction results.
[0077] The reason why the future trajectories of the first and second training samples overlap at N-1 time points is as follows: Assume the target trajectory prediction model uses M historical time points to predict N future time points. When constructing the second training sample, the time point of the first training sample is shifted forward by one time. Therefore, when the second training sample is input into the target trajectory prediction model, its input is still the state data from the M historical time points. However, the corresponding future trajectory output by the target trajectory prediction model has N-1 time points, meaning the future trajectory of the second training sample has N-1 time points. In short, the second training sample is derived from the first training sample shifted forward by one time point; therefore, the future trajectories of the first and second training samples overlap at N-1 time points.
[0078] In some embodiments, determining a second loss based on a target sample set and a target trajectory prediction model includes: determining a second candidate trajectory set based on the target sample set and the target trajectory prediction model; each second training sample includes K second candidate trajectories, and the second candidate trajectory set includes K*M second candidate trajectories; determining a second loss based on the second candidate trajectory set. Specifically, the method for determining the second loss is as follows: (See [reference]). Figure 2 The diagram shows the framework of the target trajectory prediction model. M second training samples are input into the model at once, and the model outputs M*K second candidate trajectories (each second training sample outputs K second candidate trajectories, for a total of M second training samples, resulting in a total of M*K second candidate trajectories). Each second candidate trajectory includes state data for N future time points. Then, a second loss is obtained based on the M*K second candidate trajectories.
[0079] In some embodiments, the M second training samples are arranged in ascending order of the number of historical moments; determining the second loss based on the second candidate trajectory set includes: arranging the K second candidate trajectories of each second training sample in descending order of confidence; sequentially calculating the third loss of all two second candidate trajectories with the same ranking among the K second candidate trajectories of two adjacent second training samples; summing the third losses of all two second candidate trajectories with the same ranking among two adjacent second training samples to obtain the fourth loss of two adjacent second training samples; and summing the fourth losses of all two adjacent second training samples to obtain the second loss.
[0080] Specifically, the implementation process for calculating the second loss is as follows: For example, we will use the first, second, and third second training samples as examples. Each of these samples includes K second candidate trajectories. For instance, we assume the K second candidate trajectories of the first training sample are as follows: , , , ... , The K candidate trajectories of the second training sample are as follows: , , , ... , The K second candidate trajectories of the third second training sample are as follows: , , , ... , .
[0081] First, the K candidate trajectories of each second training sample are sorted according to a second preset order. For example, the K candidate trajectories of each second training sample are sorted in descending order of confidence. Assume the K candidate trajectories of the first second training sample are sorted as follows: , , , ... , The K candidate trajectories of the second training sample, after being sorted, are: , , , ... , The K candidate trajectories of the third second training sample, after being sorted, are: , , , ... , .in, , , They are of the same sorting level (i.e., the first sorting level). , , For the same sorting rank (i.e., the second sorting rank), ..., , , They are of the same sorting level (i.e., the (M-1)th sorting level). , , They are of the same sorting rank (i.e., the Mth sorting rank).
[0082] Then, the third loss is calculated sequentially for all two second candidate trajectories with the same ranking among the K second candidate trajectories of two adjacent second training samples. For example, the third loss is calculated for all two second candidate trajectories with the same ranking among the first and second adjacent second training samples. For example, the third loss is calculated for the first ranking. and The third loss between them is used to calculate the second ranking level. and The third loss between them, ..., calculate the (M-1)th ranking rank. and The third loss between them is used to calculate the Mth ranking level. and The third loss is calculated between adjacent second and third training samples. For example, the third loss is calculated for all two second candidate trajectories with the same ranking rank. and The third loss between them is used to calculate the second ranking level. and The third loss between them, ..., calculate the (M-1)th ranking rank. and The third loss between them is used to calculate the Mth ranking level. and The third loss is calculated between the K candidate trajectories of all adjacent second training samples, for all two candidate trajectories with the same ranking.
[0083] Secondly, the third loss of all two second candidate trajectories with the same ranking level for two adjacent second training samples is summed to obtain the fourth loss for two adjacent second training samples. For example, ... and The third loss between and The third loss between them, ... and The third loss between, and and The third loss between the two samples is added together to obtain the fourth loss for the first and second adjacent second training samples. and The third loss between and The third loss between them, ... and The third loss between, and and The third loss between the two adjacent second training samples is added together to obtain the fourth loss for the second and third adjacent second training samples. This process is repeated to obtain the fourth loss for all pairs of adjacent second training samples.
[0084] Finally, the fourth losses of all adjacent pairs of second training samples are summed to obtain the second loss. For example, the fourth losses of adjacent first and second second training samples, adjacent second and third second training samples, ..., adjacent (M-1)th and Mth second training samples are summed to obtain the second loss.
[0085] In some embodiments, the method for calculating the third loss includes: calculating the fifth loss of the two state data at all the same time in two second candidate trajectories with the same ranking level; and summing all the fifth losses to obtain the third loss.
[0086] The fifth loss for calculating the two state data at the same time can be calculated using L2 loss.
[0087] Figure 3 This is a schematic diagram of a model training method provided in an embodiment of this application. For an example, please refer to [link / reference needed]. Figure 3 The Mth second training sample includes state data from a historical time point, for example, it includes state data from the Mth historical time point (let's say it's a historical time point). The (M-1)th second training sample includes state data from two historical time points, for example, state data from the Mth historical time point and state data from the (M-1)th historical time point (let's say...). and The (M-2)th second training sample includes state data from three historical time points, such as the state data from the Mth historical time point, the state data from the (M-1)th historical time point, and the state data from the (M-2)th historical time point; the (M-3)th second training sample includes state data from four historical time points, such as the state data from the Mth historical time point, the state data from the (M-1)th historical time point, the state data from the (M-2)th historical time point, and the state data from the (M-3)th historical time point.
[0088] For example, the Mth and (M-1)th second training samples are grouped together (set as group 1), the (M-1)th and (M-2)th second training samples are grouped together (set as group 2), and the (M-2)th and (M-3)th second training samples are grouped together (set as group 3). After inputting these second training samples into the target trajectory prediction model, K corresponding second candidate trajectories are obtained. For example, taking the first group as an example... Figure 3 This shows two candidate trajectories (let's call them candidate trajectory A and candidate trajectory B) where the Mth and (M-1)th second training samples are at the same ranking level (assuming the first ranking level). See [link / reference] for details. Figure 3 Both candidate trajectory A and candidate trajectory B include N future states. Assuming N is 8, the 8 future states of candidate trajectory A are as follows: , , , , , , , The eight future moments of the second candidate trajectory B are as follows: , , , , , , , Since the Mth second training sample is derived from the (M-1)th second training sample shifted forward by one time step, the first future time step of the second candidate trajectory A of the Mth second training sample is... The time of the (M-1)th second training sample It is the same moment, the second future moment of the second candidate trajectory A of the Mth second training sample. The first future moment of the second candidate trajectory B of the M-1th second training sample It is the same moment, the third future moment of the second candidate trajectory A of the Mth second training sample. The second future moment of the second candidate trajectory B of the M-1th second training sample It is the same moment, ..., the eighth future moment of the second candidate trajectory A of the Mth second training sample. The seventh future moment of the second candidate trajectory B of the M-1th second training sample. They are at the same time. Therefore, the L2 loss calculation method is used to calculate the second future time of the second candidate trajectory A of the Mth second training sample. The first future moment of the second candidate trajectory B of the M-1th second training sample. The fifth loss is used to calculate the third future moment of the second candidate trajectory A of the Mth second training sample. The second future moment of the second candidate trajectory B of the M-1th second training sample The fifth loss, ..., calculate the eighth future moment of the second candidate trajectory A of the Mth second training sample. The seventh future moment of the second candidate trajectory B of the M-1th second training sample. The fifth loss is then obtained by summing all the fifth losses. This yields the third loss for the second candidate trajectory A and the second candidate trajectory B (i.e., the sum of the fifth losses of the two state data at the same time in the two second candidate trajectories with the same ranking, i.e., the third loss of the two second candidate trajectories with the same ranking).
[0089] Therefore, it can be seen that the target trajectory prediction model in related technologies ignores the temporal consistency between training samples. For example, in practical applications, as time goes by, the current state will become a historical state, meaning that the two time states are related and temporal consistency needs to be considered. This ignores consistency during training but expects the prediction results to be temporally consistent during application, which is inherently contradictory, or in other words, the trained model is difficult to meet the needs of the application. To solve the above problem, this application strengthens temporal consistency during the training phase. Within a certain time, given continuous inputs to the model, the generated trajectories should be consistent, i.e., temporal consistency. This effect is achieved through loss calculation. Specifically, this application constructs M consecutive second training samples based on state data from M historical time moments. Since the M second training samples have temporal sequence, when the target trajectory prediction model is trained using the second training samples, the output second candidate trajectories also have a certain temporal sequence. Therefore, the second loss calculated based on the second candidate trajectory set can explicitly reflect the impact of temporal sequence on trajectory prediction. Consequently, when the target trajectory prediction model is optimized based on the second loss, the stability of the prediction can be improved.
[0090] Step 150: Optimize the parameters of the target trajectory prediction model based on the first loss and the second loss to obtain a trained target trajectory prediction model. Supervised training of the target trajectory prediction model using the first and second losses optimizes its parameters. This training method constrains the target trajectory prediction model, ensuring consistent output when the input is consistent. Therefore, by supervising and optimizing the target trajectory prediction model using the first and second losses, a trained and optimized target trajectory prediction model can be obtained, enabling the model to output a more stable predicted future trajectory of the target obstacle.
[0091] In some embodiments, optimizing the parameters of the target trajectory prediction model based on a first loss and a second loss to obtain the trained target trajectory prediction model includes: adding the first loss and the second loss to obtain a target loss; and optimizing the parameters of the target trajectory prediction model based on the target loss to obtain the trained target trajectory prediction model.
[0092] Specifically, the target trajectory prediction model is trained under supervised supervision based on the target loss. For example, when the target loss is zero, it indicates that the input and output of the target trajectory prediction model are consistent, and no penalty is needed (i.e., supervised training optimization). This is the optimal target trajectory prediction model. When the target loss is greater than zero, it indicates that the input and output of the target trajectory prediction model are inconsistent. Therefore, the target trajectory prediction model needs to be penalized, and supervised training optimization continues until the input and output of the target trajectory prediction model are consistent. Furthermore, since this application constructs M consecutive second training samples based on state data from M historical moments, and since the M second training samples have temporal sequence, the output second candidate trajectories also have a certain temporal sequence when training the target trajectory prediction model using the second training samples. Therefore, the second loss calculated based on the second candidate trajectory set can explicitly reflect the impact of temporal sequence on trajectory prediction. Therefore, the target loss obtained from the second loss and the first loss can take into account the impact of temporal sequence on each future trajectory when supervising the target trajectory prediction model, which is beneficial to improving the stability of the target trajectory prediction model, thereby improving the stability of trajectory prediction.
[0093] Figure 4 This is a block diagram illustrating the principle structure of a trajectory prediction model training system provided in this application embodiment. On the other hand, this embodiment provides a trajectory prediction model training system; please refer to [link / reference needed]. Figure 4 The trajectory prediction model training system 100 includes: an acquisition module 101, used to acquire a first training sample of a target obstacle, the first training sample including state data at M historical moments, where M is a positive integer; a construction module 102, used to construct a target sample set based on the first training sample; the target sample set includes M second training samples; a first determination module 103, used to determine a first loss based on the first training sample and the target trajectory prediction model; a second determination module 104, used to determine a second loss based on the target sample set and the target trajectory prediction model; and an optimization module 105, used to optimize the parameters of the target trajectory prediction model based on the first loss and the second loss to obtain a trained target trajectory prediction model.
[0094] This application provides a trajectory prediction model training system. The system constructs multiple second training samples from the historical state data of a target obstacle according to temporal sequence. It then calculates a first loss based on the first training samples and the target trajectory prediction model, and a second loss based on the second training samples and the target trajectory prediction model. Finally, it optimizes the parameters of the target trajectory prediction model using the first and second losses. Because the second training samples are constructed according to temporal sequence, the second loss takes into account the impact of temporal sequence on trajectory prediction, thereby stably outputting the future predicted trajectory of the target obstacle and improving the stability of trajectory prediction.
[0095] In some embodiments, the construction module 102 is further configured to:
[0096] In chronological order, each of the M historical moments is taken as the current historical moment.
[0097] Based on the state data of each current historical moment and the state data of all historical moments preceding the current historical moment, a second training sample corresponding to each current historical moment is constructed.
[0098] The target sample set is determined based on the second training sample corresponding to each historical current moment.
[0099] In some embodiments, the first determining module 103 is further configured to:
[0100] A first candidate trajectory set is determined based on the first training samples and the target trajectory prediction model; the first candidate trajectory set includes K first candidate trajectories.
[0101] The first loss is determined based on the first candidate trajectory set and the preset true trajectory.
[0102] In some embodiments, the first determining module 103 is further configured to:
[0103] Select the first candidate trajectory with the highest similarity to the preset true value trajectory from the K first candidate trajectories, and use it as the target candidate trajectory;
[0104] The loss between the target candidate trajectory and the preset true trajectory is calculated to obtain the first loss.
[0105] In some embodiments, the second determining module 104 is further configured to:
[0106] The second candidate trajectory set is determined based on the target sample set and the target trajectory prediction model; each second training sample includes K second candidate trajectories, and the second candidate trajectory set includes K*M second candidate trajectories.
[0107] The second loss is determined based on the second candidate trajectory set.
[0108] In some embodiments, the M second training samples are arranged in ascending order of the number of historical moments; the second determining module 104 is further configured to:
[0109] Arrange the K second candidate trajectories of each second training sample in descending order of confidence.
[0110] Calculate the third loss for all two second candidate trajectories with the same ranking level among the K second candidate trajectories of two adjacent second training samples in turn;
[0111] The third loss of two candidate trajectories with the same ranking level for two adjacent second training samples is summed to obtain the fourth loss for two adjacent second training samples.
[0112] The fourth loss of all two adjacent second training samples is summed to obtain the second loss.
[0113] In some embodiments, the second determining module 104 is further configured to:
[0114] Calculate the fifth loss for the two state data at all the same time in two second candidate trajectories with the same ranking rank;
[0115] Adding all the fifth losses together, we get the third loss.
[0116] This embodiment also provides an electronic device, including a memory and a processor. The memory stores a computer program, and when the computer program is executed by the processor, it implements the method of any of the above embodiments.
[0117] This embodiment also provides a computer-readable storage medium having a computer program stored thereon, the computer program being loaded by a processor to perform the steps of any of the methods in the above embodiments.
[0118] In the embodiments of this application, the storage medium may be a magnetic disk, an optical disk, a read-only memory (ROM), or a random access memory (RAM), etc.
[0119] In the above embodiments, the descriptions of each embodiment have different focuses. For parts not described in detail in a certain embodiment, please refer to the relevant descriptions in other embodiments.
[0120] The above provides a detailed description of a trajectory prediction model training method, system, device, and storage medium provided in the embodiments of this application. Specific examples have been used to illustrate the principles and implementation methods of this application. The descriptions of the above embodiments are only for the purpose of helping to understand the method and core ideas of this application. At the same time, for those skilled in the art, there will be changes in the specific implementation methods and application scope based on the ideas of this application. Therefore, the content of this specification should not be construed as a limitation of this application.
Claims
1. A method for training a trajectory prediction model, characterized in that, include: Acquire the first training sample of the target obstacle. The first training sample includes state data at M historical moments, where M is a positive integer. A target sample set is constructed based on the first training sample; the target sample set includes M second training samples; The first loss is determined based on the first training sample and the target trajectory prediction model; The second loss is determined based on the target sample set and the target trajectory prediction model; The target trajectory prediction model is optimized based on the first loss and the second loss to obtain the trained target trajectory prediction model. The step of constructing the target sample set based on the first training sample includes: taking each of the M historical moments as the current historical moment in chronological order; Based on the state data of each current historical moment and the state data of all historical moments preceding the current historical moment, a second training sample corresponding to each current historical moment is constructed; the target sample set is determined based on the second training sample corresponding to each current historical moment.
2. The method according to claim 1, characterized in that, The step of determining the first loss based on the first training sample and the target trajectory prediction model includes: A first candidate trajectory set is determined based on the first training sample and the target trajectory prediction model; the first candidate trajectory set includes K first candidate trajectories; The first loss is determined based on the first candidate trajectory set and the preset true trajectory.
3. The method according to claim 2, characterized in that, The step of determining the first loss based on the first candidate trajectory set and the preset true trajectory includes: From the K first candidate trajectories, select the first candidate trajectory with the highest similarity to the preset true value trajectory, and use it as the target candidate trajectory; The loss between the target candidate trajectory and the preset true trajectory is calculated to obtain the first loss.
4. The method according to claim 1, characterized in that, The second loss is determined based on the target sample set and the target trajectory prediction model, including: A second candidate trajectory set is determined based on the target sample set and the target trajectory prediction model; each second training sample includes K second candidate trajectories, and the second candidate trajectory set includes K*M second candidate trajectories; The second loss is determined based on the second candidate trajectory set.
5. The method according to claim 4, characterized in that, The M second training samples are arranged in ascending order of the number of historical moments; determining the second loss based on the second candidate trajectory set includes: Arrange the K second candidate trajectories of each second training sample in descending order of confidence; Calculate the third loss for all two second candidate trajectories with the same ranking level among the K second candidate trajectories of two adjacent second training samples in turn; The third loss of the two second candidate trajectories with the same ranking level of two adjacent second training samples is added together to obtain the fourth loss of the two adjacent second training samples. The fourth loss of all two adjacent second training samples is summed to obtain the second loss.
6. The method according to claim 5, characterized in that, The method for calculating the third loss includes: Calculate the fifth loss for the two state data at all the same time in two second candidate trajectories with the same ranking level; Adding all the fifth losses together yields the third loss.
7. A trajectory prediction model training system, characterized in that, include: The acquisition module is used to acquire the first training sample of the target obstacle. The first training sample includes state data at M historical moments, where M is a positive integer. A construction module is used to construct a target sample set based on the first training samples; the target sample set includes M second training samples. The first determining module is used to determine a first loss based on the first training sample and the target trajectory prediction model; The second determining module is used to determine the second loss based on the target sample set and the target trajectory prediction model; An optimization module is used to optimize the parameters of the target trajectory prediction model based on the first loss and the second loss to obtain the trained target trajectory prediction model. The construction module is further configured to: sequentially take each of the M historical moments as the current historical moment according to the time order; construct the second training sample corresponding to each current historical moment based on the state data of each current historical moment and the state data of all historical moments before the current historical moment; and determine the target sample set based on the second training sample corresponding to each current historical moment.
8. An electronic device, characterized in that, It includes a memory and a processor, wherein the memory stores a computer program that, when executed by the processor, implements the method as described in any one of claims 1-6.
9. A computer-readable storage medium, characterized in that, It stores a computer program, which is loaded by a processor to perform the steps of the method as described in any one of claims 1-6.