A task-adaptive trajectory representation learning method

What is AI technical title?
AI technical title is built by Patsnap AI team. It summarizes the technical point description of the patent document.
By constructing a mapping relationship between GPS trajectories and road segment trajectories and a learnable task prompt vector, adaptive fusion of cross-modal information is achieved, which solves the problems of insufficient multimodal feature fusion and task adaptation in existing methods and improves the adaptability of trajectory representation.

CN122241265APending Publication Date: 2026-06-19JIANGXI NORMAL UNIV

View PDF 1 Cites 0 Cited by

Patent Information

Authority / Receiving Office: CN · China
Patent Type: Applications(China)
Current Assignee / Owner: JIANGXI NORMAL UNIV
Filing Date: 2026-05-20
Publication Date: 2026-06-19

Application Information

Patent Timeline

20 May 2026

Application

19 Jun 2026

Publication

CN122241265A

IPC: G06F18/22; G08G1/01; G06F18/213; G06F18/25; G06F18/20; G06N3/0455; G06N3/0985; G01S19/42; G06F123/02

AI Tagging

Application Domain

Detection of traffic movement Biological models

Technology Topics

Pattern recognition Road networks

Explore More Agents

Novelty Search
Search existing technologies and assess novelty
↗
FTO
Analyze whether a product may infringe others' patents
↗
Design FTO
Check prior-design risk for exterior design
↗
Drafting
Draft patent application text based on a technical solution
↗
Find Solutions with TRIZ
Generate feasible solution to solve your technical challenge
↗

Similar Technology Patents

Get free access to AI patent search and analysis

Check patentability, review prior art and ask IP Agent with full patent context.

AI Technical Summary

Technical Problem

Existing trajectory representation learning methods are insufficient in terms of multimodal feature fusion and task adaptation, making it difficult to meet the needs of different downstream tasks, and they do not fully consider the differences in data distribution between continuous GPS trajectory data and discrete road segment topology data.

Method used

By constructing a mapping relationship between GPS trajectories and road segment trajectories, extracting features using a pre-trained encoder, and combining learnable task cue vectors and hierarchical fusion modules, adaptive fusion of cross-modal information is achieved to generate task-specific fusion representations.

Benefits of technology

It improves the fusion effect of multimodal trajectory information, addresses the problem of fixed fusion strategies and difficulty in dynamically adjusting feature contributions in traditional methods, and enhances the adaptability of trajectory representation in multi-task scenarios.

✦ Generated by Eureka AI based on patent content.

Smart Images

Figure CN122241265A_ABST

Patent Text Reader

Abstract

This invention discloses a task-adaptive trajectory representation learning method, relating to the field of traffic big data analysis technology. The method includes: acquiring GPS trajectories and corresponding road network data; obtaining road segment trajectories through map matching; constructing an alignment matrix based on the GPS trajectories and road segment trajectories; inputting the two types of trajectories into a pre-trained GPS trajectory encoder and a road segment trajectory encoder, respectively, to obtain GPS trajectory embedding sequences and road segment trajectory embedding sequences; constructing a task cue vector based on the target downstream task, using it as task condition information; fusing the two types of embedding sequences through a hierarchical fusion module to obtain a task-specific fusion representation; finally, inputting the fusion representation into a prediction module to output the downstream task prediction result, and updating parameters based on the task loss. This invention achieves adaptive fusion of multimodal trajectory features under different downstream task requirements, improving the task specificity and adaptability of trajectory representation.

Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of traffic big data analysis technology, and in particular to a task-adaptive trajectory representation learning method. Background Technology

[0002] In the field of traffic big data analytics, trajectory representation learning is used to transform raw trajectory data into feature representations that can be processed and analyzed by computer models. It is a fundamental technology for applications such as trajectory analysis, route planning, and traffic prediction. Existing research typically combines GPS trajectory information with road network topology information to construct multimodal trajectory representations, thereby enhancing the ability to represent the spatiotemporal characteristics of trajectories and the semantic information of paths, thus adapting to the analytical needs of complex traffic scenarios.

[0003] However, existing trajectory representation learning methods still have shortcomings in multimodal feature fusion and task adaptation. First, different downstream tasks have different focuses on trajectory features. For example, travel time estimation tasks usually focus more on the spatiotemporal dynamics of GPS trajectories, while road classification tasks focus more on road network path semantics. However, existing methods usually adopt fixed multimodal fusion strategies, and the generated trajectory representations lack adaptability to different tasks, making it difficult for the generated trajectory representations to meet the needs of different tasks. Second, when processing continuous GPS trajectory data and discrete road segment topology data, existing methods often do not fully consider the differences in data distribution between the two modalities, making it difficult to fully fuse features between different modalities.

[0004] The publication number is CN119558374A, titled "A Deep Learning-Based Trajectory Representation Pre-training System." This system constructs a deep learning model for multi-scale trajectory structures. It combines static and dynamic features of the road network to obtain initial feature inputs for the trajectory. A multi-scale trajectory attention encoder is used to fuse intra-layer and inter-layer features of trajectory sequence representations at different granularities to obtain rich spatiotemporal semantic information. A self-supervised pre-training module is then used to pre-train the trajectory representation based on a masked trajectory reconstruction task, resulting in trajectory vector representations usable for downstream tasks. However, this method does not introduce an adaptive fusion mechanism for specific downstream tasks, thus its ability to dynamically adjust the fusion weights of different features for different downstream tasks still has room for improvement.

[0005] Therefore, how to effectively extract multimodal trajectory features and achieve task-driven dynamic feature fusion according to the needs of different downstream tasks, thereby improving the adaptability of trajectory representation in multi-task scenarios, remains an urgent technical problem to be solved. Summary of the Invention

[0006] To solve the above-mentioned technical problems, one technical solution adopted by the present invention is: to provide a task-adaptive trajectory representation learning method, comprising: Obtain GPS trajectory and corresponding road network data; The GPS trajectory and the road network data are matched on a map to obtain the corresponding road segment trajectory; An alignment matrix is constructed based on the GPS trajectory and the road segment trajectory to map the GPS trajectory points to the road segments; Based on the alignment matrix, the GPS trajectory is input into the pre-trained GPS trajectory encoder for feature extraction to obtain the GPS trajectory embedding sequence, and the road segment trajectory is input into the pre-trained road segment trajectory encoder for feature extraction to obtain the road segment trajectory embedding sequence. Construct learnable task cue vectors based on the target downstream task; The task prompt vector is used as task condition information. The hierarchical fusion module adaptively fuses the GPS trajectory embedding sequence and the road segment trajectory embedding sequence to obtain a task-specific fusion representation. The task-specific fusion representation is input into the prediction module corresponding to the target downstream task to obtain the downstream task prediction result. The task loss is calculated based on the prediction result and the real label. Based on the task loss, the task prompt vector, the hierarchical fusion module and the learnable parameters in the prediction module are iteratively updated. The road network data includes: road segment sets, road segment attribute information, and topological network structure; The GPS track includes GPS track points arranged in chronological order.

[0007] Furthermore, the map matching includes: A map matching algorithm based on a hidden Markov model is used to match the GPS trajectory with the road network data to obtain the corresponding road segment trajectory.

[0008] Furthermore, the step of constructing an alignment matrix for mapping GPS trajectory points to road segments based on the GPS trajectory and the road segment trajectory includes: Construct an initial matrix, where the rows of the matrix correspond to each road segment in the road segment trajectory, and the columns of the matrix correspond to each trajectory point in the GPS trajectory; Initialize each element in the initial matrix to 0; Traverse each road segment in the road segment trajectory and extract the GPS trajectory point sequence corresponding to each road segment from the GPS trajectory; In the initial matrix, the intersection element of each road segment's corresponding row and the column containing each trajectory point in its corresponding GPS trajectory point subsequence is set to 1; Use the assigned initial matrix as the alignment matrix.

[0009] Furthermore, the pre-trained GPS trajectory encoder and the pre-trained road segment trajectory encoder are obtained through joint self-supervised pre-training, which includes the following steps: Based on the alignment matrix, the trajectory points in the GPS trajectory and the road segments in the road segment trajectory are synchronized and masked to obtain the masked GPS trajectory and the masked road segment trajectory. The masked GPS trajectory is input into the GPS trajectory encoder for feature extraction to obtain the GPS trajectory embedding sequence; The masked road segment trajectory is input into the road segment trajectory encoder for feature extraction to obtain the road segment trajectory embedding sequence; Average pooling is performed on the GPS trajectory embedding sequence and the road segment trajectory embedding sequence respectively, and the calculation formula is as follows: ; ; in, This is a vector representing the GPS trajectory. For GPS track embedding sequences, Let the vector represent the trajectory of the road segment. Embed the sequence of road segment trajectories. This is an average pooling operation; Based on the GPS trajectory representation vector and the road segment trajectory representation vector, calculate the contrast loss; The GPS trajectory embedding sequence and the road segment trajectory embedding sequence are processed by cross-modal interaction module to obtain the GPS trajectory embedding sequence and the road segment trajectory embedding sequence after interaction. Based on the GPS trajectory embedding sequence after the interaction and the road segment trajectory embedding sequence after the interaction, calculate the mask reconstruction loss; A joint training loss function is constructed based on the contrast loss and the mask reconstruction loss. The parameters of the GPS trajectory encoder, the road segment trajectory encoder, and the cross-modal interaction module are updated by minimizing the joint training loss function to obtain the pre-trained GPS trajectory encoder and the pre-trained road segment trajectory encoder.

[0010] Further, the step of inputting the masked GPS trajectory into the GPS trajectory encoder for feature extraction to obtain a GPS trajectory embedding sequence includes: Extract the features of each trajectory point in the masked GPS trajectory and construct a GPS feature matrix; Based on the alignment matrix, the GPS trajectory point intervals corresponding to each road segment are determined, and the GPS feature matrix is sliced along the time dimension to obtain the sub-feature matrix of each road segment. The sub-feature matrix is input into the first bidirectional gated recurrent network to model the GPS trajectory point sequence features inside the road segment. The hidden states of the first bidirectional gated recurrent network at the last moment of forward and backward are spliced together to obtain the initial GPS embedding of the road segment. The initial GPS embeddings are arranged into a GPS initial embedding sequence according to the driving order. The sequence is then input into a second bidirectional gated cyclic network to model the temporal context dependency between road segment sequences, thereby obtaining the GPS trajectory embedding sequence. The trajectory point features include: longitude coordinates, latitude coordinates, velocity, acceleration, angle increment, time increment, and distance.

[0011] Further, the step of inputting the masked road segment trajectory into the road segment trajectory encoder for feature extraction to obtain the road segment trajectory embedding sequence includes: Each road segment in the road segment set is initialized to obtain an initial vector representation of each road segment; Based on the aforementioned topological network structure, a road network adjacency matrix is constructed. An initial vector representation of each road segment and its neighbors is aggregated using a graph attention network to obtain the spatial feature embedding of the road segments. The calculation formula is as follows: ; in, Let be the initial vector representation of each road segment in the road segment set. For road segment collection, For the road network adjacency matrix, For graph attention network layers, Embedding of spatial features of road segments; The time information of each road segment in the road segment trajectory is obtained, and the time information is mapped into a high-dimensional time vector to obtain the road segment time feature embedding; The spatial feature embedding of the road segment is added to the temporal feature embedding of the road segment to obtain the spatiotemporal representation of each road segment; The spatiotemporal representations of each road segment are input into a Transformer encoder to form a sequence based on the driving order, thereby obtaining the trajectory embedding sequence of the road segment.

[0012] Furthermore, the cross-modal interaction module includes: The GPS trajectory embedding sequence is concatenated with the road segment trajectory embedding sequence to construct a cross-modal long sequence; For each element in the cross-modal long sequence, the feature dimension is adjusted by linear projection, and the position embedding and modality embedding are superimposed. The calculation formula is as follows: ; in, For the enhanced representation, For the first in a long cross-modal sequence One element, It is a linear projection matrix. For location embedding, This indicates the index of the element in the sequence. For modal embedding, This is a modal identifier variable used to indicate the modal type to which the element belongs; The enhanced representation Composition sequence Cross-modal information interaction is achieved through a multi-head self-attention layer with residual structure. The calculation formula is as follows: ; in, For the output sequence, For bullish self-attention operations, For the enhanced representation The sequence formed, For layer normalization processing; The output sequence is based on the concatenation order during input. It was re-splittered into the GPS trajectory embedding sequence after interaction and the road segment trajectory embedding sequence after interaction.

[0013] Furthermore, the calculation formula for the contrast loss is as follows: ; ; ; in, To compare loss functions, The cosine similarity function is used. For the first GPS trajectory representation vector of a trajectory, For the first The path segment trajectory of a trajectory is represented by a vector. For the first The path segment trajectory of a trajectory is represented by a vector. For the first GPS trajectory representation vector of a trajectory For temperature parameters, This refers to the batch sample size. This represents the contrast loss from GPS mode to road segment mode direction. The contrast loss is between the road segment mode and the GPS mode. The calculation of the mask reconstruction loss includes: For each mask location, extract the feature vectors of the post-interaction GPS trajectory embedding sequence and the post-interaction road segment trajectory embedding sequence at the corresponding mask location; The feature vector is input into the prediction layer, and the predicted probability distribution of each road segment identifier at each mask position is obtained through the Softmax function. The calculation formula is as follows: ; ; in, and They represent the first Predicted probability distributions for GPS trajectory and road segment trajectory at each mask location. For normalization function, and These represent the weight matrix on the GPS trajectory side and the weight matrix on the road segment trajectory side, respectively. and These represent the bias vector on the GPS track side and the bias vector on the road segment track side, respectively. and They represent the first The interaction-based GPS trajectory feature vector and the interaction-based road segment trajectory feature vector at each mask location; Based on the probability values corresponding to the real road segment identifiers in the predicted probability distribution, the mask reconstruction loss is calculated using the following formula: ; in, This represents the total number of mask positions. For the set of mask locations, For mask position index, For index The actual road signs at the location True road segment identification in GPS trajectory-side prediction probability distribution The corresponding probability value, True road segment identifiers in the probability distribution of road segment trajectory prediction The corresponding probability value, Loss due to mask reconstruction; The joint training loss function is calculated using the following formula: ; in, , These are adjustable hyperparameters used to balance the optimization weights between the contrastive learning objective and the mask reconstruction objective. For mask reconstruction loss function, For comparison loss functions.

[0014] Further, the step of obtaining the time information of each road segment in the road segment trajectory and mapping the time information into a high-dimensional time vector to obtain the road segment time feature embedding includes: Based on the alignment matrix, the starting GPS trajectory point corresponding to each road segment is determined; Use the timestamp of the starting GPS track point as a reference timestamp, and calculate the minute index and weekday index based on the reference timestamp; The minute index and the weekday index are input into the learnable embedding layer to obtain the minute time embedding vector and the weekday time embedding vector, calculated using the following formula: ; ; in, and This represents a learnable embedding mapping function. For timestamp The calculated minute index, For timestamp The calculated weekday index, Embed a minute-time vector. Embed a vector for the weekday time; The minute time embedding vector is added to the weekday time embedding vector to obtain the road segment time feature embedding.

[0015] Furthermore, the construction of a learnable task cue vector based on the target downstream task includes: A unique task identifier is assigned to each downstream task, and a task hint vector lookup table is constructed to store the initial task vector corresponding to each downstream task. Obtain the task identifier corresponding to the target downstream task, and extract the corresponding initial task vector from the task prompt vector lookup table through a lookup operation; The initial task vector is input into a fully connected network for dimensional mapping and nonlinear transformation to obtain the task prompt vector.

[0016] Furthermore, the hierarchical fusion module includes: For the first in the trajectory The task prompt vector and the GPS trajectory are embedded in the sequence of the first road segment. The representation of the nth road segment and the embedding sequence of the road segment trajectory The representations of each road segment are concatenated, and a gated weight is generated through nonlinear mapping. The calculation formula is as follows: ; in, For gating weights, The first GPS track embedding sequence The representation of each road segment, The first segment of the road trajectory embedding sequence The representation of each road segment, This indicates a feature concatenation operation. Let be the bias vector of the gated mapping. The weight matrix of the gated mapping, For the Sigmoid function, This is a task hint vector; The gating weights are used to apply the gating weights to the first GPS trajectory embedding sequence. The representation of the _th road segment and the _th road segment trajectory embedding sequence The representations of each road segment are weighted and fused to obtain a comprehensive representation of the corresponding road segment, and then combined into a comprehensive feature sequence according to the driving order. The calculation formula is as follows: ; ; in, The first GPS track embedding sequence The representation of each road segment, The first segment of the road trajectory embedding sequence The representation of each road segment, For element-wise multiplication, For gating weights, This represents the number of road segments corresponding to the trajectory. For the first A comprehensive representation of each road segment. For comprehensive feature sequences; The query vector is obtained by linear projection of the task prompt vector, and the key matrix and value matrix are obtained by linear projection of the comprehensive feature sequence, respectively. The query vector, the key matrix, and the value matrix are input into a scaled dot product attention mechanism for computation to obtain the task-specific fusion representation. The calculation formula is as follows: ; ; ; in, For task hint vectors, These represent the query projection matrix, key projection matrix, and value projection matrix, respectively. Let be the dimension of the key vector. This is the transpose of the key matrix. For normalization function, For query vector, The key matrix, For value matrices, Task-specific fusion representation This is a comprehensive feature sequence.

[0017] Furthermore, the prediction module includes at least one fully connected layer for mapping the task-specific fusion representation to a space that matches the output dimension of the target downstream task; The task loss is calculated based on the prediction results and the true labels, using a loss function adapted to the target downstream task.

[0018] The beneficial effects of this invention are:

[0019] This invention provides a task-adaptive trajectory representation learning method. By introducing a learnable task cue vector, the model can dynamically focus on relevant features in different modalities according to task requirements, thereby improving the fusion effect of multimodal trajectory information and addressing the problem in traditional methods where the fusion strategy is fixed and it is difficult to dynamically adjust feature contributions according to downstream task requirements. Attached Figure Description

[0020] Figure 1 This is a flowchart of a task-adaptive trajectory representation learning method provided by the present invention.

[0021] Figure 2 This is an architecture diagram of a task-adaptive trajectory representation learning method provided by the present invention. Detailed Implementation

[0022] Exemplary embodiments of the present invention will now be described in more detail with reference to the accompanying drawings. While exemplary embodiments of the invention are shown in the drawings, it should be understood that the invention can be implemented in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided to enable a more thorough understanding of the invention and to fully convey the scope of the invention to those skilled in the art.

[0023] Example 1

[0024] The task-adaptive trajectory representation learning method involved in Embodiment 1 of the present invention includes:

[0025] The process involves: acquiring GPS trajectories and corresponding road network data; performing map matching between the GPS trajectories and the road network data to obtain corresponding road segment trajectories; constructing an alignment matrix for mapping GPS trajectory points to road segments based on the GPS trajectories and road segment trajectories; inputting the GPS trajectories into a pre-trained GPS trajectory encoder for feature extraction to obtain GPS trajectory embedding sequences, and inputting the road segment trajectories into a pre-trained road segment trajectory encoder for feature extraction to obtain road segment trajectory embedding sequences; constructing learnable task prompt vectors based on the target downstream task; and using the task prompt vectors as task condition information. The GPS trajectory embedding sequence and the road segment trajectory embedding sequence are adaptively fused through a hierarchical fusion module to obtain a task-specific fusion representation. The task-specific fusion representation is input into a prediction module corresponding to the target downstream task to obtain the downstream task prediction result. The task loss is calculated based on the prediction result and the real label. Based on the task loss, the task prompt vector, the hierarchical fusion module, and the learnable parameters in the prediction module are iteratively updated. The road network data includes: a set of road segments, road segment attribute information, and a topological network structure. The GPS trajectory includes: GPS trajectory points arranged in chronological order.

[0026] Specifically, Figure 1 A flowchart of a task-adaptive trajectory representation learning method according to Embodiment 1 of the present invention is shown, including: S1. Obtain GPS trajectory and corresponding road network data; For example, in this embodiment of the invention, the GPS trajectory within the target area and the road network data of the corresponding area are first obtained. The GPS trajectory consists of multiple GPS trajectory points arranged in chronological order, and each GPS trajectory point includes longitude, latitude and timestamp. For example, a GPS track can be represented as: , in, Indicates the first GPS track points.

[0027] Each GPS track point is represented as: ; in, Indicates the first The longitude of each trajectory point Indicates the first The latitude of each trajectory point Represents a timestamp.

[0028] S2. Match the GPS trajectory with the road network data on a map to obtain the corresponding road segment trajectory; Further, the map matching described in step S2 includes: A map matching algorithm based on a hidden Markov model is used to match the GPS trajectory with the road network data to obtain the corresponding road segment trajectory. For example, in this embodiment of the invention, a map matching algorithm based on a hidden Markov model is used. Specifically, the GPS trajectory points in the GPS trajectory are arranged in chronological order to form an observation sequence, and candidate road segments in the road network data that are spatially adjacent to each GPS trajectory point are used as a set of hidden states. The observation probability is determined based on the spatial distance between each candidate road segment and the corresponding GPS trajectory point, and the state transition probability is determined based on the connectivity between candidate road segments at adjacent times, the road network distance, and the motion continuity between the GPS trajectory points. Then, the Viterbi algorithm is used to solve for the optimal hidden state sequence corresponding to the observation sequence to obtain the most likely road segment sequence that matches the GPS trajectory, and the most likely road segment sequence is used as the corresponding road segment trajectory.

[0029] S3. Construct an alignment matrix for the mapping relationship between GPS trajectory points and road segments based on the GPS trajectory and the road segment trajectory; Further, the step S3, which involves constructing an alignment matrix based on the GPS trajectory and the road segment trajectory to map the GPS trajectory points to the road segments, includes: S3.1 Construct an initial matrix, where the rows of the matrix correspond to each road segment in the road segment trajectory, and the columns of the matrix correspond to each trajectory point in the GPS trajectory; S3.2. Initialize each element in the initial matrix to 0; S3.3. Traverse each road segment in the road segment trajectory and extract the GPS trajectory point subsequence corresponding to each road segment from the GPS trajectory; S3.4 In the initial matrix, the intersection element of each road segment's corresponding row and the column containing each trajectory point in its corresponding GPS trajectory point sub-sequence is set to 1; S3.5. Use the assigned initial matrix as the alignment matrix.

[0030] For example, after obtaining the road segment trajectory, an alignment matrix is further constructed based on the GPS trajectory and the road segment trajectory to establish the mapping relationship between GPS trajectory points and road segments. Specifically, a trajectory contains... Each section of road If there are 10 GPS track points, then a 100-point GPS track network is constructed. initial matrix The rows of the initial matrix correspond to each road segment in the road segment trajectory, and the columns correspond to each trajectory point in the GPS trajectory. Initially, all elements in the matrix are assigned a value of 0.

[0031] Then, each road segment in the road trajectory is traversed, and the sequence of GPS trajectory points corresponding to that road segment is determined based on the map matching results. For example, if the first... Section Corresponding to the The first to the second If there are 10 GPS track points, then the initial matrix will be... The Middle line, number Listed to number The elements of each column are assigned the value 1. After completing the traversal, the final alignment matrix is obtained.

[0032] S4. Based on the alignment matrix, the GPS trajectory is input into the pre-trained GPS trajectory encoder for feature extraction to obtain the GPS trajectory embedding sequence, and the road segment trajectory is input into the pre-trained road segment trajectory encoder for feature extraction to obtain the road segment trajectory embedding sequence. Furthermore, the pre-trained GPS trajectory encoder and the pre-trained road segment trajectory encoder mentioned in step S4 are obtained through joint self-supervised pre-training, which includes the following steps: S4.1 Based on the alignment matrix, synchronize the trajectory points in the GPS trajectory and the road segments in the road segment trajectory to obtain the masked GPS trajectory and the masked road segment trajectory; For example, one or more target road segment locations in the road segment trajectory are randomly selected as mask locations, and the corresponding road segments are replaced with preset mask markers on the road segment trajectory side. At the same time, a subsequence of GPS trajectory points corresponding to the target road segment is determined according to the alignment matrix, and a synchronization mask is performed on the trajectory points in the subsequence on the GPS trajectory side.

[0033] S4.2. Input the masked GPS trajectory into the GPS trajectory encoder for feature extraction to obtain the GPS trajectory embedding sequence; Further, step S4.2, which involves inputting the masked GPS trajectory into the GPS trajectory encoder for feature extraction to obtain a GPS trajectory embedding sequence, includes: S4.2.1 Extract the features of each trajectory point in the masked GPS trajectory and construct a GPS feature matrix. The trajectory point features include: longitude coordinates, latitude coordinates, velocity, acceleration, angle increment, time increment, and distance.

[0034] For example, regarding the first For each trajectory point, its seven-dimensional feature data is extracted. This seven-dimensional feature data includes: longitude coordinates, latitude coordinates, velocity, acceleration, angle increment, time increment, and distance. Specifically, velocity and acceleration are calculated based on the positional and temporal changes between adjacent GPS trajectory points; angle increment, time increment, and distance represent the change in heading angle, time interval, and spatial distance between the current GPS trajectory point and the previous GPS trajectory point, respectively. For the trajectory starting point, since there is no previous GPS trajectory point, its velocity, acceleration, angle increment, time increment, and distance are initialized to zero.

[0035] Arrange the seven-dimensional feature data in order to form the first... Feature vector of GPS trajectory points ,in, and They represent the first The longitude and latitude coordinates of each GPS track point. Indicates the first The speed of each GPS track point Indicates the first The acceleration of each GPS track point Indicates the first The change in heading angle between each GPS track point and the previous GPS track point Indicates the first The time interval between each GPS track point and the previous GPS track point Indicates the first The spatial distance between each GPS track point and the previous GPS track point is calculated by stacking the feature vectors of all GPS track points sequentially along the row direction in chronological order, thus constructing a structure with dimension [missing information]. The GPS feature matrix, where The GPS feature matrix, representing the number of GPS track points, can be expressed as: .

[0036] S4.2.2 Determine the GPS trajectory point intervals corresponding to each road segment based on the alignment matrix, and slice the GPS feature matrix along the time dimension to obtain the sub-feature matrix of each road segment; S4.2.3 Input the sub-feature matrix into the first bidirectional gated recurrent network to model the GPS trajectory point sequence features inside the road segment, and splice the hidden states of the first bidirectional gated recurrent network at the last moment of the forward and backward directions to obtain the initial GPS embedding of the road segment. For example, the first The sub-feature matrix of each road segment is , Indicates the first The number of GPS track points contained in each road segment Representing the feature dimension. The sub-feature matrix... Expanded into a sequence along the time dimension ,in Indicates the first The first section of the road The feature vectors at each time step. Then, the sequence... Input the first bidirectional gated recurrent network for encoding, and calculate the forward hidden state respectively. With backward hidden state Extract the final forward state. With backward final time state By stitching the images together, the initial GPS embedding for this road segment is obtained. ;in, This indicates a vector concatenation operation.

[0037] S4.2.4. The initial GPS embeddings are arranged into a GPS initial embedding sequence according to the driving order. The sequence is then input into a second bidirectional gated cyclic network to model the temporal context dependency between road segment sequences, thereby obtaining the GPS trajectory embedding sequence.

[0038] For example, the GPS corresponding to each road segment is initially embedded. The segments are arranged into a sequence according to their travel order and input into a second bidirectional gated cyclic network to capture long-range temporal dependencies between them. After encoding, the location of each segment can be obtained. Corresponding forward hidden state and backward hidden state The forward and backward hidden states at the same location are concatenated to obtain a GPS trajectory embedding sequence with fused contextual information. Among them, the location of the road section The corresponding embedding representation is: ;in, This indicates a vector concatenation operation.

[0039] S4.3. Input the masked road segment trajectory into the road segment trajectory encoder for feature extraction to obtain the road segment trajectory embedding sequence; Further, step S4.3, which involves inputting the masked road segment trajectory into the road segment trajectory encoder for feature extraction to obtain a road segment trajectory embedding sequence, includes: S4.3.1. Initialize each road segment in the road segment set to obtain the initial vector representation of each road segment; S4.3.2. Based on the aforementioned topological network structure, construct a road network adjacency matrix. Utilize a graph attention network to aggregate the initial vector representations of each road segment and its neighbors to obtain the spatial feature embedding of the road segments. The calculation formula is as follows: ; in, Let be the initial vector representation of each road segment in the road segment set. For road segment collection, For the road network adjacency matrix, For graph attention network layers, Embedding of spatial features of road segments; S4.3.3 Obtain the time information of each road segment in the road segment trajectory, and map the time information into a high-dimensional time vector to obtain the road segment time feature embedding; Further, step S4.3.3, which involves obtaining the time information of each road segment in the road segment trajectory and mapping the time information into a high-dimensional time vector to obtain the road segment time feature embedding, includes: S4.3.3.1. Based on the alignment matrix, determine the starting GPS trajectory point corresponding to each road segment; S4.3.3.2. Use the timestamp of the starting GPS track point as a reference timestamp, and calculate the minute index and weekday index based on the reference timestamp; For example, the encoding rule for the weekday index is as follows: Monday to Sunday correspond to the index respectively. to When the reference timestamp is 08:23:45 on May 15, 2024, since this time corresponds to a Wednesday, its weekday index is... Furthermore, based on the hour and minute information, this time is converted into a minute index for the current day. The minute index is as follows: Therefore, the minute index corresponding to this reference timestamp is The weekday index is .

[0040] S4.3.3.3. Input the minute index and the weekday index into the learnable embedding layer to obtain the minute time embedding vector and the weekday time embedding vector. The calculation formula is as follows: ; ; in, and This represents a learnable embedding mapping function. For timestamp The calculated minute index, For timestamp The calculated weekday index, Embed a minute-time vector. Embed a vector for the weekday time; S4.3.3.4 Add the minute time embedding vector to the weekday time embedding vector to obtain the road segment time feature embedding.

[0041] S4.3.4 Add the road segment spatial feature embedding to the road segment temporal feature embedding to obtain the road segment spatiotemporal representation of each road segment; S4.3.5. Input the spatiotemporal representations of each road segment into a sequence formed by the driving order into the Transformer encoder for modeling to obtain the road segment trajectory embedding sequence.

[0042] S4.4. Perform average pooling operations on the GPS trajectory embedding sequence and the road segment trajectory embedding sequence respectively. The calculation formula is as follows: ; ; in, This is a vector representing the GPS trajectory. For GPS track embedding sequences, Let the vector represent the trajectory of the road segment. Embed the sequence of road segment trajectories. This is an average pooling operation; S4.5. Calculate the contrast loss based on the GPS trajectory representation vector and the road segment trajectory representation vector; Further, the calculation of the contrast loss in step S4.5 is performed using the following formula: ; ; ; in, To compare loss functions, The cosine similarity function is used. For the first GPS trajectory representation vector of a trajectory, For the first The path segment trajectory of a trajectory is represented by a vector. For the first The path segment trajectory of a trajectory is represented by a vector. For the first GPS trajectory representation vector of a trajectory For temperature parameters, This refers to the batch sample size. This represents the contrast loss from GPS mode to road segment mode direction. The contrast loss is between the road segment mode and the GPS mode. S4.6. Perform cross-modal information interaction processing on the GPS trajectory embedding sequence and the road segment trajectory embedding sequence through the cross-modal interaction module to obtain the GPS trajectory embedding sequence and the road segment trajectory embedding sequence after interaction. Further, the cross-modal interaction module described in step S4.6 includes: S4.6.1. Concatenate the GPS trajectory embedding sequence with the road segment trajectory embedding sequence to construct a cross-modal long sequence; S4.6.2 For each element in the cross-modal long sequence, the feature dimension is adjusted by linear projection, and the position embedding and modality embedding are superimposed. The calculation formula is as follows: ; in, For the enhanced representation, For the first in a long cross-modal sequence One element, It is a linear projection matrix. For location embedding, This indicates the index of the element in the sequence. For modal embedding, This is a modal identifier variable used to indicate the modal type to which the element belongs; S4.6.3, the enhanced representation Composition sequence Cross-modal information interaction is achieved through a multi-head self-attention layer with residual structure. The calculation formula is as follows: ; in, For the output sequence, For bullish self-attention operations, For the enhanced representation The sequence formed, For layer normalization processing; S4.6.4. Based on the concatenation order during input, the output sequence is... It was re-splittered into the GPS trajectory embedding sequence after interaction and the road segment trajectory embedding sequence after interaction.

[0043] S4.7. Based on the GPS trajectory embedding sequence after the interaction and the road segment trajectory embedding sequence after the interaction, calculate the mask reconstruction loss; Further, the calculation of the mask reconstruction loss in step S4.7 includes: S4.7.1 For each mask position, extract the feature vectors of the GPS trajectory embedding sequence after interaction and the road segment trajectory embedding sequence after interaction at the corresponding mask positions; S4.7.2. Input the feature vector into the prediction layer, and obtain the prediction probability distribution of each road segment identifier at each mask position through the Softmax function. The calculation formula is as follows: ; ; in, and They represent the first Predicted probability distributions for GPS trajectory and road segment trajectory at each mask location. For normalization function, and These represent the weight matrix on the GPS trajectory side and the weight matrix on the road segment trajectory side, respectively. and These represent the bias vector on the GPS track side and the bias vector on the road segment track side, respectively. and They represent the first The interaction-based GPS trajectory feature vector and the interaction-based road segment trajectory feature vector at each mask location; S4.7.3. Based on the probability values of the corresponding real road segment identifiers in the predicted probability distribution, the mask reconstruction loss is calculated using the following formula: ; in, This represents the total number of mask positions. For the set of mask locations, For mask position index, For index The actual road signs at the location True road segment identification in GPS trajectory-side prediction probability distribution The corresponding probability value, True road segment identifiers in the probability distribution of road segment trajectory prediction The corresponding probability value, Loss due to mask reconstruction.

[0044] S4.8. Construct a joint training loss function based on the contrast loss and the mask reconstruction loss, and update the parameters of the GPS trajectory encoder, the road segment trajectory encoder and the cross-modal interaction module by minimizing the joint training loss function to obtain the pre-trained GPS trajectory encoder and the pre-trained road segment trajectory encoder.

[0045] Furthermore, the joint training loss function described in step S4.8 is calculated using the following formula: ; in, , These are adjustable hyperparameters used to balance the optimization weights between the contrastive learning objective and the mask reconstruction objective. For mask reconstruction loss function, For comparison loss functions.

[0046] S5. Construct learnable task cue vectors based on target downstream tasks; Furthermore, the step S5, which involves constructing a learnable task cue vector based on the target downstream task, includes: S5.1 Assign a unique task identifier to each downstream task and construct a task prompt vector lookup table, which is used to store the initial task vector corresponding to each downstream task. S5.2 Obtain the task identifier corresponding to the target downstream task, and extract the corresponding initial task vector from the task prompt vector lookup table through a lookup operation; S5.3 Input the initial task vector into a fully connected network for dimensional mapping and nonlinear transformation to obtain the task prompt vector.

[0047] In one embodiment of the present invention, the downstream task is trajectory travel time estimation, which involves querying the corresponding task ID and obtaining the initial task vector. Through the fully connected layer and activation function , obtain the task hint vector ,in This is the bias vector. This vector is used to guide subsequent feature fusion.

[0048] S6. Using the task prompt vector as task condition information, the GPS trajectory embedding sequence and the road segment trajectory embedding sequence are adaptively fused through the hierarchical fusion module to obtain a task-specific fusion representation. Furthermore, the hierarchical fusion module described in step S6 includes: S6.1, Regarding the first [item] in the trajectory The task prompt vector and the GPS trajectory are embedded in the sequence of the first road segment. The representation of the nth road segment and the embedding sequence of the road segment trajectory The representations of each road segment are concatenated, and a gated weight is generated through nonlinear mapping. The calculation formula is as follows: ; in, For gating weights, The first GPS track embedding sequence The representation of each road segment, The first segment of the road trajectory embedding sequence The representation of each road segment, This indicates a feature concatenation operation. Let be the bias vector of the gated mapping. The weight matrix of the gated mapping, For the Sigmoid function, This is a task hint vector; S6.2, Using the gating weights, the first... The representation of the _th road segment and the _th road segment trajectory embedding sequence The representations of each road segment are weighted and fused to obtain a comprehensive representation of the corresponding road segment, and then combined into a comprehensive feature sequence according to the driving order. The calculation formula is as follows: ; ; in, The first GPS track embedding sequence The representation of each road segment, The first segment of the road trajectory embedding sequence The representation of each road segment, For element-wise multiplication, For gating weights, This represents the number of road segments corresponding to the trajectory. For the first A comprehensive representation of each road segment. For comprehensive feature sequences; S6.3. The task prompt vector is linearly projected to obtain the query vector, and the comprehensive feature sequence is linearly projected to obtain the key matrix and the value matrix respectively; S6.4. Input the query vector, the key matrix, and the value matrix into the scaled dot product attention mechanism for calculation to obtain the task-specific fusion representation. The calculation formula is as follows: ; ; ; in, For task hint vectors, These represent the query projection matrix, key projection matrix, and value projection matrix, respectively. Let be the dimension of the key vector. This is the transpose of the key matrix. For normalization function, For query vector, The key matrix, For value matrices, Task-specific fusion representation This is a comprehensive feature sequence.

[0049] S7. Input the task-specific fusion representation into the prediction module corresponding to the target downstream task to obtain the downstream task prediction result. Calculate the task loss based on the prediction result and the true label. Iteratively update the task prompt vector, the hierarchical fusion module, and the learnable parameters in the prediction module based on the task loss.

[0050] Furthermore, the prediction module in step S7 includes at least one fully connected layer for mapping the task-specific fusion representation to a space that matches the output dimension of the target downstream task; The task loss is calculated based on the prediction results and the true labels, using a loss function adapted to the target downstream task.

[0051] For example, cross-entropy loss is used in trajectory classification tasks, while mean squared error loss or mean absolute error loss is used in road speed inference tasks and trajectory travel time estimation tasks.

[0052] Example 2

[0053] Figure 2 This illustrates the overall architecture of a task-adaptive trajectory representation learning method provided by an embodiment of the present invention. For example... Figure 2 As shown, the original input includes road network data and GPS tracks. First, map matching is performed on the GPS tracks to obtain the corresponding road segment tracks; then, an alignment matrix between the road segment tracks and the GPS tracks is established based on the map matching results.

[0054] In the joint self-supervised pre-training phase, the road segment trajectory and GPS trajectory are synchronized and masked before being input into the corresponding road segment trajectory encoder and GPS trajectory encoder for encoding. On the one hand, the global representation alignment of the two modalities is constrained by the comparison loss; on the other hand, the information of the two modalities is exchanged through the cross-modal interaction module, and the reconstruction error of the masked position is optimized based on the mask reconstruction loss. After joint self-supervised pre-training, the pre-trained road segment trajectory encoder and the pre-trained GPS trajectory encoder are obtained.

[0055] In the downstream task stage, the pre-trained road segment trajectory encoder and the pre-trained GPS trajectory encoder are used to extract the road segment trajectory embedding sequence and the GPS trajectory embedding sequence, respectively. Combined with the task prompt vector, the two modal embedding sequences are adaptively fused through a hierarchical fusion module to generate a task-specific fusion representation. Then, the task-specific fusion representation is input into the prediction module to obtain the prediction result corresponding to the target downstream task. Based on the prediction result and the real label, the task loss is calculated to update the relevant parameters of the task prompt vector, the parameters of the hierarchical fusion module, and the parameters of the prediction module.

[0056] Those skilled in the art will understand that although some embodiments herein include certain features included in other embodiments but not others, combinations of features from different embodiments are meant to be within the scope of the invention and form different embodiments.

[0057] Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of the present invention, and not to limit them; although the present invention has been described in detail with reference to the foregoing embodiments, those skilled in the art should understand that modifications can still be made to the technical solutions described in the foregoing embodiments, or equivalent substitutions can be made to some of the technical features; and these modifications or substitutions do not cause the essence of the corresponding technical solutions to deviate from the spirit and scope of the technical solutions of the embodiments of the present invention.

Claims

1. A task-adaptive trajectory representation learning method, characterized in that, Includes the following steps: Obtain GPS trajectory and corresponding road network data; The GPS trajectory and the road network data are matched on a map to obtain the corresponding road segment trajectory; An alignment matrix is constructed based on the GPS trajectory and the road segment trajectory to map the GPS trajectory points to the road segments; Based on the alignment matrix, the GPS trajectory is input into the pre-trained GPS trajectory encoder for feature extraction to obtain the GPS trajectory embedding sequence, and the road segment trajectory is input into the pre-trained road segment trajectory encoder for feature extraction to obtain the road segment trajectory embedding sequence. Construct learnable task cue vectors based on the target downstream task; The task prompt vector is used as task condition information. The hierarchical fusion module adaptively fuses the GPS trajectory embedding sequence and the road segment trajectory embedding sequence to obtain a task-specific fusion representation. The task-specific fusion representation is input into the prediction module corresponding to the target downstream task to obtain the downstream task prediction result. The task loss is calculated based on the prediction result and the real label. Based on the task loss, the task prompt vector, the hierarchical fusion module and the learnable parameters in the prediction module are iteratively updated. The road network data includes: road segment sets, road segment attribute information, and topological network structure; The GPS track includes GPS track points arranged in chronological order.

2. The task-adaptive trajectory representation learning method as described in claim 1, characterized in that, The map matching includes: A map matching algorithm based on a hidden Markov model is used to match the GPS trajectory with the road network data to obtain the corresponding road segment trajectory.

3. The task-adaptive trajectory representation learning method as described in claim 1, characterized in that, The process of constructing an alignment matrix based on the GPS trajectory and the road segment trajectory to map GPS trajectory points to road segments includes: Construct an initial matrix, where the rows of the matrix correspond to each road segment in the road segment trajectory, and the columns of the matrix correspond to each trajectory point in the GPS trajectory; Initialize each element in the initial matrix to 0; Traverse each road segment in the road segment trajectory and extract the GPS trajectory point sequence corresponding to each road segment from the GPS trajectory; In the initial matrix, the intersection element of each road segment's corresponding row and the column containing each trajectory point in its corresponding GPS trajectory point subsequence is set to 1; Use the assigned initial matrix as the alignment matrix.

4. The task-adaptive trajectory representation learning method as described in claim 1, characterized in that, The pre-trained GPS trajectory encoder and the pre-trained road segment trajectory encoder are obtained through joint self-supervised pre-training, which includes the following steps: Based on the alignment matrix, the trajectory points in the GPS trajectory and the road segments in the road segment trajectory are synchronized and masked to obtain the masked GPS trajectory and the masked road segment trajectory. The masked GPS trajectory is input into the GPS trajectory encoder for feature extraction to obtain the GPS trajectory embedding sequence; The masked road segment trajectory is input into the road segment trajectory encoder for feature extraction to obtain the road segment trajectory embedding sequence; Average pooling is performed on the GPS trajectory embedding sequence and the road segment trajectory embedding sequence respectively, and the calculation formula is as follows: ；； in, This is a vector representing the GPS trajectory. Embedding sequences for GPS tracks, Let the vector represent the trajectory of the road segment. Embed the sequence of road segment trajectories. This is an average pooling operation; Based on the GPS trajectory representation vector and the road segment trajectory representation vector, calculate the contrast loss; The GPS trajectory embedding sequence and the road segment trajectory embedding sequence are processed by cross-modal interaction module to obtain the GPS trajectory embedding sequence and the road segment trajectory embedding sequence after interaction. Based on the GPS trajectory embedding sequence after the interaction and the road segment trajectory embedding sequence after the interaction, calculate the mask reconstruction loss; A joint training loss function is constructed based on the contrast loss and the mask reconstruction loss. The parameters of the GPS trajectory encoder, the road segment trajectory encoder, and the cross-modal interaction module are updated by minimizing the joint training loss function to obtain the pre-trained GPS trajectory encoder and the pre-trained road segment trajectory encoder.

5. The task-adaptive trajectory representation learning method as described in claim 4, characterized in that, The step of inputting the masked GPS trajectory into the GPS trajectory encoder for feature extraction to obtain a GPS trajectory embedding sequence includes: Extract the features of each trajectory point in the masked GPS trajectory and construct a GPS feature matrix; Based on the alignment matrix, the GPS trajectory point intervals corresponding to each road segment are determined, and the GPS feature matrix is sliced along the time dimension to obtain the sub-feature matrix of each road segment. The sub-feature matrix is input into the first bidirectional gated recurrent network to model the GPS trajectory point sequence features inside the road segment. The hidden states of the first bidirectional gated recurrent network at the last moment of forward and backward are spliced together to obtain the initial GPS embedding of the road segment. The initial GPS embeddings are arranged into a GPS initial embedding sequence according to the driving order. The sequence is then input into a second bidirectional gated cyclic network to model the temporal context dependency between road segment sequences, thereby obtaining the GPS trajectory embedding sequence. The trajectory point features include: longitude coordinates, latitude coordinates, velocity, acceleration, angle increment, time increment, and distance.

6. The task-adaptive trajectory representation learning method as described in claim 4, characterized in that, The step of inputting the masked road segment trajectory into the road segment trajectory encoder for feature extraction to obtain the road segment trajectory embedding sequence includes: Each road segment in the road segment set is initialized to obtain an initial vector representation of each road segment; Based on the aforementioned topological network structure, a road network adjacency matrix is constructed. An initial vector representation of each road segment and its neighbors is aggregated using a graph attention network to obtain the spatial feature embedding of the road segments. The calculation formula is as follows: ； in, Let be the initial vector representation of each road segment in the road segment set. For road segment collection, For the road network adjacency matrix, For graph attention network layers, Embedding of spatial features of road segments; The time information of each road segment in the road segment trajectory is obtained, and the time information is mapped into a high-dimensional time vector to obtain the road segment time feature embedding; The spatial feature embedding of the road segment is added to the temporal feature embedding of the road segment to obtain the spatiotemporal representation of each road segment; The spatiotemporal representations of each road segment are input into a Transformer encoder to form a sequence based on the driving order, thereby obtaining the trajectory embedding sequence of the road segment.

7. The task-adaptive trajectory representation learning method as described in claim 4, characterized in that, The cross-modal interaction module includes: The GPS trajectory embedding sequence is concatenated with the road segment trajectory embedding sequence to construct a cross-modal long sequence; For each element in the cross-modal long sequence, the feature dimension is adjusted by linear projection, and the position embedding and modality embedding are superimposed. The calculation formula is as follows: ； in, For the enhanced representation, For the first in a long cross-modal sequence One element, It is a linear projection matrix. For location embedding, This indicates the index of the element in the sequence. For modal embedding, This is a modal identifier variable used to indicate the modal type to which the element belongs; The enhanced representation Composition sequence Cross-modal information interaction is achieved through a multi-head self-attention layer with residual structure. The calculation formula is as follows: ； in, For the output sequence, For multi-head self-attention operation, For the enhanced representation The sequence formed, For layer normalization processing; The output sequence is based on the concatenation order during input. It was re-splittered into the GPS trajectory embedding sequence after interaction and the road segment trajectory embedding sequence after interaction.

8. The task-adaptive trajectory representation learning method as described in claim 4, characterized in that, The calculation formula for the contrast loss is as follows: ；；； in, To compare loss functions, The cosine similarity function is used. For the first GPS trajectory representation vector of a trajectory, For the first The path segment trajectory of a trajectory is represented by a vector. For the first The path segment trajectory of a trajectory is represented by a vector. For the first GPS trajectory representation vector of a trajectory For temperature parameters, This refers to the batch sample size. This represents the contrast loss from GPS mode to road segment mode direction. The contrast loss is between the road segment mode and the GPS mode. The calculation of the mask reconstruction loss includes: For each mask location, extract the feature vectors of the post-interaction GPS trajectory embedding sequence and the post-interaction road segment trajectory embedding sequence at the corresponding mask location; The feature vector is input into the prediction layer, and the predicted probability distribution of each road segment identifier at each mask position is obtained through the Softmax function. The calculation formula is as follows: ；； in, and They represent the first Predicted probability distributions for GPS trajectory and road segment trajectory at each mask location. For normalization function, and These represent the weight matrix on the GPS trajectory side and the weight matrix on the road segment trajectory side, respectively. and These represent the bias vector on the GPS track side and the bias vector on the road segment track side, respectively. and They represent the first The interaction-based GPS trajectory feature vector and the interaction-based road segment trajectory feature vector at each mask location; Based on the probability values corresponding to the real road segment identifiers in the predicted probability distribution, the mask reconstruction loss is calculated using the following formula: ； in, This represents the total number of mask positions. For the set of mask locations, For mask position index, For index The actual road signs at the location True road segment identification in GPS trajectory-side prediction probability distribution The corresponding probability value, True road segment identifiers in the probability distribution of road segment trajectory prediction The corresponding probability value, Loss due to mask reconstruction; The joint training loss function is calculated using the following formula: ； in, , These are adjustable hyperparameters used to balance the optimization weights between the contrastive learning objective and the mask reconstruction objective. For mask reconstruction loss function, For comparison loss functions.

9. The task-adaptive trajectory representation learning method as described in claim 6, characterized in that, The step of obtaining the time information of each road segment in the road segment trajectory and mapping the time information into a high-dimensional time vector to obtain the road segment time feature embedding includes: Based on the alignment matrix, the starting GPS trajectory point corresponding to each road segment is determined; Use the timestamp of the starting GPS track point as a reference timestamp, and calculate the minute index and weekday index based on the reference timestamp; The minute index and the weekday index are input into the learnable embedding layer to obtain the minute time embedding vector and the weekday time embedding vector, calculated using the following formula: ；； in, and This represents a learnable embedding mapping function. For timestamp The calculated minute index, For timestamp The calculated weekday index, Embed a minute-time vector. Embed a vector for the weekday time; The minute time embedding vector is added to the weekday time embedding vector to obtain the road segment time feature embedding.

10. The task-adaptive trajectory representation learning method as described in claim 1, characterized in that, The construction of a learnable task cue vector based on the target downstream task includes: A unique task identifier is assigned to each downstream task, and a task hint vector lookup table is constructed to store the initial task vector corresponding to each downstream task. Obtain the task identifier corresponding to the target downstream task, and extract the corresponding initial task vector from the task prompt vector lookup table through a lookup operation; The initial task vector is input into a fully connected network for dimensional mapping and nonlinear transformation to obtain the task prompt vector.

11. The task-adaptive trajectory representation learning method as described in claim 1, characterized in that, The hierarchical fusion module includes: For the first in the trajectory The task prompt vector and the GPS trajectory are embedded in the sequence of the first road segment. The representation of the nth road segment and the embedding sequence of the road segment trajectory The representations of each road segment are concatenated, and a gated weight is generated through nonlinear mapping. The calculation formula is as follows: ； in, For gating weights, The first GPS track embedding sequence The representation of each road segment, The first segment of the road trajectory embedding sequence The representation of each road segment, This indicates a feature concatenation operation. Let be the bias vector of the gated mapping. The weight matrix of the gated mapping, For the Sigmoid function, This is a task hint vector; The gating weights are used to apply the gating weights to the first GPS trajectory embedding sequence. The representation of the _th road segment and the _th road segment trajectory embedding sequence The representations of each road segment are weighted and fused to obtain a comprehensive representation of the corresponding road segment, and then combined into a comprehensive feature sequence according to the driving order. The calculation formula is as follows: ；； in, The first GPS track embedding sequence The representation of each road segment, The first segment of the road trajectory embedding sequence The representation of each road segment, For element-wise multiplication, For gating weights, This represents the number of road segments corresponding to the trajectory. For the first A comprehensive representation of each road segment. For comprehensive feature sequences; The query vector is obtained by linear projection of the task prompt vector, and the key matrix and value matrix are obtained by linear projection of the comprehensive feature sequence, respectively. The query vector, the key matrix, and the value matrix are input into a scaled dot product attention mechanism for computation to obtain the task-specific fusion representation. The calculation formula is as follows: ；；； in, This is the task hint vector. These represent the query projection matrix, key projection matrix, and value projection matrix, respectively. Let be the dimension of the key vector. This is the transpose of the key matrix. For normalization function, For query vector, The key matrix, For value matrices, Task-specific fusion representation This is a comprehensive feature sequence.

12. The task-adaptive trajectory representation learning method as described in claim 1, characterized in that, The prediction module includes at least one fully connected layer for mapping the task-specific fusion representation to a space that matches the output dimension of the target downstream task; The task loss is calculated based on the prediction results and the true labels, using a loss function adapted to the target downstream task.

Citation Information

Patent Citations

Trajectory representation pre-training system based on deep learning
CN119558374A

Patent Information

AI Technical Summary

Abstract

Description

Patent Citations

Trajectory representation pre-training system based on deep learning