Travel chain data completion method based on improved rbf neural network model

By using an improved RBF neural network model, combined with multi-source data and navigation API path planning, the system automatically identifies and completes walking or cycling data in the travel chain, solving the problem of incomplete travel chain data in existing technologies and achieving high accuracy and high completeness of travel chain data.

CN122245090APending Publication Date: 2026-06-19BEIJING UNIV OF CHEM TECH +1

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
BEIJING UNIV OF CHEM TECH
Filing Date
2026-01-26
Publication Date
2026-06-19

AI Technical Summary

Technical Problem

When performing travel chain analysis based on mobile signaling data, existing technologies often fail to provide complete walking or cycling data before and after a user takes public transportation, which affects the overall accuracy and completeness of the travel chain data. Furthermore, traditional methods have limited capabilities in integrating and processing multi-source data, making it difficult to fully extract valuable information from the data.

Method used

An improved RBF neural network model is adopted to acquire and preprocess mobile phone signaling data and public transportation data. Combined with activity trajectory clustering and navigation API route planning, a travel chain structure is constructed, which automatically identifies and completes walking or cycling data. Multi-source data is used for comprehensive analysis to improve the completeness and accuracy of travel chain data.

🎯Benefits of technology

It significantly improves the completeness and accuracy of travel chain data, enhances the model's recognition and generalization capabilities, ensures applicability and stability in different scenarios, and achieves high-precision travel mode identification and data completion.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN122245090A_ABST
    Figure CN122245090A_ABST
Patent Text Reader

Abstract

This invention relates to a method for completing travel chain data based on an improved Radial Basis Function (RBF) neural network model, belonging to the field of intelligent transportation technology. The method includes: acquiring a user's current travel chain data, wherein the current travel chain data includes at least one travel segment and at least one missing travel segment, the travel segment including public transportation data, and the missing travel segment including walking or cycling data; inputting the user's current travel chain data into a pre-trained improved RBF neural network model to obtain complete travel chain data, wherein the pre-trained improved RBF neural network model is trained based on historical travel chain data, including historical mobile phone signaling data and historical public transportation card swipe data. Through multi-source data fusion and an improved radial basis function (RBF) neural network model, walking or cycling data in the travel chain is automatically identified and completed.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of intelligent transportation technology, and in particular to a method for completing travel chain data based on an improved RBF neural network model. Background Technology

[0002] With the implementation of my country's intelligent transportation system development strategy, many regions have actively responded and promoted the construction of urban travel integration analysis demonstration platforms. In recent years, related research has been gradually carried out, focusing on the precise extraction technology of urban travel spatiotemporal characteristics based on mobile phone signaling data and the urban travel chain feature inference model based on deep learning methods. As a new type of data source, mobile phone signaling data has a wide coverage and strong real-time performance, providing detailed information on users' movement in the city, providing strong support for urban traffic planning and management, and promoting the development of the intelligent transportation field.

[0003] Existing technologies for trip chain analysis based on mobile signaling data have some shortcomings. For example, users' walking or cycling data before and after using public transportation are often incomplete, affecting the overall accuracy and completeness of trip chain data. Furthermore, traditional analysis methods have limited capabilities in fusing and processing multi-source data, making it difficult to fully extract valuable information from the data.

[0004] Therefore, there is an urgent need for a method that can effectively integrate multi-source data and automatically identify and supplement walking or cycling data in the travel chain, so as to improve the integrity and accuracy of travel chain data and provide more reliable data support for the development of intelligent transportation systems. Summary of the Invention

[0005] The technical problem to be solved by the present invention is to provide a method for completing travel chain data based on an improved RBF neural network model, which aims to solve at least one of the above-mentioned technical problems.

[0006] The technical solution of the present invention to solve the above-mentioned technical problems is as follows: Firstly, this application provides a method for trip chain data completion based on an improved RBF neural network model, employing the following technical solution: A method for trip chain data completion based on an improved RBF neural network model includes: Obtain the user's current travel chain data, which includes at least one travel segment and at least one missing travel segment. The travel segment includes public transportation data, and the missing travel segment includes walking data or cycling data. The user's current travel chain data is input into a pre-trained improved RBF neural network model to obtain complete travel chain data. The pre-trained improved RBF neural network model is trained based on historical travel chain data, which includes historical mobile phone signaling data and historical public transportation card swipe data.

[0007] The beneficial effects of this invention are: by acquiring current travel chain data containing travel segments with public transportation data and missing travel segments containing walking or cycling data, and inputting this data into an improved RBF neural network model trained based on historical mobile phone signaling data and historical public transportation card swiping data, the walking or cycling data in the travel chain can be automatically identified and completed, thereby improving the completeness and accuracy of the travel chain data.

[0008] Based on the above technical solution, the present invention can be further improved as follows.

[0009] Furthermore, obtaining the user's current travel chain data includes: Collect mobile signaling data and public transportation data of users in the current time period. The mobile signaling data represents the user's stay time and spatial coordinates at different locations, and the public transportation data represents the station coordinates and timestamps of the user's origin and destination. The user's mobile signaling data and public transportation data in the current time period are preprocessed to obtain preprocessed mobile signaling data and public transportation data. The preprocessing includes missing data processing, spatiotemporal conflict resolution processing, location merging processing and ping-pong effect processing. Based on preprocessed mobile signaling data and public transportation data, the user's current travel chain data is obtained through stop point identification and travel chain construction.

[0010] The beneficial effects of adopting the above-mentioned further solutions are as follows: collecting mobile phone signaling data and public transportation data in the current time period can obtain detailed information about users' travel; preprocessing these two types of data can ensure the integrity and reliability of the data and reduce redundancy and abnormal data; based on the preprocessed data, stop point identification and travel chain construction can be performed to obtain current travel chain data containing travel segments and missing travel segments, providing a foundation for subsequent completion of walking or cycling data.

[0011] Furthermore, the training method for the pre-trained improved RBF neural network model includes: Acquire spatiotemporal trajectory data of users within a preset historical time period. The spatiotemporal trajectory data includes historical mobile phone signaling data and historical public transportation card swipe data. The historical mobile phone signaling data includes the time and spatial coordinates of the user's stay at different locations. The user's spatiotemporal trajectory data over a preset historical time period is clustered to determine the user's target activity area; Based on the historical public transportation card swipe data and the target activity area, a travel chain structure is constructed, which includes multiple travel segments. Based on the travel chain structure and the path planning results provided by the navigation API, determine the travel mode label for each travel segment in the travel chain structure; A training set is constructed based on multiple travel chain structures and the travel mode labels of each travel segment in the travel chain structure. The pre-constructed improved RBF neural network model is then trained based on the training set to obtain the trained improved RBF neural network model.

[0012] The beneficial effects of adopting the above-mentioned further solutions are as follows: by acquiring spatiotemporal trajectory data of users' preset historical time periods, multi-source data can be integrated for model training; by clustering activity trajectories of spatiotemporal trajectory data to determine target activity areas, the user's activity range can be accurately located; by combining historical public transportation card swiping data and target activity areas to construct a travel chain structure, the training data can be made more consistent with actual travel conditions; by using navigation API route planning results to determine travel mode labels for travel segments, accurate label information can be provided for model training; and by constructing a training set based on these to train and improve the RBF neural network model, the model's recognition and generalization abilities can be enhanced, thereby improving the accuracy and completeness of travel chain data completion.

[0013] Furthermore, the step of clustering the user's spatiotemporal trajectory data over a preset historical time period to determine the user's target activity area includes: Based on the DBSCAN clustering algorithm, density clustering is performed on spatiotemporal trajectory data with a preset historical time period to generate multiple candidate activity region clusters; The daily activity intensity distribution is calculated based on daily mobile phone signaling data within a preset historical time period. The information entropy value is calculated based on the daily activity intensity distribution, and weekdays and rest days are distinguished according to the information entropy value. Calculate the first dwell index of each candidate activity area cluster during the preset rest period on a weekday, and the second dwell index during the preset working period on a weekday. The probability of residence for each candidate activity area cluster is calculated based on the first residence index, and the probability of working for each candidate activity area cluster is calculated based on the second residence index. Based on the residential probability and working probability of each candidate activity area cluster, the user's optimal residential location and optimal working location are determined; The user's target activity area is determined based on the user's optimal residence and optimal workplace.

[0014] The beneficial effects of adopting the above-mentioned further scheme are as follows: Clustering activity trajectories on spatiotemporal trajectory data within a preset historical time period enables accurate determination of the user's target activity area. Specifically, the DBSCAN clustering algorithm can perform density clustering on spatiotemporal trajectory data, generating multiple candidate activity area clusters; calculating the daily activity intensity distribution and information entropy value based on mobile phone signaling data can distinguish between weekdays and rest days; calculating the dwell time indicators of candidate activity area clusters at different times and deriving the probability of residence and work can determine the user's optimal residence and optimal workplace, thereby determining the target activity area. This improves the accuracy of user activity area identification and provides a precise foundation for the subsequent construction of the travel chain structure.

[0015] Furthermore, the construction of the travel chain structure based on the historical public transportation card swiping data and the target activity area includes: Based on the historical public transportation card swipe data and the target activity area, the user's continuous travel records are divided into multiple travel segments, each of which includes a segment start point, a public transportation boarding start point, a public transportation boarding end point, and a segment end point; Based on timestamp information, multiple travel segments are spatiotemporally correlated and sorted to construct a travel chain structure.

[0016] The beneficial effects of adopting the above-mentioned further scheme are: it can divide the user's continuous travel records into multiple travel segments, including the segment start point, public transportation start point, public transportation end point, and segment end point, according to historical public transportation card swipe data and target activity area. Based on the timestamp information, these segments are spatiotemporally correlated and sorted to construct a travel chain structure. This provides a foundation for subsequently determining the travel mode labels of travel segments and training and improving the RBF neural network model. It helps to more accurately identify and complete the walking or cycling data in the travel chain, and improve the completeness and accuracy of the travel chain data.

[0017] Furthermore, determining the travel mode label for each travel segment in the travel chain structure based on the travel chain structure and the route planning results provided by the navigation API includes: Based on the navigation API, the path planning results of different travel segments in the travel chain structure using multiple candidate travel modes under the same time constraints are obtained. The travel modes include walking, public transportation, driving, and cycling. For each travel segment, calculate the path matching degree between the mobile signaling data corresponding to the travel segment and the navigation path of each candidate travel mode; For each travel segment, calculate the time matching degree between the mobile signaling data corresponding to the travel segment and the navigation path of each candidate travel mode; For each travel segment, based on the path matching degree and the time matching degree, the comprehensive matching degree between the mobile signaling data corresponding to the travel segment and the navigation path of each candidate travel mode is determined, and the travel mode corresponding to the maximum comprehensive matching degree is selected as the travel mode label of the travel segment.

[0018] The beneficial effects of adopting the above-mentioned further solution are as follows: obtaining route planning results of multiple candidate travel modes based on the navigation API can provide a foundation for subsequent matching degree calculation; calculating the path matching degree and time matching degree, and then combining them to obtain the comprehensive matching degree, can comprehensively consider both path and time dimensions, select the travel mode corresponding to the maximum comprehensive matching degree as the label, accurately determine the travel mode of the travel segment, and thus provide accurate training data for training and improving the RBF neural network model, improve the model's ability to identify travel modes, help automatically identify and complete walking or cycling data in the travel chain, and improve the completeness and accuracy of travel chain data.

[0019] Furthermore, the step of training a pre-constructed improved RBF neural network model based on a training set to obtain a trained improved RBF neural network model includes: The improved RBF neural network model is initialized based on the training set and the candidate node count set, and the optimal number of hidden layer nodes of the improved RBF neural network model is determined. Based on the training set, the optimal number of hidden layer nodes, the K-Means clustering algorithm, and the Adam optimization algorithm, the pre-constructed improved RBF neural network model is trained to obtain the trained improved RBF neural network model.

[0020] The beneficial effects of adopting the above-mentioned further scheme are as follows: initializing the improved RBF neural network model based on the training set and the candidate node set can determine the optimal number of hidden layer nodes, providing a more suitable structure for the model; using the K-Means clustering algorithm to select the center point can enable the model to better adapt to the data distribution; using the Adam optimization algorithm to train the model can adaptively adjust the learning rate, accelerate the convergence speed and improve accuracy, ultimately improving the recognition performance and generalization ability of the improved RBF neural network model, so that the trained model can more accurately complete the walking or cycling data in the travel chain data.

[0021] Secondly, this application provides a trip chain data completion device based on an improved RBF neural network model, which adopts the following technical solution: A trip chain data completion device based on an improved RBF neural network model, comprising: The acquisition module is used to acquire the user's current travel chain data. The current travel chain data includes at least one travel segment and at least one missing travel segment. The travel segment includes public transportation data, and the missing travel segment includes walking data or cycling data. The travel chain data completion module is used to input the user's current travel chain data into a pre-trained improved RBF neural network model to obtain complete travel chain data. The pre-trained improved RBF neural network model is trained based on historical travel chain data, which includes historical mobile phone signaling data and historical public transportation card swipe data.

[0022] Thirdly, this application provides an electronic device that adopts the following technical solution: An electronic device includes a memory and a processor, wherein the memory stores a computer program capable of being loaded by the processor and executing the trip chain data completion method based on the improved RBF neural network model as described in any of the first aspects.

[0023] Fourthly, this application provides a computer-readable storage medium, which adopts the following technical solution: A computer-readable storage medium storing a computer program capable of being loaded by a processor and executing the trip chain data completion method based on an improved RBF neural network model as described in any of the first aspects.

[0024] Additional aspects and advantages of this application will be set forth in part in the description which follows, and will become apparent from the description or may be learned by practice of this application.

[0025] The present invention, by adopting the above technical solution, has the following beneficial effects: 1. By fusing multi-source data and using an improved RBF neural network model, walking or cycling data in the travel chain is automatically identified and completed, significantly improving the completeness of travel chain data and solving the problem of incomplete travel chain data in existing technologies.

[0026] 2. By utilizing multi-source data for comprehensive analysis and combining features such as user's geographical location, travel time, and movement speed, a high-precision travel mode identification is achieved through an improved RBF neural network model, ensuring the accuracy of the supplementary data.

[0027] 3. By introducing multiple features (such as time, distance, speed, and geographical location) and optimizing the structural parameters and training algorithm of the RBF neural network, the model's recognition and generalization abilities are improved, ensuring the model's applicability and stability in different scenarios. Attached Figure Description

[0028] Figure 1A flowchart illustrating a trip chain data completion method based on an improved RBF neural network model, provided as an embodiment of the present invention; Figure 2 A schematic diagram illustrating a data preprocessing step provided in one embodiment of the present invention; Figure 3 A flowchart illustrating a training method for an improved RBF neural network model provided in an embodiment of the present invention; Figure 4 A schematic diagram illustrating the principle of user travel chain construction according to an embodiment of the present invention; Figure 5 A schematic diagram of a trip chain data completion device based on an improved RBF neural network model provided in an embodiment of the present invention; Figure 6 This is a schematic diagram of the structure of an electronic device provided in one embodiment of the present invention. Detailed Implementation

[0029] To make the objectives, technical solutions, and advantages of the embodiments of this application clearer, the technical solutions of the embodiments of this application will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of this application, not all embodiments. Based on the embodiments of this application, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of this application.

[0030] Furthermore, the term "and / or" in this article is merely a description of the relationship between related objects, indicating that three relationships can exist. For example, A and / or B can represent: A existing alone, A and B existing simultaneously, or B existing alone. Additionally, the character " / " in this article, unless otherwise specified, generally indicates that the preceding and following related objects have an "or" relationship.

[0031] This application provides a method for completing travel chain data based on an improved RBF neural network model. This method can be executed by an electronic device, which can be a server or a mobile terminal device. The server can be an independent physical server, a server cluster or distributed system composed of multiple physical servers, or a cloud server that provides cloud computing services. The mobile terminal device can be a laptop computer, a desktop computer, etc., but is not limited to these.

[0032] like Figure 1 As shown, a method for completing travel chain data based on an improved RBF (Radial Basis Function) neural network model includes: S1, Obtain the user's current travel chain data, which includes at least one travel segment and at least one missing travel segment. The travel segment includes public transportation data, and the missing travel segment includes walking data or cycling data. In this embodiment of the application, the acquisition of current travel chain data includes steps such as data collection, preprocessing, stop point identification, and travel chain construction.

[0033] like Figure 2 As shown, obtaining the user's current travel chain data includes: Collect mobile signaling data and public transportation data of users in the current time period. The mobile signaling data represents the user's stay time and spatial coordinates at different locations, and the public transportation data represents the station coordinates and timestamps of the user's origin and destination. The user's mobile signaling data and public transportation data in the current time period are preprocessed to obtain preprocessed mobile signaling data and public transportation data. The preprocessing includes missing data processing, spatiotemporal conflict resolution processing, location merging processing and ping-pong effect processing. Based on preprocessed mobile signaling data and public transportation data, the user's current travel chain data is obtained through stop point identification and travel chain construction.

[0034] In the above embodiments, mobile signaling data can be obtained through communication between mobile phone base stations and mobile phones, representing the user's dwell time and spatial coordinates at different locations. For example, the switching time and location information of the mobile phone in different base station coverage areas can be recorded. Public transportation data can be obtained from data sources such as transportation card swipe records or public transportation applications, representing the station coordinates and timestamps of the user's origin and destination. The data source can also be replaced by the back-end database of the public transportation operator, etc.

[0035] Preprocessing is performed separately for user mobile signaling data and public transportation data for the current time period. Missing data handling involves filtering out incomplete or missing signaling data to ensure data integrity and reliability. Interpolation methods, such as linear interpolation, can be used to supplement missing data, estimating the missing data based on data from adjacent time points.

[0036] The formula for linear interpolation is: ; in, This is an estimated value at time point t. The time point in time when the missing data occurred. For the actual observations recorded at a time (t-1) before the missing time point, This refers to the actual observation recorded at a time (t+1) after the missing time point.

[0037] Spatiotemporal conflict resolution addresses anomaly data where users appear in different locations at the same time, selecting the location with the highest credibility as valid data. This selection can be based on a credibility formula, such as combining factors like signal strength, time information, and historical location data to calculate credibility.

[0038] The credibility formula is: ; ; in, This represents the confidence level of the location, where x is one of multiple candidate location data points recorded at the same time point t. , , These correspond to the weights of signal strength, time information, and historical location information, respectively. For credibility sub-scores based on time reasonableness, For confidence sub-scores based on signal strength, This is a credibility sub-score based on users' historical behavior patterns.

[0039] Location merging combines data from users who appear in the same location multiple times within a short period, reducing the impact of redundant data. This can be achieved by calculating the average of these locations.

[0040] ; in, Indicates the location point. Indicates the number of times it appears.

[0041] Ping-pong effect processing addresses the ping-pong switching phenomenon caused by signal jumps, filtering out frequently switching abnormal data. Thresholds can be set for time difference and position difference; when the time difference exceeds the threshold and the position difference is less than the threshold, the data is considered abnormal and filtered out.

[0042] ; Indicates time difference, Indicates the positional difference. and This represents the threshold.

[0043] The stop point identification step is based on preprocessed mobile phone signaling data and public transportation data.

[0044] S2, input the user's current travel chain data into the pre-trained improved RBF neural network model to obtain complete travel chain data. The pre-trained improved RBF neural network model is trained based on historical travel chain data, which includes historical mobile phone signaling data and historical public transportation card swiping data.

[0045] In the embodiments of this application, such as Figure 3 As shown, the training method for the pre-trained improved RBF neural network model includes: S21, acquire the user's spatiotemporal trajectory data within a preset historical time period. The spatiotemporal trajectory data includes historical mobile phone signaling data and historical public transportation card swiping data. The historical mobile phone signaling data includes the user's stay time and spatial coordinates at different locations. S22, perform activity trajectory clustering on the spatiotemporal trajectory data of the user in a preset historical time period to determine the user's target activity area; In this embodiment of the application, the step of clustering the user's spatiotemporal trajectory data over a preset historical time period to determine the user's target activity area includes: Based on the DBSCAN clustering algorithm, density clustering is performed on spatiotemporal trajectory data with a preset historical time period to generate multiple candidate activity region clusters; The daily activity intensity distribution is calculated based on daily mobile phone signaling data within a preset historical time period. The information entropy value is calculated based on the daily activity intensity distribution, and weekdays and rest days are distinguished according to the information entropy value. Calculate the first dwell index of each candidate activity area cluster during the preset rest period on a weekday, and the second dwell index during the preset working period on a weekday. The probability of residence for each candidate activity area cluster is calculated based on the first residence index, and the probability of working for each candidate activity area cluster is calculated based on the second residence index. Based on the residential probability and working probability of each candidate activity area cluster, the user's optimal residential location and optimal working location are determined; The user's target activity area is determined based on the user's optimal residence and optimal workplace.

[0046] In the above implementation, the DBSCAN algorithm is a density-based spatial clustering algorithm that can divide data points into different clusters and identify noise points. The DBSCAN clustering algorithm outputs clusters, which are regions formed by dense trajectory points. Based on work and rest periods, it extracts two main residence areas: home and office. Signaling data ensures the accuracy of the extracted residence coordinates.

[0047] Activity intensity can be measured by the amount of time a user spends at a particular location.

[0048] Information entropy reflects the uncertainty of data; different activity intensity distributions lead to different information entropy values. Introducing the activity intensity information entropy metric allows for the measurement and analysis of user activity intensity, identifying work and rest periods.

[0049] ; Represents the set of activity intensities. It is the probability of the i-th activity intensity.

[0050] By analyzing activity intensity, we extracted grids of users' main residence areas during work and rest periods. Then, by calculating the probability distribution of work-residence locations within each grid, we scored each grid to determine the user's optimal residence and optimal work location. The scoring formula is as follows: ; For the selected core activity areas, calculate their center point coordinates and radius to form a list of main activity areas. Separately calculate the "stay time index" for users during "rest periods" and "work periods," as well as within each grid. The stay time index can be the sum of stay times, etc.

[0051] It should be noted that the clustering algorithm can also be replaced by the K-Means algorithm, etc.

[0052] S23, Based on the historical public transportation card swiping data and the target activity area, construct a travel chain structure, which includes multiple travel segments; In the embodiments of this application, such as Figure 4 As shown, the step of constructing a travel chain structure based on the historical public transportation card swipe data and the target activity area includes: Based on the historical public transportation card swipe data and the target activity area, the user's continuous travel records are divided into multiple travel segments, each of which includes a segment start point, a public transportation boarding start point, a public transportation boarding end point, and a segment end point; Based on timestamp information, multiple travel segments are spatiotemporally correlated and sorted to construct a travel chain structure.

[0053] In the above embodiments, based on the mobile phone signaling data collected in the steps, as well as the optimal residence and optimal workplace, the present invention divides the user's travel chain into multiple segments. Each segment includes the starting point before taking public transportation (such as residence, workplace, or the destination of the previous trip), the starting point of taking public transportation, the destination of taking public transportation, and the destination after taking public transportation (such as residence, workplace, or the starting point of the next trip). For example, the various travel segments can be connected in chronological order to form a complete travel chain.

[0054] S24, Based on the travel chain structure and the path planning results provided by the navigation API, determine the travel mode label for each travel segment in the travel chain structure; In this embodiment of the application, determining the travel mode label for each travel segment in the travel chain structure based on the travel chain structure and the path planning results provided by the navigation API includes: Based on the navigation API, the path planning results of different travel segments in the travel chain structure using multiple candidate travel modes under the same time constraints are obtained. The travel modes include walking, public transportation, driving, and cycling. For each travel segment, calculate the path matching degree between the mobile signaling data corresponding to the travel segment and the navigation path of each candidate travel mode; For each travel segment, calculate the time matching degree between the mobile signaling data corresponding to the travel segment and the navigation path of each candidate travel mode; For each travel segment, based on the path matching degree and the time matching degree, the comprehensive matching degree between the mobile signaling data corresponding to the travel segment and the navigation path of each candidate travel mode is determined, and the travel mode corresponding to the maximum comprehensive matching degree is selected as the travel mode label of the travel segment.

[0055] In the above implementation, for each integrated travel chain, the travel time and distance from the user's origin to the public transportation starting point and from the public transportation ending point to the ending point are calculated. Combining the user's geographic location coordinates and time data, the travel distance and travel time for different modes of transportation are obtained using APIs provided by relevant navigation apps.

[0056] By analyzing the user's speed at each segment of the travel chain, their travel mode can be determined.

[0057] First, given the same origin and destination and the same time frame, navigation routes are requested for four modes of transportation: walking, public transport, driving, and cycling. Then, navigation data matching is performed, which mainly consists of three parts.

[0058] The first part is path matching calculation, which matches the signaling trajectory data with the navigation path for each mode of transportation and calculates the matching degree. The path matching formula is as follows: ; This indicates the degree of matching between the signaling trajectory T and the navigation path P. Indicates signaling trajectory points and navigation waypoints The distance between the signaling trajectory points and the navigation path points is denoted by n, which represents the total number of path points. The purpose of path matching is to measure the similarity between the signaling trajectory and the navigation path by calculating the inverse ratio of the distances between them. The higher the matching degree, the greater the similarity between the signaling trajectory and the navigation path.

[0059] The second part is time matching calculation of the temporal matching degree between signaling trajectory data and various navigation paths. The time of the signaling trajectory is set to... The navigation path takes time. The time matching formula is as follows: ; Indicates signaling trajectory time and navigation path time The matching degree, or time matching, aims to measure the temporal compatibility between signaling trajectory time points and navigation path time points by calculating the inverse ratio of the time difference between the two. A higher matching degree indicates a smaller time difference between the two.

[0060] The third part is a comprehensive comparison, which compares the path matching degree and the time matching degree to calculate the comprehensive matching degree. The comprehensive comparison formula is as follows: ; The purpose of the comprehensive comparison is to sum the path matching degree and the time matching degree to obtain a comprehensive matching degree. Taking into account the matching of both path and time dimensions, the travel mode corresponding to the maximum comprehensive matching degree is selected as the travel mode label of the travel segment.

[0061] S25, a training set is constructed based on multiple travel chain structures and the travel mode labels of each travel segment in the travel chain structure, and the pre-constructed improved RBF neural network model is trained based on the training set to obtain the trained improved RBF neural network model.

[0062] In this embodiment of the application, the step of training a pre-constructed improved RBF neural network model based on a training set to obtain a trained improved RBF neural network model includes: The improved RBF neural network model is initialized based on the training set and the candidate node count set, and the optimal number of hidden layer nodes of the improved RBF neural network model is determined. Based on the training set, the optimal number of hidden layer nodes, the K-Means clustering algorithm, and the Adam optimization algorithm, the pre-constructed improved RBF neural network model is trained to obtain the trained improved RBF neural network model.

[0063] In this embodiment of the application, a training dataset is constructed using the multi-source data obtained in the aforementioned steps. This training dataset includes the following features: travel time (travel start time and end time), travel distance, travel speed, and geographical coordinates of the origin and destination.

[0064] The improved RBF neural network model was used to train the training dataset. The improvements included: optimizing the number of hidden layer nodes. The initial setting was 20 hidden layer nodes. Different numbers of hidden layer nodes, such as 10, 20, 30, and 40, were tried using a grid search method. The model performance was evaluated on the validation set, and the configuration with the highest accuracy and F1 score was selected. The experimental results showed that the model performance was optimal when the number of hidden layer nodes was 30.

[0065] To optimize the selection of radial basis function (RBF) center points, 20 feature vectors from training samples are randomly selected as center points during initialization. K-Means clustering is also attempted to select center points. First, K-Means clustering is performed on the training samples, resulting in 30 cluster centers, which are then used as the center points of the RBF neural network. The model performance of both initialization methods is evaluated on the validation set. Experimental results show that the K-Means clustering center point initialization method slightly improves model performance.

[0066] The training algorithm was optimized by initially using the standard gradient descent algorithm for model training. The Adam optimization algorithm was then attempted, and it was found that the Adam algorithm can adaptively adjust the learning rate for each parameter. Compared to standard gradient descent, the Adam algorithm typically converges to a better solution faster. The model performance of both optimization algorithms was evaluated on a validation set, and experimental results show that the model using the Adam optimization algorithm performs better.

[0067] Through the optimizations in the three aspects mentioned above, an RBF neural network model with 30 hidden layer nodes, initialized centroids using K-Means clustering, and trained using the Adam optimization algorithm was finally obtained. The optimized model achieved higher prediction accuracy and F1 score on the validation set, thereby improving the overall recognition performance and generalization ability.

[0068] The specific formula for the model is as follows: ; The output function is: The weight is The sample input is The center point is The sample size is The radial quantity function is .

[0069] Subsequently, during training, the weights are optimized using the minimized loss function. and center point : ; The final parameters of the model were determined. The network structure is as follows: input layer: 8 nodes, corresponding to 8 input features; hidden layer: 20 RBF neurons; output layer: 3 nodes, corresponding to 3 travel modes; activation function: Gaussian radial basis function; hidden layer node initialization: randomly select the feature vectors of 20 training samples as center points.

[0070] Training parameters: learning rate: initial value 0.01, using a dynamic adjustment strategy, batch size 64, number of iterations 500, loss function is cross-entropy loss, optimization algorithm is Adam optimizer.

[0071] This method acquires current travel chain data containing travel segments with public transportation data and missing travel segments with walking or cycling data. It then inputs this data into an improved RBF neural network model trained based on historical mobile phone signaling data and historical public transportation card swipe data. This allows the model to automatically identify and complete walking or cycling data in the travel chain, thereby improving the completeness and accuracy of the travel chain data.

[0072] The pre-trained improved RBF neural network model acquires spatiotemporal trajectory data from users' preset historical time periods, integrating multi-source data for model training. Activity trajectory clustering of the spatiotemporal trajectory data determines the target activity area, accurately locating the user's activity range. Combining historical public transportation card swipe data with the target activity area constructs a travel chain structure, making the training data more closely reflect actual travel conditions. Utilizing navigation API route planning results determines the travel mode labels for travel segments, providing accurate label information for model training. Based on these, a training set is constructed to train the improved RBF neural network model, enhancing the model's recognition and generalization abilities, thereby improving the accuracy and completeness of travel chain data completion.

[0073] Figure 5 A schematic diagram of a trip chain data completion device 200 based on an improved RBF neural network model is shown.

[0074] like Figure 5As shown, a trip chain data completion device 200 based on an improved RBF neural network model mainly includes: The acquisition module 201 is used to acquire the user's current travel chain data, which includes at least one travel segment and at least one missing travel segment. The travel segment includes public transportation data, and the missing travel segment includes walking data or cycling data. The travel chain data completion module 202 is used to input the user's current travel chain data into a pre-trained improved RBF neural network model to obtain complete travel chain data. The pre-trained improved RBF neural network model is trained based on historical travel chain data, which includes historical mobile phone signaling data and historical public transportation card swiping data.

[0075] In one example, the module in any of the above devices may be one or more integrated circuits configured to implement the above methods, such as one or more application-specific integrated circuits (ASICs), or one or more digital signal processors (DSPs), or one or more field-programmable gate arrays (FPGAs), or a combination of at least two of these integrated circuit forms.

[0076] For example, when modules in a device can be implemented via a processing element scheduler, the processing element can be a general-purpose processor, such as a central processing unit (CPU) or other processor capable of calling programs. Alternatively, these modules can be integrated together as a system-on-a-chip (SOC).

[0077] In this application, various objects such as messages / information / devices / network elements / systems / apparatus / actions / operations / processes / concepts may be named. It is understood that these specific names do not constitute a limitation on the relevant objects. The names may be changed depending on the scenario, context, or usage habits. The understanding of the technical meaning of the technical terms in this application should be mainly determined from their functions and technical effects embodied / performed in the technical solution.

[0078] Those skilled in the art will clearly understand that, for the sake of convenience and brevity, the specific working processes of the systems, devices, and modules described above can be referred to the corresponding processes in the foregoing method embodiments, and will not be repeated here.

[0079] Those skilled in the art will recognize that the modules and algorithm steps of the various examples described in conjunction with the embodiments disclosed herein can be implemented in electronic hardware, or a combination of computer software and electronic hardware. Whether these functions are implemented in hardware or software depends on the specific application and design constraints of the technical solution. Those skilled in the art can use different methods to implement the described functions for each specific application, but such implementation should not be considered beyond the scope of this application.

[0080] Figure 6 This is a structural block diagram of an electronic device 300 according to an embodiment of this application.

[0081] like Figure 6 As shown, the electronic device 300 includes a processor 301 and a memory 302, and may further include one or more of an information input / output (I / O) interface 303, a communication component 304, and a communication bus 305.

[0082] The processor 301 controls the overall operation of the electronic device 300 to complete all or part of the steps in the aforementioned walk-chain data completion method based on the improved RBF neural network model. The memory 302 stores various types of data to support the operation of the electronic device 300. This data may include, for example, instructions for any application or method operating on the electronic device 300, as well as application-related data. The memory 302 can be implemented by any type of volatile or non-volatile storage device or a combination thereof, such as one or more of Static Random Access Memory (SRAM), Electrically Erasable Programmable Read-Only Memory (EEPROM), Erasable Programmable Read-Only Memory (EPROM), Programmable Read-Only Memory (PROM), Read-Only Memory (ROM), magnetic storage, flash memory, magnetic disk, or optical disk.

[0083] I / O interface 303 provides an interface between processor 301 and other interface modules, such as keyboards, mice, and buttons. These buttons can be virtual or physical. Communication component 304 is used to test wired or wireless communication between electronic device 300 and other devices. Wireless communication includes Wi-Fi, Bluetooth, Near Field Communication (NFC), 2G, 3G, or 4G, or a combination thereof. Therefore, the corresponding communication component 304 may include a Wi-Fi component, a Bluetooth component, and an NFC component.

[0084] The communication bus 305 may include a path for transmitting information between the aforementioned components. The communication bus 305 may be a PCI (Peripheral Component Interconnect) bus or an EISA (Extended Industry Standard Architecture) bus, etc. The communication bus 305 may be divided into an address bus, a data bus, a control bus, etc.

[0085] The electronic device 300 may be implemented by one or more application-specific integrated circuits (ASICs), digital signal processors (DSPs), digital signal processing devices (DSPDs), programmable logic devices (PLDs), field-programmable gate arrays (FPGAs), controllers, microcontrollers, microprocessors, or other electronic components to execute the travel chain data completion method based on the improved RBF neural network model given in the above embodiments.

[0086] The following describes the computer-readable storage medium provided in the embodiments of this application. The computer-readable storage medium described below can be referred to in correspondence with the trip chain data completion method based on the improved RBF neural network model described above.

[0087] This application also provides a computer-readable storage medium storing a computer program, which, when executed by a processor, implements the steps of the above-described method for completing travel chain data based on an improved RBF neural network model.

[0088] The computer-readable storage medium may include various media capable of storing program code, such as USB flash drives, portable hard drives, read-only memory (ROM), random access memory (RAM), magnetic disks, or optical disks.

[0089] The terms “comprising,” “including,” or any other variations thereof are intended to cover non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements includes not only those elements but also other elements not expressly listed, or elements inherent to such process, method, article, or apparatus.

[0090] The above description is merely a preferred embodiment of this application and an explanation of the technical principles employed. Those skilled in the art should understand that the scope of this application is not limited to technical solutions formed by specific combinations of the above-described technical features, but should also cover other technical solutions formed by arbitrary combinations of the above-described technical features or their equivalents without departing from the foregoing application concept. For example, technical solutions formed by substituting the above features with (but not limited to) technical features with similar functions claimed in this application.

Claims

1. A method for trip chain data completion based on an improved RBF neural network model, characterized in that, include: Obtain the user's current travel chain data, which includes at least one travel segment and at least one missing travel segment. The travel segment includes public transportation data, and the missing travel segment includes walking data or cycling data. The user's current travel chain data is input into a pre-trained improved RBF neural network model to obtain complete travel chain data. The pre-trained improved RBF neural network model is trained based on historical travel chain data, which includes historical mobile phone signaling data and historical public transportation card swipe data.

2. The method for completing travel chain data based on an improved RBF neural network model according to claim 1, characterized in that, The acquisition of the user's current travel chain data includes: Collect mobile signaling data and public transportation data of users in the current time period. The mobile signaling data represents the user's stay time and spatial coordinates at different locations, and the public transportation data represents the station coordinates and timestamps of the user's origin and destination. The user's mobile signaling data and public transportation data in the current time period are preprocessed to obtain preprocessed mobile signaling data and public transportation data. The preprocessing includes missing data processing, spatiotemporal conflict resolution processing, location merging processing and ping-pong effect processing. Based on preprocessed mobile signaling data and public transportation data, the user's current travel chain data is obtained through stop point identification and travel chain construction.

3. The method for completing travel chain data based on an improved RBF neural network model according to claim 1, characterized in that, The training method for the pre-trained improved RBF neural network model includes: Acquire spatiotemporal trajectory data of users within a preset historical time period. The spatiotemporal trajectory data includes historical mobile phone signaling data and historical public transportation card swipe data. The historical mobile phone signaling data includes the time and spatial coordinates of the user's stay at different locations. The user's spatiotemporal trajectory data over a preset historical time period is clustered to determine the user's target activity area; Based on the historical public transportation card swipe data and the target activity area, a travel chain structure is constructed, which includes multiple travel segments. Based on the travel chain structure and the path planning results provided by the navigation API, determine the travel mode label for each travel segment in the travel chain structure; A training set is constructed based on multiple travel chain structures and the travel mode labels of each travel segment in the travel chain structure. The pre-constructed improved RBF neural network model is then trained based on the training set to obtain the trained improved RBF neural network model.

4. The method for completing travel chain data based on an improved RBF neural network model according to claim 3, characterized in that, The step of clustering the user's spatiotemporal trajectory data over a preset historical time period to determine the user's target activity area includes: Based on the DBSCAN clustering algorithm, density clustering is performed on spatiotemporal trajectory data with a preset historical time period to generate multiple candidate activity region clusters; The daily activity intensity distribution is calculated based on daily mobile phone signaling data within a preset historical time period. The information entropy value is calculated based on the daily activity intensity distribution, and weekdays and rest days are distinguished according to the information entropy value. Calculate the first dwell index of each candidate activity area cluster during the preset rest period on a weekday, and the second dwell index during the preset working period on a weekday. The probability of residence for each candidate activity area cluster is calculated based on the first residence index, and the probability of working for each candidate activity area cluster is calculated based on the second residence index. Based on the residential probability and working probability of each candidate activity area cluster, the user's optimal residential location and optimal working location are determined; The user's target activity area is determined based on the user's optimal residence and optimal workplace.

5. The method for trip chain data completion based on an improved RBF neural network model according to claim 4, characterized in that, The process of constructing a travel chain structure based on the historical public transportation card swipe data and the target activity area includes: Based on the historical public transportation card swipe data and the target activity area, the user's continuous travel records are divided into multiple travel segments, each of which includes a segment start point, a public transportation boarding start point, a public transportation boarding end point, and a segment end point; Based on timestamp information, multiple travel segments are spatiotemporally correlated and sorted to construct a travel chain structure.

6. The method for completing travel chain data based on an improved RBF neural network model according to claim 3, characterized in that, The step of determining the travel mode label for each travel segment in the travel chain structure based on the travel chain structure and the route planning results provided by the navigation API includes: Based on the navigation API, the path planning results of different travel segments in the travel chain structure using multiple candidate travel modes under the same time constraints are obtained. The travel modes include walking, public transportation, driving, and cycling. For each travel segment, calculate the path matching degree between the mobile signaling data corresponding to the travel segment and the navigation path of each candidate travel mode; For each travel segment, calculate the time matching degree between the mobile signaling data corresponding to the travel segment and the navigation path of each candidate travel mode; For each travel segment, based on the path matching degree and the time matching degree, the comprehensive matching degree between the mobile signaling data corresponding to the travel segment and the navigation path of each candidate travel mode is determined, and the travel mode corresponding to the maximum comprehensive matching degree is selected as the travel mode label of the travel segment.

7. The method for completing travel chain data based on an improved RBF neural network model according to claim 3, characterized in that, The process of training a pre-built improved RBF neural network model based on a training set to obtain a trained improved RBF neural network model includes: The improved RBF neural network model is initialized based on the training set and the candidate node count set, and the optimal number of hidden layer nodes of the improved RBF neural network model is determined. Based on the training set, the optimal number of hidden layer nodes, the K-Means clustering algorithm, and the Adam optimization algorithm, the pre-constructed improved RBF neural network model is trained to obtain the trained improved RBF neural network model.

8. A trip chain data completion device based on an improved RBF neural network model, characterized in that, include: The acquisition module is used to acquire the user's current travel chain data, which includes at least one travel segment and at least one missing travel segment. The travel segment includes public transportation data, and the missing travel segment includes walking data or cycling data. The travel chain data completion module is used to input the user's current travel chain data into a pre-trained improved RBF neural network model to obtain complete travel chain data. The pre-trained improved RBF neural network model is trained based on historical travel chain data, which includes historical mobile phone signaling data and historical public transportation card swipe data.

9. An electronic device, characterized in that, Includes a processor, which is coupled to a memory; The processor is configured to execute a computer program stored in the memory to cause the electronic device to perform the method as described in any one of claims 1 to 7.

10. A computer-readable storage medium, characterized in that, Includes a computer program or instructions that, when run on a computer, cause the computer to perform the method as described in any one of claims 1-7.