A vehicle fuel consumption prediction method and system based on vehicle CAN data

By acquiring vehicle CAN data in real time, performing feature extraction and cluster analysis, and combining road parameters and congestion level data, a fuel consumption prediction model is constructed, which solves the problems of real-time performance and accuracy in fuel consumption prediction in existing technologies, and achieves efficient fuel consumption prediction.

CN118015727BActive Publication Date: 2026-06-19NANCHANG AUTOMOTIVE INST OF INTELLIGENCE & NEW ENERGY

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
NANCHANG AUTOMOTIVE INST OF INTELLIGENCE & NEW ENERGY
Filing Date
2024-03-20
Publication Date
2026-06-19

AI Technical Summary

Technical Problem

Existing technologies for predicting vehicle fuel consumption suffer from problems such as data noise, nonlinear system characteristics, high computational resource requirements, poor real-time performance, and large prediction bias in complex driving environments, especially making it difficult to accurately predict fuel consumption during traffic congestion.

Method used

By acquiring vehicle CAN data in real time, performing feature extraction and cluster analysis, and combining road parameters and congestion level data, a fuel consumption prediction model is constructed. Key features are selected using mutual information, and the model is trained using a random forest algorithm to achieve fuel consumption prediction.

🎯Benefits of technology

It achieves real-time and accurate fuel consumption prediction, reduces feature redundancy, reduces reliance on additional sensors and computing resources, provides a perspective of both macroscopic and microscopic driving environments, and improves the accuracy of fuel consumption prediction.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN118015727B_ABST
    Figure CN118015727B_ABST
Patent Text Reader

Abstract

This invention provides a method and system for predicting vehicle fuel consumption based on in-vehicle CAN data. The method includes: extracting features from the original CAN parameters; performing cluster analysis on the feature variables in the driver's operation and driving risk feature set; determining the vehicle's driving parameters and road parameters, and determining the congestion level data of the driving segment; constructing a preset fuel consumption prediction model, dividing the processed dataset into a training set and a test set; inputting the training set into the preset fuel consumption prediction model for training to obtain a trained fuel consumption prediction model, and inputting the test set into the trained fuel consumption prediction model to obtain the predicted fuel consumption result. This invention can improve the counting accuracy in complex scenarios, handle counting tasks in open-set environments, and avoid the predefinition of target categories.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention belongs to the technical field of fuel consumption prediction, specifically relating to a method and system for predicting vehicle fuel consumption based on onboard CAN data. Background Technology

[0002] With rapid social development and continuous growth in transportation demand, the volume of road freight transport has shown a year-on-year upward trend, making accurate prediction of vehicle fuel consumption particularly important. The fuel consumption of road freight trucks is not only related to the vehicle's own characteristics, driving conditions, cargo type, and load capacity, but also affected by the actual driving environment and external conditions. Especially during traffic congestion, frequent acceleration and deceleration, and reduced average speed, significantly increase fuel consumption and emissions. In the past, although many technologies attempted to directly measure and predict fuel consumption using various sensors and specialized equipment, they were often limited by high costs, implementation complexity, and real-time issues. Fuel consumption estimation methods relying solely on vehicle dynamics models and neural networks also suffer from errors due to data noise, the characteristics of nonlinear systems, and the large computational resource requirements. Furthermore, the prediction bias of traditional methods is quite significant when facing complex driving environments and road conditions.

[0003] For example, Chinese patent application No. 202010918106.5, entitled "A Method and Apparatus for Evaluating the Energy Consumption of Heavy-Duty Diesel Vehicles," uses the vehicle's ignition switch signal and engine status signal to determine the mileage and collects driving data based on the CAN bus. It refines and extracts key features and uses the Pearson correlation coefficient to filter out features highly correlated with energy consumption. Subsequently, it uses a pre-trained neural network-based model to combine these features for energy consumption evaluation. Its disadvantages are: the Pearson correlation coefficient has limitations and may overlook some nonlinear or more complex feature relationships; it requires a large amount of data to train the neural network; and it may have evaluation biases for certain specific driving scenarios or vehicle models not covered.

[0004] Chinese patent application No. 202210310009.7, entitled "A Transient Vehicle Fuel Consumption Estimation Method and System Based on Correction of Strongly Correlated Fuel Consumption Parameters," proposes a method for estimating fuel consumption based on steady-state and non-steady-state driving data. This method combines various data analysis techniques, such as principal component analysis, cluster analysis, and multinomial fitting, to establish a vehicle fuel consumption model. Its drawback is that it does not consider the impact of traffic flow conditions. For example, in busy urban roads or traffic jams, frequent vehicle starts and stops may increase fuel consumption. These unconsidered real-world driving scenario factors may lead to deviations in fuel consumption estimation in certain situations.

[0005] Patent application CN202010234278.0, entitled "Multi-condition Fuel Consumption Prediction Method and System for Fuel Vehicles Based on Gaussian Process Regression," provides a fuel consumption prediction scheme for fuel vehicles. It uses a Gaussian process regression model for training and optimizes the dataset through a sequential sampling algorithm, repeatedly executing the process until a predetermined stopping condition is met before outputting the prediction result. Its drawbacks are: repeated sequential sampling may prolong computation time, and the Gaussian process-based model may require significant computational resources. Summary of the Invention

[0006] To address the aforementioned technical problems, this invention provides a vehicle fuel consumption prediction method and system based on vehicle CAN data, which solves the technical problems in the prior art.

[0007] In a first aspect, the present invention provides the following technical solution: a method for predicting vehicle fuel consumption based on onboard CAN data, comprising:

[0008] The vehicle's original CAN parameters are acquired in real time, and features are extracted from the original CAN parameters to obtain the first input features. The original CAN parameters include at least vehicle speed, engine speed, throttle position, fuel consumption, brake pressure, driving latitude and longitude data, driving mileage, and key engine parameters.

[0009] A driver operation and driving risk feature set is constructed, and the feature variables in the driver operation and driving risk feature set are clustered using the K-Means unsupervised clustering analysis algorithm to obtain the second input feature;

[0010] Based on the latitude and longitude data of the vehicle, the driving parameters and road parameters of the vehicle are determined, and based on the driving parameters and road parameters, the congestion level data of the driving segment is determined;

[0011] A preset fuel consumption prediction model is constructed, and the first input feature, the second input feature and the traffic congestion level data of the driving segment are stored in the processing dataset. The processing dataset is divided into a training set and a test set.

[0012] The training set is input into the preset fuel consumption prediction model for training to obtain the trained fuel consumption prediction model, and the test set is input into the trained fuel consumption prediction model to obtain the predicted fuel consumption result.

[0013] Compared with existing technologies, the beneficial effects of this invention are as follows: This invention first acquires the vehicle's original CAN parameters in real time, extracts features from the original CAN parameters to obtain the first input features, then constructs a driver operation and driving risk feature set, and uses the K-Means unsupervised clustering analysis algorithm to perform cluster analysis on the feature variables in the driver operation and driving risk feature set to obtain the second input features; then, based on the driving latitude and longitude data, it determines the vehicle's driving parameters and road parameters, and based on the driving parameters and road parameters, it determines the traffic congestion level data of the driving segment; then, it constructs a preset fuel consumption prediction model, stores the first input features, the second input features, and the traffic congestion level data of the driving segment into a processing dataset, and divides the processing dataset into a training set and a test set; finally, it inputs the training set into the preset fuel consumption prediction model for training to obtain the trained fuel consumption prediction model, and inputs the test set into the trained fuel consumption prediction model to obtain the predicted fuel consumption result. This invention reads the vehicle's CAN data in real time, and through comprehensive analysis of massive amounts of data, ensures the real-time performance and accuracy of fuel consumption prediction. This solves the problems of data accuracy and training requirements. At the same time, the MI method is used to accurately screen key vehicle condition features that are highly correlated with fuel consumption, which significantly reduces feature redundancy and avoids the problem of local feature limitations. Furthermore, the driver transportation behavior profile is constructed using real vehicle operation data as an independent feature, providing macro and micro perspectives for prediction and helping to consider actual driving environment factors more comprehensively. In addition, based on the ratio of free flow speed to CAN speed, the road congestion index of the vehicle's driving segment is effectively calculated, making fuel consumption prediction more accurate. Compared with other traditional methods, this invention only needs to acquire the vehicle's CAN data, without the need for additional sensors or professional monitoring equipment, and does not rely on a large amount of computing resources.

[0014] Preferably, the step of acquiring the vehicle's original CAN parameters in real time and extracting features from the original CAN parameters to obtain the first input features includes:

[0015] The vehicle's raw CAN parameters are acquired in real time, and the raw CAN parameters are sequentially subjected to smoothing filtering, interpolation and missing data processing to obtain processed CAN parameters.

[0016] Calculate the mutual information between the fuel consumption variable and the processed CAN parameters. :

[0017] ;

[0018] In the formula, The parameter set consists of all the parameters processed by CAN. For parameters and parameters The joint probability distribution, and Parameters and parameters The marginal probability distribution;

[0019] Based on mutual information The CAN parameters are sorted from largest to smallest to obtain sorted CAN parameters, and the first few parameters in the sorted CAN parameters are selected as the first input features.

[0020] Preferably, the step of constructing a driver operation and driving risk feature set and using the K-Means unsupervised clustering analysis algorithm to perform clustering analysis on the feature variables in the driver operation and driving risk feature set to obtain the second input feature includes:

[0021] Construct a driver operation and driving risk feature set, which includes at least the number of operating days, total operating time, average daily operating time, total dwell time, average daily dwell time, total operating efficiency, total mileage, average daily mileage, total night driving time, proportion of night driving time, frequency of night driving, and frequency of fatigue driving.

[0022] Determine the number of data points in the driver operation and driving risk feature set and the number of clusters to be divided. Select a set of cluster center points and minimize the distance from each data point to its nearest center point. Iterate continuously until the cluster center no longer changes or the iteration stopping condition is met.

[0023] The elbow method was used to determine the number of clusters and the corresponding sum of squared errors, and K-SSE curves were plotted based on the number of clusters and the corresponding sum of squared errors.

[0024] The optimal K value that causes a sharp slowdown is determined in the K-SSE curve, and the feature data corresponding to the optimal K value is used as the second input feature.

[0025] Preferably, the step of determining the vehicle's driving parameters and road parameters based on the driving latitude and longitude data includes:

[0026] Based on the latitude and longitude data of the vehicle, the vehicle's driving trajectory is visualized on the map to obtain a driving route map;

[0027] The geographic information system is matched with the driving route map to determine the driving parameters of the vehicle. The driving parameters include at least the specific road, road segment, driving time and driving distance.

[0028] The road type and number of lanes are determined based on the specific road on which the vehicle travels, in order to obtain the vehicle's road parameters.

[0029] Preferably, the step of determining the congestion level data of the driving segment based on the driving parameters and the road parameters includes:

[0030] A dynamic segment service level table is determined based on the driving parameters and the road parameters, and the road free-flow speed is determined based on the dynamic segment service level table.

[0031] Based on the road free-flow velocity With vehicle CAN speed Determine the road congestion index :

[0032] ;

[0033] Based on the road congestion index Assign congestion levels to each road segment the vehicle travels on in order to obtain congestion level data for each road segment.

[0034] Preferably, the step of inputting the training set into the preset fuel consumption prediction model for training to obtain the trained fuel consumption prediction model includes:

[0035] The training set is input into the preset fuel consumption prediction model, and the preset fuel consumption prediction model is trained using the random forest algorithm;

[0036] The preset fuel consumption prediction model is iteratively optimized by grid search and cross-validation until the average absolute error of the model parameters is no greater than the performance threshold, so as to obtain the optimized fuel consumption prediction model.

[0037] Calculate the prediction score of the optimized fuel consumption prediction model. :

[0038] ;

[0039] In the formula, For the first This is a real fuel consumption value. To optimize the output of the fuel consumption prediction model One fuel consumption prediction value, This represents the number of fuel consumption samples.

[0040] Determine the prediction score of the optimized fuel consumption prediction model. Whether the predicted score of the optimized fuel consumption prediction model is greater than the scoring threshold. If the predicted score is greater than the scoring threshold, the optimized fuel consumption prediction model will be used as the training fuel consumption prediction model. If the result is not greater than the scoring threshold, the parameters of the optimized fuel consumption prediction model will be iteratively optimized again.

[0041] Secondly, the present invention provides the following technical solution: a vehicle fuel consumption prediction system based on onboard CAN data, the system comprising:

[0042] The first determining module is used to acquire the vehicle's original CAN parameters in real time, and to extract features from the original CAN parameters to obtain the first input features. The original CAN parameters include at least vehicle speed, engine speed, throttle position, fuel consumption, brake pressure, driving latitude and longitude data, driving mileage, and key engine parameters.

[0043] The second determining module is used to construct a driver operation and driving risk feature set, and to perform cluster analysis on the feature variables in the driver operation and driving risk feature set using the K-Means unsupervised clustering analysis algorithm to obtain the second input feature;

[0044] The third determining module is used to determine the vehicle's driving parameters and road parameters based on the driving latitude and longitude data, and to determine the congestion level data of the driving road segment based on the driving parameters and road parameters;

[0045] The construction module is used to build a preset fuel consumption prediction model, store the first input feature, the second input feature and the traffic congestion level data of the driving segment into the processing dataset, and divide the processing dataset into a training set and a test set.

[0046] The prediction module is used to input the training set into the preset fuel consumption prediction model for training to obtain a trained fuel consumption prediction model, and to input the test set into the trained fuel consumption prediction model to obtain the predicted fuel consumption result.

[0047] Preferably, the first determining module includes:

[0048] The processing submodule is used to acquire the vehicle's raw CAN parameters in real time, and to perform smoothing filtering, interpolation and missing data processing on the raw CAN parameters in sequence to obtain the processed CAN parameters.

[0049] The mutual information determination submodule is used to calculate the mutual information between the fuel consumption variable and the processed CAN parameters. :

[0050] ;

[0051] In the formula, The parameter set consists of all the parameters processed by CAN. For parameters and parameters The joint probability distribution, and Parameters and parameters The marginal probability distribution;

[0052] The sorting submodule is used for sorting based on mutual information. The CAN parameters are sorted from largest to smallest to obtain sorted CAN parameters, and the first few parameters in the sorted CAN parameters are selected as the first input features.

[0053] Thirdly, the present invention provides the following technical solution: a computer, including a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor executes the computer program to implement the vehicle fuel consumption prediction method based on vehicle CAN data as described above.

[0054] Fourthly, the present invention provides the following technical solution: a storage medium storing a computer program, wherein the computer program, when executed by a processor, implements the vehicle fuel consumption prediction method based on vehicle CAN data as described above. Attached Figure Description

[0055] To more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings used in the description of the embodiments or the prior art will be briefly introduced below. Obviously, the drawings described below are only some embodiments of the present invention. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.

[0056] Figure 1 This is a flowchart of a vehicle fuel consumption prediction method based on vehicle CAN data provided in Embodiment 1 of the present invention;

[0057] Figure 2 This is a K-SSE curve diagram provided in Embodiment 1 of the present invention;

[0058] Figure 3 This is a structural block diagram of the vehicle fuel consumption prediction system based on vehicle CAN data provided in Embodiment 2 of the present invention;

[0059] Figure 4 This is a schematic diagram of the hardware structure of a computer provided for another embodiment of the present invention.

[0060] The embodiments of the present invention will be further described below with reference to the accompanying drawings. Detailed Implementation

[0061] Embodiments of the present invention are described in detail below, examples of which are illustrated in the accompanying drawings, wherein the same or similar reference numerals denote the same or similar elements or elements having the same or similar functions throughout. The embodiments described below with reference to the accompanying drawings are exemplary and intended to explain embodiments of the present invention, and should not be construed as limiting the present invention.

[0062] In the description of the embodiments of the present invention, it should be understood that the terms "length", "width", "upper", "lower", "front", "rear", "left", "right", "vertical", "horizontal", "top", "bottom", "inner", "outer", etc., indicate the orientation or positional relationship based on the orientation or positional relationship shown in the accompanying drawings. They are only for the convenience of describing the embodiments of the present invention and simplifying the description, and do not indicate or imply that the device or element referred to must have a specific orientation, or be constructed and operated in a specific orientation. Therefore, they should not be construed as limitations on the present invention.

[0063] Furthermore, the terms "first" and "second" are used for descriptive purposes only and should not be construed as indicating or implying relative importance or implicitly specifying the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include one or more of that feature. In the description of embodiments of the present invention, "a plurality of" means two or more, unless otherwise explicitly specified.

[0064] In the embodiments of the present invention, unless otherwise explicitly specified and limited, the terms "installation," "connection," "linking," "fixing," etc., should be interpreted broadly. For example, they can refer to a fixed connection, a detachable connection, or an integral part; they can refer to a mechanical connection or an electrical connection; they can refer to a direct connection or an indirect connection through an intermediate medium; they can refer to the internal communication of two components or the interaction between two components. Those skilled in the art can understand the specific meaning of the above terms in the embodiments of the present invention according to the specific circumstances.

[0065] Example 1

[0066] In Embodiment 1 of the present invention, as Figure 1 As shown, a vehicle fuel consumption prediction method based on onboard CAN data includes:

[0067] S1. Acquire the vehicle's original CAN parameters in real time, and extract features from the original CAN parameters to obtain the first input features. The original CAN parameters include at least vehicle speed, engine speed, throttle position, fuel consumption, brake pressure, driving latitude and longitude data, driving mileage, and key engine parameters.

[0068] Specifically, in step S1, the original CAN parameters can be obtained through the CAN bus as initial, unprocessed data. The original CAN parameters include at least vehicle speed, engine speed, throttle position, fuel consumption, brake pressure, driving latitude and longitude data, driving mileage, and key engine parameters, including fuel consumption parameters. Here, fuel consumption parameters refer to the real-time fuel consumption data monitored from the previous moment to the current moment. The purpose of this invention is to predict the fuel consumption at the next moment or in the next time period.

[0069] Specifically, step S1 includes:

[0070] S1. Acquire the vehicle's original CAN parameters in real time, and perform smoothing filtering, interpolation and missing data processing on the original CAN parameters in sequence to obtain processed CAN parameters.

[0071] Specifically, the raw CAN parameters may have abnormalities such as signal jumps and distortions, so data preprocessing is required. For the smoothing and filtering process, a low-pass filter is introduced to smooth the raw signal and filter out noise. For the interpolation and missing data process, linear interpolation is introduced to complete the data for any missing data points. For the abnormal data processing process, impulse noise and other obvious abnormal values ​​in the data are detected and eliminated. It should be noted that the data acquisition frequency in this invention is 1Hz.

[0072] S2. Calculate the mutual information between the fuel consumption variable and the processed CAN parameters. :

[0073] ;

[0074] In the formula, The parameter set consists of all the parameters processed by CAN. For parameters and parameters The joint probability distribution, and Parameters and parameters The marginal probability distribution;

[0075] Specifically, mutual information is a method for measuring the interdependence between two random variables. Unlike the correlation coefficient, mutual information can capture not only linear relationships but also more complex nonlinear relationships. In this invention, feature selection using the MI method is a key step in improving model performance and reducing training time. Especially when dealing with high-dimensional data, selecting features that are highly correlated with the target variable can greatly improve the accuracy of the model.

[0076] S3, based on mutual information The CAN parameters are sorted from largest to smallest to obtain sorted CAN parameters, and the first few parameters in the sorted CAN parameters are selected as the first input features.

[0077] Specifically, in this invention, after sorting the CAN processing parameters, mutual information is selected. The top 10 features are used as the first input features.

[0078] S2. Construct a driver operation and driving risk feature set, and use the K-Means unsupervised clustering analysis algorithm to perform clustering analysis on the feature variables in the driver operation and driving risk feature set to obtain the second input feature;

[0079] Step S2 includes:

[0080] S21. Construct a driver operation and driving risk feature set, which includes at least the number of operating days, total operating time, average daily operating time, total stay time, average daily stay time, total operating efficiency, total mileage, average daily mileage, total night driving time, proportion of night driving time, frequency of night driving, and frequency of fatigue driving.

[0081] S22. Determine the number of data points in the driver operation and driving risk feature set and the number of clusters to be divided, select a set of cluster center points and minimize the distance from each data point to its nearest center point, and iterate continuously until the cluster center no longer changes or the iteration stop condition is met.

[0082] Specifically, in step S22, for a dataset X containing n-dimensional data points and to be divided into K clusters, Euclidean distance is used to measure dissimilarity between data objects, and the clustering objective is... It can be represented as:

[0083] ;

[0084] In the formula, It is the center of the k-th cluster. Represents the first in the dataset One point, It is the number of data points in the k-th cluster;

[0085] And the new cluster centers in the iteration It can be represented as:

[0086] .

[0087] S23. Determine the number of clusters and the corresponding sum of squared errors based on the elbow method, and plot the K-SSE curve based on the number of clusters and the corresponding sum of squared errors.

[0088] Specifically, in this step, the number of clusters is the K value in the K-SSE curve, and the sum of squared errors is the SSE value in the K-SSE curve.

[0089] S24. Determine the optimal K value in the K-SSE curve where the rate of decrease is sharp, and use the feature data corresponding to the optimal K value as the second input feature;

[0090] Specifically, after determining the K-SSE curve, the optimal K value is found at the point where the SSE begins to decrease sharply. This means identifying a significant inflection point in the K-SSE curve. As the K value iteratively increases, the SSE continuously decreases. The optimal K value at which the SSE begins to decrease sharply is then found. For a detailed description of the K-SSE curve, please refer to [reference needed]. Figure 2 ,from Figure 2 As can be seen, the SSE change weakens when K=3. Therefore, when the driver behavior feature parameter is 3, the dataset clustering effect is the best. Therefore, the feature data corresponding to the optimal K value=3 is used as the second input feature.

[0091] S3. Determine the vehicle's driving parameters and road parameters based on the driving latitude and longitude data, and determine the congestion level data of the driving section based on the driving parameters and road parameters;

[0092] Specifically, step S3 includes: S31, determining the vehicle's driving parameters and road parameters based on the driving latitude and longitude data; S32, determining the traffic congestion level data of the driving section based on the driving parameters and the road parameters.

[0093] Step S31 includes:

[0094] S311. Based on the latitude and longitude data of the driving route, the vehicle's driving trajectory is visualized on the map to obtain a driving route map.

[0095] S312. Match the geographic information system with the driving route map to determine the driving parameters of the vehicle, wherein the driving parameters include at least the specific road, road segment, driving time and driving distance of the vehicle.

[0096] S313. Determine the road type information and number of lanes based on the specific road on which the vehicle is traveling, so as to obtain the vehicle's road parameters.

[0097] Step S32 includes:

[0098] S321. Determine a dynamic segment service level table based on the driving parameters and the road parameters, and determine the road free-flow speed based on the dynamic segment service level table;

[0099] Specifically, the free-flow speed of traffic is determined according to the dynamic segment service level table, which is determined based on the driving speed under different congestion levels on different road segments. The specific dynamic segment service level table is as follows:

[0100] Dynamic segment service level rating table

[0101] Road segment grade service level expressway Main road Secondary roads branch road 1 >55Km / h >40 km / h >30 km / h >30 km / h 2 >40-55 km / h >30-40 km / h >20-30 km / h >20-30 km / h 3 >30-40 km / h >20-30 km / h >15-20 km / h >15-20 km / h 4 >20-30 km / h >15-20 km / h >10-15 km / h >10-15 km / h 5 ≤20 km / h ≤15 km / h ≤10 km / h ≤10 km / h

[0102] As can be seen from the table above, service levels 1 to 5 represent increasingly higher levels of congestion, while the free-flow speed of traffic is the speed of traffic on roads with a service level of 1.

[0103] S322, Based on the road free-flow velocity With vehicle CAN speed Determine the road congestion index :

[0104] ;

[0105] S323, Based on the aforementioned road congestion index Assign congestion levels to each road segment the vehicle travels on in order to obtain congestion level data for each road segment.

[0106] Specifically, in this step, the congestion level data of the driving segment is allocated according to the different congestion levels of different segments, and the different congestion levels can be divided into smooth (1), slow (2), congested (3), moderate congestion (4), and severe congestion (5).

[0107] S4. Construct a preset fuel consumption prediction model, store the first input feature, the second input feature and the traffic congestion level data of the driving segment into the processing dataset, and divide the processing dataset into a training set and a test set.

[0108] Specifically, in this step, the preset fuel consumption prediction model is a fuel consumption prediction RF random forest model. When constructing the feature dataset, the first input feature, the second input feature, and the traffic congestion level data of the driving segment are stored in the processing dataset in sequence, and the processing dataset is divided into a training set and a test set in a 7:3 ratio.

[0109] S5. Input the training set into the preset fuel consumption prediction model for training to obtain the trained fuel consumption prediction model, and input the test set into the trained fuel consumption prediction model to obtain the predicted fuel consumption result.

[0110] Step S5 includes:

[0111] S51. Input the training set into the preset fuel consumption prediction model, and train the preset fuel consumption prediction model using the random forest algorithm.

[0112] S52. The preset fuel consumption prediction model is iteratively optimized by grid search and cross-validation until the average absolute error of the model parameters is no greater than the performance threshold, so as to obtain the optimized fuel consumption prediction model.

[0113] Specifically, the model is trained using a random forest algorithm, and the parameters are optimized through grid search and cross-validation to improve the model's prediction accuracy. A performance threshold is set during the parameter optimization process. In this application, when the mean absolute error (MAE) is less than or equal to 2, the performance threshold is 2, which yields an optimized fuel consumption prediction model to ensure that the model achieves the predetermined performance target. At the same time, an early stopping strategy is implemented during the model iterative optimization process. Training is stopped when the improvement is not significant after a certain number of consecutive iterations to prevent overfitting and save computational resources.

[0114] S53. Calculate the prediction score of the optimized fuel consumption prediction model. :

[0115] ;

[0116] In the formula, For the first This is a real fuel consumption value. To optimize the output of the fuel consumption prediction model One fuel consumption prediction value, This represents the number of fuel consumption samples.

[0117] Specifically, the prediction score here is the mean absolute percentage error. The larger the error, the lower the model's prediction accuracy; the smaller the error, the higher the model's prediction accuracy.

[0118] S54. Determine the prediction score of the optimized fuel consumption prediction model. Whether the predicted score of the optimized fuel consumption prediction model is greater than the scoring threshold. If the score exceeds the scoring threshold, the parameters of the optimized fuel consumption prediction model will be iteratively optimized again. If the predicted score of the optimized fuel consumption prediction model is... If the value is not greater than the scoring threshold, the optimized fuel consumption prediction model will be used as the training fuel consumption prediction model.

[0119] Specifically, the scoring threshold here is 5%, when the predicted score... If the score exceeds the scoring threshold, it means the model's prediction accuracy is insufficient. Therefore, steps S51-S52 need to be repeated until the model's prediction accuracy meets the requirements. If the prediction score of the optimized fuel consumption prediction model is... If the score is not greater than the scoring threshold, it means that the model's prediction accuracy has met the prediction requirements. Therefore, the optimized fuel consumption prediction model can be used as the training fuel consumption prediction model. Then, the test set can be input into the training fuel consumption prediction model for prediction output to obtain the predicted fuel consumption result.

[0120] The vehicle fuel consumption prediction method based on vehicle CAN data provided in Embodiment 1 of this invention first acquires the vehicle's original CAN parameters in real time, extracts features from the original CAN parameters to obtain the first input features, then constructs a driver operation and driving risk feature set, and uses the K-Means unsupervised clustering analysis algorithm to perform cluster analysis on the feature variables in the driver operation and driving risk feature set to obtain the second input features; then, based on the driving latitude and longitude data, the vehicle's driving parameters and road parameters are determined, and based on the driving parameters and road parameters, the congestion level data of the driving segment is determined; then, a preset fuel consumption prediction model is constructed, and the first input features, the second input features, and the driving segment congestion level data are stored in a processing dataset, which is divided into a training set and a test set; finally, the training set is input into the preset fuel consumption prediction model for training to obtain the trained fuel consumption prediction model, and the test set is input into the trained fuel consumption prediction model to obtain the predicted fuel consumption result. This invention reads vehicle CAN data in real time, and through comprehensive analysis of massive amounts of data, ensures the real-time performance and accuracy of fuel consumption prediction. This solves the problems of data accuracy and training requirements. At the same time, the MI method is used to accurately screen key vehicle condition features that are highly correlated with fuel consumption, which significantly reduces feature redundancy and avoids the problem of local feature limitations. Furthermore, the driver transportation behavior profile is constructed using real vehicle operation data as an independent feature, providing macro and micro perspectives for prediction and helping to consider actual driving environment factors more comprehensively. In addition, based on the ratio of free flow speed to CAN speed, the road congestion index of the vehicle's driving segment is effectively calculated, making fuel consumption prediction more accurate. Compared with other traditional methods, this invention only needs to acquire the vehicle's CAN data, without the need for additional sensors or professional monitoring equipment, and does not rely on a large amount of computing resources.

[0121] Example 2

[0122] like Figure 3 As shown, in Embodiment 2 of the present invention, a vehicle fuel consumption prediction system based on onboard CAN data is provided. The system includes:

[0123] The first determining module 1 is used to acquire the vehicle's original CAN parameters in real time, and to extract features from the original CAN parameters to obtain the first input features. The original CAN parameters include at least vehicle speed, engine speed, throttle position, fuel consumption, brake pressure, driving latitude and longitude data, driving mileage, and key engine parameters.

[0124] The second determining module 2 is used to construct a driver operation and driving risk feature set, and to perform cluster analysis on the feature variables in the driver operation and driving risk feature set using the K-Means unsupervised clustering analysis algorithm to obtain the second input feature;

[0125] The third determining module 3 is used to determine the vehicle's driving parameters and road parameters based on the driving latitude and longitude data, and to determine the congestion level data of the driving road segment based on the driving parameters and road parameters;

[0126] Module 4 is used to build a preset fuel consumption prediction model, store the first input feature, the second input feature and the traffic congestion level data of the driving segment into the processing dataset, and divide the processing dataset into a training set and a test set.

[0127] The prediction module 5 is used to input the training set into the preset fuel consumption prediction model for training to obtain a trained fuel consumption prediction model, and to input the test set into the trained fuel consumption prediction model to obtain the predicted fuel consumption result.

[0128] The first determining module 1 includes:

[0129] The processing submodule is used to acquire the vehicle's raw CAN parameters in real time, and to perform smoothing filtering, interpolation and missing data processing on the raw CAN parameters in sequence to obtain the processed CAN parameters.

[0130] The mutual information determination submodule is used to calculate the mutual information between the fuel consumption variable and the processed CAN parameters. :

[0131] ;

[0132] In the formula, The parameter set consists of all the parameters processed by CAN. For parameters and parameters The joint probability distribution, and Parameters and parameters The marginal probability distribution;

[0133] The sorting submodule is used for sorting based on mutual information. The CAN parameters are sorted from largest to smallest to obtain sorted CAN parameters, and the first few parameters in the sorted CAN parameters are selected as the first input features.

[0134] The second determining module 2 includes:

[0135] The risk feature set determination submodule is used to construct a driver operation and driving risk feature set, which includes at least the number of operating days, total operating time, average daily operating time, total stay time, average daily stay time, total operating efficiency, total mileage, average daily mileage, total night driving time, proportion of night driving time, frequency of night driving, and frequency of fatigue driving.

[0136] The clustering submodule is used to determine the number of data points in the driver operation and driving risk feature set and the number of clusters to be divided, select a set of cluster center points and minimize the distance of each data point to its nearest center point, and iterate continuously until the cluster center no longer changes or the iteration stopping condition is met.

[0137] The curve determination submodule is used to determine the number of different clusters and the corresponding sum of squared errors based on the elbow method, and to plot the K-SSE curve based on the number of different clusters and the corresponding sum of squared errors.

[0138] The K-value determination submodule is used to determine the optimal K value that slows down sharply in the K-SSE curve, and to use the feature data corresponding to the optimal K value as the second input feature.

[0139] The third determining module 3 includes:

[0140] The route map determination submodule is used to visualize the vehicle's driving trajectory on a map based on the driving latitude and longitude data to obtain a driving route map;

[0141] The driving parameter determination submodule is used to match the geographic information system with the driving route map to determine the driving parameters of the vehicle. The driving parameters include at least the specific road, road segment, driving time and driving distance of the vehicle.

[0142] The road parameter determination submodule is used to determine the road type information and number of lanes based on the specific road the vehicle is traveling on, so as to obtain the vehicle's road parameters.

[0143] The third determining module 3 further includes:

[0144] The free-flow speed determination submodule is used to determine a dynamic segment service level table based on the driving parameters and the road parameters, and to determine the road free-flow speed based on the dynamic segment service level table;

[0145] The road congestion index determination submodule is used to determine the free-flow velocity of the road traffic. With vehicle CAN speed Determine the road congestion index :

[0146] ;

[0147] The congestion level determination submodule is used to determine the level based on the road congestion index. Assign congestion levels to each road segment the vehicle travels on in order to obtain congestion level data for each road segment.

[0148] The prediction module 5 includes:

[0149] The training submodule is used to input the training set into the preset fuel consumption prediction model and train the preset fuel consumption prediction model using the random forest algorithm.

[0150] The iterative submodule is used to perform iterative optimization of the preset fuel consumption prediction model through grid search and cross-validation until the average absolute error of the model parameters is no greater than the performance threshold, so as to obtain the optimized fuel consumption prediction model.

[0151] The scoring determination submodule is used to calculate the prediction score of the optimized fuel consumption prediction model. :

[0152] ;

[0153] In the formula, For the first This is a real fuel consumption value. To optimize the output of the fuel consumption prediction model One fuel consumption prediction value, This represents the number of fuel consumption samples.

[0154] The judgment submodule is used to judge the prediction score of the optimized fuel consumption prediction model. Whether the predicted score of the optimized fuel consumption prediction model is greater than the scoring threshold. If the predicted score is greater than the scoring threshold, the optimized fuel consumption prediction model will be used as the training fuel consumption prediction model. If the result is not greater than the scoring threshold, the parameters of the optimized fuel consumption prediction model will be iteratively optimized again.

[0155] In other embodiments of the present invention, the present invention provides the following technical solution: a computer, including a memory 102, a processor 101, and a computer program stored in the memory 102 and executable on the processor 101, wherein the processor 101 executes the computer program to implement the vehicle fuel consumption prediction method based on vehicle CAN data as described above.

[0156] Specifically, the processor 101 may include a central processing unit (CPU), or an application specific integrated circuit (ASIC), or one or more integrated circuits that can be configured to implement the embodiments of the present invention.

[0157] The memory 102 may include a large-capacity memory for data or instructions. For example, and not limitingly, the memory 102 may include a hard disk drive (HDD), a floppy disk drive, a solid-state drive (SSD), flash memory, an optical disk drive, a magneto-optical disk drive, magnetic tape, or a Universal Serial Bus (USB) drive, or a combination of two or more of these. Where appropriate, the memory 102 may include removable or non-removable (or fixed) media. Where appropriate, the memory 102 may be internal or external to a data processing device. In a particular embodiment, the memory 102 is non-volatile memory. In a particular embodiment, the memory 102 includes read-only memory (ROM) and random access memory (RAM). Where appropriate, the ROM may be a mask-programmed ROM, a programmable read-only memory (PROM), an erasable read-only memory (EPROM), an electrically erasable read-only memory (EEPROM), an electrically alterable read-only memory (EAROM), or flash memory, or a combination of two or more of these. Where appropriate, the RAM can be Static Random-Access Memory (SRAM) or Dynamic Random-Access Memory (DRAM). DRAM can be Fast Page Mode Dynamic Random Access Memory (FPMDRAM), Extended Data Out Dynamic Random Access Memory (EDODRAM), Synchronous Dynamic Random-Access Memory (SDRAM), etc.

[0158] The memory 102 can be used to store or cache various data files that need to be processed and / or used for communication, as well as possible computer program instructions executed by the processor 101.

[0159] The processor 101 reads and executes the computer program instructions stored in the memory 102 to implement the above-mentioned vehicle fuel consumption prediction method based on vehicle CAN data.

[0160] In some embodiments, the computer may further include a communication interface 103 and a bus 100. For example, Figure 4 As shown, the processor 101, memory 102, and communication interface 103 are connected through bus 100 and complete communication with each other.

[0161] The communication interface 103 is used to enable communication between the various modules, devices, units, and / or equipment in the embodiments of the present invention. The communication interface 103 can also enable data communication with other components such as external devices, image / data acquisition devices, databases, external storage, and image / data processing workstations.

[0162] Bus 100 includes hardware, software, or both, that couples components of a computer device together. Bus 100 includes, but is not limited to, at least one of the following: data bus, address bus, control bus, expansion bus, and local bus. For example, and not as a limitation, bus 100 may include an Accelerated Graphics Port (AGP) or other graphics bus, an Extended Industry Standard Architecture (EISA) bus, a Front Side Bus (FSB), a Hyper Transport (HT) interconnect, an Industry Standard Architecture (ISA) bus, an InfiniBand interconnect, a Low Pin Count (LPC) bus, a memory bus, a Micro Channel Architecture (MCA) bus, a Peripheral Component Interconnect (PCI) bus, a PCI-Express (PCI-X) bus, a Serial Advanced Technology Attachment (SATA) bus, a Video Electronics Standards Association Local Bus (VLB) bus, or other suitable buses, or a combination of two or more of these. Where appropriate, bus 100 may include one or more buses. Although specific buses are described and illustrated in the embodiments of the present invention, the present invention is contemplated by any suitable bus or interconnect.

[0163] The computer can execute the vehicle fuel consumption prediction method based on vehicle CAN data of the present invention based on the vehicle fuel consumption prediction system that has acquired vehicle fuel consumption prediction data based on vehicle CAN data, thereby realizing the prediction of vehicle fuel consumption.

[0164] In some further embodiments of the present invention, in conjunction with the above-described vehicle fuel consumption prediction method based on vehicle CAN data, the present invention provides the following technical solution: a storage medium storing a computer program, wherein the computer program, when executed by a processor, implements the above-described vehicle fuel consumption prediction method based on vehicle CAN data.

[0165] Those skilled in the art will understand that the logic and / or steps represented in the flowcharts or otherwise described herein, for example, can be considered as a ordered list of executable instructions for implementing logical functions, and can be embodied in any computer-readable medium for use by, or in conjunction with, an instruction execution system, apparatus, or device (such as a computer-based system, a processor-included system, or other system that can fetch and execute instructions from, an instruction execution system, apparatus, or device). For the purposes of this specification, "computer-readable medium" can mean any means that can contain, store, communicate, propagate, or transmit programs for use by, or in conjunction with, an instruction execution system, apparatus, or device.

[0166] More specific examples of readable media (a non-exhaustive list) include: electrical connections (electronic devices) with one or more wires, portable computer disk drives (magnetic devices), random access memory (RAM), read-only memory (ROM), erasable and editable read-only memory (EPROM or flash memory), fiber optic devices, and portable optical disc read-only memory (CDROM). Furthermore, computer-readable media can even be paper or other suitable media on which the program can be printed, since the program can be obtained electronically, for example, by optically scanning the paper or other medium, followed by editing, interpreting, or otherwise processing as necessary, and then stored in computer memory.

[0167] It should be understood that various parts of the present invention can be implemented in hardware, software, firmware, or a combination thereof. In the above embodiments, multiple steps or methods can be implemented in software or firmware stored in memory and executed by a suitable instruction execution system. For example, if implemented in hardware, as in another embodiment, it can be implemented using any one or a combination of the following techniques known in the art: discrete logic circuits having logic gates for implementing logical functions on data signals, application-specific integrated circuits (ASICs) having suitable combinational logic gates, programmable gate arrays (PGAs), field-programmable gate arrays (FPGAs), etc.

[0168] The technical features of the above embodiments can be combined in any way. For the sake of brevity, not all possible combinations of the technical features in the above embodiments are described. However, as long as there is no contradiction in the combination of these technical features, they should be considered to be within the scope of this specification.

[0169] The embodiments described above are merely illustrative of several implementations of the present invention, and while the descriptions are relatively specific and detailed, they should not be construed as limiting the scope of the invention patent. It should be noted that those skilled in the art can make various modifications and improvements without departing from the concept of the present invention, and these all fall within the protection scope of the present invention. Therefore, the protection scope of this invention patent should be determined by the appended claims.

Claims

1. A vehicle fuel consumption prediction method based on vehicle CAN data, characterized in that, include: The vehicle's original CAN parameters are acquired in real time, and features are extracted from the original CAN parameters to obtain the first input features. The original CAN parameters include at least vehicle speed, engine speed, throttle position, fuel consumption, brake pressure, driving latitude and longitude data, driving mileage, and key engine parameters. A driver operation and driving risk feature set is constructed, and the feature variables in the driver operation and driving risk feature set are clustered using the K-Means unsupervised clustering analysis algorithm to obtain the second input feature; Based on the latitude and longitude data of the vehicle, the driving parameters and road parameters of the vehicle are determined, and based on the driving parameters and road parameters, the congestion level data of the driving segment is determined; A preset fuel consumption prediction model is constructed, and the first input feature, the second input feature and the traffic congestion level data of the driving segment are stored in the processing dataset. The processing dataset is divided into a training set and a test set. The training set is input into the preset fuel consumption prediction model for training to obtain the trained fuel consumption prediction model, and the test set is input into the trained fuel consumption prediction model to obtain the predicted fuel consumption result. The step of acquiring the vehicle's raw CAN parameters in real time and extracting features from the raw CAN parameters to obtain the first input features includes: The vehicle's raw CAN parameters are acquired in real time, and the raw CAN parameters are sequentially subjected to smoothing filtering, interpolation and missing data processing to obtain processed CAN parameters. computing a mutual information between a fuel consumption variable and the processed CAN parameter : ; In the formula, The parameter set consists of all the parameters processed by CAN. For parameters and parameters The joint probability distribution, and Parameters and parameters The marginal probability distribution; Based on mutual information The CAN parameters are sorted from largest to smallest to obtain sorted CAN parameters, and the first few parameters in the sorted CAN parameters are selected as the first input features. The step of constructing a driver operation and driving risk feature set, and using the K-Means unsupervised clustering analysis algorithm to perform clustering analysis on the feature variables in the driver operation and driving risk feature set to obtain the second input feature includes: Construct a driver operation and driving risk feature set, which includes at least the number of operating days, total operating time, average daily operating time, total dwell time, average daily dwell time, total operating efficiency, total mileage, average daily mileage, total night driving time, proportion of night driving time, frequency of night driving, and frequency of fatigue driving. Determine the number of data points in the driver operation and driving risk feature set and the number of clusters to be divided. Select a set of cluster center points and minimize the distance from each data point to its nearest center point. Iterate continuously until the cluster center no longer changes or the iteration stopping condition is met. The elbow method was used to determine the number of clusters and the corresponding sum of squared errors, and K-SSE curves were plotted based on the number of clusters and the corresponding sum of squared errors. Determine the optimal K value that causes a sharp slowdown in the K-SSE curve, and use the feature data corresponding to the optimal K value as the second input feature; The step of determining the congestion level data of a travel segment based on the driving parameters and the road parameters includes: A dynamic segment service level table is determined based on the driving parameters and the road parameters, and the road free-flow speed is determined based on the dynamic segment service level table. Based on the road free-flow velocity With vehicle CAN speed Determine the road congestion index : ; Based on the road congestion index Assign congestion levels to each road segment the vehicle travels on in order to obtain congestion level data for each road segment. The step of inputting the training set into the preset fuel consumption prediction model for training to obtain the trained fuel consumption prediction model includes: The training set is input into the preset fuel consumption prediction model, and the preset fuel consumption prediction model is trained using the random forest algorithm; The preset fuel consumption prediction model is iteratively optimized by grid search and cross-validation until the average absolute error of the model parameters is no greater than the performance threshold, so as to obtain the optimized fuel consumption prediction model. computing a prediction score for the optimized fuel consumption prediction model : ; In the formula, For the first This is a real fuel consumption value. To optimize the output of the fuel consumption prediction model One fuel consumption prediction value, This represents the number of fuel consumption samples. Determine the prediction score of the optimized fuel consumption prediction model. Whether the predicted score of the optimized fuel consumption prediction model is greater than the scoring threshold. If the predicted score is greater than the scoring threshold, the optimized fuel consumption prediction model will be used as the training fuel consumption prediction model. If the result is not greater than the scoring threshold, the parameters of the optimized fuel consumption prediction model will be iteratively optimized again.

2. The vehicle fuel consumption prediction method based on onboard CAN data according to claim 1, characterized in that, The steps for determining the vehicle's driving parameters and road parameters based on the driving latitude and longitude data include: Based on the latitude and longitude data of the driving, the vehicle's driving trajectory is visualized on the map to obtain a driving route map; The geographic information system is matched with the driving route map to determine the driving parameters of the vehicle. The driving parameters include at least the specific road, road segment, driving time and driving distance. The road type and number of lanes are determined based on the specific road on which the vehicle travels, in order to obtain the vehicle's road parameters.

3. A vehicle fuel consumption prediction system based on in-vehicle CAN data, the system employing the vehicle fuel consumption prediction method based on in-vehicle CAN data as claimed in claim 1, characterized in that, The system includes: The first determining module is used to acquire the vehicle's original CAN parameters in real time, and to extract features from the original CAN parameters to obtain the first input features. The original CAN parameters include at least vehicle speed, engine speed, throttle position, fuel consumption, brake pressure, driving latitude and longitude data, driving mileage, and key engine parameters. The second determining module is used to construct a driver operation and driving risk feature set, and to perform cluster analysis on the feature variables in the driver operation and driving risk feature set using the K-Means unsupervised clustering analysis algorithm to obtain the second input feature; The third determining module is used to determine the vehicle's driving parameters and road parameters based on the driving latitude and longitude data, and to determine the congestion level data of the driving road segment based on the driving parameters and road parameters; The construction module is used to build a preset fuel consumption prediction model, store the first input feature, the second input feature and the traffic congestion level data of the driving segment into the processing dataset, and divide the processing dataset into a training set and a test set. The prediction module is used to input the training set into the preset fuel consumption prediction model for training to obtain a trained fuel consumption prediction model, and to input the test set into the trained fuel consumption prediction model to obtain the predicted fuel consumption result.

4. The vehicle fuel consumption prediction system based on on-board CAN data of claim 3, wherein, The first determining module includes: The processing submodule is used to acquire the vehicle's raw CAN parameters in real time, and to perform smoothing filtering, interpolation and missing data processing on the raw CAN parameters in sequence to obtain the processed CAN parameters. a mutual information determination sub-module, configured to calculate mutual information between the fuel consumption variable and the processed CAN parameter : ; In the formula, The parameter set consists of all the parameters processed by CAN. For parameters and parameters The joint probability distribution, and Parameters and parameters The marginal probability distribution; The sorting submodule is used for sorting based on mutual information. The CAN parameters are sorted from largest to smallest to obtain sorted CAN parameters, and the first few parameters in the sorted CAN parameters are selected as the first input features.

5. A computer comprising a memory, a processor, and a computer program stored on the memory and executable on the processor, characterized in that, When the processor executes the computer program, it implements the vehicle fuel consumption prediction method based on on-board CAN data as described in any one of claims 1 to 3.

6. A storage medium, characterized by The storage medium stores a computer program, which, when executed by a processor, implements the vehicle fuel consumption prediction method based on vehicle CAN data as described in any one of claims 1 to 3.

Citation Information

Patent Citations

  • Multi-working-condition fuel vehicle fuel consumption prediction method and system based on Gaussian process regression

    CN111460381A

  • Method and equipment for evaluating energy consumption of heavy-duty diesel vehicle

    CN112200932A

  • Transient automobile fuel consumption estimation method and system based on fuel consumption strongly related parameter correction

    CN114781245A

  • Prediction method and system for vehicle energy consumption analysis

    CN110705774A

  • Driving behavior unsupervised mode recognition method and data acquisition monitoring system

    CN113159105A