Method, device and equipment for optimizing vehicle service

By introducing a multimodal fusion model and a spatiotemporal large model, combined with a multi-agent deep deterministic policy gradient algorithm, the problem of lagging vehicle data value assessment was solved, and dynamic optimization and collaborative decision-making of vehicle service strategies were realized, improving response accuracy and adaptability.

CN122196940APending Publication Date: 2026-06-12AVATR CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
AVATR CO LTD
Filing Date
2026-05-18
Publication Date
2026-06-12

AI Technical Summary

Technical Problem

In existing technologies, the lag in the value assessment of vehicle data leads to inaccurate service strategies, which fail to effectively adapt to the collaborative optimization of the interests of multiple parties. Furthermore, the lack of local intelligent processing capabilities on the vehicle side results in data processing delays and bandwidth waste.

Method used

By introducing a multimodal fusion model with gated attention and a spatiotemporal large model, combined with a multi-agent deep deterministic policy gradient algorithm, data weights are dynamically adjusted and fused to predict future value streams and generate service strategies adapted to multi-vehicle collaborative scenarios.

Benefits of technology

It improves the accuracy and adaptability of service strategies to dynamic traffic environments and vehicle physical characteristics, realizes end-to-end value transformation from data assets to decision output, and enhances the synergy and forward-looking decision-making capabilities of in-vehicle services.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN122196940A_ABST
    Figure CN122196940A_ABST
Patent Text Reader

Abstract

The embodiment of the application relates to the technical field of vehicles, and discloses a kind of optimization method, device and equipment of vehicle service, the method comprises: vehicle obtains multimodal data;Then vehicle is based on the multimodal fusion model of gated attention to multimodal data processing, obtain the data asset to be handled that characterizes body perception feature vector;Server responds to preset trigger condition, obtains the data asset to be handled that characterizes body perception feature vector;The server inputs the data asset to be handled into space-time big model, obtains the target value flow data of data asset to be handled within future preset length;Then server determines the service policy information applied to vehicle according to target value flow data using multi-agent deep deterministic policy gradient algorithm. The technical scheme of the application can solve the technical problem that the service policy for the vehicle is not accurate in the prior art.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of vehicle technology, specifically to a method, apparatus, and device for optimizing vehicle services. Background Technology

[0002] As a core node in intelligent connected transportation systems, new energy vehicles continuously generate massive amounts of multimodal data during their operation. This data is of significant value in scenarios such as vehicle-to-everything (V2X) communication, charging network scheduling, personalized user services, and fault prediction and maintenance.

[0003] Current technology treats vehicle data as a fixed attribute for classification and management, thereby assessing its value and generating subsequent service strategies for the vehicle. However, there is a lag in value assessment, which can lead to inaccurate service strategies. Summary of the Invention

[0004] In view of the above problems, embodiments of the present invention provide a method, apparatus and equipment for optimizing vehicle services, which are used to solve the technical problem of inaccurate service strategies for vehicles in the prior art.

[0005] According to one aspect of the present invention, a method for optimizing vehicle services is provided, applied to a server, the method comprising:

[0006] In response to a preset triggering condition, a data asset to be processed representing an embodied perception feature vector is acquired. The data asset to be processed is obtained by the vehicle processing multimodal data using a multimodal fusion model based on gating attention. The multimodal fusion model is used to dynamically adjust the weight ratio of different modal data and fuse the adjusted data.

[0007] The data assets to be processed are input into the spatiotemporal big model to obtain the target value stream data of the data assets to be processed within a future preset time period. The spatiotemporal big model is trained based on the physical constraints of the vehicle, historical data assets and labeled value stream data.

[0008] Based on the target value stream data, the Multi-Agent Deep Deterministic Policy Gradient (MADDPG) algorithm is used to determine the service policy information applied to the vehicle.

[0009] According to one aspect of the present invention, a method for optimizing vehicle services is provided, applied to a vehicle, the method comprising:

[0010] Acquire multimodal data, including: Controller Area Network (CAN) bus data, sensor data, and interaction data;

[0011] The multimodal fusion model based on gated attention processes the multimodal data to obtain data assets to be processed that represent embodied perception feature vectors. The multimodal fusion model is used to dynamically adjust the weight ratio of different modal data and fuse the adjusted data.

[0012] The data assets to be processed are uploaded to the server, which is used to input the data assets to be processed into the spatiotemporal big model to obtain the target value stream data of the data assets to be processed within a preset time period in the future. The target value stream data is used to determine the service strategy information applied to the vehicle. The spatiotemporal big model is trained based on the physical constraints of the vehicle, historical data assets and labeled value stream data.

[0013] According to another aspect of the present invention, a vehicle service optimization apparatus is provided, applied to a server, the apparatus comprising:

[0014] The first acquisition module is used to acquire, in response to a preset trigger condition, a data asset to be processed that represents the embodied perception feature vector. The data asset to be processed is obtained by the vehicle processing multimodal data based on a multimodal fusion model of gating attention. The multimodal fusion model is used to dynamically adjust the weight ratio of different modal data and fuse the adjusted data.

[0015] The first processing module is used to input the data asset to be processed into the spatiotemporal big model to obtain the target value stream data of the data asset to be processed within a future preset time period. The spatiotemporal big model is trained based on the physical constraints of the vehicle, historical data assets and labeled value stream data.

[0016] The determination module is used to determine the service strategy information applied to the vehicle based on the target value stream data and using a multi-agent deep deterministic policy gradient algorithm.

[0017] According to one aspect of the present invention, a vehicle service optimization apparatus is provided, applied to a vehicle, the apparatus comprising:

[0018] The second acquisition module is used to acquire multimodal data, which includes: CAN bus data, sensor data, and interaction data;

[0019] The second processing module is used to process the multimodal data based on a gated attention multimodal fusion model to obtain a data asset to be processed that represents an embodied perception feature vector. The multimodal fusion model is used to dynamically adjust the weight ratio of different modal data and fuse the adjusted data.

[0020] The sending module is used to upload the data asset to be processed to the server. The server is used to input the data asset to be processed into the spatiotemporal big model to obtain the target value stream data of the data asset to be processed within a preset time period in the future. The target value stream data is used to determine the service strategy information applied to the vehicle. The spatiotemporal big model is trained based on the physical constraints of the vehicle, historical data assets and labeled value stream data.

[0021] According to another aspect of the present invention, an electronic device is provided, including: a processor, a memory, a communication interface, and a communication bus, wherein the processor, the memory, and the communication interface communicate with each other through the communication bus;

[0022] The memory is used to store at least one executable instruction that causes the processor to perform operations such as the vehicle service optimization method described above.

[0023] According to another aspect of the present invention, a computer-readable storage medium is provided, the storage medium storing at least one executable instruction that causes an electronic device / vehicle service optimization apparatus to perform operations such as the vehicle service optimization method described above.

[0024] According to another aspect of the present invention, a computer program product is provided, including a computer program that, when executed by a processor, causes a vehicle service optimization device / electronic device to perform the operation of the above-described vehicle service optimization method.

[0025] In this embodiment of the invention, the vehicle acquires multimodal data, including CAN bus data, sensor data, and interaction data. The vehicle then processes the multimodal data using a gated attention-based multimodal fusion model to obtain a data asset representing embodied perception feature vectors. This multimodal fusion model dynamically adjusts the weight ratios of different modalities and fuses the adjusted data. The server, responding to a preset trigger condition, acquires the data asset representing the embodied perception feature vectors. The server inputs the data asset into a spatiotemporal large model to obtain target value stream data for the data asset within a preset future timeframe. The spatiotemporal large model is trained based on the vehicle's physical constraints, historical data assets, and labeled value stream data. Then, based on the target value stream data, the server uses a multi-agent deep deterministic policy gradient algorithm to determine service policy information applicable to the vehicle. In this technical solution, the server acquires the data assets to be processed based on multimodal data collected from vehicles and inputs them into a spatiotemporal large model trained by integrating vehicle physical constraints, historical data assets, and labeled value stream data. This model can effectively predict the value stream of target data assets in future time periods, proactively capture the dynamic changes of data assets, and deeply model high-dimensional spatiotemporal correlations. Then, the multi-agent deep deterministic policy gradient algorithm is used to optimize the predicted value stream data, generating service policy information adapted to multi-vehicle collaborative scenarios. This not only improves the accuracy and adaptability of service policies in response to dynamic traffic environments and vehicle physical characteristics, but also realizes end-to-end value transformation from data assets to decision output. Thus, under the premise of ensuring safety and efficiency constraints, it enhances the collaborativeness and forward-looking decision-making capabilities of in-vehicle services.

[0026] The above description is merely an overview of the technical solutions of the embodiments of the present invention. In order to better understand the technical means of the embodiments of the present invention and to implement them in accordance with the contents of the specification, and to make the above and other objects, features and advantages of the embodiments of the present invention more apparent and understandable, specific embodiments of the present invention are described below. Attached Figure Description

[0027] The accompanying drawings are for illustrative purposes only and are not intended to limit the invention. Furthermore, the same reference numerals denote the same parts throughout the drawings. In the drawings:

[0028] Figure 1 A flowchart of a first embodiment of the vehicle service optimization method provided by the present invention is shown;

[0029] Figure 2 A flowchart of a second embodiment of the vehicle service optimization method provided by the present invention is shown;

[0030] Figure 3This invention provides a time-series diagram of a multi-agent game.

[0031] Figure 4 A flowchart of a third embodiment of the vehicle service optimization method provided by the present invention is shown;

[0032] Figure 5 A flowchart of a fourth embodiment of the vehicle service optimization method provided by the present invention is shown;

[0033] Figure 6 An overall flowchart of an example of the vehicle service optimization method provided by the present invention is shown;

[0034] Figure 7 An overall flowchart of an example of the vehicle side provided by the present invention is shown;

[0035] Figure 8 This invention provides an overall flowchart illustrating an example of server-side value stream data determination.

[0036] Figure 9 A schematic diagram of the structure of a first embodiment of the vehicle service optimization device provided by the present invention is shown.

[0037] Figure 10 A schematic diagram of the structure of a second embodiment of the vehicle service optimization device provided by the present invention is shown;

[0038] Figure 11 A schematic diagram of an embodiment of the electronic device provided by the present invention is shown. Detailed Implementation

[0039] Exemplary embodiments of the invention will now be described in more detail with reference to the accompanying drawings. While exemplary embodiments of the invention are shown in the drawings, it should be understood that the invention can be implemented in various forms and should not be limited to the embodiments set forth herein.

[0040] As a core node in intelligent connected transportation systems, new energy vehicles continuously generate massive amounts of multimodal data during their operation, including: vehicle dynamics (e.g., battery state of charge (SOC), motor speed), user behavior data (e.g., driving habits, charging preferences), environmental perception data (e.g., radar point clouds, camera images), and service interaction data (e.g., navigation routes, Over-the-Air (OTA) upgrade records). This data is of significant value in scenarios such as vehicle-to-everything (V2X), charging network scheduling, personalized user services (e.g., intelligent navigation, energy management), and fault prediction and maintenance.

[0041] For example, in charging scenarios, vehicles need to dynamically adjust charging strategies based on real-time grid load and user trip planning; in safety scenarios, collision warnings are achieved through multi-sensor fusion; and in user service scenarios, customized value-added services need to be provided based on driving behavior data.

[0042] Currently, the main methods for managing data assets of new energy vehicles to assist vehicle operation can be:

[0043] 1) Treating vehicle data as a fixed attribute for classification and management, and evaluating its value through preset rules (such as data type and collection frequency), ignores the dynamic evolution characteristics of data in the spatiotemporal dimension, resulting in a lag in value assessment and difficulty in adapting to real-time scenario requirements.

[0044] 2) Service strategies are usually formulated based on the platform's unilateral goals (e.g., maximizing the utilization of charging stations), without considering the coordinated optimization of multiple interests such as users' personalized needs (e.g., the lowest charging cost), vehicle manufacturers' operational efficiency (e.g., battery health maintenance) and grid dispatch, which can easily lead to an imbalance between local optima and global benefits.

[0045] 3) Traditional deep learning models (such as Long Short-Term Memory (LSTM) and Transformer) are used for data prediction, but the physical laws of the vehicle (such as battery electrochemical characteristics and kinetic equations) are not embedded, which leads to deviations between the prediction results and the actual physical behavior. In addition, the model has poor interpretability and is difficult to support high-risk decisions (such as fault warning).

[0046] 4) The vehicle-mounted device only acts as a data collector and lacks local intelligent processing capabilities. It needs to upload a large amount of raw data to the cloud, resulting in wasted bandwidth and processing delays. Furthermore, high-value information (such as emergency security incidents) cannot be extracted and responded to in a timely manner at the source.

[0047] Based on the aforementioned technical problems, the technical concept of this invention is as follows: A spatiotemporal large model integrating vehicle physical constraints can be introduced. Through joint training of historical data assets and labeled value stream data, it can acquire the ability to model the evolution of data value in the spatiotemporal dimension, thereby transforming static value assessment into dynamic prediction of future value streams. After predicting the dynamic value distribution of data assets in future periods, it is further recognized that the formulation of service strategies is essentially a sequential decision-making problem in a multi-objective, multi-agent, and dynamic environment. However, current rule-based or single-objective optimization is difficult to capture the coupling of multiple interests and real-time response requirements. Therefore, a multi-agent deep deterministic policy gradient algorithm can be adopted, utilizing its ability to perform collaborative policy optimization in complex dynamic environments. The predicted value stream can be directly used as the basis for decision optimization, thereby integrating the dynamic characteristics of data value in the policy generation stage. This makes the generation of service strategies no longer dependent on lagging static assessment, but based on forward-looking modeling of future value streams and multi-objective collaborative optimization, thus solving the technical problem of inaccurate policies caused by lagging value assessment.

[0048] Based on the above technical concept, the technical solution of the present invention will be described in detail below through specific embodiments. The executing entity on the method side of the present invention is a server corresponding to the vehicle (e.g., a server, etc.). It should be noted that the following specific embodiments can be combined with each other, and the same or similar concepts or processes may not be described again in some embodiments.

[0049] Figure 1 A flowchart illustrating a first embodiment of the vehicle service optimization method provided by the present invention is shown, the method being executed by a server. Figure 1 As shown, the method includes the following steps:

[0050] Step 11: Acquire multimodal data for the vehicle;

[0051] The multimodal data includes: CAN bus data, sensor data, and interactive data;

[0052] In this step, the vehicle synchronously acquires raw multimodal data through its data acquisition modules, such as: CAN bus data (e.g., vehicle speed, battery SOC, motor torque), environmental perception sensor data (e.g., camera images, millimeter-wave radar point clouds), and human-machine interaction data (e.g., navigation destination settings, multimedia volume adjustment).

[0053] Step 12: The vehicle processes the multimodal data using a gating attention-based multimodal fusion model to obtain the data asset to be processed, which represents the embodied perception feature vector.

[0054] Among them, the multimodal fusion model is used to dynamically adjust the weight ratio of different modal data and fuse the adjusted data.

[0055] In this step, the multimodal fusion model dynamically learns the importance weights of different modal data in the current driving scenario through a gating mechanism. For example, in scenarios with drastic changes in lighting when entering or exiting a tunnel, it automatically increases the fusion weight of radar data and decreases the weight of visual data.

[0056] Furthermore, after acquiring multimodal data, it can be input into a multimodal fusion model to obtain a more accurate and objective embodied perception feature vector that can represent the current vehicle scenario, i.e., the data asset to be processed.

[0057] Step 13: The server responds to the preset trigger conditions and obtains the data assets to be processed;

[0058] Optionally, the preset trigger condition can be one or more of the following conditions to trigger execution:

[0059] 1) Event trigger: When the vehicle reports high-value emergency data (e.g., signs of battery thermal runaway, collision risk warning), or when the user actively initiates a service request (e.g., to find the best charging station), or when the vehicle completes the trip planning settings;

[0060] 2) Periodic triggering: Perform data asset value stream deduction and service strategy update for vehicles at preset time intervals (e.g., every 5 minutes);

[0061] 3) Triggering based on the following target value stream data: For example, if the target value stream data indicates that the battery will be depleted in the next few minutes, then the target of charging the vehicle will be triggered.

[0062] Optional, multimodal data includes: CAN bus data, sensor data, and interaction data.

[0063] In this implementation, CAN bus data can be vehicle speed, battery SOC, motor torque, etc.; environmental perception sensor data can be camera images, millimeter-wave radar point clouds, etc.; and human-machine interaction data can be navigation destination settings, multimedia volume adjustment, etc.

[0064] Furthermore, on the vehicle side, uploading the data assets to be processed to the server can be achieved by the vehicle uploading the data assets to be processed to the server according to a preset value dimension.

[0065] The preset value dimension is determined based on security relevance, economic relevance, and timeliness relevance.

[0066] In this implementation, the vehicle first determines the method of uploading the data assets to be processed to the server based on dimensions such as safety relevance, economic relevance, and timeliness relevance, so as to realize the value-added and intelligent scheduling of data at the source of the vehicle.

[0067] Among them, safety relevance can measure the direct impact of data assets on vehicle safety (e.g., collision warning signals, signs of battery thermal runaway); economic relevance can measure the potential contribution of data assets to user costs or platform revenue (e.g., charging strategy optimization, grid load balancing); and timeliness relevance can measure the time sensitivity of data assets (e.g., instantaneous response needs, long-term maintenance planning).

[0068] Optionally, one possible implementation for vehicles to upload data assets to the server based on preset value dimensions is:

[0069] Step 1: Based on the preset value dimensions, perform value assessment on the data assets to be processed to obtain the assessment results;

[0070] In this implementation, the vehicle first scores the output data assets to be processed in real time based on dimensions such as safety relevance, economic relevance, and timeliness, and obtains the scoring results.

[0071] The first possible implementation in step 2: If the evaluation result indicates that it is instantaneous response data involving safety, make a decision on the data asset to be processed based on the local engine in the vehicle, and upload the decision result as the data asset to be processed to the server;

[0072] In this implementation, if the evaluation results show that a certain data in the data asset to be processed is determined to be instantaneous response data involving safety (e.g., a warning of an impending collision), the highest priority path will be triggered: directly call the vehicle's local decision engine to execute the preset safety policy (e.g., emergency braking AEB), and at the same time, the complete log of this decision will be immediately uploaded to the server as the data asset to be processed.

[0073] The second possible implementation in step 2: If the assessment results indicate that the data to be processed is high-value and urgent, upload the data assets to the server;

[0074] In this implementation, if the evaluation results show high-value emergency data (e.g., signs of battery thermal runaway), the system bypasses the vehicle's local decision-making and directly uploads the data to the server via a high-priority channel, requesting further analysis and intervention from the cloud, i.e., executing the corresponding method on the server.

[0075] The third possible implementation in step 2: If the evaluation result indicates that the data is normal, upload the data asset to be processed to the server according to the communication network status between the server and the vehicle.

[0076] In this implementation, if the evaluation results show that the data is routine (e.g., daily driving habit data), the data is temporarily stored in a local cache queue based on the current network signal strength and server load, and then uploaded in batches when the network is idle.

[0077] Step 14: The server inputs the data assets to be processed into the spatiotemporal model to obtain the target value stream data of the data assets to be processed within a preset future time period.

[0078] Among them, the spatiotemporal large model is trained based on the physical constraints of vehicles, historical data assets, and labeled value stream data;

[0079] Optionally, before step 14, the server may also perform:

[0080] Step 1: Obtain historical data assets and the labeled value stream data corresponding to the historical data assets;

[0081] In this implementation, the training phase of the spatiotemporal large model can retrieve a large amount of historical data from the data asset operation layer, including: embodied perception feature vectors uploaded by historical vehicles (i.e., historical data assets), and corresponding labeled value stream data (such as the remaining range value stream data, the charging economic value stream data, or the actual parts maintenance value stream).

[0082] The methods for obtaining labeled value stream data include: manual labeling, rule-based labeling, or inference from actual revenue.

[0083] Step 2: Based on physical constraints, historical data assets, and labeled value stream data, train a multi-scale spatiotemporal attention network to obtain a large spatiotemporal model.

[0084] In this implementation, a multi-scale spatiotemporal attention network is constructed, and a loss function corresponding to physical constraints is introduced during training. For example, if the acceleration sequence predicted by the multi-scale spatiotemporal attention network does not satisfy the vehicle dynamics equations, a penalty is imposed. Furthermore, through supervised learning on massive historical data assets and the labeled value stream data corresponding to those historical data assets, the multi-scale spatiotemporal attention network gradually learns the complex mapping relationship between data features and dynamic value, ultimately converging into a large spatiotemporal model capable of accurately predicting future value streams.

[0085] The multi-scale spatiotemporal attention network processes patterns at different time scales through parallel attention heads: short-term heads capture instantaneous risks (e.g., sudden braking ahead), medium-term heads are used for trip planning (e.g., range estimation), and long-term heads identify slowly changing trends (e.g., component wear). Simultaneously, physical constraints (e.g., vehicle dynamics equations) are embedded as constraints in the loss function during training; for example, a penalty is imposed if the predicted acceleration sequence does not satisfy F=ma.

[0086] Step 15: Based on the target value stream data, the server uses a multi-agent deep deterministic policy gradient algorithm to determine the service policy information applied to the vehicle.

[0087] In this step, after the spatiotemporal large model derives the target value stream data, a multi-agent deep deterministic policy gradient algorithm is used based on the target value stream data to perform game optimization, so as to obtain a Pareto optimal equilibrium policy that can be accepted by each agent in the multi-agent group. This policy is recorded as the service policy information applied to the vehicle and then sent to the vehicle for execution.

[0088] The execution of the multi-agent deep deterministic policy gradient algorithm can be as follows: The target value stream data is input into the service game optimization layer, which deploys a game optimization center based on the MADDPG algorithm. The MADDPG algorithm adopts a framework of centralized training and distributed execution. During training, it uses global information (including the state of each agent and shared target value stream data) to coordinate and optimize the policy network of each agent. During execution, each agent (e.g., representing a vehicle, user, or platform) independently generates actions based on its own local observation information (e.g., vehicle remaining battery power, user preferences, grid load) and shared target value stream data. Then, by simulating a multi-round negotiation process, a Pareto optimal equilibrium strategy is solved.

[0089] The intelligent agents can include: vehicle-side intelligent agents, user intelligent agents, and platform intelligent agents.

[0090] For example, vehicle-side intelligent agents pursue low energy consumption, low battery degradation, and high safety; user intelligent agents pursue low time costs and high travel reliability; platform intelligent agents (which may have external spatiotemporal characteristics, such as static and dynamic attributes of each charging station) pursue balanced charging station load, stable power grid, and maximized revenue.

[0091] The game objective is to negotiate a strategy that is acceptable to all agents and optimal overall, while sharing the target value stream data. Accordingly, the output service strategy information is: arrive at charging station A at 15:30 and charge for 20 minutes to 80% SOC.

[0092] The vehicle service optimization method provided in this invention obtains data assets to be processed in response to preset triggering conditions. These data assets are determined based on multimodal data collected from the vehicle. The data assets to be processed are then input into a spatiotemporal large model to obtain target value stream data for the future time period. The spatiotemporal large model incorporates the physical constraints of the vehicle and is trained based on historical data assets and labeled value stream data. Based on the target value stream data, a multi-agent deep deterministic policy gradient algorithm is used to determine the service strategy information applied to the vehicle. In this technical solution, the server acquires the data assets to be processed based on multimodal data collected from vehicles and inputs them into a spatiotemporal large model trained by integrating vehicle physical constraints, historical data assets, and labeled value stream data. This model can effectively predict the value stream of target data assets in future time periods, proactively capture the dynamic changes of data assets, and deeply model high-dimensional spatiotemporal correlations. Then, the multi-agent deep deterministic policy gradient algorithm is used to optimize the predicted value stream data, generating service policy information adapted to multi-vehicle collaborative scenarios. This not only improves the accuracy and adaptability of service policies in response to dynamic traffic environments and vehicle physical characteristics, but also realizes end-to-end value transformation from data assets to decision output. Thus, under the premise of ensuring safety and efficiency constraints, it enhances the collaborativeness and forward-looking decision-making capabilities of in-vehicle services.

[0093] Based on the above embodiments, Figure 2 A flowchart of a second embodiment of the vehicle service optimization method provided by the present invention is shown, as follows: Figure 2 As shown, step 15 above (execution subject: server) may include:

[0094] Step 21: Construct a game environment that includes vehicle-side intelligent agents, user intelligent agents, and platform intelligent agents;

[0095] In this step, a multi-agent game environment is constructed, in which three core agents can be defined: the vehicle-side agent representing the vehicle itself (e.g., the agent's goal is to reduce energy consumption and extend lifespan), the user agent representing the driver (e.g., the agent's goal is to improve travel efficiency and comfort), and the platform agent representing the service provider (e.g., the agent's goal is to improve operational revenue and resource utilization).

[0096] Step 22: Input the target value stream data into the game environment as shared global state information;

[0097] In this step, the target value stream data derived from the spatiotemporal model (such as risk value, charging demand value, etc. in the future) is used as a global shared state, that is, global state information is input into the game environment.

[0098] Step 23: Using a multi-agent deep deterministic policy gradient algorithm, the service policy information of the vehicle is determined in the game environment based on the target value stream data and global state information.

[0099] In this step, within the game environment, the MADDPG algorithm can be used to process the target value stream data and global state information, and the solution result can be used to determine the vehicle's service strategy information.

[0100] Optionally, one possible implementation of step 23 could be:

[0101] Step 1: Employ the multi-agent deep deterministic policy gradient algorithm to coordinate and optimize the policy networks of each agent during the centralized training phase in the game environment using global state information.

[0102] During the centralized training phase, the central controller uses global state information, including target value stream data, to jointly train the Actor-Critic network of each agent, learning a collaborative strategy that maximizes the interests of all parties.

[0103] Step 2: After coordination and optimization, in the distributed execution phase of the game environment, each agent generates actions based on its own observation information, and obtains the Pareto optimal equilibrium strategy through multiple rounds of negotiation.

[0104] During the distributed execution phase, each agent makes decisions independently based solely on its own local observation information (e.g., the vehicle's own location observed, and the user's travel preferences observed) and shared target value stream data.

[0105] Furthermore, through this multi-round information interaction and strategy iteration among various intelligent agents, the final equilibrium strategy can ensure that the interests of any party cannot be further improved without harming the interests of other parties, thus achieving Pareto optimality and obtaining a Pareto optimal equilibrium strategy.

[0106] Step 3: Determine the Pareto optimal equilibrium strategy as the service strategy information.

[0107] For example, in the above implementation, each agent can adopt the same Actor-Critic network structure. The Actor network of each agent can be a fully connected neural network containing two hidden layers. The input is the local observation vector of the agent (for example, the observation vector of the vehicle agent includes: vehicle position, battery SOC, and vehicle speed; the observation vector of the user agent includes: destination, time preference, and cost sensitivity), and the output is the action probability distribution (for example, the probability of choosing charging stations A, B, and C). Each agent corresponds to a Critic network, which is a fully connected network containing three hidden layers.

[0108] During the centralized training phase, the input consists of the concatenation of observation and action vectors from all agents, along with shared global state information (i.e., target value stream data). The output is the value estimate of the agent's action. During training, an experience replay mechanism is used to sample batch data. The loss function for each Critic network is the mean square value of the temporal difference error, and the gradient of the Actor network's update policy is the gradient of the corresponding Critic network's output. Through multiple rounds of iterative training, the policies of each agent converge.

[0109] In the distributed execution phase, multi-round negotiation is manifested as follows: at each time step, each agent independently generates action proposals based on the current observations; if there is an action conflict (e.g., multiple vehicle agents choose the same charging station), a conflict resolution protocol based on priority or random yielding is triggered; the protocol can be set to iterate up to N rounds (e.g., N=5), and if a consensus is reached, a joint action is output as service strategy information, otherwise the default distributed strategy is adopted.

[0110] Optional, Figure 3 This invention provides a time-series diagram of a multi-agent game, as shown in the figure. Figure 3 As shown, a specific game scenario is presented:

[0111] It includes the following main components: vehicle-side intelligent agent (Agent_V), user intelligent agent (Agent_C), platform intelligent agent (Agent_P), game optimization center (MADDPG), and spatiotemporal large model;

[0112] The implementation process includes:

[0113] Phase 1), Value Stream Information Synchronization and Deduction:

[0114] The vehicle-side intelligent agent reports real-time vehicle status feature vectors to the spatiotemporal big model; the user intelligent agent reports user preferences (e.g., hoping to charge within 30 minutes) to the game optimization center; the platform intelligent agent publishes platform goals (e.g., guiding vehicles to less busy charging stations) to the game optimization center; the spatiotemporal big model infers the data asset value stream based on physical information and pushes the data asset value stream to the game optimization center (e.g., the current value of station A is 0.9, and it will be 0.4 after 30 minutes due to congestion).

[0115] Phase 2), Game Strategy Generation and Negotiation:

[0116] In the game optimization center, the strategy network of each agent calculates based on the global state (value stream + objectives of each party); it suggests to the vehicle agent: recommend going to charging station A immediately; it proposes to the user agent: if the recommendation is accepted, a reward of 50 points will be given; it proposes to the platform agent: successful guidance will improve regional operational efficiency; the vehicle agent responds: the recommendation is accepted, and the estimated arrival time is 5 minutes; the user agent responds: the solution is accepted, and the points are expected to be credited immediately; the platform agent responds: it is confirmed that there are available seats at station A, and guidance is accepted.

[0117] Phase 3), Strategy Execution and Data Acquisition:

[0118] The vehicle-side intelligent agent displays recommended routes and incentive information on the user's intelligent agent vehicle interface, and then the user confirms the execution; the vehicle-side intelligent agent provides feedback on the actual executed actions (starting navigation to station A), and continuously reports data such as location and energy consumption (during the trip); the spatiotemporal big data model updates the value stream (e.g., due to changes in intersections, minor adjustments to the value of station A) to the game optimization center; the user intelligent agent completes the service and gives a satisfaction rating (4.5 / 5.0).

[0119] Phase 4), Value Stream Model and Game Strategy Update:

[0120] The game optimization center calculates the overall reward for this round of the game by integrating feedback from all parties (satisfaction, efficiency improvement, and execution); and uses the reward signal to update the policy network of all agents (using a centralized training and distributed execution paradigm); compares the actual results with the predicted value stream and provides model optimization signals to the spatiotemporal large model; the spatiotemporal large model adjusts the parameters of the physical information model to improve the accuracy of the next value stream inference.

[0121] The vehicle service optimization method provided in this invention constructs a game environment that includes a vehicle-side intelligent agent, a user intelligent agent, and a platform intelligent agent; inputs target value stream data into the game environment as shared global state information; and employs a multi-agent deep deterministic policy gradient algorithm to determine the vehicle's service strategy information in the game environment based on the target value stream data and the global state information. This technical solution first achieves unified modeling of multiple stakeholders in the same dynamic scenario based on multi-agent systems, laying the foundation for coordinating personalized user needs, vehicle manufacturer operational efficiency, and platform operation goals from a global perspective. Then, it employs a multi-agent deep deterministic policy gradient algorithm. During the centralized training phase, it utilizes complete global state information to jointly optimize the policy networks of each agent, enabling each agent to perceive the behavioral intentions of other agents and the global evolution of the environment during training. This effectively overcomes the local optima and policy conflicts that can easily result from unilateral optimization. In the distributed execution phase, each agent can generate collaboratively optimized service policies based solely on its own local observation information. This ensures both the real-time performance and low latency of vehicle-side decision-making and avoids reliance on global communication during the execution phase. While improving the service policy's response speed to real-time scenarios, it also ensures a balance between global benefits and individual needs.

[0122] Based on the above embodiments, Figure 4 A flowchart of a third embodiment of the vehicle service optimization method provided by the present invention is shown, as follows: Figure 4 As shown, after step 15, the server can also perform:

[0123] Step 41: Send service policy information to the vehicle and obtain real-time feedback data after the vehicle executes the service policy information;

[0124] In this step, service policy information is sent to the vehicle. After the vehicle executes the service policy information, the server can collect real-time feedback data after the execution.

[0125] For example, real-time feedback data could include: user satisfaction ratings for the service, the actual time a vehicle takes to reach a charging station and the amount of electricity consumed, and the platform's actual revenue during that period.

[0126] Step 42: Update the spatiotemporal large model and / or the policy network of each agent based on real-time feedback data.

[0127] In this step, the spatiotemporal model and / or the policy networks of individual agents can be adjusted and optimized based on real-time feedback data.

[0128] For example, real-time feedback data (such as user satisfaction and platform revenue) can be used as reward signals in reinforcement learning to update the policy network and value network of each agent in a multi-agent deep deterministic policy gradient algorithm. For instance, if a user gives a low satisfaction rating to a recommended charging station, the policy network will adjust its weights to avoid similar recommendations.

[0129] Furthermore, by comparing real-time feedback data with target value stream data, the difference can be used as a loss signal, and the parameters of the spatiotemporal large model itself can be optimized through backpropagation algorithm to make its future predictions more closely resemble actual physical behavior.

[0130] Correspondingly, on the vehicle side, the following can be performed: obtain service policy information issued by the server and execute it; upload real-time feedback data after executing the service policy information to the server.

[0131] The vehicle service optimization method provided in this invention sends service policy information to vehicles and obtains real-time feedback data after the vehicles execute the service policy information; based on the real-time feedback data, it updates the spatiotemporal large model and / or the policy networks of each agent. In this technical solution, the immediate capture of the actual effect of the policy provides a real basis for evaluating the accuracy of prediction and decision-making; then, by comparing or jointly analyzing the real-time feedback data with the previously predicted target value stream data, it can effectively identify the deviations in the spatiotemporal large model in value stream prediction and the shortcomings of the multi-agent policy network in collaborative decision-making; based on this, the aforementioned difference information is used to update the spatiotemporal large model and the policy networks of each agent respectively, enabling the spatiotemporal large model to continuously learn the dynamic evolution laws of data assets in the real physical environment, continuously correct prediction errors caused by unmodeled factors or environmental changes, and continuously approach the complexity and timeliness requirements of real-world scenarios during continuous operation, significantly improving the accuracy of service policies in long-term deployment.

[0132] Based on the above embodiments, Figure 5 A flowchart of a fourth embodiment of the vehicle service optimization method provided by the present invention is shown, as follows: Figure 5 As shown, before step 12, the vehicle can also perform the following:

[0133] Step 51: Perform spatiotemporal synchronization processing on the multimodal data to obtain the processed multimodal data;

[0134] In this implementation, the collected multimodal data is spatiotemporally synchronized and aligned to ensure that all data are unified to the same timestamp and vehicle coordinate system, resulting in processed multimodal data.

[0135] Step 52: According to the preset filtering strategy, filter the processed multimodal data to obtain filtered multimodal data;

[0136] The preset filtering strategy is determined based on data quality and / or data duplication.

[0137] In this implementation, a preset filtering strategy is applied to the processed multimodal data. For example, low-quality data (such as sensor fault frames and outliers) is filtered out by integrity, consistency and anomaly monitoring rules; or redundant data with minimal changes over a continuous period of time (such as static environmental images in a parked state) is detected by hash algorithm and downgraded or discarded, finally obtaining the filtered multimodal data.

[0138] The vehicle service optimization method provided in this invention performs spatiotemporal synchronization processing on multimodal data to obtain processed multimodal data; then, according to a preset filtering strategy, filters the processed multimodal data to obtain filtered multimodal data. In this technical solution, the spatiotemporal synchronization of multimodal data ensures the accuracy and synchronization of subsequent data processing; subsequent data filtering avoids unnecessary or redundant data from putting pressure on the server and improves accuracy.

[0139] Based on the above embodiments, Figure 6 An overall flowchart of an example of the vehicle service optimization method provided by the present invention is shown, such as... Figure 6 As shown, the system layers corresponding to the vehicle service optimization method in this example may include: vehicle-side embodied perception layer, cloud-based spatiotemporal large model inference layer, service game optimization layer, service execution layer, and data asset operation layer.

[0140] 1) The vehicle-side embodied perception layer first acquires the raw multi-source data streams collected by the vehicle, including data from CAN, sensors, and the Human-Machine Interface (HMI); then, within the vehicle-side embodied perception layer, it performs multimodal data acquisition and synchronization; embodied intelligence fusion and feature extraction (spatiotemporal attention); local decision-making and value pre-screening engine (upload / caching / local execution); and finally uploads the embodied perception feature vector (standardized data assets, i.e., data assets to be processed).

[0141] 2) In the cloud-based spatiotemporal large model extrapolation layer (spatiotemporal large model), standardized data assets are input into the physical information spatiotemporal large model engine; the digital twin state extrapolation and manager (multi-step prediction) is executed; the data asset value stream construction and evaluator (value quantification) is executed; in order to obtain high-value data asset streams (including future value extrapolation, i.e. target value stream data);

[0142] 3) In the service game optimization layer, the following steps are taken: entering the multi-agent game center (using the MADDPG algorithm); service strategy generator (Pareto optimal solution); dynamic pricing and incentive mechanism; outputting personalized service instructions (i.e., service strategy information);

[0143] 4) In the service execution layer, services are executed (such as vehicle system / application (Application, APP) push), thereby obtaining service effect data (feedback signals).

[0144] 5) In the data asset operation layer, game effect monitoring is performed based on feedback signals (A / B testing, multi-objective pinging); data asset value stream closed-loop optimization engine; value re-evaluation and feedback (dynamic weight update), output optimization instructions (feedback signals) are given to the multi-agent game center and the physical information spatiotemporal large model engine respectively.

[0145] In one possible implementation, the flow is described as the forward flow shown in the attached figure:

[0146] The vehicle-side embodied intelligent perception layer acquires raw data from the CAN bus, sensors, HMI, etc. through the multimodal data acquisition and synchronization module; through the embodied intelligent fusion and feature extraction module (using a spatiotemporal attention mechanism), it generates standardized embodied perception feature vectors (i.e., primary data assets); the local decision-making and value pre-screening engine intelligently decides whether to upload immediately, upload in batches after caching, or execute locally immediately (e.g., in emergency safety response) as high-value data assets based on the data value density and urgency.

[0147] The cloud-based spatiotemporal large model inference layer receives feature vectors uploaded from the vehicle. The physical information spatiotemporal large model engine uses a pre-trained large model embedded with physical laws to perform deep inference. The digital twin state inference and manager updates the vehicle's virtual state and completes multi-step predictions. The data asset value stream construction and evaluation device quantifies the current and future value of various data assets to form a dynamic value stream.

[0148] The service game optimization layer receives value stream information, and the multi-agent game center (using the MADDPG algorithm) coordinates the interests of agents such as the vehicle, user, and platform to generate Pareto optimal service strategies. The service strategy generator transforms these strategies into specific instructions, and dynamic pricing and incentive mechanisms ensure the implementation of the strategies.

[0149] The data asset operation layer is responsible for monitoring the game's effects and completing the iteration of the value model through a value reassessment and feedback closed-loop optimization engine.

[0150] Reverse feedback (top-down): Optimization instructions from the operations layer (e.g., adjusting the weight of a certain type of data asset) can directly affect the lower-level model, forming a complete value-added closed loop of data-asset-value-service-feedback-optimization. The core innovation of this architecture lies in clarifying the division of labor among each layer and clearly depicting the entire path of data asset value flow generation, deduction, game theory, and optimization.

[0151] Based on the above embodiments, Figure 7 An overall flowchart of an example of the vehicle-side provided by the present invention is shown, such as Figure 7 As shown, the system layering on the vehicle side in this example may include: vehicle-side data assetization pipeline and local value decision loop;

[0152] First, the vehicle-side data assetization pipeline receives raw multi-source data inflow, performs spatiotemporal synchronization, and unifies the timestamps and coordinate systems of its modules. Then, it enters the data quality and initial screening module, which monitors integrity, consistency, and anomalies, discarding or downgrading low-quality, redundant, or abnormal data and recording logs. For high-quality, valid data, or directly for the monitored data, a multimodal fusion core based on gating attention is used to generate embodied perception feature vectors (standardized preliminary asset data) containing spatiotemporal contextual semantics.

[0153] Secondly, the local value decision-making loop performs local lightweight value assessment and decision-making modules;

[0154] 1) If the data is of extremely high value and requires instantaneous response (e.g., forward collision warning), the local decision engine immediately executes safety policies (e.g., Automatic Emergency Braking (AEB)); thereby generating service execution logs and uploading them as high-value data assets;

[0155] 2) For high-value / urgent data (e.g., collision risk / battery overheating), the highest priority is given; data is uploaded to the cloud immediately; data is also uploaded to the cloud via the vehicle-to-everything (V2X) channel with priority.

[0156] 3) For data of general value (e.g., normal driving behavior, energy consumption), medium priority: cache and then upload to the cloud in batches according to the strategy; data cache area; and then upload in batches according to network conditions and cloud load.

[0157] Finally, the cloud-based data receiver receives the data.

[0158] Based on the above embodiments, Figure 8 This invention provides an example of a server-based determination of value stream data, as illustrated in the overall flowchart. Figure 8 As shown, this example includes:

[0159] Historical and real-time data asset sequence feature vectors are embedded with physical constraints (vehicle dynamics / battery model / thermodynamics); the real-time data asset sequence feature vectors (data assets to be processed) are input into the spatiotemporal large model of physical information (it should be understood that the spatiotemporal large model embeds the physical constraint laws of the vehicle and is trained based on historical data assets and labeled value stream data).

[0160] Core processing: Multi-scale spatiotemporal attention, resulting in: short-term attention head (future 1-5 minutes), medium-term attention head (future 0.5-2 hours), and long-term attention head (future days-weeks);

[0161] Output: Data asset value projection results (target value stream data), including: predictive maintenance value stream (T0: 0.2; T+1h: 0.3; T+24h: 0.8), remaining range value stream (current: 180km; 30 minutes later: 165-175km), and charging economy value stream (current: 0.9; 1h later: 0.6).

[0162] In one possible implementation, described by the forward flow in the attached diagram: the input not only includes historical and real-time data asset sequences (feature vectors from the vehicle), but more importantly, it embeds physical constraints into the model itself. For example, during training, the model is constrained by vehicle dynamics equations (such as F=ma) to ensure that the predicted vehicle acceleration and force relationship conforms to physical laws; the embedding of the battery degradation model makes the prediction of State of Health (SOH) more consistent with electrochemical principles.

[0163] The core processing component is a multi-scale spatiotemporal attention mechanism. This mechanism uses parallel attention heads to focus on patterns at different time scales: a short-term head captures instantaneous risks (such as sudden braking ahead); a medium-term head is used for journey planning (such as range estimation); and a long-term head identifies slowly changing trends (such as component wear). This design enables the model to handle tasks with different time horizons simultaneously.

[0164] The output is the innovative data asset value stream projection result, no longer a single numerical value, but a curve or range estimate that changes over time. For example, predictive maintenance value streams show the trend of increasing probability of failure of a certain component over time, thus triggering an alert when the value reaches a threshold. This output provides accurate and interpretable evidence for forward-looking decision-making and is key to the dynamic valuation of data assets.

[0165] The technical solutions provided in the embodiments of the present invention have the following technical effects:

[0166] Technical effects 1) Dynamic perception and increased utilization of data asset value: Through value stream deduction, real-time and forward-looking assessment of data asset value is achieved, and the efficiency of identifying and utilizing high-value data assets is improved by more than 60%.

[0167] Technical effect 2), global optimization of service decision-making and win-win situation for all parties: The game optimization framework effectively balances the interests of multiple parties such as vehicle, user and platform. While improving user satisfaction (actually improved by 40%+), it also optimizes platform operation efficiency (e.g., charging station turnover rate increased by 25%) and social benefits (e.g., the peak-valley difference of the power grid is reduced).

[0168] Technical effect 3) Significantly improved real-time performance and resource efficiency: Intelligent processing and pre-screening on the vehicle end reduce the latency of critical safety responses to the millisecond level, and reduce network bandwidth usage by about 30% by optimizing data transmission strategies.

[0169] Technical benefits 4) Breakthrough in predictive reliability and self-evolution capabilities: The introduction of the physical information model reduces the long-term prediction error of key indicators such as range and fault warning by more than 40%. The closed-loop learning mechanism ensures continuous evolution in response to environmental changes.

[0170] Based on the above method embodiments, Figure 9 A schematic diagram of the structure of a first embodiment of the vehicle service optimization device provided by the present invention is shown. Figure 9 As shown, this device is used in a server and includes:

[0171] The first acquisition module 91 is used to acquire the data asset to be processed, which represents the embodied perception feature vector, in response to a preset trigger condition. The data asset to be processed is obtained by the vehicle processing multimodal data based on the multimodal fusion model of gating attention. The multimodal fusion model is used to dynamically adjust the weight ratio of different modal data and fuse the adjusted data.

[0172] The first processing module 92 is used to input the data assets to be processed into the spatiotemporal big model to obtain the target value stream data of the data assets to be processed within a future preset time period. The spatiotemporal big model is trained based on the physical constraints of vehicles, historical data assets and labeled value stream data.

[0173] The determination module 93 is used to determine the service strategy information applied to the vehicle based on the target value stream data and using a multi-agent deep deterministic policy gradient algorithm.

[0174] In one or more embodiments, the determining module 93 determines service policy information applicable to the vehicle based on the target value stream data using a multi-agent deep deterministic policy gradient algorithm, specifically for:

[0175] Construct a game-theoretic environment that includes vehicle-side intelligent agents, user intelligent agents, and platform intelligent agents;

[0176] The target value stream data is input into the game environment as shared global state information;

[0177] A multi-agent deep deterministic policy gradient algorithm is adopted to determine the service policy information of vehicles in a game environment based on target value stream data and global state information.

[0178] In one or more embodiments, the determining module 93 employs a multi-agent deep deterministic policy gradient algorithm to determine the vehicle's service policy information in a game environment based on target value stream data and global state information, specifically for:

[0179] A multi-agent deep deterministic policy gradient algorithm is adopted to coordinate and optimize the policy network of each agent by utilizing global state information during the centralized training phase of the game environment.

[0180] After coordination and optimization, in the distributed execution phase of the game environment, each agent generates actions based on its own observation information, and obtains the Pareto optimal equilibrium strategy through multiple rounds of negotiation.

[0181] The Pareto optimal equilibrium strategy is determined as the service strategy information.

[0182] In one or more embodiments, the first processing module 92 is further configured to:

[0183] Send service policy information to vehicles and obtain real-time feedback data after vehicles execute service policy information;

[0184] Update the spatiotemporal model and / or the policy networks of individual agents based on real-time feedback data.

[0185] In one or more embodiments, before inputting the data asset to be processed into the spatiotemporal large model to obtain the target value stream data of the data asset to be processed within a preset future time period, the first processing module 92 is further configured to:

[0186] Acquire historical data assets and the corresponding labeled value stream data of historical data assets;

[0187] Based on physical constraints, historical data assets, and labeled value stream data, a multi-scale spatiotemporal attention network is trained to obtain a large spatiotemporal model.

[0188] Based on the above method embodiments, Figure 10 A schematic diagram of a second embodiment of the vehicle service optimization device provided by the present invention is shown. Figure 10 As shown, the device is applied to a vehicle and includes:

[0189] The second acquisition module 101 is used to acquire multimodal data, which includes CAN bus data, sensor data, and interaction data.

[0190] The second processing module 102 is used to process multimodal data based on a gated attention multimodal fusion model to obtain data assets to be processed that represent embodied perception feature vectors. The multimodal fusion model is used to dynamically adjust the weight ratio of different modal data and fuse the adjusted data.

[0191] The sending module 103 is also used to upload the data assets to be processed to the server. The server is used to input the data assets to be processed into the spatiotemporal big model to obtain the target value stream data of the data assets to be processed within a preset time period in the future. The target value stream data is used to determine the service strategy information applied to the vehicle. The spatiotemporal big model is trained based on the physical constraints of the vehicle, historical data assets and labeled value stream data.

[0192] In one or more embodiments, before processing the multimodal data using a gated attention-based multimodal fusion model to obtain the data asset to be processed, representing the embodied perception feature vector, the second processing module 102 is further configured to:

[0193] Spatiotemporal synchronization processing is performed on the multimodal data to obtain the processed multimodal data;

[0194] The processed multimodal data is filtered according to a preset filtering strategy to obtain filtered multimodal data. The preset filtering strategy is determined based on data quality and / or data redundancy.

[0195] In one or more embodiments, the sending module 103 uploads the data asset to be processed to the server, specifically for:

[0196] Based on the preset value dimensions, the data assets to be processed are uploaded to the server. The preset value dimensions are determined based on security relevance, economic relevance, and timeliness relevance.

[0197] In one or more embodiments, the sending module 103 uploads the data assets to be processed to the server according to a preset value dimension, specifically for:

[0198] Based on the preset value dimensions, the data assets to be processed are valued and the evaluation results are obtained.

[0199] If the assessment results indicate that the data is related to safety, the local engine in the vehicle makes a decision on the data asset to be processed and uploads the decision result to the server as the data asset to be processed.

[0200] If the assessment results indicate that the data is of high value and urgent importance, upload the data assets to be processed to the server;

[0201] If the assessment result indicates routine data, upload the data assets to be processed to the server based on the communication network status between the server and the vehicle.

[0202] In one or more embodiments, the second processing module 102 is further configured to:

[0203] Obtain the service policy information issued by the server and execute it;

[0204] The real-time feedback data after the service policy information is executed is uploaded to the server.

[0205] It should be noted that the division of the various modules in the above device is merely a logical functional division. In actual implementation, they can be fully or partially integrated into a single physical element, or they can be physically separated. Furthermore, these modules can be implemented entirely in software through processing element calls, or entirely in hardware. Alternatively, some modules can be implemented through processing element calls in software, while others can be implemented in hardware. Moreover, these modules can be integrated together or implemented independently. The processing element here can be an integrated circuit with signal processing capabilities. During implementation, each step of the above method or each of the above modules can be completed through the integrated logic circuits in the hardware of the processor element or through software instructions.

[0206] As can be seen from the above, the vehicle service optimization device provided in this embodiment of the invention can acquire the data assets to be processed based on multimodal data collected from vehicles, and input them into a spatiotemporal large model trained by integrating vehicle physical constraints, historical data assets, and labeled value stream data. This can effectively predict the value stream of target data assets in future time periods, proactively capture the dynamic change patterns of data assets, and deeply model high-dimensional spatiotemporal correlations. Furthermore, it uses a multi-agent deep deterministic policy gradient algorithm to optimize the predicted value stream data, generating service policy information adapted to multi-vehicle collaborative scenarios. This not only improves the response accuracy and adaptability of service policies to dynamic traffic environments and vehicle physical characteristics, but also realizes end-to-end value transformation from data assets to decision output. Thus, under the premise of ensuring safety and efficiency constraints, it enhances the collaborativeness and forward-looking decision-making capabilities of in-vehicle services.

[0207] Figure 11 A schematic diagram of an embodiment of the electronic device provided by the present invention is shown, such as... Figure 11As shown, the electronic device may include: a processor 112, a communications interface 114, a memory 116, and a communications bus 118.

[0208] The processor 112, communication interface 114, and memory 116 communicate with each other via communication bus 118. Communication interface 114 is used to communicate with other network elements such as clients or other servers. The processor 112 executes program 110, specifically performing the relevant steps in the above method embodiments.

[0209] Specifically, program 110 may include program code, which includes computer-executable instructions.

[0210] Processor 112 may be a central processing unit (CPU), an application-specific integrated circuit (ASIC), or one or more integrated circuits configured to implement embodiments of the present invention. The electronic device includes one or more processors, which may be processors of the same type, such as one or more CPUs, or processors of different types, such as one or more CPUs and one or more ASICs.

[0211] Memory 116 is used to store program 110. Memory 116 may include high-speed RAM memory, and may also include non-volatile memory, such as at least one disk storage device.

[0212] Specifically, program 110 can be called by processor 112 to cause the electronic device to perform the following operations:

[0213] 1) When the electronic device is a server:

[0214] In response to a preset trigger condition, the system acquires a data asset representing the embodied perception feature vector. This data asset is obtained by the vehicle processing multimodal data using a multimodal fusion model based on gating attention. The multimodal fusion model is used to dynamically adjust the weight ratio of different modal data and fuse the adjusted data.

[0215] The data assets to be processed are input into the spatiotemporal big model to obtain the target value stream data of the data assets to be processed within a preset time period in the future. The spatiotemporal big model is trained based on the physical constraints of vehicles, historical data assets and labeled value stream data.

[0216] Based on the target value stream data, a multi-agent deep deterministic policy gradient algorithm is used to determine the service policy information applied to the vehicle.

[0217] In one or more embodiments, based on the target value stream data, a multi-agent deep deterministic policy gradient algorithm is used to determine service policy information applied to the vehicle, including:

[0218] Construct a game-theoretic environment that includes vehicle-side intelligent agents, user intelligent agents, and platform intelligent agents;

[0219] The target value stream data is input into the game environment as shared global state information;

[0220] A multi-agent deep deterministic policy gradient algorithm is adopted to determine the service policy information of vehicles in a game environment based on target value stream data and global state information.

[0221] In one or more embodiments, a multi-agent deep deterministic policy gradient algorithm is employed to determine vehicle service policy information in a game environment based on target value stream data and global state information, including:

[0222] A multi-agent deep deterministic policy gradient algorithm is adopted to coordinate and optimize the policy network of each agent by utilizing global state information during the centralized training phase of the game environment.

[0223] After coordination and optimization, in the distributed execution phase of the game environment, each agent generates actions based on its own observation information, and obtains the Pareto optimal equilibrium strategy through multiple rounds of negotiation.

[0224] The Pareto optimal equilibrium strategy is determined as the service strategy information.

[0225] In one or more embodiments, the following is also performed:

[0226] Send service policy information to vehicles and obtain real-time feedback data after vehicles execute service policy information;

[0227] Update the spatiotemporal model and / or the policy networks of individual agents based on real-time feedback data.

[0228] In one or more embodiments, before inputting the data asset to be processed into the spatiotemporal large model to obtain the target value stream data of the data asset to be processed within a preset future time period, the following is also performed:

[0229] Acquire historical data assets and the corresponding labeled value stream data of historical data assets;

[0230] Based on physical constraints, historical data assets, and labeled value stream data, a multi-scale spatiotemporal attention network is trained to obtain a large spatiotemporal model.

[0231] 2) When the electronic device is a vehicle:

[0232] Acquire multimodal data, including CAN bus data, sensor data, and interaction data;

[0233] A multimodal fusion model based on gated attention processes multimodal data to obtain data assets to be processed that represent embodied perception feature vectors. The multimodal fusion model is used to dynamically adjust the weight ratio of different modal data and fuse the adjusted data.

[0234] The data assets to be processed are uploaded to the server, which then inputs them into the spatiotemporal big data model to obtain the target value stream data of the data assets to be processed within a preset time period. The target value stream data is used to determine the service strategy information applied to the vehicle. The spatiotemporal big data model is trained based on the physical constraints of the vehicle, historical data assets, and labeled value stream data.

[0235] In one or more embodiments, before processing the multimodal data using a gated attention-based multimodal fusion model to obtain the data asset to be processed, representing the embodied perception feature vector, the following is also performed:

[0236] Spatiotemporal synchronization processing is performed on the multimodal data to obtain the processed multimodal data;

[0237] The processed multimodal data is filtered according to a preset filtering strategy to obtain filtered multimodal data. The preset filtering strategy is determined based on data quality and / or data redundancy.

[0238] In one or more embodiments, uploading the data asset to be processed to the server includes:

[0239] Based on the preset value dimensions, the data assets to be processed are uploaded to the server. The preset value dimensions are determined based on security relevance, economic relevance, and timeliness relevance.

[0240] In one or more embodiments, uploading the data asset to be processed to the server according to a preset value dimension includes:

[0241] Based on the preset value dimensions, the data assets to be processed are valued and the evaluation results are obtained.

[0242] If the assessment results indicate that the data is related to safety, the local engine in the vehicle makes a decision on the data asset to be processed and uploads the decision result to the server as the data asset to be processed.

[0243] If the assessment results indicate that the data is of high value and urgent importance, upload the data assets to be processed to the server;

[0244] If the assessment result indicates routine data, upload the data assets to be processed to the server based on the communication network status between the server and the vehicle.

[0245] In one or more embodiments, the following is also performed:

[0246] Obtain the service policy information issued by the server and execute it;

[0247] The real-time feedback data after the service policy information is executed is uploaded to the server.

[0248] As can be seen from the above, the electronic device provided in this embodiment of the invention can acquire the data assets to be processed determined based on multimodal data collected from vehicles, and input them into a spatiotemporal large model trained by integrating vehicle physical constraints, historical data assets, and labeled value stream data. This enables effective prediction of the target data asset value stream in future time periods, forward-looking capture of the dynamic change patterns of data assets, and deep modeling of high-dimensional spatiotemporal correlations. Furthermore, by using a multi-agent deep deterministic policy gradient algorithm to optimize the predicted value stream data, service policy information adapted to multi-vehicle collaborative scenarios can be generated. This not only improves the response accuracy and adaptability of service policies to dynamic traffic environments and vehicle physical characteristics, but also realizes end-to-end value transformation from data assets to decision output. Thus, under the premise of ensuring safety and efficiency constraints, it enhances the collaborativeness and forward-looking decision-making capabilities of in-vehicle services.

[0249] This invention provides a computer-readable storage medium storing at least one executable instruction that, when executed on a vehicle service optimization device / electronic device, causes the vehicle service optimization device / electronic device to perform the vehicle service optimization method in any of the above method embodiments.

[0250] Specifically, the executable instructions can be used to cause the vehicle service optimization device / electronic device to perform the following operations:

[0251] 1) When the electronic device is a server:

[0252] In response to a preset trigger condition, the system acquires a data asset representing the embodied perception feature vector. This data asset is obtained by the vehicle processing multimodal data using a multimodal fusion model based on gating attention. The multimodal fusion model is used to dynamically adjust the weight ratio of different modal data and fuse the adjusted data.

[0253] The data assets to be processed are input into the spatiotemporal big model to obtain the target value stream data of the data assets to be processed within a preset time period in the future. The spatiotemporal big model is trained based on the physical constraints of vehicles, historical data assets and labeled value stream data.

[0254] Based on the target value stream data, a multi-agent deep deterministic policy gradient algorithm is used to determine the service policy information applied to the vehicle.

[0255] In one or more embodiments, based on the target value stream data, a multi-agent deep deterministic policy gradient algorithm is used to determine service policy information applied to the vehicle, including:

[0256] Construct a game-theoretic environment that includes vehicle-side intelligent agents, user intelligent agents, and platform intelligent agents;

[0257] The target value stream data is input into the game environment as shared global state information;

[0258] A multi-agent deep deterministic policy gradient algorithm is adopted to determine the service policy information of vehicles in a game environment based on target value stream data and global state information.

[0259] In one or more embodiments, a multi-agent deep deterministic policy gradient algorithm is employed to determine vehicle service policy information in a game environment based on target value stream data and global state information, including:

[0260] A multi-agent deep deterministic policy gradient algorithm is adopted to coordinate and optimize the policy network of each agent by utilizing global state information during the centralized training phase of the game environment.

[0261] After coordination and optimization, in the distributed execution phase of the game environment, each agent generates actions based on its own observation information, and obtains the Pareto optimal equilibrium strategy through multiple rounds of negotiation.

[0262] The Pareto optimal equilibrium strategy is determined as the service strategy information.

[0263] In one or more embodiments, the following is also performed:

[0264] Send service policy information to vehicles and obtain real-time feedback data after vehicles execute service policy information;

[0265] Update the spatiotemporal model and / or the policy networks of individual agents based on real-time feedback data.

[0266] In one or more embodiments, before inputting the data asset to be processed into the spatiotemporal large model to obtain the target value stream data of the data asset to be processed within a preset future time period, the following is also performed:

[0267] Acquire historical data assets and the corresponding labeled value stream data of historical data assets;

[0268] Based on physical constraints, historical data assets, and labeled value stream data, a multi-scale spatiotemporal attention network is trained to obtain a large spatiotemporal model.

[0269] 2) When the electronic device is a vehicle:

[0270] Acquire multimodal data, including CAN bus data, sensor data, and interaction data;

[0271] A multimodal fusion model based on gated attention processes multimodal data to obtain data assets to be processed that represent embodied perception feature vectors. The multimodal fusion model is used to dynamically adjust the weight ratio of different modal data and fuse the adjusted data.

[0272] The data assets to be processed are uploaded to the server, which then inputs them into the spatiotemporal big data model to obtain the target value stream data of the data assets to be processed within a preset time period. The target value stream data is used to determine the service strategy information applied to the vehicle. The spatiotemporal big data model is trained based on the physical constraints of the vehicle, historical data assets, and labeled value stream data.

[0273] In one or more embodiments, before processing the multimodal data using a gated attention-based multimodal fusion model to obtain the data asset to be processed, representing the embodied perception feature vector, the following is also performed:

[0274] Spatiotemporal synchronization processing is performed on the multimodal data to obtain the processed multimodal data;

[0275] The processed multimodal data is filtered according to a preset filtering strategy to obtain filtered multimodal data. The preset filtering strategy is determined based on data quality and / or data redundancy.

[0276] In one or more embodiments, uploading the data asset to be processed to the server includes:

[0277] Based on the preset value dimensions, the data assets to be processed are uploaded to the server. The preset value dimensions are determined based on security relevance, economic relevance, and timeliness relevance.

[0278] In one or more embodiments, uploading the data asset to be processed to the server according to a preset value dimension includes:

[0279] Based on the preset value dimensions, the data assets to be processed are valued and the evaluation results are obtained.

[0280] If the assessment results indicate that the data is related to safety, the local engine in the vehicle makes a decision on the data asset to be processed and uploads the decision result to the server as the data asset to be processed.

[0281] If the assessment results indicate that the data is of high value and urgent importance, upload the data assets to be processed to the server;

[0282] If the assessment result indicates routine data, upload the data assets to be processed to the server based on the communication network status between the server and the vehicle.

[0283] In one or more embodiments, the following is also performed:

[0284] Obtain the service policy information issued by the server and execute it;

[0285] The real-time feedback data after the service policy information is executed is uploaded to the server.

[0286] As can be seen from the above, the vehicle service optimization device / electronic device provided in this embodiment of the invention can acquire the data assets to be processed based on multimodal data collected from vehicles, and input them into a spatiotemporal large model trained by integrating vehicle physical constraints, historical data assets, and labeled value stream data. This can effectively predict the value stream of target data assets in future time periods, and achieve forward-looking capture of the dynamic change law of data assets and deep modeling of high-dimensional spatiotemporal correlation. Furthermore, by using a multi-agent deep deterministic policy gradient algorithm to optimize the predicted value stream data, service policy information adapted to multi-vehicle collaborative scenarios can be generated. This not only improves the response accuracy and adaptability of service policies to dynamic traffic environments and vehicle physical characteristics, but also realizes end-to-end value transformation from data assets to decision output. Thus, under the premise of ensuring safety and efficiency constraints, it enhances the collaboration and forward-looking decision-making capabilities of in-vehicle services.

[0287] This invention provides a computer program product, including a computer program that, when executed by a processor, implements the operation of the above-described vehicle service optimization method.

[0288] Its implementation principle and technical effects are as disclosed above.

[0289] The description of the various embodiments above tends to emphasize the differences between the various embodiments. The similarities or similarities between them can be referred to, and for the sake of brevity, they will not be repeated here.

[0290] The methods disclosed in the various method embodiments provided by this invention can be arbitrarily combined without conflict to obtain new method embodiments.

[0291] The features disclosed in the various product embodiments provided by this invention can be arbitrarily combined without conflict to obtain new product embodiments.

[0292] The features disclosed in the various method or device embodiments provided by the present invention can be arbitrarily combined without conflict to obtain new method or device embodiments.

[0293] It should be noted that the aforementioned computer-readable storage media can be ROM, Programmable Read-Only Memory (PROM), Erasable Programmable Read-Only Memory (EPROM), Electrically Erasable Programmable Read-Only Memory (EEPROM), Ferromagnetic Random Access Memory (FRAM), Flash Memory, Magnetic Surface Memory, Optical Disc, or Compact Disc Read-Only Memory (CD-ROM), etc. It can also be various vehicles that include one or any combination of the above-mentioned storage media.

[0294] It should be noted that, in this document, the terms "comprising," "including," or any other variations thereof are intended to cover non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements includes not only those elements but also other elements not expressly listed, or elements inherent to such a process, method, article, or apparatus. Unless otherwise specified, an element defined by the phrase "comprising one..." does not exclude the presence of other identical elements in the process, method, article, or apparatus that includes that element.

[0295] The sequence numbers of the above embodiments of the present invention are for descriptive purposes only and do not represent the superiority or inferiority of the embodiments.

[0296] Through the above description of the embodiments, those skilled in the art can clearly understand that the methods of the above embodiments can be implemented by means of software plus necessary general-purpose hardware nodes. Of course, they can also be implemented by hardware, but in many cases the former is a better implementation method. Based on this understanding, the technical solution of the present invention, or the part that contributes to the prior art, can be embodied in the form of a software product. This computer software product is stored in a storage medium (such as ROM / RAM, magnetic disk, optical disk) and includes several instructions to cause a terminal device (which may be a mobile phone, computer, server, air conditioner, vehicle terminal, or network device, etc.) to execute the methods described in the various embodiments of the present invention.

[0297] This invention is described with reference to flowchart illustrations and / or block diagrams of methods, apparatuses, devices, and computer program products according to embodiments of the invention. It will be understood that each block of the flowchart illustrations and / or block diagrams, and combinations of blocks in the flowchart illustrations and / or block diagrams, can be implemented by computer program instructions. These computer program instructions can be provided to a processor of a general-purpose computer, special-purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, generate instructions for implementing the flowchart... Figure 1 One or more processes and / or boxes Figure 1 A device that provides the functions specified in one or more boxes.

[0298] These computer program instructions may also be stored in a computer-readable storage medium that can direct a computer or other programmable data processing device to function in a particular manner, such that the instructions stored in the computer-readable storage medium produce an article of manufacture including instruction means, which are implemented in a process Figure 1 One or more processes and / or boxes Figure 1 The function specified in one or more boxes.

[0299] These computer program instructions may also be loaded onto a computer or other programmable data processing equipment to cause a series of operational steps to be performed on the computer or other programmable equipment to produce a computer-implemented process, thereby providing instructions that execute on the computer or other programmable equipment for implementing the process. Figure 1 One or more processes and / or boxes Figure 1 The algorithms or displays provided herein for the functions specified in the boxes or boxes are not inherently related to any particular computer, virtual system, or other device. Furthermore, the embodiments of this invention are not directed to any particular programming language.

[0300] It should be noted that the above embodiments are illustrative of the invention and not restrictive, and that those skilled in the art can devise alternative embodiments without departing from the scope of the appended claims. In the claims, any reference signs placed between parentheses should not be construed as limiting the claims. The word "comprising" does not exclude the presence of elements or steps not listed in the claims. The word "a" or "an" preceding an element does not exclude the presence of a plurality of such elements. The invention can be implemented by means of hardware comprising several different elements and by means of a suitably programmed computer. In the unit claims enumerating several means, several of these means may be embodied by the same item of hardware. The use of the words first, second, and third, etc., does not indicate any order. These words can be interpreted as names. The steps in the above embodiments, unless otherwise specified, should not be construed as limiting the order of execution.

Claims

1. A method for optimizing vehicle services, characterized in that, Applied to a server, the method includes: In response to a preset triggering condition, a data asset to be processed representing an embodied perception feature vector is acquired. The data asset to be processed is obtained by the vehicle processing multimodal data using a multimodal fusion model based on gating attention. The multimodal fusion model is used to dynamically adjust the weight ratio of different modal data and fuse the adjusted data. The data assets to be processed are input into the spatiotemporal big model to obtain the target value stream data of the data assets to be processed within a future preset time period. The spatiotemporal big model is trained based on the physical constraints of the vehicle, historical data assets and labeled value stream data. Based on the target value stream data, a multi-agent deep deterministic policy gradient algorithm is used to determine the service policy information applied to the vehicle.

2. The method according to claim 1, characterized in that, The step of determining the service policy information applied to the vehicle based on the target value stream data using a multi-agent deep deterministic policy gradient algorithm includes: Construct a game-theoretic environment that includes vehicle-side intelligent agents, user intelligent agents, and platform intelligent agents; The target value stream data is input into the game environment as shared global state information; Using the multi-agent deep deterministic policy gradient algorithm, the service policy information of the vehicle is determined in the game environment based on the target value stream data and the global state information.

3. The method according to claim 2, characterized in that, The process of employing the multi-agent deep deterministic policy gradient algorithm to determine the vehicle's service policy information in the game environment based on the target value stream data and the global state information includes: The multi-agent deep deterministic policy gradient algorithm is used to coordinate and optimize the policy networks of each agent during the centralized training phase of the game environment using the global state information. After the coordination optimization, in the distributed execution phase of the game environment, each agent generates actions based on its own observation information, and obtains the Pareto optimal equilibrium strategy through multiple rounds of negotiation. The Pareto optimal equilibrium strategy is determined as the service strategy information.

4. The method according to claim 3, characterized in that, The method further includes: Send the service policy information to the vehicle and obtain real-time feedback data after the vehicle executes the service policy information; Based on the real-time feedback data, update the spatiotemporal large model and / or the policy network of each agent.

5. The method according to any one of claims 1-4, characterized in that, Before inputting the data asset to be processed into the spatiotemporal large model to obtain the target value stream data of the data asset to be processed within a preset future time period, the method further includes: Obtain the historical data assets and the labeled value stream data corresponding to the historical data assets; Based on the physical constraints, the historical data assets, and the labeled value stream data, a multi-scale spatiotemporal attention network is trained to obtain the spatiotemporal large model.

6. A method for optimizing vehicle services, characterized in that, Applied to vehicles, the method includes: Acquire multimodal data, including CAN bus data, sensor data, and interaction data; The multimodal fusion model based on gated attention processes the multimodal data to obtain data assets to be processed that represent embodied perception feature vectors. The multimodal fusion model is used to dynamically adjust the weight ratio of different modal data and fuse the adjusted data. The data assets to be processed are uploaded to the server, which is used to input the data assets to be processed into the spatiotemporal big model to obtain the target value stream data of the data assets to be processed within a preset time period in the future. The target value stream data is used to determine the service strategy information applied to the vehicle. The spatiotemporal big model is trained based on the physical constraints of the vehicle, historical data assets and labeled value stream data.

7. The method according to claim 6, characterized in that, Before the multimodal data is processed by the gated attention-based multimodal fusion model to obtain the data asset to be processed representing the embodied perception feature vector, the method further includes: The multimodal data is subjected to spatiotemporal synchronization processing to obtain processed multimodal data; According to a preset filtering strategy, the processed multimodal data is filtered to obtain filtered multimodal data. The preset filtering strategy is determined based on data quality and / or data redundancy.

8. The method according to claim 6 or 7, characterized in that, Uploading the data asset to be processed to the server includes: The data assets to be processed are uploaded to the server according to a preset value dimension, which is determined based on security relevance, economic relevance, and timeliness relevance.

9. The method according to claim 8, characterized in that, The step of uploading the data asset to be processed to the server according to a preset value dimension includes: The data assets to be processed are valued according to the preset value dimensions to obtain the evaluation results. If the assessment result indicates that the instantaneous response data involves safety, a decision is made on the data asset to be processed based on the local engine in the vehicle, and the decision result is uploaded to the server as the data asset to be processed. If the assessment results indicate that the data to be processed is high-value and urgent, the data asset to be processed will be uploaded to the server; If the evaluation result indicates that the data is normal, the data asset to be processed is uploaded to the server according to the communication network status between the server and the vehicle.

10. The method according to claim 6 or 7, characterized in that, The method further includes: Obtain the service policy information issued by the server and execute it; The real-time feedback data after executing the service policy information is uploaded to the server.

11. A vehicle service optimization device, characterized in that, Applications include servers or vehicles; When applied to the server, the optimization device is used to perform the operation of the vehicle service optimization method as described in any one of claims 1-5; When applied to the vehicle, the optimization device is used to perform the operation of the vehicle service optimization method as described in any one of claims 6-10.

12. An electronic device, characterized in that, include: The processor, memory, communication interface, and communication bus are provided, wherein the processor, memory, and communication interface communicate with each other via the communication bus. The memory is used to store at least one executable instruction that causes the processor to perform the operation of the vehicle service optimization method as described in any one of claims 1-10.