A method for energy management of hybrid vehicles

What is AI technical title?
AI technical title is built by Patsnap AI team. It summarizes the technical point description of the patent document.
By combining deep reinforcement learning and adaptive equivalent fuel minimization strategy, the optimal reference trajectory of battery SOC is generated, which solves the problem of insufficient utilization of traffic information in the energy management of new energy vehicles and achieves efficient energy allocation and improved fuel economy.

CN116513150BActive Publication Date: 2026-06-30JIANGSU UNIV

View PDF 2 Cites 0 Cited by

Patent Information

Authority / Receiving Office: CN · China
Patent Type: Patents(China)
Current Assignee / Owner: JIANGSU UNIV
Filing Date: 2023-02-24
Publication Date: 2026-06-30

Application Information

Patent Timeline

24 Feb 2023

Application

30 Jun 2026

Publication

CN116513150B

IPC: B60W20/11

AI Tagging

Technology Topics

Reference modeling New energy

Explore More Agents

Novelty Search
Search existing technologies and assess novelty
↗
FTO
Analyze whether a product may infringe others' patents
↗
Design FTO
Check prior-design risk for exterior design
↗
Drafting
Draft patent application text based on a technical solution
↗
Find Solutions with TRIZ
Generate feasible solution to solve your technical challenge
↗

Similar Technology Patents

Power load mutation identification algorithm based on multi-scale dynamic model
CN122286458AReference modeling Missing data
Method, device and medium for generating a three-dimensional character model of a virtual character
CN122176154AImage enhancement Image analysis Reference modeling Computer graphics (images)
Method and system for analyzing dynamic response of building energy consumption under meteorological evolution
CN122263637Aclear featuresClarify the law of uncertainty changesGeometric CAD Ensemble learning Reference modelingClimatic adaptation
Reference model based brushless dc motor position sensor signal reconstruction method
CN122348697AReference modeling DC motor
A uvm-based on-chip network switch verification method and platform
CN122332236AReference modeling Routing table

Get free access to AI patent search and analysis

Check patentability, review prior art and ask IP Agent with full patent context.

AI Technical Summary

Technical Problem

Existing energy management strategies for new energy vehicles are difficult to effectively utilize traffic information in practical applications, resulting in insufficient fuel economy and real-time performance, as well as poor robustness and generalization.

Method used

By synchronously acquiring traffic conditions and vehicle status information, using deep reinforcement learning agents to train neural networks, and combining an adaptive equivalent fuel minimization strategy, the optimal reference trajectory for battery SOC is generated, enabling real-time optimization of energy management strategies.

Benefits of technology

It improves the real-time performance, robustness, and generalizability of energy management for new energy vehicles, enabling efficient energy distribution under complex traffic conditions and enhancing fuel economy.

✦ Generated by Eureka AI based on patent content.

Smart Images

Figure CN116513150B_ABST

Patent Text Reader

Abstract

This invention provides an energy management method for hybrid vehicles. First, traffic condition information and vehicle status information are simultaneously acquired and a database is generated. Then, the acquired information is used as the state space, and the reference SOC trajectory is used as the action space. Combined with a lower-level energy management algorithm, the upper-level deep reinforcement learning agent is trained until convergence. Finally, the neural network of the trained deep reinforcement learning agent is extracted as the optimal reference model for battery SOC. Based on the neural network model and combined with the lower-level energy management algorithm, the energy consumption of the entire vehicle is allocated. This invention not only solves the problem of the difficulty in determining the SOC trajectory in energy management strategies, but also improves the effectiveness, generalization, and stability of energy management strategies for new energy vehicles under actual road conditions, thus possessing many beneficial effects not found in existing technologies.

Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the technical field of energy management for new energy vehicles, and specifically to an energy management method for hybrid vehicles. Background Technology

[0002] With the rapid development of science and technology, the global car ownership is constantly increasing, leading to an energy crisis, exacerbating environmental pollution, and posing a threat to people's lives. On the one hand, traditional cars are limited to a single power source—the engine—and the energy utilization efficiency of the engine is difficult to improve. On the other hand, people are also anxious about the driving range of pure electric vehicles. Therefore, vehicles with multiple power sources have certain characteristics and advantages over traditional cars and electric cars in terms of power, fuel economy, driving comfort, and emissions, and have experienced rapid development in recent years.

[0003] Because vehicles with multiple power sources are generally equipped with at least one electric motor and other power sources, the design of their energy management strategies is crucial. The increasing prevalence of technologies such as Global Positioning System (GPS), Intelligent Transportation Systems (ITS), and vehicle-to-everything (V2X) communication has provided new energy vehicles with more channels for acquiring external information, leading to more intelligent energy management strategies. Current research on energy management strategies in this field reveals limitations: rule-based strategies struggle to guarantee fuel economy; global optimization strategies require predicting global operating conditions, hindering real-time optimization; and instantaneous optimization strategies also have limitations. For example, the equivalent factor for the Equivalent Minimum Fuel Consumption Strategy (ECMS) is difficult to determine. To achieve better fuel economy, many researchers have begun to incorporate more road condition information as a reference for energy management strategies to better optimize vehicle economy globally. Calculating future vehicle energy demand using traffic information facilitates real-time adjustments to vehicle power distribution and achieves good fuel economy. However, this approach ignores the inherent uncertainty of operating conditions, thus having certain limitations.

[0004] In summary, existing research on energy management strategies for new energy vehicles is based solely on standard operating conditions and does not consider traffic information. Furthermore, it only considers traffic signals that are difficult to obtain in practice, making it quite challenging to apply in real-world situations. Summary of the Invention

[0005] To address the shortcomings and deficiencies in the current field, this invention provides a new energy vehicle energy management method based on traffic information and reinforcement learning, which can make full use of easily accessible traffic information and improve the real-time performance, robustness, and generalization of energy management strategies.

[0006] The technical solution provided by this invention is as follows:

[0007] A hybrid vehicle energy management method, characterized by the following steps:

[0008] 1) Synchronously acquire traffic condition information and vehicle status information, process them, and generate a database;

[0009] 2) Using the processed traffic condition information and vehicle status information as state inputs and the SOC reference quantity as action outputs, the deep reinforcement learning agent is trained to convergence based on the reward function of environmental feedback and combined with the lower-level energy management algorithm.

[0010] 3) The neural network extracted from the deep reinforcement learning agent will be used as the SOC trajectory generation model, and the neural network and the lower-level energy management algorithm will be combined as the energy management strategy for the energy consumption allocation of the whole vehicle.

[0011] As a further preferred embodiment of the present invention

[0012] In step 1),

[0013] The traffic information includes at least the average travel time for each road segment, the distance between each road segment, the average travel delay time for each road segment, the average number of stops for each road segment, the corresponding road surface gradient, and the total distance traveled.

[0014] The vehicle status information includes at least the corresponding vehicle speed, acceleration, and position information;

[0015] And among them,

[0016] The average travel time, distance, average travel delay, average number of stops, mileage, and vehicle location information for each road segment are obtained through a map app.

[0017] The corresponding road slope, vehicle speed, and vehicle acceleration information are obtained through vehicle sensors.

[0018] As a further preferred embodiment of the present invention

[0019] In step 2),

[0020] The deep reinforcement learning agent includes an upper-layer deep reinforcement learning algorithm, and the deep reinforcement learning algorithm is trained using training and testing datasets.

[0021] As a further preferred embodiment of the present invention

[0022] In step 2), before training the deep reinforcement learning agent to convergence using the processed traffic condition information and vehicle status information, the different information contained therein is first normalized.

[0023] After the deep reinforcement learning agent is trained to convergence, it is combined with a lower-level energy management algorithm as an energy management strategy, and the input and output during its application are kept consistent with those during training.

[0024] As a further preferred embodiment of the present invention

[0025] In step 2), during the process of training the deep reinforcement learning agent to convergence, the state input space is constructed according to the following formula:

[0026] State = {SOC} iref SOC l ,L l ,T,W,S l ,L,T td ,T sd G slope}

[0027] in,

[0028] SOC iref SOC is based on mileage planning;

[0029] SOC l The remaining power capacity as currently planned;

[0030] L l The remaining mileage is in meters.

[0031] T represents the current torque, in Nm;

[0032] W represents the current rotational speed in rad / s.

[0033] S l The average travel time for the current road segment, in seconds;

[0034] L represents the current road segment distance, in meters.

[0035] T td This represents the average travel delay time for the current road segment.

[0036] T sd This represents the average number of stops on the current road segment.

[0037] G slope The current road surface slope, %;

[0038] The average travel delay time T of the current road segment td Calculate using the following formula:

[0039] T td =S l -L / V max

[0040] in,

[0041] V max The desired driving speed is expressed in m / s.

[0042] As a further preferred embodiment of the present invention

[0043] In step 2), the action output space is constructed from the SOC reference value, and the calculation formula is as follows:

[0044] Action={ΔSOC ref}

[0045] Wherein, ΔSOC ref This is the current SOC reference value.

[0046] As a further preferred embodiment of the present invention

[0047] In step 2), when ΔSOC ref When outputting actions, a virtual reference SOC curve is set. The virtual reference SOC curve is selected based on the driving mileage planning and is determined according to the following calculation formula:

[0048]

[0049] in,

[0050] SOC iref SOC is based on mileage planning;

[0051] SOC final The expected final value of SOC, %;

[0052] SOC ini The initial value of SOC is %, %.

[0053] v is the vehicle speed, in m / s;

[0054] As a further preferred embodiment of the present invention

[0055] Construct the reward function according to the following formula:

[0056] R1=α[f cost (t)+w*m cost (t)]

[0057] R2=β[SOC iref (t)-SOC(t)] 2 if(SOC(t)-SOC) iref (t)) 2 >ε, elseR2=0

[0058] R3=γ[SOC(t)-SOCfinal ],if SOC(t)-SOC final <σ,elseR3=0

[0059] R = R1 + R2 + R3

[0060] in,

[0061] R1, R2, and R3 are reward functions;

[0062] f cost Fuel consumption cost, in yuan;

[0063] m cost The cost of electricity is [amount in yuan].

[0064] w is the equivalence coefficient;

[0065] SOC(t) is the actual SOC at time t, %;

[0066] SOC final The expected final value SOC, %;

[0067] α, β, γ, ε, and σ are coefficients determined based on the vehicle and driving conditions;

[0068] R is the sum of the reward functions.

[0069] As a further preferred embodiment of the present invention

[0070] The lower-level energy management algorithm employs an adaptive equivalent fuel minimization strategy and sets an equivalence factor for this strategy. The equivalence factor is calculated according to the following formula:

[0071]

[0072] in,

[0073] δ(t) is the equivalent factor at time t;

[0074] δ s This is the constant part of the equivalent factor;

[0075] k1 and k2 are proportionality coefficients;

[0076] SOC ref (t) represents the reference SOC at time t, %

[0077] SOC(t) is the actual SOC at time t, %;

[0078] Or calculate using the following formula:

[0079]

[0080] As a further preferred embodiment of the present invention

[0081] In step 3), the energy management strategy adopts an adaptive equivalent fuel minimization strategy to allocate vehicle energy consumption according to the following formula:

[0082]

[0083] in,

[0084] minJ represents the vehicle's minimum energy consumption, expressed in grams.

[0085] Let g be the engine fuel consumption at time t;

[0086] Let g be the energy consumption of the motor at time t.

[0087] The superior effects achieved by this invention compared to the prior art include:

[0088] 1) This invention provides an energy management method for hybrid vehicles, which enables efficient energy management of new energy vehicles based on traffic condition information and vehicle status information. This method innovatively integrates traffic condition information with vehicle status information, and, in situations where traffic data is difficult to obtain in practical applications, it can rationally plan the battery's State of Charge (SOC) using available information and deep reinforcement learning algorithms.

[0089] 2) This invention provides an energy management method for hybrid vehicles. First, based on traffic condition information and vehicle state information extracted from real-world traffic scenarios, the processed traffic condition information and vehicle state information are used as state inputs, and the State of Charge (SOC) reference value is used as the action output. After training the upper-layer deep reinforcement learning algorithm's agent to convergence using a lower-layer energy management algorithm, its neural network is extracted and used as a model to generate the optimal reference trajectory for battery SOC. Then, multi-dimensional traffic information of the target driving route is pre-acquired using map apps, thereby calculating the reference battery SOC increment or SOC trajectory in real time. At the lower layer of the energy management algorithm, the Adaptive Equivalent Fuel Minimum Strategy (AECMS) can be used to optimally allocate energy to the vehicle, improving the real-time performance, robustness, generalization, and practicality of the energy management strategy. Attached Figure Description

[0090] Figure 1 A flowchart of the vehicle energy management method provided by the present invention.

[0091] Figure 2 This is an application logic framework diagram of the vehicle energy management method provided by the present invention.

[0092] Figure 3This is a schematic diagram of the input and output when using a deep reinforcement learning algorithm for training in an embodiment of the present invention.

[0093] Figure 4 This is a schematic diagram of the input and output during actual application in the embodiments of the present invention.

[0094] Figure 5 This is a schematic diagram illustrating the principle of the energy management training model of the present invention.

[0095] Figure 6 This is a diagram of the algorithm structure of the Deep Deterministic Policy Gradient (DDPG) algorithm used in this invention. Detailed Implementation

[0096] The technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of the present invention, and not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of the present invention.

[0097] In the description of this invention, it should be noted that the terms "upper," "lower," "inner," "outer," "front end," "rear end," "both ends," "one end," and "the other end," etc., indicate the orientation or positional relationship based on the orientation or positional relationship shown in the accompanying drawings. They are used only for the convenience of describing this invention and for simplifying the description, and do not indicate or imply that the device or element referred to must have a specific orientation, or be constructed and operated in a specific orientation. Therefore, they should not be construed as limitations on this invention. Furthermore, the terms "first" and "second" are used for descriptive purposes only and should not be construed as indicating or implying relative importance.

[0098] In the description of this invention, it should be noted that, unless otherwise explicitly specified and limited, the terms "installed," "equipped with," "connected," etc., should be interpreted broadly. For example, "connection" can be a fixed connection, a detachable connection, or an integral connection; it can be a mechanical connection or an electrical connection; it can be a direct connection or an indirect connection through an intermediate medium; it can be a connection within two components. Those skilled in the art can understand the specific meaning of the above terms in this invention based on the specific circumstances.

[0099] The following will be combined with the appendix Figure 1-6 The technical solution of the present invention is described below. All other embodiments obtained by those skilled in the art based on the embodiments of the present invention without inventive effort are within the scope of protection of the present invention.

[0100] like Figure 1 The diagram illustrates an energy management method for hybrid vehicles provided by this invention, specifically including the following steps:

[0101] Step 1): Synchronously acquire traffic condition information and vehicle status information, process them, and generate a database; in this step,

[0102] The traffic information includes at least the average travel time for each road segment, the distance between each road segment, the average travel delay time for each road segment, the average number of stops for each road segment, the corresponding road surface gradient, and the total distance traveled.

[0103] The vehicle status information includes at least the corresponding vehicle speed, acceleration, and position information;

[0104] And among them,

[0105] The average travel time, distance, average travel delay, average number of stops, mileage, and vehicle location information for each road segment are obtained through a map app.

[0106] The corresponding road slope, vehicle speed, and vehicle acceleration information are obtained through vehicle sensors.

[0107] Step 2): Using the processed traffic condition information and vehicle status information as state inputs, and the SOC reference value as action outputs, the deep reinforcement learning agent is trained to convergence based on the reward function from environmental feedback and combined with the lower-level energy management algorithm. The deep reinforcement learning agent includes an upper-level deep reinforcement learning algorithm, which is trained using training and testing datasets. Before training the deep reinforcement learning agent to convergence using the processed traffic condition information and vehicle status information in this step, the different information it contains is normalized. After training the deep reinforcement learning agent to convergence, the lower-level energy management algorithm is used as the energy management strategy, ensuring that the input and output during application remain consistent with those during training.

[0108] In this embodiment, the deep reinforcement learning algorithm selected is the DDPG algorithm, such as... Figure 3 and 6 As shown, the DDPG algorithm flow is as follows:

[0109] (1) First initialize Actor and Critic;

[0110] (2) The DDPG algorithm includes a policy network and a Q network;

[0111] (3) Input state S t Once the Actor receives action A, and applies action A to the environment, the environment will return the state S for the next moment. t+1And the reward R, so we can use (S) t ,A,R,S t+1 This process is represented by )

[0112] (4) The experience pool sampling adopts a priority sampling strategy;

[0113] (5) The target Actor network and the target Critic network are updated using a soft update method, while the current Actor network and the current Critic network are updated using a gradient update method;

[0114] (6) Regularization can be applied to the Actor network to avoid overfitting during training.

[0115] The traffic and vehicle status information in this step differs from that in step 1; it is obtained through calculation and processing of the information from step 1. Parameters such as the currently planned remaining battery power and the State of Charge (SOC) planned based on mileage need to be manually set. Furthermore, information such as vehicle dimensions, vehicle weight, vehicle powertrain, vehicle braking system, and vehicle electrical components is required to help build the vehicle model.

[0116] In this step, the specific process of using the processed traffic condition information and vehicle status information as state inputs is as follows: During the training of the deep reinforcement learning agent to convergence, the state input space is constructed according to the following formula:

[0117] State = {SOC} iref SOC l ,L l ,T,W,S l ,L,T td ,T sd G slope}

[0118] in,

[0119] SOC iref SOC is based on mileage planning;

[0120] SOC l The remaining power capacity as currently planned;

[0121] L l The remaining mileage is in meters.

[0122] T represents the current torque, in Nm;

[0123] W represents the current rotational speed in rad / s.

[0124] S l The average travel time for the current road segment, in seconds;

[0125] L represents the current road segment distance, in meters.

[0126] T td This represents the average travel delay time for the current road segment.

[0127] T sd This represents the average number of stops on the current road segment.

[0128] G slope The current road surface slope, %;

[0129] The average travel delay time T of the current road segment td Calculate using the following formula:

[0130] T td =S l -L / V max

[0131] in,

[0132] V max The desired driving speed is expressed in m / s.

[0133] In this step, the specific process of using the SOC reference value as the action output is as follows: Construct the action output space from the SOC reference value, and the calculation formula is as follows:

[0134] Action={ΔSOC ref}

[0135] Wherein, ΔSOC ref This is the current SOC reference value; when ΔSOC is used... ref When outputting actions, a virtual reference SOC curve is set. The virtual reference SOC curve is selected based on the driving mileage planning and is determined according to the following calculation formula:

[0136]

[0137] in,

[0138] SOC iref SOC is based on mileage planning;

[0139] SOC final The expected final value of SOC, %;

[0140] SOC ini The initial value of SOC is %, %.

[0141] v is the vehicle speed, in m / s;

[0142] In this step, the reward function serves as a guide for the deep reinforcement learning agent during training; it represents its objective function. The exploration in deep reinforcement learning aims to optimize this objective function. The reward function for environmental feedback is constructed according to the following formula:

[0143] When constructing the reward function R1, the cost function can be established by the sum of the vehicle's fuel consumption and the equivalent electricity consumption cost:

[0144] R1=α[f cost (t)+w*m cost (t)]

[0145] It should be noted that R1 can also be in the form of an exponential function.

[0146] When constructing the reward function R2, it should be noted that R2 only penalizes the output reference SOC when it deviates from the virtual reference SOC by a certain range. A relatively large penalty can be applied, which can help the deep reinforcement learning training converge quickly. The formula is as follows:

[0147] R2=β[SOC iref (t)-SOC(t)] 2 if(SOC(t)-SOC) iref (t)) 2 >ε, elseR2

[0148] =0

[0149] When constructing the reward function R3, it should be noted that R3 only penalizes the output if it falls a certain distance below the given final SOC, thus guaranteeing a lower bound on the SOC.

[0150] R3=γ[SOC(t)-SOC final ],if SOC(t)-SOC final <σ,elseR3=0

[0151] R = R1 + R2 + R3

[0152] in,

[0153] R1, R2, and R3 are reward functions;

[0154] f cost Fuel consumption cost, in yuan;

[0155] m cost The cost of electricity is [amount in yuan].

[0156] w is the equivalence coefficient;

[0157] SOC(t) is the actual SOC at time t, %;

[0158] SOC final The expected final value SOC, %;

[0159] α, β, γ, ε, and σ are coefficients determined based on the vehicle and driving conditions;

[0160] R is the sum of the reward functions.

[0161] Step 3): The neural network extracted from the deep reinforcement learning agent is used as the optimal reference trajectory to form the SOC reference generation model. This neural network is then combined with the lower-level energy management algorithm as an energy management strategy for the vehicle's energy consumption allocation. The SOC reference generation model in this step can be found in [link to relevant documentation]. Figure 5 As shown.

[0162] The lower-level energy management algorithm in this step uses an adaptive equivalent fuel minimization strategy, and sets an equivalence factor for the adaptive equivalent fuel minimization strategy.

[0163] In fact, once the reference SOC is given, the equivalent factor can be directly calculated from the reference SOC once an initial value is provided. Therefore, the equivalent factor can be derived from the reference SOC given by deep reinforcement learning, and energy consumption allocation can be achieved according to the adaptive equivalent fuel minimization strategy.

[0164] The equivalent factor is calculated according to the following formula:

[0165]

[0166] in,

[0167] δ(t) is the equivalent factor;

[0168] δ s This is the constant part of the equivalent factor;

[0169] k1 and k2 are proportionality coefficients;

[0170] SOC ref (t) represents the reference SOC at time t, %;

[0171] SOC(t) is the actual SOC at time t, %;

[0172] Or calculate using the following formula:

[0173]

[0174] in,

[0175] δ(t) is the equivalent factor at time t.

[0176] In practical applications, such as Figure 2 and Figure 4As shown, a neural network extracted through deep reinforcement learning replaces the deep reinforcement learning agent as the optimal reference trajectory generation model for the battery's state of charge (SOC). The neural network and the lower-level energy management algorithm work together as the energy management strategy. Information required for input can be obtained from the connected platform or sensors and used as input to the neural network extracted through deep reinforcement learning. AECMS is then employed in the lower layer of energy management to allocate power sources.

[0177] Since ECMS is highly sensitive to the equivalence factor, a strategy is needed to regulate it. Therefore, after obtaining the equivalence factor, the energy management strategy adopts an adaptive equivalent fuel minimization strategy to allocate vehicle energy consumption according to the following formula:

[0178]

[0179] in,

[0180] minJ represents the vehicle's minimum energy consumption, expressed in grams.

[0181] Let g be the engine fuel consumption at time t;

[0182] Let g be the energy consumption of the motor at time t.

[0183] The embodiments of this invention do not imply a specific order of execution. The execution order of each process should be determined by its function and internal logic, and should not constitute any limitation on the implementation process of the embodiments of this invention. This invention is applicable to multi-power source new energy vehicles.

Claims

1. A hybrid vehicle energy management method characterized by: Includes the following steps: 1) Synchronously acquire traffic condition information and vehicle status information, process them, and generate a database; 2) Using the processed traffic condition information and vehicle status information as state inputs and the SOC reference quantity as action outputs, the deep reinforcement learning agent is trained to convergence based on the reward function of environmental feedback and combined with the lower-level energy management algorithm. 3) The neural network extracted from the deep reinforcement learning agent will be used as the SOC trajectory generation model, and the neural network and the lower-level energy management algorithm will be combined as an energy management strategy to allocate the energy consumption of the whole vehicle. In step 2), during the process of training the deep reinforcement learning agent to convergence, the state input space is constructed according to the following formula: ； in, to plan the SOC based on the driving range; the current planned remaining power; for the remaining distance, m; T represents the current torque, in Nm; W represents the current rotational speed in rad / s. current segment average travel time, s; L represents the current road segment distance, in meters. average travel delay time for the current link; average number of stops for the current segment; is the current road slope, the current link average travel delay time is calculated according to the following formula: ； in, The desired driving speed is expressed in m / s. Construct the reward function according to the following formula: ； in, For the reward function; Fuel consumption cost, in yuan; The cost of electricity is [amount in yuan]. w is the equivalence coefficient; Let SOC be the actual SOC at time t, % The expected final value SOC, % These are coefficients determined based on the vehicle and its operating conditions. R is the sum of the reward functions; The lower-level energy management algorithm employs an adaptive equivalent fuel minimization strategy and sets an equivalence factor for this strategy. The equivalence factor is calculated according to the following formula: ; in, The equivalent factor at time t; This is the constant part of the equivalent factor; This is the proportionality coefficient; Let SOC be the reference SOC at time t, % Let SOC be the actual SOC at time t, % Or calculate using the following formula: 。 2. The energy management method for a hybrid vehicle according to claim 1, characterized in that: In step 1), The traffic information includes at least the average travel time for each road segment, the distance between each road segment, the average travel delay time for each road segment, the average number of stops for each road segment, the corresponding road surface gradient, and the total distance traveled. The vehicle status information includes at least the corresponding vehicle speed, acceleration, and position information; And among them, The average travel time, distance, average travel delay, average number of stops, mileage, and vehicle location information for each road segment are obtained through a map app. The corresponding road slope, vehicle speed, and vehicle acceleration information are obtained through vehicle sensors.

3. The energy management method for a hybrid vehicle according to claim 1, characterized in that: In step 2), The deep reinforcement learning agent includes an upper-layer deep reinforcement learning algorithm, and the deep reinforcement learning algorithm is trained using training and testing datasets.

4. The energy management method for a hybrid vehicle according to claim 3, characterized in that: In step 2), before training the deep reinforcement learning agent to convergence using the processed traffic condition information and vehicle status information, the different information contained therein is first normalized. After the deep reinforcement learning agent is trained to convergence, it is combined with a lower-level energy management algorithm as an energy management strategy, and the input and output during its application are kept consistent with those during training.

5. The energy management method for a hybrid vehicle according to claim 3, characterized in that: In step 2), the action output space is constructed from the SOC reference value, and the calculation formula is as follows: ; in, This is the current SOC reference value.

6. The energy management method for a hybrid vehicle according to claim 5, characterized in that: In step 2), when... When outputting an action, a virtual reference SOC curve is set. The virtual reference SOC curve is selected based on the driving mileage planning and is determined according to the following calculation formula: ; in, SOC is based on mileage planning; Let SOC be the expected final value, % Let SOC be the initial value, % v represents the vehicle speed, in m / s.

7. The energy management method for a hybrid vehicle according to claim 1, characterized in that: In step 3), the energy management strategy adopts an adaptive equivalent fuel minimization strategy to allocate vehicle energy consumption according to the following formula: ； in, The lowest energy consumption for the entire vehicle, g; Let g be the engine fuel consumption at time t; Let g be the energy consumption of the motor at time t.

Citation Information

Patent Citations

Hybrid train energy management method and system based on deep reinforcement learning
CN112116156A
Hybrid electric vehicle hierarchical prediction energy management method fused with deep reinforcement learning
CN113525396A

Patent Information

AI Technical Summary

Abstract

Description

Patent Citations

Hybrid train energy management method and system based on deep reinforcement learning

Hybrid electric vehicle hierarchical prediction energy management method fused with deep reinforcement learning