Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Hybrid power system energy management strategy based on reverse deep reinforcement learning

A technology of reinforcement learning and energy management, applied in general control systems, control/regulation systems, instruments, etc., can solve problems such as inability to carry out online applications, and achieve the effect of deep reinforcement learning, good real-time performance, and fast calculation speed

Active Publication Date: 2020-07-03
SOUTH CHINA UNIV OF TECH
View PDF10 Cites 9 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, the global optimization energy management strategy is only aimed at solving known working conditions and cannot be applied online. Traditional reinforcement learning methods perform well in solving tasks with limited state and action spaces, but they are difficult to solve in terms of state and action space dimensions. When the problem is high, it seems powerless (Chen Xiliang, Cao Lei, He Ming, Li Chenxi, Xu Zhixiong. A review of deep reverse reinforcement learning [J]. Computer Engineering and Application, 2018, 54(05): 24-35.)

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Hybrid power system energy management strategy based on reverse deep reinforcement learning
  • Hybrid power system energy management strategy based on reverse deep reinforcement learning
  • Hybrid power system energy management strategy based on reverse deep reinforcement learning

Examples

Experimental program
Comparison scheme
Effect test

Embodiment

[0039] like figure 1 , figure 2 As shown, a hybrid system energy management strategy based on reverse deep reinforcement learning includes the following steps:

[0040] S1: Use the optimization solution method to calculate the global hybrid mode allocation ratio and the global optimized SOC result under one of the complete working conditions, and form an expert state-action pair as expert knowledge for reverse reinforcement learning; the optimization solution method includes Pseudospectral method, dynamic programming method, genetic algorithm.

[0041] S2: Create a reward function neural network and initialize parameters;

[0042] The reward function neural network is composed of a fully connected neural network, a convolutional neural network, and a long-term short-term memory neural network stacked in the order of a convolutional neural network, a long-term short-term memory neural network, and a fully connected neural network; the fully connected neural network consists ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a hybrid power system energy management strategy based on reverse deep reinforcement learning. The strategy comprises the following steps of calculating a globally optimized SOC result as expert knowledge by using an optimization solution method; creating a reward neural network; learning the expert knowledge by utilizing reverse reinforcement learning to obtain parametersof the reward neural network; creating an action neural network, and evaluating the neural network; setting an SOC value before vehicle interaction; inputting the obtained SOC value before interactioninto a reward neural network to obtain a reward value; inputting the obtained pre-interaction SOC value into an action neural network to obtain a mode distribution ratio; interacting with the environment by using the mode distribution ratio to obtain an SOC value after interaction; inputting the SOC value before interaction, the mode distribution ratio, the reward value and the SOC value after interaction into an evaluation neural network to obtain an evaluation value; and calculating the gradient of each network and performing back propagation to update the network parameters by an intelligent agent till training is finished. The method is advantaged in that the optimal reward function can be learned from the expert knowledge, so the deep reinforcement learning effect is better.

Description

technical field [0001] The invention relates to the field of hybrid system energy management, in particular to a hybrid system energy management strategy based on reverse deep reinforcement learning. Background technique [0002] The hybrid electromechanical coupling device couples the power of multiple power sources such as the internal combustion engine and the electric motor in a hybrid electric vehicle (HEV) and performs reasonable power distribution, and transmits the power to the drive axle to drive the vehicle. It can be regarded as an automatic transmission system with electric motors by integrating one or more electric motors into the transmission, which is a complex system integrating mechanics, electricity, chemistry and thermodynamics. China's automobile fuel consumption regulations put forward very high requirements for energy conservation and emission reduction of vehicle manufacturers in the next 10-15 years. With the further increase in the pressure of energ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G05B13/04
CPCG05B13/042
Inventor 李梓棋赵克刚
Owner SOUTH CHINA UNIV OF TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products