Micro-power-grid energy storage scheduling method and device based on deep Q-value network (DQN) reinforcement learning

A technology of reinforcement learning and energy storage scheduling, which is applied in the direction of circuit devices, AC network circuits, AC network load balancing, etc., can solve the problems of insufficient estimation ability of deep Q value network, insufficient stability and accuracy of learning objectives, etc., and achieve estimation ability Strong, stable learning objectives

Active Publication Date: 2019-02-15
STATE GRID HENAN ELECTRIC POWER ELECTRIC POWER SCI RES INST +3
View PDF5 Cites 44 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] In order to overcome the deficiencies of the prior art, the purpose of the present invention is to provide a microgrid energy storage scheduling method and device based on deep Q-value network reinforcement learning, aiming to solve the problem of applying the de

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Micro-power-grid energy storage scheduling method and device based on deep Q-value network (DQN) reinforcement learning
  • Micro-power-grid energy storage scheduling method and device based on deep Q-value network (DQN) reinforcement learning
  • Micro-power-grid energy storage scheduling method and device based on deep Q-value network (DQN) reinforcement learning

Examples

Experimental program
Comparison scheme
Effect test

specific Embodiment 1

[0056] Such as figure 1 As shown, the embodiment of the present invention provides a microgrid energy storage scheduling method based on deep Q-value network reinforcement learning, including:

[0057] Step S101 is established to establish a microgrid model;

[0058] Training step S102, according to the microgrid model, use the deep Q value network reinforcement learning algorithm to carry out artificial intelligence training;

[0059] Calculation step S103, according to the input parameter characteristic value, calculate and obtain the battery operation strategy of microgrid energy storage dispatch.

[0060] Such as figure 2 As shown, preferably, the micro-grid model can be provided with sequentially connected battery pack energy storage systems, photovoltaic power generation systems, power loads and control devices, and the power loads and control devices are connected to the distribution network through a common connection point. The electricity price information of the...

specific Embodiment 2

[0116] Such as Figure 6 As shown, the embodiment of the present invention provides a microgrid energy storage scheduling device based on deep Q-value network reinforcement learning, including:

[0117] Build module 201, used to build a microgrid model;

[0118] The training module 202 is used to perform artificial intelligence training using a deep Q value network reinforcement learning algorithm according to the microgrid model;

[0119] The calculation module 203 is used to calculate the battery operation strategy for energy storage scheduling of the microgrid according to the input parameter characteristic values.

[0120] The embodiment of the present invention uses a deep Q value network to schedule and manage the energy of the microgrid. The intelligent body interacts with the environment to determine the optimal energy storage scheduling strategy, and controls the operation mode of the battery in a constantly changing environment, based on the dynamic decision of the ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a micro-power-grid energy storage scheduling method and device based on deep Q-value network reinforcement learning. A micro-power-grid model is established; a deep Q-value network reinforcement learning algorithm is utilized for artificial intelligence training according to the micro-power-grid model; and a battery running strategy of micro-power-grid energy storage scheduling is calculated and obtained according to input parameter feature values. According to the embodiment of the invention, deep Q-value networks are utilized for scheduling management on micro-power-grid energy, an agent decides the optimal energy storage scheduling strategy through interaction with an environment, a running mode of the battery is controlled in the constantly changing environment,features of energy storage management are dynamically determined on the basis of a micro-power-grid, and the micro-power-grid is enabled to obtain a maximum running benefit in interaction with a mainpower grid; and the networks are enabled to respectively calculate an evaluation value of the environment and an additional value, which is brought by action, through using a competitive Q-value network model, decomposition of the two parts enables a learning objective to be more stable and accurate, and estimation ability of the deep Q-value networks on environment status is enabled to be higher.

Description

technical field [0001] The invention relates to the technical field of microgrid energy storage scheduling, in particular to a microgrid energy storage scheduling method and device based on deep Q-value network reinforcement learning. Background technique [0002] At present, machine learning methods are gradually applied in all walks of life. Using Deep Q-Network (DQN) to combine convolutional neural networks with traditional Q-value learning algorithms in deep learning is also an emerging research direction. . If the experience replay technology is used, by storing the experience of the agent, a part of the samples are randomly selected for the network to learn during each training, which can break the correlation between the data and make the training of the neural network convergent and stable. [0003] When the deep Q value network is applied to the management of microgrid energy storage scheduling, the target state-action Q value function has the problem of overestima...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): H02J3/46H02J3/32G06Q50/06G06Q10/06
CPCH02J3/32H02J3/46G06Q10/0631G06Q10/0637G06Q50/06H02J2203/20
Inventor 张江南崔承刚吴坡贺勇赵延平刘海宝唐耀华李冰郝涛
Owner STATE GRID HENAN ELECTRIC POWER ELECTRIC POWER SCI RES INST
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products