Subway station air conditioning system energy-saving control method based on deep reinforcement learning

An air-conditioning system and reinforcement learning technology, applied in neural learning methods, biological models, design optimization/simulation, etc., can solve problems such as long convergence time, limited parameter space, and limited applicability of complex system control, so as to reduce training time , reduce the number of training sessions, and meet the temperature requirements of the station

Active Publication Date: 2021-08-20
BEIJING UNIV OF CIVIL ENG & ARCHITECTURE
View PDF9 Cites 9 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] Existing studies have shown that intelligent control methods have self-adaptation, self-learning and self-coordination capabilities, which can improve the performance and energy-saving effect of air conditioning systems. Among them, the agent in reinforcement learning (Reinforcement learning, RL) is directly connected with the environment. Interacting to maximize the reward signal, which can realize the global optimal control of complex systems, is one of the effective ways to give full play to the energy-saving potential of the air-conditioning system. Applying reinforcement learning methods to control the air-conditioning system of subway stations can effectively improve the energy-saving effect of the system. However, there are still Two problems need to be solved. One is that the online training agent based on the model-free reinforcement learning method takes a long time to converge, and it is difficult to meet the real-time requirements of the control system.
Secondly, the state space and action space of the air-conditioning system of the subway station are both multi-dimensional and continuous. However, most of the relevant research results can only deal with the problem of limited parameter space, and only generate control laws for a single discrete control variable, which limits their impact on Applicability of Complex System Control

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Subway station air conditioning system energy-saving control method based on deep reinforcement learning
  • Subway station air conditioning system energy-saving control method based on deep reinforcement learning
  • Subway station air conditioning system energy-saving control method based on deep reinforcement learning

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0075] A method for energy-saving control of subway station air-conditioning systems based on deep reinforcement learning: through the following steps:

[0076] S1. Collect the data parameters of the air-conditioning system of the subway station;

[0077] S2. Perform moving average filter processing, normalization and denormalization processing on the collected data, and convert the data into a value within the range of 0-1 by using a linear function conversion method;

[0078] S3, using the neural network and the data obtained in step S2 to construct the neural network model of the air-conditioning system of the subway station;

[0079] S4. Determine the state variable, action variable, reward signal and structure of the DDPG agent;

[0080] S5. Using the DDPG algorithm to solve the final control strategy.

[0081] The above steps S1-S3 are described in further detail below:

[0082] In order to reduce the training time of the agent, it is first necessary to model the syst...

Embodiment 2

[0107] A method for energy-saving control of subway station air-conditioning systems based on deep reinforcement learning: through the following steps:

[0108] S1. Collect the data parameters of the air-conditioning system of the subway station;

[0109] S2. Perform moving average filter processing, normalization and denormalization processing on the collected data, and convert the data into a value within the range of 0-1 by using a linear function conversion method;

[0110] S3, using the neural network and the data obtained in step S2 to construct the neural network model of the air-conditioning system of the subway station;

[0111] S4. Determine the state variable, action variable, reward signal and structure of the DDPG agent;

[0112] S5. Using the DDPG algorithm to solve the final control strategy.

[0113] Steps S4-S5 are further described below:

[0114] Before the DDPG agent is trained, the control strategy must first determine the state, action, reward signal, ...

experiment example

[0155] In order to realize the proposed improved DDPG algorithm, this experimental example uses Pycharm software, based on the Tensorflow framework, an algorithm program is written according to Algorithm 1, and a simulation experiment is carried out, and the neural network model established in Example 1 is used as the learning environment for the DDPG agent .The specific process is attached Figure 10 shown.

[0156] attached Figure 11 The score (total reward value) of the DDPG algorithm based on multi-step prediction is given during 1000 times of training. It can be seen that during the training process, the reward value of each time fluctuates, and the main reasons for this phenomenon are Two, one is that the initial environment of each training is different, and the other is that the algorithm adds random noise to each strategy exploration. However, from the perspective of the change trend of the overall reward value, during the training process, the total reward value sh...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a subway station air conditioning system energy-saving control method based on deep reinforcement learning. According to the invention, the method includes collecting data parameters of a subway station air conditioning system; performing moving average filtering processing, normalization and anti-normalization processing on the acquired data, and converting the data into numerical values in a range of 0-1 by using a linear function conversion method; constructing a neural network model of the subway station air conditioning system by using a neural network and the data obtained in the step; determining a state variable, an action variable, a reward signal and a structure of the DDPG agent; and using the multi-step prediction DDPG algorithm for solving the final control strategy. According to the invention, the control method provided by the invention has good temperature tracking performance; compared with a traditional DDPG algorithm, the number of times of agent training is reduced by 86, the system can stably operate under the condition that the system load changes, the station temperature requirement is met, and meanwhile, compared with an operation system in a current practical project, the energy is saved by 17.908%.

Description

technical field [0001] The invention relates to the field of air-conditioning energy saving in subway stations, in particular to an energy-saving control method for air-conditioning systems in subway stations based on deep reinforcement learning. Background technique [0002] As a necessary link to realize the functionality of urban rail transit, subway stations are of great significance to people's daily life. In recent years, with the rapid construction and operation of many subway stations, their corresponding energy consumption has also increased rapidly, and the energy consumption problem has become increasingly prominent. Among them, HVAC (Heating, ventilation and air conditioning, HVAC) system is the main source of energy consumption, accounting for more than 40% of the total energy consumption of the station, second only to the train traction system, the equipment of the air conditioning system of the subway station is generally in accordance with Long-term peak hour...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F30/27F24F11/46G06N3/00G06N3/04G06N3/08G06F119/08
CPCG06F30/27G06N3/08F24F11/46G06N3/008G06F2119/08G06N3/045
Inventor 魏东焦焕炎冉义兵冯浩东
Owner BEIJING UNIV OF CIVIL ENG & ARCHITECTURE
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products