Adaptive signal control method and system based on reinforcement learning and phase competition

An adaptive signal and reinforcement learning technology, which is applied in the traffic control system of road vehicles, traffic control systems, neural learning methods, etc., can solve problems such as poor generalization, discounted control effects, and model no longer applicable, and achieve good Lu Effects of stickiness, reduced state space, and improved convergence

Pending Publication Date: 2022-04-29
TSINGHUA UNIV
View PDF4 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, most current models based on deep reinforcement learning have two disadvantages: 1. Poor generalization
The control effect of the model trained under one traffic flow data is likely to be greatly reduced under another traffic flow data
2. Most of the current models concatenate all the states into vectors and input them directly, but this makes the model no longer applicable when the topology or phase setting of the intersection changes, due to the change of the vector dimension

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Adaptive signal control method and system based on reinforcement learning and phase competition
  • Adaptive signal control method and system based on reinforcement learning and phase competition
  • Adaptive signal control method and system based on reinforcement learning and phase competition

Examples

Experimental program
Comparison scheme
Effect test

Embodiment

[0108] The effect of the present invention after training is compared with the traditional signal control method MaxPressure, the reinforcement learning method DQN under ordinary state representation, and the baseline method FRAP from the average travel time, average waiting time, and average queue length. Among them, FRAP is the main reference of the present invention, and the present invention optimizes the structure on the basis of it. The results are shown in Table 1. It can be seen that the present invention has achieved optimal control effects on the three indicators.

[0109] Table 1. Comparison of experimental effects

[0110]

[0111] For the test of generalization performance, the model trained on one data set is directly tested on the other three data sets, and the average travel time of the test results is shown in Table 2. It can be seen that compared with the baseline method FRAP, the present invention achieves lower average travel time on the three test dat...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a self-adaptive signal control method and system based on reinforcement learning and phase competition, and the method comprises the steps: obtaining an intersection state through interaction with a simulation environment, obtaining a decision through the output of a strategy network pi theta, collecting rewards and a state at a next moment after the decision, and obtaining a sample simulation track; and training and parameter updating are performed on the PPO network based on the simulation trajectory, and the operation is repeated for a plurality of rounds until convergence. And after convergence, the model can perform signal adjustment and signal control based on the real-time state of the traffic flow. The method can be suitable for different intersections while ensuring the signal control effect. The method can be widely applied to the field of urban traffic signal control.

Description

technical field [0001] The invention relates to the field of urban traffic signal control, in particular to an adaptive signal control method and system based on deep reinforcement learning and phase competition. Background technique [0002] Since the 21st century, with the rapid development of my country's economic level, the living standards of residents have been greatly improved. Therefore, how to reasonably and effectively improve the efficiency of traffic travel, so as to alleviate the problem of traffic congestion, has become the focus of government departments in recent years. The intersection has always been the main cause of traffic congestion, so it has become a research hotspot in recent years to alleviate traffic congestion by optimizing its signal timing. [0003] Most of the traditional signal control methods are modeled and optimized based on the knowledge in the field of traffic engineering, such as Webster, GreenWave, SCATS, SCOOT and other methods. Howe...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G08G1/01G08G1/07G06F30/27G06N3/04G06N3/08
CPCG08G1/0104G08G1/07G06F30/27G06N3/08G06N3/045Y02T10/40
Inventor 胡坚明吴智楷彭黎辉裴欣
Owner TSINGHUA UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products