Traffic organization scheme optimization method based on multi-signal lamp reinforcement learning

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A technology of reinforcement learning and optimization methods, applied in traffic control systems of road vehicles, machine learning, traffic signal control, etc., can solve problems such as model convergence and speed instability, and achieve the effect of improving the smooth flow rate

Active Publication Date: 2021-11-09

CHENGDU UNIV OF INFORMATION TECH

View PDF5 Cites 1 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

Decentralized communication is more practical and does not require centralized decision-making to have good scalability, but the convergence and speed of the model are often very unstable

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment 1

[0062] This embodiment is a multi-intersection traffic organization plan optimization method based on multi-signal reinforced learning, using multi-agents, Actor-Critic network, Subnet network, and trajectory reconstruction to improve the traffic flow rate of the road network. A multi-agent environment is one in which there are multiple intelligent entities in each step, such as figure 1 Shown is the difference between multi-agent and single-agent environments.

[0063] First construct an Actor network. The traffic road network contains multiple intersections, and the signal lights at each intersection correspond to an agent. Multiple agents need to construct multiple corresponding Actor networks. The Actor network includes a state space set and a behavior space set.

[0064] Through the program in the traffic lights to change the state of the road, to achieve a certain sense of short-term road closure for traffic control. In this embodiment, proceeding from the actual situat...

Embodiment 2

[0072] This embodiment is a method for optimizing a traffic organization scheme based on reinforcement learning of multi-signal lights for a single intersection. The simulation platform used in this embodiment is SUMO. SUMO is an open source road simulator, which can meet the collection of relevant data required in the simulation experiment, as well as the simulation of traffic behavior and the required road network construction. The most important thing is to The timing data of traffic lights can be collected. The development IDE tool for writing code is Pycharm, and Tensorflow-gpu-1.4.0 version and Numpy are used to complete the relevant reinforcement learning and neural network construction. The above extensions need to be improved, and the second most important thing is to implement SUMO Traci Traffic control interface, Traci can help to expand the dynamic control of traffic lights, can call SUMO simulation tools, obtain individual vehicle information, and obtain detailed ...

Embodiment 9

[0075] In the experimental model of 9-grid multi-intersection in this embodiment, each rectangle represents a signalized intersection, and every two adjacent intersections are connected by two lanes.

[0076] In the setting of this embodiment, the following parameter settings need to be completed in the SUMO simulation software. In the 9-grid environment, a total of 7,000 vehicles enter the simulation system. The model sets the initial vehicles to 50 vehicles, and the shortest vehicle There are 2 driving paths, the longest vehicle driving path is 7, and the random seed parameter is set to 10.

[0077] After the experimental model is built, the action mode of each agent is constructed according to its own behavior mode. Under the original conditions, the total waiting time of cars in this environment is 24732 seconds. There are 21 pairs of OD pairs in this experimental traffic environment. In the original environment, the traffic volume in the lower right area of the 9th grid...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a traffic organization scheme optimization method based on multi-signal lamp reinforcement learning, and belongs to the field of traffic signal lamp control. Firstly, an Actor network containing a state space set and a behavior space set is constructed, then an observed value is introduced, high-latitude information is compressed into low-latitude information through processing of a Subnet network, the behavior deflection probability is calculated, then initial state information, updated state information and the behavior deflection probability are introduced into a Critic network for centralized learning, finally, track reconstruction is carried out. In a multi-intersection traffic environment, multiple intelligent agents improve the road network unblocked rate by means of an Actor-Critic algorithm framework. Meanwhile, a method of centralized learning and distributed execution between intelligent agents is used, and the advantages of centralized learning and distributed execution are combined, so that the convergence speed of the algorithm is greatly improved.

Description

technical field [0001] The invention relates to the field of traffic signal lamp control, in particular to a traffic organization scheme optimization method based on multi-signal lamp reinforcement learning. Background technique [0002] In the era of technology and information technology, human life is becoming more and more abundant. Now most families have their own means of transportation - cars, which leads to various traffic problems in the city, such as long waiting time, Lane occupancy is too high, etc. With the development of artificial intelligence, many traffic intelligent technologies have emerged, which have begun to effectively control traffic behavior. Agent reinforcement learning is one of the current artificial intelligence development technologies. Currently, reinforcement learning is the mainstream of intelligent transportation technology, including algorithms such as Q-learning, Sarsa, and TD lambda. [0003] How to enable agents to learn efficiently in ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G08G1/01G08G1/07G06F30/20G06N20/00

CPCG08G1/0125G08G1/07G06F30/20G06N20/00Y02T10/40

Inventor 郑皎凌吴昊昇王茂帆

Owner CHENGDU UNIV OF INFORMATION TECH

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Traffic organization scheme optimization method based on multi-signal lamp reinforcement learning

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment 1

Embodiment 2

Embodiment 9

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology