Multi-intersection vehicle-road cooperative control method and device based on hierarchical reinforcement learning, medium

By calculating the global reward offset using a managerial intelligent agent and dynamically adjusting traffic lights and CAV trajectory planning, the problem of insufficient linkage mechanism in hierarchical reinforcement learning traffic control is solved, and the optimization and stability of global traffic flow are achieved.

CN122245129APending Publication Date: 2026-06-19ZHEJIANG UNIV

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
ZHEJIANG UNIV
Filing Date
2026-05-25
Publication Date
2026-06-19

AI Technical Summary

Technical Problem

In existing hierarchical reinforcement learning traffic control methods, there is a lack of flexible and efficient linkage mechanisms between managers and lower-level agents, which leads to oscillations in traffic light timing and CAV trajectory planning strategies, making it difficult to achieve global optimization.

Method used

A managerial intelligent agent collects global traffic information, calculates the global reward offset, and transmits it to the reward function of the traffic light intelligent agent and the connected autonomous vehicle intelligent agent to dynamically adjust the traffic light phase and CAV trajectory planning, forming a multi-level hierarchical decision-making architecture.

Benefits of technology

It achieves the co-evolution of traffic light timing and CAV trajectory planning under the same global objective, avoids system oscillations, and improves the global optimality and flexibility of regional traffic flow.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN122245129A_ABST
    Figure CN122245129A_ABST
Patent Text Reader

Abstract

This invention discloses a multi-intersection vehicle-road cooperative control method, device, and medium based on hierarchical reinforcement learning, comprising: acquiring intersection nodes and road connection relationships of the traffic network to be analyzed, thereby constructing a road network model; deploying a manager agent, several traffic light agents, and several connected autonomous vehicle agents at each intersection; each traffic light agent collecting the current traffic state of the intersection, and each connected autonomous vehicle agent collecting the current traffic flow state of the road; the manager agent at each intersection calculating a global reward offset based on the shared global traffic state and traffic flow state, and passing the global reward offset to the reward functions of the corresponding traffic light agent and connected autonomous vehicle agent; each traffic light agent and each connected autonomous vehicle agent adjusting the traffic light phase and the trajectory planning of the connected autonomous vehicle respectively based on the global reward offset.
Need to check novelty before this filing date? Find Prior Art