A formation method for unmanned vehicles on highways based on multi-agent reinforcement learning

An unmanned vehicle and reinforcement learning technology, applied in the field of intelligent vehicles, can solve the problems of poor stability of the formation system, limited vehicle perception, and high stability requirements, to enhance stability and fault tolerance, increase control constraints, and increase training difficulty. big effect

Active Publication Date: 2022-06-03
BEIJING INSTITUTE OF TECHNOLOGYGY
View PDF4 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

First of all, the dynamic vehicle movement state on the expressway is complex, and it is difficult for the vehicle formation to coordinate; secondly, the vehicle perception is limited, and the stability of the formation system is poor; third, the fixed formation mode makes the system less flexible and has a greater impact on surrounding vehicles
[0003] The formation method based on traditional control requires complex controller design, and the system-level control method has high requirements for the stability of a single vehicle. If a vehicle fails during formation driving, the control program needs to be changed manually. In high-speed road scenarios, fixed control modes will also lose system flexibility and adaptability to environmental changes
Reinforcement learning is machine learning. With the development of artificial intelligence and machine learning, reinforcement learning has gradually been applied to autonomous driving tasks, but it is usually aimed at single-vehicle intelligence, and the advantages of reinforcement learning in the field of multi-agents have not been fully exploited.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A formation method for unmanned vehicles on highways based on multi-agent reinforcement learning
  • A formation method for unmanned vehicles on highways based on multi-agent reinforcement learning
  • A formation method for unmanned vehicles on highways based on multi-agent reinforcement learning

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0051] S1: Initialize the training environment.

[0063]

[0064]

[0065]

[0070]

[0071] where μ∈{S, T} means that the curve is divided into two dimensions, S and T, S means displacement, T means sampling time, and f(t) means

[0072] For lateral control, a proportional controller model was employed to convert the calculated lateral velocity to a heading reference.

[0076] Further, assuming that the network parameter of the Q-MIX network is θ, which represents the weight and bias of the network, the final loss

[0077]

[0080]

[0081] wherein, represents the Target-Q target network.

Embodiment 2

[0084] This implementation case provides a high-speed road unmanned vehicle formation decision based on multi-agent reinforcement learning

[0086] In step S2, the local observations of each vehicle are taken as input and input into the DRQN ​​network. Build two structural phases

[0087] Set up a "memory playback unit" and randomly select the experience in it for training, which interrupts the training sample

[0089]

[0090]

[0092]

[0094] Equation constraints, including position and velocity constraints at the initial time, and position constraints at the termination time. on the S dimension

[0095] P

[0096] P

[0097] n·(P

[0098] n·(P

[0099] Wherein the subscript 0 represents the starting point, and 3 represents the end point.

[0100] Inequality constraints, including position constraints, velocity constraints and acceleration constraints of control points. Inequality constraints are optimizations

[0101] S

[0102] P

[0103]

[0104]

[0106] Step S4, execute ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a formation method for unmanned vehicles on high-speed roads based on multi-agent reinforcement learning. The vehicle formation problem is regarded as a multi-agent cooperation problem, and each vehicle has the ability to make independent decisions, which can realize safe and fast driving Under the premise of flexible formation, that is, it is safe to avoid obstacles when the traffic flow is large, and it is not necessary to maintain the formation, and the formation is restored when the traffic flow is small; the end-to-end method that directly maps from the image input to the vehicle control quantity is trained due to the large action search space. It is very difficult, so the present invention only uses the multi-agent reinforcement learning method to learn the lane-changing strategy, and then combines the S-T graph trajectory optimization method to calculate the precise control amount, increase the control constraints, respect the principle of kinematics, and ensure safety. Comply with human driving habits.

Description

Formation method of unmanned vehicles on expressway based on multi-agent reinforcement learning technical field The invention belongs to the technical field of intelligent vehicles, in particular to a kind of expressway based on multi-agent reinforcement learning Unmanned vehicle formation method. Background technique [0002] Autonomous vehicles have been researched for decades and can replace humans. Cumbersome operations in complex scenarios such as high-density, long-cycle, and large traffic have high social and economic value. Freeway has Features such as clear topology, known traffic rules, clear constraints, and relative closure make it a typical field for autonomous driving to land. scene. Among them, the formation of intelligent logistics vehicles is a key problem worthy of research, which is important for reducing fuel consumption and improving fleet operation. Efficiency, reducing traffic congestion and other aspects play an important role. However, for...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06Q10/04G06F30/27G06N3/04G06F111/04G06F111/08
CPCG06Q10/04G06F30/27G06N3/04G06F2111/04G06F2111/08Y02T10/40
Inventor 王美玲陈思园宋文杰王凯
Owner BEIJING INSTITUTE OF TECHNOLOGYGY
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products