Expressway unmanned vehicle formation method based on multi-agent reinforcement learning

An unmanned vehicle and reinforcement learning technology, applied in the field of intelligent vehicles, can solve the problems of poor stability of the formation system, limited vehicle perception, and high stability requirements, and achieve enhanced stability and fault tolerance, increased control constraints, and training difficulty big effect

Active Publication Date: 2021-08-13
BEIJING INSTITUTE OF TECHNOLOGYGY
View PDF4 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

First of all, the dynamic vehicle movement state on the expressway is complex, and it is difficult for the vehicle formation to coordinate; secondly, the vehicle perception is limited, and the stability of the formation system is poor; third, the fixed formation mode makes the system less flexible and has a greater impact on surrounding vehicles
[0003] The formation method based on traditional control requires complex controller design, and the system-level control method has high requirements for the stability of a single vehicle. If a vehicle fails during formation driving, the control program needs to be changed manually. In high-speed road scenarios, fixed control modes will also lose system flexibility and adaptability to environmental changes
Reinforcement learning is machine learning. With the development of artificial intelligence and machine learning, reinforcement learning has gradually been applied to autonomous driving tasks, but it is usually aimed at single-vehicle intelligence, and the advantages of reinforcement learning in the field of multi-agents have not been fully exploited.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Expressway unmanned vehicle formation method based on multi-agent reinforcement learning
  • Expressway unmanned vehicle formation method based on multi-agent reinforcement learning
  • Expressway unmanned vehicle formation method based on multi-agent reinforcement learning

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0050] Such as figure 1 As shown, the present invention relates to a high-speed road unmanned vehicle formation method based on multi-agent reinforcement learning, which obtains environmental information as observation input into the trained Q-MIX network, obtains the action decision of each unmanned vehicle, and realizes the formation , wherein, the training method of Q-MIX network comprises the following steps:

[0051] S1: Initialize the training environment.

[0052] S2: Input the environmental information of the training environment into the Q-MIX network as observations to obtain the action decisions of each unmanned vehicle, that is, to obtain the decision-making strategy adopted by each unmanned vehicle facing the current scene, which is divided into: change lanes to the left , lane keeping, change lane to the right.

[0053] Further, the observations made of environmental information include local observations and global observations, wherein the local observations ...

Embodiment 2

[0084] This implementation case provides a method for formation decision-making of unmanned vehicles on high-speed roads based on multi-agent reinforcement learning. The method framework is as follows: Figure 4 shown. This method divides the decision-making control into two parts. The first part inputs the environmental information as an observation into the QMIX network, and outputs the current decision of each formation vehicle (lane change to the left, lane keeping, and lane change to the right). The second part is based on Decision-making information, trajectory planning, and calculation of control variables (acceleration, direction). The reward obtained by the vehicle for performing this action is the reward value of QMIX. After training, an intelligent vehicle formation decision model in high-speed scenarios can be obtained. That is to say, the present invention trains a set of decision-making and control strategies for intelligent vehicle formations on expressways th...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides an expressway unmanned vehicle formation method based on multi-agent reinforcement learning, the method regards a vehicle formation problem as a multi-agent cooperation problem, and each vehicle has an independent decision-making capability. The method can achieve flexible formation on the premise of safe and rapid driving, namely, safe obstacle avoidance is achieved when the traffic flow is large while the formation does not need to be kept, and the formation is recovered when the traffic flow is small; an end-to-end mode of directly mapping image input to vehicle control quantity is large in training difficulty due to large action search space, so that a lane changing strategy is learned only by using a multi-agent reinforcement learning method, the accurate control quantity is calculated in combination with an S-T graph trajectory optimization method, so that control constraints are increased, the kinematics principle is respected, safety guarantee is achieved, and human driving habits are met.

Description

technical field [0001] The invention belongs to the technical field of intelligent vehicles, and in particular relates to a formation method for unmanned vehicles on expressways based on multi-agent reinforcement learning. Background technique [0002] Autonomous Vehicles (Autonomous Vehicles) have been researched for decades. They can replace humans to complete tedious operations in complex scenarios such as high-density, long-cycle, and large-flow, and have high social and economic value. Expressway has the characteristics of clear topology, known traffic rules, clear restrictions, and relatively closed. It is a typical scenario for the implementation of autonomous driving. Among them, the formation of intelligent logistics vehicles is a key problem worthy of research, which plays an important role in reducing fuel consumption, improving fleet operating efficiency, and reducing traffic congestion. However, there are still many problems for formation tasks on high-speed st...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06Q10/04G06F30/27G06N3/04G06F111/04G06F111/08
CPCG06Q10/04G06F30/27G06N3/04G06F2111/04G06F2111/08Y02T10/40
Inventor 王美玲陈思园宋文杰王凯
Owner BEIJING INSTITUTE OF TECHNOLOGYGY
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products