Automatic driving decision-making method and system based on partial observable migration reinforcement learning

A reinforcement learning and automatic driving technology, applied in control/regulation systems, motor vehicles, non-electric variable control, etc., can solve problems such as unsatisfactory, and achieve the effect of improving utilization, wide application prospects, and strong robustness

Active Publication Date: 2020-04-17
NANJING UNIV
View PDF10 Cites 25 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, due to the difference in the dynamical system between the target task and the source task, the prior knowledge obtained using a single source task cannot always perfectly solve the subproblems in the target task
Taking autonomous driving as an example, because of the differences in the number of lanes, traffic density, and speed limit between highways and urban roads, the performance of driving strategies based on urban roads may be unsatisfactory on highways.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Automatic driving decision-making method and system based on partial observable migration reinforcement learning
  • Automatic driving decision-making method and system based on partial observable migration reinforcement learning
  • Automatic driving decision-making method and system based on partial observable migration reinforcement learning

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0067] Below in conjunction with specific embodiment, further illustrate the present invention, should be understood that these embodiments are only used to illustrate the present invention and are not intended to limit the scope of the present invention, after having read the present invention, those skilled in the art will understand various equivalent forms of the present invention All modifications fall within the scope defined by the appended claims of the present application.

[0068] In the present invention, the driving scheme is composed of observation set, driving strategy and termination function, and the task to be completed is to get from one point on the map to another point quickly and safely. Obviously, if a certain driving plan can be driven reliably under the current road conditions, the plan will be given a positive reward value; otherwise, a negative reward value will be given. To maximize the cumulative reward, we need to find the optimal mapping from road...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses an automatic driving decision-making method and system based on partial observable migration reinforcement learning, and employs a scheme reuse method related to a scene, and achieves auxiliary solving of a driving problem under a strange road condition through a conventional scheme in a migration driving scheme database. In order to achieve good riding experience, reinforcement learning is used for solving a decision problem in the field of automatic driving. The system comprises a scene unit, a sensing unit, a decision-making unit, an action planning unit and a control unit. A new environment model is added to a virtual environment database to cope with increasingly complex driving scenarios; a convolution layer is added to the neural network to identify obstaclesaround the vehicle; the important historical information is memorized by adding a long-short-term memory unit into the neural network; a Q value is estimated more accurately by using a weighted depthdouble-Q network algorithm based on Boltzmann soft maximization; the probability that each driving scheme is selected is solved by using a maximum entropy Mellowmax algorithm.

Description

technical field [0001] The invention relates to an automatic driving decision-making method and system based on partially observable transfer reinforcement learning, which is applicable to partially observable driving environments and belongs to the technical field of automobile automatic driving. Background technique [0002] Autonomous driving needs to solve three problems: localization, path planning and selection of driving behavior. At present, the problem of "where am I" can be solved using a variety of sensor fusion technologies, and the problem of "how to go" can be solved using algorithms such as Dijkstra, A*, and dynamic programming. However, the choice of driving behavior, such as how to drive on city roads or how to ensure safety while driving at night, is still a research hotspot today. [0003] We regard the driving process of the car as a partially observable Markov decision process (Partially Observable Markov Decision Process, POMDP). As a result, the vehi...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G05D1/02
CPCG05D1/0214G05D1/0221G05D1/024G05D1/0246G05D1/0257G05D1/0276G05D2201/0212
Inventor 章宗长俞扬周志华王艺深蒋俊鹏
Owner NANJING UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products