A continuous action online learning control method and system for an autonomous vehicle

A technology of automatic driving and learning control, which is applied in the field of environmental perception of autonomous driving vehicles. It can solve problems such as insufficient data efficiency, long convergence process, and easy oscillation, so as to achieve the effect of ensuring learning effect, good performance effect, and shortening the learning cycle.

Inactive Publication Date: 2019-06-28
NAT UNIV OF DEFENSE TECH
View PDF6 Cites 15 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

In the adaptive heuristic evaluator algorithm, the learning of the evaluator adopts the traditional linear TD(λ) algorithm, which has the problem of insufficient data efficiency, and the selection of the learning and training step size needs to

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A continuous action online learning control method and system for an autonomous vehicle
  • A continuous action online learning control method and system for an autonomous vehicle
  • A continuous action online learning control method and system for an autonomous vehicle

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0069] like image 3 As shown, the implementation steps of the continuous action online learning control method for automatic driving vehicles in this embodiment include:

[0070] 1) Get the current perceptual image I t ;

[0071] 2) Perceive the image I through the deep coding network t Encode to obtain the encoded state feature s t ;

[0072] 3) will encode the state feature s t Input the evaluator network (cerebellar model neural network value function network) and executor network (cerebellum model neural network policy network) of the actuator-evaluator model respectively, and both the evaluator network and the actuator network of the actuator-evaluator model use Cerebellum model neural network;

[0073] 4) Output action a through the actuator network t And the parameters of the actuator-evaluator model are updated through the evaluator network.

[0074] like Figure 4 As shown, the deep coding network adopted in step 2) is the HELM network model.

[0075] like ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a continuous action online learning control method and system for an automatic driving vehicle. The continuous action online learning control method comprises the following steps: encoding a perceptual image It through a deep encoding network to obtain an encoding state feature st; respectively inputting encoding state features st into actuators-actuators, wherein the evaluator models all adopt an evaluator network and an actuator network of a cerebellar model neural network, an action at is output through the actuator network, and an actuator is updated through the evaluator network; parameters of an evaluator model. According to the invention, a synthetic deep neural network feature coding technology and an enhanced learning principle are adopted; the learning control problem of a continuous action space is solved under high-dimensional state input; on-line learning control of a continuous action space under large-scale continuous state input can be realized,the learning period is shortened while the learning effect is ensured, the learning process can be quickly converged to obtain a control strategy with a good performance effect, and the data utilization rate is good.

Description

technical field [0001] The present invention relates to the field of environment perception of self-driving vehicles, in particular to a continuous action online learning control method and system for self-driving vehicles, which is used for combining deep neural network feature encoding technology and enhanced learning principles, and is oriented to high-dimensional state input. Solving the problem of learning control in continuous action spaces. Background technique [0002] With the development and innovation of artificial intelligence technology and the continuous growth of the automobile industry in domestic and foreign markets, as the product of the organic combination of intelligent driving technology and automobiles-intelligent driving vehicles have gradually become an important part of major automobile companies, high-tech companies, universities and research institutes. The focus of the institute's attention. Under the organic cooperation of environment perception...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06N3/04G06N3/08B60W30/18B60W40/00
Inventor 徐昕曾宇骏姚亮
Owner NAT UNIV OF DEFENSE TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products