Unlock instant, AI-driven research and patent intelligence for your innovation.

Deep reinforcement learning robust training method and device based on neuron coverage rate

A technology of reinforcement learning and training methods, which is applied in the field of robust training methods and devices for deep reinforcement learning, can solve problems such as failure, small improvement and decline of agent performance, and achieves the effect of enhanced robustness and sufficient logic coverage.

Pending Publication Date: 2021-08-24
ZHEJIANG UNIV OF TECH
View PDF0 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

First of all, most of the training methods generate a large number of repetitive operations. When the agent first learns the task, it is easy to cause failure. With continuous learning, the frequency of failure will decrease.
The agent will continue to encounter the solutions it has mastered. The performance improvement of the agent at this stage is very small. Such training will lead to unreasonable data starvation

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Deep reinforcement learning robust training method and device based on neuron coverage rate
  • Deep reinforcement learning robust training method and device based on neuron coverage rate
  • Deep reinforcement learning robust training method and device based on neuron coverage rate

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0032] In order to make the object, technical solution and advantages of the present invention clearer, the present invention will be further described in detail below in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain the present invention, and do not limit the protection scope of the present invention.

[0033] When the overall return of the deep reinforcement learning model is close to convergence in the later stage of training, a large number of repeated successful rounds (episode) lead to slow training, and the agent lacks extreme case training. In response to this problem, the embodiment of the present invention provides a deep reinforcement learning robust training method and device based on neuron coverage, which is used for the training of a deep reinforcement learning model (that is, an agent) in the field of automatic driving to improve The robustness of the age...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a deep reinforcement learning robust training method and device based on neuron coverage rate, and the method comprises the following steps: (1) building an intelligent driving environment, collecting state data from the intelligent driving environment, and training a deep reinforcement learning model until a set return value is reached; (2) performing operating in an environment by using the trained deep reinforcement learning model, and extracting multiple rounds of state action pairs; (3) constructing a predictor used for predicting a future moment state action pair sequence according to the historical state action pair sequence and a classifier used for carrying out quality classification on the state action pairs, and training the predictor and the classifier by utilizing the extracted state action pairs; and (4) according to the defined adversarial sampling strategy, performing quality sampling on the state action pair according to the state action pair, and performing retraining of the deep reinforcement learning model so as to improve the robustness of the deep reinforcement learning model.

Description

technical field [0001] The invention relates to the field of artificial intelligence, in particular to a robust training method and device for deep reinforcement learning based on neuron coverage. Background technique [0002] With the rapid development of artificial intelligence, deep reinforcement learning algorithms have become one of the most concerned algorithms in this field. Deep reinforcement learning combines the perception ability of deep learning with the decision-making ability of reinforcement learning, and can perform end-to-end control directly according to the input information, and solve the sequential decision-making problem in high-dimensional state space. Due to its excellent performance, deep reinforcement learning algorithms are widely used in autonomous driving, automatic translation, dialogue systems, and video detection. However, the neural network black box lacks interpretability, and it is difficult to guarantee the security. Therefore, it is very...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06N3/08G06K9/62
CPCG06N3/08G06F18/241Y02T10/40
Inventor 陈晋音王珏章燕王雪柯胡书隆
Owner ZHEJIANG UNIV OF TECH