Deep reinforcement learning robust training method and device based on neuron coverage rate

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A technology of reinforcement learning and training methods, which is applied in the field of robust training methods and devices for deep reinforcement learning, can solve problems such as failure, small improvement and decline of agent performance, and achieves the effect of enhanced robustness and sufficient logic coverage.

Pending Publication Date: 2021-08-24

ZHEJIANG UNIV OF TECH

View PDF0 Cites 2 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

First of all, most of the training methods generate a large number of repetitive operations. When the agent first learns the task, it is easy to cause failure. With continuous learning, the frequency of failure will decrease.

The agent will continue to encounter the solutions it has mastered. The performance improvement of the agent at this stage is very small. Such training will lead to unreasonable data starvation

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0032] In order to make the object, technical solution and advantages of the present invention clearer, the present invention will be further described in detail below in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain the present invention, and do not limit the protection scope of the present invention.

[0033] When the overall return of the deep reinforcement learning model is close to convergence in the later stage of training, a large number of repeated successful rounds (episode) lead to slow training, and the agent lacks extreme case training. In response to this problem, the embodiment of the present invention provides a deep reinforcement learning robust training method and device based on neuron coverage, which is used for the training of a deep reinforcement learning model (that is, an agent) in the field of automatic driving to improve The robustness of the age...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a deep reinforcement learning robust training method and device based on neuron coverage rate, and the method comprises the following steps: (1) building an intelligent driving environment, collecting state data from the intelligent driving environment, and training a deep reinforcement learning model until a set return value is reached; (2) performing operating in an environment by using the trained deep reinforcement learning model, and extracting multiple rounds of state action pairs; (3) constructing a predictor used for predicting a future moment state action pair sequence according to the historical state action pair sequence and a classifier used for carrying out quality classification on the state action pairs, and training the predictor and the classifier by utilizing the extracted state action pairs; and (4) according to the defined adversarial sampling strategy, performing quality sampling on the state action pair according to the state action pair, and performing retraining of the deep reinforcement learning model so as to improve the robustness of the deep reinforcement learning model.

Description

technical field [0001] The invention relates to the field of artificial intelligence, in particular to a robust training method and device for deep reinforcement learning based on neuron coverage. Background technique [0002] With the rapid development of artificial intelligence, deep reinforcement learning algorithms have become one of the most concerned algorithms in this field. Deep reinforcement learning combines the perception ability of deep learning with the decision-making ability of reinforcement learning, and can perform end-to-end control directly according to the input information, and solve the sequential decision-making problem in high-dimensional state space. Due to its excellent performance, deep reinforcement learning algorithms are widely used in autonomous driving, automatic translation, dialogue systems, and video detection. However, the neural network black box lacks interpretability, and it is difficult to guarantee the security. Therefore, it is very...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Applications(China)

IPC IPC(8): G06N3/08G06K9/62

CPCG06N3/08G06F18/241Y02T10/40

Inventor 陈晋音王珏章燕王雪柯胡书隆

Owner ZHEJIANG UNIV OF TECH

Deep reinforcement learning robust training method and device based on neuron coverage rate

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology