Improved deep reinforcement learning method and system based on Double DQN

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A reinforcement learning and deep technology, applied in the field of reinforcement learning, can solve problems such as inability to converge

Inactive Publication Date: 2020-07-28

NANJING UNIV OF SCI & TECH

View PDF0 Cites 15 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

However, this method is greatly affected by noise, which may cause unconvergent results.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0059] In order to make the purpose, technical solution and advantages of the present application clearer, the present application will be further described in detail below in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain the present application, and are not intended to limit the present application.

[0060] In one embodiment, combined with figure 1 , provides an improved deep reinforcement learning method based on Double Deep Q-Learning Network, which includes the following steps:

[0061] Step 1, initialize the environment and DQN network parameters;

[0062] Here, the environment includes: state space action space and reward function r; DQN network parameters include current value neural network parameters, target value neural network parameters, DQN error function and playback memory unit Among them, the neural network parameters include the number of network...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses an improved deep reinforcement learning method and system based on Double DQNs, and belongs to the field of reinforcement learning, and the method comprises the following steps: initializing an environment and DQN network parameters; performing experience accumulation based on an epsilon-greedy strategy, and storing experience into a playback memory unit; and training and optimizing the DQN network by using samples in the playback memory unit to obtain a decision network. According to the method, the convergence speed of Double Q-Learning Network can be increased, the final convergence value can be optimized, the interference of noise on the effect of the DQN algorithm can be reduced, the application effect of deep reinforcement learning in actual production and life can be improved, and the application range of deep reinforcement learning can be expanded.

Description

technical field [0001] The invention belongs to the field of reinforcement learning, in particular to an improved deep reinforcement learning method and system based on Double DQN. Background technique [0002] Double Deep Q-Learning Network is one of the most common frameworks in deep reinforcement learning. It has good results in practice. DQN is divided into three parts: environment, playback memory unit and neural network. Among them, the agent interacts with the environment to obtain the current state s, and obtains the next state s' and reward r after taking an action a. The playback memory unit stores each item (s, a, s', r), and after storing a certain amount, extracts part of the data according to a certain extraction method and inputs them to the neural network for training. There are two neural networks with exactly the same network structure, namely the current value network (Q-eval) and the target value network (Q-target). The input of the current value networ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G06N3/08G06N3/04

CPCG06N3/08G06N3/045

Inventor 奚思遥王力立肖强林高尚杜万年闫晓黄成单梁张永

Owner NANJING UNIV OF SCI & TECH

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Improved deep reinforcement learning method and system based on Double DQN

What is Al technical title? Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document. A reinforcement learning and deep technology, applied in the field of reinforcement learning, can solve problems such as inability to converge

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A reinforcement learning and deep technology, applied in the field of reinforcement learning, can solve problems such as inability to converge

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology