Acceleration method for deep reinforcement learning of simulation robot

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A technology for simulating robots and reinforcement learning, which is applied in the acceleration field of deep reinforcement learning of robots in a simulation environment. It can solve the problems of high computing cost, large training time cost, and limiting the speed of deep reinforcement learning of robots, etc., and achieves the effect of easy deployment.

Active Publication Date: 2020-01-24

NAT UNIV OF DEFENSE TECH

View PDF5 Cites 4 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

For simulation-based learning, on the one hand, the simulation environment is required to be as realistic as possible, so that the learning results can be more easily transferred to reality, but on the other hand, the more realistic the simulation environment, the greater the computational cost. The cost of training time, the evolution of the simulation environment has become a bottleneck that limits the speed of deep reinforcement learning for robots

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0037] The present invention will be further described below in conjunction with the accompanying drawings and specific embodiments.

[0038] A method for accelerating deep reinforcement learning of simulated robots, comprising the following steps:

[0039] Step 1: Select one node as the learning node, and the other nodes as the environment nodes, and perform the initialization operation. The structure of the whole system is as follows: figure 1 As shown in , the specific number of environment nodes to start is determined according to the parallelization scale required by the application, including the following steps:

[0040] 1.1 Initialize the deep reinforcement learning agent and agent environment that need to be accelerated in the learning node;

[0041] 1.2 Initialize an environment node for each robot simulator instance, the environment node maintains communication details with the robot simulator instance, and provides a unified message interface to communicate with t...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention belongs to the field of robots, discloses an acceleration method for deep reinforcement learning of a simulation robot, and aims to accelerate the learning process so as to reduce the time expenditure of robot deep reinforcement learning research, debugging and deployment. According to the technical scheme, one node is selected as a learning node, and other nodes are selected as environment nodes; each environment node processes interaction details with one robot simulator instance and provides a unified environment interaction message interface; and the learning node performs environment interaction with each environment node through a message interface in a frame simulation form, and learning data is collected from multiple environments, so that reinforced learning is accelerated. According to the method, while the environment nodes are abstracted to adapt to various robot simulators, learning algorithm development and simulation interaction details are decoupled, message communication allows the environment nodes and simulator instances to be deployed in a distributed computing environment, and the method has the advantages of being easy to deploy and extensible.

Description

technical field [0001] The invention belongs to the field of robots, and relates to an acceleration method for deep reinforcement learning of robots in a simulation environment, and can be applied to robot control tasks such as obstacle avoidance, navigation, formation, and multi-robot collaboration of intelligent robots. Background technique [0002] Reinforcement learning is one of the important technologies used in the field of robotics. Through reinforcement learning, robots can learn a set of action strategies to complete tasks independently through continuous attempts. This self-learning ability plays an important role in complex scenarios where it is difficult to manually design action strategies. significance. [0003] Reinforcement learning is used to solve sequential decision-making problems. The learner (that is, the agent) tries to make an action according to the action strategy combined with the current environment state (the initial strategy is usually a random...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G06N3/08G06F30/20

CPCG06N3/08

Inventor 唐玉华黄达杨绍武徐利洋蔡中轩李明龙粱震

Owner NAT UNIV OF DEFENSE TECH

Acceleration method for deep reinforcement learning of simulation robot

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology