An acceleration method for deep reinforcement learning of simulated robots

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A technology for simulating robots and reinforcement learning, which is applied in the acceleration field of deep reinforcement learning for robots in a simulation environment. It can solve the problems of high computational cost, large training time cost, and limiting the speed of deep reinforcement learning of robots, etc., and achieves the effect of easy deployment.

Active Publication Date: 2022-03-15

NAT UNIV OF DEFENSE TECH

View PDF5 Cites 0 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

For simulation-based learning, on the one hand, the simulation environment is required to be as realistic as possible, so that the learning results can be more easily transferred to reality, but on the other hand, the more realistic the simulation environment, the greater the computational cost. The cost of training time, the evolution of the simulation environment has become a bottleneck that limits the speed of deep reinforcement learning for robots

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0037] The present invention will be further described below in conjunction with the accompanying drawings and specific embodiments.

[0038] A method for accelerating deep reinforcement learning of simulated robots, comprising the following steps:

[0039] Step 1: Select one node as the learning node, and the other nodes as the environment nodes, and perform the initialization operation. The structure of the whole system is as follows: figure 1 As shown in , the specific number of environment nodes to start is determined according to the parallelization scale required by the application, including the following steps:

[0040] 1.1 Initialize the deep reinforcement learning agent and agent environment that need to be accelerated in the learning node;

[0041] 1.2 Initialize an environment node for each robot simulator instance, the environment node maintains communication details with the robot simulator instance, and provides a unified message interface to communicate with t...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention belongs to the field of robots, and discloses an acceleration method for deep reinforcement learning of a simulated robot, with the purpose of accelerating the learning process so as to reduce the time cost of research, debugging and deployment of robot deep reinforcement learning. The technical scheme of the present invention is: select one node as a learning node, and other nodes as environment nodes; each environment node processes the interaction details with a robot emulator instance, and provides a unified environment interaction message interface; the learning node adopts the form of frame simulation Interact with each environment node through a message interface, and collect learning data from multiple environments at the same time, thereby accelerating reinforcement learning. The present invention decouples learning algorithm development and simulation interaction details through the abstraction of environment nodes while adapting to various robot simulators, and message communication allows each environment node and simulator instance to be deployed in a distributed computing environment, which is easy to deploy , Scalable advantages.

Description

technical field [0001] The invention belongs to the field of robots, and relates to an acceleration method for deep reinforcement learning of robots in a simulation environment, and can be applied to robot control tasks such as obstacle avoidance, navigation, formation, and multi-robot collaboration of intelligent robots. Background technique [0002] Reinforcement learning is one of the important technologies used in the field of robotics. Through reinforcement learning, robots can learn a set of action strategies to complete tasks independently through continuous attempts. This self-learning ability plays an important role in complex scenarios where it is difficult to manually design action strategies. significance. [0003] Reinforcement learning is used to solve sequential decision-making problems. The learner (that is, the agent) tries to make an action according to the action strategy combined with the current environment state (the initial strategy is usually a random...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Patents(China)

IPC IPC(8): G06N3/08G06F30/20

CPCG06N3/08

Inventor 唐玉华黄达杨绍武徐利洋蔡中轩李明龙粱震

Owner NAT UNIV OF DEFENSE TECH

An acceleration method for deep reinforcement learning of simulated robots

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology