Robot Reality Transfer Method Based on Reinforcement Learning and Residual Modeling

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A technology of reinforcement learning and robotics, which is applied in reality migration, robot simulation, and robot migration from simulated environment to real environment. It can solve the problems of high-performance expert data collection difficulties, poor execution, and low sampling efficiency, and achieve accelerated convergence and The effects of generalization, broad application prospects, and strong generalization ability

Active Publication Date: 2022-06-21

NANJING UNIV

View PDF5 Cites 0 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

These two methods have their own limitations: imitation learning needs to collect enough expert demonstration data, and it is difficult to collect high-performance expert data in reality; training reinforcement learning algorithms in simulators requires building a high-fidelity model that completely restores the real environment. True Simulators, Laborious and Expensive

Another serious and practical challenge is that there are often discrepancies between simulation and the real world, leading to strategies that work well in simulation but perform poorly in the real world

[0005] When using reinforcement learning to train robots applied in actual scenes, when the robot directly interacts with the environment and samples in the real environment, the following two serious problems will appear: low sampling efficiency and safety issues

However, a serious and practical challenge is that there are often discrepancies between simulation and the real world, leading to strategies that work well in simulation but perform poorly in the real world

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0046] Below in conjunction with specific embodiments, the present invention will be further illustrated, and it should be understood that these embodiments are only used to illustrate the present invention and not to limit the scope of the present invention. The modifications all fall within the scope defined by the appended claims of this application.

[0047] Aiming at the problem that the behavior strategy application obtained by training the robot using reinforcement learning in the simulator does not perform well in the actual scene, which leads to the problem that the robot based on reinforcement learning training cannot be applied to the ground. A robot based on reinforcement learning and residual modeling is proposed. Reality Migration System and Method. The robot reality transfer system based on reinforcement learning and residual modeling includes an environment simulator building module based on machine learning and reinforcement learning, a residual model building...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a robot reality migration method based on reinforcement learning and residual modeling. Under the condition of only a small amount of mixed offline data and a simulator with deviation, the offline data and the deviation simulator are used to construct an optimal simulator. Based on the optimal simulator, the robot behavior strategy is trained through the reinforcement learning algorithm and the robot can adapt to the real environment autonomously. This adaptability enables the behavior strategy to be more effectively extended to the real environment. The invention proposes a robot training framework, and provides an innovative method for reducing the deviation of robot control from the simulator environment application to the real scene. The method models the residuals of the simulator environment state space and the real environment state space based on offline data, and uses the learned residual model to modify the original simulator. Finally, the strategy learned by the robot in the correction simulator is transferred to the real environment.

Description

technical field [0001] The invention relates to a method for a robot to migrate from a simulated environment to a real environment, in particular to a method and system for robot simulation and real migration based on reinforcement learning and residual modeling, belonging to the technical field of robot control. Background technique [0002] In recent years, with the wide application of deep learning in many fields, reinforcement learning has achieved remarkable success in the simulation environment. Different from supervised learning, unsupervised learning, and semi-supervised learning, reinforcement learning mainly solves the problem of serialized decision-making. By constantly interacting with the environment, constantly updating behavior strategies and obtaining the maximum cumulative reward. Reinforcement learning environments are generally modeled as Markov decision processes ( Markov Decision Processes ), the Markov decision model is defined as < S , A , R , ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Patents(China)

IPC IPC(8): B25J9/16G06N3/08G06N3/04G06N3/00

Inventor 俞扬刘驭壬詹德川周志华魏宏伟

Owner NANJING UNIV

Robot Reality Transfer Method Based on Reinforcement Learning and Residual Modeling

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology