Robot reality migration system and method based on reinforcement learning and residual error modeling

A technology of reinforcement learning and transfer systems, applied in neural learning methods, biological models, instruments, etc., can solve the problems of high-performance expert data collection difficulties, poor execution, low sampling efficiency, etc., to speed up convergence and generalization , wide application prospects, the effect of strong generalization ability

Active Publication Date: 2022-04-08
NANJING UNIV
View PDF5 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

These two methods have their own limitations: imitation learning needs to collect enough expert demonstration data, and it is difficult to collect high-performance expert data in reality; training reinforcement learning algorithms in simulators requires building a high-fidelity model that completely restores the real environment. True Simulators, Laborious and Expensive
Another serious and practical challenge is that there are often discrepancies between simulation and the real world, leading to strategies that work well in simulation but perform poorly in the real world
[0005] When using reinforcement learning to train robots applied in actual scenes, when the robot directly interacts with the environment and samples in the real environment, the following two serious problems will appear: low sampling efficiency and safety issues
However, a serious and practical challenge is that there are often discrepancies between simulation and the real world, leading to strategies that work well in simulation but perform poorly in the real world

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Robot reality migration system and method based on reinforcement learning and residual error modeling
  • Robot reality migration system and method based on reinforcement learning and residual error modeling
  • Robot reality migration system and method based on reinforcement learning and residual error modeling

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0046] Below in conjunction with specific embodiment, further illustrate the present invention, should be understood that these embodiments are only used to illustrate the present invention and are not intended to limit the scope of the present invention, after having read the present invention, those skilled in the art will understand various equivalent forms of the present invention All modifications fall within the scope defined by the appended claims of the present application.

[0047] Aiming at the problem that the behavior policy obtained by using reinforcement learning to train the robot in the simulator does not perform well in the actual scene, which leads to the problem that the robot based on reinforcement learning training cannot be applied, a robot based on reinforcement learning and residual modeling is proposed. Reality transfer system and method. The robot reality transfer system based on reinforcement learning and residual modeling includes environment simula...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a robot reality migration system and method based on reinforcement learning and residual modeling, and the method employs offline data and a deviation simulator to construct an optimal simulator under the condition that there is only a small amount of mixed offline data and a deviation simulator. A robot behavior strategy is trained through a reinforcement learning algorithm based on the optimal simulator, and the robot can be enabled to adapt to a real environment autonomously. The self-adaptability enables the behavior strategy to be popularized to the real environment more effectively. The invention provides a robot training framework, and provides an innovative method for reducing the deviation of robot control from simulator environment application to real scenes. According to the method, modeling is performed on a residual error of a simulator environment state space and a real environment state space based on offline data, and an original simulator is corrected by using a learned residual error model. And finally, a strategy learned by the robot in the correction simulator is migrated to a real environment.

Description

technical field [0001] The invention relates to a method for a robot to migrate from a simulation environment to a real environment, in particular to a robot simulation and reality migration method and system based on reinforcement learning and residual modeling, belonging to the technical field of robot control. Background technique [0002] In recent years, with the wide application of deep learning in many fields, reinforcement learning has achieved remarkable success in the simulation environment. Different from supervised learning, unsupervised learning, and semi-supervised learning, reinforcement learning mainly solves serialized decision-making problems. Through continuous interaction with the environment, the behavior strategy is continuously updated to obtain the maximum cumulative reward. Reinforcement learning environments are generally modeled as Markov decision processes ( Markov Decision Processes ), the Markov decision model is defined as < S , A , R , ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): B25J9/16G06N3/08G06N3/04G06N3/00
Inventor 俞扬刘驭壬詹德川周志华魏宏伟
Owner NANJING UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products