Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Robot demonstration teaching method based on meta-imitation learning

A teaching method and robot technology, applied in the field of robot demonstration and teaching based on meta-imitation learning, can solve problems such as inefficiency, difficulty in designing reward functions, and time-consuming

Pending Publication Date: 2020-11-24
GUANGZHOU INST OF ADVANCED TECH CHINESE ACAD OF SCI
View PDF7 Cites 8 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Behavioral cloning is supervised learning from observation to action. This method relies on expert data and requires a large amount of data sets. It transfers redundant labor to experts, which is relatively inefficient and time-consuming.
Inverse reinforcement learning uses the demonstrations given by experts to classify or regress incentive functions, evaluate the pros and cons of the current state through the incentive functions and accumulate them to learn the optimal strategy, but in many cases the reward function is difficult to design, Additional experience is required to optimize the reward function. This method of optimizing the reward function by trial and error is time-consuming and laborious, and this method is difficult to apply to multi-stage decision-making

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Robot demonstration teaching method based on meta-imitation learning
  • Robot demonstration teaching method based on meta-imitation learning
  • Robot demonstration teaching method based on meta-imitation learning

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0040] figure 1 Provided the overall flow diagram of the present invention, as figure 1 As shown, the present invention provides a robot demonstration and teaching method based on meta-imitation learning, comprising the following steps:

[0041] Step S1: Obtain the robot demonstration teaching task set p(T);

[0042] Step S2: Construct the network structure model and obtain the adaptive target loss function L ψ ;

[0043] Step S3: In the meta-training stage, use Algorithm 1 to learn and optimize the adaptive target loss function L ψ , to obtain the policy parameters θ and ψ;

[0044] Step S4: In the meta-testing stage, use Algorithm 2 for the trajectory τ demonstrated by the experts h To learn and get the learning strategy π φ ;

[0045] Step S5: The expert demonstrates the trajectory As input, combined with the learned policy π φ , using the network structure model to generate robot imitation trajectories combined with robot state information (s 1 ,s 2 ,...,s T...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a robot demonstration teaching method based on meta-imitation learning, and relates to the technical field of machine learning, and the method comprises the steps: obtaining arobot demonstration teaching task set; constructing a network structure model and obtaining a self-adaptive target loss function; in the meta-training stage, learning and optimizing a loss function and initialization values and parameters of the loss function by using an algorithm I; in the meta-test stage, learning the trajectory demonstrated by the expert by using an algorithm II to obtain a learning strategy; taking the expert demonstration track as input, in combination with a learning strategy, generating a robot imitation track by utilizing the network structure model, and mapping the robot imitation track to the action of the robot in combination with the robot state information. According to the method, a new scene can be rapidly generalized from a small number of demonstration examples given by expert demonstration, specific task engineering does not need to be carried out, and the robot can self-learn strategies irrelevant to tasks according to the expert demonstration, so that a track is generated, and one-time demonstration and rapid teaching are realized.

Description

technical field [0001] The invention relates to the technical field of machine learning, in particular to a robot demonstration and teaching method based on meta-imitation learning. Background technique [0002] With the continuous expansion of the market, especially with the rapid development of the 3C industry, industrial robots have been widely used in many fields, and can complete a series of complex tasks, such as grasping, assembly, welding, laser cutting and spraying, etc. . The traditional robot teaching method is complicated to operate and takes a long time. It also needs to master the robot model and other related knowledge, which has high requirements for the operator. In the existing technology, in order to simplify the teaching operation of the robot, the operator generally controls the joint movement of the robot manually through the teaching pendant, and uses manual dragging to make the end of the robot move according to the required trajectory, and the robot...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G05B13/04
CPCG05B13/042B25J9/1664G05B19/42G06N3/08G06N3/006G06N3/048G06N3/045
Inventor 雷渠江李秀昊徐杰桂光超梁波潘艺芃刘纪王雨禾王卫军韩彰秀
Owner GUANGZHOU INST OF ADVANCED TECH CHINESE ACAD OF SCI
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products