A manipulator model learning method, device, electronic equipment and storage medium

A learning method and manipulator technology, applied in manipulators, program-controlled manipulators, manufacturing computing systems, etc., can solve problems such as reinforcement learning models failing to accurately imitate and complete expert demonstrations, no technical solutions, and expert behavior strategy deviations

Active Publication Date: 2022-06-03
JIHUA LAB
View PDF9 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0002] At present, in production applications, manipulators can enhance the versatility of autonomous interaction of manipulators through reinforcement learning and efficiently complete complex tasks; existing reinforcement learning models can generally accelerate model convergence by learning optimal expert behavior strategies combined with demonstration data, but It is easy to cause the final reinforcement learning model to fail to accurately imitate the expert's demonstration behavior due to the deviation of the expert's behavior strategy or learning only at the lowest imitation cost
[0003] For the above problems, there is no effective technical solution

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A manipulator model learning method, device, electronic equipment and storage medium
  • A manipulator model learning method, device, electronic equipment and storage medium
  • A manipulator model learning method, device, electronic equipment and storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0111] In order to more clearly describe the learning process of the manipulator model in the manipulator model learning method provided by the embodiment of the present application, a more detailed embodiment of the manipulator model learning method is described, and the learning method includes the following steps:

[0112] 1. Collect the original image data of various real production and life scenes, mark the objects and behaviors in the original image, establish an image database, or directly use Imagenet, the largest database for image recognition in the world, to obtain the image database, and predict the image database through the image database. Train to get a feature extractor;

[0113] 2. Obtain the image information of the current job scene through the manipulator body vision sensor, and use the feature extractor obtained by the aforementioned pre-training to learn the image data features of the manipulator operation from the image information , the learned image d...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The present invention relates to the technical field of intelligent manipulators, and specifically discloses a manipulator model learning method, device, electronic equipment, and storage medium, wherein the learning method includes the following steps: acquiring multiple sets of information about the same execution task for the manipulator model to learn expert demonstration data; generating an expert policy associated with a learning cost function based on the expert demonstration data, the learning cost function being established based on the imitation cost required to imitate the expert demonstration data and the density of the expert demonstration data; minimizing The learning cost function is used to obtain the optimal expert strategy; the manipulator model is trained according to the optimal expert strategy; the optimal expert strategy finally obtained by this method pushes the manipulator model to the range where the expert demonstration data is densely distributed to imitate the expert demonstration Behavior, so that the manipulator model can accurately imitate and complete the expert demonstration behavior at the lowest possible imitation cost.

Description

technical field [0001] The present application relates to the technical field of intelligent manipulators, and in particular, to a manipulator model learning method, device, electronic device and storage medium. Background technique [0002] At present, in production applications, manipulators can enhance the versatility of autonomous interaction of manipulators and efficiently complete complex tasks through reinforcement learning. Existing reinforcement learning models generally speed up model convergence by learning optimal expert behavior strategies combined with demonstration data, but It is easy for the final reinforcement learning model to fail to accurately imitate and complete the expert demonstration behavior due to the deviation of the expert behavior strategy or only learning at the lowest imitation cost. [0003] There is currently no effective technical solution for the above-mentioned problems. SUMMARY OF THE INVENTION [0004] The purpose of the present app...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): B25J9/16
CPCY02P90/30
Inventor 焦家辉张晟东王济宇李志建蔡维嘉李腾张立华李伟
Owner JIHUA LAB
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products