Efficient mechanical arm grabbing deep reinforcement learning reward training method and system

A technology of reinforcement learning and training methods, applied in the field of machine learning, can solve problems such as incoordination and poor movement coherence, and achieve the effect of improving poor movement coherence and solving complex calculations

Active Publication Date: 2021-06-18
NORTHWEST UNIV
View PDF5 Cites 4 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0006] The purpose of the present invention is to provide an efficient manipulator grasping depth reinforcement learning reward

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Efficient mechanical arm grabbing deep reinforcement learning reward training method and system
  • Efficient mechanical arm grabbing deep reinforcement learning reward training method and system
  • Efficient mechanical arm grabbing deep reinforcement learning reward training method and system

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0041] In the step S1, after the depth camera recognizes the end effector of the mechanical arm and the target object, it returns the coordinates of the end effector of the mechanical arm to the computer (x h ,y h ), depth d h , the coordinates of the target object (x o ,y o ), depth d o , and use the Euclidean distance to calculate the distance d between the end effector of the manipulator and the target object, the calculation formula is as follows:

[0042]

Embodiment 2

[0044] The step S2 initializes the movement distance reward r of the end effector of the mechanical arm relative to the target object 1 , step reward r 2 , the sum of the rotation angles of the steering gear for each degree of freedom of the robotic arm is rewarded r 3 And whether to grab the successful reward r 4, so that the above values ​​are all 0, then the total number of rewards R is:

[0045] R=αr 1 +βr 2 +γr 3 +δr 4 (2).

Embodiment 3

[0047] The reward method calculated in the state S in the step S5 is:

[0048] S51: Calculate the moving distance of the end effector of the manipulator relative to the target object △d=d’-d, the moving distance reward r 1 =-△d;

[0049] S52: Step number s=s+1, step number reward r 2 =-s;

[0050] S53: Calculate the sum of the rotation angles of the steering gear for each degree of freedom of the manipulator as but

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses an efficient mechanical arm grabbing deep reinforcement learning reward training method and system, and relates to the technical field of machine learning. A depth camera is used for identifying a target object and a mechanical arm tail end, the sight distance between the center of a mechanical arm tail end actuator and the center of the target object is calculated, and the distance is returned to a computer as a basis for judging whether a mechanical arm can grab the target object or not. Every time the mechanical arm tail end actuator tents, the computer takes the moving distance of the mechanical arm tail end actuator relative to the object, the moving step number of a mechanical arm, the sum of the rotating angles of steering engines of all degrees of freedom of the mechanical arm and the weighted sum of whether the target object is successfully grabbed or not as a reward mechanism of a DDPG depth deterministic strategy gradient network. The end-to-end training process is completed by using a DDPG. The problems that existing mechanical arm control is poor in action coherence and not coordinated are solved.

Description

technical field [0001] The present invention relates to the technical field of machine learning, in particular to an efficient method and system for rewarding training of deep reinforcement learning for manipulator grasping. Background technique [0002] With the development of artificial intelligence and robotics, the use of robotic arms is becoming more and more abundant. It acts as the arm of a robot just like our arms. How to use the robotic arm to accurately grasp objects has become one of the key issues in the development of robot technology. At present, most of the popular kinematics and inverse kinematics methods are used for precise grasping of robotic arms. Accurate grasping is achieved by solving kinematics inverse solutions. This method needs to consider solvability, that is, consider no solution, multiple solutions, etc. In this case, the calculation is complex and time-consuming. Another relatively novel way is to make the robot have its own soul through rein...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): B25J9/16B25J9/04B25J19/02B25J19/04
CPCB25J9/163B25J9/04B25J19/02B25J19/04
Inventor 刘成汪霖郑春燕张晨升李银奎赵启轩马俊飞曲瑞王新宇
Owner NORTHWEST UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products