An efficient robotic arm grasping deep reinforcement learning reward training method and system

A technology of reinforcement learning and training methods, applied in the field of machine learning, can solve problems such as incoordination and poor movement coherence, and achieve the effect of improving poor movement coherence and solving complex calculations

Active Publication Date: 2022-08-09
NORTHWEST UNIV
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0006] The purpose of the present invention is to provide an efficient manipulator grasping depth reinforcement learning reward training method and system, which improves the problems of poor coherence and uncoordination of the existing manipulator control.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • An efficient robotic arm grasping deep reinforcement learning reward training method and system
  • An efficient robotic arm grasping deep reinforcement learning reward training method and system
  • An efficient robotic arm grasping deep reinforcement learning reward training method and system

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0041] In the step S1, after the depth camera identifies the end effector of the manipulator and the target object, it returns the coordinates of the end effector of the manipulator to the computer (x h ,y h ), depth d h , the coordinates of the target object (x o ,y o ), depth d o , and use the Euclidean distance to calculate the distance d between the end effector of the manipulator and the target object. The calculation formula is as follows:

[0042]

Embodiment 2

[0044] The step S2 initializes the moving distance reward r of the end effector of the robotic arm relative to the target object 1 , step reward r 2 , the sum of the rotation angles of the servos for each degree of freedom of the manipulator is rewarded r 3 and whether to grab a successful reward r 4, so that the above values ​​are all 0, then the total number of rewards R is:

[0045] R=αr 1 +βr 2 +γr 3 +δr 4 (2).

Embodiment 3

[0047] The reward method calculated in the state S in the step S5 is:

[0048] S51: Calculate the moving distance △d=d'-d of the end effector of the robotic arm relative to the target object, and the moving distance is rewarded r 1 =-Δd;

[0049] S52: number of steps s=s+1, reward r for number of steps 2 =-s;

[0050] S53: Calculate the sum of the rotation angles of the servos for each degree of freedom of the manipulator as but

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses an efficient deep reinforcement learning reward training method and system for grasping a mechanical arm, which relates to the technical field of machine learning. and return this to the computer as a basis for judging whether the robotic arm can grasp the target object. Each time the end effector of the manipulator is tested, the computer calculates the moving distance of the end effector of the manipulator relative to the object, the number of steps moved by the manipulator, the sum of the rotation angles of the steering gear of each degree of freedom of the manipulator, and whether the target object is successfully grasped. And as the reward mechanism of DDPG deep deterministic policy gradient network, the end-to-end training process is completed with DDPG. The present invention improves the problems of poor action continuity and uncoordinated existing in the control of the existing mechanical arm.

Description

technical field [0001] The invention relates to the technical field of machine learning, in particular to an efficient deep reinforcement learning reward training method and system for grasping a robotic arm. Background technique [0002] With the development of artificial intelligence and robotics, the use of robotic arms is becoming more and more abundant, and it acts as the arm of a robot just like our arm. How to use the robotic arm to grasp objects accurately has become one of the key issues in the development of robotics. At present, the most popular kinematics and inverse kinematics methods for precise grasping of robotic arms are used to achieve precise grasping by solving the inverse kinematics solution. This method needs to consider the solvability, that is, no solution, multiple solutions, etc. situation, the calculation is complex and time-consuming. Another relatively novel way is to make the robot have its own soul through reinforcement learning, so that it c...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): B25J9/16B25J9/04B25J19/02B25J19/04
CPCB25J9/163B25J9/04B25J19/02B25J19/04
Inventor 刘成汪霖郑春燕张晨升李银奎赵启轩马俊飞曲瑞王新宇
Owner NORTHWEST UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products