Efficient mechanical arm grabbing deep reinforcement learning reward training method and system

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A technology of reinforcement learning and training methods, applied in the field of machine learning, can solve problems such as incoordination and poor movement coherence, and achieve the effect of improving poor movement coherence and solving complex calculations

Active Publication Date: 2021-06-18

NORTHWEST UNIV

View PDF5 Cites 4 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0006] The purpose of the present invention is to provide an efficient manipulator grasping depth reinforcement learning reward training method and system, which improves the problems of poor coherence and uncoordination of the existing manipulator control.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment 1

[0041] In the step S1, after the depth camera recognizes the end effector of the mechanical arm and the target object, it returns the coordinates of the end effector of the mechanical arm to the computer (x h ,y h ), depth d h , the coordinates of the target object (x o ,y o ), depth d o , and use the Euclidean distance to calculate the distance d between the end effector of the manipulator and the target object, the calculation formula is as follows:

[0042]

Embodiment 2

[0044] The step S2 initializes the movement distance reward r of the end effector of the mechanical arm relative to the target object 1 , step reward r 2 , the sum of the rotation angles of the steering gear for each degree of freedom of the robotic arm is rewarded r 3 And whether to grab the successful reward r 4, so that the above values are all 0, then the total number of rewards R is:

[0045] R=αr 1 +βr 2 +γr 3 +δr 4 (2).

Embodiment 3

[0047] The reward method calculated in the state S in the step S5 is:

[0048] S51: Calculate the moving distance of the end effector of the manipulator relative to the target object △d=d’-d, the moving distance reward r 1 =-△d;

[0049] S52: Step number s=s+1, step number reward r 2 =-s;

[0050] S53: Calculate the sum of the rotation angles of the steering gear for each degree of freedom of the manipulator as but

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses an efficient mechanical arm grabbing deep reinforcement learning reward training method and system, and relates to the technical field of machine learning. A depth camera is used for identifying a target object and a mechanical arm tail end, the sight distance between the center of a mechanical arm tail end actuator and the center of the target object is calculated, and the distance is returned to a computer as a basis for judging whether a mechanical arm can grab the target object or not. Every time the mechanical arm tail end actuator tents, the computer takes the moving distance of the mechanical arm tail end actuator relative to the object, the moving step number of a mechanical arm, the sum of the rotating angles of steering engines of all degrees of freedom of the mechanical arm and the weighted sum of whether the target object is successfully grabbed or not as a reward mechanism of a DDPG depth deterministic strategy gradient network. The end-to-end training process is completed by using a DDPG. The problems that existing mechanical arm control is poor in action coherence and not coordinated are solved.

Description

technical field [0001] The present invention relates to the technical field of machine learning, in particular to an efficient method and system for rewarding training of deep reinforcement learning for manipulator grasping. Background technique [0002] With the development of artificial intelligence and robotics, the use of robotic arms is becoming more and more abundant. It acts as the arm of a robot just like our arms. How to use the robotic arm to accurately grasp objects has become one of the key issues in the development of robot technology. At present, most of the popular kinematics and inverse kinematics methods are used for precise grasping of robotic arms. Accurate grasping is achieved by solving kinematics inverse solutions. This method needs to consider solvability, that is, consider no solution, multiple solutions, etc. In this case, the calculation is complex and time-consuming. Another relatively novel way is to make the robot have its own soul through rein...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): B25J9/16B25J9/04B25J19/02B25J19/04

CPCB25J9/163B25J9/04B25J19/02B25J19/04

Inventor 刘成汪霖郑春燕张晨升李银奎赵启轩马俊飞曲瑞王新宇

Owner NORTHWEST UNIV

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Efficient mechanical arm grabbing deep reinforcement learning reward training method and system

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment 1

Embodiment 2

Embodiment 3

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology