An efficient robotic arm grasping deep reinforcement learning reward training method and system

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A technology of reinforcement learning and training methods, applied in the field of machine learning, can solve problems such as incoordination and poor movement coherence, and achieve the effect of improving poor movement coherence and solving complex calculations

Active Publication Date: 2022-08-09

NORTHWEST UNIV

View PDF0 Cites 0 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0006] The purpose of the present invention is to provide an efficient manipulator grasping depth reinforcement learning reward training method and system, which improves the problems of poor coherence and uncoordination of the existing manipulator control.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment 1

[0041] In the step S1, after the depth camera identifies the end effector of the manipulator and the target object, it returns the coordinates of the end effector of the manipulator to the computer (x h ,y h ), depth d h , the coordinates of the target object (x o ,y o ), depth d o , and use the Euclidean distance to calculate the distance d between the end effector of the manipulator and the target object. The calculation formula is as follows:

[0042]

Embodiment 2

[0044] The step S2 initializes the moving distance reward r of the end effector of the robotic arm relative to the target object 1 , step reward r 2 , the sum of the rotation angles of the servos for each degree of freedom of the manipulator is rewarded r 3 and whether to grab a successful reward r 4, so that the above values are all 0, then the total number of rewards R is:

[0045] R=αr 1 +βr 2 +γr 3 +δr 4 (2).

Embodiment 3

[0047] The reward method calculated in the state S in the step S5 is:

[0048] S51: Calculate the moving distance △d=d'-d of the end effector of the robotic arm relative to the target object, and the moving distance is rewarded r 1 =-Δd;

[0049] S52: number of steps s=s+1, reward r for number of steps 2 =-s;

[0050] S53: Calculate the sum of the rotation angles of the servos for each degree of freedom of the manipulator as but

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses an efficient deep reinforcement learning reward training method and system for grasping a mechanical arm, which relates to the technical field of machine learning. and return this to the computer as a basis for judging whether the robotic arm can grasp the target object. Each time the end effector of the manipulator is tested, the computer calculates the moving distance of the end effector of the manipulator relative to the object, the number of steps moved by the manipulator, the sum of the rotation angles of the steering gear of each degree of freedom of the manipulator, and whether the target object is successfully grasped. And as the reward mechanism of DDPG deep deterministic policy gradient network, the end-to-end training process is completed with DDPG. The present invention improves the problems of poor action continuity and uncoordinated existing in the control of the existing mechanical arm.

Description

technical field [0001] The invention relates to the technical field of machine learning, in particular to an efficient deep reinforcement learning reward training method and system for grasping a robotic arm. Background technique [0002] With the development of artificial intelligence and robotics, the use of robotic arms is becoming more and more abundant, and it acts as the arm of a robot just like our arm. How to use the robotic arm to grasp objects accurately has become one of the key issues in the development of robotics. At present, the most popular kinematics and inverse kinematics methods for precise grasping of robotic arms are used to achieve precise grasping by solving the inverse kinematics solution. This method needs to consider the solvability, that is, no solution, multiple solutions, etc. situation, the calculation is complex and time-consuming. Another relatively novel way is to make the robot have its own soul through reinforcement learning, so that it c...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Patents(China)

IPC IPC(8): B25J9/16B25J9/04B25J19/02B25J19/04

CPCB25J9/163B25J9/04B25J19/02B25J19/04

Inventor 刘成汪霖郑春燕张晨升李银奎赵启轩马俊飞曲瑞王新宇

Owner NORTHWEST UNIV

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

An efficient robotic arm grasping deep reinforcement learning reward training method and system

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment 1

Embodiment 2

Embodiment 3

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology