Robot motion control method and device based on actor-critic method

A technology of robot motion and control method, applied in the field of machine learning

Inactive Publication Date: 2016-06-22
SUZHOU UNIV
View PDF7 Cites 30 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, for the off-policy AC method proposed in recent

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Robot motion control method and device based on actor-critic method
  • Robot motion control method and device based on actor-critic method
  • Robot motion control method and device based on actor-critic method

Examples

Experimental program
Comparison scheme
Effect test

Example Embodiment

[0045] Embodiment 1: A robot motion control method. The video data is collected by a camera, and the video data is processed to obtain the current position information of the robot, obstacle distribution information, and given destination information. The map is obtained by analyzing the video data. The position of the robot is regarded as the state x of the robot, and the movement direction of the robot is regarded as the action u; figure 1 As shown, specific control methods include learning process and motion control.

[0046] The learning process includes the following steps:

[0047] 1 state transition

[0048] According to the environment model, the robot state is transferred, and then the actions that should be performed in the new state are selected according to the behavior strategy. The behavior strategy adopts a completely random strategy, that is, the selection probabilities of all actions in any state are equal and the sum is 1.

[0049] 2Calculate the off strategy factor...

Example Embodiment

[0074] This embodiment 801.4

[0075] Off-PAC1242.4

[0076] OPGTD2(λ)1125.2

[0077] SARSA1747.8

[0078] In the method of the present invention, the result obtained by changing different μ values:

[0079] VOPAC

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a robot motion control method and device based on an actor-critic method. The control method comprises the steps that video data are collected, and the current robot position information, the obstacle distribution information and the given destination information are obtained; the position where a robot is located serves as the state of the robot, and the motion direction of the robot serves as an action; state transition is conducted; discrete strategy factors are calculated; the approximate average rewarding value and the approximate average square rewarding value are updated; the current average rewarding time difference and the current average square rewarding time difference are calculated; iteration updating is conducted on approximate average rewarding parameters and approximate average square rewarding parameters; approximate average rewarding gradient calculating, approximate average square rewarding gradient calculating and strategy parameter updating are conducted; and state actions are replaced. The above steps are repeated till the strategy parameters are converged, and the robot motion control is achieved. According to the robot motion control method and device, the intelligent motion control is achieved, and the control result is stable.

Description

technical field [0001] The invention relates to a robot motion control method, which belongs to the field of machine learning, in particular to a variance-related off-policy actor-critic control method and device. Background technique [0002] With the development of robotics research, how to intelligently control the movement of robots has become a key technical issue for the further development of robots. In the prior art, the motion control of the robot includes human control and automatic control. [0003] For example, Chinese invention patent application CN105313129A discloses a video-based robot walking motion control method. The robot camera collects video images, checks the robot video images on a mobile terminal (PAD or mobile phone), and slides on the mobile terminal video images through fingers. Damage, to control the walking motion of the robot. This technical solution belongs to human control, although the control in the non-visual distance can be realized thr...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): B25J9/16
CPCB25J9/1664
Inventor 刘全许丹朱斐
Owner SUZHOU UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products