Optimal control method of gait of humanoid robot based on deep Q network

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A humanoid robot and network technology, applied in the direction of adaptive control, general control system, control/regulation system, etc., can solve problems such as gait walking and other complex movements, and achieve the effect of fast and stable walking and increased walking speed

Inactive Publication Date: 2020-02-07

HOHAI UNIV

View PDF3 Cites 18 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

The current deep reinforcement learning method can make the robot realize some simple tasks, but it does not consider complex movements such as gait walking

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment 1

[0042] A gait control method for a humanoid robot based on a deep Q network, comprising:

[0043] Construct the gait model of the humanoid robot to realize the omnidirectional walking of the humanoid robot;

[0044] Obtain the interaction data between the humanoid robot and the environment during the walking process, store it in the memory data pool, and use it to provide training samples; the interaction data is a quadruple (s, a, r, s′), where s represents State parameters, a represents the dynamic parameters of the humanoid robot in state s, r represents the feedback reward value obtained by the humanoid robot in state s when performing action a, and s′ represents the reward value obtained by the humanoid robot after performing action a in state s next state;

[0045] Build a deep Q-network learning architecture, learn and train the deep Q-network based on the training samples of the memory data pool, and obtain the state-action strategy deep Q-network model of the humanoid ...

Embodiment 1-1

[0051] On the basis of Embodiment 1, this embodiment takes the simulation platform as an example to illustrate the gait control and optimization process of the humanoid robot, that is, this embodiment selects the NAO simulation robot as the experimental object, and the RoboCup 3D simulation platform as the experimental environment. During the training process, the gait model parameters and state parameters can be captured directly through the platform to fit the state-action value function generated by the robot walking, and the gait action performed by the current robot is selected through the action selection strategy, and the reward function is generated to update the DQN. It can reduce the problem of falling into local optimum caused by too many robot parameters in the optimization process, improve the walking speed of the robot, and realize the fast and stable walking of the humanoid robot.

[0052] refer to figure 1 and figure 2 As shown, this embodiment also includes,...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses an optimal control method of a gait of a humanoid robot based on a deep Q network. The method comprises the steps of: constructing a gait model, and obtaining interaction databetween the humanoid robot and the environment during walking for providing training samples; performing learning and training on the deep Q network based on the training samples of a memory data poolto obtain a state-action strategy deep Q network model of the humanoid robot; obtaining state parameters of the humanoid robot in an action environment to serve as the input of the deep Q network model, and obtaining action parameters of the deep Q network model under the current state-action strategy; performing gait control on the humanoid robot by using the constructed gait model and accordingto the action parameters output by the deep Q network model; and achieving the purpose of updating the deep Q network in the training of the deep Q network model by generating an award function. By adopting the optimal control method disclosed by the invention, the walking speed of the humanoid robot can be improved, and the fast and stable walking of the humanoid robot can be realized.

Description

technical field [0001] The invention relates to a gait optimization control method of a humanoid robot based on a deep Q network, and belongs to the technical field of humanoid robots. Background technique [0002] As an important branch of mobile robots, humanoid robots are the most suitable universal mobile and manipulation platforms for working with humans. In all the process of imitating human behavior, the most important thing that the robot should have is its walking function. [0003] A humanoid robot has many degrees of freedom and is a changing mechanical structure while walking. Using the 3D Linear Inverted Pendulum Model (3D-LIPM) gait model to realize the fast walking of the robot requires debugging a large number of gait parameters. However, the traditional manual parameter adjustment method takes a lot of time and may not be able to obtain the optimal value. Currently, genetic algorithms, particle swarm optimization, and reinforcement learning can all optimi...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G05B13/04

CPCG05B13/042

Inventor 刘惠义袁雯陶莹刘晓芸

Owner HOHAI UNIV

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Optimal control method of gait of humanoid robot based on deep Q network

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment 1

Embodiment 1-1

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology