Biped robot adaptive walking control method based on deep reinforcement learning

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A biped robot and reinforcement learning technology, which is applied in the field of adaptive walking control of biped robots based on deep reinforcement learning, can solve problems such as the general performance of the strategy gradient algorithm, the inability to converge complex inputs, and huge parameters, etc., to achieve powerful parameter optimization Update ability, strong walking stability, good real-time effect

Active Publication Date: 2019-09-20

TONGJI ARTIFICIAL INTELLIGENCE RES INST SUZHOU CO LTD

View PDF6 Cites 26 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0005] 2. Huge parameters

In this case, policy gradient algorithms such as RDPG, DDPG, and Actor2Critic models perform generally, and even fail to converge for complex inputs.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0039] The present invention will be described in detail below in conjunction with the accompanying drawings and specific embodiments. This embodiment is carried out on the premise of the technical solution of the present invention, and detailed implementation and specific operation process are given, but the protection scope of the present invention is not limited to the following embodiments.

[0040] The present invention introduces reinforcement learning methods. Considering that supervised common machine learning models and deep learning methods relying on large-scale neural network construction can not complete adaptive control in scenarios where the real-time data of robot walking is relatively high, and the previous and subsequent time states are interdependent. Relying on the interaction data information training between the agent (robot) and the environment, the reinforcement learning model has unique advantages in this unsupervised scenario. At the same time, in ord...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention relates to a biped robot adaptive walking control method based on deep reinforcement learning. The method comprises the steps that 1) a simulation platform is established; 2) a network model based on a deep reinforcement learning method introducing an attention mechanism is constructed; 3) the network model is trained according to the interaction information of a biped robot in the environment of the simulation platform, and the interaction information is stored in a playback pool; and 4) the network model which completes training is used to realize adaptive control of walking of the biped robot. Compared with the prior art, the method provided by the invention has the advantages of fast convergence speed, good fitting effect, high walking stability and the like.

Description

technical field [0001] The invention relates to a robot control method, in particular to a biped robot adaptive walking control method based on deep reinforcement learning. Background technique [0002] Through the continuous development and innovation of technology, biped robots have been able to walk in a known environment through trajectory planning or trajectory teaching. However, compared with humans in an unknown environment, they can adaptively adjust their gait, cross obstacles, and move flexibly. There are still many areas to be improved and improved in the walking control of biped robots. [0003] There are several difficulties in the adaptive walking control of biped robots in complex environments: [0004] 1. Various gaits. Robots need to produce many kinds of gaits when crossing complex terrains. Classical robot walking control algorithms such as multi-objective optimization, gradient descent, genetic algorithm and single-layer CPG cannot satisfy the adaptabil...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G05D1/02G06N3/04G06N3/08

CPCG05D1/0257G06N3/084G05D1/0223G05D1/0221G05D1/028G05D1/0276G06N3/045

Inventor 刘成菊马璐

Owner TONGJI ARTIFICIAL INTELLIGENCE RES INST SUZHOU CO LTD

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Biped robot adaptive walking control method based on deep reinforcement learning

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology