Biped robot adaptive walking control method based on deep reinforcement learning

A biped robot and reinforcement learning technology, which is applied in the field of adaptive walking control of biped robots based on deep reinforcement learning, can solve problems such as the general performance of the strategy gradient algorithm, the inability to converge complex inputs, and huge parameters, etc., to achieve powerful parameter optimization Update ability, strong walking stability, good real-time effect

Active Publication Date: 2019-09-20
TONGJI ARTIFICIAL INTELLIGENCE RES INST SUZHOU CO LTD
View PDF6 Cites 26 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] 2. Huge parameters
In this case, policy gradient algorithms such as RDPG, DDPG, a

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Biped robot adaptive walking control method based on deep reinforcement learning
  • Biped robot adaptive walking control method based on deep reinforcement learning
  • Biped robot adaptive walking control method based on deep reinforcement learning

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0039] The present invention will be described in detail below in conjunction with the accompanying drawings and specific embodiments. This embodiment is carried out on the premise of the technical solution of the present invention, and detailed implementation and specific operation process are given, but the protection scope of the present invention is not limited to the following embodiments.

[0040] The present invention introduces reinforcement learning methods. Considering that supervised common machine learning models and deep learning methods relying on large-scale neural network construction can not complete adaptive control in scenarios where the real-time data of robot walking is relatively high, and the previous and subsequent time states are interdependent. Relying on the interaction data information training between the agent (robot) and the environment, the reinforcement learning model has unique advantages in this unsupervised scenario. At the same time, in ord...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a biped robot adaptive walking control method based on deep reinforcement learning. The method comprises the steps that 1) a simulation platform is established; 2) a network model based on a deep reinforcement learning method introducing an attention mechanism is constructed; 3) the network model is trained according to the interaction information of a biped robot in the environment of the simulation platform, and the interaction information is stored in a playback pool; and 4) the network model which completes training is used to realize adaptive control of walking of the biped robot. Compared with the prior art, the method provided by the invention has the advantages of fast convergence speed, good fitting effect, high walking stability and the like.

Description

technical field [0001] The invention relates to a robot control method, in particular to a biped robot adaptive walking control method based on deep reinforcement learning. Background technique [0002] Through the continuous development and innovation of technology, biped robots have been able to walk in a known environment through trajectory planning or trajectory teaching. However, compared with humans in an unknown environment, they can adaptively adjust their gait, cross obstacles, and move flexibly. There are still many areas to be improved and improved in the walking control of biped robots. [0003] There are several difficulties in the adaptive walking control of biped robots in complex environments: [0004] 1. Various gaits. Robots need to produce many kinds of gaits when crossing complex terrains. Classical robot walking control algorithms such as multi-objective optimization, gradient descent, genetic algorithm and single-layer CPG cannot satisfy the adaptabil...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G05D1/02G06N3/04G06N3/08
CPCG05D1/0257G06N3/084G05D1/0223G05D1/0221G05D1/028G05D1/0276G06N3/045
Inventor 刘成菊马璐
Owner TONGJI ARTIFICIAL INTELLIGENCE RES INST SUZHOU CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products