A Control Method of Inverted Pendulum Based on Neural Network and Reinforcement Learning

A technology of reinforcement learning and neural network, applied in the field of artificial intelligence and control, to achieve the effect of accelerating the generation of control volume, improving efficiency and fast update speed

Inactive Publication Date: 2018-11-06
CHINA UNIV OF MINING & TECH
View PDF7 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] In order to solve the above problems, the present invention provides an inverted pendulum control method based on neural network and reinforcement learning, which can not only realize fast stability control of the inverted pendulum system, but also use the reinforcement learning algorithm in the field of artificial intelligence, which can be used without marking, without Building and updating a neural network to maintain the balance of an inverted pendulum in the case of a mentor

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A Control Method of Inverted Pendulum Based on Neural Network and Reinforcement Learning
  • A Control Method of Inverted Pendulum Based on Neural Network and Reinforcement Learning
  • A Control Method of Inverted Pendulum Based on Neural Network and Reinforcement Learning

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0031] A kind of implementation process of the inverted pendulum control method based on neural network and reinforcement learning of the present invention is:

[0032] The overall control framework of the present invention is a reinforcement learning controller, assuming that at each time step t=1,2,..., the state of the Agent observing the Markov decision process is s t , choose action a, receive immediate reward r t , and make the system transition to the next state s t+1 , the transition probability is p(s t ,a t ,s t+1 ). Therefore, the evolution process of the first n steps of the system is as follows:

[0033]

[0034] The goal of a reinforcement learning system is to learn a policy π such that the cumulative discounted reward obtained in future time steps

[0035] The maximum (0≤γ≤1 is the discount factor), this strategy is the optimal strategy, but in many real situations, the state transition probability function P and reward function R of the environment ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention, which belongs to the technical field of artificial intelligence and control, relates to a neural network and enhanced learning algorithm, particularly to an inverted pendulum control method based on a neural network and reinforced learning, thereby carrying out self studying to complete control on an inverted pendulum. The method is characterized in that: step one, obtaining inverted pendulum system model information; step two, obtaining state information of an inverted pendulum and initializing a neural network; step three, carrying out and completing ELM training by using a straining sample SAM; step four, controlling the inverted pendulum by using an enhanced learning controller; step five, updating the training sample and a BP neural network; and step six, checking whether a control result meets a learning termination condition; if not, returning to the step two to carry out circulation continuously; and if so, finishing the algorithm. According to the invention, a problem of easy occurrence of a curse of dimensionality in continuous state space as well as a control problem of a non-linear system having a continuous state can be solved effectively; and the updating speed becomes fast.

Description

technical field [0001] The invention relates to an inverted pendulum control method based on a neural network and reinforcement learning, relates to a neural network and reinforcement learning algorithm, can perform self-learning, and completes a control device for an inverted pendulum, and belongs to the field of artificial intelligence and control technology. In particular, it involves combining the reinforcement learning algorithm with ELM-BP, utilizing the generalization performance of the neural network, and adopting the actor-critic architecture to design a new method that can effectively control the inverted pendulum system with a continuous state space. Background technique [0002] The inverted pendulum control system is an unstable, complex and nonlinear system, and it is an ideal model for testing control theories and methods, and an ideal experimental platform for teaching control theory and carrying out various control experiments. The research on the inverted p...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G05B13/04G06N3/08
CPCG05B13/042G05B2219/32329G06N3/088
Inventor 丁世飞孟令恒王婷婷许新征
Owner CHINA UNIV OF MINING & TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products