A Control Method of Inverted Pendulum Based on Neural Network and Reinforcement Learning

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A technology of reinforcement learning and neural network, applied in the field of artificial intelligence and control, to achieve the effect of accelerating the generation of control volume, improving efficiency and fast update speed

Inactive Publication Date: 2018-11-06

CHINA UNIV OF MINING & TECH

View PDF7 Cites 0 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0005] In order to solve the above problems, the present invention provides an inverted pendulum control method based on neural network and reinforcement learning, which can not only realize fast stability control of the inverted pendulum system, but also use the reinforcement learning algorithm in the field of artificial intelligence, which can be used without marking, without Building and updating a neural network to maintain the balance of an inverted pendulum in the case of a mentor

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0031] A kind of implementation process of the inverted pendulum control method based on neural network and reinforcement learning of the present invention is:

[0032] The overall control framework of the present invention is a reinforcement learning controller, assuming that at each time step t=1,2,..., the state of the Agent observing the Markov decision process is s t , choose action a, receive immediate reward r t , and make the system transition to the next state s t+1 , the transition probability is p(s t ,a t ,s t+1 ). Therefore, the evolution process of the first n steps of the system is as follows:

[0033]

[0034] The goal of a reinforcement learning system is to learn a policy π such that the cumulative discounted reward obtained in future time steps

[0035] The maximum (0≤γ≤1 is the discount factor), this strategy is the optimal strategy, but in many real situations, the state transition probability function P and reward function R of the environment ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention, which belongs to the technical field of artificial intelligence and control, relates to a neural network and enhanced learning algorithm, particularly to an inverted pendulum control method based on a neural network and reinforced learning, thereby carrying out self studying to complete control on an inverted pendulum. The method is characterized in that: step one, obtaining inverted pendulum system model information; step two, obtaining state information of an inverted pendulum and initializing a neural network; step three, carrying out and completing ELM training by using a straining sample SAM; step four, controlling the inverted pendulum by using an enhanced learning controller; step five, updating the training sample and a BP neural network; and step six, checking whether a control result meets a learning termination condition; if not, returning to the step two to carry out circulation continuously; and if so, finishing the algorithm. According to the invention, a problem of easy occurrence of a curse of dimensionality in continuous state space as well as a control problem of a non-linear system having a continuous state can be solved effectively; and the updating speed becomes fast.

Description

technical field [0001] The invention relates to an inverted pendulum control method based on a neural network and reinforcement learning, relates to a neural network and reinforcement learning algorithm, can perform self-learning, and completes a control device for an inverted pendulum, and belongs to the field of artificial intelligence and control technology. In particular, it involves combining the reinforcement learning algorithm with ELM-BP, utilizing the generalization performance of the neural network, and adopting the actor-critic architecture to design a new method that can effectively control the inverted pendulum system with a continuous state space. Background technique [0002] The inverted pendulum control system is an unstable, complex and nonlinear system, and it is an ideal model for testing control theories and methods, and an ideal experimental platform for teaching control theory and carrying out various control experiments. The research on the inverted p...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Patents(China)

IPC IPC(8): G05B13/04G06N3/08

CPCG05B13/042G05B2219/32329G06N3/088

Inventor 丁世飞孟令恒王婷婷许新征

Owner CHINA UNIV OF MINING & TECH

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

A Control Method of Inverted Pendulum Based on Neural Network and Reinforcement Learning

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology