Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Reinforced learning algorithm based on immunologic tolerance mechanism

An immune tolerance and reinforcement learning technology, applied in the field of reinforcement learning algorithms, can solve the problems that the algorithm is easy to fall into local extremum and not converge, and achieve the effect of ensuring the global optimization ability

Inactive Publication Date: 2013-07-24
XIAN UNIV OF TECH
View PDF2 Cites 6 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Therefore, TD(λ) based on value function approximation has attracted more and more attention, but the algorithm is prone to fall into local extremum and is not convergent

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Reinforced learning algorithm based on immunologic tolerance mechanism
  • Reinforced learning algorithm based on immunologic tolerance mechanism
  • Reinforced learning algorithm based on immunologic tolerance mechanism

Examples

Experimental program
Comparison scheme
Effect test

Embodiment

[0107] The implementation process of the reinforcement learning algorithm based on the immune tolerance mechanism in the present invention will be illustrated below through an example of robot path planning.

[0108] (1) First, determine the path map of the robot, using a 20×20 grid map, represented by a matrix M, element 0 in M ​​represents the passable area, and element 1 represents obstacles.

[0109] (2) Secondly, initialize parameters, see step 1.

[0110] (3) Starting from the starting position, if the position of the robot basically does not change within k time steps, that is, the distance between the position of the kth step before the current time step and the current position is greater than a certain threshold D max , then use immunity to optimize the learning system, jump to (4), otherwise jump to (5).

[0111] (4) Execute steps 3 to 7 for the weights in the neural network.

[0112] (5) if Figure 6 As shown, the 8 locations adjacent to the current location are n...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a reinforced learning algorithm based on an immunologic tolerance mechanism. The reinforced learning algorithm based on the immunologic tolerance mechanism comprises the steps of firstly designing the vector quantity of a primary function and the vector quantity of a weight value of a TD (lambda), then encoding the vector quantity of the weight value according to the number of floating points, when the error between the environment of a system and the real environment is larger than a set threshold value, regarding the environment of the system as a primary response in an artificial immune system, when meeting the environment for the first time, optimizing the environment by the immunologic tolerance mechanism, memorizing environmental knowledge by a memory, namely an immune body, then selecting the optimal strategy according to parameters of a current system, updating the parameters of the system according to a feedback reward value (r), continuously carrying out an iteration for the next time, when the error between the environment of the system and the real environment is smaller than the threshold value, regarding the environment of the system as similar environment, regarding the similar environment as a secondary response in the artificial immune system, and directly selecting the optimal strategy according to parameters of the system through the fact that the system judges motion selection.

Description

technical field [0001] The invention relates to a reinforcement learning algorithm based on immune tolerance mechanism. Background technique [0002] Reinforcement learning is a kind of machine learning algorithm between supervised learning and unsupervised learning. It originated from behavioral psychology. It was developed in the 1980s and is currently widely used in game competitions, control systems, scheduling management, and robotics. , is a hotspot in the field of machine learning research. [0003] Reinforcement learning can learn the environment based on deterministic or non-deterministic rewards without knowing the model. Typical reinforcement learning algorithms are: Sarsa learning algorithm, Q learning algorithm, TD (λ) learning algorithm. Among them, TD(λ) learning algorithm includes tabular TD(λ) and TD(λ) based on value function approximation. In the Sarsa learning algorithm, Q learning algorithm, and tabular TD(λ), a large amount of storage space is requir...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06N3/00G06N3/08
Inventor 王磊黑新宏金海燕林叶王玉
Owner XIAN UNIV OF TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products