Reinforced learning algorithm based on immunologic tolerance mechanism

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
An immune tolerance and reinforcement learning technology, applied in the field of reinforcement learning algorithms, can solve the problems that the algorithm is easy to fall into local extremum and not converge, and achieve the effect of ensuring the global optimization ability

Inactive Publication Date: 2013-07-24

XIAN UNIV OF TECH

View PDF2 Cites 6 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

Therefore, TD(λ) based on value function approximation has attracted more and more attention, but the algorithm is prone to fall into local extremum and is not convergent

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment

[0107] The implementation process of the reinforcement learning algorithm based on the immune tolerance mechanism in the present invention will be illustrated below through an example of robot path planning.

[0108] (1) First, determine the path map of the robot, using a 20×20 grid map, represented by a matrix M, element 0 in M represents the passable area, and element 1 represents obstacles.

[0109] (2) Secondly, initialize parameters, see step 1.

[0110] (3) Starting from the starting position, if the position of the robot basically does not change within k time steps, that is, the distance between the position of the kth step before the current time step and the current position is greater than a certain threshold D max , then use immunity to optimize the learning system, jump to (4), otherwise jump to (5).

[0111] (4) Execute steps 3 to 7 for the weights in the neural network.

[0112] (5) if Figure 6 As shown, the 8 locations adjacent to the current location are n...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a reinforced learning algorithm based on an immunologic tolerance mechanism. The reinforced learning algorithm based on the immunologic tolerance mechanism comprises the steps of firstly designing the vector quantity of a primary function and the vector quantity of a weight value of a TD (lambda), then encoding the vector quantity of the weight value according to the number of floating points, when the error between the environment of a system and the real environment is larger than a set threshold value, regarding the environment of the system as a primary response in an artificial immune system, when meeting the environment for the first time, optimizing the environment by the immunologic tolerance mechanism, memorizing environmental knowledge by a memory, namely an immune body, then selecting the optimal strategy according to parameters of a current system, updating the parameters of the system according to a feedback reward value (r), continuously carrying out an iteration for the next time, when the error between the environment of the system and the real environment is smaller than the threshold value, regarding the environment of the system as similar environment, regarding the similar environment as a secondary response in the artificial immune system, and directly selecting the optimal strategy according to parameters of the system through the fact that the system judges motion selection.

Description

technical field [0001] The invention relates to a reinforcement learning algorithm based on immune tolerance mechanism. Background technique [0002] Reinforcement learning is a kind of machine learning algorithm between supervised learning and unsupervised learning. It originated from behavioral psychology. It was developed in the 1980s and is currently widely used in game competitions, control systems, scheduling management, and robotics. , is a hotspot in the field of machine learning research. [0003] Reinforcement learning can learn the environment based on deterministic or non-deterministic rewards without knowing the model. Typical reinforcement learning algorithms are: Sarsa learning algorithm, Q learning algorithm, TD (λ) learning algorithm. Among them, TD(λ) learning algorithm includes tabular TD(λ) and TD(λ) based on value function approximation. In the Sarsa learning algorithm, Q learning algorithm, and tabular TD(λ), a large amount of storage space is requir...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Applications(China)

IPC IPC(8): G06N3/00G06N3/08

Inventor 王磊黑新宏金海燕林叶王玉

Owner XIAN UNIV OF TECH

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Reinforced learning algorithm based on immunologic tolerance mechanism

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology