Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Reinforced learning method and device based on short-time access mechanism and storage medium

A reinforcement learning and mechanism technology, applied in neural learning methods, neural architectures, biological neural network models, etc., can solve problems such as low learning efficiency of action strategies, and achieve the effect of reducing sample complexity, improving learning efficiency, and improving exploration efficiency.

Pending Publication Date: 2020-11-06
TSINGHUA UNIV
View PDF0 Cites 1 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] In view of this, this disclosure proposes a reinforcement learning technical solution based on a short-term access mechanism to solve the problem of low learning efficiency of the agent's action strategy in the reinforcement learning process

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Reinforced learning method and device based on short-time access mechanism and storage medium
  • Reinforced learning method and device based on short-time access mechanism and storage medium
  • Reinforced learning method and device based on short-time access mechanism and storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0046] Various exemplary embodiments, features, and aspects of the present disclosure will be described in detail below with reference to the accompanying drawings. The same reference numbers in the figures indicate functionally identical or similar elements. While various aspects of the embodiments are shown in drawings, the drawings are not necessarily drawn to scale unless specifically indicated.

[0047] The word "exemplary" is used exclusively herein to mean "serving as an example, embodiment, or illustration." Any embodiment described herein as "exemplary" is not necessarily to be construed as superior or better than other embodiments.

[0048] In addition, in order to better illustrate the present disclosure, numerous specific details are given in the following specific implementation manners. It will be understood by those skilled in the art that the present disclosure may be practiced without some of the specific details. In some instances, methods, means, componen...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention relates to a reinforcement learning method and device based on a short-time access mechanism, and a storage medium. The method comprises the steps: configuring a state cache list which is used for storing state increment information obtained through the change of a current environment state of an intelligent agent under the condition that the intelligent agent meets a preset short-time access mechanism; inputting all actions of the intelligent agent at the next moment into the environment state transition probability model, and outputting a plurality of environment states of allactions corresponding to the next moment; comparing the plurality of environment states at the next moment with the state increment information in the state cache list, and determining an action corresponding to the environment state with the maximum difference in the plurality of environment states as a first alternative action executed by the intelligent agent at the next moment; and executing exploration operation for reinforcement learning according to the first alternative action. According to the invention, through the state cache list, repeated exploration of the explored environment state is avoided; through the environment state transition probability model, the exploration of the intelligent agent to the unknown state is strengthened and guided, and the learning efficiency is effectively improved.

Description

technical field [0001] The present disclosure relates to the technical field of artificial intelligence, and in particular to a reinforcement learning method, device and storage medium based on a short-term access mechanism. Background technique [0002] Reinforcement learning refers to a learning method that controls the interaction between an agent (agent) and the environment in order to enable the agent to obtain the maximum reward in the environment. One of the most important problems faced in reinforcement learning is how to trade off "exploration" and "exploitation": too much reliance on "exploration" will reduce the learning efficiency of the agent's action strategy, and too much reliance on "exploitation" "It will cause the agent to be unable to learn more effective action strategies; relying solely on "exploration" or "utilization" cannot complete reinforcement learning tasks well. Traditional solutions usually use a count-based method, that is, maintain a global s...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06N3/04G06N3/08
CPCG06N3/08G06N3/045
Inventor 季向阳张宏昌
Owner TSINGHUA UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products