Check patentability & draft patents in minutes with Patsnap Eureka AI!

Method for constructing diversified search strategy model based on deep reinforcement learning network

A search strategy and reinforcement learning technology, applied in the field of deep reinforcement learning, to reduce the probability of falling into a local solution

Active Publication Date: 2022-01-21
INST OF AUTOMATION CHINESE ACAD OF SCI
View PDF8 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] However, traditional deep reinforcement learning exploration methods are difficult to deal with the problem of misleading rewards in scenarios where the input is high-dimensional data (such as the environment with images and high-dimensional vectors as states), and these misleading rewards will prevent agents from obtaining long-term Looking at higher rewards, this eventually makes the agent stuck in local solutions

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method for constructing diversified search strategy model based on deep reinforcement learning network
  • Method for constructing diversified search strategy model based on deep reinforcement learning network
  • Method for constructing diversified search strategy model based on deep reinforcement learning network

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0037] In order to make the purpose, technical solutions and advantages of the embodiments of the present disclosure clearer, the technical solutions in the embodiments of the present disclosure will be clearly and completely described below in conjunction with the drawings in the embodiments of the present disclosure. Obviously, the described embodiments It is a part of embodiments of the present disclosure, but not all embodiments. Based on the embodiments in the present disclosure, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts belong to the protection scope of the present disclosure.

[0038] The first exemplary embodiment of the present disclosure provides a method for building a model based on a deep reinforcement learning network.

[0039] figure 1 A flowchart schematically shows a method for constructing a model of diverse search strategies based on a deep reinforcement learning network according to an embodiment...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention relates to a method for constructing a diversified search strategy model based on a deep reinforcement learning network, and the method is based on setting of the weight of a virtual reward, can enable different agents to access different states, and enables the intelligent agents to access different states. Once a certain intelligent agent is caught in a misleading reward, other intelligent agents access a series of states guiding the misleading reward again, the weight is a negative value, so that the signals of the virtual rewards obtained by the other agents are negative, and the intelligent agents are forced not to access the series of states guiding the misleading reward any more. In this way, it is guaranteed that different intelligent agents access different state sets, an updated search strategy model can find a second target position corresponding to global optimum after being trained; and the technical problem that in the prior art, the global optimum cannot be searched due to the fact that misleading rewards are caught when high-dimensional data are searched is effectively solved, and the probability that the agent falls into a local solution due to misleading rewards can be reduced.

Description

technical field [0001] The present disclosure relates to the field of deep reinforcement learning and the field of image processing technology, and in particular to a method for constructing a model of diverse search strategies based on a deep reinforcement learning network. Background technique [0002] With the development of artificial intelligence technology, when making decisions in the face of complex scenarios, a method of deep reinforcement learning has been proposed. Deep learning (DL, Deep Learning) is a method of representation learning for data in machine learning. Reinforcement learning (RL, Reinforcement Learning) is to build an environment model and learn an optimal strategy by exploring the unknown environment while exploring. Deep Reinforcement Learning (DRL, Deep Reinforcement Learning) combines the perception ability of deep learning and the decision-making ability of reinforcement learning, and can be directly controlled according to the input informatio...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06N3/08G06N20/10G06N3/04
CPCG06N3/08G06N20/10G06N3/04
Inventor 黄凯奇尹奇跃张俊格徐沛
Owner INST OF AUTOMATION CHINESE ACAD OF SCI
Features
  • R&D
  • Intellectual Property
  • Life Sciences
  • Materials
  • Tech Scout
Why Patsnap Eureka
  • Unparalleled Data Quality
  • Higher Quality Content
  • 60% Fewer Hallucinations
Social media
Patsnap Eureka Blog
Learn More