Method for constructing diversified search strategy model based on deep reinforcement learning network

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A search strategy and reinforcement learning technology, applied in the field of deep reinforcement learning, to reduce the probability of falling into a local solution

Active Publication Date: 2022-01-21

INST OF AUTOMATION CHINESE ACAD OF SCI

View PDF8 Cites 2 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0004] However, traditional deep reinforcement learning exploration methods are difficult to deal with the problem of misleading rewards in scenarios where the input is high-dimensional data (such as the environment with images and high-dimensional vectors as states), and these misleading rewards will prevent agents from obtaining long-term Looking at higher rewards, this eventually makes the agent stuck in local solutions

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0037] In order to make the purpose, technical solutions and advantages of the embodiments of the present disclosure clearer, the technical solutions in the embodiments of the present disclosure will be clearly and completely described below in conjunction with the drawings in the embodiments of the present disclosure. Obviously, the described embodiments It is a part of embodiments of the present disclosure, but not all embodiments. Based on the embodiments in the present disclosure, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts belong to the protection scope of the present disclosure.

[0038] The first exemplary embodiment of the present disclosure provides a method for building a model based on a deep reinforcement learning network.

[0039] figure 1 A flowchart schematically shows a method for constructing a model of diverse search strategies based on a deep reinforcement learning network according to an embodiment...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention relates to a method for constructing a diversified search strategy model based on a deep reinforcement learning network, and the method is based on setting of the weight of a virtual reward, can enable different agents to access different states, and enables the intelligent agents to access different states. Once a certain intelligent agent is caught in a misleading reward, other intelligent agents access a series of states guiding the misleading reward again, the weight is a negative value, so that the signals of the virtual rewards obtained by the other agents are negative, and the intelligent agents are forced not to access the series of states guiding the misleading reward any more. In this way, it is guaranteed that different intelligent agents access different state sets, an updated search strategy model can find a second target position corresponding to global optimum after being trained; and the technical problem that in the prior art, the global optimum cannot be searched due to the fact that misleading rewards are caught when high-dimensional data are searched is effectively solved, and the probability that the agent falls into a local solution due to misleading rewards can be reduced.

Description

technical field [0001] The present disclosure relates to the field of deep reinforcement learning and the field of image processing technology, and in particular to a method for constructing a model of diverse search strategies based on a deep reinforcement learning network. Background technique [0002] With the development of artificial intelligence technology, when making decisions in the face of complex scenarios, a method of deep reinforcement learning has been proposed. Deep learning (DL, Deep Learning) is a method of representation learning for data in machine learning. Reinforcement learning (RL, Reinforcement Learning) is to build an environment model and learn an optimal strategy by exploring the unknown environment while exploring. Deep Reinforcement Learning (DRL, Deep Reinforcement Learning) combines the perception ability of deep learning and the decision-making ability of reinforcement learning, and can be directly controlled according to the input informatio...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G06N3/08G06N20/10G06N3/04

CPCG06N3/08G06N20/10G06N3/04

Inventor 黄凯奇尹奇跃张俊格徐沛

Owner INST OF AUTOMATION CHINESE ACAD OF SCI

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Method for constructing diversified search strategy model based on deep reinforcement learning network

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology