Check patentability & draft patents in minutes with Patsnap Eureka AI!

Method for improving accuracy of reliable action selection in intelligent agent control

An intelligent and accurate technology, applied in instruments, computing models, artificial life, etc., can solve the problems of low performance of learning strategies, low sample efficiency, poor accuracy of reliable actions, etc., to improve sample efficiency, improve accuracy, improve performance effect

Pending Publication Date: 2022-03-04
UNIV OF SCI & TECH OF CHINA
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0006] The purpose of the present invention is to provide a method for improving the accuracy of selecting reliable actions in the control of the agent, which can improve the sample efficiency of the model-based reinforcement learning method for the agent, and then solve the problem of the model-based learning method of the agent. Low efficiency, resulting in low performance of the learned strategy, resulting in poor accuracy of selecting reliable actions in control

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method for improving accuracy of reliable action selection in intelligent agent control
  • Method for improving accuracy of reliable action selection in intelligent agent control
  • Method for improving accuracy of reliable action selection in intelligent agent control

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0057] Embodiments of the present invention provide a method for improving the accuracy of selecting reliable actions in agent control, which is a sample-efficient model-based reinforcement learning method (model-based reinforcement learning) for agent control. The conservative model-based actor critic (CMBAC) method is suitable for the following scenarios, including:

[0058] Given a target task for an agent in a real-world application, the given problem can be modeled as a Markov decision problem available as a tuple to represent the Markov decision problem. in, is the state space, is the action space, requiring both the state space and the action space to be continuous; is the state transition probability density; is a deterministic reward function; γ∈(0,1) is a discount factor;

[0059] In the embodiment of the present invention, the strategy, that is, the mapping from the state to the probability distribution on the action space, is recorded as π, and π(|s) ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a method for improving the accuracy of selecting reliable actions in agent control, which comprises the following steps of: 1, interacting an agent deployed with a behavior strategy network, a probabilistic neural network and an evaluation scoring network in advance with a real environment according to a preset target task to be completed to collect real environment data, simulating real environment dynamics through probabilistic neural network learning from the collected real environment data to obtain a plurality of dynamic models; 2, learning a plurality of estimations of an evaluation scoring function of the evaluation scoring network by the intelligent agent based on a plurality of kinetic models; step 3, the intelligent agent optimizes the strategy of the behavior strategy network by using the average value of k minimum estimations in the plurality of estimations of the obtained evaluation scoring function; and 4, performing behavior selection by adopting an optimized strategy in intelligent agent control. The sample efficiency of a model reinforcement learning method for the intelligent agent can be improved, so that the learning strategy performance is improved, and the accuracy of selecting reliable actions in control is improved.

Description

technical field [0001] The invention relates to the field of intelligent body control, in particular to a method for improving the accuracy of selecting reliable actions in intelligent body control. Background technique [0002] Reinforcement learning has achieved great success in decision-making tasks, from playing video games to controlling robots in simulators. However, many of these results are achieved by model-free reinforcement learning methods and often require a large number of samples, which greatly hinders the application of model-free reinforcement learning methods to real-world tasks. [0003] In contrast, model-based RL methods build a model of the environment and generate fictional interactions with higher sample efficiency than model-free RL methods. Therefore, for agent learning, model-based reinforcement learning methods are more promising approaches for real-world tasks, such as robot control, industrial control, etc. The sample efficiency of modeled rei...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06N3/00G06N20/00
CPCG06N3/008G06N20/00
Inventor 王杰李厚强王治海周祺
Owner UNIV OF SCI & TECH OF CHINA
Features
  • R&D
  • Intellectual Property
  • Life Sciences
  • Materials
  • Tech Scout
Why Patsnap Eureka
  • Unparalleled Data Quality
  • Higher Quality Content
  • 60% Fewer Hallucinations
Social media
Patsnap Eureka Blog
Learn More