Reinforced learning method and device

A technology of reinforcement learning and learning model, applied in the field of reinforcement learning, can solve the problems of low expected return of policy function, low training efficiency of reinforcement learning, and low learning efficiency, and achieve the effect of improving efficiency, training efficiency and accuracy.

Pending Publication Date: 2020-09-01
HUAWEI TECH CO LTD
View PDF3 Cites 18 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The strategy function of the agent usually uses a deep neural network, but the deep neural network often has the problem of low learning efficiency
In the case of a large number of training ...

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Reinforced learning method and device
  • Reinforced learning method and device
  • Reinforced learning method and device

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0050] The technical solution in this application will be described below with reference to the accompanying drawings.

[0051] In order to describe the embodiment of the present application, several terms involved in the embodiment of the present application are firstly introduced.

[0052] Artificial Intelligence (AI): A branch of computer science that attempts to understand the nature of intelligence and produce a new class of intelligent machines that respond in ways similar to human intelligence. Research in the field of artificial intelligence includes robotics, language recognition, image recognition, natural language processing, decision-making and reasoning, human-computer interaction, recommendation and search, etc.

[0053] Machine learning is at the heart of artificial intelligence. Relevant people in the industry define machine learning as a process of gradually improving the model performance P through the training process E in order to achieve the task T. For ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to artificial intelligence, provides a reinforced learning method and device, and can improve the training efficiency of reinforcement learning. The method comprises the steps: acquiring a structure chart, wherein the structure chart comprises structure information of an environment or an intelligent agent acquired through learning; inputting a current state and a structure diagram of the environment to a strategy function of the intelligent agent, the strategy function being used for generating an action in response to the current state and the structure diagram, and thestrategy function of the intelligent agent being a diagram neural network; outputting an action to the environment by using the intelligent agent; acquiring a next state and reward data in response to the action from the environment by using the intelligent agent; and carrying out reinforced learning training on the intelligent agent according to the reward data.

Description

technical field [0001] This application relates to the field of artificial intelligence, in particular to a method and device for reinforcement learning. Background technique [0002] Artificial intelligence (AI) is a new technical science that studies theories, methods, technologies and application systems for simulating, extending and expanding human intelligence. Machine learning is at the heart of artificial intelligence. Methods of machine learning include reinforcement learning. [0003] Reinforcement learning is that the agent (agent) learns in a "trial and error" way, and the reward (reward) obtained through the interaction between the action (action) and the environment guides the behavior, and the goal is to make the agent obtain the maximum reward. A policy function is a rule of action used by an agent in reinforcement learning. The policy function is usually a neural network. The strategy function of the agent usually adopts a deep neural network, but the dee...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06N3/04G06N3/08G06N20/00
CPCG06N3/08G06N20/00G06N3/045G06N3/092G06N3/006
Inventor 刘扶芮寸文璟陈志堂
Owner HUAWEI TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products