Unlock instant, AI-driven research and patent intelligence for your innovation.

Target task processing method, device and equipment based on reinforcement learning migration

A target task and reinforcement learning technology, applied in machine learning, instrumentation, computing, etc., can solve problems such as slow convergence speed and learning speed that cannot meet the growing needs, and achieve the effect of accelerating convergence speed and improving task learning speed

Pending Publication Date: 2022-05-06
CHINA TELECOM CLOUD TECH CO LTD
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] Therefore, the technical problem to be solved by the present invention is to overcome the defects in the prior art that the convergence speed is slow and the learning speed cannot meet the growing demand, thereby providing a target task processing method, device and equipment based on reinforcement learning migration

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Target task processing method, device and equipment based on reinforcement learning migration
  • Target task processing method, device and equipment based on reinforcement learning migration
  • Target task processing method, device and equipment based on reinforcement learning migration

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0028] The technical solutions of the present invention will be clearly and completely described below in conjunction with the accompanying drawings. Apparently, the described embodiments are some of the embodiments of the present invention, but not all of them. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts belong to the protection scope of the present invention.

[0029] In addition, the technical features involved in the different embodiments of the present invention described below may be combined with each other as long as there is no conflict with each other.

[0030] The embodiments of the present invention improve the learning speed of the target task by transferring the reinforcement learning process of the learned task to the target task. In each of the following embodiments, the semi-Markov decision process is used to introduce the concept of option (option). I...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides a target task processing method, device and equipment based on reinforcement learning migration. The method comprises the following steps that a sub-target set in a preset source task database is obtained, source tasks are learned tasks, and the sub-target set comprises sub-target tasks corresponding to target tasks in the source tasks; extracting trajectory features of the sub-target tasks based on the sub-target set, and constructing a candidate item set; and screening a reuse candidate item from the candidate item set, and performing reinforcement learning on the target task based on the reuse candidate item to obtain a reinforcement learning result of the target task. The candidate item set of the learned task is migrated to the target task, so that the convergence speed of the target task in the reinforcement learning process is accelerated, and the task learning speed is greatly improved.

Description

technical field [0001] The present invention relates to the technical field of machine learning, in particular to a target task processing method, device and equipment based on reinforcement learning transfer. Background technique [0002] Reinforcement Learning (RL), also known as Reinforcement Learning, Evaluation Learning or Enhanced Learning, is one of the paradigms and methodologies of machine learning, which is used to describe and solve the problem of learning through the process of interaction between agents and the environment. Strategies to maximize rewards or achieve specific goals are an important class of machine learning techniques to solve sequential decision-making problems. After decades of development, they have been successfully applied to many fields such as automatic control, robotics, recommendation and retrieval. A reinforcement learning task can be expressed as a Markov decision process (MDP). However, in the process of reinforcement learning in the ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06N20/00
CPCG06N20/00
Inventor 范顺国李兴达李文成
Owner CHINA TELECOM CLOUD TECH CO LTD