Modular reinforcement learning model processing method, system and equipment and storage medium

A reinforcement learning and componentization technology, applied in the field of artificial intelligence, can solve problems such as complex operation

Pending Publication Date: 2021-05-28
超参数科技(深圳)有限公司
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, the various parts of the RLLib framework are interrelated. When the reinforcement learning model training process based on the RLLib framework is reused in various businesses, large-scale modification of the training framework is required, and the operation is complicated.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Modular reinforcement learning model processing method, system and equipment and storage medium
  • Modular reinforcement learning model processing method, system and equipment and storage medium
  • Modular reinforcement learning model processing method, system and equipment and storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0077] In order to make the purpose, technical solution and advantages of the present application clearer, the present application will be further described in detail below in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain the present application, and are not intended to limit the present application.

[0078] The componentized reinforcement learning model processing method provided by this application can be applied to such as figure 1 shown in the application environment. The server 102 controls the virtual object to interact with the interactive environment through the running component 1020 in the reinforcement learning system, and obtains the interactive data generated during the interactive process. Then, through the learning component 1022, the reinforcement learning model is iteratively trained based on the interaction data. Finally, the evaluation component 102...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention relates to a modular reinforcement learning model processing method and device, computer equipment and a storage medium. The method comprises that: interaction data generated by a virtual object in an interaction process with an interaction environment is obtained; the virtual object is controlled by a running component in a reinforcement learning system deployed in the cloud; the reinforcement learning system further comprises a learning assembly and an evaluation assembly; the reinforcement learning model is iteratively trained based on the interaction data through a learning component; in the iterative training process, the reinforcement learning model obtained through iterative training is evaluated through an evaluation component, and whether the reinforcement learning model obtained through iterative training meets interaction conditions or not is judged according to a result obtained through evaluation; and if not, the model associated with the running component is updated according to the reinforcement learning model obtained by iterative training, so that the running component controls the virtual object based on the updated reinforcement learning model. By adopting the method, the complexity of multiplexing the reinforcement learning model training framework in different services can be reduced.

Description

technical field [0001] The present application relates to the technical field of artificial intelligence, in particular to a model training method, device, computer equipment and storage medium. Background technique [0002] With the development of artificial intelligence technology, reinforcement learning technology is widely used in games, e-commerce recommendation, automatic driving, intelligent scheduling and other fields. The training process of the reinforcement learning model is complex, how to train the reinforcement learning model is an important issue to promote the development of reinforcement learning technology. In the traditional technology, the open source framework of RLLib (Reinforcement Learning Library) distributed reinforcement learning is used to train the reinforcement learning model. However, the various parts of the RLLib framework are interrelated. When the reinforcement learning model training process based on the RLLib framework is reused in vario...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06N20/00
CPCG06N20/00
Inventor朱恒满周正张正生刘永升
Owner超参数科技(深圳)有限公司