Training and control method and device, computing equipment and medium

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A training method and location information technology, applied in the field of reinforcement learning, can solve the problems of reducing the rationality of smart objects and the lack of smart object interaction of smart objects

Pending Publication Date: 2021-12-31

SENSETIME GRP LTD +1

View PDF5 Cites 0 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

Since the reward information used to train each smart object is the same, each trained smart object only cares about individual behaviors in various scenarios and ignores the behavior of surrounding smart objects. The interactivity of the smart object also reduces the rationality of the behavior of the smart object

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0088] In order to make the purposes, technical solutions and advantages of the embodiments of the present disclosure more clear, the technical solutions in the embodiments of the present disclosure will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present disclosure. Obviously, the described embodiments are only These are some, but not all, embodiments of the present disclosure. The components of the disclosed embodiments generally described and illustrated herein may be arranged and designed in a variety of different configurations. Thus, the following detailed description of the embodiments of the present disclosure is not intended to limit the scope of the disclosure as claimed, but is merely representative of selected embodiments of the disclosure. Based on the embodiments of the present disclosure, all other embodiments obtained by those skilled in the art without creative work fall within the protection ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention provides a training and control method and device, computing equipment and a medium. The training method comprises the steps of obtaining the initial reward information of each intelligent object corresponding to a current round target system, wherein the initial reward information is used for representing the task completion degree of the corresponding intelligent object in the current round, for each intelligent object in the at least one intelligent object in the target system, determining target reward information of the intelligent object in the current round based on the position information and the initial reward information of each intelligent object and the local coordination coefficient of the current round, and based on the target reward information corresponding to the intelligent object in the current round, adjusting the network parameters of the operation network of the intelligent object, and based on the adjusted network parameters of the operation network of the intelligent object, determining the initial reward information of the next round, and repeatedly executing the above steps until the training of the operation network of each intelligent object reaches a preset training cut-off condition, and obtaining each trained operation network.

Description

technical field [0001] The present disclosure relates to the technical field of reinforcement learning, and in particular, to a training and control method, apparatus, computing device and medium. Background technique [0002] Self-propelled particle system is a common model for modeling multi-intelligent object systems, for example, modeling traffic flow system under traffic intersection, modeling fish school system, etc. Through the training of the smart objects in the self-propelled particle system, the simulation of the behavior of the smart objects can be effectively realized. [0003] In the prior art, most of the smart objects are trained by means of reinforcement learning. Specifically, each smart object is independently iteratively trained by setting fixed reward information. Since the reward information used to train each smart object is the same, each smart object obtained by training only cares about the behavior of the individual in various scenarios, ignoring ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Applications(China)

IPC IPC(8): G05B13/04

CPCG05B13/042

Inventor 彭正皓黎权毅刘春晓周博磊

Owner SENSETIME GRP LTD

Training and control method and device, computing equipment and medium

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology