Order assignment method and system based on interactive reinforcement learning

Pending Publication Date: 2021-02-23

SHENZHEN INST OF ADVANCED TECH CHINESE ACAD OF SCI

View PDF5 Cites 0 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0004] The existing technologies basically carry out completely autonomous training and learning process through the interaction between traditional reinforcement learning and the environment. However, this completely autonomous learning method lacks human participation, and the learning process takes a lot of time; the learning process cannot be controlled. The behavior of the agent may produce wrong results; the learning results are difficult to simulate complex real scenes

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0032] In order to make the object, technical solution and advantages of the present invention clearer, the present invention will be further described in detail below in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain the present invention, not to limit the present invention.

[0033] Such as figure 1 As shown, the present invention provides a method for order assignment based on interactive reinforcement learning, comprising the following steps:

[0034] Step S1, performing imitation training on order dispatching task modeling;

[0035] Step S2, providing a demonstration example of order dispatching imitating human behavior in terms of state and action sequence, and imitating the order dispatching strategy behavior of human demonstration through autonomous learning;

[0036] Step S3, when entering a catastrophic state or an error state in which human beings are not sat...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention relates to the technical field of internet information, in particular to an order assignment method and system based on interactive reinforcement learning. According to the invention, man-machine interaction is introduced in an autonomous learning process, a man-machine interaction mode of human demonstration, interference and evaluation is fused, learning is carried out from the human demonstration, and a real order assignment scene can be better simulated through real data of the human demonstration; learning is carried out from human interference, when wrong actions occur in the autonomous learning process, the performance of the intelligent agent is controlled, and wrong results are avoided; learning is carried out from human evaluation, an autonomous learning result is evaluated manually, so that the learning process shifts to a better order assignment strategy direction, the learning process is accelerated, and an optimal order assignment strategy is obtained.

Description

technical field [0001] The invention relates to the field of Internet information technology, in particular to an order assignment method and system based on interactive reinforcement learning. Background technique [0002] Online ride-hailing apps and platforms have become a new and popular way to provide on-demand transportation services through mobile apps. At present, some taxi-hailing mobile applications such as Didi, Uber, and Lyft are popular all over the world. The system provides services for a large number of passengers every day and generates a large number of taxi-hailing orders. For example, China's largest online car-hailing service provider Didi needs to process about 11 million orders every day. The order assignment problem of online taxi service is essentially a reasonable match between potential passengers and drivers. By. In many cases, the service is reusable, and the service provider will disappear for a period of time after being matched with the user...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Applications(China)

IPC IPC(8): G06Q30/06G06Q50/30G06N20/00

CPCG06Q30/0635G06N20/00G06Q50/40

Inventor 金铭王洋须成忠

Owner SHENZHEN INST OF ADVANCED TECH CHINESE ACAD OF SCI

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Order assignment method and system based on interactive reinforcement learning

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology