Training method and device for unmanned vehicle control model

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A technology for controlling models and unmanned vehicles, which is applied in the training field of unmanned vehicle control models, can solve the problems of sparse effective rewards, long time consumption, and high cost of model training, so as to reduce training costs, solve sparse rewards, and save training time Effect

Active Publication Date: 2020-04-21

BEIJING SANKUAI ONLINE TECH CO LTD

View PDF6 Cites 0 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0004] However, in the existing process of training the reinforcement learning model, it is usually determined that the model output is the correct control when the unmanned vehicle reaches the destination, and positive feedback is given. When a dangerous situation occurs during driving, it is determined that the model output is the wrong control. Give negative feedback, so usually only the rewards that reach the destination or the feedback of dangerous situations are effective rewards, that is, the rewards that make the model parameters converge, and most of the rewards are difficult to make the model converge during the driving process. The effective rewards obtained during the training process are relatively sparse, making model training costly and time-consuming

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0059] In order to make the purpose, technical solution and advantages of this specification clearer, the technical solution of this application will be clearly and completely described below in conjunction with specific embodiments of this specification and corresponding drawings. Apparently, the described embodiments are only some of the embodiments of the present application, rather than all the embodiments. Based on the embodiments in this specification, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts belong to the protection scope of the present application.

[0060] The technical solutions provided by various embodiments of the present application will be described in detail below in conjunction with the accompanying drawings.

[0061] figure 1 A schematic diagram of the training process of the unmanned vehicle control model provided by the embodiment of this specification, including:

[0062] S100: Obtain the curr...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a training method and device for an unmanned vehicle control model. Aiming at each moment during model training; the method comprises the steps of determining a feature matrixcomposed of historical environment features used for calculating awards at the last moment and current environment features determined according to current environment information; then, based on theimportance degree of the current environment feature and each historical environment feature on the feature matrix, selecting features used for calculating rewards at the current moment from the feature matrix, determining the rewards according to the current environment features and the selected features to train the unmanned vehicle control model, and performing unmanned vehicle control according to the trained model after the training is finished. Because the calculation of the features of the rewards is determined based on the overall importance degree of the feature pairs of the feature pairs including the historical environment features, more effective rewards can be determined based on the change of the environment information during training, the problem of sparse rewards is solved, and the cost and time are saved.

Description

technical field [0001] The present application relates to the field of unmanned driving technology, in particular to a training method and device for an unmanned vehicle control model. Background technique [0002] At present, the main problem that needs to be solved in the control method of unmanned vehicles in the field of unmanned driving technology is how to avoid obstacles for unmanned vehicles. Usually, the obstacle avoidance process of unmanned vehicles is: input the environmental information collected by unmanned vehicles in real time, its own driving status, etc. The pre-trained model controls the unmanned vehicle to avoid obstacles according to the output of the model. [0003] In the prior art, the method of reinforcement learning is usually used for model training, and the model is obtained through continuous "trial and error" training. Specifically, when training the reinforcement learning model, the unmanned vehicle determines the reward according to the impac...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Applications(China)

IPC IPC(8): G06K9/62G06N20/00G05D1/02

CPCG06N20/00G05D1/0221G06F18/211G06F18/22

Inventor 任冬淳夏华夏樊明宇丁曙光钱德恒

Owner BEIJING SANKUAI ONLINE TECH CO LTD

Training method and device for unmanned vehicle control model

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology