Unlock instant, AI-driven research and patent intelligence for your innovation.

Training method and device for unmanned vehicle control model

A technology for controlling models and unmanned vehicles, which is applied in the training field of unmanned vehicle control models, can solve the problems of sparse effective rewards, long time consumption, and high cost of model training, so as to reduce training costs, solve sparse rewards, and save training time Effect

Active Publication Date: 2020-04-21
BEIJING SANKUAI ONLINE TECH CO LTD
View PDF6 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] However, in the existing process of training the reinforcement learning model, it is usually determined that the model output is the correct control when the unmanned vehicle reaches the destination, and positive feedback is given. When a dangerous situation occurs during driving, it is determined that the model output is the wrong control. Give negative feedback, so usually only the rewards that reach the destination or the feedback of dangerous situations are effective rewards, that is, the rewards that make the model parameters converge, and most of the rewards are difficult to make the model converge during the driving process. The effective rewards obtained during the training process are relatively sparse, making model training costly and time-consuming

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Training method and device for unmanned vehicle control model
  • Training method and device for unmanned vehicle control model
  • Training method and device for unmanned vehicle control model

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0059] In order to make the purpose, technical solution and advantages of this specification clearer, the technical solution of this application will be clearly and completely described below in conjunction with specific embodiments of this specification and corresponding drawings. Apparently, the described embodiments are only some of the embodiments of the present application, rather than all the embodiments. Based on the embodiments in this specification, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts belong to the protection scope of the present application.

[0060] The technical solutions provided by various embodiments of the present application will be described in detail below in conjunction with the accompanying drawings.

[0061] figure 1 A schematic diagram of the training process of the unmanned vehicle control model provided by the embodiment of this specification, including:

[0062] S100: Obtain the curr...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a training method and device for an unmanned vehicle control model. Aiming at each moment during model training; the method comprises the steps of determining a feature matrixcomposed of historical environment features used for calculating awards at the last moment and current environment features determined according to current environment information; then, based on theimportance degree of the current environment feature and each historical environment feature on the feature matrix, selecting features used for calculating rewards at the current moment from the feature matrix, determining the rewards according to the current environment features and the selected features to train the unmanned vehicle control model, and performing unmanned vehicle control according to the trained model after the training is finished. Because the calculation of the features of the rewards is determined based on the overall importance degree of the feature pairs of the feature pairs including the historical environment features, more effective rewards can be determined based on the change of the environment information during training, the problem of sparse rewards is solved, and the cost and time are saved.

Description

technical field [0001] The present application relates to the field of unmanned driving technology, in particular to a training method and device for an unmanned vehicle control model. Background technique [0002] At present, the main problem that needs to be solved in the control method of unmanned vehicles in the field of unmanned driving technology is how to avoid obstacles for unmanned vehicles. Usually, the obstacle avoidance process of unmanned vehicles is: input the environmental information collected by unmanned vehicles in real time, its own driving status, etc. The pre-trained model controls the unmanned vehicle to avoid obstacles according to the output of the model. [0003] In the prior art, the method of reinforcement learning is usually used for model training, and the model is obtained through continuous "trial and error" training. Specifically, when training the reinforcement learning model, the unmanned vehicle determines the reward according to the impac...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06K9/62G06N20/00G05D1/02
CPCG06N20/00G05D1/0221G06F18/211G06F18/22
Inventor 任冬淳夏华夏樊明宇丁曙光钱德恒
Owner BEIJING SANKUAI ONLINE TECH CO LTD