A knowledge transfer combined reinforcement learning method and a learning method applied to autonomous skills of an unmanned vehicle

A technology that strengthens learning and knowledge, applied in the field of artificial intelligence, can solve problems such as complexity growth, scale increase, dimension disaster, etc., and achieve the effect of overcoming slow speed, making significant progress, and improving learning speed and efficiency

Active Publication Date: 2019-05-10
UNIV OF SHANGHAI FOR SCI & TECH
View PDF4 Cites 11 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0008] In view of the above-mentioned problems and needs in the prior art, the purpose of the present invention is to provide a reinforcement learning method combined with knowledge transfer and a learning method applied to autonomous skills of unmanned vehicles, which can effectively overcome the slow speed of robot autonomous skills learning. problems, to avoid encountering the "curse of dimensionality" due to the increase in scale and complexity of high-dimensional and complex problems

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A knowledge transfer combined reinforcement learning method and a learning method applied to autonomous skills of an unmanned vehicle
  • A knowledge transfer combined reinforcement learning method and a learning method applied to autonomous skills of an unmanned vehicle
  • A knowledge transfer combined reinforcement learning method and a learning method applied to autonomous skills of an unmanned vehicle

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0064] combine Figure 1 to Figure 5(b) As shown, a reinforcement learning method combined with knowledge transfer provided in this embodiment is characterized in that it specifically includes the following steps:

[0065] S1. Design the mapping relationship between the autonomous tasks of the BP neural network, initialize the target task by mapping the learning experience in the source task, and set a good prior for the target task;

[0066] S2. Store the source task learning experience as a case, and build a linear perceptron to learn the action mapping relationship between the source domain and the target domain;

[0067] S3. Use the case-based reasoning mechanism to store the online learning experience of the target task to expand the case base, and propose a progressive forgetting criterion to clear the long-term unused information in the experience stored in the case base to reduce matching retrieval time;

[0068] S4. Carry out similarity calculation and case retrieva...

Embodiment 2

[0124] combine Figure 6(a) to Figure 9(b) As shown, this embodiment also provides a learning method applied to the autonomous skills of unmanned vehicles. The method shown in Embodiment 1 is used for learning, and the autonomous skills of unmanned vehicles are learned on the unmanned vehicle simulator, and the source task and When the target tasks are different, perform distributed retrieval processing on the cases in the case library;

[0125] Consider driving an underpowered car up a steep mountain road, as shown in Figure 6(a). But because the force of gravity is greater than the power the car's engine can generate, it is difficult to charge up the slope even with the maximum throttle. The solution is to go to the slope in the opposite direction first, and then rely on inertia and a large throttle to rush to the target slope. The reward value for each step of this task is -1. There are 3 optional actions: large throttle (+1), reverse large throttle (-1), and empty throt...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a reinforcement learning method in combination with knowledge transfer. The reinforcement learning method comprises the following steps: S1, designing a mapping relation between BP neural network autonomous tasks; S2, performing case storage on source task learning experience, and constructing a linear perceptron to learn an action mapping relation between a source domain and a target domain; S3, applying a case-based reasoning mechanism; S4, carrying out similarity calculation and case retrieval, and accelerating learning of related but different tasks by using the learnt experience in the case library as a heuristic expression; the method is applied to the learning method of the autonomous skills of the unmanned vehicle. According to the method, the advantages ofreinforcement learning and transfer learning are combined, and experience obtained by the robot from a simple domain or a source domain can be applied to a complex domain or a target domain through transfer acceleration; the learning speed is high, and the dimensionality disaster can be avoided; and the autonomous skill learning speed and efficiency of the unmanned vehicle are remarkably improved.

Description

technical field [0001] The invention relates to a reinforcement learning method combined with knowledge transfer and a learning method applied to autonomous skills of unmanned vehicles, belonging to the technical field of artificial intelligence. Background technique [0002] With the advent of the era of artificial intelligence, the rapid development of big data, cloud computing, and the Internet of Things, society has become more and more intelligent, and research on more intelligent robots has become the main strategic development direction of various countries in the world. The reindustrial strategy proposed by the United States, the new robot strategy proposed by Japan, the Industry 4.0 proposed by Germany, and the Internet + strategy proposed by my country all reflect that artificial intelligence has become a necessary part of the development strategies of various countries. Nowadays, the development of unmanned factories, speech recognition, computer vision, unmanned ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06N3/04G06N3/08G06N5/04G06N20/00
CPCY02T10/40
Inventor 丁子凡丁德锐王永雄魏国亮鄂贵
Owner UNIV OF SHANGHAI FOR SCI & TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products