Robot behavior learning model based on utility differential network

A technique for learning models, differential networks

Inactive Publication Date: 2011-05-18
BEIHANG UNIV
View PDF3 Cites 7 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The robot behavior learning model based on the utility difference network of the present invention solves the problem of limited knowledge acquisition and too strong experience of the general behavior decision-making model, realizes the offline learning process and online decision-making process, and solves the problem of low real-time reasoning process

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Robot behavior learning model based on utility differential network
  • Robot behavior learning model based on utility differential network
  • Robot behavior learning model based on utility differential network

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0016] The present invention will be further described in detail with reference to the accompanying drawings and embodiments. Among them, the first embodiment specifically describes the offline learning process of the learning model of the present invention; the second embodiment describes the online decision-making process.

[0017] Such as figure 1 As shown, the learning model of the present invention includes five parts: a utility fitting network unit 11 , a differential signal calculation network unit 12 , a confidence evaluation network unit 13 , an action decision network unit 14 and an action correction network unit 15 . In the off-line learning process of the learning model of the present invention, five parts are all involved.

[0018] The utility fitting network unit 11 is used to calculate the action a selected at time t t The different state space vectors s generated after the execution of the action execution unit 16 t The resulting utility fitted value and o...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a robot behavior learning model based on a utility differential network, which comprises a utility fitting network unit, a differential signal calculating network unit, a confidence evaluating network unit, an action decision network unit, an action correcting network unit and an action executing unit. The model realizes the offline learning process and the online decision process. The utility fitting network unit calculates and obtains a utility fitting value of a state after action is executed; the differential signal calculating network unit is used for calculatinga differential signal; the confidence evaluating network unit outputs the confidence obtained by calculating to the action correcting network unit; the action decision network unit outputs an action selecting function; and the action correcting network unit corrects the action selecting function by utilizing confidence, calculates a probability value selected by each action and outputs the actionwith largest probability to the action executing unit for executing. The invention can more favorably ensure the completeness of a robot for obtaining environmental knowledge and more favorably ensure the timeliness and effectiveness of robot behavior decision.

Description

technical field [0001] The invention relates to a robot behavior learning model based on a utility difference network, which belongs to one of new applications in the field of artificial intelligence. Background technique [0002] Robot intelligent behavior generally refers to the process in which a robot performs reasoning and decision-making on the basis of perceiving the surrounding environment to achieve behavioral intelligent decision-making. The establishment of an intelligent behavior decision-making model requires the acquisition, representation and reasoning of knowledge, and the ability to automatically evaluate the pros and cons of robot behavior. At present, the cognitive behavioral model based on reinforcement learning technology has advantages in knowledge acquisition, adaptability to decision-making environment, and reusability, making it the first choice for intelligent behavioral modeling. [0003] The reinforcement learning process requires exploration of ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06N3/00
Inventor 宋晓麻士东龚光红
Owner BEIHANG UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products