Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Adaptive game playing algorithm based on deep reinforcement learning

A reinforcement learning and self-adaptive technology, applied in the field of data processing, can solve the problem of poor scalability of multi-agent agents, and achieve the effect of improving scalability

Inactive Publication Date: 2019-03-19
DONGGUAN UNIV OF TECH
View PDF0 Cites 20 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] In order to solve the problems in the prior art, the present invention provides an adaptive game algorithm based on deep reinforcement learning to solve the problem of poor scalability of multi-agents in the prior art

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Adaptive game playing algorithm based on deep reinforcement learning
  • Adaptive game playing algorithm based on deep reinforcement learning
  • Adaptive game playing algorithm based on deep reinforcement learning

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0026] The present invention will be further described below in conjunction with the accompanying drawings and specific embodiments.

[0027] An adaptive game algorithm based on deep reinforcement learning, including the following steps: (A) obtain strategies with different degrees of cooperation; (B) generate strategies with different degrees of cooperation; (C) detect the opponent's cooperation strategies; coping strategies.

[0028] In the step (A), different network structures and / or different target reward forms are used for training and strategies with different degrees of cooperation are obtained.

[0029] In the step (A), strategies with different degrees of cooperation are obtained by modifying the key factors affecting the degree of competition and cooperation in the environment or by modifying the learning objectives of the agent.

[0030] In the described step (B), set the strategies of different degrees of cooperation obtained in the step (A) as expert networks, ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention relates to the field of data processing, and discloses an adaptive game playing algorithm based on deep reinforcement learning, comprising the following steps: (A) acquiring strategies for different cooperation levels; (B) generating strategies for different cooperation levels; (C) detecting the cooperation strategy of the opponent; and (D) developing different coping strategies. Theadaptive game playing algorithm based on deep reinforcement learning has the beneficial effects of using the trained detector and the strategy of different cooperation degrees to implement the existing ideas such as Tit for tat in the sequential social dilemmas; improving the scalability of the agent agent; and obtaining more intuitively better than its own competitive strategy.

Description

【Technical field】 [0001] The invention relates to the field of data processing, in particular to an adaptive game algorithm based on deep reinforcement learning. 【Background technique】 [0002] Reinforcement learning is used in various fields, from games to robot control. Traditional reinforcement learning uses tables or linear functions to represent value functions or strategies. It is difficult to expand to complex problems. Deep reinforcement learning combined with deep learning uses neural networks to extract The ability to approximate features and functions has seen some successful applications [DQN][Alpha Zero][PPO]. Prisoner's Dilemma (PD game) has always been the research focus of Matrix game (Matrix game). PDgame regards cooperation and competition as an atomic action (atomic action), but in the real world, the game is composed of a series of actions, which will be timed The temporally extended PD is called Sequential Prisoner's Dilemma (SPD). In the PD game, most...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06N3/04G06N3/08
CPCG06N3/08G06N3/042G06N3/045
Inventor 侯韩旭郝建业王维勋
Owner DONGGUAN UNIV OF TECH
Features
  • Generate Ideas
  • Intellectual Property
  • Life Sciences
  • Materials
  • Tech Scout
Why Patsnap Eureka
  • Unparalleled Data Quality
  • Higher Quality Content
  • 60% Fewer Hallucinations
Social media
Patsnap Eureka Blog
Learn More