Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Training method and device for reinforcement learning model in battle game

A technology of reinforcement learning and battle games, applied in indoor games, machine learning, computing models, etc., can solve problems such as AI ability not meeting expectations, AI exploration ability decline, lineup imbalance, etc.

Active Publication Date: 2021-02-26
TENCENT TECH (SHENZHEN) CO LTD
View PDF5 Cites 5 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] Due to the unbalanced strength and difficulty of heroes in the multiplayer game design, this imbalance can easily lead to an unbalanced lineup. In the traditional reinforcement learning training process, a certain probability is used to self-play or randomly select opponents from the historical opponent pool. This training method will cause weak lineups to be suppressed by strong lineups when training AI, resulting in a significant decline in AI's exploration capabilities, poor model training efficiency, and AI capabilities that cannot meet expectations

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Training method and device for reinforcement learning model in battle game
  • Training method and device for reinforcement learning model in battle game
  • Training method and device for reinforcement learning model in battle game

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0061] In order to make the purpose, technical solution and advantages of the application clearer, the application will be further described in detail below in conjunction with the accompanying drawings. Apparently, the described embodiments are only some of the embodiments of the application, not all of them. Based on the embodiments in this application, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts belong to the scope of protection of this application.

[0062] The word "exemplary" is used hereinafter to mean "serving as an example, embodiment or illustration". Any embodiment described as "exemplary" is not necessarily to be construed as superior or better than other embodiments.

[0063] The terms "first" and "second" herein are only used for descriptive purposes, and should not be interpreted as indicating or implying relative importance or implicitly specifying the quantity of indicated technical features. Therefor...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides a training method and device for a reinforcement learning model in a battle game, belongs to the technical field of computers, and relates to artificial intelligence and computer vision technologies. The method comprises the steps that a target battle model and a similar opponent model of the target battle model are acquired, the similar opponent model is a historical battlemodel with the grade score difference between the similar opponent model and the target battle model smaller than a score threshold value, and the grade score is used for evaluating the battle ability of the model; based on the battle state characteristics of the two battle parties, the prediction operation of the target battle model and the prediction operation of the similar opponent model aredetermined respectively; the target battle model and the similar opponent model are used for controlling the two battle parties to execute prediction operation so as to conduct battle; an operation value of the target battle model in battle is determined; and the target battle model is trained based on the battle state characteristics, the prediction operation and the operation value.

Description

technical field [0001] The present application relates to the field of computer technology, in particular to a method and device for training reinforcement learning models in battle games. Background technique [0002] MOBA (Multiplayer Online Battle Arena, multiplayer online tactical arena game), also known as ARTS (Action Real-Time Strategy, action real-time strategy game). The gameplay of this type of game generally requires the purchase of equipment during battles. Players are usually divided into two or more hostile camps and compete with each other on a scattered game map. Each player controls the selected virtual character through the interface to fight against each other. fight. In the game, the virtual characters of the two camps can be controlled by the player to fight, or the players and AI (Artificial Intelligence, artificial intelligence) can control the virtual characters of different camps to fight. [0003] Due to the unbalanced strength and difficulty of h...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): A63F13/843G06N20/00
CPCA63F13/843G06N20/00
Inventor 陈光伟李思琴王亮付强
Owner TENCENT TECH (SHENZHEN) CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products