Game control method and device and storage medium

A game control and game technology, applied in indoor games, video games, sports accessories, etc., can solve the problems of deep reinforcement learning instability, reduce the effect of event control, amplify the overestimation of deep reinforcement learning algorithms, etc., to improve the control effect, Reduce the effect of overestimation and coupling reduction

Pending Publication Date: 2022-01-04
CHINA MOBILE SUZHOU SOFTWARE TECH CO LTD +1
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, since the Q-Learning algorithm is an off-strategy algorithm, this itself will increase the instability of the deep reinforcement learning algorithm, and each time a greedy strategy (greedy strategy) is used to select the actions performed in the video game, each time Both use the Max operator to select the optimal action, but it will amplify the overestimation problem of the deep reinforcement learning algorithm
Therefore, the instability and overestimation problems in deep reinforcement learning still exist, thereby reducing the control effect of events

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Game control method and device and storage medium
  • Game control method and device and storage medium
  • Game control method and device and storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0074] An embodiment of the present invention provides a game control method, such as figure 1 As shown, the method includes:

[0075] S101. Acquire the current video frame when it is detected that the target video game starts;

[0076] When the device detects that the target video game starts, the first frame image when the target video game starts is used as the current video frame, and the current video frame represents the current state.

[0077] In some embodiments, when the device detects that the target video game starts, it also sets the playback memory unit, the total capacity of the playback memory unit, the total number of preset training rounds, the preset training total step size of each round of training, the preset online A value network and a preset target value network; wherein, the playback memory unit is used to store data generated when controlling the target video game; the preset training total step size is greater than the preset number of samples.

[...

Embodiment 2

[0135] Based on the same inventive concept of the first embodiment, further description will be made.

[0136] An embodiment of the present invention provides a game control device, such as Figure 5 As shown, the game control device 3 includes:

[0137] The control unit 31 is used to obtain the current video frame when detecting that the target video game starts; and based on the current video frame and the preset online value network, obtain the current grayscale image and the current action, and control the target video game to execute the current action, Get the current reward value and the next video frame;

[0138] The data construction unit 32 is used to obtain the current five-tuple based on the current grayscale image, the current action, the current reward value, the next video frame and the preset online value network and save it to the preset database;

[0139] The parameter updating unit 33 is used for when the number of obtained current quintuples is greater th...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a game control method and a device and a storage medium, and the method comprises the steps: obtaining a current video frame when the start of a target video game is detected; based on the current video frame and a preset online value network, obtaining a current grey-scale map and a current action, controlling the target video game to execute the current action, obtaining a current reward value and a next video frame, obtaining a current quintuple, and storing the current quintuple in a preset database; when the number of the obtained current quintuple is greater than or equal to the preset sample number and the preset total training round number is greater than zero, performing parameter updating on a preset online value network based on a preset target value network and a preset database; when the number of the current quintuple is an integral multiple of a preset training total step length, replacing a parameter of a preset target value network by using a parameter of a preset online value network, and subtracting one from a preset training total round number; and taking the next video frame as the current video frame, and continuing the process. According to the invention, the event control effect can be improved.

Description

technical field [0001] The invention relates to dynamic optimization technology, in particular to a game control method and device, and a storage medium. Background technique [0002] At present, deep reinforcement learning is a hot field of artificial intelligence research. It combines the perception ability of deep learning and the decision-making ability of reinforcement learning, and provides a new control idea for dynamic optimization related events (such as video games). [0003] Due to the problems of instability and overestimation in deep reinforcement learning, related technologies use an improved algorithm of deep Q network in the control of video games to try to alleviate the overestimation problem of deep reinforcement learning algorithms. However, since the Q-Learning algorithm is an off-strategy algorithm, this itself will increase the instability of the deep reinforcement learning algorithm, and each time a greedy strategy (greedy strategy) is used to select t...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): A63F13/49A63F13/55G06N3/04G06N3/06
CPCA63F13/55A63F13/49G06N3/061G06N3/045
Inventor 夏宗涛
Owner CHINA MOBILE SUZHOU SOFTWARE TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products