Method, apparatus, medium and device for controlling virtual object

By acquiring game state data through action decision models and performing feature processing, the high-cost NPC control problem in existing technologies is solved, achieving efficient operation control and improved accuracy under multiple roles, thus enhancing the user experience.

CN117618912BActive Publication Date: 2026-06-19BEIJING ZITIAO NETWORK TECH CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
BEIJING ZITIAO NETWORK TECH CO LTD
Filing Date
2023-11-29
Publication Date
2026-06-19

AI Technical Summary

Technical Problem

The one-to-one mapping relationship between game NPCs and takeover models in existing technologies leads to high deployment and maintenance costs, and the training process requires a lot of manpower and data annotation.

Method used

An action decision model is adopted. By acquiring game state data, game features and character coding features are determined. Action decisions are made using feature processing units and prediction units to realize operation control under multiple virtual characters, reduce the workload of model deployment, training and maintenance, and improve the accuracy and versatility of action decisions.

Benefits of technology

It effectively reduces the workload of model deployment, training and maintenance, improves the accuracy of action decisions and the precision of virtual object control, and enhances the user's gaming experience.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN117618912B_ABST
    Figure CN117618912B_ABST
Patent Text Reader

Abstract

This disclosure relates to a method, apparatus, medium, and device for controlling a virtual object. The method includes: acquiring game state data of a target virtual object in the current game; determining game features and character encoding features corresponding to the target virtual object based on the game state data, wherein the game features represent the data corresponding to the current game image frame of the target virtual object in the current game, and the character encoding features represent the features of the virtual character corresponding to the target virtual object; obtaining a decision action output by the action decision model based on the game features, the character encoding features, and the action decision model, and controlling the target virtual object according to the decision action; wherein the action decision model includes a feature processing unit and a prediction unit, wherein the feature processing unit is used to adjust the game features based on the character encoding features to obtain processed game features, and the prediction unit is used to determine the decision action based on the processed game features.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This disclosure relates to the field of computer technology, and more specifically, to a method, apparatus, medium, and device for controlling virtual objects. Background Technology

[0002] The application of artificial intelligence in the gaming industry is developing at a rapid pace, bringing unprecedented revolutionary progress and innovation to the gaming experience. Currently, it is possible to train models based on artificial intelligence through supervised learning, reinforcement learning, and state machines to achieve a one-to-one mapping relationship between game NPCs (Non-Player Characters) and their models.

[0003] However, this one-to-one mapping relationship between game NPCs and takeover models brings high deployment and maintenance costs, and also requires a lot of manpower for model training and data labeling during the training process. Summary of the Invention

[0004] This summary section is provided to briefly introduce the concepts, which will be described in detail in the detailed description section below. This summary section is not intended to identify key or essential features of the claimed technical solution, nor is it intended to limit the scope of the claimed technical solution.

[0005] In a first aspect, this disclosure provides a method for controlling a virtual object, the method comprising:

[0006] Obtain the game state data of the target virtual object in the current game;

[0007] Based on the game state data, the game features and character encoding features corresponding to the target virtual object are determined, wherein the game features are used to represent the data corresponding to the current game image frame of the target virtual object in the current game, and the character encoding features are used to represent the features of the virtual character corresponding to the target virtual object;

[0008] Based on the game features, the character encoding features, and the action decision model, a decision action output by the action decision model is obtained, and the target virtual object is controlled according to the decision action. The action decision model includes a feature processing unit and a prediction unit. The feature processing unit is used to adjust the game features based on the character encoding features to obtain processed game features. The prediction unit is used to determine the decision action based on the processed game features. The action decision model is trained based on training samples corresponding to multiple virtual characters.

[0009] Secondly, this disclosure provides a control device for a virtual object, the device comprising:

[0010] The acquisition module is used to acquire the game state data of the target virtual object in the current game.

[0011] The first determining module is used to determine the game features and character encoding features corresponding to the target virtual object based on the game state data, wherein the game features are used to represent the data corresponding to the current game image frame of the target virtual object in the current game, and the character encoding features are used to represent the features of the virtual character corresponding to the target virtual object;

[0012] The second determining module is used to obtain the decision action output by the action decision model based on the game features, the character encoding features, and the action decision model, and to control the target virtual object according to the decision action. The action decision model includes a feature processing unit and a prediction unit. The feature processing unit is used to adjust the game features based on the character encoding features to obtain processed game features. The prediction unit is used to determine the decision action based on the processed game features. The action decision model is trained based on training samples corresponding to multiple virtual characters.

[0013] Thirdly, this disclosure provides a computer-readable medium having a computer program stored thereon, which, when executed by a processing device, implements the steps of the method described in the first aspect.

[0014] Fourthly, this disclosure provides an electronic device, comprising:

[0015] A storage device on which computer programs are stored;

[0016] A processing device for executing the computer program in the storage device to implement the steps of the method described in the second aspect.

[0017] The above technical solution enables the control of virtual objects under multiple virtual characters by deploying a single action decision model, effectively reducing the workload, complexity, and resource consumption required for model deployment, training, and maintenance. Furthermore, adding a feature processing unit to the action decision model allows it to process game features differently based on the character encoding features of different virtual characters, ensuring a high degree of matching between the output decision action and the virtual character, improving the accuracy and effectiveness of action decisions, and enhancing the control precision of virtual objects. This makes the control of virtual objects closer to real user operations, improving the user's gaming experience. Additionally, training based on training data corresponding to multiple virtual characters can also improve the versatility and generalization of the action decision model to some extent.

[0018] Other features and advantages of this disclosure will be described in detail in the following detailed description section. Attached Figure Description

[0019] The above and other features, advantages, and aspects of the embodiments of this disclosure will become more apparent from the accompanying drawings and the following detailed description. Throughout the drawings, the same or similar reference numerals denote the same or similar elements. It should be understood that the drawings are schematic, and the originals and elements are not necessarily drawn to scale. In the drawings:

[0020] Figure 1 This is a flowchart of a method for controlling a virtual object according to one embodiment of the present disclosure.

[0021] Figure 2 This is a schematic diagram of an action decision model provided according to one embodiment of the present disclosure.

[0022] Figure 3 This is a schematic diagram of a feature processing unit in an action decision model provided according to one embodiment of the present disclosure.

[0023] Figure 4 This is a block diagram of a control device for a virtual object provided according to one embodiment of the present disclosure.

[0024] Figure 5 A schematic diagram of the structure of an electronic device suitable for implementing embodiments of the present disclosure is shown. Detailed Implementation

[0025] Embodiments of this disclosure will now be described in more detail with reference to the accompanying drawings. While some embodiments of this disclosure are shown in the drawings, it should be understood that this disclosure can be implemented in various forms and should not be construed as limited to the embodiments set forth herein. Rather, these embodiments are provided to provide a more thorough and complete understanding of this disclosure. It should be understood that the accompanying drawings and embodiments of this disclosure are for illustrative purposes only and are not intended to limit the scope of protection of this disclosure.

[0026] It should be understood that the steps described in the method embodiments of this disclosure may be performed in different orders and / or in parallel. Furthermore, the method embodiments may include additional steps and / or omit the steps shown. The scope of this disclosure is not limited in this respect.

[0027] The term "comprising" and its variations as used herein are open-ended inclusions, meaning "including but not limited to". The term "based on" means "at least partially based on". The term "one embodiment" means "at least one embodiment"; the term "another embodiment" means "at least one additional embodiment"; the term "some embodiments" means "at least some embodiments". Definitions of other terms will be given in the description below.

[0028] It should be noted that the concepts of "first" and "second" mentioned in this disclosure are used only to distinguish different devices, modules or units, and are not used to limit the order of functions performed by these devices, modules or units or their interdependencies.

[0029] It should be noted that the terms "a" and "a plurality of" used in this disclosure are illustrative rather than restrictive, and those skilled in the art should understand that, unless otherwise expressly indicated in the context, they should be understood as "one or more".

[0030] The names of messages or information exchanged between multiple devices in the embodiments of this disclosure are for illustrative purposes only and are not intended to limit the scope of such messages or information.

[0031] It is understood that before using the technical solutions disclosed in the various embodiments of this disclosure, users should be informed of the types, scope of use, and usage scenarios of the personal information involved in this disclosure in an appropriate manner in accordance with relevant laws and regulations, and user authorization should be obtained.

[0032] For example, upon receiving a user's active request, a prompt message is sent to the user to explicitly inform them that the requested operation will require the acquisition and use of the user's personal information. This allows the user to independently choose whether to provide personal information to the software or hardware, such as the electronic device, application, server, or storage medium performing the operations of this disclosed technical solution, based on the prompt message.

[0033] As an optional but non-limiting implementation, in response to a user's active request, sending a prompt message to the user can be done via a pop-up window, where the prompt message can be presented in text format. Furthermore, the pop-up window can also include a selection control allowing the user to choose "agree" or "disagree" to provide personal information to the electronic device.

[0034] It is understood that the above notification and user authorization process are merely illustrative and do not constitute a limitation on the implementation of this disclosure. Other methods that comply with relevant laws and regulations may also be applied to the implementation of this disclosure.

[0035] Meanwhile, it is understood that the data involved in this technical solution (including but not limited to the data itself, the acquisition or use of the data) shall comply with the requirements of relevant laws, regulations and related provisions.

[0036] Figure 1 The diagram shown is a flowchart of a virtual object control method according to an embodiment of the present disclosure. Figure 1 As shown, the method includes:

[0037] In step 11, the game state data of the target virtual object in the current game is obtained.

[0038] The target virtual object can be an object not controlled by a real player in the current game. Game state data can be obtained through the game data interface during the current game. This game state data represents the target virtual object's data in the current game. For example, the game state data may include object information of the target virtual object in the current game, as well as object information of other virtual objects in the game. This object information may include one or a combination of information such as health, mana, economy, kills, assists, and deaths. As another example, the game state data may also include game scene information, such as the survival status of minions and monsters in the current game, and the health of turrets.

[0039] It should be noted that the types of data included in this game status data can be preset according to the actual application scenario.

[0040] In step 12, based on the game state data, the game features and character encoding features corresponding to the target virtual object are determined. The game features are used to represent the data corresponding to the current game image frame of the target virtual object in the current game, and the character encoding features are used to represent the features of the virtual character corresponding to the target virtual object.

[0041] In this step, feature extraction can be performed based on game state data obtained from game matches to obtain game features. Game features can be divided into three categories, and the dimensions for feature extraction based on game state data can be pre-set. For example, the first category of game features can be game scene features, which can represent common scene features in a game match. Its dimensions can include the current game frame rate, the total number of kills, assists, deaths, and economy for both sides. This feature data can be used to represent the game's progress and the combat performance of both sides. The game scene features can also include information such as whether there are attack targets or summoned creatures within the target virtual object's attack range. The second category of game features can be the combat features of the virtual character corresponding to the target virtual object. Its dimensions can be configured according to the virtual character's relevant attributes, such as health, position, availability of hero skills, level, cooldown time, attack range, and mana cost. These features are related to the virtual character's survivability, attack power, and skill operation ability. The third type of game feature can be the characteristics of the attack target corresponding to the target virtual object in the game scene. For example, the attack target can be soldiers, monsters, buildings, and virtual objects belonging to the enemy camp in the game scene. The characteristics of the attack target can include health characteristics and whether the attack target exists. These characteristics will also affect the outcome of the game and influence the action decision in the battle.

[0042] The dimensions of the game features can be pre-set based on the actual application scenario, and this disclosure does not impose any limitations on this. Therefore, during feature extraction, extraction can be performed on each feature dimension based on the game state data during the game, obtaining the value of the corresponding dimension, thereby obtaining the game feature.

[0043] Character coding features can be used to represent the characteristics of a corresponding virtual character, so as to distinguish between the character coding features of different virtual characters.

[0044] In step 13, based on game features, character coding features, and action decision model, the decision action output by the action decision model is obtained, and the target virtual object is controlled according to the decision action.

[0045] Among them, such as Figure 2 As shown, the action decision model 20 includes a feature processing unit 21 and a prediction unit 22. The feature processing unit 21 is used to adjust the game features based on the character encoding features to obtain processed game features. The prediction unit 22 is used to determine the decision action based on the processed game features. The action decision model is trained based on training samples corresponding to multiple virtual characters.

[0046] In this embodiment, the action control of multiple virtual characters can be realized based on the action decision model. Since different virtual characters usually have different combat logic and abilities, the action decision model adds a feature processing unit to adjust the game features based on the character encoding features of the virtual characters. This enhances the action decision model's ability to capture the differentiated information between different virtual characters, effectively improving the recognition of different virtual characters and thus achieving the accuracy of the action decision model's action decisions for different virtual characters.

[0047] Specifically, game features and character coding features can be input into the action decision model, enabling the feature processing unit to identify virtual characters based on the character coding features. Then, based on the character coding features, the input game features are adjusted to match the virtual character, thereby predicting decision actions, obtaining decision actions, and controlling the target virtual object.

[0048] The above technical solution enables the control of virtual objects under multiple virtual characters by deploying a single action decision model, effectively reducing the workload, complexity, and resource consumption required for model deployment, training, and maintenance. Furthermore, adding a feature processing unit to the action decision model allows it to process game features differently based on the character encoding features of different virtual characters, ensuring a high degree of matching between the output decision action and the virtual character, improving the accuracy and effectiveness of action decisions, and enhancing the control precision of virtual objects. This makes the control of virtual objects closer to real user operations, improving the user's gaming experience. Additionally, training based on training data corresponding to multiple virtual characters can also improve the versatility and generalization of the action decision model to some extent.

[0049] In one possible embodiment, the character encoding feature includes at least one of the identification feature of the virtual character, the object identifier of the target attack object of the virtual character, and the equipment identifier of the equipment carried by the virtual character.

[0050] For example, the identification feature of the virtual character can be its unique identifier, which can be used to uniquely represent a virtual character, thereby enabling the differentiation of different virtual characters. The target attack object of the virtual character can be used to indicate the focus of the target virtual character in a recent time period, thereby revealing the movement trajectory of the target virtual object during that time period. The equipment identifier of the virtual character's carried equipment can be used to represent the virtual character's attributes and capabilities.

[0051] Character encoding features are used to differentiate virtual characters, enabling the feature processing unit (FPU) model to determine the weights of different virtual characters on game features during training. For example, this feature processing unit can be implemented using a gating unit. The FPU can include learnable unit parameters to adjust information delivery and selectivity. These parameters determine which features from the input game features are passed to the next layer of the network at a given time, while suppressing or ignoring irrelevant features. For instance, these parameters could include weights corresponding to different game features. This allows the model training to determine the weights of game features corresponding to different character encoding features, and then adjust the game features based on these weights to obtain processed game features. This improves the performance and accuracy of subsequent prediction units.

[0052] The above technical solution can represent each virtual character based on the character encoding features, so that the action decision model can adaptively adjust the input game features based on the character encoding features to ensure the prediction accuracy of the action decision model in different scenarios.

[0053] In one possible embodiment, the role encoding feature includes the object identifier of the target attack object of the virtual role;

[0054] The step of determining the game characteristics corresponding to the target virtual object based on the game state data includes:

[0055] Based on the game status data, the virtual character's various attack targets within the most recent target time period are determined.

[0056] The target time period can be set based on specific application scenarios, and this disclosure does not limit it. For example, the target time period can be set to 1 minute, which can determine the attack targets of the virtual character in the most recent 1 minute. Attack data corresponding to the target virtual object within the target time period can be obtained from the game status data, and the corresponding attack targets in the attack data can be further determined.

[0057] The top N objects, ranked from highest to lowest based on the number of attacks against each of the aforementioned attack targets, are identified as the target attack targets, and the object identifier of the target attack target is added to the role coding feature, where N is a positive integer.

[0058] The more times a certain object is attacked, the more attention the target virtual object pays to it during the target time period. Therefore, the top N objects can be sorted by the number of attacks from high to low as the target attack objects. In other words, the objects that the target virtual object pays more attention to during the target time period are the target attack objects, thereby accurately representing the attack intent of the target virtual object during the target time period.

[0059] Therefore, by using the above technical solution, the target virtual object can be extracted and represented from the game state data in the recent period, thereby improving the richness and accuracy of the character coding features. At the same time, it provides reliable data support for the accurate prediction of the target virtual object's decision-making actions. When determining the decision-making action, it can be predicted based on the target virtual object's recent attack intentions, making the determined decision-making action more in line with the target virtual object's combat situation in the current game, and further ensuring the accuracy of the action decision.

[0060] In one possible embodiment, the action decision model can be determined in the following way:

[0061] Training samples are generated based on the game state data of multiple virtual characters in historical matches. The training samples include the training game features corresponding to the training virtual objects, the training decision actions corresponding to the training game features, the training character type and training character encoding features of the virtual characters corresponding to the training virtual objects, and the multiple virtual characters include virtual characters under various character types.

[0062] In practical applications, due to the game experience and game scene settings, multiple character types can be set, and multiple virtual characters can be set under each character type. The character type and the virtual characters under that character type can be configured based on the actual application scenario, and this disclosure does not limit this.

[0063] In this embodiment, in order to improve the efficiency of model deployment and reduce the complexity of model maintenance, the model can be trained uniformly based on virtual characters of different role types, so as to obtain a general model that can be applied to multiple virtual characters, and used to make decisions and control the actions of virtual characters of different role types.

[0064] As an example, after obtaining authorization from the relevant user, you can obtain the user's game data within a historical time period.

[0065] In one possible embodiment, an exemplary implementation of generating training samples based on game state data from historical matches of multiple virtual characters may include:

[0066] For each historical game's game state data, determine the training character type of the virtual character corresponding to the training virtual object in the game state data and the triggered training decision action in the game state data, and obtain the game image frame when the training decision action is triggered.

[0067] Specifically, the training character type corresponding to the virtual character of the training virtual object can be determined by analyzing game state data or based on the parameters in the game state data that represent the character type of the virtual character. For example, if the game state data can contain the character types of the virtual characters corresponding to each virtual object in the historical game, then after determining the training virtual object, its corresponding virtual character's character type can be directly obtained as the training character type.

[0068] The game state data under the historical game can be the recording data corresponding to the historical game. Then, by performing image recognition on the recording data, the triggered decision actions, such as skill output, recall, etc. can be determined. This can be achieved based on image recognition models or algorithms, and this disclosure does not limit it.

[0069] After determining the corresponding decision action, the recorded image frame that triggers the decision action can be extracted and used as the game image frame for training the decision action.

[0070] Subsequently, for each game image frame triggered by the training decision action, feature extraction is performed on the game image frame to obtain the training game features and training character encoding features corresponding to the training decision action.

[0071] For example, based on the historical game data of the virtual character U1, the determined training decision action is a skill. Feature extraction is performed on game image frames based on the preset dimensions of game features and the dimensions of the trained character encoding features to obtain the trained game features and trained character encoding features. The dimensions of the trained game features are the same as the dimensions of the game features determined in step 12, and the dimensions of the trained character encoding features are the same as the dimensions of the character encoding features determined in step 12.

[0072] Therefore, the above technical solution can automatically generate training samples based on the game state data of multiple virtual characters under various role types in historical matches, effectively reducing the workload required to obtain sample data.

[0073] Next, the training game features and the training character encoding features are input into the feature processing unit to obtain training game processing features; and the training game processing features are input into the prediction unit to obtain the predicted decision action and the predicted character type. Thus, the corresponding predicted decision action and predicted character type can be determined based on the training game features and the training character encoding features.

[0074] The action decision model obtained in this disclosure is used to take over the control of virtual characters under different role types. Since the operations of virtual characters under different role types vary greatly, in this disclosure, obtaining the predicted decision action can be taken as the main task. That is, the above-mentioned game features and character encoding features are used as the input of the main task to train the model to generate decision actions in the game.

[0075] Furthermore, auxiliary tasks are added during the model training process. In neural network training, auxiliary tasks are methods that utilize additional labels or features to assist in training the main task. In this embodiment, the prediction of the virtual character's role type is used as an additional auxiliary task to help the action decision model further learn the association between different role types and game features. In this way, while learning the main task, the model also learns the role type corresponding to the virtual character, thereby improving the model's understanding of the game's overall characteristics.

[0076] It should be noted that when deploying the trained action decision model, there is no need to deploy the processing model for the auxiliary task, thus simplifying the application of the online model. Therefore, this method allows for the addition of extra information to enable the model to learn more comprehensively without increasing the number of network nodes and feature sizes, which helps improve the accuracy of the model's action decisions in game matches.

[0077] Subsequently, based on the training role type and the training decision action, as well as the prediction decision action and the prediction role type, a target loss is determined, and the action decision model is trained based on the target loss.

[0078] In one possible embodiment, determining the target loss based on the training role type and the training decision action, and the prediction decision action and the prediction role type, includes:

[0079] A first loss is determined based on the training role type corresponding to the training sample and the predicted role type. This first loss can be determined using the cross-entropy loss function commonly used in classification models.

[0080] Based on the training decision action and the predicted decision action, a second loss is determined. This second loss can be determined using a cross-entropy loss function. The output decision action can be pre-set, such as action parameters corresponding to a virtual character, including: movement, basic attack, skills, battlefield skills, recall, equipment, canceling skills, and auxiliary skills, etc., and can be set based on different game scenarios.

[0081] The target loss is determined based on the first loss and the second loss.

[0082] Then, the first loss and the second loss can be weighted and summed to obtain the target loss. The weights of the first loss and the second loss can be set based on the actual needs of the main task and the auxiliary task, and this disclosure does not limit this.

[0083] Therefore, through the above technical solution, the model can be trained simultaneously with an auxiliary task (i.e., the task of predicting the type of the role). This ensures that the model can better learn the characteristics of different virtual roles, while also ensuring that the model can distinguish between different types of virtual roles. This improves the matching degree between the output decision action and the virtual role, thus enhancing the accuracy of the output results.

[0084] In one possible embodiment, such as Figure 3 As shown, the feature processing unit includes a linear transformation subunit, a logical transformation subunit, and a processing subunit. The linear transformation subunit is used to perform a linear transformation on the character encoding feature to obtain a first processed feature. The logical transformation subunit is used to map the first processed feature to obtain a second processed feature with a value between 0 and 1. The processing subunit is used to perform an algorithmic product process on the second processed feature and the game feature to obtain the processed game feature.

[0085] The linear transformation subunit performs a linear transformation on the role encoding features, which can be a transformation process that uses a matrix or linear mapping that satisfies the addition and scalar multiplication properties to transform the input role encoding features. The logical transformation subunit can perform transformations using logical functions, such as the logical function E(x) = 1 / (1+e^x). -x The transformation is performed. Logistic functions are used to constrain the model's output between 0 and 1, limiting the output to a specific range, which aids in network learning and optimization. Logistic functions are commonly used sigmoid functions with central symmetry, and their output values ​​are between 0 and 1. The processing subunits can multiply the corresponding numerical or vector elements in the game features and the second processing features one by one to obtain the corresponding processed game features.

[0086] Therefore, through the above technical solution, the input character encoding features can be transformed based on the feature processing unit, and the game features can be adjusted based on the transformed second processed features. This makes the obtained processed game features more closely match the input virtual character, enabling different processing of game features based on different character feature codes. Simultaneously, it enhances the action decision model's understanding of character encoding features, thereby reducing model confusion with different virtual characters. Furthermore, by learning adjustable weight parameters, the feature processing unit can automatically and selectively control the transmission and filtering of different information in the game features under different conditions. This allows the model's behavior to adaptively adjust based on the input character encoding features, resulting in excellent performance of the action character model across various virtual character control scenarios. In addition, through the configuration of the feature processing unit, the accuracy of the model can be improved, and the number of parameters in the model can be significantly reduced, thereby reducing computational and memory overhead and broadening the model's applicability.

[0087] Based on the same inventive concept, this disclosure also provides a control device for virtual objects, such as... Figure 4 As shown, the device 10 includes:

[0088] The acquisition module 100 is used to acquire the game state data of the target virtual object in the current game.

[0089] The first determining module 200 is used to determine the game features and character encoding features corresponding to the target virtual object based on the game state data, wherein the game features are used to represent the data corresponding to the current game image frame of the target virtual object in the current game, and the character encoding features are used to represent the features of the virtual character corresponding to the target virtual object;

[0090] The second determining module 300 is used to obtain the decision action output by the action decision model based on the game features, the character encoding features, and the action decision model, and to control the target virtual object according to the decision action. The action decision model includes a feature processing unit and a prediction unit. The feature processing unit is used to adjust the game features based on the character encoding features to obtain processed game features. The prediction unit is used to determine the decision action based on the processed game features. The action decision model is trained based on training samples corresponding to multiple virtual characters.

[0091] Optionally, the character encoding features include at least one of the identification features of the virtual character, the object identifier of the target attack object of the virtual character, and the equipment identifier of the equipment carried by the virtual character.

[0092] Optionally, the role encoding feature includes the object identifier of the target attack object of the virtual role;

[0093] The first determining module includes:

[0094] The first determining submodule is used to determine, based on the game state data, each attack target of the virtual character within the most recent target time period;

[0095] The second determination submodule is used to determine the top N objects in descending order of the number of attacks corresponding to each of the attack objects as the target attack objects, and to add the object identifier of the target attack object to the role coding feature.

[0096] Optionally, the action decision model is determined by a training module, which includes:

[0097] The generation submodule is used to generate training samples based on the game state data of multiple virtual characters in historical matches. The training samples include training game features corresponding to the training virtual objects, training decision actions corresponding to the training game features, training character types and training character encoding features of the virtual characters corresponding to the training virtual objects, and the multiple virtual characters include virtual characters under multiple character types.

[0098] The first processing submodule is used to input the training game features and the training character encoding features into the feature processing unit to obtain the training game processing features;

[0099] The second processing submodule is used to input the training game processing features into the prediction unit to obtain the predicted decision action and the predicted role type.

[0100] The third determination submodule is used to determine the target loss based on the training role type and the training decision action, as well as the prediction decision action and the prediction role type.

[0101] The training submodule is used to train the action decision model based on the target loss.

[0102] Optionally, the generation submodule includes:

[0103] The fourth determination submodule is used to determine the training character type of the virtual character corresponding to the training virtual object in the game state data and the triggered training decision action in the game state data for each historical game, and to obtain the game image frame when the training decision action is triggered.

[0104] The fifth determining submodule is used to extract features from the game image frame when each training decision action is triggered, and to obtain the training game features and training character encoding features corresponding to the training decision action.

[0105] Optionally, the third determining submodule includes:

[0106] The sixth determining submodule is used to determine the first loss based on the training role type corresponding to the training sample and the predicted role type;

[0107] The seventh determination submodule is used to determine the second loss based on the training decision action and the prediction decision action;

[0108] The eighth determining submodule is used to determine the target loss based on the first loss and the second loss.

[0109] Optionally, the feature processing unit includes a linear transformation subunit, a logical transformation subunit, and a processing subunit, wherein the linear transformation subunit is used to perform a linear transformation on the role encoding feature to obtain a first processed feature; the logical transformation subunit is used to map the first processed feature to obtain a second processed feature with a value between 0 and 1; and the processing subunit is used to perform an algorithmic product processing on the second processed feature and the game feature to obtain the processed game feature.

[0110] The following is for reference. Figure 5 The diagram illustrates a structural schematic of an electronic device (e.g., a terminal device or a server) 600 suitable for implementing embodiments of the present disclosure. The terminal device in the embodiments of the present disclosure may include, but is not limited to, mobile terminals such as mobile phones, laptops, digital broadcast receivers, PDAs (personal digital assistants), PADs (tablet computers), PMPs (portable multimedia players), in-vehicle terminals (e.g., in-vehicle navigation terminals), and fixed terminals such as digital TVs and desktop computers. Figure 5 The electronic device shown is merely an example and should not be construed as limiting the functionality and scope of the embodiments disclosed herein.

[0111] like Figure 5As shown, electronic device 600 may include a processing device (e.g., a central processing unit, a graphics processor, etc.) 601, which can perform various appropriate actions and processes according to a program stored in read-only memory (ROM) 602 or a program loaded from storage device 608 into random access memory (RAM) 603. RAM 603 also stores various programs and data required for the operation of electronic device 600. Processing device 601, ROM 602, and RAM 603 are interconnected via bus 604. Input / output (I / O) interface 605 is also connected to bus 604.

[0112] Typically, the following devices can be connected to I / O interface 605: input devices 606 including, for example, touchscreens, touchpads, keyboards, mice, cameras, microphones, accelerometers, gyroscopes, etc.; output devices 607 including, for example, liquid crystal displays (LCDs), speakers, vibrators, etc.; storage devices 608 including, for example, magnetic tapes, hard disks, etc.; and communication devices 609. Communication device 609 allows electronic device 600 to communicate wirelessly or wiredly with other devices to exchange data. Although Figure 5 An electronic device 600 with various devices is shown; however, it should be understood that it is not required to implement or possess all of the devices shown. More or fewer devices may be implemented or possessed alternatively.

[0113] In particular, according to embodiments of this disclosure, the processes described above with reference to the flowcharts can be implemented as computer software programs. For example, embodiments of this disclosure include a computer program product comprising a computer program carried on a non-transitory computer-readable medium, the computer program containing program code for performing the methods shown in the flowcharts. In such embodiments, the computer program can be downloaded and installed from a network via a communication device 609, or installed from a storage device 608, or installed from a ROM 602. When the computer program is executed by the processing device 601, it performs the functions defined in the methods of embodiments of this disclosure.

[0114] It should be noted that the computer-readable medium described in this disclosure can be a computer-readable signal medium or a computer-readable storage medium, or any combination thereof. A computer-readable storage medium can be, for example,—but not limited to—an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination thereof. More specific examples of a computer-readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer disk, a hard disk, random access memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), optical storage device, magnetic storage device, or any suitable combination thereof. In this disclosure, a computer-readable storage medium can be any tangible medium containing or storing a program that can be used by or in connection with an instruction execution system, apparatus, or device. In this disclosure, a computer-readable signal medium can include a data signal propagated in baseband or as part of a carrier wave, carrying computer-readable program code. Such propagated data signals can take various forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination thereof. A computer-readable signal medium can be any computer-readable medium other than a computer-readable storage medium, which can send, propagate, or transmit a program for use by or in connection with an instruction execution system, apparatus, or device. The program code contained on the computer-readable medium can be transmitted using any suitable medium, including but not limited to: wires, optical fibers, RF (radio frequency), etc., or any suitable combination thereof.

[0115] In some implementations, clients and servers can communicate using any currently known or future-developed network protocol such as HTTP (Hypertext Transfer Protocol) and can interconnect with digital data communication (e.g., communication networks) of any form or medium. Examples of communication networks include local area networks (“LANs”), wide area networks (“WANs”), the Internet (e.g., the Internet of Things), and peer-to-peer networks (e.g., ad hoc peer-to-peer networks), as well as any currently known or future-developed networks.

[0116] The aforementioned computer-readable medium may be included in the aforementioned electronic device; or it may exist independently and not assembled into the electronic device.

[0117] The aforementioned computer-readable medium carries one or more programs that, when executed by the electronic device, cause the electronic device to: acquire game state data of a target virtual object in the current game; determine game features and character encoding features corresponding to the target virtual object based on the game state data, wherein the game features represent data corresponding to the current game image frame of the target virtual object in the current game, and the character encoding features represent features of the virtual character corresponding to the target virtual object; obtain a decision action output by the action decision model based on the game features, the character encoding features, and the action decision model, and control the target virtual object based on the decision action, wherein the action decision model includes a feature processing unit and a prediction unit, the feature processing unit is used to adjust the game features based on the character encoding features to obtain processed game features, the prediction unit is used to determine the decision action based on the processed game features, and the action decision model is trained based on training samples corresponding to multiple virtual characters.

[0118] Computer program code for performing the operations of this disclosure can be written in one or more programming languages ​​or a combination thereof, including but not limited to object-oriented programming languages ​​such as Java, Smalltalk, and C++, as well as conventional procedural programming languages ​​such as the "C" language or similar programming languages. The program code can be executed entirely on the user's computer, partially on the user's computer, as a standalone software package, partially on the user's computer and partially on a remote computer, or entirely on a remote computer or server. In cases involving remote computers, the remote computer can be connected to the user's computer via any type of network—including a local area network (LAN) or a wide area network (WAN)—or can be connected to an external computer (e.g., via the Internet using an Internet service provider).

[0119] The flowcharts and block diagrams in the accompanying drawings illustrate the architecture, functionality, and operation of possible implementations of systems, methods, and computer program products according to various embodiments of this disclosure. In this regard, each block in a flowchart or block diagram may represent a module, segment, or portion of code containing one or more executable instructions for implementing a specified logical function. It should also be noted that in some alternative implementations, the functions indicated in the blocks may occur in a different order than those indicated in the drawings. For example, two consecutively indicated blocks may actually be executed substantially in parallel, and they may sometimes be executed in reverse order, depending on the functions involved. It should also be noted that each block in the block diagrams and / or flowcharts, and combinations of blocks in the block diagrams and / or flowcharts, can be implemented using a dedicated hardware-based system that performs the specified function or operation, or using a combination of dedicated hardware and computer instructions.

[0120] The modules described in the embodiments of this disclosure can be implemented in software or in hardware. The name of a module does not necessarily limit the module itself; for example, an acquisition module can also be described as "a module that acquires game state data of a target virtual object in the current game".

[0121] The functions described above in this document can be performed, at least in part, by one or more hardware logic components. For example, exemplary types of hardware logic components that can be used, without limitation, include: Field Programmable Gate Arrays (FPGAs), Application-Specific Integrated Circuits (ASICs), Application Standard Products (ASSPs), System-on-Chip (SoCs), Complex Programmable Logic Devices (CPLDs), and so on.

[0122] In the context of this disclosure, a machine-readable medium can be a tangible medium that may contain or store a program for use by or in conjunction with an instruction execution system, apparatus, or device. A machine-readable medium can be a machine-readable signal medium or a machine-readable storage medium. A machine-readable medium can be, but is not limited to, electronic, magnetic, optical, electromagnetic, infrared, or semiconductor systems, apparatus, or devices, or any suitable combination of the foregoing. More specific examples of machine-readable storage media include electrical connections based on one or more wires, portable computer disks, hard disks, random access memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), optical storage devices, magnetic storage devices, or any suitable combination of the foregoing.

[0123] According to one or more embodiments of this disclosure, Example 1 provides a method for controlling a virtual object, wherein the method includes:

[0124] Obtain the game state data of the target virtual object in the current game;

[0125] Based on the game state data, the game features and character encoding features corresponding to the target virtual object are determined, wherein the game features are used to represent the data corresponding to the current game image frame of the target virtual object in the current game, and the character encoding features are used to represent the features of the virtual character corresponding to the target virtual object;

[0126] Based on the game features, the character encoding features, and the action decision model, a decision action output by the action decision model is obtained, and the target virtual object is controlled according to the decision action. The action decision model includes a feature processing unit and a prediction unit. The feature processing unit is used to adjust the game features based on the character encoding features to obtain processed game features. The prediction unit is used to determine the decision action based on the processed game features. The action decision model is trained based on training samples corresponding to multiple virtual characters.

[0127] According to one or more embodiments of this disclosure, Example 2 provides the method of Example 1, wherein the character encoding feature includes at least one of the identification feature of the virtual character, the object identifier of the target attack object of the virtual character, and the equipment identifier of the equipment carried by the virtual character.

[0128] According to one or more embodiments of this disclosure, Example 3 provides the method of Example 2, wherein the role encoding feature includes the object identifier of the target attack object of the virtual role;

[0129] The step of determining the game characteristics corresponding to the target virtual object based on the game state data includes:

[0130] Based on the game status data, determine the various attack targets of the virtual character within the most recent target time period;

[0131] The top N objects, ranked from highest to lowest based on the number of attacks against each of the aforementioned attack targets, are identified as the target attack targets, and the object identifier of the target attack target is added to the role coding feature.

[0132] According to one or more embodiments of this disclosure, Example 4 provides the method of Example 1, wherein the action decision model is determined in the following manner:

[0133] Training samples are generated based on the game state data of multiple virtual characters in historical matches. The training samples include training game features corresponding to the training virtual objects, training decision actions corresponding to the training game features, training character types and training character encoding features of the virtual characters corresponding to the training virtual objects, and the multiple virtual characters include virtual characters under various character types.

[0134] The training game features and the training character encoding features are input into the feature processing unit to obtain the training game processing features;

[0135] The training game processing features are input into the prediction unit to obtain the predicted decision action and the predicted role type;

[0136] The target loss is determined based on the training role type and the training decision action, as well as the prediction decision action and the prediction role type.

[0137] The action decision model is trained based on the target loss.

[0138] According to one or more embodiments of this disclosure, Example 5 provides the method of Example 4, wherein generating training samples based on game state data from historical matches of multiple virtual characters includes:

[0139] For each historical game's game state data, determine the training character type of the virtual character corresponding to the training virtual object in the game state data and the triggered training decision action in the game state data, and obtain the game image frame when the training decision action is triggered.

[0140] For each game image frame triggered by the training decision action, feature extraction is performed on the game image frame to obtain the training game features and training character encoding features corresponding to the training decision action.

[0141] According to one or more embodiments of this disclosure, Example 6 provides the method of Example 4, wherein determining the target loss based on the training role type and the training decision action, and the prediction decision action and the prediction role type, includes:

[0142] The first loss is determined based on the training role type corresponding to the training sample and the predicted role type;

[0143] The second loss is determined based on the training decision action and the prediction decision action;

[0144] The target loss is determined based on the first loss and the second loss.

[0145] According to one or more embodiments of this disclosure, Example 7 provides the method of Example 1, wherein the feature processing unit includes a linear transformation subunit, a logical transformation subunit, and a processing subunit, wherein the linear transformation subunit is used to perform a linear transformation on the role encoding feature to obtain a first processed feature; the logical transformation subunit is used to map the first processed feature to obtain a second processed feature with a value between 0 and 1; and the processing subunit is used to perform an algorithmic product processing on the second processed feature and the game feature to obtain the processed game feature.

[0146] According to one or more embodiments of this disclosure, Example 8 provides a control device for a virtual object, wherein the device includes:

[0147] The acquisition module is used to acquire the game state data of the target virtual object in the current game.

[0148] The first determining module is used to determine the game features and character encoding features corresponding to the target virtual object based on the game state data, wherein the game features are used to represent the data corresponding to the current game image frame of the target virtual object in the current game, and the character encoding features are used to represent the features of the virtual character corresponding to the target virtual object;

[0149] The second determining module is used to obtain the decision action output by the action decision model based on the game features, the character encoding features, and the action decision model, and to control the target virtual object according to the decision action. The action decision model includes a feature processing unit and a prediction unit. The feature processing unit is used to adjust the game features based on the character encoding features to obtain processed game features. The prediction unit is used to determine the decision action based on the processed game features. The action decision model is trained based on training samples corresponding to multiple virtual characters.

[0150] According to one or more embodiments of the present disclosure, Example 9 provides a computer-readable medium having a computer program stored thereon that, when executed by a processing device, implements the steps of the method described in any one of Examples 1-7.

[0151] According to one or more embodiments of this disclosure, Example 10 provides an electronic device, including:

[0152] A storage device on which computer programs are stored;

[0153] A processing device for executing the computer program in the storage device to implement the steps of any one of the methods in Examples 1-7.

[0154] The above description is merely a preferred embodiment of this disclosure and an explanation of the technical principles employed. Those skilled in the art should understand that the scope of this disclosure is not limited to technical solutions formed by specific combinations of the above-described technical features, but should also cover other technical solutions formed by arbitrary combinations of the above-described technical features or their equivalents without departing from the above-described concept. For example, technical solutions formed by substituting the above features with (but not limited to) technical features disclosed in this disclosure that have similar functions.

[0155] Furthermore, while the operations are described in a specific order, this should not be construed as requiring these operations to be performed in the specific order shown or in a sequential order. In certain environments, multitasking and parallel processing may be advantageous. Similarly, while several specific implementation details are included in the above discussion, these should not be construed as limiting the scope of this disclosure. Certain features described in the context of individual embodiments may also be implemented in combination in a single embodiment. Conversely, various features described in the context of a single embodiment may also be implemented individually or in any suitable sub-combination in multiple embodiments.

[0156] Although the subject matter has been described using language specific to structural features and / or methodological logic, it should be understood that the subject matter defined in the appended claims is not necessarily limited to the specific features or actions described above. Rather, the specific features and actions described above are merely illustrative forms of implementing the claims. Regarding the apparatus in the above embodiments, the specific manner in which the various modules perform their operations has been described in detail in the embodiments relating to the method, and will not be elaborated upon here.

Claims

1. A method for controlling a virtual object, characterized in that, The method includes: Obtain the game state data of the target virtual object in the current game; Based on the game state data, the game features and character encoding features corresponding to the target virtual object are determined, wherein the game features are used to represent the data corresponding to the current game image frame of the target virtual object in the current game, and the character encoding features are used to represent the features of the virtual character corresponding to the target virtual object; Based on the game features, the character encoding features, and the action decision model, a decision action output by the action decision model is obtained, and the target virtual object is controlled according to the decision action. The action decision model includes a feature processing unit and a prediction unit. The feature processing unit is used to adjust the game features based on the character encoding features to obtain processed game features. The prediction unit is used to determine the decision action based on the processed game features. The action decision model is trained based on training samples corresponding to multiple virtual characters.

2. The method according to claim 1, characterized in that, The character encoding features include at least one of the following: the identification features of the virtual character, the object identifier of the target attack object of the virtual character, and the equipment identifier of the equipment carried by the virtual character.

3. The method of claim 2, wherein, The role encoding feature includes the object identifier of the target attack object of the virtual character; The step of determining the game characteristics corresponding to the target virtual object based on the game state data includes: Based on the game status data, determine the various attack targets of the virtual character within the most recent target time period; The top N objects, ranked from highest to lowest based on the number of attacks against each of the aforementioned attack targets, are identified as the target attack targets, and the object identifier of the target attack target is added to the role coding feature.

4. The method of claim 1, wherein, The action decision model is determined in the following way: Training samples are generated based on the game state data of multiple virtual characters in historical matches. The training samples include training game features corresponding to the training virtual objects, training decision actions corresponding to the training game features, training character types and training character encoding features of the virtual characters corresponding to the training virtual objects, and the multiple virtual characters include virtual characters under various character types. The training game features and the training character encoding features are input into the feature processing unit to obtain the training game processing features; The training game processing features are input into the prediction unit to obtain the predicted decision action and the predicted role type; The target loss is determined based on the training role type and the training decision action, as well as the prediction decision action and the prediction role type. The action decision model is trained based on the target loss.

5. The method of claim 4, wherein, The process of generating training samples based on game state data from historical matches of multiple virtual characters includes: For each historical game's game state data, determine the training character type of the virtual character corresponding to the training virtual object in the game state data and the triggered training decision action in the game state data, and obtain the game image frame when the training decision action is triggered. For each game image frame triggered by the training decision action, feature extraction is performed on the game image frame to obtain the training game features and training character encoding features corresponding to the training decision action.

6. The method of claim 4, wherein, The step of determining the target loss based on the training role type and the training decision action, and the prediction decision action and the prediction role type, includes: The first loss is determined based on the training role type corresponding to the training sample and the predicted role type; The second loss is determined based on the training decision action and the prediction decision action; The target loss is determined based on the first loss and the second loss.

7. The method of claim 1, wherein, The feature processing unit includes a linear transformation subunit, a logical transformation subunit, and a processing subunit. The linear transformation subunit is used to perform a linear transformation on the character encoding feature to obtain a first processed feature. The logical transformation subunit is used to map the first processed feature to obtain a second processed feature with a value between 0 and 1. The processing subunit is used to perform an algorithmic product process on the second processed feature and the game feature to obtain the processed game feature.

8. A control device of a virtual object, characterized by, The device includes: The acquisition module is used to acquire the game state data of the target virtual object in the current game. The first determining module is used to determine the game features and character encoding features corresponding to the target virtual object based on the game state data, wherein the game features are used to represent the data corresponding to the current game image frame of the target virtual object in the current game, and the character encoding features are used to represent the features of the virtual character corresponding to the target virtual object; The second determining module is used to obtain the decision action output by the action decision model based on the game features, the character encoding features, and the action decision model, and to control the target virtual object according to the decision action. The action decision model includes a feature processing unit and a prediction unit. The feature processing unit is used to adjust the game features based on the character encoding features to obtain processed game features. The prediction unit is used to determine the decision action based on the processed game features. The action decision model is trained based on training samples corresponding to multiple virtual characters.

9. A computer readable medium having stored thereon a computer program, characterized in that, When executed by the processing device, the program implements the steps of the method described in any one of claims 1-7.

10. An electronic device, comprising: include: A storage device on which computer programs are stored; A processing device for executing the computer program in the storage device to implement the steps of the method according to any one of claims 1-7.