Method, apparatus, medium and device for controlling virtual object

What is AI technical title?
AI technical title is built by Patsnap AI team. It summarizes the technical point description of the patent document.
By reconstructing the action space and training an action decision model based on data from multiple virtual characters, the problem of high NPC control costs in existing technologies is solved, and efficient and accurate operation control of multiple virtual characters is achieved.

CN117654046BActive Publication Date: 2026-06-23BEIJING ZITIAO NETWORK TECH CO LTD

View PDF 2 Cites 0 Cited by

Patent Information

Authority / Receiving Office: CN · China
Patent Type: Patents(China)
Current Assignee / Owner: BEIJING ZITIAO NETWORK TECH CO LTD
Filing Date: 2023-11-29
Publication Date: 2026-06-23

AI Technical Summary

Technical Problem

In existing technologies, NPC control methods based on model training can only take over fixed NPCs, resulting in high model deployment and maintenance costs, and a lack of versatility and generalization.

Method used

By reconstructing the action space, the actions output by multiple virtual characters are standardized into multiple candidate action types, and the action decision model is trained based on the training data corresponding to multiple virtual characters, so as to realize the operation control of virtual objects under multiple virtual characters.

Benefits of technology

It reduces the workload of model deployment, training, and maintenance, improves the versatility and generalization of action decision models, and enhances the accuracy and efficiency of virtual object control.

✦ Generated by Eureka AI based on patent content.

Smart Images

Figure CN117654046B_ABST

Patent Text Reader

Abstract

The present disclosure relates to a virtual object control method, device, medium and equipment, the method comprising: obtaining game state data of a target virtual object in a current game; determining a game feature corresponding to the target virtual object according to the game state data, wherein the game feature comprises a game scene feature and a character feature of a virtual character corresponding to the target virtual object; obtaining action decision parameters output by an action decision model and action parameters corresponding to a plurality of candidate action types according to the game scene feature, the character feature and the action decision model; determining target action parameters of a target action according to the action decision parameters and the action parameters corresponding to the candidate action types, and controlling the target virtual object according to the target action parameters.

Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This disclosure relates to the field of computer technology, and more specifically, to a method, apparatus, medium, and device for controlling virtual objects. Background Technology

[0002] The application of artificial intelligence (AI) in the gaming industry is rapidly developing, bringing unprecedented progress and innovation to the gaming experience. Based on deep learning technology, NPCs (Non-Player Characters) in games can be automatically controlled by models, enabling them to realistically interact with players and provide a more challenging and personalized gaming experience.

[0003] In related technologies, the model used to control NPCs is obtained through model training, and one AI model can only take over one fixed NPC. The deployment and maintenance costs of this model are very high. Summary of the Invention

[0004] This summary section is provided to briefly introduce the concepts, which will be described in detail in the detailed description section below. This summary section is not intended to identify key or essential features of the claimed technical solution, nor is it intended to limit the scope of the claimed technical solution.

[0005] In a first aspect, this disclosure provides a method for controlling a virtual object, the method comprising:

[0006] Obtain the game state data of the target virtual object in the current game;

[0007] Based on the game state data, the game features corresponding to the target virtual object are determined, wherein the game features include game scene features and the character features of the virtual character corresponding to the target virtual object;

[0008] Based on the game scene features, the character features, and the action decision model, the action decision parameters output by the action decision model and the action parameters corresponding to multiple candidate action types are obtained. The action decision model is trained based on training data corresponding to multiple virtual characters. The action decision parameters are used to determine the type of the target action from multiple candidate action types. The action parameters are used to represent the control parameters of the action corresponding to the candidate action type.

[0009] Based on the action decision parameters and the action parameters corresponding to the candidate action types, the target action parameters are determined, and the target virtual object is controlled according to the target action parameters.

[0010] Secondly, this disclosure provides a control device for a virtual object, the device comprising:

[0011] The acquisition module is used to acquire the game state data of the target virtual object in the current game.

[0012] The first determining module is used to determine the game features corresponding to the target virtual object based on the game state data, wherein the game features include game scene features and the character features of the virtual character corresponding to the target virtual object;

[0013] The processing module is used to obtain the action decision parameters output by the action decision model and the action parameters corresponding to multiple candidate action types based on the game scene features, the character features and the action decision model. The action decision model is trained based on training data corresponding to multiple virtual characters. The action decision parameters are used to determine the type of the target action from multiple candidate action types. The action parameters are used to represent the control parameters of the action corresponding to the candidate action type.

[0014] The second determining module is used to determine the target action parameters of the target action based on the action decision parameters and the action parameters corresponding to the candidate action types, and to control the target virtual object based on the target action parameters.

[0015] Thirdly, this disclosure provides a computer-readable medium having a computer program stored thereon, which, when executed by a processing device, implements the steps of the method described in the first aspect.

[0016] Fourthly, this disclosure provides an electronic device, comprising:

[0017] A storage device on which computer programs are stored;

[0018] A processing device for executing the computer program in the storage device to implement the steps of the method described in the first aspect.

[0019] The aforementioned technical solution reconstructs the model's action space, standardizing the actions output by multiple virtual characters into multiple candidate actions. This allows for feature extraction based on current game state data when controlling virtual objects, yielding game features. Action prediction is then based on these game features to obtain the target action and its parameters. Thus, deploying a single action decision model enables the control of virtual objects across multiple virtual characters, effectively reducing the workload, complexity, and resource consumption required for model deployment, training, and maintenance. Furthermore, training the action decision model on training data corresponding to multiple virtual characters improves its versatility and generalization, enabling it to better handle game scenarios not covered by data from a single virtual character, thereby enhancing the accuracy of virtual object control.

[0020] Other features and advantages of this disclosure will be described in detail in the following detailed description section. Attached Figure Description

[0021] The above and other features, advantages, and aspects of the embodiments of this disclosure will become more apparent from the accompanying drawings and the following detailed description. Throughout the drawings, the same or similar reference numerals denote the same or similar elements. It should be understood that the drawings are schematic, and the originals and elements are not necessarily drawn to scale. In the drawings:

[0022] Figure 1 This is a flowchart of a method for controlling a virtual object according to one embodiment of the present disclosure.

[0023] Figure 2 This is a schematic diagram of the processing flow of an action decision model provided according to one embodiment of the present disclosure.

[0024] Figure 3 This is a block diagram of a control device for a virtual object provided according to one embodiment of the present disclosure.

[0025] Figure 4 A schematic diagram of the structure of an electronic device suitable for implementing embodiments of the present disclosure is shown. Detailed Implementation

[0026] Embodiments of this disclosure will now be described in more detail with reference to the accompanying drawings. While some embodiments of this disclosure are shown in the drawings, it should be understood that this disclosure can be implemented in various forms and should not be construed as limited to the embodiments set forth herein. Rather, these embodiments are provided to provide a more thorough and complete understanding of this disclosure. It should be understood that the accompanying drawings and embodiments of this disclosure are for illustrative purposes only and are not intended to limit the scope of protection of this disclosure.

[0027] It should be understood that the steps described in the method embodiments of this disclosure may be performed in different orders and / or in parallel. Furthermore, the method embodiments may include additional steps and / or omit the steps shown. The scope of this disclosure is not limited in this respect.

[0028] The term "comprising" and its variations as used herein are open-ended inclusions, meaning "including but not limited to". The term "based on" means "at least partially based on". The term "one embodiment" means "at least one embodiment"; the term "another embodiment" means "at least one additional embodiment"; the term "some embodiments" means "at least some embodiments". Definitions of other terms will be given in the description below.

[0029] It should be noted that the concepts of "first" and "second" mentioned in this disclosure are used only to distinguish different devices, modules or units, and are not used to limit the order of functions performed by these devices, modules or units or their interdependencies.

[0030] It should be noted that the terms "a" and "a plurality of" used in this disclosure are illustrative rather than restrictive, and those skilled in the art should understand that, unless otherwise expressly indicated in the context, they should be understood as "one or more".

[0031] The names of messages or information exchanged between multiple devices in the embodiments of this disclosure are for illustrative purposes only and are not intended to limit the scope of such messages or information.

[0032] It is understood that before using the technical solutions disclosed in the various embodiments of this disclosure, users should be informed of the types, scope of use, and usage scenarios of the personal information involved in this disclosure in an appropriate manner in accordance with relevant laws and regulations, and user authorization should be obtained.

[0033] For example, upon receiving a user's active request, a prompt message is sent to the user to explicitly inform them that the requested operation will require the acquisition and use of the user's personal information. This allows the user to independently choose whether to provide personal information to the software or hardware, such as the electronic device, application, server, or storage medium performing the operations of this disclosed technical solution, based on the prompt message.

[0034] As an optional but non-limiting implementation, in response to a user's active request, sending a prompt message to the user can be done via a pop-up window, where the prompt message can be presented in text format. Furthermore, the pop-up window can also include a selection control allowing the user to choose "agree" or "disagree" to provide personal information to the electronic device.

[0035] It is understood that the above notification and user authorization process are merely illustrative and do not constitute a limitation on the implementation of this disclosure. Other methods that comply with relevant laws and regulations may also be applied to the implementation of this disclosure.

[0036] Meanwhile, it is understood that the data involved in this technical solution (including but not limited to the data itself, the acquisition or use of the data) shall comply with the requirements of relevant laws, regulations and related provisions.

[0037] Figure 1 The diagram shown is a flowchart of a virtual object control method according to an embodiment of the present disclosure. Figure 1 As shown, the method may include:

[0038] In step 11, the game state data of the target virtual object in the current game is obtained.

[0039] The target virtual object can be an object not controlled by a real player in the current game. Game state data can be obtained through the game data interface during the current game. This game state data represents the target virtual object's data in the current game. For example, the game state data may include object information of the target virtual object in the current game, as well as object information of other virtual objects in the game. This object information may include one or a combination of information such as health, mana, economy, kills, assists, and deaths. As another example, the game state data may also include game scene information, such as the survival status of minions and monsters in the current game, and the health of turrets.

[0040] It should be noted that the types of data included in the game status data can be preset according to the actual application scenario. In this step, the game status data can be obtained through the corresponding data acquisition interface in the game.

[0041] In step 12, based on the game state data, the game features corresponding to the target virtual object are determined, wherein the game features include game scene features and the character features of the virtual character corresponding to the target virtual object.

[0042] In this step, feature extraction can be performed based on game state data obtained from game matches to obtain game features. The dimensions for feature extraction based on game state data can be pre-defined. For example, game scene features can be used to represent common features in a game match, and these dimensions can include the current game frame rate, the total number of kills, assists, deaths, and economy for both sides, whether there are attack targets or summoned creatures within the attack range of the target virtual object, and the remaining survival status of other objects besides virtual characters (such as minions / jungles / towers).

[0043] Character features can be used to represent the combat characteristics of a virtual character in a game, that is, the characteristics associated with the virtual character. The dimensions included can be configured according to the relevant attributes of the virtual character.

[0044] The dimensions of the game features can be pre-set based on the actual application scenario, and this disclosure does not impose any limitations on this. Therefore, during feature extraction, extraction can be performed on each feature dimension based on the game state data during the game, obtaining the value of the corresponding dimension, thereby obtaining the game feature.

[0045] In step 13, based on game scene features, character features, and action decision model, action decision parameters output by the action decision model and action parameters corresponding to multiple candidate action types are obtained. The action decision model is trained based on training data corresponding to multiple virtual characters. The action decision parameters are used to determine the type of target action from multiple candidate action types. The action parameters are used to represent the control parameters of the action corresponding to the candidate action type.

[0046] As an example, the action decision model has multiple output heads, one of which is used to output the action decision parameters, and the remaining output heads are used to output the action parameters corresponding to a candidate action type.

[0047] Among them, such as Figure 2 The diagram shows a motion decision model. In this example, the motion decision model contains nine output heads F1-F9. F1 is used to output motion decision parameters, and F2-F9 correspond to the motion parameters of a candidate motion type, so as to obtain multiple output parameters based on the motion decision model.

[0048] In this embodiment, the action decision model is trained based on training data corresponding to multiple virtual characters. The obtained action decision model can automatically control the virtual objects under the multiple virtual characters. Thus, by deploying one action decision model, the virtual objects under multiple virtual characters can be taken over, effectively simplifying the deployment and maintenance of the model.

[0049] The applicant's research revealed that different virtual characters typically exhibit differences in skill operation or usage. Related technologies train a corresponding model for each virtual character, and this model can configure different output heads based on the skills corresponding to that virtual character. However, the action decision model in this embodiment is used to control the actions of multiple virtual characters, and since the skills of different virtual characters differ, it is not possible to configure individual settings for each virtual character's different skills. Therefore, this disclosure reconstructs the action space.

[0050] For example, the output actions of multiple virtual characters can be reconstructed into actions of multiple candidate action types. Each candidate action type corresponds to an output head in the action decision model, which can unify the learning of actions of the same type under multiple virtual characters and enhance the generalization of the model.

[0051] In step 14, the target action parameters are determined based on the action decision parameters and the action parameters corresponding to the candidate action types, and the target virtual object is controlled based on the target action parameters.

[0052] Among them, the effective action type can be determined from the candidate action types by the action decision parameters. Then, the effective action parameters can be used as the target action parameters of the target action, and the target virtual object can be controlled based on the target action parameters.

[0053] The aforementioned technical solution reconstructs the model's action space, standardizing the actions output by multiple virtual characters into multiple candidate actions. This allows for feature extraction based on current game state data when controlling virtual objects, yielding game features. Action prediction is then based on these game features to obtain the target action and its parameters. Thus, deploying a single action decision model enables the control of virtual objects across multiple virtual characters, effectively reducing the workload, complexity, and resource consumption required for model deployment, training, and maintenance. Furthermore, training the action decision model on training data corresponding to multiple virtual characters improves its versatility and generalization, enabling it to better handle game scenarios not covered by data from a single virtual character, thereby enhancing the accuracy of virtual object control.

[0054] In one possible embodiment, the character features of the virtual character corresponding to the target virtual object include the skill features of the virtual character and the skill mask features of the virtual character, wherein the skill mask features are used to represent the skill usage range of the skills corresponding to the virtual character.

[0055] As shown above, the character feature can be used to represent the combat characteristics of the virtual character in the game. Its dimensions can include the virtual character's equipment identifier, level, health, mana, economy, number of kills, number of deaths, number of assists, etc., to represent the attack characteristics of the virtual character in the current game.

[0056] The method also includes the skill characteristics corresponding to the virtual character. The skill characteristics are used to indicate whether the skill corresponding to the virtual character is available, as well as the skill level information, etc., so as to provide reliable data support for subsequent skill output decisions, avoid predicting the output of unavailable skills, and ensure the accuracy of subsequent decision actions.

[0057] The skill mask feature is used to represent the skill usage range of the skill corresponding to the virtual character. In this embodiment, skills can be divided into three types: directional, positional, and target-oriented. Directional skills are skills based on direction-controlled output. For example, if the game scene is divided into several 360-degree sections, each corresponding to a direction number, the direction of the skill output can be determined based on the direction number. Positional skills are skills based on position-controlled output. The position in the game scene can be divided into multiple squares using a chessboard grid, each corresponding to a position number, so the position point of the skill output can be determined based on the position number. Target-oriented skills are skills based on attack target-controlled output. For example, if the attackable objects in the game scene are numbered sequentially according to a set order to obtain object number identifiers, the target of the skill release can be determined based on the object number identifier. This set order can be set based on the actual application scenario, and this disclosure does not limit it.

[0058] Different virtual characters possess different skills, and even the same skill may differ between different virtual characters. For example, directional skills might have a range of application of 0-180 degrees for virtual character U1, while for virtual character U2 it might be 90-180 degrees. Therefore, in this embodiment, skill mask features are used to constrain the subsequent output motion parameters by defining the skill application range of various features of the virtual character.

[0059] For example, if the possible range of use for a directional skill of a virtual character U2 is 90-180 degrees, then each dimension of the skill mask feature corresponding to the directional skill corresponds to a direction number identifier. If the skill can be used within the direction range corresponding to the direction number identifier, then the value of the dimension corresponding to the direction number identifier is 1; if the skill cannot be used within the direction range corresponding to the direction number identifier, then the value of the dimension corresponding to the direction number identifier is 0, thereby constraining the range of use of the skill.

[0060] As another example, a game can involve two factions battling each other, with each faction containing multiple virtual objects. In one scenario, virtual objects within the same faction cannot attack each other. In the skill mask feature corresponding to a target skill, each dimension corresponds to an object ID identifier. Therefore, for a virtual character's target skill targeting a target virtual object, the value in the dimension corresponding to the object ID identifier of an object within the same faction is 0, while the values in the other dimensions are 1. Similarly, in another scenario, if a virtual character U1's target skill cannot attack objects of type X, then the value in the skill mask feature corresponding to the object ID identifier of that type X object is 0.

[0061] Therefore, through the above technical solution, the skill usage range of different skills of each virtual character can be represented in the action decision model that is common to multiple virtual characters. By adding a skill mask mechanism, the action decision model can be effectively constrained in its exploration of the action space of decision actions, reducing the possibility of unreasonable skill selection. This improves the accuracy and rationality of the action decision model's prediction of the virtual object's actions, and further enhances the control accuracy and performance of the target virtual object.

[0062] In one possible embodiment, the character features of the virtual character corresponding to the target virtual object further include the skill state features corresponding to the virtual character. The skill state features are used to represent the state of the virtual character after using a skill. The skill state features include a state identifier for each state and a state feature corresponding to each state identifier.

[0063] In practical applications, different virtual characters may enter a certain state after releasing a skill. For example, virtual character U1 will be in a frozen state after releasing skill J1, while virtual character U2 will be in an attack buff state after releasing skill J2. Since different virtual characters have different skill states, this embodiment represents the state after skill release based on skill state characteristics.

[0064] Accordingly, determining the game characteristics corresponding to the target virtual object based on the game state data may include:

[0065] For each status identifier in the skill status feature, a status table is queried based on the role identifier of the virtual character, wherein the status table contains status identifiers associated with the role identifiers of the multiple virtual characters respectively.

[0066] This state table can be pre-configured based on the skill states of each virtual character. For example, virtual character U1 corresponds to states T1 and T2, virtual character U2 corresponds to state T3, and virtual character U3 corresponds to states T4 and T5. Different state identifiers can be used to represent the same state. For example, T1 and T4 are both used to represent buff states, where T1 represents speed boost and T4 represents health regeneration boost.

[0067] If the status identifier in the skill status feature is found in the status identifier associated with the character identifier of the virtual character, the status feature corresponding to the status identifier is determined based on the game status data;

[0068] If the status identifier in the skill status feature is not found in the status identifier associated with the character identifier of the virtual character, then the status feature corresponding to the status identifier is filled with the set value.

[0069] As an example, there are M state identifiers corresponding to these virtual characters. For one of these state identifiers, T1, if the current virtual character is U1, then state identifier T1 can be queried from the state identifiers associated with virtual character U1 based on the state table. If found, the state feature corresponding to the state identifier can be determined based on the game state data. This state feature can include the number of state layers, state duration, and remaining state time. The number of state layers indicates the number of stacks of the current state; for example, hitting an object with a skill will stack one layer, and the more objects hit, the more state layers there are. The corresponding state features can be obtained by feature extraction from the game state data after the skill is released.

[0070] For one of the status identifiers T4, the status identifier T4 can be queried from the status identifiers associated with the virtual character U1 based on the status table. If it is not found, the status feature can be obtained by filling in the value with the set value. For example, the set value can be -1, which means that the virtual object is not in this state.

[0071] Therefore, the above technical solution can further accurately represent the state of the virtual character after the skill is released, providing effective data support for subsequent accurate action decisions.

[0072] In one possible embodiment, an exemplary implementation of determining the target action parameters of the target action based on the action decision parameters and the action parameters corresponding to the candidate action types may include:

[0073] The configuration file of the virtual character is queried according to the action decision parameters, wherein the configuration file indicates the action type corresponding to the action decision parameters of the virtual character.

[0074] The action parameters of the candidate action type corresponding to the queried action decision parameters are determined as the target action parameters of the target action.

[0075] The action decision model described in this disclosure is a general model applicable to multiple virtual characters. Therefore, a configuration file for each virtual character can be generated in advance. For example, different action decision parameters can be configured in the configuration file to represent the corresponding action types for the virtual character.

[0076] For example, action decision parameters may include: 0 empty, 1 movement, 2 basic attack, 3 first skill, 4 second skill, 5 third skill, 6 fourth skill, 7 battlefield skill, 8 recall.

[0077] Specifically, the action parameters for movement and battlefield skill direction can be represented based on the representation of the application range of directional skills described above. The action parameters for basic attack and battlefield skill target parameters can be represented based on the representation of the application range of target skills described above. The action parameters for battlefield skill position parameters can be represented based on the representation of the application range of position skills described above.

[0078] If the output action decision parameter is 3, it means that the output decision action is the first skill corresponding to the virtual character. If the virtual character is U1, the configuration file of the virtual character U1 can be queried based on the action decision parameter 3. Based on the configuration file, it is determined that the action type corresponding to the action decision parameter "3 first skill" is directional. Then, the target action parameter can be selected from the candidate action types as the target action parameter of the target action. If the value of the 8th bit in the action parameter of the candidate action type is 1, it means that the attack target of the target action is the 8th object. Then, the target virtual object can be controlled based on the target action parameter.

[0079] If, in another scenario, the action decision parameter output by the virtual character U2 is 3, then the configuration file of the virtual character U2 can be queried based on the action decision parameter 3. Based on the configuration file, it can be determined that the action type corresponding to the action decision parameter "3-skill" is positional. Then, positional action parameters can be selected from the candidate action types as the target action parameters of the target action, and then the target virtual object can be controlled based on the target action parameters.

[0080] Therefore, in the above technical solution, the model can be trained and learned according to the candidate action types. When making action decisions, the corresponding action type can be queried from the relevant configuration file based on the action decision parameters. Then, based on this action type, the corresponding target action parameters are selected from the action parameters of the candidate action types output by the action decision model. Thus, in the action decision model, actions of the same type from different virtual characters can be placed in the same parameter output header for learning, enabling the action decision model to learn a unified operational approach for actions of the same type, further enhancing the model's generalization ability.

[0081] In one possible embodiment, the action decision model is determined in the following manner:

[0082] Training samples are generated based on the game state data of multiple virtual characters under the same character type in historical matches. The training samples contain training game feature training action decision parameters and training action parameters corresponding to the training decision actions in the game state data.

[0083] In practical applications, due to the game experience and scene settings, each character type can correspond to a profession in the game. The operation differences of virtual characters under different character types may be significant, while the operation of virtual characters under the same character type has certain similarities. Based on this, in this embodiment of the disclosure, a motion decision model corresponding to that character type can be trained through historical games corresponding to multiple virtual characters under the same character type. This motion decision model is used to make decisions and control the actions of each virtual character under that job type.

[0084] As an example, after obtaining authorization from the user, game match data for the user within a historical time period can be retrieved. Then, the data can be grouped based on the character type of the virtual character corresponding to the user-controlled virtual object, grouping historical matches of virtual characters with the same character type into the same group.

[0085] In one possible embodiment, an exemplary implementation of generating training samples based on game state data from historical matches of multiple virtual characters of the same character type may include:

[0086] For each historical game's game state data, determine the triggered training decision action in the game state data, and obtain the game image frame when the training decision action is triggered.

[0087] The game state data under the historical game can be the recording data corresponding to the historical game. Then, by performing image recognition on the recording data, the triggered decision actions, such as skill output, return to base, etc. can be determined. This can be achieved based on image recognition models or algorithms, and this disclosure does not limit it.

[0088] After determining the corresponding decision action, the recorded image frame that triggers the decision action can be extracted and used as the game image frame for training the decision action.

[0089] Subsequently, for each game image frame triggered by the training decision action, the training action decision parameters corresponding to the training decision action are determined, and feature extraction is performed on the game image frame to obtain the training game features and training action parameters associated with the training decision action, wherein the training game features have the same dimension as the game features corresponding to the target virtual object.

[0090] For example, based on the historical match data of the virtual character U1, the identified training decision action is a skill 1, and the corresponding training action decision parameter is determined to be 3. Feature extraction is performed on game image frames based on preset game feature dimensions to obtain the training game features. The dimensions of the action parameters under different candidate action types can be preset, and the training action parameters can be determined based on the operation of the decision action. For example, if the decision action is to output a skill 1 attack target Q, the value of the object Q dimension in the training action parameters can be set to 1, and the values of other dimensions can be set to 0 to obtain the training action parameters.

[0091] Therefore, the above technical solution can automatically generate training samples based on the game state data of multiple virtual characters of the same character type in their historical matches, effectively reducing the workload required to obtain sample data.

[0092] The model is trained using the training game features as input and the training action decision parameters and the training action parameters as target outputs to obtain the action decision model.

[0093] As an example, training game features can be input into the model to obtain predicted action decision parameters and predicted action parameters, respectively. This action decision model can be implemented based on a multilayer perceptron (MLP). Then, a first loss can be determined using the predicted action decision parameters and the training action decision parameters, and a second loss can be determined based on the predicted action parameters for the action type corresponding to the training action parameters. The model can then be trained based on both the first and second losses.

[0094] As another example, the training game features include the skill mask features corresponding to the training virtual character;

[0095] The step of training the model using the training game features as model input and the training action decision parameters and the training action parameters as the model target output to obtain the action decision model may include:

[0096] The training game features are input into the model to obtain the predicted action decision parameters output by the model and the predicted action parameters corresponding to each candidate action type.

[0097] The action loss is determined based on the predicted action decision parameters and the trained action decision parameters.

[0098] The loss for this action can be determined based on the cross-entropy loss algorithm.

[0099] Based on the predicted action parameters, the action parameters of the type corresponding to the training decision action, the training action parameters, and the skill mask features, the parameter loss is determined.

[0100] The skill mask feature is used to represent the skill's usage range. For example, the usage range of a directional skill for a virtual object is 90-180 degrees. If the action parameters corresponding to the type of the training decision action fall within this skill's usage range, the parameter loss is determined based on the predicted action parameters and the training action parameters. If the action parameters corresponding to the type of the training decision action do not fall within this skill's usage range, it indicates that the predicted action parameters are invalid. In this case, it is not necessary to determine the parameter loss based on the predicted action parameters and the training action parameters. Therefore, by combining the skill mask feature, the probability of invalid parameters can be adjusted to be infinitely close to 0 during model training, making the model's gradient updates more reasonable and improving the model's accuracy in predicting actions.

[0101] The loss of this parameter can be calculated based on the cross-entropy loss algorithm.

[0102] The target loss of the model is determined based on the action loss and the parameter loss, and the model is trained based on the target loss.

[0103] As an example, the action loss and parameter loss can be weighted and fused to obtain the target loss. The weights corresponding to the action loss and parameter loss can be set based on the actual application scenario, and this disclosure does not limit this. The model can be trained based on the target loss using model training methods commonly used in this field.

[0104] Therefore, through the above technical solution, during model training, historical game data corresponding to multiple virtual characters of the same character type can be combined for training to obtain an action decision model applicable to a single character type. This increases the model's generalization ability and allows it to better handle game scenarios not covered in the data of a single virtual character, thereby enhancing the accuracy of the model's output actions. Furthermore, while achieving the same level of control accuracy, the action decision model obtained in this disclosure requires less data per virtual character on average, alleviating the pressure on game data caching.

[0105] Based on the same inventive concept, this disclosure also provides a control device for virtual objects, such as... Figure 3 As shown, the device 10 includes:

[0106] The acquisition module 100 is used to acquire the game state data of the target virtual object in the current game.

[0107] The first determining module 200 is used to determine the game features corresponding to the target virtual object based on the game state data, wherein the game features include game scene features and the character features of the virtual character corresponding to the target virtual object;

[0108] The processing module 300 is used to obtain the action decision parameters output by the action decision model and the action parameters corresponding to multiple candidate action types based on the game scene features, the character features and the action decision model. The action decision model is trained based on training data corresponding to multiple virtual characters. The action decision parameters are used to determine the type of the target action from multiple candidate action types. The action parameters are used to represent the control parameters of the action corresponding to the candidate action type.

[0109] The second determining module 400 is used to determine the target action parameters of the target action based on the action decision parameters and the action parameters corresponding to the candidate action types, and to control the target virtual object based on the target action parameters.

[0110] Optionally, the character features of the virtual character corresponding to the target virtual object include the skill features of the virtual character and the skill mask features of the virtual character, wherein the skill mask features are used to represent the skill usage range of the skills corresponding to the virtual character.

[0111] Optionally, the character features of the virtual character corresponding to the target virtual object also include the skill state features of the virtual character. The skill state features are used to represent the state of the virtual character after using a skill. The skill state features include a state identifier for each state and a state feature corresponding to each state identifier.

[0112] The first determining module includes:

[0113] The first query submodule is used to query the status table based on the role identifier of the virtual character for each status identifier in the skill status feature, wherein the status table contains status identifiers associated with the role identifiers of the multiple virtual characters respectively.

[0114] The first determining submodule is used to determine the state feature corresponding to the state identifier based on the game state data if the state identifier in the skill state feature is found in the state identifier associated with the character identifier of the virtual character; if the state identifier in the skill state feature is not found in the state identifier associated with the character identifier of the virtual character, the state feature corresponding to the state identifier is filled with a set value.

[0115] Optionally, the action decision model is determined through training by a training module, which includes:

[0116] The generation submodule is used to generate training samples based on the game state data of multiple virtual characters under the same character type in the historical game. The training samples include the training game features, training action decision parameters and training action parameters corresponding to the training decision actions in the game state data.

[0117] The training submodule is used to train the model by taking the training game features as model input and the training action decision parameters and the training action parameters as the target output of the model, so as to obtain the action decision model.

[0118] Optionally, the generation submodule includes:

[0119] The second determination submodule is used to determine the triggered training decision action in the game state data for each historical game, and to obtain the game image frame when the training decision action is triggered.

[0120] The third determining submodule is used to determine the training action decision parameters corresponding to each training decision action triggered in the game image frame, and to extract features from the game image frame to obtain the training game features and training action parameters associated with the training decision action, wherein the training game features have the same dimension as the game features corresponding to the target virtual object.

[0121] Optionally, the training game features include the skill mask features corresponding to the training virtual character;

[0122] The training submodule includes:

[0123] The processing submodule is used to input the training game features into the model and obtain the predicted action decision parameters output by the model and the predicted action parameters corresponding to each candidate action type.

[0124] The fourth determination submodule is used to determine the action loss based on the predicted action decision parameters and the trained action decision parameters;

[0125] The fifth determination submodule is used to determine the parameter loss based on the action parameters of the type corresponding to the training decision action in the predicted action parameters, as well as the training action parameters and the skill mask features;

[0126] The update submodule is used to determine the target loss of the model based on the action loss and the parameter loss, and to train the model based on the target loss.

[0127] Optionally, the action decision model has multiple output heads, one of which is used to output the action decision parameters, and the remaining output heads are used to output the action parameters corresponding to a candidate action type.

[0128] Optionally, the second determining module includes:

[0129] The second query submodule is used to query the configuration file of the virtual character based on the action decision parameters, wherein the configuration file indicates the action type corresponding to the action decision parameters of the virtual character;

[0130] The sixth determination submodule is used to determine the action parameters of the candidate action type corresponding to the queried action decision parameters as the target action parameters of the target action.

[0131] The following is for reference. Figure 4 The diagram illustrates a structural schematic of an electronic device (e.g., a terminal device or a server) 600 suitable for implementing embodiments of the present disclosure. The terminal device in the embodiments of the present disclosure may include, but is not limited to, mobile terminals such as mobile phones, laptops, digital broadcast receivers, PDAs (personal digital assistants), PADs (tablet computers), PMPs (portable multimedia players), in-vehicle terminals (e.g., in-vehicle navigation terminals), and fixed terminals such as digital TVs and desktop computers. Figure 4 The electronic device shown is merely an example and should not be construed as limiting the functionality and scope of the embodiments disclosed herein.

[0132] like Figure 4 As shown, electronic device 600 may include a processing device (e.g., a central processing unit, a graphics processor, etc.) 601, which can perform various appropriate actions and processes according to a program stored in read-only memory (ROM) 602 or a program loaded from storage device 608 into random access memory (RAM) 603. RAM 603 also stores various programs and data required for the operation of electronic device 600. Processing device 601, ROM 602, and RAM 603 are interconnected via bus 604. Input / output (I / O) interface 605 is also connected to bus 604.

[0133] Typically, the following devices can be connected to I / O interface 605: input devices 606 including, for example, touchscreens, touchpads, keyboards, mice, cameras, microphones, accelerometers, gyroscopes, etc.; output devices 607 including, for example, liquid crystal displays (LCDs), speakers, vibrators, etc.; storage devices 608 including, for example, magnetic tapes, hard disks, etc.; and communication devices 609. Communication device 609 allows electronic device 600 to communicate wirelessly or wiredly with other devices to exchange data. Although Figure 4An electronic device 600 with various devices is shown; however, it should be understood that it is not required to implement or possess all of the devices shown. More or fewer devices may be implemented or possessed alternatively.

[0134] In particular, according to embodiments of this disclosure, the processes described above with reference to the flowcharts can be implemented as computer software programs. For example, embodiments of this disclosure include a computer program product comprising a computer program carried on a non-transitory computer-readable medium, the computer program containing program code for performing the methods shown in the flowcharts. In such embodiments, the computer program can be downloaded and installed from a network via a communication device 609, or installed from a storage device 608, or installed from a ROM 602. When the computer program is executed by the processing device 601, it performs the functions defined in the methods of embodiments of this disclosure.

[0135] It should be noted that the computer-readable medium described in this disclosure can be a computer-readable signal medium or a computer-readable storage medium, or any combination thereof. A computer-readable storage medium can be, for example,—but not limited to—an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination thereof. More specific examples of a computer-readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer disk, a hard disk, random access memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), optical storage device, magnetic storage device, or any suitable combination thereof. In this disclosure, a computer-readable storage medium can be any tangible medium containing or storing a program that can be used by or in connection with an instruction execution system, apparatus, or device. In this disclosure, a computer-readable signal medium can include a data signal propagated in baseband or as part of a carrier wave, carrying computer-readable program code. Such propagated data signals can take various forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination thereof. A computer-readable signal medium can be any computer-readable medium other than a computer-readable storage medium, which can send, propagate, or transmit a program for use by or in connection with an instruction execution system, apparatus, or device. The program code contained on the computer-readable medium can be transmitted using any suitable medium, including but not limited to: wires, optical fibers, RF (radio frequency), etc., or any suitable combination thereof.

[0136] In some implementations, clients and servers can communicate using any currently known or future-developed network protocol such as HTTP (Hypertext Transfer Protocol) and can interconnect with digital data communication (e.g., communication networks) of any form or medium. Examples of communication networks include local area networks (“LANs”), wide area networks (“WANs”), the Internet (e.g., the Internet of Things), and peer-to-peer networks (e.g., ad hoc peer-to-peer networks), as well as any currently known or future-developed networks.

[0137] The aforementioned computer-readable medium may be included in the aforementioned electronic device; or it may exist independently and not assembled into the electronic device.

[0138] The aforementioned computer-readable medium carries one or more programs that, when executed by the electronic device, cause the electronic device to: acquire game state data of a target virtual object in the current game; determine game features corresponding to the target virtual object based on the game state data, wherein the game features include game scene features and character features of the virtual character corresponding to the target virtual object; obtain action decision parameters output by the action decision model and action parameters corresponding to multiple candidate action types based on the game scene features, the character features, and the action decision model, wherein the action decision model is trained based on training data corresponding to multiple virtual characters, the action decision parameters are used to determine the type of the target action from multiple candidate action types, and the action parameters are used to represent the control parameters of the action corresponding to the candidate action type; determine the target action parameters of the target action based on the action decision parameters and the action parameters corresponding to the candidate action types, and control the target virtual object based on the target action parameters.

[0139] Computer program code for performing the operations of this disclosure can be written in one or more programming languages or a combination thereof, including but not limited to object-oriented programming languages such as Java, Smalltalk, and C++, as well as conventional procedural programming languages such as the "C" language or similar programming languages. The program code can be executed entirely on the user's computer, partially on the user's computer, as a standalone software package, partially on the user's computer and partially on a remote computer, or entirely on a remote computer or server. In cases involving remote computers, the remote computer can be connected to the user's computer via any type of network—including a local area network (LAN) or a wide area network (WAN)—or can be connected to an external computer (e.g., via the Internet using an Internet service provider).

[0140] The flowcharts and block diagrams in the accompanying drawings illustrate the architecture, functionality, and operation of possible implementations of systems, methods, and computer program products according to various embodiments of this disclosure. In this regard, each block in a flowchart or block diagram may represent a module, segment, or portion of code containing one or more executable instructions for implementing a specified logical function. It should also be noted that in some alternative implementations, the functions indicated in the blocks may occur in a different order than those indicated in the drawings. For example, two consecutively indicated blocks may actually be executed substantially in parallel, and they may sometimes be executed in reverse order, depending on the functions involved. It should also be noted that each block in the block diagrams and / or flowcharts, and combinations of blocks in the block diagrams and / or flowcharts, can be implemented using a dedicated hardware-based system that performs the specified function or operation, or using a combination of dedicated hardware and computer instructions.

[0141] The modules described in the embodiments of this disclosure can be implemented in software or in hardware. The name of a module does not necessarily limit the module itself; for example, an acquisition module can also be described as "a module that acquires game state data of a target virtual object in the current game".

[0142] The functions described above in this document can be performed, at least in part, by one or more hardware logic components. For example, exemplary types of hardware logic components that can be used, without limitation, include: Field Programmable Gate Arrays (FPGAs), Application-Specific Integrated Circuits (ASICs), Application Standard Products (ASSPs), System-on-Chip (SoCs), Complex Programmable Logic Devices (CPLDs), and so on.

[0143] In the context of this disclosure, a machine-readable medium can be a tangible medium that may contain or store a program for use by or in conjunction with an instruction execution system, apparatus, or device. A machine-readable medium can be a machine-readable signal medium or a machine-readable storage medium. A machine-readable medium can be, but is not limited to, electronic, magnetic, optical, electromagnetic, infrared, or semiconductor systems, apparatus, or devices, or any suitable combination of the foregoing. More specific examples of machine-readable storage media include electrical connections based on one or more wires, portable computer disks, hard disks, random access memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), optical storage devices, magnetic storage devices, or any suitable combination of the foregoing.

[0144] According to one or more embodiments of this disclosure, Example 1 provides a method for controlling a virtual object, wherein the method includes:

[0145] Obtain the game state data of the target virtual object in the current game;

[0146] Based on the game state data, the game features corresponding to the target virtual object are determined, wherein the game features include game scene features and the character features of the virtual character corresponding to the target virtual object;

[0147] Based on the game scene features, the character features, and the action decision model, the action decision parameters output by the action decision model and the action parameters corresponding to multiple candidate action types are obtained. The action decision model is trained based on training data corresponding to multiple virtual characters. The action decision parameters are used to determine the type of the target action from multiple candidate action types. The action parameters are used to represent the control parameters of the action corresponding to the candidate action type.

[0148] Based on the action decision parameters and the action parameters corresponding to the candidate action types, the target action parameters are determined, and the target virtual object is controlled according to the target action parameters.

[0149] According to one or more embodiments of this disclosure, Example 2 provides the method of Example 1, wherein the character features of the virtual character corresponding to the target virtual object include the skill features corresponding to the virtual character and the skill mask features corresponding to the virtual character, and the skill mask features are used to represent the skill usage scope of the skill corresponding to the virtual character.

[0150] According to one or more embodiments of this disclosure, Example 3 provides the method of Example 2, wherein the role characteristics of the virtual character corresponding to the target virtual object further include the skill state characteristics of the virtual character, the skill state characteristics are used to represent the state of the virtual character after using a skill, and the skill state characteristics include a state identifier of each state and a state characteristic corresponding to each state identifier.

[0151] The step of determining the game characteristics corresponding to the target virtual object based on the game state data includes:

[0152] For each status identifier in the skill status feature, a status table is queried based on the role identifier of the virtual character, wherein the status table contains status identifiers associated with the role identifiers of the multiple virtual characters respectively;

[0153] If the status identifier in the skill status feature is found in the status identifier associated with the character identifier of the virtual character, the status feature corresponding to the status identifier is determined based on the game status data;

[0154] If the status identifier in the skill status feature is not found in the status identifier associated with the character identifier of the virtual character, then the status feature corresponding to the status identifier is filled with the set value.

[0155] According to one or more embodiments of this disclosure, Example 4 provides the method of Example 1, wherein the action decision model is determined in the following manner:

[0156] Based on the game state data of multiple virtual characters under the same character type in historical matches, training samples are generated. The training samples contain training game features, training action decision parameters and training action parameters corresponding to the training decision actions in the game state data.

[0157] The model is trained using the training game features as input and the training action decision parameters and the training action parameters as target outputs to obtain the action decision model.

[0158] According to one or more embodiments of this disclosure, Example 5 provides the method of Example 4, wherein generating training samples based on game state data from historical matches of multiple virtual characters of the same character type includes:

[0159] For each historical game's game state data, determine the triggered training decision action in the game state data, and obtain the game image frame when the training decision action is triggered.

[0160] For each game image frame triggered by the training decision action, the training action decision parameters corresponding to the training decision action are determined, and feature extraction is performed on the game image frame to obtain the training game features and training action parameters associated with the training decision action, wherein the training game features have the same dimension as the game features corresponding to the target virtual object.

[0161] According to one or more embodiments of this disclosure, Example 6 provides the method of Example 4, wherein the training game features include skill mask features corresponding to the training virtual character;

[0162] The step of training the model using the training game features as model input and the training action decision parameters and the training action parameters as the model target output to obtain the action decision model includes:

[0163] The training game features are input into the model to obtain the predicted action decision parameters output by the model and the predicted action parameters corresponding to each candidate action type.

[0164] Based on the predicted action decision parameters and the trained action decision parameters, determine the action loss;

[0165] Based on the predicted action parameters, the action parameters corresponding to the type of the training decision action, the training action parameters, and the skill mask features, the parameter loss is determined.

[0166] The target loss of the model is determined based on the action loss and the parameter loss, and the model is trained based on the target loss.

[0167] According to one or more embodiments of this disclosure, Example 7 provides the method of Example 1, wherein the action decision model has a plurality of output heads, one of which is used to output the action decision parameters, and each of the remaining output heads is used to output action parameters corresponding to a candidate action type.

[0168] According to one or more embodiments of this disclosure, Example 8 provides the method of Example 1, wherein determining the target action parameters of the target action based on the action decision parameters and the action parameters corresponding to the candidate action types includes:

[0169] The configuration file of the virtual character is queried according to the action decision parameters, wherein the configuration file indicates the action type corresponding to the action decision parameters of the virtual character;

[0170] The action parameters of the candidate action type corresponding to the queried action decision parameters are determined as the target action parameters of the target action.

[0171] According to one or more embodiments of this disclosure, Example 9 provides a control device for a virtual object, the device comprising:

[0172] The acquisition module is used to acquire the game state data of the target virtual object in the current game.

[0173] The first determining module is used to determine the game features corresponding to the target virtual object based on the game state data, wherein the game features include game scene features and the character features of the virtual character corresponding to the target virtual object;

[0174] The processing module is used to obtain the action decision parameters output by the action decision model and the action parameters corresponding to multiple candidate action types based on the game scene features, the character features and the action decision model. The action decision model is trained based on training data corresponding to multiple virtual characters. The action decision parameters are used to determine the type of the target action from multiple candidate action types. The action parameters are used to represent the control parameters of the action corresponding to the candidate action type.

[0175] The second determining module is used to determine the target action parameters of the target action based on the action decision parameters and the action parameters corresponding to the candidate action types, and to control the target virtual object based on the target action parameters.

[0176] According to one or more embodiments of the present disclosure, Example 10 provides a computer-readable medium having a computer program stored thereon that, when executed by a processing device, implements the steps of the method described in any one of Examples 1-8.

[0177] According to one or more embodiments of this disclosure, Example 11 provides an electronic device, including:

[0178] A storage device on which computer programs are stored;

[0179] A processing device for executing the computer program in the storage device to implement the steps of any one of the methods in Examples 1-8.

[0180] The above description is merely a preferred embodiment of this disclosure and an explanation of the technical principles employed. Those skilled in the art should understand that the scope of this disclosure is not limited to technical solutions formed by specific combinations of the above-described technical features, but should also cover other technical solutions formed by arbitrary combinations of the above-described technical features or their equivalents without departing from the above-described concept. For example, technical solutions formed by substituting the above features with (but not limited to) technical features disclosed in this disclosure that have similar functions.

[0181] Furthermore, while the operations are described in a specific order, this should not be construed as requiring these operations to be performed in the specific order shown or in a sequential order. In certain environments, multitasking and parallel processing may be advantageous. Similarly, while several specific implementation details are included in the above discussion, these should not be construed as limiting the scope of this disclosure. Certain features described in the context of individual embodiments may also be implemented in combination in a single embodiment. Conversely, various features described in the context of a single embodiment may also be implemented individually or in any suitable sub-combination in multiple embodiments.

[0182] Although the subject matter has been described using language specific to structural features and / or methodological logic, it should be understood that the subject matter defined in the appended claims is not necessarily limited to the specific features or actions described above. Rather, the specific features and actions described above are merely illustrative forms of implementing the claims. Regarding the apparatus in the above embodiments, the specific manner in which the various modules perform their operations has been described in detail in the embodiments relating to the method, and will not be elaborated upon here.

Claims

1. A method for controlling a virtual object, characterized in that, The method includes: Obtain the game state data of the target virtual object in the current game; Based on the game state data, the game features corresponding to the target virtual object are determined, wherein the game features include game scene features and the character features of the virtual character corresponding to the target virtual object; Based on the game scene features, the character features, and the action decision model, the action decision parameters output by the action decision model and the action parameters corresponding to multiple candidate action types are obtained. The action decision model is trained based on training data corresponding to multiple virtual characters. The action decision parameters are used to determine the type of the target action from multiple candidate action types. The action parameters are used to represent the control parameters of the action corresponding to the candidate action type. Based on the action decision parameters and the action parameters corresponding to the candidate action types, the target action parameters are determined, and the target virtual object is controlled according to the target action parameters.

2. The method according to claim 1, characterized in that, The character features of the virtual character corresponding to the target virtual object include the skill features of the virtual character and the skill mask features of the virtual character. The skill mask features are used to represent the skill usage range of the skills corresponding to the virtual character.

3. The method according to claim 2, characterized in that, The character features of the virtual character corresponding to the target virtual object also include the skill state features of the virtual character. The skill state features are used to represent the state of the virtual character after using a skill. The skill state features include a state identifier for each state and a state feature corresponding to each state identifier. The step of determining the game characteristics corresponding to the target virtual object based on the game state data includes: For each status identifier in the skill status feature, a status table is queried based on the role identifier of the virtual character, wherein the status table contains status identifiers associated with the role identifiers of the multiple virtual characters respectively; If the status identifier in the skill status feature is found in the status identifier associated with the character identifier of the virtual character, the status feature corresponding to the status identifier is determined based on the game status data; If the status identifier in the skill status feature is not found in the status identifier associated with the character identifier of the virtual character, then the status feature corresponding to the status identifier is filled with the set value.

4. The method according to claim 1, characterized in that, The action decision model is determined in the following way: Based on the game state data of multiple virtual characters under the same character type in historical matches, training samples are generated. The training samples contain training game features, training action decision parameters and training action parameters corresponding to the training decision actions in the game state data. The model is trained using the training game features as input and the training action decision parameters and the training action parameters as target outputs to obtain the action decision model.

5. The method according to claim 4, characterized in that, The step of generating training samples based on game state data from historical matches of multiple virtual characters of the same character type includes: For each historical game's game state data, determine the triggered training decision action in the game state data, and obtain the game image frame when the training decision action is triggered. For each game image frame triggered by the training decision action, the training action decision parameters corresponding to the training decision action are determined, and feature extraction is performed on the game image frame to obtain the training game features and training action parameters associated with the training decision action, wherein the training game features have the same dimension as the game features corresponding to the target virtual object.

6. The method according to claim 4, characterized in that, The training game features include skill mask features corresponding to the training virtual characters; The step of training the model using the training game features as model input and the training action decision parameters and the training action parameters as the model target output to obtain the action decision model includes: The training game features are input into the model to obtain the predicted action decision parameters output by the model and the predicted action parameters corresponding to each candidate action type. Based on the predicted action decision parameters and the trained action decision parameters, determine the action loss; Based on the predicted action parameters, the action parameters of the type corresponding to the training decision action, the training action parameters, and the skill mask features, the parameter loss is determined. The target loss of the model is determined based on the action loss and the parameter loss, and the model is trained based on the target loss.

7. The method according to claim 1, characterized in that, The action decision model has multiple output heads, one of which is used to output the action decision parameters, and the remaining output heads are used to output the action parameters corresponding to a candidate action type.

8. The method according to claim 1, characterized in that, The step of determining the target action parameters of the target action based on the action decision parameters and the action parameters corresponding to the candidate action types includes: The configuration file of the virtual character is queried according to the action decision parameters, wherein the configuration file indicates the action type corresponding to the action decision parameters of the virtual character; The action parameters of the candidate action type corresponding to the queried action decision parameters are determined as the target action parameters of the target action.

9. A control device for a virtual object, characterized in that, The device includes: The acquisition module is used to acquire the game state data of the target virtual object in the current game. The first determining module is used to determine the game features corresponding to the target virtual object based on the game state data, wherein the game features include game scene features and the character features of the virtual character corresponding to the target virtual object; The processing module is used to obtain the action decision parameters output by the action decision model and the action parameters corresponding to multiple candidate action types based on the game scene features, the character features and the action decision model. The action decision model is trained based on training data corresponding to multiple virtual characters. The action decision parameters are used to determine the type of the target action from multiple candidate action types. The action parameters are used to represent the control parameters of the action corresponding to the candidate action type. The second determining module is used to determine the target action parameters of the target action based on the action decision parameters and the action parameters corresponding to the candidate action types, and to control the target virtual object based on the target action parameters.

10. A computer-readable medium having a computer program stored thereon, characterized in that, When the program is executed by the processing device, it implements the steps of the method described in any one of claims 1-8.

11. An electronic device, characterized in that, include: A storage device on which computer programs are stored; A processing device for executing the computer program in the storage device to implement the steps of the method according to any one of claims 1-8.