Method for training action trajectory generation model, action trajectory generation method and device

By extracting features and iteratively training motion trajectory samples, and combining a generative sub-model and a discriminative sub-model, the problem of monotonous motion style in motion trajectory generation models is solved, thereby improving the realism of motion trajectories and the diversity of motion styles.

CN116205943BActive Publication Date: 2026-06-26MASHANG CONSUMER FINANCE CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
MASHANG CONSUMER FINANCE CO LTD
Filing Date
2022-12-07
Publication Date
2026-06-26

AI Technical Summary

Technical Problem

Existing motion trajectory generation models output motion trajectories of target objects that lack realism, have a monotonous motion style, and lack controllability and diversity.

Method used

By extracting action category features and action trajectory features from action trajectory samples, action category feature vectors and preprocessed feature vectors are generated. Iterative training is performed using a generation sub-model and a discriminator sub-model. By combining action category recognition and trajectory authenticity discrimination, the model parameters are optimized to ensure the controllability of action category and the realism of action trajectory.

Benefits of technology

It improves the realism of motion trajectories and the diversity of motion styles output by the motion trajectory generation model, ensures the controllability of motion categories, and generates motion trajectories that conform to the laws of human movement and have diverse motion styles.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN116205943B_ABST
    Figure CN116205943B_ABST
Patent Text Reader

Abstract

The application provides a training method of an action trajectory generation model, an action trajectory generation method and device. Action category feature extraction and action trajectory feature extraction are performed on a first action trajectory sample to obtain an action category feature vector and an action trajectory feature vector. Action trajectory prediction is performed based on the action category feature vector and the preprocessed action trajectory feature vector to obtain a first prediction feature vector, so that the first prediction feature vector is obtained based on the two dimensions of the action trajectory category and the action trajectory itself. The model parameters are iteratively updated based on the action category recognition result and the action trajectory discrimination result of the first prediction feature vector. The predicted action trajectory is not required to be consistent with the action trajectory sample, but only the action category loss and the action trajectory authenticity loss are considered to optimize the model parameters, so that the authenticity, controllability and action style diversity of the action trajectory output by the trained model can be ensured.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This application relates to the field of artificial intelligence technology, and in particular to a training method, motion trajectory generation method and apparatus for a motion trajectory generation model. Background Technology

[0002] Currently, with the rapid development of artificial intelligence technology, the application of neural network models is becoming increasingly widespread. For example, when applying neural network models to the field of digital animation, animators only need to draw a small number of known motion image frames. Then, by inputting the 3D skeleton information corresponding to these few known motion image frames into a pre-trained motion trajectory generation model to complete the motion image frames, a motion trajectory of the target object with multiple motion image frames can be obtained. Based on this motion trajectory, a digital animation video of the target object can then be rendered. The target object can be any animated character, such as a person or an animal. However, the motion trajectories of the target object output by existing motion trajectory generation models lack realism. Summary of the Invention

[0003] The purpose of this application is to provide a training method, motion trajectory generation method, and apparatus for a motion trajectory generation model, which can ensure the realism of the motion trajectory output by the model after training.

[0004] To achieve the above technical solution, the embodiments of this application are implemented as follows:

[0005] In a first aspect, embodiments of this application provide a training method for an action trajectory generation model, the method comprising:

[0006] Obtain the first sample dataset; the first sample dataset includes multiple first action trajectory samples;

[0007] Based on the first action trajectory sample, action category feature extraction processing is performed to obtain a first action category feature vector; and based on the first action trajectory sample, action trajectory feature extraction processing is performed to obtain an original action trajectory feature vector, and based on the original action trajectory feature vector, preprocessing is performed to obtain a first preprocessed feature vector.

[0008] The first action category feature vector and the first preprocessed feature vector corresponding to each of the first action trajectory samples are input into the model to be trained for iterative training to obtain the action trajectory generation model.

[0009] The model to be trained includes a generator sub-model and a discriminator sub-model; the specific implementation methods for each model training are as follows:

[0010] For each of the first motion trajectory samples: the generation sub-model generates a first predicted feature vector based on the first motion category feature vector and the first preprocessed feature vector corresponding to the first motion trajectory sample; the discrimination sub-model performs trajectory authenticity discrimination based on the first predicted feature vector and the original motion trajectory feature vector corresponding to the first motion trajectory sample, and generates a motion trajectory discrimination result set; and performs motion category recognition based on the first predicted feature vector to obtain the motion category recognition result.

[0011] The parameters of the generated sub-model are updated based on the action category recognition result and the action trajectory discrimination result set; and the parameters of the discrimination sub-model are updated based on the action trajectory discrimination result set.

[0012] Secondly, this application provides a method for generating motion trajectories, the method comprising:

[0013] Obtain the motion trajectory to be completed; and obtain the target motion category feature vector, which is used to constrain the motion style of the motion trajectory.

[0014] Based on the motion trajectory to be completed, motion trajectory feature extraction processing is performed to obtain the motion trajectory feature vector to be completed, and the motion trajectory feature vector to be completed is preprocessed to obtain the second preprocessed feature vector.

[0015] The target action category feature vector and the second preprocessed feature vector are input into the trained action trajectory generation model to predict the action trajectory, and the second predicted feature vector is obtained.

[0016] Based on the second predicted feature vector, the completed action trajectory is generated;

[0017] A target motion trajectory is generated based on at least one of the completed motion trajectories corresponding to the motion trajectory to be completed.

[0018] Thirdly, this application provides a training device for a motion trajectory generation model, the device comprising:

[0019] The first acquisition module is used to acquire a first sample dataset; the first sample dataset includes multiple first action trajectory samples.

[0020] The first processing module is configured to perform action category feature extraction processing based on the first action trajectory sample to obtain a first action category feature vector; and to perform action trajectory feature extraction processing based on the first action trajectory sample to obtain an original action trajectory feature vector, and to perform preprocessing based on the original action trajectory feature vector to obtain a first preprocessed feature vector.

[0021] The model training module is used to input the first action category feature vector and the first preprocessed feature vector corresponding to each of the first action trajectory samples into the model to be trained for iterative training to obtain the action trajectory generation model.

[0022] The model to be trained includes a generator sub-model and a discriminator sub-model; the specific implementation methods for each model training are as follows:

[0023] For each of the first motion trajectory samples: the generation sub-model generates a first predicted feature vector based on the first motion category feature vector and the first preprocessed feature vector corresponding to the first motion trajectory sample; the discrimination sub-model performs trajectory authenticity discrimination based on the first predicted feature vector and the original motion trajectory feature vector corresponding to the first motion trajectory sample, and generates a motion trajectory discrimination result set; and performs motion category recognition based on the first predicted feature vector to obtain the motion category recognition result.

[0024] The parameters of the generated sub-model are updated based on the action category recognition result and the action trajectory discrimination result set; and the parameters of the discrimination sub-model are updated based on the action trajectory discrimination result set.

[0025] Fourthly, an embodiment of this application provides a motion trajectory generation device, the device comprising:

[0026] The second acquisition module is used to acquire the motion trajectory to be completed; and to acquire the target motion category feature vector, wherein the target motion category feature vector is used to constrain the motion style of the motion trajectory.

[0027] The second processing module is used to perform motion trajectory feature extraction processing based on the motion trajectory to be completed, to obtain a motion trajectory feature vector to be completed, and to perform preprocessing based on the motion trajectory feature vector to be completed, to obtain a second preprocessed feature vector.

[0028] The motion trajectory prediction module is used to input the target motion category feature vector and the second preprocessed feature vector into the trained motion trajectory generation model to predict the motion trajectory and obtain the second predicted feature vector.

[0029] The first trajectory generation module is used to generate the completed action trajectory based on the second predicted feature vector;

[0030] The second trajectory generation module is used to generate a target motion trajectory based on at least one of the completed motion trajectories corresponding to the motion trajectory to be completed.

[0031] Fifthly, an embodiment of this application provides a computer device, the device comprising:

[0032] A processor; and a memory arranged to store computer-executable instructions configured to be executed by the processor, the executable instructions including steps for performing the methods described in the first or second aspect.

[0033] Sixthly, embodiments of this application provide a storage medium for storing computer-executable instructions that cause a computer to perform steps in the methods described in the first or second aspect.

[0034] As can be seen in this embodiment, during the model training phase, by extracting action category features and action trajectory features from the first action trajectory samples, action category feature vectors and action trajectory feature vectors corresponding to each first action trajectory sample are obtained. The action trajectory feature vectors are then preprocessed to obtain a first preprocessed feature vector. Specifically, the action category feature vector corresponding to the first action trajectory sample is used as the first-dimensional feature vector extracted from the action trajectory category dimension and input into the model to be trained. The first preprocessed feature vector corresponding to the first action trajectory sample is used as the second-dimensional feature vector extracted from the action trajectory itself dimension. Then, the model to be trained is used based on the action category feature vector and the first... Preprocessing feature vectors is used for motion trajectory prediction to obtain a first predicted feature vector. Then, motion category recognition and trajectory authenticity determination are performed based on the first predicted feature vector. Finally, the parameters of the model to be trained are iteratively updated based on the motion category recognition results and motion trajectory determination results of each first motion trajectory sample. On the one hand, since the first predicted feature vector is obtained based on feature vectors of both the motion trajectory category and the motion trajectory itself, the motion category feature vector is considered during motion trajectory prediction, ensuring the controllability of the predicted motion category. On the other hand, the model parameters are updated using the motion category recognition results and motion trajectory determination results based on the first predicted feature vector. Iterative updates do not require the predicted motion trajectories to be as consistent as possible with the motion trajectory samples. Instead, they only consider the action category recognition loss and the motion trajectory authenticity loss to optimize the model parameters. This ensures that during the iterative update of model parameters, the predicted motion trajectory (i.e., the output motion trajectory) is not required to have a strict convergence in action style with the motion trajectory samples (i.e., the input motion trajectory). Instead, it only constrains the output motion trajectory to be consistent with the input motion trajectory at the action category level. That is, it allows the output motion trajectory and the input motion trajectory to belong to the same action category but have different action styles. In this way, the trained model can allow for diverse action styles when outputting motion trajectories that belong to the target action category. Furthermore, when using the trained model to complete motion trajectories, the target action category feature vector can be introduced to constrain the action style of the motion trajectory, thereby ensuring the controllability of the action style of the motion trajectory output by the trained model. On the other hand, during the iterative update of model parameters, the true or false prediction of motion trajectory is continuously learned based on the multi-round adversarial method of generation and discrimination. Due to the existence of the discriminative sub-model, the model parameters are adjusted based on the discrimination results of the discriminative sub-model, which can further train the generative sub-model. This makes the motion trajectory predicted by the trained generative sub-model more realistic and closer to the real motion trajectory, thereby further improving the realism of the motion trajectory output by the trained model. Attached Figure Description

[0035] To more clearly illustrate the technical solutions in the embodiments of this application or the prior art, the drawings used in the description of the embodiments or the prior art will be briefly introduced below. Obviously, the drawings described below are only some embodiments recorded in one or more of this application. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.

[0036] Figure 1 A schematic diagram of the first process for training a motion trajectory generation model provided in an embodiment of this application;

[0037] Figure 2 This is a schematic diagram illustrating the specific implementation of each model training step in the training method for the motion trajectory generation model provided in this application embodiment;

[0038] Figure 3 A schematic diagram illustrating the first implementation principle of the training method for the motion trajectory generation model provided in this application embodiment;

[0039] Figure 4 A schematic diagram illustrating the second implementation principle of the training method for the motion trajectory generation model provided in this application embodiment;

[0040] Figure 5 A schematic diagram illustrating the third implementation principle of the training method for the motion trajectory generation model provided in this application embodiment;

[0041] Figure 6 A schematic diagram illustrating the implementation principle of the training method for the action classification model provided in this application embodiment;

[0042] Figure 7 A schematic diagram illustrating the implementation principle of the process for constructing action feature distribution information corresponding to the target action category provided in the embodiments of this application;

[0043] Figure 8 A schematic diagram illustrating the fourth implementation principle of the training method for the motion trajectory generation model provided in this application embodiment;

[0044] Figure 9 A flowchart illustrating the motion trajectory generation method provided in this application embodiment;

[0045] Figure 10a This is a schematic diagram illustrating the first implementation principle of the motion trajectory generation method provided in this application embodiment;

[0046] Figure 10b This is a schematic diagram illustrating the second implementation principle of the motion trajectory generation method provided in the embodiments of this application;

[0047] Figure 10cThis is a schematic diagram illustrating the third implementation principle of the motion trajectory generation method provided in the embodiments of this application;

[0048] Figure 10d This is a schematic diagram illustrating the fourth implementation principle of the motion trajectory generation method provided in the embodiments of this application;

[0049] Figure 11 A schematic diagram illustrating the fifth implementation principle of the motion trajectory generation method provided in this application embodiment;

[0050] Figure 12 A schematic diagram of the module composition of the training device for the motion trajectory generation model provided in the embodiments of this application;

[0051] Figure 13 This is a schematic diagram of the module composition of the motion trajectory generation device provided in the embodiments of this application;

[0052] Figure 14 A schematic diagram of the structure of a computer device provided in an embodiment of this application. Detailed Implementation

[0053] To enable those skilled in the art to better understand the technical solutions in one or more of this application, the technical solutions in the embodiments of this application will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of one or more of this application, and not all embodiments. Based on the embodiments of one or more of this application, all other embodiments obtained by those skilled in the art without creative effort should fall within the protection scope of this application.

[0054] It should be noted that, unless otherwise specified, one or more embodiments and features described in this application can be combined with each other. The embodiments of this application will now be described in detail with reference to the accompanying drawings.

[0055] This application provides one or more embodiments of a training method, motion trajectory generation method, and apparatus for a motion trajectory generation model. Considering that during model training, directly comparing predicted motion trajectories with actual motion trajectories, calculating motion trajectory prediction loss, and then iteratively updating model parameters based on this loss inevitably leads to the model learning the motion style of the motion trajectories during training. This results in a relatively uniform motion style in the output motion trajectories of the trained model, lacking selectivity. To address this problem, this technical solution, on one hand, extracts motion category features and motion trajectory features from a first motion trajectory sample to obtain various first motion trajectory generation methods. The system constructs action category feature vectors and action trajectory feature vectors corresponding to trajectory samples, and preprocesses the action trajectory feature vectors to obtain a first preprocessed feature vector. Then, the model to be trained predicts the action trajectory based on the action category feature vector and the first preprocessed feature vector, obtaining a first predicted feature vector. Since the first predicted feature vector is obtained based on feature vectors of both the action trajectory category and the action trajectory itself, the action category feature vector is considered during action trajectory prediction, ensuring the controllability of the action category in the predicted action trajectory. Furthermore, the model parameters are iterated using the action category recognition results and action trajectory discrimination results based on the first predicted feature vector. The model update does not require the predicted motion trajectory to be as consistent as possible with the motion trajectory samples. Instead, it only considers the action category recognition loss and the motion trajectory authenticity loss to optimize the model parameters. This means that during the iterative update of model parameters, the predicted motion trajectory (i.e., the output motion trajectory) is not required to have a strict convergence in action style with the motion trajectory samples (i.e., the input motion trajectory). It only constrains the output motion trajectory to be consistent with the input motion trajectory at the action category level. That is, it allows the output motion trajectory and the input motion trajectory to belong to the same action category but have different action styles. In this way, the trained model can allow for diverse action styles when outputting motion trajectories that belong to the target action category. Furthermore, when using the trained model to complete motion trajectories, the target action category feature vector can be introduced to constrain the action style of the motion trajectory, thereby ensuring the controllability of the action style of the motion trajectory output by the trained model. On the other hand, during the iterative update of model parameters, the true or false prediction of motion trajectory is continuously learned based on the multi-round adversarial method of generation and discrimination. Due to the existence of the discriminative sub-model, the model parameters are adjusted based on the discrimination results of the discriminative sub-model, which can further train the generative sub-model. This makes the motion trajectory predicted by the trained generative sub-model more realistic and closer to the real motion trajectory, thereby further improving the realism of the motion trajectory output by the trained model.

[0056] in, Figure 1This is a schematic diagram of a first flowchart illustrating a training method for a motion trajectory generation model provided in one or more embodiments of this application. Figure 1 The method described herein can be executed by an electronic device equipped with a motion trajectory generation model training device. This electronic device can be a terminal device or a designated server. The hardware device used for training the motion trajectory generation model (i.e., the electronic device equipped with the motion trajectory generation model training device) and the hardware device used for generating the motion trajectory (i.e., the electronic device equipped with the motion trajectory generation device) can be the same or different. It should be noted that the motion trajectory generation model trained based on the model training method provided in this application can be applied to any specific application scenario that requires rendering animation videos based on motion trajectories. For example, it could be used to render game character animation videos based on motion trajectories, or to render action demonstration videos based on motion trajectories.

[0057] Specifically, the training process for motion trajectory generation models, such as Figure 1 As shown, the method includes at least the following steps:

[0058] S102, Obtain the first sample dataset; wherein, the first sample dataset includes multiple first action trajectory samples;

[0059] Each first motion trajectory sample can be a motion trajectory sample containing N original 3D skeleton information of the target object (i.e., any animated character), where N is an integer greater than 1. Specifically, each first motion trajectory sample is generated based on an animated video sample with a preset playback duration. If the animated video sample contains N motion image frames, each motion image frame is converted into corresponding 3D skeleton information, and the motion trajectory composed of N 3D skeleton information is a first motion trajectory sample. If the motion category of the target object in the animated video sample is a specified motion category (e.g., jogging, running, or jumping), then the actual motion category label of the first motion trajectory sample is the specified motion category.

[0060] 3D skeleton information refers to a graph structure composed of multiple joints connected according to their adjacent relationships, used to describe the static posture of a target object. For example, for a target person, 3D skeleton information can also be called human skeleton or human posture. The human skeleton can be divided into 3D skeleton and 2D skeleton. Since the 3D skeleton can describe the spatial positional relationship between joints, it is preferred to construct 3D skeleton information. By connecting the 3D skeleton information at different times, the motion trajectory of the target object can be described.

[0061] Specifically, the process of generating the first motion trajectory sample based on an animated video sample with a preset playback duration can be achieved by transforming the animated video sample using a preset transformation matrix, so that the 3D skeleton of the intermediate image frame j of the animated video sample is aligned with the standard viewpoint of a preset camera, thereby obtaining the aligned 3D skeleton motion trajectory (i.e., the first motion trajectory sample).

[0062] The first sample dataset may include M first action trajectory samples, which may be randomly selected from P candidate action trajectory samples. The M first action trajectory samples participate in one round of model training. In each round of model training, M first action trajectory samples are randomly selected from P candidate action trajectory samples. The M first action trajectory samples selected for different rounds of model training may be completely different or partially the same. P and M are both integers greater than 1 and M is less than P.

[0063] For example, in an application scenario where game character animation videos are rendered based on motion trajectories, P game character animation video samples are obtained. For each game character animation video sample, based on N motion image frames of the target game character in the game character animation video sample, N 3D skeleton information of the target game character are determined. The combination of the N 3D skeleton information of the target game character is determined as a candidate motion trajectory sample. Here, the motion category corresponding to a certain game character animation video sample can be one of the n preset motion categories. Correspondingly, the P candidate motion trajectory samples include the candidate motion trajectory subsets corresponding to the n preset motion categories. Each candidate motion trajectory subset includes multiple candidate motion trajectory samples belonging to one of the n preset motion categories, where n is an integer greater than 1 and n is less than P.

[0064] S104, based on the first action trajectory sample, perform action category feature extraction processing to obtain a first action category feature vector; and, based on the first action trajectory sample, perform action trajectory feature extraction processing to obtain an original action trajectory feature vector, and perform preprocessing based on the original action trajectory feature vector to obtain a first preprocessed feature vector.

[0065] Specifically, for each first motion trajectory sample, features are extracted from two dimensions: motion trajectory category and motion trajectory itself, to obtain the first motion category feature vector and the original motion trajectory feature vector corresponding to the first motion trajectory sample; then, the original motion trajectory feature vector is preprocessed to obtain the first preprocessed feature vector; wherein, the above preprocessing includes at least one of feature vector occlusion, feature vector completion, and image frame position marking.

[0066] Specifically, for the feature extraction process from the dimension of the motion trajectory itself, the first motion trajectory sample (i.e., the motion trajectory segment containing the 3D skeletons corresponding to N motion image frames) is converted into the original motion trajectory feature vector (i.e., the original motion trajectory representation matrix containing the motion feature vectors corresponding to the N 3D skeletons); whereby the original motion trajectory representation matrix X can be expressed as:

[0067]

[0068] Where d represents the total dimension, N represents the number of 3D skeletons in the first motion trajectory sample (i.e., corresponding to N motion image frames), and P x Indicates the spatial position of the 3D skeleton with index x; Q x This represents the angle information of the 3D skeleton with index x. x This represents the motion feature vector corresponding to the 3D skeleton with index x. It is a row matrix where x takes values ​​from 1 to N.

[0069] Specifically, after obtaining the original motion trajectory representation matrix X corresponding to the first motion trajectory sample, the original motion trajectory representation matrix X is preprocessed to obtain the first preprocessed feature vector. The first preprocessed feature vector may include the first type of feature vector corresponding to the non-target image frame and the second type of feature vector corresponding to the target image frame. The target image frame is selected from N motion image frames. The motion feature vector corresponding to the target image frame needs to be occluded first, and then the occluded motion feature vector is completed. Among them, the first type of feature vector comes from the motion feature vector corresponding to the non-target image frame in the original motion trajectory feature vector (i.e., the known motion feature vector), and the second type of feature vector is obtained by interpolating the occluded motion feature vector corresponding to the target image frame in the original motion trajectory feature vector according to the kinematic model to complete the feature vector (i.e., the newly completed motion feature vector).

[0070] Next, the first action category feature vector and the first preprocessed feature vector corresponding to each first action trajectory sample are used as model inputs for action trajectory prediction. Among them, the 3D skeleton information corresponding to the non-target image frame can be regarded as a known keyframe skeleton, and the target image frame can be regarded as an image frame to be predicted. Therefore, based on the newly completed action feature vector corresponding to the target image frame in the first preprocessed feature vector, the action trajectory of the target image frame is predicted to obtain the first prediction feature vector, so as to update the model parameters based on the first prediction feature vector corresponding to each first action trajectory sample.

[0071] In each round of model training, the first action category feature vector and the original action trajectory feature vector corresponding to the first action trajectory sample can be generated in real time. However, considering that the corresponding action category feature vector and the original action trajectory feature vector are different for a given first action trajectory sample, the first action category feature vector and the original action trajectory feature vector corresponding to each candidate action trajectory sample can also be pre-generated. In this way, if a candidate action trajectory sample is selected as the first action trajectory sample during model training, the corresponding first action category feature vector and the original action trajectory feature vector can be directly obtained. It should be noted that, regarding the generation process of the first preprocessed feature vector, since the target image frame selected from N action image frames can be random, even for the same first action trajectory sample, the first preprocessed feature vector corresponding to the first action trajectory sample can be different in different rounds of model training. Therefore, for each round of model training, the first action category feature vector and the original action trajectory feature vector corresponding to the first action trajectory sample can be pre-generated, while the first preprocessed feature vector corresponding to the first action trajectory sample is generated in real time.

[0072] S106, Input the first action category feature vector and the first preprocessed feature vector corresponding to each first action trajectory sample into the model to be trained for iterative training of the model to obtain the action trajectory generation model;

[0073] Specifically, for each round of model training, based on the first action category feature vector and the first preprocessed feature vector corresponding to the M first action trajectory samples selected in the current round, the model parameters of the model to be trained are updated until the model training result of the current round meets the preset model training termination condition, and the action trajectory generation model is obtained. The preset model training termination condition may include: the current model training round number equals the total training round number, the model loss function converges, or the generation sub-model and the discriminator sub-model reach a balance.

[0074] In specific implementation, regarding the model iterative training process in step S106 above, the specific implementation process of model iterative training is explained below. Since the processing procedure is the same for each model training iteration, a detailed explanation is given using any single model training as an example. Specifically, if the model to be trained includes a generator sub-model and a discriminator sub-model; such as Figure 2 As shown, the specific implementation methods for each model training are as follows: Each model training can be implemented using the following steps S1061 to S1062:

[0075] S1061, for each first motion trajectory sample: the generation sub-model generates a first predicted feature vector based on the first motion category feature vector and the first preprocessed feature vector corresponding to the first motion trajectory sample; the discrimination sub-model performs trajectory authenticity discrimination based on the first predicted feature vector and the original motion trajectory feature vector corresponding to the first motion trajectory sample, and generates a motion trajectory discrimination result set; and performs motion category recognition based on the first predicted feature vector to obtain the motion category recognition result.

[0076] Specifically, after obtaining the first action category feature vector and the first preprocessed feature vector corresponding to each first action trajectory sample, for each first action trajectory sample, the first action category feature vector and the first preprocessed feature vector corresponding to that first action trajectory sample are input into the generation sub-model. The output of the generation sub-model is the first predicted feature vector corresponding to that first action trajectory sample. The first predicted feature vector includes action feature vectors corresponding to N action image frames in the first action trajectory sample. The action feature vector corresponding to the target image frame is predicted by the generation sub-model, while the action feature vector corresponding to non-target image frames can be predicted by the generation sub-model or derived from the original action trajectory feature vector. Then, the first predicted feature vector and the original action trajectory feature vector corresponding to the first action trajectory sample are input into the discrimination sub-model. The input of the discrimination sub-model is the action trajectory discrimination result set corresponding to that first action trajectory sample. The discrimination sub-model is used to discriminate the predicted action trajectories generated by the generation sub-model. Does it possess a more realistic feel than actual motion trajectories? The motion trajectory discrimination result set can include a first discrimination result indicating that the generated sample (corresponding to the first predicted feature vector) is true, a second discrimination result indicating that the real sample (corresponding to the original motion trajectory feature vector) is true, and a third discrimination result indicating that the generated sample (corresponding to the first predicted feature vector) is fake. Furthermore, based on the first predicted feature vector corresponding to the first motion trajectory sample, motion category recognition is performed to obtain a motion category recognition result, which includes the probability that the motion trajectory corresponding to the first predicted feature vector belongs to the target motion category. Specifically, motion category recognition can be performed using motion trajectory matching, or it can be performed using a neural network model. That is, a pre-trained motion classification model is used to perform motion category recognition based on the first predicted feature vector corresponding to the first motion trajectory sample, and the output of the motion classification model is the motion category recognition result corresponding to the first motion trajectory sample.

[0077] S1062, update the parameters of the generated sub-model based on the above action category recognition results and action trajectory discrimination results set; and update the parameters of the discrimination sub-model based on the above action trajectory discrimination results set.

[0078] Specifically, after obtaining the corresponding action category recognition result and action trajectory discrimination result set for each first action trajectory sample, a first loss value is determined based on the action category recognition result and action trajectory discrimination result set corresponding to each first action trajectory sample; and the parameters of the above-mentioned generating sub-model are updated based on the first loss value using the gradient descent method; and a second loss value is determined based on the action trajectory discrimination result set corresponding to each first action trajectory sample; and the parameters of the above-mentioned discrimination sub-model are updated based on the second loss value using the gradient descent method; after multiple rounds of model training, if the model training result of the current round meets the preset model training termination condition, the action trajectory generation model is determined based on the generated sub-model with updated parameters of the current round; wherein, the above-mentioned action trajectory generation model includes the trained generated sub-model.

[0079] It should be noted that the process of iteratively training the model parameters based on the loss value to obtain the motion trajectory generation model can be found in the existing process of using gradient descent to backpropagate and fine-tune the model parameters, which will not be repeated here.

[0080] In this embodiment, on the one hand, since the first predicted feature vector is obtained based on feature vectors of two dimensions: action trajectory category and action trajectory itself, the action category feature vector is considered during action trajectory prediction, thus ensuring the controllability of the action category of the predicted action trajectory. On the other hand, by iteratively updating the model parameters based on the action category recognition result and action trajectory discrimination result of the first predicted feature vector, it does not limit the predicted action trajectory to be as consistent as possible with the action trajectory sample. Instead, it only considers the action category recognition loss and the action trajectory authenticity loss to optimize the model parameters. This ensures that during the iterative update of the model parameters, it does not limit the action style of the predicted action trajectory (i.e., the output action trajectory) to strictly converge with that of the action trajectory sample (i.e., the input action trajectory). It only constrains the output action trajectory to be consistent with the input action trajectory at the action category level, that is, it allows the output action trajectory to be consistent with the input action trajectory at the action category level. The output motion trajectory belongs to the same action category as the input motion trajectory but has a different action style. This allows the trained model to have diverse action styles when outputting motion trajectories belonging to the target action category. Furthermore, when using the trained model for motion trajectory completion, the target action category feature vector can be introduced to constrain the action style of the motion trajectory, thereby ensuring the controllability of the action style of the output motion trajectory of the trained model. On the other hand, during the iterative update of model parameters, the true or false motion trajectory prediction is continuously learned based on a multi-round adversarial approach of generation and discrimination. Due to the existence of the discrimination sub-model, the model parameters are adjusted based on the discrimination results of the discrimination sub-model, which can further train the generation sub-model. This makes the motion trajectory predicted by the trained generation sub-model more realistic and closer to the real motion trajectory, thereby further improving the realism of the motion trajectory output by the trained model.

[0081] Additionally, it should be noted that the realism of motion trajectories in this application refers to the continuity and naturalness of the movements, conforming to the laws of human movement. That is, the movements formed by connecting multiple 3D skeletons in the output motion trajectory are continuous, natural, and conform to the laws of human movement. The controllability of motion trajectories can include the controllability of the motion category of the motion trajectory, that is, the output motion trajectory and the input motion trajectory have the same motion category, or intermediate image frames are introduced as known image frames to better constrain the motion category. The controllability of motion trajectories can also include the controllability of the motion style of the motion trajectory, that is, the motion style of the motion trajectory can be constrained and controlled based on the feature vector of the target motion category. The diversity of motion trajectories means that the motion trajectory belongs to a certain motion category but the motion style can be variable, that is, the output motion trajectory and the input motion trajectory have the same motion category but different motion styles.

[0082] In one specific embodiment, the model to be trained includes a generative sub-model and a discriminative sub-model, such as... Figure 3As shown, a schematic diagram illustrating the specific implementation principle of the training process for a motion trajectory generation model is presented, including:

[0083] Obtain M first motion trajectory samples; wherein, any first motion trajectory sample q includes N original 3D skeleton information, for example, N is 5;

[0084] For any first action trajectory sample q, using a pre-trained action classification model, action category features are extracted based on the first action trajectory sample q to obtain a first action category feature vector; and,

[0085] Based on the first motion trajectory sample q, motion trajectory feature extraction processing is performed to obtain the original motion trajectory representation matrix X (i.e., the original motion trajectory feature vector, including the known motion feature vectors corresponding to N original 3D skeleton information respectively); the original motion trajectory representation matrix X is preprocessed to obtain the preprocessed motion trajectory representation matrix I (i.e., the first preprocessed feature vector, including the newly completed motion feature vector of the 3D skeleton information corresponding to the target image frame and the known motion feature vector of the 3D skeleton information corresponding to the non-target image frame).

[0086] Input the first action category feature vector corresponding to the first action trajectory sample q and the preprocessed action trajectory representation matrix I into the generator sub-model in the model to be trained to obtain the predicted action trajectory representation matrix Q (i.e. the first predicted feature vector, including the predicted action feature vector of the 3D skeleton information corresponding to the target image frame and the known action feature vector of the 3D skeleton information corresponding to the non-target image frame).

[0087] Based on the predicted motion trajectory representation matrix Q, 3D skeleton transformation is performed to obtain predicted motion trajectory samples; using a pre-trained motion classification model, motion category recognition is performed based on the predicted motion trajectory samples to obtain motion category recognition results.

[0088] The predicted motion trajectory representation matrix Q and the corresponding original motion trajectory representation matrix X are input into the discriminant sub-model in the model to be trained to perform trajectory authenticity discrimination, and the motion trajectory discrimination result set is obtained.

[0089] Based on the above action category recognition results and action trajectory discrimination results set, the parameters of the above-mentioned generating sub-model are iteratively updated, and the parameters of the above-mentioned discrimination sub-model are iteratively updated based on the above-mentioned action trajectory discrimination results set, to obtain the trained action trajectory generation model; wherein, the trained action trajectory generation model may include the trained generating sub-model;

[0090] It should be noted that the above Figure 3 The two action classification models shown can be the same action classification model or different action classification models; this application does not limit this. Figure 3 The numbers 1 to 5 represent the position markers of N 3D bones in the first motion trajectory sample q. Any marker can be used, and this application does not limit it.

[0091] Furthermore, regarding the process of predicting motion trajectories based on the first action category feature vector and the first preprocessed feature vector, if the feature dimensions of the first action category feature vector and the first preprocessed feature vector are the same, they can be directly concatenated to obtain the first concatenated feature vector. However, considering that the first action category feature vector and the first preprocessed feature vector come from different motion trajectory processing processes, directly concatenating these two feature vectors may result in low accuracy of the first concatenated feature vector. Therefore, in order to improve the concatenation accuracy of the first concatenated feature vector and thus improve the prediction accuracy of the motion trajectory, this application first improves the accuracy of the first concatenated feature vector. The action category feature vector is transformed to obtain a second action category feature vector. Then, the second action category feature vector and the first preprocessed feature vector are concatenated to obtain a first concatenated feature vector. Furthermore, the feature vector transformation network is used as part of the generation sub-model. During the parameter update of the generation sub-model based on the loss value, the parameters of the action trajectory generation network and the feature vector transformation network are updated simultaneously. This allows the model to better learn how to ensure the accuracy of concatenating feature vectors from both the action category feature extraction and action trajectory feature extraction dimensions during training. In specific implementations, the above generation sub-model may include a feature vector transformation network, a feature vector concatenation network, and an action trajectory generation network.

[0092] Correspondingly, the generation of the first predicted feature vector based on the first action category feature vector and the first preprocessed feature vector corresponding to the first action trajectory sample in S1061 above specifically includes:

[0093] Step A1: The above-mentioned feature vector transformation network performs feature transformation processing on the first action category feature vector corresponding to the first action trajectory sample to obtain the second action category feature vector;

[0094] Specifically, the main function of the aforementioned feature vector transformation network is to map the first action category feature vector to an intermediate latent space, and obtain the action category latent space vector as the second action category feature vector. This avoids the action category feature vector being too constrained by the output distribution of the action classification model, so that the action category feature vector can be better concatenated with the first preprocessed feature vector.

[0095] Step A2: The above-mentioned feature vector concatenation network concatenates the second action category feature vector with the first preprocessed feature vector corresponding to the first action trajectory sample to obtain the first concatenated feature vector.

[0096] Specifically, taking the first preprocessed feature vector as the preprocessed motion trajectory representation matrix I as an example, after obtaining the second motion category feature vector CE, the second motion category feature vector CE is concatenated with the preprocessed motion trajectory representation matrix I to obtain the concatenated motion trajectory representation matrix R (i.e., the first concatenated feature vector). The dimension of this concatenated motion trajectory representation matrix R is (M+1)×d, that is...

[0097] Step A3: The above-mentioned motion trajectory generation network predicts motion trajectories based on the first concatenated feature vector and generates a first predicted feature vector.

[0098] Specifically, the first concatenated feature vector corresponding to each first action trajectory sample is input into the action trajectory generation network for action trajectory prediction. The output of the action trajectory generation network is the first predicted feature vector corresponding to the first action trajectory sample.

[0099] Specifically, taking the first concatenated feature vector as the concatenated motion trajectory representation matrix R as an example, after obtaining the concatenated motion trajectory representation matrix R, it is input into the motion trajectory generation network for motion trajectory prediction. The output of the motion trajectory generation network is the predicted motion trajectory representation matrix Q (i.e., the first predicted feature vector). The dimension of this predicted motion trajectory representation matrix Q is M×d, that is... The predicted motion trajectory representation matrix Q consists of N row matrices, each row matrix representing the motion feature vector corresponding to a motion image frame.

[0100] It should be noted that the parameters of the aforementioned feature vector transformation network are updated synchronously with the parameters of the motion trajectory generation network. The second action category feature vector is obtained by transforming the first action category feature vector based on the parameters updated in the previous round of the feature vector transformation network. Specifically, the process of updating the parameters of the aforementioned generation sub-model using the gradient descent method based on the first loss value is essentially updating the parameters of both the feature vector transformation network and the motion trajectory generation network simultaneously using the gradient descent method based on the first loss value. This makes the second action category feature vector output by the feature vector transformation network more suitable for concatenation with the first preprocessed feature vector, thereby improving the accuracy of the first concatenated feature vector.

[0101] In one specific embodiment, in the above Figure 3 On the basis of, such as Figure 4The diagram illustrates the specific implementation principle of a motion trajectory generation model training process. If the generation sub-model includes a feature vector transformation network, a feature vector concatenation network, and a motion trajectory generation network, the process of inputting the first action category feature vector corresponding to the first motion trajectory sample q and the preprocessed motion trajectory representation matrix I into the generation sub-model in the model to be trained to obtain the predicted motion trajectory representation matrix Q specifically includes:

[0102] The aforementioned feature vector transformation network performs feature transformation on the first action category feature vector to obtain the second action category feature vector;

[0103] The aforementioned feature vector concatenation network concatenates the second action category feature vector with the preprocessed action trajectory representation matrix I to obtain the concatenated action trajectory representation matrix R (i.e., the first concatenated feature vector).

[0104] The above-mentioned motion trajectory generation network predicts motion trajectories based on the above-mentioned spliced ​​motion trajectory representation matrix R to obtain an intermediate motion trajectory representation matrix J. Then, the motion feature vectors corresponding to the non-target image frames in the intermediate motion trajectory representation matrix J are replaced with the corresponding known motion feature vectors to obtain the predicted motion trajectory representation matrix Q (i.e., the first predicted feature vector).

[0105] It should be noted that the process of replacing known action feature vectors based on the intermediate action trajectory representation matrix J to obtain the predicted action trajectory representation matrix Q can be implemented by the aforementioned action trajectory generation network or by other processing modules. That is, the output of the aforementioned action trajectory generation network is the intermediate action trajectory representation matrix J, and then other processing modules replace known action feature vectors based on the intermediate action trajectory representation matrix J to obtain the predicted action trajectory representation matrix Q.

[0106] Specifically, the preprocessing of the original motion trajectory feature vector involves converting known motion feature vectors corresponding to some image frames (i.e., target image frames) in the original motion trajectory feature vector into newly completed motion feature vectors (i.e., the second type of feature vectors mentioned above), resulting in a first preprocessed feature vector. This allows the generation sub-model to predict the motion feature vectors (i.e., the newly completed motion feature vectors) corresponding to the target image frames in the first preprocessed feature vector. The first preprocessed feature vector includes known motion feature vectors corresponding to non-target image frames and newly completed motion feature vectors corresponding to the target image frames. These newly completed motion feature vectors are obtained by occluding and completing the motion feature vectors corresponding to the target image frames in the original motion trajectory feature vector. Specifically, in S104, the preprocessing of the original motion trajectory feature vector to obtain the first preprocessed feature vector includes:

[0107] Step B1: Perform occlusion processing on the action feature vector corresponding to the target image frame in the above-mentioned original action trajectory feature vector to obtain the first action trajectory feature vector. Here, the target image frame is selected from N action image frames, and the N action image frames correspond one-to-one with the N 3D skeleton information in the first action trajectory sample.

[0108] Specifically, during the process of selecting the target image frame from N action image frames, it can be randomly selected, selected according to a preset rule, or randomly selected within the specified image frames (that is, a combination of selecting according to a preset rule and random selection). For example, randomly select among non-first and last image frames, that is, keep the first action image frame and the last action image frame. Since the target image frame is randomly selected, for the same first action trajectory sample, during different rounds of model training, due to different occluded action image frames, the first action trajectory feature vector corresponding to the first action trajectory sample is also different. This can ensure the sufficiency of the action trajectory samples participating in model training.

[0109] Specifically, still taking the above-mentioned original action trajectory feature vector represented as the original action trajectory representation matrix X as an example, perform occlusion processing on the target row matrix (that is, the action feature vector corresponding to the target image frame) in the original action trajectory representation matrix X to obtain the occluded action trajectory representation matrix X m (that is, the first action trajectory feature vector). The purpose of occlusion processing is to simulate the action trajectory representation matrix corresponding to an incomplete 3D skeleton action trajectory. The occlusion ratio of the target row matrix can change during the model training process, as long as at least the specified row matrix is ensured not to be occluded.

[0110] Step B2: Perform completion processing on the occluded action feature vector in the above-mentioned first action trajectory feature vector to obtain the second action trajectory feature vector.

[0111] Specifically, after obtaining the occluded action trajectory representation matrix X m (that is, the first action trajectory feature vector) through feature vector occlusion processing, perform completion processing on the occluded target row matrix in the occluded action trajectory representation matrix X m to obtain the completed action trajectory representation matrix (that is, the second action trajectory feature vector). In specific implementation, according to the kinematic model, interpolation can be used to perform completion processing on the occluded target row matrix in the occluded action trajectory representation matrix X m . For example, it is necessary to supplement the row matrix corresponding to time t (that is, the action feature vector corresponding to the action image frame at time t is occluded), and the two closest unoccluded times to time t are t1 and t2, where t1 < t2; then:

[0112] Linear interpolation is used to complete the spatial location, that is

[0113] Spherical interpolation is used to complete the rotation angle, i.e. in,

[0114]

[0115] Step B3: Perform position marking processing on the motion feature vectors corresponding to each motion image frame in the second motion trajectory feature vector to obtain the first preprocessed feature vector.

[0116] Specifically, the second motion trajectory feature vector mentioned above is still used as the complete motion trajectory representation matrix. For example, to complete the motion trajectory representation matrix Each row of the matrix (i.e., the newly completed motion feature vector corresponding to the target image frame and the known motion feature vector corresponding to the non-target image frame) undergoes position marking processing to obtain the preprocessed motion trajectory representation matrix I (i.e., the first preprocessed feature vector); the purpose of position marking processing is to mark the completed motion trajectory representation matrix. The correspondence between each row matrix and N 3D skeletons is established to help the model distinguish which row matrices correspond to the target image frames. The preprocessed motion trajectory representation matrix I can be expressed as:

[0117]

[0118] Where d represents the total dimension, and N represents the number of 3D skeletons in the first motion trajectory sample (i.e., corresponding to N motion image frames). PE represents the motion feature vector of the 3D skeleton with index x. x This represents the position encoding vector of the 3D skeleton with index x, where x takes values ​​from 1 to N.

[0119] In one specific embodiment, in the above Figure 3 On the basis of, such as Figure 5 As shown, a schematic diagram illustrating the specific implementation principle of the training process for a motion trajectory generation model is presented. Specifically, the process of preprocessing the original motion trajectory representation matrix X to obtain the preprocessed motion trajectory representation matrix I includes:

[0120] Occlusion processing is performed on multiple target row matrices in the original motion trajectory representation matrix X to obtain the occluded motion trajectory representation matrix X. m (i.e., the first motion trajectory feature vector); where the target row matrix represents the original motion feature vector corresponding to the target image frame, for example, occluding the row matrices with serial numbers 2 and 3;

[0121] The above occlusion action trajectory representation matrix X mThe occluded target row matrix is ​​completed to obtain the completed motion trajectory representation matrix. (i.e., the feature vector of the second motion trajectory);

[0122] Based on the position encoding vectors of N 3D skeletons in the first motion trajectory sample q, the above-mentioned motion trajectory representation matrix is ​​completed. The position marking process is performed on each row of the matrix to obtain the preprocessed motion trajectory representation matrix I (i.e., the first preprocessed feature vector).

[0123] Specifically, regarding the process of updating model parameters based on the first predicted feature vector, if the aforementioned action trajectory discrimination result set includes a first discrimination result indicating that the first predicted feature vector is judged to be true, a second discrimination result indicating that the original action trajectory feature vector is judged to be true, and a third discrimination result indicating that the first predicted feature vector is judged to be fake; wherein the first predicted feature vector can be regarded as a generated sample, and the original action trajectory feature vector can be regarded as a real sample, the discrimination sub-model performs a true / false discrimination judgment on the generated sample and the real sample to obtain a true / false discrimination result set, and then updates the model parameters of the generating sub-model based on the true / false discrimination result set to prompt the generating sub-model to generate a more realistic first predicted feature vector, thereby making the generated sample more realistic;

[0124] Correspondingly, in S1062 above, the parameters of the generated sub-model are updated based on the above set of action category recognition results and action trajectory discrimination results; and the parameters of the discrimination sub-model are updated based on the above set of action trajectory discrimination results, specifically including:

[0125] Step C1: Update the parameters of the generated sub-model based on the action category recognition results and the first discrimination results.

[0126] Specifically, a first loss value is calculated based on the action category recognition result and the first discrimination result corresponding to each first action trajectory sample, and then the parameters of the generated sub-model are updated based on the first loss value; wherein, the first loss value includes not only the conventional cross-entropy classification loss L calculated based on the action category recognition result. CE It also includes the first adversarial loss L calculated based on the first discrimination result. GNet That is, the adversarial loss calculated based on the probability that the first predicted feature vector (i.e., the generated sample) is a true prediction by the discriminant sub-model. Wherein, the first loss value L = L CE +2L GNet λ2 represents the weighting parameter used to balance the proportions of the two loss components (i.e., cross-entropy loss and adversarial loss), and DNet(x) represents the probability that the first predicted feature vector (i.e. the generated sample) is predicted as true by the discriminant submodel.

[0127] Step C2: Update the parameters of the above-mentioned discriminant sub-model based on the second and third discrimination results.

[0128] Specifically, a second loss value is calculated based on the second and third discrimination results corresponding to each first action trajectory sample, and then the parameters of the discrimination sub-model are updated based on the second loss value. In specific implementation, the second loss value is calculated based on the probability that the original action trajectory feature vector (i.e., the real sample) is predicted as true by the discrimination sub-model and the probability that the first predicted feature vector (i.e., the generated sample) is predicted as fake by the discrimination sub-model. The formula for calculating the first loss value can be:

[0129]

[0130] Where DNet(x) represents the probability that the original action trajectory feature vector (i.e., the real sample) is predicted as true by the discriminant sub-model, and 1-DNet(x) represents the probability that the first predicted feature vector (i.e., the generated sample) is predicted as fake by the discriminant sub-model. It is expected that the larger this value is, the better, indicating that the discriminant sub-model has a stronger ability to distinguish between true and false.

[0131] Furthermore, in the process of action category recognition and action trajectory authenticity determination based on the first predicted feature vector, the action feature vector corresponding to the non-target image frame in the first predicted feature vector can be a predicted feature vector (i.e., the action feature vector obtained by the generation sub-model for action trajectory prediction of the non-target image frame) or a known action feature vector (i.e., the action feature vector corresponding to the non-target image frame in the original action trajectory feature vector). To improve the accuracy of action category recognition and the accuracy of action trajectory authenticity determination, thereby improving the accuracy of model loss calculation, preferably, the action feature vector corresponding to the non-target image frame in the first predicted feature vector is a known action feature vector. That is, after predicting the action trajectory based on the first concatenated feature vector corresponding to the first action trajectory sample to obtain the third action trajectory feature vector, the predicted action feature vector corresponding to the non-target image frame in the third action trajectory feature vector is replaced with the known action feature vector corresponding to the non-target image frame in the original action trajectory feature vector. Specifically, step A3 above, which predicts the action trajectory based on the first concatenated feature vector to generate the first predicted feature vector, specifically includes:

[0132] Step A31: Based on the first concatenated feature vector, predict the action trajectory to obtain the third action trajectory feature vector;

[0133] Among them, the third motion trajectory feature vector includes the predicted motion feature vector corresponding to the non-target image frame and the predicted motion feature vector corresponding to the target image frame. That is, the motion feature vectors corresponding to the N motion image frames in the third motion trajectory feature vector are all obtained by the generation sub-model based on the first spliced ​​feature vector to predict the motion trajectory.

[0134] Specifically, taking the first concatenated feature vector as the concatenated action trajectory representation matrix R as an example, after obtaining the concatenated action trajectory representation matrix R, it is input into the action trajectory generation network for action trajectory prediction, resulting in the intermediate action trajectory representation matrix J (i.e., the third action trajectory feature vector). The dimension of this intermediate action trajectory representation matrix J is N×d, that is... The intermediate motion trajectory representation matrix J consists of N row matrices, each row matrix representing the predicted motion feature vector corresponding to a motion image frame.

[0135] Step A32: Based on the third motion trajectory feature vector and the original motion trajectory feature vector, generate the first predicted feature vector;

[0136] The first predicted feature vector includes the action feature vector corresponding to the target image frame in the third action trajectory feature vector and the action feature vector corresponding to the non-target image frame in the original action trajectory feature vector.

[0137] In other words, the first predicted feature vector includes the known action feature vector corresponding to the non-target image frame and the predicted action feature vector corresponding to the target image frame. That is, the action feature vector corresponding to the non-target image frame in the first predicted feature vector is obtained by feature vector replacement.

[0138] Specifically, considering that the first motion trajectory feature vector is obtained by occluding the motion feature vector corresponding to the target image frame, the first motion trajectory feature vector only contains the known motion feature vector corresponding to the non-target image frame. Therefore, the first predicted feature vector can also be generated based on the third motion feature vector and the first motion feature vector corresponding to the first motion trajectory sample.

[0139] Specifically, after obtaining the intermediate motion trajectory representation matrix J (i.e., the third motion trajectory feature vector), the row matrices corresponding to the non-target image frames in the intermediate motion trajectory representation matrix J (i.e., the predicted motion feature vectors) are replaced with the row matrices corresponding to the non-target image frames in the original motion trajectory representation matrix X (i.e., the known motion feature vectors), thus obtaining the predicted motion trajectory representation matrix Q (i.e., the first predicted feature vector). In practical implementation, the intermediate motion trajectory representation matrix J and the occluded motion trajectory representation matrix X can also be directly compared. mA fusion process is performed to obtain the predicted motion trajectory representation matrix Q (i.e., the first predicted feature vector). This involves replacing the row matrices corresponding to non-target image frames in the intermediate motion trajectory representation matrix J with known motion feature vectors, while the row matrices corresponding to target image frames in the intermediate motion trajectory representation matrix J remain the predicted motion feature vectors. This also means that the occluded motion trajectory representation matrix X... m The unoccluded row matrix in the matrix still represents the known action feature vectors, while the occluded action trajectory representation matrix X... m The occluded row matrix is ​​transformed into a predicted action feature vector.

[0140] Furthermore, regarding the feature extraction process from the action trajectory category dimension, considering that the intermediate output data of existing action classification models includes action category feature vectors, the existing action classification models can be fully utilized for action category feature extraction. Specifically, the action category feature extraction process based on the first action trajectory sample in S104 above, to obtain the first action category feature vector, specifically includes:

[0141] Step B4: Based on the above-mentioned first motion trajectory samples, construct the spatiotemporal graph of the first motion trajectory;

[0142] Step B5: Using the pre-trained action classification model, action category features are extracted based on the above-mentioned first action trajectory spatiotemporal graph to obtain the first action category feature vector.

[0143] The aforementioned action classification model can be a Spatiotemporal Graph Convolutional Network (ST-GCN) or other deep neural networks used for action classification. Specifically, taking a Spatiotemporal Graph Convolutional Network (ST-GCN) as an example, the spatiotemporal graph of the first action trajectory can be represented as G(V,E). S E T V represents the set of vertices, recording the position of key i at each time t, and E represents the set of vertices. S E represents the set of spatial edges, recording the connection relationships of joints on the human skeleton H at a certain time t. T This represents a time edge set, recording the temporal connection relationships of a certain human skeletal joint. The specific construction process of the motion trajectory spatiotemporal graph can be found in the specific implementation process of existing technologies, and will not be repeated here. In addition, the construction process of the motion trajectory spatiotemporal graph mentioned below is similar.

[0144] Specifically, taking the action classification model as a spatiotemporal graph convolutional network (ST-GCN) as an example, the spatiotemporal graph of the first action trajectory is input into the spatiotemporal graph convolutional network (ST-GCN), and its output is a feature vector F with a preset dimension. This feature vector F can be used as the feature vector of the first action category mentioned above, that is, it can be expressed as F = STGCN(G).

[0145] It should be noted that the step of constructing the first action trajectory spatiotemporal graph based on the first action trajectory sample is optional and can be adjusted according to the requirements of the action classification model for the input data. For the case where the action classification model is a spatiotemporal graph convolutional network (ST-GCN), since the spatiotemporal graph needs to be used as the input of the spatiotemporal graph convolutional network (ST-GCN), the first action trajectory spatiotemporal graph is constructed. However, for the case where the action classification model is another deep neural network used for action classification, it is necessary to combine the requirements of the deep neural network for the input data and generate the required input data based on the first action trajectory sample.

[0146] Furthermore, regarding the process of identifying the action category of the predicted action trajectory samples, existing action classification models can also be fully utilized for action category identification. Therefore, during model training, by introducing a pre-trained action classification model, it can not only be used to extract action category features from the first action trajectory sample, but also to identify the action category of the predicted action trajectory sample. Specifically, the action category identification based on the first predicted feature vector in S1061 above, to obtain the action category identification result, specifically includes:

[0147] Step A4: Perform 3D skeleton transformation processing based on the first predicted feature vector to obtain the predicted motion trajectory sample.

[0148] Specifically, taking the first predicted feature vector as the predicted motion trajectory representation matrix Q as an example, 3D skeleton transformation processing is performed based on each row of the predicted motion trajectory representation matrix Q to obtain the predicted motion trajectory sample; wherein, the predicted motion trajectory sample includes the target 3D skeleton information corresponding to each row of the predicted motion trajectory representation matrix Q (i.e., the predicted 3D skeleton information corresponding to the target image frame and the original 3D skeleton information corresponding to the non-target image frame).

[0149] Step A5: Based on the above predicted motion trajectory samples, construct a second motion trajectory spatiotemporal map;

[0150] Specifically, taking the action classification model as a spatiotemporal graph convolutional network (ST-GCN) as an example, since the input data of the spatiotemporal graph convolutional network (ST-GCN) should be a spatiotemporal graph, after generating the corresponding predicted action trajectory sample for the first action trajectory sample, the corresponding second action trajectory spatiotemporal graph is first constructed based on the predicted action trajectory sample, and then the second action trajectory spatiotemporal graph is used as the input of the action classification model to extract action category features.

[0151] Step A6: Using the pre-trained action classification model, action category features are extracted based on the second action trajectory spatiotemporal graph to obtain a third action category feature vector. Then, action category recognition is performed based on the third action category feature vector to obtain the action category recognition result.

[0152] Specifically, after obtaining the spatiotemporal map of the second action trajectory, the second action trajectory spatiotemporal map is input into the pre-trained action classification model to first obtain the third action category feature vector. Then, based on the third action category feature vector, the probability that the action category of the predicted action trajectory sample is the target action category (i.e., the true category label of the first action trajectory sample) is obtained, which is the action category recognition result.

[0153] It should be noted that the above-mentioned motion trajectory generation model may include a trained generation sub-model; however, considering that in the motion trajectory generation stage (i.e., the application stage of the motion trajectory generation model), one implementation is to introduce a reference motion trajectory and use a pre-trained motion classification model to extract motion category features from the reference motion trajectory to obtain a target motion category feature vector used to constrain the motion style. Then, this target motion category feature vector is used as the input of the motion trajectory generation model to predict the motion trajectory, so that the predicted target motion trajectory has a high similarity to the motion style of the reference motion trajectory. That is, the motion style of the predicted target motion trajectory imitates the motion style of the reference motion trajectory. Therefore, the above-mentioned motion trajectory generation model may include a trained generation sub-model and a pre-trained motion classification model.

[0154] Furthermore, regarding the specific training process of the above action classification model, before obtaining the first sample dataset in S102, it also includes:

[0155] Step D1: Obtain the second sample dataset; wherein, the second sample dataset includes multiple second motion trajectory samples;

[0156] Specifically, the second motion trajectory sample may be the same as or different from the first motion trajectory sample. The second sample dataset may be partially different from the first sample dataset, or all of the samples may be different. Each second motion trajectory sample may also be a motion trajectory sample that includes N 3D skeleton information of the target object. Specifically, each second motion trajectory sample is generated based on an animation video sample with a preset playback duration. If the animation video sample includes N motion image frames, each motion image frame is converted into corresponding 3D skeleton information. The motion trajectory composed of N 3D skeleton information is a second motion trajectory sample.

[0157] Specifically, the process of generating a second motion trajectory sample based on an animated video sample with a preset playback duration can be achieved by transforming the animated video sample using a preset transformation matrix, so that the 3D skeleton of the intermediate image frame j of the animated video sample is aligned with the standard viewpoint of a preset camera, thereby obtaining the aligned 3D skeleton motion trajectory (i.e., the second motion trajectory sample).

[0158] Step D2: Based on the second motion trajectory sample mentioned above, construct the spatiotemporal graph of the third motion trajectory;

[0159] Specifically, taking the action classification model to be trained as a spatiotemporal graph convolutional network (ST-GCN) as an example, the spatiotemporal graph of the third action trajectory mentioned above can also be represented as G(V,E). S E T V represents the set of vertices, recording the position of key i at each time t, and E represents the set of vertices. S E represents the set of spatial edges, recording the connection relationships of joints on the human skeleton H at a certain time t. T This represents a time edge set, which records the temporal connection relationships of a certain human skeletal joint. The specific construction process of the motion trajectory spatiotemporal graph can be found in the specific implementation process of existing technologies, and will not be elaborated here.

[0160] In practice, to ensure sufficient samples for training the action classification model, the intermediate image frame j used for alignment during the generation of the second action trajectory sample is randomly selected from the animation video sample and is used to simulate the difference in camera shooting angle. Thus, for the same animation video sample, the aligned 3D skeleton action trajectory (i.e., the second action trajectory sample) is different. Therefore, the spatiotemporal map of the third action trajectory constructed based on the second action trajectory sample is also different, thereby achieving the effect of sample data augmentation.

[0161] Step D3: Input the above-mentioned third action trajectory spatiotemporal graph into the action classification model to be trained for action category feature extraction processing to obtain the fourth action category feature vector, and perform action category prediction based on the above-mentioned fourth action category feature vector to obtain the action category prediction result.

[0162] Specifically, after obtaining the spatiotemporal map of the third action trajectory, the spatiotemporal map of the third action trajectory is input into the action classification model to be trained. First, the fourth action category feature vector is obtained. Then, based on the fourth action category feature vector, the probability that the action category of the second action trajectory sample is the target action category (i.e., the true category label of the second action trajectory sample) is obtained, which is the action category prediction result.

[0163] Step D4: Based on the prediction results of the action category corresponding to each second action trajectory sample and the feature vector deviation information, determine the classification loss value; wherein, the above-mentioned feature vector deviation information includes the deviation information between the fourth action category feature vector and the target center feature vector, and the target center feature vector is the center feature vector corresponding to the true action category of the second action trajectory sample;

[0164] Specifically, considering that it is necessary not only to extract action category features using the trained action classification model to obtain action category feature vectors, but also to construct action feature distribution information corresponding to the target action category based on the action category feature vectors, the above classification loss value includes not only the conventional cross-entropy classification loss L calculated based on the action category prediction results. CE It also includes the center loss calculated based on the feature vector bias information, i.e., L Centre In other words, the center bias loss is used to constrain the action category feature vectors to be more concentrated around the target center feature vector, making the action feature distribution information constructed based on the action category feature vectors more consistent with a preset probability distribution (such as a Gaussian distribution); where the classification loss value L = L CE +λ1L Centre λ1 represents the weighting parameter, used to balance the proportions of the two loss components (i.e., cross-entropy loss and center bias loss).

[0165] In practical implementation, center deviation loss Where F represents the feature vector of the fourth action category, and C k C represents the target center feature vector, where each target action category k corresponds to a target center feature vector C. k Target center feature vector C k The target center feature vector C can be used as a parameter that the model needs to learn. k It has the same dimension as the feature vector F of the fourth action category.

[0166] Step D5: Based on the above classification loss value, perform iterative training on the above action classification model to obtain the trained action classification model.

[0167] Specifically, after determining the classification loss value corresponding to the second sample dataset, the gradient descent method is used to update the parameters of the action classification model based on the classification loss value. It should be noted that the process of iteratively training the model parameters based on the loss value to obtain the trained action classification model can be found in the existing process of using gradient descent to backpropagate and fine-tune the model parameters, which will not be elaborated here.

[0168] In one specific embodiment, such as Figure 6As shown, a schematic diagram illustrating the specific implementation principle of the training process of the action classification model is presented, including:

[0169] Acquire multiple second motion trajectory samples; wherein any second motion trajectory sample p includes N original 3D skeleton information, for example, N is 5;

[0170] Each second motion trajectory sample is input into the motion classification model to be trained for motion category feature extraction to obtain a fourth motion category feature vector. Based on the fourth motion category feature vector, motion category prediction is performed to obtain the motion category prediction result.

[0171] Based on the prediction results of the action category corresponding to each second action trajectory sample and the feature vector deviation information, the classification loss value is determined; wherein, the aforementioned feature vector deviation information includes the deviation information between the fourth action category feature vector and the target center feature vector, and the target center feature vector is the center feature vector corresponding to the true action category of the second action trajectory sample;

[0172] Based on the above classification loss value, the above action classification model is iteratively trained to obtain the trained action classification model.

[0173] Furthermore, in the motion trajectory generation stage (i.e., the application stage of the motion trajectory generation model), in order to improve the diversity of motion styles of the motion trajectories generated by the model without introducing reference motion trajectories, another approach is to pre-construct corresponding motion feature distribution information for each target motion category, so as to sample a candidate feature vector from the motion feature distribution information as the target motion category feature vector. Since different candidate feature vectors correspond to the same motion category but different motion styles, this can also improve the diversity of motion styles of the generated motion trajectories. Moreover, if the motion style of the motion trajectory generated by the model does not meet expectations, the motion trajectory generation step can be re-executed to obtain a motion trajectory with another motion style, thereby achieving selectivity of motion style for the motion trajectory. The specific implementation process of constructing corresponding motion feature distribution information for each target motion category includes, in step D5, after iteratively training the above motion classification model based on the above classification loss value to obtain the trained motion classification model, the following steps are also included:

[0174] Step D6: Obtain the third sample dataset; wherein, the third sample dataset includes multiple third action trajectory samples under the target action category;

[0175] The target action category mentioned above can be any action category among multiple action categories involved in the target action trajectory generation scenario. For each target action category, a third sample dataset corresponding to the target action category is obtained so as to construct the action feature distribution information corresponding to the target action category based on the third sample dataset.

[0176] Specifically, the third motion trajectory sample may be the same as or different from the second motion trajectory sample. The third sample dataset may be partially different from the second sample dataset, or all of the samples may be different. Each third motion trajectory sample may also be a motion trajectory sample that includes N 3D skeleton information of the target object. Specifically, each third motion trajectory sample is generated based on an animation video sample with a preset playback duration. If the animation video sample includes N motion image frames, each motion image frame is converted into corresponding 3D skeleton information. The motion trajectory composed of N 3D skeleton information is a third motion trajectory sample.

[0177] Step D7: Based on the third motion trajectory sample mentioned above, construct the spatiotemporal graph of the fourth motion trajectory;

[0178] Specifically, taking the action classification model as a spatiotemporal graph convolutional network (ST-GCN) as an example, since the input data of the spatiotemporal graph convolutional network (ST-GCN) should be a spatiotemporal graph, after obtaining the third action trajectory sample, the corresponding fourth action trajectory spatiotemporal graph is first constructed based on the third action trajectory sample. Then, the fourth action trajectory spatiotemporal graph is used as the input of the action classification model to extract action category features, obtain the corresponding action category feature vector, and then construct the corresponding action feature distribution information.

[0179] Step D8: Input the spatiotemporal graphs of each fourth action trajectory into the trained action classification model to extract action category features and obtain the fifth action category feature vector.

[0180] Specifically, the process of using the action classification model to output the feature vector of the fifth action category is equivalent to the application process of the trained action classification model.

[0181] Step D9: Based on the feature vector of the fifth action category corresponding to each third action trajectory sample, construct the action feature distribution information corresponding to the target action category;

[0182] The aforementioned action feature distribution information includes a set of candidate feature vectors, which includes candidate feature vectors corresponding to various action styles under the target action category. The candidate feature vectors are used as target action category feature vectors during the action trajectory generation stage, and the target action category feature vectors are used to constrain the action style of the action trajectory.

[0183] Specifically, since the action categories of the third action trajectory samples are all the target action category, but the action styles of different third action trajectory samples can be the same or different, the fifth action category feature vector is obtained by extracting action category features from action trajectory samples of multiple action styles under the target action category. Then, based on multiple fifth action category feature vectors, the action feature distribution information corresponding to the target action category is constructed (i.e., a probability distribution function is fitted based on the feature vectors corresponding to many action styles). This action feature distribution information covers more feature vectors corresponding to action styles. Therefore, in the action trajectory generation stage (i.e., the application stage of the action trajectory generation model), a candidate feature vector is sampled from the action feature distribution information corresponding to the target action type as the target action category feature vector. This target action category feature vector can be used as the input of the action trajectory generation model, that is, as a controllable condition for the action trajectory generation process to constrain the action style of the generated action trajectory. Here, it can be random sampling or sampling according to preset rules.

[0184] In practical implementation, the action feature distribution information corresponding to any target action category k can be represented as a Gaussian distribution function.

[0185] Where i represents the sequence number of the fourth action trajectory spatiotemporal graph, and its value ranges from 1 to N. k N k The fourth action trajectory spacetime diagram, F i This represents the feature vector of the fifth action category corresponding to the spatiotemporal graph of the fourth action trajectory with sequence number i.

[0186] In one specific embodiment, such as Figure 7 As shown, a schematic diagram illustrating the specific implementation principle of constructing action feature distribution information corresponding to a target action category is presented, specifically including:

[0187] Obtain multiple third motion trajectory samples under the target motion category; wherein, any third motion trajectory sample f includes N original 3D skeleton information, for example, N is 5;

[0188] Each third motion trajectory sample is input into the trained motion classification model to extract motion category features, resulting in the fifth motion category feature vector.

[0189] Based on the feature vector of the fifth action category corresponding to each third action trajectory sample, construct the action feature distribution information corresponding to the target action category;

[0190] It should be noted that any action category can be used as the target action category, and corresponding action feature distribution information can be constructed for any action category.

[0191] In practical implementation, the aforementioned feature vector transformation network can be considered a generative sub-model. Furthermore, considering that the feature vector transformation network can be implemented using existing feature encoding network structures, and that the generative and discriminative sub-models can also be implemented using existing adversarial neural network structures, the feature vector transformation network can be relatively independent of the generative sub-model. The generative and discriminative sub-models constitute an adversarial neural network. Additionally, when using an existing adversarial neural network structure, the output of the generative sub-model is typically the aforementioned intermediate motion trajectory representation matrix J. Therefore, taking the example of the feature vector transformation network using an existing feature encoding network structure, and the generative and discriminative sub-models using existing adversarial neural network structures, the feature vector transformation network and the generative sub-model are relatively independent. That is, the aforementioned model to be trained includes a feature vector transformation sub-model (equivalent to the aforementioned feature vector transformation network), a generative sub-model (equivalent to the aforementioned motion trajectory generation network), and a discriminative sub-model. The process of replacing known motion feature vectors based on the intermediate motion trajectory representation matrix J to obtain the predicted motion trajectory representation matrix Q is implemented by other processing modules. Based on this, as follows... Figure 8 The diagram illustrates the specific implementation principle of the training process for a motion trajectory generation model. The training process specifically includes:

[0192] Obtain M first motion trajectory samples; wherein, any first motion trajectory sample q includes N original 3D skeleton information, for example, N is 5;

[0193] For any first action trajectory sample q, a pre-trained action classification model is used to extract action category features based on the first action trajectory sample q, resulting in a first action category feature vector. This first action category feature vector is then input into the feature vector transformation submodel in the model to be trained for feature transformation, resulting in a second action category feature vector.

[0194] Based on the first motion trajectory sample q, motion trajectory feature extraction processing is performed to obtain the original motion trajectory representation matrix X (i.e. the original motion trajectory feature vector, including the known motion feature vectors corresponding to N original 3D skeleton information respectively);

[0195] Occlusion processing is performed on multiple target row matrices (i.e., known action feature vectors corresponding to target image frames) in the original motion trajectory representation matrix X to obtain the occluded motion trajectory representation matrix X. m (i.e., the first motion trajectory feature vector); for example, occluding the row matrices with serial numbers 2 and 3;

[0196] The above occlusion action trajectory representation matrix X m The occluded target row matrix is ​​completed to obtain the completed motion trajectory representation matrix. (i.e., the feature vector of the second motion trajectory);

[0197] Based on the position encoding vectors of N 3D skeletons in the first motion trajectory sample q, the above-mentioned motion trajectory representation matrix is ​​completed. The position marking processing of each row matrix is ​​performed to obtain the preprocessed motion trajectory representation matrix I (i.e. the first preprocessed feature vector, including the newly completed motion feature vector after the position marking of the 3D skeleton information corresponding to the target image frame and the known motion feature vector after the position marking of the 3D skeleton information corresponding to the non-target image frame).

[0198] The second action category feature vector is concatenated with the preprocessed action trajectory representation matrix I to obtain the concatenated action trajectory representation matrix R (i.e., the first concatenated feature vector).

[0199] The concatenated action trajectory representation matrix R corresponding to the first action trajectory sample q is input into the generator sub-model in the model to be trained to predict the action trajectory, and the intermediate action trajectory representation matrix J is obtained.

[0200] Replace the motion feature vectors corresponding to non-target image frames in the intermediate motion trajectory representation matrix J with the corresponding known motion feature vectors to obtain the predicted motion trajectory representation matrix Q (i.e., the first predicted feature vector, which includes the predicted motion feature vectors of the 3D skeleton information corresponding to the target image frame and the known motion feature vectors of the 3D skeleton information corresponding to the non-target image frame).

[0201] Based on the predicted motion trajectory representation matrix Q, 3D skeleton transformation is performed to obtain predicted motion trajectory samples; using a pre-trained motion classification model, motion category recognition is performed based on the predicted motion trajectory samples to obtain motion category recognition results.

[0202] The predicted motion trajectory representation matrix Q and the corresponding original motion trajectory representation matrix X are input into the discriminant sub-model in the model to be trained to perform trajectory authenticity discrimination, resulting in a motion trajectory discrimination result set. The motion trajectory discrimination result set includes a first discrimination result in which the generated sample (corresponding to the predicted motion trajectory representation matrix Q) is judged as true, a second discrimination result in which the real sample (corresponding to the original motion trajectory representation matrix X) is judged as true, and a third discrimination result in which the generated sample (corresponding to the predicted motion trajectory representation matrix Q) is judged as fake.

[0203] Based on the above action category recognition results and the first discrimination result, the parameters of the above generation sub-model and feature vector transformation sub-model are iteratively updated, and the parameters of the above discrimination sub-model are iteratively updated based on the above second discrimination result and the third discrimination result, so as to obtain the trained action trajectory generation model; wherein, the trained action trajectory generation model may include the trained feature vector transformation sub-model, generation sub-model and discrimination sub-model.

[0204] The training method for the motion trajectory generation model in this embodiment of the application, during the model training phase, involves extracting motion category features and motion trajectory features from the first motion trajectory samples to obtain motion category feature vectors and motion trajectory feature vectors corresponding to each first motion trajectory sample. The motion trajectory feature vectors are then preprocessed to obtain a first preprocessed feature vector. Specifically, the motion category feature vector corresponding to the first motion trajectory sample is used as the first-dimensional feature vector extracted from the motion trajectory category dimension and input into the model to be trained. The first preprocessed feature vector corresponding to the first motion trajectory sample is used as the second-dimensional feature vector extracted from the motion trajectory itself dimension. Then, the model to be trained utilizes the motion category feature vectors... The motion trajectory is predicted using the first preprocessed feature vector and the first preprocessed feature vector to obtain the first predicted feature vector. Then, motion category identification and trajectory authenticity determination are performed based on the first predicted feature vector. Finally, the parameters of the model to be trained are iteratively updated based on the motion category identification results and motion trajectory determination results of each first motion trajectory sample. On the one hand, since the first predicted feature vector is obtained based on feature vectors of both the motion trajectory category and the motion trajectory itself, the motion category feature vector is considered during motion trajectory prediction, ensuring the controllability of the motion category of the predicted motion trajectory. On the other hand, the model is updated using the motion category identification results and motion trajectory determination results based on the first predicted feature vector. The parameters are iteratively updated without requiring the predicted motion trajectory to be as consistent as possible with the motion trajectory samples. Instead, the model parameters are optimized only by considering the action category recognition loss and the motion trajectory authenticity loss. This ensures that during the model parameter iteration process, the predicted motion trajectory (i.e., the output motion trajectory) is not required to have a strict convergence in action style with the motion trajectory samples (i.e., the input motion trajectory). It only constrains the output motion trajectory to be consistent with the input motion trajectory at the action category level. That is, the output motion trajectory and the input motion trajectory are allowed to belong to the same action category but have different action styles. Thus, after training, the model is allowed to have different action styles when outputting motion trajectories belonging to the target action category. Diversity; furthermore, when using the trained model for motion trajectory completion, the target action category feature vector can be introduced to constrain the action style of the motion trajectory, thereby ensuring the controllability of the action style of the motion trajectory output by the trained model; on the other hand, during the iterative update of model parameters, the true and false of motion trajectory predictions are continuously learned based on the multi-round adversarial method of generation and discrimination. Due to the existence of the discriminative sub-model, the model parameters are adjusted based on the discrimination results of the discriminative sub-model, which can further train the generative sub-model, so that the motion trajectory predicted by the trained generative sub-model has a more realistic feel than the real motion trajectory, thereby further improving the realism of the motion trajectory output by the trained model.

[0205] Corresponding to the above Figures 1 to 8Based on the same technical concept, this application also provides a method for generating motion trajectories, which is used to train the motion trajectory generation model. Figure 9 This is a flowchart illustrating the motion trajectory generation method provided in an embodiment of this application. Figure 9 The method described herein can be executed by an electronic device equipped with a motion trajectory generation device, which can be a terminal device or a designated server. It should be noted that the motion trajectory generation device provided in this application embodiment can be applied to any specific application scenario that requires rendering animation videos based on motion trajectories, such as an application scenario for rendering game character animation videos based on motion trajectories, or an application scenario for rendering action demonstration videos based on motion trajectories.

[0206] Specifically, regarding the process of generating motion trajectories, such as Figure 9 As shown, the method includes at least the following steps:

[0207] S902, obtain the motion trajectory to be completed; and obtain the target motion category feature vector, which is used to constrain the motion style of the motion trajectory.

[0208] The motion trajectory to be completed includes multiple known 3D skeleton information (i.e., 3D skeleton information corresponding to known image frames). The known 3D skeleton information in the motion trajectory to be completed can be regarded as known keyframe skeletons, and the motion image frames corresponding to the known keyframe skeletons can be regarded as known image frames. The motion image frames corresponding to the missing positions in the motion trajectory to be completed can be regarded as image frames to be completed. The known 3D skeleton information can be manually drawn by the animators or selected from the completed motion trajectory.

[0209] After obtaining the target action category feature vector, the 3D skeleton information at the missing positions in the action trajectory to be completed is predicted based on the target action category feature vector. Then, the completed action trajectory is obtained. Since the 3D skeleton information at the missing positions in the completed action trajectory is obtained based on the target action category feature vector, the action style of the completed action trajectory is similar to the action style constrained by the target action category feature vector. In other words, the target action category feature vector can constrain the action style of the 3D skeleton information at the missing positions in the action trajectory to be completed, that is, it constrains the action style of the target object in the image frame to be completed, thereby constraining the action style of the completed action trajectory.

[0210] It should be noted that the known 3D skeleton information can be any one of the 3D skeleton information corresponding to the first image frame, the 3D skeleton information corresponding to the last image frame, or the 3D skeleton information corresponding to the intermediate image frames; the known image frames involved in the motion trajectory generation stage are similar to the non-target image frames mentioned above, and the image frames to be completed involved in the motion trajectory generation stage are similar to the target image frames mentioned above.

[0211] The process of determining the target action category feature vector can be achieved by extracting action category features based on a reference action trajectory, or by sampling from the action feature distribution information corresponding to the target action category.

[0212] S904, based on the above-mentioned action trajectory to be completed, the action trajectory feature extraction process is performed to obtain the action trajectory feature vector to be completed, and based on the above-mentioned action trajectory feature vector to be completed, the second preprocessed feature vector is obtained.

[0213] The process of generating the feature vector of the motion trajectory to be completed can refer to the process of generating the feature vector of the original motion trajectory, and the process of generating the second preprocessed feature vector can refer to the process of generating the first preprocessed feature vector. These will not be repeated here.

[0214] It should be noted that during the motion trajectory generation stage, since the motion feature vectors corresponding to some image frames in the motion trajectory feature vector to be completed are missing and need to be predicted by the model, the above preprocessing includes feature vector completion and image frame position marking. That is, the preprocessing here does not include feature vector occlusion.

[0215] S906, Input the above target action category feature vector and the second preprocessed feature vector into the trained action trajectory generation model to predict the action trajectory and obtain the second predicted feature vector.

[0216] The trained motion trajectory generation model can be obtained by iterative training using the training method described above. Specifically, the generation process of the second predicted feature vector can refer to the generation process of the first predicted feature vector, and will not be repeated here.

[0217] S908, Based on the second predicted feature vector mentioned above, generate the completed action trajectory;

[0218] Specifically, a 3D skeleton transformation is performed based on the second predicted feature vector to obtain the completed motion trajectory (i.e., the predicted completed motion trajectory). The process of generating the completed motion trajectory can refer to the process of generating the predicted motion trajectory sample mentioned above, and will not be repeated here.

[0219] S910, Generate a target motion trajectory based on the completed motion trajectory corresponding to at least one motion trajectory to be completed.

[0220] Specifically, the completed motion trajectory corresponding to a certain motion trajectory to be completed can be directly used as the target motion trajectory. Furthermore, considering that the number N of 3D skeleton information in the completed motion trajectory may be less than the expected number (for example, the number of 3D skeletons in the completed motion trajectory generated by the model is all N, while the actual motion trajectory needs to contain more 3D skeletons), and that a certain segment in the completed motion trajectory may not meet expectations (for example, the connection between multiple adjacent 3D skeletons in the completed motion trajectory generated by the model is not natural or smooth), a motion trajectory generation model can be used to generate corresponding completed motion trajectories based on multiple motion trajectories to be completed, and then the completed motion trajectories corresponding to multiple motion trajectories to be completed can be synthesized to obtain the target motion trajectory.

[0221] It should be noted that the motion trajectory generation process actually involves two motion trajectory completion processes. The first motion trajectory completion is during the preprocessing stage, and the second motion trajectory completion is when the motion trajectory generation model predicts the motion trajectory of the image frame to be completed. That is, firstly, interpolation is used to add the motion feature vector corresponding to the image frame to be completed, and then the motion trajectory generation model is used to predict the motion trajectory based on the interpolated motion feature vector corresponding to the image frame to be completed, and update the motion feature vector corresponding to the image frame to be completed (that is, update the interpolated motion feature vector corresponding to the image frame to be completed with the predicted motion feature vector).

[0222] In one specific embodiment, such as Figure 10a As shown, a schematic diagram illustrating the specific implementation principle of a motion trajectory generation process is presented, including:

[0223] Obtain the motion trajectory to be completed; the motion trajectory to be completed includes multiple known 3D skeleton information. For example, taking N=5 as an example, the position labels of the multiple known 3D skeleton information are 1, 4, and 5 respectively. That is, the unknown 3D skeleton information with position labels 2 and 3 needs to be completed.

[0224] Obtain the target action category feature vector; the target action category feature vector is used to constrain the action style of the action trajectory, and the target action category feature vector is determined based on the reference action trajectory or the action feature distribution information corresponding to the target action category;

[0225] Based on the above-mentioned motion trajectory to be completed, motion trajectory feature extraction processing is performed to obtain the motion trajectory representation matrix to be completed (i.e., the motion trajectory feature vector to be completed, including the known motion feature vectors corresponding to multiple known 3D skeleton information in the motion trajectory to be completed); the motion trajectory representation matrix to be completed is preprocessed to obtain the corresponding preprocessed motion trajectory representation matrix (i.e., the second preprocessed feature vector, including the new completed motion feature vector corresponding to the unknown 3D skeleton information in the motion trajectory to be completed and the known motion feature vector corresponding to the known 3D skeleton information).

[0226] The trained motion trajectory generation model is used to predict motion trajectories based on the target motion category feature vector and the preprocessed motion trajectory representation matrix, resulting in the predicted motion trajectory representation matrix (i.e., the second predicted feature vector).

[0227] Based on the above-mentioned predicted motion trajectory representation matrix, 3D skeleton transformation processing is performed to obtain the completed motion trajectory.

[0228] Generate the target motion trajectory based on the completed motion trajectory corresponding to at least one motion trajectory to be completed.

[0229] It should be noted that the aforementioned second predicted feature vector may include the predicted action feature vector corresponding to the unknown 3D skeleton information in the action trajectory to be completed and the predicted action feature vector corresponding to the known 3D skeleton information; in addition, the predicted action feature vector corresponding to the known 3D skeleton information may be replaced with the known action feature vector. Therefore, the aforementioned second predicted feature vector may also include the predicted action feature vector corresponding to the unknown 3D skeleton information in the action trajectory to be completed and the known action feature vector corresponding to the known 3D skeleton information.

[0230] Specifically, the action trajectory generation model can include a feature vector transformation network, a feature vector concatenation network, and an action trajectory generation network for the generation process of the second predicted feature vector.

[0231] Correspondingly, in S906 above, the target action category feature vector and the second preprocessed feature vector are input into the trained action trajectory generation model to predict the action trajectory, resulting in the second predicted feature vector, which specifically includes:

[0232] The aforementioned feature vector transformation network performs feature transformation on the target action category feature vector to obtain the transformed action category feature vector.

[0233] Specifically, the process of generating the transformed action category feature vector can refer to the process of generating the second action category feature vector described above, and will not be repeated here.

[0234] The aforementioned feature vector concatenation network concatenates the transformed action category feature vector with the aforementioned second preprocessed feature vector to obtain the second concatenated feature vector.

[0235] Specifically, the process of generating the second concatenated feature vector can refer to the process of generating the first concatenated feature vector described above, and will not be repeated here.

[0236] The aforementioned motion trajectory generation network predicts motion trajectories based on the aforementioned second concatenated feature vector, generating a second predicted feature vector.

[0237] Specifically, the process of generating the second predicted feature vector can refer to the process of generating the first predicted feature vector described above, and will not be repeated here.

[0238] In one specific embodiment, in the above Figure 10a On the basis of, such as Figure 10b The diagram illustrates the implementation principle of a motion trajectory generation process. If the motion trajectory generation model includes a generation sub-model, the generation sub-model specifically includes a feature vector transformation network, a feature vector concatenation network, and a motion trajectory generation network. The process of using the trained motion trajectory generation model to predict the motion trajectory based on the target motion category feature vector and the preprocessed motion trajectory representation matrix, to obtain the predicted motion trajectory representation matrix (i.e., the second predicted feature vector), specifically includes:

[0239] The aforementioned feature vector transformation network performs feature transformation on the target action category feature vector to obtain the transformed action category feature vector;

[0240] The aforementioned feature vector concatenation network concatenates the transformed action category feature vector with the preprocessed action trajectory representation matrix corresponding to the action trajectory to be completed, to obtain the corresponding concatenated action trajectory representation matrix (i.e., the second concatenated feature vector).

[0241] The above-mentioned motion trajectory generation network predicts motion trajectories based on the spliced ​​motion trajectory representation matrix corresponding to the motion trajectory to be completed, and obtains the corresponding intermediate motion trajectory representation matrix. Then, the predicted feature vectors corresponding to the known 3D skeleton information in the intermediate motion trajectory representation matrix are replaced with the corresponding known motion feature vectors to obtain the predicted motion trajectory representation matrix (i.e. the second predicted feature vector) corresponding to the motion trajectory to be completed.

[0242] It should be noted that the process of replacing known action feature vectors based on the intermediate action trajectory representation matrix to obtain the predicted action trajectory representation matrix can be implemented by the aforementioned action trajectory generation network or by other processing modules. That is, the output of the aforementioned action trajectory generation network is the intermediate action trajectory representation matrix, and then other processing modules replace known action feature vectors based on the intermediate action trajectory representation matrix to obtain the predicted action trajectory representation matrix.

[0243] Specifically, the process of generating the target action category feature vector, specifically S902 above, involves obtaining the target action category feature vector, which includes:

[0244] Obtain a reference motion trajectory; use a pre-trained motion classification model to perform motion category feature extraction based on the spatiotemporal graph of the motion trajectory corresponding to the above reference motion trajectory to obtain the target motion category feature vector;

[0245] or,

[0246] Obtain the action feature distribution information corresponding to the target action category; sample a target action category feature vector from the above action feature distribution information.

[0247] Specifically, in the motion trajectory generation stage (i.e., the application stage of the motion trajectory generation model), the aforementioned motion trajectory generation model can include a trained generation sub-model and a pre-trained motion classification model. Correspondingly, for the generation process of the target motion category feature vector, one implementation is to introduce a reference motion trajectory and use the pre-trained motion classification model to extract motion category features from the reference motion trajectory to obtain a target motion category feature vector used to constrain the motion style. Then, the target motion category feature vector is used as the input of the motion trajectory generation model to perform motion trajectory prediction, so that the predicted target motion trajectory has a high similarity to the motion style of the reference motion trajectory, that is, the motion style of the predicted target motion trajectory imitates the motion style of the reference motion trajectory.

[0248] Specifically, in the motion trajectory generation stage (i.e., the application stage of the motion trajectory generation model), the aforementioned motion trajectory generation model can include a trained generation sub-model. To improve the diversity of motion styles generated by the model without introducing reference motion trajectories, another approach to generating the target motion category feature vector is to obtain the motion feature distribution information corresponding to the target motion category. A candidate feature vector is sampled from the motion feature distribution information corresponding to the target motion type as the target motion category feature vector. This target motion category feature vector can be used as input to the motion trajectory generation model, i.e., as a controllable condition for the motion trajectory generation process, to constrain the motion style of the generated motion trajectory. Since different candidate feature vectors correspond to the same motion category but different motion styles, the diversity of motion styles of the generated motion trajectory can also be improved. Furthermore, if the motion style of the currently generated motion trajectory does not meet expectations, the motion trajectory generation step can be re-executed to obtain a motion trajectory with a different motion style, thus achieving selectivity in the motion style of the motion trajectory. The motion feature distribution information can be constructed during the training stage of the motion trajectory generation model. The motion feature distribution information corresponding to any target motion category k can be represented as a Gaussian distribution function.

[0249] In one specific embodiment, in the above Figure 10a On the basis of, such as Figure 10c The diagram illustrates the specific implementation principle of a motion trajectory generation process. Specifically, the process of obtaining the target motion category feature vector includes:

[0250] The first approach involves obtaining a reference motion trajectory and then using a pre-trained motion classification model to extract motion category features from the reference motion trajectory, thereby obtaining a target motion category feature vector used to constrain the motion style.

[0251] The second implementation method is to obtain the action feature distribution information corresponding to the target action category, and sample a candidate feature vector from the action feature distribution information corresponding to the target action type as the target action category feature vector.

[0252] It should be noted that either the first or second implementation method can be chosen to obtain the target action category feature vector based on actual needs.

[0253] Specifically, regarding the generation process of the target motion trajectory, S910 above generates the target motion trajectory based on the completed motion trajectory corresponding to at least one motion trajectory to be completed, including:

[0254] Based on at least one of the completed action trajectories corresponding to the above-mentioned action trajectory to be completed, a synthetic action trajectory is determined; wherein, if the current action trajectory to be completed is the first action trajectory to be completed, the completed action trajectory corresponding to the current action trajectory to be completed is directly determined as the synthetic action trajectory.

[0255] If the above synthesized motion trajectory does not meet the preset constraints, the process continues to obtain the next motion trajectory to be completed, so as to generate the completed motion trajectory corresponding to the motion trajectory to be completed.

[0256] If the above synthesized motion trajectory meets the preset constraints, then the synthesized motion trajectory will be determined as the target motion trajectory.

[0257] Specifically, the aforementioned preset constraints may include at least one of the following: the number of 3D skeletons in the synthesized motion trajectory is greater than or equal to the expected number, and the synthesized motion trajectory meets the expected animation effect. If the currently obtained synthesized motion trajectory does not meet the preset conditions, the target motion trajectory can be obtained by performing multiple motion trajectory completions using the motion trajectory generation model.

[0258] Specifically, the motion trajectory to be completed may include 3D skeleton information manually drawn by the target user (such as an animator). For example, for the first motion trajectory to be completed, the motion trajectory to be completed may include the first 3D skeleton manually drawn by the target user, at least one intermediate 3D skeleton, and the last 3D skeleton. The motion trajectory to be completed may also include 3D skeleton information selected by the target user (such as animator) based on the currently obtained synthetic motion trajectory. For example, for non-first motion trajectories to be completed, the motion trajectory to be completed may include multiple 3D skeletons selected by the target user from the currently obtained synthetic motion trajectory, and the next motion trajectory to be completed is generated based on the selected multiple 3D skeleton information and position information. Optionally, if the currently obtained synthetic motion trajectory contains motion trajectory segments that do not meet expectations, the target user selects multiple specified 3D skeletons from the unexpected motion trajectory segments, takes the first specified 3D skeleton in the order as the first 3D skeleton in the motion trajectory to be completed, takes the specified 3D skeleton in the middle position as the intermediate 3D skeleton in the motion trajectory to be completed, and takes the last specified 3D skeleton in the order as the last 3D skeleton in the motion trajectory to be completed.

[0259] In one specific embodiment, in the above Figure 10a On the basis of, such as Figure 10d As shown, a schematic diagram illustrating the specific implementation principle of a motion trajectory generation process is presented, based on the above... Figure 10a Taking the motion trajectory to be completed as motion trajectory to be completed 1 (i.e., the first motion trajectory to be completed) and the corresponding completed motion trajectory as completed motion trajectory 1 as an example, the process of generating the target motion trajectory based on the completed motion trajectory specifically includes:

[0260] If the completed motion trajectory 1 does not meet the preset constraints, the process continues to obtain the next motion trajectory to be completed (i.e., motion trajectory 2 to be completed). If the number of 3D skeletons in the completed motion trajectory 1 is less than the expected number (e.g., the expected number is 7) and the motion trajectory segments with sequence numbers 1 to 3 are not smooth enough, then the 3D skeletons with sequence numbers 1 and 3 in the completed motion trajectory 1 can be used as the first and last 3D skeletons in the motion trajectory 2 to be completed. In practice, the target user can specify the known 3D skeletons in the next motion trajectory to be completed through human-computer interaction in the completed motion trajectory 1. Specifically, the process receives the known skeleton selection information input by the target user based on the completed motion trajectory 1, and generates the next motion trajectory to be completed (i.e., motion trajectory 2 to be completed). The motion trajectory 2 to be completed includes two known 3D skeleton information with position identifiers of 1 and 5 respectively, meaning that the unknown 3D skeleton information with position identifiers 2, 3, and 4 needs to be completed.

[0261] Obtain the target action category feature vector; this target action category feature vector may be the same as or different from the target action category feature vector used in the process of generating the completed action trajectory 1.

[0262] Based on the above-mentioned motion trajectory 2 to be completed, motion trajectory feature extraction processing is performed to obtain motion trajectory representation matrix 2 to be completed; motion trajectory representation matrix 2 to be completed is preprocessed to obtain the corresponding preprocessed motion trajectory representation matrix 2.

[0263] The trained motion trajectory generation model is used to predict motion trajectories based on the target motion category feature vector and the preprocessed motion trajectory representation matrix 2, resulting in the predicted motion trajectory representation matrix 2.

[0264] Based on the above-mentioned predicted motion trajectory representation matrix 2, 3D skeleton transformation processing is performed to obtain the completed motion trajectory 2.

[0265] Based on the completed motion trajectory 1 and completed motion trajectory 2, the target motion trajectory is generated.

[0266] It should be noted that the above Figure 10d The diagram only illustrates the specific implementation process of generating a target motion trajectory based on the completed motion trajectories corresponding to two motion trajectories to be completed. The process of generating a target motion trajectory based on the completed motion trajectories corresponding to more than two motion trajectories to be completed can refer to the specific implementation process described above, and will not be repeated here.

[0267] In practical implementation, the aforementioned feature vector transformation network can be considered a generative sub-model. Furthermore, considering that the feature vector transformation network can be implemented using existing feature encoding network structures, and that the generative and discriminative sub-models can also be implemented using existing adversarial neural network structures, the feature vector transformation network can be relatively independent of the generative sub-model. The generative and discriminative sub-models constitute an adversarial neural network. Additionally, when using an existing adversarial neural network structure, the output of the generative sub-model is typically the aforementioned intermediate motion trajectory representation matrix. Therefore, taking the example of the feature vector transformation network using an existing feature encoding network structure, and the generative and discriminative sub-models using existing adversarial neural network structures, the feature vector transformation network and the generative sub-model are relatively independent. That is, the aforementioned motion trajectory generation model includes a feature vector transformation sub-model (equivalent to the aforementioned feature vector transformation network) and a generative sub-model (equivalent to the aforementioned motion trajectory generation network). The process of replacing known motion feature vectors based on the intermediate motion trajectory representation matrix to obtain the predicted motion trajectory representation matrix is ​​implemented by other processing modules. Based on this, in the aforementioned... Figure 10c On the basis of, such as Figure 11 The diagram illustrates the specific implementation principle of a motion trajectory generation process. For each step of generating the completed motion trajectory based on the motion trajectory to be completed, the process specifically includes:

[0268] Obtain the trajectory of the action to be completed, and generate the target action category feature vector using either the first or second implementation method described above;

[0269] The target action category feature vector is input into the feature vector transformation sub-model in the action trajectory generation model for feature transformation processing, resulting in the transformed action category feature vector; and...

[0270] Based on the motion trajectory to be completed, the motion trajectory feature extraction process is performed to obtain the motion trajectory representation matrix to be completed.

[0271] The row matrix to be completed in the above motion trajectory representation matrix is ​​completed to obtain the completed motion trajectory representation matrix.

[0272] Based on the position encoding vectors of the known and unknown 3D skeletons in the motion trajectory to be completed, position marking processing is performed on each row of the above motion trajectory representation matrix to obtain the preprocessed motion trajectory representation matrix.

[0273] The transformed action category feature vectors are concatenated with the preprocessed action trajectory representation matrix to obtain the concatenated action trajectory representation matrix.

[0274] The spliced ​​motion trajectory representation matrix corresponding to the motion trajectory to be completed is input into the generation sub-model in the motion trajectory generation model to predict the motion trajectory and obtain the intermediate motion trajectory representation matrix.

[0275] Replace the predicted feature vector corresponding to the known 3D skeleton in the intermediate motion trajectory representation matrix corresponding to the motion trajectory to be completed with the corresponding known motion feature vector to obtain the predicted motion trajectory representation matrix.

[0276] Based on the above-mentioned predicted motion trajectory representation matrix, 3D skeleton transformation processing is performed to obtain the completed motion trajectory.

[0277] The motion trajectory generation method in this embodiment of the application, on the one hand, extracts motion trajectory features to obtain a feature vector of the motion trajectory to be completed, and preprocesses the feature vector to obtain a second preprocessed feature vector; simultaneously, it obtains a target action category feature vector to constrain the motion style of the motion trajectory; then, it uses the second preprocessed feature vector and the target action category feature vector as input to the motion trajectory generation model, and uses the motion trajectory generation model to predict the motion trajectory to obtain a second predicted feature vector; since the second predicted feature vector is obtained based on the feature vectors of the motion trajectory category and the motion trajectory itself, the motion category feature vector is considered during motion trajectory prediction, which ensures the controllability of the motion category of the predicted motion trajectory; on the other hand, the target action category feature vector can be set according to actual needs, and during the training phase of the motion trajectory generation model, only the action category recognition loss and the motion trajectory authenticity loss are considered to optimize the model parameters, so that the predicted motion trajectory is not limited during the iterative update of the model parameters (i.e., the input...). The output motion trajectory and the input motion trajectory sample (i.e., the motion style of the input motion trajectory) are strictly similar. The only constraint is that the output motion trajectory and the input motion trajectory are consistent at the action category level. That is, the output motion trajectory and the input motion trajectory are allowed to belong to the same action category but have different action styles. Therefore, the trained model allows for diverse action styles when outputting motion trajectories belonging to the target action category. Furthermore, when using the trained model for motion trajectory completion, the target action category feature vector can be introduced to constrain the action style of the motion trajectory, thereby ensuring the controllability of the action style of the output motion trajectory of the trained model. On the other hand, the motion trajectory generation model is obtained by continuously learning the truth and falsehood of motion trajectory prediction through multi-round adversarial learning based on generation and discrimination, and iteratively updating the parameters. That is, the model parameters are adjusted based on the discrimination results of the discrimination sub-model, which can further train the generation sub-model. Therefore, the motion trajectory predicted by the trained generation sub-model has a more realistic feel than the real motion trajectory, thereby further improving the realism of the motion trajectory output by the trained model.

[0278] It should be noted that this embodiment in this application is based on the same inventive concept as the previous embodiment in this application. Therefore, the specific implementation of this embodiment can be referred to the implementation of the training method of the aforementioned motion trajectory generation model, and the repeated parts will not be described again.

[0279] Corresponding to the above Figures 1 to 8 Based on the same technical concept, the training method for the described motion trajectory generation model also provides a training device for the motion trajectory generation model in this application. Figure 12 This is a schematic diagram of the module composition of the training device for the motion trajectory generation model provided in this application embodiment. The device is used to execute... Figures 1 to 8 The training method for the described motion trajectory generation model, such as... Figure 12 As shown, the device includes:

[0280] The first acquisition module 1202 is used to acquire a first sample dataset; the first sample dataset includes multiple first motion trajectory samples.

[0281] The first processing module 1204 is used to perform action category feature extraction processing based on the first action trajectory sample to obtain a first action category feature vector; and to perform action trajectory feature extraction processing based on the first action trajectory sample to obtain an original action trajectory feature vector, and to perform preprocessing based on the original action trajectory feature vector to obtain a first preprocessed feature vector.

[0282] Model training module 1206 is used to input the first action category feature vector and the first preprocessed feature vector corresponding to each first action trajectory sample into the model to be trained for iterative training to obtain the action trajectory generation model;

[0283] The model to be trained includes a generator sub-model and a discriminator sub-model; the specific implementation methods for each model training are as follows:

[0284] For each of the first motion trajectory samples: the generation sub-model generates a first predicted feature vector based on the first motion category feature vector and the first preprocessed feature vector corresponding to the first motion trajectory sample; the discrimination sub-model performs trajectory authenticity discrimination based on the first predicted feature vector and the original motion trajectory feature vector corresponding to the first motion trajectory sample, and generates a motion trajectory discrimination result set; and performs motion category recognition based on the first predicted feature vector to obtain the motion category recognition result.

[0285] The parameters of the generated sub-model are updated based on the action category recognition result and the action trajectory discrimination result set; and the parameters of the discrimination sub-model are updated based on the action trajectory discrimination result set.

[0286] The training device for the motion trajectory generation model in this embodiment of the application, during the model training phase, extracts motion category features and motion trajectory features from the first motion trajectory samples to obtain motion category feature vectors and motion trajectory feature vectors corresponding to each first motion trajectory sample. The motion trajectory feature vectors are then preprocessed to obtain a first preprocessed feature vector. Specifically, the motion category feature vector corresponding to the first motion trajectory sample is used as the first-dimensional feature vector extracted from the motion trajectory category dimension and input to the model to be trained. The first preprocessed feature vector corresponding to the first motion trajectory sample is used as the second-dimensional feature vector extracted from the motion trajectory itself dimension. Then, the model to be trained utilizes the motion category features... The motion trajectory is predicted using the first preprocessed feature vector and the first preprocessed feature vector to obtain the first predicted feature vector. Then, motion category identification and trajectory authenticity determination are performed based on the first predicted feature vector. Finally, the parameters of the model to be trained are iteratively updated based on the motion category identification results and motion trajectory determination results of each first motion trajectory sample. On the one hand, since the first predicted feature vector is obtained based on feature vectors of both the motion trajectory category and the motion trajectory itself, the motion category feature vector is considered during motion trajectory prediction, ensuring the controllability of the motion category of the predicted motion trajectory. On the other hand, the model is updated using the motion category identification results and motion trajectory determination results based on the first predicted feature vector. The parameters are iteratively updated without requiring the predicted motion trajectory to be as consistent as possible with the motion trajectory samples. Instead, the model parameters are optimized only by considering the action category recognition loss and the motion trajectory authenticity loss. This ensures that during the model parameter iteration process, the predicted motion trajectory (i.e., the output motion trajectory) is not required to have a strict convergence in action style with the motion trajectory samples (i.e., the input motion trajectory). It only constrains the output motion trajectory to be consistent with the input motion trajectory at the action category level. That is, the output motion trajectory and the input motion trajectory are allowed to belong to the same action category but have different action styles. Thus, after training, the model is allowed to have different action styles when outputting motion trajectories belonging to the target action category. Diversity; furthermore, when using the trained model for motion trajectory completion, the target action category feature vector can be introduced to constrain the action style of the motion trajectory, thereby ensuring the controllability of the action style of the motion trajectory output by the trained model; on the other hand, during the iterative update of model parameters, the true and false of motion trajectory predictions are continuously learned based on the multi-round adversarial method of generation and discrimination. Due to the existence of the discriminative sub-model, the model parameters are adjusted based on the discrimination results of the discriminative sub-model, which can further train the generative sub-model, so that the motion trajectory predicted by the trained generative sub-model has a more realistic feel than the real motion trajectory, thereby further improving the realism of the motion trajectory output by the trained model.

[0287] It should be noted that the embodiments of the training device for the motion trajectory generation model in this application and the embodiments of the training method for the motion trajectory generation model in this application are based on the same inventive concept. Therefore, the specific implementation of this embodiment can be referred to the implementation of the corresponding training method for the motion trajectory generation model mentioned above, and the repeated parts will not be described again.

[0288] Corresponding to the above Figures 9 to 11 Based on the same technical concept, this application also provides a motion trajectory generation device in its embodiments, which is also described as a motion trajectory generation method. Figure 13 This is a schematic diagram of the module composition of the motion trajectory generation device provided in the embodiments of this application. The device is used to perform... Figures 9 to 11 The described motion trajectory generation method, such as Figure 13 As shown, the device includes:

[0289] The second acquisition module 1302 is used to acquire the motion trajectory to be completed; and to acquire the target motion category feature vector, wherein the target motion category feature vector is used to constrain the motion style of the motion trajectory.

[0290] The second processing module 1304 is used to perform motion trajectory feature extraction processing based on the motion trajectory to be completed, to obtain a motion trajectory feature vector to be completed, and to perform preprocessing based on the motion trajectory feature vector to be completed, to obtain a second preprocessed feature vector.

[0291] The motion trajectory prediction module 1306 is used to input the target motion category feature vector and the second preprocessed feature vector into the trained motion trajectory generation model to predict the motion trajectory and obtain the second predicted feature vector.

[0292] The first trajectory generation module 1308 is used to generate a completed action trajectory based on the second predicted feature vector;

[0293] The second trajectory generation module 1310 is used to generate a target motion trajectory based on at least one of the completed motion trajectories corresponding to the motion trajectory to be completed.

[0294] The motion trajectory generation device in this embodiment, on the one hand, extracts motion trajectory features to obtain a feature vector of the motion trajectory to be completed, and preprocesses the feature vector to obtain a second preprocessed feature vector; simultaneously, it obtains a target action category feature vector to constrain the motion style of the motion trajectory; then, it uses the second preprocessed feature vector and the target action category feature vector as input to the motion trajectory generation model, and uses the motion trajectory generation model to predict the motion trajectory to obtain a second predicted feature vector; since the second predicted feature vector is obtained based on feature vectors of two dimensions, motion trajectory category and motion trajectory itself, the motion category feature vector is considered during motion trajectory prediction, thus ensuring the controllability of the motion category of the predicted motion trajectory; on the other hand, the target action category feature vector can be set according to actual needs, and during the training phase of the motion trajectory generation model, only the motion category recognition loss and the motion trajectory authenticity loss are considered to optimize the model parameters, so that the predicted motion trajectory is not limited during the iterative update of the model parameters (i.e., the input...). The output motion trajectory and the input motion trajectory sample (i.e., the motion style of the input motion trajectory) are strictly similar. The only constraint is that the output motion trajectory and the input motion trajectory are consistent at the action category level. That is, the output motion trajectory and the input motion trajectory are allowed to belong to the same action category but have different action styles. Therefore, the trained model allows for diverse action styles when outputting motion trajectories belonging to the target action category. Furthermore, when using the trained model for motion trajectory completion, the target action category feature vector can be introduced to constrain the action style of the motion trajectory, thereby ensuring the controllability of the action style of the output motion trajectory of the trained model. On the other hand, the motion trajectory generation model is obtained by continuously learning the truth and falsehood of motion trajectory prediction through multi-round adversarial learning based on generation and discrimination, and iteratively updating the parameters. That is, the model parameters are adjusted based on the discrimination results of the discrimination sub-model, which can further train the generation sub-model. Therefore, the motion trajectory predicted by the trained generation sub-model has a more realistic feel than the real motion trajectory, thereby further improving the realism of the motion trajectory output by the trained model.

[0295] It should be noted that the embodiments of the motion trajectory generation device in this application and the embodiments of the motion trajectory generation method in this application are based on the same inventive concept. Therefore, the specific implementation of this embodiment can be referred to the implementation of the corresponding motion trajectory generation method mentioned above, and the repeated parts will not be described again.

[0296] Furthermore, corresponding to the above Figures 1 to 11 Based on the same technical concept, this application also provides a computer device for executing the above-described method for training or generating motion trajectory models, such as... Figure 14As shown.

[0297] Computer devices can vary significantly due to differences in configuration or performance. They may include one or more processors 1401 and memory 1402, with memory 1402 storing one or more application programs or data. Memory 1402 can be temporary or persistent storage. The application programs stored in memory 1402 may include one or more modules (not shown), each module including a series of computer-executable instructions for the computer device. Furthermore, processor 1401 may be configured to communicate with memory 1402 and execute the series of computer-executable instructions stored in memory 1402 on the computer device. The computer device may also include one or more power supplies 1403, one or more wired or wireless network interfaces 1404, one or more input / output interfaces 1405, one or more keyboards 1406, etc.

[0298] In one specific embodiment, the computer device includes a memory and one or more programs, wherein the one or more programs are stored in the memory, and the one or more programs may include one or more modules, and each module may include a series of computer-executable instructions for use in the computer device, and is configured to be executed by one or more processors. The one or more programs include computer-executable instructions for performing the following:

[0299] Obtain the first sample dataset; the first sample dataset includes multiple first action trajectory samples;

[0300] Based on the first action trajectory sample, action category feature extraction processing is performed to obtain a first action category feature vector; and based on the first action trajectory sample, action trajectory feature extraction processing is performed to obtain an original action trajectory feature vector, and based on the original action trajectory feature vector, preprocessing is performed to obtain a first preprocessed feature vector.

[0301] The first action category feature vector and the first preprocessed feature vector corresponding to each of the first action trajectory samples are input into the model to be trained for iterative training to obtain the action trajectory generation model.

[0302] The model to be trained includes a generator sub-model and a discriminator sub-model; the specific implementation methods for each model training are as follows:

[0303] For each of the first motion trajectory samples: the generation sub-model generates a first predicted feature vector based on the first motion category feature vector and the first preprocessed feature vector corresponding to the first motion trajectory sample; the discrimination sub-model performs trajectory authenticity discrimination based on the first predicted feature vector and the original motion trajectory feature vector corresponding to the first motion trajectory sample, and generates a motion trajectory discrimination result set; and performs motion category recognition based on the first predicted feature vector to obtain the motion category recognition result.

[0304] The parameters of the generated sub-model are updated based on the action category recognition result and the action trajectory discrimination result set; and the parameters of the discrimination sub-model are updated based on the action trajectory discrimination result set.

[0305] In the computer device of this application embodiment, during the model training phase, action category feature extraction and action trajectory feature extraction are performed on the first action trajectory samples to obtain action category feature vectors and action trajectory feature vectors corresponding to each first action trajectory sample. The action trajectory feature vectors are then preprocessed to obtain a first preprocessed feature vector. Specifically, the action category feature vector corresponding to the first action trajectory sample is used as the first-dimensional feature vector extracted from the action trajectory category dimension and input into the model to be trained. The first preprocessed feature vector corresponding to the first action trajectory sample is used as the second-dimensional feature vector extracted from the action trajectory itself dimension. Then, the model to be trained is used based on the action category feature vector and the first... Preprocessing feature vectors is used for motion trajectory prediction to obtain a first predicted feature vector. Then, motion category recognition and trajectory authenticity determination are performed based on the first predicted feature vector. Finally, the parameters of the model to be trained are iteratively updated based on the motion category recognition results and motion trajectory determination results of each first motion trajectory sample. On the one hand, since the first predicted feature vector is obtained based on feature vectors of both the motion trajectory category and the motion trajectory itself, the motion category feature vector is considered during motion trajectory prediction, ensuring the controllability of the predicted motion category. On the other hand, the model parameters are updated using the motion category recognition results and motion trajectory determination results based on the first predicted feature vector. Iterative updates do not require the predicted motion trajectories to be as consistent as possible with the motion trajectory samples. Instead, they only consider the action category recognition loss and the motion trajectory authenticity loss to optimize the model parameters. This ensures that during the iterative update of model parameters, the predicted motion trajectory (i.e., the output motion trajectory) is not required to have a strict convergence in action style with the motion trajectory samples (i.e., the input motion trajectory). Instead, it only constrains the output motion trajectory to be consistent with the input motion trajectory at the action category level. That is, it allows the output motion trajectory and the input motion trajectory to belong to the same action category but have different action styles. In this way, the trained model can allow for diverse action styles when outputting motion trajectories that belong to the target action category. Furthermore, when using the trained model to complete motion trajectories, the target action category feature vector can be introduced to constrain the action style of the motion trajectory, thereby ensuring the controllability of the action style of the motion trajectory output by the trained model. On the other hand, during the iterative update of model parameters, the true or false prediction of motion trajectory is continuously learned based on the multi-round adversarial method of generation and discrimination. Due to the existence of the discriminative sub-model, the model parameters are adjusted based on the discrimination results of the discriminative sub-model, which can further train the generative sub-model. This makes the motion trajectory predicted by the trained generative sub-model more realistic and closer to the real motion trajectory, thereby further improving the realism of the motion trajectory output by the trained model.

[0306] In another specific embodiment, the computer device includes a memory and one or more programs, wherein the one or more programs are stored in the memory, and the one or more programs may include one or more modules, and each module may include a series of computer-executable instructions for use in the computer device, and is configured to be executed by one or more processors. The one or more programs include computer-executable instructions for performing the following:

[0307] Obtain the motion trajectory to be completed; and obtain the target motion category feature vector, which is used to constrain the motion style of the motion trajectory.

[0308] Based on the motion trajectory to be completed, motion trajectory feature extraction processing is performed to obtain the motion trajectory feature vector to be completed, and the motion trajectory feature vector to be completed is preprocessed to obtain the second preprocessed feature vector.

[0309] The target action category feature vector and the second preprocessed feature vector are input into the trained action trajectory generation model to predict the action trajectory, and the second predicted feature vector is obtained.

[0310] Based on the second predicted feature vector, the completed action trajectory is generated;

[0311] A target motion trajectory is generated based on at least one of the completed motion trajectories corresponding to the motion trajectory to be completed.

[0312] The computer device in this embodiment, on the one hand, extracts motion trajectory features to obtain a feature vector of the motion trajectory to be completed, and preprocesses the feature vector to obtain a second preprocessed feature vector; simultaneously, it obtains a target action category feature vector to constrain the motion style of the motion trajectory; then, it uses the second preprocessed feature vector and the target action category feature vector as input to the motion trajectory generation model, and uses the motion trajectory generation model to predict the motion trajectory to obtain a second predicted feature vector; since the second predicted feature vector is obtained based on feature vectors of two dimensions, motion trajectory category and motion trajectory itself, the motion category feature vector is considered during motion trajectory prediction, thus ensuring the controllability of the motion category of the predicted motion trajectory; on the other hand, the target action category feature vector can be set according to actual needs, and during the training phase of the motion trajectory generation model, only the motion category recognition loss and the motion trajectory authenticity loss are considered to optimize the model parameters, so that the predicted motion trajectory (i.e., the output) is not limited during the iterative update of the model parameters. The action style of the action trajectory (i.e., the input action trajectory) is strictly convergent with that of the action trajectory sample. The only constraint is that the output action trajectory is consistent with the input action trajectory at the action category level. That is, the output action trajectory and the input action trajectory are allowed to belong to the same action category but have different action styles. Therefore, the trained model allows for diverse action styles when outputting action trajectories belonging to the target action category. Furthermore, when using the trained model for action trajectory completion, the target action category feature vector can be introduced to constrain the action style of the action trajectory, thereby ensuring the controllability of the action style of the action trajectory output by the trained model. On the other hand, the action trajectory generation model is obtained by continuously learning the truth and falsehood of action trajectory prediction through multi-round adversarial learning based on generation and discrimination, and updating the parameters iteratively. That is, the model parameters are adjusted based on the discrimination results of the discrimination sub-model, which can further train the generation sub-model. Therefore, the action trajectory predicted by the trained generation sub-model has a more realistic feel than the real action trajectory, thereby further improving the realism of the action trajectory output by the trained model.

[0313] It should be noted that the embodiments concerning computer devices in this application and the embodiments concerning the training method or motion trajectory generation method of the motion trajectory generation model in this application are based on the same inventive concept. Therefore, the specific implementation of this embodiment can refer to the implementation of the corresponding training method or motion trajectory generation method of the aforementioned motion trajectory generation model, and the repeated parts will not be described again.

[0314] Furthermore, corresponding to the above Figures 2 to 7Based on the same technical concept, this application also provides a storage medium for storing computer-executable instructions. In one specific embodiment, the storage medium can be a USB flash drive, optical disc, hard disk, etc. When the computer-executable instructions stored in the storage medium are executed by a processor, they can achieve the following process:

[0315] Obtain the first sample dataset; the first sample dataset includes multiple first action trajectory samples;

[0316] Based on the first action trajectory sample, action category feature extraction processing is performed to obtain a first action category feature vector; and based on the first action trajectory sample, action trajectory feature extraction processing is performed to obtain an original action trajectory feature vector, and based on the original action trajectory feature vector, preprocessing is performed to obtain a first preprocessed feature vector.

[0317] The first action category feature vector and the first preprocessed feature vector corresponding to each of the first action trajectory samples are input into the model to be trained for iterative training to obtain the action trajectory generation model.

[0318] The model to be trained includes a generator sub-model and a discriminator sub-model; the specific implementation methods for each model training are as follows:

[0319] For each of the first motion trajectory samples: the generation sub-model generates a first predicted feature vector based on the first motion category feature vector and the first preprocessed feature vector corresponding to the first motion trajectory sample; the discrimination sub-model performs trajectory authenticity discrimination based on the first predicted feature vector and the original motion trajectory feature vector corresponding to the first motion trajectory sample, and generates a motion trajectory discrimination result set; and performs motion category recognition based on the first predicted feature vector to obtain the motion category recognition result.

[0320] The parameters of the generated sub-model are updated based on the action category recognition result and the action trajectory discrimination result set; and the parameters of the discrimination sub-model are updated based on the action trajectory discrimination result set.

[0321] In the embodiments of this application, when the computer-executable instructions stored in the storage medium are executed by the processor, during the model training phase, action category feature extraction and action trajectory feature extraction are performed on the first action trajectory samples to obtain action category feature vectors and action trajectory feature vectors corresponding to each first action trajectory sample. The action trajectory feature vectors are then preprocessed to obtain a first preprocessed feature vector. Specifically, the action category feature vector corresponding to the first action trajectory sample is used as the first-dimensional feature vector extracted from the action trajectory category dimension and input into the model to be trained. The first preprocessed feature vector corresponding to the first action trajectory sample is used as the second-dimensional feature vector extracted from the action trajectory itself dimension. Then, the model to be trained is used... Action trajectory prediction is performed based on action category feature vectors and first preprocessed feature vectors to obtain a first predicted feature vector. Then, action category identification and trajectory authenticity determination are performed based on the first predicted feature vector. Finally, the parameters of the training model are iteratively updated based on the action category identification results and trajectory determination results of each first action trajectory sample. On the one hand, since the first predicted feature vector is obtained based on feature vectors of both action trajectory category and the action trajectory itself, action category feature vectors are considered during action trajectory prediction, ensuring the controllability of the predicted action category. On the other hand, the action category identification results and trajectory determination results based on the first predicted feature vector... The model parameters are iteratively updated, without requiring the predicted motion trajectories to be as consistent as possible with the motion trajectory samples. Instead, only the motion category recognition loss and the motion trajectory authenticity loss are considered to optimize the model parameters. This ensures that during the model parameter iteration process, the predicted motion trajectory (i.e., the output motion trajectory) is not required to strictly converge with the motion style of the motion trajectory samples (i.e., the input motion trajectory). It only constrains the output motion trajectory to be consistent with the input motion trajectory at the motion category level. That is, the output motion trajectory and the input motion trajectory are allowed to belong to the same motion category but have different motion styles. Thus, after training, the model can allow different motion styles when outputting motion trajectories belonging to the target motion category. The model exhibits diversity; furthermore, when using the trained model for motion trajectory completion, the target action category feature vector can be introduced to constrain the action style of the motion trajectory, thereby ensuring the controllability of the action style of the motion trajectory output by the trained model; on the other hand, during the iterative update of model parameters, the model continuously learns the authenticity of motion trajectory predictions based on a multi-round adversarial approach of generation and discrimination. Due to the existence of the discriminative sub-model, the model parameters are adjusted based on the discrimination results of the discriminative sub-model, which can further train the generative sub-model, making the motion trajectory predicted by the trained generative sub-model more realistic and closer to the real motion trajectory, thereby further improving the realism of the motion trajectory output by the trained model.

[0322] In another specific embodiment, the storage medium can be a USB flash drive, optical disc, hard disk, etc., and the computer-executable instructions stored on the storage medium can achieve the following process when executed by the processor:

[0323] Obtain the motion trajectory to be completed; and obtain the target motion category feature vector, which is used to constrain the motion style of the motion trajectory.

[0324] Based on the motion trajectory to be completed, motion trajectory feature extraction processing is performed to obtain the motion trajectory feature vector to be completed, and the motion trajectory feature vector to be completed is preprocessed to obtain the second preprocessed feature vector.

[0325] The target action category feature vector and the second preprocessed feature vector are input into the trained action trajectory generation model to predict the action trajectory, and the second predicted feature vector is obtained.

[0326] Based on the second predicted feature vector, the completed action trajectory is generated;

[0327] A target motion trajectory is generated based on at least one of the completed motion trajectories corresponding to the motion trajectory to be completed.

[0328] When the computer-executable instructions stored in the storage medium in this embodiment are executed by the processor, on the one hand, the feature vector of the action trajectory to be completed is obtained by extracting action trajectory features, and the feature vector of the action trajectory to be completed is preprocessed to obtain a second preprocessed feature vector; at the same time, the target action category feature vector used to constrain the action style of the action trajectory is obtained; then, the second preprocessed feature vector and the target action category feature vector are used as input to the action trajectory generation model, and the action trajectory generation model is used to predict the action trajectory to obtain a second predicted feature vector; since the second predicted feature vector is obtained based on the feature vectors of the action trajectory category and the action trajectory itself, the action category feature vector is considered when predicting the action trajectory, which can ensure the controllability of the action category of the predicted action trajectory; on the other hand, the target action category feature vector can be set according to actual needs, and in the training phase of the action trajectory generation model, only the action category recognition loss and the action trajectory authenticity loss are considered to optimize the model parameters, so that the prediction is not limited during the iterative update of the model parameters. The action style of the output action trajectory and the action trajectory sample (input action trajectory) are strictly convergent. The only constraint is that the output action trajectory and the input action trajectory are consistent at the action category level. That is, the output action trajectory and the input action trajectory are allowed to belong to the same action category but have different action styles. Therefore, the trained model allows for diverse action styles when outputting action trajectories belonging to the target action category. Furthermore, when using the trained model for action trajectory completion, the target action category feature vector can be introduced to constrain the action style of the action trajectory, thereby ensuring the controllability of the action style of the action trajectory output by the trained model. On the other hand, the action trajectory generation model is obtained by continuously learning the truth and falsehood of action trajectory predictions through multi-round adversarial learning based on generation and discrimination, and updating the parameters iteratively. That is, the model parameters are adjusted based on the discrimination results of the discrimination sub-model, which can further train the generation sub-model. Therefore, the action trajectory predicted by the trained generation sub-model has a more realistic feel than the real action trajectory, thereby further improving the realism of the action trajectory output by the trained model.

[0329] It should be noted that the embodiments concerning storage media in this application are based on the same inventive concept as the embodiments concerning the training method or motion trajectory generation method of the motion trajectory generation model in this application. Therefore, the specific implementation of this embodiment can refer to the implementation of the corresponding training method or motion trajectory generation method of the aforementioned motion trajectory generation model, and the repeated parts will not be described again.

[0330] The foregoing has described specific embodiments of this application. Other embodiments are within the scope of the appended claims. In some cases, the actions or steps recited in the claims may be performed in a different order than that shown in the embodiments and may still achieve the desired results. Furthermore, the processes depicted in the drawings do not necessarily require the specific or sequential order shown to achieve the desired results. In some embodiments, multitasking and parallel processing are also possible or may be advantageous.

[0331] Those skilled in the art will understand that embodiments of this application can be provided as methods, systems, or computer program products. Therefore, embodiments of this application can take the form of entirely hardware embodiments, entirely software embodiments, or embodiments combining software and hardware aspects. Furthermore, this application can take the form of a computer program product embodied on one or more computer-readable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) containing computer-usable program code.

[0332] This application is described with reference to flowchart illustrations and / or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of this application. It will be understood that each block of the flowchart illustrations and / or block diagrams, and combinations of blocks in the flowchart illustrations and / or block diagrams, can be implemented by computer program instructions. These computer program instructions can be provided to a processor of a general-purpose computer, special-purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, generate instructions for implementing the flowchart... Figure 1 One or more processes and / or boxes Figure 1 A device that provides the functions specified in one or more boxes.

[0333] These computer program instructions may also be stored in a computer-readable storage medium that can direct a computer or other programmable data processing device to function in a particular manner, such that the instructions stored in the computer-readable storage medium produce an article of manufacture including instruction means, which are implemented in a process Figure 1 One or more processes and / or boxes Figure 1 The function specified in one or more boxes.

[0334] These computer program instructions may also be loaded onto a computer or other programmable data processing equipment to cause a series of operational steps to be performed on the computer or other programmable equipment to produce a computer-implemented process, thereby providing instructions that execute on the computer or other programmable equipment for implementing the process. Figure 1 One or more processes and / or boxes Figure 1 The steps of the function specified in one or more boxes.

[0335] In a typical configuration, a computing device includes one or more processors (CPU), input / output interfaces, network interfaces, and memory.

[0336] Memory may include non-persistent storage in computer-readable media, such as random access memory (RAM) and / or non-volatile memory, such as read-only memory (ROM) or flash RAM. Memory is an example of computer-readable media.

[0337] Computer-readable media includes both permanent and non-permanent, removable and non-removable media that can store information using any method or technology. Information can be computer-readable instructions, data structures, modules of programs, or other data. Examples of computer storage media include, but are not limited to, phase-change memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), other types of random access memory (RAM), read-only memory (ROM), electrically erasable programmable read-only memory (EEPROM), flash memory or other memory technologies, CD-ROM, digital versatile optical disc (DVD) or other optical storage, magnetic tape, disk storage or other magnetic storage devices, or any other non-transferable medium that can be used to store information accessible by a computing device. As defined herein, computer-readable media does not include transient computer-readable media, such as modulated data signals and carrier waves.

[0338] It should also be noted that the terms "comprising," "including," or any other variations thereof are intended to cover non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements includes not only those elements but also other elements not expressly listed, or elements inherent to such a process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one..." does not exclude the presence of other identical elements in the process, method, article, or apparatus that includes said element.

[0339] The embodiments of this application can be described in the general context of computer-executable instructions, such as program modules, that are executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc., that perform a specific task or implement a specific abstract data type. One or more embodiments of this application can also be practiced in distributed computing environments where tasks are performed by remote processing devices connected via a communication network. In a distributed computing environment, program modules can reside in local and remote computer storage media, including storage devices.

[0340] The various embodiments in this application are described in a progressive manner. Similar or identical parts between embodiments can be referred to mutually. Each embodiment focuses on describing the differences from other embodiments. In particular, the system embodiments are basically similar to the method embodiments, so the description is relatively simple; relevant parts can be referred to the descriptions of the method embodiments.

[0341] The above description is merely an embodiment of this document and is not intended to limit the scope of this document. Various modifications and variations can be made to this document by those skilled in the art. Any modifications, equivalent substitutions, improvements, etc., made within the spirit and principles of this document should be included within the scope of the claims of this document.

Claims

1. A training method for a motion trajectory generation model, characterized in that, The method includes: Obtain the first sample dataset; the first sample dataset includes multiple first action trajectory samples; Based on the first action trajectory sample, action category feature extraction processing is performed to obtain a first action category feature vector; and based on the first action trajectory sample, action trajectory feature extraction processing is performed to obtain an original action trajectory feature vector, and based on the original action trajectory feature vector, preprocessing is performed to obtain a first preprocessed feature vector. The first action category feature vector and the first preprocessed feature vector corresponding to each of the first action trajectory samples are input into the model to be trained for iterative training to obtain the action trajectory generation model. The model to be trained includes a generator sub-model and a discriminator sub-model; the specific implementation methods for each model training are as follows: For each of the first motion trajectory samples: the generation sub-model generates a first predicted feature vector based on the first motion category feature vector and the first preprocessed feature vector corresponding to the first motion trajectory sample; the discrimination sub-model performs trajectory authenticity discrimination based on the first predicted feature vector and the original motion trajectory feature vector corresponding to the first motion trajectory sample, and generates a motion trajectory discrimination result set; and performs motion category recognition based on the first predicted feature vector to obtain the motion category recognition result. The parameters of the generated sub-model are updated based on the action category recognition result and the action trajectory discrimination result set; and the parameters of the discrimination sub-model are updated based on the action trajectory discrimination result set.

2. The method according to claim 1, characterized in that, The generated sub-model includes a feature vector transformation network, a feature vector concatenation network, and a motion trajectory generation network; The step of generating a first predicted feature vector based on the first action category feature vector and the first preprocessed feature vector corresponding to the first action trajectory sample includes: The feature vector transformation network performs feature transformation processing on the first action category feature vector corresponding to the first action trajectory sample to obtain the second action category feature vector. The feature vector concatenation network concatenates the second action category feature vector with the first preprocessed feature vector corresponding to the first action trajectory sample to obtain the first concatenated feature vector; The motion trajectory generation network predicts motion trajectories based on the first concatenated feature vector and generates a first predicted feature vector.

3. The method according to claim 1, characterized in that, The preprocessing based on the original motion trajectory feature vector to obtain a first preprocessed feature vector includes: The motion feature vector corresponding to the target image frame in the original motion trajectory feature vector is occluded to obtain the first motion trajectory feature vector; the target image frame is selected from N motion image frames, and the N motion image frames correspond one-to-one with the N 3D skeleton information in the first motion trajectory sample; The occluded motion feature vector in the first motion trajectory feature vector is completed to obtain the second motion trajectory feature vector. The position marking process is performed on the motion feature vectors corresponding to each motion image frame in the second motion trajectory feature vector to obtain the first preprocessed feature vector.

4. The method according to claim 1, characterized in that, The motion trajectory discrimination result set includes a first discrimination result indicating that the first predicted feature vector is judged to be true, a second discrimination result indicating that the original motion trajectory feature vector is judged to be true, and a third discrimination result indicating that the first predicted feature vector is judged to be fake. The parameters of the generated sub-model are updated based on the action category recognition results and the action trajectory discrimination results set. And updating the parameters of the discrimination sub-model based on the set of action trajectory discrimination results, including: The parameters of the generated sub-model are updated based on the action category recognition result and the first discrimination result; The parameters of the discriminant sub-model are updated based on the second and third discrimination results.

5. The method according to claim 2, characterized in that, The step of predicting the action trajectory based on the first concatenated feature vector to generate the first predicted feature vector includes: Based on the first concatenated feature vector, the motion trajectory is predicted to obtain the third motion trajectory feature vector; Based on the third motion trajectory feature vector and the original motion trajectory feature vector, a first predicted feature vector is generated; the first predicted feature vector includes the motion feature vector corresponding to the target image frame in the third motion trajectory feature vector and the motion feature vector corresponding to the non-target image frame in the original motion trajectory feature vector.

6. The method according to claim 1, characterized in that, The step of extracting action category features based on the first action trajectory sample to obtain a first action category feature vector includes: Based on the first action trajectory sample, construct a spatiotemporal graph of the first action trajectory; The pre-trained action classification model is used to extract action category features based on the first action trajectory spatiotemporal graph to obtain the first action category feature vector.

7. The method according to claim 1, characterized in that, The step of performing action category recognition based on the first predicted feature vector to obtain the action category recognition result includes: Based on the first predicted feature vector, a 3D skeleton transformation process is performed to obtain a predicted motion trajectory sample. Based on the predicted action trajectory samples, a second action trajectory spatiotemporal graph is constructed; The pre-trained action classification model is used to extract action category features based on the second action trajectory spatiotemporal map to obtain a third action category feature vector. Then, action category recognition is performed based on the third action category feature vector to obtain the action category recognition result.

8. The method according to claim 6 or 7, characterized in that, Before obtaining the first sample dataset, the following is also included: Obtain the second sample dataset; the second sample dataset includes multiple second action trajectory samples; Based on the second action trajectory sample, a third action trajectory spatiotemporal graph is constructed; The spatiotemporal graphs of the third motion trajectories are input into the motion classification model to be trained for motion category feature extraction to obtain the fourth motion category feature vector. Based on the fourth motion category feature vector, motion category prediction is performed to obtain the motion category prediction result. Based on the action category prediction results corresponding to each of the second action trajectory samples and the feature vector deviation information, the classification loss value is determined; the feature vector deviation information includes the deviation information between the fourth action category feature vector and the target center feature vector, and the target center feature vector is the center feature vector corresponding to the true action category of the second action trajectory sample; The action classification model is iteratively trained based on the classification loss value to obtain the trained action classification model.

9. The method according to claim 8, characterized in that, After iteratively training the action classification model based on the classification loss value to obtain the trained action classification model, the method further includes: Obtain a third sample dataset; the third sample dataset includes multiple third action trajectory samples under the target action category; Based on the third action trajectory sample, a fourth action trajectory spatiotemporal graph is constructed; The spatiotemporal graphs of the fourth action trajectories are input into the trained action classification model to perform action category feature extraction processing, thereby obtaining the fifth action category feature vector. Based on the feature vectors of each of the fifth action categories, action feature distribution information corresponding to the target action category is constructed; the action feature distribution information includes a set of candidate feature vectors for different action styles under the target action category, the set of candidate feature vectors is used to sample the target action category feature vectors during the action trajectory generation stage, and the target action category feature vectors are used to constrain the action style of the action trajectory.

10. A method for generating motion trajectories, characterized in that, The method includes: Obtain the motion trajectory to be completed; and obtain the target motion category feature vector, which is used to constrain the motion style of the motion trajectory. Based on the motion trajectory to be completed, motion trajectory feature extraction processing is performed to obtain the motion trajectory feature vector to be completed, and the motion trajectory feature vector to be completed is preprocessed to obtain the second preprocessed feature vector. The target action category feature vector and the second preprocessed feature vector are input into the trained action trajectory generation model to predict the action trajectory, thereby obtaining the second predicted feature vector; the action trajectory generation model is trained based on the method described in any one of claims 1 to 9. Based on the second predicted feature vector, the completed action trajectory is generated; A target motion trajectory is generated based on at least one of the completed motion trajectories corresponding to the motion trajectory to be completed.

11. The method according to claim 10, characterized in that, The motion trajectory generation model includes a feature vector transformation network, a feature vector concatenation network, and a motion trajectory generation network; The step of inputting the target action category feature vector and the second preprocessed feature vector into the trained action trajectory generation model to predict the action trajectory and obtain the second predicted feature vector includes: The feature vector transformation network performs feature transformation processing on the target action category feature vector to obtain the transformed action category feature vector. The feature vector concatenation network concatenates the transformed action category feature vector with the second preprocessed feature vector to obtain the second concatenated feature vector; The motion trajectory generation network predicts motion trajectories based on the second concatenated feature vector, generating a second predicted feature vector.

12. The method according to claim 10, characterized in that, The process of obtaining the target action category feature vector includes: Obtain the action feature distribution information corresponding to the target action category; sample a target action category feature vector from the action feature distribution information; or, Obtain a reference motion trajectory; use a pre-trained motion classification model to extract motion category features based on the spatiotemporal graph of the motion trajectory corresponding to the reference motion trajectory, and obtain the target motion category feature vector.

13. The method according to claim 10, characterized in that, The step of generating a target motion trajectory based on at least one completed motion trajectory corresponding to the motion trajectory to be completed includes: Based on the completed action trajectory corresponding to at least one of the action trajectories to be completed, a synthetic action trajectory is determined; If the synthesized motion trajectory does not meet the preset constraints, the process continues to obtain the next motion trajectory to be completed, so as to generate the completed motion trajectory corresponding to the motion trajectory to be completed. If the synthesized motion trajectory meets the preset constraints, then the synthesized motion trajectory is determined as the target motion trajectory.

14. A training device for a motion trajectory generation model, characterized in that, The device includes: The first acquisition module is used to acquire a first sample dataset; the first sample dataset includes multiple first action trajectory samples. The first processing module is configured to perform action category feature extraction processing based on the first action trajectory sample to obtain a first action category feature vector; and to perform action trajectory feature extraction processing based on the first action trajectory sample to obtain an original action trajectory feature vector, and to perform preprocessing based on the original action trajectory feature vector to obtain a first preprocessed feature vector. The model training module is used to input the first action category feature vector and the first preprocessed feature vector corresponding to each of the first action trajectory samples into the model to be trained for iterative training to obtain the action trajectory generation model. The model to be trained includes a generator sub-model and a discriminator sub-model; the specific implementation methods for each model training are as follows: For each of the first motion trajectory samples: the generation sub-model generates a first predicted feature vector based on the first motion category feature vector and the first preprocessed feature vector corresponding to the first motion trajectory sample; the discrimination sub-model performs trajectory authenticity discrimination based on the first predicted feature vector and the original motion trajectory feature vector corresponding to the first motion trajectory sample, and generates a motion trajectory discrimination result set; and performs motion category recognition based on the first predicted feature vector to obtain the motion category recognition result. The parameters of the generated sub-model are updated based on the action category recognition result and the action trajectory discrimination result set; and the parameters of the discrimination sub-model are updated based on the action trajectory discrimination result set.

15. A motion trajectory generation device, characterized in that, The device includes: The second acquisition module is used to acquire the motion trajectory to be completed; and to acquire the target motion category feature vector, wherein the target motion category feature vector is used to constrain the motion style of the motion trajectory. The second processing module is used to perform motion trajectory feature extraction processing based on the motion trajectory to be completed, to obtain a motion trajectory feature vector to be completed, and to perform preprocessing based on the motion trajectory feature vector to be completed, to obtain a second preprocessed feature vector. The motion trajectory prediction module is used to input the target motion category feature vector and the second preprocessed feature vector into the trained motion trajectory generation model to predict the motion trajectory and obtain the second predicted feature vector; the motion trajectory generation model is trained based on the first sample dataset and the corresponding motion category labels. The first trajectory generation module is used to generate the completed action trajectory based on the second predicted feature vector; The second trajectory generation module is used to generate a target motion trajectory based on at least one of the completed motion trajectories corresponding to the motion trajectory to be completed.

16. A computer device, characterized in that, The device includes: Processor; and A memory configured to store computer-executable instructions configured to be executed by the processor, the executable instructions including steps for performing the method as claimed in any one of claims 1 to 9 or any one of claims 10 to 13.

17. A storage medium, characterized in that, The storage medium is used to store computer-executable instructions that cause the computer to perform the method as described in any one of claims 1 to 9 or any one of claims 10 to 13.