A smog animation controllable generation method and device based on an interpretable basis decomposition

By constructing an interpretable basis decomposition network and a generative adversarial network, and combining physical parameters for temporal prediction, the interpretability and controllability issues of smoke animation generation under complex boundary conditions are solved, enabling fast and controllable smoke animation generation and improving generation performance and the effectiveness of prediction results.

CN117541687BActive Publication Date: 2026-06-19NAT UNIV OF DEFENSE TECH +1

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
NAT UNIV OF DEFENSE TECH
Filing Date
2023-11-13
Publication Date
2026-06-19

AI Technical Summary

Technical Problem

Existing deep learning networks lack interpretability and control over the output of intermediate layers when generating smoke animations under complex boundary conditions, resulting in the inability to generate smoke animations with good visual effects.

Method used

A smoke animation generation method based on interpretable basis decomposition is adopted. By constructing an interpretable basis decomposition network, a generation network, and an evaluation network, an interpretable basis of the fluid scene is extracted using a multi-scale encoder network, and a generative adversarial network is designed for temporal prediction. Combined with physical parameters for control, smoke animation is generated.

Benefits of technology

Given an initial smoke scene, it can quickly and controllably generate subsequent smoke animation sequences, supports fine-grained editing of physical parameters at arbitrary boundaries, improves generation performance, and ensures the effectiveness and robustness of prediction results.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN117541687B_ABST
    Figure CN117541687B_ABST
Patent Text Reader

Abstract

This application relates to a method and apparatus for controllable generation of smoke animation based on interpretable basis decomposition. The method includes: a sample training dataset comprising a fluid density field and a fluid velocity field; constructing a controllable smoke animation generation network model, which includes an interpretable basis decomposition network model, an interpretable basis generation network model, and an interpretable basis evaluation network model; inputting the sample training dataset into the interpretable basis decomposition network model for training to obtain the smoke interpretable basis; using the smoke interpretable basis of adjacent frames as a secondary sample training dataset along with control parameters, inputting them into the interpretable basis generation network model for dimensionality reduction residual training to obtain the secondary smoke interpretable basis and its corresponding time-step control parameters; inputting both into the interpretable basis evaluation network model for evaluation training to obtain the evaluation result; and generating the smoke animation based on the evaluation result and the time series. This method can quickly and effectively generate smoke animation by inputting initial smoke scene boundary conditions.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This application relates to the field of computer graphics technology, and in particular to a method and apparatus for controllable generation of smoke animation based on interpretable basis decomposition. Background Technology

[0002] The application of realistic fluid simulation in film and television special effects, virtual reality, and other fields is increasing. Its main performance bottleneck lies in the fact that traditional fluid simulation methods in computer graphics require significant computational resources and complex parameter tuning, leading to the use of physically unrealistic fluid simulation methods in many applications that demand real-time results. In recent years, many methods have used deep learning networks to accelerate realistic fluid simulation at high resolutions. However, most of these methods treat deep neural networks as black boxes, lacking analysis of the physically meaningful interpretable parts of the network's intermediate layer outputs and failing to directly control the network's output.

[0003] To address this issue, scholars such as Kim et al., in their paper "Deep fluids: A generative network for parameterized fluid simulations" (2019) published in *Computer Graphics Forum*, used a few parameters as input to guide the generative network in generating smoke animations for simple scenes (e.g., unidirectional constant external force, stationary solids with regular shapes). However, for scenes with complex boundary conditions (e.g., arbitrarily distributed external forces, moving solids with arbitrary curved surfaces), a small number of parameters cannot store enough information, thus failing to generate visually appealing smoke animations. Summary of the Invention

[0004] Therefore, it is necessary to address the aforementioned technical problems by providing a controllable smoke animation generation method and apparatus based on interpretable basis decomposition for rapidly and controllably generating smoke animation sequences in complex boundary scenarios.

[0005] A method for controllable generation of smoke animation based on interpretable basis decomposition, the method comprising:

[0006] Obtain a sample training dataset of the smoke scene, which includes the fluid density field and the fluid velocity field.

[0007] A controllable generation network model for smoke animation is constructed, which includes: an interpretable basis decomposition network model, an interpretable basis generation network model, and an interpretable basis evaluation network model.

[0008] The sample training dataset is input into the interpretable basis decomposition network model for sampling to obtain a multi-scale sample training dataset. The multi-scale sample dataset is then input into the autoencoder network for training to obtain a trained interpretable basis decomposition network model. The smoke interpretable basis is obtained through the trained interpretable basis decomposition network model.

[0009] The smoke interpretable basis of adjacent frames is used as the training dataset of secondary samples and the preset control parameters are input into the interpretable basis generation network model for dimensionality reduction residual training, so as to obtain the secondary smoke interpretable basis and the control parameters of the corresponding time step of the secondary smoke interpretable basis.

[0010] The control parameters of the corresponding time steps of the second-level smoke interpretable basis of adjacent frames are used as the third-level sample training dataset and input into the interpretable basis evaluation network model for evaluation and training to obtain the evaluation results.

[0011] The fluid density field of the current frame is generated based on the evaluation results, the fluid velocity field of the second-order smoke interpretable basis of the next frame, and the fluid density field of the second-order smoke interpretable basis of the previous frame. The fluid density field of the current frame generates the smoke animation based on the time series.

[0012] In one embodiment, the fluid motion of the smoke scene is described using an incompressible Navier-Stokes method:

[0013]

[0014] Where u is the smoke velocity, t is the time step, ρ is the smoke density, p is the pressure, and f is the external force acting on the smoke.

[0015] In one embodiment, the method further includes: obtaining a sample training dataset by simulating fluid motion in different smoke scenarios using the Euler method.

[0016] In one embodiment, the method further includes: constructing an interpretable basis decomposition network model, which includes a basis decomposition input layer, downsampled residual blocks, upsampled residual blocks, an autoencoder network, and a basis decomposition output layer.

[0017] In one embodiment, the multi-scale sample training dataset includes a multi-scale fluid velocity field and a multi-scale fluid density field. The process further includes: inputting the sample training dataset into the basis decomposition input layer, processing it through downsampling residual blocks to obtain a sample training dataset with twice the dimensionality. Processing the twice-dimensional sample training dataset through upsampling residual blocks to obtain a sample training dataset of twice the size. Connecting each layer's twice-dimensional sample training dataset with the twice-sized sample training dataset yields a multi-scale sample training dataset. This multi-scale sample training dataset is input into an autoencoder network for dimensionality-weighted summation to obtain a weighted sample training dataset. Back-training is performed based on the weighted sample training dataset to obtain a trained interpretable basis decomposition network model. The interpretable basis for smoke is then obtained through this trained interpretable basis decomposition network model.

[0018] In one embodiment, the interpretable basis generation network model includes: a basis generation input layer, a dimensionality reduction residual block, and a basis generation output layer. It further includes: inputting adjacent frame smoke interpretable bases as secondary sample training datasets along with preset control parameters into the basis generation input layer; then reducing the dimensionality of the control parameters to the same dimension as the secondary sample training dataset using the dimensionality reduction residual block, thus obtaining a parameter-matched smoke interpretable base. Backpropagation is performed based on the parameter-matched smoke interpretable base to obtain a trained interpretable basis generation network model; the trained interpretable basis generation network model is then used to obtain the secondary smoke interpretable base and the control parameters at the corresponding time steps of the secondary smoke interpretable base.

[0019] In one embodiment, the interpretable basis evaluation network model includes: an evaluation input layer, an evaluation residual block, and an evaluation output layer. The number of nodes in the evaluation input layer is twice the dimension of the smoke interpretable basis. The error function of the evaluation output layer is:

[0020]

[0021] Where CE is the cross-entropy function, PE1 is the qualitative evaluation result, and PE2 is the quantitative evaluation result. The basis to be evaluated is the second-order smoke interpretable basis of the current frame. p is the basis to be evaluated as the second-order smoke interpretable basis for the next frame. t The sgn() function sets the currently active control parameter to 1 and the inactive control parameter to 0.

[0022] A controllable smoke animation generation device based on interpretable basis decomposition, the device comprising:

[0023] The sample training dataset acquisition module is used to acquire the sample training dataset for the smoke scene. The sample training dataset includes the fluid density field and the fluid velocity field.

[0024] The module for constructing a controllable generation network model for smoke animation is used to build a controllable generation network model for smoke animation. The controllable generation network model for smoke animation includes: an interpretable basis decomposition network model, an interpretable basis generation network model, and an interpretable basis evaluation network model.

[0025] The smoke interpretable basis acquisition module is used to input the sample training dataset into the interpretable basis decomposition network model for sampling to obtain a multi-scale sample training dataset. The multi-scale sample dataset is then input into the autoencoder network for training to obtain a trained interpretable basis decomposition network model. The smoke interpretable basis is then obtained through the trained interpretable basis decomposition network model.

[0026] The second-level smoke interpretable basis and control parameter matching module is used to input the smoke interpretable basis of adjacent frames as the second-level sample training dataset and the preset control parameters into the interpretable basis generation network model for dimensionality reduction residual training, so as to obtain the second-level smoke interpretable basis and the control parameters of the corresponding time step of the second-level smoke interpretable basis.

[0027] The evaluation result acquisition module uses the control parameters of the second-level smoke interpretable basis of adjacent frames and the corresponding time steps of the second-level smoke interpretable basis of adjacent frames as the third-level sample training dataset to be input into the interpretable basis evaluation network model for evaluation training, and obtains the evaluation results.

[0028] The smoke animation generation module is used to generate the fluid density field of the current frame based on the evaluation results, the fluid velocity field of the second-order smoke interpretable basis of the next frame, and the fluid density field of the second-order smoke interpretable basis of the previous frame. The fluid density field of the current frame generates the smoke animation based on the time series.

[0029] A computer device includes a memory and a processor, the memory storing a computer program, and the processor executing the computer program performing the following steps:

[0030] Obtain a sample training dataset of the smoke scene, which includes the fluid density field and the fluid velocity field.

[0031] A controllable generation network model for smoke animation is constructed, which includes: an interpretable basis decomposition network model, an interpretable basis generation network model, and an interpretable basis evaluation network model.

[0032] The sample training dataset is input into the interpretable basis decomposition network model for sampling to obtain a multi-scale sample training dataset. The multi-scale sample dataset is then input into the autoencoder network for training to obtain a trained interpretable basis decomposition network model. The smoke interpretable basis is obtained through the trained interpretable basis decomposition network model.

[0033] The smoke interpretable basis of adjacent frames is used as the training dataset of secondary samples and the preset control parameters are input into the interpretable basis generation network model for dimensionality reduction residual training, so as to obtain the secondary smoke interpretable basis and the control parameters of the corresponding time step of the secondary smoke interpretable basis.

[0034] The control parameters of the corresponding time steps of the second-level smoke interpretable basis of adjacent frames are used as the third-level sample training dataset and input into the interpretable basis evaluation network model for evaluation and training to obtain the evaluation results.

[0035] The fluid density field of the current frame is generated based on the evaluation results, the fluid velocity field of the second-order smoke interpretable basis of the next frame, and the fluid density field of the second-order smoke interpretable basis of the previous frame. The fluid density field of the current frame generates the smoke animation based on the time series.

[0036] The aforementioned method and apparatus for controllable generation of smoke animation based on interpretable basis decomposition first employs a multi-scale encoder network to extract interpretable bases representing different physical meanings in a fluid scene. Then, a temporal predictor based on a generative adversarial network is designed, taking user-customized physical parameters and the velocity field of the previous frame as input, to predict the temporal evolution of each type of interpretable base in subsequent frames step by step. Furthermore, a parametric interpretable basis evaluation network is designed, taking the smoke velocity of two adjacent frames as input, to evaluate whether the velocity change between the two frames conforms to the effect of relevant physical parameters. Finally, based on the predicted velocity field and the density field of the previous frame, the density field of the current frame is calculated. Applying this method enables the rapid and controllable generation of subsequent smoke animation sequences given an initial smoke scene, supports the editing of physical parameters or control parameters at arbitrary boundaries, significantly improves the generation performance of smoke scenes, and ensures the effectiveness and robustness of the prediction results. Attached Figure Description

[0037] Figure 1 This is a flowchart illustrating a method for controllable generation of smoke animation based on interpretable basis decomposition in one embodiment.

[0038] Figure 2 One embodiment presents a controllable generation model for smoke animation based on multi-scale interpretable basis decomposition;

[0039] Figure 3 This is a schematic diagram of the network structure of an interpretable basis decomposition network in one embodiment;

[0040] Figure 4 This is a schematic diagram of the network structure of an interpretable base generation network in one embodiment;

[0041] Figure 5 This is a schematic diagram of the network structure of an interpretable basis evaluation network in one embodiment;

[0042] Figure 6 This is a schematic diagram of a training dataset generated through simulation in one embodiment.

[0043] Figure 7 This is a schematic diagram illustrating the smoke generation effect in a two-dimensional scene in one embodiment.

[0044] Figure 8 This is a schematic diagram illustrating the effect of local external force control in a three-dimensional scene in one embodiment.

[0045] Figure 9 This is a schematic diagram illustrating the solid boundary control effect in a three-dimensional scene in one embodiment.

[0046] Figure 10 This is a structural block diagram of a controllable smoke animation generation device based on interpretable basis decomposition in one embodiment;

[0047] Figure 11 This is an internal structural diagram of a computer device in one embodiment. Detailed Implementation

[0048] To make the objectives, technical solutions, and advantages of this application clearer, the following detailed description is provided in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative and not intended to limit the scope of this application.

[0049] In one embodiment, such as Figure 1 As shown, a controllable generation method for smoke animation based on interpretable basis decomposition is provided, including the following steps:

[0050] Step 102: Obtain the sample training dataset for the smoke scene.

[0051] The sample training dataset includes the fluid density field and the fluid velocity field.

[0052] Specifically, physical parameters such as randomly varying smoke sources, arbitrarily distributed external forces, and arbitrary boundary obstacles are set. The density field, velocity field, and symbolic distance field of the simulated fluid, as well as the physical parameters corresponding to each time step, are generated and collected using the Euler method to generate a training dataset.

[0053] Furthermore, the fluid motion is described using the incompressible Navier-Stokes equations, which are as follows:

[0054]

[0055] Where: u represents the smoke velocity, t represents the time step, ρ represents the smoke density, p represents the pressure, and f represents the external force acting on the smoke.

[0056] Furthermore, based on the Euler method, fluid sequences under different smoke scene conditions were simulated by recording the smoke velocity field u from frame t to frame (t+n). t Density field ρ t As a set of samples, several sets of samples are collected as the training dataset.

[0057] Step 104: Construct a controllable generation network model for smoke animation. The controllable generation network model for smoke animation includes: an interpretable basis decomposition network model, an interpretable basis generation network model, and an interpretable basis evaluation network model.

[0058] Step 106: Input the sample training dataset into the interpretable basis decomposition network model for sampling to obtain a multi-scale sample training dataset. Input the multi-scale sample dataset into the autoencoder network for training to obtain a trained interpretable basis decomposition network model. Obtain the smoke interpretable basis through the trained interpretable basis decomposition network model.

[0059] Specifically, a base decomposition network model for the smoke scene is constructed and trained, such as... Figure 3 As shown, the interpretable basis decomposition network model includes: a basis decomposition input layer, downsampling residual blocks, upsampling residual blocks, an autoencoder network, and a basis decomposition output layer. Using the trained basis decomposition network, the velocity field, density field, and physical parameter sequence of the input smoke animation are decomposed to obtain a set of interpretable bases corresponding to each physical parameter.

[0060] Furthermore, 1) the number of nodes in the basis decomposition input layer is the same as the dimension of the input data, and the values ​​of each dimension in the input data of each sample are the same as the values ​​of the corresponding nodes in the basis decomposition input layer.

[0061] 2) The input data is downsampled by three downsampled residual block structures. Each downsampled residual block structure halves the input size and doubles the dimension.

[0062] 3) The downsampled sample is upsampled by three upsampled residual block structures, where each upsampled residual block structure doubles the input size and halves the dimension.

[0063] 4) Connect the output of the residual block structure of each downsampling part to the output of the residual block structure of each upsampling part to obtain the multi-scale representation of the sample data.

[0064] 5) The autoencoder network output is obtained by weighted summation of the multi-scale information of the samples.

[0065] 6) After obtaining the error of the base decomposition output layer, the backpropagation algorithm is used to obtain the node error in the inner hidden layer, and the gradient descent algorithm is used to adjust the weights between nodes to improve the nonlinear fitting ability.

[0066] Furthermore, the error function of the output layer in step 6) is:

[0067]

[0068] Among them, u t ρ t U represents the velocity field and density field input at time t, respectively. t ρ t These represent the velocity field and density field output by the autoencoder network at time t, respectively.

[0069] Step 108: Use the adjacent frame smoke interpretable basis as the training dataset of secondary samples and the preset control parameters as input to the interpretable basis generation network model for dimensionality reduction residual training, and obtain the secondary smoke interpretable basis and the control parameters of the corresponding time step of the secondary smoke interpretable basis.

[0070] An interpretable basis generation network model includes: a basis generation input layer, a dimension reduction residual block, and a basis generation output layer.

[0071] Specifically, a temporally continuous smoke interpretable basis is obtained from the pre-trained interpretable basis decomposition network model. This smoke interpretable basis is then used as the training sample for the interpretable basis generation network model. Based on the long-term mechanism, the interpretable basis generation network is constructed and trained. The trained interpretable basis generation network is then used to predict the state of subsequent time steps based on the distribution of smoke interpretable basis corresponding to different control parameters (i.e., physical parameters) at the input time step.

[0072] Furthermore, the smoke velocity and density fields recorded in adjacent frames t, t+1, and t+2 can be used to interpret the base b in the smoke output by the trained basis decomposition network. t ,b t+1 ,b t+2 Take the first two bases as input data and the third base as output data to form a set of samples; collect several sets of samples as training datasets.

[0073] Furthermore, an interpretable base-generative network model is constructed and trained, such as... Figure 4 As shown, it specifically includes:

[0074] 1) The number of nodes in the base generation input layer is the sum of the output dimension and the physical parameter dimension of the base decomposition network. The values ​​of each dimension in the input data of each sample are the same as the values ​​of the corresponding nodes in the base generation input layer.

[0075] 2) The input data is reduced to the same dimension as the interpretable basis by using a structure of seven dimensionality reduction residual blocks.

[0076] 3) After obtaining the error of the base generation output layer, the backpropagation algorithm is used to obtain the node error in the inner hidden layer, and the gradient descent algorithm is used to adjust the weights between nodes to improve the nonlinear fitting ability.

[0077] Furthermore, the error function for generating the output layer in step 3) is:

[0078]

[0079] in, This represents the interpretable basis for the m-th parameter at time t.

[0080] Step 110: The control parameters of the second-level smoke interpretable basis of adjacent frames and the corresponding time steps of the second-level smoke interpretable basis of adjacent frames are used as the third-level sample training dataset and input into the interpretable basis evaluation network model for evaluation training to obtain the evaluation results.

[0081] An interpretable base evaluation network model includes: an evaluation input layer, an evaluation residual block, and an evaluation output layer.

[0082] Specifically, a temporally continuous smoke interpretable basis obtained from the pre-trained interpretable basis generation network is used as the training sample for the interpretable basis evaluation network. The basis of two adjacent frames is used as the input of the interpretable basis evaluation network, and the physical parameters of the corresponding time step are used to supervise the output of the network to train the interpretable basis evaluation network. The weights of the evaluation network are fixed, and the continuous basis output by the evaluation basis generation network is evaluated to see whether they conform to the corresponding physical parameters.

[0083] Furthermore, the smoke velocity and density fields of frames t, t+1, ..., t+n are recorded and then processed by the base b output from the trained interpretable basis decomposition network. t ,b t+1 ,...,b t+n ,like Figure 5 As shown, two adjacent bases are taken as input data, and the parameter p at the corresponding time step is taken. t The output data forms a set of samples, and several sets of samples are collected as the training dataset.

[0084] Furthermore, 1) the number of nodes in the evaluation input layer is twice the output dimension of the interpretable basis generator network, and the values ​​of each dimension in the input data of each sample are the same as the values ​​of the corresponding nodes in the evaluation input layer.

[0085] 2) The network was evaluated from both qualitative and quantitative perspectives.

[0086] 3) After obtaining the evaluation output layer error, the backpropagation algorithm is used to obtain the node error in the inner hidden layer, and the gradient descent algorithm is used to adjust the weights between nodes to improve the nonlinear fitting ability.

[0087] In addition, the error function for evaluating the output layer in step 3) is:

[0088]

[0089] Where CE is the cross-entropy function, PE1 is the qualitative assessment, and PE2 is the quantitative assessment.

[0090] Step 112: Generate the fluid density field of the current frame based on the evaluation results, the fluid velocity field of the second-order smoke interpretable basis of the next frame, and the fluid density field of the second-order smoke interpretable basis of the previous frame. Generate the smoke animation based on the fluid density field of the current frame according to the time series.

[0091] Specifically, based on the predicted velocity field and the density field of the previous frame, the velocity field u at time t is calculated using the differentiable MacCormack advection method. t With density field ρ t To calculate time ρ at t+1, use the input. t+1 Then, a smoke animation is generated based on the length of the time series.

[0092] In the aforementioned method for controllable generation of smoke animation based on interpretable basis decomposition, a multi-scale encoder network is first used to extract interpretable bases representing different physical meanings in the fluid scene. Then, a temporal predictor based on a generative adversarial network is designed, taking user-customized physical parameters and the velocity field of the previous frame as input, to predict the temporal evolution of each type of interpretable base in subsequent frames step by step. Furthermore, a parametric interpretable basis evaluation network is designed, taking the smoke velocity of two adjacent frames as input, to evaluate whether the velocity change between the two frames conforms to the effect of the relevant physical parameters. Finally, based on the predicted velocity field and the density field of the previous frame, the density field of the current frame is calculated using the differentiable MacCormack advection method. Applying this method enables the rapid and controllable generation of subsequent smoke animation sequences given an initial smoke scene, supports arbitrary boundary fine-grained editing of physical parameters, greatly improves the generation performance of smoke scenes, and ensures the effectiveness and robustness of the prediction results.

[0093] In one embodiment, the fluid motion of the smoke scene is described using an incompressible Navier-Stokes method:

[0094]

[0095] Where u is the smoke velocity, t is the time step, ρ is the smoke density, p is the pressure, and f is the external force acting on the smoke.

[0096] In one embodiment, a sample training dataset is obtained by simulating fluid motion in different smoke scenarios using the Euler method.

[0097] In one embodiment, an interpretable basis decomposition network model is constructed, which includes a basis decomposition input layer, downsampled residual blocks, upsampled residual blocks, an autoencoder network, and a basis decomposition output layer.

[0098] In one embodiment, the multi-scale sample training dataset includes a multi-scale fluid velocity field and a multi-scale fluid density field. The process further includes: inputting the sample training dataset into the basis decomposition input layer, processing it through downsampling residual blocks to obtain a sample training dataset with twice the dimensionality. Processing the twice-dimensional sample training dataset through upsampling residual blocks to obtain a sample training dataset of twice the size. Connecting each layer's twice-dimensional sample training dataset with the twice-sized sample training dataset yields a multi-scale sample training dataset. This multi-scale sample training dataset is input into an autoencoder network for dimensionality-weighted summation to obtain a weighted sample training dataset. Back-training is performed based on the weighted sample training dataset to obtain a trained interpretable basis decomposition network model. The interpretable basis for smoke is then obtained through this trained interpretable basis decomposition network model.

[0099] In one embodiment, the interpretable basis generation network model includes: a basis generation input layer, a dimensionality reduction residual block, and a basis generation output layer. It further includes: inputting adjacent frame smoke interpretable bases as secondary sample training datasets along with preset control parameters into the basis generation input layer; then reducing the dimensionality of the control parameters to the same dimension as the secondary sample training dataset using the dimensionality reduction residual block, thus obtaining a parameter-matched smoke interpretable base. Backpropagation is performed based on the parameter-matched smoke interpretable base to obtain a trained interpretable basis generation network model; the trained interpretable basis generation network model is then used to obtain the secondary smoke interpretable base and the control parameters at the corresponding time steps of the secondary smoke interpretable base.

[0100] In one embodiment, the interpretable basis evaluation network model includes: an evaluation input layer, an evaluation residual block, and an evaluation output layer. The number of nodes in the evaluation input layer is twice the dimension of the smoke interpretable basis. The error function of the evaluation output layer is:

[0101]

[0102] Where CE is the cross-entropy function, PE1 is the qualitative evaluation result, and PE2 is the quantitative evaluation result. The basis to be evaluated is the second-order smoke interpretable basis of the current frame. p is the basis to be evaluated as the second-order smoke interpretable basis for the next frame. t The sgn() function sets the currently active control parameter to 1 and the inactive control parameter to 0.

[0103] In one embodiment, such as Figure 2As shown, a controllable fluid animation generation model based on multi-scale interpretable basis decomposition is proposed, including a basis decomposition network, a basis generation network, and a basis evaluation network. In a smoke scene, the density and velocity fields of the fluid over a continuous time frame t to t+n are acquired and input into the controllable fluid animation generation model based on multi-scale interpretable basis decomposition. First, after training the basis decomposition network, a temporally continuous interpretable basis for the smoke is obtained. Second, the interpretable basis for the smoke is matched with physical parameters set according to the current smoke scene conditions at different time steps. After training the basis generation network, a predicted interpretable basis (i.e., the interpretable basis to be evaluated) is obtained. Then, the weights of the basis evaluation network are fixed, and the continuous basis output by the basis generation network is evaluated to determine whether they conform to the corresponding physical parameters. The interpretable basis to be evaluated at time t and at time t+1 are evaluated both quantitatively and qualitatively. The evaluation results are used to back-train the basis evaluation network. Finally, based on the predicted velocity field and the density field of the previous frame, the velocity field u at time t is calculated using the differentiable MacCormack advection method. t With density field ρ t To calculate the density field ρ input at time t+1. t+1 Generate realistic smoke animations within the continuous reading time period.

[0104] It should be understood that, although Figure 1 The steps in the flowchart are shown sequentially as indicated by the arrows, but these steps are not necessarily executed in the order indicated by the arrows. Unless otherwise specified herein, there is no strict order in which these steps are executed, and they can be performed in other orders. Figure 1 At least some of the steps in the process may include multiple sub-steps or multiple stages. These sub-steps or stages are not necessarily completed at the same time, but can be executed at different times. The execution order of these sub-steps or stages is not necessarily sequential, but can be executed in turn or alternately with other steps or at least some of the sub-steps or stages of other steps.

[0105] In one embodiment, such as Figure 6 The image shows a smoke animation simulated using this method. Horizontal external force, vorticity coefficient, and arbitrary boundary obstacles were selected as physical parameters. The external force ranged from -0.5 to 0.5, and the vorticity coefficient ranged from 1.0 to 4.0. Obstacles were models from the NTU 3D model library. The simulation generated a dataset containing 300 scenes, with 100 frames per scene. 80% of the data was used for training, and the remaining 20% ​​was used for testing.

[0106] In one embodiment, such as Figure 7As shown, the smoke animation generation results in a two-dimensional scene at times t = 80, 120, 160, and 200 are presented, and compared with the smoke generation results based on physics and other deep learning-based methods. The generation results of this method maintain the overall motion trend well.

[0107] In one embodiment, such as Figure 8 The image shows the smoke animation generation results in a 3D scene with localized external force control at times t = 20, 40, 60, 80, and 100. Opposite external forces were applied to the left and right sides of the scene. The generated results successfully demonstrate the effect of the external force tearing the smoke apart to both sides.

[0108] In one embodiment, such as Figure 9 The image shows the smoke animation generation results in a 3D scene, achieved by editing the local solid boundary at times t = 40, 60, 80, 100, 120, 140, and 160. At t = 80, the solid boundary was edited, and the ear portion of the rabbit-shaped solid was removed. The generated result successfully reflects the effect of smoke billowing outwards from the inside of the rabbit-shaped solid.

[0109] In one embodiment, such as Figure 10 As shown, a controllable smoke animation generation device based on interpretable basis decomposition is provided, including: a sample training dataset acquisition module 1002, a smoke animation controllable generation network model construction module 1004, a smoke interpretable basis acquisition module 1006, a second-order smoke interpretable basis and control parameter matching module 1008, an evaluation result acquisition module 1010, and a smoke animation generation module 1012, wherein:

[0110] The sample training dataset acquisition module 1002 is used to acquire the sample training dataset for the smoke scene. The sample training dataset includes the fluid density field and the fluid velocity field.

[0111] The smoke animation controllable generation network model construction module 1004 is used to construct a smoke animation controllable generation network model, which includes: an interpretable basis decomposition network model, an interpretable basis generation network model, and an interpretable basis evaluation network model.

[0112] The smoke interpretable basis acquisition module 1006 is used to input the sample training dataset into the interpretable basis decomposition network model for sampling to obtain a multi-scale sample training dataset. The multi-scale sample dataset is then input into the autoencoder network for training to obtain a trained interpretable basis decomposition network model. The smoke interpretable basis is then obtained through the trained interpretable basis decomposition network model.

[0113] The second-level smoke interpretable basis and control parameter matching module 1008 is used to input the smoke interpretable basis of adjacent frames as the second-level sample training dataset and the preset control parameters into the interpretable basis generation network model for dimensionality reduction residual training, so as to obtain the second-level smoke interpretable basis and the control parameters of the corresponding time step of the second-level smoke interpretable basis.

[0114] The evaluation result acquisition module 1010 uses the control parameters of the second-level smoke interpretable basis of adjacent frames and the corresponding time steps of the second-level smoke interpretable basis of adjacent frames as the third-level sample training dataset to be input into the interpretable basis evaluation network model for evaluation training, and obtains the evaluation result.

[0115] The smoke animation generation module 1012 is used to generate the fluid density field of the current frame based on the evaluation results, the fluid velocity field of the second-order smoke interpretable basis of the next frame, and the fluid density field of the second-order smoke interpretable basis of the previous frame. The fluid density field of the current frame generates the smoke animation based on the time series.

[0116] Specific limitations regarding the controllable generation device for smoke animation based on interpretable basis decomposition can be found in the above description of the controllable generation method for smoke animation based on interpretable basis decomposition, and will not be repeated here. Each module in the aforementioned controllable generation device for smoke animation based on interpretable basis decomposition can be implemented entirely or partially through software, hardware, or a combination thereof. These modules can be embedded in or independent of the processor in a computer device, or stored in the memory of a computer device as software, so that the processor can call and execute the operations corresponding to each module.

[0117] In one embodiment, a computer device is provided, which may be a terminal, and its internal structure diagram may be as follows: Figure 11 As shown, the computer device includes a processor, memory, network interface, display screen, and input devices connected via a system bus. The processor provides computing and control capabilities. The memory includes non-volatile storage media and internal memory. The non-volatile storage media stores the operating system and computer programs. The internal memory provides an environment for the operation of the operating system and computer programs stored in the non-volatile storage media. The network interface is used to communicate with external terminals via a network connection. When executed by the processor, the computer program implements a controllable generation method for smoke animation based on interpretable basis decomposition. The display screen can be a liquid crystal display (LCD) or an e-ink display. The input devices can be a touch layer covering the display screen, buttons, a trackball, or a touchpad mounted on the computer device casing, or an external keyboard, touchpad, or mouse.

[0118] Those skilled in the art will understand that Figure 10-11The structure shown is merely a block diagram of a portion of the structure related to the present application and does not constitute a limitation on the computer device to which the present application is applied. Specific computer devices may include more or fewer components than those shown in the figure, or combine certain components, or have different component arrangements.

[0119] In one embodiment, a computer device is provided, including a memory and a processor, the memory storing a computer program, the processor executing the computer program to perform the following steps:

[0120] Obtain a sample training dataset of the smoke scene, which includes the fluid density field and the fluid velocity field.

[0121] A controllable generation network model for smoke animation is constructed, which includes: an interpretable basis decomposition network model, an interpretable basis generation network model, and an interpretable basis evaluation network model.

[0122] The sample training dataset is input into the interpretable basis decomposition network model for sampling to obtain a multi-scale sample training dataset. The multi-scale sample dataset is then input into the autoencoder network for training to obtain a trained interpretable basis decomposition network model. The smoke interpretable basis is obtained through the trained interpretable basis decomposition network model.

[0123] The smoke interpretable basis of adjacent frames is used as the training dataset of secondary samples and the preset control parameters are input into the interpretable basis generation network model for dimensionality reduction residual training, so as to obtain the secondary smoke interpretable basis and the control parameters of the corresponding time step of the secondary smoke interpretable basis.

[0124] The control parameters of the corresponding time steps of the second-level smoke interpretable basis of adjacent frames are used as the third-level sample training dataset and input into the interpretable basis evaluation network model for evaluation and training to obtain the evaluation results.

[0125] The fluid density field of the current frame is generated based on the evaluation results, the fluid velocity field of the second-order smoke interpretable basis of the next frame, and the fluid density field of the second-order smoke interpretable basis of the previous frame. The fluid density field of the current frame generates the smoke animation based on the time series.

[0126] Those skilled in the art will understand that all or part of the processes in the methods of the above embodiments can be implemented by a computer program instructing related hardware. The computer program can be stored in a non-volatile computer-readable storage medium, and when executed, it can include the processes of the embodiments of the above methods. Any references to memory, storage, databases, or other media used in the embodiments provided in this application can include non-volatile and / or volatile memory. Non-volatile memory can include read-only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), or flash memory. Volatile memory can include random access memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in various forms, such as static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), dual data rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous link DRAM (SLDRAM), Rambus direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), etc.

[0127] The technical features of the above embodiments can be combined in any way. For the sake of brevity, not all possible combinations of the technical features in the above embodiments are described. However, as long as there is no contradiction in the combination of these technical features, they should be considered to be within the scope of this specification.

[0128] The embodiments described above are merely illustrative of several implementation methods of this application, and while the descriptions are specific and detailed, they should not be construed as limiting the scope of the invention. It should be noted that those skilled in the art can make various modifications and improvements without departing from the concept of this application, and these all fall within the protection scope of this application. Therefore, the protection scope of this application should be determined by the appended claims.

Claims

1. A method for controllable generation of smoke animation based on interpretable basis decomposition, characterized in that, The method includes: Obtain a sample training dataset of a smoke scene, wherein the sample training dataset includes a fluid density field and a fluid velocity field; A controllable generation network model for smoke animation is constructed, which includes: an interpretable basis decomposition network model, an interpretable basis generation network model, and an interpretable basis evaluation network model. The sample training dataset is input into the interpretable basis decomposition network model for sampling to obtain a multi-scale sample training dataset. The multi-scale sample training dataset is then input into the autoencoder network for training to obtain the trained interpretable basis decomposition network model. The smoke interpretable basis is obtained through the trained interpretable basis decomposition network model. The smoke interpretable basis in adjacent frames is used as a secondary sample training dataset and preset control parameters are input into the interpretable basis generation network model for dimensionality reduction residual training to obtain the secondary smoke interpretable basis and the control parameters of the corresponding time step of the secondary smoke interpretable basis; The control parameters of the second-level smoke interpretable basis of adjacent frames and the corresponding time step of the second-level smoke interpretable basis of adjacent frames are used as the third-level sample training dataset and input into the interpretable basis evaluation network model for evaluation training to obtain the evaluation result; The fluid density field of the current frame is generated based on the evaluation results, the fluid velocity field of the secondary smoke interpretable basis in the next frame, and the fluid density field of the secondary smoke interpretable basis in the previous frame. The fluid density field of the current frame generates a smoke animation based on the time series.

2. The method of claim 1, wherein, The fluid motion of the smoke scene is described using an incompressible Navier-Stokes method: Where u is the smoke velocity, t is the time step, ρ is the smoke density, p is the pressure, and f is the external force acting on the smoke.

3. The method according to claim 2, characterized in that, Obtain a sample training dataset of smoke scenes, including: The training dataset was obtained by simulating fluid motion in different smoke scenarios using the Euler method.

4. The method according to any one of claims 1 to 3, characterized in that, Before the step of inputting the sample training dataset into the interpretable basis decomposition network model for sampling to obtain a multi-scale sample training dataset, the method further includes: The interpretable basis decomposition network model is constructed, which includes a basis decomposition input layer, a downsampled residual block, an upsampled residual block, an autoencoder network, and a basis decomposition output layer.

5. The method according to claim 4, characterized in that, The multi-scale sample training dataset includes: a multi-scale fluid velocity field and a multi-scale fluid density field; The sample training dataset is input into the interpretable basis decomposition network model for sampling to obtain a multi-scale sample training dataset. This multi-scale sample training dataset is then input into an autoencoder network for training to obtain a trained interpretable basis decomposition network model. The smoke interpretable basis is obtained through the trained interpretable basis decomposition network model, including: After the sample training dataset is input into the basis decomposition input layer, it is processed by the downsampling residual block to obtain the sample training dataset with twice the dimension; after the sample training dataset with twice the dimension is processed by the upsampling residual block, the sample training dataset with twice the size is obtained. Connect the sample training dataset of twice the dimension of each layer with the sample training dataset of twice the size to obtain a multi-scale sample training dataset. The multi-scale sample training dataset is input into the autoencoder network for dimension-weighted summation to obtain a weighted sample training dataset. The weighted sample training dataset is used for reverse training to obtain the trained interpretable basis decomposition network model. The smoke interpretable basis is obtained through the trained interpretable basis decomposition network model.

6. The method according to claim 3, characterized in that, The interpretable basis generation network model includes: a basis generation input layer, a dimension reduction residual block, and a basis generation output layer; The smoke interpretable basis of adjacent frames is used as a secondary sample training dataset, along with preset control parameters, and input into the interpretable basis generation network model for dimensionality reduction residual training. This yields the secondary smoke interpretable basis and the control parameters for the corresponding time steps, including: After the smoke interpretable basis of adjacent frames is used as a secondary sample training dataset and preset control parameters are input to the basis generation input layer, the dimension of the control parameters is reduced to the same dimension as the dimension of the secondary sample training dataset through the dimension reduction residual block, so as to obtain the smoke interpretable basis with matching parameters. The smoke interpretable basis matched with the parameters is back-trained to obtain the trained interpretable basis generation network model. The control parameters of the second-level smoke interpretable basis and the corresponding time step of the second-level smoke interpretable basis are obtained through the trained interpretable basis generation network model.

7. The method according to claim 6, characterized in that, The interpretable basis evaluation network model includes: an evaluation input layer, an evaluation residual block, and an evaluation output layer; The number of nodes in the evaluation input layer is twice the dimension of the smoke interpretable basis; The error function of the evaluation output layer is: Where CE is the cross-entropy function, PE1 is the qualitative evaluation result, and PE2 is the quantitative evaluation result. The basis to be evaluated is the second-order smoke interpretable basis described in the current frame. p is the basis to be evaluated for the secondary smoke interpretable basis described in the next frame. t The sgn() function sets the currently active control parameter to 1 and the inactive control parameter to 0.

8. A controllable smoke animation generation device based on interpretable basis decomposition, characterized in that, The device includes: The sample training dataset acquisition module is used to acquire a sample training dataset for a smoke scene, wherein the sample training dataset includes a fluid density field and a fluid velocity field. A smoke animation controllable generation network model construction module is used to construct a smoke animation controllable generation network model, which includes: an interpretable basis decomposition network model, an interpretable basis generation network model, and an interpretable basis evaluation network model. The smoke interpretable basis acquisition module is used to input the sample training dataset into the interpretable basis decomposition network model for sampling to obtain a multi-scale sample training dataset, input the multi-scale sample training dataset into the autoencoder network for training to obtain the trained interpretable basis decomposition network model, and obtain the smoke interpretable basis through the trained interpretable basis decomposition network model. The secondary smoke interpretable basis and control parameter matching module is used to input the smoke interpretable basis of adjacent frames as a secondary sample training dataset and preset control parameters into the interpretable basis generation network model for dimensionality reduction residual training, so as to obtain the secondary smoke interpretable basis and the control parameters of the corresponding time step of the secondary smoke interpretable basis; The evaluation result acquisition module is used to input the control parameters of the second-level smoke interpretable basis of adjacent frames and the corresponding time steps of the second-level smoke interpretable basis of adjacent frames as the third-level sample training dataset into the interpretable basis evaluation network model for evaluation training, and obtain the evaluation result; The smoke animation generation module is used to generate the fluid density field of the current frame based on the evaluation results, the fluid velocity field of the second-order smoke interpretable basis of the next frame, and the fluid density field of the second-order smoke interpretable basis of the previous frame. The fluid density field of the current frame generates the smoke animation according to the time series.

9. A computer device comprising a memory and a processor, wherein the memory stores a computer program, characterized in that, When the processor executes the computer program, it implements the steps of the method according to any one of claims 1 to 7.